linux/fs/erofs
Junbeom Yeom 4012d78562 erofs: fix unexpected EIO under memory pressure
erofs readahead could fail with ENOMEM under the memory pressure because
it tries to alloc_page with GFP_NOWAIT | GFP_NORETRY, while GFP_KERNEL
for a regular read. And if readahead fails (with non-uptodate folios),
the original request will then fall back to synchronous read, and
`.read_folio()` should return appropriate errnos.

However, in scenarios where readahead and read operations compete,
read operation could return an unintended EIO because of an incorrect
error propagation.

To resolve this, this patch modifies the behavior so that, when the
PCL is for read(which means pcl.besteffort is true), it attempts actual
decompression instead of propagating the privios error except initial EIO.

- Page size: 4K
- The original size of FileA: 16K
- Compress-ratio per PCL: 50% (Uncompressed 8K -> Compressed 4K)
[page0, page1] [page2, page3]
[PCL0]---------[PCL1]

- functions declaration:
  . pread(fd, buf, count, offset)
  . readahead(fd, offset, count)
- Thread A tries to read the last 4K
- Thread B tries to do readahead 8K from 4K
- RA, besteffort == false
- R, besteffort == true

        <process A>                   <process B>

pread(FileA, buf, 4K, 12K)
  do readahead(page3) // failed with ENOMEM
  wait_lock(page3)
    if (!uptodate(page3))
      goto do_read
                               readahead(FileA, 4K, 8K)
                               // Here create PCL-chain like below:
                               // [null, page1] [page2, null]
                               //   [PCL0:RA]-----[PCL1:RA]
...
  do read(page3)        // found [PCL1:RA] and add page3 into it,
                        // and then, change PCL1 from RA to R
...
                               // Now, PCL-chain is as below:
                               // [null, page1] [page2, page3]
                               //   [PCL0:RA]-----[PCL1:R]

                                 // try to decompress PCL-chain...
                                 z_erofs_decompress_queue
                                   err = 0;

                                   // failed with ENOMEM, so page 1
                                   // only for RA will not be uptodated.
                                   // it's okay.
                                   err = decompress([PCL0:RA], err)

                                   // However, ENOMEM propagated to next
                                   // PCL, even though PCL is not only
                                   // for RA but also for R. As a result,
                                   // it just failed with ENOMEM without
                                   // trying any decompression, so page2
                                   // and page3 will not be uptodated.
                ** BUG HERE ** --> err = decompress([PCL1:R], err)

                                   return err as ENOMEM
...
    wait_lock(page3)
      if (!uptodate(page3))
        return EIO      <-- Return an unexpected EIO!
...

Fixes: 2349d2fa02 ("erofs: sunset unneeded NOFAILs")
Cc: stable@vger.kernel.org
Reviewed-by: Jaewook Kim <jw5454.kim@samsung.com>
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: Junbeom Yeom <junbeom.yeom@samsung.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2025-12-22 00:18:53 +08:00
..
Kconfig erofs: Do not select tristate symbols from bool symbols 2025-08-11 06:02:20 +08:00
Makefile erofs: support DEFLATE decompression by using Intel QAT 2025-05-25 15:27:40 +08:00
compress.h erofs: enable error reporting for z_erofs_fixup_insize() 2025-11-30 23:49:32 +08:00
data.c iomap: add caller-provided callbacks for read and readahead 2025-11-05 12:57:23 +01:00
decompressor.c erofs: enable error reporting for z_erofs_fixup_insize() 2025-11-30 23:49:32 +08:00
decompressor_crypto.c erofs: enable error reporting for z_erofs_fixup_insize() 2025-11-30 23:49:32 +08:00
decompressor_deflate.c erofs: enable error reporting for z_erofs_fixup_insize() 2025-11-30 23:49:32 +08:00
decompressor_lzma.c erofs: enable error reporting for z_erofs_fixup_insize() 2025-11-30 23:49:32 +08:00
decompressor_zstd.c erofs: enable error reporting for z_erofs_fixup_insize() 2025-11-30 23:49:32 +08:00
dir.c erofs: Add support for FS_IOC_GETFSLABEL 2025-09-25 11:26:20 +08:00
erofs_fs.h erofs: switch on-disk header `erofs_fs.h` to MIT license 2025-12-01 15:25:43 +08:00
fileio.c Changes since last update: 2025-12-03 20:14:44 -08:00
fscache.c erofs: get rid of raw bi_end_io() usage 2025-11-30 23:55:13 +08:00
inode.c Coccinelle-based conversion to use ->i_state accessors 2025-10-20 20:22:26 +02:00
internal.h erofs: Add support for FS_IOC_GETFSLABEL 2025-09-25 11:26:20 +08:00
namei.c erofs: get rid of erofs_kmap_type 2025-03-17 01:21:24 +08:00
super.c erofs: limit the level of fs stacking for file-backed mounts 2025-11-24 14:17:29 +08:00
sysfs.c erofs: support to readahead dirent blocks in erofs_readdir() 2025-07-24 19:44:08 +08:00
xattr.c erofs: fix long xattr name prefix placement 2025-09-12 03:37:07 +08:00
xattr.h erofs: remove ENOATTR definition 2025-07-24 19:42:07 +08:00
zdata.c erofs: fix unexpected EIO under memory pressure 2025-12-22 00:18:53 +08:00
zmap.c erofs: consolidate z_erofs_extent_lookback() 2025-10-22 07:54:31 +08:00
zutil.c The various patchsets are summarized below. Plus of course many 2025-01-26 18:36:23 -08:00