linux/fs/netfs
Max Kellermann 4d428dca25
netfs: fix reference leak
Commit 20d72b00ca ("netfs: Fix the request's work item to not
require a ref") modified netfs_alloc_request() to initialize the
reference counter to 2 instead of 1.  The rationale was that the
requet's "work" would release the second reference after completion
(via netfs_{read,write}_collection_worker()).  That works most of the
time if all goes well.

However, it leaks this additional reference if the request is released
before the I/O operation has been submitted: the error code path only
decrements the reference counter once and the work item will never be
queued because there will never be a completion.

This has caused outages of our whole server cluster today because
tasks were blocked in netfs_wait_for_outstanding_io(), leading to
deadlocks in Ceph (another bug that I will address soon in another
patch).  This was caused by a netfs_pgpriv2_begin_copy_to_cache() call
which failed in fscache_begin_write_operation().  The leaked
netfs_io_request was never completed, leaving `netfs_inode.io_count`
with a positive value forever.

All of this is super-fragile code.  Finding out which code paths will
lead to an eventual completion and which do not is hard to see:

- Some functions like netfs_create_write_req() allocate a request, but
  will never submit any I/O.

- netfs_unbuffered_read_iter_locked() calls netfs_unbuffered_read()
  and then netfs_put_request(); however, netfs_unbuffered_read() can
  also fail early before submitting the I/O request, therefore another
  netfs_put_request() call must be added there.

A rule of thumb is that functions that return a `netfs_io_request` do
not submit I/O, and all of their callers must be checked.

For my taste, the whole netfs code needs an overhaul to make reference
counting easier to understand and less fragile & obscure.  But to fix
this bug here and now and produce a patch that is adequate for a
stable backport, I tried a minimal approach that quickly frees the
request object upon early failure.

I decided against adding a second netfs_put_request() each time
because that would cause code duplication which obscures the code
further.  Instead, I added the function netfs_put_failed_request()
which frees such a failed request synchronously under the assumption
that the reference count is exactly 2 (as initially set by
netfs_alloc_request() and never touched), verified by a
WARN_ON_ONCE().  It then deinitializes the request object (without
going through the "cleanup_work" indirection) and frees the allocation
(with RCU protection to protect against concurrent access by
netfs_requests_seq_start()).

All code paths that fail early have been changed to call
netfs_put_failed_request() instead of netfs_put_request().
Additionally, I have added a netfs_put_request() call to
netfs_unbuffered_read() as explained above because the
netfs_put_failed_request() approach does not work there.

Fixes: 20d72b00ca ("netfs: Fix the request's work item to not require a ref")
Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Paulo Alcantara <pc@manguebit.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
cc: stable@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-26 10:14:19 +02:00
..
Kconfig netfs: clean up after renaming FSCACHE_DEBUG config 2024-08-12 22:03:26 +02:00
Makefile netfs: Add support for caching single monolithic objects such as AFS dirs 2024-12-20 22:34:06 +01:00
buffered_read.c netfs: fix reference leak 2025-09-26 10:14:19 +02:00
buffered_write.c netfs: Prevent duplicate unlocking 2025-09-15 13:57:15 +02:00
direct_read.c netfs: fix reference leak 2025-09-26 10:14:19 +02:00
direct_write.c netfs: fix reference leak 2025-09-26 10:14:19 +02:00
fscache_cache.c netfs: Mark __nonstring lookup tables 2025-04-17 10:13:46 +02:00
fscache_cookie.c netfs: Mark __nonstring lookup tables 2025-04-17 10:13:46 +02:00
fscache_internal.h netfs, fscache: Combine fscache with netfs 2023-12-24 15:08:46 +00:00
fscache_io.c netfs: Fix the request's work item to not require a ref 2025-05-21 14:35:20 +02:00
fscache_main.c fscache: delete fscache_cookie_lru_timer when fscache exits to avoid UAF 2024-09-01 10:30:25 +02:00
fscache_proc.c netfs: Fix proc/fs/fscache symlink to point to "netfs" not "../netfs" 2024-01-04 13:15:32 +00:00
fscache_stats.c netfs: Fix interaction between write-streaming and cachefiles culling 2024-01-05 15:42:25 +00:00
fscache_volume.c netfs/fscache: Add a memory barrier for FSCACHE_VOLUME_CREATING 2024-11-11 14:39:38 +01:00
internal.h netfs: fix reference leak 2025-09-26 10:14:19 +02:00
iterator.c netfs: Speed up buffered reading 2024-09-12 12:20:41 +02:00
locking.c netfs: Downgrade i_rwsem for a buffered write 2024-10-17 15:33:42 +02:00
main.c netfs: Renumber the NETFS_RREQ_* flags to make traces easier to read 2025-07-01 22:37:14 +02:00
misc.c netfs: Update tracepoints in a number of ways 2025-07-01 22:37:14 +02:00
objects.c netfs: fix reference leak 2025-09-26 10:14:19 +02:00
read_collect.c netfs: Fix unbuffered write error handling 2025-08-15 15:56:49 +02:00
read_pgpriv2.c netfs: fix reference leak 2025-09-26 10:14:19 +02:00
read_retry.c netfs: Fix wait/wake to be consistent about the waitqueue used 2025-05-21 14:35:21 +02:00
read_single.c netfs: fix reference leak 2025-09-26 10:14:19 +02:00
rolling_buffer.c netfs: Fix rolling_buffer_load_from_ra() to not clear mark bits 2025-03-19 10:04:22 +01:00
stats.c netfs: Add retry stat counters 2025-02-13 16:00:48 +01:00
write_collect.c netfs: Fix unbuffered write error handling 2025-08-15 15:56:49 +02:00
write_issue.c netfs: fix reference leak 2025-09-26 10:14:19 +02:00
write_retry.c netfs: Update tracepoints in a number of ways 2025-07-01 22:37:14 +02:00