linux

Commit Graph

Author	SHA1	Message	Date
Linus Torvalds	1ca0f935a1	nfsd-6.15 fixes: - v6.15 libcrc clean-up makes invalid configurations possible - Fix a potential deadlock introduced during the v6.15 merge window -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmgD2mYACgkQM2qzM29m f5fn3xAAlPZP7MInbeszGMeRw9zdWaqbg4C90Gehdvi7sMBIYRo9E36NZuT4vl4D dfpHhoMIPMO5gsGrOeJZAT27La1F8M+O4yZOiuheTfIikaBrbZcZZtpwdRQHayGw qhHz8XoVNPsqvnoG+95u15XWG4IbI4kme2gNbYxEgF8zGERLv1aKPQTh3FqYG7Bk +9ANjoPNCAlnUAYdVHJZ5kwfTR10pDwx+jC45pZA4+hAusLx1Ek43pkcBgLMvoh2 ylbWa6SKceVdh/kb3rWaKXtVptNjcaMWLUZNSpd6Bu9e3zDCfjJN0hKUamHD+hSn HjYKbNU1W/l0wJsq9KKDwrOgE7LDZPP/mQ3MkFtMR/ccyTCAGe9Wy6nv7PnwEAu2 GqRh34VbNDGMPp4bpuTlp2FAL91BtGyYCkuwDZ1lqUDqnCmg4gRjyzFpkFF4lXPJ hq4J3NgA9BjO9wlQ4P3KNKFgE34s/8R0ey7euyrI5hxFoVFKxr5DNgKXVeW0FsY/ zMaDLzVN2Uv37IxjKhc1XvE2Ml6+J3tRv9awv+Yp05tG8/3vAhogs7So3TtwfWdU iWA3G0+zB30ATqdeXwpmJ8XxIqBnixDKhJcnLNOwrT+vAeqqqa0C39M0sOTQTMg3 VyFYkgt7v8jljp34DRfc4DAhTCsHzBzgAh9H/jrTP1xDlp7i7AU= =vCFi -----END PGP SIGNATURE----- Merge tag 'nfsd-6.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - v6.15 libcrc clean-up makes invalid configurations possible - Fix a potential deadlock introduced during the v6.15 merge window * tag 'nfsd-6.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: nfsd: decrease sc_count directly if fail to queue dl_recall nfs: add missing selections of CONFIG_CRC32	2025-04-19 10:38:03 -07:00
Eric Biggers	cd35b6cb46	nfs: add missing selections of CONFIG_CRC32 nfs.ko, nfsd.ko, and lockd.ko all use crc32_le(), which is available only when CONFIG_CRC32 is enabled. But the only NFS kconfig option that selected CONFIG_CRC32 was CONFIG_NFS_DEBUG, which is client-specific and did not actually guard the use of crc32_le() even on the client. The code worked around this bug by only actually calling crc32_le() when CONFIG_CRC32 is built-in, instead hard-coding '0' in other cases. This avoided randconfig build errors, and in real kernels the fallback code was unlikely to be reached since CONFIG_CRC32 is 'default y'. But, this really needs to just be done properly, especially now that I'm planning to update CONFIG_CRC32 to not be 'default y'. Therefore, make CONFIG_NFS_FS, CONFIG_NFSD, and CONFIG_LOCKD select CONFIG_CRC32. Then remove the fallback code that becomes unnecessary, as well as the selection of CONFIG_CRC32 from CONFIG_NFS_DEBUG. Fixes: `1264a2f053` ("NFS: refactor code for calculating the crc32 hash of a filehandle") Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Anna Schumaker <anna.schumaker@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-04-13 16:39:42 -04:00
Linus Torvalds	94d471a4f4	NFS client updates for Linux 6.15 Highlights include: Bugfixes: - 3 Fixes for looping in the NFSv4 state manager delegation code. - Fix for the NFSv4 state XDR code from Neil Brown. - Fix a leaked reference in nfs_lock_and_join_requests(). - Fix a use-after-free in the delegation return code. Features: - Implemenation of the NFSv4.2 copy offload OFFLOAD_STATUS operation to allow monitoring of an in-progress copy. - Add a mount option to force NFSv3/NFSv4 to use READDIRPLUS in a getdents() call. - SUNRPC now allows some basic management of an existing RPC client's connections using sysfs. - Improvements to the automated teardown of a NFS client when the container it was initiated from gets killed. - Improvements to prevent tasks from getting stuck in a killable wait state after calling exit_signals(). -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEESQctxSBg8JpV8KqEZwvnipYKAPIFAmftuE0ACgkQZwvnipYK APIAAhAAqFdJnh88UUT0/R184Qzpd021lR9XhxkwNA3TzhOIzmpuTgBzNE1iMG1j EHveYqCpTU2orA1aisAyw5c8meJlsCQREPDvUOQ2i4BTCCmsBHOMxg7KDWwwRdNh SVDCezFWrHYz4An81jpgBe3/x6RJaEyAhKC45ZzQruiBtSMeoOX1TAV/DTWwEo0j JcLdAUSGVBsfyrj3qT0oJXoj+96o7rbB80loCdNKy8m8PBWHWp0oILwuU00XdXgu 7jYyjZfxW1013It+vfVFsjTYRVfJ92pq3wiz/U9HXYDe3Arc4oPRw509/Jo3xEWW tdUljc/HepD3459ahiubTCLY39JxILl8/GapWe2Fn0J/JJuOGgZX9lqIMKDn4QCA 6TBOqWK7OEwImj4M7cfPptJQWd+hp91T4AR13xWJeQgp19AR8yOqEW0YX6hVlaBg UrBwdR+l6ys5lJJBReUW+JMDCYZmbH9RjuwcqzXn71JmlACHNFi6odwLnQ1mInvF P5pEf7aXaZkF6kEz2kmZ1eUgdkERAaIGCNFQTui6intlCSlQodNurrEU7Vx146os OvowJYM0HvnVBDOnERrJD04HADKZeDS8jt59ev0uXbP/NFxEJnPRRQgIdiZbfISV beQrc2fpUgwdjYAURbW1qWO7XNTJzK9LHJzn02SytfCazX0IQO0= =zPX4 -----END PGP SIGNATURE----- Merge tag 'nfs-for-6.15-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client updates from Trond Myklebust: "Bugfixes: - Three fixes for looping in the NFSv4 state manager delegation code - Fix for the NFSv4 state XDR code (Neil Brown) - Fix a leaked reference in nfs_lock_and_join_requests() - Fix a use-after-free in the delegation return code Features: - Implement the NFSv4.2 copy offload OFFLOAD_STATUS operation to allow monitoring of an in-progress copy - Add a mount option to force NFSv3/NFSv4 to use READDIRPLUS in a getdents() call - SUNRPC now allows some basic management of an existing RPC client's connections using sysfs - Improvements to the automated teardown of a NFS client when the container it was initiated from gets killed - Improvements to prevent tasks from getting stuck in a killable wait state after calling exit_signals()" * tag 'nfs-for-6.15-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (29 commits) nfs: Add missing release on error in nfs_lock_and_join_requests() NFSv4: Check for delegation validity in nfs_start_delegation_return_locked() NFS: Don't allow waiting for exiting tasks SUNRPC: Don't allow waiting for exiting tasks NFSv4: Treat ENETUNREACH errors as fatal for state recovery NFSv4: clp->cl_cons_state < 0 signifies an invalid nfs_client NFSv4: Further cleanups to shutdown loops NFS: Shut down the nfs_client only after all the superblocks SUNRPC: rpc_clnt_set_transport() must not change the autobind setting SUNRPC: rpcbind should never reset the port to the value '0' pNFS/flexfiles: Report ENETDOWN as a connection error pNFS/flexfiles: Treat ENETUNREACH errors as fatal in containers NFS: Treat ENETUNREACH errors as fatal in containers NFS: Add a mount option to make ENETUNREACH errors fatal sunrpc: Add a sysfs file for one-step xprt deletion sunrpc: Add a sysfs file for adding a new xprt sunrpc: Add a sysfs files for rpc_clnt information sunrpc: Add a sysfs attr for xprtsec NFS: Add implid to sysfs NFS: Extend rdirplus mount option with "force\|none" ...	2025-04-02 17:06:31 -07:00
Dan Carpenter	8e5419d654	nfs: Add missing release on error in nfs_lock_and_join_requests() Call nfs_release_request() on this error path before returning. Fixes: `c3f2235782` ("nfs: fold nfs_folio_find_and_lock_request into nfs_lock_and_join_requests") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://lore.kernel.org/r/3aaaa3d5-1c8a-41e4-98c7-717801ddd171@stanley.mountain Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2025-04-02 09:53:16 -04:00
Linus Torvalds	b6dde1e527	NFSD 6.15 Release Notes Neil Brown contributed more scalability improvements to NFSD's open file cache, and Jeff Layton contributed a menagerie of repairs to NFSD's NFSv4 callback / backchannel implementation. Mike Snitzer contributed a change to NFS re-export support that disables support for file locking on a re-exported NFSv4 mount. This is because NFSv4 state recovery is currently difficult if not impossible for re-exported NFS mounts. The change aims to prevent data integrity exposures after the re-export server crashes. Work continues on the evolving NFSD netlink administrative API. Many thanks to the contributors, reviewers, testers, and bug reporters who participated during the v6.15 development cycle. -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmfmpMIACgkQM2qzM29m f5f6DA/+P0YqoRg3Zk/4oWwXZWbfEOMhWFltT+D1PE2QjUfOZpiwUSFQfsfYgXO6 OFu0iDQ4g8BxBeP6Umv61qy7Cv6n4fVzIHqzymXQvymh9JzoQiXlE9/fA8nAHuiH u7kkNPRi7faBz1sMg/WpN9CHctg7STPOhhG/JrZcSFZnh87mU1i4i4bZBNz8tVnK ZWf483OUuSmJY2/bUTkwvr4GbceTKBlLWFFjiRhfAKvJBWvu4myfC0DI5QzxmsgI MJ62do7AFJP1ww2Ih9LLi2kFIt/yyInSVAgyts1CPhlJ4BfPnTSOw/i2+CuF3D/M bZYEAOjH3AqjBZmq58sIQezpD5f9/TOrTSwYwS31zl/THYE413WiW80/MDoWqo0y 9cSNkD3nJlPVLLCfF58vXLoe7wpLoN/ZbTdxoozzUWEFR5A4Jz3XP8F/Cws0cjem uWWAQMItiQpg1+RYJYfu4dg5+iN6dbgYbvzlr7buISwFNXi3Zo99MkJ4wHj9TJbL Tpjth1rWGPwwSOMT6ojKiYMq1oUzx5PuAm9Saq9oIzQAbBySmxHF/LSDz3wEuBoO MK1jzKroEmMk3fJOOAajSDLOdAbL3vfj6H/xi2IHvKnaz9yHCZNu2YGV05BBMprd hWePf69AO5Ky5Q9KuGClEtwvJ9ZR5pb4DO2dqaYu8ximu3O4vPo= =e2E2 -----END PGP SIGNATURE----- Merge tag 'nfsd-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd updates from Chuck Lever: "Neil Brown contributed more scalability improvements to NFSD's open file cache, and Jeff Layton contributed a menagerie of repairs to NFSD's NFSv4 callback / backchannel implementation. Mike Snitzer contributed a change to NFS re-export support that disables support for file locking on a re-exported NFSv4 mount. This is because NFSv4 state recovery is currently difficult if not impossible for re-exported NFS mounts. The change aims to prevent data integrity exposures after the re-export server crashes. Work continues on the evolving NFSD netlink administrative API. Many thanks to the contributors, reviewers, testers, and bug reporters who participated during the v6.15 development cycle" * tag 'nfsd-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (45 commits) NFSD: Add a Kconfig setting to enable delegated timestamps sysctl: Fixes nsm_local_state bounds nfsd: use a long for the count in nfsd4_state_shrinker_count() nfsd: remove obsolete comment from nfs4_alloc_stid nfsd: remove unneeded forward declaration of nfsd4_mark_cb_fault() nfsd: reorganize struct nfs4_delegation for better packing nfsd: handle errors from rpc_call_async() nfsd: move cb_need_restart flag into cb_flags nfsd: replace CB_GETATTR_BUSY with NFSD4_CALLBACK_RUNNING nfsd: eliminate cl_ra_cblist and NFSD4_CLIENT_CB_RECALL_ANY nfsd: prevent callback tasks running concurrently nfsd: disallow file locking and delegations for NFSv4 reexport nfsd: filecache: drop the list_lru lock during lock gc scans nfsd: filecache: don't repeatedly add/remove files on the lru list nfsd: filecache: introduce NFSD_FILE_RECENT nfsd: filecache: use list_lru_walk_node() in nfsd_file_gc() nfsd: filecache: use nfsd_file_dispose_list() in nfsd_file_close_inode_sync() NFSD: Re-organize nfsd_file_gc_worker() nfsd: filecache: remove race handling. fs: nfs: acl: Avoid -Wflex-array-member-not-at-end warning ...	2025-03-31 17:28:17 -07:00
Trond Myklebust	9e8f324bd4	NFSv4: Check for delegation validity in nfs_start_delegation_return_locked() Check that the delegation is still attached after taking the spin lock in nfs_start_delegation_return_locked(). Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2025-03-28 16:38:05 -04:00
Trond Myklebust	8d3ca33102	NFS: Don't allow waiting for exiting tasks Once a task calls exit_signals() it can no longer be signalled. So do not allow it to do killable waits. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2025-03-28 16:38:02 -04:00
Trond Myklebust	0af5fb5ed3	NFSv4: Treat ENETUNREACH errors as fatal for state recovery If a containerised process is killed and causes an ENETUNREACH or ENETDOWN error to be propagated to the state manager, then mark the nfs_client as being dead so that we don't loop in functions that are expecting recovery to succeed. Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2025-03-26 12:19:58 -04:00
Trond Myklebust	c81d5bcb7b	NFSv4: clp->cl_cons_state < 0 signifies an invalid nfs_client If someone calls nfs_mark_client_ready(clp, status) with a negative value for status, then that should signal that the nfs_client is no longer valid. Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2025-03-26 12:19:51 -04:00
Trond Myklebust	3eba080e4d	NFSv4: Further cleanups to shutdown loops Replace the tests for the RPC client being shut down with tests for whether the nfs_client is in an error state. Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2025-03-26 12:19:08 -04:00
Trond Myklebust	2d3e998a0b	NFS: Shut down the nfs_client only after all the superblocks The nfs_client manages state for all the superblocks in the "cl_superblocks" list, so it must not be shut down until all of them are gone. Fixes: `7d3e26a054` ("NFS: Cancel all existing RPC tasks when shutdown") Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2025-03-26 12:17:48 -04:00
Linus Torvalds	26d8e43079	vfs-6.15-rc1.async.dir -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZ90rNwAKCRCRxhvAZXjc onBJAP9Z8Ywmlb5KQ1E3HvDmkwyY6yOSyZ9/CmbzrkCJ8ywYkQD/d9/xt0EP/O/q N8YtzXArHWt7u0YbcVpy9WK3F72BdwU= =VJgY -----END PGP SIGNATURE----- Merge tag 'vfs-6.15-rc1.async.dir' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs async dir updates from Christian Brauner: "This contains cleanups that fell out of the work from async directory handling: - Change kern_path_locked() and user_path_locked_at() to never return a negative dentry. This simplifies the usability of these helpers in various places - Drop d_exact_alias() from the remaining place in NFS where it is still used. This also allows us to drop the d_exact_alias() helper completely - Drop an unnecessary call to fh_update() from nfsd_create_locked() - Change i_op->mkdir() to return a struct dentry Change vfs_mkdir() to return a dentry provided by the filesystems which is hashed and positive. This allows us to reduce the number of cases where the resulting dentry is not positive to very few cases. The code in these places becomes simpler and easier to understand. - Repack DENTRY_* and LOOKUP_* flags" * tag 'vfs-6.15-rc1.async.dir' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: doc: fix inline emphasis warning VFS: Change vfs_mkdir() to return the dentry. nfs: change mkdir inode_operation to return alternate dentry if needed. fuse: return correct dentry for ->mkdir ceph: return the correct dentry on mkdir hostfs: store inode in dentry after mkdir if possible. Change inode_operations.mkdir to return struct dentry * nfsd: drop fh_update() from S_IFDIR branch of nfsd_create_locked() nfs/vfs: discard d_exact_alias() VFS: add common error checks to lookup_one_qstr_excl() VFS: change kern_path_locked() and user_path_locked_at() to never return negative dentry VFS: repack LOOKUP_ bit flags. VFS: repack DENTRY_ flags.	2025-03-24 10:47:14 -07:00
Trond Myklebust	aa42add73c	pNFS/flexfiles: Report ENETDOWN as a connection error If the client should see an ENETDOWN when trying to connect to the data server, it might still be able to talk to the metadata server through another NIC. If so, report the error. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Tested-by: Jeff Layton <jlayton@kernel.org> Acked-by: Chuck Lever <chuck.lever@oracle.com>	2025-03-21 12:44:26 -04:00
Trond Myklebust	8c9f0df7a8	pNFS/flexfiles: Treat ENETUNREACH errors as fatal in containers Propagate the NFS_MOUNT_NETUNREACH_FATAL flag to work with the pNFS flexfiles client. In these circumstances, the client needs to treat the ENETDOWN and ENETUNREACH errors as fatal, and should abandon the attempted I/O. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Tested-by: Jeff Layton <jlayton@kernel.org> Acked-by: Chuck Lever <chuck.lever@oracle.com>	2025-03-21 12:44:24 -04:00
Trond Myklebust	9827144bfb	NFS: Treat ENETUNREACH errors as fatal in containers Propagate the NFS_MOUNT_NETUNREACH_FATAL flag to work with the generic NFS client. If the flag is set, the client will receive ENETDOWN and ENETUNREACH errors from the RPC layer, and is expected to treat them as being fatal. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Tested-by: Jeff Layton <jlayton@kernel.org> Acked-by: Chuck Lever <chuck.lever@oracle.com>	2025-03-21 12:44:19 -04:00
Trond Myklebust	487fae09d7	NFS: Add a mount option to make ENETUNREACH errors fatal If the NFS client was initially created in a container, and that container is torn down, there is usually no possibity to go back and destroy any NFS clients that are hung because their virtual network devices have been unlinked. Add a flag that tells the NFS client that in these circumstances, it should treat ENETDOWN and ENETUNREACH errors as fatal to the NFS client. The option defaults to being on when the mount happens from inside a net namespace that is not "init_net". Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Tested-by: Jeff Layton <jlayton@kernel.org> Acked-by: Chuck Lever <chuck.lever@oracle.com>	2025-03-21 12:44:09 -04:00
Anna Schumaker	e171b96500	NFS: Add implid to sysfs The Linux NFS server added support for returning this information during an EXCHANGE_ID in Linux v6.13. This is something and admin might want to query, so let's add it to sysfs. Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com> Link: https://lore.kernel.org/r/20250207204225.594002-2-anna@kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2025-03-21 09:34:52 -04:00
Benjamin Coddington	cfe1f8776f	NFS: Extend rdirplus mount option with "force\|none" There are certain users that wish to force the NFS client to choose READDIRPLUS over READDIR for a particular mount. Update the "rdirplus" mount option to optionally accept values. For "rdirplus=force", the NFS client will always attempt to use READDDIRPLUS. The setting of "rdirplus=none" is aliased to the existing "nordirplus". Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Link: https://lore.kernel.org/r/c4cf0de4c8be0930b91bc74bee310d289781cd3b.1741885071.git.bcodding@redhat.com Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2025-03-21 09:34:52 -04:00
Chuck Lever	2d6a194f4b	NFS: Refactor trace_nfs4_offload_cancel Add a trace_nfs4_offload_status trace point that looks just like trace_nfs4_offload_cancel. Promote that event to an event class to avoid duplicating code. An alternative approach would be to expand trace_nfs4_offload_status to report more of the actual OFFLOAD_STATUS result. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Link: https://lore.kernel.org/r/20250113153235.48706-16-cel@kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2025-03-17 16:51:53 -04:00
Chuck Lever	adcc0aef9e	NFS: Use NFSv4.2's OFFLOAD_STATUS operation We've found that there are cases where a transport disconnection results in the loss of callback RPCs. NFS servers typically do not retransmit callback operations after a disconnect. This can be a problem for the Linux NFS client's current implementation of asynchronous COPY, which waits indefinitely for a CB_OFFLOAD callback. If a transport disconnect occurs while an async COPY is running, there's a good chance the client will never get the completing CB_OFFLOAD. Fix this by implementing the OFFLOAD_STATUS operation so that the Linux NFS client can probe the NFS server if it doesn't see a CB_OFFLOAD in a reasonable amount of time. This patch implements a simplistic check. As future work, the client might also be able to detect whether there is no forward progress on the request asynchronous COPY operation, and CANCEL it. Suggested-by: Olga Kornievskaia <kolga@netapp.com> Link: https://bugzilla.kernel.org/show_bug.cgi?id=218735 Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Link: https://lore.kernel.org/r/20250113153235.48706-15-cel@kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2025-03-17 16:51:53 -04:00
Chuck Lever	77dd8a302f	NFS: Implement NFSv4.2's OFFLOAD_STATUS operation Enable the Linux NFS client to observe the progress of an offloaded asynchronous COPY operation. This new operation will be put to use in a subsequent patch. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Link: https://lore.kernel.org/r/20250113153235.48706-14-cel@kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2025-03-17 16:51:53 -04:00
Chuck Lever	8955e7ce61	NFS: Implement NFSv4.2's OFFLOAD_STATUS XDR Add XDR encoding and decoding functions for the NFSv4.2 OFFLOAD_STATUS operation. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Link: https://lore.kernel.org/r/20250113153235.48706-13-cel@kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2025-03-17 16:51:53 -04:00
NeilBrown	43502f6e8d	NFS: fix open_owner_id_maxsz and related fields. A recent change increased the size of an NFSv4 open owner, but didn't increase the corresponding max_sz defines. This is not know to have caused failure, but should be fixed. This patch also fixes some relates _maxsz fields that are wrong. Note that the XXX_owner_id_maxsz values now are only the size of the id and do NOT include the len field that will always preceed the id in xdr encoding. I think this is clearer. Reported-by: David Disseldorp <ddiss@suse.com> Fixes: `d98f722725` ("nfs: simplify and guarantee owner uniqueness.") Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2025-03-17 16:51:53 -04:00
Trond Myklebust	e767b59e29	NFSv4: Avoid unnecessary scans of filesystems for delayed delegations The amount of looping through the list of delegations is occasionally leading to soft lockups. If the state manager was asked to manage the delayed return of delegations, then only scan those filesystems containing delegations that were marked as being delayed. Fixes: `be20037725` ("NFSv4: Fix delegation return in cases where we have to retry") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2025-03-17 16:51:53 -04:00
Trond Myklebust	f163aa81a7	NFSv4: Avoid unnecessary scans of filesystems for expired delegations The amount of looping through the list of delegations is occasionally leading to soft lockups. If the state manager was asked to reap the expired delegations, it should scan only those filesystems that hold delegations that need to be reaped. Fixes: `7f156ef0bf` ("NFSv4: Clean up nfs_delegation_reap_expired()") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2025-03-17 16:51:53 -04:00
Trond Myklebust	35a566a24e	NFSv4: Avoid unnecessary scans of filesystems for returning delegations The amount of looping through the list of delegations is occasionally leading to soft lockups. If the state manager was asked to return delegations asynchronously, it should only scan those filesystems that hold delegations that need to be returned. Fixes: `af3b61bf61` ("NFSv4: Clean up nfs_client_return_marked_delegations()") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2025-03-17 16:51:53 -04:00
Trond Myklebust	47acca884f	NFSv4: Don't trigger uneccessary scans for return-on-close delegations The amount of looping through the list of delegations is occasionally leading to soft lockups. Avoid at least some loops by not requiring the NFSv4 state manager to scan for delegations that are marked for return-on-close. Instead, either mark them for immediate return (if possible) or else leave it up to nfs4_inode_return_delegation_on_close() to return them once the file is closed by the application. Fixes: `b757144fd7` ("NFSv4: Be less aggressive about returning delegations for open files") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2025-03-17 16:51:52 -04:00
Mike Snitzer	9254c8ae9b	nfsd: disallow file locking and delegations for NFSv4 reexport We do not and cannot support file locking with NFS reexport over NFSv4.x for the same reason we don't do it for NFSv3: NFS reexport server reboot cannot allow clients to recover locks because the source NFS server has not rebooted, and so it is not in grace. Since the source NFS server is not in grace, it cannot offer any guarantees that the file won't have been changed between the locks getting lost and any attempt to recover/reclaim them. The same applies to delegations and any associated locks, so disallow them too. Clients are no longer allowed to get file locks or delegations from a reexport server, any attempts will fail with operation not supported. Update the "Reboot recovery" section accordingly in Documentation/filesystems/nfs/reexport.rst Signed-off-by: Mike Snitzer <snitzer@kernel.org> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-03-10 09:11:08 -04:00
Linus Torvalds	1110ce6a1e	33 hotfixes. 24 are cc:stable and the remainder address post-6.13 issues or aren't considered necessary for -stable kernels. 26 are for MM and 7 are for non-MM. - "mm: memory_failure: unmap poisoned folio during migrate properly" from Ma Wupeng fixes a couple of two year old bugs involving the migration of hwpoisoned folios. - "selftests/damon: three fixes for false results" from SeongJae Park fixes three one year old bugs in the SAMON selftest code. The remainder are singletons and doubletons. Please see the individual changelogs for details. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZ8zgnAAKCRDdBJ7gKXxA jmTzAP9LsTIpkPkRXDpwxPR/Si5KPwOkE6sGj4ETEqbX3vUvcAEA/Lp7oafc7Vqr XxlC1VFush1ZK29Tecxzvnapl2/VSAs= =zBvY -----END PGP SIGNATURE----- Merge tag 'mm-hotfixes-stable-2025-03-08-16-27' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "33 hotfixes. 24 are cc:stable and the remainder address post-6.13 issues or aren't considered necessary for -stable kernels. 26 are for MM and 7 are for non-MM. - "mm: memory_failure: unmap poisoned folio during migrate properly" from Ma Wupeng fixes a couple of two year old bugs involving the migration of hwpoisoned folios. - "selftests/damon: three fixes for false results" from SeongJae Park fixes three one year old bugs in the SAMON selftest code. The remainder are singletons and doubletons. Please see the individual changelogs for details" * tag 'mm-hotfixes-stable-2025-03-08-16-27' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (33 commits) mm/page_alloc: fix uninitialized variable rapidio: add check for rio_add_net() in rio_scan_alloc_net() rapidio: fix an API misues when rio_add_net() fails MAINTAINERS: .mailmap: update Sumit Garg's email address Revert "mm/page_alloc.c: don't show protection in zone's ->lowmem_reserve[] for empty zone" mm: fix finish_fault() handling for large folios mm: don't skip arch_sync_kernel_mappings() in error paths mm: shmem: remove unnecessary warning in shmem_writepage() userfaultfd: fix PTE unmapping stack-allocated PTE copies userfaultfd: do not block on locking a large folio with raised refcount mm: zswap: use ATOMIC_LONG_INIT to initialize zswap_stored_pages mm: shmem: fix potential data corruption during shmem swapin mm: fix kernel BUG when userfaultfd_move encounters swapcache selftests/damon/damon_nr_regions: sort collected regiosn before checking with min/max boundaries selftests/damon/damon_nr_regions: set ops update for merge results check to 100ms selftests/damon/damos_quota: make real expectation of quota exceeds include/linux/log2.h: mark is_power_of_2() with __always_inline NFS: fix nfs_release_folio() to not deadlock via kcompactd writeback mm, swap: avoid BUG_ON in relocate_cluster() mm: swap: use correct step in loop to wait all clusters in wait_for_allocation() ...	2025-03-08 14:34:06 -10:00
Mike Snitzer	ce6d9c1c2b	NFS: fix nfs_release_folio() to not deadlock via kcompactd writeback Add PF_KCOMPACTD flag and current_is_kcompactd() helper to check for it so nfs_release_folio() can skip calling nfs_wb_folio() from kcompactd. Otherwise NFS can deadlock waiting for kcompactd enduced writeback which recurses back to NFS (which triggers writeback to NFSD via NFS loopback mount on the same host, NFSD blocks waiting for XFS's call to __filemap_get_folio): 6070.550357] INFO: task kcompactd0:58 blocked for more than 4435 seconds. {--- [58] "kcompactd0" [<0>] folio_wait_bit+0xe8/0x200 [<0>] folio_wait_writeback+0x2b/0x80 [<0>] nfs_wb_folio+0x80/0x1b0 [nfs] [<0>] nfs_release_folio+0x68/0x130 [nfs] [<0>] split_huge_page_to_list_to_order+0x362/0x840 [<0>] migrate_pages_batch+0x43d/0xb90 [<0>] migrate_pages_sync+0x9a/0x240 [<0>] migrate_pages+0x93c/0x9f0 [<0>] compact_zone+0x8e2/0x1030 [<0>] compact_node+0xdb/0x120 [<0>] kcompactd+0x121/0x2e0 [<0>] kthread+0xcf/0x100 [<0>] ret_from_fork+0x31/0x40 [<0>] ret_from_fork_asm+0x1a/0x30 ---} [akpm@linux-foundation.org: fix build] Link: https://lkml.kernel.org/r/20250225022002.26141-1-snitzer@kernel.org Fixes: `96780ca55e` ("NFS: fix up nfs_release_folio() to try to release the page") Signed-off-by: Mike Snitzer <snitzer@kernel.org> Cc: Anna Schumaker <anna.schumaker@oracle.com> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-03-05 21:36:15 -08:00
NeilBrown	8376583b84	nfs: change mkdir inode_operation to return alternate dentry if needed. mkdir now allows a different dentry to be returned which is sometimes relevant for nfs. This patch changes the nfs_rpc_ops mkdir op to return a dentry, and passes that back to the caller. The mkdir nfs_rpc_op will return NULL if the original dentry should be used. This matches the mkdir inode_operation. nfs4_do_create() is duplicated to nfs4_do_mkdir() which is changed to handle the specifics of directories. Consequently the current special handling for directories is removed from nfs4_do_create() Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: NeilBrown <neilb@suse.de> Link: https://lore.kernel.org/r/20250227013949.536172-6-neilb@suse.de Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-03-05 11:52:50 +01:00
NeilBrown	88d5baf690	Change inode_operations.mkdir to return struct dentry * Some filesystems, such as NFS, cifs, ceph, and fuse, do not have complete control of sequencing on the actual filesystem (e.g. on a different server) and may find that the inode created for a mkdir request already exists in the icache and dcache by the time the mkdir request returns. For example, if the filesystem is mounted twice the directory could be visible on the other mount before it is on the original mount, and a pair of name_to_handle_at(), open_by_handle_at() calls could instantiate the directory inode with an IS_ROOT() dentry before the first mkdir returns. This means that the dentry passed to ->mkdir() may not be the one that is associated with the inode after the ->mkdir() completes. Some callers need to interact with the inode after the ->mkdir completes and they currently need to perform a lookup in the (rare) case that the dentry is no longer hashed. This lookup-after-mkdir requires that the directory remains locked to avoid races. Planned future patches to lock the dentry rather than the directory will mean that this lookup cannot be performed atomically with the mkdir. To remove this barrier, this patch changes ->mkdir to return the resulting dentry if it is different from the one passed in. Possible returns are: NULL - the directory was created and no other dentry was used ERR_PTR() - an error occurred non-NULL - this other dentry was spliced in This patch only changes file-systems to return "ERR_PTR(err)" instead of "err" or equivalent transformations. Subsequent patches will make further changes to some file-systems to return a correct dentry. Not all filesystems reliably result in a positive hashed dentry: - NFS, cifs, hostfs will sometimes need to perform a lookup of the name to get inode information. Races could result in this returning something different. Note that this lookup is non-atomic which is what we are trying to avoid. Placing the lookup in filesystem code means it only happens when the filesystem has no other option. - kernfs and tracefs leave the dentry negative and the ->revalidate operation ensures that lookup will be called to correctly populate the dentry. This could be fixed but I don't think it is important to any of the users of vfs_mkdir() which look at the dentry. The recommendation to use d_drop();d_splice_alias() is ugly but fits with current practice. A planned future patch will change this. Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: NeilBrown <neilb@suse.de> Link: https://lore.kernel.org/r/20250227013949.536172-2-neilb@suse.de Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-02-27 20:00:17 +01:00
Christian Brauner	71628584df	Merge patch series "prep patches for my mkdir series" NeilBrown <neilb@suse.de> says: These two patches are cleanup are dependencies for my mkdir changes and subsequence directory locking changes. * patches from https://lore.kernel.org/r/20250226062135.2043651-1-neilb@suse.de: (2 commits) nfsd: drop fh_update() from S_IFDIR branch of nfsd_create_locked() nfs/vfs: discard d_exact_alias() Link: https://lore.kernel.org/r/20250226062135.2043651-1-neilb@suse.de Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-02-27 09:25:34 +01:00
NeilBrown	3ff6c8707c	nfs/vfs: discard d_exact_alias() d_exact_alias() is a descendent of d_add_unique() which was introduced 20 years ago mostly likely to work around problems with NFS servers of the time. It is now not used in several situations were it was originally needed and there have been no reports of problems - presumably the old NFS servers have been improved. This only place it is now use is in NFSv4 code and the old problematic servers are thought to have been v2/v3 only. There is no clear benefit in reusing a unhashed() dentry which happens to have the same name as the dentry we are adding. So this patch removes d_exact_alias() and the one place that it is used. Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: NeilBrown <neilb@suse.de> Link: https://lore.kernel.org/r/20250226062135.2043651-2-neilb@suse.de Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-02-26 09:55:18 +01:00
Stephen Smalley	9084ed79dd	lsm,nfs: fix memory leak of lsm_context commit `b530104f50` ("lsm: lsm_context in security_dentry_init_security") did not preserve the lsm id for subsequent release calls, which results in a memory leak. Fix it by saving the lsm id in the nfs4_label and providing it on the subsequent release call. Fixes: `b530104f50` ("lsm: lsm_context in security_dentry_init_security") Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com> Acked-by: Paul Moore <paul@paul-moore.com> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>	2025-02-25 15:07:24 -05:00
Trond Myklebust	8f8df955f0	NFSv4: Fix a deadlock when recovering state on a sillyrenamed file If the file is sillyrenamed, and slated for delete on close, it is possible for a server reboot to triggeer an open reclaim, with can again race with the application call to close(). When that happens, the call to put_nfs_open_context() can trigger a synchronous delegreturn call which deadlocks because it is not marked as privileged. Instead, ensure that the call to nfs4_inode_return_delegation_on_close() catches the delegreturn, and schedules it asynchronously. Reported-by: Li Lingfeng <lilingfeng3@huawei.com> Fixes: `adb4b42d19` ("Return the delegation when deleting sillyrenamed files") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>	2025-02-19 16:45:24 -05:00
Trond Myklebust	88025c67fe	NFS: Adjust delegated timestamps for O_DIRECT reads and writes Adjust the timestamps if O_DIRECT is being combined with attribute delegations. Fixes: `e12912d941` ("NFSv4: Add support for delegated atime and mtime attributes") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>	2025-02-19 16:45:24 -05:00
Trond Myklebust	fcf857ee19	NFS: O_DIRECT writes must check and adjust the file length While it is uncommon for delegations to be held while O_DIRECT writes are in progress, it is possible. The xfstests generic/647 and generic/729 both end up triggering that state, and end up failing due to the fact that the file size is not adjusted. Reported-by: Chuck Lever <chuck.lever@oracle.com> Link: https://bugzilla.kernel.org/show_bug.cgi?id=219738 Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>	2025-02-19 16:45:23 -05:00
NeilBrown	204a575e91	VFS: add common error checks to lookup_one_qstr_excl() Callers of lookup_one_qstr_excl() often check if the result is negative or positive. These changes can easily be moved into lookup_one_qstr_excl() by checking the lookup flags: LOOKUP_CREATE means it is NOT an error if the name doesn't exist. LOOKUP_EXCL means it IS an error if the name DOES exist. This patch adds these checks, then removes error checks from callers, and ensures that appropriate flags are passed. This subtly changes the meaning of LOOKUP_EXCL. Previously it could only accompany LOOKUP_CREATE. Now it can accompany LOOKUP_RENAME_TARGET as well. A couple of small changes are needed to accommodate this. The NFS change is functionally a no-op but ensures nfs_is_exclusive_create() does exactly what the name says. Signed-off-by: NeilBrown <neilb@suse.de> Link: https://lore.kernel.org/r/20250217003020.3170652-3-neilb@suse.de Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-02-19 14:09:15 +01:00
Linus Torvalds	d3d90cc289	Provide stable parent and name to ->d_revalidate() instances Most of the filesystem methods where we care about dentry name and parent have their stability guaranteed by the callers; ->d_revalidate() is the major exception. It's easy enough for callers to supply stable values for expected name and expected parent of the dentry being validated. That kills quite a bit of boilerplate in ->d_revalidate() instances, along with a bunch of races where they used to access ->d_name without sufficient precautions. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCZ5gkoQAKCRBZ7Krx/gZQ 6w9FAP4nyxNNWMjE1TwuWR/DNDMYYuw/qn/miZ88B5BUM8hzqgD/W2SjRvcbSaIm xSIYpbtKgtqNU34P1PU+dBvL8Utz2AE= =TWY8 -----END PGP SIGNATURE----- Merge tag 'pull-revalidate' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs d_revalidate updates from Al Viro: "Provide stable parent and name to ->d_revalidate() instances Most of the filesystem methods where we care about dentry name and parent have their stability guaranteed by the callers; ->d_revalidate() is the major exception. It's easy enough for callers to supply stable values for expected name and expected parent of the dentry being validated. That kills quite a bit of boilerplate in ->d_revalidate() instances, along with a bunch of races where they used to access ->d_name without sufficient precautions" * tag 'pull-revalidate' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: 9p: fix ->rename_sem exclusion orangefs_d_revalidate(): use stable parent inode and name passed by caller ocfs2_dentry_revalidate(): use stable parent inode and name passed by caller nfs: fix ->d_revalidate() UAF on ->d_name accesses nfs{,4}_lookup_validate(): use stable parent inode passed by caller gfs2_drevalidate(): use stable parent inode and name passed by caller fuse_dentry_revalidate(): use stable parent inode and name passed by caller vfat_revalidate{,_ci}(): use stable parent inode passed by caller exfat_d_revalidate(): use stable parent inode passed by caller fscrypt_d_revalidate(): use stable parent inode passed by caller ceph_d_revalidate(): propagate stable name down into request encoding ceph_d_revalidate(): use stable parent inode passed by caller afs_d_revalidate(): use stable name and parent inode passed by caller Pass parent directory inode and expected name to ->d_revalidate() generic_ci_d_compare(): use shortname_storage ext4 fast_commit: make use of name_snapshot primitives dissolve external_name.u into separate members make take_dentry_name_snapshot() lockless dcache: back inline names with a struct-wrapped array of unsigned long make sure that DNAME_INLINE_LEN is a multiple of word size	2025-01-30 09:13:35 -08:00
Linus Torvalds	af13ff1c33	Summary: All ctl_table declared outside of functions and that remain unmodified after initialization are const qualified. This prevents unintended modifications to proc_handler function pointers by placing them in the .rodata section. This is a continuation of the tree-wide effort started a few releases ago with the constification of the ctl_table struct arguments in the sysctl API done in `78eb4ea25c` ("sysctl: treewide: constify the ctl_table argument of proc_handlers") Testing: Testing was done on 0-day and sysctl selftests in x86_64. The linux-next branch was not used for such a big change in order to avoid unnecessary merge conflicts -----BEGIN PGP SIGNATURE----- iQGzBAABCgAdFiEErkcJVyXmMSXOyyeQupfNUreWQU8FAmeY6L0ACgkQupfNUreW QU/REwwAizeoFg3XyfwvGsjKUJKvZ8Ltnv3n4+tkd687UAQJnJHPE7/ODR8hKbpE E56G12jFlKQyiFR01wg+cbOy6+TTOT9o5qVmLZbo/zmI491Ygkxqen0Y0Z2mGXqR FMqcI8ZBmAAYfUKDjjUo+xUI70aNikWOOKRSmJp4cpgm5242d/UN7sOuKkOgt5DY GiyjPGlpKFkcYN4bOegKhlfZKdr9BMFxSgN0TZLtensj6cDrkZyLsrdgmVXy1mRT 0xTnmonGehweog4XY4hSPt2l6uCUu1fiY/WUcghKdWxUty43x9J3LahfD9b7DiAA G+DxHStSH0S/czWsa8Z0peyt/2gW8KZcRgk9W4UyVhpyDknXtVxr2sI3nxbTEFGl x2h6C29VCqg9Tn9oljEgGbYUrwlLz5Mah65JLDwlPLTpJmfA4BNbNxaC1V+DiqrX eApet8vaqGPlG7F3DRlyRAn7DoG8rs/eX93qqjbSA/pUjKjQUwCk/VBxNr1JBuNG elX+8QZi =x7aW -----END PGP SIGNATURE----- Merge tag 'constfy-sysctl-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl Pull sysctl table constification from Joel Granados: "All ctl_table declared outside of functions and that remain unmodified after initialization are const qualified. This prevents unintended modifications to proc_handler function pointers by placing them in the .rodata section. This is a continuation of the tree-wide effort started a few releases ago with the constification of the ctl_table struct arguments in the sysctl API done in `78eb4ea25c` ("sysctl: treewide: constify the ctl_table argument of proc_handlers")" * tag 'constfy-sysctl-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl: treewide: const qualify ctl_tables where applicable	2025-01-29 10:35:40 -08:00
Linus Torvalds	b88fe2b5dd	NFS Client Updates for Linux 6.14 New Features: * Enable using direct IO with localio * Added localio related tracepoints Bugfixes: * Sunrpc fixes for working with a very large cl_tasks list * Fix a possible buffer overflow in nfs_sysfs_link_rpc_client() * Fixes for handling reconnections with localio * Fix how the NFS_FSCACHE kconfig option interacts with NETFS_SUPPORT * Fix COPY_NOTIFY xdr_buf size calculations * pNFS/Flexfiles fix for retrying requesting a layout segment for reads * Sunrpc fix for retrying on EKEYEXPIRED error when the TGT is expired Cleanups: * Various other nfs & nfsd localio cleanups * Prepratory patches for async copy improvements that are under development * Make OFFLOAD_CANCEL, LAYOUTSTATS, and LAYOUTERR moveable to other xprts * Add netns inum and srcaddr to debugfs rpc_xprt info -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEnZ5MQTpR7cLU7KEp18tUv7ClQOsFAmeZUzUACgkQ18tUv7Cl QOvArw/9HltIlcJHbi7tApGJ4dFpuJCa/fHbA1n5bHvKrCR5aElmFZoiFDdsM1JX kFAlMED9n1dW9VmzLJcepxmrLo/t7KXueiZNharHynWTxcszSl6jS+tOFBW6OflG Rrrjq/SrsWI2Fu8X4e/7ZV7pqRLGGn5SSMwgbuMbcyzBvVgN8mZM/BneIp1J59AI 5NOsif5KWetVhQc43zlRlbVWR5cvNGcUK4i58LIaPFzPMt0xq/XJI+QWffj6kv4g cHabCNYTdQYMkhiPQC+LLYkw6sMbw2NatajTTYNMWfR/I+7wz9k5ej6CHKPIFCSr xjmscypySTLfMFQjrDFZkpX2CwSp/VIbV6go36DJwAlcCRzqz+I7cajlrRK4zvyr DyrcaZHvClEczP9QqdPj2wqRXbmIOsDMksOu4ACTUImd4o3f2v1K6DcwRj9oUIhV AGR31OEMt2A+RaVvVZYR4PpixJ01vH9LcmsaOu5KkHX8X4q2osQ7eMy+FV4kV09S pMnxDMAyszJU8IuzUG1/HfkonNlDMivIbqpgG4ZaVW08Nq4mCxJll1vTAa9FTLz2 z+9eocqKwf724q1RAgOB7vj4AwOwL4Ul6d18UBtyUitZz3ndLRZ8Yy6r/AhrpCsC 3co0Y3znZbKeRjmReNl0GLG4qiKE+E7Xh23Lf3IqXg8GE2Mu+Ls= =srvH -----END PGP SIGNATURE----- Merge tag 'nfs-for-6.14-1' of git://git.linux-nfs.org/projects/anna/linux-nfs Pull NFS client updates from Anna Schumaker: "New Features: - Enable using direct IO with localio - Added localio related tracepoints Bugfixes: - Sunrpc fixes for working with a very large cl_tasks list - Fix a possible buffer overflow in nfs_sysfs_link_rpc_client() - Fixes for handling reconnections with localio - Fix how the NFS_FSCACHE kconfig option interacts with NETFS_SUPPORT - Fix COPY_NOTIFY xdr_buf size calculations - pNFS/Flexfiles fix for retrying requesting a layout segment for reads - Sunrpc fix for retrying on EKEYEXPIRED error when the TGT is expired Cleanups: - Various other nfs & nfsd localio cleanups - Prepratory patches for async copy improvements that are under development - Make OFFLOAD_CANCEL, LAYOUTSTATS, and LAYOUTERR moveable to other xprts - Add netns inum and srcaddr to debugfs rpc_xprt info" * tag 'nfs-for-6.14-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (28 commits) SUNRPC: do not retry on EKEYEXPIRED when user TGT ticket expired sunrpc: add netns inum and srcaddr to debugfs rpc_xprt info pnfs/flexfiles: retry getting layout segment for reads NFSv4.2: make LAYOUTSTATS and LAYOUTERROR MOVEABLE NFSv4.2: mark OFFLOAD_CANCEL MOVEABLE NFSv4.2: fix COPY_NOTIFY xdr buf size calculation NFS: Rename struct nfs4_offloadcancel_data NFS: Fix typo in OFFLOAD_CANCEL comment NFS: CB_OFFLOAD can return NFS4ERR_DELAY nfs: Make NFS_FSCACHE select NETFS_SUPPORT instead of depending on it nfs: fix incorrect error handling in LOCALIO nfs: probe for LOCALIO when v3 client reconnects to server nfs: probe for LOCALIO when v4 client reconnects to server nfs/localio: remove redundant code and simplify LOCALIO enablement nfs_common: add nfs_localio trace events nfs_common: track all open nfsd_files per LOCALIO nfs_client nfs_common: rename nfslocalio nfs_uuid_lock to nfs_uuids_lock nfsd: nfsd_file_acquire_local no longer returns GC'd nfsd_file nfsd: rename nfsd_serv_ prefixed methods and variables with nfsd_net_ nfsd: update percpu_ref to manage references on nfsd_net ...	2025-01-28 14:23:46 -08:00
Joel Granados	1751f872cc	treewide: const qualify ctl_tables where applicable Add the const qualifier to all the ctl_tables in the tree except for watchdog_hardlockup_sysctl, memory_allocation_profiling_sysctls, loadpin_sysctl_table and the ones calling register_net_sysctl (./net, drivers/inifiniband dirs). These are special cases as they use a registration function with a non-const qualified ctl_table argument or modify the arrays before passing them on to the registration function. Constifying ctl_table structs will prevent the modification of proc_handler function pointers as the arrays would reside in .rodata. This is made possible after commit `78eb4ea25c` ("sysctl: treewide: constify the ctl_table argument of proc_handlers") constified all the proc_handlers. Created this by running an spatch followed by a sed command: Spatch: virtual patch @ depends on !(file in "net") disable optional_qualifier @ identifier table_name != { watchdog_hardlockup_sysctl, iwcm_ctl_table, ucma_ctl_table, memory_allocation_profiling_sysctls, loadpin_sysctl_table }; @@ + const struct ctl_table table_name [] = { ... }; sed: sed --in-place \ -e "s/struct ctl_table .table = &uts_kern/const struct ctl_table *table = \&uts_kern/" \ kernel/utsname_sysctl.c Reviewed-by: Song Liu <song@kernel.org> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> # for kernel/trace/ Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> # SCSI Reviewed-by: Darrick J. Wong <djwong@kernel.org> # xfs Acked-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Corey Minyard <cminyard@mvista.com> Acked-by: Wei Liu <wei.liu@kernel.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Bill O'Donnell <bodonnel@redhat.com> Acked-by: Baoquan He <bhe@redhat.com> Acked-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Acked-by: Anna Schumaker <anna.schumaker@oracle.com> Signed-off-by: Joel Granados <joel.granados@kernel.org>	2025-01-28 13:48:37 +01:00
Linus Torvalds	f34b580514	NFSD 6.14 Release Notes Jeff Layton contributed an implementation of NFSv4.2+ attribute delegation, as described here: https://www.ietf.org/archive/id/draft-ietf-nfsv4-delstid-08.html This interoperates with similar functionality introduced into the Linux NFS client in v6.11. An attribute delegation permits an NFS client to manage a file's mtime, rather than flushing dirty data to the NFS server so that the file's mtime reflects the last write, which is considerably slower. Neil Brown contributed dynamic NFSv4.1 session slot table resizing. This facility enables NFSD to increase or decrease the number of slots per NFS session depending on server memory availability. More session slots means greater parallelism. Chuck Lever fixed a long-standing latent bug where NFSv4 COMPOUND encoding screws up when crossing a page boundary in the encoding buffer. This is a zero-day bug, but hitting it is rare and depends on the NFS client implementation. The Linux NFS client does not happen to trigger this issue. A variety of bug fixes and other incremental improvements fill out the list of commits in this release. Great thanks to all contributors, reviewers, testers, and bug reporters who participated during this development cycle. -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmeVPBUACgkQM2qzM29m f5dClxAAmW4O2bOJaR8neJ54fzeFYXtFYEXRF/XbIh2KdnCy2LoywT9ux8ndzE0k 1tsjtv0g4y84IcYfrxXPRTuhh2GO2pHw5L1kGVRezJg2ODSFbpdzGtcVK7SIrs2S eijTViqdi9xRgfj1jPqRrvxC99RL3/fmCztqorAPLDFsqYAd/6ZRxZ7+IcZ2h4+J cJ0Z6Wx6eh10roacZPXweH13XJ7xWO/ublYZvQFQpK2BAKyO98aXGgLDraNt8k60 X3DZLSKkGB/eBlNeAlTtcrXec/ot6XGJPKr3b/7zhwfMi8B13RGdSmCR8SxMdRQM vCQO4G2YadU4YFS6FFIw9Wc1XDYUuYh2YgcveafjzjXbgi7NnY7rYOxtnTgBi0Xv XGjtGqpvD676gPm+8b3DcwqmWI3c/WUdtQIZ1uYRCFZdFqkVP91bySPT2aHtx2GO 4j3uEyTlypC00kyvu1oL3+tVUG/EFlJCpYvIbOwqDG2m7KWPStzpfJTD5Q9cdlEl fdgs6l82EVqe1YyjLTqajDuOcRrLYK2hlR/5STc03hQV+GpKSo5UypRejzE9WtRV zT/tyelqhj4+0EZJz4ay/8q9s2Jp+5JGVxoVvjujSuH7+Ulb3T+IDkldtMamO8Fm www2y0/fLfU2xIapMJdCoJ+ZKgel2i8RZMPIc0cIfO5ITXm+dOs= =hwXx -----END PGP SIGNATURE----- Merge tag 'nfsd-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd updates from Chuck Lever: "Jeff Layton contributed an implementation of NFSv4.2+ attribute delegation, as described here: https://www.ietf.org/archive/id/draft-ietf-nfsv4-delstid-08.html This interoperates with similar functionality introduced into the Linux NFS client in v6.11. An attribute delegation permits an NFS client to manage a file's mtime, rather than flushing dirty data to the NFS server so that the file's mtime reflects the last write, which is considerably slower. Neil Brown contributed dynamic NFSv4.1 session slot table resizing. This facility enables NFSD to increase or decrease the number of slots per NFS session depending on server memory availability. More session slots means greater parallelism. Chuck Lever fixed a long-standing latent bug where NFSv4 COMPOUND encoding screws up when crossing a page boundary in the encoding buffer. This is a zero-day bug, but hitting it is rare and depends on the NFS client implementation. The Linux NFS client does not happen to trigger this issue. A variety of bug fixes and other incremental improvements fill out the list of commits in this release. Great thanks to all contributors, reviewers, testers, and bug reporters who participated during this development cycle" * tag 'nfsd-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (42 commits) sunrpc: Remove gss_{de,en}crypt_xdr_buf deadcode sunrpc: Remove gss_generic_token deadcode sunrpc: Remove unused xprt_iter_get_xprt Revert "SUNRPC: Reduce thread wake-up rate when receiving large RPC messages" nfsd: implement OPEN_ARGS_SHARE_ACCESS_WANT_OPEN_XOR_DELEGATION nfsd: handle delegated timestamps in SETATTR nfsd: add support for delegated timestamps nfsd: rework NFS4_SHARE_WANT_* flag handling nfsd: add support for FATTR4_OPEN_ARGUMENTS nfsd: prepare delegation code for handing out _ATTRS_DELEG delegations nfsd: rename NFS4_SHARE_WANT_ constants to OPEN4_SHARE_ACCESS_WANT_* nfsd: switch to autogenerated definitions for open_delegation_type4 nfs_common: make include/linux/nfs4.h include generated nfs4_1.h nfsd: fix handling of delegated change attr in CB_GETATTR SUNRPC: Document validity guarantees of the pointer returned by reserve_space NFSD: Insulate nfsd4_encode_fattr4() from page boundaries in the encode buffer NFSD: Insulate nfsd4_encode_secinfo() from page boundaries in the encode buffer NFSD: Refactor nfsd4_do_encode_secinfo() again NFSD: Insulate nfsd4_encode_readlink() from page boundaries in the encode buffer NFSD: Insulate nfsd4_encode_read_plus_data() from page boundaries in the encode buffer ...	2025-01-27 17:27:24 -08:00
Al Viro	ffeeaada2b	nfs: fix ->d_revalidate() UAF on ->d_name accesses Pass the stable name all the way down to ->rpc_ops->lookup() instances. Note that passing &dentry->d_name is safe in e.g. nfs_lookup() - it is stable there, as it is in ->create() et.al. dget_parent() in nfs_instantiate() should be redundant - it'd better be stable there; if it's not, we have more trouble, since ->d_name would also be unsafe in such case. nfs_submount() and nfs4_submount() may or may not require fixes - if they ever get moved on server with fhandle preserved, we are in trouble there... UAF window is fairly narrow here and exfiltration requires the ability to watch the traffic. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2025-01-27 19:25:24 -05:00
Al Viro	39f644a266	nfs{,4}_lookup_validate(): use stable parent inode passed by caller we can't kill __nfs_lookup_revalidate() completely, but ->d_parent boilerplate in it is gone Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2025-01-27 19:25:24 -05:00
Al Viro	5be1fa8abd	Pass parent directory inode and expected name to ->d_revalidate() ->d_revalidate() often needs to access dentry parent and name; that has to be done carefully, since the locking environment varies from caller to caller. We are not guaranteed that dentry in question will not be moved right under us - not unless the filesystem is such that nothing on it ever gets renamed. It can be dealt with, but that results in boilerplate code that isn't even needed - the callers normally have just found the dentry via dcache lookup and want to verify that it's in the right place; they already have the values of ->d_parent and ->d_name stable. There is a couple of exceptions (overlayfs and, to less extent, ecryptfs), but for the majority of calls that song and dance is not needed at all. It's easier to make ecryptfs and overlayfs find and pass those values if there's a ->d_revalidate() instance to be called, rather than doing that in the instances. This commit only changes the calling conventions; making use of supplied values is left to followups. NOTE: some instances need more than just the parent - things like CIFS may need to build an entire path from filesystem root, so they need more precautions than the usual boilerplate. This series doesn't do anything to that need - these filesystems have to keep their locking mechanisms (rename_lock loops, use of dentry_path_raw(), private rwsem a-la v9fs). One thing to keep in mind when using name is that name->name will normally point into the pathname being resolved; the filename in question occupies name->len bytes starting at name->name, and there is NUL somewhere after it, but it the next byte might very well be '/' rather than '\0'. Do not ignore name->len. Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Gabriel Krisman Bertazi <gabriel@krisman.be> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2025-01-27 19:25:23 -05:00
Mike Snitzer	eb3fabde15	pnfs/flexfiles: retry getting layout segment for reads If ff_layout_pg_get_read()'s attempt to get a layout segment results in -EAGAIN have ff_layout_pg_init_read() retry it after sleeping. If "softerr" mount is used, use 'io_maxretrans' to limit the number of attempts to get a layout segment. This fixes a long-standing issue of O_DIRECT reads failing with -EAGAIN (11) when using flexfiles Client Side Mirroring (CSM). Cc: stable@vger.kernel.org Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>	2025-01-22 14:26:46 -05:00
Linus Torvalds	f96a974170	lsm/stable-6.14 PR 20250121 -----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmeQFBoUHHBhdWxAcGF1 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXPvcA//XCdwMz0bGtWKv58nuyP8vkQx08n6 //olz/O8te3uWK5O3kRiarzFLwH8qsHQ6A7GYalwwix34hatR4ndJE0Y/guVRWa1 +aBmJxJ7Jm/q3fvpAEfqiSgreuE6kBoztlDOWEq+hUQGu4qfnQGm2EnvbvfFrAmN VheOfIQSU2KCL/Scc3FGnF6uru4WrqN0JJ9RbvrEpfdQgmcyTGLnQsZLljutWSIq kDWkteIr7cj3O9J45zpxZsTftvYSgVn/y1iKeXbHI4DBA1eheK12vsHB9AADKI1J GwHxOrnLpZtv+ICUKqcfFTmWTl+NmfJJurAT5KXKdBjL3xM5MoJlBvK1A5qE9CMo LaHVG/TZR2MmBaoM3EN+gvWhDgWlvT02Q/0cYaafTlVLMez3HtfctxN6OnCvTXTB Y8dqYClhhlBm/mHQwYfMoeKw4MftUpzEqBd1Nj7Qe8dbP0f/62Ca3K2B3D6Rf8QV pj3ryMlSWYV9mdTerruLNQexTGoN7l66jPwzdWpTbFeL3WmNtfCako8OZGbXgPIu Iahm3P+jnSVx8ZQro2c9zwdKXI5xiI335pCBbDZ8aX+JAsfj0OofHsFx5Q5diber M7tAEhxDqRisbpz7Ei+/LOAEGg2Z619XKg8ks4z6Y4P5PF7zEgeWTkZJk2iLbxXe 6LLOjmF7LLw+G4M= =fgyr -----END PGP SIGNATURE----- Merge tag 'lsm-pr-20250121' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm Pull lsm updates from Paul Moore: - Improved handling of LSM "secctx" strings through lsm_context struct The LSM secctx string interface is from an older time when only one LSM was supported, migrate over to the lsm_context struct to better support the different LSMs we now have and make it easier to support new LSMs in the future. These changes explain the Rust, VFS, and networking changes in the diffstat. - Only build lsm_audit.c if CONFIG_SECURITY and CONFIG_AUDIT are enabled Small tweak to be a bit smarter about when we build the LSM's common audit helpers. - Check for absurdly large policies from userspace in SafeSetID SafeSetID policies rules are fairly small, basically just "UID:UID", it easy to impose a limit of KMALLOC_MAX_SIZE on policy writes which helps quiet a number of syzbot related issues. While work is being done to address the syzbot issues through other mechanisms, this is a trivial and relatively safe fix that we can do now. - Various minor improvements and cleanups A collection of improvements to the kernel selftests, constification of some function parameters, removing redundant assignments, and local variable renames to improve readability. * tag 'lsm-pr-20250121' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: lockdown: initialize local array before use to quiet static analysis safesetid: check size of policy writes net: corrections for security_secid_to_secctx returns lsm: rename variable to avoid shadowing lsm: constify function parameters security: remove redundant assignment to return variable lsm: Only build lsm_audit.c if CONFIG_SECURITY and CONFIG_AUDIT are set selftests: refactor the lsm `flags_overset_lsm_set_self_attr` test binder: initialize lsm_context structure rust: replace lsm context+len with lsm_context lsm: secctx provider check on release lsm: lsm_context in security_dentry_init_security lsm: use lsm_context in security_inode_getsecctx lsm: replace context+len with lsm_context lsm: ensure the correct LSM context releaser	2025-01-21 20:03:04 -08:00
Olga Kornievskaia	0b96c75d86	NFSv4.2: make LAYOUTSTATS and LAYOUTERROR MOVEABLE LAYOUTSTATS and LAYOUTERROR should be marked MOVEABLE for when we need to move tasks off a non-functional transport. Signed-off-by: Olga Kornievskaia <okorniev@redhat.com> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>	2025-01-21 11:34:50 -05:00

1 2 3 4 5 ...

6970 Commits