linux

Commit Graph

Author	SHA1	Message	Date
Linus Torvalds	9368f0f941	vfs-6.19-rc1.inode -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaSmOZAAKCRCRxhvAZXjc omMSAP9GLhavxyWQ24Q+49CNWWRQWDY1wTOiUK2BwtIvZ0YEcAD8D1dAiMckL5pC RwEAVA5p+y+qi+bZP0KXCBxQddoTIQM= =zo/J -----END PGP SIGNATURE----- Merge tag 'vfs-6.19-rc1.inode' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs inode updates from Christian Brauner: "Features: - Hide inode->i_state behind accessors. Open-coded accesses prevent asserting they are done correctly. One obvious aspect is locking, but significantly more can be checked. For example it can be detected when the code is clearing flags which are already missing, or is setting flags when it is illegal (e.g., I_FREEING when ->i_count > 0) - Provide accessors for ->i_state, converts all filesystems using coccinelle and manual conversions (btrfs, ceph, smb, f2fs, gfs2, overlayfs, nilfs2, xfs), and makes plain ->i_state access fail to compile - Rework I_NEW handling to operate without fences, simplifying the code after the accessor infrastructure is in place Cleanups: - Move wait_on_inode() from writeback.h to fs.h - Spell out fenced ->i_state accesses with explicit smp_wmb/smp_rmb for clarity - Cosmetic fixes to LRU handling - Push list presence check into inode_io_list_del() - Touch up predicts in __d_lookup_rcu() - ocfs2: retire ocfs2_drop_inode() and I_WILL_FREE usage - Assert on ->i_count in iput_final() - Assert ->i_lock held in __iget() Fixes: - Add missing fences to I_NEW handling" * tag 'vfs-6.19-rc1.inode' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (22 commits) dcache: touch up predicts in __d_lookup_rcu() fs: push list presence check into inode_io_list_del() fs: cosmetic fixes to lru handling fs: rework I_NEW handling to operate without fences fs: make plain ->i_state access fail to compile xfs: use the new ->i_state accessors nilfs2: use the new ->i_state accessors overlayfs: use the new ->i_state accessors gfs2: use the new ->i_state accessors f2fs: use the new ->i_state accessors smb: use the new ->i_state accessors ceph: use the new ->i_state accessors btrfs: use the new ->i_state accessors Manual conversion to use ->i_state accessors of all places not covered by coccinelle Coccinelle-based conversion to use ->i_state accessors fs: provide accessors for ->i_state fs: spell out fenced ->i_state accesses with explicit smp_wmb/smp_rmb fs: move wait_on_inode() from writeback.h to fs.h fs: add missing fences to I_NEW handling ocfs2: retire ocfs2_drop_inode() and I_WILL_FREE usage ...	2025-12-01 09:02:34 -08:00
Dominique Martinet	43c36a56cc	Revert "fs/9p: Refresh metadata in d_revalidate for uncached mode too" This reverts commit `290434474c`. That commit broke cache=mmap, a mode that doesn't cache metadata, but still has writeback cache. In commit `290434474c` ("fs/9p: Refresh metadata in d_revalidate for uncached mode too") we considered metadata cache to be enough to not look at the server, but in writeback cache too looking at the server size would make the vfs consider the file has been truncated before the data has been flushed out, making the following repro fail (nothing is ever read back, the resulting file ends up with no data written) ``` #include <fcntl.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> char buf[4096]; int main(int argc, char *argv[]) { int ret, i; int fdw, fdr; if (argc < 2) return 1; fdw = openat(AT_FDCWD, argv[1], O_RDWR\|O_CREAT\|O_EXCL\|O_CLOEXEC, 0600); if (fdw < 0) { fprintf(stderr, "cannot open fdw\n"); return 1; } write(fdw, buf, sizeof(buf)); fdr = openat(AT_FDCWD, argv[1], O_RDONLY\|O_CLOEXEC); if (fdr < 0) { fprintf(stderr, "cannot open fdr\n"); close(fdw); return 1; } for (i = 0; i < 10; i++) { ret = read(fdr, buf, sizeof(buf)); fprintf(stderr, "i: %d, read returns %d\n", i, ret); } close(fdr); close(fdw); return 0; } ``` There is a fix for this particular reproducer but it looks like there are other problems around metadata refresh (e.g. around file rename), so revert this to avoid d_revalidate in uncached mode for now. Reported-by: Song Liu <song@kernel.org> Link: https://lkml.kernel.org/r/CAHzjS_u_SYdt5=2gYO_dxzMKXzGMt-TfdE_ueowg-Hq5tRCAiw@mail.gmail.com Reported-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> Link: https://lore.kernel.org/bpf/CAEf4BzZbCE4tLoDZyUf_aASpgAGFj75QMfSXX4a4dLYixnOiLg@mail.gmail.com/ Fixes: `290434474c` ("fs/9p: Refresh metadata in d_revalidate for uncached mode too") Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>	2025-10-22 14:25:27 +09:00
Mateusz Guzik	b4dbfd8653	Coccinelle-based conversion to use ->i_state accessors All places were patched by coccinelle with the default expecting that ->i_lock is held, afterwards entries got fixed up by hand to use unlocked variants as needed. The script: @@ expression inode, flags; @@ - inode->i_state & flags + inode_state_read(inode) & flags @@ expression inode, flags; @@ - inode->i_state &= ~flags + inode_state_clear(inode, flags) @@ expression inode, flag1, flag2; @@ - inode->i_state &= ~flag1 & ~flag2 + inode_state_clear(inode, flag1 \| flag2) @@ expression inode, flags; @@ - inode->i_state \|= flags + inode_state_set(inode, flags) @@ expression inode, flags; @@ - inode->i_state = flags + inode_state_assign(inode, flags) @@ expression inode, flags; @@ - flags = inode->i_state + flags = inode_state_read(inode) @@ expression inode, flags; @@ - READ_ONCE(inode->i_state) & flags + inode_state_read(inode) & flags Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-10-20 20:22:26 +02:00
Linus Torvalds	80b7065ec1	Bunch of unrelated fixes - polling fix for trans fd that ought to have been fixed otherwise back in March, but apparently came back somewhere else... - USB transport buffer overflow fix - Some dentry lifetime rework to handle metadata update for currently opened files in uncached mode, or inode type change in cached mode - a double-put on invalid flush found by syzbot - and finally /sys/fs/9p/caches not advancing buffer and overwriting itself for large contents Thanks to everyone involved! -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE/IPbcYBuWt0zoYhOq06b7GqY5nAFAmjns5cACgkQq06b7GqY 5nDGLg/+Ltsiwj51CDwyJxFPX4ki1jM3j2iA8BNpdjjzCev6Jp1vuWBPiV+sTtgM zKmIw9joyvzaIXKzA3h7tMe7h7y8H4p7KZyD3YSDSNW5XIb5UYXeQwYVw39OxENk nVQkQ41H+bPepmz3R0t/F4la9myVLBtfdcPWJN+JfkOLgdeemCGf+TShJ6F+9Qaq PTd3UKfdv8JRlKQmeLH7pPlBoWYRoC2Tq3pZbyC3D1A50bWkTfxusW0k3MUzRyHO t/jNbhHt0ai+densQ6O5M+ALMI7KIIzR+obVaBHfzSUcwFeGhmW2x84Mo0qX6YMF 0B/fY3mCXi8QO6my1hfwmbRtY5mEV967wM/uF7E/SuvyNVizOMBZn8FgWfVPVszr ix1iNBgQJdoOEMbHvDfCAtGNorFrMiybyLvoabL7YCwFo68LdYqj2br3/feajoYM 5g46nUcidH4fVBLtiBGE8w0ko8aa3zLB4iu/fdATuEZ1r+gPt40SWo1mKahoCgY3 BOtYxEZTnd4l4GPJDQIaYgVckgSiex0xcE9pRJ5LtBwdyw9g4jwfb39WHx4MT2sE u5xWDbadU/ZM6ppLJ/iBkotvU9D7ohKTviN1IfCwjqL7IMBNz5c+nBwLQZWK4YTR 2ySKpv1VOMROglt1KvnHfdpmdJ+CjAJLLNE+XSz9AHXinK5v8aM= =bUN1 -----END PGP SIGNATURE----- Merge tag '9p-for-6.18-rc1' of https://github.com/martinetd/linux Pull 9p updates from Dominique Martinet: "A bunch of unrelated fixes: - polling fix for trans fd that ought to have been fixed otherwise back in March, but apparently came back somewhere else... - USB transport buffer overflow fix - Some dentry lifetime rework to handle metadata update for currently opened files in uncached mode, or inode type change in cached mode - a double-put on invalid flush found by syzbot - and finally /sys/fs/9p/caches not advancing buffer and overwriting itself for large contents Thanks to everyone involved!" * tag '9p-for-6.18-rc1' of https://github.com/martinetd/linux: 9p: sysfs_init: don't hardcode error to ENOMEM 9p: fix /sys/fs/9p/caches overwriting itself 9p: clean up comment typos 9p/trans_fd: p9_fd_request: kick rx thread if EPOLLIN net/9p: fix double req put in p9_fd_cancelled net/9p: Fix buffer overflow in USB transport layer fs/9p: Add p9_debug(VFS) in d_revalidate fs/9p: Invalidate dentry if inode type change detected in cached mode fs/9p: Refresh metadata in d_revalidate for uncached mode too	2025-10-09 11:56:59 -07:00
Al Viro	fb3d71972b	9p: simplify v9fs_vfs_atomic_open() if v9fs_vfs_lookup() returns a preexisting alias, it is guaranteed to be positive. IOW, in that case we will immediately return finish_no_open(), leaving only the case res == NULL past that point. Reviewed-by: NeilBrown <neil@brown.name> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2025-09-16 23:59:38 -04:00
Tingmao Wang	290434474c	fs/9p: Refresh metadata in d_revalidate for uncached mode too Currently if another process keeps a file open, due to existing dentry in the dcache, other processes will not see updated metadata of that file if it is changed on the server, even in uncached mode. This can also manifest as -ENODATA when reading a file that has shrunk on the server (even if it's re-opened in another process), or -ENOTSUPP if the file has changed type (e.g. regular file to directory) on the server. We can end up in a situation where both `readdir` or `read` fails until the file is closed by all processes using it. This commit fixes that, and invalidates the dentry altogether if the inode type is changed (for uncached mode). Signed-off-by: Tingmao Wang <m@maowtm.org> Message-ID: <bfac417f65cc1d6812be822f8913f0d4ba0c1052.1743956147.git.m@maowtm.org> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>	2025-08-23 15:34:46 +09:00
NeilBrown	88d5baf690	Change inode_operations.mkdir to return struct dentry * Some filesystems, such as NFS, cifs, ceph, and fuse, do not have complete control of sequencing on the actual filesystem (e.g. on a different server) and may find that the inode created for a mkdir request already exists in the icache and dcache by the time the mkdir request returns. For example, if the filesystem is mounted twice the directory could be visible on the other mount before it is on the original mount, and a pair of name_to_handle_at(), open_by_handle_at() calls could instantiate the directory inode with an IS_ROOT() dentry before the first mkdir returns. This means that the dentry passed to ->mkdir() may not be the one that is associated with the inode after the ->mkdir() completes. Some callers need to interact with the inode after the ->mkdir completes and they currently need to perform a lookup in the (rare) case that the dentry is no longer hashed. This lookup-after-mkdir requires that the directory remains locked to avoid races. Planned future patches to lock the dentry rather than the directory will mean that this lookup cannot be performed atomically with the mkdir. To remove this barrier, this patch changes ->mkdir to return the resulting dentry if it is different from the one passed in. Possible returns are: NULL - the directory was created and no other dentry was used ERR_PTR() - an error occurred non-NULL - this other dentry was spliced in This patch only changes file-systems to return "ERR_PTR(err)" instead of "err" or equivalent transformations. Subsequent patches will make further changes to some file-systems to return a correct dentry. Not all filesystems reliably result in a positive hashed dentry: - NFS, cifs, hostfs will sometimes need to perform a lookup of the name to get inode information. Races could result in this returning something different. Note that this lookup is non-atomic which is what we are trying to avoid. Placing the lookup in filesystem code means it only happens when the filesystem has no other option. - kernfs and tracefs leave the dentry negative and the ->revalidate operation ensures that lookup will be called to correctly populate the dentry. This could be fixed but I don't think it is important to any of the users of vfs_mkdir() which look at the dentry. The recommendation to use d_drop();d_splice_alias() is ugly but fits with current practice. A planned future patch will change this. Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: NeilBrown <neilb@suse.de> Link: https://lore.kernel.org/r/20250227013949.536172-2-neilb@suse.de Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-02-27 20:00:17 +01:00
Linus Torvalds	850925a813	Revert patches causing inode collision problems -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE/IPbcYBuWt0zoYhOq06b7GqY5nAFAmccC7UACgkQq06b7GqY 5nBYbA/7BP/Me80+ofClszxd0F3nlXCGJPe31vC36EeFsap5p1TrIM7jvj0iR0Zr HXLxyTwocrubFgSE28rUrSS2nhmk016g3EUEVeoaO+hYSbvtdrRhoTSCubEwb4N5 68vUz1ebGqZ35oT9oVZeAd+xdwxpKPktqrN6zCJsyWbSiZevE8YMkV+dgYrGde+Y j7BMypcxtQ2+u8donJ1PnGz04k0RNhqcqGGED0VuNw1Sp6quY8n0EWvHEeYjrb+N hFYB1lSeA245fVm8bmnUFHLJ2w47bxfJZ8hANbO9C0zLI98vlmH6xItXQnGofc4P 1VtCHxoZnIFMMjK/iTPjvi4JsanNr0mS1Aa5w8HAHqyHYeIoBCpn78XFf79IhFAl pO3RV6nxE8fELXBM39Yyq7I3rQfyFmNBeob0UWIbazQ7cC54xBHpu0RDMj2gD6xy TBbZL2euRfGEswKWfrdR32U1jzSCv+jTAWew9sdx+z0RX9inRyaNBHFVsjOnVFZS iLJn20cotgw0WU2OhghWvgsV4o/Ckx4eJAC9oNxZ9pT1s7BqSvW1qg/Zlpn9wYgf UGQGXrCdXGY6XZ3W53c2joS+QWtmU6N8mAC6jCeqQm8pCQQeiTHE1QycyNc0xPGd ELhQYzCA1hEiNDki3e/vqzbPScpAer9dc/6hMKUT55SVfjcCBoc= =O6o1 -----END PGP SIGNATURE----- Merge tag '9p-for-6.12-rc5' of https://github.com/martinetd/linux Pull more 9p reverts from Dominique Martinet: "Revert patches causing inode collision problems. The code simplification introduced significant regressions on servers that do not remap inode numbers when exporting multiple underlying filesystems with colliding inodes. See the top-most revert (commit `be2ca38253`) for details. This problem had been ignored for too long and the reverts will also head to stable (6.9+). I'm confident this set of patches gets us back to previous behaviour (another related patch had already been reverted back in April and we're almost back to square 1, and the rest didn't touch inode lifecycle)" * tag '9p-for-6.12-rc5' of https://github.com/martinetd/linux: Revert "fs/9p: simplify iget to remove unnecessary paths" Revert "fs/9p: fix uaf in in v9fs_stat2inode_dotl" Revert "fs/9p: remove redundant pointer v9ses" Revert " fs/9p: mitigate inode collisions"	2024-10-25 15:25:02 -07:00
Dominique Martinet	be2ca38253	Revert "fs/9p: simplify iget to remove unnecessary paths" This reverts commit `724a08450f`. This code simplification introduced significant regressions on servers that do not remap inode numbers when exporting multiple underlying filesystems with colliding inodes, as can be illustrated with simple tmpfs exports in qemu with remapping disabled: ``` # host side cd /tmp/linux-test mkdir m1 m2 mount -t tmpfs tmpfs m1 mount -t tmpfs tmpfs m2 mkdir m1/dir m2/dir echo foo > m1/dir/foo echo bar > m2/dir/bar # guest side # started with -virtfs local,path=/tmp/linux-test,mount_tag=tmp,security_model=mapped-file mount -t 9p -o trans=virtio,debug=1 tmp /mnt/t ls /mnt/t/m1/dir # foo ls /mnt/t/m2/dir # bar (works ok if directry isn't open) # cd to keep first dir's inode alive cd /mnt/t/m1/dir ls /mnt/t/m2/dir # foo (should be bar) ``` Other examples can be crafted with regular files with fscache enabled, in which case I/Os just happen to the wrong file leading to corruptions, or guest failing to boot with: \| VFS: Lookup of 'com.android.runtime' in 9p 9p would have caused loop In theory, we'd want the servers to be smart enough and ensure they never send us two different files with the same 'qid.path', but while qemu has an option to remap that is recommended (and qemu prints a warning if this case happens), there are many other servers which do not (kvmtool, nfs-ganesha, probably diod...), we should at least ensure we don't cause regressions on this: - assume servers can't be trusted and operations that should get a 'new' inode properly do so. commit `d05dcfdf5e` (" fs/9p: mitigate inode collisions") attempted to do this, but v9fs_fid_iget_dotl() was not called so some higher level of caching got in the way; this needs to be fixed properly before we can re-apply the patches. - if we ever want to really simplify this code, we will need to add some negotiation with the server at mount time where the server could claim they handle this properly, at which point we could optimize this out. (but that might not be needed at all if we properly handle the 'new' check?) Fixes: `724a08450f` ("fs/9p: simplify iget to remove unnecessary paths") Reported-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/all/20240408141436.GA17022@redhat.com/ Link: https://lkml.kernel.org/r/20240923100508.GA32066@willie-the-truck Cc: stable@vger.kernel.org # v6.9+ Message-ID: <20241024-revert_iget-v1-4-4cac63d25f72@codewreck.org> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>	2024-10-25 06:26:09 +09:00
Dominique Martinet	f69999b5f9	Revert " fs/9p: mitigate inode collisions" This reverts commit `d05dcfdf5e`. This is a requirement to revert commit `724a08450f` ("fs/9p: simplify iget to remove unnecessary paths"), see that revert for details. Fixes: `724a08450f` ("fs/9p: simplify iget to remove unnecessary paths") Reported-by: Will Deacon <will@kernel.org> Link: https://lkml.kernel.org/r/20240923100508.GA32066@willie-the-truck Cc: stable@vger.kernel.org # v6.9+ Message-ID: <20241024-revert_iget-v1-1-4cac63d25f72@codewreck.org> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>	2024-10-25 06:26:08 +09:00
Dominique Martinet	f009e946c1	Revert "9p: Enable multipage folios" This reverts commit `1325e4a91a`. using multipage folios apparently break some madvise operations like MADV_PAGEOUT which do not reliably unload the specified page anymore, Revert the patch until that is figured out. Reported-by: Andrii Nakryiko <andrii@kernel.org> Fixes: `1325e4a91a` ("9p: Enable multipage folios") Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2024-10-24 11:24:05 -07:00
David Howells	1325e4a91a	9p: Enable multipage folios Enable support for multipage folios on the 9P filesystem. This is all handled through netfslib and is already enabled on AFS and CIFS also. Signed-off-by: David Howells <dhowells@redhat.com> cc: Eric Van Hensbergen <ericvh@kernel.org> cc: Latchesar Ionkov <lucho@ionkov.net> cc: Dominique Martinet <asmadeus@codewreck.org> cc: Christian Schoenebeck <linux_oss@crudebyte.com> cc: Jeff Layton <jlayton@kernel.org> cc: Matthew Wilcox <willy@infradead.org> cc: v9fs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org Message-ID: <20240620173137.610345-7-dhowells@redhat.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>	2024-09-23 05:51:27 +09:00
David Howells	f89ea63f1c	netfs, 9p: Fix race between umount and async request completion There's a problem in 9p's interaction with netfslib whereby a crash occurs because the 9p_fid structs get forcibly destroyed during client teardown (without paying attention to their refcounts) before netfslib has finished with them. However, it's not a simple case of deferring the clunking that p9_fid_put() does as that requires the p9_client record to still be present. The problem is that netfslib has to unlock pages and clear the IN_PROGRESS flag before destroying the objects involved - including the fid - and, in any case, nothing checks to see if writeback completed barring looking at the page flags. Fix this by keeping a count of outstanding I/O requests (of any type) and waiting for it to quiesce during inode eviction. Reported-by: syzbot+df038d463cca332e8414@syzkaller.appspotmail.com Link: https://lore.kernel.org/all/0000000000005be0aa061846f8d6@google.com/ Reported-by: syzbot+d7c7a495a5e466c031b6@syzkaller.appspotmail.com Link: https://lore.kernel.org/all/000000000000b86c5e06130da9c6@google.com/ Reported-by: syzbot+1527696d41a634cc1819@syzkaller.appspotmail.com Link: https://lore.kernel.org/all/000000000000041f960618206d7e@google.com/ Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/755891.1716560771@warthog.procyon.org.uk Tested-by: syzbot+d7c7a495a5e466c031b6@syzkaller.appspotmail.com Reviewed-by: Dominique Martinet <asmadeus@codewreck.org> cc: Eric Van Hensbergen <ericvh@kernel.org> cc: Latchesar Ionkov <lucho@ionkov.net> cc: Christian Schoenebeck <linux_oss@crudebyte.com> cc: Jeff Layton <jlayton@kernel.org> cc: Steve French <sfrench@samba.org> cc: Hillf Danton <hdanton@sina.com> cc: v9fs@lists.linux.dev cc: linux-afs@lists.infradead.org cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Reported-and-tested-by: syzbot+d7c7a495a5e466c031b6@syzkaller.appspotmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-05-27 13:12:13 +02:00
Eric Van Hensbergen	d05dcfdf5e	fs/9p: mitigate inode collisions Detect and mitigate inode collsions that now occur since we fixed 9p generating duplicate inode structures. Underlying cause of these appears to be a race condition between reuse of inode numbers in underlying file system and cleanup of inode numbers in the client. Enabling caching makes this much more likely to happen as it increases cleanup latency due to writebacks. Reported-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>	2024-04-22 15:34:27 +00:00
Eric Van Hensbergen	6e45a30fe5	fs/9p: remove erroneous nlink init from legacy stat2inode In 9p2000 legacy mode, stat2inode initializes nlink to 1, which is redundant with what alloc_inode should have already set. 9p2000.u overrides this with extensions if present in the stat structure, and 9p2000.L incorporates nlink into its stat structure. At the very least this probably messes with directory nlink accounting in legacy mode. Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>	2024-04-09 23:53:00 +00:00
Joakim Sindholt	87de39e705	fs/9p: translate O_TRUNC into OTRUNC This one hits both 9P2000 and .u as it appears v9fs has never translated the O_TRUNC flag. Signed-off-by: Joakim Sindholt <opensource@zhasha.com> Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>	2024-03-28 15:10:28 +00:00
Joakim Sindholt	cd25e15e57	fs/9p: only translate RWX permissions for plain 9P2000 Garbage in plain 9P2000's perm bits is allowed through, which causes it to be able to set (among others) the suid bit. This was presumably not the intent since the unix extended bits are handled explicitly and conditionally on .u. Signed-off-by: Joakim Sindholt <opensource@zhasha.com> Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>	2024-03-28 13:59:23 +00:00
Eric Van Hensbergen	6630036b7c	fs/9p: fix uninitialized values during inode evict If an iget fails due to not being able to retrieve information from the server then the inode structure is only partially initialized. When the inode gets evicted, references to uninitialized structures (like fscache cookies) were being made. This patch checks for a bad_inode before doing anything other than clearing the inode from the cache. Since the inode is bad, it shouldn't have any state associated with it that needs to be written back (and there really isn't a way to complete those anyways). Reported-by: syzbot+eb83fe1cce5833cd66a0@syzkaller.appspotmail.com Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>	2024-03-25 14:16:06 +00:00
Eric Van Hensbergen	724a08450f	fs/9p: simplify iget to remove unnecessary paths Remove the additional comparison operators and switch to simply lookup by inode number (aka qid.path). Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>	2024-01-26 16:46:56 +00:00
Eric Van Hensbergen	b91a26696e	fs/9p: rework qid2ino logic This changes from a function to a macro because we can figure out if we are 32 or 64 bit at compile time. Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>	2024-01-26 16:46:56 +00:00
Eric Van Hensbergen	f61c906a7d	fs/9p: Eliminate now unused v9fs_get_inode Now with all inode allocation going through get_from_fid functions we can remove v9fs_get_inode and reduce us down to a single inode allocation path. Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>	2024-01-26 16:46:56 +00:00
David Howells	9546ac78b2	9p: Fix initialisation of netfs_inode for 9p The 9p filesystem is calling netfs_inode_init() in v9fs_init_inode() - before the struct inode fields have been initialised from the obtained file stats (ie. after v9fs_stat2inode*() has been called), but netfslib wants to set a couple of its fields from i_size. Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Marc Dionne <marc.dionne@auristor.com> Tested-by: Dominique Martinet <asmadeus@codewreck.org> Acked-by: Dominique Martinet <asmadeus@codewreck.org> cc: Eric Van Hensbergen <ericvh@kernel.org> cc: Latchesar Ionkov <lucho@ionkov.net> cc: Dominique Martinet <asmadeus@codewreck.org> cc: Christian Schoenebeck <linux_oss@crudebyte.com> cc: v9fs@lists.linux.dev cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org	2024-01-03 14:53:01 +00:00
David Howells	80105ed2fd	9p: Use netfslib read/write_iter Use netfslib's read and write iteration helpers, allowing netfslib to take over the management of the page cache for 9p files and to manage local disk caching. In particular, this eliminates write_begin, write_end, writepage and all mentions of struct page and struct folio from 9p. Note that netfslib now offers the possibility of write-through caching if that is desirable for 9p: just set the NETFS_ICTX_WRITETHROUGH flag in v9inode->netfs.flags in v9fs_set_netfs_context(). Note also this is untested as I can't get ganesha.nfsd to correctly parse the config to turn on 9p support. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: Eric Van Hensbergen <ericvh@kernel.org> cc: Latchesar Ionkov <lucho@ionkov.net> cc: Dominique Martinet <asmadeus@codewreck.org> cc: Christian Schoenebeck <linux_oss@crudebyte.com> cc: v9fs@lists.linux.dev cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org	2023-12-28 09:45:28 +00:00
David Howells	100ccd18bb	netfs: Optimise away reads above the point at which there can be no data Track the file position above which the server is not expected to have any data (the "zero point") and preemptively assume that we can satisfy requests by filling them with zeroes locally rather than attempting to download them if they're over that line - even if we've written data back to the server. Assume that any data that was written back above that position is held in the local cache. Note that we have to split requests that straddle the line. Make use of this to optimise away some reads from the server. We need to set the zero point in the following circumstances: (1) When we see an extant remote inode and have no cache for it, we set the zero_point to i_size. (2) On local inode creation, we set zero_point to 0. (3) On local truncation down, we reduce zero_point to the new i_size if the new i_size is lower. (4) On local truncation up, we don't change zero_point. (5) On local modification, we don't change zero_point. (6) On remote invalidation, we set zero_point to the new i_size. (7) If stored data is discarded from the pagecache or culled from fscache, we must set zero_point above that if the data also got written to the server. (8) If dirty data is written back to the server, but not fscache, we must set zero_point above that. (9) If a direct I/O write is made, set zero_point above that. Assuming the above, any read from the server at or above the zero_point position will return all zeroes. The zero_point value can be stored in the cache, provided the above rules are applied to it by any code that culls part of the local cache. Signed-off-by: David Howells <dhowells@redhat.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org	2023-12-28 09:45:27 +00:00
David Howells	c9c4ff12df	netfs: Move pinning-for-writeback from fscache to netfs Move the resource pinning-for-writeback from fscache code to netfslib code. This is used to keep a cache backing object pinned whilst we have dirty pages on the netfs inode in the pagecache such that VM writeback will be able to reach it. Whilst we're at it, switch the parameters of netfs_unpin_writeback() to match ->write_inode() so that it can be used for that directly. Note that this mechanism could be more generically useful than that for network filesystems. Quite often they have to keep around other resources (e.g. authentication tokens or network connections) until the writeback is complete. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org	2023-12-24 15:08:49 +00:00
Jeff Layton	d0242a3a61	9p: convert to new timestamp accessors Convert to using the new inode timestamp accessor functions. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20231004185347.80880-13-jlayton@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2023-10-18 13:26:17 +02:00
Linus Torvalds	615e95831e	v6.6-vfs.ctime -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZOXTKAAKCRCRxhvAZXjc oifJAQCzi/p+AdQu8LA/0XvR7fTwaq64ZDCibU4BISuLGT2kEgEAuGbuoFZa0rs2 XYD/s4+gi64p9Z01MmXm2XO1pu3GPg0= =eJz5 -----END PGP SIGNATURE----- Merge tag 'v6.6-vfs.ctime' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs timestamp updates from Christian Brauner: "This adds VFS support for multi-grain timestamps and converts tmpfs, xfs, ext4, and btrfs to use them. This carries acks from all relevant filesystems. The VFS always uses coarse-grained timestamps when updating the ctime and mtime after a change. This has the benefit of allowing filesystems to optimize away a lot of metadata updates, down to around 1 per jiffy, even when a file is under heavy writes. Unfortunately, this has always been an issue when we're exporting via NFSv3, which relies on timestamps to validate caches. A lot of changes can happen in a jiffy, so timestamps aren't sufficient to help the client decide to invalidate the cache. Even with NFSv4, a lot of exported filesystems don't properly support a change attribute and are subject to the same problems with timestamp granularity. Other applications have similar issues with timestamps (e.g., backup applications). If we were to always use fine-grained timestamps, that would improve the situation, but that becomes rather expensive, as the underlying filesystem would have to log a lot more metadata updates. This introduces fine-grained timestamps that are used when they are actively queried. This uses the 31st bit of the ctime tv_nsec field to indicate that something has queried the inode for the mtime or ctime. When this flag is set, on the next mtime or ctime update, the kernel will fetch a fine-grained timestamp instead of the usual coarse-grained one. As POSIX generally mandates that when the mtime changes, the ctime must also change the kernel always stores normalized ctime values, so only the first 30 bits of the tv_nsec field are ever used. Filesytems can opt into this behavior by setting the FS_MGTIME flag in the fstype. Filesystems that don't set this flag will continue to use coarse-grained timestamps. Various preparatory changes, fixes and cleanups are included: - Fixup all relevant places where POSIX requires updating ctime together with mtime. This is a wide-range of places and all maintainers provided necessary Acks. - Add new accessors for inode->i_ctime directly and change all callers to rely on them. Plain accesses to inode->i_ctime are now gone and it is accordingly rename to inode->__i_ctime and commented as requiring accessors. - Extend generic_fillattr() to pass in a request mask mirroring in a sense the statx() uapi. This allows callers to pass in a request mask to only get a subset of attributes filled in. - Rework timestamp updates so it's possible to drop the @now parameter the update_time() inode operation and associated helpers. - Add inode_update_timestamps() and convert all filesystems to it removing a bunch of open-coding" * tag 'v6.6-vfs.ctime' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (107 commits) btrfs: convert to multigrain timestamps ext4: switch to multigrain timestamps xfs: switch to multigrain timestamps tmpfs: add support for multigrain timestamps fs: add infrastructure for multigrain timestamps fs: drop the timespec64 argument from update_time xfs: have xfs_vn_update_time gets its own timestamp fat: make fat_update_time get its own timestamp fat: remove i_version handling from fat_update_time ubifs: have ubifs_update_time use inode_update_timestamps btrfs: have it use inode_update_timestamps fs: drop the timespec64 arg from generic_update_time fs: pass the request_mask to generic_fillattr fs: remove silly warning from current_time gfs2: fix timestamp handling on quota inodes fs: rename i_ctime field to __i_ctime selinux: convert to ctime accessor functions security: convert to ctime accessor functions apparmor: convert to ctime accessor functions sunrpc: convert to ctime accessor functions ...	2023-08-28 09:31:32 -07:00
Jeff Layton	0d72b92883	fs: pass the request_mask to generic_fillattr generic_fillattr just fills in the entire stat struct indiscriminately today, copying data from the inode. There is at least one attribute (STATX_CHANGE_COOKIE) that can have side effects when it is reported, and we're looking at adding more with the addition of multigrain timestamps. Add a request_mask argument to generic_fillattr and have most callers just pass in the value that is passed to getattr. Have other callers (e.g. ksmbd) just pass in STATX_BASIC_STATS. Also move the setting of STATX_CHANGE_COOKIE into generic_fillattr. Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: "Paulo Alcantara (SUSE)" <pc@manguebit.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Jeff Layton <jlayton@kernel.org> Message-Id: <20230807-mgctime-v7-2-d1dec143a704@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>	2023-08-09 08:56:36 +02:00
Dominique Martinet	cf7c33d332	9p: remove dead stores (variable set again without being read) The 9p code for some reason used to initialize variables outside of the declaration, e.g. instead of just initializing the variable like this: int retval = 0 We would be doing this: int retval; retval = 0; This is perfectly fine and the compiler will just optimize dead stores anyway, but scan-build seems to think this is a problem and there are many of these warnings making the output of scan-build full of such warnings: fs/9p/vfs_inode.c:916:2: warning: Value stored to 'retval' is never read [deadcode.DeadStores] retval = 0; ^ ~ I have no strong opinion here, but if we want to regularly run scan-build we should fix these just to silence the messages. I've confirmed these all are indeed ok to remove. Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>	2023-07-20 19:14:50 +00:00
Jeff Layton	4f87180060	9p: convert to ctime accessor functions In later patches, we're going to change how the inode's ctime field is used. Switch to using accessor functions instead of raw accesses of inode->i_ctime. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Jeff Layton <jlayton@kernel.org> Message-Id: <20230705190309.579783-19-jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>	2023-07-13 10:28:03 +02:00
Linus Torvalds	ed23734c23	Including fixes from netfilter. Current release - regressions: - sched: act_pedit: free pedit keys on bail from offset check Current release - new code bugs: - pds_core: - Kconfig fixes (DEBUGFS and AUXILIARY_BUS) - fix mutex double unlock in error path Previous releases - regressions: - sched: cls_api: remove block_cb from driver_list before freeing - nf_tables: fix ct untracked match breakage - eth: mtk_eth_soc: drop generic vlan rx offload - sched: flower: fix error handler on replace Previous releases - always broken: - tcp: fix skb_copy_ubufs() vs BIG TCP - ipv6: fix skb hash for some RST packets - af_packet: don't send zero-byte data in packet_sendmsg_spkt() - rxrpc: timeout handling fixes after moving client call connection to the I/O thread - ixgbe: fix panic during XDP_TX with > 64 CPUs - igc: RMW the SRRCTL register to prevent losing timestamp config - dsa: mt7530: fix corrupt frames using TRGMII on 40 MHz XTAL MT7621 - r8152: - fix flow control issue of RTL8156A - fix the poor throughput for 2.5G devices - move setting r8153b_rx_agg_chg_indicate() to fix coalescing - enable autosuspend - ncsi: clear Tx enable mode when handling a Config required AEN - octeontx2-pf: macsec: fixes for CN10KB ASIC rev Misc: - 9p: remove INET dependency Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmRVeUIACgkQMUZtbf5S IrtTug/9Hhg/L0PTSwrfuGh4W1/cjheMWppNLkwyWQUiKG7FcZQ9vu9PxceE3VRu 2fTqHyvgDMZ8jACovXObeda8z1+g3s/tIPaXELephBIjVlF/h3kG2OaIzlU4jDb4 A4vklwf8eLbfyVBG22QgKl/I70zVMtnmnOo6c6CPuIOTcMPzslndFO9tB0nCg99F DCgCM1BBP1tz+OUch2rLnSzYcqkWqS49BhRk6dhYSliawUFU/5+1tDGDjwWolkfm 0jqP9DjBOSpZKO8m7SpsUNz7NFRIfYErWZ+YebWbggNxj/6TRJTP83MM0tGoK1rE /mz2xpuOki59frlwVOAD6gb/qefjHUp21P4NA7bnhizxFlQL5MHpCeGQ9yLHBSmY 9Q4ArJkM4jXQ0oDA2nII/pz+cDZGEWFGQ14WW3kYUb7WFmISH4I9OiA9i0TBW6OL r1Y/rqzkUvtKWzh9RpiAF9lsdHAm3SX9ES5RfMxzv0x886VOZR4jaMmokRDdPRzq 0r2Oyj75b62+X0r44Fe22Pl/kPS/uh3642xo9h85aAv/EvhT9JNzMvomJm9d6tkb 966I085AVbwxPAy+rl5SWyAq60EWDExNTjZvPv0mSMlmSsQ9iK5//xOF2Saw2zai /44zQ27tVGkCC44Ou5KmfJN3u4OrKkhcuyxtcDr9QeoOdKZRkMg= =9xND -----END PGP SIGNATURE----- Merge tag 'net-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from netfilter. Current release - regressions: - sched: act_pedit: free pedit keys on bail from offset check Current release - new code bugs: - pds_core: - Kconfig fixes (DEBUGFS and AUXILIARY_BUS) - fix mutex double unlock in error path Previous releases - regressions: - sched: cls_api: remove block_cb from driver_list before freeing - nf_tables: fix ct untracked match breakage - eth: mtk_eth_soc: drop generic vlan rx offload - sched: flower: fix error handler on replace Previous releases - always broken: - tcp: fix skb_copy_ubufs() vs BIG TCP - ipv6: fix skb hash for some RST packets - af_packet: don't send zero-byte data in packet_sendmsg_spkt() - rxrpc: timeout handling fixes after moving client call connection to the I/O thread - ixgbe: fix panic during XDP_TX with > 64 CPUs - igc: RMW the SRRCTL register to prevent losing timestamp config - dsa: mt7530: fix corrupt frames using TRGMII on 40 MHz XTAL MT7621 - r8152: - fix flow control issue of RTL8156A - fix the poor throughput for 2.5G devices - move setting r8153b_rx_agg_chg_indicate() to fix coalescing - enable autosuspend - ncsi: clear Tx enable mode when handling a Config required AEN - octeontx2-pf: macsec: fixes for CN10KB ASIC rev Misc: - 9p: remove INET dependency" * tag 'net-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (69 commits) net: bcmgenet: Remove phy_stop() from bcmgenet_netif_stop() pds_core: fix mutex double unlock in error path net/sched: flower: fix error handler on replace Revert "net/sched: flower: Fix wrong handle assignment during filter change" net/sched: flower: fix filter idr initialization net: fec: correct the counting of XDP sent frames bonding: add xdp_features support net: enetc: check the index of the SFI rather than the handle sfc: Add back mailing list virtio_net: suppress cpu stall when free_unused_bufs ice: block LAN in case of VF to VF offload net: dsa: mt7530: fix network connectivity with multiple CPU ports net: dsa: mt7530: fix corrupt frames using trgmii on 40 MHz XTAL MT7621 9p: Remove INET dependency netfilter: nf_tables: fix ct untracked match breakage af_packet: Don't send zero-byte data in packet_sendmsg_spkt(). igc: read before write to SRRCTL register pds_core: add AUXILIARY_BUS and NET_DEVLINK to Kconfig pds_core: remove CONFIG_DEBUG_FS from makefile ionic: catch failure from devlink_alloc ...	2023-05-05 19:12:01 -07:00
Jason Andryuk	d7385ba137	9p: Remove INET dependency 9pfs can run over assorted transports, so it doesn't have an INET dependency. Drop it and remove the includes of linux/inet.h. NET_9P_FD/trans_fd.o builds without INET or UNIX and is usable over plain file descriptors. However, tcp and unix functionality is still built and would generate runtime failures if used. Add imply INET and UNIX to NET_9P_FD, so functionality is enabled by default but can still be explicitly disabled. This allows configuring 9pfs over Xen with INET and UNIX disabled. Signed-off-by: Jason Andryuk <jandryuk@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-05-04 21:46:57 +01:00
Eric Van Hensbergen	21e26d5e54	fs/9p: Fix bit operation logic error This re-introduces a fix that somehow got dropped during rebase of the current series in for-next. When writeback is enabled, opens are forced to support both read and write operations but with the logic error other flags may be dropped unintentionaly. Reported-by: Christophe Jaillet <christophe.jaillet@wanadoo.fr> Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>	2023-04-28 16:59:26 +00:00
Eric Van Hensbergen	4eb3117888	fs/9p: Rework cache modes and add new options to Documentation Switch cache modes to a bit-mask and use legacy cache names as shortcuts. Update documentation to include information on both shortcuts and bitmasks. This patch also fixes missing guards related to fscache. Update the documentation for new mount flags and cache modes. Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>	2023-04-09 21:41:21 +00:00
Eric Van Hensbergen	1543b4c507	fs/9p: remove writeback fid and fix per-file modes This patch removes the creating of an additional writeback_fid for opened files. The patch addresses problems when files were opened write-only or getattr on files with dirty caches. This patch also incorporates information about cache behavior in the fid for every file. This allows us to reflect cache behavior from mount flags, open mode, and information from the server to inform readahead and writeback behavior. This includes adding support for a 9p semantic that qid.version==0 is used to mark a file as non-cachable which is important for synthetic files. This may have a side-effect of not supporting caching on certain legacy file servers that do not properly set qid.version. There is also now a mount flag which can disable the qid.version behavior. Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>	2023-03-27 02:33:48 +00:00
Eric Van Hensbergen	d9bc0d11e3	fs/9p: Consolidate file operations and add readahead and writeback We had 3 different sets of file operations across 2 different protocol variants differentiated by cache which really only changed 3 functions. But the real problem is that certain file modes, mount options, and other factors weren't being considered when we decided whether or not to use caches. This consolidates all the operations and switches to conditionals within a common set to decide whether or not to do different aspects of caching. Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org> Reviewed-by: Dominique Martinet <asmadeus@codewreck.org>	2023-03-27 02:33:39 +00:00
Linus Torvalds	3808330b20	9p patches for 6.3 merge window (part 1) Here is the 9p patches for the 6.3 merge window combining the tested and reviewed patches from both Dominique's for-next tree and my for-next tree. Most of these patches have been in for-next since December with only some reword in the description: - some fixes and cleanup setting up for a larger set of performance patches I've been working on - a contributed fixes relating to 9p/rdma - some contributed fixes relating to 9p/xen I've marked this as part 1, I'm not sure I'll be submitting part 2. There were several performance patches that I wanted to get in, but the revisions after review only went out last week so while they have been tested, I haven't received reviews on the revisions. Its been about a decade since I've submitted a pull request, sorry if I messed anything up. -eric -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEElpbw0ZalkJikytFRiP/V+0pf/5gFAmP/frIACgkQiP/V+0pf /5jb2w//dccypyaurk441RSdbsIVZUqf8aiGDzRSyX/iZBrUU4nmavdx8+qycpvc MVdD3WBiv7+PvdyYnt6EUGZYweGbI7vbVrpUpf20aq1q5A5oDWIR8K1EfrTUJQAd j69ALc4Qbq4FmX7kBOKYuj+qUvnnrcyDFQwTHTQ6o8d0MtpmW7MZr2kUiCDKeN3m 8K2uemR83Azjb2m5TvhWRSjb+61cf03W/Jw4+8PsbtfzX8MyGblqUsgODi35IdiJ c2aO+JF0Cw4y9ulw7PR9SkX0yi6CY4Ll/pGV9xW0liIs6E6FQboiwczr+SZNzQoc TUC3S7bGSyodiCNTp685sK3KfHaxj5QHBvilUL/t7cgjOWwWSkl8neg9sbI8Bc7P WxQ4p6cqnMXMJiHAxk70QfIvCsabLSfTjBhN5zAUwjLmjQsxboFWHloKwndea49v 32NHEhGwEt7QSE1WHdJ4m6xfuNRSayriOoQ28Yxyg8ekpK+7bWkI8CXGS3Cq9kGQ SGhX/UZqT5J1YCvBAwh9s7d5hhHUBxaVz74Pssvbd/PkeKI0CiXFoJM5cu32F+p2 4gsY67NcyUjJZOq0NsUxFTqQXw4jdrnkcrnGD4istRFyDMf/3hhuV/F+SvPhBn0l GdMDwXjEkZLSYQi14TADp/V5DfDzRGRpquo++XbqikQyAgoNcKI= =Cvvf -----END PGP SIGNATURE----- Merge tag '9p-6.3-for-linus-part1' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs Pull 9p updates from Eric Van Hensbergen: - some fixes and cleanup setting up for a larger set of performance patches I've been working on - a contributed fixes relating to 9p/rdma - some contributed fixes relating to 9p/xen * tag '9p-6.3-for-linus-part1' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: fs/9p: fix error reporting in v9fs_dir_release net/9p: fix bug in client create for .L 9p/rdma: unmap receive dma buffer in rdma_request()/post_recv() 9p/xen: fix connection sequence 9p/xen: fix version parsing fs/9p: Expand setup of writeback cache to all levels net/9p: Adjust maximum MSIZE to account for p9 header	2023-03-01 08:52:49 -08:00
Eric Van Hensbergen	344504e912	fs/9p: Expand setup of writeback cache to all levels If cache is enabled, make sure we are putting the right things in place (mainly impacts mmap). This also sets us up for more cache levels. Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org> Reviewed-by: Dominique Martinet <asmadeus@codewreck.org>	2023-02-23 22:39:36 +00:00
Christian Brauner	f2d40141d5	fs: port inode_init_owner() to mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in `256c8aed2b` ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>	2023-01-19 09:24:28 +01:00
Christian Brauner	e18275ae55	fs: port ->rename() to pass mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in `256c8aed2b` ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>	2023-01-19 09:24:26 +01:00
Christian Brauner	5ebb29bee8	fs: port ->mknod() to pass mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in `256c8aed2b` ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>	2023-01-19 09:24:26 +01:00
Christian Brauner	c54bd91e9e	fs: port ->mkdir() to pass mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in `256c8aed2b` ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>	2023-01-19 09:24:26 +01:00
Christian Brauner	7a77db9551	fs: port ->symlink() to pass mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in `256c8aed2b` ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>	2023-01-19 09:24:25 +01:00
Christian Brauner	6c960e68aa	fs: port ->create() to pass mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in `256c8aed2b` ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>	2023-01-19 09:24:25 +01:00
Christian Brauner	b74d24f7a7	fs: port ->getattr() to pass mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in `256c8aed2b` ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>	2023-01-19 09:24:25 +01:00
Christian Brauner	c1632a0f11	fs: port ->setattr() to pass mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in `256c8aed2b` ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>	2023-01-19 09:24:02 +01:00
Christophe JAILLET	6e0149a553	9p/fs: Remove unneeded idr.h #include The 9p fs does not use IDR or IDA functionalities. So there is no point in including <linux/idr.h>. Remove it. Link: https://lkml.kernel.org/r/3d1e0ed9714eaee7e18d9f5b0b4bfa49b00b286d.1669553950.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com> [Dominique: reword subject] Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>	2022-12-02 23:59:15 +09:00
Dominique Martinet	dafbe68973	9p fid refcount: cleanup p9_fid_put calls Simplify p9_fid_put cleanup path in many 9p functions since the function is noop on null or error fids. Also make the *_add_fid() helpers "steal" the fid by nulling its pointer, so put after them will be noop. This should lead to no change of behaviour Link: https://lkml.kernel.org/r/20220612085330.1451496-7-asmadeus@codewreck.org Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>	2022-07-02 18:52:21 +09:00
Dominique Martinet	b48dbb998d	9p fid refcount: add p9_fid_get/put wrappers I was recently reminded that it is not clear that p9_client_clunk() was actually just decrementing refcount and clunking only when that reaches zero: make it clear through a set of helpers. This will also allow instrumenting refcounting better for debugging next patch Link: https://lkml.kernel.org/r/20220612085330.1451496-5-asmadeus@codewreck.org Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com> Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>	2022-07-02 18:52:21 +09:00
Dominique Martinet	e5690f2632	9p: fix fid refcount leak in v9fs_vfs_get_link we check for protocol version later than required, after a fid has been obtained. Just move the version check earlier. Link: https://lkml.kernel.org/r/20220612085330.1451496-3-asmadeus@codewreck.org Fixes: `6636b6dcc3` ("9p: add refcount to p9_fid struct") Cc: stable@vger.kernel.org Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com> Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>	2022-06-15 12:05:29 +09:00

1 2 3 4 5 ...

303 Commits