linux/fs
Linus Torvalds 7c8a4671dc vfs-7.1-rc1.mount.v2
Please consider pulling these changes from the signed vfs-7.1-rc1.mount.v2 tag.
 
 Thanks!
 Christian
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCad3vFgAKCRCRxhvAZXjc
 onXwAQDwEGvpMUUiuI/JWFqCA5vY5LXXr/36wdcs0iUL1uy9IgEAyOdnYhYkcaX1
 3lm87f6OmYkhlq6enJbco7uT4CUzlQA=
 =1Ls8
 -----END PGP SIGNATURE-----

Merge tag 'vfs-7.1-rc1.mount.v2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs mount updates from Christian Brauner:

 - Add FSMOUNT_NAMESPACE flag to fsmount() that creates a new mount
   namespace with the newly created filesystem attached to a copy of the
   real rootfs. This returns a namespace file descriptor instead of an
   O_PATH mount fd, similar to how OPEN_TREE_NAMESPACE works for
   open_tree().

   This allows creating a new filesystem and immediately placing it in a
   new mount namespace in a single operation, which is useful for
   container runtimes and other namespace-based isolation mechanisms.

   This accompanies OPEN_TREE_NAMESPACE and avoids a needless detour via
   OPEN_TREE_NAMESPACE to get the same effect. Will be especially useful
   when you mount an actual filesystem to be used as the container
   rootfs.

 - Currently, creating a new mount namespace always copies the entire
   mount tree from the caller's namespace. For containers and sandboxes
   that intend to build their mount table from scratch this is wasteful:
   they inherit a potentially large mount tree only to immediately tear
   it down.

   This series adds support for creating a mount namespace that contains
   only a clone of the root mount, with none of the child mounts. Two
   new flags are introduced:

     - CLONE_EMPTY_MNTNS (0x400000000) for clone3(), using the 64-bit flag space
     - UNSHARE_EMPTY_MNTNS (0x00100000) for unshare()

   Both flags imply CLONE_NEWNS. The resulting namespace contains a
   single nullfs root mount with an immutable empty directory. The
   intended workflow is to then mount a real filesystem (e.g., tmpfs)
   over the root and build the mount table from there.

 - Allow MOVE_MOUNT_BENEATH to target the caller's rootfs, allowing to
   switch out the rootfs without pivot_root(2).

   The traditional approach to switching the rootfs involves
   pivot_root(2) or a chroot_fs_refs()-based mechanism that atomically
   updates fs->root for all tasks sharing the same fs_struct. This has
   consequences for fork(), unshare(CLONE_FS), and setns().

   This series instead decomposes root-switching into individually
   atomic, locally-scoped steps:

	fd_tree = open_tree(-EBADF, "/newroot", OPEN_TREE_CLONE | OPEN_TREE_CLOEXEC);
	fchdir(fd_tree);
	move_mount(fd_tree, "", AT_FDCWD, "/", MOVE_MOUNT_BENEATH | MOVE_MOUNT_F_EMPTY_PATH);
	chroot(".");
	umount2(".", MNT_DETACH);

   Since each step only modifies the caller's own state, the
   fork/unshare/setns races are eliminated by design.

   A key step to making this possible is to remove the locked mount
   restriction. Originally MOVE_MOUNT_BENEATH doesn't support mounting
   beneath a mount that is locked. The locked mount protects the
   underlying mount from being revealed. This is a core mechanism of
   unshare(CLONE_NEWUSER | CLONE_NEWNS). The mounts in the new mount
   namespace become locked. That effectively makes the new mount table
   useless as the caller cannot ever get rid of any of the mounts no
   matter how useless they are.

   We can lift this restriction though. We simply transfer the locked
   property from the top mount to the mount beneath. This works because
   what we care about is to protect the underlying mount aka the parent.
   The mount mounted between the parent and the top mount takes over the
   job of protecting the parent mount from the top mount mount. This
   leaves us free to remove the locked property from the top mount which
   can consequently be unmounted:

	unshare(CLONE_NEWUSER | CLONE_NEWNS)

   and we inherit a clone of procfs on /proc then currently we cannot
   unmount it as:

	umount -l /proc

   will fail with EINVAL because the procfs mount is locked.

   After this series we can now do:

	mount --beneath -t tmpfs tmpfs /proc
	umount -l /proc

   after which a tmpfs mount has been placed beneath the procfs mount.
   The tmpfs mount has become locked and the procfs mount has become
   unlocked.

   This means you can safely modify an inherited mount table after
   unprivileged namespace creation.

   Afterwards we simply make it possible to move a mount beneath the
   rootfs allowing to upgrade the rootfs.

   Removing the locked restriction makes this very useful for containers
   created with unshare(CLONE_NEWUSER | CLONE_NEWNS) to reshuffle an
   inherited mount table safely and MOVE_MOUNT_BENEATH makes it possible
   to switch out the rootfs instead of using the costly pivot_root(2).

* tag 'vfs-7.1-rc1.mount.v2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  selftests/namespaces: remove unused utils.h include from listns_efault_test
  selftests/fsmount_ns: add missing TARGETS and fix cap test
  selftests/empty_mntns: fix wrong CLONE_EMPTY_MNTNS hex value in comment
  selftests/empty_mntns: fix statmount_alloc() signature mismatch
  selftests/statmount: remove duplicate wait_for_pid()
  mount: always duplicate mount
  selftests/filesystems: add MOVE_MOUNT_BENEATH rootfs tests
  move_mount: allow MOVE_MOUNT_BENEATH on the rootfs
  move_mount: transfer MNT_LOCKED
  selftests/filesystems: add clone3 tests for empty mount namespaces
  selftests/filesystems: add tests for empty mount namespaces
  namespace: allow creating empty mount namespaces
  selftests: add FSMOUNT_NAMESPACE tests
  selftests/statmount: add statmount_alloc() helper
  tools: update mount.h header
  mount: add FSMOUNT_NAMESPACE
  mount: simplify __do_loopback()
  mount: start iterating from start of rbtree
2026-04-14 19:59:25 -07:00
..
9p treewide: change inode->i_ino from unsigned long to u64 2026-03-06 14:31:28 +01:00
adfs vfs-7.1-rc1.misc 2026-04-13 14:20:11 -07:00
affs affs-for-7.1-tag 2026-04-13 16:39:01 -07:00
afs vfs-7.1-rc1.kino 2026-04-13 12:19:01 -07:00
autofs vfs-7.1-rc1.misc 2026-04-13 14:20:11 -07:00
befs treewide: change inode->i_ino from unsigned long to u64 2026-03-06 14:31:28 +01:00
bfs vfs-7.1-rc1.bh.metadata 2026-04-13 12:46:42 -07:00
btrfs for-7.1-tag 2026-04-13 16:35:32 -07:00
cachefiles vfs-7.1-rc1.kino 2026-04-13 12:19:01 -07:00
ceph vfs-7.1-rc1.kino 2026-04-13 12:19:01 -07:00
coda treewide: change inode->i_ino from unsigned long to u64 2026-03-06 14:31:28 +01:00
configfs
cramfs treewide: change inode->i_ino from unsigned long to u64 2026-03-06 14:31:28 +01:00
crypto fscrypt updates for 7.1 2026-04-13 17:29:12 -07:00
debugfs debugfs: fix placement of EXPORT_SYMBOL_GPL for debugfs_create_str() 2026-04-02 16:15:23 +02:00
devpts
dlm ipv6: convert CONFIG_IPV6 to built-in only and clean up Kconfigs 2026-03-29 11:21:22 -07:00
ecryptfs treewide: change inode->i_ino from unsigned long to u64 2026-03-06 14:31:28 +01:00
efivarfs
efs treewide: change inode->i_ino from unsigned long to u64 2026-03-06 14:31:28 +01:00
erofs Changes since last update: 2026-04-13 16:59:19 -07:00
exfat Description for this pull request: 2026-04-13 16:57:31 -07:00
exportfs treewide: change inode->i_ino from unsigned long to u64 2026-03-06 14:31:28 +01:00
ext2 vfs-7.1-rc1.bh.metadata 2026-04-13 12:46:42 -07:00
ext4 fscrypt updates for 7.1 2026-04-13 17:29:12 -07:00
f2fs fscrypt updates for 7.1 2026-04-13 17:29:12 -07:00
fat vfs-7.1-rc1.bh.metadata 2026-04-13 12:46:42 -07:00
freevxfs treewide: change inode->i_ino from unsigned long to u64 2026-03-06 14:31:28 +01:00
fuse lsm/stable-7.1 PR 20260410 2026-04-13 15:17:28 -07:00
gfs2 Networking changes for 7.1. 2026-04-14 18:36:10 -07:00
hfs vfs-7.1-rc1.misc 2026-04-13 14:20:11 -07:00
hfsplus hfs/hfsplus updates for v7.1 2026-04-13 16:50:38 -07:00
hostfs
hpfs treewide: change inode->i_ino from unsigned long to u64 2026-03-06 14:31:28 +01:00
hugetlbfs hugetlbfs: Stop using i_private_data 2026-03-26 15:03:30 +01:00
iomap fscrypt updates for 7.1 2026-04-13 17:29:12 -07:00
isofs treewide: fix missed i_ino format specifier conversions 2026-03-06 14:31:30 +01:00
jbd2 vfs-7.1-rc1.kino 2026-04-13 12:19:01 -07:00
jffs2 treewide: change inode->i_ino from unsigned long to u64 2026-03-06 14:31:28 +01:00
jfs treewide: change inode->i_ino from unsigned long to u64 2026-03-06 14:31:28 +01:00
kernfs Driver core changes for 7.1-rc1 2026-04-13 19:03:11 -07:00
lockd treewide: change inode->i_ino from unsigned long to u64 2026-03-06 14:31:28 +01:00
minix vfs-7.1-rc1.bh.metadata 2026-04-13 12:46:42 -07:00
netfs netfs: Fix the handling of stream->front by removing it 2026-03-26 15:18:45 +01:00
nfs vfs-7.1-rc1.kino 2026-04-13 12:19:01 -07:00
nfs_common
nfsd vfs-7.1-rc1.kino 2026-04-13 12:19:01 -07:00
nilfs2 nilfs2 updates for v7.1 2026-04-13 16:53:19 -07:00
nls
notify treewide: change inode->i_ino from unsigned long to u64 2026-03-06 14:31:28 +01:00
ntfs3 vfs-7.1-rc1.bh.metadata 2026-04-13 12:46:42 -07:00
ocfs2 vfs-7.1-rc1.bh.metadata 2026-04-13 12:46:42 -07:00
omfs vfs-7.1-rc1.misc 2026-04-13 14:20:11 -07:00
openpromfs
orangefs treewide: change inode->i_ino from unsigned long to u64 2026-03-06 14:31:28 +01:00
overlayfs lsm/stable-7.1 PR 20260410 2026-04-13 15:17:28 -07:00
proc vfs-7.1-rc1.misc 2026-04-13 14:20:11 -07:00
pstore pstore/ftrace: Factor KASLR offset in the core kernel instruction addresses 2026-04-10 23:59:41 -07:00
qnx4 vfs-7.1-rc1.bh.metadata 2026-04-13 12:46:42 -07:00
qnx6 vfs-7.1-rc1.bh.metadata 2026-04-13 12:46:42 -07:00
quota
ramfs
resctrl fs/resctrl: Add missing return value descriptions 2026-04-07 21:01:22 +02:00
romfs
smb smb: client: allow both 'lease' and 'nolease' mount options 2026-04-13 09:14:54 -05:00
squashfs
sysfs Driver core changes for 7.1-rc1 2026-04-13 19:03:11 -07:00
tests fs/tests: exec: Remove bad test vector 2026-03-18 11:41:53 -07:00
tracefs
ubifs treewide: change inode->i_ino from unsigned long to u64 2026-03-06 14:31:28 +01:00
udf vfs-7.1-rc1.bh.metadata 2026-04-13 12:46:42 -07:00
ufs vfs-7.1-rc1.bh.metadata 2026-04-13 12:46:42 -07:00
unicode
vboxsf
verity vfs-7.1-rc1.kino 2026-04-13 12:19:01 -07:00
xfs xfs: new code for Linux 7.1 2026-04-13 17:03:48 -07:00
zonefs treewide: change inode->i_ino from unsigned long to u64 2026-03-06 14:31:28 +01:00
Kconfig
Kconfig.binfmt
Makefile
aio.c aio: Stop using i_private_data and i_private_lock 2026-03-26 15:03:30 +01:00
anon_inodes.c
attr.c fs: attr: fix comment formatting and spelling issues 2026-04-07 11:26:11 +02:00
backing-file.c lsm: add backing_file LSM hooks 2026-04-03 16:53:50 -04:00
bad_inode.c
binfmt_elf.c
binfmt_elf_fdpic.c vfs-7.1-rc1.kino 2026-04-13 12:19:01 -07:00
binfmt_flat.c
binfmt_misc.c
binfmt_script.c
bpf_fs_kfuncs.c
buffer.c fscrypt updates for 7.1 2026-04-13 17:29:12 -07:00
char_dev.c
compat_binfmt_elf.c
coredump.c vfs-7.1-rc1.misc 2026-04-13 14:20:11 -07:00
d_path.c dcache: permit dynamic_dname()s up to NAME_MAX 2026-04-07 12:32:22 +02:00
dax.c
dcache.c twenty six smb3 client fixes 2026-04-13 17:09:00 -07:00
direct-io.c
drop_caches.c
eventfd.c
eventpoll.c vfs-7.1-rc1.kino 2026-04-13 12:19:01 -07:00
exec.c exec: use strnlen() in __set_task_comm 2026-04-01 12:26:07 -07:00
fcntl.c
fhandle.c
file.c vfs-7.1-rc1.misc 2026-04-13 14:20:11 -07:00
file_attr.c
file_table.c lsm/stable-7.1 PR 20260410 2026-04-13 15:17:28 -07:00
filesystems.c
fs-writeback.c writeback: don't block sync for filesystems with no data integrity guarantees 2026-03-20 14:18:56 +01:00
fs_context.c vfs-7.1-rc1.misc 2026-04-13 14:20:11 -07:00
fs_dirent.c
fs_parser.c
fs_pin.c
fs_struct.c
fserror.c treewide: change inode->i_ino from unsigned long to u64 2026-03-06 14:31:28 +01:00
fsopen.c
init.c
inode.c vfs-7.1-rc1.bh.metadata 2026-04-13 12:46:42 -07:00
internal.h lsm/stable-7.1 PR 20260410 2026-04-13 15:17:28 -07:00
ioctl.c
kernel_read_file.c
libfs.c vfs-7.1-rc1.bh.metadata 2026-04-13 12:46:42 -07:00
locks.c treewide: change inode->i_ino from unsigned long to u64 2026-03-06 14:31:28 +01:00
mbcache.c vfs-7.1-rc1.misc 2026-04-13 14:20:11 -07:00
mnt_idmapping.c
mount.h
mpage.c mpage: Provide variant of mpage_writepages() with own optional folio handler 2026-03-27 17:01:36 +01:00
namei.c vfs-7.1-rc1.misc 2026-04-13 14:20:11 -07:00
namespace.c mount: always duplicate mount 2026-04-14 09:30:15 +02:00
nsfs.c vfs-7.1-rc1.kino 2026-04-13 12:19:01 -07:00
nullfs.c
open.c fs: remove do_sys_truncate 2026-03-23 12:41:58 +01:00
pidfs.c vfs-7.1-rc1.pidfs 2026-04-13 13:27:11 -07:00
pipe.c treewide: change inode->i_ino from unsigned long to u64 2026-03-06 14:31:28 +01:00
pnode.c
pnode.h
posix_acl.c
proc_namespace.c
read_write.c
readdir.c fs: Replace user_access_{begin/end} by scoped user access 2026-03-24 14:44:02 +01:00
remap_range.c
select.c vfs-7.1-rc1.misc 2026-04-13 14:20:11 -07:00
seq_file.c
signalfd.c
splice.c
stack.c
stat.c
statfs.c
super.c
sync.c
sysctls.c
timerfd.c
userfaultfd.c
utimes.c
xattr.c