linux/include/uapi/linux
Linus Torvalds 7c8a4671dc vfs-7.1-rc1.mount.v2
Please consider pulling these changes from the signed vfs-7.1-rc1.mount.v2 tag.
 
 Thanks!
 Christian
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCad3vFgAKCRCRxhvAZXjc
 onXwAQDwEGvpMUUiuI/JWFqCA5vY5LXXr/36wdcs0iUL1uy9IgEAyOdnYhYkcaX1
 3lm87f6OmYkhlq6enJbco7uT4CUzlQA=
 =1Ls8
 -----END PGP SIGNATURE-----

Merge tag 'vfs-7.1-rc1.mount.v2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs mount updates from Christian Brauner:

 - Add FSMOUNT_NAMESPACE flag to fsmount() that creates a new mount
   namespace with the newly created filesystem attached to a copy of the
   real rootfs. This returns a namespace file descriptor instead of an
   O_PATH mount fd, similar to how OPEN_TREE_NAMESPACE works for
   open_tree().

   This allows creating a new filesystem and immediately placing it in a
   new mount namespace in a single operation, which is useful for
   container runtimes and other namespace-based isolation mechanisms.

   This accompanies OPEN_TREE_NAMESPACE and avoids a needless detour via
   OPEN_TREE_NAMESPACE to get the same effect. Will be especially useful
   when you mount an actual filesystem to be used as the container
   rootfs.

 - Currently, creating a new mount namespace always copies the entire
   mount tree from the caller's namespace. For containers and sandboxes
   that intend to build their mount table from scratch this is wasteful:
   they inherit a potentially large mount tree only to immediately tear
   it down.

   This series adds support for creating a mount namespace that contains
   only a clone of the root mount, with none of the child mounts. Two
   new flags are introduced:

     - CLONE_EMPTY_MNTNS (0x400000000) for clone3(), using the 64-bit flag space
     - UNSHARE_EMPTY_MNTNS (0x00100000) for unshare()

   Both flags imply CLONE_NEWNS. The resulting namespace contains a
   single nullfs root mount with an immutable empty directory. The
   intended workflow is to then mount a real filesystem (e.g., tmpfs)
   over the root and build the mount table from there.

 - Allow MOVE_MOUNT_BENEATH to target the caller's rootfs, allowing to
   switch out the rootfs without pivot_root(2).

   The traditional approach to switching the rootfs involves
   pivot_root(2) or a chroot_fs_refs()-based mechanism that atomically
   updates fs->root for all tasks sharing the same fs_struct. This has
   consequences for fork(), unshare(CLONE_FS), and setns().

   This series instead decomposes root-switching into individually
   atomic, locally-scoped steps:

	fd_tree = open_tree(-EBADF, "/newroot", OPEN_TREE_CLONE | OPEN_TREE_CLOEXEC);
	fchdir(fd_tree);
	move_mount(fd_tree, "", AT_FDCWD, "/", MOVE_MOUNT_BENEATH | MOVE_MOUNT_F_EMPTY_PATH);
	chroot(".");
	umount2(".", MNT_DETACH);

   Since each step only modifies the caller's own state, the
   fork/unshare/setns races are eliminated by design.

   A key step to making this possible is to remove the locked mount
   restriction. Originally MOVE_MOUNT_BENEATH doesn't support mounting
   beneath a mount that is locked. The locked mount protects the
   underlying mount from being revealed. This is a core mechanism of
   unshare(CLONE_NEWUSER | CLONE_NEWNS). The mounts in the new mount
   namespace become locked. That effectively makes the new mount table
   useless as the caller cannot ever get rid of any of the mounts no
   matter how useless they are.

   We can lift this restriction though. We simply transfer the locked
   property from the top mount to the mount beneath. This works because
   what we care about is to protect the underlying mount aka the parent.
   The mount mounted between the parent and the top mount takes over the
   job of protecting the parent mount from the top mount mount. This
   leaves us free to remove the locked property from the top mount which
   can consequently be unmounted:

	unshare(CLONE_NEWUSER | CLONE_NEWNS)

   and we inherit a clone of procfs on /proc then currently we cannot
   unmount it as:

	umount -l /proc

   will fail with EINVAL because the procfs mount is locked.

   After this series we can now do:

	mount --beneath -t tmpfs tmpfs /proc
	umount -l /proc

   after which a tmpfs mount has been placed beneath the procfs mount.
   The tmpfs mount has become locked and the procfs mount has become
   unlocked.

   This means you can safely modify an inherited mount table after
   unprivileged namespace creation.

   Afterwards we simply make it possible to move a mount beneath the
   rootfs allowing to upgrade the rootfs.

   Removing the locked restriction makes this very useful for containers
   created with unshare(CLONE_NEWUSER | CLONE_NEWNS) to reshuffle an
   inherited mount table safely and MOVE_MOUNT_BENEATH makes it possible
   to switch out the rootfs instead of using the costly pivot_root(2).

* tag 'vfs-7.1-rc1.mount.v2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  selftests/namespaces: remove unused utils.h include from listns_efault_test
  selftests/fsmount_ns: add missing TARGETS and fix cap test
  selftests/empty_mntns: fix wrong CLONE_EMPTY_MNTNS hex value in comment
  selftests/empty_mntns: fix statmount_alloc() signature mismatch
  selftests/statmount: remove duplicate wait_for_pid()
  mount: always duplicate mount
  selftests/filesystems: add MOVE_MOUNT_BENEATH rootfs tests
  move_mount: allow MOVE_MOUNT_BENEATH on the rootfs
  move_mount: transfer MNT_LOCKED
  selftests/filesystems: add clone3 tests for empty mount namespaces
  selftests/filesystems: add tests for empty mount namespaces
  namespace: allow creating empty mount namespaces
  selftests: add FSMOUNT_NAMESPACE tests
  selftests/statmount: add statmount_alloc() helper
  tools: update mount.h header
  mount: add FSMOUNT_NAMESPACE
  mount: simplify __do_loopback()
  mount: start iterating from start of rbtree
2026-04-14 19:59:25 -07:00
..
android
byteorder
caif
can
cifs
counter
dvb
genwqe
hdlc
hsi
iio
io_uring io_uring/zcrx: implement device-less mode for zcrx 2026-04-01 10:21:12 -06:00
isdn
media
misc
mmc
netfilter netfilter: nf_tables: add netlink policy based cap on registers 2026-04-08 07:51:31 +02:00
netfilter_arp
netfilter_bridge
netfilter_ipv4
netfilter_ipv6
nfsd
raid
sched
spi
sunrpc
surface_aggregator
tc_act
tc_ematch
usb
a.out.h
acct.h
acrn.h
adb.h
adfs_fs.h
affs_hardblocks.h
agpgart.h
aio_abi.h
am437x-vpfe.h
amt.h
apm_bios.h
arcfb.h
arm_sdei.h
aspeed-lpc-ctrl.h
aspeed-p2a-ctrl.h
aspeed-video.h
atalk.h
atm.h
atm_eni.h
atm_he.h
atm_idt77105.h
atm_nicstar.h
atm_tcp.h
atm_zatm.h
atmapi.h
atmarp.h
atmbr2684.h
atmclip.h
atmdev.h
atmioc.h
atmlec.h
atmmpc.h
atmppp.h
atmsap.h
atmsvc.h
audit.h
auto_dev-ioctl.h
auto_fs.h
auto_fs4.h
auxvec.h
ax25.h
batadv_packet.h
batman_adv.h
baycom.h
bcm933xx_hcs.h
bfs_fs.h
binfmts.h
bits.h
blk-crypto.h
blkdev.h
blkpg.h
blktrace_api.h
blkzoned.h
bpf.h bpf: Clarify BPF_RB_NO_WAKEUP behavior for bpf_ringbuf_discard() 2026-03-31 15:46:34 -07:00
bpf_common.h
bpf_perf_event.h
bpqether.h
bsg.h bsg: add bsg_uring_cmd uapi structure 2026-03-19 11:38:24 -06:00
bt-bmc.h
btf.h btf: Add BTF kind layout encoding to UAPI 2026-03-26 13:53:56 -07:00
btrfs.h
btrfs_tree.h btrfs: tree-checker: introduce checks for FREE_SPACE_INFO 2026-04-07 18:56:01 +02:00
cachefiles.h
can.h
capability.h
capi.h
cciss_defs.h
cciss_ioctl.h
ccs.h
cdrom.h
cec-funcs.h
cec.h
cfm_bridge.h
cgroupstats.h
chio.h
close_range.h
cn_proc.h
coda.h
coff.h
comedi.h
connector.h
const.h
coredump.h
coresight-stm.h
counter.h
cramfs_fs.h
cryptouser.h
cuda.h
cxl_mem.h
cyclades.h
cycx_cfm.h
dcbnl.h
dccp.h
dev_energymodel.h
devlink.h devlink: Add resource scope filtering to resource dump 2026-04-08 19:55:39 -07:00
dlm.h
dlm_device.h
dlm_plock.h
dlmconstants.h
dm-ioctl.h
dm-log-userspace.h
dma-buf.h
dma-heap.h
dns_resolver.h
dpll.h dpll: add frequency monitoring to netlink spec 2026-04-03 16:48:01 -07:00
dqblk_xfs.h
dw100.h
edd.h
efs_fs_sb.h
elf-em.h
elf-fdpic.h
elf.h
errno.h
errqueue.h
erspan.h
ethtool.h
ethtool_netlink.h
ethtool_netlink_generated.h net: ethtool: add ethtool COALESCE_RX_CQE_FRAMES/NSECS 2026-03-18 20:01:10 -07:00
eventfd.h
eventpoll.h
exfat.h
ext4.h
f2fs.h
fadvise.h
falloc.h
fanotify.h
fb.h
fcntl.h
fd.h
fdreg.h
fib_rules.h
fiemap.h
filter.h
firewire-cdev.h
firewire-constants.h
fou.h
fpga-dfl.h
fs.h
fscrypt.h
fsi.h
fsl_hypervisor.h
fsl_mc.h
fsmap.h
fsverity.h
fuse.h
futex.h
gameport.h
gen_stats.h
genetlink.h
gfs2_ondisk.h
gpib.h
gpib_ioctl.h
gpio.h
gsmmux.h
gtp.h
handshake.h
hash_info.h
hdlc.h
hdlcdrv.h
hdreg.h
hid.h
hiddev.h
hidraw.h
hpet.h
hsr_netlink.h
hw_breakpoint.h
hyperv.h
i2c-dev.h
i2c.h
i2o-dev.h
i8k.h
icmp.h
icmpv6.h
idxd.h
if.h
if_addr.h
if_addrlabel.h
if_alg.h
if_arcnet.h
if_arp.h
if_bonding.h
if_bridge.h
if_eql.h
if_ether.h
if_fc.h
if_fddi.h
if_hippi.h
if_infiniband.h
if_link.h net: bridge: add stp_mode attribute for STP mode selection 2026-04-10 15:52:24 -07:00
if_ltalk.h
if_macsec.h
if_packet.h
if_phonet.h
if_plip.h
if_ppp.h
if_pppol2tp.h
if_pppox.h
if_slip.h
if_team.h
if_tun.h
if_tunnel.h
if_vlan.h
if_x25.h
if_xdp.h
ife.h
igmp.h
ila.h
in.h
in6.h
in_route.h
inet_diag.h
inotify.h
input-event-codes.h Input: add keycodes for contextual AI usages (HUTRR119) 2026-03-29 22:02:11 +02:00
input.h
io_uring.h
ioam6.h
ioam6_genl.h
ioam6_iptunnel.h
ioctl.h
iommufd.h
ioprio.h
ip.h
ip6_tunnel.h
ip_vs.h
ipc.h
ipmi.h
ipmi_bmc.h
ipmi_msgdefs.h
ipmi_ssif_bmc.h
ipsec.h
ipv6.h
ipv6_route.h
irqnr.h
iso_fs.h
isst_if.h
ivtv.h
ivtvfb.h
jffs2.h
joystick.h
kcm.h
kcmp.h
kcov.h
kd.h
kdev_t.h
kernel-page-flags.h
kernel.h
kernelcapi.h
kexec.h
keyboard.h
keyctl.h
kfd_ioctl.h
kfd_sysfs.h
kvm.h
kvm_para.h
l2tp.h
landlock.h landlock: Control pathname UNIX domain socket resolution by path 2026-04-07 18:51:06 +02:00
libc-compat.h
limits.h
lirc.h
liveupdate.h
llc.h
loadpin.h
lockd_netlink.h
loop.h
lp.h
lsm.h
lwtunnel.h
magic.h
major.h
map_benchmark.h
map_to_7segment.h
map_to_14segment.h
matroxfb.h
max2175.h
mctp.h
mdio.h
media-bus-format.h
media.h
mei.h
mei_uuid.h
membarrier.h
memfd.h
mempolicy.h
mii.h net: phy: qcom: at803x: Use the correct bit to disable extended next page 2026-04-13 14:36:22 -07:00
minix_fs.h
mman.h
mmtimer.h
module.h
module_signature.h module: Move 'struct module_signature' to UAPI 2026-03-24 21:42:37 +00:00
mount.h
mpls.h
mpls_iptunnel.h
mptcp.h
mptcp_pm.h
mqueue.h
mroute.h
mroute6.h
mrp_bridge.h
msdos_fs.h
msg.h
mshv.h
mtio.h
nbd-netlink.h
nbd.h
ncsi.h
ndctl.h
neighbour.h
net.h
net_dropmon.h
net_namespace.h
net_shaper.h
net_tstamp.h
netconf.h
netdev.h net: Add queue-create operation 2026-04-09 18:21:45 -07:00
netdevice.h
netfilter.h
netfilter_arp.h
netfilter_bridge.h
netfilter_ipv4.h
netfilter_ipv6.h
netlink.h
netlink_diag.h
netrom.h
nexthop.h
nfc.h
nfs.h
nfs2.h
nfs3.h
nfs4.h
nfs4_mount.h
nfs_fs.h
nfs_idmap.h
nfs_mount.h
nfsacl.h
nfsd_netlink.h
nilfs2_api.h
nilfs2_ondisk.h
nitro_enclaves.h
nl80211-vnd-intel.h
nl80211.h wifi: nl80211: Add a notification to notify NAN channel evacuation 2026-03-25 20:56:55 +01:00
npcm-video.h
nsfs.h
nsm.h
ntsync.h
nubus.h
nvme_ioctl.h
nvram.h
omap3isp.h
omapfb.h
oom.h
openat2.h
openvswitch.h
ovpn.h ovpn: add support for asymmetric peer IDs 2026-03-17 11:09:05 +01:00
packet_diag.h
papr_pdsm.h
param.h
parport.h
patchkey.h
pci.h
pci_regs.h
pcitest.h
perf_event.h
personality.h
pfkeyv2.h
pfrut.h
pg.h
phantom.h
phonet.h
pidfd.h pidfds: add coredump_code field to pidfd_info 2026-03-23 16:29:15 +01:00
pkt_cls.h
pkt_sched.h
pktcdvd.h
pmu.h
poll.h
posix_acl.h
posix_acl_xattr.h
posix_types.h
ppdev.h
ppp-comp.h
ppp-ioctl.h
ppp_defs.h
pps.h
pps_gen.h
pr.h
prctl.h prctl: cfi: change the branch landing pad prctl()s to be more descriptive 2026-04-04 18:40:58 -06:00
psample.h
psci.h
psp-dbc.h
psp-sev.h
psp-sfs.h
psp.h
ptp_clock.h
ptrace.h
pwm.h
qemu_fw_cfg.h
qnx4_fs.h
qnxtypes.h
qrtr.h
quota.h
radeonfb.h
random.h
rds.h
reboot.h
remoteproc_cdev.h
resource.h
rfkill.h
rio_cm_cdev.h
rio_mport_cdev.h
rkisp1-config.h
romfs_fs.h
rose.h
route.h
rpl.h
rpl_iptunnel.h
rpmsg.h
rpmsg_types.h
rseq.h
rtc.h
rtnetlink.h
rxrpc.h
scc.h
sched.h vfs-7.1-rc1.mount.v2 2026-04-14 19:59:25 -07:00
scif_ioctl.h
screen_info.h
sctp.h
seccomp.h
securebits.h
sed-opal.h sed-opal: Add STACK_RESET command 2026-03-31 07:04:00 -06:00
seg6.h
seg6_genl.h
seg6_hmac.h
seg6_iptunnel.h seg6: add per-route tunnel source address 2026-03-26 18:45:29 -07:00
seg6_local.h
selinux_netlink.h
sem.h
serial.h
serial_core.h
serial_reg.h
serio.h
sev-guest.h
shm.h
signal.h
signalfd.h
smc.h
smc_diag.h
smiapp.h
snmp.h
sock_diag.h
socket.h
sockios.h
sonet.h
sonypi.h
sound.h
soundcard.h
stat.h
stddef.h
stm.h
string.h
suspend_ioctls.h
swab.h
switchtec_ioctl.h
sync_file.h
synclink.h
sysctl.h
sysinfo.h
target_core_user.h
taskstats.h
tcp.h
tcp_metrics.h
tdx-guest.h
tee.h
termios.h
thermal.h
thp7312.h
time.h
time_types.h
timerfd.h
times.h
timex.h
tiocl.h
tipc.h
tipc_config.h
tipc_netlink.h
tipc_sockets_diag.h
tls.h
toshiba.h
tps6594_pfsm.h
trace_mmap.h
tty.h
tty_flags.h
typelimits.h
types.h
ublk_cmd.h ublk: widen ublk_shmem_buf_reg.len to __u64 for 4GB buffer support 2026-04-09 19:08:35 -06:00
udf_fs_i.h
udmabuf.h
udp.h
uhid.h
uinput.h
uio.h
uleds.h
ultrasound.h
um_timetravel.h
un.h
unistd.h
unix_diag.h
usbdevice_fs.h
usbip.h
user_events.h
userfaultfd.h
userio.h
utime.h
utsname.h
uuid.h
uvcvideo.h
v4l2-common.h
v4l2-controls.h
v4l2-dv-timings.h
v4l2-mediabus.h
v4l2-subdev.h
vbox_err.h
vbox_vmmdev_types.h
vboxguest.h
vdpa.h
vduse.h
vesa.h
veth.h
vfio.h
vfio_ccw.h
vfio_zdev.h
vhost.h
vhost_types.h
videodev2.h
virtio_9p.h
virtio_balloon.h
virtio_blk.h
virtio_bt.h
virtio_config.h
virtio_console.h
virtio_crypto.h
virtio_fs.h
virtio_gpio.h
virtio_gpu.h
virtio_i2c.h
virtio_ids.h
virtio_input.h
virtio_iommu.h
virtio_mem.h
virtio_mmio.h
virtio_net.h
virtio_pci.h
virtio_pcidev.h
virtio_pmem.h
virtio_ring.h
virtio_rng.h
virtio_rtc.h
virtio_scmi.h
virtio_scsi.h
virtio_snd.h
virtio_spi.h
virtio_types.h
virtio_vsock.h
vm_sockets.h
vm_sockets_diag.h
vmclock-abi.h
vmcore.h
vsockmon.h
vt.h
vtpm_proxy.h
wait.h
watch_queue.h
watchdog.h
wireguard.h
wireless.h
wmi.h
wwan.h
x25.h
xattr.h
xdp_diag.h
xfrm.h
xilinx-v4l2-controls.h
zorro.h
zorro_ids.h