linux/kernel/bpf
Linus Torvalds 7cd122b552 Some filesystems use a kinda-sorta controlled dentry refcount leak to pin
dentries of created objects in dcache (and undo it when removing those).
 Reference is grabbed and not released, but it's not actually _stored_
 anywhere.  That works, but it's hard to follow and verify; among other
 things, we have no way to tell _which_ of the increments is intended
 to be an unpaired one.  Worse, on removal we need to decide whether
 the reference had already been dropped, which can be non-trivial if
 that removal is on umount and we need to figure out if this dentry is
 pinned due to e.g. unlink() not done.  Usually that is handled by using
 kill_litter_super() as ->kill_sb(), but there are open-coded special
 cases of the same (consider e.g. /proc/self).
 
 Things get simpler if we introduce a new dentry flag (DCACHE_PERSISTENT)
 marking those "leaked" dentries.  Having it set claims responsibility
 for +1 in refcount.
 
 The end result this series is aiming for:
 
 * get these unbalanced dget() and dput() replaced with new primitives that
   would, in addition to adjusting refcount, set and clear persistency flag.
 * instead of having kill_litter_super() mess with removing the remaining
   "leaked" references (e.g. for all tmpfs files that hadn't been removed
   prior to umount), have the regular shrink_dcache_for_umount() strip
   DCACHE_PERSISTENT of all dentries, dropping the corresponding
   reference if it had been set.  After that kill_litter_super() becomes
   an equivalent of kill_anon_super().
 
 Doing that in a single step is not feasible - it would affect too many places
 in too many filesystems.  It has to be split into a series.
 
 This work has really started early in 2024; quite a few preliminary pieces
 have already gone into mainline.  This chunk is finally getting to the
 meat of that stuff - infrastructure and most of the conversions to it.
 
 Some pieces are still sitting in the local branches, but the bulk of
 that stuff is here.
 
 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCaTEq1wAKCRBZ7Krx/gZQ
 643uAQC1rRslhw5l7OjxEpIYbGG4M+QaadN4Nf5Sr2SuTRaPJQD/W4oj/u4C2eCw
 Dd3q071tqyvm/PXNgN2EEnIaxlFUlwc=
 =rKq+
 -----END PGP SIGNATURE-----

Merge tag 'pull-persistency' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull persistent dentry infrastructure and conversion from Al Viro:
 "Some filesystems use a kinda-sorta controlled dentry refcount leak to
  pin dentries of created objects in dcache (and undo it when removing
  those). A reference is grabbed and not released, but it's not actually
  _stored_ anywhere.

  That works, but it's hard to follow and verify; among other things, we
  have no way to tell _which_ of the increments is intended to be an
  unpaired one. Worse, on removal we need to decide whether the
  reference had already been dropped, which can be non-trivial if that
  removal is on umount and we need to figure out if this dentry is
  pinned due to e.g. unlink() not done. Usually that is handled by using
  kill_litter_super() as ->kill_sb(), but there are open-coded special
  cases of the same (consider e.g. /proc/self).

  Things get simpler if we introduce a new dentry flag
  (DCACHE_PERSISTENT) marking those "leaked" dentries. Having it set
  claims responsibility for +1 in refcount.

  The end result this series is aiming for:

   - get these unbalanced dget() and dput() replaced with new primitives
     that would, in addition to adjusting refcount, set and clear
     persistency flag.

   - instead of having kill_litter_super() mess with removing the
     remaining "leaked" references (e.g. for all tmpfs files that hadn't
     been removed prior to umount), have the regular
     shrink_dcache_for_umount() strip DCACHE_PERSISTENT of all dentries,
     dropping the corresponding reference if it had been set. After that
     kill_litter_super() becomes an equivalent of kill_anon_super().

  Doing that in a single step is not feasible - it would affect too many
  places in too many filesystems. It has to be split into a series.

  This work has really started early in 2024; quite a few preliminary
  pieces have already gone into mainline. This chunk is finally getting
  to the meat of that stuff - infrastructure and most of the conversions
  to it.

  Some pieces are still sitting in the local branches, but the bulk of
  that stuff is here"

* tag 'pull-persistency' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (54 commits)
  d_make_discardable(): warn if given a non-persistent dentry
  kill securityfs_recursive_remove()
  convert securityfs
  get rid of kill_litter_super()
  convert rust_binderfs
  convert nfsctl
  convert rpc_pipefs
  convert hypfs
  hypfs: swich hypfs_create_u64() to returning int
  hypfs: switch hypfs_create_str() to returning int
  hypfs: don't pin dentries twice
  convert gadgetfs
  gadgetfs: switch to simple_remove_by_name()
  convert functionfs
  functionfs: switch to simple_remove_by_name()
  functionfs: fix the open/removal races
  functionfs: need to cancel ->reset_work in ->kill_sb()
  functionfs: don't bother with ffs->ref in ffs_data_{opened,closed}()
  functionfs: don't abuse ffs_data_closed() on fs shutdown
  convert selinuxfs
  ...
2025-12-05 14:36:21 -08:00
..
preload umd: Remove usermode driver framework 2025-07-26 21:03:04 +02:00
Kconfig bpf: Update the bpf_prog_calc_tag to use SHA256 2025-09-18 19:10:20 -07:00
Makefile bpf, x86: add new map type: instructions array 2025-11-05 17:31:25 -08:00
arena.c mm: consistently use current->mm in mm_get_unmapped_area() 2025-11-16 17:27:57 -08:00
arraymap.c bpf: generalize and export map_get_next_key for arrays 2025-10-21 11:17:25 -07:00
bloom_filter.c bpf: Check bloom filter map value size 2024-03-27 09:56:17 -07:00
bpf_cgrp_storage.c bpf: use rcu_read_lock_dont_migrate() for bpf_cgrp_storage_free() 2025-08-25 18:52:16 -07:00
bpf_inode_storage.c bpf: use rcu_read_lock_dont_migrate() for bpf_inode_storage_free() 2025-08-25 18:52:16 -07:00
bpf_insn_array.c bpf: force BPF_F_RDONLY_PROG on insn array creation 2025-11-28 15:15:43 -08:00
bpf_iter.c bpf: convert bpf_iter_new_fd() to FD_PREPARE() 2025-11-28 12:42:33 +01:00
bpf_local_storage.c bpf: Replace bpf memory allocator with kmalloc_nolock() in local storage 2025-11-18 16:20:25 -08:00
bpf_lru_list.c bpf: Replace get_next_cpu() with cpumask_next_wrap() 2025-08-18 15:11:02 +02:00
bpf_lru_list.h bpf: Adjust free target to avoid global starvation of LRU map 2025-06-18 18:50:14 -07:00
bpf_lsm.c bpf: Disable file_alloc_security hook 2025-11-28 15:18:28 -08:00
bpf_struct_ops.c bpf: Export necessary symbols for modules with struct_ops 2025-11-10 11:07:34 -08:00
bpf_task_storage.c bpf: use rcu_read_lock_dont_migrate() for bpf_task_storage_free() 2025-08-25 18:52:16 -07:00
btf.c bpf: Allow union argument in trampoline based programs 2025-09-23 12:07:46 -07:00
btf_iter.c bpf: Remove custom build rule 2024-08-30 08:55:26 -07:00
btf_relocate.c bpf: Remove custom build rule 2024-08-30 08:55:26 -07:00
cgroup.c bpf: Convert bpf_sock_addr_kern "uaddr" to sockaddr_unsized 2025-11-04 19:10:33 -08:00
cgroup_iter.c bpf: Let verifier consider {task,cgroup} is trusted in bpf_iter_reg 2023-11-07 15:24:25 -08:00
core.c bpf: specify the old and new poke_type for bpf_arch_text_poke 2025-11-24 09:47:03 -08:00
cpumap.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf after rc5 2025-09-11 09:34:37 -07:00
cpumask.c bpf: fix missing kdoc string fields in cpumask.c 2025-03-15 11:48:57 -07:00
crypto.c bpf: Fix out-of-bounds dynptr write in bpf_crypto_crypt 2025-09-09 15:07:57 -07:00
devmap.c bpf: Remove redundant __GFP_NOWARN 2025-08-12 14:56:04 -07:00
disasm.c bpf: disasm: add support for BPF_JMP|BPF_JA|BPF_X 2025-11-05 17:53:23 -08:00
disasm.h
dispatcher.c bpf: Add kernel symbol for struct_ops trampoline 2024-11-12 17:13:46 -08:00
dmabuf_iter.c bpf: Add open coded dmabuf iterator 2025-05-27 09:51:25 -07:00
hashtab.c bpf: Free special fields when update [lru_,]percpu_hash maps 2025-11-13 09:14:15 -08:00
helpers.c Networking changes for 6.19. 2025-12-03 17:24:33 -08:00
inode.c convert bpf 2025-11-16 01:35:03 -05:00
kmem_cache_iter.c bpf: Add open coded version of kmem_cache iterator 2024-11-01 11:08:32 -07:00
link_iter.c bpf: Clean up individual BTF_ID code 2025-07-16 18:34:42 -07:00
liveness.c bpf: correct stack liveness for tail calls 2025-11-21 17:45:30 -08:00
local_storage.c bpf: Remove redundant __GFP_NOWARN 2025-08-12 14:56:04 -07:00
log.c bpf, x86: add support for indirect jumps 2025-11-05 17:53:23 -08:00
lpm_trie.c bpf: Convert lpm_trie.c to rqspinlock 2025-03-19 08:03:05 -07:00
map_in_map.c bpf: switch maps to CLASS(fd, ...) 2024-08-13 15:58:17 -07:00
map_in_map.h bpf: Add map and need_defer parameters to .map_fd_put_ptr() 2023-12-04 17:50:26 -08:00
map_iter.c bpf: treewide: Annotate BPF kfuncs in BTF 2024-01-31 20:40:56 -08:00
memalloc.c bpf: replace use of system_unbound_wq with system_dfl_wq 2025-09-08 10:04:37 -07:00
mmap_unlock_work.h
mprog.c bpf: Handle bpf_mprog_query with NULL entry 2023-10-06 17:11:20 -07:00
net_namespace.c bpf: Remove attach_type in bpf_netns_link 2025-07-11 11:01:04 -07:00
offload.c net: move misc netdev_lock flavors to a separate header 2025-03-08 09:06:50 -08:00
percpu_freelist.c bpf: Convert percpu_freelist.c to rqspinlock 2025-03-19 08:03:05 -07:00
percpu_freelist.h bpf: Convert percpu_freelist.c to rqspinlock 2025-03-19 08:03:05 -07:00
prog_iter.c bpf: Clean up individual BTF_ID code 2025-07-16 18:34:42 -07:00
queue_stack_maps.c bpf: Convert queue_stack map to rqspinlock 2025-04-10 12:51:10 -07:00
range_tree.c bpf: Use kmalloc_nolock() in range tree 2025-11-06 15:55:19 -08:00
range_tree.h bpf: Introduce range_tree data structure and use it in bpf arena 2024-11-13 13:52:45 -08:00
relo_core.c bpf: Remove custom build rule 2024-08-30 08:55:26 -07:00
reuseport_array.c bpf: Use sockfd_put() helper 2024-08-30 08:57:47 -07:00
ringbuf.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf after 6.18-rc4 2025-11-03 14:59:55 -08:00
rqspinlock.c rqspinlock: Precede non-head waiter queueing with AA check 2025-11-29 09:35:36 -08:00
rqspinlock.h rqspinlock: Protect waiters in queue from stalls 2025-03-19 08:03:05 -07:00
stackmap.c bpf-next-6.19 2025-12-03 16:54:54 -08:00
stream.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf after 6.18-rc5+ 2025-11-14 17:43:41 -08:00
syscall.c Significant patch series in this merge are as follows: 2025-12-05 13:52:43 -08:00
sysfs_btf.c Driver core changes for 6.17-rc1 2025-07-29 12:15:39 -07:00
task_iter.c vfs-6.13.file 2024-11-18 10:30:29 -08:00
tcx.c bpf: Remove location field in tcx_link 2025-07-11 11:00:57 -07:00
tnum.c bpf: Improve the general precision of tnum_mul 2025-08-27 15:00:26 -07:00
token.c bpf: convert bpf_token_create() to FD_PREPARE() 2025-11-28 12:42:33 +01:00
trampoline.c bpf: implement "jmp" mode for trampoline 2025-11-24 09:47:04 -08:00
verifier.c bpf: check for insn arrays in check_ptr_alignment 2025-11-28 15:15:43 -08:00