mirror of https://github.com/torvalds/linux.git
When unpinning a BPF hash table (htab or htab_lru) that contains internal
structures (timer, workqueue, or task_work) in its values, a BUG warning
is triggered:
BUG: sleeping function called from invalid context at kernel/bpf/hashtab.c:244
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 14, name: ksoftirqd/0
...
The issue arises from the interaction between BPF object unpinning and
RCU callback mechanisms:
1. BPF object unpinning uses ->free_inode() which schedules cleanup via
call_rcu(), deferring the actual freeing to an RCU callback that
executes within the RCU_SOFTIRQ context.
2. During cleanup of hash tables containing internal structures,
htab_map_free_internal_structs() is invoked, which includes
cond_resched() or cond_resched_rcu() calls to yield the CPU during
potentially long operations.
However, cond_resched() or cond_resched_rcu() cannot be safely called from
atomic RCU softirq context, leading to the BUG warning when attempting
to reschedule.
Fix this by changing from ->free_inode() to ->destroy_inode() and rename
bpf_free_inode() to bpf_destroy_inode() for BPF objects (prog, map, link).
This allows direct inode freeing without RCU callback scheduling,
avoiding the invalid context warning.
Reported-by: Le Chen <tom2cat@sjtu.edu.cn>
Closes: https://lore.kernel.org/all/1444123482.1827743.1750996347470.JavaMail.zimbra@sjtu.edu.cn/
Fixes:
|
||
|---|---|---|
| .. | ||
| preload | ||
| Kconfig | ||
| Makefile | ||
| arena.c | ||
| arraymap.c | ||
| bloom_filter.c | ||
| bpf_cgrp_storage.c | ||
| bpf_inode_storage.c | ||
| bpf_iter.c | ||
| bpf_local_storage.c | ||
| bpf_lru_list.c | ||
| bpf_lru_list.h | ||
| bpf_lsm.c | ||
| bpf_struct_ops.c | ||
| bpf_task_storage.c | ||
| btf.c | ||
| btf_iter.c | ||
| btf_relocate.c | ||
| cgroup.c | ||
| cgroup_iter.c | ||
| core.c | ||
| cpumap.c | ||
| cpumask.c | ||
| crypto.c | ||
| devmap.c | ||
| disasm.c | ||
| disasm.h | ||
| dispatcher.c | ||
| dmabuf_iter.c | ||
| hashtab.c | ||
| helpers.c | ||
| inode.c | ||
| kmem_cache_iter.c | ||
| link_iter.c | ||
| liveness.c | ||
| local_storage.c | ||
| log.c | ||
| lpm_trie.c | ||
| map_in_map.c | ||
| map_in_map.h | ||
| map_iter.c | ||
| memalloc.c | ||
| mmap_unlock_work.h | ||
| mprog.c | ||
| net_namespace.c | ||
| offload.c | ||
| percpu_freelist.c | ||
| percpu_freelist.h | ||
| prog_iter.c | ||
| queue_stack_maps.c | ||
| range_tree.c | ||
| range_tree.h | ||
| relo_core.c | ||
| reuseport_array.c | ||
| ringbuf.c | ||
| rqspinlock.c | ||
| rqspinlock.h | ||
| stackmap.c | ||
| stream.c | ||
| syscall.c | ||
| sysfs_btf.c | ||
| task_iter.c | ||
| tcx.c | ||
| tnum.c | ||
| token.c | ||
| trampoline.c | ||
| verifier.c | ||