mirror of https://github.com/torvalds/linux.git
In free_unmap_area_noflush(), va->flags is marked as VM_LAZY_FREE first, and then vmap_lazy_nr is increased atomically. But, in __purge_vmap_area_lazy(), while traversing of vmap_are_list, nr is counted by checking VM_LAZY_FREE is set to va->flags. After counting the variable nr, kernel reads vmap_lazy_nr atomically and checks a BUG_ON condition whether nr is greater than vmap_lazy_nr to prevent vmap_lazy_nr from being negative. The problem is that, if interrupted right after marking VM_LAZY_FREE, increment of vmap_lazy_nr can be delayed. Consequently, BUG_ON condition can be met because nr is counted more than vmap_lazy_nr. It is highly probable when vmalloc/vfree are called frequently. This scenario have been verified by adding delay between marking VM_LAZY_FREE and increasing vmap_lazy_nr in free_unmap_area_noflush(). Even the vmap_lazy_nr is for checking high watermark, it never be the strict watermark. Although the BUG_ON condition is to prevent vmap_lazy_nr from being negative, vmap_lazy_nr is signed variable. So, it could go down to negative value temporarily. Consequently, removing the BUG_ON condition is proper. A possible BUG_ON message is like the below. kernel BUG at mm/vmalloc.c:517! invalid opcode: 0000 [#1] SMP EIP: 0060:[<c04824a4>] EFLAGS: 00010297 CPU: 3 EIP is at __purge_vmap_area_lazy+0x144/0x150 EAX: ee8a8818 EBX: c08e77d4 ECX: e7c7ae40 EDX: c08e77ec ESI: 000081fe EDI: e7c7ae60 EBP: e7c7ae64 ESP: e7c7ae3c DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Call Trace: [<c0482ad9>] free_unmap_vmap_area_noflush+0x69/0x70 [<c0482b02>] remove_vm_area+0x22/0x70 [<c0482c15>] __vunmap+0x45/0xe0 [<c04831ec>] vmalloc+0x2c/0x30 Code: 8d 59 e0 eb 04 66 90 89 cb 89 d0 e8 87 fe ff ff 8b 43 20 89 da 8d 48 e0 8d 43 20 3b 04 24 75 e7 fe 05 a8 a5 a3 c0 e9 78 ff ff ff <0f> 0b eb fe 90 8d b4 26 00 00 00 00 56 89 c6 b8 ac a5 a3 c0 31 EIP: [<c04824a4>] __purge_vmap_area_lazy+0x144/0x150 SS:ESP 0068:e7c7ae3c [ See also http://marc.info/?l=linux-kernel&m=126335856228090&w=2 ] Signed-off-by: Yongseok Koh <yongseok.koh@samsung.com> Reviewed-by: Minchan Kim <minchan.kim@gmail.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|---|---|---|
| .. | ||
| Kconfig | ||
| Kconfig.debug | ||
| Makefile | ||
| backing-dev.c | ||
| bootmem.c | ||
| bounce.c | ||
| debug-pagealloc.c | ||
| dmapool.c | ||
| fadvise.c | ||
| failslab.c | ||
| filemap.c | ||
| filemap_xip.c | ||
| fremap.c | ||
| highmem.c | ||
| hugetlb.c | ||
| hwpoison-inject.c | ||
| init-mm.c | ||
| internal.h | ||
| kmemcheck.c | ||
| kmemleak-test.c | ||
| kmemleak.c | ||
| ksm.c | ||
| maccess.c | ||
| madvise.c | ||
| memcontrol.c | ||
| memory-failure.c | ||
| memory.c | ||
| memory_hotplug.c | ||
| mempolicy.c | ||
| mempool.c | ||
| migrate.c | ||
| mincore.c | ||
| mlock.c | ||
| mm_init.c | ||
| mmap.c | ||
| mmu_context.c | ||
| mmu_notifier.c | ||
| mmzone.c | ||
| mprotect.c | ||
| mremap.c | ||
| msync.c | ||
| nommu.c | ||
| oom_kill.c | ||
| page-writeback.c | ||
| page_alloc.c | ||
| page_cgroup.c | ||
| page_io.c | ||
| page_isolation.c | ||
| pagewalk.c | ||
| percpu.c | ||
| prio_tree.c | ||
| quicklist.c | ||
| readahead.c | ||
| rmap.c | ||
| shmem.c | ||
| slab.c | ||
| slob.c | ||
| slub.c | ||
| sparse-vmemmap.c | ||
| sparse.c | ||
| swap.c | ||
| swap_state.c | ||
| swapfile.c | ||
| thrash.c | ||
| truncate.c | ||
| util.c | ||
| vmalloc.c | ||
| vmscan.c | ||
| vmstat.c | ||