linux/lib
Jens Axboe e39435ce68 lib/percpu_counter.c: fix bad percpu counter state during suspend
I got a bug report yesterday from Laszlo Ersek in which he states that
his kvm instance fails to suspend.  Laszlo bisected it down to this
commit 1cf7e9c68f ("virtio_blk: blk-mq support") where virtio-blk is
converted to use the blk-mq infrastructure.

After digging a bit, it became clear that the issue was with the queue
drain.  blk-mq tracks queue usage in a percpu counter, which is
incremented on request alloc and decremented when the request is freed.
The initial hunt was for an inconsistency in blk-mq, but everything
seemed fine.  In fact, the counter only returned crazy values when
suspend was in progress.

When a CPU is unplugged, the percpu counters merges that CPU state with
the general state.  blk-mq takes care to register a hotcpu notifier with
the appropriate priority, so we know it runs after the percpu counter
notifier.  However, the percpu counter notifier only merges the state
when the CPU is fully gone.  This leaves a state transition where the
CPU going away is no longer in the online mask, yet it still holds
private values.  This means that in this state, percpu_counter_sum()
returns invalid results, and the suspend then hangs waiting for
abs(dead-cpu-value) requests to complete which of course will never
happen.

Fix this by clearing the state earlier, so we never have a case where
the CPU isn't in online mask but still holds private state.  This bug
has been there since forever, I guess we don't have a lot of users where
percpu counters needs to be reliable during the suspend cycle.

Signed-off-by: Jens Axboe <axboe@fb.com>
Reported-by: Laszlo Ersek <lersek@redhat.com>
Tested-by: Laszlo Ersek <lersek@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-08 16:48:51 -07:00
..
fonts partly revert commit 8a10bc9: parisc/sti_console: prefer Linux fonts over built-in ROM fonts 2014-03-23 16:44:42 +01:00
lz4 lz4: fix compression/decompression signedness mismatch 2013-09-11 15:59:45 -07:00
lzo
mpi MPILIB: add module description and license 2013-09-25 17:17:01 +01:00
raid6 md update for v3.12 2013-09-10 13:03:41 -07:00
reed_solomon
xz
zlib_deflate
zlib_inflate
.gitignore
Kconfig Kconfig: rename HAS_IOPORT to HAS_IOPORT_MAP 2014-04-07 16:36:11 -07:00
Kconfig.debug locktorture: Add a lock-torture kernel module 2014-02-23 09:04:29 -08:00
Kconfig.kgdb
Kconfig.kmemcheck
Makefile * Avoid WARN_ON() when mapping BGRT on Baytrail (EFI 32-bit). 2014-02-07 11:27:30 -08:00
argv_split.c
asn1_decoder.c
assoc_array.c assoc_array: remove global variable 2014-01-23 09:25:10 -08:00
atomic64.c
atomic64_test.c
audit.c
average.c lib: Ensure EWMA does not store wrong intermediate values 2014-01-16 23:46:06 -08:00
bcd.c
bch.c
bitmap.c
bitrev.c
bsearch.c
btree.c
bug.c
build_OID_registry
bust_spinlocks.c
check_signature.c
checksum.c
clz_ctz.c lib/clz_ctz.c: add prototype declarations in lib/clz_ctz.c 2014-04-03 16:21:12 -07:00
clz_tab.c
cmdline.c lib/cmdline.c: declare exported symbols immediately 2014-01-23 16:36:57 -08:00
cordic.c
cpu-notifier-error-inject.c
cpu_rmap.c Remove GENERIC_HARDIRQ config option 2013-09-13 15:09:52 +02:00
cpumask.c lib/cpumask.c: use memblock apis for early memory allocations 2014-01-21 16:19:47 -08:00
crc-ccitt.c
crc-itu-t.c
crc-t10dif.c crypto: crct10dif - Add fallback for broken initrds 2013-09-12 15:31:34 +10:00
crc7.c
crc8.c
crc16.c
crc32.c lib: crc32: reduce number of cases for crc32{, c}_combine 2013-11-04 15:27:08 -05:00
crc32defs.h
ctype.c
debug_locks.c
debugobjects.c lib/debugobjects.c: remove unnecessary work pending test 2013-11-13 12:09:22 +09:00
dec_and_lock.c
decompress.c initramfs: debug detected compression method 2014-04-07 16:36:11 -07:00
decompress_bunzip2.c
decompress_inflate.c lib/decompress_inflate.c: include appropriate header file 2014-04-03 16:21:12 -07:00
decompress_unlz4.c lib/decompress_unlz4.c: always set an error return code on failures 2014-01-23 16:37:04 -08:00
decompress_unlzma.c
decompress_unlzo.c
decompress_unxz.c
devres.c Kconfig: rename HAS_IOPORT to HAS_IOPORT_MAP 2014-04-07 16:36:11 -07:00
digsig.c lib/digsig.c: use ERR_CAST inlined function instead of ERR_PTR(PTR_ERR(...)) 2013-11-13 12:09:22 +09:00
div64.c
dma-debug.c dma debug: account for cachelines and read-only mappings in overlap tracking 2014-03-04 07:55:47 -08:00
dump_stack.c
dynamic_debug.c dynamic_debug: replace obselete simple_strtoul() with kstrtouint() 2014-01-27 21:02:39 -08:00
dynamic_queue_limits.c
earlycpio.c
extable.c
fault-inject.c
fdt.c
fdt_ro.c
fdt_rw.c
fdt_strerror.c
fdt_sw.c
fdt_wip.c
find_last_bit.c
find_next_bit.c
flex_array.c reciprocal_divide: update/correction of the algorithm 2014-01-21 23:17:20 -08:00
flex_proportions.c
gcd.c
gen_crc32table.c
genalloc.c lib/genalloc.c: add check gen_pool_dma_alloc() if dma pointer is not NULL 2014-01-29 16:22:39 -08:00
halfmd4.c
hash.c lib: hash: follow-up fixups for arch hash 2013-12-19 00:14:53 -05:00
hexdump.c lib: introduce upper case hex ascii helpers 2013-09-20 15:38:26 -04:00
hweight.c
idr.c lib/idr.c: use RCU_INIT_POINTER(x, NULL) 2014-04-07 16:36:07 -07:00
inflate.c
int_sqrt.c
interval_tree.c
interval_tree_test_main.c
iomap.c Kconfig: rename HAS_IOPORT to HAS_IOPORT_MAP 2014-04-07 16:36:11 -07:00
iomap_copy.c
iommu-helper.c
ioremap.c
iovec.c
irq_regs.c
is_single_threaded.c
jedec_ddr_data.c
kasprintf.c
kfifo.c kfifo: kfifo_copy_{to,from}_user: fix copied bytes calculation 2013-11-15 09:32:23 +09:00
klist.c
kobject.c sysfs, kobject: add sysfs wrapper for kernfs_enable_ns() 2014-02-07 16:08:57 -08:00
kobject_uevent.c kobject: don't block for each kobject_uevent 2014-04-03 16:21:04 -07:00
kstrtox.c lib/kstrtox.c: remove redundant cleanup 2014-01-23 16:36:57 -08:00
kstrtox.h
lcm.c
libcrc32c.c
list_debug.c
list_sort.c
llist.c llists-move-llist_reverse_order-from-raid5-to-llistc-fix 2013-11-15 09:32:22 +09:00
locking-selftest-hardirq.h
locking-selftest-mutex.h
locking-selftest-rlock-hardirq.h
locking-selftest-rlock-softirq.h
locking-selftest-rlock.h
locking-selftest-rsem.h
locking-selftest-softirq.h
locking-selftest-spin-hardirq.h
locking-selftest-spin-softirq.h
locking-selftest-spin.h
locking-selftest-wlock-hardirq.h
locking-selftest-wlock-softirq.h
locking-selftest-wlock.h
locking-selftest-wsem.h
locking-selftest.c sched: Introduce preempt_count accessor functions 2013-09-25 14:07:32 +02:00
lockref.c lockref: include mutex.h rather than reinvent arch_mutex_cpu_relax 2013-11-27 20:37:33 -08:00
lru_cache.c
md5.c
memory-notifier-error-inject.c
memweight.c
net_utils.c
nlattr.c netlink: don't compare the nul-termination in nla_strcmp 2014-04-01 15:25:02 -04:00
notifier-error-inject.c
notifier-error-inject.h
of-reconfig-notifier-error-inject.c
oid_registry.c
parser.c lib/parser.c: put EXPORT_SYMBOLs in the conventional place 2014-01-23 16:36:55 -08:00
pci_iomap.c
percpu-refcount.c percpu-refcount: Add a WARN() for ref going negative 2014-01-21 04:40:56 -05:00
percpu_counter.c lib/percpu_counter.c: fix bad percpu counter state during suspend 2014-04-08 16:48:51 -07:00
percpu_ida.c Merge branch 'for-linus' of git://git.kernel.dk/linux-block 2014-02-14 10:45:18 -08:00
percpu_test.c percpu: add test module for various percpu operations 2013-11-13 12:09:11 +09:00
plist.c
pm-notifier-error-inject.c
prio_heap.c
proportions.c
radix-tree.c mm: keep page cache radix tree nodes in check 2014-04-03 16:21:01 -07:00
random32.c lib/random32.c: minor cleanups and kdoc fix 2014-04-03 16:21:11 -07:00
ratelimit.c
rational.c
rbtree.c rbtree: add postorder iteration functions 2013-09-11 15:59:19 -07:00
rbtree_test.c rbtree/test: test rbtree_postorder_for_each_entry_safe() 2014-01-23 16:37:03 -08:00
reciprocal_div.c reciprocal_divide: update/correction of the algorithm 2014-01-21 23:17:20 -08:00
scatterlist.c lib/scatterlist: export sg_miter_skip() 2013-12-08 17:56:37 -08:00
sha1.c
show_mem.c lib/show_mem.c: show num_poisoned_pages when oom 2014-01-21 16:19:48 -08:00
smp_processor_id.c percpu: add preemption checks to __this_cpu ops 2014-04-07 16:36:14 -07:00
sort.c
stmp_device.c
string.c asmlinkage Make __stack_chk_failed and memcmp visible 2014-02-13 18:13:43 -08:00
string_helpers.c
strncpy_from_user.c
strnlen_user.c
swiotlb.c memblock, nobootmem: add memblock_virt_alloc_low() 2014-01-27 21:02:38 -08:00
syscall.c lib/syscall.c: unexport task_current_syscall() 2014-04-03 16:21:06 -07:00
test-kstrtox.c
test-string_helpers.c
test_module.c test: add minimal module for verification testing 2014-01-23 16:36:57 -08:00
test_user_copy.c test: check copy_to/from_user boundary validation 2014-01-23 16:36:57 -08:00
textsearch.c
timerqueue.c
ts_bm.c
ts_fsm.c
ts_kmp.c
ucs2_string.c
usercopy.c
uuid.c
vsprintf.c vsprintf: remove %n handling 2014-04-03 16:21:07 -07:00