mirror of https://github.com/torvalds/linux.git
This message describes another issue about md RAID10 found by testing the
2.6.24 md RAID10 using new scsi fault injection framework.
Abstract:
When a scsi error results in disabling a disk during RAID10 recovery, the
resync threads of md RAID10 could stall.
This case, the raid array has already been broken and it may not matter. But
I think stall is not preferable. If it occurs, even shutdown or reboot will
fail because of resource busy.
The deadlock mechanism:
The r10bio_s structure has a "remaining" member to keep track of BIOs yet to
be handled when recovering. The "remaining" counter is incremented when
building a BIO in sync_request() and is decremented when finish a BIO in
end_sync_write().
If building a BIO fails for some reasons in sync_request(), the "remaining"
should be decremented if it has already been incremented. I found a case
where this decrement is forgotten. This causes a md_do_sync() deadlock
because md_do_sync() waits for md_done_sync() called by end_sync_write(), but
end_sync_write() never calls md_done_sync() because of the "remaining" counter
mismatch.
For example, this problem would be reproduced in the following case:
Personalities : [raid10]
md0 : active raid10 sdf1[4] sde1[5](F) sdd1[2] sdc1[1] sdb1[6](F)
3919616 blocks 64K chunks 2 near-copies [4/2] [_UU_]
[>....................] recovery = 2.2% (45376/1959808) finish=0.7min speed=45376K/sec
This case, sdf1 is recovering, sdb1 and sde1 are disabled.
An additional error with detaching sdd will cause a deadlock.
md0 : active raid10 sdf1[4] sde1[5](F) sdd1[6](F) sdc1[1] sdb1[7](F)
3919616 blocks 64K chunks 2 near-copies [4/1] [_U__]
[=>...................] recovery = 5.0% (99520/1959808) finish=5.9min speed=5237K/sec
2739 ? S< 0:17 [md0_raid10]
28608 ? D< 0:00 [md0_resync]
28629 pts/1 Ss 0:00 bash
28830 pts/1 R+ 0:00 ps ax
31819 ? D< 0:00 [kjournald]
The resync thread keeps working, but actually it is deadlocked.
Patch:
By this patch, the remaining counter will be decremented if needed.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
||
|---|---|---|
| .. | ||
| raid6test | ||
| .gitignore | ||
| Kconfig | ||
| Makefile | ||
| bitmap.c | ||
| dm-bio-list.h | ||
| dm-bio-record.h | ||
| dm-crypt.c | ||
| dm-delay.c | ||
| dm-emc.c | ||
| dm-exception-store.c | ||
| dm-hw-handler.c | ||
| dm-hw-handler.h | ||
| dm-io.c | ||
| dm-io.h | ||
| dm-ioctl.c | ||
| dm-linear.c | ||
| dm-log.c | ||
| dm-log.h | ||
| dm-mpath-hp-sw.c | ||
| dm-mpath-rdac.c | ||
| dm-mpath.c | ||
| dm-mpath.h | ||
| dm-path-selector.c | ||
| dm-path-selector.h | ||
| dm-raid1.c | ||
| dm-round-robin.c | ||
| dm-snap.c | ||
| dm-snap.h | ||
| dm-stripe.c | ||
| dm-table.c | ||
| dm-target.c | ||
| dm-uevent.c | ||
| dm-uevent.h | ||
| dm-zero.c | ||
| dm.c | ||
| dm.h | ||
| faulty.c | ||
| kcopyd.c | ||
| kcopyd.h | ||
| linear.c | ||
| md.c | ||
| mktables.c | ||
| multipath.c | ||
| raid0.c | ||
| raid1.c | ||
| raid5.c | ||
| raid6.h | ||
| raid6algos.c | ||
| raid6altivec.uc | ||
| raid6int.uc | ||
| raid6mmx.c | ||
| raid6recov.c | ||
| raid6sse1.c | ||
| raid6sse2.c | ||
| raid6x86.h | ||
| raid10.c | ||
| unroll.pl | ||