linux/kernel
Thomas Gleixner f2530dc71c kthread: Prevent unpark race which puts threads on the wrong cpu
The smpboot threads rely on the park/unpark mechanism which binds per
cpu threads on a particular core. Though the functionality is racy:

CPU0	       	 	CPU1  	     	    CPU2
unpark(T)				    wake_up_process(T)
  clear(SHOULD_PARK)	T runs
			leave parkme() due to !SHOULD_PARK  
  bind_to(CPU2)		BUG_ON(wrong CPU)						    

We cannot let the tasks move themself to the target CPU as one of
those tasks is actually the migration thread itself, which requires
that it starts running on the target cpu right away.

The solution to this problem is to prevent wakeups in park mode which
are not from unpark(). That way we can guarantee that the association
of the task to the target cpu is working correctly.

Add a new task state (TASK_PARKED) which prevents other wakeups and
use this state explicitly for the unpark wakeup.

Peter noticed: Also, since the task state is visible to userspace and
all the parked tasks are still in the PID space, its a good hint in ps
and friends that these tasks aren't really there for the moment.

The migration thread has another related issue.

CPU0	      	     	 CPU1
Bring up CPU2
create_thread(T)
park(T)
 wait_for_completion()
			 parkme()
			 complete()
sched_set_stop_task()
			 schedule(TASK_PARKED)

The sched_set_stop_task() call is issued while the task is on the
runqueue of CPU1 and that confuses the hell out of the stop_task class
on that cpu. So we need the same synchronizaion before
sched_set_stop_task().

Reported-by: Dave Jones <davej@redhat.com>
Reported-and-tested-by: Dave Hansen <dave@sr71.net>
Reported-and-tested-by: Borislav Petkov <bp@alien8.de>
Acked-by: Peter Ziljstra <peterz@infradead.org>
Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: dhillf@gmail.com
Cc: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1304091635430.21884@ionos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-04-12 14:18:43 +02:00
..
debug KGDB/KDB fixes and cleanups 2013-03-02 08:31:39 -08:00
events perf: Generate EXIT event only once per task context 2013-03-18 09:47:33 +01:00
gcov
irq Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-02-26 20:16:07 -08:00
power
sched Merge branch 'for-3.9/core' of git://git.kernel.dk/linux-block 2013-02-28 12:52:24 -08:00
time clockevents: Don't allow dummy broadcast timers 2013-03-07 17:16:11 +01:00
trace tracing: Fix double free when function profile init failed 2013-04-09 18:54:04 -04:00
.gitignore
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt
Makefile Merge branch 'akpm' (final batch from Andrew) 2013-02-27 20:58:09 -08:00
acct.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-02-26 20:16:07 -08:00
async.c
audit.c
audit.h
audit_tree.c
audit_watch.c
auditfilter.c
auditsc.c
backtracetest.c
bounds.c
capability.c
cgroup.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
cgroup_freezer.c
compat.c
configs.c
context_tracking.c
cpu.c
cpu_pm.c
cpuset.c
crash_dump.c
cred.c
delayacct.c
dma.c
elfcore.c
exec_domain.c
exit.c Revert "lockdep: check that no locks held at freeze time" 2013-03-31 11:38:33 -07:00
extable.c
fork.c userns: Don't allow CLONE_NEWUSER | CLONE_FS 2013-03-13 15:00:20 -07:00
freezer.c
futex.c futex: fix kernel-doc notation and spello 2013-03-12 20:42:10 -07:00
futex_compat.c
groups.c
hrtimer.c
hung_task.c
irq_work.c
itimer.c
jump_label.c
kallsyms.c
kcmp.c
kexec.c kexec: avoid freeing NULL pointer in image_crash_alloc() 2013-02-27 19:10:12 -08:00
kmod.c
kprobes.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
ksysfs.c
kthread.c kthread: Prevent unpark race which puts threads on the wrong cpu 2013-04-12 14:18:43 +02:00
latencytop.c
lglock.c
lockdep.c Revert "lockdep: check that no locks held at freeze time" 2013-03-31 11:38:33 -07:00
lockdep_internals.h
lockdep_proc.c
lockdep_states.h
modsign_certificate.S
modsign_pubkey.c
module-internal.h
module.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-02-26 20:16:07 -08:00
module_signing.c
mutex-debug.c
mutex-debug.h
mutex.c
mutex.h
notifier.c
nsproxy.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-02-26 20:16:07 -08:00
padata.c
panic.c
params.c
pid.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
pid_namespace.c pid: Handle the exit of a multi-threaded init. 2013-03-26 03:41:23 -07:00
posix-cpu-timers.c
posix-timers.c posix-timers: convert to idr_alloc() 2013-02-27 19:10:19 -08:00
printk.c printk: Provide a wake_up_klogd() off-case 2013-03-22 16:41:20 -07:00
profile.c
ptrace.c
range.c
rcu.h
rcupdate.c
rcutiny.c
rcutiny_plugin.h
rcutorture.c
rcutree.c
rcutree.h
rcutree_plugin.h
rcutree_trace.c
relay.c
res_counter.c
resource.c
rtmutex-debug.c
rtmutex-debug.h
rtmutex-tester.c
rtmutex.c
rtmutex.h
rtmutex_common.h
rwsem.c
seccomp.c
semaphore.c
signal.c kernel/signal.c: use __ARCH_HAS_SA_RESTORER instead of SA_RESTORER 2013-03-13 15:21:45 -07:00
smp.c
smpboot.c kthread: Prevent unpark race which puts threads on the wrong cpu 2013-04-12 14:18:43 +02:00
smpboot.h
softirq.c Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2013-03-05 18:10:04 -08:00
spinlock.c
srcu.c
stacktrace.c
stop_machine.c stop_machine: Mark per cpu stopper enabled early 2013-02-26 22:25:17 +01:00
sys.c PM / reboot: call syscore_shutdown() after disable_nonboot_cpus() 2013-04-08 22:10:40 +02:00
sys_ni.c
sysctl.c Initial ARC Linux port with some fixes on top for 3.9-rc1 2013-03-02 07:58:56 -08:00
sysctl_binary.c sysctl: fix null checking in bin_dn_node_address() 2013-02-27 19:10:21 -08:00
task_work.c
taskstats.c
test_kprobes.c
time.c
timeconst.bc
timer.c
tracepoint.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
tsacct.c
uid16.c
up.c
user-return-notifier.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
user.c userns: Restrict when proc and sysfs can be mounted 2013-03-27 07:50:08 -07:00
user_namespace.c userns: Restrict when proc and sysfs can be mounted 2013-03-27 07:50:08 -07:00
utsname.c kernel/utsname.c: fix wrong comment about clone_uts_ns() 2013-02-27 19:10:22 -08:00
utsname_sysctl.c kernel/utsname_sysctl.c: put get/get_uts() into CONFIG_PROC_SYSCTL code block 2013-02-27 19:10:22 -08:00
wait.c
watchdog.c
workqueue.c Merge branch 'for-3.9-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq 2013-03-18 18:47:07 -07:00
workqueue_internal.h