linux/kernel/sched
John Stultz ddae0ca2a8 sched: Move psi_account_irqtime() out of update_rq_clock_task() hotpath
It was reported that in moving to 6.1, a larger then 10%
regression was seen in the performance of
clock_gettime(CLOCK_THREAD_CPUTIME_ID,...).

Using a simple reproducer, I found:
5.10:
100000000 calls in 24345994193 ns => 243.460 ns per call
100000000 calls in 24288172050 ns => 242.882 ns per call
100000000 calls in 24289135225 ns => 242.891 ns per call

6.1:
100000000 calls in 28248646742 ns => 282.486 ns per call
100000000 calls in 28227055067 ns => 282.271 ns per call
100000000 calls in 28177471287 ns => 281.775 ns per call

The cause of this was finally narrowed down to the addition of
psi_account_irqtime() in update_rq_clock_task(), in commit
52b1364ba0 ("sched/psi: Add PSI_IRQ to track IRQ/SOFTIRQ
pressure").

In my initial attempt to resolve this, I leaned towards moving
all accounting work out of the clock_gettime() call path, but it
wasn't very pretty, so it will have to wait for a later deeper
rework. Instead, Peter shared this approach:

Rework psi_account_irqtime() to use its own psi_irq_time base
for accounting, and move it out of the hotpath, calling it
instead from sched_tick() and __schedule().

In testing this, we found the importance of ensuring
psi_account_irqtime() is run under the rq_lock, which Johannes
Weiner helpfully explained, so also add some lockdep annotations
to make that requirement clear.

With this change the performance is back in-line with 5.10:
6.1+fix:
100000000 calls in 24297324597 ns => 242.973 ns per call
100000000 calls in 24318869234 ns => 243.189 ns per call
100000000 calls in 24291564588 ns => 242.916 ns per call

Reported-by: Jimmy Shiu <jimmyshiu@google.com>
Originally-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: John Stultz <jstultz@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Chengming Zhou <chengming.zhou@linux.dev>
Reviewed-by: Qais Yousef <qyousef@layalina.io>
Link: https://lore.kernel.org/r/20240618215909.4099720-1-jstultz@google.com
2024-07-01 13:01:44 +02:00
..
Makefile
autogroup.c scheduler: Remove the now superfluous sentinel elements from ctl_table array 2024-04-24 09:43:54 +02:00
autogroup.h
build_policy.c
build_utility.c
clock.c
completion.c
core.c sched: Move psi_account_irqtime() out of update_rq_clock_task() hotpath 2024-07-01 13:01:44 +02:00
core_sched.c
cpuacct.c
cpudeadline.c
cpudeadline.h
cpufreq.c
cpufreq_schedutil.c
cpupri.c
cpupri.h
cputime.c sched/vtime: Get rid of generic vtime_task_switch() implementation 2024-04-17 13:37:20 +02:00
deadline.c sched/deadline: Fix task_struct reference leak 2024-07-01 13:01:44 +02:00
debug.c sched/debug: Dump domains' level 2024-05-17 09:48:25 +02:00
fair.c Revert "sched/fair: Make sure to try to detach at least one movable task" 2024-07-01 13:01:43 +02:00
features.h
idle.c
isolation.c sched/isolation: Fix boot crash when maxcpus < first housekeeping CPU 2024-04-28 10:08:21 +02:00
loadavg.c
membarrier.c
pelt.c sched/cpufreq: Rename arch_update_thermal_pressure() => arch_update_hw_pressure() 2024-04-24 12:08:01 +02:00
pelt.h sched/cpufreq: Rename arch_update_thermal_pressure() => arch_update_hw_pressure() 2024-04-24 12:08:01 +02:00
psi.c sched: Move psi_account_irqtime() out of update_rq_clock_task() hotpath 2024-07-01 13:01:44 +02:00
rt.c scheduler: Remove the now superfluous sentinel elements from ctl_table array 2024-04-24 09:43:54 +02:00
sched-pelt.h
sched.h sched: Move psi_account_irqtime() out of update_rq_clock_task() hotpath 2024-07-01 13:01:44 +02:00
smp.h
stats.c
stats.h sched: Move psi_account_irqtime() out of update_rq_clock_task() hotpath 2024-07-01 13:01:44 +02:00
stop_task.c
swait.c
topology.c bitmap patches for 6.10 2024-05-21 15:29:01 -07:00
wait.c
wait_bit.c