mirror of https://github.com/torvalds/linux.git
Add sched_ext_ops operations to init/exit cgroups, and track task migrations
and config changes. A BPF scheduler may not implement or implement only
subset of cgroup features. The implemented features can be indicated using
%SCX_OPS_HAS_CGOUP_* flags. If cgroup configuration makes use of features
that are not implemented, a warning is triggered.
While a BPF scheduler is being enabled and disabled, relevant cgroup
operations are locked out using scx_cgroup_rwsem. This avoids situations
like task prep taking place while the task is being moved across cgroups,
making things easier for BPF schedulers.
v7: - cgroup interface file visibility toggling is dropped in favor just
warning messages. Dynamically changing interface visiblity caused more
confusion than helping.
v6: - Updated to reflect the removal of SCX_KF_SLEEPABLE.
- Updated to use CONFIG_GROUP_SCHED_WEIGHT and fixes for
!CONFIG_FAIR_GROUP_SCHED && CONFIG_EXT_GROUP_SCHED.
v5: - Flipped the locking order between scx_cgroup_rwsem and
cpus_read_lock() to avoid locking order conflict w/ cpuset. Better
documentation around locking.
- sched_move_task() takes an early exit if the source and destination
are identical. This triggered the warning in scx_cgroup_can_attach()
as it left p->scx.cgrp_moving_from uncleared. Updated the cgroup
migration path so that ops.cgroup_prep_move() is skipped for identity
migrations so that its invocations always match ops.cgroup_move()
one-to-one.
v4: - Example schedulers moved into their own patches.
- Fix build failure when !CONFIG_CGROUP_SCHED, reported by Andrea Righi.
v3: - Make scx_example_pair switch all tasks by default.
- Convert to BPF inline iterators.
- scx_bpf_task_cgroup() is added to determine the current cgroup from
CPU controller's POV. This allows BPF schedulers to accurately track
CPU cgroup membership.
- scx_example_flatcg added. This demonstrates flattened hierarchy
implementation of CPU cgroup control and shows significant performance
improvement when cgroups which are nested multiple levels are under
competition.
v2: - Build fixes for different CONFIG combinations.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: David Vernet <dvernet@meta.com>
Acked-by: Josh Don <joshdon@google.com>
Acked-by: Hao Luo <haoluo@google.com>
Acked-by: Barret Rhoden <brho@google.com>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Andrea Righi <andrea.righi@canonical.com>
|
||
|---|---|---|
| .. | ||
| .gitignore | ||
| Makefile | ||
| config | ||
| create_dsq.bpf.c | ||
| create_dsq.c | ||
| ddsp_bogus_dsq_fail.bpf.c | ||
| ddsp_bogus_dsq_fail.c | ||
| ddsp_vtimelocal_fail.bpf.c | ||
| ddsp_vtimelocal_fail.c | ||
| dsp_local_on.bpf.c | ||
| dsp_local_on.c | ||
| enq_last_no_enq_fails.bpf.c | ||
| enq_last_no_enq_fails.c | ||
| enq_select_cpu_fails.bpf.c | ||
| enq_select_cpu_fails.c | ||
| exit.bpf.c | ||
| exit.c | ||
| exit_test.h | ||
| hotplug.bpf.c | ||
| hotplug.c | ||
| hotplug_test.h | ||
| init_enable_count.bpf.c | ||
| init_enable_count.c | ||
| maximal.bpf.c | ||
| maximal.c | ||
| maybe_null.bpf.c | ||
| maybe_null.c | ||
| maybe_null_fail_dsp.bpf.c | ||
| maybe_null_fail_yld.bpf.c | ||
| minimal.bpf.c | ||
| minimal.c | ||
| prog_run.bpf.c | ||
| prog_run.c | ||
| reload_loop.c | ||
| runner.c | ||
| scx_test.h | ||
| select_cpu_dfl.bpf.c | ||
| select_cpu_dfl.c | ||
| select_cpu_dfl_nodispatch.bpf.c | ||
| select_cpu_dfl_nodispatch.c | ||
| select_cpu_dispatch.bpf.c | ||
| select_cpu_dispatch.c | ||
| select_cpu_dispatch_bad_dsq.bpf.c | ||
| select_cpu_dispatch_bad_dsq.c | ||
| select_cpu_dispatch_dbl_dsp.bpf.c | ||
| select_cpu_dispatch_dbl_dsp.c | ||
| select_cpu_vtime.bpf.c | ||
| select_cpu_vtime.c | ||
| test_example.c | ||
| util.c | ||
| util.h | ||