Commit Graph

176 Commits

Author SHA1 Message Date
K Prateek Nayak d87e4026d1 cpufreq/amd-pstate: Enable ITMT support after initializing core rankings
When working on dynamic ITMT priority support, it was observed that
"asym_prefer_cpu" on AMD systems supporting Preferred Core ranking
was always set to the first CPU in the sched group when the system boots
up despite another CPU in the group having a higher ITMT ranking.

"asym_prefer_cpu" is cached when the sched domain hierarchy is
constructed. On AMD systems that support Preferred Core rankings, sched
domains are rebuilt when ITMT support is enabled for the first time from
amd_pstate*_cpu_init().

Since amd_pstate*_cpu_init() is called to initialize the cpudata for
each CPU, the ITMT support is enabled after the first CPU initializes
its asym priority but this is too early since other CPUs have not yet
initialized their asym priorities and the sched domain is rebuilt only
once when the support is toggled on for the first time.

Initialize the asym priorities first in amd_pstate*_cpu_init() and then
enable ITMT support later in amd_pstate_register_driver() to ensure all
CPUs have correctly initialized their asym priorities before sched
domain hierarchy is rebuilt.

Clear the ITMT support when the amd-pstate driver unregisters since core
rankings cannot be trusted unless the update_limits() callback is
operational.

Remove the delayed work mechanism now that ITMT support is only toggled
from the driver init path which is outside the cpuhp critical section.

Fixes: f3a0523918 ("cpufreq: amd-pstate: Enable amd-pstate preferred core support")
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20250411081439.27652-1-kprateek.nayak@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-04-11 12:49:57 -05:00
Dhananjay Ugwekar 56a49e19e1 cpufreq/amd-pstate: Fix min_limit perf and freq updation for performance governor
The min_limit perf and freq values can get disconnected with performance
governor, as we only modify the perf value in the special case. Fix that
by modifying the perf and freq values together

Fixes: 009d1c29a4 ("cpufreq/amd-pstate: Move perf values into a union")
Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20250407081925.850473-1-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-04-07 08:56:21 -05:00
Rafael J. Wysocki 7a6589f1aa ARM cpufreq updates for 6.15
- manage sysfs attributes and boost frequencies efficiently from cpufreq
   core to reduce boilerplate code from drivers (Viresh Kumar).
 
 - Minor cleanups to cpufreq drivers (Aaron Kling, Benjamin Schneider,
   Dhananjay Ugwekar, Imran Shaik, and zuoqian).
 
 - Migrate to using for_each_present_cpu (Jacky Bai).
 
 - cpufreq-qcom-hw DT binding fixes (Krzysztof Kozlowski).
 
 - Use str_enable_disable() helper (Lifeng Zheng).
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEx73Crsp7f6M6scA70rkcPK6BEhwFAmfXwv8ACgkQ0rkcPK6B
 Ehyx9xAAvPbByizNlClE4Hp8Dg5sv15fpN1klvQuhVbMLQsFeTHryE2+pw/kWZYz
 mmDQoFjUVOBjr8f4ixznlGaUjU2Of6pSS5JZi4G8hRP5aGvDXEielSK/P2AFLRvb
 e33J5/Efb4DEilQVbS1oQfg1kpVZ53bVhtz8CGY/Yk1Dfh/IoUzlM9CCMKUI2h+P
 c12DyGzNeaH9Ne4A4SKcAG//JzOWUc12OAxt4M0a71T4/Hn0qiRb/pz0xVGv8rfa
 0CfSDyFs7fxt4BWHzqHa1q9a6Zvel7Mib0aWqKa9F5ptDzNkFphb6UN0WuaKeBmw
 LHmaxoq7Cn9xWlxK2sHHOGak6CksaBmSFNkrmulXxO9o0Bpt8nqaaXUp2BjqwLlG
 fcnwGGlYCp46mmTDX4NQXYAQpw4Iy7qMpKSlzbjPq/cXsYvlJ6O4+6OHHtVZtj+I
 exjs2HTDe+tS2DEkRSBRtxNYwndhKsnGeRndtSx7oTb9zJGDUPTqIBKH1VuGwHsY
 NBbR5y+b2cRz2LYI1SkurGIKLKEuYP9luR+sxCsqFl7R+yKoSnvPfQk0BXu31rgf
 l+9o/8qYzuZED79unhmqCsTsQCpH+7/s00J1ZOtdZbwNGT78+WuxcP8X2ClAJ8nH
 joJrU0XUBJhexwoDHp4nInpp1k8yjDDGwbRGJK4ZDbZAKCxJZ7k=
 =fyCe
 -----END PGP SIGNATURE-----

Merge tag 'cpufreq-arm-updates-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm

Merge ARM cpufreq updates for 6.15 from Viresh Kumar:

"- manage sysfs attributes and boost frequencies efficiently from cpufreq
   core to reduce boilerplate code from drivers (Viresh Kumar).

 - Minor cleanups to cpufreq drivers (Aaron Kling, Benjamin Schneider,
   Dhananjay Ugwekar, Imran Shaik, and zuoqian).

 - Migrate to using for_each_present_cpu (Jacky Bai).

 - cpufreq-qcom-hw DT binding fixes (Krzysztof Kozlowski).

 - Use str_enable_disable() helper (Lifeng Zheng)."

* tag 'cpufreq-arm-updates-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: (59 commits)
  dt-bindings: cpufreq: cpufreq-qcom-hw: Narrow properties on SDX75, SA8775p and SM8650
  dt-bindings: cpufreq: cpufreq-qcom-hw: Drop redundant minItems:1
  dt-bindings: cpufreq: cpufreq-qcom-hw: Add missing constraint for interrupt-names
  dt-bindings: cpufreq: cpufreq-qcom-hw: Add QCS8300 compatible
  cpufreq: Init cpufreq only for present CPUs
  cpufreq: tegra186: Share policy per cluster
  cpufreq: tegra194: Allow building for Tegra234
  cpufreq: enable 1200Mhz clock speed for armada-37xx
  cpufreq: Remove cpufreq_enable_boost_support()
  cpufreq: staticize policy_has_boost_freq()
  cpufreq: qcom: Set .set_boost directly
  cpufreq: dt: Set .set_boost directly
  cpufreq: scmi: Set .set_boost directly
  cpufreq: powernv: Set .set_boost directly
  cpufreq: loongson: Set .set_boost directly
  cpufreq: apple: Set .set_boost directly
  cpufreq: Restrict enabling boost on policies with no boost frequencies
  cpufreq: cppc: Set policy->boost_supported
  cpufreq: amd: Set policy->boost_supported
  cpufreq: acpi: Set policy->boost_supported
  ...
2025-03-22 15:00:00 +01:00
Mario Limonciello efb758c8c8 cpufreq/amd-pstate: Drop actions in amd_pstate_epp_cpu_offline()
When the CPU goes offline there is no need to change the CPPC request
because the CPU will go into the deepest C-state it supports already.

Actually changing the CPPC request when it goes offline messes up the
cached values and can lead to the wrong values being restored when
it comes back.

Instead drop the actions and if the CPU comes back online let
amd_pstate_epp_set_policy() restore it to expected values.

Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06 13:01:26 -06:00
Mario Limonciello 4e16c11752 cpufreq/amd-pstate: Stop caching EPP
EPP values are cached in the cpudata structure per CPU. This is needless
though because they are also cached in the CPPC request variable.

Drop the separate cache for EPP values and always reference the CPPC
request variable when needed.

Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06 13:01:26 -06:00
Mario Limonciello 2064543f5b cpufreq/amd-pstate: Rework CPPC enabling
The CPPC enable register is configured as "write once".  That is
any future writes don't actually do anything.

Because of this, all the cleanup paths that currently exist for
CPPC disable are non-effective.

Rework CPPC enable to only enable after all the CAP registers have
been read to avoid enabling CPPC on CPUs with invalid _CPC or
unpopulated MSRs.

As the register is write once, remove all cleanup paths as well.

Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06 13:01:25 -06:00
Mario Limonciello 93039a60fb cpufreq/amd-pstate: Drop debug statements for policy setting
There are trace events that exist now for all amd-pstate modes that
will output information right before programming to the hardware.

This makes the existing debug statements unnecessary remaining
overhead.  Drop them.

Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06 13:01:25 -06:00
Mario Limonciello 1905fac6f9 cpufreq/amd-pstate: Update cppc_req_cached for shared mem EPP writes
On EPP only writes update the cached variable so that the min/max
performance controls don't need to be updated again.

Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06 13:01:25 -06:00
Mario Limonciello 77fbea69b0 cpufreq/amd-pstate: Move all EPP tracing into *_update_perf and *_set_epp functions
The EPP tracing is done by the caller today, but this precludes the
information about whether the CPPC request has changed.

Move it into the update_perf and set_epp functions and include information
about whether the request has changed from the last one.
amd_pstate_update_perf() and amd_pstate_set_epp() now require the policy
as an argument instead of the cpudata.

Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06 13:01:25 -06:00
Mario Limonciello 9f5daa2f2f cpufreq/amd-pstate: Cache CPPC request in shared mem case too
In order to prevent a potential write for shmem_update_perf()
cache the request into the cppc_req_cached variable normally only
used for the MSR case.

This adds symmetry into the code and potentially avoids extra writes.

Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06 13:01:25 -06:00
Mario Limonciello b4cc466b97 cpufreq/amd-pstate: Replace all AMD_CPPC_* macros with masks
Bitfield masks are easier to follow and less error prone.

Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06 13:01:25 -06:00
Mario Limonciello f458cf79d7 cpufreq/amd-pstate: Drop `cppc_cap1_cached`
The `cppc_cap1_cached` variable isn't used at all, there is no
need to read it at initialization for each CPU.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06 13:01:25 -06:00
Mario Limonciello 6f0b13f16f cpufreq/amd-pstate: Overhaul locking
amd_pstate_cpu_boost_update() and refresh_frequency_limits() both
update the policy state and have nothing to do with the amd-pstate
driver itself.

A global "limits" lock doesn't make sense because each CPU can have
policies changed independently.  Each time a CPU changes values they
will atomically be written to the per-CPU perf member. Drop per CPU
locking cases.

The remaining "global" driver lock is used to ensure that only one
entity can change driver modes at a given time.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06 13:01:24 -06:00
Mario Limonciello 009d1c29a4 cpufreq/amd-pstate: Move perf values into a union
By storing perf values in a union all the writes and reads can
be done atomically, removing the need for some concurrency protections.

While making this change, also drop the cached frequency values,
using inline helpers to calculate them on demand from perf value.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06 13:01:24 -06:00
Mario Limonciello a9b9b4c2a4 cpufreq/amd-pstate: Drop min and max cached frequencies
Use the perf_to_freq helpers to calculate this on the fly.
As the members are no longer cached add an extra check into
amd_pstate_epp_update_limit() to avoid unnecessary calls in
amd_pstate_update_min_max_limit().

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06 13:01:24 -06:00
Mario Limonciello a9ba0fd452 cpufreq/amd-pstate: Show a warning when a CPU fails to setup
I came across a system that MSR_AMD_CPPC_CAP1 for some CPUs isn't
populated.  This is an unexpected behavior that is most likely a
BIOS bug. In the event it happens I'd like users to report bugs
to properly root cause and get this fixed.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06 13:01:24 -06:00
Mario Limonciello b7a4115658 cpufreq/amd-pstate: Invalidate cppc_req_cached during suspend
During resume it's possible the firmware didn't restore the CPPC request
MSR but the kernel thinks the values line up. This leads to incorrect
performance after resume from suspend.

To fix the issue invalidate the cached value at suspend. During resume use
the saved values programmed as cached limits.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reported-by: Miroslav Pavleski <miroslav@pavleski.net>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217931
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06 13:01:24 -06:00
Dhananjay Ugwekar a1d1d8fb65 cpufreq/amd-pstate: Fix the clamping of perf values
The clamping in freq_to_perf() is broken right now, as we first typecast
(read wraparound) the overflowing value into a u8 and then clamp it down.
So, use a u32 to store the >255 value in certain edge cases and then clamp
it down into a u8.

Also, use a "explicit typecast + clamp" instead of just a "clamp_t" as the
latter typecasts first and then clamps between the limits, which defeats
our purpose.

Fixes: 620136ced3 ("cpufreq/amd-pstate: Modularize perf<->freq conversion")
Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20250222033221.554976-1-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-03-06 13:01:17 -06:00
Dhananjay Ugwekar 3e93edc58a cpufreq/amd-pstate: Remove the unncecessary driver_lock in amd_pstate_update_limits
There is no need to take a driver wide lock while updating the
highest_perf value in the percpu cpudata struct. Hence remove it.

Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20250205112523.201101-13-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-02-23 18:54:56 -06:00
Dhananjay Ugwekar 97a705dc1a cpufreq/amd-pstate: Use scope based cleanup for cpufreq_policy refs
There have been instances in past where refcount decrementing is missed
while exiting a function. Use automatic scope based cleanup to avoid
such errors.

Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20250205112523.201101-12-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-02-23 18:54:56 -06:00
Dhananjay Ugwekar 426db24d4d cpufreq/amd-pstate: Add missing NULL ptr check in amd_pstate_update
Check if policy is NULL before dereferencing it in amd_pstate_update.

Fixes: e8f555daac ("cpufreq/amd-pstate: fix setting policy current frequency value")
Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20250205112523.201101-11-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-02-23 18:54:56 -06:00
Dhananjay Ugwekar b899434857 cpufreq/amd-pstate: Remove the unnecessary cpufreq_update_policy call
The update_limits callback is only called in two conditions.

* When the preferred core rankings change. In which case, we just need to
change the prefcore ranking in the cpudata struct. As there are no changes
to any of the perf values, there is no need to call cpufreq_update_policy()

* When the _PPC ACPI object changes, i.e. the highest allowed Pstate
changes. The _PPC object is only used for a table based cpufreq driver
like acpi-cpufreq, hence is irrelevant for CPPC based amd-pstate.

Hence, the cpufreq_update_policy() call becomes unnecessary and can be
removed.

Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20250205112523.201101-9-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-02-23 18:54:56 -06:00
Dhananjay Ugwekar 620136ced3 cpufreq/amd-pstate: Modularize perf<->freq conversion
Delegate the perf<->frequency conversion to helper functions to reduce
code duplication, and improve readability.

Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20250205112523.201101-8-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-02-23 18:54:56 -06:00
Dhananjay Ugwekar 555bbe67a6 cpufreq/amd-pstate: Convert all perf values to u8
All perf values are always within 0-255 range, hence convert their
datatype to u8 everywhere.

Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20250205112523.201101-7-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-02-23 18:54:56 -06:00
Dhananjay Ugwekar e9869c836b cpufreq/amd-pstate: Pass min/max_limit_perf as min/max_perf to amd_pstate_update
Currently, amd_pstate_update_freq passes the hardware perf limits as
min/max_perf to amd_pstate_update, which eventually gets programmed into
the min/max_perf fields of the CPPC_REQ register.

Instead pass the effective perf limits i.e. min/max_limit_perf values to
amd_pstate_update as min/max_perf.

Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20250205112523.201101-6-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-02-23 18:54:56 -06:00
Dhananjay Ugwekar 932da64896 cpufreq/amd-pstate: Remove the redundant des_perf clamping in adjust_perf
des_perf is later on clamped between min_perf and max_perf in
amd_pstate_update. So, remove the redundant clamping from
amd_pstate_adjust_perf.

Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20250205112523.201101-5-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-02-23 18:54:56 -06:00
Dhananjay Ugwekar 6ceb877d5c cpufreq/amd-pstate: Modify the min_perf calculation in adjust_perf callback
Instead of setting a fixed floor at lowest_nonlinear_perf, use the
min_limit_perf value, so that it gives the user the freedom to lower the
floor further.

There are two minimum frequency/perf limits that we need to consider in
the adjust_perf callback. One provided by schedutil i.e. the sg_cpu->bw_min
value passed in _min_perf arg, another is the effective value of
min_freq_qos request that is updated in cpudata->min_limit_perf. Modify the
code to use the bigger of these two values.

Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20250205112523.201101-4-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-02-23 18:54:56 -06:00
Viresh Kumar 98f39e93d1 cpufreq: amd: Set policy->boost_supported
With a later commit, the cpufreq core will call the ->set_boost()
callback only if the policy supports boost frequency. The
boost_supported flag is set by the cpufreq core if policy->freq_table is
set and one or more boost frequencies are present.

For other drivers, the flag must be set explicitly.

The policy->boost_enabled flag is set by the cpufreq core once the
policy is initialized, don't set it anymore.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-02-07 09:45:15 +05:30
Dhananjay Ugwekar 3ace20038e cpufreq/amd-pstate: Fix cpufreq_policy ref counting
amd_pstate_update_limits() takes a cpufreq_policy reference but doesn't
decrement the refcount in one of the exit paths, fix that.

Fixes: 45722e777f ("cpufreq: amd-pstate: Optimize amd_pstate_update_limits()")
Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20250205112523.201101-10-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-02-06 13:19:36 -06:00
Dhananjay Ugwekar db1cafc77a cpufreq: amd-pstate: Remove unnecessary driver_lock in set_boost
set_boost is a per-policy function call, hence a driver wide lock is
unnecessary. Also this mutex_acquire can collide with the mutex_acquire
from the mode-switch path in status_store(), which can lead to a
deadlock. So, remove it.

Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Acked-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-02-06 09:04:30 +05:30
Dhananjay Ugwekar 55db9b73c3 cpufreq/amd-pstate: Fix max_perf updation with schedutil
In adjust_perf() callback, we are setting the max_perf to highest_perf,
as opposed to the correct limit value i.e. max_limit_perf. Fix that.

Fixes: 3f7b835fa4 ("cpufreq/amd-pstate: Move limit updating code")
Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20250205112523.201101-3-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-02-05 12:18:27 -06:00
Dhananjay Ugwekar d364eee14c cpufreq/amd-pstate: Remove the goto label in amd_pstate_update_limits
Scope based guard/cleanup macros should not be used together with goto
labels. Hence, remove the goto label.

Fixes: 6c093d5a5b ("cpufreq/amd-pstate: convert mutex use to guard()")
Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20250205112523.201101-2-dhananjay.ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-02-05 12:18:27 -06:00
Lifeng Zheng fa803513ab cpufreq/amd-pstate: Fix per-policy boost flag incorrect when fail
Commit c8c68c38b5 ("cpufreq: amd-pstate: initialize core precision
boost state") sets per-policy boost flag to false when boost fail.
However, this boost flag will be set to reverse value in
store_local_boost() and cpufreq_boost_trigger_state() in cpufreq.c. This
will cause the per-policy boost flag set to true when fail to set boost.
Remove the extra assignment in amd_pstate_set_boost() and keep all
operations on per-policy boost flag outside of set_boost() to fix this
problem.

Fixes: c8c68c38b5 ("cpufreq: amd-pstate: initialize core precision boost state")
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20250110091949.3610770-1-zhenglifeng1@huawei.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-02-03 00:04:23 -06:00
Naresh Solanki 857a61c2ce cpufreq/amd-pstate: Refactor max frequency calculation
The previous approach introduced roundoff errors during division when
calculating the boost ratio. This, in turn, affected the maximum
frequency calculation, often resulting in reporting lower frequency
values.

For example, on the Glinda SoC based board with the following
parameters:

max_perf = 208
nominal_perf = 100
nominal_freq = 2600 MHz

The Linux kernel previously calculated the frequency as:
freq = ((max_perf * 1024 / nominal_perf) * nominal_freq) / 1024
freq = 5405 MHz  // Integer arithmetic.

With the updated formula:
freq = (max_perf * nominal_freq) / nominal_perf
freq = 5408 MHz

This change ensures more accurate frequency calculations by eliminating
unnecessary shifts and divisions, thereby improving precision.

Signed-off-by: Naresh Solanki <naresh.solanki@9elements.com>
[ML: trim the changelog from commit message]
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20241219201833.2750998-1-naresh.solanki@9elements.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-01-03 23:44:07 -06:00
Mario Limonciello fd604ae6c2 cpufreq/amd-pstate: Fix prefcore rankings
commit 50a062a762 ("cpufreq/amd-pstate: Store the boost numerator as
highest perf again") updated the value stored for highest perf to no longer
store the highest perf value but instead the boost numerator.

This is a fixed value for systems with preferred cores and not appropriate
for use ITMT rankings. Update the value used for ITMT rankings to be the
preferred core ranking.

Reported-and-tested-by: Sebastian <sobrus@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219640
Fixes: 50a062a762 ("cpufreq/amd-pstate: Store the boost numerator as highest perf again")
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Link: https://lore.kernel.org/r/20250102141204.3413202-1-superm1@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2025-01-03 10:27:32 -06:00
Mario Limonciello 95fad7fb58 cpufreq/amd-pstate: Drop boost_state variable
Currently boost_state is cached for every processor in cpudata structure
and driver boost state is set for every processor.

Both of these aren't necessary as the driver only needs to set once and
the policy stores whether boost is enabled.

Move the driver boost setting to registration and adjust all references
to cached value to pull from the policy instead.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241209185248.16301-16-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-11 10:44:53 -06:00
Mario Limonciello f9a378ff64 cpufreq/amd-pstate: Set different default EPP policy for Epyc and Ryzen
For Ryzen systems the EPP policy set by the BIOS is generally configured
to performance as this is the default register value for the CPPC request
MSR.

If a user doesn't use additional software to configure EPP then the system
will default biased towards performance and consume extra battery. Instead
configure the default to "balanced_performance" for this case.

Suggested-by: Artem S. Tashkinov <aros@gmx.com>
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Tested-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219526
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241209185248.16301-15-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-11 10:44:53 -06:00
Mario Limonciello f8fde687c9 cpufreq/amd-pstate: Drop ret variable from amd_pstate_set_energy_pref_index()
The ret variable is not necessary.

Reviewed-and-tested-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241209185248.16301-14-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-11 10:44:53 -06:00
Mario Limonciello fff3957969 cpufreq/amd-pstate: Always write EPP value when updating perf
For MSR systems the EPP value is in the same register as perf targets
and so divding them into two separate MSR writes is wasteful.

In msr_update_perf(), update both EPP and perf values in one write to
MSR_AMD_CPPC_REQ, and cache them if successful.

To accomplish this plumb the EPP value into the update_perf call and
modify all its callers to check the return value.

As this unifies calls, ensure that the MSR write is necessary before
flushing a write out. Also drop the comparison from the passive flow
tracing.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241209185248.16301-13-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-11 10:44:53 -06:00
Mario Limonciello b3781f30bf cpufreq/amd-pstate: Cache EPP value and use that everywhere
Cache the value in cpudata->epp_cached, and use that for all callers.
As all callers use cached value merge amd_pstate_get_energy_pref_index()
into show_energy_performance_preference().

Check if the EPP value is changed before writing it to MSR or
shared memory region.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241209185248.16301-12-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-11 10:44:53 -06:00
Mario Limonciello 3f7b835fa4 cpufreq/amd-pstate: Move limit updating code
The limit updating code in amd_pstate_epp_update_limit() should not
only apply to EPP updates.  Move it to amd_pstate_update_min_max_limit()
so other callers can benefit as well.

With this move it's not necessary to have clamp_t calls anymore because
the verify callback is called when setting limits.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241209185248.16301-11-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-11 10:44:53 -06:00
Mario Limonciello 942718f2a2 cpufreq/amd-pstate: Change amd_pstate_update_perf() to return an int
As msr_update_perf() calls an MSR it's possible that it fails. Pass
this return code up to the caller.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241209185248.16301-10-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-11 10:44:53 -06:00
Mario Limonciello 68cb0e77b6 cpufreq/amd-pstate: store all values in cpudata struct in khz
Storing values in the cpudata structure in different units leads
to confusion and hardcoded conversions elsewhere.  After ratios are
calculated store everything in khz for any future use. Adjust all
relevant consumers for this change as well.

Suggested-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241209185248.16301-9-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-11 10:44:53 -06:00
Mario Limonciello 474e7218e8 cpufreq/amd-pstate: Only update the cached value in msr_set_epp() on success
If writing the MSR MSR_AMD_CPPC_REQ fails then the cached value in the
amd_cpudata structure should not be updated.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241209185248.16301-8-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-11 10:44:53 -06:00
Mario Limonciello 88a95ba066 cpufreq/amd-pstate: Use FIELD_PREP and FIELD_GET macros
The FIELD_PREP and FIELD_GET macros improve readability and help
to avoid shifting bugs.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241209185248.16301-7-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-11 10:44:53 -06:00
Mario Limonciello 3b43739824 cpufreq/amd-pstate: Drop cached epp_policy variable
epp_policy is not used by any of the current code and there
is no need to cache it.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241209185248.16301-6-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-11 10:44:52 -06:00
Mario Limonciello 6c093d5a5b cpufreq/amd-pstate: convert mutex use to guard()
Using scoped guard declaration will unlock mutexes automatically.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241209185248.16301-5-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-11 10:44:52 -06:00
Mario Limonciello 4dcd130151 cpufreq/amd-pstate: Add trace event for EPP perf updates
In "active" mode the most important thing for debugging whether
an issue is hardware or software based is to look at what was the
last thing written to the CPPC request MSR or shared memory region.

The 'amd_pstate_epp_perf' trace event shows the values being written
for all CPUs.

Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241209185248.16301-4-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-11 10:44:52 -06:00
Dhananjay Ugwekar 53ec2101df cpufreq/amd-pstate: Merge amd_pstate_epp_cpu_offline() and amd_pstate_epp_offline()
amd_pstate_epp_offline() is only called from within
amd_pstate_epp_cpu_offline() and doesn't make much sense to have it at all.
Hence, remove it.

Also remove the unncessary debug print in the offline path while at it.

Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20241204144842.164178-6-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-11 10:44:52 -06:00
Dhananjay Ugwekar b78f8c87ec cpufreq/amd-pstate: Remove the cppc_state check in offline/online functions
Only amd_pstate_epp driver (i.e. cppc_state = ACTIVE) enters the
amd_pstate_epp_offline() and amd_pstate_epp_cpu_online() functions,
so remove the unnecessary if condition checking if cppc_state is
equal to AMD_PSTATE_ACTIVE.

Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241204144842.164178-5-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-11 10:44:52 -06:00