Commit Graph

176 Commits

Author SHA1 Message Date
Dhananjay Ugwekar b1089e0c88 cpufreq/amd-pstate: Refactor amd_pstate_epp_reenable() and amd_pstate_epp_offline()
Replace similar code chunks with amd_pstate_update_perf() and
amd_pstate_set_epp() function calls.

Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241204144842.164178-4-Dhananjay.Ugwekar@amd.com
[ML: Fix LKP reported error about unused variable]
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-11 10:44:52 -06:00
Dhananjay Ugwekar 57a2b25e45 cpufreq/amd-pstate: Move the invocation of amd_pstate_update_perf()
amd_pstate_update_perf() should not be a part of shmem_set_epp() function,
so move it to the amd_pstate_epp_update_limit() function, where it is needed.

Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241204144842.164178-3-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-11 10:44:52 -06:00
Dhananjay Ugwekar 16c977f817 cpufreq/amd-pstate: Convert the amd_pstate_get/set_epp() to static calls
MSR and shared memory based systems have different mechanisms to get and
set the epp value. Split those mechanisms into different functions and
assign them appropriately to the static calls at boot time. This eliminates
the need for the "if(cpu_feature_enabled(X86_FEATURE_CPPC))" checks at
runtime.

Also, propagate the error code from rdmsrl_on_cpu() and cppc_get_epp_perf()
to *_get_epp()'s caller, instead of returning -EIO unconditionally.

Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241204144842.164178-2-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-11 10:44:52 -06:00
Mario Limonciello 2993b29b2a cpufreq/amd-pstate: Use boost numerator for upper bound of frequencies
commit 18d9b52271 ("cpufreq/amd-pstate: Use nominal perf for limits
when boost is disabled") introduced different semantics for min/max limits
based upon whether the user turned off boost from sysfs.

This however is not necessary when the highest perf value is the boost
numerator.

Suggested-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Fixes: 18d9b52271 ("cpufreq/amd-pstate: Use nominal perf for limits when boost is disabled")
Link: https://lore.kernel.org/r/20241209185248.16301-3-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-10 10:17:43 -06:00
Mario Limonciello 50a062a762 cpufreq/amd-pstate: Store the boost numerator as highest perf again
commit ad4caad58d ("cpufreq: amd-pstate: Merge
amd_pstate_highest_perf_set() into amd_get_boost_ratio_numerator()")
changed the semantics for highest perf and commit 18d9b52271
("cpufreq/amd-pstate: Use nominal perf for limits when boost is disabled")
worked around those semantic changes.

This however is a confusing result and furthermore makes it awkward to
change frequency limits and boost due to the scaling differences. Restore
the boost numerator to highest perf again.

Suggested-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Fixes: ad4caad58d ("cpufreq: amd-pstate: Merge amd_pstate_highest_perf_set() into amd_get_boost_ratio_numerator()")
Link: https://lore.kernel.org/r/20241209185248.16301-2-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-10 10:17:43 -06:00
K Prateek Nayak 919bfa9b2d cpufreq/amd-pstate: Detect preferred core support before driver registration
Booting with amd-pstate on 3rd Generation EPYC system incorrectly
enabled ITMT support despite the system not supporting Preferred Core
ranking. amd_pstate_init_prefcore() called during amd_pstate*_cpu_init()
requires "amd_pstate_prefcore" to be set correctly however the preferred
core support is detected only after driver registration which is too
late.

Swap the function calls around to detect preferred core support before
registring the driver via amd_pstate_register_driver(). This ensures
amd_pstate*_cpu_init() sees the correct value of "amd_pstate_prefcore"
considering the platform support.

Fixes: 279f838a61 ("x86/amd: Detect preferred cores in amd_get_boost_ratio_numerator()")
Fixes: ff2653ded4 ("cpufreq/amd-pstate: Move registration after static function call update")
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
Acked-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20241210032557.754-1-kprateek.nayak@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-09 21:57:34 -06:00
Linus Torvalds d8d78a90e7 - Add a feature flag which denotes AMD CPUs supporting workload classification
with the purpose of using such hints when making scheduling decisions
 
 - Determine the boost enumerator for each AMD core based on its type: efficiency
   or performance, in the cppc driver
 
 - Add the type of a CPU to the topology CPU descriptor with the goal of
   supporting and making decisions based on the type of the respective core
 
 - Add a feature flag to denote AMD cores which have heterogeneous topology and
   enable SD_ASYM_PACKING for those
 
 - Check microcode revisions before disabling PCID on Intel
 
 - Cleanups and fixlets
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmc7q0UACgkQEsHwGGHe
 VUq27Q//TADIn/rZj95OuWLYFXduOpzdyfF6BAOabRjUpIWTGJ5YdKjj1TCA2wUE
 6SiHZWQxQropB3NgeICcDT+3OGdGzE2qywzpXspUDsBPraWx+9CA56qREYafpRps
 88ZQZJWHla2/0kHN5oM4fYe05mWMLAFgIhG4tPH/7sj54Zqar40nhVksz3WjKAid
 yEfzbdVeRI5sNoujyHzGANXI0Fo98nAyi5Qj9kXL9W/UV1JmoQ78Rq7V9IIgOBsc
 l6Gv/h0CNtH9voqfrfUb07VHk8ZqSJ37xUnrnKdidncWGCWEAoZRr7wU+I9CHKIs
 tzdx+zq6JC3YN0IwsZCjk4me+BqVLJxW2oDgW7esPifye6ElyEo4T9UO9LEpE1qm
 ReAByoIMdSXWwXuITwy4NxLPKPCpU7RyJCiqFzpJp0g4qUq2cmlyERDirf6eknXL
 s+dmRaglEdcQT/EL+Y+vfFdQtLdwJmOu+nPPjjFxeRcIDB+u1sXJMEFbyvkLL6FE
 HOdNxL+5n/3M8Lbh77KIS5uCcjXL2VCkZK2/hyoifUb+JZR/ENoqYjElkMXOplyV
 KQIfcTzVCLRVvZApf/MMkTO86cpxMDs7YLYkgFxDsBjRdoq/Mzub8yzWn6kLZtmP
 ANNH4uYVtjrHE1nxJSA0JgYQlJKYeNU5yhLiTLKhHL5BwDYfiz8=
 =420r
 -----END PGP SIGNATURE-----

Merge tag 'x86_cpu_for_v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 cpuid updates from Borislav Petkov:

 - Add a feature flag which denotes AMD CPUs supporting workload
   classification with the purpose of using such hints when making
   scheduling decisions

 - Determine the boost enumerator for each AMD core based on its type:
   efficiency or performance, in the cppc driver

 - Add the type of a CPU to the topology CPU descriptor with the goal of
   supporting and making decisions based on the type of the respective
   core

 - Add a feature flag to denote AMD cores which have heterogeneous
   topology and enable SD_ASYM_PACKING for those

 - Check microcode revisions before disabling PCID on Intel

 - Cleanups and fixlets

* tag 'x86_cpu_for_v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/cpu: Remove redundant CONFIG_NUMA guard around numa_add_cpu()
  x86/cpu: Fix FAM5_QUARK_X1000 to use X86_MATCH_VFM()
  x86/cpu: Fix formatting of cpuid_bits[] in scattered.c
  x86/cpufeatures: Add X86_FEATURE_AMD_WORKLOAD_CLASS feature bit
  x86/amd: Use heterogeneous core topology for identifying boost numerator
  x86/cpu: Add CPU type to struct cpuinfo_topology
  x86/cpu: Enable SD_ASYM_PACKING for PKG domain on AMD
  x86/cpufeatures: Add X86_FEATURE_AMD_HETEROGENEOUS_CORES
  x86/cpufeatures: Rename X86_FEATURE_FAST_CPPC to have AMD prefix
  x86/mm: Don't disable PCID when INVLPG has been fixed by microcode
2024-11-19 12:27:19 -08:00
Mario Limonciello ff2653ded4 cpufreq/amd-pstate: Move registration after static function call update
On shared memory designs the static functions need to work before
registration is done or the system can hang at bootup.

Move the registration later in amd_pstate_init() to solve this.

Fixes: b427ac4084 ("cpufreq/amd-pstate: Remove the redundant amd_pstate_set_driver() call")
Reported-by: Klara Modin <klarasmodin@gmail.com>
Closes: https://lore.kernel.org/linux-pm/cf9c146d-bacf-444e-92e2-15ebf513af96@gmail.com/#t
Tested-by: Klara Modin <klarasmodin@gmail.com>
Tested-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Link: https://lore.kernel.org/r/20241028145542.1739160-2-superm1@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-29 08:50:39 -05:00
Mario Limonciello 3ac757e8db cpufreq/amd-pstate: Push adjust_perf vfunc init into cpu_init
As the driver can be changed in and out of different modes it's possible
that adjust_perf is assigned when it shouldn't be.

This could happen if an MSR design is started up in passive mode and then
switches to active mode.

To solve this explicitly clear `adjust_perf` in amd_pstate_epp_cpu_init().

Tested-by: Klara Modin <klarasmodin@gmail.com>
Tested-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Link: https://lore.kernel.org/r/20241028145542.1739160-1-superm1@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-29 08:31:57 -05:00
Dhananjay Ugwekar a6960e6b1b cpufreq/amd-pstate: Align offline flow of shared memory and MSR based systems
Set min_perf to lowest_perf for shared memory systems, similar to the MSR
based systems.

Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241023102108.5980-5-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-28 14:54:36 -05:00
Dhananjay Ugwekar 796ff50e12 cpufreq/amd-pstate: Call cppc_set_epp_perf in the reenable function
The EPP value being set in perf_ctrls.energy_perf is not being propagated
to the shared memory, fix that.

Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241023102108.5980-4-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-28 14:54:36 -05:00
Dhananjay Ugwekar 73070a9169 cpufreq/amd-pstate: Do not attempt to clear MSR_AMD_CPPC_ENABLE
MSR_AMD_CPPC_ENABLE is a write once register, i.e. attempting to clear
it is futile, it will not take effect. Hence, return if disable (0)
argument is passed to the msr_cppc_enable()

Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241023102108.5980-3-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-28 14:54:36 -05:00
Dhananjay Ugwekar 7fb463aac8 cpufreq/amd-pstate: Rename functions that enable CPPC
Explicitly rename functions that enable CPPC as *_cppc_*.

Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Link: https://lore.kernel.org/r/20241023102108.5980-2-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-28 14:54:36 -05:00
Swapnil Sapkal 54ab7d7c59 amd-pstate: Switch to amd-pstate by default on some Server platforms
Currently the default cpufreq driver for all the AMD EPYC servers is
acpi-cpufreq. Going forward, switch to amd-pstate as the default
driver on the AMD EPYC server platforms with CPU family 0x1A or
higher. The default mode will be active mode.

Testing shows that amd-pstate with active mode and performance
governor provides comparable or better performance per-watt against
acpi-cpufreq + performance governor.

Likewise, amd-pstate with active mode and powersave governor with the
energy_performance_preference=power (EPP=255) provides comparable or
better performance per-watt against acpi-cpufreq + schedutil governor
for a wide range of workloads.

Users can still revert to using acpi-cpufreq driver on these platforms
with the "amd_pstate=disable" kernel commandline parameter.

Signed-off-by: Swapnil Sapkal <swapnil.sapkal@amd.com>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20241021101836.9047-3-gautham.shenoy@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-28 14:54:36 -05:00
Gautham R. Shenoy 0c411b39e4 amd-pstate: Set min_perf to nominal_perf for active mode performance gov
The amd-pstate driver sets CPPC_REQ.min_perf to CPPC_REQ.max_perf when
in active mode with performance governor. Typically CPPC_REQ.max_perf
is set to CPPC.highest_perf. This causes frequency throttling on
power-limited platforms which causes performance regressions on
certain classes of workloads.

Hence, set the CPPC_REQ.min_perf to the CPPC.nominal_perf or
CPPC_REQ.max_perf, whichever is lower of the two.

Fixes: ffa5096a7c ("cpufreq: amd-pstate: implement Pstate EPP support for the AMD processors")
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20241021101836.9047-2-gautham.shenoy@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-28 14:54:35 -05:00
Dhananjay Ugwekar b427ac4084 cpufreq/amd-pstate: Remove the redundant amd_pstate_set_driver() call
amd_pstate_set_driver() is called twice, once in amd_pstate_init() and once
as part of amd_pstate_register_driver(). Move around code and eliminate
the redundancy.

Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20241017100528.300143-5-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-28 14:54:35 -05:00
Dhananjay Ugwekar 162cfa4eba cpufreq/amd-pstate: Remove the switch case in amd_pstate_init()
Replace the switch case with a more readable if condition.

Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20241017100528.300143-4-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-28 14:54:35 -05:00
Dhananjay Ugwekar e3591eebec cpufreq/amd-pstate: Call amd_pstate_set_driver() in amd_pstate_register_driver()
Replace a similar chunk of code in amd_pstate_register_driver() with
amd_pstate_set_driver() call.

Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20241017100528.300143-3-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-28 14:54:35 -05:00
Dhananjay Ugwekar 6f241fa50a cpufreq/amd-pstate: Call amd_pstate_register() in amd_pstate_init()
Replace a similar chunk of code in amd_pstate_init() with
amd_pstate_register() call.

Suggested-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20241017100528.300143-2-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-28 14:54:35 -05:00
Dhananjay Ugwekar 5d9a354cf8 cpufreq/amd-pstate: Set the initial min_freq to lowest_nonlinear_freq
According to the AMD architectural programmer's manual volume 2 [1], in
section "17.6.4.1 CPPC_CAPABILITY_1" lowest_nonlinear_perf is described
as "Reports the most energy efficient performance level (in terms of
performance per watt). Above this threshold, lower performance levels
generally result in increased energy efficiency. Reducing performance
below this threshold does not result in total energy savings for a given
computation, although it reduces instantaneous power consumption". So
lowest_nonlinear_perf is the most power efficient performance level, and
going below that would lead to a worse performance/watt.

Also, setting the minimum frequency to lowest_nonlinear_freq (instead of
lowest_freq) allows the CPU to idle at a higher frequency which leads
to more time being spent in a deeper idle state (as trivial idle tasks
are completed sooner). This has shown a power benefit in some systems,
in other systems, power consumption has increased but so has the
throughput/watt.

Modify the initial policy_data->min set by cpufreq-core to
lowest_nonlinear_freq, in the ->verify() callback. Also set the
cpudata->req[0] to FREQ_QOS_MIN_DEFAULT_VALUE (i.e. 0), so that it also
gets overriden by the check in verify function.

Link: https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/programmer-references/24593.pdf [1]

Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20241017053927.25285-3-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-28 14:54:35 -05:00
Dhananjay Ugwekar 205cb215d0 cpufreq/amd-pstate: Remove the redundant verify() function
Merge the two verify() callback functions and rename the
cpufreq_policy_data argument for better readability.

Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20241017053927.25285-2-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-28 14:54:35 -05:00
Mario Limonciello 508239724b cpufreq/amd-pstate: Drop needless EPP initialization
The EPP value doesn't need to be cached to the CPPC request in
amd_pstate_epp_update_limit() because it's passed as an argument
at the end to amd_pstate_set_epp() and stored at that time.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Tested-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Link: https://lore.kernel.org/r/20241012174519.897-4-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-28 14:54:35 -05:00
Mario Limonciello 047a2d0c83 cpufreq/amd-pstate: Use amd_pstate_update_min_max_limit() for EPP limits
When the EPP updates are set the maximum capable frequency for the
CPU is used to set the upper limit instead of that of the policy.

Adjust amd_pstate_epp_update_limit() to reuse policy calculation code
from amd_pstate_update_min_max_limit().

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Tested-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Link: https://lore.kernel.org/r/20241012174519.897-3-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-28 14:54:35 -05:00
Mario Limonciello 67c08d303e cpufreq/amd-pstate: Don't update CPPC request in amd_pstate_cpu_boost_update()
When boost is changed the CPPC value is changed in amd_pstate_cpu_boost_update()
but then changed again when refresh_frequency_limits() and all it's callbacks
occur.  The first is a pointless write, so instead just update the limits for
the policy and let the policy refresh anchor everything properly.

Fixes: c8c68c38b5 ("cpufreq: amd-pstate: initialize core precision boost state")
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Tested-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Link: https://lore.kernel.org/r/20241012174519.897-2-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-28 14:54:35 -05:00
Mario Limonciello 7820e8050d cpufreq/amd-pstate: Fix non kerneldoc comment
The comment for amd_cppc_supported() isn't meant to be kernel doc.

Fixes: cb817ec667 ("cpufreq: amd-pstate: show CPPC debug message if CPPC is not supported")
Link: https://lore.kernel.org/r/20240905162351.1345560-1-superm1@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-28 14:54:35 -05:00
Dhananjay Ugwekar 1bfe6a54d2 cpufreq/amd-pstate: Rename MSR and shared memory specific functions
Existing function names "cppc_*" and "pstate_*" for shared memory and
MSR based systems are not intuitive enough, replace them with "shmem_*" and
"msr_*" respectively.

Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20240917091434.10685-1-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-28 14:54:35 -05:00
Mario Limonciello 104edc6efc x86/cpufeatures: Rename X86_FEATURE_FAST_CPPC to have AMD prefix
This feature is an AMD unique feature of some processors, so put
AMD into the name.

Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20241025171459.1093-2-mario.limonciello@amd.com
2024-10-25 20:09:16 +02:00
Mario Limonciello 18d9b52271 cpufreq/amd-pstate: Use nominal perf for limits when boost is disabled
When boost has been disabled the limit for perf should be nominal perf not
the highest perf.  Using the latter to do calculations will lead to
incorrect values that are still above nominal.

Fixes: ad4caad58d ("cpufreq: amd-pstate: Merge amd_pstate_highest_perf_set() into amd_get_boost_ratio_numerator()")
Reported-by: Peter Jung <ptr1337@cachyos.org>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219348
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Tested-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Link: https://lore.kernel.org/r/20241012174519.897-1-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-15 23:54:15 -05:00
Dhananjay Ugwekar c10e50a469 cpufreq/amd-pstate: Fix amd_pstate mode switch on shared memory systems
While switching the driver mode between active and passive, Collaborative
Processor Performance Control (CPPC) is disabled in
amd_pstate_unregister_driver(). But, it is not enabled back while registering
the new driver (passive or active). This leads to the new driver mode not
working correctly, so enable it back in amd_pstate_register_driver().

Fixes: 3ca7bc818d ("cpufreq: amd-pstate: Add guided mode control support via sysfs")
Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20241004122303.94283-1-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-07 11:32:05 -05:00
Rafael J. Wysocki 9bcf30348f second round of amd-pstate changes for 6.12 (second try):
* Move the calculation of the AMD boost numerator outside of
   amd-pstate, correcting acpi-cpufreq on systems with preferred cores
 * Harden preferred core detection to avoid potential false positives
 * Add extra unit test coverage for mode state machine
 -----BEGIN PGP SIGNATURE-----
 
 iQJOBAABCgA4FiEECwtuSU6dXvs5GA2aLRkspiR3AnYFAmbhviEaHG1hcmlvLmxp
 bW9uY2llbGxvQGFtZC5jb20ACgkQLRkspiR3AnYqDA//TrvmXcpk1mnVJw3Y7MG0
 /n8dsLpxqVtEf+USnlGR+iRhgSQ/W/Kr7b5a+jmdCwpHChuWHt2FnNgcHLIxDnZC
 vmEJ02/2BCRoPKvcvV4VTh0ATu3O9nqwQiBVWBdNjDy+Dzr0pzA+SQopt1hCIsO2
 mzUodhpiBqYKlMf/i6+aM1gZCGGqoRC40aGqnJsgegb61vl7zIc2ZcbTxUQlyTfv
 t6J73IXLx8+YtrjejBYc7mRHhMQ2hCKy92C/8cNoGocj5faSKsAA3OUDcWq8qX0U
 zK3GGGdW8MLHSbt3VyntstnfiLL7TnzowcjvrMudIWpjC1987GlE9BApbN9VRZ8e
 ARN3Y7/ltjut/1fRB97BwjI9aDpzA0122Qzy4UOcK8o+be1eIr+ihV3Z9EN/snWg
 0L/oq5+rGHvvIzf1BwGhoPSvgBIu7eMIYDcRxKPlEiKsbXrL4DdJC/nXgaZ/HiGO
 eHx1dNy7LFrdnEwVI1frZWC6ZuZcpmOBdhnfU+leVxzB3Z++Qc266rsxKBsc5taZ
 PPV18pxfbbl3iL85KDIbuBUCmA0aY8WEdCKtfXpl7zlB5g0fZQLyYeUbvahK08Sk
 vyQAnPECbX/4v1Vx54Z70GPk0XD2+TXdg8yApnXrmRc36z/SLdprk5hPKbKhZu/r
 iPxFUnvd0HCtjsLrsq/qUiQ=
 =R4HZ
 -----END PGP SIGNATURE-----

Merge tag 'amd-pstate-v6.12-2024-09-11' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux

Merge the second round of amd-pstate changes for 6.12 from Mario
Limonciello:

"* Move the calculation of the AMD boost numerator outside of
   amd-pstate, correcting acpi-cpufreq on systems with preferred cores
 * Harden preferred core detection to avoid potential false positives
 * Add extra unit test coverage for mode state machine"

* tag 'amd-pstate-v6.12-2024-09-11' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux:
  cpufreq/amd-pstate-ut: Fix an "Uninitialized variables" issue
  cpufreq/amd-pstate-ut: Add test case for mode switches
  cpufreq/amd-pstate: Export symbols for changing modes
  amd-pstate: Add missing documentation for `amd_pstate_prefcore_ranking`
  cpufreq: amd-pstate: Add documentation for `amd_pstate_hw_prefcore`
  cpufreq: amd-pstate: Optimize amd_pstate_update_limits()
  cpufreq: amd-pstate: Merge amd_pstate_highest_perf_set() into amd_get_boost_ratio_numerator()
  x86/amd: Detect preferred cores in amd_get_boost_ratio_numerator()
  x86/amd: Move amd_get_highest_perf() out of amd-pstate
  ACPI: CPPC: Adjust debug messages in amd_set_max_freq_ratio() to warn
  ACPI: CPPC: Drop check for non zero perf ratio
  x86/amd: Rename amd_get_highest_perf() to amd_get_boost_ratio_numerator()
  ACPI: CPPC: Adjust return code for inline functions in !CONFIG_ACPI_CPPC_LIB
  x86/amd: Move amd_get_highest_perf() from amd.c to cppc.c
2024-09-11 18:22:23 +02:00
Mario Limonciello 8d916815b0 cpufreq/amd-pstate: Export symbols for changing modes
In order to effectively test all mode switch combinations export
everything necessarily for amd-pstate-ut to trigger a mode switch.

Reviewed-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-09-11 10:23:23 -05:00
Mario Limonciello 45722e777f cpufreq: amd-pstate: Optimize amd_pstate_update_limits()
Don't take and release the mutex when prefcore isn't present and
avoid initialization of variables that will be initially set
in the function.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-09-11 10:23:23 -05:00
Mario Limonciello ad4caad58d cpufreq: amd-pstate: Merge amd_pstate_highest_perf_set() into amd_get_boost_ratio_numerator()
The special case in amd_pstate_highest_perf_set() is the value used
for calculating the boost numerator.  Merge this into
amd_get_boost_ratio_numerator() and then use that to calculate boost
ratio.

This allows dropping more special casing of the highest perf value.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-09-11 10:23:23 -05:00
Mario Limonciello 279f838a61 x86/amd: Detect preferred cores in amd_get_boost_ratio_numerator()
AMD systems that support preferred cores will use "166" as their
numerator for max frequency calculations instead of "255".

Add a function for detecting preferred cores by looking at the
highest perf value on all cores.

If preferred cores are enabled return 166 and if disabled the
value in the highest perf register. As the function will be called
multiple times, cache the values for the boost numerator and if
preferred cores will be enabled in global variables.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-09-11 10:23:23 -05:00
Mario Limonciello 2819bfef64 x86/amd: Move amd_get_highest_perf() out of amd-pstate
amd_pstate_get_highest_perf() is a helper used to get the highest perf
value on AMD systems.  It's used in amd-pstate as part of preferred
core handling, but applicable for acpi-cpufreq as well.

Move it out to cppc handling code as amd_get_highest_perf().

Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-09-11 10:23:23 -05:00
Rafael J. Wysocki 6af3aab6c7 ARM cpufreq updates for 6.12
- Several OF related cleanups in cpufreq drivers (Rob Herring).
 
 - Enable COMPILE_TEST for ARM drivers (Rob Herrring).
 
 - Introduce quirks for syscon failures and use socinfo to get revision
   for TI cpufreq driver (Dhruva Gole and Nishanth Menon).
 
 - Minor cleanups in amd-pstate driver (Anastasia Belova and Dhananjay
   Ugwekar).
 
 - Minor cleanups for loongson, cpufreq-dt and powernv cpufreq drivers
   (Danila Tikhonov, Huacai Chen, and Liu Jing).
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEx73Crsp7f6M6scA70rkcPK6BEhwFAmbalyoACgkQ0rkcPK6B
 Ehw9IBAAus+BOdYMzU8VT7j8Y98oOfb5FsJCoTU2KaV2RIIpX4k+6daruCOm0BXP
 RtRiI+ILV5zLUm8CIC15f2GQE6PtDBFmjky7ItEemcbQPlTkpkZFWNFhBqE1u3hw
 jllA4p1LmUwAnr1zkwl2CEUJSRJBxWPeTxPL0Ci6pycFhiNPZwGqOreJQRsIMOh3
 pgohKSBebxpzgwES8fhR32CqaHphrEFCryHafZIqzsXSBuyETGEKg57zTmdo6ojy
 GDuaIz6kQ9lKvW/q9iwTih93SsBnzDD85AAERDZkUDxey5IBLztrJLH5QT/XN77K
 EQOHeygwyKk4su00fXy/LXmMqKHCN/mAHgb6JvWBIm2xbDWx6drBJyV/NdX4YI4w
 4m1SqmFH9Cv41UIcynQR83XthGKgIddjEDKPW0GNMQ+LHWlUS6Qm4Kb2q4rXruqD
 bUWs3NmZEvYD9P2XOKGHgfSPZ0iNXi0Lt5BBIWbeIPNwaikxHisNsNG1W2pMsfke
 n19cvt20aBJgx2s5acIH7Po8qQglrGGK9EKWRg8gInvtB7QRbHBhXVD6ZNwuIk/7
 u2+Y42R4R1GzwsD3EUl+RnnUFgRwhg53OIzcE+AaaMDqGeTdxmG42eg0jGSBA7yx
 KbljH9PAfsMjjEjsVYReiIYxS28PZNyTBaxZJxD2RyxMz53CV9w=
 =BlNP
 -----END PGP SIGNATURE-----

Merge tag 'cpufreq-arm-updates-6.12' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/vireshk/pm

Merge ARM cpufreq updates for 6.12 from Viresh Kumar:

"- Several OF related cleanups in cpufreq drivers (Rob Herring).

 - Enable COMPILE_TEST for ARM drivers (Rob Herrring).

 - Introduce quirks for syscon failures and use socinfo to get revision
   for TI cpufreq driver (Dhruva Gole and Nishanth Menon).

 - Minor cleanups in amd-pstate driver (Anastasia Belova and Dhananjay
   Ugwekar).

 - Minor cleanups for loongson, cpufreq-dt and powernv cpufreq drivers
   (Danila Tikhonov, Huacai Chen, and Liu Jing)."

* tag 'cpufreq-arm-updates-6.12' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
  cpufreq: ti-cpufreq: Use socinfo to get revision in AM62 family
  cpufreq: Fix the cacography in powernv-cpufreq.c
  cpufreq: ti-cpufreq: Introduce quirks to handle syscon fails appropriately
  cpufreq: loongson3: Use raw_smp_processor_id() in do_service_request()
  cpufreq: amd-pstate: add check for cpufreq_cpu_get's return value
  cpufreq: Add SM7325 to cpufreq-dt-platdev blocklist
  cpufreq: Fix warning on unused of_device_id tables for !CONFIG_OF
  cpufreq/amd-pstate: Add the missing cpufreq_cpu_put()
  cpufreq: Drop CONFIG_ARM and CONFIG_ARM64 dependency on Arm drivers
  cpufreq: Enable COMPILE_TEST on Arm drivers
  cpufreq: armada-8k: Avoid excessive stack usage
  cpufreq: omap: Drop asm includes
  cpufreq: qcom: Add explicit io.h include for readl/writel_relaxed
  cpufreq: spear: Use of_property_for_each_u32() instead of open coding
  cpufreq: Use of_property_present()
2024-09-06 20:50:46 +02:00
Rafael J. Wysocki 222caf5520 amd-pstate development for 6.12:
* Validate return of any attempt to update EPP limits, which fixes
   the masking hardware problems.
 -----BEGIN PGP SIGNATURE-----
 
 iQJOBAABCgA4FiEECwtuSU6dXvs5GA2aLRkspiR3AnYFAmbYvoIaHG1hcmlvLmxp
 bW9uY2llbGxvQGFtZC5jb20ACgkQLRkspiR3AnY6ZxAAw0vB5p/mC8QfS5VuDm0O
 Up+dMtjVf9VcIOx4OoqSPNTY4HmNUnBSmACfeY/LzUa22BhM+7SJ74y8UoXGYGc8
 GKzuVDRsbqnuMrGqbOd8u+eJhcc8fln7zJVa1xBOM8V5yHqvnxjAsnwVieVhjmKa
 7m5K8ht/tJsCpaY/BjF7iHMgq950UGjO+pUXzrYP5ARV1O367DMWVQ91X9NBKwvN
 w/mapWmH+mpp0//2hL7wzrtGIfYjdndFP3xM0+7v5MoEd9KuSl7xCVObGidDOEaz
 oxfMCfPPpHMJLwD/L6VOwq/AIomJQVUZS8111ucnbdkIVW5QOlSaFwKJG/KmFvUQ
 98lfX2xAaGB26tvWQ3l+vFQ7Qces0NrJfCLbpS2NdQ2oSq1+1ab3Yeh1wDXYph7o
 hF+Xa/qJHzIFwzeQ2EmdXwthpoXbF5mmIiBZCTE/v1WpBVkYj+lJGtU+WsyAwFJT
 qPASAqrRo8KoCfhdxYT47Rc3AowUWtHoJQlZNzTdbmc7KXrk+7NfDUZ5ifLq3puI
 FAPWg76NM8aefoattsC57+6/V2bygbe1wIQaXRrybZiO9cMtY0eQ9bE4ysIm0ZQ8
 bSPEHg1iYl3OrMzwRYnBHmev8E4jmKdKSnBBPSqD3ObeR3fNZQs1pxWc3CtHP8LA
 p+AEE66CGAzsed5wudcEoOc=
 =nPcu
 -----END PGP SIGNATURE-----

Merge tag 'amd-pstate-v6.12-2024-09-04' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux

Merge an amd-pstate driver update for 6.12 from Mario Limonciello:

"amd-pstate development for 6.12:
 * Validate return of any attempt to update EPP limits, which fixes
   the masking hardware problems."

* tag 'amd-pstate-v6.12-2024-09-04' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux:
  cpufreq/amd-pstate: Catch failures for amd_pstate_epp_update_limit()
2024-09-05 13:04:46 +02:00
Mario Limonciello c3e093efbc cpufreq/amd-pstate: Catch failures for amd_pstate_epp_update_limit()
amd_pstate_set_epp() calls cppc_set_epp_perf() which can fail for
a variety of reasons but this is ignored.  Change the return flow
to allow failures.

Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-09-04 15:07:34 -05:00
Anastasia Belova 5493f9714e cpufreq: amd-pstate: add check for cpufreq_cpu_get's return value
cpufreq_cpu_get may return NULL. To avoid NULL-dereference check it
and return in case of error.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Signed-off-by: Anastasia Belova <abelova@astralinux.ru>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-09-04 20:43:59 +05:30
Dhananjay Ugwekar 49243adc71 cpufreq/amd-pstate: Add the missing cpufreq_cpu_put()
Fix the reference counting of cpufreq_policy object in amd_pstate_update()
function by adding the missing cpufreq_cpu_put().

Fixes: e8f555daac ("cpufreq/amd-pstate: fix setting policy current frequency value")
Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-09-04 20:43:58 +05:30
Gautham R. Shenoy 9c68a3b03e cpufreq/amd-pstate: Remove warning for X86_FEATURE_CPPC on certain Zen models
commit bff7d13c19 ("cpufreq: amd-pstate: add debug message while
CPPC is supported and disabled by SBIOS") issues a warning on plaforms
where the X86_FEATURE_CPPC is expected to be enabled, but is not due
to it being disabled in the BIOS.

This feature bit corresponds to CPUID 0x80000008.ebx[27] which is a
reserved bit on the Zen1 processors and a reserved bit on Zen2 based
models 0x70-0x7F, and is expected to be cleared on these
platforms. Thus printing the warning message for these models when
X86_FEATURE_CPPC is unavailable is incorrect. Fix this.

Modify some of the comments, and use switch-case for model range
checking for improved readability while at it.

Fixes: bff7d13c19 ("cpufreq: amd-pstate: add debug message while CPPC is supported and disabled by SBIOS")
Cc: Xiaojian Du <xiaojian.du@amd.com>
Reported-by: David Wang <00107082@163.com>
Closes: https://lore.kernel.org/lkml/20240730140111.4491-1-00107082@163.com/
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Acked-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-08-28 10:15:00 -05:00
Gautham R. Shenoy 0d8584d288 cpufreq/amd-pstate: Use topology_logical_package_id() instead of logical_die_id()
After the commit 63edbaa48a ("x86/cpu/topology: Add support for the
AMD 0x80000026 leaf"), the topolgy_logical_die_id() function returns
the logical Core Chiplet Die (CCD) ID instead of the logical socket
ID.

Since this is currently used to set MSR_AMD_CPPC_ENABLE, which needs
to be set on any one of the threads of the socket, it is prudent to
use topology_logical_package_id() in place of
topology_logical_die_id().

Fixes: 63edbaa48a ("x86/cpu/topology: Add support for the AMD 0x80000026 leaf")
cc: stable@vger.kernel.org # 6.10
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Tested-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Link: https://lore.kernel.org/lkml/20240801124509.3650-1-Dhananjay.Ugwekar@amd.com/
Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-08-22 15:42:10 -05:00
Dan Carpenter 67d95303c8 cpufreq: amd-pstate: Fix uninitialized variable in amd_pstate_cpu_boost_update()
Smatch complains that "ret" could be uninitialized:

  drivers/cpufreq/amd-pstate.c:734 amd_pstate_cpu_boost_update()
  error: uninitialized symbol 'ret'.

This seems like it probably is a real issue.  Initialize "ret" to zero to
be safe.

Fixes: c8c68c38b5 ("cpufreq: amd-pstate: initialize core precision boost state")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Acked-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/lkml/7ff53543-6c04-48a0-8d99-7dc010b93b3a@stanley.mountain/T/
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-08-22 15:42:10 -05:00
Rafael J. Wysocki 7ad9eab9d4 ARM cpufreq updates for 6.11
- cpufreq: Add Loongson-3 CPUFreq driver support (Huacai Chen).
 - Make exit() callback return void (Lizhe and Viresh Kumar).
 - Minor cleanups and fixes in several drivers (Bryan Brattlof,
   Javier Carrasco, Jagadeesh Kona, Jeff Johnson, Nícolas F. R. A. Prado,
   Primoz Fiser, Raphael Gallais-Pou, and Riwen Lu).
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEx73Crsp7f6M6scA70rkcPK6BEhwFAmaM3agACgkQ0rkcPK6B
 Ehw2QA//W+HaHbEf3zOFvwDgG23h3ampEzIoZ1LTznU7rsK7as1XgJ12pHk3uZyy
 L9OppUeN0zH9LaIgOCG5C5oVnRujl30LK3jo/vyBkGROdpng6w4Wci/2XIqPEZFJ
 sMC3om+VgbXGu1UaxSTX/fBjuWeuoLY6rrGHjkDcAh52bgEWuRTzgOIrcRTRpcvb
 G8Gy1YU/t2j/UocYkiR3s5JAFyujmiWcoD4fO4wt+JaYRnDmfQXSrE9X0dpjN+Vp
 wxftLn3RgbuIXGmrDnnwUiDa/e6YSTLKgkrdzshSyOeHUzW7SoMfkMqb26bnFsLY
 m2FKnTtT2uQIPdFwrPPseXhUvjklyOAeIZH6tO/QGoteXU3SVWB1kBQNcVbztWF5
 hHGL/qERACIt3xU/WQ0h1nvTMf46+1vc944uArh6F6t/XvmcoXv05YDRymyZBWLx
 mNRqG89gDex/TB+R15GBbXibK2UEGB26Bu84m7nFgbo5B0oM+OPebm49133gfz3V
 b8XaxzQMMFgdV3CpqRxQTNSnPWiwspttBZE7hYULONDxj8Ys/yfY7Gq8khjQxEBO
 xxQ4QRtlwkLSilyNb19i5LM9F+HpmkxdjO6su3SgZW5QVUUKsNA/aY0CbrXuIRiS
 dBGwBz8/EZ/7+/bK+TIU5tdR8UCSrVifF/bVGaQnWRWvB/2gPhw=
 =qMmS
 -----END PGP SIGNATURE-----

Merge tag 'cpufreq-arm-updates-6.11' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/vireshk/pm

Merge ARM cpufreq updates for 6.11 from Viresh Kumar:

"- cpufreq: Add Loongson-3 CPUFreq driver support (Huacai Chen).
 - Make exit() callback return void (Lizhe and Viresh Kumar).
 - Minor cleanups and fixes in several drivers (Bryan Brattlof,
   Javier Carrasco, Jagadeesh Kona, Jeff Johnson, Nícolas F. R. A. Prado,
   Primoz Fiser, Raphael Gallais-Pou, and Riwen Lu)."

* tag 'cpufreq-arm-updates-6.11' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: (21 commits)
  cpufreq: sti: fix build warning
  cpufreq: mediatek: Use dev_err_probe in every error path in probe
  cpufreq: Add Loongson-3 CPUFreq driver support
  cpufreq: Make cpufreq_driver->exit() return void
  cpufreq: pcc: Remove empty exit() callback
  cpufreq: loongson2: Remove empty exit() callback
  cpufreq: nforce2: Remove empty exit() callback
  cpufreq: sti: add missing MODULE_DEVICE_TABLE entry for stih418
  cpufreq: ti: update OPP table for AM62Px SoCs
  cpufreq: ti: update OPP table for AM62Ax SoCs
  cpufreq: sun50i: add Allwinner H700 speed bin
  cpufreq/cppc: Don't compare desired_perf in target()
  OPP: ti: Fix ti_opp_supply_probe wrong return values
  cpufreq: ti-cpufreq: Handle deferred probe with dev_err_probe()
  cpufreq: dt-platdev: add missing MODULE_DESCRIPTION() macro
  cpufreq: longhaul: Fix kernel-doc param for longhaul_setstate
  cpufreq: qcom-nvmem: eliminate uses of of_node_put()
  cpufreq: qcom-nvmem: fix memory leaks in probe error paths
  cpufreq: scmi: Avoid overflow of target_freq in fast switch
  cpufreq: sun50i: replace of_node_put() with automatic cleanup handler
  ...
2024-07-09 17:58:20 +02:00
Lizhe b4b1ddc9df cpufreq: Make cpufreq_driver->exit() return void
The cpufreq core doesn't check the return type of the exit() callback
and there is not much the core can do on failures at that point. Just
drop the returned value and make it return void.

Signed-off-by: Lizhe <sensor1010@163.com>
[ Viresh: Reworked the patches to fix all missing changes together. ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> # Mediatek
Acked-by: Sudeep Holla <sudeep.holla@arm.com> # scpi, scmi, vexpress
Acked-by: Mario Limonciello <mario.limonciello@amd.com> # amd
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> # bmips
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Acked-by: Kevin Hilman <khilman@baylibre.com> # omap
2024-07-09 08:45:30 +05:30
Dhananjay Ugwekar 738d7d0357 cpufreq/amd-pstate: Fix the scaling_max_freq setting on shared memory CPPC systems
On shared memory CPPC systems, with amd_pstate=active mode, the change
in scaling_max_freq doesn't get written to the shared memory
region. Due to this, the writes to the scaling_max_freq sysfs file
don't take effect. Fix this by propagating the scaling_max_freq
changes to the shared memory region.

Fixes: ffa5096a7c ("cpufreq: amd-pstate: implement Pstate EPP support for the AMD processors")
Reported-by: David Arcari <darcari@redhat.com>
Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20240702081413.5688-3-Dhananjay.Ugwekar@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-07-07 10:32:48 -05:00
Perry Yuan 89ac482d51 cpufreq: amd-pstate: Cap the CPPC.max_perf to nominal_perf if CPB is off
When Core Performance Boost is disabled by the user, the
CPPC_REQ.max_perf should not exceed the nominal_perf since by definition
the frequencies between nominal_perf and the highest_perf are in the
boost range. Fix this in amd_pstate_update()

Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
Link: https://lore.kernel.org/r/66f55232be01092c423f0523f68b82b80c293943.1718988436.git.perry.yuan@amd.com
Link: https://lore.kernel.org/r/20240626042733.3747-4-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-06-26 15:48:21 -05:00
Perry Yuan c8c68c38b5 cpufreq: amd-pstate: initialize core precision boost state
The "Core Performance Boost (CPB) feature, when enabled in the BIOS,
allows the OS to control the highest performance for each individual
core. The active, passive and the guided modes of the amd-pstate driver
do support controlling the core frequency boost when this BIOS feature
is enabled. Additionally, the amd-pstate driver provides a sysfs
interface allowing the user to activate/deactivate this core performance
boost feature at runtime.

Add support for the set_boost callback in the active mode driver to
enable boost control via the cpufreq core. This ensures a consistent
boost control interface across all pstate modes, including passive
mode, guided mode, and active mode.

With this addition, all three pstate modes can support the same boost
control interface with unique interface and global CPB control. Each
CPU also supports individual boost control, allowing global CPB to
change all cores' boost states simultaneously. Specific CPUs can
update their boost states separately, ensuring all cores' boost
states are synchronized.

Cc: Oleksandr Natalenko <oleksandr@natalenko.name>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217931
Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
Co-developed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20240626042733.3747-3-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-06-26 15:48:21 -05:00
Mario Limonciello bc76f57574 cpufreq: amd-pstate: Don't create attributes when registration fails
If driver registration fails then immediately return the failure
instead of continuing to register attributes.

This fixes issues of falling back from amd-pstate to other drivers
when cpufreq init has failed for any reason.

Reported-by: alex.s.cochran@proton.me
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Perry Yuan <Perry.Yuan@amd.com>
Link: https://lore.kernel.org/r/20240623200918.52104-1-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-06-24 13:17:56 -05:00
Meng Li e8f555daac cpufreq/amd-pstate: fix setting policy current frequency value
When scaling min/max freq values were being setted,
the value of policy->cur need to update.

Signed-off-by: Meng Li <li.meng@amd.com>
Acked-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20240227071133.3405003-1-li.meng@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-06-20 21:52:05 -05:00
Perry Yuan 4e4f600ee7 cpufreq: amd-pstate: auto-load pstate driver by default
If the `amd-pstate` driver is not loaded automatically by default,
it is because the kernel command line parameter has not been added.
To resolve this issue, it is necessary to call the `amd_pstate_set_driver()`
function to enable the desired mode (passive/active/guided) before registering
the driver instance.

This ensures that the driver is loaded correctly without relying on the kernel
command line parameter.

When there is no parameter added to command line, Kernel config will
provide the default mode to load.

Meanwhile, user can add driver mode in command line which will override
the kernel config default option.

Reported-by: Andrei Amuraritei <andamu@posteo.net>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218705
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/83301c4cea4f92fb19e14b23f2bac7facfd8bdbb.1718811234.git.perry.yuan@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-06-20 21:52:05 -05:00
Perry Yuan 918263938c cpufreq: amd-pstate: enable shared memory type CPPC by default
The amd-pstate-epp driver has been implemented and resolves the
performance drop issue seen in passive mode for shared memory type
CPPC systems. Users who enable the active mode driver will not
experience a performance drop compared to the passive mode driver.
Therefore, the EPP driver should be loaded by default for shared
memory type CPPC system to get better performance.

Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/c705507cf3ee790e544251cfd897ed11e8e57712.1718811234.git.perry.yuan@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-06-20 21:52:05 -05:00
Perry Yuan c9fdaba836 cpufreq: amd-pstate: switch boot_cpu_has() to cpu_feature_enabled()
replace the usage of the deprecated boot_cpu_has() function with
the modern cpu_feature_enabled() function. The switch to cpu_feature_enabled()
ensures compatibility with the latest CPU feature detection mechanisms and
improves code maintainability.

Acked-by: Mario Limonciello <mario.limonciello@amd.com>
Suggested-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/f1567593ac5e1d38343067e9c681a8c4b0707038.1718811234.git.perry.yuan@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-06-20 21:52:05 -05:00
Perry Yuan bff7d13c19 cpufreq: amd-pstate: add debug message while CPPC is supported and disabled by SBIOS
If CPPC feature is supported by the CPU however the CPUID flag bit is not
set by SBIOS, the `amd_pstate` will be failed to load while system
booting.
So adding one more debug message to inform user to check the SBIOS setting,
The change also can help maintainers to debug why amd_pstate driver failed
to be loaded at system booting if the processor support CPPC.

Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218686
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Acked-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/42c953616ac121bd1e5c329e83d015a02e6b32c7.1718811234.git.perry.yuan@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-06-20 21:52:05 -05:00
Perry Yuan cb817ec667 cpufreq: amd-pstate: show CPPC debug message if CPPC is not supported
Add CPU ID checking in case the driver attempt to load on systems where
CPPC functionality is unavailable. And the warning message will not
be shown if CPPC is not supported.

It will also print debug message if the CPU has no CPPC support that
helps to debug the driver loading failure issue.

Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Closes: https://lore.kernel.org/linux-pm/CYYPR12MB8655D32EA18574C9497E888A9C122@CYYPR12MB8655.namprd12.prod.outlook.com/T/#t
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Acked-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/437dbd581a4119465581330081d9b1e289482ba2.1718811234.git.perry.yuan@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-06-20 21:52:05 -05:00
Perry Yuan 7bf7f22906 cpufreq: amd-pstate: remove unused variable nominal_freq
removed the unused variable `nominal_freq` for build warning.
This variable was defined and assigned a value in the previous code,
but it was not used in the subsequent code.

Closes: https://lore.kernel.org/oe-kbuild-all/202405080431.BPU6Yg9s-lkp@intel.com/
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Acked-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/b7ef41557f71d40d098393ddb27f0fe1f23648ae.1718811234.git.perry.yuan@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-06-20 21:52:05 -05:00
Perry Yuan 8f8b42c1fc cpufreq: amd-pstate: optimize the initial frequency values verification
To enhance the debugging capability of the driver loading failure for
broken CPPC ACPI tables, it can optimize the expression by moving the
verification of `min_freq`, `nominal_freq`, and other dependency values
to the `amd_pstate_init_freq()` function where they are initialized.
If any of these values are incorrect, the `amd-pstate` driver will not be registered.

By ensuring that these values are correct before they are used, it will facilitate
the debugging process when encountering driver loading failures due to faulty CPPC
ACPI tables from BIOS

Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Acked-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Acked-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/f9793f8451c1832e34cc9dc35f89c653b39cfe38.1718811234.git.perry.yuan@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-06-20 21:52:05 -05:00
Mario Limonciello fc6e083726 cpufreq: amd-pstate: Allow users to write 'default' EPP string
The EPP string for 'default' represents what the firmware had configured
as the default EPP value but once a user changes EPP to another string
they can't reset it back to 'default'.

Cache the firmware EPP value and allow the user to write 'default' using
this value.

Reported-by: Artem S. Tashkinov <aros@gmx.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217931#c61
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-06-20 21:52:05 -05:00
Xiaojian Du c00d476cbc cpufreq: amd-pstate: change cpu freq transition delay for some models
Some of AMD ZEN4 APU/CPU have support for adjusting the CPU core
clock more quickly and presicely according to CPU work loading.
This is advertised by the Fast CPPC x86 feature.
This change will only be effective in the *passive mode* of
AMD pstate driver. From the test results of different
transition delay values, 600us is chosen to make a balance
between performance and power consumption.

Some test results on AMD Ryzen 7840HS(Phoenix) APU:

1. Tbench
(Energy less is better, Throughput more is better,
PPW--Performance per Watt more is better)
============= =================== ============== =============== ============== =============== ============== =============== ===============
 Trans Delay   Tbench              governor:schedutil, 3-iterations average
============= =================== ============== =============== ============== =============== ============== =============== ===============
 1000us        Clients             1              2               4              8              12             16              32
               Energy/Joules       2010           2804            8768           17171          16170          15132           15027
               Throughput/(MB/s)   114            259             1041           3010           3135           4851            4605
               PPW                 0.0567         0.0923          0.1187         0.1752         0.1938         0.3205          0.3064
 600us         Clients             1              2               4              8              12             16              32
               Energy/Joules       2115  (5.22%)  2388  (-14.84%) 10700(22.03%)  16716 (-2.65%) 15939 (-1.43%) 15053 (-0.52%)  15083 (0.37% )
               Throughput/(MB/s)   122   (7.02%)  234   (-9.65% ) 1188 (14.12%)  3003  (-0.23%) 3143  (0.26% ) 4842  (-0.19%)  4603  (-0.04%)
               PPW                 0.0576(1.59%)  0.0979(6.07%  ) 0.111(-6.49%)  0.1796(2.51% ) 0.1971(1.70% ) 0.3216(0.34% )  0.3051(-0.42%)
============= =================== ============== ================ ============= =============== ============== =============== ===============

2.Dbench
(Energy less is better, Throughput more is better,
PPW--Performance per Watt more is better)
============= =================== ============== =============== ============== =============== ============== =============== ===============
 Trans Delay   Dbench              governor:schedutil, 3-iterations average
============= =================== ============== =============== ============== =============== ============== =============== ===============
 1000us        Clients             1             2               4              8               12             16              32
               Energy/Joules       4890          3779            3567           5157            5611           6500            8163
               Throughput/(MB/s)   327           167             220            577             775            938             1397
               PPW                 0.0668        0.0441          0.0616         0.1118          0.1381         0.1443          0.1711
 600us         Clients             1             2               4              8               12             16              32
               Energy/Joules       4915  (0.51%) 4912  (29.98%)  3506  (-1.71%) 4907  (-4.85% ) 5011 (-10.69%) 5672  (-12.74%) 8141  (-0.27%)
               Throughput/(MB/s)   348   (6.42%) 284   (70.06%)  220   (0.00% ) 518   (-10.23%) 712  (-8.13% ) 854   (-8.96% ) 1475  (5.58% )
               PPW                 0.0708(5.99%) 0.0578(31.07%)  0.0627(1.79% ) 0.1055(-5.64% ) 0.142(2.82%  ) 0.1505(4.30%  ) 0.1811(5.84% )
============= =================== ============== =============== ============== =============== ============== =============== ===============

3.Hackbench(less time is better)
============= =========================== ==========================
  hackbench     governor:schedutil
============= =========================== ==========================
  Trans Delay   Process Mode Ave time(s)  Thread Mode Ave time(s)
  1000us        14.484                      14.484
  600us         14.418(-0.46%)              15.41(+6.39%)
============= =========================== ==========================

4.Perf_sched_bench(less time is better)
============= =================== ============== ============== ============== =============== =============== =============
 Trans Delay  perf_sched_bench    governor:schedutil
============= =================== ============== ============== ============== =============== =============== =============
  1000us        Groups             1             2              4              8               12              24
                AveTime(s)        1.64          2.851          5.878          11.636          16.093          26.395
  600us         Groups             1             2              4              8               12              24
                AveTime(s)        1.69(3.05%)   2.845(-0.21%)  5.843(-0.60%)  11.576(-0.52%)  16.092(-0.01%)  26.32(-0.28%)
============= ================== ============== ============== ============== =============== =============== ==============

5.Sysbench(higher is better)
============= ================== ============== ================= ============== ================ =============== =================
  Sysbench    governor:schedutil
============= ================== ============== ================= ============== ================ =============== =================
  1000us      Thread             1               2                4              8                12               24
              Ave events         6020.98         12273.39         24119.82       46171.57         47074.37         47831.72
  600us       Thread             1               2                4              8                12               24
              Ave events         6154.82(2.22%)  12271.63(-0.01%) 24392.5(1.13%) 46117.64(-0.12%) 46852.19(-0.47%) 47678.92(-0.32%)
============= ================== ============== ================= ============== ================ =============== =================

In conclusion, a shorter transition delay
of cpu clock will make a quite positive effect to improve PPW
on Dbench test, in the meanwhile, keep stable performance
on Tbench, Hackbench, Perf_sched_bench and Sysbench.

Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Acked-by: Mario Limonciello <mario.limonciello@amd.com>
2024-06-11 16:12:12 -05:00
Dhananjay Ugwekar e4731baaf2 cpufreq: amd-pstate: Fix the inconsistency in max frequency units
The nominal frequency in cpudata is maintained in MHz whereas all other
frequencies are in KHz. This means we have to convert nominal frequency
value to KHz before we do any interaction with other frequency values.

In amd_pstate_set_boost(), this conversion from MHz to KHz is missed,
fix that.

Tested on a AMD Zen4 EPYC server

Before:
$ cat /sys/devices/system/cpu/cpufreq/policy*/scaling_max_freq | uniq
2151
$ cat /sys/devices/system/cpu/cpufreq/policy*/cpuinfo_min_freq | uniq
400000
$ cat /sys/devices/system/cpu/cpufreq/policy*/scaling_cur_freq | uniq
2151
409422

After:
$ cat /sys/devices/system/cpu/cpufreq/policy*/scaling_max_freq | uniq
2151000
$ cat /sys/devices/system/cpu/cpufreq/policy*/cpuinfo_min_freq | uniq
400000
$ cat /sys/devices/system/cpu/cpufreq/policy*/scaling_cur_freq | uniq
2151000
1799527

Fixes: ec437d71db ("cpufreq: amd-pstate: Introduce a new AMD P-State driver to support future processors")
Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Acked-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Tested-by: Peter Jung <ptr1337@cachyos.org>
Cc: 5.17+ <stable@vger.kernel.org> # 5.17+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-05-28 22:03:11 +02:00
Arnd Bergmann 779b8a14af cpufreq: amd-pstate: remove global header file
When extra warnings are enabled, gcc points out a global variable
definition in a header:

In file included from drivers/cpufreq/amd-pstate-ut.c:29:
include/linux/amd-pstate.h:123:27: error: 'amd_pstate_mode_string' defined but not used [-Werror=unused-const-variable=]
  123 | static const char * const amd_pstate_mode_string[] = {
      |                           ^~~~~~~~~~~~~~~~~~~~~~

This header is only included from two files in the same directory,
and one of them uses only a single definition from it, so clean it
up by moving most of the contents into the driver that uses them,
and making shared bits a local header file.

Fixes: 36c5014e54 ("cpufreq: amd-pstate: optimize driver working mode selection in amd_pstate_param()")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-05-28 21:59:39 +02:00
Peng Ma cea04f3d9a cpufreq: amd-pstate: fix memory leak on CPU EPP exit
The cpudata memory from kzalloc() in amd_pstate_epp_cpu_init() is
not freed in the analogous exit function, so fix that.

Signed-off-by: Peng Ma <andypma@tencent.com>
Acked-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Perry Yuan <Perry.Yuan@amd.com>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-05-16 10:38:07 +02:00
Perry Yuan bf202e654b cpufreq: amd-pstate: fix the highest frequency issue which limits performance
To address the performance drop issue, an optimization has been
implemented. The incorrect highest performance value previously set by the
low-level power firmware for AMD CPUs with Family ID 0x19 and Model ID
ranging from 0x70 to 0x7F series has been identified as the cause.

To resolve this, a check has been implemented to accurately determine the
CPU family and model ID. The correct highest performance value is now set
and the performance drop caused by the incorrect highest performance value
are eliminated.

Before the fix, the highest frequency was set to 4200MHz, now it is set
to 4971MHz which is correct.

CPU NODE SOCKET CORE L1d:L1i:L2:L3 ONLINE    MAXMHZ   MINMHZ       MHZ
  0    0      0    0 0:0:0:0          yes 4971.0000 400.0000  400.0000
  1    0      0    0 0:0:0:0          yes 4971.0000 400.0000  400.0000
  2    0      0    1 1:1:1:0          yes 4971.0000 400.0000 4865.8140
  3    0      0    1 1:1:1:0          yes 4971.0000 400.0000  400.0000

Fixes: f3a0523918 ("cpufreq: amd-pstate: Enable amd-pstate preferred core support")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218759
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Co-developed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Gaha Bana <gahabana@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-05-08 13:56:00 +02:00
Perry Yuan 5c3fd1edaa cpufreq: amd-pstate: remove unused variable lowest_nonlinear_freq
removed the unused variable `lowest_nonlinear_freq` for build warning.
This variable was defined and assigned a value in the previous code,
but it was not used in the subsequent code.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202404271038.em6nJjzy-lkp@intel.com/
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-04-30 12:23:06 +02:00
Perry Yuan 5131a3ca35 cpufreq: amd-pstate: fix code format problems
get some code format problems fixed in the amd-pstate driver.

Changes Made:

- Fixed incorrect comment format in the functions.

- Removed unnecessary blank line.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202404271148.HK9yHBlB-lkp@intel.com/
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-04-30 12:23:06 +02:00
Perry Yuan eb8b6c3682 cpufreq: amd-pstate: Add quirk for the pstate CPPC capabilities missing
Add quirks table to get CPPC capabilities issue fixed by providing
correct perf or frequency values while driver loading.

If CPPC capabilities are not defined in the ACPI tables or wrongly
defined by platform firmware, it needs to use quick to get those
issues fixed with correct workaround values to make pstate driver
can be loaded even though there are CPPC capabilities errors.

The workaround will match the broken BIOS which lack of CPPC capabilities
nominal_freq and lowest_freq definition in the ACPI table.

$ cat /sys/devices/system/cpu/cpu0/acpi_cppc/lowest_freq
0
$ cat /sys/devices/system/cpu/cpu0/acpi_cppc/nominal_freq
0

Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Tested-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-04-26 19:35:38 +02:00
Perry Yuan 069a2bb8c4 cpufreq: amd-pstate: get transition delay and latency value from ACPI tables
Make pstate driver initially retrieve the P-state transition delay and
latency values from the BIOS ACPI tables which has more reasonable
delay and latency values according to the platform design and
requirements.

Previously there values were hardcoded at specific value which may
have conflicted with platform and it might not reflect the most
accurate or optimized setting for the processor.

[054h 0084   8]                Preserve Mask : FFFFFFFF00000000
[05Ch 0092   8]                   Write Mask : 0000000000000001
[064h 0100   4]              Command Latency : 00000FA0
[068h 0104   4]          Maximum Access Rate : 0000EA60
[06Ch 0108   2]      Minimum Turnaround Time : 0000

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-04-26 19:35:38 +02:00
Perry Yuan 2ddb8a3946 cpufreq: amd-pstate: Bail out if min/max/nominal_freq is 0
The amd-pstate driver cannot work when the min_freq, nominal_freq or
the max_freq is zero. When this happens it is prudent to error out
early on rather than waiting failing at the time of the governor
initialization.

Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Tested-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-04-26 19:35:38 +02:00
Gautham R. Shenoy 3cbbe8871a cpufreq: amd-pstate: Remove amd_get_{min,max,nominal,lowest_nonlinear}_freq()
amd_get_{min,max,nominal,lowest_nonlinear}_freq() functions merely
return cpudata->{min,max,nominal,lowest_nonlinear}_freq values.

There is no loss in readability in replacing their invocations by
accesses to the corresponding members of cpudata.

Do so and remove these helper functions.

Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Li Meng <li.meng@amd.com>
Tested-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-04-26 19:35:38 +02:00
Perry Yuan 5547c0ebfc cpufreq: amd-pstate: Unify computation of {max,min,nominal,lowest_nonlinear}_freq
Currently the amd_get_{min, max, nominal, lowest_nonlinear}_freq()
helpers computes the values of min_freq, max_freq, nominal_freq and
lowest_nominal_freq respectively afresh from
cppc_get_perf_caps(). This is not necessary as there are fields in
cpudata to cache these values.

To simplify this, add a single helper function named
amd_pstate_init_freq() which computes all these frequencies at once, and
caches it in cpudata.

Use the cached values everywhere else in the code.

Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Li Meng <li.meng@amd.com>
Tested-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Co-developed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-04-26 19:35:38 +02:00
Meng Li 8164f74332 cpufreq: amd-pstate: adjust min/max limit perf
The min/max limit perf values calculated based on frequency
may exceed the reasonable range of perf(highest perf, lowest perf).

Signed-off-by: Meng Li <li.meng@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-02-29 20:11:40 +01:00
Tor Vic b26ffbf800 cpufreq: amd-pstate: Fix min_perf assignment in amd_pstate_adjust_perf()
In the function amd_pstate_adjust_perf(), the 'min_perf' variable is set
to 'highest_perf' instead of 'lowest_perf'.

Fixes: 1d215f0319 ("cpufreq: amd-pstate: Add fast switch function for AMD P-State")
Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Reviewed-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Tor Vic <torvic9@mailbox.org>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Cc: 6.1+ <stable@vger.kernel.org> # 6.1+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-02-12 16:40:28 +01:00
Meng Li e571a5e206 cpufreq: amd-pstate: Update amd-pstate preferred core ranking dynamically
Preferred core rankings can be changed dynamically by the
platform based on the workload and platform conditions and
accounting for thermals and aging.
When this occurs, cpu priority need to be set.

Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Wyes Karny <wyes.karny@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Meng Li <li.meng@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-31 14:54:50 +01:00
Meng Li f3a0523918 cpufreq: amd-pstate: Enable amd-pstate preferred core support
amd-pstate driver utilizes the functions and data structures
provided by the ITMT architecture to enable the scheduler to
favor scheduling on cores which can be get a higher frequency
with lower voltage. We call it amd-pstate preferrred core.

Here sched_set_itmt_core_prio() is called to set priorities and
sched_set_itmt_support() is called to enable ITMT feature.
amd-pstate driver uses the highest performance value to indicate
the priority of CPU. The higher value has a higher priority.

The initial core rankings are set up by amd-pstate when the
system boots.

Add a variable hw_prefcore in cpudata structure. It will check
if the processor and power firmware support preferred core
feature.

Add one new early parameter `disable` to allow user to disable
the preferred core.

Only when hardware supports preferred core and user set `enabled`
in early parameter, amd pstate driver supports preferred core featue.

Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Wyes Karny <wyes.karny@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Co-developed-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Meng Li <li.meng@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-31 14:54:50 +01:00
Mario Limonciello 22fb4f0419 cpufreq/amd-pstate: Fix setting scaling max/min freq values
Scaling min/max freq values were being cached and lagging a setting
each time.  Fix the ordering of the clamp call to ensure they work.

Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217931
Fixes: febab20cae ("cpufreq/amd-pstate: Fix scaling_min_freq and scaling_max_freq update")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Wyes Karny <wkarny@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-22 20:35:58 +01:00
Ayush Jain 142c169b31 cpufreq/amd-pstate: Only print supported EPP values for performance governor
show_energy_performance_available_preferences() to show only supported
values which is performance in performance governor policy.

-------Before--------
$ cat /sys/devices/system/cpu/cpu1/cpufreq/scaling_driver
amd-pstate-epp
$ cat /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
performance
$ cat /sys/devices/system/cpu/cpu1/cpufreq/energy_performance_preference
performance
$ cat /sys/devices/system/cpu/cpu1/cpufreq/energy_performance_available_preferences
default performance balance_performance balance_power power

-------After--------
$ cat /sys/devices/system/cpu/cpu1/cpufreq/scaling_driver
amd-pstate-epp
$ cat /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
performance
$ cat /sys/devices/system/cpu/cpu1/cpufreq/energy_performance_preference
performance
$ cat /sys/devices/system/cpu/cpu1/cpufreq/energy_performance_available_preferences
performance

Fixes: ffa5096a7c ("cpufreq: amd-pstate: implement Pstate EPP support for the AMD processors")
Suggested-by: Wyes Karny <wyes.karny@amd.com>
Signed-off-by: Ayush Jain <ayush.jain3@amd.com>
Reviewed-by: Wyes Karny <wyes.karny@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-11-29 22:04:15 +01:00
Wyes Karny febab20cae cpufreq/amd-pstate: Fix scaling_min_freq and scaling_max_freq update
When amd_pstate is running, writing to scaling_min_freq and
scaling_max_freq has no effect. These values are only passed to the
policy level, but not to the platform level. This means that the
platform does not know about the frequency limits set by the user.

To fix this, update the min_perf and max_perf values at the platform
level whenever the user changes the scaling_min_freq and scaling_max_freq
values.

Fixes: ffa5096a7c ("cpufreq: amd-pstate: implement Pstate EPP support for the AMD processors")
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Wyes Karny <wyes.karny@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-11-29 17:40:16 +01:00
Gautham R. Shenoy bb87be267b cpufreq/amd-pstate: Fix the return value of amd_pstate_fast_switch()
cpufreq_driver->fast_switch() callback expects a frequency as a return
value. amd_pstate_fast_switch() was returning the return value of
amd_pstate_update_freq(), which only indicates a success or failure.

Fix this by making amd_pstate_fast_switch() return the target_freq
when the call to amd_pstate_update_freq() is successful, and return
the current frequency from policy->cur when the call to
amd_pstate_update_freq() is unsuccessful.

Fixes: 4badf2eb1e ("cpufreq: amd-pstate: Add ->fast_switch() callback")
Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Wyes Karny <wyes.karny@amd.com>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Cc: 6.4+ <stable@vger.kernel.org> # v6.4+
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-11-28 15:06:25 +01:00
Thomas Weißschuh 5e720f8c8c cpufreq: amd-pstate: fix global sysfs attribute type
In commit 3666062b87 ("cpufreq: amd-pstate: move to use bus_get_dev_root()")
the "amd_pstate" attributes where moved from a dedicated kobject to the
cpu root kobject.

While the dedicated kobject expects to contain kobj_attributes the root
kobject needs device_attributes.

As the changed arguments are not used by the callbacks it works most of
the time.
However CFI will detect this issue:

[ 4947.849350] CFI failure at dev_attr_show+0x24/0x60 (target: show_status+0x0/0x70; expected type: 0x8651b1de)
...
[ 4947.849409] Call Trace:
[ 4947.849410]  <TASK>
[ 4947.849411]  ? __warn+0xcf/0x1c0
[ 4947.849414]  ? dev_attr_show+0x24/0x60
[ 4947.849415]  ? report_cfi_failure+0x4e/0x60
[ 4947.849417]  ? handle_cfi_failure+0x14c/0x1d0
[ 4947.849419]  ? __cfi_show_status+0x10/0x10
[ 4947.849420]  ? handle_bug+0x4f/0x90
[ 4947.849421]  ? exc_invalid_op+0x1a/0x60
[ 4947.849422]  ? asm_exc_invalid_op+0x1a/0x20
[ 4947.849424]  ? __cfi_show_status+0x10/0x10
[ 4947.849425]  ? dev_attr_show+0x24/0x60
[ 4947.849426]  sysfs_kf_seq_show+0xa6/0x110
[ 4947.849433]  seq_read_iter+0x16c/0x4b0
[ 4947.849436]  vfs_read+0x272/0x2d0
[ 4947.849438]  ksys_read+0x72/0xe0
[ 4947.849439]  do_syscall_64+0x76/0xb0
[ 4947.849440]  ? do_user_addr_fault+0x252/0x650
[ 4947.849442]  ? exc_page_fault+0x7a/0x1b0
[ 4947.849443]  entry_SYSCALL_64_after_hwframe+0x72/0xdc

Fixes: 3666062b87 ("cpufreq: amd-pstate: move to use bus_get_dev_root()")
Reported-by: Jannik Glückert <jannik.glueckert@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217765
Link: https://lore.kernel.org/lkml/c7f1bf9b-b183-bf6e-1cbb-d43f72494083@gmail.com/
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-08-07 19:41:48 +02:00
Mario Limonciello c88ad30e3f cpufreq: amd-pstate: Add a kernel config option to set default mode
Users are having more success with amd-pstate since the introduction
of EPP and Guided modes.  To expose the driver to more users by default
introduce a kernel configuration option for setting the default mode.

Users can use an integer to map out which default mode they want to use
in lieu of a kernel command line option.

This will default to EPP, but only if:
 1) The CPU supports an MSR.
 2) The system profile is identified
 3) The system profile is identified as a non-server by the FADT.

Link: https://gitlab.freedesktop.org/hadess/power-profiles-daemon/-/merge_requests/121
Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Co-developed-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-06-21 18:44:56 +02:00
Mario Limonciello 32f80b9adf cpufreq: amd-pstate: Set a fallback policy based on preferred_profile
If a user's configuration doesn't explicitly specify the cpufreq
scaling governor then the code currently explicitly falls back to
'powersave'. This default is fine for notebooks and desktops, but
servers and undefined machines should default to 'performance'.

Look at the 'preferred_profile' field from the FADT to set this
policy accordingly.

Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/05_ACPI_Software_Programming_Model/ACPI_Software_Programming_Model.html#fixed-acpi-description-table-fadt
Acked-by: Huang Rui <ray.huang@amd.com>
Suggested-by: Wyes Karny <Wyes.Karny@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-06-21 18:44:56 +02:00
Wyes Karny f4aad63930 cpufreq: amd-pstate: Make amd-pstate EPP driver name hyphenated
amd-pstate passive mode driver is hyphenated. So make amd-pstate active
mode driver consistent with that rename "amd_pstate_epp" to
"amd-pstate-epp".

Fixes: ffa5096a7c ("cpufreq: amd-pstate: implement Pstate EPP support for the AMD processors")
Cc: All applicable <stable@vger.kernel.org>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Wyes Karny <wyes.karny@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-06-16 19:37:13 +02:00
Wyes Karny 217e67784e cpufreq: amd-pstate: Write CPPC enable bit per-socket
Currently amd_pstate sets CPPC enable bit in MSR_AMD_CPPC_ENABLE only
for the CPU where the module_init happened. But MSR_AMD_CPPC_ENABLE is
per-socket. This causes CPPC enable bit to set for only one socket for
servers with more than one physical packages. To fix this write
MSR_AMD_CPPC_ENABLE per-socket.

Also, handle duplicate calls for cppc_enable, because it's called from
per-policy/per-core callbacks and can result in duplicate MSR writes.

Before the fix:
amd@amd:~$ sudo rdmsr -a 0xc00102b1 | uniq --count
	192 0
    192 1

After the fix:
amd@amd:~$ sudo rdmsr -a 0xc00102b1 | uniq --count
    384 1

Suggested-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Wyes Karny <wyes.karny@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-06-16 19:30:47 +02:00
Wyes Karny 3bf8c6307b cpufreq: amd-pstate: Update policy->cur in amd_pstate_adjust_perf()
Driver should update policy->cur after updating the frequency.
Currently amd_pstate doesn't update policy->cur when `adjust_perf`
is used. Which causes /proc/cpuinfo to show wrong cpu frequency.
Fix this by updating policy->cur with correct frequency value in
adjust_perf function callback.

- Before the fix: (setting min freq to 1.5 MHz)

[root@amd]# cat /proc/cpuinfo | grep "cpu MHz" | sort | uniq --count
      1 cpu MHz         : 1777.016
      1 cpu MHz         : 1797.160
      1 cpu MHz         : 1797.270
    189 cpu MHz         : 400.000

- After the fix: (setting min freq to 1.5 MHz)

[root@amd]# cat /proc/cpuinfo | grep "cpu MHz" | sort | uniq --count
      1 cpu MHz         : 1753.353
      1 cpu MHz         : 1756.838
      1 cpu MHz         : 1776.466
      1 cpu MHz         : 1776.873
      1 cpu MHz         : 1777.308
      1 cpu MHz         : 1779.900
    183 cpu MHz         : 1805.231
      1 cpu MHz         : 1956.815
      1 cpu MHz         : 2246.203
      1 cpu MHz         : 2259.984

Fixes: 1d215f0319 ("cpufreq: amd-pstate: Add fast switch function for AMD P-State")
Signed-off-by: Wyes Karny <wyes.karny@amd.com>
[ rjw: Subject edits ]
Cc: 5.17+ <stable@vger.kernel.org> # 5.17+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-05-25 19:35:13 +02:00
Wyes Karny 249b62c448 cpufreq: amd-pstate: Remove fast_switch_possible flag from active driver
amd_pstate active mode driver is only compatible with static governors.
Therefore it doesn't need fast_switch functionality. Remove
fast_switch_possible flag from amd_pstate active mode driver.

Fixes: ffa5096a7c ("cpufreq: amd-pstate: implement Pstate EPP support for the AMD processors")
Signed-off-by: Wyes Karny <wyes.karny@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-05-24 19:39:16 +02:00
Gautham R. Shenoy 4badf2eb1e cpufreq: amd-pstate: Add ->fast_switch() callback
Schedutil normally calls the adjust_perf callback for drivers with
adjust_perf callback available and fast_switch_possible flag set.
However, when frequency invariance is disabled and schedutil tries to
invoke fast_switch. So, there is a chance of kernel crash if this
function pointer is not set. To protect against this scenario add
fast_switch callback to amd_pstate driver.

Fixes: 1d215f0319 ("cpufreq: amd-pstate: Add fast switch function for AMD P-State")
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Wyes Karny <wyes.karny@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-05-24 19:39:16 +02:00
Linus Torvalds 556eb8b791 Driver core changes for 6.4-rc1
Here is the large set of driver core changes for 6.4-rc1.
 
 Once again, a busy development cycle, with lots of changes happening in
 the driver core in the quest to be able to move "struct bus" and "struct
 class" into read-only memory, a task now complete with these changes.
 
 This will make the future rust interactions with the driver core more
 "provably correct" as well as providing more obvious lifetime rules for
 all busses and classes in the kernel.
 
 The changes required for this did touch many individual classes and
 busses as many callbacks were changed to take const * parameters
 instead.  All of these changes have been submitted to the various
 subsystem maintainers, giving them plenty of time to review, and most of
 them actually did so.
 
 Other than those changes, included in here are a small set of other
 things:
   - kobject logging improvements
   - cacheinfo improvements and updates
   - obligatory fw_devlink updates and fixes
   - documentation updates
   - device property cleanups and const * changes
   - firwmare loader dependency fixes.
 
 All of these have been in linux-next for a while with no reported
 problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZEp7Sw8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ykitQCfamUHpxGcKOAGuLXMotXNakTEsxgAoIquENm5
 LEGadNS38k5fs+73UaxV
 =7K4B
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here is the large set of driver core changes for 6.4-rc1.

  Once again, a busy development cycle, with lots of changes happening
  in the driver core in the quest to be able to move "struct bus" and
  "struct class" into read-only memory, a task now complete with these
  changes.

  This will make the future rust interactions with the driver core more
  "provably correct" as well as providing more obvious lifetime rules
  for all busses and classes in the kernel.

  The changes required for this did touch many individual classes and
  busses as many callbacks were changed to take const * parameters
  instead. All of these changes have been submitted to the various
  subsystem maintainers, giving them plenty of time to review, and most
  of them actually did so.

  Other than those changes, included in here are a small set of other
  things:

   - kobject logging improvements

   - cacheinfo improvements and updates

   - obligatory fw_devlink updates and fixes

   - documentation updates

   - device property cleanups and const * changes

   - firwmare loader dependency fixes.

  All of these have been in linux-next for a while with no reported
  problems"

* tag 'driver-core-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (120 commits)
  device property: make device_property functions take const device *
  driver core: update comments in device_rename()
  driver core: Don't require dynamic_debug for initcall_debug probe timing
  firmware_loader: rework crypto dependencies
  firmware_loader: Strip off \n from customized path
  zram: fix up permission for the hot_add sysfs file
  cacheinfo: Add use_arch[|_cache]_info field/function
  arch_topology: Remove early cacheinfo error message if -ENOENT
  cacheinfo: Check cache properties are present in DT
  cacheinfo: Check sib_leaf in cache_leaves_are_shared()
  cacheinfo: Allow early level detection when DT/ACPI info is missing/broken
  cacheinfo: Add arm64 early level initializer implementation
  cacheinfo: Add arch specific early level initializer
  tty: make tty_class a static const structure
  driver core: class: remove struct class_interface * from callbacks
  driver core: class: mark the struct class in struct class_interface constant
  driver core: class: make class_register() take a const *
  driver core: class: mark class_release() as taking a const *
  driver core: remove incorrect comment for device_create*
  MIPS: vpe-cmp: remove module owner pointer from struct class usage.
  ...
2023-04-27 11:53:57 -07:00
Tom Rix 11fa52fe61 cpufreq: amd-pstate: Make varaiable mode_state_machine static
smatch reports
drivers/cpufreq/amd-pstate.c:907:25: warning: symbol
  'mode_state_machine' was not declared. Should it be static?

This variable is only used in one file so it should be static.

Signed-off-by: Tom Rix <trix@redhat.com>
Reviewed-by: Wyes Karny <wyes.karny@amd.com>
Tested-by: Wyes Karny <wyes.karny@amd.com>
Reviewed-by: Dhruva Gole <d-gole@ti.com>
[ rjw: Subject edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-04-11 20:47:20 +02:00
Wyes Karny 3ca7bc818d cpufreq: amd-pstate: Add guided mode control support via sysfs
amd_pstate driver's `status` sysfs entry helps to control the driver's
mode dynamically by user. After the addition of guided mode the
combinations of mode transitions have been increased (16 combinations).
Therefore optimise the amd_pstate_update_status function by implementing
a state transition table.

There are 4 states amd_pstate supports, namely: 'disable', 'passive',
'active', and 'guided'.  The transition from any state to any other
state is possible after this change.

Sysfs interface:

To disable amd_pstate driver:
 # echo disable > /sys/devices/system/cpu/amd_pstate/status

To enable passive mode:
 # echo passive > /sys/devices/system/cpu/amd_pstate/status

To change mode to active:
 # echo active > /sys/devices/system/cpu/amd_pstate/status

To change mode to guided:
 # echo guided > /sys/devices/system/cpu/amd_pstate/status

Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Signed-off-by: Wyes Karny <wyes.karny@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-03-17 19:06:23 +01:00
Wyes Karny 2dd6d0ebf7 cpufreq: amd-pstate: Add guided autonomous mode
From ACPI spec below 3 modes for CPPC can be defined:

 1. Non autonomous: OS scaling governor specifies operating frequency/
    performance level through `Desired Performance` register and platform
    follows that.

 2. Guided autonomous: OS scaling governor specifies min and max
    frequencies/ performance levels through `Minimum Performance` and
    `Maximum Performance` register, and platform can autonomously select an
    operating frequency in this range.

 3. Fully autonomous: OS only hints (via EPP) to platform for the required
    energy performance preference for the workload and platform autonomously
    scales the frequency.

Currently (1) is supported by amd_pstate as passive mode, and (3) is
implemented by EPP support. This change is to support (2).

In guided autonomous mode the min_perf is based on the input from the
scaling governor. For example, in case of schedutil this value depends
on the current utilization. And max_perf is set to max capacity.

To activate guided auto mode ``amd_pstate=guided`` command line
parameter has to be passed in the kernel.

Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Signed-off-by: Wyes Karny <wyes.karny@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-03-17 19:06:23 +01:00
Greg Kroah-Hartman 3666062b87 cpufreq: amd-pstate: move to use bus_get_dev_root()
Direct access to the struct bus_type dev_root pointer is going away soon
so replace that with a call to bus_get_dev_root() instead, which is what
it is there for.

In doing so, remove the unneded kobject structure that was only being
created to cause a subdirectory for the attributes.  The name of the
attribute group is the correct way to do this, saving code and
complexity as well as allowing the attributes to properly show up to
userspace tools (the raw kobject would not allow that.)

Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: linux-pm@vger.kernel.org
Acked-by: Huang Rui <ray.huang@.amd.com>
Link: https://lore.kernel.org/r/20230313182918.1312597-20-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-17 15:30:07 +01:00
Nick Alcock fa0746b11b cpufreq: amd-pstate: remove MODULE_LICENSE in non-modules
Since commit 8b41fc4454 ("kbuild: create modules.builtin without
Makefile.modbuiltin or tristate.conf"), MODULE_LICENSE declarations
are used to identify modules. As a consequence, uses of the macro
in non-modules will cause modprobe to misidentify their containing
object file as a module when it is not (false positives), and modprobe
might succeed rather than failing with a suitable error message.

So remove it in amd-pstate.c which cannot be built as a module.

Signed-off-by: Nick Alcock <nick.alcock@oracle.com>
Suggested-by: Luis Chamberlain <mcgrof@kernel.org>
[ rjw: Subject and changelog adjustments ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-23 20:01:23 +01:00
Kai-Heng Feng 7af78020e2 cpufreq: amd-pstate: Let user know amd-pstate is disabled
Commit 202e683df3 ("cpufreq: amd-pstate: add amd-pstate driver
parameter for mode selection") changed the driver to be disabled by
default, and this can surprise users.

Let users know what happened so they can decide what to do next.

Link: https://bugs.launchpad.net/bugs/2006942
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Yuan Perry <Perry.Yuan@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-23 19:56:08 +01:00
Wyes Karny 6e9d12125f cpufreq: amd-pstate: Fix invalid write to MSR_AMD_CPPC_REQ
`amd_pstate_set_epp` function uses `cppc_req_cached` and `epp` variable
to update the MSR_AMD_CPPC_REQ register for AMD MSR systems. The recent
commit 7cca9a9851 ("cpufreq: amd-pstate: avoid uninitialized variable
use") changed the sequence of updating cppc_req_cached and writing the
MSR_AMD_CPPC_REQ. Therefore while switching from powersave to
performance governor and vice-versa in active mode MSR_AMD_CPPC_REQ is
set with the previous cached value. To fix this: first update the
`cppc_req_cached` variable and then call `amd_pstate_set_epp` function.

 - Before commit 7cca9a9851 ("cpufreq: amd-pstate: avoid uninitialized
   variable use"):

With powersave governor:
[    1.652743] amd_pstate_epp_init: writing to cppc_req_cached = 0x1eff
[    1.652744] amd_pstate_set_epp: writing cppc_req_cached = 0x1eff
[    1.652746] amd_pstate_set_epp: writing min_perf = 30, des_perf = 0, max_perf = 255, epp = 0

Changing to performance governor:
[  300.493842] amd_pstate_epp_init: writing to cppc_req_cached = 0xffff
[  300.493846] amd_pstate_set_epp: writing cppc_req_cached = 0xffff
[  300.493847] amd_pstate_set_epp: writing min_perf = 255, des_perf = 0, max_perf = 255, epp = 0

 - After commit 7cca9a9851 ("cpufreq: amd-pstate: avoid uninitialized
   variable use"):

With powersave governor:
[    1.646037] amd_pstate_set_epp: writing cppc_req_cached = 0xffff
[    1.646038] amd_pstate_set_epp: writing min_perf = 255, des_perf = 0, max_perf = 255, epp = 0
[    1.646042] amd_pstate_epp_init: writing to cppc_req_cached = 0x1eff

Changing to performance governor:
[  687.117401] amd_pstate_set_epp: writing cppc_req_cached = 0x1eff
[  687.117405] amd_pstate_set_epp: writing min_perf = 30, des_perf = 0, max_perf = 255, epp = 0
[  687.117419] amd_pstate_epp_init: writing to cppc_req_cached = 0xffff

 - After this fix:

With powersave governor:
[    2.525717] amd_pstate_epp_init: writing to cppc_req_cached = 0x1eff
[    2.525720] amd_pstate_set_epp: writing cppc_req_cached = 0x1eff
[    2.525722] amd_pstate_set_epp: writing min_perf = 30, des_perf = 0, max_perf = 255, epp = 0

Changing to performance governor:
[ 3440.152468] amd_pstate_epp_init: writing to cppc_req_cached = 0xffff
[ 3440.152473] amd_pstate_set_epp: writing cppc_req_cached = 0xffff
[ 3440.152474] amd_pstate_set_epp: writing min_perf = 255, des_perf = 0, max_perf = 255, epp = 0

Fixes: 7cca9a9851 ("cpufreq: amd-pstate: avoid uninitialized variable use")
Signed-off-by: Wyes Karny <wyes.karny@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-15 15:58:07 +01:00
Arnd Bergmann 7cca9a9851 cpufreq: amd-pstate: avoid uninitialized variable use
The new epp support causes warnings about three separate
but related bugs:

1) failing before allocation should just return an error:

drivers/cpufreq/amd-pstate.c:951:6: error: variable 'ret' is used uninitialized whenever 'if' condition is true [-Werror,-Wsometimes-uninitialized]
        if (!dev)
            ^~~~
drivers/cpufreq/amd-pstate.c:1018:9: note: uninitialized use occurs here
        return ret;
               ^~~

2) wrong variable to store return code:

drivers/cpufreq/amd-pstate.c:963:6: error: variable 'ret' is used uninitialized whenever 'if' condition is true [-Werror,-Wsometimes-uninitialized]
        if (rc)
            ^~
drivers/cpufreq/amd-pstate.c:1019:9: note: uninitialized use occurs here
        return ret;
               ^~~
drivers/cpufreq/amd-pstate.c:963:2: note: remove the 'if' if its condition is always false
        if (rc)
        ^~~~~~~

3) calling amd_pstate_set_epp() in cleanup path after determining
that it should not be called:

drivers/cpufreq/amd-pstate.c:1055:6: error: variable 'epp' is used uninitialized whenever 'if' condition is true [-Werror,-Wsometimes-uninitialized]
        if (cpudata->epp_policy == cpudata->policy)
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/cpufreq/amd-pstate.c:1080:30: note: uninitialized use occurs here
        amd_pstate_set_epp(cpudata, epp);
                                    ^~~

All three are trivial to fix, but most likely there are additional bugs
in this function when the error handling was not really tested.

Fixes: ffa5096a7c ("cpufreq: amd-pstate: implement Pstate EPP support for the AMD processors")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Wyes Karny <wyes.karny@amd.com>
Reviewed-by: Yuan Perry <Perry.Yuan@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-09 20:21:42 +01:00
Uwe Kleine-König dd329e1e21 cpufreq: Make cpufreq_unregister_driver() return void
All but a few drivers ignore the return value of
cpufreq_unregister_driver(). Those few that don't only call it after
cpufreq_register_driver() succeeded, in which case the call doesn't
fail.

Make the function return no value and add a WARN_ON for the case that
the function is called in an invalid situation (i.e. without a previous
successful call to cpufreq_register_driver()).

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Florian Fainelli <f.fainelli@gmail.com> # brcmstb-avs-cpufreq.c
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-09 20:19:18 +01:00
Perry Yuan 3ec32b6d17 cpufreq: amd-pstate: convert sprintf with sysfs_emit()
replace the sprintf with a more generic sysfs_emit function

No intended potential function impact

Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Wyes Karny <wyes.karny@amd.com>
Tested-by: Wyes Karny <wyes.karny@amd.com>
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-03 21:59:42 +01:00
Perry Yuan abd61c08ef cpufreq: amd-pstate: add driver working mode switch support
While amd-pstate driver was loaded with specific driver mode, it will
need to check which mode is enabled for the pstate driver,add this sysfs
entry to show the current status

$ cat /sys/devices/system/cpu/amd-pstate/status
active

Meanwhile, user can switch the pstate driver mode with writing mode
string to sysfs entry as below.

Enable passive mode:
$ sudo bash -c "echo passive >  /sys/devices/system/cpu/amd-pstate/status"

Enable active mode (EPP driver mode):
$ sudo bash -c "echo active > /sys/devices/system/cpu/amd-pstate/status"

Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Wyes Karny <wyes.karny@amd.com>
Tested-by: Wyes Karny <wyes.karny@amd.com>
Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-03 21:59:42 +01:00
Perry Yuan 50ddd2f782 cpufreq: amd-pstate: implement suspend and resume callbacks
add suspend and resume support for the AMD processors by amd_pstate_epp
driver instance.

When the CPPC is suspended, EPP driver will set EPP profile to 'power'
profile and set max/min perf to lowest perf value.
When resume happens, it will restore the MSR registers with
previous cached value.

Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Mario Limonciello <Mario.Limonciello@amd.com>
Reviewed-by: Wyes Karny <wyes.karny@amd.com>
Tested-by: Wyes Karny <wyes.karny@amd.com>
Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-03 21:59:42 +01:00
Perry Yuan d4da12f803 cpufreq: amd-pstate: implement amd pstate cpu online and offline callback
Adds online and offline driver callback support to allow cpu cores go
offline and help to restore the previous working states when core goes
back online later for EPP driver mode.

Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Mario Limonciello <Mario.Limonciello@amd.com>
Reviewed-by: Wyes Karny <wyes.karny@amd.com>
Tested-by: Wyes Karny <wyes.karny@amd.com>
Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-03 21:59:41 +01:00