update smu v13.0.6 message to allow guest driver set gfx clock.
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
voltage_parameters is a point to a struct of type
SET_VOLTAGE_PARAMETERS_V1_3. Passing just voltage_parameters would
not print the right size of the struct variable. So we need to pass
*voltage_parameters to sizeof().
Fixes: 4630d5031c ("drm/amdgpu: check PS, WS index")
Signed-off-by: Samasth Norway Ananda <samasth.norway.ananda@oracle.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
amdgpu:
- DSC fixes
- DC resource pool fixes
- OTG fix
- DML2 fixes
- Aux fix
- GFX10 RLC firmware handling fix
- Revert a broken workaround for SMU 13.0.2
- DC writeback fix
- Enable gfxoff when ROCm apps are active on gfx11 with the proper FW version
amdkfd:
- Fix dma-buf exports using GEM handles
nouveau:
- fix a unneeded WARN_ON triggering
xe:
- Fix for definition of wakeref_t
- Fix for an error code aliasing
- Fix for VM_UNBIND_ALL in the case there are no bound VMAs
- Fixes for a number of __iomem address space mismatches reported by sparse
- Fixes for the assignment of exec_queue priority
- A Fix for skip_guc_pc not taking effect
- Workaround for a build problem on GCC 11
- A couple of fixes for error paths
- Fix a Flat CCS compression metadata copy issue
- Fix a misplace array bounds checking
- Don't have display support depend on EXPERT (as discussed on IRC)
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmWqGdoACgkQDHTzWXnE
hr5tiA//ZIos/mK+70JprAhJkXN/Lo5IDBOsldDQ1BkakkVLU1taHIsrER6iDT8g
WmDzuC4ZIkHZyJ1V8zcIZ4wjE+sIOUeje0fSuMRgwPD+rrdn3WjUiAuvofuxQ5fD
LFW+O/9hzTl3xBoxidPdupf33WGRMAKhNuYvhwfH14LaqNDVVdAHU7MPTORmFsyY
CbCOLze5dwAmlB4rk+LDsO0gFtXjB7/Ewg2vHUzlEmaYmPpRxu9MHCpadQGq2Sal
nwxwI5lb8DqR8jbel3pA0/kz06EdSKKr20YTdw+RVrp/tPqDq4njkZfJMPK+2VYf
VpSPnGJOvvdtehCrKBBOBK4WRZWgbTUSuxrjgtIy+fp9aWt84NOEG2xDk1W+qWHS
sqp5PeFb3meK1bsCdBBVSjj1fKApKmzJWqiCr0Z8dzV8QsZWSpPzhyfp3puxKESV
dORGhMWdtdTMSPRysDUiTMGxn+fxaFTTx9Jsd0LbLizf/sob+ol9EjbNihaa5aKM
FdHpUgO08X3OAppmcGmGQdIXG2K+TmKMsvguRAK98OVYXhVUpB/BrCGnd9eGyjYY
mSQWrwKGm+Y/dGnr+ylHKxcyRQMmhc33gHmPAItNdMC8OemiHWZKKzoKBOzT2aME
pZc6ZRJesG46bfrNqID6ASAIw6SEA+Zj3rk3QsWNPAbHVJcZboY=
=LInS
-----END PGP SIGNATURE-----
Merge tag 'drm-next-2024-01-19' of git://anongit.freedesktop.org/drm/drm
Pull more drm fixes from Dave Airlie:
"This is mostly amdgpu and xe fixes, with an amdkfd and nouveau fix
thrown in.
The amdgpu ones are just the usual couple of weeks of fixes. The xe
ones are bunch of cleanups for the new xe driver, the fix you put in
on the merge commit and the kconfig fix that was hiding the problem
from me.
amdgpu:
- DSC fixes
- DC resource pool fixes
- OTG fix
- DML2 fixes
- Aux fix
- GFX10 RLC firmware handling fix
- Revert a broken workaround for SMU 13.0.2
- DC writeback fix
- Enable gfxoff when ROCm apps are active on gfx11 with the proper FW
version
amdkfd:
- Fix dma-buf exports using GEM handles
nouveau:
- fix a unneeded WARN_ON triggering
xe:
- Fix for definition of wakeref_t
- Fix for an error code aliasing
- Fix for VM_UNBIND_ALL in the case there are no bound VMAs
- Fixes for a number of __iomem address space mismatches reported by
sparse
- Fixes for the assignment of exec_queue priority
- A Fix for skip_guc_pc not taking effect
- Workaround for a build problem on GCC 11
- A couple of fixes for error paths
- Fix a Flat CCS compression metadata copy issue
- Fix a misplace array bounds checking
- Don't have display support depend on EXPERT (as discussed on IRC)"
* tag 'drm-next-2024-01-19' of git://anongit.freedesktop.org/drm/drm: (71 commits)
nouveau/vmm: don't set addr on the fail path to avoid warning
drm/amdgpu: Enable GFXOFF for Compute on GFX11
drm/amd/display: Drop 'acrtc' and add 'new_crtc_state' NULL check for writeback requests.
drm/amdgpu: revert "Adjust removal control flow for smu v13_0_2"
drm/amdkfd: init drm_client with funcs hook
drm/amd/display: Fix a switch statement in populate_dml_output_cfg_from_stream_state()
drm/amdgpu: Fix the null pointer when load rlc firmware
drm/amd/display: Align the returned error code with legacy DP
drm/amd/display: Fix DML2 watermark calculation
drm/amd/display: Clear OPTC mem select on disable
drm/amd/display: Port DENTIST hang and TDR fixes to OTG disable W/A
drm/amd/display: Add logging resource checks
drm/amd/display: Init link enc resources in dc_state only if res_pool presents
drm/amd/display: Fix late derefrence 'dsc' check in 'link_set_dsc_pps_packet()'
drm/amd/display: Avoid enum conversion warning
drm/amd/pm: Fix smuv13.0.6 current clock reporting
drm/amd/pm: Add error log for smu v13.0.6 reset
drm/amdkfd: Fix 'node' NULL check in 'svm_range_get_range_boundaries()'
drm/amdgpu: drop exp hw support check for GC 9.4.3
drm/amdgpu: move debug options init prior to amdgpu device init
...
(controllers set the flag, but there is no client to use it). Also,
CLASS_SPD support gets simplified to prepare removal in the future.
Class based instantiation is not recommended these days anyhow.
Furthermore, I2C core now creates a debugfs directory per I2C adapter.
Current bus driver users were converted to use it. Then, there are also
quite some driver updates. Standing out are patches for the wmt-driver
which is refactored to support more variants. This is the rebased pull
request where a large series for the designware driver was dropped.
-----BEGIN PGP SIGNATURE-----
iQJDBAABCgAtFiEEOZGx6rniZ1Gk92RdFA3kzBSgKbYFAmWph0UPHHdzYUBrZXJu
ZWwub3JnAAoJEBQN5MwUoCm2kbIQAJotSmX0mM+nNPReYCMMiloxoxUwgpiErNwY
WDrYQSezthAJ1LDsGOEeLcE4f4I+UcUHBO1BoERtOZg3cGtE0Ii5N845sp100S9O
ktyaKS5utoErymThWFFrnZX60/8yKXUMzZmNzy96560gPcxbFyyyVhKfBSPzK9T+
O8CGu7GRNqgWHlvH3yqGeCbreWYrYVSrluEpBu6807cp3zDxrU+autOnsewm5+md
ka3DdqrbxJSblYK8fJKESAUgkRmZgYKbgl0iiCuqX+ib6I4OA3Z68ny7dl0fY3Ws
vwt7d88SaBKDdJmUZyb/sm4aJsW69GN+ECZolxrn4TIw45k4tes2s6Ma5+TV3E9h
Fd1RuqduFEqQ7cj31UPe2x8rgj5Fo5nbjCWxdZv+/3zF8+cHwi8iwkp2PScsPCsa
fmCdehUE5DrgobsRNANe6XJzxY5wp2VNpGEWKeaQz2Z0/d9T1YFS7a8aewvhXoPC
isZboi6GQh2XoE8UgGJa29VUuaIkUW513DwCGw8mz1yKN+kHGcsRXXjkjaZoQn3U
MMvh/zkI2Hpy/m2R8PWeIq5XhLJvmlZ19JJzUHJIjXh9Fn9EVtXhlUleh6mzMfeM
n8NOg7Eukep2sBgmaufkUKz2Jtogs59YDSXZEvqJjIkPM2Wi0hA18Qj+pilES1ff
3ckk3mxY
=8D3Q
-----END PGP SIGNATURE-----
Merge tag 'i2c-for-6.8-rc1-rebased' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c updates from Wolfram Sang:
"This removes the currently unused CLASS_DDC support (controllers set
the flag, but there is no client to use it).
Also, CLASS_SPD support gets simplified to prepare removal in the
future. Class based instantiation is not recommended these days
anyhow.
Furthermore, I2C core now creates a debugfs directory per I2C adapter.
Current bus driver users were converted to use it.
Finally, quite some driver updates. Standing out are patches for the
wmt-driver which is refactored to support more variants.
This is the rebased pull request where a large series for the
designware driver was dropped"
* tag 'i2c-for-6.8-rc1-rebased' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (38 commits)
MAINTAINERS: use proper email for my I2C work
i2c: stm32f7: add support for stm32mp25 soc
i2c: stm32f7: perform I2C_ISR read once at beginning of event isr
dt-bindings: i2c: document st,stm32mp25-i2c compatible
i2c: stm32f7: simplify status messages in case of errors
i2c: stm32f7: perform most of irq job in threaded handler
i2c: stm32f7: use dev_err_probe upon calls of devm_request_irq
i2c: i801: Add lis3lv02d for Dell XPS 15 7590
i2c: i801: Add lis3lv02d for Dell Precision 3540
i2c: wmt: Reduce redundant: REG_CR setting
i2c: wmt: Reduce redundant: function parameter
i2c: wmt: Reduce redundant: clock mode setting
i2c: wmt: Reduce redundant: wait event complete
i2c: wmt: Reduce redundant: bus busy check
i2c: mux: reg: Remove class-based device auto-detection support
i2c: make i2c_bus_type const
dt-bindings: at24: add ROHM BR24G04
eeprom: at24: use of_match_ptr()
i2c: cpm: Remove linux,i2c-index conversion from be32
i2c: imx: Make SDA actually optional for bus recovering
...
I2C_CLASS_SPD was used to expose the EEPROM content to user space,
via the legacy eeprom driver. Now that this driver has been removed,
we can remove I2C_CLASS_SPD support. at24 driver with explicit
instantiation should be used instead.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Theoretically, it would be possible for a buggy or malicious VBIOS to
overwrite past the bounds of the passed parameters (or its own
workspace); add bounds checking to prevent this from happening.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3093
Signed-off-by: Alexander Richards <electrodeyt@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Separate deferred error from UE and CE and log it
individually.
Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
add aca smu backend support for smu v13.0.6.
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When current clock is equal to max dpm level clock, the level is not
indicated correctly with *. Fix by comparing current clock against dpm
level value.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.7.x
In 'struct phm_ppm_table *ptr' allocation using kzalloc, an incorrect
structure type is passed to sizeof() in kzalloc, larger structure types
were used, thus using correct type 'struct phm_ppm_table' fixes the
below:
drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/process_pptables_v1_0.c:203 get_platform_power_management_table() warn: struct type mismatch 'phm_ppm_table vs _ATOM_Tonga_PPM_Table'
Cc: Eric Huang <JinHuiEric.Huang@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
On APUs power is SoC power, not just GPU.
Clarify that for UVD/VCE/VCN the IP is powered down,
not disabled which can confusing and lead to concerns
that the IP is actually not available.
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Hawaii, Bonaire, Fiji, and Tonga support average power, the others
support current power.
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The hwmgr->backend, (i.e. data) allocated by kzalloc is not freed in
the error-handling paths of smu7_get_evv_voltages and
smu7_update_edc_leakage_table. However, it did be freed in the
error-handling of phm_initializa_dynamic_state_adjustment_rule_settings,
by smu7_hwmgr_backend_fini. So the lack of free in smu7_get_evv_voltages
and smu7_update_edc_leakage_table is considered a memleak in this patch.
Fixes: 599a7e9fe1 ("drm/amd/powerplay: implement smu7 hwmgr to manager asics with smu ip version 7.")
Fixes: 8f0804c6b7 ("drm/amd/pm: add edc leakage controller setting")
Signed-off-by: Zhipeng Lu <alexious@zju.edu.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Expose sysfs entry mem_busy_percent for GC version
9.4.3 APU system
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use gpu_metrics_v1_5 for SMUv13.0.6 to fill
gpu metric info
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add new gpu_metrics_v1_5 to acquire vcn/jpeg activity
& pcie nak error counters
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Update pmfw metric table to include vcn & jpeg
activity for smu_v_13_0_6
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use separate metric table for APU and Non APU
systems for smu_v_13_0_6 to get metric data
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
It is reported that on a Topaz dGPU the kernel emits:
amdgpu: can't get the mac of 5
This is because there is no definition for max levels of VDDGFX
declared for SMU71 or SMU7. The correct definition is VDDC so
use this.
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3049
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
There is a repeated define of smu v14_0_0 driver if version, so delete
one in driver if header.
Signed-off-by: Li Ma <li.ma@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The amdgpu_free_extended_power_table is called in every error-handling
paths of amdgpu_parse_extended_power_table. However, after the following
call chain of returning:
amdgpu_parse_extended_power_table
|-> kv_dpm_init / si_dpm_init
(the only two caller of amdgpu_parse_extended_power_table)
|-> kv_dpm_sw_init / si_dpm_sw_init
(the only caller of kv_dpm_init / si_dpm_init, accordingly)
|-> kv_dpm_fini / si_dpm_fini
(goto dpm_failed in xx_dpm_sw_init)
|-> amdgpu_free_extended_power_table
As above, the amdgpu_free_extended_power_table is called twice in this
returning chain and thus a double-free is triggered. Similarily, the
last kfree in amdgpu_parse_extended_power_table also cause a double free
with amdgpu_free_extended_power_table in kv_dpm_fini.
Fixes: 84176663e7 ("drm/amd/pm: create a new holder for those APIs used only by legacy ASICs(si/kv)")
Signed-off-by: Zhipeng Lu <alexious@zju.edu.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When ps allocated by kzalloc equals to NULL, kv_parse_power_table
frees adev->pm.dpm.ps that allocated before. However, after the control
flow goes through the following call chains:
kv_parse_power_table
|-> kv_dpm_init
|-> kv_dpm_sw_init
|-> kv_dpm_fini
The adev->pm.dpm.ps is used in the for loop of kv_dpm_fini after its
first free in kv_parse_power_table and causes a use-after-free bug.
Fixes: a2e73f56fa ("drm/amdgpu: Add support for CIK parts")
Signed-off-by: Zhipeng Lu <alexious@zju.edu.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When the allocation of
adev->pm.dpm.dyn_state.vddc_dependency_on_dispclk.entries fails,
amdgpu_free_extended_power_table is called to free some fields of adev.
However, when the control flow returns to si_dpm_sw_init, it goes to
label dpm_failed and calls si_dpm_fini, which calls
amdgpu_free_extended_power_table again and free those fields again. Thus
a double-free is triggered.
Fixes: 841686df9f ("drm/amdgpu: add SI DPM support (v4)")
Signed-off-by: Zhipeng Lu <alexious@zju.edu.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
add power save mode workload for smu 13.0.10, so that in compute mode,
pmfw will add margin since some applications requres higher margin.
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
pm supports return vpe clock table and soc clock table
Signed-off-by: Peyton Lee <peytolee@amd.com>
Reviewed-by: Li Ma <li.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Instead of software managed counters.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fulfill the SMU13.0.7 support for Wifi RFI mitigation feature.
--
v10->v11:
- downgrade the prompt level on message failure(Lijo)
v13:
- Fix the format issue (IIpo Jarvinen)
- Remove duplicate code (IIpo Jarvinen)
Signed-off-by: Evan Quan <quanliangl@hotmail.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fulfill the SMU13.0.0 support for Wifi RFI mitigation feature.
--
v10->v11:
- downgrade the prompt level on message failure(Lijo)
v13:
- Fix the format issue (IIpo Jarvinen)
- Move function smu_v13_0_0_set_wbrf_exclusion_ranges to
smu_v13_0.c as a generic code for later use (IIpo Jarvinen)
Co-developed-by: Evan Quan <quanliangl@hotmail.com>
Signed-off-by: Evan Quan <quanliangl@hotmail.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
To protect PMFW from being overloaded.
Signed-off-by: Evan Quan <quanliangl@hotmail.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
With WBRF feature supported, as a driver responding to the frequencies,
amdgpu driver is able to do shadow pstate switching to mitigate possible
interference(between its (G-)DDR memory clocks and local radio module
frequency bands used by Wifi 6/6e/7).
--
v1->v2:
- update the prompt for feature support(Lijo)
v8->v9:
- update parameter document for smu_wbrf_event_handler(Simon)
v9->v10:
v10->v11:
- correct the logics for wbrf range sorting(Lijo)
v13:
- Fix the format issue (IIpo Jarvinen)
Signed-off-by: Evan Quan <quanliangl@hotmail.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add those data structures to support Wifi RFI mitigation feature.
Signed-off-by: Evan Quan <quanliangl@hotmail.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Remove redundant functions members of pptable_funcs and change
the function type as static because they are not called by other
files.
Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Replace direct usage of adev->ip_versions with amdgpu_ip_version.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fix pp_dpm_sclk_od and pp_dpm_mclk_od typos.
Those were defined as pp_*clk_od but used as pp_dpm_*clk_od instead.
This change removes the _dpm part.
Fixes: 8cfd6a0575 ("drm/amd/pm: Hide irrelevant pm device attributes")
Signed-off-by: Dmitrii Galantsev <dmitrii.galantsev@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
support new mca smu error code decoding from smu 85.86.0 for smu v13.0.6
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Increment the driver if version and add new mems to the mertics table.
Signed-off-by: Li Ma <li.ma@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When kzalloc() for smu_table->ecc_table fails, we should free
the previously allocated resources to prevent memleak.
Fixes: edd7942085 ("drm/amd/pm: add message smu to get ecc_table v2")
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Refactor code such that ras block decides the default mca debug mode,
and not swsmu block.
By default mca debug mode is set to false.
v2: squash in uninitialized value fix (Alex)
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The smu needs to get the rlc power down message to sync the rlc state
with smu, the rlc state updating message need to be sent at while smu
begin suspend sequence , otherwise SMU will crash while RLC state is not
notified by driver, and rlc state probally changed after that
notification, so it needs to notify rlc state to smu at the end of the
suspend sequence in amdgpu_device_suspend() that can make sure the rlc
state is correctly set to SMU.
[ 101.000590] amdgpu 0000:03:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x0000001E SMN_C2PMSG_82:0x00000000
[ 101.000598] amdgpu 0000:03:00.0: amdgpu: Failed to disable gfxoff!
[ 110.838026] amdgpu 0000:03:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x0000001E SMN_C2PMSG_82:0x00000000
[ 110.838035] amdgpu 0000:03:00.0: amdgpu: Failed to disable smu features.
[ 110.838039] amdgpu 0000:03:00.0: amdgpu: Fail to disable dpm features!
[ 110.838040] [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend of IP block <smu> failed -62
[ 110.884394] PM: suspend of devices aborted after 21213.620 msecs
[ 110.884402] PM: start suspend of devices aborted after 21213.882 msecs
[ 110.884405] PM: Some devices failed to suspend, or early wake event detected
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add sysfs attribute to read power management metrics. A snapshot is
captured to the buffer when the attribute is read.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add support to fetch PM metrics sample from SMU v13.0.6
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add API support to fetch a snapshot of power management metrics from PMFW.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
correct mca ipid die/socket/addr decode
v2: squash in fix from Yang
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
smu_v13_0_baco_set_armd3_sequence is not used by other files, so
make it as static type.
Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use generic functions and remove the duplicate code
Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fix the return value and drop redundant parameter
of get_asic_baco_capability function.
Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
No need to notify about unload during reset. Also remove the FW version
check.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fill PCIE error counters & instantaneous bandwidth
in gpu metrics v1_4 for smu v_13_0_6
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Update pmfw metric table to include pcie
instantaneous bandwidth & pcie error counters
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
add pcs xgmi ras error query support for smu v13.0.6.
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
disable mca debug mode for smu v13.0.6 by default.
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
refine smu mca driver to support query ras error from pmfw path.
- correct gfx smu bank hwid (from mp5 to smu bank)
- retire unused callback function in amdgpu_mca_smu_funcs{}
- add new mca_bank_set{} structure to collect mca bank
- move enum mca_reg_idx into amdgpu_mca.h header
- add mca status register field decode macro
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Resolves Sphinx unexpected indentation warning when compiling
documentation (e.g. `make htmldocs`). Replaces tabs with spaces and adds
a literal block to keep vertical formatting of the
example power state list.
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com> (v2)
Acked-by: Randy Dunlap <rdunlap@infradead.org> (v2)
Signed-off-by: Hunter Chasens <hunter.chasens18@ncf.edu>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The DS clock may exceed the limit as sclk dfll divider is 16
to target freq.
Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Hide PCIe DPM attribute on SOCs with GC v9.4.2 and GC v9.4.3.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For IMU enabled APUs, after sending the PrepareMp1ForUnload message
to SMU in system_features_control, the RLC registers can't be touched.
The driver to stop the rlc in suspending is no longer required.
Signed-off-by: Tim Huang <Tim.Huang@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Change return code to EOPNOTSUPP for unsupported functions. Use the
error code information to hide sysfs nodes not valid for the SOC.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
dorp fw version check and using max table size to init table.
Signed-off-by: Li Ma <li.ma@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Update driver if headers and metrics table in smu v14_0_0 after smu fw promotion.
Drop the legacy metrics table and add warning of checking pmfw version.
Signed-off-by: Li Ma <li.ma@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The current code checks sriov vf flag multiple times when creating
hwmon sysfs. So fix it.
Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This was fixed in PMFW before launch and is no longer
required.
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.1.x
For pptable structs that use flexible array sizes, use flexible arrays.
Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2039926
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Return 0 as the default min power limit for the asics use
powerplay.
Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
MACO only works if BACO is supported
Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.1.x
fix the high voltage and temperature issue after the driver is unloaded on smu 13.0.0,
smu 13.0.7 and smu 13.0.10
v2 - fix the code format and make sure it is used on the unload case only.
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fixes warnings:
drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu13/smu_v13_0_6_ppt.c:286:45:
warning: '%s' directive output may be truncated writing up to 29 bytes
into a region of size 23 [-Wformat-truncation=]
drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu13/smu_v13_0_6_ppt.c:286:52:
warning: '%s' directive output may be truncated writing up to 29 bytes
into a region of size 23 [-Wformat-truncation=]
drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu14/smu_v14_0.c:72:45: warning:
'%s' directive output may be truncated writing up to 29 bytes into a
region of size 23 [-Wformat-truncation=]
drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu14/smu_v14_0.c:72:52: warning:
'%s' directive output may be truncated writing up to 29 bytes into a
region of size 23 [-Wformat-truncation=]
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Modify the print format of the fractional part to avoid display error.
Signed-off-by: Li Ma <li.ma@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
In SR-IOV environment, the value of pcie_table->num_of_link_levels will
be 0, and num_of_levels - 1 will cause array index out of bounds
Signed-off-by: Lin.Cao <lincao12@amd.com>
Acked-by: Jingwen Chen <Jingwen.Chen2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
is_mode1_reset_supported may be called before smu init, when smu_context
is unitialized in driver load/unload test. Call smu_cmn_get_smc_version
explicitly in is_mode1_reset_supported.
v2: apply to aldebaran in case is_mode1_reset_supported will be
uncommented (Candice Li)
Fixes: 710d9caec7 ("drm/amd/pm: drop most smu_cmn_get_smc_version in smu")
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Candice Li <candice.li@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rather than individual ASICs checking for the quirk, set the quirk at the
driver level.
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fix the return value in default case and drop
redundant 'break'.
Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
PMFW will handle the features disablement properly for gpu reset case,
driver involvement may cause some unexpected issues.
Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Call amdgpu_ras_set_mca_debug_mode when we set mca debug mode in smu
v13_0_6.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The incoming strings might not be terminated by a newline
or a 0.
(found while testing a program that just wrote the string
itself, causing a crash)
Cc: stable@vger.kernel.org
Fixes: e3933f26b6 ("drm/amd/pp: Add edit/commit/show OD clock/voltage support in sysfs")
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Support for getting power1_cap_min value on smu13 and smu11.
For other Asics, we still use 0 as the default value.
Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add missing free on an error path.
Fixes: 511a95552e ("drm/amd/pm: Add SMU 13.0.6 support")
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Kunwu.Chan <chentao@kylinos.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If one of the devices in the hive detects a
fatal error, need to send ras recovery reset
message to PMFW of all devices in the hive.
For that add a flag in hive to indicate that
it's undergoing ras recovery
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Update the PMFW version check the the ROCm optimizations.
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Update driver if, pmfw and ppsmc header files.
Add new gpu_metrics_v3_0 for metrics table updated in driver if
and reserve legacy metrics table to maintain backward compatibility.
---
v1:
Update header files and add gpu_metrics_v3_0.
v2:
Update smu_types.h, smu headers and drop smu_cmn_get_smc_version in smu v14_0_0.
Signed-off-by: Li Ma <li.ma@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add P2S table load support on SMU v13.0.6 ASICs.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
smu_check_fw_version is called in smu hw init, thus smu if version
and version are garenteed to be stored in smu context. No need to
call smu_cmn_get_smc_version again after system boot up.
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add reset option for fan_ctrl interfaces on the smu v13.0.7
User can use command "echo r > interface_name" to reset the
interface to boot value
Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add reset option for fan_ctrl interfaces.
For example:
User can use the "echo r > acoustic_limit_rpm_threshold" command
to reset acoustic_limit_rpm_threshold to boot value
Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Power up/down UMSCH by SMU.
Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Acked-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add ppt callback to power up/down UMSCH.
v2: squash in updates (Alex)
Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add initial swSMU support for smu 14 series ASIC.
v2: squash in build fixes and updates (Li Ma)
fix warnings (Alex)
v3: squash in updates (Alex)
v4: squash in updates (Alex)
v5: squash in avg/current power updates (Alex)
Signed-off-by: Li Ma <li.ma@amd.com>
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add initial smu v14_0_0 pmfw if file
v2: squash in updates (Alex)
Signed-off-by: Li Ma <li.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add initial smu v14_0_0 ppsmc file
v2: squash in updates (Alex)
v3: squash in updates (Alex)
Signed-off-by: Li Ma <li.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add initial smu v14_0_0 driver if file
v2: squash in updates (Alex)
v3: update interface (Alex)
Signed-off-by: Li Ma <li.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use gpu_metrics_v1_4 for SMUv13.0.6 to fill
gpu metric info
v3: Removed filling gpu metric instantaneous
pcie bw
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Update pmfw metric table to include xgmi transfer
data and pci instantaneous bandwidth for smu v13_0_6
v2:
Updated metric table version
v3: Removed inst pcie bw with alignment to metrics table
version 8
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Wait for completion of sending the EnableGfxImu message
when using the PSP FW loading.
Signed-off-by: Tim Huang <Tim.Huang@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Keep FRU related information together in a separate structure.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When ROCm is active enable additional SMU 13.0.0 optimizations.
This reuses the unused powersave profile on PMFW.
v2: move to the swsmu code since we need both bits active in
the workload mask.
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For pptable structs that use flexible array sizes, use flexible arrays.
Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2036742
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For pptable structs that use flexible array sizes, use flexible arrays.
Suggested-by: Felix Held <felix.held@amd.com>
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2874
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Remove set df cstate as disallow df state is
not required for SMUv13.0.6
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS
(for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).
As found with Coccinelle[1], add __counted_by for struct smu10_voltage_dependency_table.
[1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci
Cc: Evan Quan <evan.quan@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Xiaojian Du <Xiaojian.Du@amd.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230922173216.3823169-1-keescook@chromium.org
Publish max operating temperature of SOC and memory as temp*_emergency
nodes in hwmon. temp*_crit will show the throttle temperature limits.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
CTF limit represents the max operating temperature and thermal limit
gives the limit at which throttling starts. Add support for both limits.
SOC and HBM may have different limit values.*_emergency_max gives max
operating temperature and *_crit_max value represents throttle limit.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Several files declare MIN() or MAX() macros that ignore the types of the
values being compared. Drop these macros and switch to min() min_t(),
and max() from `linux/minmax.h`.
Suggested-by: Hamza Mahfooz <Hamza.Mahfooz@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The matching values for `pcie_gen_cap` and `pcie_width_cap` when
fetched from powerplay tables are 1 byte, so narrow the arguments
to match to ensure min() and max() comparisons without casts.
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
While aligning SMU11 with SMU13 implementation an assumption was made that
`dpm_context->dpm_tables.pcie_table` was populated in dpm table initialization
like in SMU13 but it isn't.
So restore some of the original logic and instead just check for
amdgpu_device_pcie_dynamic_switching_supported() to decide whether to hardcode
values; erring on the side of performance.
Cc: stable@vger.kernel.org # 6.1+
Reported-and-tested-by: Umio Yasuno <coelacanth_dream@protonmail.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1447#note_2101382
Fixes: e701156ccc ("drm/amd: Align SMU11 SMU_MSG_OverridePcieParameters implementation with SMU13")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
"ret" was checked earlier inside the loop, so we know it is zero here.
No need to check a second time.
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
disable pp_power_profile_mode for sriov on gc11.0.3 as not supported
by smu
Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Pass the correct size to smu_v13_0_6_print_clks, otherwise
the same place in buf will be re-written.
Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Replace with set_plpd_mode uniformly for places to use.
Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The allow_xgmi_power_down(true/false) will be generally replaced by:
- allow: select_xgmi_plpd_policy(XGMI_PLPD_DEFAULT)
- disallow: select_xgmi_plpd_policy(XGMI_PLPD_DISALLOW)
Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Assign DEFAULT mode if it supports plpd, otherwise keeps NONE
v2: reduce ip version checks
Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add the interface to change xgmi per-link power down policy.
v2: split from sysfs interface code and miscellaneous updates
v3: check against XGMI_PLPD_DEFAULT/XGMI_PLPD_OPTIMIZED and
pass PPSMC param
Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add enum pp_xgmi_plpd_mode to describe PLPD policies.
v2: move the enum from amdgpu_smu.h to kgd_pp_interface.h
Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
To add message to select PLPD mode.
Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add reset option for fan_curve.
User can use command "echo r > fan_cure" to reset the fan_curve
to boot value
Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Selectively updating feature mask is not supported in SMU v13.0.6.
Remove the callback corresponding to that.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Instead of neglecting fractional part, round the Q10 format values in
SMU v13.0.6 metrics table.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
update smu header to support mca dump interface.
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
pp_dpm_*clk nodes also could show the frequencies when a clock is in
'sleep' state. Add documentation related to that.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For SMU v13.0.6, keep GFX deep sleep clock reporting style consistent
with that of other clocks. Sample format below.
S: 78Mhz *
0: 600Mhz
1: 800Mhz
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
On SMU v13.0.6, effective clocks are reported by FW which won't exactly
match with DPM level. Report the current clock based on the values
matching closest to the effective clock. Also, when deep sleep is
applied to a clock, report it with a special level "S:" as in sample
clock levels below
S: 19Mhz *
0: 615Mhz
1: 800Mhz
2: 888Mhz
3: 1000Mhz
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use an inline function for version check. Gives more flexibility to
handle any format changes.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This loop will exit with "retry" set to -1 if it fails but the code
checks for if "retry" is zero. Fix this by changing post-op to a
pre-op. --retry vs retry--.
Fixes: e01eeffc3f ("drm/amd/pm: avoid driver getting empty metrics table for the first time")
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This loop will exit with "retry" set to -1 if it fails but the code
checks for if "retry" is zero. Fix this by changing post-op to a
pre-op. --retry vs retry--.
Fixes: e01eeffc3f ("drm/amd/pm: avoid driver getting empty metrics table for the first time")
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
v1:
enable smu_v13_0_6 mca debug mode when UMC RAS feature is enabled.
v2:
use amdgpu_ras_is_supported() helper function instead bitmask check.
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
update smu firmware header to support smu mca debug feature.
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
split switch statement into two and consolidate the common
code for printing most of the types of clock speeds
Signed-off-by: Darren Powell <darren.powell@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use variables to remove ternary expression in print statement
and improve readability. This will help to optimize the code
duplication in the switch statement
Also Changed:
replaced single_dpm_table->count as iterator in for loops
with safer clocks_num_levels value
replaced dpm_table.value usage with local var clocks_mhz
Signed-off-by: Darren Powell <darren.powell@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use variables to remove the multiple nested ternary expressions
and improve readability. This will help to optimize the code
duplication in the switch statement
Also Changed:
Modify function aldebaran_get_clk_table to void function as it
always returns 0
Use const string "attempt_string" to cut down on repetition
Signed-off-by: Darren Powell <darren.powell@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Replace print_clock_levels with emit_clock_levels for aldebaran
* replace .print_clk_levels with .emit_clk_levels in aldebaran_ppt_funcs
* added extra parameter int *offset
* removed var size, uses arg *offset instead
* removed call to smu_cmn_get_sysfs_buf
* errors are returned to caller
* returns 0 on success
additional incidental changes
* changed type of vars i, now to remove comparing mismatch types
* renamed var s/now/cur_value/
* switch statement default now returns -EINVAL
* RAS Recovery returns -EBUSY
Based on
commit b06b48d7dd ("amdgpu/pm: Implement emit_clk_levels for navi10")
Signed-off-by: Darren Powell <darren.powell@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If vcn is disabled in kernel parameters, don't touch vcn,
otherwise it may cause vcn hang.
v2: delete unnecessary logs
v3: move "is_vcn_enabled" check to smu_dpm_setvcn/jpeg_enable (Evan)
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For APUs with SMU v13.0.6, mode-2 reset is kept as default and for
others mode-1 is the default reset method.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Tested-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Critical Temperature needs to be reported in
millidegree Celsius.
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add critical temperature message support func for smu v13.0.6
and expose critical temperature as part of hw mon attributes
for GC v9.4.3
v2:
Added comment for pmfw version requirement & move the check
to get_thermal_temperature_range function
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>