Enable RAS EEPROM support for smu v13_0_0 and v13_0_10.
Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
1. Add a function pointer structure ta_funcs to psp context
2. Make the interfaces generic to all TAs
3. Leverage exisitng TA context and remove unused functions
4. Fix return code bugs
v2: Add comments for ta funcs macros and correct typo
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Candice Li <candice.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
1. Save TA unload psp response status
2. Add RAS TA loading status check for initializaiton
3. Drop RAS context teardown to allow RAS TA to be reloaded
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Candice Li <candice.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
MES scheduler and kiq versions are stored in mes.sched_version and
mes.kiq_version, respectively, which are read from a register after
their queues are initialized. Remove mes.ucode_fw_version and
mes.data_fw_version which tried to read this versioning info from the
firmware headers (which don't contain this information).
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Jack Xiao <Jack.Xiao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use mes.sched_version, mes.kiq_version for debugfs as
mes.ucode_fw_version does not contain correct versioning information.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Jack Xiao <Jack.Xiao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
Before we call psp_securedisplay_invoke(), we call
psp_prep_securedisplay_cmd_buf() to prepare and initialize the command
buffer.
However, we didn't use the mutex_lock to protect the status of command
buffer. So when multiple threads are using the command buffer, after
thread A return from psp_securedisplay_invoke() and the command buffer
status is set to SUCCESS, another thread B may call
psp_prep_securedisplay_cmd_buf() and initialize the status to FAILURE
again, and cause Thread A to get a failure return status.
[How]
Move the mutex_lock out of psp_securedisplay_invoke() to its caller to
cover psp_prep_securedisplay_cmd_buf() and the code checking the return
status of command buffer.
Signed-off-by: Alan Liu <HaoPing.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Make the code simpler.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For MCA poison, if unmap queue fails, only gpu reset should be
triggered without page retirement handling, MCA notifier will do it.
v2: handle MCA poison consumption in umc_poison_handler directly.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Make the code more readable.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Define page retirement functions for MCA platform.
v2: remove page retirement handling from MCA poison handler,
let MCA notifier do page retirement.
v3: remove specific poison handler for MCA to simplify code.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch to fix the gdm3 start failure with virual display:
/usr/libexec/gdm-x-session[1711]: (II) AMDGPU(0): Setting screen physical size to 270 x 203
/usr/libexec/gdm-x-session[1711]: (EE) AMDGPU(0): Failed to make import prime FD as pixmap: 22
/usr/libexec/gdm-x-session[1711]: (EE) AMDGPU(0): failed to set mode: Invalid argument
/usr/libexec/gdm-x-session[1711]: (WW) AMDGPU(0): Failed to set mode on CRTC 0
/usr/libexec/gdm-x-session[1711]: (EE) AMDGPU(0): Failed to enable any CRTC
gnome-shell[1840]: Running GNOME Shell (using mutter 42.2) as a X11 window and compositing manager
/usr/libexec/gdm-x-session[1711]: (EE) AMDGPU(0): failed to set mode: Invalid argument
vkms doesn't have modifiers support, set fb_modifiers_not_supported to bring the gdm back.
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Acked-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Tim Huang <Tim.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Change ttm_resource structure from num_pages to size_t size in bytes.
v1 -> v2: change PFN_UP(dst_mem->size) to ttm->num_pages
v1 -> v2: change bo->resource->size to bo->base.size at some places
v1 -> v2: remove the local variable
v1 -> v2: cleanup cmp_size_smaller_first()
v2 -> v3: adding missing PFN_UP in ttm_bo_vm_fault_reserved
Signed-off-by: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221027091237.983582-1-Amaranath.Somalapuram@amd.com
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
In the S2idle suspend/resume phase the gfxoff is keeping functional so
some IP blocks will be likely to reinitialize at gfxoff entry and that
will result in failing to program GC registers.Therefore, let disallow
gfxoff until AMDGPU IPs reinitialized completely.
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 5.15.x
UAPI Changes:
- Documentation for page-flip flags
Cross-subsystem Changes:
- dma-buf: Add unlocked variant of vmapping and attachment-mapping
functions
Core Changes:
- atomic-helpers: CRTC primary plane test fixes
- connector: TV API consistency improvements, cmdline parsing
improvements
- crtc-helpers: Introduce drm_crtc_helper_atomic_check() helper
- edid: Fixes for HFVSDB parsing,
- fourcc: Addition of the Vivante tiled modifier
- makefile: Sort and reorganize the objects files
- mode_config: Remove fb_base from drm_mode_config_funcs
- sched: Add a module parameter to change the scheduling policy,
refcounting fix for fences
- tests: Sort the Kunit tests in the Makefile, improvements to the
DP-MST tests
- ttm: Remove unnecessary drm_mm_clean() call
Driver Changes:
- New driver: ofdrm
- Move all drivers to a common dma-buf locking convention
- bridge:
- adv7533: Remove dynamic lane switching
- it6505: Runtime PM support
- ps8640: Handle AUX defer messages
- tc358775: Drop soft-reset over I2C
- ast: Atomic Gamma LUT Support, Convert to SHMEM, various
improvements
- lcdif: Support for YUV planes
- mgag200: Fix PLL Setup on some revisions
- udl: Modesetting improvements, hot-unplug support
- vc4: Fix support for PAL-M
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCY1D3fQAKCRDj7w1vZxhR
xZZgAPoCSfyU3+M3ALT0vQ/OpktE5tuwUBMac2Lxkqgx4dnQ0gEAnYeez0Dedod8
HNdgBH7FTklYa/zT8n17SwHUOzJJ5gc=
=YvuW
-----END PGP SIGNATURE-----
Merge tag 'drm-misc-next-2022-10-20' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for 6.2:
UAPI Changes:
- Documentation for page-flip flags
Cross-subsystem Changes:
- dma-buf: Add unlocked variant of vmapping and attachment-mapping
functions
Core Changes:
- atomic-helpers: CRTC primary plane test fixes
- connector: TV API consistency improvements, cmdline parsing
improvements
- crtc-helpers: Introduce drm_crtc_helper_atomic_check() helper
- edid: Fixes for HFVSDB parsing,
- fourcc: Addition of the Vivante tiled modifier
- makefile: Sort and reorganize the objects files
- mode_config: Remove fb_base from drm_mode_config_funcs
- sched: Add a module parameter to change the scheduling policy,
refcounting fix for fences
- tests: Sort the Kunit tests in the Makefile, improvements to the
DP-MST tests
- ttm: Remove unnecessary drm_mm_clean() call
Driver Changes:
- New driver: ofdrm
- Move all drivers to a common dma-buf locking convention
- bridge:
- adv7533: Remove dynamic lane switching
- it6505: Runtime PM support
- ps8640: Handle AUX defer messages
- tc358775: Drop soft-reset over I2C
- ast: Atomic Gamma LUT Support, Convert to SHMEM, various
improvements
- lcdif: Support for YUV planes
- mgag200: Fix PLL Setup on some revisions
- udl: Modesetting improvements, hot-unplug support
- vc4: Fix support for PAL-M
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20221020072405.g3o4hxuk75gmeumw@houat
IMU is a new firmware for GFX11.
There are four means by which firmware version can be queried
from the driver: device attributes, vf2pf, debugfs,
and the AMDGPU_INFO_FW_VERSION option in the amdgpu info ioctl.
Add IMU as an option for those four methods.
V2: Added debugfs
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: David Francis <David.Francis@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
allow gfxoff on gc_11_0_3
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If the number of pages from the userptr BO differs from the SG BO then the
allocated memory for the SG table doesn't get freed before returning
-EINVAL, which may lead to a memory leak in some error paths. Fix this by
checking the number of pages before allocating memory for the SG table.
Fixes: 264fb4d332 ("drm/amdgpu: Add multi-GPU DMA mapping helpers")
Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
MMHUB 2.1.x versions don't have ATCL2. Remove accesses to ATCL2 registers.
Since they are non-existing registers, read access will cause a
'Completer Abort' and gets reported when AER is enabled with the below patch.
Tagging with the patch so that this is backported along with it.
v2: squash in uninitialized warning fix (Nathan Chancellor)
Fixes: 8795e182b0 ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()")
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
In the S2idle suspend/resume phase the gfxoff is keeping functional so
some IP blocks will be likely to reinitialize at gfxoff entry and that
will result in failing to program GC registers.Therefore, let disallow
gfxoff until AMDGPU IPs reinitialized completely.
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
IMU is a new firmware for GFX11.
There are four means by which firmware version can be queried
from the driver: device attributes, vf2pf, debugfs,
and the AMDGPU_INFO_FW_VERSION option in the amdgpu info ioctl.
Add IMU as an option for those four methods.
V2: Added debugfs
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: David Francis <David.Francis@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Commit 8795e182b0 ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()")
uncovered a bug in amdgpu that required a reordering of the driver
init sequence to avoid accessing a special register on the GPU
before it was properly set up leading to an PCI AER error. This
reordering uncovered a different hw programming ordering dependency
in some APUs where the SDMA doorbells need to be programmed before
the GFX doorbells. To fix this, move the SDMA doorbell programming
back into the soc15 common code, but use the actual doorbell range
values directly rather than the values stored in the ring structure
since those will not be initialized at this point.
This is a partial revert, but with the doorbell assignment
fixed so the proper doorbell index is set before it's used.
Fixes: e3163bc8ff ("drm/amdgpu: move nbio sdma_doorbell_range() into sdma code for vega")
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: skhan@linuxfoundation.org
allow gfxoff on gc_11_0_3
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If the number of pages from the userptr BO differs from the SG BO then the
allocated memory for the SG table doesn't get freed before returning
-EINVAL, which may lead to a memory leak in some error paths. Fix this by
checking the number of pages before allocating memory for the SG table.
Fixes: 264fb4d332 ("drm/amdgpu: Add multi-GPU DMA mapping helpers")
Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
MMHUB 2.1.x versions don't have ATCL2. Remove accesses to ATCL2 registers.
Since they are non-existing registers, read access will cause a
'Completer Abort' and gets reported when AER is enabled with the below patch.
Tagging with the patch so that this is backported along with it.
v2: squash in uninitialized warning fix (Nathan Chancellor)
Fixes: 8795e182b0 ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()")
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
[why]
MES response time in sriov may be longer than default value
due to reset or init in other VF. A timeout value specific
to sriov is needed.
[how]
When in sriov, adjust the timeout value to calculated
worst case scenario.
Signed-off-by: Yiqing Yao <yiqing.yao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[WHY]
0, original pstate X
1, ctx_A_create -> ctx_A->stable_pstate = X
2, ctx_A_set_pstate (Y) -> current pstate is Y (PEAK or STANDARD)
3, ctx_B_create -> ctx_B->stable_pstate = Y
4, ctx_A_destroy -> restore pstate to X
5, ctx_B_destroy -> restore pstate to Y
Above sequence will cause final pstate is wrong (Y), should be original X.
[HOW]
When ctx_B create,
if ctx_A touched pstate setting
(not auto, stable_pstate_ctx != NULL),
set ctx_B->stable_pstate the same value as ctx_A saved,
if stable_pstate_ctx == NULL,
fetch current pstate to fill
ctx_B->stable_pstate.
Signed-off-by: Chengming Gui <Jack.Gui@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
[why]
MES response time in sriov may be longer than default value
due to reset or init in other VF. A timeout value specific
to sriov is needed.
[how]
When in sriov, adjust the timeout value to calculated
worst case scenario.
Signed-off-by: Yiqing Yao <yiqing.yao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[WHY]
0, original pstate X
1, ctx_A_create -> ctx_A->stable_pstate = X
2, ctx_A_set_pstate (Y) -> current pstate is Y (PEAK or STANDARD)
3, ctx_B_create -> ctx_B->stable_pstate = Y
4, ctx_A_destroy -> restore pstate to X
5, ctx_B_destroy -> restore pstate to Y
Above sequence will cause final pstate is wrong (Y), should be original X.
[HOW]
When ctx_B create,
if ctx_A touched pstate setting
(not auto, stable_pstate_ctx != NULL),
set ctx_B->stable_pstate the same value as ctx_A saved,
if stable_pstate_ctx == NULL,
fetch current pstate to fill
ctx_B->stable_pstate.
Signed-off-by: Chengming Gui <Jack.Gui@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
- Fix a buffer overflow in format_helper_test.
- Set DDC pointer in drmm_connector_init.
- Compiler fixes for panfrost.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEuXvWqAysSYEJGuVH/lWMcqZwE8MFAmNRMiEACgkQ/lWMcqZw
E8MJExAAhVv/bEcg9Ib1e2gIirtOKB3w0V20rEvkvplzsfKALMH3oXHx0cIbk2sj
YBICPiuc2x4sDQW9xlQmPa5Gv+6aoqjSOpu+Zpc5MQAb8qnO/vxDCOlYDJwcicjP
DdxXwJcJ48+x3FFWmrg6gcU8fmH6Rckb2BgBsF3fyzOhIMF2ME6a02S+bYuHNOxK
sQfo1Tlbibi5pfrxHFBE9V+Wjf3ohQwxQHmcpAC6wwgtGKOLQSxkU7o+wCFUNsn1
dlWTVe9+2xgNQenTue091PAkFP5R4T3wv654EPudyUYjybQL8ci8aFjaFTQgR3BT
s4LsViTViTbm5OrSga2hGA+GGH2OGkW70yN+tqXPVxWyPaMA9GuoTh9xKhY3zVsB
69HlWBzfhAoyixp0rWmilZIGAvKXn8xf+HzOxQ/ihDk6a/VsyLbEdYP72AQTRCi8
A+h6vBNxP6AVPwDCA//hpCupG75bI8h/Plf1R/W7uzVTKXCSPAGdzROI8dxjsFAX
Y4OR9kd8yn+bpf6G6D0q+tJV8BqUAzs5AUVkXWU5i1XaAK6fNpqh066yFQt8xCHX
hFRcr7StYqI9ZliGP1ZEjE8nsiaqPZfDLMqSQlHl392JEfmf5uwskKwVadGHVfuC
L+iATevGiNsY4JHHk+YBXHMSWrX9XIRNDio+LZQgFttr/KTK4ac=
=rzib
-----END PGP SIGNATURE-----
Merge tag 'drm-misc-fixes-2022-10-20' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
drm-misc-fixes for v6.1-rc2:
- Fix a buffer overflow in format_helper_test.
- Set DDC pointer in drmm_connector_init.
- Compiler fixes for panfrost.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/c4d05683-8ebe-93b8-d24c-d1d2c68f12c4@linux.intel.com
Commit 8795e182b0 ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()")
uncovered a bug in amdgpu that required a reordering of the driver
init sequence to avoid accessing a special register on the GPU
before it was properly set up leading to an PCI AER error. This
reordering uncovered a different hw programming ordering dependency
in some APUs where the SDMA doorbells need to be programmed before
the GFX doorbells. To fix this, move the SDMA doorbell programming
back into the soc15 common code, but use the actual doorbell range
values directly rather than the values stored in the ring structure
since those will not be initialized at this point.
This is a partial revert, but with the doorbell assignment
fixed so the proper doorbell index is set before it's used.
Fixes: e3163bc8ff ("drm/amdgpu: move nbio sdma_doorbell_range() into sdma code for vega")
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: skhan@linuxfoundation.org
Cc: stable@vger.kernel.org
The fb_base in struct drm_mode_config has been unused for a long time.
Some drivers set it and some don't leading to a very confusing state
where the variable can't be relied upon, because there's no indication
as to which driver sets it and which doesn't.
The only usage of fb_base is internal to two drivers so instead of trying
to force it into all the drivers to get it into a coherent state
completely remove it.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Thomas Zimmermann <tzimemrmann@suse.de>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221019024401.394617-1-zack@kde.org
A user reported a bug on CAPE VERDE system where uvd_v3_1
IP component failed to initialize as there is an issue with
BO move code from one memory to other.
In function amdgpu_mem_visible() called by amdgpu_bo_move(),
when there are no blocks to compare or if we have a single
block then break the loop.
Fixes: 312b4dc11d ("drm/amdgpu: Fix VRAM BO swap issue")
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
If mes is not dequeued during fini, mes will be in an uncleaned state
during reload, then mes couldn't receive some commands which leads to
reload failure.
[How]
Perform MES dequeue via MMIO after all the unmap jobs are done by mes
and before kiq fini.
v2: Move the dequeue operation inside kiq_hw_fini.
Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
Reviewed-by: Jack Xiao <Jack.Xiao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
L1 blocks most of GC registers accessing by MMIO.
[How]
Use RLCG interface to program GC registers under SRIOV VF in full access time.
Signed-off-by: Yifan Zha <Yifan.Zha@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Suggested by PMFW team and same as what did for gfxoff feature.
This can address some Mode1Reset failures observed on SMU13.0.0.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.0.x
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
enable gfx clock gating features on smu_v13_0_10
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Jack Gui <Jack.Gui@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
- refactor mode2 on v11.0.7 to align with aldebaran
- comment out using mode2 reset as default for now, will introduce
another controller to replace previous reset_level_mask
v2: squash in unused variable removal (Alex)
Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This reverts commit dac6b80818.
This commit reverted the AMDGPU_SKIP_MODE2_RESET as it conflicts with
the original design of reset handler. Will redesign it.
Fixes: dac6b80818 ("drm/amdgpu: let mode2 reset fallback to default when failure")
Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This reverts commit 5bd8d53f6f.
This commit breaks the reset logic for aldebaran, revert it for now.
Will move the mask inside the reset handler.
Fixes: 5bd8d53f6f ("drm/amdgpu: add debugfs amdgpu_reset_level")
Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For asic with VF MMIO access protection avoid using CPU for VM table updates.
CPU pagetable updates have issues with HDP flush as VF MMIO access protection
blocks write to mmBIF_BX_DEV0_EPF0_VF0_HDP_MEM_COHERENCY_FLUSH_CNTL register
during sriov runtime.
v3: introduce virtualization capability flag AMDGPU_VF_MMIO_ACCESS_PROTECT
which indicates that VF MMIO write access is not allowed in sriov runtime
Signed-off-by: Danijel Slivka <danijel.slivka@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
A user reported a bug on CAPE VERDE system where uvd_v3_1
IP component failed to initialize as there is an issue with
BO move code from one memory to other.
In function amdgpu_mem_visible() called by amdgpu_bo_move(),
when there are no blocks to compare or if we have a single
block then break the loop.
Fixes: 312b4dc11d ("drm/amdgpu: Fix VRAM BO swap issue")
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
If mes is not dequeued during fini, mes will be in an uncleaned state
during reload, then mes couldn't receive some commands which leads to
reload failure.
[How]
Perform MES dequeue via MMIO after all the unmap jobs are done by mes
and before kiq fini.
v2: Move the dequeue operation inside kiq_hw_fini.
Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
Reviewed-by: Jack Xiao <Jack.Xiao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
L1 blocks most of GC registers accessing by MMIO.
[How]
Use RLCG interface to program GC registers under SRIOV VF in full access time.
Signed-off-by: Yifan Zha <Yifan.Zha@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
kmap() is being deprecated in favor of kmap_local_page().
There are two main problems with kmap(): (1) It comes with an overhead as
mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmap’s pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.
With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts).
It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore,
the tasks can be preempted and, when they are scheduled to run again, the
kernel virtual addresses are restored and are still valid.
Since its use in amdgpu/amdgpu_ttm.c is safe, it should be preferred.
Therefore, replace kmap() with kmap_local_page() in amdgpu/amdgpu_ttm.c.
Suggested-by: Ira Weiny <ira.weiny@intel.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
RAS error address translation algorithm is common
across dGPU and A + A platform as along as the SOC
integrates the same generation of UMC IP.
UMC RAS is managed by x86 MCA on A + A platform,
umc_ras in GPU driver is not initialized at all on
A + A platform. In such case, any umc_ras callback
implemented for dGPU config shouldn't be invoked
from A + A specific callback.
The change moves convert_error_address out of dGPU
umc_ras structure and makes it share between A + A
and dGPU config.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Stanley Yang <Stanley.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Suggested by PMFW team and same as what did for gfxoff feature.
This can address some Mode1Reset failures observed on SMU13.0.0.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.0.x
enable gfx clock gating features on smu_v13_0_10
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Jack Gui <Jack.Gui@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
- refactor mode2 on v11.0.7 to align with aldebaran
- comment out using mode2 reset as default for now, will introduce
another controller to replace previous reset_level_mask
v2: squash in unused variable removal (Alex)
Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This reverts commit dac6b80818.
This commit reverted the AMDGPU_SKIP_MODE2_RESET as it conflicts with
the original design of reset handler. Will redesign it.
Fixes: dac6b80818 ("drm/amdgpu: let mode2 reset fallback to default when failure")
Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This reverts commit 5bd8d53f6f.
This commit breaks the reset logic for aldebaran, revert it for now.
Will move the mask inside the reset handler.
Fixes: 5bd8d53f6f ("drm/amdgpu: add debugfs amdgpu_reset_level")
Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For asic with VF MMIO access protection avoid using CPU for VM table updates.
CPU pagetable updates have issues with HDP flush as VF MMIO access protection
blocks write to mmBIF_BX_DEV0_EPF0_VF0_HDP_MEM_COHERENCY_FLUSH_CNTL register
during sriov runtime.
v3: introduce virtualization capability flag AMDGPU_VF_MMIO_ACCESS_PROTECT
which indicates that VF MMIO write access is not allowed in sriov runtime
Signed-off-by: Danijel Slivka <danijel.slivka@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
After commit 571c053658 ("drm/amdgpu: switch sdma buffer function
tear down to a helper"), the variable 'ring' is not used anymore, it
can be removed.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For consistency with the rest of the code.
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For consistency with newer asics.
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
To match with the definition in psp firmware
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
more ip instances are available
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch to allow secure submission on gfx11 and sdma6.
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Tim Huang <Tim.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
this patch to add tmz support for GC 11.0.1.
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Tim Huang <Tim.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add poison mode query support on umc v8_10_0.
Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fix some issues found by checkpatch script.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
So the code can be simplified.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Make the code reusable and remove redundant code.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Only RAS UE error address is queried currently, no need to check CE status.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Update all SDMA versions that support SR-IOV to properly
tear down the ttm buffer functions on suspend.
Tested-by: Bokun Zhang <Bokun.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Switch all of the SDMA implementations to use the helper to
tear down the ttm buffer manager.
Tested-by: Bokun Zhang <Bokun.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
- Under SRIOV, SDMA engine is shared between VFs. Therefore,
we will not stop SDMA during hw_fini. This is not an issue
with normal dirver loading and unloading.
- However, when we put the SDMA engine to suspend state and resume
it, the issue starts to show up. Something could attempt to use
that SDMA engine to clear or move memory before the engine is
initialized since the DRM entity is still there.
- Therefore, we will call sdma_v5_2_enable(false) during hw_fini,
and if we are under SRIOV, we will call sdma_v5_2_enable(true)
afterwards to allow other VFs to use SDMA. This way, the DRM
entity of SDMA engine is emptied and it will follow the flow
of resume code path.
Tested-by: Bokun Zhang <Bokun.Zhang@amd.com>
Signed-off-by: Bokun Zhang <Bokun.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Here is the big set of driver core and debug printk changes for 6.1-rc1.
Included in here is:
- dynamic debug updates for the core and the drm subsystem. The
drm changes have all been acked by the relevant maintainers.
- kernfs fixes for syzbot reported problems
- kernfs refactors and updates for cgroup requirements
- magic number cleanups and removals from the kernel tree (they
were not being used and they really did not actually do
anything.)
- other tiny cleanups
All of these have been in linux-next for a while with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCY0BYUA8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ylozwCdFRlcghaf7XBUyNgRZRwMC+oQI8EAn1G/nEDE
6aFd2er41uK0IGQnSmYO
=OK0k
-----END PGP SIGNATURE-----
Merge tag 'driver-core-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the big set of driver core and debug printk changes for
6.1-rc1. Included in here is:
- dynamic debug updates for the core and the drm subsystem. The drm
changes have all been acked by the relevant maintainers
- kernfs fixes for syzbot reported problems
- kernfs refactors and updates for cgroup requirements
- magic number cleanups and removals from the kernel tree (they were
not being used and they really did not actually do anything)
- other tiny cleanups
All of these have been in linux-next for a while with no reported
issues"
* tag 'driver-core-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (74 commits)
docs: filesystems: sysfs: Make text and code for ->show() consistent
Documentation: NBD_REQUEST_MAGIC isn't a magic number
a.out: restore CMAGIC
device property: Add const qualifier to device_get_match_data() parameter
drm_print: add _ddebug descriptor to drm_*dbg prototypes
drm_print: prefer bare printk KERN_DEBUG on generic fn
drm_print: optimize drm_debug_enabled for jump-label
drm-print: add drm_dbg_driver to improve namespace symmetry
drm-print.h: include dyndbg header
drm_print: wrap drm_*_dbg in dyndbg descriptor factory macro
drm_print: interpose drm_*dbg with forwarding macros
drm: POC drm on dyndbg - use in core, 2 helpers, 3 drivers.
drm_print: condense enum drm_debug_category
debugfs: use DEFINE_SHOW_ATTRIBUTE to define debugfs_regset32_fops
driver core: use IS_ERR_OR_NULL() helper in device_create_groups_vargs()
Documentation: ENI155_MAGIC isn't a magic number
Documentation: NBD_REPLY_MAGIC isn't a magic number
nbd: remove define-only NBD_MAGIC, previously magic number
Documentation: FW_HEADER_MAGIC isn't a magic number
Documentation: EEPROM_MAGIC_VALUE isn't a magic number
...
This reverts commit 66f99628eb.
Unfortunately, that commit causes performance regressions on non-PSR
setups. So, just revert it until FB_DAMAGE_CLIPS support can be added.
Cc: stable@vger.kernel.org
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2189
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216554
Fixes: 66f99628eb ("drm/amdgpu: use dirty framebuffer helper")
Fixes: abbc7a3daf ("drm/amdgpu: don't register a dirty callback for non-atomic")
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
amdkfd_total_mem_size is the size of total GPUs vram plus system memory
to estimate page tables memory usage and leave enough VRAM room for page
tables allocation.
Calculate amdkfd_total_mem_size in amdgpu_amdkfd_device_probe is
incorrect because adev->gmc.real_vram_size is still 0 called from
amdgpu_device_ip_early_init. Move the calculation
to amdgpu_amdkfd_device_init to get the correct VRAM size.
Do reverse calculation in amdgpu_amdkfd_device_fini_sw to support
hot-unplugging GPUs.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Under VRAM usage pression, map to GPU may fail to create pt bo and
vmbo->shadow_list is not initialized, then ttm_bo_release calling
amdgpu_bo_vm_destroy to access vmbo->shadow_list generates below
dmesg and NULL pointer access backtrace:
Set vmbo destroy callback to amdgpu_bo_vm_destroy only after creating pt
bo successfully, otherwise use default callback amdgpu_bo_destroy.
amdgpu: amdgpu_vm_bo_update failed
amdgpu: update_gpuvm_pte() failed
amdgpu: Failed to map bo to gpuvm
amdgpu 0000:43:00.0: amdgpu: Failed to map peer:0000:43:00.0 mem_domain:2
BUG: kernel NULL pointer dereference, address:
RIP: 0010:amdgpu_bo_vm_destroy+0x4d/0x80 [amdgpu]
Call Trace:
<TASK>
ttm_bo_release+0x207/0x320 [amdttm]
amdttm_bo_init_reserved+0x1d6/0x210 [amdttm]
amdgpu_bo_create+0x1ba/0x520 [amdgpu]
amdgpu_bo_create_vm+0x3a/0x80 [amdgpu]
amdgpu_vm_pt_create+0xde/0x270 [amdgpu]
amdgpu_vm_ptes_update+0x63b/0x710 [amdgpu]
amdgpu_vm_update_range+0x2e7/0x6e0 [amdgpu]
amdgpu_vm_bo_update+0x2bd/0x600 [amdgpu]
update_gpuvm_pte+0x160/0x420 [amdgpu]
amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x313/0x1130 [amdgpu]
kfd_ioctl_map_memory_to_gpu+0x115/0x390 [amdgpu]
kfd_ioctl+0x24a/0x5b0 [amdgpu]
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
DRM buddy manager allocates the contiguous memory requests in
a single block or multiple blocks. So for the ttm move operation
(incase of low vram memory) we should consider all the blocks to
compute the total memory size which compared with the struct
ttm_resource num_pages in order to verify that the blocks are
contiguous for the eviction process.
v2: Added a Fixes tag
v3: Rewrite the code to save a bit of calculations and
variables (Christian)
Fixes: c9cad937c0 ("drm/amdgpu: add drm buddy support to amdgpu")
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch is to fix the SDMA user queue doorbell missing issue on
SDMA 6.0. F32_WPTR_POLL_ENABLE has to be set if doorbell mode is
used. Otherwise ringing SDMA user queue doorbell can't wake up
system from gfxoff.
Signed-off-by: Ruili Ji <ruiliji2@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.0.x
In some error path in amdgpu_sdma_init_microcode(), release_firmware() is
not called, the memory allocated in request_firmware() will be leaked,
calling amdgpu_sdma_destroy_inst_ctx() which calls release_firmware() to
avoid memory leak.
Fixes: 15aa13056d ("drm/amdgpu: add function to init SDMA microcode")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
core:
- convert selftests to kunit
- managed init for more objects
- move to idr_init_base
- rename fb and gem cma helpers to dma
- hide unregistered connectors from getconnector ioctl
- DSC passthrough aux support
- backlight handling improvements
- add dma_resv_assert_held to vmap/vunmap
edid:
- move luminance calculation to core
fbdev:
- fix aperture helper usage
fourcc:
- add more format helpers
- add DRM_FORMAT_Cxx, DRM_FORMAT_Rxx, DRM_FORMAT_Dxx
- add packed AYUV8888, XYUV8888
- add some kunit tests
ttm:
- allow bos without backing store
- rewrite placement to use intersect/compatible functions
dma-buf:
- docs update
- improve signalling when debugging
udmabuf:
- fix failure path GPF
dp:
- drop dp/mst legacy code
- atomic mst state support
- audio infoframe packing
panel:
- Samsung LTL101AL01
- B120XAN01.0
- R140NWF5 RH
- DMT028VGHMCMI-1A T
- AUO B133UAN02.1
- IVO M133NW4J-R3
- Innolux N120ACA-EA1
amdgpu:
- Gang submit support
- Mode2 reset for RDNA2
- New IP support:
DCN 3.1.4, 3.2
SMU 13.x
NBIO 7.7
GC 11.x
PSP 13.x
SDMA 6.x
GMC 11.x
- DSC passthrough support
- PSP fixes for TA support
- vangogh GFXOFF stats
- clang fixes
- gang submit CS cleanup prep work
- fix VRAM eviction issues
amdkfd:
- GC 10.3 IP ISA fixes
- fix CRIU regression
- CPU fault on COW mapping fixes
i915:
- align fw versioning with kernel practices
- add display substruct to i915 private
- add initial runtime info to driver info
- split out HDCP and backlight registers
- MEI XeHP SDV GSC support
- add per-gt sysfs defaults
- TLB invalidation improvements
- Disable PCI BAR resize on 32-bit
- GuC firmware updates and compat changes
- GuC log timestamp translation
- DG2 preemption workaround changes
- DG2 improved HDMI pixel clocks support
- PCI BAR sanity checks
- Enable DC5 on DG2
- DG2 DMC fw bumped
- ADL-S PCI ID added
- Meteorlake enablement
- Rename ggtt_view to gtt_view
- host RPS fixes
- release mmaps on rpm suspend on discrete
- clocking and dpll refactoring
- VBT definitions and parsing updates
- SKL watermark code extracted to separate file
- allow seamless M/N changes on eDP panels
- BUG_ON removal and cleanups
msm:
- DPU: simplified VBIF configuration
- cleanup CTL interfaces
- DSI: removed unused msm_display_dsc_config struct
- switch regulator calls to new API
- switched to PANEL_BRIDGE for direct attached panels
- DSI_PHY: convert drivers to parent_hws
- DP: cleanup pixel_rate handling
- HDMI: turned hdmi-phy-8996 into OF clk provider
- misc dt-bindings fixes
- choose eDP as primary display if it's available
- support getting interconnects from either the mdss or the mdp5/dpu
device nodes
- gem: Shrinker + LRU re-work:
- adds a shared GEM LRU+shrinker helper and moves msm over to that
- reduces lock contention between retire and submit by avoiding the
need to acquire obj lock in retire path (and instead using resv
seeing obj's busyness in the shrinker
- fix reclaim vs submit issues
- GEM fault injection for triggering userspace error paths
- Map/unmap optimization
- Improved robustness for a6xx GPU recovery
virtio:
- Improve error and edge conditions handling
- Convert to use managed helpers
- stop exposing LINEAR modifier
mgag200:
- split modeset handling per model
udl:
- suspend/disconnect handling improvements
vc4:
- rework HDMI power up
- depend on PM
- better unplugging support
ast:
- resolution handling improvements
ingenic:
- Add JZ4760(B) support
- avoid a modeset when sharpness property is unchanged
- use the new PM ops
it6505:
- power seq and clock updates
ssd130x:
- regmap bulk write
- use atomic helpers instead of simple helpers
via:
- rename via_drv to via_dri1, consolidate all code.
radeon:
- drop DP MST experimental support
- delayed work flush fix
- use time_after
ti-sn65dsi86:
- DP support
mediatek:
- MT8195 DP support
- drop of_gpio header
- remove unneeded result
- small DP code improvements
vkms:
- RGB565, XRGB64 and ARGB64 support
sun4i:
- tv: convert to atomic
rcar-du:
- Synopsys DW HDMI bridge DT bindings update
exynos:
- use drm_display_info.is_hdmi
- correct return of mixer_mode_valid and hdmi_mode_valid
omap:
- refcounting fix
rockchip:
- RK3568 support
- RK3399 gamma support
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmM894sACgkQDHTzWXnE
hr7EYw//WdVe69TNauCAQiOYdmPp1twmr2o5gDOFLoo4IZw5v+qK0HL/nTrDkBq6
xIu1GLTScOh0AItW1rhFmrtKhO/u/QPQ15P6cO7x8AzlUIhVOYqM79+OA0X6zIV8
IZjpc6EEWPSKJTCRud9HdzsV06DIa+QlwShLCaOFxRiGSuUqsxzacIHUqnFekRnV
PBG7RzcmdWwe6Gy/7T2wegsFjw1mh4S4FypEGs53emru3PGvcau5dwXcE5Jro7Br
k4BFFknuXahVJ2ynVfIFn3QUQRMLgAKRWflqxo7McLeKVQEt4gfB6+PaMwGpSiRQ
iC9QPy69TWEx6X015q2DvvlQDewnCbPOlzyoj9O991QDGLPIim8srPblr8DPeeOz
Y7IW1PRVnPdKReMJvTyrIVED/XT9fUoR7N+F9sfPnEee5HsvjXNGumEHbOE8avFf
rB6CFdby+Ecd9cSeINXowFy4ss0d5zCHMiKEVyQWTZOJysp29vLyKezNqU5m37FK
LAQHtsRdn1+V3o22H5y1PJyqssbOMImMV1ffqW/urRLLefPVHIKCKI8Ycgh0qxqc
B+gebHMgF8j6RR0DHAcQby+PIVi/Pn36TAMI3lPsVjFWGS5s5EQwpKlMNj46H0Cr
yE2Vr4w29+Nsv0b4Uz16AZ0mHcauqx4bvWMyT0frJfNcE86x8MI=
=u/MJ
-----END PGP SIGNATURE-----
Merge tag 'drm-next-2022-10-05' of git://anongit.freedesktop.org/drm/drm
Pull drm updates from Dave Airlie:
"Lots of stuff all over, some new AMD IP support and gang submit
support. i915 has further DG2 and Meteorlake pieces, and a bunch of
i915 display refactoring. msm has a shrinker rework. There are also a
bunch of conversions to use kunit.
This has two external pieces, some MEI changes needed for future Intel
discrete GPUs. These should be acked by Greg. There is also a cross
maintainer shared tree with some backlight rework from Hans in here.
Core:
- convert selftests to kunit
- managed init for more objects
- move to idr_init_base
- rename fb and gem cma helpers to dma
- hide unregistered connectors from getconnector ioctl
- DSC passthrough aux support
- backlight handling improvements
- add dma_resv_assert_held to vmap/vunmap
edid:
- move luminance calculation to core
fbdev:
- fix aperture helper usage
fourcc:
- add more format helpers
- add DRM_FORMAT_Cxx, DRM_FORMAT_Rxx, DRM_FORMAT_Dxx
- add packed AYUV8888, XYUV8888
- add some kunit tests
ttm:
- allow bos without backing store
- rewrite placement to use intersect/compatible functions
dma-buf:
- docs update
- improve signalling when debugging
udmabuf:
- fix failure path GPF
dp:
- drop dp/mst legacy code
- atomic mst state support
- audio infoframe packing
panel:
- Samsung LTL101AL01
- B120XAN01.0
- R140NWF5 RH
- DMT028VGHMCMI-1A T
- AUO B133UAN02.1
- IVO M133NW4J-R3
- Innolux N120ACA-EA1
amdgpu:
- Gang submit support
- Mode2 reset for RDNA2
- New IP support:
DCN 3.1.4, 3.2
SMU 13.x
NBIO 7.7
GC 11.x
PSP 13.x
SDMA 6.x
GMC 11.x
- DSC passthrough support
- PSP fixes for TA support
- vangogh GFXOFF stats
- clang fixes
- gang submit CS cleanup prep work
- fix VRAM eviction issues
amdkfd:
- GC 10.3 IP ISA fixes
- fix CRIU regression
- CPU fault on COW mapping fixes
i915:
- align fw versioning with kernel practices
- add display substruct to i915 private
- add initial runtime info to driver info
- split out HDCP and backlight registers
- MEI XeHP SDV GSC support
- add per-gt sysfs defaults
- TLB invalidation improvements
- Disable PCI BAR resize on 32-bit
- GuC firmware updates and compat changes
- GuC log timestamp translation
- DG2 preemption workaround changes
- DG2 improved HDMI pixel clocks support
- PCI BAR sanity checks
- Enable DC5 on DG2
- DG2 DMC fw bumped
- ADL-S PCI ID added
- Meteorlake enablement
- Rename ggtt_view to gtt_view
- host RPS fixes
- release mmaps on rpm suspend on discrete
- clocking and dpll refactoring
- VBT definitions and parsing updates
- SKL watermark code extracted to separate file
- allow seamless M/N changes on eDP panels
- BUG_ON removal and cleanups
msm:
- DPU:
simplified VBIF configuration
cleanup CTL interfaces
- DSI:
removed unused msm_display_dsc_config struct
switch regulator calls to new API
switched to PANEL_BRIDGE for direct attached panels
- DSI_PHY: convert drivers to parent_hws
- DP: cleanup pixel_rate handling
- HDMI: turned hdmi-phy-8996 into OF clk provider
- misc dt-bindings fixes
- choose eDP as primary display if it's available
- support getting interconnects from either the mdss or the mdp5/dpu
device nodes
- gem: Shrinker + LRU re-work:
- adds a shared GEM LRU+shrinker helper and moves msm over to that
- reduce lock contention between retire and submit by avoiding the
need to acquire obj lock in retire path (and instead using resv
seeing obj's busyness in the shrinker
- fix reclaim vs submit issues
- GEM fault injection for triggering userspace error paths
- Map/unmap optimization
- Improved robustness for a6xx GPU recovery
virtio:
- improve error and edge conditions handling
- convert to use managed helpers
- stop exposing LINEAR modifier
mgag200:
- split modeset handling per model
udl:
- suspend/disconnect handling improvements
vc4:
- rework HDMI power up
- depend on PM
- better unplugging support
ast:
- resolution handling improvements
ingenic:
- add JZ4760(B) support
- avoid a modeset when sharpness property is unchanged
- use the new PM ops
it6505:
- power seq and clock updates
ssd130x:
- regmap bulk write
- use atomic helpers instead of simple helpers
via:
- rename via_drv to via_dri1, consolidate all code.
radeon:
- drop DP MST experimental support
- delayed work flush fix
- use time_after
ti-sn65dsi86:
- DP support
mediatek:
- MT8195 DP support
- drop of_gpio header
- remove unneeded result
- small DP code improvements
vkms:
- RGB565, XRGB64 and ARGB64 support
sun4i:
- tv: convert to atomic
rcar-du:
- Synopsys DW HDMI bridge DT bindings update
exynos:
- use drm_display_info.is_hdmi
- correct return of mixer_mode_valid and hdmi_mode_valid
omap:
- refcounting fix
rockchip:
- RK3568 support
- RK3399 gamma support"
* tag 'drm-next-2022-10-05' of git://anongit.freedesktop.org/drm/drm: (1374 commits)
drm/amdkfd: Fix UBSAN shift-out-of-bounds warning
drm/amdkfd: Track unified memory when switching xnack mode
drm/amdgpu: Enable sram on vcn_4_0_2
drm/amdgpu: Enable VCN DPG for GC11_0_1
drm/msm: Fix build break with recent mm tree
drm/panel: simple: Use dev_err_probe() to simplify code
drm/panel: panel-edp: Use dev_err_probe() to simplify code
drm/panel: simple: Add Multi-Inno Technology MI0800FT-9
dt-bindings: display: simple: Add Multi-Inno Technology MI0800FT-9 panel
drm/amdgpu: correct the memcpy size for ip discovery firmware
drm/amdgpu: Skip put_reset_domain if it doesn't exist
drm/amdgpu: remove switch from amdgpu_gmc_noretry_set
drm/amdgpu: Fix mc_umc_status used uninitialized warning
drm/amd/display: Prevent OTG shutdown during PSR SU
drm/amdgpu: add page retirement handling for CPU RAS
drm/amdgpu: use RAS error address convert api in mca notifier
drm/amdgpu: support to convert dedicated umc mca address
drm/amdgpu: export umc error address convert interface
drm/amdgpu: fix sdma v4 init microcode error
drm/amd/display: fix array-bounds error in dc_stream_remove_writeback()
...
Highlights:
- AMD Platform Management Framework (PMF) driver with AMT and QnQF support
- AMD PMC: Improved logging for debugging s2idle issues
- Big refactor of the ACPI/x86 backlight handling, ensuring that we only
register 1 /sys/class/backlight device per LCD panel
- Microsoft Surface:
- Surface Laptop Go 2 support
- Surface Pro 8 HID sensor support
- Asus WMI:
- Lots of cleanups
- Support for TUF RGB keyboard backlight control
- Add support for ROG X13 tablet mode
- Siemens Simatic: IPC227G and IPC427G support
- Toshiba ACPI laptop driver: Fan hwmon and battery ECO mode support
- tools/power/x86/intel-speed-select: Various improvements
- Various cleanups
- Various small bugfixes
The following is an automated git shortlog grouped by driver:
ACPI:
- video: Change disable_backlight_sysfs_if quirks to acpi_backlight=native
- s2idle: Add a new ->check() callback for platform_s2idle_ops
- video: Fix indentation of video_detect_dmi_table[] entries
- video: Drop NL5x?U, PF4NU1F and PF5?U?? acpi_backlight=native quirks
- video: Drop "Samsung X360" acpi_backlight=native quirk
- video: Remove acpi_video_set_dmi_backlight_type()
- video: Add Apple GMUX brightness control detection
- video: Add Nvidia WMI EC brightness control detection (v3)
- video: Refactor acpi_video_get_backlight_type() a bit
- video: Remove code to unregister acpi_video backlight when a native backlight registers
- video: Make backlight class device registration a separate step (v2)
- video: Simplify acpi_video_unregister_backlight()
- video: Remove acpi_video_bus from list before tearing it down
- video: Drop backlight_device_get_by_type() call from acpi_video_get_backlight_type()
- video: Add acpi_video_backlight_use_native() helper
acer-wmi:
- Move backlight DMI quirks to acpi/video_detect.c
- Acer Aspire One AOD270/Packard Bell Dot keymap fixes
apple-gmux:
- Stop calling acpi/video.h functions
asus-wmi:
- Expand support of GPU fan to read RPM and label
- Make kbd_rgb_mode_groups static
- Move acpi_backlight=native quirks to ACPI video_detect.c
- Move acpi_backlight=vendor quirks to ACPI video_detect.c
- Drop DMI chassis-type check from backlight handling
- Increase FAN_CURVE_BUF_LEN to 32
- Fix the name of the mic-mute LED classdev
- Implement TUF laptop keyboard power states
- Implement TUF laptop keyboard LED modes
- Support the GPU fan on TUF laptops
- Modify behaviour of Fn+F5 fan key
- Update tablet_mode_sw module-param help text
- Simplify tablet-mode-switch handling
- Simplify tablet-mode-switch probing
- Add support for ROG X13 tablet mode
- Adjust tablet/lidflip handling to use enum
- Support the hardware GPU MUX on some laptops
- Simplify some of the *_check_present() helpers
- Refactor panel_od attribute
- Refactor egpu_enable attribute
- Refactor disable_gpu attribute
- Document the panel_od sysfs attribute
- Document the egpu_enable sysfs attribute
- Document the dgpu_disable sysfs attribute
- Use kobj_to_dev()
- Convert all attr-show to use sysfs_emit
compal-laptop:
- Get rid of a few forward declarations
dell-privacy:
- convert to use dev_groups
dell-smbios-base:
- Use sysfs_emit()
dell-wmi:
- Add WMI event 0x0012 0x0003 to the list
docs:
- ABI: charge_control_end_threshold may not support all values
drivers/platform:
- toshiba_acpi: Call HCI_PANEL_POWER_ON on resume on some models
drm/amdgpu:
- Register ACPI video backlight when skipping amdgpu backlight registration
- Don't register backlight when another backlight should be used (v3)
drm/i915:
- Call acpi_video_register_backlight() (v3)
- Don't register backlight when another backlight should be used (v2)
drm/nouveau:
- Register ACPI video backlight when nv_backlight registration fails (v2)
- Don't register backlight when another backlight should be used (v2)
drm/radeon:
- Register ACPI video backlight when skipping radeon backlight registration
- Don't register backlight when another backlight should be used (v3)
drm/todo:
- Add entry about dealing with brightness control on devices with > 1 panel
gpio-f7188x:
- use unique labels for banks/chips
- Add GPIO support for Nuvoton NCT6116
- add a prefix to macros to keep gpio namespace clean
- switch over to using pr_fmt
hp-wmi:
- Support touchpad on/off
- Setting thermal profile fails with 0x06
int3472/discrete:
- Drop a forward declaration
intel-uncore-freq:
- Use sysfs_emit() to instead of scnprintf()
intel_cht_int33fe:
- Fix comment according to the code flow
leds:
- simatic-ipc-leds-gpio: Make simatic_ipc_led_gpio_table static
- simatic-ipc-leds-gpio: add new model 227G
move from strlcpy with unused retval to strscpy:
- move from strlcpy with unused retval to strscpy
msi-laptop:
- Change DMI match / alias strings to fix module autoloading
- Add msi_scm_disable_hw_fn_handling() helper
- Add msi_scm_model_exit() helper
- Fix resource cleanup
- Simplify ec_delay handling
- Fix old-ec check for backlight registering
- Drop MSI_DRIVER_VERSION
- Use MODULE_DEVICE_TABLE()
nvidia-wmi-ec-backlight:
- Use acpi_video_get_backlight_type()
- Move fw interface definitions to a header (v2)
p2sb:
- Fix UAF when caller uses resource name
platform/mellanox:
- mlxreg-lc: Make error handling flow consistent
- Remove redundant 'NULL' check
- Remove unnecessary code
- mlxreg-lc: Fix locking issue
- mlxreg-lc: Fix coverity warning
platform/surface:
- Split memcpy() of struct ssam_event flexible array
- aggregator_registry: Add HID devices for sensors and UCSI client to SP8
- aggregator_registry: Rename HID device nodes based on new findings
- aggregator_registry: Rename HID device nodes based on their function
- aggregator_registry: Add support for Surface Laptop Go 2
platform/x86:
- use PLATFORM_DEVID_NONE instead of -1
platform/x86/amd:
- pmc: Dump idle mask during "check" stage instead
- pmc: remove CONFIG_DEBUG_FS checks
- pmc: Fix build without debugfs
- pmc: Add sysfs files for SMU
- pmc: Add an extra STB message for checking s2idle entry
- pmc: Always write to the STB
- pmc: Add defines for STB events
platform/x86/amd/pmf:
- Remove unused power_delta instances
- install notify handler after acpi init
- Add sysfs to toggle CnQF
- Add support for CnQF
- Fix clang unused variable warning
- Fix undefined reference to platform_profile
- Force load driver on older supported platforms
- Handle AMT and CQL events for Auto mode
- Add support for Auto mode feature
- Get performance metrics from PMFW
- Add fan control support
- Add heartbeat signal support
- Add debugfs information
- Add support SPS PMF feature
- Add support for PMF APCI layer
- Add support for PMF core layer
- Add ABI doc for AMD PMF
- Add AMD PMF driver entry
platform/x86/intel/wmi:
- thunderbolt: Use dev_groups callback
pmc_atom:
- Amend comment style and grammar
- Make terminator entry uniform
- Improve quirk message to be less cryptic
- Fix SLP_TYPx bitfield mask
samsung-laptop:
- Move acpi_backlight=[vendor|native] quirks to ACPI video_detect.c
simatic-ipc:
- add new model 427G
- enable watchdog for 227G
thinkpad_acpi:
- Explicitly set to balanced mode on startup
tools/power/x86/intel-speed-select:
- Release v1.13
- Optimize CPU initialization
- Utilize cpu_map to get physical id
- Remove unused struct clos_config fields
- Enforce isst_id value
- Do not export get_physical_id
- Introduce is_cpu_in_power_domain helper
- Cleanup get_physical_id usage
- Convert more function to use isst_id
- Add pkg and die in isst_id
- Introduce struct isst_id
- Remove unused core_mask array
- Remove dead code
- Fix cpu count for TDP level display
toshiba_acpi:
- change turn_on_panel_on_resume to static
- Remove duplicate include
- Set correct parent for input device.
- Add fan RPM reading (hwmon interface)
- Add fan RPM reading (internals)
- Stop using acpi_video_set_dmi_backlight_type()
- Fix ECO LED control on Toshiba Z830
- Battery charge mode in toshiba_acpi (internals)
- Battery charge mode in toshiba_acpi (sysfs)
wmi:
- Drop forward declaration of static functions
- Allow duplicate GUIDs for drivers that use struct wmi_driver
x86-android-tablets:
- Fix broken touchscreen on Chuwi Hi8 with Windows BIOS
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEEuvA7XScYQRpenhd+kuxHeUQDJ9wFAmM9eLgUHGhkZWdvZWRl
QHJlZGhhdC5jb20ACgkQkuxHeUQDJ9zIzAf+JG058wucK0Zsnu4POzGp+uHjnWuu
AJUmTeRvCD7MidwIv5PiwEtTucQ8OuXYR+tPc8LIwCVZtc05FBmh7Y8le/CX0SS6
n9EZIvCk3Owosti5w2TPnCK920kh+Wfxl/fmfDbpi6+lpAL8r+F/mZEGKGFdZpWu
Q+yM/eyxwPH8q8gjrXOUC7rN43aYeO3OCpNG3GYkQ/2S5qrjuz39dXhNVzdSsxm7
aYOqJRNjZQEjQ3kJcp65kC6oWp3UaI1zdpGwhBG/SY8BCtCYZzlRy7gN2FCJfAa/
EyYayOvdy0zNwewbIYzck4W80hUDtfidgZgZ9crQscO/JjbGi6LhveD4YA==
=afGw
-----END PGP SIGNATURE-----
Merge tag 'platform-drivers-x86-v6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver updates from Hans de Goede:
- AMD Platform Management Framework (PMF) driver with AMT and QnQF
support
- AMD PMC: Improved logging for debugging s2idle issues
- Big refactor of the ACPI/x86 backlight handling, ensuring that we
only register 1 /sys/class/backlight device per LCD panel
- Microsoft Surface:
- Surface Laptop Go 2 support
- Surface Pro 8 HID sensor support
- Asus WMI:
- Lots of cleanups
- Support for TUF RGB keyboard backlight control
- Add support for ROG X13 tablet mode
- Siemens Simatic: IPC227G and IPC427G support
- Toshiba ACPI laptop driver: Fan hwmon and battery ECO mode support
- tools/power/x86/intel-speed-select: Various improvements
- Various cleanups
- Various small bugfixes
* tag 'platform-drivers-x86-v6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (153 commits)
platform/x86: use PLATFORM_DEVID_NONE instead of -1
platform/x86/amd: pmc: Dump idle mask during "check" stage instead
platform/x86/intel/wmi: thunderbolt: Use dev_groups callback
platform/x86/amd: pmc: remove CONFIG_DEBUG_FS checks
platform/surface: Split memcpy() of struct ssam_event flexible array
platform/x86: compal-laptop: Get rid of a few forward declarations
platform/x86: intel-uncore-freq: Use sysfs_emit() to instead of scnprintf()
platform/x86: dell-smbios-base: Use sysfs_emit()
platform/x86/amd/pmf: Remove unused power_delta instances
platform/x86/amd/pmf: install notify handler after acpi init
Documentation/ABI/testing/sysfs-amd-pmf: Add ABI doc for AMD PMF
platform/x86/amd/pmf: Add sysfs to toggle CnQF
platform/x86/amd/pmf: Add support for CnQF
platform/x86/amd: pmc: Fix build without debugfs
platform/x86: hp-wmi: Support touchpad on/off
platform/x86: int3472/discrete: Drop a forward declaration
platform/x86: toshiba_acpi: change turn_on_panel_on_resume to static
platform/x86: wmi: Drop forward declaration of static functions
platform/x86: toshiba_acpi: Remove duplicate include
platform/x86: msi-laptop: Change DMI match / alias strings to fix module autoloading
...
switch to common helper to initialize rlc firmware
for gfx11
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
To initialzie rlc firmware according to rlc
firmware header version
v2: squash in backwards compat fix
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
To initialize rlc firmware in header v2_4
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
To initialize rlc firmware in header v2_3
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
To initialize rlc firmware in header v2_2
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
To initialize rlc firmware in header v2_1
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
To initialize rlc firmware in header v2_0
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use fw->size instead of discovery_tmr_size for fallback path.
Signed-off-by: Le Ma <le.ma@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For xgmi sriov, the reset is handled by host driver and hive->reset_domain
is not initialized so need to check if it exists before doing a put.
Signed-off-by: Vignesh Chander <Vignesh.Chander@amd.com>
Reviewed-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Simplify the logic in amdgpu_gmc_noretry_set by getting rid of the
switch. Also set noretry=1 as default for GFX 10.3.0 and greater since
retry faults are not supported.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
On ChromeOS clang build, the following warning is seen:
/mnt/host/source/src/third_party/kernel/v5.15/drivers/gpu/drm/amd/amdgpu/umc_v6_7.c:463:6: error: variable 'mc_umc_status' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized]
if (mca_addr == UMC_INVALID_ADDR) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~~
/mnt/host/source/src/third_party/kernel/v5.15/drivers/gpu/drm/amd/amdgpu/umc_v6_7.c:485:21: note: uninitialized use occurs here
if ((REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, Val) == 1 &&
^~~~~~~~~~~~~
/mnt/host/source/src/third_party/kernel/v5.15/drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgpu.h:1208:5: note: expanded from macro 'REG_GET_FIELD'
(((value) & REG_FIELD_MASK(reg, field)) >> REG_FIELD_SHIFT(reg, field))
^~~~~
/mnt/host/source/src/third_party/kernel/v5.15/drivers/gpu/drm/amd/amdgpu/umc_v6_7.c:463:2: note: remove the 'if' if its condition is always true
if (mca_addr == UMC_INVALID_ADDR) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/mnt/host/source/src/third_party/kernel/v5.15/drivers/gpu/drm/amd/amdgpu/umc_v6_7.c:460:24: note: initialize the variable 'mc_umc_status' to silence this warning
uint64_t mc_umc_status, mc_umc_addrt0;
^
= 0
1 error generated.
make[5]: *** [/mnt/host/source/src/third_party/kernel/v5.15/scripts/Makefile.build:289: drivers/gpu/drm/amd/amdgpu/umc_v6_7.o] Error 1
Fix by initializing mc_umc_status = 0.
Fixes: 1014bd1cb3 ("drm/amdgpu: support to convert dedicated umc mca address")
Reviewed-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use the convert interface to simplify code.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Update umc error address query interface, the mca address can be read
from register or input from parameter.
TODO: define a common address conversion function to simplify the code.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Make it global so we can convert specific mca address.
v2: rename query_error_address_per_channel to
convert_ras_error_address
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fix init SDMA microcode error for sdma v4, which caused by mistake when
rearch sdma init microcode function (coding 4.2.2 to 4.2.0).
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
- Under SRIOV, we need to send REQ_GPU_FINI to the hypervisor
during the suspend time. Furthermore, we cannot request a
mode 1 reset under SRIOV as VF. Therefore, we will skip it
as it is called in suspend_noirq() function.
- In the resume code path, we need to send REQ_GPU_INIT to the
hypervisor and also resume PSP IP block under SRIOV.
Signed-off-by: Bokun Zhang <Bokun.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Change the type of parameter on amdgpu_gfx_cp_init_microcode to fix
compiler warning.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
To allow upload the list via psp
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The function amdgpu_fence_count_emitted used in work_hander should not call
amdgpu_fence_process which must be used in irq handler.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Jiadong.Zhu <Jiadong.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The current position calulated in gfx_v9_0_ring_emit_patch_cond_exec
underflows when the wptr is divisible by ring->buf_mask + 1.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Jiadong.Zhu <Jiadong.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Update mes_v11_api_def.h add_queue API with is_aql_queue parameter. Also
re-use gds_size for the queue size (unused for KFD). MES requires the
queue size in order to compute the actual wptr offset within the queue
RB since it increases monotonically for AQL queues.
v2: Make is_aql_queue assign clearer
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Enables support for software trap for MES >= 4.
Adapted from implementation from Jay Cornwall.
v2: Add IP version check in conditions.
v3: Remove debugger code changes.
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: David Belanger <david.belanger@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
update VF_RB_SETUP_FLAG, add SMU_DPM_INTERFACE_FLAG,
and corresponding change in VCN4.
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use common function to init sdma v6 firmware ucode.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Support SDMA firmware init on common function for sdma v2 struct.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use common function to init sdma v5 firmware ucode.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use common function to init sdma v4 firmware ucode.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add an common function to init SDMA related microcode.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use common function to init gfx v11 CP firmware ucode.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use common function to init gfx v10 CP firmware ucode.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use common function to init gfx v9 CP firmware ucode.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add an common function to init CP related microcode.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Make sure gfxoff is disabled before gfx register accessing.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use the simpified API that calculates distance between two devices.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Disable verbose while getting p2p distance. With verbose, it shows
warning if ACS redirect is set between the devices. Adds noise
to dmesg logs when a few GPU devices are on the same platform.
Example log:
amdgpu 0000:34:00.0: ACS redirect is set between the client and provider (0000:31:00.0)
amdgpu 0000:34:00.0: to disable ACS redirect for this path, add the kernel parameter:
pci=disable_acs_redir=0000:30:00.0;0000:2e:00.0;0000:33:00.0;0000:2e:10.0
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For the asic using smu v13_0_2, there is the following
warning when uninstalling amdgpu:
amdgpu: ras disable gfx failed poison:1 ret:-22.
[Why]:
For the asic using smu v13_0_2, the psp .suspend and
mode1reset is called before amdgpu_ras_pre_fini during
amdgpu uninstall, it has disabled all ras features and
reset the psp. Since the psp is reset, calling
amdgpu_ras_disable_all_features in amdgpu_ras_pre_fini
to disable ras features will fail.
[How]:
If all ras features are disabled, amdgpu_ras_disable_all_features
will not be called to disable all ras features again.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
switch to common helper to initialize rlc firmware
for gfx11
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
switch to common helper to initialize rlc firmware
for gfx10
v2: squash in size validation fix (Alex)
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
- Under SRIOV, we need to send REQ_GPU_FINI to the hypervisor
during the suspend time. Furthermore, we cannot request a
mode 1 reset under SRIOV as VF. Therefore, we will skip it
as it is called in suspend_noirq() function.
- In the resume code path, we need to send REQ_GPU_INIT to the
hypervisor and also resume PSP IP block under SRIOV.
Signed-off-by: Bokun Zhang <Bokun.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
The function amdgpu_fence_count_emitted used in work_hander should not call
amdgpu_fence_process which must be used in irq handler.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Jiadong.Zhu <Jiadong.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The current position calulated in gfx_v9_0_ring_emit_patch_cond_exec
underflows when the wptr is divisible by ring->buf_mask + 1.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Jiadong.Zhu <Jiadong.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Update mes_v11_api_def.h add_queue API with is_aql_queue parameter. Also
re-use gds_size for the queue size (unused for KFD). MES requires the
queue size in order to compute the actual wptr offset within the queue
RB since it increases monotonically for AQL queues.
v2: Make is_aql_queue assign clearer
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Make sure gfxoff is disabled before gfx register accessing.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
switch to common helper to initialize rlc firmware
for gfx9
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
To initialzie rlc firmware according to rlc
firmware header version
v2: squash in backwards compat fix
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use DECLARE_DYNDBG_CLASSMAP across DRM:
- in .c files, since macro defines/initializes a record
- in drivers, $mod_{drv,drm,param}.c
ie where param setup is done, since a classmap is param related
- in drm/drm_print.c
since existing __drm_debug param is defined there,
and we ifdef it, and provide an elaborated alternative.
- in drm_*_helper modules:
dp/drm_dp - 1st item in makefile target
drivers/gpu/drm/drm_crtc_helper.c - random pick iirc.
Since these modules all use identical CLASSMAP declarations (ie: names
and .class_id's) they will all respond together to "class DRM_UT_*"
query-commands:
:#> echo class DRM_UT_KMS +p > /proc/dynamic_debug/control
NOTES:
This changes __drm_debug from int to ulong, so BIT() is usable on it.
DRM's enum drm_debug_category values need to sync with the index of
their respective class-names here. Then .class_id == category, and
dyndbg's class FOO mechanisms will enable drm_dbg(DRM_UT_KMS, ...).
Though DRM needs consistent categories across all modules, thats not
generally needed; modules X and Y could define FOO differently (ie a
different NAME => class_id mapping), changes are made according to
each module's private class-map.
No callsites are actually selected by this patch, since none are
class'd yet.
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20220912052852.1123868-3-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A user reported that when he starts a game (MTGA) with wine,
he observed an error msg failed to pin framebuffer with error -12.
Found an issue with the VRAM mem type eviction decision condition
logic. This patch will fix the if condition code error.
Gitlab bug link:
https://gitlab.freedesktop.org/drm/amd/-/issues/2159
Fixes: ded910f368 ("drm/amdgpu: Implement intersect/compatible functions")
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220922151447.265696-1-Arunpravin.PaneerSelvam@amd.com
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Some asics still support non-atomic code paths.
Fixes: 66f99628eb ("drm/amdgpu: use dirty framebuffer helper")
Reported-by: Arthur Marsh <arthur.marsh@internode.on.net>
Reviewed-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch updates the PTE flags when translate further (TF) is
enabled:
- With translate_further enabled, invalid PTEs can be 0. Reading
consecutive invalid PTEs as 0 is considered a fault. To prevent
this, ensure invalid PTEs have at least 1 bit set.
- The current invalid PTE flags settings to translate a retry fault
into a no-retry fault, doesn't work with TF enabled. As a result,
update invalid PTE flags settings which works for both TF enabled
and disabled case.
Fixes: 352e683b72 ("drm/amdgpu: Enable translate_further to extend UTCL2 reach")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
To initialize rlc firmware in header v2_4
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
To initialize rlc firmware in header v2_3
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
To initialize rlc firmware in header v2_2
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
To initialize rlc firmware in header v2_1
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
To initialize rlc firmware in header v2_0
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Free page table BO from vm resv unlocked context generate below
warnings.
Add a pt_free_work in vm to free page table BO from vm->pt_freed list.
pass vm resv unlock status from page table update caller, and add vm_bo
entry to vm->pt_freed list and schedule the pt_free_work if calling with
vm resv unlocked.
WARNING: CPU: 12 PID: 3238 at
drivers/gpu/drm/ttm/ttm_bo.c:106 ttm_bo_set_bulk_move+0xa1/0xc0
Call Trace:
amdgpu_vm_pt_free+0x42/0xd0 [amdgpu]
amdgpu_vm_pt_free_dfs+0xb3/0xf0 [amdgpu]
amdgpu_vm_ptes_update+0x52d/0x850 [amdgpu]
amdgpu_vm_update_range+0x2a6/0x640 [amdgpu]
svm_range_unmap_from_gpus+0x110/0x300 [amdgpu]
svm_range_cpu_invalidate_pagetables+0x535/0x600 [amdgpu]
__mmu_notifier_invalidate_range_start+0x1cd/0x230
unmap_vmas+0x9d/0x140
unmap_region+0xa8/0x110
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use vm_status_lock to protect all vm_status state transitions to allow
them to happen without a reservation lock in unlocked page table
updates.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use vm_status_lock to protect all vm_status state transitions to allow
them to happen without a reservation lock in unlocked page table
updates.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use vm_status_lock to protect all vm_status state transitions to allow
them to happen without a reservation lock in unlocked page table
updates.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use vm_status_lock to protect all vm_status state transitions to allow
them to happen without a reservation lock in unlocked page table
updates.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use vm_status_lock to protect all vm_status state transitions to allow
them to happen without a reservation lock in unlocked page table
updates.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The vm status_lock will be used to protect all vm status lists.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This got lost somewhere along the way, This fixes
audio not working until set_property was called.
Signed-off-by: hongao <hongao@uniontech.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Some asics still support non-atomic code paths.
Fixes: 66f99628eb ("drm/amdgpu: use dirty framebuffer helper")
Reported-by: Arthur Marsh <arthur.marsh@internode.on.net>
Reviewed-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Since that has now landed bump the minor to let userspace know about it.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The return value is no longer initialized before the loop because of
moving code around.
Signed-off-by: Christian König <christian.koenig@amd.com>
Fixes: c2b08e7a6d ("drm/amdgpu: move entity selection and job init earlier during CS")
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Allows submitting jobs as gang which needs to run on multiple engines at the
same time.
All members of the gang get the same implicit, explicit and VM dependencies. So
no gang member will start running until everything else is ready.
The last job is considered the gang leader (usually a submission to the GFX
ring) and used for signaling output dependencies.
Each job is remembered individually as user of a buffer object, so there is no
joining of work at the end.
v2: rebase and fix review comments from Andrey and Yogesh
v3: use READ instead of BOOKKEEP for now because of VM unmaps, set gang
leader only when necessary
v4: fix order of pushing jobs and adding fences found by Trigger.
v5: fix job index calculation and adding IBs to jobs
v6: fix typo found by Alex
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Allows submitting jobs as gang which needs to run on multiple
engines at the same time.
Basic idea is that we have a global gang submit fence representing when the
gang leader is finally pushed to run on the hardware last.
Jobs submitted as gang are never re-submitted in case of a GPU reset since this
won't work and will just deadlock the hardware immediately again.
v2: fix logic inversion, improve documentation, fix rcu
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Similar to what we did for VCN3 use the job instead of the parser
entity. Cleanup the coding style quite a bit as well.
v2: merge improved application check into this patch
v3: finally fix the check
v4: limit to the correct engine
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This reverts commit 250195ff74.
The job should now be initialized when we reach the parser functions.
v2: merge improved application check into this patch
v3: back to the original test, but use the right ring
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Initialize the entity for the CS and scheduler job much earlier.
v2: fix job initialisation order and use correct scheduler instance
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Return early on success and so remove all those "if (r)" in the error
path.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cleanup the coding style and function names to represent the data
they process for pass2 as well.
Go over the chunks only twice now instead of multiple times.
v2: fix job initialisation order and use correct scheduler instance
v3: try to move all functional changes into a separate patch.
v4: separate reordering, pass1 and pass2 change
v5: fix va_start calculation
v6: fix user fence check
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
V3:
Fixed psp fence and memory issues for the asic
using smu v13_0_2 when removing amdgpu device.
[Why]:
1. psp_suspend->psp_free_shared_bufs->
psp_ta_free_shared_buf->
amdgpu_bo_free_kernel->
...->amdgpu_bo_release_notify->
amdgpu_fill_buffer
psp will free vram memory used by psp when psp_suspend
is called. But for the asic using smu v13_0_2, because
psp_suspend is called before adev->shutdown is set to
true when removing the first hive device, amdgpu fill_buffer
will be called, which will cause fence issues when evicting
all vram resources in amdgpu vram mgr_fini.
2. Since psp_hw_fini is not called after calling psp_suspend
and psp_suspend only calls psp_ring_stop, the psp ring memory
will not be released when amdgpu device is removed.
[How]:
1. Set shutdown to true before calling amdgpu_device_gpu_recover,
then amdgpu_fill_buffer will not be called when psp_suspend is
called.
2. Free psp ring memory in psp_sw_fini.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Adjust removal control flow for smu v13_0_2:
During amdgpu uninstallation, when removing the first
device, the kernel needs to first send a mode1reset message
to all gpu devices. Otherwise, smu initialization will fail
the next time amdgpu is installed.
V2:
1. Update commit comments.
2. Remove the global variable amdgpu_device_remove_cnt
and add a variable to the structure amdgpu_hive_info.
3. Use hive to detect the first removed device instead of
a global variable.
V3:
1. Update commit comments.
2. Split a patch into multiple patches.
3. The current patch does:
a. Add a work mode of AMDGPU_RESET_FOR_DEVICE_REMOVE into
the existing gpu recover path, which make all devices
in hive list only have HW reset but no resume (except
the base IP).
b. Call AMDGPU_RESET_FOR_DEVICE_REMOVE and
AMDGPU_NEED_FULL_RESET mode of amdgpu_device_gpu_recover
in amdgpu_pci_remove when removing the first device in
hive list.
c. When removing the first device, the IP blocks keyword
function call sequence is as follows:
.suspend->mode1reset->.resume(basic ip)->.hw_fini->.early_fini->.sw_fini.
^ |
|-<----------<---------<----|
The first three sequences are because of a call to
amdgpu_device_gpu_recover. The three sequences will be
executed in a loop until all devices in the hive list
are iterated.
The sequences starting from .hw_fini only apply to the
first device. Since .suspend has been called before,
except the resumed phase1 basic ip blocks, all other ip
blocks .hw_fini of current device will do nothing.
d. When removing other devices, the calling sequences is the
same as legacy:
.hw_fini -> .early_fini -> .sw_fini.
Since .suspend has been called when removing the first device,
except the resumed phase1 basic ip blocks, all of other ip
blocks .hw_fini of current device will do nothing.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch addes MES and MES-KIQ version in debugfs.
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Tim Huang <Tim.Huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
clean up some inconsistent indentings
Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=2178
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
amdgpu_firmware_info debugfs will show rlcv/rlcp
ucode version info
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch updates the PTE flags when translate further (TF) is
enabled:
- With translate_further enabled, invalid PTEs can be 0. Reading
consecutive invalid PTEs as 0 is considered a fault. To prevent
this, ensure invalid PTEs have at least 1 bit set.
- The current invalid PTE flags settings to translate a retry fault
into a no-retry fault, doesn't work with TF enabled. As a result,
update invalid PTE flags settings which works for both TF enabled
and disabled case.
Fixes: 352e683b72 ("drm/amdgpu: Enable translate_further to extend UTCL2 reach")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
SDMA update page table may be called from unlocked context, this
generate below warning. Use unlocked iterator to handle this case.
WARNING: CPU: 0 PID: 1475 at
drivers/dma-buf/dma-resv.c:483 dma_resv_iter_next
Call Trace:
dma_resv_iter_first+0x43/0xa0
amdgpu_vm_sdma_update+0x69/0x2d0 [amdgpu]
amdgpu_vm_ptes_update+0x29c/0x870 [amdgpu]
amdgpu_vm_update_range+0x2f6/0x6c0 [amdgpu]
svm_range_unmap_from_gpus+0x115/0x300 [amdgpu]
svm_range_cpu_invalidate_pagetables+0x510/0x5e0 [amdgpu]
__mmu_notifier_invalidate_range_start+0x1d3/0x230
unmap_vmas+0x140/0x150
unmap_region+0xa8/0x110
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Suggested-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
there is only one SDMA engine in SDMA 6.0.1, the sdma_hqd_mask has to be
zeroed for the 2nd engine, otherwise MES scheduler will consider 2nd
engine exists and map/unmap SDMA queues to the non-existent engine.
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Tim Huang <Tim.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Move common IP init before GMC init so that HDP gets
remapped before GMC init which uses it.
This fixes the Unsupported Request error reported through
AER during driver load. The error happens as a write happens
to the remap offset before real remapping is done.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373
The error was unnoticed before and got visible because of the commit
referenced below. This doesn't fix anything in the commit below, rather
fixes the issue in amdgpu exposed by the commit. The reference is only
to associate this commit with below one so that both go together.
Fixes: 8795e182b0 ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()")
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
This mirrors what we do for other asics and this way we are
sure the sdma doorbell range is properly initialized.
There is a comment about the way doorbells on gfx9 work that
requires that they are initialized for other IPs before GFX
is initialized. However, the statement says that it applies to
multimedia as well, but the VCN code currently initializes
doorbells after GFX and there are no known issues there. In my
testing at least I don't see any problems on SDMA.
This is a prerequisite for fixing the Unsupported Request error
reported through AER during driver load.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373
The error was unnoticed before and got visible because of the commit
referenced below. This doesn't fix anything in the commit below, rather
fixes the issue in amdgpu exposed by the commit. The reference is only
to associate this commit with below one so that both go together.
Fixes: 8795e182b0 ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()")
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
This mirrors what we do for other asics and this way we are
sure the ih doorbell range is properly initialized.
There is a comment about the way doorbells on gfx9 work that
requires that they are initialized for other IPs before GFX
is initialized. In this case IH is initialized before GFX,
so there should be no issue.
This is a prerequisite for fixing the Unsupported Request error
reported through AER during driver load.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373
The error was unnoticed before and got visible because of the commit
referenced below. This doesn't fix anything in the commit below, rather
fixes the issue in amdgpu exposed by the commit. The reference is only
to associate this commit with below one so that both go together.
Fixes: 8795e182b0 ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()")
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Move common IP init before GMC init so that HDP gets
remapped before GMC init which uses it.
This fixes the Unsupported Request error reported through
AER during driver load. The error happens as a write happens
to the remap offset before real remapping is done.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373
The error was unnoticed before and got visible because of the commit
referenced below. This doesn't fix anything in the commit below, rather
fixes the issue in amdgpu exposed by the commit. The reference is only
to associate this commit with below one so that both go together.
Fixes: 8795e182b0 ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()")
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This mirrors what we do for other asics and this way we are
sure the sdma doorbell range is properly initialized.
There is a comment about the way doorbells on gfx9 work that
requires that they are initialized for other IPs before GFX
is initialized. However, the statement says that it applies to
multimedia as well, but the VCN code currently initializes
doorbells after GFX and there are no known issues there. In my
testing at least I don't see any problems on SDMA.
This is a prerequisite for fixing the Unsupported Request error
reported through AER during driver load.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373
The error was unnoticed before and got visible because of the commit
referenced below. This doesn't fix anything in the commit below, rather
fixes the issue in amdgpu exposed by the commit. The reference is only
to associate this commit with below one so that both go together.
Fixes: 8795e182b0 ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()")
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This mirrors what we do for other asics and this way we are
sure the ih doorbell range is properly initialized.
There is a comment about the way doorbells on gfx9 work that
requires that they are initialized for other IPs before GFX
is initialized. In this case IH is initialized before GFX,
so there should be no issue.
This is a prerequisite for fixing the Unsupported Request error
reported through AER during driver load.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373
The error was unnoticed before and got visible because of the commit
referenced below. This doesn't fix anything in the commit below, rather
fixes the issue in amdgpu exposed by the commit. The reference is only
to associate this commit with below one so that both go together.
Fixes: 8795e182b0 ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()")
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
for imu_v11_0_3_program_rlc_ram(). Include imu_v11_0_3.h
in imu_v11_0_3.c.
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sort the functions in the order they are called
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cleanup the coding style and function names to represent the data
they process. Only initialize and cleanup the CS structure in
init/fini.
Check the size of the IB chunk in pass1.
v2: fix job initialisation order and use correct scheduler instance
v3: try to move all functional changes into a separate patch.
v4: move reordering and pass2 out of this patch as well
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use DMA_RESV_USAGE_BOOKKEEP for VM page table updates and KFD preemption fence.
v2: actually update all usages for KFD
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This reverts commit 94f4c4965e.
We found that the bo_list is missing a protection for its list entries.
Since that is fixed now this workaround can be removed again.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Move setting the job resources into amdgpu_job.c
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We should not have any different CS constrains based
on the execution environment.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
No need to reset error status since only umc ras supported on psp v13_0_0.
Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Set correct EEPROM I2C address for smu v13_0_0.
Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
copy ras driver to psp if present
Signed-off-by: John Clements <john.clements@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Was missing before and would have resulted in a write to
a non-existant register. Normally APUs don't use HDP, but
other asics could use this code and APUs do use the HDP
when used in passthrough.
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fix everything checkpatch.pl complained about in amdgpu_amdkfd_gpuvm.c
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Jingyu Wang <jingyuwang_vip@163.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fix everything checkpatch.pl complained about in amdgpu_amdkfd.c
Signed-off-by: Jingyu Wang <jingyuwang_vip@163.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This is a patch to the amdgpu_sync.c file that fixes some warnings found by the checkpatch.pl tool
Signed-off-by: Jingyu Wang <jingyuwang_vip@163.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fix everything checkpatch.pl complained about in amdgpu_acpi.c
Signed-off-by: Jingyu Wang <jingyuwang_vip@163.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Return the sdma_v6_0_start() directly instead of storing it in another
redundant variable.
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: zhang songyi <zhang.songyi@zte.com.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
both get_xgmi_hive and put_xgmi_hive can be skipped since the
reset domain is not necessary for VF
Signed-off-by: Vignesh Chander <Vignesh.Chander@amd.com>
Reviewed-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
No need to reset error status since only umc ras supported on psp v13_0_0.
Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Was missing before and would have resulted in a write to
a non-existant register. Normally APUs don't use HDP, but
other asics could use this code and APUs do use the HDP
when used in passthrough.
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
align TMR BO size TO tmr size is not necessary,
modify the size to 1M to avoid re-create BO fail
when serious VRAM fragmentation.
v2:
add new macro PSP_TMR_ALIGNMENT for TMR BO alignment size
Signed-off-by: Yang Wang <KevinYang.Wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Enable full reset for RAS supported configuration on gc v11_0_0.
v2: simplify the code.
Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Currently, we aren't handling DRM_IOCTL_MODE_DIRTYFB. So, use
drm_atomic_helper_dirtyfb() as the dirty callback in the amdgpu_fb_funcs
struct.
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
As per PCIE Base Spec r4.0 Section 6.18
'Software must not enable LTR in an Endpoint unless the Root Complex
and all intermediate Switches indicate support for LTR.'
This fixes the Unsupported Request error reported through AER during
ASPM enablement.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216455
The error was unnoticed before and got visible because of the commit
referenced below. This doesn't fix anything in the commit below, rather
fixes the issue in amdgpu exposed by the commit. The reference is only
to associate this commit with below one so that both go together.
Fixes: 8795e182b0 ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()")
Reported-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
align TMR BO size TO tmr size is not necessary,
modify the size to 1M to avoid re-create BO fail
when serious VRAM fragmentation.
v2:
add new macro PSP_TMR_ALIGNMENT for TMR BO alignment size
Signed-off-by: Yang Wang <KevinYang.Wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Enable full reset for RAS supported configuration on gc v11_0_0.
v2: simplify the code.
Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Only check MCUMC_STATUS for CE counter for umc v8_10.
Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For SRIOV configuration, host driver control the reset method(either FLR or
heavier chain reset). The host will notify the guest individually with FLR
message if individual GPU within the hive need to be reset. So for guest
side, no need to use hive->reset_domain to replace the original per
device reset_domain
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Currently, we aren't handling DRM_IOCTL_MODE_DIRTYFB. So, use
drm_atomic_helper_dirtyfb() as the dirty callback in the amdgpu_fb_funcs
struct.
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
As per PCIE Base Spec r4.0 Section 6.18
'Software must not enable LTR in an Endpoint unless the Root Complex
and all intermediate Switches indicate support for LTR.'
This fixes the Unsupported Request error reported through AER during
ASPM enablement.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216455
The error was unnoticed before and got visible because of the commit
referenced below. This doesn't fix anything in the commit below, rather
fixes the issue in amdgpu exposed by the commit. The reference is only
to associate this commit with below one so that both go together.
Fixes: 8795e182b0 ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()")
Reported-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm-misc-next for v6.1-rc1:
[airlied - fix sun4i_tv build]
UAPI Changes:
- Hide unregistered connectors from GETCONNECTOR ioctl.
- drm/virtio no longer advertises LINEAR modifier, as it doesn't work.
-
Cross-subsystem Changes:
- Fix GPF in udmabuf failure path.
Core Changes:
- Rework TTM placement to use intersect/compatible functions.
- Drop legacy DP-MST support.
- More DP-MST related fixes, and move all state into atomic.
- Make DRM_MIPI_DBI select DRM_KMS_HELPER.
- Add audio_infoframe packing for DP.
- Add logging when some atomic check functions fail.
- Assorted documentation updates and fixes.
Driver Changes:
- Assorted cleanups and fixes in msm, lcdif, nouveau, virtio,
panel/ilitek, bridge/icn6211, tve200, gma500, bridge/*, panfrost, via,
bochs, qxl, sun4i.
- Add add AUO B133UAN02.1, IVO M133NW4J-R3, Innolux N120ACA-EA1 eDP panels.
- Improve DP-MST modeset state handling in amdgpu, nouveau, i915.
- Drop DP-MST from radeon driver, it was broken and only user of legacy
DP-MST.
- Handle unplugging better in vc4.
- Simplify drm cmdparser tests.
- Add DP support to ti-sn65dsi86.
- Add MT8195 DP support to mediatek.
- Support RGB565, XRGB64, and ARGB64 formats in vkms.
- Convert sun4i tv support to atomic.
- Refactor vc4/vec TV Modesetting, and fix timings.
- Use atomic helpers instead of simple display helpers in ssd130x.
Maintainer changes:
- Add Douglas Anderson as reviewer for panel-edp.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/a489485b-3ebc-c734-0f80-aed963d89efe@linux.intel.com
current function mixes CSDMA_DOORBELL_RANGE and SDMA0_DOORBELL_RANGE
range/size manipulation, while these 2 registers have difference size
field mask. Remove range/size manipulation for SDMA0_DOORBELL_RANGE.
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Xiaojian Du <Xiaojian.Du@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Addresses should be printed in hex format.
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
current function mixes CSDMA_DOORBELL_RANGE and SDMA0_DOORBELL_RANGE
range/size manipulation, while these 2 registers have difference size
field mask. Remove range/size manipulation for SDMA0_DOORBELL_RANGE.
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Xiaojian Du <Xiaojian.Du@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Addresses should be printed in hex format.
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
V1:
The psp_cmd_submit_buf function is called by psp_hw_fini to send
TA unload messages to psp to terminate ras, asd and tmr. But when
amdgpu is uninstalled, drm_dev_unplug is called earlier than
psp_hw_fini in amdgpu_pci_remove, the calling order as follows:
static void amdgpu_pci_remove(struct pci_dev *pdev) {
drm_dev_unplug
......
amdgpu_driver_unload_kms->amdgpu_device_fini_hw->...
->.hw_fini->psp_hw_fini->...
->psp_ta_unload->psp_cmd_submit_buf
......
}
The program will return when calling drm_dev_enter in psp_cmd_submit_buf.
So the call to drm_dev_enter in psp_cmd_submit_buf should be
removed, so that the TA unload messages can be sent to the psp
when amdgpu is uninstalled.
V2:
1. Restore psp_cmd_submit_buf to its original code.
2. Move drm_dev_unplug call after amdgpu_driver_unload_kms in
amdgpu_pci_remove.
3. Since amdgpu_device_fini_hw is called by amdgpu_driver_unload_kms,
remove the unplug check to release device mmio resource in
amdgpu_device_fini_hw before calling drm_dev_unplug.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
UAPI Changes:
Cross-subsystem Changes:
- DMA-buf: documentation updates.
- Assorted small fixes to vga16fb
- Fix fbdev drivers to use the aperture helpers.
- Make removal of conflicting drivers work correctly without fbdev enabled.
Core Changes:
- bridge, scheduler, dp-mst: Assorted small fixes.
- Add more format helpers to fourcc, and use it to replace the cpp usage.
- Add DRM_FORMAT_Cxx, DRM_FORMAT_Rxx (single channel), and DRM_FORMAT_Dxx
("darkness", inverted single channel)
- Add packed AYUV8888 and XYUV8888 formats.
- Assorted documentation updates.
- Rename ttm_bo_init to ttm_bo_init_validate.
- Allow TTM bo's to exist without backing store.
- Convert drm selftests to kunit.
- Add managed init functions for (panel) bridge, crtc, encoder and connector.
- Fix endianness handling in various format conversion helpers.
- Make tests pass on big-endian platforms, and add test for rgb888 -> rgb565
- Move DRM_PLANE_HELPER_NO_SCALING to atomic helpers and rename, so
drm_plane_helper is no longer needed in most drivers.
- Use idr_init_base instead of idr_init.
- Rename FB and GEM CMA helpers to DMA helpers.
- Rework XRGB8888 related conversion helpers, and add drm_fb_blit() that
takes a iosys_map. Make drm_fb_memcpy take an iosys_map too.
- Move edid luminance calculation to core, and use it in i915.
Driver Changes:
- bridge/{adv7511,ti-sn65dsi86,parade-ps8640}, panel/{simple,nt35510,tc358767},
nouveau, sun4i, mipi-dsi, mgag200, bochs, arm, komeda, vmwgfx, pl111:
Assorted small fixes and doc updates.
- vc4: Rework hdmi power up, and depend on PM.
- panel/simple: Add Samsung LTL101AL01.
- ingenic: Add JZ4760(B) support, avoid a modeset when sharpness property
is unchanged, and use the new PM ops.
- Revert some amdgpu commits that cause garbaged graphics when starting
X, and reapply them with the real problem fixed.
- Completely rework vc4 init to use managed helpers.
- Rename via_drv to via_dri1, and move all stuff there only used by the
dri1 implementation in preperation for atomic modeset.
- Use regmap bulk write in ssd130x.
- Power sequence and clock updates to it6505.
- Split panel-sitrox-st7701 init sequence and rework mode programming code.
- virtio: Improve error and edge conditions handling, and convert to use managed
helpers.
- Add Samsung LTL101AL01, B120XAN01.0, R140NWF5 RH, DMT028VGHMCMI-1A T, panels.
- Add generic fbdev support to komeda.
- Split mgag200 modeset handling to make it more model-specific.
- Convert simpledrm to use atomic helpers.
- Improve udl suspend/disconnect handling.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEuXvWqAysSYEJGuVH/lWMcqZwE8MFAmMA59MACgkQ/lWMcqZw
E8MBihAAtttSIF8fvRIgnoFeswteB5HlXtrSOVRns+0h7mNzcmUL3yp/jV9betND
NEfBCERRIRk5vqkzm2o9P/rTG+KMYCyyCNdcvvh+rVn+HU2V9MwtJdowZqLu7Rxi
Qn59SaL6fMQanJwF8yHcii1ikqelU8TfsMXcFXt2d/yjn5D/h80tNOULEB74zIDH
Vey/xTawcqEz3Qbj2mL4VihEnzU1cziGTpZvAWU2GTdWLxxN88uA4RD2JDyvGaXG
oO6oilYT9TW+2C+0jfMUysu7RcNYuo4KD8dSVLvec+UKDKdO1pSqn40vgTZp0MBw
V0VGg/Lqg6+u33UQkxeRpdzTboD4cMjoeP0/IJdHQVMYYH449tXDgZEjXqVHdz2c
Ea8JIm36CHjqA0Ua3DT8y1qRtdrjuATJauoB6MXW8RpguqMzVuI4e9Dio77WRNYe
Lg/V5AlQEKcwxeqe6v+8K7aB4lvHZ9g0WSMFHoxYz28d2cFWDVeKgK/GCW3WWMKo
AQigRS84z3QBqK+IeTjbfZREoRFtx7NzQpzU9jQdL06XGWExhGOs8yU2OCoDNN8+
iFMBStEPHLPg27HrLV5JYeMRUBPJMnwQhAEnrSv9KqIfPD3/g2jonoGm3J57p+OI
Ag7pwg88spzP1DC7M6+UvzUnH76Dhf4hdnfQjz10c6BfS1ODPbA=
=kU/l
-----END PGP SIGNATURE-----
Merge tag 'drm-misc-next-2022-08-20-1' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for v6.1:
UAPI Changes:
Cross-subsystem Changes:
- DMA-buf: documentation updates.
- Assorted small fixes to vga16fb
- Fix fbdev drivers to use the aperture helpers.
- Make removal of conflicting drivers work correctly without fbdev enabled.
Core Changes:
- bridge, scheduler, dp-mst: Assorted small fixes.
- Add more format helpers to fourcc, and use it to replace the cpp usage.
- Add DRM_FORMAT_Cxx, DRM_FORMAT_Rxx (single channel), and DRM_FORMAT_Dxx
("darkness", inverted single channel)
- Add packed AYUV8888 and XYUV8888 formats.
- Assorted documentation updates.
- Rename ttm_bo_init to ttm_bo_init_validate.
- Allow TTM bo's to exist without backing store.
- Convert drm selftests to kunit.
- Add managed init functions for (panel) bridge, crtc, encoder and connector.
- Fix endianness handling in various format conversion helpers.
- Make tests pass on big-endian platforms, and add test for rgb888 -> rgb565
- Move DRM_PLANE_HELPER_NO_SCALING to atomic helpers and rename, so
drm_plane_helper is no longer needed in most drivers.
- Use idr_init_base instead of idr_init.
- Rename FB and GEM CMA helpers to DMA helpers.
- Rework XRGB8888 related conversion helpers, and add drm_fb_blit() that
takes a iosys_map. Make drm_fb_memcpy take an iosys_map too.
- Move edid luminance calculation to core, and use it in i915.
Driver Changes:
- bridge/{adv7511,ti-sn65dsi86,parade-ps8640}, panel/{simple,nt35510,tc358767},
nouveau, sun4i, mipi-dsi, mgag200, bochs, arm, komeda, vmwgfx, pl111:
Assorted small fixes and doc updates.
- vc4: Rework hdmi power up, and depend on PM.
- panel/simple: Add Samsung LTL101AL01.
- ingenic: Add JZ4760(B) support, avoid a modeset when sharpness property
is unchanged, and use the new PM ops.
- Revert some amdgpu commits that cause garbaged graphics when starting
X, and reapply them with the real problem fixed.
- Completely rework vc4 init to use managed helpers.
- Rename via_drv to via_dri1, and move all stuff there only used by the
dri1 implementation in preperation for atomic modeset.
- Use regmap bulk write in ssd130x.
- Power sequence and clock updates to it6505.
- Split panel-sitrox-st7701 init sequence and rework mode programming code.
- virtio: Improve error and edge conditions handling, and convert to use managed
helpers.
- Add Samsung LTL101AL01, B120XAN01.0, R140NWF5 RH, DMT028VGHMCMI-1A T, panels.
- Add generic fbdev support to komeda.
- Split mgag200 modeset handling to make it more model-specific.
- Convert simpledrm to use atomic helpers.
- Improve udl suspend/disconnect handling.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/f0c71766-61e8-19b7-763a-5fbcdefc633d@linux.intel.com
Typically the acpi_video driver will initialize before amdgpu, which
used to cause /sys/class/backlight/acpi_video0 to get registered and then
amdgpu would register its own amdgpu_bl# device later. After which
the drivers/acpi/video_detect.c code unregistered the acpi_video0 device
to avoid there being 2 backlight devices.
This means that userspace used to briefly see 2 devices and the
disappearing of acpi_video0 after a brief time confuses the systemd
backlight level save/restore code, see e.g.:
https://bbs.archlinux.org/viewtopic.php?id=269920
To fix this the ACPI video code has been modified to make backlight class
device registration a separate step, relying on the drm/kms driver to
ask for the acpi_video backlight registration after it is done setting up
its native backlight device.
Add a call to the new acpi_video_register_backlight() when amdgpu skips
registering its own backlight device because of either the firmware_flags
or the acpi_video_get_backlight_type() return value. This ensures that
if the acpi_video backlight device should be used, it will be available
before the amdgpu drm_device gets registered with userspace.
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
These structures are basically ported from MMSCH v3_0,
besides, added RB and RB4 enablement flag to support
unified queue
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Enable unified queue support for sriov, abandon all previous
multi-queue settings
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Previously since vcn0/vcn1 are not enabled, loading firmware
is skipped. Now add firmware loading back since vcn0/vcn1
has already been enabled on sriov
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For sriov, CG and MG are controlled from hypervisor side,
no need to manage them again in ip init
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Reviewed-by: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
There is no CG(Clock Gating)/PG(Power Gating) requirement on SRIOV VF.
For multi VF, VF should not enable any CG/PG features.
For one VF, PF will program CG/PG related registers.
[How]
Do not set any cg/pg flag bit at early init under sriov.
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Yifan Zha <Yifan.Zha@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
KIQ register init requires GRBM_GFX_CNTL to select KIQ.
[How]
As RLCG accessing registers will save the data of GRBM_GFX_CNTL and restore it.
Use RLCG indirect accessing register method to select grbm instead of mmio directly access.
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Yifan Zha <Yifan.Zha@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
As SDMA0_SEM_WAIT_FAIL_TIMER_CNTL is a PF-only register,
L1 would block this register for VF access.
[How]
VF do not program it.
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Yifan Zha <Yifan.Zha@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
As VF cannot read MMMC_VM_FB_OFFSET with L1 Policy(read 0xffffffff).
It leads to driver get the incorrect vram base offset.
[How]
Since SR-IOV is dGPU only, skip reading this register and set the
fb_offest to 0.
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Yifan Zha <Yifan.Zha@amd.com>
Signed-off-by: Horace Chen <horace.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
vm_l2_bank_select_reserved_cid2 is a PF_only register
that cannot be programmed by VF. This feature is only
support HDP using GPUVM page tables to access FB memory
which should be disabled on SRIOV.
[How]
Disable the feature on VF.
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Yifan Zha <Yifan.Zha@amd.com>
Signed-off-by: Horace Chen <horace.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
VF should not program these registers, the value were defined in the host.
[How]
Skip writing them in SRIOV environment and program them on host side.
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Yifan Zha <Yifan.Zha@amd.com>
Signed-off-by: Horace Chen <horace.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
With L1 Policy applied, IH_RB_CNTL/RING cannot be accessed by VF.
[How]
Use PSP program IH_RB_CNTL in VF.
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Yifan Zha <Yifan.Zha@amd.com>
Signed-off-by: Horace Chen <horace.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add support for PSP 13.0.10 for SR-IOV VF
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Horace Chen <horace.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
SRIOV needs to initialize mmsch instead of multimedia engines
directly. So currently remove them for SR-IOV until the code and
firmwares are ready.
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Horace Chen <horace.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
SR-IOV may need to load different firmwares for different ASIC inside
VF.
So create a new function in amdgpu_virt to check whether FW load needs
to be skipped.
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Horace Chen <horace.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
Under SR-IOV, if VF is switched out then its doorbell will be disabled,
SDMA rely on WPTR_POLL to get doorbells which was sent during VF
switched-out time.
[How]
For SR-IOV, set SDMA WPTR_POLL_ENABLE to 1.
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Horace Chen <horace.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
Under SR-IOV, we are not sure whether pipe status is
good or not when doing initialization. The compute engine
maybe fail to bringup if pipe status is bad.
[How]
Do an RS64 pipe reset for MEC before we do initialization.
Also apply to bare-metal.
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Horace Chen <horace.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
under SR-IOV, the nbio doorbell range will be defined by PF. So VF
nbio doorbell range registers will be blocked. It will cause violation
if VF access those registers directly.
[How]
create an nbio_v4_3_sriov_funcs for sriov nbio_v4_3 initialization to
skip the setting for the doorbell range registers.
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Horace Chen <horace.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For further chips we will use CHIP_IP_DISCOVERY, so add this
support for virtualization
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Horace Chen <horace.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
there is only one SDMA engine in SDMA 6.0.1, the sdma_hqd_mask has to be
zeroed for the 2nd engine, otherwise MES scheduler will consider 2nd
engine exists and map/unmap SDMA queues to the non-existent engine.
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Tim Huang <Tim.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>