linux

Commit Graph

Author	SHA1	Message	Date
Dave Airlie	1faeeb315f	amd-drm-next-6.16-2025-05-09: amdgpu: - IPS fixes - DSC cleanup - DC Scaling updates - DC FP fixes - Fused I2C-over-AUX updates - SubVP fixes - Freesync fix - DMUB AUX fixes - VCN fix - Hibernation fixes - HDP fixes - DCN 2.1 fixes - DPIA fixes - DMUB updates - Use drm_file_err in amdgpu - Enforce isolation updates - Use new dma_fence helpers - USERQ fixes - Documentation updates - Misc code cleanups - SR-IOV updates - RAS updates - PSP 12 cleanups amdkfd: - Update error messages for SDMA - Userptr updates drm: - Add drm_file_err function dma-buf: - Add a helper to sort and deduplicate dma_fence arrays -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCaB6KhwAKCRC93/aFa7yZ 2FYIAQD43sIYoZCo6l7lZt0m54d6Mt1jangVhMK/jH31eOgKYQEAoRJKSMjT6Ktl FLBzG8FXekwQELNu6EgyG8Ywwim9AQ0= =dfkK -----END PGP SIGNATURE----- Merge tag 'amd-drm-next-6.16-2025-05-09' of https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.16-2025-05-09: amdgpu: - IPS fixes - DSC cleanup - DC Scaling updates - DC FP fixes - Fused I2C-over-AUX updates - SubVP fixes - Freesync fix - DMUB AUX fixes - VCN fix - Hibernation fixes - HDP fixes - DCN 2.1 fixes - DPIA fixes - DMUB updates - Use drm_file_err in amdgpu - Enforce isolation updates - Use new dma_fence helpers - USERQ fixes - Documentation updates - Misc code cleanups - SR-IOV updates - RAS updates - PSP 12 cleanups amdkfd: - Update error messages for SDMA - Userptr updates drm: - Add drm_file_err function dma-buf: - Add a helper to sort and deduplicate dma_fence arrays From: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250509230951.3871914-1-alexander.deucher@amd.com Signed-off-by: Dave Airlie <airlied@redhat.com>	2025-05-12 07:14:34 +10:00
Lijo Lazar	afc6053d4c	Reapply: drm/amdgpu: Use generic hdp flush function Except HDP v5.2 all use a common logic for HDP flush. Use a generic function. HDP v5.2 forces NO_KIQ logic, revisit it later. Reapply after fixing up an HDP regression. v2: merge the fix (Alex) Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> (v1) Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-08 11:21:37 -04:00
Alex Deucher	dbc064adfc	drm/amdgpu/hdp7: use memcfg register to post the write for HDP flush Reading back the remapped HDP flush register seems to cause problems on some platforms. All we need is a read, so read back the memcfg register. Fixes: `689275140c` ("drm/amdgpu/hdp7.0: do a posting read when flushing HDP") Reported-by: Alexey Klimov <alexey.klimov@linaro.org> Link: https://lists.freedesktop.org/archives/amd-gfx/2025-April/123150.html Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4119 Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3908 Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-08 11:21:12 -04:00
Alex Deucher	84141ff615	drm/amdgpu/hdp6: use memcfg register to post the write for HDP flush Reading back the remapped HDP flush register seems to cause problems on some platforms. All we need is a read, so read back the memcfg register. Fixes: `abe1cbaec6` ("drm/amdgpu/hdp6.0: do a posting read when flushing HDP") Reported-by: Alexey Klimov <alexey.klimov@linaro.org> Link: https://lists.freedesktop.org/archives/amd-gfx/2025-April/123150.html Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4119 Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3908 Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-08 11:20:48 -04:00
Huang Rui	793fa8ce4e	drm/amdgpu: cleanup sriov function for psp v12 PSP v12 won't have SRIOV function. Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-08 11:20:43 -04:00
Alex Deucher	4a89b7698e	drm/amdgpu/hdp5.2: use memcfg register to post the write for HDP flush Reading back the remapped HDP flush register seems to cause problems on some platforms. All we need is a read, so read back the memcfg register. Fixes: `f756dbac1c` ("drm/amdgpu/hdp5.2: do a posting read when flushing HDP") Reported-by: Alexey Klimov <alexey.klimov@linaro.org> Link: https://lists.freedesktop.org/archives/amd-gfx/2025-April/123150.html Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4119 Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3908 Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-08 11:20:19 -04:00
Alex Deucher	a5cb344033	drm/amdgpu/hdp5: use memcfg register to post the write for HDP flush Reading back the remapped HDP flush register seems to cause problems on some platforms. All we need is a read, so read back the memcfg register. Fixes: `cf424020e0` ("drm/amdgpu/hdp5.0: do a posting read when flushing HDP") Reported-by: Alexey Klimov <alexey.klimov@linaro.org> Link: https://lists.freedesktop.org/archives/amd-gfx/2025-April/123150.html Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4119 Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3908 Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-08 11:18:30 -04:00
Huang Rui	518e22b42c	drm/amdgpu: remove re-route ih in psp v12 APU doesn't have second IH ring, so re-routing action here is a no-op. It will take a lot of time to wait timeout from PSP during the initialization. So remove the function in psp v12. Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-08 11:18:24 -04:00
Mario Limonciello	b54695dae9	drm/amd: Add per-ring reset for vcn v5.0.0 use If there is a problem requiring a reset of the VCN engine, it is better to reset the VCN engine rather than the entire GPU. Add a reset callback for the ring which will stop and start VCN if an issue happens. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250506204948.12048-4-mario.limonciello@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-07 17:48:24 -04:00
Mario Limonciello	b8b6e6f165	drm/amd: Add per-ring reset for vcn v4.0.0 use If there is a problem requiring a reset of the VCN engine, it is better to reset the VCN engine rather than the entire GPU. Add a reset callback for the ring which will stop and start VCN if an issue happens. Link: https://lore.kernel.org/r/20250506204948.12048-3-mario.limonciello@amd.com Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-07 17:48:19 -04:00
Mario Limonciello	d1a46cdd00	drm/amd: Add per-ring reset for vcn v4.0.5 use There is a problem occurring on VCN 4.0.5 where in some situations a job is timing out. This triggers a job timeout which then causes a GPU reset for recovery. That has exposed a number of issues with GPU reset that have since been fixed. But also a GPU reset isn't actually needed for this circumstance. Just restarting the ring is enough. Add a reset callback for the ring which will stop and start VCN if the issue happens. Link: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12528 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3909 Link: https://lore.kernel.org/r/20250506204948.12048-2-mario.limonciello@amd.com Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-07 17:48:11 -04:00
Alex Deucher	5c937b4a60	drm/amdgpu/hdp4: use memcfg register to post the write for HDP flush Reading back the remapped HDP flush register seems to cause problems on some platforms. All we need is a read, so read back the memcfg register. Fixes: `c9b8dcabb5` ("drm/amdgpu/hdp4.0: do a posting read when flushing HDP") Reported-by: Alexey Klimov <alexey.klimov@linaro.org> Link: https://lists.freedesktop.org/archives/amd-gfx/2025-April/123150.html Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4119 Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3908 Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-07 17:47:33 -04:00
Alex Deucher	e8614fc769	Revert "drm/amdgpu: Use generic hdp flush function" This reverts commit `18a878fd8a`. Revert this temporarily to make it easier to fix a regression in the HDP handling. Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-07 17:47:02 -04:00
Dr. David Alan Gilbert	4c83d4538b	drm/amd/pm/smu13: Remove unused smu_v3 functions smu_v13_0_display_clock_voltage_request() and smu_v13_0_set_min_deep_sleep_dcefclk() were added in 2020 by commit `c05d1c4015` ("drm/amd/swsmu: add aldebaran smu13 ip support (v3)") but have remained unused. Remove them. smu_v13_0_display_clock_voltage_request() was the only user of smu_v13_0_set_hard_freq_limited_range(). Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-07 17:46:42 -04:00
Dr. David Alan Gilbert	2c599d66b9	drm/amd/pm/smu11: Remove unused smu_v11_0_get_dpm_level_range The last use of smu_v11_0_get_dpm_level_range() was removed in 2020 by commit `46a301e14e` ("drm/amd/powerplay: drop unnecessary Navi1x specific APIs") Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-07 17:46:21 -04:00
Dr. David Alan Gilbert	1d8d8b8d14	drm/amd/pm/smu7: Remove unused smu7_copy_bytes_from_smc smu7_copy_bytes_from_smc() was added in 2016 by commit `1ff55f4651` ("drm/amd/powerplay: implement smu7_smumgr for asics with smu ip version 7.") but never used. Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-07 17:46:00 -04:00
Sunil Khatri	c2a3bac7c8	drm/amdgpu: fix the indentation fix the indentation drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c:6992 gfx_v11_ip_dump compiler: gcc-11 (Debian 11.3.0-12) 11.3.0 Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/r/202505071619.7sHTLpNg-lkp@intel.com/ Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Arvind Yadav <Arvind.Yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-07 17:45:27 -04:00
Huang Rui	8465f0a372	drm/amdgpu: remove mdelay in psp v12 Since secure firmware is more stable than bring up phase, I believe we don't need such mdelays any more before wait PSP response on PSP v12. Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Trigger Huang <Trigger.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-07 17:45:16 -04:00
Shane Xiao	2d274bf709	amd/amdkfd: Trigger segfault for early userptr unmmapping If applications unmap the memory before destroying the userptr, it needs trigger a segfault to notify user space to correct the free sequence in VM debug mode. v2: Send gpu access fault to user space v3: Report gpu address to user space, remove unnecessary params v4: update pr_err into one line, remove userptr log info Signed-off-by: Shane Xiao <shane.xiao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-07 17:45:09 -04:00
Shane Xiao	8e320f67d4	drm/amdgpu: Add debug bit for userptr usage In VM debug mode, it is desirable to notify the application to correct the freeing sequence by unmapping the memory before destroying the userptr in the old userptr path. Add a bitmask to decide whether to send gpu vm fault to the applition. Signed-off-by: Shane Xiao <shane.xiao@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-07 17:45:04 -04:00
Prike Liang	def41146b9	drm/amdgpu: unreserve the gem BO before returning from attach error It requires unlocking the reserved gem BO before returning from attaching the eviction fence error. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-07 17:44:59 -04:00
Prike Liang	926c79ad6e	drm/amdgpu: promote the implicit sync to the dependent read fences The driver doesn't want to implicitly sync on the DMA_RESV_USAGE_BOOKKEEP usage fences, and the BOOKEEP fences should be synced explicitly. So, as the VM implicit syncing only need to return and sync the dependent read fences. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-07 17:44:51 -04:00
Alex Deucher	6edc89645c	drm/amdgpu/psp: mark securedisplay TA as optional This is an optional TA which is only available on certain embedded systems. Mark it as optional to avoid user confusion. This mirrors what we already do for other optional TAs. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4181 Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-07 17:44:45 -04:00
Alex Deucher	06f2dcc241	drm/amdgpu: fix pm notifier handling Set the s3/s0ix and s4 flags in the pm notifier so that we can skip the resource evictions properly in pm prepare based on whether we are suspending or hibernating. Drop the eviction as processes are not frozen at this time, we we can end up getting stuck trying to evict VRAM while applications continue to submit work which causes the buffers to get pulled back into VRAM. v2: Move suspend flags out of pm notifier (Mario) Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4178 Fixes: `2965e6355d` ("drm/amd: Add Suspend/Hibernate notification callback support") Cc: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-07 17:43:18 -04:00
Ellen Pan	086809c82c	drm/amdgpu: Implement unrecoverable error message handling for VFs This notification may arrive in VF mailbox while polling for response from another event. This patches covers the following scenarios: - If VF is already in RMA state, then do not attempt to contact the host. Host will ignore the VF after sending the notification. - If the notification is detected during polling, then set the RMA status, and return error to caller. - If the notification arrives by interrupt, then set the RMA status and queue a reset. This reset will fail and VF will stop runtime services. Reviewed-by: Shravan Kumar Gande <Shravankumar.Gande@amd.com> Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com> Signed-off-by: Ellen Pan <yunru.pan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-07 17:43:13 -04:00
Ellen Pan	6be34e1d1f	drm/amdgpu: Add unrecoverable error message definitions for VFs Host may stop runtime services after reaching a bad page threshold. This notification will indicate to the VF that it no longer has access to the GPU. Reviewed-by: Shravan Kumar Gande <Shravankumar.Gande@amd.com> Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com> Signed-off-by: Ellen Pan <yunru.pan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-07 17:43:07 -04:00
Alex Deucher	ce8f7d9589	Revert "drm/amd: Stop evicting resources on APUs in suspend" This reverts commit `3a9626c816`. This breaks S4 because we end up setting the s3/s0ix flags even when we are entering s4 since prepare is used by both flows. The causes both the S3/s0ix and s4 flags to be set which breaks several checks in the driver which assume they are mutually exclusive. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3634 Cc: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-07 17:42:30 -04:00
Ruijing Dong	5c89ceda99	drm/amdgpu/vcn: using separate VCN1_AON_SOC offset VCN1_AON_SOC_ADDRESS_3_0 offset varies on different VCN generations, the issue in vcn4.0.5 is caused by a different VCN1_AON_SOC_ADDRESS_3_0 offset. This patch does the following: 1. use the same offset for other VCN generations. 2. use the vcn4.0.5 special offset 3. update vcn_4_0 and vcn_5_0 Acked-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Ruijing Dong <ruijing.dong@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-07 17:42:19 -04:00
Prike Liang	af7160c25c	drm/amdgpu: fix the eviction fence dereference The dma_resv_add_fence() already refers to the added fence. So when attaching the evciton fence to the gem bo, it needn't refer to it anymore. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-07 17:41:54 -04:00
Ellen Pan	5da3d8820d	drm/amdgpu: Implement Runtime Bad Page query for VFs Host will send a notification when new bad pages are available. Uopn guest request, the first 256 bad page addresses will be placed into the PF2VF region. Guest should pause the PF2VF worker thread while the copy is in progress. Reviewed-by: Shravan Kumar Gande <Shravankumar.Gande@amd.com> Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com> Signed-off-by: Ellen Pan <yunru.pan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-07 17:41:49 -04:00
Ellen Pan	6615f1ad34	drm/amdgpu: Add Runtime Bad Page message definitions for VFs Currently VFs rely on poison consumption interrupt from HW to kick off the bad page retirement process. Part of this process includes a VF reset. This patch adds the following: 1) Host Bad Pages notification message. 2) Guest request bad pages message. When combined, VFs are able to reserve the pages early, and potentially avoid future poison consumption that will disrupt user services from consequent FLR. Reviewed-by: Shravan Kumar Gande <Shravankumar.Gande@amd.com> Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com> Signed-off-by: Ellen Pan <yunru.pan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-07 17:41:43 -04:00
Rodrigo Siqueira	c8305c6327	drm/amdgpu: Add documentation to some parts of the AMDGPU ring and wb Add some random documentation associated with the ring buffer manipulations and writeback. Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-07 17:41:35 -04:00
Eric Huang	f0be138691	drm/amdkfd: change error to warning message for SDMA queues creation SDMA doesn't support oversubsciption, it is the user matter to create queues over HW limit, but not supposed to be a KFD error. Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-07 17:41:27 -04:00
Harry Wentland	d01ca8708d	drm/amd/display: Don't check for NULL divisor in fixpt code [Why] We check for a NULL divisor but don't act on it. This check does nothing other than throw a warning. It does confuse static checkers though: See https://lkml.org/lkml/2025/4/26/371 [How] Drop the ASSERTs in both DC and SPL variants. Fixes: `4562236b3b` ("drm/amd/dc: Add dc display driver (v2)") Fixes: `6efc0ab3b0` ("drm/amd/display: add back quality EASF and ISHARP and dc dependency changes") Signed-off-by: Harry Wentland <harry.wentland@amd.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Leo Li <sunpeng.li@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-07 17:40:15 -04:00
Ivan Shamliev	e2255687c8	drm/amd/display: Use true/false for boolean variables in DML2 core files Replace 0 and 1 with false and true for boolean variables in dml2_core_dcn4_calcs.c and dml2_core_utils.c to align with the Linux kernel coding style guidelines, which recommend using C99 bool type with true/false values. Signed-off-by: Ivan Shamliev <ivan.shamliev.dev@abv.bg> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-07 17:40:02 -04:00
James Flowers	3e71fc7c4c	drm/amd/display: adds kernel-doc comment for dc_stream_remove_writeback() Adds kernel-doc for externally linked dc_stream_remove_writeback function. Signed-off-by: James Flowers <bold.zone2373@fastmail.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-07 17:39:55 -04:00
Dave Airlie	5e0c679981	Linux 6.15-rc5 -----BEGIN PGP SIGNATURE----- iQFSBAABCgA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmgX1CgeHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGxiIH/A7LHlVatGEQgRFi 0JALDgcuGTMtMU1qD43rv8Z1GXqTpCAlaBt9D1C9cUH/86MGyBTVRWgVy0wkaU2U 8QSfFWQIbrdaIzelHtzmAv5IDtb+KrcX1iYGLcMb6ZYaWkv8/CMzMX1nkgxEr1QT 37Xo3/F17yJumAdNQxdRhVLGy2d3X5rScecpufwh97sMwoddllMCDs2LIoeSAYpG 376/wzni09G2fADa8MEKqcaMue4qcf0FOo/gOkT8YwFGSZLKa6uumlBLg04QoCt0 foK2vfcci1q4H4ZbCu3uQESYGLQHY0f2ICDCwC3m25VF9a81TmlbC3MLum3vhmKe RtLDcXg= =xyaI -----END PGP SIGNATURE----- BackMerge tag 'v6.15-rc5' into drm-next Linux 6.15-rc5, requested by tzimmerman for fixes required in drm-next. Signed-off-by: Dave Airlie <airlied@redhat.com>	2025-05-06 16:39:25 +10:00
Arvind Yadav	3e50b1d625	drm/amdgpu: only keep most recent fence for each context Keep only the latest fences to reduce the number of values given back to userspace v2: - Export this code from dma-fence-unwrap.c(by Christian). v3: - To split this in a dma_buf patch and amd userq patch(by Sunil). - No need to add a new function just re-use existing(by Christian). v4: Export dma_fence_dedub_array function and used it(by Christian). Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-05 13:29:58 -04:00
Srinivasan Shanmugam	68071eb0ae	drm/amdgpu: Add Support for enforcing isolation without Cleaner Shader Adjusted the enforce isolation setting handling to include the ability to disable the cleaner shader without affecting isolation between tasks. v2: Updated enforce isolation documentation and parameters. (Alex) Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-05 13:29:53 -04:00
Sunil Khatri	71353c1a4f	drm/amdgpu: change DRM_DBG_DRIVER to drm_dbg_driver update the functions in amdgpu_userqueues.c from DRM_DBG_DRIVER to drm_dbg_driver so multi gpu instance can be logged in. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-05 13:29:38 -04:00
Sunil Khatri	c46a37628a	drm/amdgpu: change DRM_ERROR to drm_file_err in amdgpu_userq.c change the DRM_ERROR and drm_err to drm_file_err to add process name and pid to the logging. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-05 13:29:33 -04:00
Sunil Khatri	8c97cdb1a6	drm/amdgpu: use drm_file_err in fence timeouts use drm_file_err instead of DRM_ERROR which adds process and pid information in the userqueue error logging. Sample log: [ 19.802315] amdgpu 0000:0a:00.0: [drm] ERROR comm: ibus-x11 pid: 2055 client: Unset ... Couldn't unmap all the queues [ 19.802319] amdgpu 0000:0a:00.0: [drm] ERROR comm: ibus-x11 pid: 2055 client: Unset ... Failed to evict userqueue [ 19.838432] amdgpu 0000:0a:00.0: [drm] ERROR comm: systemd-logind pid: 1042 client: Unset ... Couldn't unmap all the queues [ 19.838436] amdgpu 0000:0a:00.0: [drm] ERROR comm: systemd-logind pid: 1042 client: Unset ... Failed to evict userqueue Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-05 13:29:25 -04:00
Sunil Khatri	30ff75809d	drm/amdgpu: add drm_file reference in userq_mgr drm_file will be used in usermode queues code to enable better process information in logging and hence add drm_file part of the userq_mgr struct. update the drm_file pointer in userq_mgr for each amdgpu_driver_open_kms. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-05 13:29:18 -04:00
Taimur Hassan	c38de9db74	drm/amd/display: Promote DC to 3.2.331 Summary * Remove redundant NULL check * Fix invalid context error in dml helper * Prepare for Fused I2C-over-AUX * Allow DSCClock disable * Vmax / Vmin update for Vsync * Fix race condition in DPIA AUX transfer * Fix wrong handling for AUX_DEFER case * Only wait for required space in DMUB mailbox Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-05 13:29:06 -04:00
Dillon Varone	dbc5b24fff	drm/amd/display: Only wait for required free space in DMUB mailbox [WHY&HOW] When command submission is blocked by a full mailbox, only wait for enough space to free to submit the command, instead of waiting for idle. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-05 13:28:59 -04:00
Meenakshikumar Somasundaram	59510792ba	drm/amd/display: Assign preferred stream encoder instance to dpia [Why] For dpia, preferred engine instance availability is not checked when assigning stream encoder instance. [How] Check for dpia preferred engine id and assign the same stream encoder instance for the stream if available. Reviewed-by: PeiChen Huang <peichen.huang@amd.com> Signed-off-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-05 13:28:47 -04:00
Wayne Lin	3637e457eb	drm/amd/display: Fix wrong handling for AUX_DEFER case [Why] We incorrectly ack all bytes get written when the reply actually is defer. When it's defer, means sink is not ready for the request. We should retry the request. [How] Only reply all data get written when receive I2C_ACK\|AUX_ACK. Otherwise, reply the number of actual written bytes received from the sink. Add some messages to facilitate debugging as well. Fixes: `ad6756b4d7` ("drm/amd/display: Shift dc link aux to aux_payload") Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Ray Wu <ray.wu@amd.com> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-05 13:26:43 -04:00
Wayne Lin	9b540e3fe6	drm/amd/display: Copy AUX read reply data whenever length > 0 [Why] amdgpu_dm_process_dmub_aux_transfer_sync() should return all exact data reply from the sink side. Don't do the analysis job in it. [How] Remove unnecessary check condition AUX_TRANSACTION_REPLY_AUX_ACK. Fixes: `ead08b95fa` ("drm/amd/display: Fix race condition in DPIA AUX transfer") Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Ray Wu <ray.wu@amd.com> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-05 13:25:31 -04:00
Wayne Lin	81b5c6fa62	drm/amd/display: Remove incorrect checking in dmub aux handler [Why & How] "Request length != reply length" is expected behavior defined in spec. It's not an invalid reply. Besides, replied data handling logic is not designed to be written in amdgpu_dm_process_dmub_aux_transfer_sync(). Remove the incorrectly handling section. Fixes: `ead08b95fa` ("drm/amd/display: Fix race condition in DPIA AUX transfer") Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Ray Wu <ray.wu@amd.com> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-05 13:01:50 -04:00
Wayne Lin	1db6c9e9b6	drm/amd/display: Fix the checking condition in dmub aux handling [Why & How] Fix the checking condition for detecting AUX_RET_ERROR_PROTOCOL_ERROR. It was wrongly checking by "not equals to" Reviewed-by: Ray Wu <ray.wu@amd.com> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-05 13:01:44 -04:00
Wayne Lin	d5c9ade755	drm/amd/display: Shift DMUB AUX reply command if necessary [Why] Defined value of dmub AUX reply command field get updated but didn't adjust dm receiving side accordingly. [How] Check the received reply command value to see if it's updated version or not. Adjust it if necessary. Fixes: `ead08b95fa` ("drm/amd/display: Fix race condition in DPIA AUX transfer") Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Ray Wu <ray.wu@amd.com> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-05 12:58:55 -04:00
Dillon Varone	4465dd0e41	drm/amd/display: Refactor SubVP cursor limiting logic [WHY] There are several gaps that can result in SubVP being enabled with incompatible HW cursor sizes, and unjust restrictions to cursor size due to wrong predictions on future usage of SubVP. [HOW] - remove "prediction" logic in favor of tagging based on previous SubVP usage - block SubVP if current HW cursor settings are incompatible - provide interface for DM to determine if HW cursor should be disabled due to an attempt to enable SubVP Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-05 12:58:49 -04:00
Austin Zheng	fe3250f108	drm/amd/display: Call FP Protect Before Mode Programming/Mode Support [Why] Memory allocation occurs within dml21_validate() for adding phantom planes. May cause kernel to be tainted due to usage of FP Start. [How] Move FP start from dml21_validate to before mode programming/mode support. Calculations requiring floating point are all done within mode programming or mode support. Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: Austin Zheng <Austin.Zheng@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-05 12:58:36 -04:00
Alex Hung	94da0735b6	drm/amd/display: Remove unnecessary DC_FP_START/DC_FP_END [WHY & HOW] Remove the unnecessary DC_FP_START/DC_FP_END pair to reduce time in preempt_disable. It also fixes "BUG: sleeping function called from invalid context" error messages because of calling kzalloc with GFP_KERNEL which can sleep. Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-05 12:58:30 -04:00
JinZe Xu	a9cbeb6059	drm/amd/display: Send IPSExit unconditionally. [Why&How] PMFW needs to flush page cache in IPSExit. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: JinZe Xu <JinZe.Xu@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-05 12:58:26 -04:00
Kevin Gao	18a77bda7a	drm/amd/display: Add skip rIOMMU dc config option [Why] Need option to skip rIOMMU calls for dcn21. [How] Added rIOMMU dc config option and check for whether to skip rIOMMU calls. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Kevin Gao <kgao1003@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-05 12:58:17 -04:00
Nicholas Kazlauskas	c00a39f62b	Revert "drm/amd/display: turn off eDP lcdvdd and backlight if not required" This reverts commit `0d93e82186` ("drm/amd/display: turn off eDP lcdvdd and backlight if not required") Reason for revert: Causes S4 lightup regressions. Reviewed-by: Gabe Teeger <gabe.teeger@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-05 12:57:09 -04:00
Taimur Hassan	a063ce924e	drm/amd/display: [FW Promotion] Release 0.1.8.0 Undefined unnecessary definition to avoid wrong referring Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-05 12:57:06 -04:00
Charlene Liu	1bcd679209	drm/amd/display: disable DPP RCG before DPP CLK enable [why] DPP CLK enable needs to disable DPPCLK RCG first. The DPPCLK_en in dccg should always be enabled when the corresponding pipe is enabled. Reviewed-by: Hansen Dsouza <hansen.dsouza@amd.com> Signed-off-by: Charlene Liu <Charlene.Liu@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-05 12:57:00 -04:00
Aurabindo Pillai	cfb2d41831	drm/amd/display: more liberal vmin/vmax update for freesync [Why] FAMS2 expects vmin/vmax to be updated in the case when freesync is off, but supported. But we only update it when freesync is enabled. [How] Change the vsync handler such that dc_stream_adjust_vmin_vmax() its called irrespective of whether freesync is enabled. If freesync is supported, then there is no harm in updating vmin/vmax registers. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3546 Reviewed-by: ChiaHsuan Chung <chiahsuan.chung@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-05 12:56:37 -04:00
Charlene Liu	4daa5e6c2b	drm/amd/display: allow dscclk disable [why] when dscclk rcg disabled from usr reg option, dsc clock will remain enabled because driver was doing two things both dscclk and dsc rcg in the same routine. Reviewed-by: Hansen Dsouza <hansen.dsouza@amd.com> Signed-off-by: Charlene Liu <Charlene.Liu@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-05 12:56:30 -04:00
Ryan Seto	6f23163365	Revert "drm/amd/display: Refactor SubVP cursor limiting logic" This reverts commit `19e743f0fb` ("drm/amd/display: Refactor SubVP cursor limiting logic") Reason for revert: Corruption Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: Ryan Seto <ryanseto@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-05 12:55:35 -04:00
Dominik Kaszewski	2f2c97089d	drm/amd/display: Prepare for Fused I2C-over-AUX [Why] Passive DP-HDMI dongles use I2C-over-AUX protocol which is currently not supported using HDCP Locality Check FW path. [How] Prepare code for switching to I2C-over-AUX protocol. Passive dongle detection to be added in another commit. Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: Dominik Kaszewski <dominik.kaszewski@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-05 12:55:30 -04:00
Roman Li	bd3e84bc98	drm/amd/display: Fix invalid context error in dml helper [Why] "BUG: sleeping function called from invalid context" error. after: "drm/amd/display: Protect FPU in dml2_validate()/dml21_validate()" The populate_dml_plane_cfg_from_plane_state() uses the GFP_KERNEL flag for memory allocation, which shouldn't be used in atomic contexts. The allocation is needed only for using another helper function get_scaler_data_for_plane(). [How] Modify helpers to pass a pointer to scaler_data within existing context, eliminating the need for dynamic memory allocation/deallocation and copying. Fixes: `366e77cd49` ("drm/amd/display: Protect FPU in dml2_validate()/dml21_validate()") Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Roman Li <Roman.Li@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-05 12:54:37 -04:00
Alex Hung	19860f4939	drm/amd/display: Remove redundant null check [WHY & HOW] The null check for connector was dereferenced previously in the same function and the caller. Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-05 12:54:32 -04:00
Jesse Agate	48337bd15c	drm/amd/display: Always Scale Flag [Why & How] When always scale flag is set at the API level, the number of taps should not be overridden to zero in the identity scaling ratio case, and luma scale should not be set to bypass regardless of luma scale ratio Reviewed-by: Samson Tam <samson.tam@amd.com> Signed-off-by: Jesse Agate <jesse.agate@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-05 12:54:29 -04:00
Taimur Hassan	c7e923b8a2	drm/amd/display: Promote DC to 3.2.330 Summary * Update IPS checks to properly include all ASICs. * Refactoring DSC enum dsc_bits_per_comp * Fix ACPI edid parsing issue * Update AUX read interval for LTTPR with old sinks * Correct prefetch calculation Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-05 12:54:24 -04:00
Bhuvanachandra Pinninti	f6510641d2	drm/amd/display: Refactoring DSC enum dsc_bits_per_comp. [Why] Previously the 'dsc_bits_per_comp' enumeration was defined in individual .c files, making it unavailable for other files that may need it. [How] The 'dsc_bits_per_comp' enumeration has been relocated to a common header file. Reviewed-by: Mounika Adhuri <Mounika.Adhuri@amd.com> Reviewed-by: Martin Leung <martin.leung@amd.com> Signed-off-by: Bhuvanachandra Pinninti <bpinnint@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-05 12:54:18 -04:00
Ovidiu Bunea	b4db797117	drm/amd/display: Update IPS sequential_ono requirement checks [why & how] ASICs that require special RCG/PG programming are determined based on hw_internal_rev. Update these checks to properly include all such ASICs. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Ovidiu Bunea <Ovidiu.Bunea@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-05-05 12:54:12 -04:00
Sonny Jiang	6718b10a5b	drm/amdgpu: Add DPG pause for VCN v5.0.1 For vcn5.0.1 only, enable DPG PAUSE to avoid DPG resets. Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `3e5f86c14c`)	2025-05-01 11:02:00 -04:00
Lijo Lazar	79af0604eb	drm/amdgpu: Fix offset for HDP remap in nbio v7.11 APUs in passthrough mode use HDP flush. 0x7F000 offset used for remapping HDP flush is mapped to VPE space which could get power gated. Use another unused offset in BIF space. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `d8116a32cd`) Cc: stable@vger.kernel.org	2025-05-01 11:01:46 -04:00
Felix Kuehling	9397204ffa	drm/amdgpu: Fail DMABUF map of XGMI-accessible memory If peer memory is XGMI-accessible, we should never access it through PCIe P2P DMA mappings. PCIe P2P is slower, has different coherence behaviour, limited or no support for atomics, or may not work at all. Fail with a warning if DMABUF mappings of such memory are attempted. Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `dbe4c63689`)	2025-05-01 11:01:46 -04:00
Chris Bainbridge	be593d9d91	drm/amd/display: Fix slab-use-after-free in hdcp The HDCP code in amdgpu_dm_hdcp.c copies pointers to amdgpu_dm_connector objects without incrementing the kref reference counts. When using a USB-C dock, and the dock is unplugged, the corresponding amdgpu_dm_connector objects are freed, creating dangling pointers in the HDCP code. When the dock is plugged back, the dangling pointers are dereferenced, resulting in a slab-use-after-free: [ 66.775837] BUG: KASAN: slab-use-after-free in event_property_validate+0x42f/0x6c0 [amdgpu] [ 66.776171] Read of size 4 at addr ffff888127804120 by task kworker/0:1/10 [ 66.776179] CPU: 0 UID: 0 PID: 10 Comm: kworker/0:1 Not tainted 6.14.0-rc7-00180-g54505f727a38-dirty #233 [ 66.776183] Hardware name: HP HP Pavilion Aero Laptop 13-be0xxx/8916, BIOS F.17 12/18/2024 [ 66.776186] Workqueue: events event_property_validate [amdgpu] [ 66.776494] Call Trace: [ 66.776496] <TASK> [ 66.776497] dump_stack_lvl+0x70/0xa0 [ 66.776504] print_report+0x175/0x555 [ 66.776507] ? __virt_addr_valid+0x243/0x450 [ 66.776510] ? kasan_complete_mode_report_info+0x66/0x1c0 [ 66.776515] kasan_report+0xeb/0x1c0 [ 66.776518] ? event_property_validate+0x42f/0x6c0 [amdgpu] [ 66.776819] ? event_property_validate+0x42f/0x6c0 [amdgpu] [ 66.777121] __asan_report_load4_noabort+0x14/0x20 [ 66.777124] event_property_validate+0x42f/0x6c0 [amdgpu] [ 66.777342] ? __lock_acquire+0x6b40/0x6b40 [ 66.777347] ? enable_assr+0x250/0x250 [amdgpu] [ 66.777571] process_one_work+0x86b/0x1510 [ 66.777575] ? pwq_dec_nr_in_flight+0xcf0/0xcf0 [ 66.777578] ? assign_work+0x16b/0x280 [ 66.777580] ? lock_is_held_type+0xa3/0x130 [ 66.777583] worker_thread+0x5c0/0xfa0 [ 66.777587] ? process_one_work+0x1510/0x1510 [ 66.777588] kthread+0x3a2/0x840 [ 66.777591] ? kthread_is_per_cpu+0xd0/0xd0 [ 66.777594] ? trace_hardirqs_on+0x4f/0x60 [ 66.777597] ? _raw_spin_unlock_irq+0x27/0x60 [ 66.777599] ? calculate_sigpending+0x77/0xa0 [ 66.777602] ? kthread_is_per_cpu+0xd0/0xd0 [ 66.777605] ret_from_fork+0x40/0x90 [ 66.777607] ? kthread_is_per_cpu+0xd0/0xd0 [ 66.777609] ret_from_fork_asm+0x11/0x20 [ 66.777614] </TASK> [ 66.777643] Allocated by task 10: [ 66.777646] kasan_save_stack+0x39/0x60 [ 66.777649] kasan_save_track+0x14/0x40 [ 66.777652] kasan_save_alloc_info+0x37/0x50 [ 66.777655] __kasan_kmalloc+0xbb/0xc0 [ 66.777658] __kmalloc_cache_noprof+0x1c8/0x4b0 [ 66.777661] dm_dp_add_mst_connector+0xdd/0x5c0 [amdgpu] [ 66.777880] drm_dp_mst_port_add_connector+0x47e/0x770 [drm_display_helper] [ 66.777892] drm_dp_send_link_address+0x1554/0x2bf0 [drm_display_helper] [ 66.777901] drm_dp_check_and_send_link_address+0x187/0x1f0 [drm_display_helper] [ 66.777909] drm_dp_mst_link_probe_work+0x2b8/0x410 [drm_display_helper] [ 66.777917] process_one_work+0x86b/0x1510 [ 66.777919] worker_thread+0x5c0/0xfa0 [ 66.777922] kthread+0x3a2/0x840 [ 66.777925] ret_from_fork+0x40/0x90 [ 66.777927] ret_from_fork_asm+0x11/0x20 [ 66.777932] Freed by task 1713: [ 66.777935] kasan_save_stack+0x39/0x60 [ 66.777938] kasan_save_track+0x14/0x40 [ 66.777940] kasan_save_free_info+0x3b/0x60 [ 66.777944] __kasan_slab_free+0x52/0x70 [ 66.777946] kfree+0x13f/0x4b0 [ 66.777949] dm_dp_mst_connector_destroy+0xfa/0x150 [amdgpu] [ 66.778179] drm_connector_free+0x7d/0xb0 [ 66.778184] drm_mode_object_put.part.0+0xee/0x160 [ 66.778188] drm_mode_object_put+0x37/0x50 [ 66.778191] drm_atomic_state_default_clear+0x220/0xd60 [ 66.778194] __drm_atomic_state_free+0x16e/0x2a0 [ 66.778197] drm_mode_atomic_ioctl+0x15ed/0x2ba0 [ 66.778200] drm_ioctl_kernel+0x17a/0x310 [ 66.778203] drm_ioctl+0x584/0xd10 [ 66.778206] amdgpu_drm_ioctl+0xd2/0x1c0 [amdgpu] [ 66.778375] __x64_sys_ioctl+0x139/0x1a0 [ 66.778378] x64_sys_call+0xee7/0xfb0 [ 66.778381] do_syscall_64+0x87/0x140 [ 66.778385] entry_SYSCALL_64_after_hwframe+0x4b/0x53 Fix this by properly incrementing and decrementing the reference counts when making and deleting copies of the amdgpu_dm_connector pointers. (Mario: rebase on current code and update fixes tag) Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4006 Signed-off-by: Chris Bainbridge <chris.bainbridge@gmail.com> Fixes: `da3fd7ac0b` ("drm/amd/display: Update CP property based on HW query") Reviewed-by: Alex Hung <alex.hung@amd.com> Link: https://lore.kernel.org/r/20250417215005.37964-1-mario.limonciello@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `d4673f3c3b`) Cc: stable@vger.kernel.org	2025-05-01 11:01:23 -04:00
Antonio Fernando Silva e Cruz Filho	da072da2c8	drm/amd/display: Rename program_timing function for better debugging [WHY] Improve the output when using the ftrace debug feature, making it easier to identify which function is currently being executed. [HOW] Rename the program_timing function to a name that includes the path to the function's file. Signed-off-by: Antonio Fernando Silva e Cruz Filho <fernando.cruz.ctt@gmail.com> Co-developed-by: André Nogueira Ribeiro <r.andrenogueira@gmail.com> Signed-off-by: André Nogueira Ribeiro <r.andrenogueira@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:18:59 -04:00
Dan Carpenter	97c39b4da6	drm/amdgpu/userq: remove unnecessary NULL check The "ticket" pointer points to in the middle of the &exec struct so it can't be NULL. Remove the check. Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:18:52 -04:00
Dan Carpenter	d6c6d5ec66	drm/amdgpu/userq: Call unreserve on error in amdgpu_userq_fence_read_wptr() This error path should call amdgpu_bo_unreserve() before returning. Fixes: `d8675102ba` ("drm/amdgpu: add vm root BO lock before accessing the vm") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:17:42 -04:00
Alex Deucher	aded8b3c36	drm/amdgpu: properly handle GC vs MM in amdgpu_vmid_mgr_init() When kernel queues are disabled, all GC vmids are available for the scheduler. MM vmids are still managed by the driver so make all 16 available. Also fix gmc 10 vs 11 mix up in commit `1f61fc28b9` ("drm/amdgpu/mes: make more vmids available when disable_kq=1") v2: Properly handle pre-GC 10 hardware Fixes: `1f61fc28b9` ("drm/amdgpu/mes: make more vmids available when disable_kq=1") Cc: Arvind Yadav <Arvind.Yadav@amd.com> Reviewed-by: Arvind Yadav <Arvind.Yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:16:53 -04:00
Alex Deucher	2e828a25f8	drm/amdgpu/mes: use correct MES pipe for resets Use the KIQ pipe for kernel queues and the SCHED pipe for user queues. Fixes: `2408b0272b` ("drm/amdgpu/mes: consolidate on a single mes reset callback") Cc: Michael Chen <Michael.Chen@amd.com> Cc: Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by: Michael Chen <michael.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:16:14 -04:00
Alex Deucher	2408b0272b	drm/amdgpu/mes: consolidate on a single mes reset callback Use the legacy one as it covers both kernel queues and user queues. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:16:07 -04:00
Alex Deucher	6535348a3e	drm/amdgpu/mes: remove more unused functions These were leftover from mes bring up and are unused. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:15:57 -04:00
Bagas Sanjaya	4e24c6bb5f	drm/amdgpu/userq: fix user_queue parameters list Sphinx reports htmldocs warning: Documentation/gpu/amdgpu/module-parameters:7: drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c:1119: ERROR: Unexpected indentation. [docutils] Fix the warning by using reST bullet list syntax for user_queue parameter options, separated from preceding paragraph by a blank line. Fixes: `fb20954c97` ("drm/amdgpu/userq: rework driver parameter") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Closes: https://lore.kernel.org/linux-next/20250422202956.176fb590@canb.auug.org.au/ Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:15:15 -04:00
Lijo Lazar	0105725e2d	drm/amdgpu: Fix comment style Fix code comment style Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202504271826.xy2fFO28-lkp@intel.com/ Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:15:10 -04:00
Lijo Lazar	3580440308	drm/amd/pm: Fix comment style Fix code comment style Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202504271422.D6cqMlZ0-lkp@intel.com/ Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:15:03 -04:00
Lijo Lazar	cf1fcdeec4	drm/amdgpu: Print bootloader status for long waits If it needs a long wait for completion of bootloader execution, report the status in between. That helps to know if there is some issue during bootloader execution. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:14:58 -04:00
Yifan Zha	161949dd71	drm/amdgpu: refine MES register print for devices of hive [Why] Register access print missed device info. [How] Using dev_xxx instead of DRM_xxx to indicate which device of a hive is the message for. Signed-off-by: Yifan Zha <Yifan.Zha@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:14:47 -04:00
Lijo Lazar	3805e6959c	drm/amdgpu: Fix query order of XGMI v6.4.1 status Keep the register offsets as per link order for querying XGMI v6.4.1 link status. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Mangesh Gadre <Mangesh.Gadre@amd.com> Fixes: `6dee64e765` ("drm/amdgpu: Fix xgmi v6.4.1 link status reporting") Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:13:44 -04:00
Asad Kamal	96ac487c12	drm/amd/pm: Add board voltage node to hwmon Add and expose board voltage node as vddboard to hwmon for smu_v13_0_6 v2: Replace ip check with supported sensor attribute(Lijo) Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:13:40 -04:00
Jesse.Zhang	ad7c088e31	drm/amdgpu: Fix API status offset for MES queue reset The mes_v11_0_reset_hw_queue and mes_v12_0_reset_hw_queue functions were using the wrong union type (MESAPI__REMOVE_QUEUE) when getting the offset for api_status. Since these functions handle queue reset operations, they should use MESAPI__RESET union instead. This fixes the polling of API status during hardware queue reset operations in the MES for both v11 and v12 versions. Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-By: Shaoyun.liu <Shaoyun.liu@amd.com> Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:10:06 -04:00
Asad Kamal	3a2191efe4	drm/amd/pm: Add voltage caps for smu_v13_0_6 Add & enable board voltage caps for smu_v13_0_6 v3: Update version check for board voltage support Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:10:02 -04:00
Asad Kamal	ad3d93230d	drm/amd/pm: Fill static metrics data Fill static metrics data for smu_v13_0_6 v2: Proceed with driver load just with warning even if board voltage reads invalid value Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:09:54 -04:00
Asad Kamal	0ee5847849	drm/amd/pm: Use common function to fetch static metrics table Use common function to fetch static metrics table for smu_v13_0_12 Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:09:50 -04:00
Asad Kamal	9eddfcbef4	drm/amd/pn: Fetch static metrics table Fetch static metrics table for smu_v13_0_6 v2: Add static metrics caps check to fetch static metrics table v3: Update version having all fixes for static metrics support Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:09:45 -04:00
Asad Kamal	a2344a9827	drm/amd/pm: Update pmfw headers for smu_v_13_0_6 Update pmfw headers for smu_v_13_0_6 to include static metrics table Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:09:42 -04:00
Alex Deucher	482d485332	drm/amdgpu/userq: take the userq_mgr lock in enforce isolation Add the missing locking. Fixes: `94976e7e5e` ("drm/amdgpu/userq: add helpers to start/stop scheduling") Reviewed-by: Arvind Yadav <Arvind.Yadav@amd.com> Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:09:26 -04:00
Alex Deucher	c5e02d6588	drm/amdgpu/userq: take the userq_mgr lock in suspend/resume Add the missing locking. Fixes: `73e12e98ec` ("drm/amdgpu/userq: add suspend and resume helpers") Reviewed-by: Arvind Yadav <Arvind.Yadav@amd.com> Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:08:24 -04:00
Sonny Jiang	3e5f86c14c	drm/amdgpu: Add DPG pause for VCN v5.0.1 For vcn5.0.1 only, enable DPG PAUSE to avoid DPG resets. Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:08:21 -04:00
Asad Kamal	b02a284cc8	drm/amd/pm: Add ip version check for smu_v13_0_12 functions Add ip version check to use smu_v13_0_12 specific functions Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:08:16 -04:00
Aurabindo Pillai	d85212e1ce	drm/amd/display: downgrade HDMI infoframe error to one time warning In certain config, a modeprobe test triggers too many instances of the error related to infoframe. Make it print only once, and also make it a warning. Fixes: `6027cbee19` ("drm/amd/display: Add error check for avi and vendor infoframe setup function") Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Chengjun Yao <Chengjun.Yao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:06:19 -04:00
Eric Huang	7295e00df0	drm/amdkfd: add pasid debugfs entries the entries will be appearing at /sys/kernel/debug/kfd/proc/<pid>/pasid_<gpuid>. Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:06:14 -04:00
Arvind Yadav	56801cb83c	drm/amdgpu: remove DRM_AMDGPU_NAVI3X_USERQ config for UQ DRM_AMDGPU_NAVI3X_USERQ config support is not required for usermode queue. v2: rebase. Cc: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:06:00 -04:00
Srinivasan Shanmugam	716ad3c28f	drm/amd/display: Fix NULL pointer dereference for program_lut_mode in dcn401_populate_mcm_luts This commit introduces a NULL pointer check for mpc->funcs->program_lut_mode in the dcn401_populate_mcm_luts function. The previous implementation directly called program_lut_mode without validating its existence, which could lead to a NULL pointer dereference. With this change, the function is now only invoked if mpc->funcs->program_lut_mode is not NULL Fixes the below: drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn401/dcn401_hwseq.c:720 dcn401_populate_mcm_luts() error: we previously assumed 'mpc->funcs->program_lut_mode' could be null (see line 701) drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn401/dcn401_hwseq.c 642 void dcn401_populate_mcm_luts(struct dc dc, 643 struct pipe_ctx pipe_ctx, 644 struct dc_cm2_func_luts mcm_luts, 645 bool lut_bank_a) 646 { ... 716 } 717 if (m_lut_params.pwl) { 718 if (mpc->funcs->mcm.populate_lut) 719 mpc->funcs->mcm.populate_lut(mpc, m_lut_params, lut_bank_a, mpcc_id); --> 720 mpc->funcs->program_lut_mode(mpc, MCM_LUT_SHAPER, MCM_LUT_ENABLE, lut_bank_a, mpcc_id); Cc: Yihan Zhu <yihanzhu@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Alex Hung <alex.hung@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Yihan Zhu <yihanzhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:05:52 -04:00
Amber Lin	ab9fcc6362	drm/amdkfd: Set SDMA_RLCx_IB_CNTL/SWITCH_INSIDE_IB When submitting MQD to CP, set SDMA_RLCx_IB_CNTL/SWITCH_INSIDE_IB bit so it'll allow SDMA preemption if there is a massive command buffer of long-running SDMA commands. Signed-off-by: Amber Lin <Amber.Lin@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:05:46 -04:00
Rodrigo Siqueira	ffc7e11c10	drm/amdgpu: Add documentation associated with CSB Add a description for the get_csb_buffer callback, update the glossary, and add some extra information about RB, which is associated with CSB configuration. Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:05:41 -04:00
Rodrigo Siqueira	e7164c7ade	drm/amdgpu/gfx: Use CSB helpers in gfx_v6_0_get_csb_buffer Remove duplications from gfx_v6_0_get_csb_buffer by using CSB helpers. Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:05:38 -04:00
Rodrigo Siqueira	aff78a6172	drm/amdgpu/gfx: Fix gfx_v7_0_get_csb_buffer to use rb_config Instead of having the hardcoded values for the CSB buffer in gfx_v7_0_get_csb_buffer, use the values calculated in previous steps by accessing raster_config and raster_config_1. Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:05:34 -04:00
Prike Liang	e125a6e8ce	drm/amdgpu: set the evf name to identify the userq case The evf fence name can clearly identify the userq usage. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Arvind Yadav <Arvind.Yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:05:27 -04:00
Lijo Lazar	d8116a32cd	drm/amdgpu: Fix offset for HDP remap in nbio v7.11 APUs in passthrough mode use HDP flush. 0x7F000 offset used for remapping HDP flush is mapped to VPE space which could get power gated. Use another unused offset in BIF space. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:05:14 -04:00
Lijo Lazar	923406e74e	drm/amd/pm: Reset SMU v13.0.x custom settings On SMU v13.0.2 and SMU v13.0.6 variants user may choose custom min/max clocks in manual perf mode. Those custom min/max values need to be reset once user switches to auto or restores default settings. Otherwise, they may get used inadvertently during the next operation. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:05:11 -04:00
Prike Liang	9d40b05d6d	drm/amdgpu: add the evf attached gem obj resv dump This debug dump will help on debugging the evf attached gem obj fence related issue. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Arvind Yadav <Arvind.Yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:05:07 -04:00
Felix Kuehling	dbe4c63689	drm/amdgpu: Fail DMABUF map of XGMI-accessible memory If peer memory is XGMI-accessible, we should never access it through PCIe P2P DMA mappings. PCIe P2P is slower, has different coherence behaviour, limited or no support for atomics, or may not work at all. Fail with a warning if DMABUF mappings of such memory are attempted. Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:05:02 -04:00
Rodrigo Siqueira	9bd5a47ee2	drm/amdgpu/gfx: Use CSB helpers in gfx_v7_0_get_csb_buffer Use CSB helpers to remove code duplication from gfx_v7_0_get_csb_buffer. Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:04:48 -04:00
Rodrigo Siqueira	b990cb5234	drm/amdgpu/gfx: Use CSB helpers in gfx_v8_0_get_csb_buffer Remove code duplication from gfx_v8_0_get_csb_buffer by using CSB helpers. Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:04:45 -04:00
Rodrigo Siqueira	9fec2e92fa	drm/amdgpu/gfx: Use CSB helpers in gfx_v9_0_get_csb_buffer Eliminate code duplication in gfx_v9_0_get_csb_buffer by using CSB helpers. Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:04:42 -04:00
Rodrigo Siqueira	0eef0e36ba	drm/amdgpu/gfx: Use CSB helpers in gfx_v10_0_get_csb_buffer Remove duplicate code by using CSB helpers. Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:04:38 -04:00
Rodrigo Siqueira	106172df6e	drm/amdgpu/gfx: Use CSB helpers in gfx_v11_0_get_csb_buffer Part of the code in gfx_v11_0_get_csb_buffer can be removed in favor of some GFX CSB helpers. This commit removes the duplicated part for the GFX 11 CSB function. Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:04:34 -04:00
Rodrigo Siqueira	9718f7457d	drm/amdgpu/gfx: Introduce helpers handling CSB manipulation From GFX6 to GFX11, there is a function for getting the CSB buffer to be put into the hardware. Three common parts are duplicated in all of these GFX functions: 1. Prepare the CSB preamble. 2. Parser the CS data. 3. End the CSB preamble. This commit creates helpers to be used from GFX6 to GFX11. Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:04:29 -04:00
Colin Ian King	5a2658fda4	drm/amdgpu: Fix spelling mistake "rounter" -> "router" There is a spelling mistake with the array utcl2_rounter_str, it appears it should be utcl2_router_str. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:04:23 -04:00
Kees Cook	e5f7e4e0a4	drm/amdgpu/atom: Work around vbios NULL offset false positive GCC really does not want to consider NULL (or near-NULL) addresses as valid, so calculations based off of NULL end up getting range-tracked into being an offset wthin a 0 byte array. It gets especially mad about this: if (vbios_str == NULL) vbios_str += sizeof(BIOS_ATOM_PREFIX) - 1; ... if (vbios_str != NULL && vbios_str == 0) vbios_str++; It sees this as being "sizeof(BIOS_ATOM_PREFIX) - 1" byte offset from NULL, when building with -Warray-bounds (and the coming -fdiagnostic-details flag): In function 'atom_get_vbios_pn', inlined from 'amdgpu_atom_parse' at drivers/gpu/drm/amd/amdgpu/atom.c:1553:2: drivers/gpu/drm/amd/amdgpu/atom.c:1447:34: error: array subscript 0 is outside array bounds of 'unsigned char[0]' [-Werror=array-bounds=] 1447 \| if (vbios_str != NULL && vbios_str == 0) \| ^~~~~~~~~~ 'amdgpu_atom_parse': events 1-2 1444 \| if (vbios_str == NULL) \| ^ \| \| \| (1) when the condition is evaluated to true ...... 1447 \| if (vbios_str != NULL && *vbios_str == 0) \| ~~~~~~~~~~ \| \| \| (2) out of array bounds here In function 'amdgpu_atom_parse': cc1: note: source object is likely at address zero As there isn't a sane way to convince it otherwise, hide vbios_str from GCC's optimizer to avoid the warning so we can get closer to enabling -Warray-bounds globally. Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Kees Cook <kees@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:04:12 -04:00
Chris Bainbridge	d4673f3c3b	drm/amd/display: Fix slab-use-after-free in hdcp The HDCP code in amdgpu_dm_hdcp.c copies pointers to amdgpu_dm_connector objects without incrementing the kref reference counts. When using a USB-C dock, and the dock is unplugged, the corresponding amdgpu_dm_connector objects are freed, creating dangling pointers in the HDCP code. When the dock is plugged back, the dangling pointers are dereferenced, resulting in a slab-use-after-free: [ 66.775837] BUG: KASAN: slab-use-after-free in event_property_validate+0x42f/0x6c0 [amdgpu] [ 66.776171] Read of size 4 at addr ffff888127804120 by task kworker/0:1/10 [ 66.776179] CPU: 0 UID: 0 PID: 10 Comm: kworker/0:1 Not tainted 6.14.0-rc7-00180-g54505f727a38-dirty #233 [ 66.776183] Hardware name: HP HP Pavilion Aero Laptop 13-be0xxx/8916, BIOS F.17 12/18/2024 [ 66.776186] Workqueue: events event_property_validate [amdgpu] [ 66.776494] Call Trace: [ 66.776496] <TASK> [ 66.776497] dump_stack_lvl+0x70/0xa0 [ 66.776504] print_report+0x175/0x555 [ 66.776507] ? __virt_addr_valid+0x243/0x450 [ 66.776510] ? kasan_complete_mode_report_info+0x66/0x1c0 [ 66.776515] kasan_report+0xeb/0x1c0 [ 66.776518] ? event_property_validate+0x42f/0x6c0 [amdgpu] [ 66.776819] ? event_property_validate+0x42f/0x6c0 [amdgpu] [ 66.777121] __asan_report_load4_noabort+0x14/0x20 [ 66.777124] event_property_validate+0x42f/0x6c0 [amdgpu] [ 66.777342] ? __lock_acquire+0x6b40/0x6b40 [ 66.777347] ? enable_assr+0x250/0x250 [amdgpu] [ 66.777571] process_one_work+0x86b/0x1510 [ 66.777575] ? pwq_dec_nr_in_flight+0xcf0/0xcf0 [ 66.777578] ? assign_work+0x16b/0x280 [ 66.777580] ? lock_is_held_type+0xa3/0x130 [ 66.777583] worker_thread+0x5c0/0xfa0 [ 66.777587] ? process_one_work+0x1510/0x1510 [ 66.777588] kthread+0x3a2/0x840 [ 66.777591] ? kthread_is_per_cpu+0xd0/0xd0 [ 66.777594] ? trace_hardirqs_on+0x4f/0x60 [ 66.777597] ? _raw_spin_unlock_irq+0x27/0x60 [ 66.777599] ? calculate_sigpending+0x77/0xa0 [ 66.777602] ? kthread_is_per_cpu+0xd0/0xd0 [ 66.777605] ret_from_fork+0x40/0x90 [ 66.777607] ? kthread_is_per_cpu+0xd0/0xd0 [ 66.777609] ret_from_fork_asm+0x11/0x20 [ 66.777614] </TASK> [ 66.777643] Allocated by task 10: [ 66.777646] kasan_save_stack+0x39/0x60 [ 66.777649] kasan_save_track+0x14/0x40 [ 66.777652] kasan_save_alloc_info+0x37/0x50 [ 66.777655] __kasan_kmalloc+0xbb/0xc0 [ 66.777658] __kmalloc_cache_noprof+0x1c8/0x4b0 [ 66.777661] dm_dp_add_mst_connector+0xdd/0x5c0 [amdgpu] [ 66.777880] drm_dp_mst_port_add_connector+0x47e/0x770 [drm_display_helper] [ 66.777892] drm_dp_send_link_address+0x1554/0x2bf0 [drm_display_helper] [ 66.777901] drm_dp_check_and_send_link_address+0x187/0x1f0 [drm_display_helper] [ 66.777909] drm_dp_mst_link_probe_work+0x2b8/0x410 [drm_display_helper] [ 66.777917] process_one_work+0x86b/0x1510 [ 66.777919] worker_thread+0x5c0/0xfa0 [ 66.777922] kthread+0x3a2/0x840 [ 66.777925] ret_from_fork+0x40/0x90 [ 66.777927] ret_from_fork_asm+0x11/0x20 [ 66.777932] Freed by task 1713: [ 66.777935] kasan_save_stack+0x39/0x60 [ 66.777938] kasan_save_track+0x14/0x40 [ 66.777940] kasan_save_free_info+0x3b/0x60 [ 66.777944] __kasan_slab_free+0x52/0x70 [ 66.777946] kfree+0x13f/0x4b0 [ 66.777949] dm_dp_mst_connector_destroy+0xfa/0x150 [amdgpu] [ 66.778179] drm_connector_free+0x7d/0xb0 [ 66.778184] drm_mode_object_put.part.0+0xee/0x160 [ 66.778188] drm_mode_object_put+0x37/0x50 [ 66.778191] drm_atomic_state_default_clear+0x220/0xd60 [ 66.778194] __drm_atomic_state_free+0x16e/0x2a0 [ 66.778197] drm_mode_atomic_ioctl+0x15ed/0x2ba0 [ 66.778200] drm_ioctl_kernel+0x17a/0x310 [ 66.778203] drm_ioctl+0x584/0xd10 [ 66.778206] amdgpu_drm_ioctl+0xd2/0x1c0 [amdgpu] [ 66.778375] __x64_sys_ioctl+0x139/0x1a0 [ 66.778378] x64_sys_call+0xee7/0xfb0 [ 66.778381] do_syscall_64+0x87/0x140 [ 66.778385] entry_SYSCALL_64_after_hwframe+0x4b/0x53 Fix this by properly incrementing and decrementing the reference counts when making and deleting copies of the amdgpu_dm_connector pointers. (Mario: rebase on current code and update fixes tag) Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4006 Signed-off-by: Chris Bainbridge <chris.bainbridge@gmail.com> Fixes: `da3fd7ac0b` ("drm/amd/display: Update CP property based on HW query") Reviewed-by: Alex Hung <alex.hung@amd.com> Link: https://lore.kernel.org/r/20250417215005.37964-1-mario.limonciello@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:03:08 -04:00
Lijo Lazar	75f138db48	drm/amdgpu: Disallow partition query during reset Reject queries to get current partition modes during reset. Also, don't accept sysfs interface requests to switch compute partition mode while in reset. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:03:02 -04:00
Srinivasan Shanmugam	af0819755c	drm/amd/display: Fix NULL pointer dereferences in dm_update_crtc_state() v2 Added checks for NULL values after retrieving drm_new_conn_state to prevent dereferencing NULL pointers. Fixes the below: drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:10751 dm_update_crtc_state() warn: 'drm_new_conn_state' can also be NULL drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c 10672 static int dm_update_crtc_state(struct amdgpu_display_manager dm, 10673 struct drm_atomic_state state, 10674 struct drm_crtc crtc, 10675 struct drm_crtc_state old_crtc_state, 10676 struct drm_crtc_state new_crtc_state, 10677 bool enable, 10678 bool lock_and_validation_needed) 10679 { 10680 struct dm_atomic_state dm_state = NULL; 10681 struct dm_crtc_state dm_old_crtc_state, dm_new_crtc_state; 10682 struct dc_stream_state new_stream; 10683 int ret = 0; 10684 ... 10703 10704 /* TODO This hack should go away / 10705 if (connector && enable) { 10706 / Make sure fake sink is created in plug-in scenario */ 10707 drm_new_conn_state = drm_atomic_get_new_connector_state(state, 10708 connector); drm_atomic_get_new_connector_state() can't return error pointers, only NULL. 10709 drm_old_conn_state = drm_atomic_get_old_connector_state(state, 10710 connector); 10711 10712 if (IS_ERR(drm_new_conn_state)) { ^^^^^^^^^^^^^^^^^^ 10713 ret = PTR_ERR_OR_ZERO(drm_new_conn_state); Calling PTR_ERR_OR_ZERO() doesn't make sense. It can't be success. 10714 goto fail; 10715 } 10716 ... 10748 10749 dm_new_crtc_state->abm_level = dm_new_conn_state->abm_level; 10750 --> 10751 ret = fill_hdr_info_packet(drm_new_conn_state, ^^^^^^^^^^^^^^^^^^ Unchecked dereference 10752 &new_stream->hdr_static_metadata); 10753 if (ret) 10754 goto fail; 10755 v2: Modified the NULL pointer check for drm_new_conn_state in the dm_update_crtc_state function to include a warning via WARN_ON and return -EINVAL to indicate an invalid state when the pointer is NULL. Cc: Harry Wentland <harry.wentland@amd.com> Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Alex Hung <alex.hung@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-30 18:02:52 -04:00
Gergo Koteles	b316727a27	drm/amd/display: do not copy invalid CRTC timing info Since `b255ce4388`, it is possible that the CRTC timing information for the preferred mode has not yet been calculated while amdgpu_dm_connector_mode_valid() is running. In this case use the CRTC timing information of the actual mode. Fixes: `b255ce4388` ("drm/amdgpu: don't change mode in amdgpu_dm_connector_mode_valid()") Closes: https://lore.kernel.org/all/ed09edb167e74167a694f4854102a3de6d2f1433.camel@irl.hu/ Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4085 Signed-off-by: Gergo Koteles <soyer@irl.hu> Reviewed-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `20232192a5`) Cc: stable@vger.kernel.org	2025-04-22 16:51:17 -04:00
Leo Li	6ed0dc3fd3	drm/amd/display: Default IPS to RCG_IN_ACTIVE_IPS2_IN_OFF [Why] Recent findings show negligible power savings between IPS2 and RCG during static desktop. In fact, DCN related clocks are higher when IPS2 is enabled vs RCG. RCG_IN_ACTIVE is also the default policy for another OS supported by DC, and it has faster entry/exit. [How] Remove previous logic that checked for IPS2 support, and just default to `DMUB_IPS_RCG_IN_ACTIVE_IPS2_IN_OFF`. Fixes: `199888aa25` ("drm/amd/display: Update IPS default mode for DCN35/DCN351") Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `8f772d79ef`) Cc: stable@vger.kernel.org	2025-04-22 16:50:45 -04:00
George Shen	d59bddce49	drm/amd/display: Use 16ms AUX read interval for LTTPR with old sinks [Why/How] LTTPR are required to program DPCD 0000Eh to 0x4 (16ms) upon AUX read reply to this register. Since old Sinks witih DPCD rev 1.1 and earlier may not support this register, assume the mandatory value is programmed by the LTTPR to avoid AUX timeout issues. Reviewed-by: Wenjing Liu <wenjing.liu@amd.com> Signed-off-by: George Shen <george.shen@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `1594b60d74`)	2025-04-22 16:49:15 -04:00
Mario Limonciello	870bea21fd	drm/amd/display: Fix ACPI edid parsing on some Lenovo systems [Why] The ACPI EDID in the BIOS of a Lenovo laptop includes 3 blocks, but dm_helpers_probe_acpi_edid() has a start that is 'char'. The 3rd block index starts after 255, so it can't be indexed properly. This leads to problems with the display when the EDID is parsed. [How] Change the variable type to 'short' so that larger values can be indexed. Cc: Renjith Pananchikkal <renjith.pananchikkal@amd.com> Reported-by: Mark Pearson <mpearson@lenovo.com> Suggested-by: David Ober <dober@lenovo.com> Fixes: `c6a837088b` ("drm/amd/display: Fetch the EDID from _DDC if available for eDP") Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `a918bb4a90`) Cc: stable@vger.kernel.org	2025-04-22 16:48:44 -04:00
Felix Kuehling	a92741e72f	drm/amdgpu: Allow P2P access through XGMI If peer memory is accessible through XGMI, allow leaving it in VRAM rather than forcing its migration to GTT on DMABuf attachment. Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Tested-by: Hao (Claire) Zhou <hao.zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `372c8d72c3`)	2025-04-22 16:47:12 -04:00
Nicholas Susanto	756c85e4d0	drm/amd/display: Enable urgent latency adjustment on DCN35 [Why] Urgent latency adjustment was disabled on DCN35 due to issues with P0 enablement on some platforms. Without urgent latency, underflows occur when doing certain high timing configurations. After testing, we found that reenabling urgent latency didn't reintroduce p0 support on multiple platforms. [How] renable urgent latency on DCN35 and setting it to 3000 Mhz. This reverts commit `3412860cc4`. Reviewed-by: Charlene Liu <charlene.liu@amd.com> Signed-off-by: Nicholas Susanto <nsusanto@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `cd74ce1f0c`)	2025-04-22 16:46:34 -04:00
Roman Li	67fe574651	drm/amd/display: Force full update in gpu reset [Why] While system undergoing gpu reset always do full update to sync the dc state before and after reset. [How] Return true in should_reset_plane() if gpu reset detected Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Roman Li <Roman.Li@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `2ba8619b9a`) Cc: stable@vger.kernel.org	2025-04-22 16:45:50 -04:00
Roman Li	7eb287beeb	drm/amd/display: Fix gpu reset in multidisplay config [Why] The indexing of stream_status in dm_gpureset_commit_state() is incorrect. That leads to asserts in multi-display configuration after gpu reset. [How] Adjust the indexing logic to align stream_status with surface_updates. Fixes: `cdaae8371a` ("drm/amd/display: Handle GPU reset for DC block") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3808 Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Roman Li <Roman.Li@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `d91bc90139`) Cc: stable@vger.kernel.org	2025-04-22 16:45:10 -04:00
Felix Kuehling	5e56935b51	drm/amdgpu: Don't pin VRAM without DMABUF_MOVE_NOTIFY Pinning of VRAM is for peer devices that don't support dynamic attachment and move notifiers. But it requires that all such peer devices are able to access VRAM via PCIe P2P. Any device without P2P access requires migration to GTT, which fails if the memory is already pinned for another peer device. Sharing between GPUs should not require pinning in VRAM. However, if DMABUF_MOVE_NOTIFY is disabled in the kernel build, even DMABufs shared between GPUs must be pinned, which can lead to failures and functional regressions on systems where some peer GPUs are not P2P accessible. Disable VRAM pinning if move notifiers are disabled in the kernel build to fix regressions when sharing BOs between GPUs. Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Tested-by: Hao (Claire) Zhou <hao.zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `05185812ae`)	2025-04-22 16:44:28 -04:00
Felix Kuehling	5cf3c602df	drm/amdgpu: Use allowed_domains for pinning dmabufs When determining the domains for pinning DMABufs, filter allowed_domains and fail with a warning if VRAM is forbidden and GTT is not an allowed domain. Fixes: `f5e7fabd1f` ("drm/amdgpu: allow pinning DMA-bufs into VRAM if all importers can do P2P") Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `3940796a6e`)	2025-04-22 16:44:02 -04:00
Sunil Khatri	127e612bf1	drm/amdgpu: update fence ptr with context:seqno log context:seqno of the fence during timeout rather than logging fence pointer. Reviewed-by: Arvind Yadav <Arvind.Yadav@amd.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:46 -04:00
Arvind Yadav	cade59abaa	drm/amdgpu/gfx12: Add fw minimum version check for usermode queue This patch is load usermode queue based on FW support for gfx12. CP Ucode FW Vesion: [PFP = 2840, ME = 2780, MEC = 3050, MES = 123] v2: Addressed review comments from Alex - Just check the firmware versions directly. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Cc: Shashank Sharma <shashank.sharma@amd.com> Cc: Sunil Khatri <sunil.khatri@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:46 -04:00
Arvind Yadav	61ca97e959	drm/amdgpu/gfx11: Add fw minimum version check for usermode queue This patch is load usermode queue based on FW support for gfx11. CP Ucode FW version: [PFP = 2530, ME = 2390, MEC = 2600, MES = 120] v2: Addressed review comments from Alex. - Just check the firmware versions directly. v3: Firmware version checks only for Navi3x(by Alex). Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Cc: Shashank Sharma <shashank.sharma@amd.com> Cc: Sunil Khatri <sunil.khatri@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:46 -04:00
Srinivasan Shanmugam	3f397cd203	drm/amd/display: Add NULL pointer checks in dm_force_atomic_commit() This commit updates the dm_force_atomic_commit function to replace the usage of PTR_ERR_OR_ZERO with IS_ERR for checking error states after retrieving the Connector (drm_atomic_get_connector_state), CRTC (drm_atomic_get_crtc_state), and Plane (drm_atomic_get_plane_state) states. The function utilized PTR_ERR_OR_ZERO for error checking. However, this approach is inappropriate in this context because the respective functions do not return NULL; they return pointers that encode errors. This change ensures that error pointers are properly checked using IS_ERR before attempting to dereference. Cc: Harry Wentland <harry.wentland@amd.com> Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Alex Hung <alex.hung@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:46 -04:00
Alex Deucher	42a6667780	drm/amdgpu/userq: use consistent function naming s/userqueue/userq/ 1. remove the mix of amdgpu_userqueue and amdgpu_userq 2. to be consistent with other amdgpu_userq_fence.c 3. it's shorter Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:46 -04:00
Alex Deucher	4fdbe3a623	drm/amdgpu/userq: rename eviction helpers suspend/resume -> evict/restore Rename to avoid confusion with the system suspend and resume helpers. v2: update error messages Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:46 -04:00
Alex Deucher	d13e95967e	drm/amdgpu/userq: move waiting for last fence before umap Need to wait for the last fence before unmapping. This also fixes a memory leak in amdgpu_userqueue_cleanup() when the fence isn't signalled. Fixes: `b0db33c8c5` ("drm/amdgpu/userq: rework front end call sequence") Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:45 -04:00
Alex Deucher	36b0bc1731	drm/amdgpu/userq: unmap queues amdgpu_userq_mgr_fini() This was missed when the map and unmap were split out of the mqd create and destroy functions. Fixes: `b0db33c8c5` ("drm/amdgpu/userq: rework front end call sequence") Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:45 -04:00
Alex Deucher	e67b95f0cd	drm/amdgpu: switch from queue_active to queue state Track the state of the queue rather than simple active vs not. This is needed for other states (hung, preempted, etc.). While we are at it, move the state tracking into the user queue front end code. Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:45 -04:00
Alex Deucher	ba324ffb25	drm/amdgpu/userq: optimize enforce isolation and s/r If user queues are disabled for all IPs in the case of suspend and resume and for gfx/compute in the case of enforce isolation, we can return early. Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:45 -04:00
Dr. David Alan Gilbert	d61724056a	drm/amd/display: Remove unused *vbios_smu_set_dprefclk rn_vbios_smu_set_dprefclk() was added in 2019 by commit `4edb6fc918` ("drm/amd/display: Add Renoir clock manager") rv1_vbios_smu_set_dprefclk() was also added in 2019 by commit `dc88b4a684` ("drm/amd/display: make clk mgr soc specific") neither have been used. Remove them. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:45 -04:00
Xiang Liu	696e8fa354	drm/amdgpu: Print kernel message when error logged by scrub Print a kernel message when the scrub bit of status register is set to indicate that errors are being logged by the scrub. Signed-off-by: Xiang Liu <xiang.liu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:45 -04:00
Gergo Koteles	20232192a5	drm/amd/display: do not copy invalid CRTC timing info Since `b255ce4388`, it is possible that the CRTC timing information for the preferred mode has not yet been calculated while amdgpu_dm_connector_mode_valid() is running. In this case use the CRTC timing information of the actual mode. Fixes: `b255ce4388` ("drm/amdgpu: don't change mode in amdgpu_dm_connector_mode_valid()") Closes: https://lore.kernel.org/all/ed09edb167e74167a694f4854102a3de6d2f1433.camel@irl.hu/ Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4085 Signed-off-by: Gergo Koteles <soyer@irl.hu> Reviewed-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:45 -04:00
TungYu Lu	33bc89949b	drm/amd/display: Correct prefetch calculation [Why] The minimum value of the dst_y_prefetch_equ was not correct in prefetch calculation whice causes OPTC underflow. [How] Add the min operation of dst_y_prefetch_equ in prefetch calculation for legacy DML. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: TungYu Lu <tungyu.lu@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:45 -04:00
Dillon Varone	19e743f0fb	drm/amd/display: Refactor SubVP cursor limiting logic [WHY] There are several gaps that can result in SubVP being enabled with incompatible HW cursor sizes, and unjust restrictions to cursor size due to wrong predictions on future usage of SubVP [HOW] - remove "prediction" logic in favor of tagging based on previous SubVP usage - block SubVP if current HW cursor settings are incompatible - provide interface for DM to determine if HW cursor should be disabled due to an attempt to enable SubVP Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:45 -04:00
Alex Deucher	11772eb73b	drm/amdgpu/userq: add a helper to check which IPs are enabled Add a helper to get a mask of IPs which support user queues. Use this in the INFO IOCTL to get the IP mask to replace the current code. Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:45 -04:00
Arunpravin Paneer Selvam	4b27406380	drm/amdgpu: Add queue id support to the user queue wait IOCTL Add queue id support to the user queue wait IOCTL drm_amdgpu_userq_wait structure. This is required to retrieve the wait user queue and maintain the fence driver references in it so that the user queue in the same context releases their reference to the fence drivers at some point before queue destruction. Otherwise, we would gather those references until we don't have any more space left and crash. v2: Modify the UAPI comment as per the mesa and libdrm UAPI comment. Libdrm MR: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/408 Mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34493 Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:44 -04:00
Alex Deucher	4ec2141d23	drm/amdgpu/userq: enable support for secure queues Enable users to create secure GFX/compute queues. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Tested-by: Jesse.Zhang <Jesse.zhang@amd.com> Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:44 -04:00
Alex Deucher	87ceff6136	drm/amdgpu/userq/mes: pass the secure flag to mqd init So that we initialize the MQD as a secure queue. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:44 -04:00
Meenakshikumar Somasundaram	8aaeb25327	drm/amd/display: Fix pixel rate divider policy for 1 pixel per cycle config [Why] Pixel rate dividor was not programmed correctly for 1 pixel per cycle configuration for empty tu case. [How] Included check for empty tu when pixel rate dividor values were selected. Reviewed-by: Michael Strauss <michael.strauss@amd.com> Signed-off-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:44 -04:00
Leo Li	8f772d79ef	drm/amd/display: Default IPS to RCG_IN_ACTIVE_IPS2_IN_OFF [Why] Recent findings show negligible power savings between IPS2 and RCG during static desktop. In fact, DCN related clocks are higher when IPS2 is enabled vs RCG. RCG_IN_ACTIVE is also the default policy for another OS supported by DC, and it has faster entry/exit. [How] Remove previous logic that checked for IPS2 support, and just default to `DMUB_IPS_RCG_IN_ACTIVE_IPS2_IN_OFF`. Fixes: `199888aa25` ("drm/amd/display: Update IPS default mode for DCN35/DCN351") Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:44 -04:00
Charlene Liu	8e40dd9320	drm/amd/display: Revert "not disable dtb as dto src at dpms off" [why] not all the asic using the same code path. need to revisit and limit the impact. This reverts commit `32be4e39f4`. Reviewed-by: Gabe Teeger <gabe.teeger@amd.com> Signed-off-by: Charlene Liu <Charlene.Liu@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:44 -04:00
George Shen	1594b60d74	drm/amd/display: Use 16ms AUX read interval for LTTPR with old sinks [Why/How] LTTPR are required to program DPCD 0000Eh to 0x4 (16ms) upon AUX read reply to this register. Since old Sinks witih DPCD rev 1.1 and earlier may not support this register, assume the mandatory value is programmed by the LTTPR to avoid AUX timeout issues. Reviewed-by: Wenjing Liu <wenjing.liu@amd.com> Signed-off-by: George Shen <george.shen@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:44 -04:00
Mario Limonciello	a918bb4a90	drm/amd/display: Fix ACPI edid parsing on some Lenovo systems [Why] The ACPI EDID in the BIOS of a Lenovo laptop includes 3 blocks, but dm_helpers_probe_acpi_edid() has a start that is 'char'. The 3rd block index starts after 255, so it can't be indexed properly. This leads to problems with the display when the EDID is parsed. [How] Change the variable type to 'short' so that larger values can be indexed. Cc: Renjith Pananchikkal <renjith.pananchikkal@amd.com> Reported-by: Mark Pearson <mpearson@lenovo.com> Suggested-by: David Ober <dober@lenovo.com> Fixes: `c6a837088b` ("drm/amd/display: Fetch the EDID from _DDC if available for eDP") Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:44 -04:00
Taimur Hassan	ce2e117bfb	drm/amd/display: Promote DC to 3.2.329 Summary: * Implement HDMI Read request * RMCM and MCM 3DLUT support * Enable urgent latency adjustment on DCN35 * Enable phy-ssc reduction by default Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Reviewed-by: Leo Li <sunpeng.li@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-22 08:51:35 -04:00
Felix Kuehling	372c8d72c3	drm/amdgpu: Allow P2P access through XGMI If peer memory is accessible through XGMI, allow leaving it in VRAM rather than forcing its migration to GTT on DMABuf attachment. Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Tested-by: Hao (Claire) Zhou <hao.zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 11:28:36 -04:00
Roman Li	e15d09f510	drm/amd/display: enable phy-ssc reduction by default [Why] Reduction of phy-ssc is needed to support DP2 high pixel clock on dcn35x/36. There's a special flag to enable it in dmub hw params. [How] Set hbr3_phy_ssc to true for dcn35, dcn351 and dcn36. Reviewed-by: Charlene Liu <charlene.liu@amd.com> Signed-off-by: Roman Li <Roman.Li@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 11:28:30 -04:00
Nicholas Susanto	cd74ce1f0c	drm/amd/display: Enable urgent latency adjustment on DCN35 [Why] Urgent latency adjustment was disabled on DCN35 due to issues with P0 enablement on some platforms. Without urgent latency, underflows occur when doing certain high timing configurations. After testing, we found that reenabling urgent latency didn't reintroduce p0 support on multiple platforms. [How] renable urgent latency on DCN35 and setting it to 3000 Mhz. This reverts commit `3412860cc4`. Reviewed-by: Charlene Liu <charlene.liu@amd.com> Signed-off-by: Nicholas Susanto <nsusanto@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 11:28:09 -04:00
Yihan Zhu	652968d996	drm/amd/display: DCN42 RMCM and MCM 3DLUT support [WHY & HOW] Providing hardware programming for the RMCM and MCM IPs for 3DLUT in DCN42. Reviewed-by: Charlene Liu <charlene.liu@amd.com> Signed-off-by: Yihan Zhu <Yihan.Zhu@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 11:28:05 -04:00
Yihan Zhu	c9646e5a7e	drm/amd/display: DCN32 null data check [WHY & HOW] Avoid null curve data structure used in the cm block for the potential issue. Reviewed-by: Charlene Liu <charlene.liu@amd.com> Signed-off-by: Yihan Zhu <Yihan.Zhu@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 11:28:01 -04:00
Roman Li	2ba8619b9a	drm/amd/display: Force full update in gpu reset [Why] While system undergoing gpu reset always do full update to sync the dc state before and after reset. [How] Return true in should_reset_plane() if gpu reset detected Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Roman Li <Roman.Li@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 11:27:56 -04:00
Roman Li	d91bc90139	drm/amd/display: Fix gpu reset in multidisplay config [Why] The indexing of stream_status in dm_gpureset_commit_state() is incorrect. That leads to asserts in multi-display configuration after gpu reset. [How] Adjust the indexing logic to align stream_status with surface_updates. Fixes: `cdaae8371a` ("drm/amd/display: Handle GPU reset for DC block") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3808 Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Roman Li <Roman.Li@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 11:27:41 -04:00
Austin Zheng	8fc3959cd4	drm/amd/display: Move Mode Support Prefetch Checks To Its Own Function [Why] Large stack size observed in DCN4 mode support when compiling with clang. Additional instrumentation added by compiler adds to stack size. dml_core_mode_support ends up going over the stack size limit due to the size of the function. [How] Move checks and calculations for prefetch to its own function. Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Austin Zheng <Austin.Zheng@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 11:27:39 -04:00
Felix Kuehling	05185812ae	drm/amdgpu: Don't pin VRAM without DMABUF_MOVE_NOTIFY Pinning of VRAM is for peer devices that don't support dynamic attachment and move notifiers. But it requires that all such peer devices are able to access VRAM via PCIe P2P. Any device without P2P access requires migration to GTT, which fails if the memory is already pinned for another peer device. Sharing between GPUs should not require pinning in VRAM. However, if DMABUF_MOVE_NOTIFY is disabled in the kernel build, even DMABufs shared between GPUs must be pinned, which can lead to failures and functional regressions on systems where some peer GPUs are not P2P accessible. Disable VRAM pinning if move notifiers are disabled in the kernel build to fix regressions when sharing BOs between GPUs. Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Tested-by: Hao (Claire) Zhou <hao.zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 11:27:34 -04:00
Jack Chang	6df7175263	drm/amd/display: Move desync error counter operation up. [Why & How] Move desync error counter operation up to prevent it from being skipped by force disable desync error. Reviewed-by: Robin Chen <robin.chen@amd.com> Reviewed-by: Aric Cyr <aric.cyr@amd.com> Signed-off-by: Jack Chang <jack.chang@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 11:27:32 -04:00
Mario Limonciello	7e40f64896	drm/amd/display: Avoid divide by zero by initializing dummy pitch to 1 [Why] If the dummy values in `populate_dummy_dml_surface_cfg()` aren't updated then they can lead to a divide by zero in downstream callers like CalculateVMAndRowBytes() [How] Initialize dummy value to a value to avoid divide by zero. Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 11:27:29 -04:00
Chris Park	724a4b400b	drm/amd/display: Implement HDMI Read Request [Why] Read Request provides alterative method to polling to the HDMI sinks that support it. [How] Implement Read Request where interrupt can be generated by the sink. Reviewed-by: Joshua Aberback <joshua.aberback@amd.com> Signed-off-by: Chris Park <chris.park@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 11:27:25 -04:00
Yiling Chen	f53d0f48a8	drm/amd/display: To apply the adjusted DP ref clock for DP devices [Why] For some pixel clock margin sensitive external monitor, we could not keep original DP ref clock for the ASICs supported SSC DP ref clock. [How] From slicon design team's comment, we have to apply the adjusted DP ref clock for DP devices. DP 128b (DP2) signals uses the DTBCLK not DP ref. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Yiling Chen <yi-ling.chen2@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Mark Broadworth <mark.broadworth@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 11:26:59 -04:00
Alex Deucher	eec6444923	drm/amdgpu/gfx12: add support for TMZ queues to mqd_init Set up TMZ for queues. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 11:00:43 -04:00
Alex Deucher	9486875408	drm/amdgpu/gfx11: add support for TMZ queues to mqd_init Set up TMZ for queues. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 11:00:38 -04:00
Felix Kuehling	3940796a6e	drm/amdgpu: Use allowed_domains for pinning dmabufs When determining the domains for pinning DMABufs, filter allowed_domains and fail with a warning if VRAM is forbidden and GTT is not an allowed domain. Fixes: `f5e7fabd1f` ("drm/amdgpu: allow pinning DMA-bufs into VRAM if all importers can do P2P") Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:59:05 -04:00
Alex Deucher	cb808ab833	drm/amdgpu: add tmz queue parameter to mqd props Use this to track the whether we want TMZ for queues. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:59:01 -04:00
Srinivasan Shanmugam	d30f610762	drm/amdgpu: Refine Cleaner Shader MEC firmware version for GFX10.1.x GPUs Update the minimum firmware version for the Cleaner Shader in the gfx_v10_0_sw_init function. This change adjusts the minimum required firmware version for the MEC firmware from 152 to 151, allowing for broader compatibility with GFX10.1 GPUs. Fixes: `25961bad92` ("drm/amdgpu/gfx10: Add cleaner shader for GFX10.1.10") Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:57:26 -04:00
Jesse.zhang@amd.com	2200b41428	drm/amdgpu:remove old sdma reset callback mechanism This patch removes the deprecated SDMA reset callback mechanism, which was previously used to register pre-reset and post-reset callbacks for SDMA engine resets. The callback mechanism has been replaced with a more direct and efficient approach using `stop_queue` and `start_queue` functions in the ring's function table. The SDMA reset callback mechanism allowed KFD and AMDGPU to register pre-reset and post-reset functions for handling SDMA engine resets. However, this approach added unnecessary complexity and was no longer needed after the introduction of the `stop_queue` and `start_queue` functions in the ring's function table. 1. Remove Callback Mechanism: - Removed the `amdgpu_sdma_register_on_reset_callbacks` function and its associated data structures (`sdma_on_reset_funcs`). - Removed the callback registration logic from the SDMA v4.4.2 initialization code. 2. Clean Up Related Code: - Removed the `sdma_v4_4_2_set_engine_reset_funcs` function, which was used to register the callbacks. - Removed the `sdma_v4_4_2_engine_reset_funcs` structure, which contained the pre-reset and post-reset callback functions. Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:57:22 -04:00
Sunil Khatri	9018c7fe68	drm/amdgpu/userq: add context and seqno of the fence Add context and seqno of the fence in error logging rather than printing fence ptr. Reviewed-by: Christian König <christian.koenig@amd.com> Suggested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:57:09 -04:00
Jesse.zhang@amd.com	6a07ac702f	drm/amdgpu: optimize queue reset and stop logic for sdma_v5_2 This patch refactors the SDMA v5.2 queue reset and stop logic to improve code readability, maintainability, and performance. The key changes include: 1. Generalized `sdma_v5_2_gfx_stop` Function: - Added an `inst_mask` parameter to allow stopping specific SDMA instances instead of all instances. This is useful for resetting individual queues. 2. Simplified `sdma_v5_2_reset_queue` Function: - Removed redundant loops and checks by directly using the `ring->me` field to identify the SDMA instance. - Reused the `sdma_v5_2_gfx_stop` function to stop the queue, reducing code duplication. v1: The general coding style is to declare variables like "i" or "r" last. E.g. longest lines first and short lasts. (Chritian) Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:57:07 -04:00
Jesse.zhang@amd.com	574f4b5562	drm/amdgpu: optimize queue reset and stop logic for sdma_v5_0 This patch refactors the SDMA v5.0 queue reset and stop logic to improve code readability, maintainability, and performance. The key changes include: 1. Generalized `sdma_v5_0_gfx_stop` Function: - Added an `inst_mask` parameter to allow stopping specific SDMA instances instead of all instances. This is useful for resetting individual queues. 2. Simplified `sdma_v5_0_reset_queue` Function: - Removed redundant loops and checks by directly using the `ring->me` field to identify the SDMA instance. - Reused the `sdma_v5_0_gfx_stop` function to stop the queue, reducing code duplication. v1: The general coding style is to declare variables like "i" or "r" last. E.g. longest lines first and short lasts. (Chritian) Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:57:04 -04:00
Jesse.zhang@amd.com	47454f2dc0	drm/amdgpu: Register the new sdma function pointers for sdma_v5_2 Register stop/start/soft_reset queue functions for SDMA IP versions v5.2. Suggested-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:57:01 -04:00
Jesse.zhang@amd.com	e56d4bf57f	drm/amdgpu/: drm/amdgpu: Register the new sdma function pointers for sdma_v5_0 Register stop/start/soft_reset queue functions for SDMA IP versions v5.0. Suggested-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:56:58 -04:00
Jesse.zhang@amd.com	5c3e7c4953	drm/amdgpu: Implement SDMA soft reset directly for v5.x This patch introduces a new function `amdgpu_sdma_soft_reset` to handle SDMA soft resets directly, rather than relying on the DPM interface. 1. New `amdgpu_sdma_soft_reset` Function: - Implements a soft reset for SDMA engines by directly writing to the hardware registers. - Handles SDMA versions 4.x and 5.x separately: - For SDMA 4.x, the existing `amdgpu_dpm_reset_sdma` function is used for backward compatibility. - For SDMA 5.x, the driver directly manipulates the `GRBM_SOFT_RESET` register to reset the specified SDMA instance. 2. Integration into `amdgpu_sdma_reset_engine`: - The `amdgpu_sdma_soft_reset` function is called during the SDMA reset process, replacing the previous call to `amdgpu_dpm_reset_sdma`. v2: r should default to an error (Alex) Suggested-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:56:54 -04:00
Jesse.zhang@amd.com	b22659d5d3	drm/amdgpu: switch amdgpu_sdma_reset_engine to use the new sdma function pointers Replace old callback mechanism with direct calls to stop/start functions. Suggested-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:56:49 -04:00
Alex Deucher	a5c34299d8	drm/amdgpu/userq: enable support for queue priorities Enable users to create queues at different priority levels. The highest level is restricted to drm master. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:56:46 -04:00
Alex Deucher	23a650bb9f	drm/amdgpu/userq/mes: handle user queue priority Handle the queue priority set by the user. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:56:42 -04:00
Alex Deucher	9546c05628	drm/amdgpu/userq: add priorty to user queue structure So we can track this when we create user queues. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:56:39 -04:00
Alex Deucher	a83be6e479	drm/amdgpu/mes12: add conversion for priority levels Convert driver priority levels to MES11 priority levels. At the moment they are the same, but they may not always be. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:56:36 -04:00
Alex Deucher	3d0a402e7c	drm/amdgpu/mes11: add conversion for priority levels Convert driver priority levels to MES11 priority levels. At the moment they are the same, but they may not always be. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:56:33 -04:00
Alex Deucher	fced8e7d2d	drm/amdgpu: convert userq UAPI _pad to flags Reuse the _pad field for flags. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Jesse.Zhang <Jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:56:18 -04:00
Wentao Liang	6027cbee19	drm/amd/display: Add error check for avi and vendor infoframe setup function The function fill_stream_properties_from_drm_display_mode() calls the function drm_hdmi_avi_infoframe_from_display_mode() and the function drm_hdmi_vendor_infoframe_from_display_mode(), but does not check its return value. Log the error messages to prevent silent failure if either function fails. Signed-off-by: Wentao Liang <vulab@iscas.ac.cn> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:56:13 -04:00
Alex Deucher	8f23a97907	drm/amdgpu/userq: integrate with enforce isolation Enforce isolation serializes access to the GFX IP. User queues are isolated in the MES scheduler, but we still need to serialize between kernel queues and user queues. For enforce isolation, group KGD user queues with KFD user queues. v2: split out variable renaming, add config guards v3: use new function names Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:56:11 -04:00
Alex Deucher	28fc3172e4	drm/amdgpu: rename enforce isolation variables Since they will be used for both KFD and KGD user queues, rename them from kfd to userq. No intended functional change. Acked-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:56:06 -04:00
Alex Deucher	94976e7e5e	drm/amdgpu/userq: add helpers to start/stop scheduling This will be used to stop/start user queue scheduling for example when switching between kernel and user queues when enforce isolation is enabled. v2: use idx v3: only stop compute/gfx queues Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:55:59 -04:00
Alex Deucher	56a0a80af0	drm/amdgpu/userq: track the xcp_id associated with the queue Track this to align with KFD for enforce isolation handling. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:55:55 -04:00
Emily Deng	5ae4591f4e	drm/amdgpu: Clear overflow for SRIOV For VF, it doesn't have the permission to clear overflow, clear the bit by reset. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:55:51 -04:00
Alex Deucher	fb20954c97	drm/amdgpu/userq: rework driver parameter Replace disable_kq parameter with user_queue parameter. The parameter has the following logic: -1 = auto (ASIC specific default) 0 = user queues disabled 1 = user queues enabled and kernel queues enabled (if supported) 2 = user queues enabled and kernel queues disabled The default behavior (-1) is currently the same as 0 for current ASICs. To enable user queues (in addition to kernel queues) set user_queue=1. To enable user queues and disable kernel queues (to make all resources available to user queues), set user_queue=2. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:55:47 -04:00
Asad Kamal	172494c4e9	drm/amd/pm: Enable host limit metrics support Enable host limit metrics support for smuv_13_0_12 Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:55:43 -04:00
Alex Deucher	0ed032dc7d	drm/amdgpu/sdma7: properly reference trap interrupts for userqs We need to take a reference to the interrupts to make sure they stay enabled even if the kernel queues have disabled them. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:55:39 -04:00
Alex Deucher	1197cfb730	drm/amdgpu/sdma6: properly reference trap interrupts for userqs We need to take a reference to the interrupts to make sure they stay enabled even if the kernel queues have disabled them. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:55:36 -04:00
Asad Kamal	2c8b0d628a	drm/amd/pm: Enable host limit metrics support Enable host limit metrics support for smuv_13_0_6 Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:55:31 -04:00
Sathishkumar S	b574729ff0	drm/amdgpu: Enable doorbell for JPEG5_0_1 Enable doorbell for JPEG5_0_1 and adjust index for VCN5_0_1. Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-04-21 10:55:25 -04:00

... 2 3 4 5 6 ...

33591 Commits