linux

Commit Graph

Author	SHA1	Message	Date
Lang Yu	fb5b73acf7	drm/amdgpu/umsch: correct IP version format FW uses IP_VERSION_MAJ_MIN_REV format. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:33:42 -04:00
Lang Yu	1c1f14a472	drm/amdgpu: don't use legacy invalidation on MMHUB v3.3 Legacy invalidation is not supported. This is missed during rebase. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:33:29 -04:00
Lang Yu	4661482b9c	drm/amdgpu: correct NBIO v7.11 programing Use v7.7 before, switch to v7.11 now. Fix incorrect programing. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:33:21 -04:00
Xiaogang Chen	ffa88b0019	drm/amdgpu: Correctly use bo_va->ref_count in compute VMs This is needed to correctly handle BOs imported into compute VM from gfx. Both kfd and gfx should use same bo_va and set bo_va->ref_count correctly when map the Bos into same VM, otherwise we may trigger kernel general protection when iterate mappings over bo_va's valids or invalids list. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Xiaogang Chen <Xiaogang.Chen@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com> Tested-by: Xiaogang Chen <Xiaogang.Chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:33:08 -04:00
Lijo Lazar	f20f3b0d6c	drm/amd/pm: Add P2S tables for SMU v13.0.6 Add P2S table load support on SMU v13.0.6 ASICs. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:33:01 -04:00
Lijo Lazar	79daf69246	drm/amdgpu: Add support to load P2S tables Add support to load P2S tables through PSP. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:32:55 -04:00
Lijo Lazar	cd21cb1fcb	drm/amdgpu: Update PSP interface header Adds FW id for P2S table. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:32:47 -04:00
Lijo Lazar	a8558fce7a	drm/amdgpu: Avoid FRU EEPROM access on APU FRU EEPROM access is not valid for APU devices. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:32:41 -04:00
Lin.Cao	f74f19c440	drm/amdgpu: save VCN instances init info before jpeg init JPEG init header will overwirte vcn init header info which will loss some debug information Signed-off-by: Lin.Cao <lincao12@amd.com> Reviewed-by: Jingwen Chen <Jingwen.Chen2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:32:34 -04:00
Alex Hung	98a80bb3dd	Revert "drm/amd/display: Hande writeback request from userspace" This reverts commit `cd1a4bc228`. [WHY & HOW] The writeback series cause a regression in thunderbolt display. Signed-off-by: Alex Hung <alex.hung@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:24:47 -04:00
Alex Hung	731a20cb89	Revert "drm/amd/display: Add writeback enable field (wb_enabled)" This reverts commit `f6893fcb10`. [WHY & HOW] The writeback series cause a regression in thunderbolt display. Signed-off-by: Alex Hung <alex.hung@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:08:57 -04:00
Asad Kamal	625e5f3851	drm/amdgpu: Expose ras version & schema info Expose ras table version & schema info to sysfs v2: Updated schema to get poison support info from ras context, removed asic specific checks Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:02:54 -04:00
Lijo Lazar	d4a02673b3	drm/amdgpu: Read PSPv13 OS version from register PSP OS updates the version information in register. On APUs with PSPv13, PSP OS will already be loaded with SBIOS. Hence use the version register instead of using information in driver binary header. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:02:43 -04:00
Lang Yu	faeddb6eab	drm/amdgpu/umsch: enable doorbell for umsch Program vcn_doorbell_range with vcn_ring0_1. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:01:37 -04:00
Mario Limonciello	db99889065	drm/amd: Split up UVD suspend into prepare and suspend steps amdgpu_uvd_suspend() allocates memory and copies objects into that allocated memory. This fails under memory pressure. Instead move majority of this code into a prepare step when swap can still be allocated. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:01:04 -04:00
Mario Limonciello	cb11ca3233	drm/amd: Add concept of running prepare_suspend() sequence for IP blocks If any IP blocks allocate memory during their hw_fini() sequence this can cause the suspend to fail under memory pressure. Introduce a new phase that IP blocks can use to allocate memory before suspend starts so that it can potentially be evicted into swap instead. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:00:58 -04:00
Mario Limonciello	5095d54181	drm/amd: Evict resources during PM ops prepare() callback Linux PM core has a prepare() callback run before suspend. If the system is under high memory pressure, the resources may need to be evicted into swap instead. If the storage backing for swap is offlined during the suspend() step then such a call may fail. So move this step into prepare() to move evict majority of resources and update all non-pmops callers to call the same callback. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2362 Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:00:18 -04:00
Li Ma	31715a8620	drm/amdgpu: enable GFX IP v11.5.0 CG and PG support Add CG support for GFX/MC/HDP/ATHUB/IH/BIF. Add PG support for GFX. Signed-off-by: Li Ma <li.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:00:15 -04:00
Li Ma	ad3e54ab9e	drm/amdgpu/discovery: add SMU 14 support add smu 14 into the IP discovery list. Signed-off-by: Li Ma <li.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 11:00:00 -04:00
Lang Yu	4acf679f86	drm/amdgpu/umsch: power on/off UMSCH by DLDO VCN 4.0.5 uses DLDO. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 10:59:32 -04:00
Lang Yu	617b472431	drm/amdgpu/umsch: fix psp frontdoor loading These changes are missed in rebase. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 10:59:24 -04:00
Lijo Lazar	558fcb7d11	drm/amdgpu: Increase IP discovery region size IP discovery region has increased to > 8K on some SOCs.Maximum reserve size is upto 12K, but not used. For now increase to 10K. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 10:59:16 -04:00
Lin.Cao	b053117e86	drm/amdgpu: Return -EINVAL when MMSCH init status incorrect Return -EINVAL when MMSCH init fail which can be handle by function amdgpu_device_reset_sriov correctly. Signed-off-by: Lin.Cao <lincao12@amd.com> Reviewed-by: Jingwen Chen <Jingwen.Chen2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 10:58:48 -04:00
Lang Yu	9a37f65c4e	drm/amdgpu/vpe: fix insert_nop ops Avoid infinite loop when count is 0. This is missed in rebase. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 10:58:33 -04:00
Srinivasan Shanmugam	54967d5683	drm/amdgpu: Address member 'gart_placement' not described in 'amdgpu_gmc_gart_location' Fixes the below: drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c:274: warning: Function parameter or member 'gart_placement' not described in 'amdgpu_gmc_gart_location' Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 10:58:22 -04:00
Lang Yu	84aa39ab1e	drm/amdgpu/vpe: align with mcbp changes MCBP is decided by adev->gfx.mcbp now. This is missed in rebase. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 10:58:13 -04:00
Lang Yu	99ea82f424	drm/amdgpu/vpe: remove IB end boundary requirement Remove IB end boundary requirement, VPE has no such limitions, use existing amdgpu_ring_generic_pad_ib() instead. This is missed in rebase. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 10:58:01 -04:00
Jay Cornwall	757920585d	drm/amdgpu: Improve MES responsiveness during oversubscription When MES is oversubscribed it may not frequently check for new command submissions from driver if the scheduling load is high. Response latency as high as 5 seconds has been observed. Enable a flag which adds a check for new commands between scheduling quantums. Signed-off-by: Jay Cornwall <jay.cornwall@amd.com> Cc: Alexandru Tudor <alexandru.tudor@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-13 10:57:46 -04:00
Thomas Zimmermann	57390019b6	Merge drm/drm-next into drm-misc-next Updating drm-misc-next to the state of Linux v6.6-rc2. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>	2023-10-11 09:50:59 +02:00
Icenowy Zheng	3806a8c647	drm/amdgpu: fix SI failure due to doorbells allocation SI hardware does not have doorbells at all, however currently the code will try to do the allocation and thus fail, makes SI AMDGPU not usable. Fix this failure by skipping doorbells allocation when doorbells count is zero. Fixes: `54c30d2a8d` ("drm/amdgpu: create kernel doorbell pages") Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Icenowy Zheng <uwu@icenowy.me> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-09 17:59:29 -04:00
Christian König	ff89f064dc	drm/amdgpu: add missing NULL check bo->tbo.resource can easily be NULL here. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2902 Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> CC: stable@vger.kernel.org	2023-10-09 17:59:29 -04:00
Icenowy Zheng	219223eca4	drm/amdgpu: fix SI failure due to doorbells allocation SI hardware does not have doorbells at all, however currently the code will try to do the allocation and thus fail, makes SI AMDGPU not usable. Fix this failure by skipping doorbells allocation when doorbells count is zero. Fixes: `54c30d2a8d` ("drm/amdgpu: create kernel doorbell pages") Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Icenowy Zheng <uwu@icenowy.me> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-09 17:02:52 -04:00
Aaron Liu	ce862c4995	drm/amdgpu/discovery: enable DCN 3.5.0 support Enable DCN 3.5.0 support. Signed-off-by: Aaron Liu <aaron.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-09 17:02:46 -04:00
Arvind Yadav	367a0af433	drm/amdkfd: get doorbell's absolute offset based on the db_size Here, Adding db_size in byte to find the doorbell's absolute offset for both 32-bit and 64-bit doorbell sizes. So that doorbell offset will be aligned based on the doorbell size. v2: - Addressed the review comment from Felix. v3: - Adding doorbell_size as parameter to get db absolute offset. v4: Squash the two patches into one. Cc: Christian Koenig <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-09 17:02:34 -04:00
Christian König	31220ee9dc	drm/amdgpu: add missing NULL check bo->tbo.resource can easily be NULL here. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2902 Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> CC: stable@vger.kernel.org	2023-10-09 17:01:32 -04:00
Yifan Zhang	061863e5db	drm/amdgpu: add hub->ctx_distance in setup_vmid_config add hub->ctx_distance when read CONTEXT1_CNTL, align w/ write back operation. v2: fix coding style errors reported by checkpatch.pl (Christian) Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lang Yu <lang.yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-09 16:59:06 -04:00
Stanley.Yang	80285ae1ec	drm/amdgpu: Fix potential null pointer derefernce The amdgpu_ras_get_context may return NULL if device not support ras feature, so add check before using. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-09 16:52:46 -04:00
Lijo Lazar	b3e73b5a8f	Documentation/amdgpu: Add FRU attribute details Add documentation for the newly added manufacturer and fru_id attributes in sysfs. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-09 16:52:25 -04:00
Lijo Lazar	ac6b1f275f	drm/amdgpu: Add more FRU field information Add support to read Manufacturer Name and FRU File Id fields. Also add sysfs device attributes for external usage. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-09 16:52:17 -04:00
Lijo Lazar	8a2b51392a	drm/amdgpu: Refactor FRU product information Keep FRU related information together in a separate structure. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-09 16:52:08 -04:00
Yang Wang	be2e8aca06	drm/amdgpu: enable FRU device for SMU v13.0.6 v1: enable GFX v9.4.3 FRU device to query board information. v2: use MP1 version to identify different asic Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-09 16:51:58 -04:00
Boyuan Zhang	6cb8e3ee3a	drm/amdgpu: update ib start and size alignment Update IB starting address alignment and size alignment with correct values for decode and encode IPs. Decode IB starting address alignment: 256 bytes Decode IB size alignment: 64 bytes Encode IB starting address alignment: 256 bytes Encode IB size alignment: 4 bytes Also bump amdgpu driver version for this update. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-09 16:51:39 -04:00
Kees Cook	c8e7df374b	drm/amdgpu: Annotate struct amdgpu_bo_list with __counted_by Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct amdgpu_bo_list. Additionally, since the element count member must be set before accessing the annotated flexible array member, move its initialization earlier. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org> Cc: Luben Tuikov <luben.tuikov@amd.com> Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Cc: linux-hardening@vger.kernel.org Link: https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci [1] Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-05 17:59:35 -04:00
Srinivasan Shanmugam	e0a3e7bf62	drm/amdgpu: Drop unnecessary return statements There is no reason to call return at the end of function that returns void. Fixes the below: WARNING: void function return statements are not generally useful Thus remove such a statement in the affected functions. Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-05 17:59:35 -04:00
Lijo Lazar	4798db85b7	Documentation/amdgpu: Add board info details Add documentation for board info sysfs attribute. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-05 17:59:35 -04:00
Lijo Lazar	76da73f026	drm/amdgpu: Add sysfs attribute to get board info Add a sysfs attribute which shows the board form factor like OAM or CEM. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-05 17:59:35 -04:00
Lijo Lazar	b0a4553336	drm/amdgpu: Get package types for smuio v13.0 Add support to query package types supported in smuio v13.0 ASICs. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-05 17:59:35 -04:00
Lijo Lazar	4365d2ed09	drm/amdgpu: Add more smuio v13.0.3 package types Expand support to get other board types like OAM or CEM. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-05 17:59:35 -04:00
Sathishkumar S	cbad0dd13a	drm/amdgpu: fix ip count query for xcp partitions fix wrong ip count INFO on spatial partitions. update the query to return the instance count corresponding to the partition id. v2: initialize variables only when required to be (Christian) move variable declarations to the beginning of function (Christian) Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-05 17:59:35 -04:00
Lijo Lazar	28a3f49609	drm/amdgpu: Move package type enum to amdgpu_smuio Move definition of package type to amdgpu_smuio header and add new package types for CEM and OAM. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-05 17:59:35 -04:00
Srinivasan Shanmugam	2b6b29f33f	drm/amdgpu: Fix complex macros error Fixes the below: ERROR: Macros with complex values should be enclosed in parentheses WARNING: macros should not use a trailing semicolon +#define amdgpu_inc_vram_lost(adev) atomic_inc(&((adev)->vram_lost_counter)); Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-05 17:59:35 -04:00
Alex Deucher	7d3f1d76f3	drm/amdgpu: refine fault cache updates Don't update the fault cache if status is 0. In the multiple fault case, subsequent faults will return a 0 status which is useless for userspace and replaces the useful fault status, so only update if status is non-0. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-05 17:49:51 -04:00
Alex Deucher	7a41ed8b59	drm/amdgpu: add new INFO ioctl query for the last GPU page fault Add a interface to query the last GPU page fault for the process. Useful for debugging context lost errors. v2: split vmhub representation between kernel and userspace v3: add locking when fetching fault info in INFO IOCTL Mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23238 libdrm MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23238 Cc: samuel.pitoiset@gmail.com Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-05 17:49:39 -04:00
Kees Cook	ac8e62ab25	drm/amdgpu/discovery: Annotate struct ip_hw_instance with __counted_by Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct ip_hw_instance. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230922173216.3823169-2-keescook@chromium.org	2023-10-05 11:29:09 +02:00
Mario Limonciello	134b8c5d86	drm/amd: Fix detection of _PR3 on the PCIe root port On some systems with Navi3x dGPU will attempt to use BACO for runtime PM but fails to resume properly. This is because on these systems the root port goes into D3cold which is incompatible with BACO. This happens because in this case dGPU is connected to a bridge between root port which causes BOCO detection logic to fail. Fix the intent of the logic by looking at root port, not the immediate upstream bridge for _PR3. Cc: stable@vger.kernel.org Suggested-by: Jun Ma <Jun.Ma2@amd.com> Tested-by: David Perry <David.Perry@amd.com> Fixes: `b10c1c5b3a` ("drm/amdgpu: add check for ACPI power resources") Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-04 22:52:05 -04:00
Luben Tuikov	5d061675b7	drm/amdgpu: Fix a memory leak Fix a memory leak in amdgpu_fru_get_product_info(). Cc: Alex Deucher <Alexander.Deucher@amd.com> Reported-by: Yang Wang <kevinyang.wang@amd.com> Fixes: `0dbf2c5626` ("drm/amdgpu: Interpret IPMI data for product information (v2)") Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-04 22:43:26 -04:00
Philip Yang	fdac890966	drm/amdgpu: ratelimited override pte flags messages Use ratelimited version of dev_dbg to avoid flooding dmesg log. No functional change. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-04 18:40:11 -04:00
Mario Limonciello	b8e6aec146	drm/amd: Drop all hand-built MIN and MAX macros in the amdgpu base driver Several files declare MIN() or MAX() macros that ignore the types of the values being compared. Drop these macros and switch to min() min_t(), and max() from `linux/minmax.h`. Suggested-by: Hamza Mahfooz <Hamza.Mahfooz@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-04 18:39:52 -04:00
Alex Deucher	8dbf1ba867	drm/amdgpu: cache gpuvm fault information for gmc7+ Cache the current fault info in the vm struct. This can be queried by userspace later to help debug UMDs. Cc: samuel.pitoiset@gmail.com Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-04 18:37:07 -04:00
Alex Deucher	2e8ef6a561	drm/amdgpu: add cached GPU fault structure to vm struct When we get a GPU page fault, cache the fault for later analysis. Cc: samuel.pitoiset@gmail.com Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-04 18:36:57 -04:00
Rajneesh Bhardwaj	f4bff6e0b9	drm/amdgpu: Use ttm_pages_limit to override vram reporting On GFXIP9.4.3 APU, allow the memory reporting as per the ttm pages limit in NPS1 mode. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-04 18:36:42 -04:00
Rajneesh Bhardwaj	9b37d45d79	drm/amdgpu: Rework KFD memory max limits To allow bigger allocations specially on systems such as GFXIP 9.4.3 that use GTT memory for VRAM allocations, relax the limits to maximize ROCm allocations. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-04 18:36:17 -04:00
Alex Deucher	67318cb843	drm/amdgpu/gmc11: set gart placement GC11 Needed to avoid a hardware issue. v2: force high for all GC11 parts for consistency (Alex) v3: rebase Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-04 18:36:12 -04:00
Alex Deucher	917f91d8d8	drm/amdgpu/gmc: add a way to force a particular placement for GART We normally place GART based on the location of VRAM and the available address space around that, but provide an option to force a particular location for hardware that needs it. v2: Switch to passing the placement via parameter Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-04 18:36:07 -04:00
Lang Yu	a19d934986	drm/amdgpu: correct gpu clock counter query on cyan skilfish Cayn skilfish uses SMUIO v11.0.8 offset. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Aaron Liu <aaron.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: <stable@vger.kernel.org> # v5.15+	2023-10-04 18:35:28 -04:00
Mario Limonciello	c4c8955b8a	drm/amd: Fix detection of _PR3 on the PCIe root port On some systems with Navi3x dGPU will attempt to use BACO for runtime PM but fails to resume properly. This is because on these systems the root port goes into D3cold which is incompatible with BACO. This happens because in this case dGPU is connected to a bridge between root port which causes BOCO detection logic to fail. Fix the intent of the logic by looking at root port, not the immediate upstream bridge for _PR3. Cc: stable@vger.kernel.org Suggested-by: Jun Ma <Jun.Ma2@amd.com> Tested-by: David Perry <David.Perry@amd.com> Fixes: `b10c1c5b3a` ("drm/amdgpu: add check for ACPI power resources") Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-03 15:43:11 -04:00
Alex Hung	f6893fcb10	drm/amd/display: Add writeback enable field (wb_enabled) [WHAT] Add a new field to keep track whether a crtc is previously writeback-enabled. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-03 15:42:38 -04:00
Alex Hung	cd1a4bc228	drm/amd/display: Hande writeback request from userspace [WHAT] Handle writeback requests and fill in the required information for DWB programming and setup. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-03 15:42:06 -04:00
Alex Deucher	0021d70a06	drm/amdkfd: drop struct kfd_cu_info I think this was an abstraction back from when kfd supported both radeon and amdgpu. Since we just support amdgpu now, there is no more need for this and we can use the amdgpu structures directly. This also avoids having the kfd_cu_info structures on the stack when inlining which can blow up the stack. Cc: Arnd Bergmann <arnd@kernel.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-03 15:41:13 -04:00
Dave Airlie	79fb229b88	Merge tag 'drm-misc-next-2023-09-27' of git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v6.7-rc1: UAPI Changes: - drm_file owner is now updated during use, in the case of a drm fd opened by the display server for a client, the correct owner is displayed. - Qaic gains support for the QAIC_DETACH_SLICE_BO ioctl to allow bo recycling. Cross-subsystem Changes: - Disable boot logo for au1200fb, mmpfb and unexport logo helpers. Only fbcon should manage display of logo. - Update freescale in MAINTAINERS. - Add some bridge files to bridge in MAINTAINERS. - Update gma500 driver repo in MAINTAINERS to point to drm-misc. Core Changes: - Move size computations to drm buddy allocator. - Make drm_atomic_helper_shutdown(NULL) a nop. - Assorted small fixes in drm_debugfs, DP-MST payload addition error handling. - Fix DRM_BRIDGE_ATTACH_NO_CONNECTOR handling. - Handle bad (h/v)sync_end in EDID by clipping to htotal. - Build GPUVM as a module. Driver Changes: - Simple drivers don't need to cache prepared result. - Call drm_atomic_helper_shutdown() in shutdown/unbind for a whole lot more drm drivers. - Assorted small fixes in amdgpu, ssd130x, bridge/it6621, accel/qaic, nouveau, tc358768. - Add NV12 for komeda writeback. - Add arbitration lost event to synopsis/dw-hdmi-cec. - Speed up s/r in nouveau by not restoring some big bo's. - Assorted nouveau display rework in preparation for GSP-RM, especially related to how the modeset sequence works and the DP sequence in relation to link training. - Update anx7816 panel. - Support NVSYNC and NHSYNC in tegra. - Allow multiple power domains in simple driver. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/f1fae5eb-25b8-192a-9a53-215e1184ce81@linux.intel.com	2023-09-29 08:27:15 +10:00
Tao Zhou	fc59889071	drm/amdgpu: update retry times for psp vmbx wait Increase the retry loops and replace the constant number with macro. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-28 15:44:36 -04:00
Tao Zhou	1934907234	drm/amdgpu: exit directly if gpu reset fails No need to perform the full reset operation in case of gpu reset failure. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-28 15:44:36 -04:00
Mario Limonciello	93499bd6cd	drm/amd: Move microcode init from sw_init to early_init for CIK SDMA As part of IP discovery early_init is run for all HW IP blocks. During this phase all firmware is supposed to be identified that may be missing so that the driver can avoid releasing resources used by the EFI framebuffer or simpledrm until the last possible moment. Move microcode loading from sw_init to early_init. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-28 15:38:17 -04:00
Mario Limonciello	751e293f2c	drm/amd: Move microcode init from sw_init to early_init for SDMA v2.4 As part of IP discovery early_init is run for all HW IP blocks. During this phase all firmware is supposed to be identified that may be missing so that the driver can avoid releasing resources used by the EFI framebuffer or simpledrm until the last possible moment. Move microcode loading from sw_init to early_init. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-28 15:38:11 -04:00
Mario Limonciello	cc76630483	drm/amd: Move microcode init from sw_init to early_init for SDMA v3.0 As part of IP discovery early_init is run for all HW IP blocks. During this phase all firmware is supposed to be identified that may be missing so that the driver can avoid releasing resources used by the EFI framebuffer or simpledrm until the last possible moment. Move microcode loading from sw_init to early_init. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-28 15:38:05 -04:00
Mario Limonciello	e0d4fbb58c	drm/amd: Move microcode init from sw_init to early_init for SDMA v5.2 As part of IP discovery early_init is run for all HW IP blocks. During this phase all firmware is supposed to be identified that may be missing so that the driver can avoid releasing resources used by the EFI framebuffer or simpledrm until the last possible moment. Move microcode loading from sw_init to early_init. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-28 15:37:50 -04:00
Mario Limonciello	95b456d3b0	drm/amd: Move microcode init from sw_init to early_init for SDMA v6.0 As part of IP discovery early_init is run for all HW IP blocks. During this phase all firmware is supposed to be identified that may be missing so that the driver can avoid releasing resources used by the EFI framebuffer or simpledrm until the last possible moment. Move microcode loading from sw_init to early_init. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-28 15:37:43 -04:00
Mario Limonciello	ed1c1053cd	drm/amd: Move microcode init from sw_init to early_init for SDMA v5.0 As part of IP discovery early_init is run for all HW IP blocks. During this phase all firmware is supposed to be identified that may be missing so that the driver can avoid releasing resources used by the EFI framebuffer or simpledrm until the last possible moment. Move microcode loading from sw_init to early_init. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-28 15:37:34 -04:00
Mario Limonciello	161d076c2d	drm/amd: Drop error message about failing to load SDMA firmware The error path for SDMA firmware loading is unnecessarily noisy. When a firmware is missing 3 errors show up: ``` amdgpu 0000:07:00.0: Direct firmware load for amdgpu/green_sardine_sdma.bin failed with error -2 [drm:sdma_v4_0_early_init [amdgpu]] ERROR Failed to load sdma firmware! [drm:amdgpu_device_init [amdgpu]] ERROR early_init of IP block <sdma_v4_0> failed -19 ``` The error code for the device init is bubbled up already, remove the second one. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-28 15:36:41 -04:00
Le Ma	3152d01e88	drm/amd/pm: deprecate allow_xgmi_power_down interface Replace with set_plpd_mode uniformly for places to use. Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-28 15:36:23 -04:00
Mario Limonciello	3657a1d5ac	drm/amd: Limit seamless boot by default to APUs A hang is reported on DCN 3.2 with seamless boot enabled. As the benefits come from an eDP setup, limit it to only enabled by default with APUs. Suggested-by: Alexander.Deucher@amd.com Reported-by: feifei.xu@amd.com Closes: https://lore.kernel.org/amd-gfx/85b427f6-11ec-4249-bf6f-eadf9c375f88@amd.com/T/#m2887e919d7c01b2a4860d2261b366d22e070f309 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-28 15:35:31 -04:00
Alex Deucher	b2e1cbe628	drm/amdgpu/gmc11: disable AGP on GC 11.5 AGP aperture is deprecated and no longer functional. v2: fix typo (Alex) v3: just skip the agp setup call v4: revert back to the original model v5: back to v3 Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 17:00:23 -04:00
David (Ming Qiang) Wu	fa1f1cc09d	drm/amdgpu: not to save bo in the case of RAS err_event_athub err_event_athub will corrupt VCPU buffer and not good to be restored in amdgpu_vcn_resume() and in this case the VCPU buffer needs to be cleared for VCN firmware to work properly. Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: David (Ming Qiang) Wu <David.Wu3@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 17:00:23 -04:00
Luben Tuikov	9ed630c5c4	drm/amdgpu: Fix a memory leak Fix a memory leak in amdgpu_fru_get_product_info(). Cc: Alex Deucher <Alexander.Deucher@amd.com> Reported-by: Yang Wang <kevinyang.wang@amd.com> Fixes: `0dbf2c5626` ("drm/amdgpu: Interpret IPMI data for product information (v2)") Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 17:00:23 -04:00
Alex Deucher	de59b69932	drm/amdgpu/gmc: set a default disable value for AGP To disable AGP, the start needs to be set to a higher value than the end. Set a default disable value for the AGP aperture and allow the IP specific GMC code to enable it selectively be calling amdgpu_gmc_agp_location(). Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 17:00:22 -04:00
Alex Deucher	29495d8145	drm/amdgpu/gmc6-8: properly disable the AGP aperture The BOT register needs to be larger than the TOP register for this to be properly disabled. The lower 22 bits of the BOT address are always 0 and the lower 22 bits of the TOP register are always 1 so you need to make the upper bits of BOT larger than the upper bits of BOT. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 17:00:22 -04:00
Mangesh Gadre	cd956e7531	drm/amdgpu:Expose physical id of device in XGMI hive This identifies the physical ordering of devices in the hive v2: fix compilation issue Signed-off-by: Mangesh Gadre <Mangesh.Gadre@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 17:00:22 -04:00
Lang Yu	7021b397c6	drm/amdgpu/vpe: fix truncation warnings Fix truncation warnings. Fixes: `9d4346bdbc` ("drm/amdgpu: add VPE 6.1.0 support") Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/oe-kbuild-all/202309200028.aUVuM8os-lkp@intel.com Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 17:00:21 -04:00
Philip Yang	101b810430	drm/amdkfd: Move dma unmapping after TLB flush Otherwise GPU may access the stale mapping and generate IOMMU IO_PAGE_FAULT. Move this to inside p->mutex to prevent multiple threads mapping and unmapping concurrently race condition. After kfd_mem_dmaunmap_attachment is removed from unmap_bo_from_gpuvm, kfd_mem_dmaunmap_attachment is called if failed to map to GPUs, and before free the mem attachment in case failed to unmap from GPUs. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:55:10 -04:00
Christian König	08abccc9a7	drm/amdgpu: further move TLB hw workarounds a layer up For the PASID flushing we already handled that at a higher layer, apply those workarounds to the standard flush as well. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:55:09 -04:00
Christian König	e2e3788850	drm/amdgpu: rework lock handling for flush_tlb v2 Instead of each implementation doing this more or less correctly move taking the reset lock at a higher level. v2: fix typo Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:55:09 -04:00
Christian König	3983c9fd2d	drm/amdgpu: drop error return from flush_gpu_tlb_pasid That function never fails, drop the error return. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:55:09 -04:00
Christian König	041a574388	drm/amdgpu: fix and cleanup gmc_v11_0_flush_gpu_tlb_pasid The same PASID can be used by more than one VMID, reset each of them. Use the common KIQ handling. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:55:09 -04:00
Christian König	72cc99205c	drm/amdgpu: cleanup gmc_v10_0_flush_gpu_tlb_pasid The same PASID can be used by more than one VMID, reset each of them. Use the common KIQ handling. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:55:09 -04:00
Christian König	e7b90e99fa	drm/amdgpu: fix and cleanup gmc_v9_0_flush_gpu_tlb_pasid Testing for reset is pointless since the reset can start right after the test. The same PASID can be used by more than one VMID, invalidate each of them. Move the KIQ and all the workaround handling into common GMC code. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:55:09 -04:00
Christian König	0c525aa406	drm/amdgpu: fix and cleanup gmc_v8_0_flush_gpu_tlb_pasid Testing for reset is pointless since the reset can start right after the test. Grab the reset semaphore instead. The same PASID can be used by more than once VMID, build a mask of VMIDs to invalidate instead of just restting the first one. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:55:09 -04:00
Christian König	fb4c52db69	drm/amdgpu: fix and cleanup gmc_v7_0_flush_gpu_tlb_pasid Testing for reset is pointless since the reset can start right after the test. Grab the reset semaphore instead. The same PASID can be used by more than once VMID, build a mask of VMIDs to invalidate instead of just restting the first one. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:55:09 -04:00
Christian König	a54db42ff3	drm/amdgpu: cleanup gmc_v11_0_flush_gpu_tlb Remove leftovers from copying this from the gmc v10 code. v2: squash in fix from Yifan Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:55:09 -04:00
Christian König	a70cb2176f	drm/amdgpu: rework gmc_v10_0_flush_gpu_tlb v2 Move the SDMA workaround necessary for Navi 1x into a higher layer. v2: use dev_err Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:54:52 -04:00
Tao Zhou	8c14a67bdf	drm/amdgpu: change if condition for bad channel bitmap update The amdgpu_ras_eeprom_control.bad_channel_bitmap is u32 type, but the channel index could be larger than 32. For the ASICs whose channel number is more than 32, the amdgpu_dpm_send_hbm_bad_channel_flag interface is not supported, so we simply bypass channel bitmap update under this condition. v2: replace sizeof with BITS_PER_TYPE, we should check bit number instead of byte number. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:54:52 -04:00
Tao Zhou	6205b558e1	drm/amdgpu: fix value of some UMC parameters for UMC v12 Prepare for bad page retirement. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:54:52 -04:00
Philip Yang	e61801f162	drm/amdkfd: Don't use sw fault filter if retry cam enabled If retry cam enabled, we don't use sw retry fault filter and add fault into sw filter ring, so we shouldn't remove fault from sw filter. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:54:51 -04:00
Christian König	24a6eb92b7	drm/amdgpu: fix and cleanup gmc_v9_0_flush_gpu_tlb The KIQ code path was ignoring the second flush. Also avoid long lines and re-calculating the register offsets over and over again. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:54:51 -04:00
Philip Yang	bcfb9cee61	drm/amdgpu: Increase IH soft ring size for GFX v9.4.3 dGPU On GFX v9.4.3 dGPU, applications have random timeout failure when XNACK on, dmesg log has "amdgpu: IH soft ring buffer overflow 0x900, 0x900", because dGPU mode has 272 cam entries. After increasing IH soft ring to 512 entries, no more IH soft ring overflow message and application passed. Fixes: `bf80d34b6c` ("drm/amdgpu: Increase soft IH ring size") Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:54:51 -04:00
Lijo Lazar	c45e38f217	drm/amdgpu: Restore partition mode after reset On a full device reset, PSP FW gets unloaded. Hence restore the partition mode by placing a new request. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-26 16:54:51 -04:00
Cong Liu	f387bb578d	drm/amdgpu: fix a memory leak in amdgpu_ras_feature_enable This patch fixes a memory leak in the amdgpu_ras_feature_enable() function. The leak occurs when the function sends a command to the firmware to enable or disable a RAS feature for a GFX block. If the command fails, the kfree() function is not called to free the info memory. Fixes: `9f051d6ff1` ("drm/amdgpu: Free ras cmd input buffer properly") Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Cong Liu <liucong2@kylinos.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 17:27:04 -04:00
Lijo Lazar	06cce38ef5	Revert "drm/amdgpu: Report vbios version instead of PN" This reverts commit `7748ce5b69`. vbios_version sysfs node is used to identify Part Number also. Revert to the same so that it doesn't break scripts/software which parse this. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 17:26:15 -04:00
Lijo Lazar	ff96ddc3f2	drm/amdgpu: Add more fields to IP version Include subrevision and variant fileds also to IP version. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 16:25:17 -04:00
Tao Zhou	f8754f58d6	drm/amdgpu: print channel index for UMC bad page Print channel index for UMC v12. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 16:25:17 -04:00
Sathishkumar S	5aba51233b	drm/amdgpu: update IP count INFO query update the query to return the number of functional instances where there is more than an instance of the requested type and for others continue to return one. v2: count must reflect the actual number of engines (Alex) v3: fix wrong number of engines for vcn (Alex) Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 16:24:09 -04:00
Stanley.Yang	a83f2bf1f4	drm/amdgpu: Fix false positive error log It should first check block ras obj whether be set, it should return 0 directly if block ras obj or hw_ops is not set. If block doesn't support RAS just return 0 is fine. Changed from V1: return 0 directly if block ras obj or hw ops is not set Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 16:24:09 -04:00
Vignesh Chander	8c95cda3e1	drm/amdgpu/jpeg: skip set pg for sriov Host handles PG. Signed-off-by: Vignesh Chander <Vignesh.Chander@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 16:24:09 -04:00
André Almeida	a769178585	drm/amdgpu: Rework coredump to use memory dynamically Instead of storing coredump information inside amdgpu_device struct, move if to a proper separated struct and allocate it dynamically. This will make it easier to further expand the logged information. Signed-off-by: André Almeida <andrealmeid@igalia.com> Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 16:24:07 -04:00
Cong Liu	5838f74c29	drm/amdgpu: fix a memory leak in amdgpu_ras_feature_enable This patch fixes a memory leak in the amdgpu_ras_feature_enable() function. The leak occurs when the function sends a command to the firmware to enable or disable a RAS feature for a GFX block. If the command fails, the kfree() function is not called to free the info memory. Fixes: `9f051d6ff1` ("drm/amdgpu: Free ras cmd input buffer properly") Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Cong Liu <liucong2@kylinos.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 16:24:07 -04:00
Lijo Lazar	24f60ddc4b	drm/amdgpu: Fix vbios version string search Search for vbios version string in STRING_OFFSET-ATOM_ROM_HEADER region first. If those offsets are not populated, use the hardcoded region. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 16:24:06 -04:00
Lijo Lazar	2af351d692	Revert "drm/amdgpu: Report vbios version instead of PN" This reverts commit `7748ce5b69`. vbios_version sysfs node is used to identify Part Number also. Revert to the same so that it doesn't break scripts/software which parse this. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 16:24:06 -04:00
David Francis	5f248462c6	drm/amdgpu: Add EXT_COHERENT memory allocation flags These flags (for GEM and SVM allocations) allocate memory that allows for system-scope atomic semantics. On GFX943 these flags cause caches to be avoided on non-local memory. On all other ASICs they are identical in functionality to the equivalent COHERENT flags. Corresponding Thunk patch is at https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/pull/88 Reviewed-by: David Yat Sin <David.YatSin@amd.com> Signed-off-by: David Francis <David.Francis@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 16:24:06 -04:00
Yang Wang	4051844c66	drm/amdgpu: add amdgpu mca debug sysfs support add amdgpu mca debug sysfs support. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 12:25:19 -04:00
Alex Deucher	d11bbacee3	drm/amdgpu: add VPE IP discovery info to HW IP info query Add missing IP discovery info. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lang Yu <lang.yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 12:25:11 -04:00
Yang Wang	7ff607e272	drm/amdgpu: add amdgpu smu mca dump feature support add amdgpu smu mca dump feature support. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 12:25:01 -04:00
Mario Limonciello	7f4ce7b50a	drm/amd: Enable seamless boot by default on newer ASICs Seamless boot can technically be supported as far back as DCN1 but to avoid regressions on older hardware, enable it for DCN3 and later. If users report using the module parameter that it works on older ASICs as well, this can be adjusted. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 12:24:46 -04:00
Mario Limonciello	5dc270d366	drm/amd: Add a module parameter for seamless boot The module parameter can be used to test more easily enabling seamless boot support on additional ASICs. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 12:24:39 -04:00
Timmy Tsai	2fa73a101c	drm/amd: Add HDP flush during jpeg init During jpeg init, CPU writes to frame buffer which can be cached by HDP, occasionally causing invalid header to be sent to MMSCH. Perform HDP flush after writing to frame buffer before continuing with jpeg init sequence. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timmy Tsai <timmtsai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 12:24:28 -04:00
Mario Limonciello	bb0f84293e	drm/amd: Move seamless boot check out of display This will allow base driver to dictate whether seamless should be enabled. No intended functional changes. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 12:24:21 -04:00
Mario Limonciello	3ef07651a5	drm/amd: Drop special case for yellow carp without discovery `amdgpu_gmc_get_vbios_allocations` has a special case for how to bring up yellow carp when amdgpu discovery is turned off. As this ASIC ships with discovery turned on, it's generally dead code and worse it causes `adev->mman.keep_stolen_vga_memory` to not be initialized for yellow carp. Remove it. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 12:24:04 -04:00
Lijo Lazar	4e8303cf2c	drm/amdgpu: Use function for IP version check Use an inline function for version check. Gives more flexibility to handle any format changes. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-20 12:23:28 -04:00
Tvrtko Ursulin	1c7a387ffe	drm: Update file owner during use With the typical model where the display server opens the file descriptor and then hands it over to the client(), we were showing stale data in debugfs. Fix it by updating the drm_file->pid on ioctl access from a different process. The field is also made RCU protected to allow for lockless readers. Update side is protected with dev->filelist_mutex. Before: $ cat /sys/kernel/debug/dri/0/clients command pid dev master a uid magic Xorg 2344 0 y y 0 0 Xorg 2344 0 n y 0 2 Xorg 2344 0 n y 0 3 Xorg 2344 0 n y 0 4 After: $ cat /sys/kernel/debug/dri/0/clients command tgid dev master a uid magic Xorg 830 0 y y 0 0 xfce4-session 880 0 n y 0 1 xfwm4 943 0 n y 0 2 neverball 1095 0 n y 0 3 ) More detailed and historically accurate description of various handover implementation kindly provided by Emil Velikov: """ The traditional model, the server was the orchestrator managing the primary device node. From the fd, to the master status and authentication. But looking at the fd alone, this has varied across the years. IIRC in the DRI1 days, Xorg (libdrm really) would have a list of open fd(s) and reuse those whenever needed, DRI2 the client was responsible for open() themselves and with DRI3 the fd was passed to the client. Around the inception of DRI3 and systemd-logind, the latter became another possible orchestrator. Whereby Xorg and Wayland compositors could ask it for the fd. For various reasons (hysterical and genuine ones) Xorg has a fallback path going the open(), whereas Wayland compositors are moving to solely relying on logind... some never had fallback even. Over the past few years, more projects have emerged which provide functionality similar (be that on API level, Dbus, or otherwise) to systemd-logind. """ v2: * Fixed typo in commit text and added a fine historical explanation from Emil. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: Daniel Vetter <daniel@ffwll.ch> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Tested-by: Rob Clark <robdclark@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230621094824.2348732-1-tvrtko.ursulin@linux.intel.com Signed-off-by: Christian König <christian.koenig@amd.com>	2023-09-20 15:27:44 +02:00
Dave Airlie	1216d49178	amd-drm-fixes-6.6-2023-09-13: amdgpu: - GC 9.4.3 fixes - Fix white screen issues with S/G display on system with >= 64G of ram - Replay fixes - SMU 13.0.6 fixes - AUX backlight fix - NBIO 4.3 SR-IOV fixes for HDP - RAS fixes - DP MST resume fix - Fix segfault on systems with no vbios - DPIA fixes amdkfd: - CWSR grace period fix - Unaligned doorbell fix - CRIU fix for GFX11 - Add missing TLB flush on gfx10 and newer -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZQIRSAAKCRC93/aFa7yZ 2O/nAP4zB0fdLB46Hhz11aYsE9Zghe91b2rcmF4EYpEAQs7awwEAhSjy0Wiy6EYb prEGCdW0O8Tq7fdjr7+JrPmF7dasAQk= =SUbg -----END PGP SIGNATURE----- Merge tag 'amd-drm-fixes-6.6-2023-09-13' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.6-2023-09-13: amdgpu: - GC 9.4.3 fixes - Fix white screen issues with S/G display on system with >= 64G of ram - Replay fixes - SMU 13.0.6 fixes - AUX backlight fix - NBIO 4.3 SR-IOV fixes for HDP - RAS fixes - DP MST resume fix - Fix segfault on systems with no vbios - DPIA fixes amdkfd: - CWSR grace period fix - Unaligned doorbell fix - CRIU fix for GFX11 - Add missing TLB flush on gfx10 and newer Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230913195009.7714-1-alexander.deucher@amd.com	2023-09-15 09:50:50 +10:00
Daniel Vetter	15794f9dc3	One doc fix for drm/connector, one fix for amdgpu for an crash when VRAM usage is high, and one fix in gm12u320 to fix the timeout units in the code -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCZPl/TAAKCRDj7w1vZxhR xWZZAP0b3k5vIuQdbiZBdXy7+guakiJ2DqOMxJJ+sYS5Mun53AEA73Cu1gmBNMoT d8H1uBjOfvPcXANNI0t0OgJfrESOdg8= =atPC -----END PGP SIGNATURE----- Merge tag 'drm-misc-fixes-2023-09-07' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes One doc fix for drm/connector, one fix for amdgpu for an crash when VRAM usage is high, and one fix in gm12u320 to fix the timeout units in the code Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Maxime Ripard <mripard@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/w5nlld5ukeh6bgtljsxmkex3e7s7f4qquuqkv5lv4cv3uxzwqr@pgokpejfsyef	2023-09-14 14:00:51 +02:00
Alex Deucher	addd7aef25	drm/amdgpu: add remap_hdp_registers callback for nbio 7.11 Implement support for remapping the HDP aperture registers for NBIO 7.11. Reviewed-by: Lang Yu <lang.yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-12 17:30:22 -04:00
Alex Deucher	b85a17d354	drm/amdgpu: add vcn_doorbell_range callback for nbio 7.11 Implement support for setting up the VCN doorbell range for NBIO 7.11. Reviewed-by: Lang Yu <lang.yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-12 17:30:18 -04:00
Thomas Zimmermann	c900529f3d	Merge drm/drm-fixes into drm-misc-fixes Forwarding to v6.6-rc1. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>	2023-09-12 08:53:30 +02:00
David Francis	5e7e822542	drm/amdgpu: Handle null atom context in VBIOS info ioctl On some APU systems, there is no atom context and so the atom_context struct is null. Add a check to the VBIOS_INFO branch of amdgpu_info_ioctl to handle this case, returning all zeroes. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: David Francis <David.Francis@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 18:25:26 -04:00
Hawking Zhang	ffd6bde302	drm/amdgpu: fallback to old RAS error message for aqua_vanjaram So driver doesn't generate incorrect message until the new format is settled down for aqua_vanjaram Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 18:20:07 -04:00
Alex Deucher	ab43213e7a	drm/amdgpu/nbio4.3: set proper rmmio_remap.reg_offset for SR-IOV Needed for HDP flush to work correctly. Reviewed-by: Timmy Tsai <timmtsai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 18:19:48 -04:00
Alex Deucher	1832403cd4	drm/amdgpu/soc21: don't remap HDP registers for SR-IOV This matches the behavior for soc15 and nv. Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Timmy Tsai <timmtsai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 18:19:42 -04:00
Hamza Mahfooz	169ed4ece8	Revert "drm/amd: Disable S/G for APUs when 64GB or more host memory" This reverts commit `70e64c4d52`. Since, we now have an actual fix for this issue, we can get rid of this workaround as it can cause pin failures if enough VRAM isn't carved out by the BIOS. Cc: stable@vger.kernel.org # 6.1+ Acked-by: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 18:18:17 -04:00
Mukul Joshi	97e3c6a853	drm/amdgpu: Store CU info from all XCCs for GFX v9.4.3 Currently, we store CU info only for a single XCC assuming that it is the same for all XCCs. However, that may not be true. As a result, store CU info for all XCCs. This info is later used for CU masking. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 18:16:31 -04:00
Mukul Joshi	81faf9e0c3	drm/amdkfd: Fix reg offset for setting CWSR grace period This patch fixes the case where the code currently passes absolute register address and not the reg offset, which HWS expects, when sending the PM4 packet to set/update CWSR grace period. Additionally, cleanup the signature of build_grace_period_packet_info function as it no longer needs the inst parameter. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Jonathan Kim <jonathan.kim@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 18:15:43 -04:00
David Francis	86f2ec2265	drm/amdgpu: Handle null atom context in VBIOS info ioctl On some APU systems, there is no atom context and so the atom_context struct is null. Add a check to the VBIOS_INFO branch of amdgpu_info_ioctl to handle this case, returning all zeroes. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: David Francis <David.Francis@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:22:32 -04:00
André Almeida	ffde72107b	drm/amdgpu: Create an option to disable soft recovery Create a module option to disable soft recoveries on amdgpu, making every recovery go through the device reset path. This option makes easier to force device resets for testing and debugging purposes. Signed-off-by: André Almeida <andrealmeid@igalia.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:22:23 -04:00
André Almeida	887db1e49a	drm/amdgpu: Merge debug module parameters Merge all developer debug options available as separated module parameters in one, making it obvious that are for developers. Drop the obsolete module options in favor of the new ones. Signed-off-by: André Almeida <andrealmeid@igalia.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:19:54 -04:00
Yifan Zhang	0e64c9aad0	drm/amdgpu: add type conversion for gc info gc info usage misses type conversion. Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Li Ma <li.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:17:30 -04:00
Mukul Joshi	68fa72a437	drm/amdgpu: Rename KGD_MAX_QUEUES to AMDGPU_MAX_QUEUES Rename KGD_MAX_QUEUES to AMDGPU_MAX_QUEUES to conform with the naming convention followed in amdgpu_gfx.h. No functional change. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:15:41 -04:00
Tao Zhou	ced575203a	drm/amdgpu: print more address info of UMC bad page Print out row, column and bank value of UMC error address for UMC v12. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:15:15 -04:00
Hawking Zhang	9f9d4651f7	drm/amdgpu: fallback to old RAS error message for aqua_vanjaram So driver doesn't generate incorrect message until the new format is settled down for aqua_vanjaram Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:15:02 -04:00
Alex Deucher	6a82822b90	drm/amdgpu/nbio4.3: set proper rmmio_remap.reg_offset for SR-IOV Needed for HDP flush to work correctly. Reviewed-by: Timmy Tsai <timmtsai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:14:59 -04:00
Alex Deucher	8a6e26e7ef	drm/amdgpu/soc21: don't remap HDP registers for SR-IOV This matches the behavior for soc15 and nv. Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Timmy Tsai <timmtsai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:14:50 -04:00
Hamza Mahfooz	601c63ad8e	Revert "drm/amd: Disable S/G for APUs when 64GB or more host memory" This reverts commit `70e64c4d52`. Since, we now have an actual fix for this issue, we can get rid of this workaround as it can cause pin failures if enough VRAM isn't carved out by the BIOS. Cc: stable@vger.kernel.org # 6.1+ Acked-by: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:12:20 -04:00
Tao Zhou	3cb9ebc9d6	drm/amdgpu: add channel index table for UMC v12 Get UMC phyical channel index according to node id, umc instance and channel instance. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-09-11 17:10:58 -04:00

1 2 3 4 5 ...

13287 Commits