linux

Commit Graph

Author	SHA1	Message	Date
YiPeng Chai	83d29a5f8a	drm/amdgpu: Fixed psp fence and memory issues when removing amdgpu device V3: Fixed psp fence and memory issues for the asic using smu v13_0_2 when removing amdgpu device. [Why]: 1. psp_suspend->psp_free_shared_bufs-> psp_ta_free_shared_buf-> amdgpu_bo_free_kernel-> ...->amdgpu_bo_release_notify-> amdgpu_fill_buffer psp will free vram memory used by psp when psp_suspend is called. But for the asic using smu v13_0_2, because psp_suspend is called before adev->shutdown is set to true when removing the first hive device, amdgpu fill_buffer will be called, which will cause fence issues when evicting all vram resources in amdgpu vram mgr_fini. 2. Since psp_hw_fini is not called after calling psp_suspend and psp_suspend only calls psp_ring_stop, the psp ring memory will not be released when amdgpu device is removed. [How]: 1. Set shutdown to true before calling amdgpu_device_gpu_recover, then amdgpu_fill_buffer will not be called when psp_suspend is called. 2. Free psp ring memory in psp_sw_fini. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-19 15:17:47 -04:00
YiPeng Chai	f5c7e77970	drm/amdgpu: Adjust removal control flow for smu v13_0_2 Adjust removal control flow for smu v13_0_2: During amdgpu uninstallation, when removing the first device, the kernel needs to first send a mode1reset message to all gpu devices. Otherwise, smu initialization will fail the next time amdgpu is installed. V2: 1. Update commit comments. 2. Remove the global variable amdgpu_device_remove_cnt and add a variable to the structure amdgpu_hive_info. 3. Use hive to detect the first removed device instead of a global variable. V3: 1. Update commit comments. 2. Split a patch into multiple patches. 3. The current patch does: a. Add a work mode of AMDGPU_RESET_FOR_DEVICE_REMOVE into the existing gpu recover path, which make all devices in hive list only have HW reset but no resume (except the base IP). b. Call AMDGPU_RESET_FOR_DEVICE_REMOVE and AMDGPU_NEED_FULL_RESET mode of amdgpu_device_gpu_recover in amdgpu_pci_remove when removing the first device in hive list. c. When removing the first device, the IP blocks keyword function call sequence is as follows: .suspend->mode1reset->.resume(basic ip)->.hw_fini->.early_fini->.sw_fini. ^ \| \|-<----------<---------<----\| The first three sequences are because of a call to amdgpu_device_gpu_recover. The three sequences will be executed in a loop until all devices in the hive list are iterated. The sequences starting from .hw_fini only apply to the first device. Since .suspend has been called before, except the resumed phase1 basic ip blocks, all other ip blocks .hw_fini of current device will do nothing. d. When removing other devices, the calling sequences is the same as legacy: .hw_fini -> .early_fini -> .sw_fini. Since .suspend has been called when removing the first device, except the resumed phase1 basic ip blocks, all of other ip blocks .hw_fini of current device will do nothing. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-19 15:17:20 -04:00
Yifan Zhang	10faf07871	drm/amdgpu: add MES and MES-KIQ version in debugfs This patch addes MES and MES-KIQ version in debugfs. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-19 15:10:04 -04:00
Yang Li	7f89f9973c	drm/amd/display: clean up some inconsistent indentings clean up some inconsistent indentings Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=2178 Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-19 15:07:54 -04:00
Hawking Zhang	670c6edfbb	drm/amdgpu: add rlcv/rlcp version info to debugfs amdgpu_firmware_info debugfs will show rlcv/rlcp ucode version info Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-19 15:07:40 -04:00
Hawking Zhang	d5c6ad7296	drm/amdgpu: support print rlc v2_x ucode hdr add rlc v2_x support to print_rlc_hdr helper Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-19 15:07:30 -04:00
Hawking Zhang	ed2eee42d3	drm/amdgpu: save rlcv/rlcp ucode version in amdgpu_gfx cache rlcv/rlcvp ucode version info in amdgpu_gfx structure Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-19 15:07:24 -04:00
Mukul Joshi	876552e5d5	drm/amdgpu: Update PTE flags with TF enabled This patch updates the PTE flags when translate further (TF) is enabled: - With translate_further enabled, invalid PTEs can be 0. Reading consecutive invalid PTEs as 0 is considered a fault. To prevent this, ensure invalid PTEs have at least 1 bit set. - The current invalid PTE flags settings to translate a retry fault into a no-retry fault, doesn't work with TF enabled. As a result, update invalid PTE flags settings which works for both TF enabled and disabled case. Fixes: `352e683b72` ("drm/amdgpu: Enable translate_further to extend UTCL2 reach") Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-19 15:06:13 -04:00
Philip Yang	3913f0179b	drm/amdgpu: SDMA update use unlocked iterator SDMA update page table may be called from unlocked context, this generate below warning. Use unlocked iterator to handle this case. WARNING: CPU: 0 PID: 1475 at drivers/dma-buf/dma-resv.c:483 dma_resv_iter_next Call Trace: dma_resv_iter_first+0x43/0xa0 amdgpu_vm_sdma_update+0x69/0x2d0 [amdgpu] amdgpu_vm_ptes_update+0x29c/0x870 [amdgpu] amdgpu_vm_update_range+0x2f6/0x6c0 [amdgpu] svm_range_unmap_from_gpus+0x115/0x300 [amdgpu] svm_range_cpu_invalidate_pagetables+0x510/0x5e0 [amdgpu] __mmu_notifier_invalidate_range_start+0x1d3/0x230 unmap_vmas+0x140/0x150 unmap_region+0xa8/0x110 Signed-off-by: Philip Yang <Philip.Yang@amd.com> Suggested-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-19 15:05:54 -04:00
Yifan Zhang	0af4ed0c32	drm/amdgpu/mes: zero the sdma_hqd_mask of 2nd SDMA engine for SDMA 6.0.1 there is only one SDMA engine in SDMA 6.0.1, the sdma_hqd_mask has to be zeroed for the 2nd engine, otherwise MES scheduler will consider 2nd engine exists and map/unmap SDMA queues to the non-existent engine. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-14 15:00:34 -04:00
Alex Deucher	a8671493d2	drm/amdgpu: make sure to init common IP before gmc Move common IP init before GMC init so that HDP gets remapped before GMC init which uses it. This fixes the Unsupported Request error reported through AER during driver load. The error happens as a write happens to the remap offset before real remapping is done. Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373 The error was unnoticed before and got visible because of the commit referenced below. This doesn't fix anything in the commit below, rather fixes the issue in amdgpu exposed by the commit. The reference is only to associate this commit with below one so that both go together. Fixes: `8795e182b0` ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()") Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2022-09-14 14:21:49 -04:00
Alex Deucher	e3163bc8ff	drm/amdgpu: move nbio sdma_doorbell_range() into sdma code for vega This mirrors what we do for other asics and this way we are sure the sdma doorbell range is properly initialized. There is a comment about the way doorbells on gfx9 work that requires that they are initialized for other IPs before GFX is initialized. However, the statement says that it applies to multimedia as well, but the VCN code currently initializes doorbells after GFX and there are no known issues there. In my testing at least I don't see any problems on SDMA. This is a prerequisite for fixing the Unsupported Request error reported through AER during driver load. Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373 The error was unnoticed before and got visible because of the commit referenced below. This doesn't fix anything in the commit below, rather fixes the issue in amdgpu exposed by the commit. The reference is only to associate this commit with below one so that both go together. Fixes: `8795e182b0` ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()") Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2022-09-14 14:21:43 -04:00
Alex Deucher	dc1d85cb79	drm/amdgpu: move nbio ih_doorbell_range() into ih code for vega This mirrors what we do for other asics and this way we are sure the ih doorbell range is properly initialized. There is a comment about the way doorbells on gfx9 work that requires that they are initialized for other IPs before GFX is initialized. In this case IH is initialized before GFX, so there should be no issue. This is a prerequisite for fixing the Unsupported Request error reported through AER during driver load. Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373 The error was unnoticed before and got visible because of the commit referenced below. This doesn't fix anything in the commit below, rather fixes the issue in amdgpu exposed by the commit. The reference is only to associate this commit with below one so that both go together. Fixes: `8795e182b0` ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()") Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2022-09-14 14:21:25 -04:00
Alex Deucher	c1c39032a0	drm/amdgpu: make sure to init common IP before gmc Move common IP init before GMC init so that HDP gets remapped before GMC init which uses it. This fixes the Unsupported Request error reported through AER during driver load. The error happens as a write happens to the remap offset before real remapping is done. Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373 The error was unnoticed before and got visible because of the commit referenced below. This doesn't fix anything in the commit below, rather fixes the issue in amdgpu exposed by the commit. The reference is only to associate this commit with below one so that both go together. Fixes: `8795e182b0` ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()") Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-14 12:38:52 -04:00
Alex Deucher	59c43748c7	drm/amdgpu: move nbio sdma_doorbell_range() into sdma code for vega This mirrors what we do for other asics and this way we are sure the sdma doorbell range is properly initialized. There is a comment about the way doorbells on gfx9 work that requires that they are initialized for other IPs before GFX is initialized. However, the statement says that it applies to multimedia as well, but the VCN code currently initializes doorbells after GFX and there are no known issues there. In my testing at least I don't see any problems on SDMA. This is a prerequisite for fixing the Unsupported Request error reported through AER during driver load. Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373 The error was unnoticed before and got visible because of the commit referenced below. This doesn't fix anything in the commit below, rather fixes the issue in amdgpu exposed by the commit. The reference is only to associate this commit with below one so that both go together. Fixes: `8795e182b0` ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()") Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-14 12:38:52 -04:00
Alex Deucher	db10109793	drm/amdgpu: move nbio ih_doorbell_range() into ih code for vega This mirrors what we do for other asics and this way we are sure the ih doorbell range is properly initialized. There is a comment about the way doorbells on gfx9 work that requires that they are initialized for other IPs before GFX is initialized. In this case IH is initialized before GFX, so there should be no issue. This is a prerequisite for fixing the Unsupported Request error reported through AER during driver load. Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373 The error was unnoticed before and got visible because of the commit referenced below. This doesn't fix anything in the commit below, rather fixes the issue in amdgpu exposed by the commit. The reference is only to associate this commit with below one so that both go together. Fixes: `8795e182b0` ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()") Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-14 12:38:52 -04:00
Maxime Ripard	3f1a3a28e9	Immutable backlight-detect-refactor branch between acpi, drm-* and pdx86 Tag (immutable branch) with v6.0-rc1 + the (acpi/x86) backlight detect refactor work. For merging into the acpi, drm-* and pdx86 subsystems. -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEEuvA7XScYQRpenhd+kuxHeUQDJ9wFAmMVsogUHGhkZWdvZWRl QHJlZGhhdC5jb20ACgkQkuxHeUQDJ9yy6wgAlig+7hkq940L62lTpj0g2gNQv8zc HCsMpnU7dnJcZYaEvIjouZhf33ZbN52c0fQq2JWjt7fFX04LLyIiyrJ26Lc293JR ++yXpJcVoewRGqApy/P3Z05TKUCLll5bexvK4t8isnhOtEXD/nDPWKTLIV2Kd1DK nLY4KgRznXZ85RhYheUEdidZ7Lwlzt1JVBMq7tpnzu3nVdDExyZmqlqCUITcLynu ysuASQGr0D2i+1vb9eifHIA3xsQO0S37Bv62aBMBKxB6B8Fz1DYr8VA2YvoT82Hv IFT0hzCCZ/63Ljga05O78TwraxAQX0RvZWqjqGgnZg6fIBh2hxUiqeQY6g== =SA1R -----END PGP SIGNATURE----- gpgsig -----BEGIN PGP SIGNATURE----- iHUEABMIAB0WIQTXEe0+DlZaRlgM8LOIQ8rmN6G3ywUCYyG6jgAKCRCIQ8rmN6G3 y3JSAQCKELIhrWPrqAixxWn3OWcaU9PHb4Uhf9Qg3sLL2Bm2TAEAuT8iJ0Kssakg GKWScS5nj+8kbFm0J067JLOOYjvTxEI= =42X0 -----END PGP SIGNATURE----- Merge tag 'backlight-detect-refactor-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 into drm-misc-next Immutable backlight-detect-refactor branch between acpi, drm-* and pdx86 Tag (immutable branch) with v6.0-rc1 + the (acpi/x86) backlight detect refactor work. For merging into the acpi, drm-* and pdx86 subsystems. Signed-off-by: Maxime Ripard <maxime@cerno.tech> # -----BEGIN PGP SIGNATURE----- # # iQFIBAABCAAyFiEEuvA7XScYQRpenhd+kuxHeUQDJ9wFAmMVsogUHGhkZWdvZWRl # QHJlZGhhdC5jb20ACgkQkuxHeUQDJ9yy6wgAlig+7hkq940L62lTpj0g2gNQv8zc # HCsMpnU7dnJcZYaEvIjouZhf33ZbN52c0fQq2JWjt7fFX04LLyIiyrJ26Lc293JR # ++yXpJcVoewRGqApy/P3Z05TKUCLll5bexvK4t8isnhOtEXD/nDPWKTLIV2Kd1DK # nLY4KgRznXZ85RhYheUEdidZ7Lwlzt1JVBMq7tpnzu3nVdDExyZmqlqCUITcLynu # ysuASQGr0D2i+1vb9eifHIA3xsQO0S37Bv62aBMBKxB6B8Fz1DYr8VA2YvoT82Hv # IFT0hzCCZ/63Ljga05O78TwraxAQX0RvZWqjqGgnZg6fIBh2hxUiqeQY6g== # =SA1R # -----END PGP SIGNATURE----- # gpg: Signature made Mon 05 Sep 2022 09:25:44 AM IST # gpg: using RSA key BAF03B5D2718411A5E9E177E92EC4779440327DC # gpg: issuer "hdegoede@redhat.com" # gpg: Can't check signature: No public key From: Hans de Goede <hdegoede@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/261afe3d-7790-e945-adf6-a2c96c9b1eff@redhat.com	2022-09-14 12:27:10 +01:00
Alex Deucher	221bb3a9c3	drm/amdgpu: fix warning about missing imu prototype for imu_v11_0_3_program_rlc_ram(). Include imu_v11_0_3.h in imu_v11_0_3.c. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-13 14:33:01 -04:00
Christian König	d4e8ad908b	drm/amdgpu: reorder CS code Sort the functions in the order they are called Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-13 14:33:01 -04:00
Christian König	88c98d54b2	drm/amdgpu: cleanup CS init/fini and pass1 Cleanup the coding style and function names to represent the data they process. Only initialize and cleanup the CS structure in init/fini. Check the size of the IB chunk in pass1. v2: fix job initialisation order and use correct scheduler instance v3: try to move all functional changes into a separate patch. v4: move reordering and pass2 out of this patch as well Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-13 14:33:01 -04:00
Christian König	4247084057	drm/amdgpu: use DMA_RESV_USAGE_BOOKKEEP v2 Use DMA_RESV_USAGE_BOOKKEEP for VM page table updates and KFD preemption fence. v2: actually update all usages for KFD Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-13 14:33:01 -04:00
Christian König	dd80d9c8ee	drm/amdgpu: revert "partial revert "remove ctx->lock" v2" This reverts commit `94f4c4965e`. We found that the bo_list is missing a protection for its list entries. Since that is fixed now this workaround can be removed again. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-13 14:33:01 -04:00
Christian König	736ec9fadd	drm/amdgpu: move setting the job resources Move setting the job resources into amdgpu_job.c Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-13 14:33:01 -04:00
Christian König	9b94c609cc	drm/amdgpu: remove SRIOV and MCBP dependencies from the CS We should not have any different CS constrains based on the execution environment. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-13 14:33:01 -04:00
Candice Li	6da15a236c	drm/amdgpu: Skip reset error status for psp v13_0_0 No need to reset error status since only umc ras supported on psp v13_0_0. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-13 14:32:59 -04:00
Candice Li	1582252946	drm/amdgpu: Add EEPROM I2C address for smu v13_0_0 Set correct EEPROM I2C address for smu v13_0_0. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-13 14:32:58 -04:00
John Clements	c3db1b9065	drm/amdgpu: added support for ras driver loading copy ras driver to psp if present Signed-off-by: John Clements <john.clements@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-13 14:32:58 -04:00
Alex Deucher	7a87040ca6	drm/amdgpu: add HDP remap functionality to nbio 7.7 Was missing before and would have resulted in a write to a non-existant register. Normally APUs don't use HDP, but other asics could use this code and APUs do use the HDP when used in passthrough. Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-13 14:32:58 -04:00
Jingyu Wang	e7c94bfb74	drm/amdgpu: cleanup coding style in amdgpu_amdkfd_gpuvm.c Fix everything checkpatch.pl complained about in amdgpu_amdkfd_gpuvm.c Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Jingyu Wang <jingyuwang_vip@163.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-13 14:32:58 -04:00
Jingyu Wang	2f4ca1ba6c	drm/amdgpu: cleanup coding style in amdgpu_amdkfd.c Fix everything checkpatch.pl complained about in amdgpu_amdkfd.c Signed-off-by: Jingyu Wang <jingyuwang_vip@163.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-13 14:32:58 -04:00
Jingyu Wang	826f03b8ac	drm/amdgpu: cleanup coding style in amdgpu_sync.c file This is a patch to the amdgpu_sync.c file that fixes some warnings found by the checkpatch.pl tool Signed-off-by: Jingyu Wang <jingyuwang_vip@163.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-13 14:32:58 -04:00
Jingyu Wang	556bdae320	drm/amdgpu: cleanup coding style in amdgpu_acpi.c Fix everything checkpatch.pl complained about in amdgpu_acpi.c Signed-off-by: Jingyu Wang <jingyuwang_vip@163.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-13 14:32:58 -04:00
zhang songyi	79c0d7ddcb	drm/amdgpu: Remove the unneeded result variable Return the sdma_v6_0_start() directly instead of storing it in another redundant variable. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: zhang songyi <zhang.songyi@zte.com.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-13 14:32:58 -04:00
Vignesh Chander	2efc30f016	drm/amdgpu: Fix hive reference count leak both get_xgmi_hive and put_xgmi_hive can be skipped since the reset domain is not necessary for VF Signed-off-by: Vignesh Chander <Vignesh.Chander@amd.com> Reviewed-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-13 14:32:57 -04:00
Candice Li	86875d558b	drm/amdgpu: Skip reset error status for psp v13_0_0 No need to reset error status since only umc ras supported on psp v13_0_0. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-13 14:26:59 -04:00
Alex Deucher	8c5708d3da	drm/amdgpu: add HDP remap functionality to nbio 7.7 Was missing before and would have resulted in a write to a non-existant register. Normally APUs don't use HDP, but other asics could use this code and APUs do use the HDP when used in passthrough. Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-13 14:26:59 -04:00
Yang Wang	36de13fdb0	drm/amdgpu: change the alignment size of TMR BO to 1M align TMR BO size TO tmr size is not necessary, modify the size to 1M to avoid re-create BO fail when serious VRAM fragmentation. v2: add new macro PSP_TMR_ALIGNMENT for TMR BO alignment size Signed-off-by: Yang Wang <KevinYang.Wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-13 14:26:59 -04:00
Candice Li	df2c6e0c95	drm/amdgpu: Enable full reset when RAS is supported on gc v11_0_0 Enable full reset for RAS supported configuration on gc v11_0_0. v2: simplify the code. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-13 14:25:39 -04:00
Hamza Mahfooz	66f99628eb	drm/amdgpu: use dirty framebuffer helper Currently, we aren't handling DRM_IOCTL_MODE_DIRTYFB. So, use drm_atomic_helper_dirtyfb() as the dirty callback in the amdgpu_fb_funcs struct. Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-13 14:25:39 -04:00
Lijo Lazar	6c20490663	drm/amdgpu: Don't enable LTR if not supported As per PCIE Base Spec r4.0 Section 6.18 'Software must not enable LTR in an Endpoint unless the Root Complex and all intermediate Switches indicate support for LTR.' This fixes the Unsupported Request error reported through AER during ASPM enablement. Link: https://bugzilla.kernel.org/show_bug.cgi?id=216455 The error was unnoticed before and got visible because of the commit referenced below. This doesn't fix anything in the commit below, rather fixes the issue in amdgpu exposed by the commit. The reference is only to associate this commit with below one so that both go together. Fixes: `8795e182b0` ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()") Reported-by: Gustaw Smolarczyk <wielkiegie@gmail.com> Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-13 14:25:39 -04:00
Yang Wang	cd3a49af58	drm/amdgpu: change the alignment size of TMR BO to 1M align TMR BO size TO tmr size is not necessary, modify the size to 1M to avoid re-create BO fail when serious VRAM fragmentation. v2: add new macro PSP_TMR_ALIGNMENT for TMR BO alignment size Signed-off-by: Yang Wang <KevinYang.Wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-13 12:56:16 -04:00
Candice Li	34dfca8908	drm/amdgpu: Enable full reset when RAS is supported on gc v11_0_0 Enable full reset for RAS supported configuration on gc v11_0_0. v2: simplify the code. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-13 12:56:16 -04:00
Candice Li	d27ec594b4	drm/amdgpu: Rely on MCUMC_STATUS for umc v8_10 correctable error counter only Only check MCUMC_STATUS for CE counter for umc v8_10. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-13 12:56:16 -04:00
shaoyunl	46c676600c	drm/amdgpu: Use per device reset_domain for XGMI on sriov configuration For SRIOV configuration, host driver control the reset method(either FLR or heavier chain reset). The host will notify the guest individually with FLR message if individual GPU within the hive need to be reset. So for guest side, no need to use hive->reset_domain to replace the original per device reset_domain Signed-off-by: shaoyunl <shaoyun.liu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-13 12:54:23 -04:00
Hamza Mahfooz	542ab49173	drm/amdgpu: use dirty framebuffer helper Currently, we aren't handling DRM_IOCTL_MODE_DIRTYFB. So, use drm_atomic_helper_dirtyfb() as the dirty callback in the amdgpu_fb_funcs struct. Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-13 12:54:23 -04:00
Lijo Lazar	dd6aeb4e5f	drm/amdgpu: Don't enable LTR if not supported As per PCIE Base Spec r4.0 Section 6.18 'Software must not enable LTR in an Endpoint unless the Root Complex and all intermediate Switches indicate support for LTR.' This fixes the Unsupported Request error reported through AER during ASPM enablement. Link: https://bugzilla.kernel.org/show_bug.cgi?id=216455 The error was unnoticed before and got visible because of the commit referenced below. This doesn't fix anything in the commit below, rather fixes the issue in amdgpu exposed by the commit. The reference is only to associate this commit with below one so that both go together. Fixes: `8795e182b0` ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()") Reported-by: Gustaw Smolarczyk <wielkiegie@gmail.com> Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-13 12:54:22 -04:00
Dave Airlie	47519d8224	Merge tag 'amd-drm-next-6.1-2022-09-08' of https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.1-2022-09-08: amdgpu: - Mode2 reset for RDNA2 - Lots of new DC documentation - Add documentation about different asic families - DSC improvements - Aldebaran fixes - Misc spelling and grammar fixes - GFXOFF stats support for vangogh - DC frame size fixes - NBIO 7.7 updates - DCN 3.2 updates - DCN 3.1.4 Updates - SMU 13.x updates - Misc bug fixes - Rework DC register offset handling - GC 11.x updates - PSP 13.x updates - SDMA 6.x updates - GMC 11.x updates - SR-IOV updates - PSP fixes for TA unloading - DSC passthrough support - Misc code cleanups amdkfd: - ISA fixes for some GC 10.3 IPs - Misc code cleanups radeon: - Delayed work flush fix - Use time_after for some jiffies calculations drm: - DSC passthrough aux support Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220908155202.57862-1-alexander.deucher@amd.com	2022-09-12 19:17:41 +10:00
Dave Airlie	fb34d8a04e	Merge tag 'drm-misc-next-2022-09-09' of git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v6.1-rc1: [airlied - fix sun4i_tv build] UAPI Changes: - Hide unregistered connectors from GETCONNECTOR ioctl. - drm/virtio no longer advertises LINEAR modifier, as it doesn't work. - Cross-subsystem Changes: - Fix GPF in udmabuf failure path. Core Changes: - Rework TTM placement to use intersect/compatible functions. - Drop legacy DP-MST support. - More DP-MST related fixes, and move all state into atomic. - Make DRM_MIPI_DBI select DRM_KMS_HELPER. - Add audio_infoframe packing for DP. - Add logging when some atomic check functions fail. - Assorted documentation updates and fixes. Driver Changes: - Assorted cleanups and fixes in msm, lcdif, nouveau, virtio, panel/ilitek, bridge/icn6211, tve200, gma500, bridge/*, panfrost, via, bochs, qxl, sun4i. - Add add AUO B133UAN02.1, IVO M133NW4J-R3, Innolux N120ACA-EA1 eDP panels. - Improve DP-MST modeset state handling in amdgpu, nouveau, i915. - Drop DP-MST from radeon driver, it was broken and only user of legacy DP-MST. - Handle unplugging better in vc4. - Simplify drm cmdparser tests. - Add DP support to ti-sn65dsi86. - Add MT8195 DP support to mediatek. - Support RGB565, XRGB64, and ARGB64 formats in vkms. - Convert sun4i tv support to atomic. - Refactor vc4/vec TV Modesetting, and fix timings. - Use atomic helpers instead of simple display helpers in ssd130x. Maintainer changes: - Add Douglas Anderson as reviewer for panel-edp. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/a489485b-3ebc-c734-0f80-aed963d89efe@linux.intel.com	2022-09-11 22:03:07 +10:00
Guchun Chen	aac4cec1ec	drm/amdgpu: prevent toc firmware memory leak It's missed in psp fini. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-07 22:52:43 -04:00
Yifan Zhang	d832db12af	drm/amdgpu: correct doorbell range/size value for CSDMA_DOORBELL_RANGE current function mixes CSDMA_DOORBELL_RANGE and SDMA0_DOORBELL_RANGE range/size manipulation, while these 2 registers have difference size field mask. Remove range/size manipulation for SDMA0_DOORBELL_RANGE. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Xiaojian Du <Xiaojian.Du@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-07 22:52:32 -04:00
Yifan Zhang	ae0448bc88	drm/amdkfd: print address in hex format rather than decimal Addresses should be printed in hex format. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-07 22:52:19 -04:00
Chengming Gui	992db92b07	drm/amd/amdgpu: add rlc_firmware_header_v2_4 to amdgpu_firmware_header Add missing structure to avoid incorrect size and version check. Signed-off-by: Chengming Gui <Jack.Gui@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-07 22:48:35 -04:00
Guchun Chen	096e33f8ce	drm/amdgpu: prevent toc firmware memory leak It's missed in psp fini. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-07 22:28:42 -04:00
Yifan Zhang	e00debc283	drm/amdgpu: correct doorbell range/size value for CSDMA_DOORBELL_RANGE current function mixes CSDMA_DOORBELL_RANGE and SDMA0_DOORBELL_RANGE range/size manipulation, while these 2 registers have difference size field mask. Remove range/size manipulation for SDMA0_DOORBELL_RANGE. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Xiaojian Du <Xiaojian.Du@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-07 22:28:42 -04:00
Yifan Zhang	fb0a0625f8	drm/amdkfd: print address in hex format rather than decimal Addresses should be printed in hex format. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-07 22:28:42 -04:00
Chengming Gui	ba6d29e885	drm/amd/amdgpu: add rlc_firmware_header_v2_4 to amdgpu_firmware_header Add missing structure to avoid incorrect size and version check. Signed-off-by: Chengming Gui <Jack.Gui@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-07 22:21:28 -04:00
YiPeng Chai	fac53471d0	drm/amdgpu: TA unload messages are not actually sent to psp when amdgpu is uninstalled V1: The psp_cmd_submit_buf function is called by psp_hw_fini to send TA unload messages to psp to terminate ras, asd and tmr. But when amdgpu is uninstalled, drm_dev_unplug is called earlier than psp_hw_fini in amdgpu_pci_remove, the calling order as follows: static void amdgpu_pci_remove(struct pci_dev *pdev) { drm_dev_unplug ...... amdgpu_driver_unload_kms->amdgpu_device_fini_hw->... ->.hw_fini->psp_hw_fini->... ->psp_ta_unload->psp_cmd_submit_buf ...... } The program will return when calling drm_dev_enter in psp_cmd_submit_buf. So the call to drm_dev_enter in psp_cmd_submit_buf should be removed, so that the TA unload messages can be sent to the psp when amdgpu is uninstalled. V2: 1. Restore psp_cmd_submit_buf to its original code. 2. Move drm_dev_unplug call after amdgpu_driver_unload_kms in amdgpu_pci_remove. 3. Since amdgpu_device_fini_hw is called by amdgpu_driver_unload_kms, remove the unplug check to release device mmio resource in amdgpu_device_fini_hw before calling drm_dev_unplug. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-07 22:21:00 -04:00
Daniel Vetter	8284bae723	drm-misc-next for v6.1: UAPI Changes: Cross-subsystem Changes: - DMA-buf: documentation updates. - Assorted small fixes to vga16fb - Fix fbdev drivers to use the aperture helpers. - Make removal of conflicting drivers work correctly without fbdev enabled. Core Changes: - bridge, scheduler, dp-mst: Assorted small fixes. - Add more format helpers to fourcc, and use it to replace the cpp usage. - Add DRM_FORMAT_Cxx, DRM_FORMAT_Rxx (single channel), and DRM_FORMAT_Dxx ("darkness", inverted single channel) - Add packed AYUV8888 and XYUV8888 formats. - Assorted documentation updates. - Rename ttm_bo_init to ttm_bo_init_validate. - Allow TTM bo's to exist without backing store. - Convert drm selftests to kunit. - Add managed init functions for (panel) bridge, crtc, encoder and connector. - Fix endianness handling in various format conversion helpers. - Make tests pass on big-endian platforms, and add test for rgb888 -> rgb565 - Move DRM_PLANE_HELPER_NO_SCALING to atomic helpers and rename, so drm_plane_helper is no longer needed in most drivers. - Use idr_init_base instead of idr_init. - Rename FB and GEM CMA helpers to DMA helpers. - Rework XRGB8888 related conversion helpers, and add drm_fb_blit() that takes a iosys_map. Make drm_fb_memcpy take an iosys_map too. - Move edid luminance calculation to core, and use it in i915. Driver Changes: - bridge/{adv7511,ti-sn65dsi86,parade-ps8640}, panel/{simple,nt35510,tc358767}, nouveau, sun4i, mipi-dsi, mgag200, bochs, arm, komeda, vmwgfx, pl111: Assorted small fixes and doc updates. - vc4: Rework hdmi power up, and depend on PM. - panel/simple: Add Samsung LTL101AL01. - ingenic: Add JZ4760(B) support, avoid a modeset when sharpness property is unchanged, and use the new PM ops. - Revert some amdgpu commits that cause garbaged graphics when starting X, and reapply them with the real problem fixed. - Completely rework vc4 init to use managed helpers. - Rename via_drv to via_dri1, and move all stuff there only used by the dri1 implementation in preperation for atomic modeset. - Use regmap bulk write in ssd130x. - Power sequence and clock updates to it6505. - Split panel-sitrox-st7701 init sequence and rework mode programming code. - virtio: Improve error and edge conditions handling, and convert to use managed helpers. - Add Samsung LTL101AL01, B120XAN01.0, R140NWF5 RH, DMT028VGHMCMI-1A T, panels. - Add generic fbdev support to komeda. - Split mgag200 modeset handling to make it more model-specific. - Convert simpledrm to use atomic helpers. - Improve udl suspend/disconnect handling. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEuXvWqAysSYEJGuVH/lWMcqZwE8MFAmMA59MACgkQ/lWMcqZw E8MBihAAtttSIF8fvRIgnoFeswteB5HlXtrSOVRns+0h7mNzcmUL3yp/jV9betND NEfBCERRIRk5vqkzm2o9P/rTG+KMYCyyCNdcvvh+rVn+HU2V9MwtJdowZqLu7Rxi Qn59SaL6fMQanJwF8yHcii1ikqelU8TfsMXcFXt2d/yjn5D/h80tNOULEB74zIDH Vey/xTawcqEz3Qbj2mL4VihEnzU1cziGTpZvAWU2GTdWLxxN88uA4RD2JDyvGaXG oO6oilYT9TW+2C+0jfMUysu7RcNYuo4KD8dSVLvec+UKDKdO1pSqn40vgTZp0MBw V0VGg/Lqg6+u33UQkxeRpdzTboD4cMjoeP0/IJdHQVMYYH449tXDgZEjXqVHdz2c Ea8JIm36CHjqA0Ua3DT8y1qRtdrjuATJauoB6MXW8RpguqMzVuI4e9Dio77WRNYe Lg/V5AlQEKcwxeqe6v+8K7aB4lvHZ9g0WSMFHoxYz28d2cFWDVeKgK/GCW3WWMKo AQigRS84z3QBqK+IeTjbfZREoRFtx7NzQpzU9jQdL06XGWExhGOs8yU2OCoDNN8+ iFMBStEPHLPg27HrLV5JYeMRUBPJMnwQhAEnrSv9KqIfPD3/g2jonoGm3J57p+OI Ag7pwg88spzP1DC7M6+UvzUnH76Dhf4hdnfQjz10c6BfS1ODPbA= =kU/l -----END PGP SIGNATURE----- Merge tag 'drm-misc-next-2022-08-20-1' of git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v6.1: UAPI Changes: Cross-subsystem Changes: - DMA-buf: documentation updates. - Assorted small fixes to vga16fb - Fix fbdev drivers to use the aperture helpers. - Make removal of conflicting drivers work correctly without fbdev enabled. Core Changes: - bridge, scheduler, dp-mst: Assorted small fixes. - Add more format helpers to fourcc, and use it to replace the cpp usage. - Add DRM_FORMAT_Cxx, DRM_FORMAT_Rxx (single channel), and DRM_FORMAT_Dxx ("darkness", inverted single channel) - Add packed AYUV8888 and XYUV8888 formats. - Assorted documentation updates. - Rename ttm_bo_init to ttm_bo_init_validate. - Allow TTM bo's to exist without backing store. - Convert drm selftests to kunit. - Add managed init functions for (panel) bridge, crtc, encoder and connector. - Fix endianness handling in various format conversion helpers. - Make tests pass on big-endian platforms, and add test for rgb888 -> rgb565 - Move DRM_PLANE_HELPER_NO_SCALING to atomic helpers and rename, so drm_plane_helper is no longer needed in most drivers. - Use idr_init_base instead of idr_init. - Rename FB and GEM CMA helpers to DMA helpers. - Rework XRGB8888 related conversion helpers, and add drm_fb_blit() that takes a iosys_map. Make drm_fb_memcpy take an iosys_map too. - Move edid luminance calculation to core, and use it in i915. Driver Changes: - bridge/{adv7511,ti-sn65dsi86,parade-ps8640}, panel/{simple,nt35510,tc358767}, nouveau, sun4i, mipi-dsi, mgag200, bochs, arm, komeda, vmwgfx, pl111: Assorted small fixes and doc updates. - vc4: Rework hdmi power up, and depend on PM. - panel/simple: Add Samsung LTL101AL01. - ingenic: Add JZ4760(B) support, avoid a modeset when sharpness property is unchanged, and use the new PM ops. - Revert some amdgpu commits that cause garbaged graphics when starting X, and reapply them with the real problem fixed. - Completely rework vc4 init to use managed helpers. - Rename via_drv to via_dri1, and move all stuff there only used by the dri1 implementation in preperation for atomic modeset. - Use regmap bulk write in ssd130x. - Power sequence and clock updates to it6505. - Split panel-sitrox-st7701 init sequence and rework mode programming code. - virtio: Improve error and edge conditions handling, and convert to use managed helpers. - Add Samsung LTL101AL01, B120XAN01.0, R140NWF5 RH, DMT028VGHMCMI-1A T, panels. - Add generic fbdev support to komeda. - Split mgag200 modeset handling to make it more model-specific. - Convert simpledrm to use atomic helpers. - Improve udl suspend/disconnect handling. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/f0c71766-61e8-19b7-763a-5fbcdefc633d@linux.intel.com	2022-09-06 10:56:04 +02:00
Hans de Goede	c0f50c5de9	drm/amdgpu: Register ACPI video backlight when skipping amdgpu backlight registration Typically the acpi_video driver will initialize before amdgpu, which used to cause /sys/class/backlight/acpi_video0 to get registered and then amdgpu would register its own amdgpu_bl# device later. After which the drivers/acpi/video_detect.c code unregistered the acpi_video0 device to avoid there being 2 backlight devices. This means that userspace used to briefly see 2 devices and the disappearing of acpi_video0 after a brief time confuses the systemd backlight level save/restore code, see e.g.: https://bbs.archlinux.org/viewtopic.php?id=269920 To fix this the ACPI video code has been modified to make backlight class device registration a separate step, relying on the drm/kms driver to ask for the acpi_video backlight registration after it is done setting up its native backlight device. Add a call to the new acpi_video_register_backlight() when amdgpu skips registering its own backlight device because of either the firmware_flags or the acpi_video_get_backlight_type() return value. This ensures that if the acpi_video backlight device should be used, it will be available before the amdgpu drm_device gets registered with userspace. Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>	2022-09-03 12:17:26 +02:00
Jane Jian	63127922e1	drm/amdgpu/vcn: Add MMSCH v4_0 support for sriov These structures are basically ported from MMSCH v3_0, besides, added RB and RB4 enablement flag to support unified queue Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Jane Jian <Jane.Jian@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-01 15:12:14 -04:00
Jane Jian	aa44beb5f0	drm/amdgpu/vcn: Add sriov VCN v4_0 unified queue support Enable unified queue support for sriov, abandon all previous multi-queue settings Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Jane Jian <Jane.Jian@amd.com> Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-01 15:12:07 -04:00
Jane Jian	60e9c7ee3f	drm/amdgpu/vcn: Add vcn/vcn1 in white list to load its firmware under sriov Previously since vcn0/vcn1 are not enabled, loading firmware is skipped. Now add firmware loading back since vcn0/vcn1 has already been enabled on sriov Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Jane Jian <Jane.Jian@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-01 15:12:04 -04:00
Jane Jian	c322b422ab	drm/amdgpu/vcn: Disable CG/PG for SRIOV For sriov, CG and MG are controlled from hypervisor side, no need to manage them again in ip init Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Jane Jian <Jane.Jian@amd.com> Reviewed-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-01 15:12:00 -04:00
Yifan Zha	8284182592	drm/admgpu: Skip CG/PG on SOC21 under SRIOV VF [Why] There is no CG(Clock Gating)/PG(Power Gating) requirement on SRIOV VF. For multi VF, VF should not enable any CG/PG features. For one VF, PF will program CG/PG related registers. [How] Do not set any cg/pg flag bit at early init under sriov. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Yifan Zha <Yifan.Zha@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-01 15:11:54 -04:00
Yifan Zha	bbb860d46f	drm/amdgpu: Use RLCG to program GRBM_GFX_CNTL during full access time [Why] KIQ register init requires GRBM_GFX_CNTL to select KIQ. [How] As RLCG accessing registers will save the data of GRBM_GFX_CNTL and restore it. Use RLCG indirect accessing register method to select grbm instead of mmio directly access. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Yifan Zha <Yifan.Zha@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-01 15:11:50 -04:00
Yifan Zha	08c8442c4a	drm/amdgpu: Skip program SDMA0_SEM_WAIT_FAIL_TIMER_CNTL under SRIOV VF [Why] As SDMA0_SEM_WAIT_FAIL_TIMER_CNTL is a PF-only register, L1 would block this register for VF access. [How] VF do not program it. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Yifan Zha <Yifan.Zha@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-01 15:11:47 -04:00
Yifan Zha	40ad3e545b	drm/amdgpu: Skip the VRAM base offset on SRIOV [Why] As VF cannot read MMMC_VM_FB_OFFSET with L1 Policy(read 0xffffffff). It leads to driver get the incorrect vram base offset. [How] Since SR-IOV is dGPU only, skip reading this register and set the fb_offest to 0. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Yifan Zha <Yifan.Zha@amd.com> Signed-off-by: Horace Chen <horace.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-01 15:11:45 -04:00
Yifan Zha	5818eae501	drm/amdgpu: skip "Issue additional private vm invalidation to MMHUB" on SRIOV [Why] vm_l2_bank_select_reserved_cid2 is a PF_only register that cannot be programmed by VF. This feature is only support HDP using GPUVM page tables to access FB memory which should be disabled on SRIOV. [How] Disable the feature on VF. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Yifan Zha <Yifan.Zha@amd.com> Signed-off-by: Horace Chen <horace.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-01 15:11:40 -04:00
Yifan Zha	c1026c6f31	drm/amdgpu: Skip the program of MMMC_VM_AGP_* in SRIOV on MMHUB v3_0_0 [Why] VF should not program these registers, the value were defined in the host. [How] Skip writing them in SRIOV environment and program them on host side. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Yifan Zha <Yifan.Zha@amd.com> Signed-off-by: Horace Chen <horace.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-01 15:11:35 -04:00
Yifan Zha	425fede6e8	drm/amdgpu: Use PSP program IH_RB_CNTL registers under SRIOV [Why] With L1 Policy applied, IH_RB_CNTL/RING cannot be accessed by VF. [How] Use PSP program IH_RB_CNTL in VF. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Yifan Zha <Yifan.Zha@amd.com> Signed-off-by: Horace Chen <horace.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-01 15:11:31 -04:00
Horace Chen	f8bd73213a	drm/amdgpu: Support PSP 13.0.10 on SR-IOV Add support for PSP 13.0.10 for SR-IOV VF Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Horace Chen <horace.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-01 15:11:26 -04:00
Horace Chen	dc5f3829a7	drm/amdgpu: sriov remove vcn_4_0 and jpeg_4_0 SRIOV needs to initialize mmsch instead of multimedia engines directly. So currently remove them for SR-IOV until the code and firmwares are ready. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Horace Chen <horace.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-01 15:11:18 -04:00
Horace Chen	d9d86d085f	drm/amdgpu: refine virtualization psp fw skip check SR-IOV may need to load different firmwares for different ASIC inside VF. So create a new function in amdgpu_virt to check whether FW load needs to be skipped. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Horace Chen <horace.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-01 15:11:12 -04:00
Horace Chen	afb50906cf	drm/amdgpu: enable WPTR_POLL_ENABLE for sriov on sdma_v6_0 [Why] Under SR-IOV, if VF is switched out then its doorbell will be disabled, SDMA rely on WPTR_POLL to get doorbells which was sent during VF switched-out time. [How] For SR-IOV, set SDMA WPTR_POLL_ENABLE to 1. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Horace Chen <horace.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-01 15:11:06 -04:00
Horace Chen	ca4ba3394e	drm/amdgpu: add a compute pipe reset for RS64 [Why] Under SR-IOV, we are not sure whether pipe status is good or not when doing initialization. The compute engine maybe fail to bringup if pipe status is bad. [How] Do an RS64 pipe reset for MEC before we do initialization. Also apply to bare-metal. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Horace Chen <horace.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-01 15:11:03 -04:00
Horace Chen	119dc6c50e	drm/amdgpu: add sriov nbio callback structure [Why] under SR-IOV, the nbio doorbell range will be defined by PF. So VF nbio doorbell range registers will be blocked. It will cause violation if VF access those registers directly. [How] create an nbio_v4_3_sriov_funcs for sriov nbio_v4_3 initialization to skip the setting for the doorbell range registers. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Horace Chen <horace.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-01 15:11:00 -04:00
Horace Chen	09872b1c24	drm/amdgpu: add CHIP_IP_DISCOVERY support for virtualization For further chips we will use CHIP_IP_DISCOVERY, so add this support for virtualization Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Horace Chen <horace.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-01 15:10:54 -04:00
Yifan Zhang	75efc459ea	drm/amdgpu/mes: zero the sdma_hqd_mask of 2nd SDMA engine for SDMA 6.0.1 there is only one SDMA engine in SDMA 6.0.1, the sdma_hqd_mask has to be zeroed for the 2nd engine, otherwise MES scheduler will consider 2nd engine exists and map/unmap SDMA queues to the non-existent engine. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-01 15:10:35 -04:00
Chengming Gui	68fb37bc2c	drm/amd/amdgpu: skip ucode loading if ucode_size == 0 Restrict the ucode loading check to avoid frontdoor loading error. Signed-off-by: Chengming Gui <Jack.Gui@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-01 15:09:07 -04:00
Chengming Gui	39c84b8e92	drm/amd/amdgpu: skip ucode loading if ucode_size == 0 Restrict the ucode loading check to avoid frontdoor loading error. Signed-off-by: Chengming Gui <Jack.Gui@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-31 17:02:20 -04:00
Hawking Zhang	910ab9eee0	drm/amdgpu: only init tap_delay ucode when it's included in ucode binary Not all the gfx10 variants need to integrate global tap_delay and per se tap_delay firmwares Only init tap_delay ucode when it does include in rlc ucode binary so driver doesn't send a null buffer to psp for firmware loading Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Jack Gui <Jack.Gui@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-30 17:10:18 -04:00
Alex Sierra	b97e914552	drm/amdgpu: ensure no PCIe peer access for CPU XGMI iolinks [Why] Devices with CPU XGMI iolink do not support PCIe peer access. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-30 17:07:43 -04:00
YuBiao Wang	3c93603d95	drm/amdgpu: Fix use-after-free in amdgpu_cs_ioctl [Why] In amdgpu_cs_ioctl, amdgpu_job_free could be performed ealier if there is -ERESTARTSYS error. In this case, job->hw_fence could be not initialized yet. Putting hw_fence during amdgpu_job_free could lead to a use-after-free warning. [How] Check if drm_sched_job_init is performed before job_free by checking s_fence. v2: Check hw_fence.ops instead since it could be NULL if fence is not initialized. Reverse the condition since !=NULL check is discouraged in kernel. Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-30 17:00:03 -04:00
Graham Sider	47e04eed84	drm/amdgpu: Update mes_v11_api_def.h New GFX11 MES FW adds the trap_en bit. For now hardcode to 1 (traps enabled). Signed-off-by: Graham Sider <Graham.Sider@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-30 16:56:21 -04:00
Guchun Chen	c8fea9273f	drm/amdgpu: disable FRU access on special SIENNA CICHLID card Below driver load error will be printed, not friendly to end user. amdgpu: ATOM BIOS: 113-D603GLXE-077 [drm] FRU: Failed to get size field [drm:amdgpu_fru_get_product_info [amdgpu]] ERROR Failed to read FRU Manufacturer, ret:-5 Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-30 16:56:00 -04:00
ye xingchen	91a9588789	drm/amdgpu: Remove the unneeded result variable 'r' Return the value sdma_v4_0_start() directly instead of storing it in another redundant variable. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-30 16:37:31 -04:00
Hawking Zhang	e7c69a27cb	drm/amdgpu: add new ip block for MES 11.0.3 Add ip block support for mes v11_0_3. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Frank Min <Frank.Min@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-30 16:37:28 -04:00
Hawking Zhang	a4d002d7d0	drm/amdgpu: add new ip block for GFX 11.0 Add ip block support for gfx v11_0_3. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Frank Min <Frank.Min@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-30 16:37:25 -04:00
Hawking Zhang	2b5692345f	drm/amdgpu: Set GC family for GC 11.0.3 Set AMDGPU_FAMILY_GC_11_0_0. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Frank Min <Frank.Min@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-30 16:37:21 -04:00
Hawking Zhang	f926464e59	drm/amdgpu: enable imu_rlc_ram programming for v11_0_3 All gc v11_0_3 registers in gcvml2 range have different register offset from the ones in gc v11_0_0. v11_0_3 imu_rlc_ram programming has to be separated from v11_0_0 implementation v2: fix checkpatch errors (Alex) Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Yang Wang <KevinYang.Wang@amd.com> Reviewed-by: Frank Min <Frank.Min@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-30 16:37:14 -04:00
Hawking Zhang	a3813175c4	drm/amdgpu: init gfx config for gfx v11_0_3 initialize some gfx config for gfx v11_0_3 Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Frank Min <Frank.Min@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-30 16:37:09 -04:00
Hawking Zhang	701a4ad97d	drm/amdgpu: declare firmware for new MES 11.0.3 To support new mes ip block Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Frank Min <Frank.Min@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-30 16:37:07 -04:00
Hawking Zhang	c6329e255d	drm/amdgpu: declare firmware for new GC 11.0.3 To support new gfx ip block Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Frank Min <Frank.Min@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-30 16:37:04 -04:00
Hawking Zhang	94ac32338e	drm/amdgpu: add new ip block for GMC 11.0 Add ip block support for gmc v11_0_3. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Frank Min <Frank.Min@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-30 16:37:00 -04:00
Hawking Zhang	fe09f343d5	drm/amdgpu: initialize gmc sw config for v11_0_3 initialize gmc sw config for v11_0_3 Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Frank Min <Frank.Min@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-30 16:36:58 -04:00
Yang Wang	9436ac31c7	drm/amdgpu: add gfxhub_v3_0_3 support add gfxhub_v3_0_3 support Signed-off-by: Yang Wang <KevinYang.Wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-30 16:36:54 -04:00
Hawking Zhang	d5f476edc5	drm/amdgpu: only init tap_delay ucode when it's included in ucode binary Not all the gfx10 variants need to integrate global tap_delay and per se tap_delay firmwares Only init tap_delay ucode when it does include in rlc ucode binary so driver doesn't send a null buffer to psp for firmware loading Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Jack Gui <Jack.Gui@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-30 16:36:07 -04:00
Hawking Zhang	de2b2ae34d	drm/amdgpu: add new ip block for LSDMA 6.0 Add ip block support for lsdma v6_0_3. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Frank Min <Frank.Min@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-29 18:00:54 -04:00
Hawking Zhang	5bb7173566	drm/amdgpu: add new ip block for sdma 6.0 Add ip block support for sdma v6_0_3. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Frank Min <Frank.Min@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-29 18:00:54 -04:00
Hawking Zhang	f66f48471b	drm/amdgpu: declare firmware for new SDMA 6.0.3 To support new sdma ip block Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Frank Min <Frank.Min@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-29 18:00:54 -04:00
John Clements	773562364a	drm/amdgpu: enable smu block for smu 13.0.10 Force to enable smu block for SMU v13.0.10 Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-29 18:00:54 -04:00
Frank Min	a60d219137	drm/amdgpu: add new ip block for PSP 13.0 Add ip block support for psp v13_0_10. Signed-off-by: Frank Min <Frank.Min@amd.com> Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-29 17:59:30 -04:00
John Clements	10f8927d74	drm/amdgpu: added firmware module for psp 13.0.10 added missing firmware module Signed-off-by: John Clements <john.clements@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-29 17:59:30 -04:00
Frank Min	7ab47ba22e	drm/amdgpu: support psp v13_0_10 ip block Add psp v13_0_10 ip block, initialize firmware and psp functions Signed-off-by: Frank Min <Frank.Min@amd.com> Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-29 17:59:30 -04:00
Hawking Zhang	fc968efdf0	drm/amdgpu: add new ip block for SOC21 Add ip block support for soc21_common. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Frank Min <Frank.Min@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-29 17:59:30 -04:00
Sonny Jiang	0f05a2e528	drm/amdgpu: Enable pg/cg flags on GC11_0_3 for VCN This enable VCN PG, CG, DPG and JPEG PG, CG Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-29 17:59:30 -04:00
Hawking Zhang	6b46251c50	drm/amdgpu: initialize common sw config for v11_0_3 init cp/pg_flags and extenal_rev_id Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Frank Min <Frank.Min@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-29 17:59:30 -04:00
Hawking Zhang	6ec128c3ff	drm/amdgpu: drop gc 11_0_0 golden settings driver doesn't need to program any gc 11_0_0 golden Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-29 17:59:30 -04:00
Alex Sierra	ab23c5b9c7	drm/amdgpu: ensure no PCIe peer access for CPU XGMI iolinks [Why] Devices with CPU XGMI iolink do not support PCIe peer access. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-29 17:59:30 -04:00
YuBiao Wang	2581c5d85e	drm/amdgpu: Fix use-after-free in amdgpu_cs_ioctl [Why] In amdgpu_cs_ioctl, amdgpu_job_free could be performed ealier if there is -ERESTARTSYS error. In this case, job->hw_fence could be not initialized yet. Putting hw_fence during amdgpu_job_free could lead to a use-after-free warning. [How] Check if drm_sched_job_init is performed before job_free by checking s_fence. v2: Check hw_fence.ops instead since it could be NULL if fence is not initialized. Reverse the condition since !=NULL check is discouraged in kernel. Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-29 17:45:36 -04:00
Yang Yingliang	6b11af6d1c	drm/amdgpu: add missing pci_disable_device() in amdgpu_pmops_runtime_resume() Add missing pci_disable_device() if amdgpu_device_resume() fails. Fixes: `8e4d5d43cc` ("drm/amdgpu: Handling of amdgpu_device_resume return value for graceful teardown") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-29 17:44:15 -04:00
ye xingchen	1d5d194777	drm/amdgpu: Remove the unneeded result variable Return the value sdma_v5_2_start() directly instead of storing it in another redundant variable. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-29 17:43:59 -04:00
Graham Sider	2aefa9a38f	drm/amdgpu: Update mes_v11_api_def.h New GFX11 MES FW adds the trap_en bit. For now hardcode to 1 (traps enabled). Signed-off-by: Graham Sider <Graham.Sider@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-29 17:43:50 -04:00
Guchun Chen	73515bbdc4	drm/amdgpu: disable FRU access on special SIENNA CICHLID card Below driver load error will be printed, not friendly to end user. amdgpu: ATOM BIOS: 113-D603GLXE-077 [drm] FRU: Failed to get size field [drm:amdgpu_fru_get_product_info [amdgpu]] ERROR Failed to read FRU Manufacturer, ret:-5 Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-29 17:43:29 -04:00
Qu Huang	b8983d4252	drm/amdgpu: mmVM_L2_CNTL3 register not initialized correctly The mmVM_L2_CNTL3 register is not assigned an initial value Signed-off-by: Qu Huang <jinsdb@126.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-25 13:54:35 -04:00
Likun Gao	61251b2cff	drm/amdgpu: add MGCG perfmon setting for gfx11 Enable GFX11 MGCG perfmon setting. V2: set rlc to saft mode before setting. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-25 13:54:20 -04:00
Mukul Joshi	894c9c540f	drm/amdgpu: Fix page table setup on Arcturus When translate_further is enabled, page table depth needs to be updated. This was missing on Arcturus MMHUB init. This was causing address translations to fail for SDMA user-mode queues. Fixes: `352e683b72` ("drm/amdgpu: Enable translate_further to extend UTCL2 reach") Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-25 13:53:58 -04:00
Tim Huang	00047c3d96	drm/amdgpu: add sdma instance check for gfx11 CGCG For some ASICs, like GFX IP v11.0.1, only have one SDMA instance, so not need to configure SDMA1_RLC_CGCG_CTRL for this case. Signed-off-by: Tim Huang <tim.huang@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-25 13:53:19 -04:00
Tim Huang	16c01544e3	drm/amdgpu: enable NBIO IP v7.7.0 Clock Gating Enable AMD_CG_SUPPORT_BIF_MGCG and AMD_CG_SUPPORT_BIF_LS support. Signed-off-by: Tim Huang <tim.huang@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-25 13:53:03 -04:00
Tim Huang	2037769f99	drm/amdgpu: add NBIO IP v7.7.0 Clock Gating support Add BIF Clock Gating MGCG and LS support for NBIO IP v7.7.0. Signed-off-by: Tim Huang <tim.huang@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-25 13:52:55 -04:00
Qu Huang	58dcc22106	drm/amdgpu: mmVM_L2_CNTL3 register not initialized correctly The mmVM_L2_CNTL3 register is not assigned an initial value Signed-off-by: Qu Huang <jinsdb@126.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-25 13:35:18 -04:00
Rafael J. Wysocki	61ebd2fe6f	drm: amd: amdgpu: ACPI: Add comment about ACPI_FADT_LOW_POWER_S0 According to the ACPI specification [1], the ACPI_FADT_LOW_POWER_S0 flag merely means that it is better to use low-power S0 idle on the given platform than S3 (provided that the latter is supported) and it doesn't preclude using either of them (which of them will be used depends on the choices made by user space). However, on some systems that flag is used to indicate whether or not to enable special firmware mechanics allowing the system to save more energy when suspended to idle. If that flag is unset, doing so is generally risky. Accordingly, add a comment to explain the ACPI_FADT_LOW_POWER_S0 check in amdgpu_acpi_is_s0ix_active(), the purpose of which is otherwise somewhat unclear. Link: https://uefi.org/specs/ACPI/6.4/05_ACPI_Software_Programming_Model/ACPI_Software_Programming_Model.html#fixed-acpi-description-table-fadt # [1] Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-25 13:35:18 -04:00
Guchun Chen	638bc30f85	drm/amdgpu: use dev_info to benefit mGPU case 'free PSP TMR buffer' happens in suspend, but sometimes in mGPU config, it mixes with PSP resume log printing from another GPU, which is confusing. So use dev_info instead of DRM_INFO for printing. [drm] PSP is resuming... [drm] reserve 0xa00000 from 0x877e000000 for PSP TMR amdgpu 0000:e3:00.0: amdgpu: GECC is enabled amdgpu 0000:e3:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available amdgpu 0000:e3:00.0: amdgpu: SMU is resuming... amdgpu 0000:e3:00.0: amdgpu: smu driver if version = 0x00000040, smu fw if version = 0x00000041, smu fw program = 0, version = 0x003a5400 (58.84.0) amdgpu 0000:e3:00.0: amdgpu: SMU driver if version not matched amdgpu 0000:e3:00.0: amdgpu: dpm has been enabled amdgpu 0000:e3:00.0: amdgpu: SMU is resumed successfully! [drm] DMUB hardware initialized: version=0x02020014 [drm] free PSP TMR buffer [drm] kiq ring mec 2 pipe 1 q 0 Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-25 13:35:18 -04:00
Guchun Chen	a79f56d191	drm/amdgpu: use adev_to_drm to get drm device adev_to_drm is used everywhere in amdgpu code, so modify it to keep consistency. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-25 13:35:18 -04:00
Likun Gao	7201023910	drm/amdgpu: add MGCG perfmon setting for gfx11 Enable GFX11 MGCG perfmon setting. V2: set rlc to saft mode before setting. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-25 13:35:18 -04:00
Chengming Gui	d3ef9d57f2	drm/amd/amdgpu: avoid soft reset check when gpu recovery disabled Avoid soft reset, even ip hang check (ring/ib test) when gpu recovery disabled. v2: add missing "}" Signed-off-by: Chengming Gui <Jack.Gui@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-25 13:35:18 -04:00
Mukul Joshi	f47f9b2e9c	drm/amdgpu: Fix page table setup on Arcturus When translate_further is enabled, page table depth needs to be updated. This was missing on Arcturus MMHUB init. This was causing address translations to fail for SDMA user-mode queues. Fixes: `352e683b72` ("drm/amdgpu: Enable translate_further to extend UTCL2 reach") Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-25 13:35:17 -04:00
Vignesh Chander	7c55b598b3	drm/amdgpu: skip set_topology_info for VF Skip set_topology_info as xgmi TA will now block it and host needs to program it. Signed-off-by: Vignesh Chander <Vignesh.Chander@amd.com> Reviewed-By : Shaoyun Liu <Shaoyun.Liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-25 13:19:23 -04:00
Tim Huang	345c0bc0a3	drm/amdgpu: add sdma instance check for gfx11 CGCG For some ASICs, like GFX IP v11.0.1, only have one SDMA instance, so not need to configure SDMA1_RLC_CGCG_CTRL for this case. Signed-off-by: Tim Huang <tim.huang@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-25 13:19:04 -04:00
Hans de Goede	da11ef8329	drm/amdgpu: Don't register backlight when another backlight should be used (v3) Before this commit when we want userspace to use the acpi_video backlight device we register both the GPU's native backlight device and acpi_video's firmware acpi_video# backlight device. This relies on userspace preferring firmware type backlight devices over native ones. Registering 2 backlight devices for a single display really is undesirable, don't register the GPU's native backlight device when another backlight device should be used. Changes in v2: - To avoid linker errors when amdgpu is builtin and video_detect.c is in a module, select ACPI_VIDEO and its deps if ACPI is enabled. When ACPI is disabled, ACPI_VIDEO is also always disabled, ensuring the stubs from acpi/video.h will be used. Changes in v3: - Use drm_info(drm_dev, "...") to log messages - ACPI_VIDEO can now be enabled on non X86 too, adjust the Kconfig changes to match this. Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>	2022-08-25 10:56:20 +02:00
Tim Huang	9407feacd2	drm/amdgpu: enable NBIO IP v7.7.0 Clock Gating Enable AMD_CG_SUPPORT_BIF_MGCG and AMD_CG_SUPPORT_BIF_LS support. Signed-off-by: Tim Huang <tim.huang@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-22 16:47:28 -04:00
Tim Huang	c4d0d69999	drm/amdgpu: add NBIO IP v7.7.0 Clock Gating support Add BIF Clock Gating MGCG and LS support for NBIO IP v7.7.0. Signed-off-by: Tim Huang <tim.huang@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-22 16:47:23 -04:00
shaoyunl	947f63f17e	drm/amdgpu: Remove the additional kfd pre reset call for sriov The additional call is caused by merge conflict Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: shaoyunl <shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-22 16:47:09 -04:00
Candice Li	4bb5fed169	drm/amdgpu: Check num_gfx_rings for gfx v9_0 rb setup. No need to set up rb when no gfx rings. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-22 16:47:09 -04:00
YiPeng Chai	9dfa4860ef	drm/amdgpu: fix hive reference leak when adding xgmi device Only amdgpu_get_xgmi_hive but no amdgpu_put_xgmi_hive which will leak the hive reference. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-22 16:47:09 -04:00
YiPeng Chai	d8adafc7fe	drm/amdgpu: Move psp_xgmi_terminate call from amdgpu_xgmi_remove_device to psp_hw_fini V1: The amdgpu_xgmi_remove_device function will send unload command to psp through psp ring to terminate xgmi, but psp ring has been destroyed in psp_hw_fini. V2: 1. Change the commit title. 2. Restore amdgpu_xgmi_remove_device to its original calling location. Move psp_xgmi_terminate call from amdgpu_xgmi_remove_device to psp_hw_fini. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-22 16:47:09 -04:00
Tim Huang	95a72fb73c	drm/amdgpu: enable GFXOFF allow control for GC IP v11.0.1 Enable GFXOFF allow control when set the GFX power gating. Signed-off-by: Tim Huang <tim.huang@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-22 16:47:08 -04:00
Arunpravin Paneer Selvam	6d3c900c12	drm/ttm: Switch to using the new res callback Apply new intersect and compatible callback instead of having a generic placement range verfications. v2: Added a separate callback for compatiblilty checks (Christian) v3: Cleanups and removal of workarounds Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220820073304.178444-6-Arunpravin.PaneerSelvam@amd.com	2022-08-22 15:36:29 +02:00
Arunpravin Paneer Selvam	ded910f368	drm/amdgpu: Implement intersect/compatible functions Implemented a new intersect and compatible callback function fetching start offset from backend drm buddy allocator. Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220820073304.178444-3-Arunpravin.PaneerSelvam@amd.com	2022-08-22 15:35:22 +02:00
shaoyunl	0667173488	drm/amdgpu: Remove the additional kfd pre reset call for sriov The additional call is caused by merge conflict Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: shaoyunl <shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-19 17:06:38 -04:00
Candice Li	c351938350	drm/amdgpu: Check num_gfx_rings for gfx v9_0 rb setup. No need to set up rb when no gfx rings. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-19 17:06:29 -04:00
YiPeng Chai	f5994da72b	drm/amdgpu: fix hive reference leak when adding xgmi device Only amdgpu_get_xgmi_hive but no amdgpu_put_xgmi_hive which will leak the hive reference. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-19 17:06:21 -04:00
YiPeng Chai	9d705d7741	drm/amdgpu: Move psp_xgmi_terminate call from amdgpu_xgmi_remove_device to psp_hw_fini V1: The amdgpu_xgmi_remove_device function will send unload command to psp through psp ring to terminate xgmi, but psp ring has been destroyed in psp_hw_fini. V2: 1. Change the commit title. 2. Restore amdgpu_xgmi_remove_device to its original calling location. Move psp_xgmi_terminate call from amdgpu_xgmi_remove_device to psp_hw_fini. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-19 17:06:13 -04:00
Tim Huang	43ef9db423	drm/amdgpu: enable GFXOFF allow control for GC IP v11.0.1 Enable GFXOFF allow control when set the GFX power gating. Signed-off-by: Tim Huang <tim.huang@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-19 17:05:35 -04:00
Dave Airlie	b1fb6b87ed	Merge tag 'amd-drm-fixes-6.0-2022-08-17' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.0-2022-08-17: amdgpu: - Revert some DML stack changes - Rounding fixes in KFD allocations - atombios vram info table parsing fix - DCN 3.1.4 fixes - Clockgating fixes for various new IPs - SMU 13.0.4 fixes - DCN 3.1.4 FP fixes - TMDS fixes for YCbCr420 4k modes - DCN 3.2.x fixes - USB 4 fixes - SMU 13.0 fixes - SMU driver unload memory leak fixes - Display orientation fix - Regression fix for generic fbdev conversion - SDMA 6.x fixes - SR-IOV fixes - IH 6.x fixes - Use after free fix in bo list handling - Revert pipe1 support - XGMI hive reset fix amdkfd: - Fix potential crach in kfd_create_indirect_link_prop() Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220818025206.6463-1-alexander.deucher@amd.com	2022-08-19 09:45:22 +10:00
André Almeida	a021e2aa4d	drm/amdgpu: Document gfx_off members of struct amdgpu_gfx Add comments to document gfx_off related members of struct amdgpu_gfx. Signed-off-by: André Almeida <andrealmeid@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-16 18:17:32 -04:00
André Almeida	0ad7347a64	drm/amd: Add detailed GFXOFF stats to debugfs Add debugfs interface to log GFXOFF statistics: - Read amdgpu_gfxoff_count to get the total GFXOFF entry count at the time of query since system power-up - Write 1 to amdgpu_gfxoff_residency to start logging, and 0 to stop. Read it to get average GFXOFF residency % multiplied by 100 during the last logging interval. Both features are designed to be keep the values persistent between suspends. Signed-off-by: André Almeida <andrealmeid@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-16 18:17:31 -04:00
shaoyunl	1d32af4fac	drm/amdgpu: use sjt mec fw on aldebaran for sriov The second jump table is required on live migration or mulitple VF configuration on Aldebaran. With this implemented, the first level jump table(hw used) will be same, mec fw internal will use the second level jump table jump to the real functionality implementation. so the different VF can load different version of MEC as long as they support sjt Signed-off-by: shaoyunl <shaoyun.liu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-16 18:17:31 -04:00
Michel Dänzer	085292c3d7	Revert "drm/amd/amdgpu: add pipe1 hardware support" This reverts commit `4c7631800e`. Triggered GFX hangs with GNOME Wayland on Navi 21. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2117 Signed-off-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-16 18:14:31 -04:00
Maíra Canal	bbca24d0a3	drm/amdgpu: Fix use-after-free on amdgpu_bo_list mutex If amdgpu_cs_vm_handling returns r != 0, then it will unlock the bo_list_mutex inside the function amdgpu_cs_vm_handling and again on amdgpu_cs_parser_fini. This problem results in the following use-after-free problem: [ 220.280990] ------------[ cut here ]------------ [ 220.281000] refcount_t: underflow; use-after-free. [ 220.281019] WARNING: CPU: 1 PID: 3746 at lib/refcount.c:28 refcount_warn_saturate+0xba/0x110 [ 220.281029] ------------[ cut here ]------------ [ 220.281415] CPU: 1 PID: 3746 Comm: chrome:cs0 Tainted: G W L ------- --- 5.20.0-0.rc0.20220812git7ebfc85e2cd7.10.fc38.x86_64 #1 [ 220.281421] Hardware name: System manufacturer System Product Name/ROG STRIX X570-I GAMING, BIOS 4403 04/27/2022 [ 220.281426] RIP: 0010:refcount_warn_saturate+0xba/0x110 [ 220.281431] Code: 01 01 e8 79 4a 6f 00 0f 0b e9 42 47 a5 00 80 3d de 7e be 01 00 75 85 48 c7 c7 f8 98 8e 98 c6 05 ce 7e be 01 01 e8 56 4a 6f 00 <0f> 0b e9 1f 47 a5 00 80 3d b9 7e be 01 00 0f 85 5e ff ff ff 48 c7 [ 220.281437] RSP: 0018:ffffb4b0d18d7a80 EFLAGS: 00010282 [ 220.281443] RAX: 0000000000000026 RBX: 0000000000000003 RCX: 0000000000000000 [ 220.281448] RDX: 0000000000000001 RSI: ffffffff988d06dc RDI: 00000000ffffffff [ 220.281452] RBP: 00000000ffffffff R08: 0000000000000000 R09: ffffb4b0d18d7930 [ 220.281457] R10: 0000000000000003 R11: ffffa0672e2fffe8 R12: ffffa058ca360400 [ 220.281461] R13: ffffa05846c50a18 R14: 00000000fffffe00 R15: 0000000000000003 [ 220.281465] FS: 00007f82683e06c0(0000) GS:ffffa066e2e00000(0000) knlGS:0000000000000000 [ 220.281470] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 220.281475] CR2: 00003590005cc000 CR3: 00000001fca46000 CR4: 0000000000350ee0 [ 220.281480] Call Trace: [ 220.281485] <TASK> [ 220.281490] amdgpu_cs_ioctl+0x4e2/0x2070 [amdgpu] [ 220.281806] ? amdgpu_cs_find_mapping+0xe0/0xe0 [amdgpu] [ 220.282028] drm_ioctl_kernel+0xa4/0x150 [ 220.282043] drm_ioctl+0x21f/0x420 [ 220.282053] ? amdgpu_cs_find_mapping+0xe0/0xe0 [amdgpu] [ 220.282275] ? lock_release+0x14f/0x460 [ 220.282282] ? _raw_spin_unlock_irqrestore+0x30/0x60 [ 220.282290] ? _raw_spin_unlock_irqrestore+0x30/0x60 [ 220.282297] ? lockdep_hardirqs_on+0x7d/0x100 [ 220.282305] ? _raw_spin_unlock_irqrestore+0x40/0x60 [ 220.282317] amdgpu_drm_ioctl+0x4a/0x80 [amdgpu] [ 220.282534] __x64_sys_ioctl+0x90/0xd0 [ 220.282545] do_syscall_64+0x5b/0x80 [ 220.282551] ? futex_wake+0x6c/0x150 [ 220.282568] ? lock_is_held_type+0xe8/0x140 [ 220.282580] ? do_syscall_64+0x67/0x80 [ 220.282585] ? lockdep_hardirqs_on+0x7d/0x100 [ 220.282592] ? do_syscall_64+0x67/0x80 [ 220.282597] ? do_syscall_64+0x67/0x80 [ 220.282602] ? lockdep_hardirqs_on+0x7d/0x100 [ 220.282609] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 220.282616] RIP: 0033:0x7f8282a4f8bf [ 220.282639] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00 [ 220.282644] RSP: 002b:00007f82683df410 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 220.282651] RAX: ffffffffffffffda RBX: 00007f82683df588 RCX: 00007f8282a4f8bf [ 220.282655] RDX: 00007f82683df4d0 RSI: 00000000c0186444 RDI: 0000000000000018 [ 220.282659] RBP: 00007f82683df4d0 R08: 00007f82683df5e0 R09: 00007f82683df4b0 [ 220.282663] R10: 00001d04000a0600 R11: 0000000000000246 R12: 00000000c0186444 [ 220.282667] R13: 0000000000000018 R14: 00007f82683df588 R15: 0000000000000003 [ 220.282689] </TASK> [ 220.282693] irq event stamp: 6232311 [ 220.282697] hardirqs last enabled at (6232319): [<ffffffff9718cd7e>] __up_console_sem+0x5e/0x70 [ 220.282704] hardirqs last disabled at (6232326): [<ffffffff9718cd63>] __up_console_sem+0x43/0x70 [ 220.282709] softirqs last enabled at (6232072): [<ffffffff970ff669>] __irq_exit_rcu+0xf9/0x170 [ 220.282716] softirqs last disabled at (6232061): [<ffffffff970ff669>] __irq_exit_rcu+0xf9/0x170 [ 220.282722] ---[ end trace 0000000000000000 ]--- Therefore, remove the mutex_unlock from the amdgpu_cs_vm_handling function, so that amdgpu_cs_submit and amdgpu_cs_parser_fini can handle the unlock. Fixes: `90af0ca047` ("drm/amdgpu: Protect the amdgpu_bo_list list with a mutex v2") Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Melissa Wen <mwen@igalia.com> Signed-off-by: Maíra Canal <mairacanal@riseup.net> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-08-16 18:14:31 -04:00

1 2 3 4 5 ...

11419 Commits