linux

Commit Graph

Author	SHA1	Message	Date
Rodrigo Siqueira	7da45e746c	drm/amd/display: Clean up code in DC This commit removes some unnecessary code and makes the required adjustments to replace other parts of the code with a short option. Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:45 -04:00
Sunil Khatri	2a8f7464d3	drm/amdgpu: skip ip dump if devcoredump flag is set Do not dump the ip registers during driver reload in passthrough environment. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:44 -04:00
Arunpravin Paneer Selvam	e362b7c8f8	drm/amdgpu: Modify the contiguous flags behaviour Now we have two flags for contiguous VRAM buffer allocation. If the application request for AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS, it would set the ttm place TTM_PL_FLAG_CONTIGUOUS flag in the buffer's placement function. This patch will change the default behaviour of the two flags. When we set AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS - This means contiguous is not mandatory. - we will try to allocate the contiguous buffer. Say if the allocation fails, we fallback to allocate the individual pages. When we setTTM_PL_FLAG_CONTIGUOUS - This means contiguous allocation is mandatory. - we are setting this in amdgpu_bo_pin_restricted() before bo validation and check this flag in the vram manager file. - if this is set, we should allocate the buffer pages contiguously. the allocation fails, we return -ENOSPC. v2: - keep the mem_flags and bo->flags check as is(Christian) - place the TTM_PL_FLAG_CONTIGUOUS flag setting into the amdgpu_bo_pin_restricted function placement range iteration loop(Christian) - rename find_pages with amdgpu_vram_mgr_calculate_pages_per_block (Christian) - Keep the kernel BO allocation as is(Christain) - If BO pin vram allocation failed, we need to return -ENOSPC as RDMA cannot work with scattered VRAM pages(Philip) v3(Christian): - keep contiguous flag handling outside of pages_per_block calculation - remove the hacky implementation in contiguous flag error handling code v4(Christian): - use any variable and return value for non-contiguous fallback v5: rebase to amd-staging-drm-next branch Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:44 -04:00
Alex Hung	f851b078b1	drm/amd/display: Fix uninitialized variables in DC This fixes 29 UNINIT issues reported by Coverity. Reviewed-by: Hersen Wu <hersenxs.wu@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:44 -04:00
Alex Hung	ba3193fa8f	drm/amd/display: Fix uninitialized variables in DC This fixes 49 UNINIT issues reported by Coverity. Reviewed-by: Hersen Wu <hersenxs.wu@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:44 -04:00
Alex Hung	f95bcb041f	drm/amd/display: Fix uninitialized variables in DM This fixes 11 UNINIT issues reported by Coverity. Reviewed-by: Hersen Wu <hersenxs.wu@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:44 -04:00
Alex Hung	e0dd5782f8	drm/amd/display: Remove redundant include file This fixes 1 PW.INCLUDE_RECURSION reported by Coverity. "./drivers/gpu/drm/amd/amdgpu/../display/dc/dc_types.h" includes itself: dc_types.h -> dal_types.h -> dc_types.h Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:44 -04:00
Alex Hung	01eb50e53c	drm/amd/display: ASSERT when failing to find index by plane/stream id [WHY] find_disp_cfg_idx_by_plane_id and find_disp_cfg_idx_by_stream_id returns an array index and they return -1 when not found; however, -1 is not a valid index number. [HOW] When this happens, call ASSERT(), and return a positive number (which is fewer than callers' array size) instead. This fixes 4 OVERRUN and 2 NEGATIVE_RETURNS issues reported by Coverity. Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:44 -04:00
Alex Hung	3ac31c9a70	drm/amd/display: Do not return negative stream id for array [WHY] resource_stream_to_stream_idx returns an array index and it return -1 when not found; however, -1 is not a valid array index number. [HOW] When this happens, call ASSERT(), and return a zero instead. This fixes an OVERRUN and an NEGATIVE_RETURNS issues reported by Coverity. Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:44 -04:00
Hersen Wu	f1fd8a0a54	drm/amd/display: Fix overlapping copy within dml_core_mode_programming [WHY] &mode_lib->mp.Watermark and &locals->Watermark are the same address. memcpy may lead to unexpected behavior. [HOW] memmove should be used. Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Hersen Wu <hersenxs.wu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:44 -04:00
Alex Hung	1357b2165d	drm/amd/display: Skip finding free audio for unknown engine_id [WHY] ENGINE_ID_UNKNOWN = -1 and can not be used as an array index. Plus, it also means it is uninitialized and does not need free audio. [HOW] Skip and return NULL. This fixes 2 OVERRUN issues reported by Coverity. Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:44 -04:00
Alex Hung	5396a70e8c	drm/amd/display: Check pipe offset before setting vblank pipe_ctx has a size of MAX_PIPES so checking its index before accessing the array. This fixes an OVERRUN issue reported by Coverity. Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:44 -04:00
Lancelot SIX	bd31e5026d	drm/amdkfd: Enable SQ watchpoint for gfx10 There are new control registers introduced in gfx10 used to configure hardware watchpoints triggered by SMEM instructions: SQ_WATCH{0,1,2,3}_{CNTL_ADDR_HI,ADDR_L}. Those registers work in a similar way as the TCP_WATCH* registers currently used for gfx9 and above. This patch adds support to program the SQ_WATCH registers for gfx10. The SQ_WATCH?_CNTL.MASK field has one bit more than TCP_WATCH?_CNTL.MASK, so SQ watchpoints can have a finer granularity than TCP_WATCH watchpoints. In this patch, we keep the capabilities advertised to the debugger unchanged (HSA_DBG_WATCH_ADDR_MASK_*_BIT_GFX10) as this reflects what both TCP and SQ watchpoints can do and both watchpoints are configured together. Signed-off-by: Lancelot SIX <lancelot.six@amd.com> Reviewed-by: Jonathan Kim <jonathan.kim@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:44 -04:00
Alex Hung	59d99deb33	drm/amd/display: Check index msg_id before read or write [WHAT] msg_id is used as an array index and it cannot be a negative value, and therefore cannot be equal to MOD_HDCP_MESSAGE_ID_INVALID (-1). [HOW] Check whether msg_id is valid before reading and setting. This fixes 4 OVERRUN issues reported by Coverity. Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:43 -04:00
Srinivasan Shanmugam	acce6479e3	drm/amdgpu: Fix buffer size in gfx_v9_4_3_init_ cp_compute_microcode() and rlc_microcode() The function gfx_v9_4_3_init_microcode in gfx_v9_4_3.c was generating about potential truncation of output when using the snprintf function. The issue was due to the size of the buffer 'ucode_prefix' being too small to accommodate the maximum possible length of the string being written into it. The string being written is "amdgpu/%s_mec.bin" or "amdgpu/%s_rlc.bin", where %s is replaced by the value of 'chip_name'. The length of this string without the %s is 16 characters. The warning message indicated that 'chip_name' could be up to 29 characters long, resulting in a total of 45 characters, which exceeds the buffer size of 30 characters. To resolve this issue, the size of the 'ucode_prefix' buffer has been reduced from 30 to 15. This ensures that the maximum possible length of the string being written into the buffer will not exceed its size, thus preventing potential buffer overflow and truncation issues. Fixes the below with gcc W=1: drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c: In function ‘gfx_v9_4_3_early_init’: drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:379:52: warning: ‘%s’ directive output may be truncated writing up to 29 bytes into a region of size 23 [-Wformat-truncation=] 379 \| snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_rlc.bin", chip_name); \| ^~ ...... 439 \| r = gfx_v9_4_3_init_rlc_microcode(adev, ucode_prefix); \| ~~~~~~~~~~~~ drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:379:9: note: ‘snprintf’ output between 16 and 45 bytes into a destination of size 30 379 \| snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_rlc.bin", chip_name); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:413:52: warning: ‘%s’ directive output may be truncated writing up to 29 bytes into a region of size 23 [-Wformat-truncation=] 413 \| snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_mec.bin", chip_name); \| ^~ ...... 443 \| r = gfx_v9_4_3_init_cp_compute_microcode(adev, ucode_prefix); \| ~~~~~~~~~~~~ drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:413:9: note: ‘snprintf’ output between 16 and 45 bytes into a destination of size 30 413 \| snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_mec.bin", chip_name); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fixes: `8630112969` ("drm/amdgpu: split gc v9_4_3 functionality from gc v9_0") Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:43 -04:00
Hersen Wu	8e65a1b711	drm/amd/display: Add NULL pointer check for kzalloc [Why & How] Check return pointer of kzalloc before using it. Reviewed-by: Alex Hung <alex.hung@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Hersen Wu <hersenxs.wu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:43 -04:00
George Shen	b528cac6de	drm/amd/display: Handle Y carry-over in VCP X.Y calculation Theoretically rare corner case where ceil(Y) results in rounding up to an integer. If this happens, the 1 should be carried over to the X value. CC: stable@vger.kernel.org Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: George Shen <george.shen@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:43 -04:00
YiPeng Chai	6f3b69139c	drm/amdgpu: Fix ras mode2 reset failure in ras aca mode Fix ras mode2 reset failure in ras aca mode. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:43 -04:00
Bob Zhou	506c245f3f	drm/amdgpu: fix double free err_addr pointer warnings In amdgpu_umc_bad_page_polling_timeout, the amdgpu_umc_handle_bad_pages will be run many times so that double free err_addr in some special case. So set the err_addr to NULL to avoid the warnings. Signed-off-by: Bob Zhou <bob.zhou@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:43 -04:00
Jesse Zhang	7bfd16d0ec	drm/amdgpu: initialize the last_jump_jiffies in atom_exec_context The parameter "last_jump_jiffies" should be initialized before being used in the function atom_op_jump. Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:43 -04:00
Jesse Zhang	2d10c3dbde	drm/amdgpu: add check before free wb entry Check if ring is not a mes queue before freeing the wb entry, because we only allocate a wb entry when it's not a mes queue. Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:43 -04:00
Bob Zhou	cd48b97ce7	drm/amdgpu: add return result for amdgpu_i2c_{get/put}_byte After amdgpu_i2c_get_byte fail, amdgpu_i2c_put_byte shouldn't be conducted to put wrong value. So return and check the i2c transfer result. Signed-off-by: Bob Zhou <bob.zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:43 -04:00
Bob Zhou	8b2faf1a4f	drm/amdgpu: add error handle to avoid out-of-bounds if the sdma_v4_0_irq_id_to_seq return -EINVAL, the process should be stop to avoid out-of-bounds read, so directly return -EINVAL. Signed-off-by: Bob Zhou <bob.zhou@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:43 -04:00
Ma Jun	2e55bcf3d7	drm/amdgpu: Initialize timestamp for some legacy SOCs Initialize the interrupt timestamp for some legacy SOCs to fix the coverity issue "Uninitialized scalar variable" Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:43 -04:00
YiPeng Chai	48fa90718b	drm/amdgpu: Use new interface to reserve bad page Use new interface to reserve bad page. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:43 -04:00
YiPeng Chai	bcc0934885	drm/amdgpu: Fix address translation defect retired_page is page frame and should be expanded to the full address when querying status. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:42 -04:00
Harish Kasiviswanathan	7f11a836e1	drm/amdkfd: Enforce queue BO's adev Queue buffer, though it is in system memory, has to be created using the correct amdgpu device. Enforce this as the BO needs to mapped to the GART for MES Hardware scheduler to access it. Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:42 -04:00
Dmytro Laktyushkin	4fdd07cec8	drm/amd/display: Increase SAT_UPDATE_PENDING timeout Headless dp 2.0 will take longer to update. Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:42 -04:00
Rodrigo Siqueira	5e66f6eaa2	drm/amd/display: Add some missing HDMI registers for DCN3x This commit add some missing HDMI control registers to DCN3x. Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:42 -04:00
YiPeng Chai	e023874081	drm/amdgpu: support ACA logging ecc errors support ACA logging ecc errors. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:42 -04:00
YiPeng Chai	370fbff4cc	drm/amdgpu: add poison consumption handler Add poison consumption handler. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:42 -04:00
YiPeng Chai	bfa579b38b	drm/amdgpu: prepare to handle pasid poison consumption Prepare to handle pasid poison consumption. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:42 -04:00
YiPeng Chai	314c38cde6	drm/amdgpu: retire bad pages for umc v12_0 Retire bad pages for umc v12_0. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:42 -04:00
YiPeng Chai	e74313be5a	drm/amdgpu: add condition check for amdgpu_umc_fill_error_record Add condition check for amdgpu_umc_fill_error_record. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:41 -04:00
YiPeng Chai	2cf8e50ec3	drm/amdgpu: Add delay work to retire bad pages Add delay work to retire bad pages. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:41 -04:00
YiPeng Chai	f27defca68	drm/amdgpu: umc v12_0 logs ecc errors 1. umc v12_0 logs ecc errors. 2. Reserve newly detected ecc error pages. 3. Add tag for bad pages, so that they can be retired later. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:41 -04:00
YiPeng Chai	b2aa6b108d	drm/amdgpu: umc v12_0 converts error address Umc v12_0 converts error address. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:41 -04:00
YiPeng Chai	95b4063de4	drm/amdgpu: add interface to update umc v12_0 ecc status Add interface to update umc v12_0 ecc status. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:41 -04:00
YiPeng Chai	a734adfbcd	drm/amdgpu: add poison creation handler Add poison creation handler. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:41 -04:00
YiPeng Chai	f493dd64ee	drm/amdgpu: prepare for logging ecc errors Prepare for logging ecc errors. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:41 -04:00
YiPeng Chai	98b5bc878d	drm/amdgpu: add message fifo to handle RAS poison events Add message fifo to handle RAS poison events. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:41 -04:00
Jesse Zhang	88a9a467c5	drm/amdgpu: Using uninitialized value *size when calling amdgpu_vce_cs_reloc Initialize the size before calling amdgpu_vce_cs_reloc, such as case 0x03000001. V2: To really improve the handling we would actually need to have a separate value of 0xffffffff.(Christian) Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:41 -04:00
Rodrigo Siqueira	a4812f2fcb	drm/amd/display: Add TMDS DC balancer control Add TMDS balancer control to the list of available encoder registers for DCN 30. Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:41 -04:00
Srinivasan Shanmugam	442dd0552c	drm/amd/display: Remove unnecessary NULL check in dcn20_set_input_transfer_func This commit removes an unnecessary NULL check in the `dcn20_set_input_transfer_func` function in the `dcn20_hwseq.c` file. The variable `tf` is assigned the address of `plane_state->in_transfer_func` unconditionally, so it can never be `NULL`. Therefore, the check `if (tf == NULL)` is unnecessary and has been removed. The plane_state->in_transfer_func itself cannot be NULL because it's a structure, not a pointer. When we do tf = &plane_state->in_transfer_func;, we're getting the address of that structure, which will always be valid as long as plane_state itself is not NULL. we've already checked if plane_state is NULL with the line if (dpp_base == NULL \|\| plane_state == NULL) return false;. So, if the code execution gets to the point where tf = &plane_state->in_transfer_func; is called, plane_state is guaranteed to be not NULL, and therefore tf will also not be NULL. drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn20/dcn20_hwseq.c 1094 bool dcn20_set_input_transfer_func(struct dc dc, 1095 struct pipe_ctx pipe_ctx, 1096 const struct dc_plane_state plane_state) 1097 { 1098 struct dce_hwseq hws = dc->hwseq; 1099 struct dpp dpp_base = pipe_ctx->plane_res.dpp; 1100 const struct dc_transfer_func tf = NULL; ^^^^^^^^^ This assignment is not necessary now. 1101 bool result = true; 1102 bool use_degamma_ram = false; 1103 1104 if (dpp_base == NULL \|\| plane_state == NULL) 1105 return false; 1106 1107 hws->funcs.set_shaper_3dlut(pipe_ctx, plane_state); 1108 hws->funcs.set_blend_lut(pipe_ctx, plane_state); 1109 1110 tf = &plane_state->in_transfer_func; ^^^^^ Before there was an if statement but now tf is assigned unconditionally 1111 --> 1112 if (tf == NULL) { ^^^^^^^^^^^^^^^^^ so these conditions are impossible. 1113 dpp_base->funcs->dpp_set_degamma(dpp_base, 1114 IPP_DEGAMMA_MODE_BYPASS); 1115 return true; 1116 } 1117 1118 if (tf->type == TF_TYPE_HWPWL \|\| tf->type == TF_TYPE_DISTRIBUTED_POINTS) 1119 use_degamma_ram = true; 1120 1121 if (use_degamma_ram == true) { 1122 if (tf->type == TF_TYPE_HWPWL) 1123 dpp_base->funcs->dpp_program_degamma_pwl(dpp_base, Fixes the below Smatch static checker warning: drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn20/dcn20_hwseq.c:1112 dcn20_set_input_transfer_func() warn: address of 'plane_state->in_transfer_func' is non-NULL Fixes: `285a7054bf` ("drm/amd/display: Remove plane and stream pointers from dc scratch") Cc: Wenjing Liu <wenjing.liu@amd.com> Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Alvin Lee <alvin.lee2@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Qingqing Zhuo <Qingqing.Zhuo@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Suggested-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:41 -04:00
Alex Deucher	eef016ba89	drm/amdgpu/mes11: Use a separate fence per transaction We can't use a shared fence location because each transaction should be considered independently. Reviewed-by: Shaoyun.liu <shaoyunl@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:41 -04:00
Rodrigo Siqueira	ce42ba4f92	drm/amd/display: Add missing dwb registers DCN3.0 supports some specific DWB debug registers that are not exposed yet. This commit just adds the missing registers. Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:40 -04:00
Melissa Wen	efce15ec3b	drm/amd/display: use mpcc_count to log MPC state According to [1]: ``` DTN only logs 'pipe_count' instances of MPCC. However in some cases there are different number of MPCC than DPP (pipe_count). ``` As DTN log still relies on pipe_count to print mpcc state, switch to mpcc_count in all occurrences. [1] https://lore.kernel.org/amd-gfx/20240328195047.2843715-39-Roman.Li@amd.com/ Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Melissa Wen <mwen@igalia.com> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:40 -04:00
Alex Deucher	497d7cee24	drm/amdgpu: add a spinlock to wb allocation As we use wb slots more dynamically, we need to lock access to avoid racing on allocation or free. Reviewed-by: Shaoyun.liu <shaoyunl@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:40 -04:00
Sonny Jiang	754c366e41	drm/amdgpu: update fw_share for VCN5 kmd_fw_shared changed in VCN5 Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:40 -04:00
David Tadokoro	770e6c443b	drm/amd/display: Remove duplicated function signature from dcn3.01 DCCG In the header file dc/dcn301/dcn301_dccg.h, the function dccg301_create is declared twice, so remove duplication. Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: David Tadokoro <davidbtadokoro@usp.br> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:40 -04:00
Mukul Joshi	8e1d190595	drm/amdgpu: Fix VRAM memory accounting Subtract the VRAM pinned memory when checking for available memory in amdgpu_amdkfd_reserve_mem_limit function since that memory is not available for use. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:40 -04:00
Sathishkumar S	c551316e15	drm/amdgpu: update jpeg max decode resolution jpeg ip version v2.1 and higher supports 16kx16k resolution decode Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:40 -04:00
Jose Fernandez	130afc8a88	drm/amd/display: Fix division by zero in setup_dsc_config When slice_height is 0, the division by slice_height in the calculation of the number of slices will cause a division by zero driver crash. This leaves the kernel in a state that requires a reboot. This patch adds a check to avoid the division by zero. The stack trace below is for the 6.8.4 Kernel. I reproduced the issue on a Z16 Gen 2 Lenovo Thinkpad with a Apple Studio Display monitor connected via Thunderbolt. The amdgpu driver crashed with this exception when I rebooted the system with the monitor connected. kernel: ? die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434 arch/x86/kernel/dumpstack.c:447) kernel: ? do_trap (arch/x86/kernel/traps.c:113 arch/x86/kernel/traps.c:154) kernel: ? setup_dsc_config (drivers/gpu/drm/amd/amdgpu/../display/dc/dsc/dc_dsc.c:1053) amdgpu kernel: ? do_error_trap (./arch/x86/include/asm/traps.h:58 arch/x86/kernel/traps.c:175) kernel: ? setup_dsc_config (drivers/gpu/drm/amd/amdgpu/../display/dc/dsc/dc_dsc.c:1053) amdgpu kernel: ? exc_divide_error (arch/x86/kernel/traps.c:194 (discriminator 2)) kernel: ? setup_dsc_config (drivers/gpu/drm/amd/amdgpu/../display/dc/dsc/dc_dsc.c:1053) amdgpu kernel: ? asm_exc_divide_error (./arch/x86/include/asm/idtentry.h:548) kernel: ? setup_dsc_config (drivers/gpu/drm/amd/amdgpu/../display/dc/dsc/dc_dsc.c:1053) amdgpu kernel: dc_dsc_compute_config (drivers/gpu/drm/amd/amdgpu/../display/dc/dsc/dc_dsc.c:1109) amdgpu After applying this patch, the driver no longer crashes when the monitor is connected and the system is rebooted. I believe this is the same issue reported for 3113. Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Jose Fernandez <josef@netflix.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3113 Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:40 -04:00
Rodrigo Siqueira	71dfa617ea	drm/amd/display: Add missing debug registers for DCN2/3/3.1 This commit add some missing debug registers for DPCS and RDPC debug. Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:40 -04:00
Sunil Khatri	af8644121e	drm/amdgpu: add ip dump for each ip in devcoredump Add ip dump for each ip of the asic in the devcoredump for all the ips where a callback is registered for register dump. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:40 -04:00
Sunil Khatri	e043a35dc2	drm/amdgpu: dump ip state before reset for each ip Invoke the dump_ip_state function for each ip before the asic resets and save the register values for debugging via devcoredump. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:39 -04:00
Sunil Khatri	c8732c80de	drm/amdgpu: add support for gfx v10 print Add support to print ip information to be used to print registers in devcoredump buffer. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:39 -04:00
Sunil Khatri	40356542c3	drm/amdgpu: add protype for print ip state Add the protoype for print ip state to be used to print the registers in devcoredump during a gpu reset. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:39 -04:00
Sunil Khatri	c395dbb68b	drm/amdgpu: add support of gfx10 register dump Adding gfx10 gc registers to be used for register dump via devcoredump during a gpu reset. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:39 -04:00
Sunil Khatri	e21d253bd7	drm/amdgpu: add prototype for ip dump Add the prototype to dump ip registers for all ips of different asics and set them to NULL for now. Based on the requirement add a function pointer for each of them. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:39 -04:00
YiPeng Chai	af730e0820	drm/amdgpu: Add interface to reserve bad page Add interface to reserve bad page. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:39 -04:00
Ma Jun	60c448439f	drm/amdgpu: Fix uninitialized variable warnings return 0 to avoid returning an uninitialized variable r Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:39 -04:00
Jack Xiao	f88da7fbf6	drm/amdgpu/mes: fix use-after-free issue Delete fence fallback timer to fix the ramdom use-after-free issue. v2: move to amdgpu_mes.c Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:39 -04:00
Alex Deucher	e0a9bbeea0	drm/amdgpu/sdma5.2: use legacy HDP flush for SDMA2/3 This avoids a potential conflict with firmwares with the newer HDP flush mechanism. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:39 -04:00
Rajneesh Bhardwaj	a16b951586	drm/amdgpu: Update CGCG settings for GFXIP 9.4.3 Tune coarse grain clock gating idle threshold and rlc idle timeout to achieve better kernel launch latency. Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:38 -04:00
Rodrigo Siqueira	ab6a0edb7d	Revert "drm/amd/display: Add fallback configuration when set DRR" This reverts commit `d76c0a23b5`. This change must be reverted since it caused soft hangs when changing the refresh rate to 122 & 144Hz when using a 7000 series GPU. Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Reported-by: Mark Broadworth <Mark.Broadworth@amd.com> Cc: Daniel Wheeler <daniel.wheeler@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:38 -04:00
Srinivasan Shanmugam	bdc7ee7a35	drm/amdgpu: Fix snprintf buffer size in smu_v14_0_init_microcode This commit addresses buffer overflow in the smu_v14_0_init_microcode function. The issue was about the snprintf function writing more bytes into the fw_name buffer than it can hold. The line of code is: snprintf(fw_name, sizeof(fw_name), "amdgpu/%s.bin", ucode_prefix); Here, snprintf is used to write a formatted string into fw_name. The format is "amdgpu/%s.bin", where %s is a placeholder for the string ucode_prefix. The sizeof(fw_name) argument tells snprintf the maximum number of bytes it can write into fw_name, including the null-terminating character. In the original code, fw_name is an array of 30 characters. The string "amdgpu/%s.bin" could be up to 41 bytes long, which exceeds the 30 bytes allocated for fw_name. This is because %s could be replaced by ucode_prefix, which can be up to 29 characters long. Adding the 12 characters from "amdgpu/" and ".bin", the total length could be 41 characters. To address this, the size of ucode_prefix has been reduced to 15 characters. This ensures that the maximum length of the string written into fw_name does not exceed its capacity. smu_13/14 etc. don't follow legacy scheme ie., amdgpu_ucode_legacy_naming Fixes the below with gcc W=1: drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu14/smu_v14_0.c: In function ‘smu_v14_0_init_microcode’: drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu14/smu_v14_0.c:80:52: warning: ‘%s’ directive output may be truncated writing up to 29 bytes into a region of size 23 [-Wformat-truncation=] 80 \| snprintf(fw_name, sizeof(fw_name), "amdgpu/%s.bin", ucode_prefix); \| ^~ ~~~~~~~~~~~~ drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu14/smu_v14_0.c:80:9: note: ‘snprintf’ output between 12 and 41 bytes into a destination of size 30 80 \| snprintf(fw_name, sizeof(fw_name), "amdgpu/%s.bin", ucode_prefix); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fixes: `fe6cd91524` ("drm/amd/swsmu: add smu14 ip support") Cc: Li Ma <li.ma@amd.com> Cc: Likun Gao <Likun.Gao@amd.com> Cc: Lijo Lazar <lijo.lazar@amd.com> Cc: Kenneth Feng <kenneth.feng@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:38 -04:00
Frank Min	ea9238a81b	drm/amdgpu: replace tmz flag into buffer flag Replace tmz flag into buffer flag to make it easier to understand and extend Signed-off-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Frank Min <Frank.Min@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:38 -04:00
Le Ma	92ed1e9cd5	drm/amdgpu: init microcode chip name from ip versions To adapt to different gc versions in gfx_v9_4_3.c file. Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:38 -04:00
Prike Liang	26de73bc0a	drm/amdgpu: Fix the ring buffer size for queue VM flush Here are the corrections needed for the queue ring buffer size calculation for the following cases: - Remove the KIQ VM flush ring usage. - Add the invalidate TLBs packet for gfx10 and gfx11 queue. - There's no VM flush and PFP sync, so remove the gfx9 real ring and compute ring buffer usage. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:38 -04:00
Mukul Joshi	63335b383a	drm/amdkfd: Add VRAM accounting for SVM migration Do VRAM accounting when doing migrations to vram to make sure there is enough available VRAM and migrating to VRAM doesn't evict other possible non-unified memory BOs. If migrating to VRAM fails, driver can fall back to using system memory seamlessly. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:38 -04:00
Lijo Lazar	fa7bb2cac0	drm/amd/pm: Restore config space after reset During mode-2 reset, pci config space registers are affected at device side. However, certain platforms have switches which assign virtual BAR addresses and returns the same even after device is reset. This affects pci_restore_state() as it doesn't issue another config write, if the value read is same as the saved value. Add a workaround to write saved config space values from driver side. Presently, these switches are in platforms with SMU v13.0.6 SOCs, hence restrict the workaround only to those. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:38 -04:00
Lang Yu	a522ec528c	drm/amdgpu/umsch: don't execute umsch test when GPU is in reset/suspend umsch test needs full GPU functionality(e.g., VM update, TLB flush, possibly buffer moving under memory pressure) which may be not ready under these states. Just skip it to avoid potential issues. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:38 -04:00
Felix Kuehling	f989ecccdf	drm/amdkfd: Fix rescheduling of restore worker Handle the case that the restore worker was already scheduled by another eviction while the restore was in progress. Fixes: `9a1c1339ab` ("drm/amdkfd: Run restore_workers on freezable WQs") Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Tested-by: Yunxiang Li <Yunxiang.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-26 17:22:38 -04:00
Damien Le Moal	6927b01680	drm/amdgpu: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY Use the macro PCI_IRQ_INTX instead of the deprecated PCI_IRQ_LEGACY macro. Link: https://lore.kernel.org/r/20240325070944.3600338-11-dlemoal@kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-25 12:53:31 -05:00
Jack Xiao	9482552820	drm/amdgpu/mes: fix use-after-free issue Delete fence fallback timer to fix the ramdom use-after-free issue. v2: move to amdgpu_mes.c Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-23 23:23:46 -04:00
Alex Deucher	9792b7cc18	drm/amdgpu/sdma5.2: use legacy HDP flush for SDMA2/3 This avoids a potential conflict with firmwares with the newer HDP flush mechanism. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2024-04-23 23:23:46 -04:00
Prike Liang	fe93b0927b	drm/amdgpu: Fix the ring buffer size for queue VM flush Here are the corrections needed for the queue ring buffer size calculation for the following cases: - Remove the KIQ VM flush ring usage. - Add the invalidate TLBs packet for gfx10 and gfx11 queue. - There's no VM flush and PFP sync, so remove the gfx9 real ring and compute ring buffer usage. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-23 23:23:46 -04:00
Mukul Joshi	1e214f7faa	drm/amdkfd: Add VRAM accounting for SVM migration Do VRAM accounting when doing migrations to vram to make sure there is enough available VRAM and migrating to VRAM doesn't evict other possible non-unified memory BOs. If migrating to VRAM fails, driver can fall back to using system memory seamlessly. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-23 23:23:45 -04:00
Lijo Lazar	30d1cda8ce	drm/amd/pm: Restore config space after reset During mode-2 reset, pci config space registers are affected at device side. However, certain platforms have switches which assign virtual BAR addresses and returns the same even after device is reset. This affects pci_restore_state() as it doesn't issue another config write, if the value read is same as the saved value. Add a workaround to write saved config space values from driver side. Presently, these switches are in platforms with SMU v13.0.6 SOCs, hence restrict the workaround only to those. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-23 23:23:45 -04:00
Lang Yu	661d71ee5a	drm/amdgpu/umsch: don't execute umsch test when GPU is in reset/suspend umsch test needs full GPU functionality(e.g., VM update, TLB flush, possibly buffer moving under memory pressure) which may be not ready under these states. Just skip it to avoid potential issues. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2024-04-23 23:23:28 -04:00
Felix Kuehling	e26305f369	drm/amdkfd: Fix rescheduling of restore worker Handle the case that the restore worker was already scheduled by another eviction while the restore was in progress. Fixes: `9a1c1339ab` ("drm/amdkfd: Run restore_workers on freezable WQs") Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Tested-by: Yunxiang Li <Yunxiang.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2024-04-23 23:23:28 -04:00
Felix Kuehling	b0b13d5321	drm/amdgpu: Update BO eviction priorities Make SVM BOs more likely to get evicted than other BOs. These BOs opportunistically use available VRAM, but can fall back relatively seamlessly to system memory. It also avoids SVM migrations evicting other, more important BOs as they will evict other SVM allocations first. Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Acked-by: Mukul Joshi <mukul.joshi@amd.com> Tested-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-23 23:23:28 -04:00
Peyton Lee	d59198d2d0	drm/amdgpu/vpe: fix vpe dpm setup failed The vpe dpm settings should be done before firmware is loaded. Otherwise, the frequency cannot be successfully raised. Signed-off-by: Peyton Lee <peytolee@amd.com> Reviewed-by: Lang Yu <lang.yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-23 23:23:02 -04:00
Lijo Lazar	aebd3eb9d3	drm/amdgpu: Assign correct bits for SDMA HDP flush HDP Flush request bit can be kept unique per AID, and doesn't need to be unique SOC-wide. Assign only bits 10-13 for SDMA v4.4.2. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2024-04-23 23:22:54 -04:00
Ma Jun	0e95ed6452	drm/amdgpu/pm: Remove gpu_od if it's an empty directory gpu_od should be removed if it's an empty directory Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reported-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2024-04-23 23:22:46 -04:00
Lang Yu	9c783a1121	drm/amdkfd: make sure VM is ready for updating operations When page table BOs were evicted but not validated before updating page tables, VM is still in evicting state, amdgpu_vm_update_range returns -EBUSY and restore_process_worker runs into a dead loop. v2: Split the BO validation and page table update into two separate loops in amdgpu_amdkfd_restore_process_bos. (Felix) 1.Validate BOs 2.Validate VM (and DMABuf attachments) 3.Update page tables for the BOs validated above Fixes: `50661eb1a2` ("drm/amdgpu: Auto-validate DMABuf imports in compute VMs") Signed-off-by: Lang Yu <Lang.Yu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2024-04-23 23:22:23 -04:00
Mukul Joshi	25e9227c6a	drm/amdgpu: Fix leak when GPU memory allocation fails Free the sync object if the memory allocation fails for any reason. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2024-04-23 23:17:30 -04:00
Felix Kuehling	37865e02e6	drm/amdkfd: Fix eviction fence handling Handle case that dma_fence_get_rcu_safe returns NULL. If restore work is already scheduled, only update its timer. The same work item cannot be queued twice, so undo the extra queue eviction. Fixes: `9a1c1339ab` ("drm/amdkfd: Run restore_workers on freezable WQs") Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Tested-by: Gang BA <Gang.Ba@amd.com> Reviewed-by: Gang BA <Gang.Ba@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2024-04-23 23:17:30 -04:00
Joshua Ashton	2eb9dd497a	drm/amd/display: Set color_mgmt_changed to true on unsuspend Otherwise we can end up with a frame on unsuspend where color management is not applied when userspace has not committed themselves. Fixes re-applying color management on Steam Deck/Gamescope on S3 resume. Signed-off-by: Joshua Ashton <joshua@froggi.es> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-23 23:17:29 -04:00
Felix Kuehling	e76691f45a	drm/amdgpu: Update BO eviction priorities Make SVM BOs more likely to get evicted than other BOs. These BOs opportunistically use available VRAM, but can fall back relatively seamlessly to system memory. It also avoids SVM migrations evicting other, more important BOs as they will evict other SVM allocations first. Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Acked-by: Mukul Joshi <mukul.joshi@amd.com> Tested-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-23 12:08:31 -04:00
Jiapeng Chong	8e49344e66	drm/amd/display: Remove duplicate dcn32/dcn32_clk_mgr.h header ./drivers/gpu/drm/amd/display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c: dcn32/dcn32_clk_mgr.h is included more than once. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=8789 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-23 12:08:31 -04:00
Pierre-Eric Pelloux-Prayer	6e042cee74	drm/amdgpu/vcn: fix unitialized variable warnings Avoid returning an uninitialized value if we never enter the loop. This case should never be hit in practice, but returning 0 doesn't hurt. The same fix is applied to the 4 places using the same pattern. v2: - fixed typos in commit message (Alex) - use "return 0;" before the done label instead of initializing r to 0 Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-23 12:08:30 -04:00
Alex Deucher	3f0664110a	drm/amdgpu/mes11: print MES opcodes rather than numbers Makes it easier to review the logs when there are MES errors. v2: use dbg for emitted, add helpers for fetching strings v3: fix missing commas (Harish) v4: drop command prefixes (Felix) v5: squash in bounds fix (Jun) Acked-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed by Shaoyun.liu <Shaoyun.liu@amd.com> (v2) Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-23 12:08:30 -04:00
Peyton Lee	2476c6bd95	drm/amdgpu/vpe: fix vpe dpm setup failed The vpe dpm settings should be done before firmware is loaded. Otherwise, the frequency cannot be successfully raised. Signed-off-by: Peyton Lee <peytolee@amd.com> Reviewed-by: Lang Yu <lang.yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-23 12:08:30 -04:00
Lijo Lazar	7b19f1f346	drm/amdgpu: Assign correct bits for SDMA HDP flush HDP Flush request bit can be kept unique per AID, and doesn't need to be unique SOC-wide. Assign only bits 10-13 for SDMA v4.4.2. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-23 12:08:30 -04:00
Li Ma	455c7f7d9b	drm/amd/swsmu: add if condition for smu v14.0.1 smu v14.0.1 re-used smu v14.0.0 Signed-off-by: Li Ma <li.ma@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-23 12:08:30 -04:00
Ma Jun	e6f1a1946c	drm/amdgpu/pm: Print od status info Print the od status info if it's not supported. Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-23 12:08:30 -04:00
Stanley.Yang	939c475181	drm/amdgpu: Support setting reset_method at runtime In order to support more test cases, support user change reset_method at runtime. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-23 12:08:30 -04:00
Ma Jun	69bc7a8a61	drm/amdgpu/pm: Remove gpu_od if it's an empty directory gpu_od should be removed if it's an empty directory Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reported-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-04-23 12:08:30 -04:00

1 2 3 4 5 ...

29622 Commits