Commit Graph

95 Commits

Author SHA1 Message Date
Lijo Lazar 3f69d5860f drm/amdgpu: Add a read to GFX v9.4.3 ring test
Issue a read to confirm the register write before ringing doorbell. With
multiple XCCs there is chance for race condition.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Acked-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26 19:04:13 -04:00
Tao Zhou d9443ac4f9 drm/amdgpu: drop status query/reset for GCEA 9.4.3 and MMEA 1.8
PMFW will be responsible for them.

v2: remove query interfaces.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-20 15:11:28 -04:00
Mangesh Gadre 2d955a06a5 Revert "drm/amdgpu: Program xcp_ctl registers as needed"
This reverts commit 0bdebfef3f.

XCP_CTL register is programmed by firmware and
register access is protected.

Signed-off-by: Mangesh Gadre <Mangesh.Gadre@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-19 18:26:50 -04:00
Yang Wang 156c2814c2 drm/amdgpu: add RAS error info support for gfx_v9_4_3
add RAS error info support for gfx_v9_4_3.

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-13 11:35:55 -04:00
Lijo Lazar 4e8303cf2c drm/amdgpu: Use function for IP version check
Use an inline function for version check. Gives more flexibility to
handle any format changes.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-09-20 12:23:28 -04:00
Mukul Joshi f705a6f021 drm/amdgpu: Store CU info from all XCCs for GFX v9.4.3
Currently, we store CU info only for a single XCC assuming
that it is the same for all XCCs. However, that may not be
true. As a result, store CU info for all XCCs. This info is
later used for CU masking.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-09-11 17:10:19 -04:00
Hawking Zhang c2c23a10f1 drm/amdgpu: Correct se_num and reg_inst for gfx v9_4_3 ras counters
gfx_v9_4_3_ue|ce_reg_list is an array per gfx core instance
correct the settings of se_num and reg_inst for some of
gfx ras counters so all the available register instances
can be polled for ras status.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-09-06 14:36:57 -04:00
Rajneesh Bhardwaj d30279a9e3 drm/amdgpu: Hide xcp partition sysfs under SRIOV
XCP partitions should not be visible for the VF for GFXIP 9.4.3.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 15:51:16 -04:00
Tao Zhou ac3343c761 drm/amdgpu: use read-modify-write mode for gfx v9_4_3 SQ setting
Instead of using direct update, avoid touching unrelated fields.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 15:51:16 -04:00
Mangesh Gadre 7caebc8f99 drm/amdgpu: Updated TCP/UTCL1 programming
Update TCP/UTCL1 thrashing control settings

v2: updated rev_id check

Signed-off-by: Mangesh Gadre <Mangesh.Gadre@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 15:20:14 -04:00
Mangesh Gadre 559259362e drm/amdgpu: Remove SRAM clock gater override by driver
rlc firmware does required setting, driver need not do it.

Signed-off-by: Mangesh Gadre <Mangesh.Gadre@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-30 14:59:30 -04:00
Lijo Lazar b5cdadedaa drm/amdgpu: Remove gfxoff check in GFX v9.4.3
GFXOFF feature is not there for GFX 9.4.3 ASICs.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-16 11:34:50 -04:00
Eric Huang 952ee94593 drm/amdgpu: enable trap of each kfd vmid for gfx v9.4.3
To setup ttmp on as default for gfx v9.4.3 in IP hw init.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Jonathan Kim <jonathan.kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-27 14:47:52 -04:00
Lijo Lazar b5ac08806c drm/amdgpu: Restore HQD persistent state register
On GFX v9.4.3, compute queue MQD is populated using the values in HQD
persistent state register. Hence don't clear the values on module
unload, instead restore it to the default reset value so that MQD is
initialized correctly during next module load. In particular, preload
flag needs to be set on compute queue MQD, otherwise it could cause
uninitialized values being used at device reset state resulting in EDC.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-25 13:47:27 -04:00
Shiwu Zhang 9bc12db4e2 drm/amdgpu: fix the indexing issue during rlcg access ctrl init
In case that the GET_INST() is used for looping, only loops for the
times of actual num of xcc, otherwise GET_INST() will return the invalid
index, a.k.a -1

And also remove the redundant mask checking in case of GET_INST()

Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-25 13:47:26 -04:00
Lijo Lazar 0bdebfef3f drm/amdgpu: Program xcp_ctl registers as needed
XCP_CTL register is expected to be programmed by firmware. Under certain
conditions FW may not have programmed it correctly. As a workaround,
program it when FW has not programmed the right values.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-21 16:52:25 -04:00
Victor Lu 8ed49dd1d3 drm/amdgpu: Add RLCG interface driver implementation for gfx v9.4.3 (v3)
Add RLCG interface support for gfx v9.4.3 and multiple XCCs.
Do not enable it yet.

v2: Fix amdgpu_rlcg_reg_access_ctrl init, add support for multiple XCCs
    in amdgpu_mm_wreg_mmio_rlc

v3: Use GET_INST() when indexing amdgpu_rlcg_reg_access_ctrl

Signed-off-by: Victor Lu <victorchengchi.lu@amd.com>
Reviewed-by: Zhigang Luo <zhigang.luo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18 11:16:41 -04:00
Tao Zhou bd97449837 drm/amdgpu: add watchdog timer enablement for gfx_v9_4_3
Configure SQ watchdog timer setting.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-12 11:12:09 -04:00
Lijo Lazar 67769b7cdd drm/amdgpu: Remove redundant GFX v9.4.3 sequence
Programming of XCC id is already taken care with partition mode change.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-10 09:02:36 -04:00
Lijo Lazar 4755bfbd99 drm/amdgpu: Change golden settings for GFX v9.4.3
Change the settings applicable for A0. GRBM_MCM_ADDR setting will be
applied by firmware.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Acked-by: Mangesh Gadre <Mangesh.Gadre@amd.com>
Tested-by: Mangesh Gadre <Mangesh.Gadre@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-07 13:51:48 -04:00
Lijo Lazar a28eb4871a drm/amdgpu: Keep non-psp path for partition switch
When PSP block is not present, use direct programming.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Mangesh Gadre <Mangesh.Gadre@amd.com>
Tested-by: Mangesh Gadre <Mangesh.Gadre@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-30 13:12:15 -04:00
Zhigang Luo 2036b34d4a drm/amdgpu: port SRIOV VF missed changes
port SRIOV VF missed changes from gfx_v9_0 to gfx_v9_4_3.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Zhigang Luo <Zhigang.Luo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-30 13:11:35 -04:00
Lijo Lazar b00f55374c drm/amdgpu: Use PSP FW API for partition switch
Use PSP firmware interface for switching compute partitions.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-15 11:06:59 -04:00
Yang Wang cab69d36cc drm/amdgpu: skip to resume rlcg for gc 9.4.3 in vf side
skip to resume rlcg, because rlcg is already enabled in pf side.

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 12:47:35 -04:00
Shiwu Zhang 491ae27829 drm/amdgpu: complement the 4, 6 and 8 XCC cases
Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 11:02:17 -04:00
Shiwu Zhang 89f8576555 drm/amdgpu: golden settings for ASIC rev_id 0
Suggested by FW team that GB_ADDR_CONFIG is handled by golden
settings in driver to get the expected value

Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 11:02:15 -04:00
Srinivasan Shanmugam a09e206510 drm/amdgpu: Fix defined but not used gfx9_cs_data in gfx_v9_4_3.c
gcc with W=1
In file included from drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:33:
drivers/gpu/drm/amd/amdgpu/clearstate_gfx9.h:939:36: warning: ‘gfx9_cs_data’ defined but not used [-Wunused-const-variable=]
  939 | static const struct cs_section_def gfx9_cs_data[] = {
      |

gfx9_cs_data is not used in gfx_v9_4_3.c, hence remove its
include in gfx_v9_4_3.c

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 11:00:53 -04:00
Guchun Chen 232f243189 drm/amdgpu/gfx: set sched.ready status after ring/IB test in gfx
sched.ready is nothing with ring initialization, it needs to set
to be true after ring/IB test in amdgpu_ring_test_helper to tell
the ring is ready for submission.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 10:57:11 -04:00
Jiapeng Chong e7665d0ca7 drm/amdgpu: Remove duplicate include
./drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c: amdgpu_xcp.h is included more than once.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=5281
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 10:54:31 -04:00
Tom Rix 803e4c9efc drm/amdgpu: remove unused variable num_xcc
gcc with W=1 reports
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:2138:13: error: variable
  ‘num_xcc’ set but not used [-Werror=unused-but-set-variable]
 2138 |         int num_xcc;
      |             ^~~~~~~

This variable is not used so remove it.

Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 10:52:45 -04:00
Srinivasan Shanmugam d7b8e68dc0 drm/amdgpu: Fix uninitialized variable in gfx_v9_4_3_cp_resume
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:1925:6: error: variable 'r' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized]
        if (amdgpu_xcp_query_partition_mode(adev->xcp_mgr,
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:1931:6: note: uninitialized use occurs here
        if (r)
            ^
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:1925:2: note: remove the 'if' if its condition is always true
        if (amdgpu_xcp_query_partition_mode(adev->xcp_mgr,
        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:1923:7: note: initialize the variable 'r' to silence this warning
        int r, i, num_xcc;
             ^
              = 0
1 error generated.

Suggested-by: Lijo Lazar <lijo.lazar@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 10:41:35 -04:00
Tao Zhou 92ecb92ccc drm/amdgpu: initialize RAS for gfx_v9_4_3
Register GFX RAS functions and initialize GFX RAS.

v2: remove xcp operations.
v3: reuse the return value of gfx_ras_sw_init.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 10:37:50 -04:00
Tao Zhou 0386d52d15 drm/amdgpu: add sq timeout status functions for gfx_v9_4_3
Query and reset sq timeout status.

v2: change instance from 0 to xcc_id for register access.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 10:37:47 -04:00
Tao Zhou 30feef0676 drm/amdgpu: add RAS error count reset for gfx_v9_4_3
Add GFX RAS error count reset function.

v2: remove xcp operation.
    only select_se_sh when instance number is more than 1.
v3: add check for se_num before select_se_sh.
    change instance from 0 to xcc_id for register access.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 10:37:44 -04:00
Tao Zhou bfa84da618 drm/amdgpu: add RAS error count query for gfx_v9_4_3
Query GFX RAS ce/ue count.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 10:37:42 -04:00
Tao Zhou 5c1c09a716 drm/amdgpu: add RAS error count definitions for gfx_v9_4_3
Prepare for the query of GFX RAS ce/ue count.

v2: remove xcp operation.
    only select_se_sh when instance number is more than 1.
v3: add more CE/UE registsers to query list.
    add check for se_num before select_se_sh.
    change instance from 0 to xcc_id for register access.
v4: move gfx memory id definitions to gfx_v9_4_3.
v5: create a dedicated patch for adding error count query function.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 10:37:39 -04:00
Tao Zhou 47e7f527c8 drm/amdgpu: add RAS status reset for gfx_v9_4_3
Reset GFX RAS status registers.

v2: fix typo in title.
    remove xcp operation.
v3: change instance from 0 to xcc_id for register access.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 10:37:32 -04:00
Tao Zhou bf16235b39 drm/amdgpu: add RAS status query for gfx_v9_4_3
Query GFX RAS status.

v2: remove xcp operation.
v3: change instance from 0 to xcc_id for register access.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 10:37:29 -04:00
Lijo Lazar 00e1ab02c2 drm/amdgpu: Skip halting RLC on GFX v9.4.3
RLC-PMFW handshake happens periodically when GFXCLK DPM is enabled and
halting RLC may cause unexpected results. Avoid halting RLC from driver
side.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:58:41 -04:00
Lijo Lazar 1e91a5f791 drm/amdgpu: Fix register accesses in GFX v9.4.3
Access registers with the right xcc id. Also, remove the unused logic as
PG is not used in GFX v9.4.3

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:58:39 -04:00
Lijo Lazar b6b85c8b43 drm/amdgpu: Return error on invalid compute mode
Return error if an invalid compute partition mode is requested.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:57:50 -04:00
Lijo Lazar b6f90baafe drm/amdgpu: Move memory partition query to gmc
GMC block handles memory related information, it makes more sense to
keep memory partition functions in gmc block.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:56:50 -04:00
Lijo Lazar ded7d99eb5 drm/amdgpu: Add flags for partition mode query
It's not required to take lock on all cases while querying partition
mode. Querying partition mode during KFD init process doesn't need to
take a lock. Init process after a switch will already be happening under
lock. Control the behaviour by adding flags to xcp_query_partition_mode.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:55:29 -04:00
Rajneesh Bhardwaj 228ce17643 drm/amdgpu: Handle VRAM dependencies on GFXIP9.4.3
[For 1P NPS1 mode driver bringup]

Changes required to initialize the amdgpu driver with frontdoor firmware
loading and discovery=2 with the native mode SBIOS that enables CPU GPU
unified interleaved memory.

sudo modprobe amdgpu discovery=2

Once PSP TMR region is reported via the ACPI interface, the dependency
on the ip_discovery.bin will be removed.

Choice of where to allocate driver table is given to each IP version. In
general, both GTT and VRAM domains will be considered. If one of the
tables has a strict restriction for VRAM domain, then only VRAM domain
is considered.

Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
(lijo: Modified the handling for SMU Tables)
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:53:52 -04:00
Lijo Lazar c1d3f627ff drm/amdgpu: Fix mqd init on GFX v9.4.3
For MQD init, an XCC's queue is selected with GRBM select. However, for
initialization of MQD, values read from logical XCC0 registers are used.
This results in garbage values being read from XCC0 whose queue is not
selected. Change to read from the right XCC for MQD initialization.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:52:25 -04:00
Lijo Lazar b7c7011e67 drm/amdgpu: Enable CGCG/LS for GC 9.4.3
Enable coarse grain clockgating/light sleep for GC v9.4.3. Remove
programming that is not meant for GC 9.4.3.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:52:12 -04:00
Lijo Lazar 233bb3733b drm/amdgpu: Use unique doorbell range per xcc
Program different ranges in each XCC with MEC_DOORBELL_RANGE_LOWER/HIGHER.
Keeping the same range causes CPF in other XCCs also to be busy when an IB
packet is submitted to KCQ. Only the XCC which processes the packet
comes back to idle afterwards and this causes other CPs not be idle.
This in turn affects clockgating behavior as RLC doesn't get idle
interrupt.

LOWER/HIGHER covers only KIQ/KCQs which are per XCC queues. Assigning
different ranges doesn't seem to have any side effect as user queue ranges
are outside of this range. User queue tests - PM4 through KFD and AQL
through rocr - have the same results after this change.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:52:06 -04:00
Lijo Lazar 34fd9d6867 drm/amdgpu: Add FGCG logic for GFX v9.4.3
Add logic for fine grain clock gating logic for GFX v9.4.3. The feature
will be controlled using CG flags. Also, make a change so that RLC safe
mode entry/exit is done only once during CG update sequence.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:51:03 -04:00
Lijo Lazar d524180b88 drm/amdgpu: Fix GFX v9.4.3 EOP buffer allocation
Each compute cluster gets 8 compute queues in GFX v9.4.3. Fix the EOP
buffer allocation so that compute queue on every XCC gets a unique
address.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Tested-and-Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:50:52 -04:00
Le Ma 98b2e9cad2 drm/amdgpu: correct the vmhub index when page fault occurs
The AMDGPU_GFXHUB was bind to each xcc in the logical order.
Thus convert the node_id to logical xcc_id to index the
correct AMDGPU_GFXHUB. And "node_id / 4" can get the correct
AMDGPU_MMHUB0 index.

Signed-off-by: Le Ma <le.ma@amd.com>
Tested-by: Asad kamal <asad.kamal@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09 09:50:30 -04:00