Commit Graph

10937 Commits

Author SHA1 Message Date
sashank saye 4da8b63944 drm/amdgpu: Send Message to SMU on aldebaran passthrough for sbr handling
For Aldebaran chip passthrough case we need to intimate SMU
about special handling for SBR.On older chips we send
LightSBR to SMU, enabling the same for Aldebaran. Slight
difference, compared to previous chips, is on Aldebaran, SMU
would do a heavy reset on SBR. Hence, the word Heavy
instead of Light SBR is used for SMU to differentiate.

Reviewed by: Shaoyun.liu <Shaoyun.liu@amd.com>
Signed-off-by: sashank saye <sashank.saye@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-28 16:03:19 -05:00
Rajneesh Bhardwaj fbcdbfde87 drm/amdgpu: Don't inherit GEM object VMAs in child process
When an application having open file access to a node forks, its shared
mappings also get reflected in the address space of child process even
though it cannot access them with the object permissions applied. With the
existing permission checks on the gem objects, it might be reasonable to
also create the VMAs with VM_DONTCOPY flag so a user space application
doesn't need to explicitly call the madvise(addr, len, MADV_DONTFORK)
system call to prevent the pages in the mapped range to appear in the
address space of the child process. It also prevents the memory leaks
due to additional reference counts on the mapped BOs in the child
process that prevented freeing the memory in the parent for which we had
worked around earlier in the user space inside the thunk library.

Additionally, we faced this issue when using CRIU to checkpoint restore
an application that had such inherited mappings in the child which
confuse CRIU when it mmaps on restore. Having this flag set for the
render node VMAs helps. VMAs mapped via KFD already take care of this so
this is needed only for the render nodes.

To limit the impact of the change to user space consumers such as OpenGL
etc, limit it to KFD BOs only.

Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: David Yat Sin <david.yatsin@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-28 16:03:08 -05:00
Tao Zhou b6485bed40 drm/amdkfd: reset queue which consumes RAS poison (v2)
CP supports unmap queue with reset mode which only destroys specific queue without affecting others.
Replacing whole gpu reset with reset queue mode for RAS poison consumption
saves much time, and we can also fallback to gpu reset solution if reset
queue fails.

v2: Return directly if process is NULL;
    Reset queue solution is not applicable to SDMA, fallback to legacy
way;
    Call kfd_unref_process after lookup process.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-28 16:02:59 -05:00
Tao Zhou f4409ee846 drm/amdgpu: add gpu reset control for umc page retirement
Add a reset parameter for umc page retirement, let user decide whether
call gpu reset in umc page retirement.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-28 16:02:32 -05:00
Victor Skvortsov d764fb2af6 drm/amdgpu: Modify indirect register access for gfx9 sriov
Expand RLCG interface for new GC read & write commands.
New interface will only be used if the PF enables the flag in pf2vf msg.

v2: Added a description for the scratch registers

Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
Reviewed-by: David Nieto <david.nieto@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-28 16:02:25 -05:00
Victor Skvortsov 4a0165f060 drm/amdgpu: get xgmi info before ip_init
Driver needs to call get_xgmi_info() before ip_init
to determine whether it needs to handle a pending hive reset.

Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
Reviewed-by: David Nieto <david.nieto@amd.com>
Reviewed by: shaoyun.liu <Shaoyun.lui@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-28 16:02:17 -05:00
Victor Skvortsov 4aa325ae54 drm/amdgpu: Modify indirect register access for amdkfd_gfx_v9 sriov
Modify GC register access from MMIO to RLCG if the indirect
flag is set

Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
Reviewed-by: David Nieto <david.nieto@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-28 16:02:10 -05:00
Victor Skvortsov 92f153bb5a drm/amdgpu: Modify indirect register access for gmc_v9_0 sriov
Modify GC register access from MMIO to RLCG if the
indirect flag is set

v2: Replaced ternary operator with if-else for better
readability

Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
Reviewed-by: David Nieto <david.nieto@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-28 16:02:02 -05:00
Victor Skvortsov 0da6f6e587 drm/amdgpu: Add *_SOC15_IP_NO_KIQ() macro definitions
Add helper macros to change register access
from direct to indirect.

Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
Reviewed-by: David Nieto <david.nieto@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-28 16:01:55 -05:00
Bokun Zhang b18ff6925d drm/amdgpu: Filter security violation registers
Recently, there is security policy update under SRIOV.
We need to filter the registers that hit the violation
and move the code to the host driver side so that
the guest driver can execute correctly.

Signed-off-by: Bokun Zhang <Bokun.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-28 16:00:47 -05:00
Alex Deucher ebae897388 drm/amdgpu: no DC support for headless chips
Chips with no display hardware should return false for
DC support.

v2: drop Arcturus and Aldebaran

Fixes: f7f12b2582 ("drm/amdgpu: default to true in amdgpu_device_asic_has_dc_support")
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reported-by: Tareque Md.Hanif <tarequemd.hanif@yahoo.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-28 09:07:30 -05:00
Evan Quan 7be3be2b02 drm/amdgpu: put SMU into proper state on runpm suspending for BOCO capable platform
By setting mp1_state as PP_MP1_STATE_UNLOAD, MP1 will do some proper cleanups and
put itself into a state ready for PNP. That can workaround some random resuming
failure observed on BOCO capable platforms.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-27 13:08:28 -05:00
Alex Deucher daf8de0874 drm/amdgpu: always reset the asic in suspend (v2)
If the platform suspend happens to fail and the power rail
is not turned off, the GPU will be in an unknown state on
resume, so reset the asic so that it will be in a known
good state on resume even if the platform suspend failed.

v2: handle s0ix

Acked-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-27 12:16:34 -05:00
Alex Deucher 4d625a97a7 drm/amdgpu: fix runpm documentation
It's not only supported by HG/PX laptops.  It's supported
by all dGPUs which supports BOCO/BACO functionality (runtime
D3).

BOCO - Bus Off, Chip Off.  The entire chip is powered off.
       This is controlled by ACPI.
BACO - Bus Active, Chip Off.  The chip still shows up
       on the PCI bus, but the device itself is powered
       down.

v2: fix missed HG/PX reference

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-22 21:51:30 -05:00
Dave Airlie b06103b532 Merge tag 'amd-drm-next-5.17-2021-12-16' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amdgpu:
- Add some display debugfs entries
- RAS fixes
- SR-IOV fixes
- W=1 fixes
- Documentation fixes
- IH timestamp fix
- Misc power fixes
- IP discovery fixes
- Large driver documentation updates
- Multi-GPU memory use reductions
- Misc display fixes and cleanups
- Add new SMU debug option

amdkfd:
- SVM fixes

radeon:
- Fix typo in comment

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211216202731.5900-1-alexander.deucher@amd.com
2021-12-23 11:55:28 +10:00
Yazen Ghannam 91f75eb481 x86/MCE/AMD, EDAC/mce_amd: Support non-uniform MCA bank type enumeration
AMD systems currently lay out MCA bank types such that the type of bank
number "i" is either the same across all CPUs or is Reserved/Read-as-Zero.

For example:

  Bank # | CPUx | CPUy
    0      LS     LS
    1      RAZ    UMC
    2      CS     CS
    3      SMU    RAZ

Future AMD systems will lay out MCA bank types such that the type of
bank number "i" may be different across CPUs.

For example:

  Bank # | CPUx | CPUy
    0      LS     LS
    1      RAZ    UMC
    2      CS     NBIO
    3      SMU    RAZ

Change the structures that cache MCA bank types to be per-CPU and update
smca_get_bank_type() to handle this change.

Move some SMCA-specific structures to amd.c from mce.h, since they no
longer need to be global.

Break out the "count" for bank types from struct smca_hwid, since this
should provide a per-CPU count rather than a system-wide count.

Apply the "const" qualifier to the struct smca_hwid_mcatypes array. The
values in this array should not change at runtime.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20211216162905.4132657-3-yazen.ghannam@amd.com
2021-12-22 17:22:09 +01:00
Alex Deucher 5e713c6afa drm/amdgpu: add support for IP discovery gc_info table v2
Used on gfx9 based systems. Fixes incorrect CU counts reported
in the kernel log.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1833
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-12-17 12:47:29 -05:00
chen gong b7865173cf drm/amdgpu: When the VCN(1.0) block is suspended, powergating is explicitly enabled
Play a video on the raven (or PCO, raven2) platform, and then do the S3
test. When resume, the following error will be reported:

amdgpu 0000:02:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring
vcn_dec test failed (-110)
[drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block
<vcn_v1_0> failed -110
amdgpu 0000:02:00.0: amdgpu: amdgpu_device_ip_resume failed (-110).
PM: dpm_run_callback(): pci_pm_resume+0x0/0x90 returns -110

[why]
When playing the video: The power state flag of the vcn block is set to
POWER_STATE_ON.

When doing suspend: There is no change to the power state flag of the
vcn block, it is still POWER_STATE_ON.

When doing resume: Need to open the power gate of the vcn block and set
the power state flag of the VCN block to POWER_STATE_ON.
But at this time, the power state flag of the vcn block is already
POWER_STATE_ON. The power status flag check in the "8f2cdef drm/amd/pm:
avoid duplicate powergate/ungate setting" patch will return the
amdgpu_dpm_set_powergating_by_smu function directly.
As a result, the gate of the power was not opened, causing the
subsequent ring test to fail.

[how]
In the suspend function of the vcn block, explicitly change the power
state flag of the vcn block to POWER_STATE_OFF.

BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1828
Signed-off-by: chen gong <curry.gong@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-12-17 12:47:29 -05:00
Huang Rui bf67014d6b drm/amdgpu: introduce new amdgpu_fence object to indicate the job embedded fence
The job embedded fence donesn't initialize the flags at
dma_fence_init(). Then we will go a wrong way in
amdgpu_fence_get_timeline_name callback and trigger a null pointer panic
once we enabled the trace event here. So introduce new amdgpu_fence
object to indicate the job embedded fence.

[  156.131790] BUG: kernel NULL pointer dereference, address: 00000000000002a0
[  156.131804] #PF: supervisor read access in kernel mode
[  156.131811] #PF: error_code(0x0000) - not-present page
[  156.131817] PGD 0 P4D 0
[  156.131824] Oops: 0000 [#1] PREEMPT SMP PTI
[  156.131832] CPU: 6 PID: 1404 Comm: sdma0 Tainted: G           OE     5.16.0-rc1-custom #1
[  156.131842] Hardware name: Gigabyte Technology Co., Ltd. Z170XP-SLI/Z170XP-SLI-CF, BIOS F20 11/04/2016
[  156.131848] RIP: 0010:strlen+0x0/0x20
[  156.131859] Code: 89 c0 c3 0f 1f 80 00 00 00 00 48 01 fe eb 0f 0f b6 07 38 d0 74 10 48 83 c7 01 84 c0 74 05 48 39 f7 75 ec 31 c0 c3 48 89 f8 c3 <80> 3f 00 74 10 48 89 f8 48 83 c0 01 80 38 00 75 f7 48 29 f8 c3 31
[  156.131872] RSP: 0018:ffff9bd0018dbcf8 EFLAGS: 00010206
[  156.131880] RAX: 00000000000002a0 RBX: ffff8d0305ef01b0 RCX: 000000000000000b
[  156.131888] RDX: ffff8d03772ab924 RSI: ffff8d0305ef01b0 RDI: 00000000000002a0
[  156.131895] RBP: ffff9bd0018dbd60 R08: ffff8d03002094d0 R09: 0000000000000000
[  156.131901] R10: 000000000000005e R11: 0000000000000065 R12: ffff8d03002094d0
[  156.131907] R13: 000000000000001f R14: 0000000000070018 R15: 0000000000000007
[  156.131914] FS:  0000000000000000(0000) GS:ffff8d062ed80000(0000) knlGS:0000000000000000
[  156.131923] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  156.131929] CR2: 00000000000002a0 CR3: 000000001120a005 CR4: 00000000003706e0
[  156.131937] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  156.131942] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  156.131949] Call Trace:
[  156.131953]  <TASK>
[  156.131957]  ? trace_event_raw_event_dma_fence+0xcc/0x200
[  156.131973]  ? ring_buffer_unlock_commit+0x23/0x130
[  156.131982]  dma_fence_init+0x92/0xb0
[  156.131993]  amdgpu_fence_emit+0x10d/0x2b0 [amdgpu]
[  156.132302]  amdgpu_ib_schedule+0x2f9/0x580 [amdgpu]
[  156.132586]  amdgpu_job_run+0xed/0x220 [amdgpu]

v2: fix mismatch warning between the prototype and function name (Ray, kernel test robot)

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-17 12:47:28 -05:00
Christian König fc74881c28 drm/amdgpu: fix dropped backing store handling in amdgpu_dma_buf_move_notify
bo->tbo.resource can now be NULL.

Signed-off-by: Christian König <christian.koenig@amd.com>
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1811
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211210083927.1754-1-christian.koenig@amd.com
2021-12-17 11:26:40 +01:00
Alex Deucher 0cd7f378b0 drm/amdgpu: add support for IP discovery gc_info table v2
Used on gfx9 based systems. Fixes incorrect CU counts reported
in the kernel log.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1833
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-16 14:08:20 -05:00
Victor Skvortsov 892deb4826 drm/amdgpu: Separate vf2pf work item init from virt data exchange
We want to be able to call virt data exchange conditionally
after gmc sw init to reserve bad pages as early as possible.
Since this is a conditional call, we will need
to call it again unconditionally later in the init sequence.

Refactor the data exchange function so it can be
called multiple times without re-initializing the work item.

v2: Cleaned up the code. Kept the original call to init_exchange_data()
inside early init to initialize the work item, afterwards call
exchange_data() when needed.

Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
Reviewed By: Shaoyun.liu <Shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-16 14:08:20 -05:00
chen gong d4c2933fb8 drm/amdgpu: When the VCN(1.0) block is suspended, powergating is explicitly enabled
Play a video on the raven (or PCO, raven2) platform, and then do the S3
test. When resume, the following error will be reported:

amdgpu 0000:02:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring
vcn_dec test failed (-110)
[drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block
<vcn_v1_0> failed -110
amdgpu 0000:02:00.0: amdgpu: amdgpu_device_ip_resume failed (-110).
PM: dpm_run_callback(): pci_pm_resume+0x0/0x90 returns -110

[why]
When playing the video: The power state flag of the vcn block is set to
POWER_STATE_ON.

When doing suspend: There is no change to the power state flag of the
vcn block, it is still POWER_STATE_ON.

When doing resume: Need to open the power gate of the vcn block and set
the power state flag of the VCN block to POWER_STATE_ON.
But at this time, the power state flag of the vcn block is already
POWER_STATE_ON. The power status flag check in the "8f2cdef drm/amd/pm:
avoid duplicate powergate/ungate setting" patch will return the
amdgpu_dpm_set_powergating_by_smu function directly.
As a result, the gate of the power was not opened, causing the
subsequent ring test to fail.

[how]
In the suspend function of the vcn block, explicitly change the power
state flag of the vcn block to POWER_STATE_OFF.

BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1828
Signed-off-by: chen gong <curry.gong@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-16 14:08:10 -05:00
Huang Rui 5c1e6fa49e drm/amdgpu: introduce new amdgpu_fence object to indicate the job embedded fence
The job embedded fence donesn't initialize the flags at
dma_fence_init(). Then we will go a wrong way in
amdgpu_fence_get_timeline_name callback and trigger a null pointer panic
once we enabled the trace event here. So introduce new amdgpu_fence
object to indicate the job embedded fence.

[  156.131790] BUG: kernel NULL pointer dereference, address: 00000000000002a0
[  156.131804] #PF: supervisor read access in kernel mode
[  156.131811] #PF: error_code(0x0000) - not-present page
[  156.131817] PGD 0 P4D 0
[  156.131824] Oops: 0000 [#1] PREEMPT SMP PTI
[  156.131832] CPU: 6 PID: 1404 Comm: sdma0 Tainted: G           OE     5.16.0-rc1-custom #1
[  156.131842] Hardware name: Gigabyte Technology Co., Ltd. Z170XP-SLI/Z170XP-SLI-CF, BIOS F20 11/04/2016
[  156.131848] RIP: 0010:strlen+0x0/0x20
[  156.131859] Code: 89 c0 c3 0f 1f 80 00 00 00 00 48 01 fe eb 0f 0f b6 07 38 d0 74 10 48 83 c7 01 84 c0 74 05 48 39 f7 75 ec 31 c0 c3 48 89 f8 c3 <80> 3f 00 74 10 48 89 f8 48 83 c0 01 80 38 00 75 f7 48 29 f8 c3 31
[  156.131872] RSP: 0018:ffff9bd0018dbcf8 EFLAGS: 00010206
[  156.131880] RAX: 00000000000002a0 RBX: ffff8d0305ef01b0 RCX: 000000000000000b
[  156.131888] RDX: ffff8d03772ab924 RSI: ffff8d0305ef01b0 RDI: 00000000000002a0
[  156.131895] RBP: ffff9bd0018dbd60 R08: ffff8d03002094d0 R09: 0000000000000000
[  156.131901] R10: 000000000000005e R11: 0000000000000065 R12: ffff8d03002094d0
[  156.131907] R13: 000000000000001f R14: 0000000000070018 R15: 0000000000000007
[  156.131914] FS:  0000000000000000(0000) GS:ffff8d062ed80000(0000) knlGS:0000000000000000
[  156.131923] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  156.131929] CR2: 00000000000002a0 CR3: 000000001120a005 CR4: 00000000003706e0
[  156.131937] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  156.131942] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  156.131949] Call Trace:
[  156.131953]  <TASK>
[  156.131957]  ? trace_event_raw_event_dma_fence+0xcc/0x200
[  156.131973]  ? ring_buffer_unlock_commit+0x23/0x130
[  156.131982]  dma_fence_init+0x92/0xb0
[  156.131993]  amdgpu_fence_emit+0x10d/0x2b0 [amdgpu]
[  156.132302]  amdgpu_ib_schedule+0x2f9/0x580 [amdgpu]
[  156.132586]  amdgpu_job_run+0xed/0x220 [amdgpu]

v2: fix mismatch warning between the prototype and function name (Ray, kernel test robot)

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-16 13:42:36 -05:00
Thomas Zimmermann 9758ff2fa2 Merge drm/drm-next into drm-misc-next
Backmerging for v5.16-rc5. Resolves a conflict between drm-misc-next
and drm-misc-fixes in the vc4 driver.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2021-12-16 14:48:27 +01:00
Evan Quan 17c65d6fca drm/amdgpu: correct the wrong cached state for GMC on PICASSO
Pair the operations did in GMC ->hw_init and ->hw_fini. That
can help to maintain correct cached state for GMC and avoid
unintention gate operation dropping due to wrong cached state.

BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1828

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-14 17:56:53 -05:00
Hawking Zhang 841933d5b8 drm/amdgpu: don't override default ECO_BITs setting
Leave this bit as hardware default setting

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-12-14 17:50:36 -05:00
Le Ma f3a8076eb2 drm/amdgpu: correct register access for RLC_JUMP_TABLE_RESTORE
should count on GC IP base address

Signed-off-by: Le Ma <le.ma@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-12-14 17:49:50 -05:00
Yann Dirson 326db0dc00 amdgpu: fix some comment typos
Signed-off-by: Yann Dirson <ydirson@free.fr>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-14 16:10:58 -05:00
Yann Dirson 03f2abb07e amdgpu: fix some kernel-doc markup
Those are not today pulled by the sphinx doc, but better be ready.

Signed-off-by: Yann Dirson <ydirson@free.fr>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-14 16:10:53 -05:00
Jingwen Chen 948e7ce014 drm/amd/amdgpu: fix gmc bo pin count leak in SRIOV
[Why]
gmc bo will be pinned during loading amdgpu and reset in SRIOV while
only unpinned in unload amdgpu

[How]
add amdgpu_in_reset and sriov judgement to skip pin bo

v2: fix wrong judgement

Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com>
Reviewed-by: Horace Chen <horace.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-14 16:10:00 -05:00
Jingwen Chen 85dfc1d692 drm/amd/amdgpu: fix psp tmr bo pin count leak in SRIOV
[Why]
psp tmr bo will be pinned during loading amdgpu and reset in SRIOV while
only unpinned in unload amdgpu

[How]
add amdgpu_in_reset and sriov judgement to skip pin bo

v2: fix wrong judgement

Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com>
Reviewed-by: Horace Chen <horace.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-14 16:09:49 -05:00
Evan Quan 17252701ec drm/amdgpu: correct the wrong cached state for GMC on PICASSO
Pair the operations did in GMC ->hw_init and ->hw_fini. That
can help to maintain correct cached state for GMC and avoid
unintention gate operation dropping due to wrong cached state.

BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1828

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-14 16:09:31 -05:00
Guchun Chen e0f943b4f9 drm/amdgpu: use adev_to_drm to get drm_device pointer
Updated for consistency when accessing drm_device from amdgpu driver.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-14 16:09:24 -05:00
Evan Quan 7e31a8585b drm/amdgpu: move smu_debug_mask to a more proper place
As the smu_context will be invisible from outside(of power). Also,
the smu_debug_mask can be shared around all power code instead of
some specific framework(swSMU) only.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-14 16:09:11 -05:00
Victor Skvortsov fa4a427d84 drm/amdgpu: SRIOV flr_work should use down_write
Host initiated VF FLR may fail if someone else is
already holding a read_lock. Change from down_write_trylock
to down_write to guarantee the reset goes through.

Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
Reviewed by: Shaoyun.liu <Shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-14 16:09:02 -05:00
Daniel Vetter 99b03ca651 Linux 5.16-rc5
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmG2fU0eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGC7EH/3R7Rt+OD8Wn8Ss3
 w8V+dBxVwa2u2oMTyUHPxaeOXZ7bi38XlUdLFPOK/76bGwO0a5TmYZqsWdRbGyT0
 HfcYjHsQ0lbJXk/nh2oM47oJxJXVpThIHXJEk0FZ0Y5t+DYjIYlNHzqZymUyhLem
 St74zgWcyT+MXuqY34vB827FJDUnOxhhhi85tObeunaSPAomy9aiYidSC1ARREnz
 iz2VUntP/QnRnKVvL2nUZNzcz1xL5vfCRSKsRGRSv3qW1Y/1M71ylt6JVmSftWq+
 VmMdFxFhdrb1OK/1ct/930Un/UP2NG9EJsWxote2XYlnVSZHzDqH7lUhbqgdCcLz
 1m2tVNY=
 =7wRd
 -----END PGP SIGNATURE-----

Merge v5.16-rc5 into drm-next

Thomas Zimmermann requested a fixes backmerge, specifically also for
96c5f82ef0 ("drm/vc4: fix error code in vc4_create_object()")

Just a bunch of adjacent changes conflicts, even the big pile of them
in vc4.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2021-12-14 10:24:28 +01:00
chiminghao 47d9c6faa7 drm:amdgpu:remove unneeded variable
return value form directly instead of
taking this in another redundant variable.

Reported-by: Zeal Robot <zealci@zte.com.cm>
Signed-off-by: chiminghao <chi.minghao@zte.com.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-13 16:34:27 -05:00
Isabella Basso ba6f8c135a drm/amdgpu: re-format file header comments
Fix the warning below:

 warning: Cannot understand  * \file amdgpu_ioc32.c
 on line 2 - I thought it was a doc line

Changes since v1:
- As suggested by Alexander Deucher:
  1. Reduce diff to minimum as this DOC section doesn't provide much
     value.

Signed-off-by: Isabella Basso <isabbasso@riseup.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-13 16:33:17 -05:00
Isabella Basso 929bb8e200 drm/amdgpu: fix amdgpu_ras_mca_query_error_status scope
This commit fixes the compile-time warning below:

 warning: no previous prototype for ‘amdgpu_ras_mca_query_error_status’
 [-Wmissing-prototypes]

Changes since v1:
- As suggested by Alexander Deucher:
  1. Make function static instead of adding prototype.

Signed-off-by: Isabella Basso <isabbasso@riseup.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-13 16:33:17 -05:00
Philip Yang 28fe416466 drm/amdgpu: Reduce SG bo memory usage for mGPUs
For userptr bo, if adev is not in IOMMU isolation mode, RAM direct map
to GPU, multiple GPUs use same system memory dma mapping address, they
can share the original mem->bo in attachment to reduce dma address array
memory usage.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-13 16:33:16 -05:00
Philip Yang 4a74c38cd6 drm/amdgpu: Detect if amdgpu in IOMMU direct map mode
If host and amdgpu IOMMU is not enabled or IOMMU is pass through mode,
set adev->ram_is_direct_mapped flag which will be used to optimize
memory usage for multi GPU mappings.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-13 16:33:16 -05:00
Lang Yu 6ff7fddbd1 drm/amdgpu: add support for SMU debug option
SMU firmware expects the driver maintains error context
and doesn't interact with SMU any more when SMU errors
occurred. That will aid in debugging SMU firmware issues.

Add SMU debug option support for this request, it can be
enabled or disabled via amdgpu_smu_debug debugfs file.
Use a 32-bit mask to indicate corresponding debug modes.
Currently, only one mode(HALT_ON_ERROR) is supported.
When enabled, it brings hardware to a kind of halt state
so that no one can touch it any more in the envent of SMU
errors.

The dirver interacts with SMU via sending messages. And
threre are three ways to sending messages to SMU in current
implementation. Handle them respectively as following:

1, smu_cmn_send_smc_msg_with_param() for normal timeout cases

  Halt on any error.

2, smu_cmn_send_msg_without_waiting()/smu_cmn_wait_for_response()
for longer timeout cases

  Halt on errors apart from ETIME. Otherwise this way won't work.
  Let the user handle ETIME error in such a case.

3, smu_cmn_send_msg_without_waiting() for no waiting cases

  Halt on errors apart from ETIME. Otherwise second way won't work.

== Command Guide ==

1, enable HALT_ON_ERROR mode

 # echo 0x1 > /sys/kernel/debug/dri/0/amdgpu_smu_debug

2, disable HALT_ON_ERROR mode

 # echo 0x0 > /sys/kernel/debug/dri/0/amdgpu_smu_debug

v5:
 - Use bit mask to allow more debug features.(Evan)
 - Use WRAN() instead of BUG().(Evan)

v4:
 - Set to halt state instead of a simple hang.(Christian)

v3:
 - Use debugfs_create_bool().(Christian)
 - Put variable into smu_context struct.
 - Don't resend command when timeout.

v2:
 - Resend command when timeout.(Lijo)
 - Use debugfs file instead of module parameter.

Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-13 16:33:16 -05:00
Lang Yu 34f3a4a98b drm/amdgpu: introduce a kind of halt state for amdgpu device
It is useful to maintain error context when debugging
SW/FW issues. Introduce amdgpu_device_halt() for this
purpose. It will bring hardware to a kind of halt state,
so that no one can touch it any more.

Compare to a simple hang, the system will keep stable
at least for SSH access. Then it should be trivial to
inspect the hardware state and see what's going on.

v2:
 - Set adev->no_hw_access earlier to avoid potential crashes.(Christian)

Suggested-by: Christian Koenig <christian.koenig@amd.com>
Suggested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Christian Koenig <christian.koenig@amd.co>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-13 16:33:16 -05:00
Hawking Zhang cace4bff75 drm/amdgpu: check df_funcs and its callback pointers
in case they are not avaiable in early phase

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-13 16:33:16 -05:00
Hawking Zhang 4ac955baa9 drm/amdgpu: don't override default ECO_BITs setting
Leave this bit as hardware default setting

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-13 16:33:16 -05:00
Le Ma 2c113b999c drm/amdgpu: correct register access for RLC_JUMP_TABLE_RESTORE
should count on GC IP base address

Signed-off-by: Le Ma <le.ma@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-13 16:33:16 -05:00
Hawking Zhang 2cb6577a30 drm/amdgpu: read and authenticate ip discovery binary
read and authenticate ip discovery binary getting from
vram first, if it is not valid, read and authenticate
the one getting from file

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-13 16:33:16 -05:00
Hawking Zhang 32f0e1a330 drm/amdgpu: add helper to verify ip discovery binary signature
To be used to check ip discovery binary signature

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-13 16:33:15 -05:00
Hawking Zhang f6dcaf0c07 drm/amdgpu: rename discovery_read_binary helper
add _from_vram in the funciton name to diffrentiate
the one used to read from file

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-13 16:33:15 -05:00
Hawking Zhang 43a80bd511 drm/amdgpu: add helper to load ip_discovery binary from file
To be used when ip_discovery binary is not carried by vbios

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-13 16:33:15 -05:00
Leslie Shi c40bdfb2ff drm/amdgpu: fix incorrect VCN revision in SRIOV
Guest OS will setup VCN instance 1 which is disabled as an enabled instance and
execute initialization work on it, but this causes VCN ib ring test failure
on the disabled VCN instance during modprobe:

amdgpu 0000:00:08.0: amdgpu: ring vcn_enc_1.0 uses VM inv eng 5 on hub 1
amdgpu 0000:00:08.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on vcn_dec_0 (-110).
amdgpu 0000:00:08.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on vcn_enc_0.0 (-110).
[drm:amdgpu_device_delayed_init_work_handler [amdgpu]] *ERROR* ib ring test failed (-110).

v2: drop amdgpu_discovery_get_vcn_version and rename sriov_config to
vcn_config
v3: modify VCN's revision in SR-IOV and bare-metal

Fixes: baf3f8f374 ("drm/amdgpu: handle SRIOV VCN revision parsing")
Signed-off-by: Leslie Shi <Yuliang.Shi@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-13 16:33:15 -05:00
Leslie Shi 4046afcebf drm/amdgpu: add modifiers in amdgpu_vkms_plane_init()
Fix following warning in SRIOV during modprobe:

amdgpu 0000:00:08.0: GFX9+ requires FB check based on format modifier
WARNING: CPU: 0 PID: 1023 at drivers/gpu/drm/amd/amdgpu/amdgpu_display.c:1150 amdgpu_display_framebuffer_init+0x8e7/0xb40 [amdgpu]

Signed-off-by: Leslie Shi <Yuliang.Shi@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-13 16:33:15 -05:00
Lang Yu 613aa3ea74 drm/amdgpu: only hw fini SMU fisrt for ASICs need that
We found some headaches on ASICs don't need that,
so remove that for them.

Suggested-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Kevin Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-13 16:32:35 -05:00
Philip Yang 0771c80591 drm/amdgpu: Handle fault with same timestamp
Remove not unique timestamp WARNING as same timestamp interrupt happens
on some chips,

Drain fault need to wait for the processed_timestamp to be truly greater
than the checkpoint or the ring to be empty to be sure no stale faults
are handled.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1818
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-13 16:32:35 -05:00
Isabella Basso e105b64a36 drm/amdgpu: fix location of prototype for amdgpu_kms_compat_ioctl
This fixes the warning below by changing the prototype to a location
that's actually included by the .c files that call
amdgpu_kms_compat_ioctl:

 warning: no previous prototype for ‘amdgpu_kms_compat_ioctl’
 [-Wmissing-prototypes]
 37 | long amdgpu_kms_compat_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
    |      ^~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Isabella Basso <isabbasso@riseup.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-13 16:32:34 -05:00
Isabella Basso 64cf26f04a drm/amd: append missing includes
This fixes warnings caused by global functions lacking prototypes:, such as:

 warning: no previous prototype for 'dcn303_hw_sequencer_construct'
 [-Wmissing-prototypes]
 12 | void dcn303_hw_sequencer_construct(struct dc *dc)
    |      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 ...
 warning: no previous prototype for ‘amdgpu_has_atpx’
 [-Wmissing-prototypes]
 76 | bool amdgpu_has_atpx(void) {
    |      ^~~~~~~~~~~~~~~

Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Isabella Basso <isabbasso@riseup.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-13 16:32:34 -05:00
Isabella Basso 2351b7d4e3 drm/amdgpu: fix function scopes
This turns previously global functions into static, thus removing
compile-time warnings such as:

 warning: no previous prototype for 'amdgpu_vkms_output_init' [-Wmissing-prototypes]
 399 | int amdgpu_vkms_output_init(struct drm_device *dev,
     |     ^~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Isabella Basso <isabbasso@riseup.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-13 16:32:34 -05:00
Isabella Basso bbe04dec5c drm/amd: fix improper docstring syntax
This fixes various warnings relating to erroneous docstring syntax, of
which some are listed below:

 warning: Function parameter or member 'adev' not described in
 'amdgpu_atomfirmware_ras_rom_addr'
 ...
 warning: expecting prototype for amdgpu_atpx_validate_functions().
 Prototype was for amdgpu_atpx_validate() instead
 ...
 warning: Excess function parameter 'mem' description in 'amdgpu_preempt_mgr_new'
 ...
 warning: Cannot understand  * @kfd_get_cu_occupancy - Collect number of
 waves in-flight on this device
 ...
 warning: This comment starts with '/**', but isn't a kernel-doc
 comment. Refer Documentation/doc-guide/kernel-doc.rst

Signed-off-by: Isabella Basso <isabbasso@riseup.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-13 16:32:34 -05:00
Zhigang Luo 85a774d9ad drm/amdgpu: extended waiting SRIOV VF reset completion timeout to 10s
For the ASIC has big FB, it need more time to clear FB during reset.
This change extended SRIOV VF waiting reset completion timeout from 5s
to 10s.

Signed-off-by: Zhigang Luo <zhigang.luo@amd.com>
Acked-by: Shaoyun Liu <shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-13 16:32:34 -05:00
Zhigang Luo a5f67c939e drm/amdgpu: recover XGMI topology for SRIOV VF after reset
For SRIOV VF, the XGMI topology was not recovered after reset. This
change added code to SRIOV VF reset function to update XGMI topology
for SRIOV VF after reset.

Signed-off-by: Zhigang Luo <zhigang.luo@amd.com>
Reviewed-by: Shaoyun Liu <shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-13 16:32:34 -05:00
Zhigang Luo dd26e018aa drm/amdgpu: added PSP XGMI initialization for SRIOV VF during recover
For SRIOV VF, XGMI was not initialized in PSP during recover. This change
added PSP XGMI initialization for SRIOV VF during recover.

Signed-off-by: Zhigang Luo <zhigang.luo@amd.com>
Reviewed-by: Shaoyun Liu <shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-13 16:32:34 -05:00
Zhigang Luo 175ac6ec6b drm/amdgpu: skip reset other device in the same hive if it's SRIOV VF
On SRIOV, host driver can support FLR(function level reset) on individual VF
within the hive which might bring the individual device back to normal without
the necessary to execute the hive reset. If the FLR failed , host driver will
trigger the hive reset, each guest VF will get reset notification before the
real hive reset been executed. The VF device can handle the reset request
individually in it's reset work handler.

This change updated gpu recover sequence to skip reset other device in
the same hive for SRIOV VF.

Signed-off-by: Zhigang Luo <zhigang.luo@amd.com>
Reviewed-by: Shaoyun Liu <shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-13 16:32:34 -05:00
Tao Zhou 655ff3538e drm/amdgpu: enable RAS poison flag when GPU is connected to CPU
The RAS poison mode is enabled by default on the platform.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-13 16:32:34 -05:00
Dave Airlie f8eb96b4df Merge tag 'amd-drm-next-5.17-2021-12-02' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.17-2021-12-02:

amdgpu:
- Use generic drm fb helpers
- PSR fixes
- Rework DCN3.1 clkmgr
- DPCD 1.3 fixes
- Misc display fixes can cleanups
- Clock query fixes for APUs
- LTTPR fixes
- DSC fixes
- Misc PM fixes
- RAS fixes
- OLED backlight fix
- SRIOV fixes
- Add STB (Smart Trace Buffer) for supported dGPUs
- IH rework
- Enable seamless boot for DCN3.01

amdkfd:
- Rework more stuff around IP discovery enumeration
- Further clean up of interfaces with amdgpu
- SVM fixes

radeon:
- Indentation fixes

UAPI:
- Add a new KFD header that defines some of the sysfs bitfields and enums that userspace has been using for a while
  The corresponding bit-fields and enums in user mode are defined in
  https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/blob/master/include/hsakmttypes.h

Signed-off-by: Dave Airlie <airlied@redhat.com>

# Conflicts:
#	drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211202191643.5970-1-alexander.deucher@amd.com
2021-12-10 13:52:51 +10:00
Christian König 21a6732f46 drm/amdgpu: don't skip runtime pm get on A+A config
The runtime PM get was incorrectly added after the check.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211206084551.92502-1-christian.koenig@amd.com
2021-12-09 17:13:17 +01:00
Daniel Vetter c8a04cbeed drm-misc-next for 5.17:
UAPI Changes:
 
 Cross-subsystem Changes:
 
  * Move 'nomodeset' kernel boot option into DRM subsystem
 
 Core Changes:
 
  * Replace several DRM_*() logging macros with drm_*() equivalents
  * panel: Add quirk for Lenovo Yoga Book X91F/L
  * ttm: Documentation fixes
 
 Driver Changes:
 
  * Cleanup nomodeset handling in drivers
  * Fixes
  * bridge/anx7625: Fix reading EDID; Fix error code
  * bridge/megachips: Probe both bridges before registering
  * vboxvideo: Fix ERR_PTR usage
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmGklRkACgkQaA3BHVML
 eiPweQf7B6r4Bu5XRdodHs3arqt5A7st1ZIZKOwb6SdgriisViZPdAdmgoHjBxtj
 rMj0eUac80ssPmtXf6si0jDx2G/QS0bOT38g/NQ2rjxiR3jt5mjbKnd4m/fwVzxw
 uD9Qmk/6IlLdAQa4hpgmaF5er11zHxCxhKbDYlNzQX9dPqvHKq+KjN00FWkupqTl
 5CSA6wRcZuQXbEIKrueXBWz0oX9/KX8LraI/U4Ze9I4PmP6iD09Au8dgx5Mc8pUC
 OfA94pTdUVjg9ZHBhI4jm1LRIYKDlQcOkAPSCGgrA61dM/J0JOYmiE0fRAV0KH0S
 7lzKDXqnc/KoyohI8oGFYXNbMKf2cQ==
 =xiFx
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2021-11-29' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for 5.17:

UAPI Changes:

Cross-subsystem Changes:

 * Move 'nomodeset' kernel boot option into DRM subsystem

Core Changes:

 * Replace several DRM_*() logging macros with drm_*() equivalents
 * panel: Add quirk for Lenovo Yoga Book X91F/L
 * ttm: Documentation fixes

Driver Changes:

 * Cleanup nomodeset handling in drivers
 * Fixes
 * bridge/anx7625: Fix reading EDID; Fix error code
 * bridge/megachips: Probe both bridges before registering
 * vboxvideo: Fix ERR_PTR usage

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/YaSVz15Q7dAlEevU@linux-uq9g.fritz.box
2021-12-09 09:31:45 +01:00
Claudio Suarez 3c02193102 drm/amdgpu: replace drm_detect_hdmi_monitor() with drm_display_info.is_hdmi
Once EDID is parsed, the monitor HDMI support information is available
through drm_display_info.is_hdmi. The amdgpu driver still calls
drm_detect_hdmi_monitor() to retrieve the same information, which
is less efficient. Change to drm_display_info.is_hdmi

This is a TODO task in Documentation/gpu/todo.rst

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Claudio Suarez <cssk@net-c.es>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-07 13:13:07 -05:00
Claudio Suarez 20543be93c drm/amdgpu: update drm_display_info correctly when the edid is read
drm_display_info is updated by drm_get_edid() or
drm_connector_update_edid_property(). In the amdgpu driver it is almost
always updated when the edid is read in amdgpu_connector_get_edid(),
but not always.  Change amdgpu_connector_get_edid() and
amdgpu_connector_free_edid() to keep drm_display_info updated.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Claudio Suarez <cssk@net-c.es>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-07 13:12:58 -05:00
Stanley.Yang cf63b70272 drm/amdgpu: skip umc ras error count harvest
remove in recovery stat check, skip umc ras err cnt
harvest in amdgpu_ras_log_on_err_counter

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-07 13:12:19 -05:00
Flora Cui 30c1e39197 drm/amdgpu: free vkms_output after use
Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Leslie Shi <Yuliang.Shi@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-07 13:12:13 -05:00
Flora Cui f7ed3f90b2 drm/amdgpu: drop the critial WARN_ON in amdgpu_vkms
Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Leslie Shi <Yuliang.Shi@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-07 13:12:05 -05:00
Stanley.Yang aed1faab9d drm/amdgpu: only skip get ecc info for aldebaran
skip get ecc info for aldebarn through check ip version
do not affect other asic type

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-07 13:07:55 -05:00
Zhou Qingyang b220110e4c drm/amdgpu: Fix a NULL pointer dereference in amdgpu_connector_lcd_native_mode()
In amdgpu_connector_lcd_native_mode(), the return value of
drm_mode_duplicate() is assigned to mode, and there is a dereference
of it in amdgpu_connector_lcd_native_mode(), which will lead to a NULL
pointer dereference on failure of drm_mode_duplicate().

Fix this bug add a check of mode.

This bug was found by a static analyzer. The analysis employs
differential checking to identify inconsistent security operations
(e.g., checks or kfrees) between two code paths and confirms that the
inconsistent operations are not recovered in the current function or
the callers, so they constitute bugs.

Note that, as a bug found by static analysis, it can be a false
positive or hard to trigger. Multiple researchers have cross-reviewed
the bug.

Builds with CONFIG_DRM_AMDGPU=m show no new warnings, and
our static analyzer no longer warns about this code.

Fixes: d38ceaf99e ("drm/amdgpu: add core driver (v4)")
Signed-off-by: Zhou Qingyang <zhou1615@umn.edu>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-02 12:43:31 -05:00
Alex Deucher baf3f8f374 drm/amdgpu: handle SRIOV VCN revision parsing
For SR-IOV, the IP discovery revision number encodes
additional information.  Handle that case here.

v2: drop additional IP versions

Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-02 12:43:25 -05:00
Stanley.Yang bab73f092d drm/amdgpu: skip query ecc info in gpu recovery
this is a workaround due to get ecc info failed during gpu recovery

[  700.236122] amdgpu 0000:09:00.0: amdgpu: Failed to export SMU ecc table!
[  700.236128] amdgpu 0000:09:00.0: amdgpu: GPU reset begin!
[  704.331171] amdgpu: qcm fence wait loop timeout expired
[  704.331194] amdgpu: The cp might be in an unrecoverable state due to an unsuccessful queues preemption
[  704.332445] amdgpu 0000:09:00.0: amdgpu: GPU reset begin!
[  704.332448] amdgpu 0000:09:00.0: amdgpu: Bailing on TDR for s_job:ffffffffffffffff, as another already in progress
[  704.332456] amdgpu: Pasid 0x8000 destroy queue 0 failed, ret -62
[  710.360924] amdgpu 0000:09:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x00000013 SMN_C2PMSG_82:0x00000007
[  710.360964] amdgpu 0000:09:00.0: amdgpu: Failed to disable smu features.
[  710.361002] amdgpu 0000:09:00.0: amdgpu: Fail to disable dpm features!
[  710.361014] [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend of IP block <smu> failed -62

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-02 12:43:06 -05:00
shaoyunl 428890a3fe drm/amdgpu: adjust the kfd reset sequence in reset sriov function
This change revert previous commits:
9f4f2c1a35 ("drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov")
271fd38ce5 ("drm/amdgpu: move kfd post_reset out of reset_sriov function")

This change moves the amdgpu_amdkfd_pre_reset to an earlier place
in amdgpu_device_reset_sriov, presumably to address the sequence issue
that the first patch was originally meant to fix.

Some register access(GRBM_GFX_CNTL) only be allowed on full access
mode. Move kfd_pre_reset and  kfd_post_reset back inside reset_sriov
function.

Fixes: 9f4f2c1a35 ("drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov")
Fixes: 271fd38ce5 ("drm/amdgpu: move kfd post_reset out of reset_sriov function")
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01 17:09:30 -05:00
Philip Yang 494f2e42ce drm/amdkfd: fix double free mem structure
drm_gem_object_put calls release_notify callback to free the mem
structure and unreserve_mem_limit, move it down after the last access
of mem and make it conditional call.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01 17:08:00 -05:00
Lijo Lazar e0570f0b6e drm/amdgpu: Don't halt RLC on GFX suspend
On aldebaran, RLC also controls GFXCLK. Skip halting RLC during GFX IP suspend
and keep it running till PMFW disables all DPMs.

    [  578.019986] amdgpu 0000:23:00.0: amdgpu: GPU reset begin!
    [  583.245566] amdgpu 0000:23:00.0: amdgpu: Failed to disable smu features.
    [  583.245621] amdgpu 0000:23:00.0: amdgpu: Fail to disable dpm features!
    [  583.245639] [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend of IP block <smu> failed -62
    [  583.248504] [drm] free PSP TMR buffer

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01 17:02:40 -05:00
Guchun Chen 7551f70ab9 drm/amdgpu: fix the missed handling for SDMA2 and SDMA3
There is no base reg offset or ip_version set for SDMA2
and SDMA3 on SIENNA_CICHLID, so add them.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Kevin Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01 17:00:55 -05:00
Flora Cui 1053b9c948 drm/amdgpu: check atomic flag to differeniate with legacy path
since vkms support atomic KMS interface

Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Acked-by: Alex Deucher <aleander.deucher@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01 16:59:38 -05:00
Flora Cui 3e467e478e drm/amdgpu: cancel the correct hrtimer on exit
Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01 16:58:52 -05:00
Jane Jian da3b36a23b drm/amdgpu/sriov/vcn: add new vcn ip revision check case for SIENNA_CICHLID
[WHY]
for sriov odd# vf will modify vcn0 engine ip revision(due to multimedia bandwidth feature),
which will be mismatched with original vcn0 revision

[HOW]
add new version check for vcn0 disabled revision(3, 0, 192), typically modified under
sriov mode

Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01 16:58:11 -05:00
Yann Dirson ddb267b66a drm/amdgpu: update fw_load_type module parameter doc to match code
amdgpu_ucode_get_load_type() does not interpret this parameter as
documented.  It is ignored for many ASIC types (which presumably
only support one load_type), and when not ignored it is only used
to force direct loading instead of PSP loading.  SMU loading is
only available for ASICs for which the parameter is ignored.

Signed-off-by: Yann Dirson <ydirson@free.fr>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01 16:16:06 -05:00
Philip Yang a899fe8b43 drm/amdkfd: err_pin_bo path leaks kfd_bo_list
Refactor userptr and pin_bo path to make it less confusing, move
err_pin_bo label up to remove mem from process_info kfd_bo_list.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01 16:16:00 -05:00
shaoyunl 992110d747 drm/amdgpu: adjust the kfd reset sequence in reset sriov function
This change revert previous commits:
9f4f2c1a35 ("drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov")
271fd38ce5 ("drm/amdgpu: move kfd post_reset out of reset_sriov function")

This change moves the amdgpu_amdkfd_pre_reset to an earlier place
in amdgpu_device_reset_sriov, presumably to address the sequence issue
that the first patch was originally meant to fix.

Some register access(GRBM_GFX_CNTL) only be allowed on full access
mode. Move kfd_pre_reset and  kfd_post_reset back inside reset_sriov
function.

Fixes: 9f4f2c1a35 ("drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov")
Fixes: 271fd38ce5 ("drm/amdgpu: move kfd post_reset out of reset_sriov function")
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01 16:07:33 -05:00
Philip Yang a872c152fd drm/amdkfd: fix double free mem structure
drm_gem_object_put calls release_notify callback to free the mem
structure and unreserve_mem_limit, move it down after the last access
of mem and make it conditional call.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01 16:07:13 -05:00
Lijo Lazar 81d104f4af drm/amdgpu: Don't halt RLC on GFX suspend
On aldebaran, RLC also controls GFXCLK. Skip halting RLC during GFX IP suspend
and keep it running till PMFW disables all DPMs.

    [  578.019986] amdgpu 0000:23:00.0: amdgpu: GPU reset begin!
    [  583.245566] amdgpu 0000:23:00.0: amdgpu: Failed to disable smu features.
    [  583.245621] amdgpu 0000:23:00.0: amdgpu: Fail to disable dpm features!
    [  583.245639] [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend of IP block <smu> failed -62
    [  583.248504] [drm] free PSP TMR buffer

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01 16:04:23 -05:00
Lijo Lazar fe9c5c9aff drm/amdgpu: Use MAX_HWIP instead of HW_ID_MAX
HW_ID_MAX considers HWID of all IPs, far more than what amdgpu uses.
amdgpu tracks only the IPs defined by amd_hw_ip_block_type whose max
is MAX_HWIP.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01 16:04:10 -05:00
Guchun Chen 3700169886 drm/amdgpu: fix the missed handling for SDMA2 and SDMA3
There is no base reg offset or ip_version set for SDMA2
and SDMA3 on SIENNA_CICHLID, so add them.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Kevin Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01 16:04:02 -05:00
Guchun Chen 6c18ecefab drm/amdgpu: declare static function to fix compiler warning
>> drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c:503:6: warning: no previous prototype for function 'release_psp_cmd_buf' [-Wmissing-prototypes]
   void release_psp_cmd_buf(struct psp_context *psp)
        ^
   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c:503:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   void release_psp_cmd_buf(struct psp_context *psp)
   ^
   static
   1 warning generated.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Kevin Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01 16:03:56 -05:00
Philip Yang 3c2d6ea279 drm/amdgpu: handle IH ring1 overflow
IH ring1 is used to process GPU retry fault, overflow is enabled to
drain retry fault because we want receive other interrupts while
handling retry fault to recover range. There is no overflow flag set
when wptr pass rptr. Use timestamp of rptr and wptr to handle overflow
and drain retry fault.

If fault timestamp goes backward, the fault is filtered and should not
be processed. Drain fault is finished if processed_timestamp is equal to
or larger than checkpoint timestamp.

Add amdgpu_ih_functions interface decode_iv_ts for different chips to
get timestamp from IV entry with different iv size and timestamp offset.
amdgpu_ih_decode_iv_ts_helper is used for vega10, vega20, navi10.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01 16:03:34 -05:00
Stanley.Yang 232d1d43b5 drm/amdgpu: fix disable ras feature failed when unload drvier v2
v2:
    still need call ras_disable_all_featrures to handle
    ras initilization failure case.

Function amdgpu_device_fini_hw is called before amdgpu_device_fini_sw,
so ras ta will unload before send ras disable command, ras dsiable operation
must before hw fini.

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01 16:03:22 -05:00
Flora Cui 700de2c8aa drm/amdgpu: check atomic flag to differeniate with legacy path
since vkms support atomic KMS interface

Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Acked-by: Alex Deucher <aleander.deucher@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01 16:03:03 -05:00
Flora Cui deefd07eed drm/amdgpu: fix vkms crtc settings
otherwise adev->mode_info.crtcs[] is NULL

Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01 16:02:57 -05:00
Flora Cui 4f7ee199d9 drm/amdgpu: cancel the correct hrtimer on exit
Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01 16:02:46 -05:00
Jane Jian 981b304546 drm/amdgpu/sriov/vcn: add new vcn ip revision check case for SIENNA_CICHLID
[WHY]
for sriov odd# vf will modify vcn0 engine ip revision(due to multimedia bandwidth feature),
which will be mismatched with original vcn0 revision

[HOW]
add new version check for vcn0 disabled revision(3, 0, 192), typically modified under
sriov mode

Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01 16:02:04 -05:00
Javier Martinez Canillas 6a2d2ddf2c
drm: Move nomodeset kernel parameter to the DRM subsystem
The "nomodeset" kernel cmdline parameter is handled by the vgacon driver
but the exported vgacon_text_force() symbol is only used by DRM drivers.

It makes much more sense for the parameter logic to be in the subsystem
of the drivers that are making use of it.

Let's move the vgacon_text_force() function and related logic to the DRM
subsystem. While doing that, rename it to drm_firmware_drivers_only() and
make it return true if "nomodeset" was used and false otherwise. This is
a better description of the condition that the drivers are testing for.

Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Pekka Paalanen <pekka.paalanen@collabora.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20211112133230.1595307-4-javierm@redhat.com
2021-11-27 13:52:22 +01:00
Javier Martinez Canillas 35f7775f81
drm: Don't print messages if drivers are disabled due nomodeset
The nomodeset kernel parameter handler already prints a message that the
DRM drivers will be disabled, so there's no need for drivers to do that.

Suggested-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Pekka Paalanen <pekka.paalanen@collabora.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211112133230.1595307-2-javierm@redhat.com
2021-11-27 13:52:16 +01:00
Alex Deucher 692cd92e66 drm/amd/display: update bios scratch when setting backlight
Update the bios scratch register when updating the backlight
level.  Some platforms apparently read this scratch register
and do additional operations in their hotkey handlers.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1518
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 15:14:36 -05:00
Lijo Lazar 57961c4c18 drm/amdgpu: Skip ASPM programming on aldebaran
There is no need for additional programming, keep the default settings.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 15:14:03 -05:00
Yang Wang fd08953b2d drm/amdgpu: fix byteorder error in amdgpu discovery
fix some byteorder issues about amdgpu discovery.
This will result in running errors on the big end system. (e.g:MIPS)

Signed-off-by: Yang Wang <KevinYang.Wang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 15:13:54 -05:00
Philip Yang c4ef8a73bf drm/amdgpu: enable Navi retry fault wptr overflow
If xnack is on, VM retry fault interrupt send to IH ring1, and ring1
will be full quickly. IH cannot receive other interrupts, this causes
deadlock if migrating buffer using sdma and waiting for sdma done
while handling retry fault.

Remove VMC from IH storm client, enable ring1 write pointer
overflow, then IH will drop retry fault interrupts and be able to receive
other interrupts while driver is handling retry fault.

IH ring1 write pointer doesn't writeback to memory by IH, and ring1
write pointer recorded by self-irq is not updated, so always read
the latest ring1 write pointer from register.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 15:13:10 -05:00
Philip Yang 8888e2fe9c drm/amdgpu: enable Navi 48-bit IH timestamp counter
By default this timestamp is 32 bit counter. It gets overflowed in
around 10 minutes.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 15:12:55 -05:00
Philip Yang 4d62555f62 drm/amdgpu: IH process reset count when restart
Otherwise when IH process restart, count is zero, the loop will
not exit to wake_up_all after processing AMDGPU_IH_MAX_NUM_IVS
interrupts.

Cc: stable@vger.kernel.org
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 15:09:46 -05:00
Alex Deucher 53af98c091 drm/amdgpu/gfx9: switch to golden tsc registers for renoir+
Renoir and newer gfx9 APUs have new TSC register that is
not part of the gfxoff tile, so it can be read without
needing to disable gfx off.

Acked-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 15:09:17 -05:00
Alex Deucher 244ee39885 drm/amdgpu/gfx10: add wraparound gpu counter check for APUs as well
Apply the same check we do for dGPUs for APUs as well.

Acked-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 15:09:09 -05:00
shaoyunl 271fd38ce5 drm/amdgpu: move kfd post_reset out of reset_sriov function
Fixes: 9f4f2c1a35 ("drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov")

For sriov XGMI  configuration, the host driver will handle the hive reset,
so in guest side, the reset_sriov only be called once on one device. This will
make kfd post_reset unblanced with kfd pre_reset since kfd pre_reset already
been moved out of reset_sriov function. Move kfd post_reset out of reset_sriov
function to make them balance .

Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 15:08:58 -05:00
xinhui pan 4eb6bb649f drm/amdgpu: Fix double free of dmabuf
amdgpu_amdkfd_gpuvm_free_memory_of_gpu drop dmabuf reference increased in
amdgpu_gem_prime_export.
amdgpu_bo_destroy drop dmabuf reference increased in
amdgpu_gem_prime_import.

So remove this extra dma_buf_put to avoid double free.

Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Tested-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 15:04:05 -05:00
Felix Kuehling d3a21f7e35 drm/amdgpu: Fix MMIO HDP flush on SRIOV
Disable HDP register remapping on SRIOV and set rmmio_remap.reg_offset
to the fixed address of the VF register for hdp_v*_flush_hdp.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: Bokun Zhang <bokun.zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 15:02:25 -05:00
Alex Deucher 1f57925493 drm/amd/display: update bios scratch when setting backlight
Update the bios scratch register when updating the backlight
level.  Some platforms apparently read this scratch register
and do additional operations in their hotkey handlers.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1518
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 14:06:54 -05:00
Lijo Lazar 6ff53495ce drm/amdgpu: Skip ASPM programming on aldebaran
There is no need for additional programming, keep the default settings.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 14:06:53 -05:00
Yang Wang cc7818d709 drm/amdgpu: fix byteorder error in amdgpu discovery
fix some byteorder issues about amdgpu discovery.
This will result in running errors on the big end system. (e.g:MIPS)

Signed-off-by: Yang Wang <KevinYang.Wang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 14:06:53 -05:00
Philip Yang 23eb49251b drm/amdgpu: enable Navi retry fault wptr overflow
If xnack is on, VM retry fault interrupt send to IH ring1, and ring1
will be full quickly. IH cannot receive other interrupts, this causes
deadlock if migrating buffer using sdma and waiting for sdma done
while handling retry fault.

Remove VMC from IH storm client, enable ring1 write pointer
overflow, then IH will drop retry fault interrupts and be able to receive
other interrupts while driver is handling retry fault.

IH ring1 write pointer doesn't writeback to memory by IH, and ring1
write pointer recorded by self-irq is not updated, so always read
the latest ring1 write pointer from register.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 14:06:53 -05:00
Philip Yang 71ee9236ab drm/amdgpu: enable Navi 48-bit IH timestamp counter
By default this timestamp is 32 bit counter. It gets overflowed in
around 10 minutes.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 14:06:53 -05:00
Philip Yang 514f4a99c7 drm/amdgpu: IH process reset count when restart
Otherwise when IH process restart, count is zero, the loop will
not exit to wake_up_all after processing AMDGPU_IH_MAX_NUM_IVS
interrupts.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 14:06:53 -05:00
Evan Quan 1223c15c78 drm/amdgpu: update the domain flags for dumb buffer creation
After switching to generic framebuffer framework, we rely on the
->dumb_create routine for frame buffer creation. However, the
different domain flags used are not optimal. Add the contiguous
flag to directly allocate the scanout BO as one linear buffer.

Fixes: 087451f372 ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's.")

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 14:06:53 -05:00
Ramesh Errabolu 37ba5bbc89 drm/amdgpu: Declare Unpin BO api as static
Fixes warning report from kernel test robot

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 14:06:53 -05:00
Alex Deucher 7b37c7f8f5 drm/amdgpu/gfx9: switch to golden tsc registers for renoir+
Renoir and newer gfx9 APUs have new TSC register that is
not part of the gfxoff tile, so it can be read without
needing to disable gfx off.

Acked-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 14:06:53 -05:00
Alex Deucher f75de84475 drm/amdgpu/gfx10: add wraparound gpu counter check for APUs as well
Apply the same check we do for dGPUs for APUs as well.

Acked-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 14:06:52 -05:00
shaoyunl 4f30d920d1 drm/amdgpu: move kfd post_reset out of reset_sriov function
Fixes: 9f4f2c1a35 ("drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov")

For sriov XGMI  configuration, the host driver will handle the hive reset,
so in guest side, the reset_sriov only be called once on one device. This will
make kfd post_reset unblanced with kfd pre_reset since kfd pre_reset already
been moved out of reset_sriov function. Move kfd post_reset out of reset_sriov
function to make them balance .

Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-24 14:06:52 -05:00
Dave Airlie c18c889111 drm-misc-next for 5.17:
UAPI Changes:
 
  * Remove restrictions on DMA_BUF_SET_NAME ioctl
  * connector: State of privacy screen
  * sysfs: Send hotplug uevent
 
 Cross-subsystem Changes:
 
  * clk/bmc-2835: Fixes
 
  * dma-buf: Add dma_resv selftest; Error-handling fixes; Add debugfs
    helpers; Remove dma_resv_get_excl_unlocked(); Documentation fixes
 
  * pwm: Introduce of_pwm_single_xlate()
 
 Core Changes:
 
  * Support for privacy screens
  * Make drm_irq.c legacy
  * Fix __stack_depot_* name conflict
  * Documentation fixes
  * Fixes and cleanups
 
  * dp-helper: Reuse 8b/10b link-training delay helpers
 
  * format-helper: Update interfaces
 
  * fb-helper: Allocate shadow buffer of correct size
 
  * gem: Link GEM SHMEM and CMA helpers into separate modules; Use
 	    dma_resv iterator; Import DMA_BUF namespace into GEM-helper modules
 
  * gem/shmem-helper: Interface cleanups
 
  * scheduler: Grab fence in drm_sched_job_add_implicit_dependencies();
    Lockdep fixes
 
  * kms-helpers: Link several files from core into the KMS-helper module
 
 Driver Changes:
 
  * Use dma_resv_iter in several places
  * Fixes and cleanups
 
  * amdgpu: Use drm_kms_helper_connector_hotplug_event(); Get all fences
    at once
 
  * bridge: Switch to managed MIPI DSI helpers in several places; Register
    and attach during probe in several places; Convert to YAML in several
    places
 
  * bridge/anx7625: Support MIPI DPI input; Support HDMI audio; Fixes
 
  * bridge/dw-hdmi: Allow interlace on bridge
 
  * bridge/ps8640: Enable PM; Support aux-bus
 
  * bridge/tc358768: Enabled reference clock; Support pulse mode;
    Modesetting fixes
 
  * bridge/ti-sn65dsi86: Use regmap_bulk_write(); Implement PWM
 
  * etnaviv: Get all fences at once
 
  * gma500: GEM object cleanups; Remove generic drivers in probe function
 
  * i915: Support VESA panel backlights
 
  * ingenic: Fixes and cleanups
 
  * kirin: Adjust probe order
 
  * kmb: Enable framebuffer console
 
  * lima: Kconfig fixes
 
  * meson: Refactoring to supperot DRM_BRIDGE_ATTACH_NO_ENCODER
 
  * msm: Fixes and cleanups
 
  * msm/dsi: Adjust probe order
 
  * omap: Fixes and cleanups
 
  * nouveau: CRC fixes; Validate LUTs in atomic check; Set HDMI AVI RGB
    quantization to FULL; Fixes and cleanups
 
  * panel: Support Innolux G070Y2-T02, Vivax TPC-9150, JDI R63452,
    Newhaven 1.8-128160EF, Wanchanglong W552964ABA, Novatek NT35950,
    BOE BF060Y8M, Sony Tulip Truly NT35521; Use dev_err_probe() throughout
    drivers; Fixes and cleanups
 
  * panel/ili9881c: Orientation fixes
 
  * radeon: Use dma_resv_wait_timeout()
 
  * rockchip: Add timeout for DSP hold; Suspend/resume fixes; PLL clock
    fixes; Implement mmap in GEM object functions
 
  * simpledrm: Support FB_DAMAGE_CLIPS and virtual screen sizes
 
  * sun4i: Use CMA helpers without vmap support
 
  * tidss: Fixes and cleanups
 
  * v3d: Cleanups
 
  * vc4: Fix HDMI-CEC hang when display is off; Power on HDMI controller
    while disabling; Support 4k@60 Hz modes; Fixes and cleanups
 
  * video: Convert to sysfs_emit() in several places
 
  * video/omapfb: Fix fall-through
 
  * virtio: Overflow fixes
 
  * xen: Implement mmap as GEM object functions
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmGWF0wACgkQaA3BHVML
 eiN55ggAr6QN7S7Uxu98XnqfAHC9RErY7r3PoTXXS6ODvxY41bWOpHk8TQzuw626
 JCNnpQCk6Gi8L3yl8r/l1fqoirGXrfDR1YvrnmG4I9xhPxOqBmgxDWw7HQrROm2B
 FctOvgFukvzn5jzQk2FqYgs5JBV20WqLrfEhttPFFMvjLGti/U31/+d+aGMJdRIQ
 kE2eVpPZtAlbBAP+S4mglKp6w+WrrzNHHULSGOPcGS5jwLrNFaQg7w75JiLInzyR
 RWum28USXSkKE0d6XBqHw+PWAwo3B/Vq9XSnbCX4+3GQX4tJh+hosyU7ju+rA0BY
 FJCyN/YgZvc+474o/LVtMh7PbKMEHQ==
 =TV6Q
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2021-11-18' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for 5.17:

UAPI Changes:

 * Remove restrictions on DMA_BUF_SET_NAME ioctl
 * connector: State of privacy screen
 * sysfs: Send hotplug uevent

Cross-subsystem Changes:

 * clk/bmc-2835: Fixes

 * dma-buf: Add dma_resv selftest; Error-handling fixes; Add debugfs
   helpers; Remove dma_resv_get_excl_unlocked(); Documentation fixes

 * pwm: Introduce of_pwm_single_xlate()

Core Changes:

 * Support for privacy screens
 * Make drm_irq.c legacy
 * Fix __stack_depot_* name conflict
 * Documentation fixes
 * Fixes and cleanups

 * dp-helper: Reuse 8b/10b link-training delay helpers

 * format-helper: Update interfaces

 * fb-helper: Allocate shadow buffer of correct size

 * gem: Link GEM SHMEM and CMA helpers into separate modules; Use
	    dma_resv iterator; Import DMA_BUF namespace into GEM-helper modules

 * gem/shmem-helper: Interface cleanups

 * scheduler: Grab fence in drm_sched_job_add_implicit_dependencies();
   Lockdep fixes

 * kms-helpers: Link several files from core into the KMS-helper module

Driver Changes:

 * Use dma_resv_iter in several places
 * Fixes and cleanups

 * amdgpu: Use drm_kms_helper_connector_hotplug_event(); Get all fences
   at once

 * bridge: Switch to managed MIPI DSI helpers in several places; Register
   and attach during probe in several places; Convert to YAML in several
   places

 * bridge/anx7625: Support MIPI DPI input; Support HDMI audio; Fixes

 * bridge/dw-hdmi: Allow interlace on bridge

 * bridge/ps8640: Enable PM; Support aux-bus

 * bridge/tc358768: Enabled reference clock; Support pulse mode;
   Modesetting fixes

 * bridge/ti-sn65dsi86: Use regmap_bulk_write(); Implement PWM

 * etnaviv: Get all fences at once

 * gma500: GEM object cleanups; Remove generic drivers in probe function

 * i915: Support VESA panel backlights

 * ingenic: Fixes and cleanups

 * kirin: Adjust probe order

 * kmb: Enable framebuffer console

 * lima: Kconfig fixes

 * meson: Refactoring to supperot DRM_BRIDGE_ATTACH_NO_ENCODER

 * msm: Fixes and cleanups

 * msm/dsi: Adjust probe order

 * omap: Fixes and cleanups

 * nouveau: CRC fixes; Validate LUTs in atomic check; Set HDMI AVI RGB
   quantization to FULL; Fixes and cleanups

 * panel: Support Innolux G070Y2-T02, Vivax TPC-9150, JDI R63452,
   Newhaven 1.8-128160EF, Wanchanglong W552964ABA, Novatek NT35950,
   BOE BF060Y8M, Sony Tulip Truly NT35521; Use dev_err_probe() throughout
   drivers; Fixes and cleanups

 * panel/ili9881c: Orientation fixes

 * radeon: Use dma_resv_wait_timeout()

 * rockchip: Add timeout for DSP hold; Suspend/resume fixes; PLL clock
   fixes; Implement mmap in GEM object functions

 * simpledrm: Support FB_DAMAGE_CLIPS and virtual screen sizes

 * sun4i: Use CMA helpers without vmap support

 * tidss: Fixes and cleanups

 * v3d: Cleanups

 * vc4: Fix HDMI-CEC hang when display is off; Power on HDMI controller
   while disabling; Support 4k@60 Hz modes; Fixes and cleanups

 * video: Convert to sysfs_emit() in several places

 * video/omapfb: Fix fall-through

 * virtio: Overflow fixes

 * xen: Implement mmap as GEM object functions

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/YZYZSypIrr+qcih3@linux-uq9g.fritz.box
2021-11-23 09:38:55 +10:00
Christian König 11b4da9827 drm/amdgpu: partially revert "svm bo enable_signal call condition"
Partially revert commit 5f319c5c21.

First of all this is illegal use of RCU to call dma_fence_enable_sw_signaling()
since we don't hold a reference to the fence in question and can crash badly.

Then the code doesn't seem to have the intended effect since only the
exclusive fence is handled, but the KFD fences are always added as shared fence.

Only keep the handling to throw away the content of SVM BOs.

Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211122123926.385017-1-christian.koenig@amd.com
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
2021-11-22 21:27:04 +01:00
xinhui pan 4aaea9d72e drm/amdgpu: Fix double free of dmabuf
amdgpu_amdkfd_gpuvm_free_memory_of_gpu drop dmabuf reference increased in
amdgpu_gem_prime_export.
amdgpu_bo_destroy drop dmabuf reference increased in
amdgpu_gem_prime_import.

So remove this extra dma_buf_put to avoid double free.

Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Tested-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-22 14:59:08 -05:00
Felix Kuehling e39938117e drm/amdgpu: Fix MMIO HDP flush on SRIOV
Disable HDP register remapping on SRIOV and set rmmio_remap.reg_offset
to the fixed address of the VF register for hdp_v*_flush_hdp.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: Bokun Zhang <bokun.zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-22 14:45:54 -05:00
Stanley.Yang fdcb279d5b drm/amdgpu: query umc error info from ecc_table v2
if smu support ECCTABLE, driver can message smu to get ecc_table
then query umc error info from ECCTABLE

v2:
    optimize source code makes logical more reasonable

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-22 14:45:46 -05:00
Stanley.Yang 8882f90a3f drm/amdgpu: add new query interface for umc block v2
add message smu to query error information

v2:
    rename message_smu to ecc_info

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-22 14:45:14 -05:00
Alex Deucher 92020e81dd drm/amdgpu/display: set vblank_disable_immediate for DC
Disable vblanks immediately to save power.  I think this was
missed when we merged DC support.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1781
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-22 14:45:03 -05:00
Bernard Zhao 7b833d6804 drm/amd/amdgpu: fix potential memleak
In function amdgpu_get_xgmi_hive, when kobject_init_and_add failed
There is a potential memleak if not call kobject_put.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Bernard Zhao <bernard@vivo.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-22 14:45:03 -05:00
Bernard Zhao 8b11e14bd5 drm/amd/amdgpu: cleanup the code style a bit
This change is to cleanup the code style a bit.

Signed-off-by: Bernard Zhao <bernard@vivo.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-22 14:45:02 -05:00
Bernard Zhao 7b755d6510 drm/amd/amdgpu: remove useless break after return
This change is to remove useless break after return.

Signed-off-by: Bernard Zhao <bernard@vivo.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-22 14:45:02 -05:00
hongao 6c5af7d2f8 drm/amdgpu: fix set scaling mode Full/Full aspect/Center not works on vga and dvi connectors
amdgpu_connector_vga_get_modes missed function amdgpu_get_native_mode
which assign amdgpu_encoder->native_mode with *preferred_mode result in
amdgpu_encoder->native_mode.clock always be 0. That will cause
amdgpu_connector_set_property returned early on:
if ((rmx_type != DRM_MODE_SCALE_NONE) &&
	(amdgpu_encoder->native_mode.clock == 0))
when we try to set scaling mode Full/Full aspect/Center.
Add the missing function to amdgpu_connector_vga_get_mode can fix this.
It also works on dvi connectors because
amdgpu_connector_dvi_helper_funcs.get_mode use the same method.

Signed-off-by: hongao <hongao@uniontech.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-22 14:45:02 -05:00
Candice Li d9a69fe512 drm/amdgpu: Add recovery_lock to save bad pages function
Fix race condition failure during UMC UE injection.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-22 14:45:02 -05:00
Evan Quan 6c08e0ef87 drm/amd/pm: avoid duplicate powergate/ungate setting
Just bail out if the target IP block is already in the desired
powergate/ungate state. This can avoid some duplicate settings
which sometimes may cause unexpected issues.

Link: https://lore.kernel.org/all/YV81vidWQLWvATMM@zn.tnic/
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=214921
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=215025
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1789
Signed-off-by: Evan Quan <evan.quan@amd.com>
Tested-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-22 14:45:02 -05:00
Ramesh Errabolu d25e35bc26 drm/amdgpu: Pin MMIO/DOORBELL BO's in GTT domain
MMIO/DOORBELL BOs encode control data and should be pinned in GTT
domain before enabling PCIe connected peer devices in accessing it

Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-22 14:45:01 -05:00
Ramesh Errabolu f441dd33db drm/amdgpu: Update BO memory accounting to rely on allocation flag
Accounting system to track amount of available memory (system, TTM
and VRAM of a device) relies on BO's domain. The change is to rely
instead on allocation flag indicating BO type - VRAM, GTT, USERPTR,
MMIO or DOORBELL

Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-22 14:45:01 -05:00
Thomas Zimmermann a713ca234e Merge drm/drm-next into drm-misc-next
Backmerging from drm/drm-next for v5.16-rc1.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2021-11-18 09:36:39 +01:00
Bernard Zhao 27dfaedc0d drm/amd/amdgpu: fix potential memleak
In function amdgpu_get_xgmi_hive, when kobject_init_and_add failed
There is a potential memleak if not call kobject_put.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Bernard Zhao <bernard@vivo.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 23:04:57 -05:00
hongao bf55208391 drm/amdgpu: fix set scaling mode Full/Full aspect/Center not works on vga and dvi connectors
amdgpu_connector_vga_get_modes missed function amdgpu_get_native_mode
which assign amdgpu_encoder->native_mode with *preferred_mode result in
amdgpu_encoder->native_mode.clock always be 0. That will cause
amdgpu_connector_set_property returned early on:
if ((rmx_type != DRM_MODE_SCALE_NONE) &&
	(amdgpu_encoder->native_mode.clock == 0))
when we try to set scaling mode Full/Full aspect/Center.
Add the missing function to amdgpu_connector_vga_get_mode can fix this.
It also works on dvi connectors because
amdgpu_connector_dvi_helper_funcs.get_mode use the same method.

Signed-off-by: hongao <hongao@uniontech.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-11-17 23:03:08 -05:00
Evan Quan 6ee27ee27b drm/amd/pm: avoid duplicate powergate/ungate setting
Just bail out if the target IP block is already in the desired
powergate/ungate state. This can avoid some duplicate settings
which sometimes may cause unexpected issues.

Link: https://lore.kernel.org/all/YV81vidWQLWvATMM@zn.tnic/
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=214921
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=215025
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1789
Fixes: bf756fb833 ("drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend")
Signed-off-by: Evan Quan <evan.quan@amd.com>
Tested-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-11-17 17:41:20 -05:00
Guchun Chen 69650a879b drm/amdgpu: add error print when failing to add IP block(v2)
Driver initialization is driven by IP version from IP
discovery table. So add error print when failing to add
ip block during driver initialization, this will be more
friendly to user to know which IP version is not correct.

[   40.467361] [drm] host supports REQ_INIT_DATA handshake
[   40.474076] [drm] add ip block number 0 <nv_common>
[   40.474090] [drm] add ip block number 1 <gmc_v10_0>
[   40.474101] [drm] add ip block number 2 <psp>
[   40.474103] [drm] add ip block number 3 <navi10_ih>
[   40.474114] [drm] add ip block number 4 <smu>
[   40.474119] [drm] add ip block number 5 <amdgpu_vkms>
[   40.474134] [drm] add ip block number 6 <gfx_v10_0>
[   40.474143] [drm] add ip block number 7 <sdma_v5_2>
[   40.474147] amdgpu 0000:00:08.0: amdgpu: Fatal error during GPU init
[   40.474545] amdgpu 0000:00:08.0: amdgpu: amdgpu: finishing device.

v2: use dev_err to multi-GPU system

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 17:41:07 -05:00
Guchun Chen 73729a7d07 drm/amdgpu: add error print when failing to add IP block(v2)
Driver initialization is driven by IP version from IP
discovery table. So add error print when failing to add
ip block during driver initialization, this will be more
friendly to user to know which IP version is not correct.

[   40.467361] [drm] host supports REQ_INIT_DATA handshake
[   40.474076] [drm] add ip block number 0 <nv_common>
[   40.474090] [drm] add ip block number 1 <gmc_v10_0>
[   40.474101] [drm] add ip block number 2 <psp>
[   40.474103] [drm] add ip block number 3 <navi10_ih>
[   40.474114] [drm] add ip block number 4 <smu>
[   40.474119] [drm] add ip block number 5 <amdgpu_vkms>
[   40.474134] [drm] add ip block number 6 <gfx_v10_0>
[   40.474143] [drm] add ip block number 7 <sdma_v5_2>
[   40.474147] amdgpu 0000:00:08.0: amdgpu: Fatal error during GPU init
[   40.474545] amdgpu 0000:00:08.0: amdgpu: amdgpu: finishing device.

v2: use dev_err to multi-GPU system

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 17:09:18 -05:00
Graham Sider 02274fc0f6 drm/amdkfd: replace trivial funcs with direct access
These get funcs simply return an adev field. Replace funcs/calls with
direct field accesses instead.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:11 -05:00
Nirmoy Das 26db557e35 drm/amdgpu: return early on error while setting bar0 memtype
We set WC memtype for aper_base but don't check return value
of arch_io_reserve_memtype_wc(). Be more defensive and return
early on error.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:04 -05:00
Nirmoy Das d5a28852e8 drm/amdgpu: remove unnecessary checks
amdgpu_ttm_backend_bind() only needed for TTM_PL_TT
and AMDGPU_PL_PREEMPT.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:04 -05:00
Evan Quan 087451f372 drm/amdgpu: use generic fb helpers instead of setting up AMD own's.
With the shadow buffer support from generic framebuffer emulation, it's
possible now to have runpm kicked when no update for console.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:03 -05:00
Graham Sider b5d1d755c1 drm/amdkfd: remove kgd_dev declaration and initialization
Completes removal of kgd_dev. Direct references to amdgpu_device objects
should now be used instead.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:03 -05:00
Graham Sider 56c5977eae drm/amdkfd: replace/remove remaining kgd_dev references
Remove get_amdgpu_device and other remaining kgd_dev references aside
from declaration/kfd struct entry and initialization.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:02 -05:00
Graham Sider dff63da93e drm/amdkfd: replace kgd_dev in gpuvm amdgpu_amdkfd funcs
Modified definitions:

- amdgpu_amdkfd_gpuvm_acquire_process_vm
- amdgpu_amdkfd_gpuvm_release_process_vm
- amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu
- amdgpu_amdkfd_gpuvm_free_memory_of_gpu
- amdgpu_amdkfd_gpuvm_map_memory_to_gpu
- amdgpu_amdkfd_gpuvm_unmap_memory_from_gpu
- amdgpu_amdkfd_gpuvm_sync_memory
- amdgpu_amdkfd_gpuvm_map_gtt_bo_to_kernel
- amdgpu_amdkfd_gpuvm_unmap_gtt_bo_from_kernel
- amdgpu_amdkfd_gpuvm_get_vm_fault_info
- amdgpu_amdkfd_gpuvm_import_dmabuf
- amdgpu_amdkfd_get_tile_config

Removed:

- get_amdgpu_device

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:02 -05:00
Graham Sider 574c4183ef drm/amdkfd: replace kgd_dev in get amdgpu_amdkfd funcs
Modified definitions:

- amdgpu_amdkfd_get_fw_version
- amdgpu_amdkfd_get_local_mem_info
- amdgpu_amdkfd_get_gpu_clock_counter
- amdgpu_amdkfd_get_max_engine_clock_in_mhz
- amdgpu_amdkfd_get_cu_info
- amdgpu_amdkfd_get_dmabuf_info
- amdgpu_amdkfd_get_vram_usage
- amdgpu_amdkfd_get_hive_id
- amdgpu_amdkfd_get_unique_id
- amdgpu_amdkfd_get_mmio_remap_phys_addr
- amdgpu_amdkfd_get_num_gws
- amdgpu_amdkfd_get_asic_rev_id
- amdgpu_amdkfd_get_noretry
- amdgpu_amdkfd_get_xgmi_hops_count
- amdgpu_amdkfd_get_xgmi_bandwidth_mbytes
- amdgpu_amdkfd_get_pcie_bandwidth_mbytes

Also replaces kfd_device_by_kgd with kfd_device_by_adev, now
searching via adev rather than kgd.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:02 -05:00
Graham Sider 6bfc7c7e17 drm/amdkfd: replace kgd_dev in various amgpu_amdkfd funcs
Modified definitions:

- amdgpu_amdkfd_submit_ib
- amdgpu_amdkfd_set_compute_idle
- amdgpu_amdkfd_have_atomics_support
- amdgpu_amdkfd_flush_gpu_tlb_pasid
- amdgpu_amdkfd_flush_gpu_tlb_pasid
- amdgpu_amdkfd_gpu_reset
- amdgpu_amdkfd_alloc_gtt_mem
- amdgpu_amdkfd_free_gtt_mem
- amdgpu_amdkfd_alloc_gws
- amdgpu_amdkfd_free_gws
- amdgpu_amdkfd_ras_poison_consumption_handler

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:01 -05:00
Graham Sider 3356c38dc1 drm/amdkfd: replace kgd_dev in various kfd2kgd funcs
Modified definitions:

- program_sh_mem_settings
- set_pasid_vmid_mapping
- init_interrupts
- address_watch_disable
- address_watch_execute
- wave_control_execute
- address_watch_get_offset
- get_atc_vmid_pasid_mapping_info
- set_scratch_backing_va
- set_vm_context_page_table_base
- read_vmid_from_vmfault_reg
- get_cu_occupancy
- program_trap_handler_settings

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:01 -05:00
Graham Sider 420185fdad drm/amdkfd: replace kgd_dev in hqd/mqd kfd2kgd funcs
Modified definitions:

- hqd_load
- hiq_mqd_load
- hqd_sdma_load
- hqd_dump
- hqd_sdma_dump
- hqd_is_occupied
- hqd_destroy
- hqd_sdma_is_occupied
- hqd_sdma_destroy

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:01 -05:00
Graham Sider c531a58bb6 drm/amdkfd: replace kgd_dev in static gfx v10_3 funcs
Static funcs in amdgpu_amdkfd_gfx_v10_3.c now using amdgpu_device.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:00 -05:00
Graham Sider 4056b03377 drm/amdkfd: replace kgd_dev in static gfx v10 funcs
Static funcs in amdgpu_amdkfd_gfx_v10.c now using amdgpu_device.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:00 -05:00
Graham Sider 9a17c9b79b drm/amdkfd: replace kgd_dev in static gfx v9 funcs
Static funcs in amdgpu_amdkfd_gfx_v9.c now using amdgpu_device.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:00 -05:00
Graham Sider 1cca608742 drm/amdkfd: replace kgd_dev in static gfx v8 funcs
Static funcs in amdgpu_amdkfd_gfx_v8.c now using amdgpu_device.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:57:59 -05:00
Graham Sider 9365fbf3d7 drm/amdkfd: replace kgd_dev in static gfx v7 funcs
Static funcs in amdgpu_amdkfd_gfx_v7.c now using amdgpu_device.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:57:59 -05:00
Christian König fa78e367a2 drm/amdgpu: stop getting excl fence separately
Just grab all fences for the display flip in one go.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211028132630.2330-2-christian.koenig@amd.com
2021-11-17 14:32:27 +01:00
Linus Torvalds 304ac8032d drm next/fixes for 5.16-rc1
bridge:
 - HPD improvments for lt9611uxc
 - eDP aux-bus support for ps8640
 - LVDS data-mapping selection support
 
 ttm:
 - remove huge page functionality (needs reworking)
 - fix a race condition during BO eviction
 
 panels:
 - add some new panels
 
 fbdev:
 - fix double-free
 - remove unused scrolling acceleration
 - CONFIG_FB dep improvements
 
 locking:
 - improve contended locking logging
 - naming collision fix
 
 dma-buf:
 - add dma_resv_for_each_fence iterator
 - fix fence refcounting bug
 - name locking fixesA
 
 prime:
 - fix object references during mmap
 
 nouveau:
 - various code style changes
 - refcount fix
 - device removal fixes
 - protect client list with a mutex
 - fix CE0 address calculation
 
 i915:
 - DP rates related fixes
 - Revert disabling dual eDP that was causing state readout problems
 - put the cdclk vtables in const data
 - Fix DVO port type for older platforms
 - Fix blankscreen by turning DP++ TMDS output buffers on encoder->shutdown
 - CCS FBs related fixes
 - Fix recursive lock in GuC submission
 - Revert guc_id from i915_request tracepoint
 - Build fix around dmabuf
 
 amdgpu:
 - GPU reset fix
 - Aldebaran fix
 - Yellow Carp fixes
 - DCN2.1 DMCUB fix
 - IOMMU regression fix for Picasso
 - DSC display fixes
 - BPC display calculation fixes
 - Other misc display fixes
 - Don't allow partial copy from user for DC debugfs
 - SRIOV fixes
 - GFX9 CSB pin count fix
 - Various IP version check fixes
 - DP 2.0 fixes
 - Limit DCN1 MPO fix to DCN1
 
 amdkfd:
 - SVM fixes
 - Fix gfx version for renoir
 - Reset fixes
 
 udl:
 - timeout fix
 
 imx:
 - circular locking fix
 
 virtio:
 - NULL ptr deref fix
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmGN3YwACgkQDHTzWXnE
 hr6aZQ/+Pobf1VE7V3wPUcopxccJYmgBvG/uY8EDyjA8qaxHs2pQqGN2IooOGxr6
 F8G1N94Hem/PCDn3T8JI2Tqw5z4sy4UwLahEWISurFCen1IMAfA7hYfutp9X3O7X
 8h7b+PgkvVruEAHF7z0kqnWGPHmcro29cIHNkXVRjnJuz+Gmn1XRfo6Jj65n6D7u
 NfMeU4/lWRR3767oJQzTqyAYtGxsKaZT3/tBD5WggZBzEKC7hqhAl8EUoOLWwojo
 fDqwiEpLXpraPRIQH8trkXVHhzPeLAmG916WwS8JG3CEk9mUQ+I7Jshhd8cw+bsQ
 XPuk3OBfU9mtuiGgNzrLP3xXJZs/QN3EkpKZWLefTnJY+C4BgiP2RifTnghmwV31
 6/7Pr83CX/cn3BRd7r0xaeBZYvVYBZmwoZcsZFJBM8SVjd/ofKUfAmCzZZKheio2
 5qa6bj9DQoyjEoFAULh23plcX6hvATGP7wzfRTnJ9AlAJ0KyEjVJ3r0qE6jHMDc/
 uzcTAnKIWCxt9kSgE5qwLQtxLBaBpr/iOniZbCqGkPjiZeMzqP/ug1AKVP7kk39x
 FxZVT8ZOKk8Xt4iLZx8jmHi2KKheXYZi9LqieoTrJd44qMXDOmR9DCtQX9FZuWJS
 EJAlMj6sCowAZdODPZMVpoMc3Gti9nZ2Fpu7mLrRcMk1gKfjKwo=
 =qMNk
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2021-11-12' of git://anongit.freedesktop.org/drm/drm

Pull more drm updates from Dave Airlie:
 "I missed a drm-misc-next pull for the main pull last week. It wasn't
  that major and isn't the bulk of this at all. This has a bunch of
  fixes all over, a lot for amdgpu and i915.

  bridge:
   - HPD improvments for lt9611uxc
   - eDP aux-bus support for ps8640
   - LVDS data-mapping selection support

  ttm:
   - remove huge page functionality (needs reworking)
   - fix a race condition during BO eviction

  panels:
   - add some new panels

  fbdev:
   - fix double-free
   - remove unused scrolling acceleration
   - CONFIG_FB dep improvements

  locking:
   - improve contended locking logging
   - naming collision fix

  dma-buf:
   - add dma_resv_for_each_fence iterator
   - fix fence refcounting bug
   - name locking fixesA

  prime:
   - fix object references during mmap

  nouveau:
   - various code style changes
   - refcount fix
   - device removal fixes
   - protect client list with a mutex
   - fix CE0 address calculation

  i915:
   - DP rates related fixes
   - Revert disabling dual eDP that was causing state readout problems
   - put the cdclk vtables in const data
   - Fix DVO port type for older platforms
   - Fix blankscreen by turning DP++ TMDS output buffers on encoder->shutdown
   - CCS FBs related fixes
   - Fix recursive lock in GuC submission
   - Revert guc_id from i915_request tracepoint
   - Build fix around dmabuf

  amdgpu:
   - GPU reset fix
   - Aldebaran fix
   - Yellow Carp fixes
   - DCN2.1 DMCUB fix
   - IOMMU regression fix for Picasso
   - DSC display fixes
   - BPC display calculation fixes
   - Other misc display fixes
   - Don't allow partial copy from user for DC debugfs
   - SRIOV fixes
   - GFX9 CSB pin count fix
   - Various IP version check fixes
   - DP 2.0 fixes
   - Limit DCN1 MPO fix to DCN1

  amdkfd:
   - SVM fixes
   - Fix gfx version for renoir
   - Reset fixes

  udl:
   - timeout fix

  imx:
   - circular locking fix

  virtio:
   - NULL ptr deref fix"

* tag 'drm-next-2021-11-12' of git://anongit.freedesktop.org/drm/drm: (126 commits)
  drm/ttm: Double check mem_type of BO while eviction
  drm/amdgpu: add missed support for UVD IP_VERSION(3, 0, 64)
  drm/amdgpu: drop jpeg IP initialization in SRIOV case
  drm/amd/display: reject both non-zero src_x and src_y only for DCN1x
  drm/amd/display: Add callbacks for DMUB HPD IRQ notifications
  drm/amd/display: Don't lock connection_mutex for DMUB HPD
  drm/amd/display: Add comment where CONFIG_DRM_AMD_DC_DCN macro ends
  drm/amdkfd: Fix retry fault drain race conditions
  drm/amdkfd: lower the VAs base offset to 8KB
  drm/amd/display: fix exit from amdgpu_dm_atomic_check() abruptly
  drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov
  drm/amdgpu: fix uvd crash on Polaris12 during driver unloading
  drm/i915/adlp/fb: Prevent the mapping of redundant trailing padding NULL pages
  drm/i915/fb: Fix rounding error in subsampled plane size calculation
  drm/i915/hdmi: Turn DP++ TMDS output buffers back on in encoder->shutdown()
  drm/locking: fix __stack_depot_* name conflict
  drm/virtio: Fix NULL dereference error in virtio_gpu_poll
  drm/amdgpu: fix SI handling in amdgpu_device_asic_has_dc_support()
  drm/amdgpu: Fix dangling kfd_bo pointer for shared BOs
  drm/amd/amdkfd: Don't sent command to HWS on kfd reset
  ...
2021-11-12 12:11:07 -08:00
Dave Airlie 951bad0bd9 Merge tag 'amd-drm-fixes-5.16-2021-11-10' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-fixes-5.16-2021-11-10:

amdgpu:
- Don't allow partial copy from user for DC debugfs
- SRIOV fixes
- GFX9 CSB pin count fix
- Various IP version check fixes
- DP 2.0 fixes
- Limit DCN1 MPO fix to DCN1

amdkfd:
- SVM fixes
- Reset fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211110222536.7527-1-alexander.deucher@amd.com
2021-11-11 10:14:44 +10:00
Dave Airlie f8ca7b7419 Removed the TTM Huge Page functionnality to address a crash, a timeout
fix for udl, CONFIG_FB dependency improvements, a fix for a circular
 locking depency in imx, a NULL pointer dereference fix for virtio, and a
 naming collision fix for drm/locking.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCYYuA2wAKCRDj7w1vZxhR
 xQdZAP4oqiWpCO1JfnZdvEJ/lOULqvdzYkbUZexshGLdbb4ECwEA83TzIbQvXP8p
 jsC1hPNAIsOBkQ+nGZwJkTOtDcpEAQ4=
 =N2pK
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-fixes-2021-11-10' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

Removed the TTM Huge Page functionnality to address a crash, a timeout
fix for udl, CONFIG_FB dependency improvements, a fix for a circular
locking depency in imx, a NULL pointer dereference fix for virtio, and a
naming collision fix for drm/locking.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20211110082114.vfpkpnecwdfg27lk@gilmour
2021-11-11 08:14:19 +10:00
Guchun Chen 4d395f938a drm/amdgpu: add missed support for UVD IP_VERSION(3, 0, 64)
Fixes: 96b8dd4423 ("drm/amdgpu/amdgpu_vcn: convert to IP version checking")
Signed-off-by: Flora Cui <flora.cui@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-10 12:03:41 -05:00
Guchun Chen b45a36032d drm/amdgpu: drop jpeg IP initialization in SRIOV case
Fixes: b05b9c591f ("drm/amdgpu: clean up set IP function")
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-10 12:00:08 -05:00
shaoyunl 9f4f2c1a35 drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov
The KFD pre_reset should be called before reset been executed, it will
hold the lock to prevent other rocm process to sent the packlage to hiq
during host execute the real reset on the HW

Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-09 17:08:00 -05:00
Evan Quan 4fc30ea780 drm/amdgpu: fix uvd crash on Polaris12 during driver unloading
There was a change(below) target for such issue:
d82e2c249c ("drm/amdgpu: Fix crash on device remove/driver unload")
But the fix for VI ASICs was missing there. This is a supplement for
that.

Fixes: d82e2c249c ("drm/amdgpu: Fix crash on device remove/driver unload")

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-09 17:06:15 -05:00
Alex Deucher 2d32ffd6e9 drm/amdgpu: fix SI handling in amdgpu_device_asic_has_dc_support()
Properly handle SI DC support when CONFIG_DRM_AMD_DC_SI is not
set.

Fixes: f7f12b2582 ("drm/amdgpu: default to true in amdgpu_device_asic_has_dc_support")
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-05 14:12:53 -04:00
Felix Kuehling 5702d05295 drm/amdgpu: Fix dangling kfd_bo pointer for shared BOs
If a kfd_bo was shared (e.g. a dmabuf export), the original kfd_bo may be
freed when the amdgpu_bo still lives on. Free the kfd_bo struct in the
release_notify callback then the amdgpu_bo is freed.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-By: Ramesh Errabolu <Ramesh.Errabolu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-05 14:12:45 -04:00
Evan Quan e6ef9b396b drm/amdgpu: correctly toggle gfx on/off around RLC_SPM_* register access
As part of the ib padding process, accessing the RLC_SPM_* register may
trigger gfx hang. Since gfxoff may be already kicked during the whole period.
To address that, we manually toggle gfx on/off around the RLC_SPM_*
register access.

This can resolve the gfx hang issue observed on running Talos with RDP launched
in parallel.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-05 14:12:29 -04:00
Tao Zhou 7513c9ff44 drm/amdgpu: correct xgmi ras error count reset
The error count reset for xgmi3x16 pcs is missed.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-05 14:12:21 -04:00
YuBiao Wang 6ddc0eb7a2 drm/amd/amdgpu: Fix csb.bo pin_count leak on gfx 9
[Why]
csb bo is not unpinned in gfx 9. It will lead to pin_count leak on
driver unload.

[How]
Call bo_free_kernel corresponding to bo_create_kernel in
gfx_rlc_init_csb. This will also unify the code path with other gfx
versions.

Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-05 14:11:32 -04:00
YuBiao Wang c4fc13b581 drm/amd/amdgpu: Avoid writing GMC registers under sriov in gmc9
[Why]
For Vega10, disabling gart of gfxhub could mess up KIQ and PSP
under sriov mode, and lead to DMAR on host side.

[How]
Skip writing GMC registers under sriov.

Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-05 14:11:20 -04:00
Kent Russell 7ef6b7f844 drm/amdgpu: Make sure to reserve BOs before adding or removing
BOs need to be reserved before they are added or removed, so ensure that
they are reserved during kfd_mem_attach and kfd_mem_detach

Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-05 14:11:05 -04:00
Jason Gunthorpe 0d97950953 drm/ttm: remove ttm_bo_vm_insert_huge()
The huge page functionality in TTM does not work safely because PUD and
PMD entries do not have a special bit.

get_user_pages_fast() considers any page that passed pmd_huge() as
usable:

	if (unlikely(pmd_trans_huge(pmd) || pmd_huge(pmd) ||
		     pmd_devmap(pmd))) {

And vmf_insert_pfn_pmd_prot() unconditionally sets

	entry = pmd_mkhuge(pfn_t_pmd(pfn, prot));

eg on x86 the page will be _PAGE_PRESENT | PAGE_PSE.

As such gup_huge_pmd() will try to deref a struct page:

	head = try_grab_compound_head(pmd_page(orig), refs, flags);

and thus crash.

Thomas further notices that the drivers are not expecting the struct page
to be used by anything - in particular the refcount incr above will cause
them to malfunction.

Thus everything about this is not able to fully work correctly considering
GUP_fast. Delete it entirely. It can return someday along with a proper
PMD/PUD_SPECIAL bit in the page table itself to gate GUP_fast.

Fixes: 314b6580ad ("drm/ttm, drm/vmwgfx: Support huge TTM pagefaults")
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Thomas Hellström <thomas.helllstrom@linux.intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
[danvet: Update subject per Thomas' &Christian's review]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/0-v2-a44694790652+4ac-ttm_pmd_jgg@nvidia.com
2021-11-05 11:13:19 +01:00
Linus Torvalds 5c904c66ed Char/Misc driver update for 5.16-rc1
Here is the big set of char and misc and other tiny driver subsystem
 updates for 5.16-rc1.
 
 Loads of things in here, all of which have been in linux-next for a
 while with no reported problems (except for one called out below.)
 
 Included are:
 	- habanana labs driver updates, including dma_buf usage,
 	  reviewed and acked by the dma_buf maintainers
 	- iio driver update (going through this tree not staging as they
 	  really do not belong going through that tree anymore)
 	- counter driver updates
 	- hwmon driver updates that the counter drivers needed, acked by
 	  the hwmon maintainer
 	- xillybus driver updates
 	- binder driver updates
 	- extcon driver updates
 	- dma_buf module namespaces added (will cause a build error in
 	  arm64 for allmodconfig, but that change is on its way through
 	  the drm tree)
 	- lkdtm driver updates
 	- pvpanic driver updates
 	- phy driver updates
 	- virt acrn and nitr_enclaves driver updates
 	- smaller char and misc driver updates
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYYPX2A8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ymUUgCbB4EKysgLuXYdjUalZDx+vvZO4k0AniS14O4k
 F+2dVSZ5WX6wumUzCaA6
 =bXQM
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver updates from Greg KH:
 "Here is the big set of char and misc and other tiny driver subsystem
  updates for 5.16-rc1.

  Loads of things in here, all of which have been in linux-next for a
  while with no reported problems (except for one called out below.)

  Included are:

   - habanana labs driver updates, including dma_buf usage, reviewed and
     acked by the dma_buf maintainers

   - iio driver update (going through this tree not staging as they
     really do not belong going through that tree anymore)

   - counter driver updates

   - hwmon driver updates that the counter drivers needed, acked by the
     hwmon maintainer

   - xillybus driver updates

   - binder driver updates

   - extcon driver updates

   - dma_buf module namespaces added (will cause a build error in arm64
     for allmodconfig, but that change is on its way through the drm
     tree)

   - lkdtm driver updates

   - pvpanic driver updates

   - phy driver updates

   - virt acrn and nitr_enclaves driver updates

   - smaller char and misc driver updates"

* tag 'char-misc-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (386 commits)
  comedi: dt9812: fix DMA buffers on stack
  comedi: ni_usb6501: fix NULL-deref in command paths
  arm64: errata: Enable TRBE workaround for write to out-of-range address
  arm64: errata: Enable workaround for TRBE overwrite in FILL mode
  coresight: trbe: Work around write to out of range
  coresight: trbe: Make sure we have enough space
  coresight: trbe: Add a helper to determine the minimum buffer size
  coresight: trbe: Workaround TRBE errata overwrite in FILL mode
  coresight: trbe: Add infrastructure for Errata handling
  coresight: trbe: Allow driver to choose a different alignment
  coresight: trbe: Decouple buffer base from the hardware base
  coresight: trbe: Add a helper to pad a given buffer area
  coresight: trbe: Add a helper to calculate the trace generated
  coresight: trbe: Defer the probe on offline CPUs
  coresight: trbe: Fix incorrect access of the sink specific data
  coresight: etm4x: Add ETM PID for Kryo-5XX
  coresight: trbe: Prohibit trace before disabling TRBE
  coresight: trbe: End the AUX handle on truncation
  coresight: trbe: Do not truncate buffer on IRQ
  coresight: trbe: Fix handling of spurious interrupts
  ...
2021-11-04 08:21:47 -07:00
James Zhu 93cec18478 drm/amdgpu: remove duplicated kfd_resume_iommu
Remove duplicated kfd_resume_iommu which already runs
in mdgpu_amdkfd_device_init.

Tested-By: Ken Moffat <zarniwhoop@ntlworld.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-03 12:22:08 -04:00
Aaron Liu e8a423c589 drm/amdgpu: update RLC_PG_DELAY_3 Value to 200us for yellow carp
For yellow carp, the desired CGPG hysteresis value is 0x4E20.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-03 12:22:08 -04:00
Mario Limonciello c92f909614 drm/amdgpu: Convert SMU version to decimal in debugfs
This is more useful when talking to the SMU team to have the information
in this format, save one less step to manually do it.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-03 12:22:07 -04:00
Oak Zeng 72c148d776 drm/amdgpu: use correct register mask to extract field
Aldebaran has different register mask definitions for
regiter MC_VM_XGMI_LFB_CNTL. Use the correct masks
to interpret fields of this register.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-03 12:22:07 -04:00
Jingwen Chen 38d4e4638e drm/amd/amdgpu: fix bad job hw_fence use after free in advance tdr
[Why]
In advance tdr mode, the real bad job will be resubmitted twice, while
in drm_sched_resubmit_jobs_ext, there's a dma_fence_put, so the bad job
is put one more time than other jobs.

[How]
Adding dma_fence_get before resbumit job in
amdgpu_device_recheck_guilty_jobs and put the fence for normal jobs

Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-03 12:22:07 -04:00
Linus Torvalds 56d3375448 drm for 5.16-rc1
core:
 - improve dma_fence, lease and resv documentation
 - shmem-helpers: allocate WC pages on x86, use vmf_insert_pin
 - sched fixes/improvements
 - allow empty drm leases
 - add dma resv iterator
 - add more DP 2.0 headers
 - DP MST helper improvements for DP2.0
 
 dma-buf:
 - avoid warnings, remove fence trace macros
 
 bridge:
 - new helper to get rid of panels
 - probe improvements for it66121
 - enable DSI EOTP for anx7625
 
 fbdev:
 - efifb: release runtime PM on destroy
 
 ttm:
 - kerneldoc switch
 - helper to clear all DMA mappings
 - pool shrinker optimizaton
 - remove ttm_tt_destroy_common
 - update ttm_move_memcpy for async use
 
 panel:
 - add new panel-edp driver
 
 amdgpu:
  - Initial DP 2.0 support
  - Initial USB4 DP tunnelling support
  - Aldebaran MCE support
  - Modifier support for DCC image stores for GFX 10.3
  - Display rework for better FP code handling
  - Yellow Carp/Cyan Skillfish updates
  - Cyan Skillfish display support
  - convert vega/navi to IP discovery asic enumeration
  - validate IP discovery table
  - RAS improvements
  - Lots of fixes
 
  i915:
  - DG1 PCI IDs + LMEM discovery/placement
  - DG1 GuC submission by default
  - ADL-S PCI IDs updated + enabled by default
  - ADL-P (XE_LPD) fixed and updates
  - DG2 display fixes
  - PXP protected object support for Gen12 integrated
  - expose multi-LRC submission interface for GuC
  - export logical engine instance to user
  - Disable engine bonding on Gen12+
  - PSR cleanup
  - PSR2 selective fetch by default
  - DP 2.0 prep work
  - VESA vendor block + MSO use of it
  - FBC refactor
  - try again to fix fast-narrow vs slow-wide eDP training
  - use THP when IOMMU enabled
  - LMEM backup/restore for suspend/resume
  - locking simplification
  - GuC major reworking
  - async flip VT-D workaround changes
  - DP link training improvements
  - misc display refactorings
 
 bochs:
 - new PCI ID
 
 rcar-du:
 - Non-contiguious buffer import support for rcar-du
 - r8a779a0 support prep
 
 omapdrm:
 - COMPILE_TEST fixes
 
 sti:
 - COMPILE_TEST fixes
 
 msm:
 - fence ordering improvements
 - eDP support in DP sub-driver
 - dpu irq handling cleanup
 - CRC support for making igt happy
 - NO_CONNECTOR bridge support
 - dsi: 14nm phy support for msm8953
 - mdp5: msm8x53, sdm450, sdm632 support
 
 stm:
 - layer alpha + zpo support
 
 v3d:
 - fix Vulkan CTS failure
 - support multiple sync objects
 
 gud:
 - add R8/RGB332/RGB888 pixel formats
 
 vc4:
 - convert to new bridge helpers
 
 vgem:
 - use shmem helpers
 
 virtio:
 - support mapping exported vram
 
 zte:
 - remove obsolete driver
 
 rockchip:
 - use bridge attach no connector for LVDS/RGB
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmGByPYACgkQDHTzWXnE
 hr6fxA//cXUvTHlEtF7UJDBRAYv+9lXH39NbGYU4aLJuBNlZztCuUi5JOSyDFDH1
 N9VI5biVseev2PEnCzJUubWxTqbUO7FBQTw0TyvZ4Eqn+UZMuFeo0dvdKZRAkvjV
 VHSUc0fm0+WSYanKUK7XK0fwG8aE6JVyYngzgKPSjifhszTdiiRsbU21iTinFhkS
 rgh3HEVELp+LqfoG4qzAYqFUjYqUjvCjd/hX/UkzCII8ZXKr38/4127e95443WOk
 +jes0gWGJe9TvSDrqo9TMx4qukcOniINFUvnzoD2RhOS+Jzr/i5rBh51Xy92g3NO
 Q7hy6byZdk/ZO/MXCDQ2giUOkBiqn5fQjlRGQp4iAZYw9pb3HU+/xrTq0BWVWd8o
 /vmzZYEKKU/sCGpxVDMZxsHV3mXIuVBvuZq6bjmSGcybgOBCiDx5F/Rum4nY2yHp
 lr3cuc0HP3m3f4b/HVvACO4tGd1nDDpVcon7CuhBB7HB7t6Zl9u18qc/qFw0tCTh
 3sgAhno6XFXtPFcSX2KAeeg0mhKDKKrsOnq5y3bDRr05Z0jLocJk95aXEKs6em4j
 gbyHwNaX3CHtiCnFn2/5169+n1K7zqHBtVSGmQlmFDv55rcdx7L3Spk7tCahQeSQ
 ur24r+sEggm8d5Wjl+MYq6wW3oP31s04JFaeV6oCkaSp1wS+alg=
 =jdhH
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2021-11-03' of git://anongit.freedesktop.org/drm/drm

Pull drm updates from Dave Airlie:
 "Summary below. i915 starts to add support for DG2 GPUs, enables DG1
  and ADL-S support by default, lots of work to enable DisplayPort 2.0
  across drivers. Lots of documentation updates and fixes across the
  board.

  core:
   - improve dma_fence, lease and resv documentation
   - shmem-helpers: allocate WC pages on x86, use vmf_insert_pin
   - sched fixes/improvements
   - allow empty drm leases
   - add dma resv iterator
   - add more DP 2.0 headers
   - DP MST helper improvements for DP2.0

  dma-buf:
   - avoid warnings, remove fence trace macros

  bridge:
   - new helper to get rid of panels
   - probe improvements for it66121
   - enable DSI EOTP for anx7625

  fbdev:
   - efifb: release runtime PM on destroy

  ttm:
   - kerneldoc switch
   - helper to clear all DMA mappings
   - pool shrinker optimizaton
   - remove ttm_tt_destroy_common
   - update ttm_move_memcpy for async use

  panel:
   - add new panel-edp driver

  amdgpu:
   - Initial DP 2.0 support
   - Initial USB4 DP tunnelling support
   - Aldebaran MCE support
   - Modifier support for DCC image stores for GFX 10.3
   - Display rework for better FP code handling
   - Yellow Carp/Cyan Skillfish updates
   - Cyan Skillfish display support
   - convert vega/navi to IP discovery asic enumeration
   - validate IP discovery table
   - RAS improvements
   - Lots of fixes

  i915:
   - DG1 PCI IDs + LMEM discovery/placement
   - DG1 GuC submission by default
   - ADL-S PCI IDs updated + enabled by default
   - ADL-P (XE_LPD) fixed and updates
   - DG2 display fixes
   - PXP protected object support for Gen12 integrated
   - expose multi-LRC submission interface for GuC
   - export logical engine instance to user
   - Disable engine bonding on Gen12+
   - PSR cleanup
   - PSR2 selective fetch by default
   - DP 2.0 prep work
   - VESA vendor block + MSO use of it
   - FBC refactor
   - try again to fix fast-narrow vs slow-wide eDP training
   - use THP when IOMMU enabled
   - LMEM backup/restore for suspend/resume
   - locking simplification
   - GuC major reworking
   - async flip VT-D workaround changes
   - DP link training improvements
   - misc display refactorings

  bochs:
   - new PCI ID

  rcar-du:
   - Non-contiguious buffer import support for rcar-du
   - r8a779a0 support prep

  omapdrm:
   - COMPILE_TEST fixes

  sti:
   - COMPILE_TEST fixes

  msm:
   - fence ordering improvements
   - eDP support in DP sub-driver
   - dpu irq handling cleanup
   - CRC support for making igt happy
   - NO_CONNECTOR bridge support
   - dsi: 14nm phy support for msm8953
   - mdp5: msm8x53, sdm450, sdm632 support

  stm:
   - layer alpha + zpo support

  v3d:
   - fix Vulkan CTS failure
   - support multiple sync objects

  gud:
   - add R8/RGB332/RGB888 pixel formats

  vc4:
   - convert to new bridge helpers

  vgem:
   - use shmem helpers

  virtio:
   - support mapping exported vram

  zte:
   - remove obsolete driver

  rockchip:
   - use bridge attach no connector for LVDS/RGB"

* tag 'drm-next-2021-11-03' of git://anongit.freedesktop.org/drm/drm: (1259 commits)
  drm/amdgpu/gmc6: fix DMA mask from 44 to 40 bits
  drm/amd/display: MST support for DPIA
  drm/amdgpu: Fix even more out of bound writes from debugfs
  drm/amdgpu/discovery: add SDMA IP instance info for soc15 parts
  drm/amdgpu/discovery: add UVD/VCN IP instance info for soc15 parts
  drm/amdgpu/UAPI: rearrange header to better align related items
  drm/amd/display: Enable dpia in dmub only for DCN31 B0
  drm/amd/display: Fix USB4 hot plug crash issue
  drm/amd/display: Fix deadlock when falling back to v2 from v3
  drm/amd/display: Fallback to clocks which meet requested voltage on DCN31
  drm/amd/display: move FPU associated DCN301 code to DML folder
  drm/amd/display: fix link training regression for 1 or 2 lane
  drm/amd/display: add two lane settings training options
  drm/amd/display: decouple hw_lane_settings from dpcd_lane_settings
  drm/amd/display: implement decide lane settings
  drm/amd/display: adopt DP2.0 LT SCR revision 8
  drm/amd/display: FEC configuration for dpia links in MST mode
  drm/amd/display: FEC configuration for dpia links
  drm/amd/display: Add workaround flag for EDID read on certain docks
  drm/amd/display: Set phy_mux_sel bit in dmub scratch register
  ...
2021-11-02 16:47:49 -07:00
Linus Torvalds 6e5772c8d9 Add an interface called cc_platform_has() which is supposed to be used
by confidential computing solutions to query different aspects of the
 system. The intent behind it is to unify testing of such aspects instead
 of having each confidential computing solution add its own set of tests
 to code paths in the kernel, leading to an unwieldy mess.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmF/uLUACgkQEsHwGGHe
 VUqGbQ/+LOmz8hmL5vtbXw/lVonCSBRKI2KVefnN2VtQ3rjtCq8HlNoq/hAdi15O
 WntABFV8u4daNAcssp+H/p+c8Mt/NzQa60TRooC5ZIynSOCj4oZQxTWjcnR4Qxrf
 oABy4sp09zNW31qExtTVTwPC/Ejzv4hA0Vqt9TLQOSxp7oYVYKeDJNp79VJK64Yz
 Ky7epgg8Pauk0tAT76ATR4kyy9PLGe4/Ry0bOtAptO4NShL1RyRgI0ywUmptJHSw
 FV/MnoexdAs4V8+4zPwyOkf8YMDnhbJcvFcr7Yd9AEz2q9Z1wKCgi1M3aZIoW8lV
 YMXECMGe9DfxmEJbnP5zbnL6eF32x+tbq+fK8Ye4V2fBucpWd27zkcTXjoP+Y+zH
 NLg+9QykR9QCH75YCOXcAg1Q5hSmc4DaWuJymKjT+W7MKs89ywjq+ybIBpLBHbQe
 uN9FM/CEKXx8nQwpNQc7mdUE5sZeCQ875028RaLbLx3/b6uwT6rBlNJfxl/uxmcZ
 iF1kG7Cx4uO+7G1a9EWgxtWiJQ8GiZO7PMCqEdwIymLIrlNksAk7nX2SXTuH5jIZ
 YDuBj/Xz2UUVWYFm88fV5c4ogiFlm9Jeo140Zua/BPdDJd2VOP013rYxzFE/rVSF
 SM2riJxCxkva8Fb+8TNiH42AMhPMSpUt1Nmd1H2rcEABRiT83Ow=
 =Na0U
 -----END PGP SIGNATURE-----

Merge tag 'x86_cc_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull generic confidential computing updates from Borislav Petkov:
 "Add an interface called cc_platform_has() which is supposed to be used
  by confidential computing solutions to query different aspects of the
  system.

  The intent behind it is to unify testing of such aspects instead of
  having each confidential computing solution add its own set of tests
  to code paths in the kernel, leading to an unwieldy mess"

* tag 'x86_cc_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  treewide: Replace the use of mem_encrypt_active() with cc_platform_has()
  x86/sev: Replace occurrences of sev_es_active() with cc_platform_has()
  x86/sev: Replace occurrences of sev_active() with cc_platform_has()
  x86/sme: Replace occurrences of sme_active() with cc_platform_has()
  powerpc/pseries/svm: Add a powerpc version of cc_platform_has()
  x86/sev: Add an x86 version of cc_platform_has()
  arch/cc: Introduce a function to check for confidential computing features
  x86/ioremap: Selectively build arch override encryption functions
2021-11-01 15:16:52 -07:00
Alex Deucher 403475be6d drm/amdgpu/gmc6: fix DMA mask from 44 to 40 bits
The DMA mask on SI parts is 40 bits not 44.  Copy
paste typo.

Fixes: 244511f386 ("drm/amdgpu: simplify and cleanup setting the dma mask")
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1762
Acked-by: Christian König <christian.koenig@amd.com>
Tested-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:27:00 -04:00
Alex Deucher 58f8c7fa88 drm/amdgpu/discovery: add SDMA IP instance info for soc15 parts
Add secondary instance version info for soc15 parts.

Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:26:59 -04:00
Alex Deucher 074b2092d9 drm/amdgpu/discovery: add UVD/VCN IP instance info for soc15 parts
Add secondary instance version info for vega20, arcturure, and
aldebaran.

Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:26:59 -04:00
Tao Zhou 3d1a8d950d drm/amdgpu: remove GPRs init for ALDEBARAN in gpu reset (v3)
Remove GPRs init for ALDEBARAN in gpu reset temporarily, will add the init once the
algorithm is stable.

v2: Only remove GPRs init in gpu reset.
v3: Suspend needs it, only skip it in gpu reset.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:26:13 -04:00
Tao Zhou f7e053435c drm/amdgpu: skip GPRs init for some CU settings on ALDEBARAN
Skip GPRs init in specific condition since current GPRs init algorithm only works for some CU settings.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:26:13 -04:00
Candice Li 4320e6f86d drm/amdgpu: Update TA version output in driver
TA version should only be displayed in firmware version column.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:26:12 -04:00
Lang Yu a5c5d8d50e drm/amdgpu: fix a potential memory leak in amdgpu_device_fini_sw()
amdgpu_fence_driver_sw_fini() should be executed before
amdgpu_device_ip_fini(), otherwise fence driver resource
won't be properly freed as adev->rings have been tore down.

Fixes: 72c8c97b15 ("drm/amdgpu: Split amdgpu_device_fini into early and late")

Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:26:12 -04:00
Lang Yu 68df0f195a drm/amdkfd: Separate pinned BOs destruction from general routine
Currently, all kfd BOs use same destruction routine. But pinned
BOs are not unpinned properly. Separate them from general routine.

v2 (Felix):
Add safeguard to prevent user space from freeing signal BO.
Kunmap signal BO in the event of setting event page error.
Just kunmap signal BO to avoid duplicating the code.

Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:26:12 -04:00
Philip Yang 3b8a23ae52 drm/amdkfd: restore userptr ignore bad address error
The userptr can be unmapped by application and still registered to
driver, restore userptr work return user pages will get -EFAULT bad
address error. Pretend this error as succeed. GPU access this userptr
will have VM fault later, it is better than application soft hangs with
stalled user mode queues.

v2: squash in warning fix (Alex)

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:26:12 -04:00
Kent Russell 68daadf3d6 drm/amdgpu: Add kernel parameter support for ignoring bad page threshold
When a GPU hits the bad_page_threshold, it will not be initialized by
the amdgpu driver. This means that the table cannot be cleared, nor can
information gathering be performed (getting serial number, BDF, etc).

If the bad_page_threshold kernel parameter is set to -2,
continue to initialize the GPU, while printing a warning to dmesg that
this action has been done

v2: squash in Luben's fix to restore RAS info reporting

Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Mukul Joshi <Mukul.Joshi@amd.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:26:12 -04:00
Kent Russell 8483fdfea7 drm/amdgpu: Warn when bad pages approaches 90% threshold
dmesg doesn't warn when the number of bad pages approaches the
threshold for page retirement. WARN when the number of bad pages
is at 90% or greater for easier checks and planning, instead of waiting
until the GPU is full of bad pages.

Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Mukul Joshi <Mukul.Joshi@amd.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28 14:26:12 -04:00
Dave Airlie 970eae1560 Linux 5.15-rc7
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmF298ceHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGIJYH/1rsEFQQ6caeQdy1
 z9eFIe48DNM4l7bFk+qEj2UAbzPdahVJ299Mg5fW0n2CDemOc9/n0b9TxQ37YObi
 mOzu0xwJVupIxkyFMPQSSc2q8aLm67NSpJy08DsmaNses5hSvu8x15RPHLQTybjt
 SwtKns+jpCq79P1GWbrB5e5UkLb0VNoxNp4L1U4pMrYGcEkJUXbaxNY2V/JcXdM7
 Vtn+qN0T/J6V6QVftv0t8Ecj3bjEnmL3kZHaTaNg3dGeKRpCGyHc5lcBQ0cNFG6t
 vjZ9VbuhBzGI3TN2tHH5hpA1UXo7HPBBCwQqxF1jeGLGHULikYwZ3TAPWqL3QZqC
 9cxr9SY=
 =p75d
 -----END PGP SIGNATURE-----

BackMerge tag 'v5.15-rc7' into drm-next

The msm next tree is based on rc3, so let's just backmerge rc7 before pulling it in.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2021-10-28 14:59:38 +10:00
Maxime Ripard 736638246e
Merge drm/drm-next into drm-misc-next
drm-misc-next hasn't been updated in a while and I need a post -rc2
state to merge some vc4 patches.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
2021-10-25 15:27:56 +02:00
Greg Kroah-Hartman 16b0314aa7 dma-buf: move dma-buf symbols into the DMA_BUF module namespace
In order to better track where in the kernel the dma-buf code is used,
put the symbols in the namespace DMA_BUF and modify all users of the
symbols to properly import the namespace to not break the build at the
same time.

Now the output of modinfo shows the use of these symbols, making it
easier to watch for users over time:

$ modinfo drivers/misc/fastrpc.ko | grep import
import_ns:      DMA_BUF

Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: dri-devel@lists.freedesktop.org
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Sumit Semwal <sumit.semwal@linaro.org>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20211010124628.17691-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-10-25 14:53:08 +02:00
Alex Deucher df9feb1a69 drm/amdgpu/nbio7.4: use original HDP_FLUSH bits
The extended bits were not available for use on vega20 and
presumably arcturus as well.

Fixes: a0f9f85466 ("drm/amdgpu/nbio7.4: don't use GPU_HDP_FLUSH bit 12")
Reviewed-and-tested-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-22 10:11:41 -04:00
Christian König 25b8a14e88 drm/amdgpu: use new iterator in amdgpu_ttm_bo_eviction_valuable
Simplifying the code a bit.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20211005113742.1101-13-christian.koenig@amd.com
2021-10-22 14:41:59 +02:00
Christian König 930ca2a7cb drm/amdgpu: use the new iterator in amdgpu_sync_resv
Simplifying the code a bit.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20211005113742.1101-12-christian.koenig@amd.com
2021-10-22 14:41:07 +02:00
Alex Deucher 8cbc52c207 drm/amdgpu: Workaround harvesting info for some navy flounder boards
Some navy flounder boards do not properly mark harvested
VCN instances.  Fix that here.

v2: use IP versions

Fixes: 1b592d00b4 ("drm/amdgpu/vcn: remove manual instance setting")
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1743
Reviewed-and-tested-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-21 23:39:00 -04:00
Alex Deucher 47be978be0 drm/amdgpu/vcn3.0: remove intermediate variable
No need to use the id variable, just use the constant
plus instance offset directly.

Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-21 23:38:57 -04:00
Alex Deucher 7876c7ea14 drm/amdgpu/vcn2.0: remove intermediate variable
No need to use the tmp variable, just use the constant
directly.

Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-21 23:38:53 -04:00
Alex Deucher c5dd5667f4 drm/amdgpu: Consolidate VCN firmware setup code
Roughly the same code was present in all VCN versions.
Consolidate it into a single function.

v2: use AMDGPU_UCODE_ID_VCN + i, check if num_inst >= 2

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
2021-10-21 23:38:46 -04:00
Alex Deucher e8ac9e93b4 drm/amdgpu/vcn3.0: handle harvesting in firmware setup
Only enable firmware for the instance that is enabled.

v2: use AMDGPU_UCODE_ID_VCN + i

Fixes: 1b592d00b4 ("drm/amdgpu/vcn: remove manual instance setting")
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1743
Reviewed-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-21 23:38:41 -04:00
Jingwen Chen e77f0f5c6a drm/amd/amdgpu: add dummy_page_addr to sriov msg
Add dummy_page_addr to sriov msg for host driver to set
GCVM_L2_PROTECTION_DEFAULT_ADDR* registers correctly.

v2:
should update vf2pf msg instead

Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com>
Reviewed-by: Horace Chen <horace.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-21 23:38:16 -04:00
Huang Rui a61794bd2f drm/amdgpu: remove grbm cam index/data operations for gfx v10
PSP firmware will be responsible for applying the GRBM CAM remapping in
the production. And the GRBM_CAM_INDEX / GRBM_CAM_DATA registers will be
protected by PSP under security policy. So remove it according to the
new security policy.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-21 23:38:10 -04:00
Aaron Liu 53c2ff8bcb drm/amdgpu: support B0&B1 external revision id for yellow carp
B0 internal rev_id is 0x01, B1 internal rev_id is 0x02.
The external rev_id for B0 and B1 is 0x20.
The original expression is not suitable for B1.

v2: squash in fix for display code (Alex)

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-10-20 15:27:31 -04:00
Kent Russell dcd5ea9f94 drm/amdgpu: Clarify error when hitting bad page threshold
Change the error message when the bad_page_threshold is reached,
explicitly stating that the GPU will not be initialized.

Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Mukul Joshi <Mukul.Joshi@amd.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-20 11:43:57 -04:00
Alex Deucher 0d055f09e1 drm/amdgpu: drop navi reg init functions
No longer used since IP enumeration is driven by the IP
discovery table now.

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-20 11:43:57 -04:00
Alex Deucher bf99b9b032 drm/amdgpu: drop nv_set_ip_blocks()
No longer used since IP enumeration is now driven by
amdgpu IP discovery code.

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-20 11:43:57 -04:00
Alex Deucher 7092432e3c drm/amdgpu: drop soc15_set_ip_blocks()
No longer used since IP enumeration is now driven by
amdgpu IP discovery code.

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-20 11:43:57 -04:00
Alex Deucher c9c7d18045 drm/amdgpu/gfx10: fix typo in gfx_v10_0_update_gfx_clock_gating()
Check was incorrectly converted to IP version checking.

Fixes: 4b0ad84254 ("drm/amdgpu/gfx10: convert to IP version checking")
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-20 11:43:56 -04:00
Qing Wang 40320159f0 drm/amdgpu: replace snprintf in show functions with sysfs_emit
show() must not use snprintf() when formatting the value to be
returned to user space.

Fix the following coccicheck warning:
drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c:427:
WARNING: use scnprintf or sprintf.

Signed-off-by: Qing Wang <wangqing@vivo.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-20 11:43:56 -04:00
Aaron Liu 5efacdf072 drm/amdgpu: support B0&B1 external revision id for yellow carp
B0 internal rev_id is 0x01, B1 internal rev_id is 0x02.
The external rev_id for B0 and B1 is 0x20.
The original expression is not suitable for B1.

v2: squash in fix for display code (Alex)

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-20 11:43:56 -04:00
Christian König a0a8e75948 drm/amdgpu: use new iterator in amdgpu_vm_prt_fini
No need to actually allocate an array of fences here.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20211005113742.1101-14-christian.koenig@amd.com
2021-10-20 14:06:35 +02:00
Guchun Chen dac35c4239 drm/amdgpu/discovery: parse hw_id_name for SDMA instance 2 and 3
Otherwise, hw_id_name string is NULL for SDMA 2 and 3 when dumping
ip version from VBIOS.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-19 17:32:52 -04:00
Tao Zhou 42f88ab772 drm/amdgpu: output warning for unsupported ras error inject (v2)
Output a warning message if RAS TA returns UNSUPPORTED_ERROR_INJ status.

v2: implement it in psp_ras_ta_check_status function.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-19 17:32:52 -04:00
Tao Zhou 1b5254e8d9 drm/amdgpu: centralize checking for RAS TA status
Create new function to check status returned by RAS TA.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-19 17:32:52 -04:00
YuBiao Wang a3848df60b drm/amd/amdgpu: Do irq_fini_hw after ip_fini_early
[Why]
drm_irq_uninstall is called in irq_fini_hw so that irq is disabled in sw
stage. SMU (and maybe other IP blocks) fini_hw will call irq_put for
cleanup and the whole cleanup process will be skipped because of
drm->irq_enable = false.

[How]
Move ip_fini_early before irq_fini_hw.

Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-19 17:16:50 -04:00
Tao Zhou c72942c167 drm/amdgpu: load PSP RL in resume path
Some registers' access will fail without PSP RL after resume.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-19 17:14:33 -04:00
Lang Yu 5aeeac6fa3 drm/amdkfd: Fix an inappropriate error handling in allloc memory of gpu
We should unreference a gem object instead of an amdgpu bo here.

Fixes: fd9a9f8801 ("drm/amdgpu: Use GEM obj reference for KFD BOs")

Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-19 17:13:48 -04:00
Alex Deucher 48737ac4d7 drm/amdgpu/psp: add some missing cases to psp_check_pmfw_centralized_cstate_management
Missed a few asics.

v2: update comment

Fixes: 82d05736c4 ("drm/amdgpu/amdgpu_psp: convert to IP version checking")
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-13 22:51:41 -04:00
Philip Yang 43fc10c187 drm/amdkfd: unregistered svm range not overlap with TTM range
When creating unregistered new svm range to recover retry fault, avoid
new svm range to overlap with ranges or userptr ranges managed by TTM,
otherwise svm migration will trigger TTM or userptr eviction, to evict
user queues unexpectedly.

Change helper amdgpu_ttm_tt_affect_userptr to return userptr which is
inside the range. Add helper svm_range_check_vm_userptr to scan all
userptr of the vm, and return overlap userptr bo start, last.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-13 22:20:13 -04:00
Yifan Zhang afd18180c0 drm/amdkfd: fix boot failure when iommu is disabled in Picasso.
When IOMMU disabled in sbios and kfd in iommuv2 path, iommuv2
init will fail. But this failure should not block amdgpu driver init.

Reported-by: youling <youling257@gmail.com>
Tested-by: youling <youling257@gmail.com>
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-13 14:15:52 -04:00
Mukul Joshi 91a1a52d03 drm/amdgpu: Fix RAS page retirement with mode2 reset on Aldebaran
During mode2 reset, the GPU is temporarily removed from the
mgpu_info list. As a result, page retirement fails because it
cannot find the GPU in the GPU list.
To fix this, create our own list of GPUs that support MCE notifier
based page retirement and use that list to check if the UMC error
occurred on a GPU that supports MCE notifier based page retirement.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-13 14:14:48 -04:00
Mukul Joshi a4967a1ebf drm/amdgpu: Enable RAS error injection after mode2 reset on Aldebaran
Add the missing call to re-enable RAS error injections on the Aldebaran
mode2 reset code path.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-13 14:14:35 -04:00
Lang Yu fe04957e26 drm/amdgpu: enable display for cyan skillfish
Display support for cyan skillfish is ready now. Enable it!

Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-13 14:14:34 -04:00
Alex Deucher 369b7d04ba drm/amdgpu/nbio2.3: don't use GPU_HDP_FLUSH bit 12
It's used internally by firmware.  Using it in the driver
could conflict with firmware.

v2: squash in fix for navi1x (Alex)

Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-13 14:14:34 -04:00
Alex Deucher a0f9f85466 drm/amdgpu/nbio7.4: don't use GPU_HDP_FLUSH bit 12
It's used internally by firmware.  Using it in the driver
could conflict with firmware.

Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-11 11:57:00 -04:00
Dave Airlie 797d72ce8e drm-misc-next for v5.16:
UAPI Changes:
 - Allow empty drm leases for creating separate GEM namespaces.
 
 Cross-subsystem Changes:
 - Slightly rework dma_buf_poll.
 - Add dma_resv_for_each_fence_unlocked to iterate, and use it inside
   the lockless dma-resv functions.
 
 Core Changes:
 - Allow devm_drm_of_get_bridge to build without CONFIG_OF for compile testing.
 - Add more DP2 headers.
 - fix CONFIG_FB dependency in fb_helper.
 - Add DRM_FORMAT_R8 to drm_format_info, and helpers for RGB332 and RGB888.
 - Fix crash on a 0 or invalid EDID.
 
 Driver Changes:
 - Apply and revert DRM_MODESET_LOCK_ALL_BEGIN.
 - Add mode_valid to ti-sn65dsi86 bridge.
 - Support multiple syncobjs in v3d.
 - Add R8, RGB332 and RGB888 pixel formats to GUD.
 - Use devm_add_action_or_reset in dw-hdmi-cec.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEuXvWqAysSYEJGuVH/lWMcqZwE8MFAmFdfuwACgkQ/lWMcqZw
 E8OTgg/+Nmsqhj1tsbSCWF1yx81CXHVSOhExPaMl+GPs6+y+sZ+U2rN99dnbULvA
 U56eOmjc8FvgmK89BwhSYNt++QYIRRpzjBGlCYm4bwpgqFOmYsK+en35PYMwHdxM
 Ke8newhzqa6/detvjX52igddZzrBv1Cs8aXuV5rw7Dg0ivlSlQUV0MO8JYwCliWI
 arRT8bg7wzUzhyRZqwqOqKXjvRirqBlFjJmvfL0WgHevZbzYuXbn4eWCUgCVthMH
 pU9QgK6FMW912pBxVppDO2aTDmNvqwj1BsB3RFfRuqS/JJ4s/gf39JxsipnI+/qn
 kPxZVFzzonR8Nl6h9sPi1jZrcVDCBebFgyG8hSgIVb/09U7AVYomtP18VKeh8yCy
 Pp4iQINqOcyMPmXKF491LIL92dcXZAIRaRQFKc/ZSHcfIDA7ZB1+7zf1ixBjlxjP
 GqtjLbmPspI2DzBRlTFEdf58jvX70E5nFYdQyYcy3VprJHuqEgL5PKz2Xcnve6R0
 dEkGA2vMrGtb23YyjbFTNfkdvg9WYXze9HbQLt7kc8mI77TugkG0/rCcwv5pEEu3
 WSwqMeb+5H+7va4AI715MoXbxgnCba2zPTUm1s8kSqTK0Oighc/vWcnnJ4iVuEGE
 8Xt8AIIYUtccufR6ujucVUh7nju2ZOnFE7S92LybnGnByAIADfM=
 =qxpr
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2021-10-06' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for v5.16:

UAPI Changes:
- Allow empty drm leases for creating separate GEM namespaces.

Cross-subsystem Changes:
- Slightly rework dma_buf_poll.
- Add dma_resv_for_each_fence_unlocked to iterate, and use it inside
  the lockless dma-resv functions.

Core Changes:
- Allow devm_drm_of_get_bridge to build without CONFIG_OF for compile testing.
- Add more DP2 headers.
- fix CONFIG_FB dependency in fb_helper.
- Add DRM_FORMAT_R8 to drm_format_info, and helpers for RGB332 and RGB888.
- Fix crash on a 0 or invalid EDID.

Driver Changes:
- Apply and revert DRM_MODESET_LOCK_ALL_BEGIN.
- Add mode_valid to ti-sn65dsi86 bridge.
- Support multiple syncobjs in v3d.
- Add R8, RGB332 and RGB888 pixel formats to GUD.
- Use devm_add_action_or_reset in dw-hdmi-cec.

Signed-off-by: Dave Airlie <airlied@redhat.com>

# gpg: Signature made Wed 06 Oct 2021 20:48:12 AEST
# gpg:                using RSA key B97BD6A80CAC4981091AE547FE558C72A67013C3
# gpg: Good signature from "Maarten Lankhorst <maarten.lankhorst@linux.intel.com>" [expired]
# gpg:                 aka "Maarten Lankhorst <maarten@debian.org>" [expired]
# gpg:                 aka "Maarten Lankhorst <maarten.lankhorst@canonical.com>" [expired]
# gpg: Note: This key has expired!
# Primary key fingerprint: B97B D6A8 0CAC 4981 091A  E547 FE55 8C72 A670 13C3
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/2602f4e9-a8ac-83f8-6c2a-39fd9ca2e1ba@linux.intel.com
2021-10-11 12:39:15 +10:00
Guchun Chen c58a863b1c drm/amdgpu: use adev_to_drm for consistency when accessing drm_device
adev_to_drm is used everywhere, so improve recent changes
when accessing drm_device pointer from amdgpu_device.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-08 13:22:13 -04:00
Alex Deucher 35bdf463de drm/amdgpu: add missing case for HDP for renoir
Missing 4.1.2.

Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-08 13:20:13 -04:00
Alex Deucher 73bf66712d drm/amdgpu/discovery: add missing case for SMU 11.0.5
Was missed when converting the driver over to IP based
initialization.

Tested-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-08 13:20:13 -04:00
Nirmoy Das 58144d2837 drm/amdgpu: unify BO evicting method in amdgpu_ttm
Unify BO evicting functionality for possible memory
types in amdgpu_ttm.c.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-07 11:55:46 -04:00
Mukul Joshi 12b2cab790 drm/amdgpu: Register MCE notifier for Aldebaran RAS
On Aldebaran, GPU driver will handle bad page retirement
for GPU memory even though UMC is host managed. As a result,
register a bad page retirement handler on the mce notifier
chain to retire bad pages on Aldebaran.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-06 15:53:38 -04:00
Nirmoy Das 5b9581df9f drm/amdgpu: return early if debugfs is not initialized
Check first if debugfs is initialized before creating
amdgpu debugfs files.

References: https://gitlab.freedesktop.org/drm/amd/-/issues/1686
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-06 15:53:14 -04:00
Marek Olšák f2e7d85680 drm/amd/display: fix DCC settings for DCN3
ind_block_64b_no_128bcl means INDEP_64B && INDEP_128B &&
MAX_COMPRESSED_BLOCK_SIZE == 64B. Only used by gfx10.3.

ind_block_64b means INDEP_64B && !INDEP_128B &&
MAX_COMPRESSED_BLOCK_SIZE == 64B. Only used by gfx9 and gfx10.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-06 15:50:43 -04:00
Guchun Chen 248b061689 drm/amdgpu: handle the case of pci_channel_io_frozen only in amdgpu_pci_resume
In current code, when a PCI error state pci_channel_io_normal is detectd,
it will report PCI_ERS_RESULT_CAN_RECOVER status to PCI driver, and PCI
driver will continue the execution of PCI resume callback report_resume by
pci_walk_bridge, and the callback will go into amdgpu_pci_resume
finally, where write lock is releasd unconditionally without acquiring
such lock first. In this case, a deadlock will happen when other threads
start to acquire the read lock.

To fix this, add a member in amdgpu_device strucutre to cache
pci_channel_state, and only continue the execution in amdgpu_pci_resume
when it's pci_channel_io_frozen.

Fixes: c9a6b82f45 ("drm/amdgpu: Implement DPC recovery")
Suggested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-05 13:02:31 -04:00
Yifan Zhang 714d9e4574 drm/amdgpu: init iommu after amdkfd device init
This patch is to fix clinfo failure in Raven/Picasso:

Number of platforms: 1
  Platform Profile: FULL_PROFILE
  Platform Version: OpenCL 2.2 AMD-APP (3364.0)
  Platform Name: AMD Accelerated Parallel Processing
  Platform Vendor: Advanced Micro Devices, Inc.
  Platform Extensions: cl_khr_icd cl_amd_event_callback

  Platform Name: AMD Accelerated Parallel Processing Number of devices: 0

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Tested-by: James Zhu <James.Zhu@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-05 13:02:20 -04:00
Guchun Chen e17e27f9bd drm/amdgpu: handle the case of pci_channel_io_frozen only in amdgpu_pci_resume
In current code, when a PCI error state pci_channel_io_normal is detectd,
it will report PCI_ERS_RESULT_CAN_RECOVER status to PCI driver, and PCI
driver will continue the execution of PCI resume callback report_resume by
pci_walk_bridge, and the callback will go into amdgpu_pci_resume
finally, where write lock is releasd unconditionally without acquiring
such lock first. In this case, a deadlock will happen when other threads
start to acquire the read lock.

To fix this, add a member in amdgpu_device strucutre to cache
pci_channel_state, and only continue the execution in amdgpu_pci_resume
when it's pci_channel_io_frozen.

Fixes: c9a6b82f45 ("drm/amdgpu: Implement DPC recovery")
Suggested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-05 12:23:56 -04:00
Christian König 127aedf979 drm/amdgpu: print warning and taint kernel if lockup timeout is disabled
Make sure that we notice this in error reports.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-05 12:23:42 -04:00
Christian König c8365dbda0 drm/amdgpu: revert "Add autodump debugfs node for gpu reset v8"
This reverts commit 728e7e0cd6.

Further discussion reveals that this feature is severely broken
and needs to be reverted ASAP.

GPU reset can never be delayed by userspace even for debugging or
otherwise we can run into in kernel deadlocks.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-05 12:22:36 -04:00
Yifan Zhang 286826d7d9 drm/amdgpu: init iommu after amdkfd device init
This patch is to fix clinfo failure in Raven/Picasso:

Number of platforms: 1
  Platform Profile: FULL_PROFILE
  Platform Version: OpenCL 2.2 AMD-APP (3364.0)
  Platform Name: AMD Accelerated Parallel Processing
  Platform Vendor: Advanced Micro Devices, Inc.
  Platform Extensions: cl_khr_icd cl_amd_event_callback

  Platform Name: AMD Accelerated Parallel Processing Number of devices: 0

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Tested-by: James Zhu <James.Zhu@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-05 12:22:29 -04:00
Lijo Lazar 1d617c029f drm/amdgpu: During s0ix don't wait to signal GFXOFF
In the rare event when GFX IP suspend coincides with a s0ix entry, don't
schedule a delayed work, instead signal PMFW immediately to allow GFXOFF
entry. GFXOFF is a prerequisite for s0ix entry. PMFW needs to be
signaled about GFXOFF status before amd-pmc module passes OS HINT
to PMFW telling that everything is ready for a safe s0ix entry.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1712

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-10-05 10:55:07 -04:00
Lang Yu b072ef1215 drm/amdkfd: fix a potential ttm->sg memory leak
Memory is allocated for ttm->sg by kmalloc in kfd_mem_dmamap_userptr,
but isn't freed by kfree in kfd_mem_dmaunmap_userptr. Free it!

Fixes: 264fb4d332 ("drm/amdgpu: Add multi-GPU DMA mapping helpers")

Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-05 10:53:45 -04:00
Alex Deucher 630e959f25 drm/amdgpu/gmc9: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:02 -04:00
Guo Zhengkui 8001ba85d0 drm/amdgpu: remove some repeated includings
Remove two repeated includings in line 46 and 47.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Guo Zhengkui <guozhengkui@vivo.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:02 -04:00
Lijo Lazar d04287d062 drm/amdgpu: During s0ix don't wait to signal GFXOFF
In the rare event when GFX IP suspend coincides with a s0ix entry, don't
schedule a delayed work, instead signal PMFW immediately to allow GFXOFF
entry. GFXOFF is a prerequisite for s0ix entry. PMFW needs to be
signaled about GFXOFF status before amd-pmc module passes OS HINT
to PMFW telling that everything is ready for a safe s0ix entry.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1712

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:02 -04:00
Alex Deucher 4b3a624c4c drm/amdgpu: consolidate case statements
IP_VERSION(11, 0, 13) does the exact same thing as
IP_VERSION(11, 0, 12) so squash them together.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:02 -04:00
James Zhu c60511493b drm/amdgpu/jpeg: add jpeg2.6 start/end
Add jpeg2.6 start/end with updated PCTL0_MMHUB_DEEPSLEEP_IB address.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.lilu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:02 -04:00
James Zhu d4b0ee65de drm/amdgpu/jpeg2: move jpeg2 shared macro to header file
Move jpeg2 shared macro to header file

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.lilu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:02 -04:00
Lang Yu 546dc20fed drm/amdkfd: fix a potential ttm->sg memory leak
Memory is allocated for ttm->sg by kmalloc in kfd_mem_dmamap_userptr,
but isn't freed by kfree in kfd_mem_dmaunmap_userptr. Free it!

Fixes: 264fb4d332 ("drm/amdgpu: Add multi-GPU DMA mapping helpers")

Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:02 -04:00
Alex Deucher a79d3709c4 drm/amdgpu: add an option to override IP discovery table from a file
If you set amdgpu.discovery=2 you can force the the driver to
fetch the IP discovery table from a file rather than from the
table shipped on the device.  This is useful for debugging and
for device bring up and emulation when the tables may be in flux.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:02 -04:00
Alex Deucher 5b983db8c3 drm/amdkfd: clean up parameters in kgd2kfd_probe
We can get the pdev and asic type from the adev.  No need
to pass them explicitly.

v2: squash in build fix for !CONFIG_HSA_AMD from Anson

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:02 -04:00
Alex Deucher 6d46d419af drm/amdgpu: add support for SRIOV in IP discovery path
Handle SRIOV requirements when adding IP blocks.

v2: add comment about UVD/VCE support on vega20 SR-IOV

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:01 -04:00
Alex Deucher b05b9c591f drm/amdgpu: clean up set IP function
Split into several smaller per IP functions to make it
easier to handle ordering issues for things like
SR-IOV in a follow up patch.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:01 -04:00
Alex Deucher 1d789535a0 drm/amdgpu: convert IP version array to include instances
Allow us to query instances versions more cleanly.

Instancing support is not consistent unfortunately. SDMA is a
good example.  Sienna cichlid has 4 total SDMA instances, each
enumerated separately (HWIDs 42, 43, 68, 69).  Arcturus has 8
total SDMA instances, but they are enumerated as multiple
instances of the same HWIDs (4x HWID 42, 4x HWID 43).  UMC
is another example.  On most chips there are multiple
instances with the same HWID.  This allows us to support both
forms.

v2: rebase
v3: clarify instancing support

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:01 -04:00
Alex Deucher d0761fd24e drm/amdgpu: set CHIP_IP_DISCOVERY as the asic type by default
For new chips with no explicit entry in the PCI ID list.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:01 -04:00
Alex Deucher 3ae695d691 drm/amdgpu: add new asic_type for IP discovery
Add a new asic type for asics where we don't have an
explicit entry in the PCI ID list.  We don't need
an asic type for these asics, other than something higher
than the existing ones, so just use this for all new
asics.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:01 -04:00
Alex Deucher aa9f8cc349 drm/amdgpu/ucode: add default behavior
Default to PSP ucode loading unless the user specifies
direct.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:01 -04:00
Alex Deucher f174161517 drm/amdgpu: get VCN harvest information from IP discovery table
Use the table rather than asic specific harvest registers.

v2: remove harvesting register checking

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:01 -04:00
Alex Deucher 1b592d00b4 drm/amdgpu/vcn: remove manual instance setting
Handled by IP discovery now.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:01 -04:00
Alex Deucher fe323f039d drm/amdgpu/sdma: remove manual instance setting
Handled by IP discovery now.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:01 -04:00
Alex Deucher 5c3720be7d drm/amdgpu: get VCN and SDMA instances from IP discovery table
Rather than hardcoding it.  We already have the number of VCN
instances from a previous patch, so just update the VCN
instances for chips with static tables.

v2: squash in checks for SDMA3,4 (Guchun)
v3: clarify VCN changes

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:01 -04:00
Alex Deucher 5eceb20192 drm/amdgpu: add VCN1 hardware IP
So we can store the VCN IP revision for each instance of VCN.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:01 -04:00
Alex Deucher 75a07bcd1d drm/amdgpu/soc15: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:01 -04:00
Alex Deucher 0b64a5a852 drm/amdgpu/vcn2.5: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:00 -04:00
Alex Deucher 96b8dd4423 drm/amdgpu/amdgpu_vcn: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

v2: squash in fix for navy flounder and sienna cichlid

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:00 -04:00
Alex Deucher 1fcc208cd7 drm/amdgpu/psp_v13.0: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:00 -04:00
Alex Deucher e47868ea15 drm/amdgpu/psp_v11.0: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:00 -04:00
Alex Deucher 82d05736c4 drm/amdgpu/amdgpu_psp: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:00 -04:00
Alex Deucher 9d0cb2c318 drm/amdgpu/gfx9.0: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:00 -04:00
Alex Deucher 24be2d7004 drm/amdgpu/hdp4.0: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:00 -04:00
Alex Deucher 43bf00f21e drm/amdgpu/sdma4.0: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:00 -04:00
Alex Deucher f7f12b2582 drm/amdgpu: default to true in amdgpu_device_asic_has_dc_support
We are not going to support any new chips with the old
non-DC code so make it the default.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:00 -04:00
Alex Deucher 9878844094 drm/amdgpu: drive all vega asics from the IP discovery table
Rather than hardcoding based on asic_type, use the IP
discovery table to configure the driver.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:00 -04:00
Alex Deucher 91e9db33be drm/amdgpu/soc15: get rev_id in soc15_common_early_init
for consistency with other SoCs.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:00 -04:00
Alex Deucher d4c6e870bd drm/amdgpu: add initial IP discovery support for vega based parts
Hardcode the IP versions for asics without IP discovery tables
and then enumerate the asics based on the IP versions.

TODO: fix SR-IOV support

v2: Squash in HDP fix for Renoir

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:23:00 -04:00
Alex Deucher 994470b252 drm/amdgpu/soc15: export common IP functions
So they can be driven by IP discovery table.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:59 -04:00
Alex Deucher 5f93148955 drm/amdgpu: add DCI HWIP
So we can track grab the appropriate DCE info out of the
IP discovery table.  This is a separare IP from DCN.

Acked-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:59 -04:00
Alex Deucher 75aa18415a drm/amdgpu: drive all navi asics from the IP discovery table
Rather than hardcoding based on asic_type, use the IP
discovery table to configure the driver.

v2: rebase

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:59 -04:00
Alex Deucher 3e67f4f2e2 drm/amdgpu/nv: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:59 -04:00
Alex Deucher 7c69d6153e drm/amdgpu/navi10_ih: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:59 -04:00
Alex Deucher 258fa17d1a drm/amdgpu/athub2.1: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:59 -04:00
Alex Deucher 13ebe284a2 drm/amdgpu/athub2.0: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:59 -04:00
Alex Deucher 4edbbfde89 drm/amdgpu/vcn3.0: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:59 -04:00
Alex Deucher bc7c3d1d8a drm/amdgpu/mmhub2.1: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:59 -04:00
Alex Deucher ce2d99a84f drm/amdgpu/mmhub2.0: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:59 -04:00
Alex Deucher fac1772374 drm/amdgpu/gfxhub2.1: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:59 -04:00
Alex Deucher 524cf3ab85 drm/amdgpu: drive nav10 from the IP discovery table
Rather than hardcoding based on asic_type, use the IP
discovery table to configure the driver.

Only tested on Navi10 so far.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:58 -04:00
Alex Deucher 63352b7f98 drm/amdgpu: Use IP discovery to drive setting IP blocks by default
Drive the asic setup from the IP discovery table rather than
hardcoded settings based on asic type.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:58 -04:00
Alex Deucher 5db9d0657e drm/amdgpu/gmc10.0: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

v2: squash in gmc fixes
v3: rebase

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:58 -04:00
Alex Deucher eb4fd29afd drm/amdgpu: bind to any 0x1002 PCI diplay class device
Bind to all 0x1002 GPU devices.

For now we explicitly return -ENODEV for generic bindings.
Remove this check once IP discovery based checking is in place.

v2: rebase (Alex)

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:58 -04:00
Alex Deucher bdbeb0dde4 drm/amdgpu: filter out radeon PCI device IDs
Once we claim all 0x1002 PCI display class devices, we will
need to filter out devices owned by radeon.

v2: rename radeon id array to make it more clear that
the devices are not supported by amdgpu.
    add r128, mach64 pci ids as well

Acked-by: Christian König <christian.koenig@amd.com> (v1)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:58 -04:00
Alex Deucher 4b0ad84254 drm/amdgpu/gfx10: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

v2: rebase,  squash in navi10 fixes (Alex)

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:58 -04:00
Alex Deucher 8f4bb1e784 drm/amdgpu/sdma5.2: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:58 -04:00
Alex Deucher 02200e910c drm/amdgpu/sdma5.0: convert to IP version checking
Use IP versions rather than asic_type to differentiate
IP version specific features.

v2: rebase

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:58 -04:00
Alex Deucher 795d08391b drm/amdgpu: add initial IP enumeration via IP discovery table
Add initial support for all navi based parts.

v2: rebase

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:58 -04:00
Alex Deucher a1f62df75b drm/amdgpu/nv: export common IP functions
So they can be driven by IP dicovery table.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:58 -04:00
Alex Deucher 1534db5549 drm/amdgpu: add XGMI HWIP
So we can track grab the appropriate XGMI info out of the
IP discovery table.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:58 -04:00
Alex Deucher 54d2b1f402 drm/amdgpu: fill in IP versions from IP discovery table
Prerequisite for using IP versions in the driver rather
than asic type.

v2: Use IP_VERSION() macro instead of new function

Reviewed-by: Christian König <christian.koenig@amd.com> (v1)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:58 -04:00
Alex Deucher 5f52e9a780 drm/amdgpu: store HW IP versions in the driver structure
So we can check the IP versions directly rather than using
asic type.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:58 -04:00
Alex Deucher 81d1bf01e4 drm/amdgpu: add debugfs access to the IP discovery table
Useful for debugging and new asic validation.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:58 -04:00
Alex Deucher f76f795a8f drm/amdgpu: move headless sku check into harvest function
Consolidate harvesting information.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:58 -04:00
John Clements eb601e61d3 drm/amdgpu: resolve RAS query bug
clear error count when persistant harvesting is not enabled

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:57 -04:00
Tao Zhou c749094923 amd/amdkfd: add ras page retirement handling for sq/sdma (v3)
In ras poison mode, page retirement will be handled by the irq handler of the
module which consumes corrupted data.

v2: rename ras_process_cb to ras_poison_consumption_handler.
    move the handler's implementation from ASIC specific file to common
file.

v3: call gpu reset for xGMI connected mode.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:57 -04:00
Prike Liang e5d59cfa33 drm/amdgpu: force exit gfxoff on sdma resume for rmb s0ix
In the s2idle stress test sdma resume fail occasionally,in the
failed case GPU is in the gfxoff state.This issue may introduce
by firmware miss handle doorbell S/R and now temporary fix the issue
by forcing exit gfxoff for sdma resume.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:57 -04:00
Zhan Liu 3f68c01be9 drm/amd/display: add cyan_skillfish display support
[Why]
add display related cyan_skillfish files in.

makefile controlled by CONFIG_DRM_AMD_DC_DCN201 flag.

v2: squash in clang fixes from Harry, Nathan
v3: squash in missing CONFIG_DRM_AMD_DC check (Alex)

Signed-off-by: Charlene Liu <charlene.liu@amd.com>
Signed-off-by: Zhan Liu <zhan.liu@amd.com>
Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Acked-by: Jun Lei <jun.lei@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-04 15:22:57 -04:00
Sean Paul 6f67e6fd4d Revert "drm/amd: cleanup: drm_modeset_lock_all() --> DRM_MODESET_LOCK_ALL_BEGIN()"
This reverts commit 299f040e85.

This patchset breaks on intel platforms and was previously NACK'd by
Ville.

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Fernando Ramos <greenfoo@u92.eu>
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20211002154542.15800-2-sean@poorly.run
2021-10-04 09:34:55 -04:00
Tom Lendacky e9d1d2bb75 treewide: Replace the use of mem_encrypt_active() with cc_platform_has()
Replace uses of mem_encrypt_active() with calls to cc_platform_has() with
the CC_ATTR_MEM_ENCRYPT attribute.

Remove the implementation of mem_encrypt_active() across all arches.

For s390, since the default implementation of the cc_platform_has()
matches the s390 implementation of mem_encrypt_active(), cc_platform_has()
does not need to be implemented in s390 (the config option
ARCH_HAS_CC_PLATFORM is not set).

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210928191009.32551-9-bp@alien8.de
2021-10-04 11:47:24 +02:00
Fernando Ramos 299f040e85 drm/amd: cleanup: drm_modeset_lock_all() --> DRM_MODESET_LOCK_ALL_BEGIN()
As requested in Documentation/gpu/todo.rst, replace driver calls to
drm_modeset_lock_all() with DRM_MODESET_LOCK_ALL_BEGIN() and
DRM_MODESET_LOCK_ALL_END()

Signed-off-by: Fernando Ramos <greenfoo@u92.eu>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20210924064324.229457-16-greenfoo@u92.eu
2021-10-01 13:00:57 -04:00
Andrey Grodzovsky 5c67ff3a4c drm/amdgpu: Add a UAPI flag for hot plug/unplug
To support libdrm tests.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-29 17:30:00 -04:00
Andrey Grodzovsky 894c6890a2 drm/amdgpu: drm/amdgpu: Handle IOMMU enabled case
Handle all DMA IOMMU group related dependencies before the
group is removed and we try to access it after free.

v2:
Move the actul handling function to TTM

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-29 17:30:00 -04:00
Ernst Sjöstrand 5039f52988 drm/amd/amdgpu: Validate ip discovery blob
We use the number_instance index that we get from the fw discovery blob
to index into an array for example.

Update error messages (Alex)

Signed-off-by: Ernst Sjöstrand <ernstp@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-29 17:30:00 -04:00
Arnd Bergmann 335aea75b0 drm/amdgpu: fix warning for overflow check
The overflow check in amdgpu_bo_list_create() causes a warning with
clang-14 on 64-bit architectures, since the limit can never be
exceeded.

drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c:74:18: error: result of comparison of constant 256204778801521549 with expression of type 'unsigned int' is always false [-Werror,-Wtautological-constant-out-of-range-compare]
        if (num_entries > (SIZE_MAX - sizeof(struct amdgpu_bo_list))
            ~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The check remains useful for 32-bit architectures, so just avoid the
warning by using size_t as the type for the count.

Fixes: 920990cb08 ("drm/amdgpu: allocate the bo_list array after the list")
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-29 17:30:00 -04:00
Simon Ser 2f350ddadc drm/amdgpu: check tiling flags when creating FB on GFX8-
On GFX9+, format modifiers are always enabled and ensure the
frame-buffers can be scanned out at ADDFB2 time.

On GFX8-, format modifiers are not supported and no other check
is performed. This means ADDFB2 IOCTLs will succeed even if the
tiling isn't supported for scan-out, and will result in garbage
displayed on screen [1].

Fix this by adding a check for tiling flags for GFX8 and older.
The check is taken from radeonsi in Mesa (see how is_displayable
is populated in gfx6_compute_surface).

Changes in v2: use drm_WARN_ONCE instead of drm_WARN (Michel)

[1]: https://github.com/swaywm/wlroots/issues/3185

Signed-off-by: Simon Ser <contact@emersion.fr>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Harry Wentland <hwentlan@amd.com>
Cc: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-29 17:30:00 -04:00
Matthew Auld 43d46f0b78 drm/ttm: s/FLAG_SG/FLAG_EXTERNAL/
It covers more than just ttm_bo_type_sg usage, like with say dma-buf,
since one other user is userptr in amdgpu, and in the future we might
have some more. Hence EXTERNAL is likely a more suitable name.

v2(Christian):
  - Rename these to TTM_TT_FLAGS_*
  - Fix up all the holes in the flag values

Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210929132629.353541-1-matthew.auld@intel.com
Signed-off-by: Christian König <christian.koenig@amd.com>
2021-09-29 16:17:56 +02:00
Matthew Auld 21856e1e34 drm/ttm: move ttm_tt_{add, clear}_mapping into amdgpu
Now that setting page->index shouldn't be needed anymore, we are just
left with setting page->mapping, and here it looks like amdgpu is the
only user, where pointing the page->mapping at the dev_mapping is used
to verify that the pages do indeed belong to the device, if userspace
later tries to touch them.

v2(Christian):
  - Drop the functions altogether and just inline modifying
    the page->mapping

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210927114114.152310-3-matthew.auld@intel.com
Signed-off-by: Christian König <christian.koenig@amd.com>
2021-09-29 13:55:09 +02:00
Prike Liang 26db706a6d drm/amdgpu: force exit gfxoff on sdma resume for rmb s0ix
In the s2idle stress test sdma resume fail occasionally,in the
failed case GPU is in the gfxoff state.This issue may introduce
by firmware miss handle doorbell S/R and now temporary fix the issue
by forcing exit gfxoff for sdma resume.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-09-28 14:40:27 -04:00
Simon Ser 98122e63a7 drm/amdgpu: check tiling flags when creating FB on GFX8-
On GFX9+, format modifiers are always enabled and ensure the
frame-buffers can be scanned out at ADDFB2 time.

On GFX8-, format modifiers are not supported and no other check
is performed. This means ADDFB2 IOCTLs will succeed even if the
tiling isn't supported for scan-out, and will result in garbage
displayed on screen [1].

Fix this by adding a check for tiling flags for GFX8 and older.
The check is taken from radeonsi in Mesa (see how is_displayable
is populated in gfx6_compute_surface).

Changes in v2: use drm_WARN_ONCE instead of drm_WARN (Michel)

[1]: https://github.com/swaywm/wlroots/issues/3185

Signed-off-by: Simon Ser <contact@emersion.fr>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Harry Wentland <hwentlan@amd.com>
Cc: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-09-28 14:40:19 -04:00
Hawking Zhang 9f52c25f59 drm/amdgpu: correct initial cp_hqd_quantum for gfx9
didn't read the value of mmCP_HQD_QUANTUM from correct
register offset

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-09-28 14:39:29 -04:00
Leslie Shi 66805763a9 drm/amdgpu: fix gart.bo pin_count leak
gmc_v{9,10}_0_gart_disable() isn't called matched with
correspoding gart_enbale function in SRIOV case. This will
lead to gart.bo pin_count leak on driver unload.

Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Leslie Shi <Yuliang.Shi@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-28 14:38:16 -04:00
Hawking Zhang e794747622 drm/amdgpu: correct initial cp_hqd_quantum for gfx9
didn't read the value of mmCP_HQD_QUANTUM from correct
register offset

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-28 09:30:08 -04:00
Tao Zhou f524dd54a7 drm/amdgpu: skip umc ras irq handling in poison mode (v2)
In ras poison mode, umc uncorrectable error will be ignored until
the corrupted data consumed by another ras module (such as gfx, sdma).

v2: update the debug message and replace dev_warn with dev_info.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-28 09:30:07 -04:00
Tao Zhou e43488493c drm/amdgpu: set poison supported flag for RAS (v2)
Add RAS poison supported flag and tell PSP RAS TA about the info.

v2: rename poison mode to poison supported, we can also disable poison
mode even we support it.
    print value of poison supported if ras feature enablement fails.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-28 09:30:07 -04:00
Tao Zhou aaca8c3861 drm/amdgpu: add poison mode query for UMC
Add ras poison mode query interface for UMC.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-28 09:30:06 -04:00
Tao Zhou ca5c636dc6 drm/amdgpu: add poison mode query for DF (v2)
Add ras poison mode query interface for DF.

v2: replace RREG32_PCIE with RREG32_SOC15.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-28 09:30:06 -04:00
Candice Li 77ec28eac2 drm/amdgpu: Update PSP TA Invoke to use common TA context as input
Updated invoke to use new common TA structure similarily to load/unload.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-28 09:30:06 -04:00
Leslie Shi 71cf9e72b3 drm/amdgpu: fix gart.bo pin_count leak
gmc_v{9,10}_0_gart_disable() isn't called matched with
correspoding gart_enbale function in SRIOV case. This will
lead to gart.bo pin_count leak on driver unload.

Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Leslie Shi <Yuliang.Shi@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-28 09:30:05 -04:00
Dave Airlie 1e3944578b Merge tag 'amd-drm-next-5.16-2021-09-27' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.16-2021-09-27:

amdgpu:
- RAS improvements
- BACO fixes
- Yellow Carp updates
- Misc code cleanups
- Initial DP 2.0 support
- VCN priority handling
- Cyan Skillfish updates
- Rework IB handling for multimedia engine tests
- Backlight fixes
- DCN 3.1 power saving improvements
- Runtime PM fixes
- Modifier support for DCC image stores for gfx 10.3
- Hotplug fixes
- Clean up stack related warnings in display code
- DP alt mode fixes
- Display rework for better handling FP code
- Debugfs fixes

amdkfd:
- SVM fixes
- DMA map fixes

radeon:
- AGP fix

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210927212653.4575-1-alexander.deucher@amd.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
2021-09-28 17:08:26 +10:00
Alex Deucher 2485e2753e drm/amdgpu: make soc15_common_ip_funcs static
It's not used outside of soc15.c

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23 16:35:27 -04:00
Candice Li 9080a18fc5 drm/amdgpu: Remove all code paths under the EAGAIN path in RAS late init
All code paths under the EAGAIN path in RAS late init are unused.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23 16:35:13 -04:00
John Clements 73490d2658 drm/amdgpu: Consolidate RAS cmd warning messages
Explicity post warning if cmd is issued against unsupported IP

Update to latest RAS TA interface

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23 16:35:04 -04:00
John Clements 640ae42efb drm/amdgpu: Updated RAS infrastructure
Update RAS infrastructure to support RAS query for MCA subblocks

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23 16:34:43 -04:00
Guchun Chen 6effad8abe drm/amdgpu: move amdgpu_virt_release_full_gpu to fini_early stage
adev->rmmio is set to be NULL in amdgpu_device_unmap_mmio to prevent
access after pci_remove, however, in SRIOV case, amdgpu_virt_release_full_gpu
will still use adev->rmmio for access after amdgpu_device_unmap_mmio.
The patch is to move such SRIOV calling earlier to fini_early stage.

Fixes: 07775fc138 ("drm/amdgpu: Unmap all MMIO mappings")
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Leslie Shi <Yuliang.Shi@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23 16:34:28 -04:00
Andrey Grodzovsky ebe86a57c8 drm/amdgpu: Fix resume failures when device is gone
Problem:
When device goes into suspend and unplugged during it
then all HW programming during resume fails leading
to a bad SW during pci remove handling which follows.
Because device is first resumed and only later removed
we cannot rely on drm_dev_enter/exit here.

Fix:
Use a flag we use for PCIe error recovery to avoid
accessing registres. This allows to successfully complete
pm resume sequence and finish pci remove.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23 15:17:29 -04:00
Andrey Grodzovsky c03509cbc0 drm/amdgpu: Fix MMIO access page fault
Add more guards to MMIO access post device
unbind/unplug

Bug: https://bugs.archlinux.org/task/72092?project=1&order=dateopened&sort=desc&pagenum=1
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23 15:17:29 -04:00
Andrey Grodzovsky d82e2c249c drm/amdgpu: Fix crash on device remove/driver unload
Crash:
BUG: unable to handle page fault for address: 00000000000010e1
RIP: 0010:vega10_power_gate_vce+0x26/0x50 [amdgpu]
Call Trace:
pp_set_powergating_by_smu+0x16a/0x2b0 [amdgpu]
amdgpu_dpm_set_powergating_by_smu+0x92/0xf0 [amdgpu]
amdgpu_dpm_enable_vce+0x2e/0xc0 [amdgpu]
vce_v4_0_hw_fini+0x95/0xa0 [amdgpu]
amdgpu_device_fini_hw+0x232/0x30d [amdgpu]
amdgpu_driver_unload_kms+0x5c/0x80 [amdgpu]
amdgpu_pci_remove+0x27/0x40 [amdgpu]
pci_device_remove+0x3e/0xb0
device_release_driver_internal+0x103/0x1d0
device_release_driver+0x12/0x20
pci_stop_bus_device+0x79/0xa0
pci_stop_and_remove_bus_device_locked+0x1b/0x30
remove_store+0x7b/0x90
dev_attr_store+0x17/0x30
sysfs_kf_write+0x4b/0x60
kernfs_fop_write_iter+0x151/0x1e0

Why:
VCE/UVD had dependency on SMC block for their suspend but
SMC block is the first to do HW fini due to some constraints

How:
Since the original patch was dealing with suspend issues
move the SMC block dependency back into suspend hooks as
was done in V1 of the original patches.
Keep flushing idle work both in suspend and HW fini seuqnces
since it's essential in both cases.

Fixes: 859e465927 ("drm/amdgpu: add missing cleanups for more ASICs on UVD/VCE suspend")
Fixes: bf756fb833 ("drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend")
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23 15:17:29 -04:00
xinhui pan 0a2267809f drm/amdgpu: Fix uvd ib test timeout when use pre-allocated BO
Now we use same BO for create/destroy msg. So destroy will wait for the
fence returned from create to be signaled. The default timeout value in
destroy is 10ms which is too short.

Lets wait both fences with the specific timeout.

Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23 15:17:29 -04:00
xinhui pan b2fe31cf64 drm/amdgpu: Put drm_dev_enter/exit outside hot codepath
We hit soft hang while doing memory pressure test on one numa system.
After a qucik look, this is because kfd invalid/valid userptr memory
frequently with process_info lock hold.
Looks like update page table mapping use too much cpu time.

perf top says below,
75.81%  [kernel]       [k] __srcu_read_unlock
 6.19%  [amdgpu]       [k] amdgpu_gmc_set_pte_pde
 3.56%  [kernel]       [k] __srcu_read_lock
 2.20%  [amdgpu]       [k] amdgpu_vm_cpu_update
 2.20%  [kernel]       [k] __sg_page_iter_dma_next
 2.15%  [drm]          [k] drm_dev_enter
 1.70%  [drm]          [k] drm_prime_sg_to_dma_addr_array
 1.18%  [kernel]       [k] __sg_alloc_table_from_pages
 1.09%  [drm]          [k] drm_dev_exit

So move drm_dev_enter/exit outside gmc code, instead let caller do it.
They are gart_unbind, gart_map, vm_clear_bo, vm_update_pdes and
gmc_init_pdb0. vm_bo_update_mapping already calls it.

Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-and-tested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23 15:17:29 -04:00
John Clements 226f4f5a6b drm/amdgpu: Resolve nBIF RAS error harvesting bug
Set correct RAS nBIF error query register offsets on aldebaran

Signed-off-by: John Clements <john.clements@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23 15:17:28 -04:00
Candice Li 17c6805a00 drm/amdgpu: Update PSP TA unload function
Update PSP TA unload function to use PSP TA context as input argument.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23 15:17:28 -04:00
Candice Li 3f83f17b73 drm/amdgpu: Conform ASD header/loading to generic TA systems
Update asd_context structure and add asd_initialize function to
conform ASD header/loading to generic TA systems.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23 15:17:28 -04:00
Paul Menzel 31ea43442d drm/amdgpu: Demote TMZ unsupported log message from warning to info
As the user cannot do anything about the unsupported Trusted Memory Zone
(TMZ) feature, do not warn about it, but make it informational, so
demote the log level from warning to info.

Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23 15:17:28 -04:00
Michel Dänzer 6cd1f9b40a drm/amdgpu: Drop inline from amdgpu_ras_eeprom_max_record_count
This was unusual; normally, inline functions are declared static as
well, and defined in a header file if used by multiple compilation
units. The latter would be more involved in this case, so just drop
the inline declaration for now.

Fixes compile failure building for ppc64le on RHEL 8:

In file included from ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h:32,
                 from ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:33:
../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c: In function ‘amdgpu_ras_recovery_init’:
../drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h:90:17: error: inlining failed in call
 to ‘always_inline’ ‘amdgpu_ras_eeprom_max_record_count’: function body not available
   90 | inline uint32_t amdgpu_ras_eeprom_max_record_count(void);
      |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:1985:34: note: called from here
 1985 |         max_eeprom_records_len = amdgpu_ras_eeprom_max_record_count();
      |                                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Fixes: c84d46707e "drm/amdgpu: validate bad page threshold in ras(v3)"
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-23 15:17:28 -04:00
Dave Airlie 0dfc70818a drm-misc-next for $kernel-version:
UAPI Changes:
 
 Cross-subsystem Changes:
   - dma-buf: Avoid a warning with some allocations, Remove
     DMA_FENCE_TRACE macros
 
 Core Changes:
   - bridge: New helper to git rid of panels in drivers
   - fence: Improve dma_fence_add_callback documentation, Improve
     dma_fence_ops->wait documentation
   - ioctl: Unexport drm_ioctl_permit
   - lease: Documentation improvements
   - fourcc: Add new macro to determine the modifier vendor
   - quirks: Add the Steam Deck, Chuwi HiBook, Chuwi Hi10 Pro, Samsung
     Galaxy Book 10.6, KD Kurio Smart C15200 2-in-1, Lenovo Ideapad D330
   - resv: Improve the documentation
   - shmem-helpers: Allocate WC pages on x86, Switch to vmf_insert_pfn
   - sched: Fix for a timer being canceled too soon, Avoid null pointer
     derefence if the fence is null in drm_sched_fence_free, Convert
     drivers to rely on its dependency tracking
   - ttm: Switch to kerneldoc, new helper to clear all DMA mappings, pool
     shrinker optitimization, Remove ttm_tt_destroy_common, Fix for
     unbinding on multiple drivers
 
 Driver Changes:
   - bochs: New PCI IDs
   - msm: Fence ordering impromevemnts
   - stm: Add layer alpha support, zpos
   - v3d: Fix for a Vulkan CTS failure
   - vc4: Conversion to the new bridge helpers
   - vgem: Use shmem helpers
   - virtio: Support mapping exported vram
   - zte: Remove obsolete driver
 
   - bridge: Probe improvements for it66121, enable DSI EOTP for anx7625,
     errors propagation improvements for anx7625
 
   - panels: 60fps mode for otm8009a, New driver for Samsung S6D27A1
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCYULyqgAKCRDj7w1vZxhR
 xVR1AP96dB3rfB0uIEvujMROBqupaKbYvP/7qilfMGIwLotDqQD/RKNB+EAaoHtT
 hRA7zmz7kwYA/l8PihmF1zoFddX21gA=
 =nFnK
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2021-09-16' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for $kernel-version:

UAPI Changes:

Cross-subsystem Changes:
  - dma-buf: Avoid a warning with some allocations, Remove
    DMA_FENCE_TRACE macros

Core Changes:
  - bridge: New helper to git rid of panels in drivers
  - fence: Improve dma_fence_add_callback documentation, Improve
    dma_fence_ops->wait documentation
  - ioctl: Unexport drm_ioctl_permit
  - lease: Documentation improvements
  - fourcc: Add new macro to determine the modifier vendor
  - quirks: Add the Steam Deck, Chuwi HiBook, Chuwi Hi10 Pro, Samsung
    Galaxy Book 10.6, KD Kurio Smart C15200 2-in-1, Lenovo Ideapad D330
  - resv: Improve the documentation
  - shmem-helpers: Allocate WC pages on x86, Switch to vmf_insert_pfn
  - sched: Fix for a timer being canceled too soon, Avoid null pointer
    derefence if the fence is null in drm_sched_fence_free, Convert
    drivers to rely on its dependency tracking
  - ttm: Switch to kerneldoc, new helper to clear all DMA mappings, pool
    shrinker optitimization, Remove ttm_tt_destroy_common, Fix for
    unbinding on multiple drivers

Driver Changes:
  - bochs: New PCI IDs
  - msm: Fence ordering impromevemnts
  - stm: Add layer alpha support, zpos
  - v3d: Fix for a Vulkan CTS failure
  - vc4: Conversion to the new bridge helpers
  - vgem: Use shmem helpers
  - virtio: Support mapping exported vram
  - zte: Remove obsolete driver

  - bridge: Probe improvements for it66121, enable DSI EOTP for anx7625,
    errors propagation improvements for anx7625

  - panels: 60fps mode for otm8009a, New driver for Samsung S6D27A1

Signed-off-by: Dave Airlie <airlied@redhat.com>

# gpg: Signature made Thu 16 Sep 2021 17:30:50 AEST
# gpg:                using EDDSA key 5C1337A45ECA9AEB89060E9EE3EF0D6F671851C5
# gpg: Can't check signature: No public key
From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20210916073132.ptbbmjetm7v3ufq3@gilmour
2021-09-22 15:30:40 +10:00
Paul Menzel b287e49468 drm/amdgpu: Demote TMZ unsupported log message from warning to info
As the user cannot do anything about the unsupported Trusted Memory Zone
(TMZ) feature, do not warn about it, but make it informational, so
demote the log level from warning to info.

Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-16 09:56:24 -04:00
Michel Dänzer 114518ff3b drm/amdgpu: Drop inline from amdgpu_ras_eeprom_max_record_count
This was unusual; normally, inline functions are declared static as
well, and defined in a header file if used by multiple compilation
units. The latter would be more involved in this case, so just drop
the inline declaration for now.

Fixes compile failure building for ppc64le on RHEL 8:

In file included from ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h:32,
                 from ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:33:
../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c: In function ‘amdgpu_ras_recovery_init’:
../drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h:90:17: error: inlining failed in call
 to ‘always_inline’ ‘amdgpu_ras_eeprom_max_record_count’: function body not available
   90 | inline uint32_t amdgpu_ras_eeprom_max_record_count(void);
      |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:1985:34: note: called from here
 1985 |         max_eeprom_records_len = amdgpu_ras_eeprom_max_record_count();
      |                                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Fixes: c84d46707e "drm/amdgpu: validate bad page threshold in ras(v3)"
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-16 09:56:24 -04:00
James Zhu f02abeb077 drm/amdgpu: move iommu_resume before ip init/resume
Separate iommu_resume from kfd_resume, and move it before
other amdgpu ip init/resume.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211277
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-09-16 09:56:24 -04:00
James Zhu 8066008482 drm/amdgpu: add amdgpu_amdkfd_resume_iommu
Add amdgpu_amdkfd_resume_iommu for amdgpu.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211277
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-09-16 09:56:24 -04:00
James Zhu fefc01f042 drm/amdkfd: separate kfd_iommu_resume from kfd_resume
Separate kfd_iommu_resume from kfd_resume for fine-tuning
of amdgpu device init/resume/reset/recovery sequence.

v2: squash in fix for !CONFIG_HSA_AMD

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211277
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-09-16 09:56:24 -04:00
Nirmoy Das b04ce53eac drm/amdgpu: use IS_ERR for debugfs APIs
debugfs APIs returns encoded error so use
IS_ERR for checking return value.

v2: return PTR_ERR(ent)

References: https://gitlab.freedesktop.org/drm/amd/-/issues/1686
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-By: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-09-14 16:21:15 -04:00
Christian König c92db8d64f drm/amdgpu: fix use after free during BO move
The memory backing old_mem is already freed at that point, move the
check a bit more up.

Signed-off-by: Christian König <christian.koenig@amd.com>
Fixes: bfa3357ef9 ("drm/ttm: allocate resource object instead of embedding it v2")
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1699
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-09-14 16:17:39 -04:00
Ernst Sjöstrand 67a44e6598 drm/amd/amdgpu: Increase HWIP_MAX_INSTANCE to 10
Seems like newer cards can have even more instances now.
Found by UBSAN: array-index-out-of-bounds in
drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:318:29
index 8 is out of range for type 'uint32_t *[8]'

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1697
Cc: stable@vger.kernel.org
Signed-off-by: Ernst Sjöstrand <ernstp@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14 16:14:41 -04:00
xinhui pan 0fcfb30019 drm/amdgpu: Fix a race of IB test
Direct IB submission should be exclusive. So use write lock.

Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14 15:59:58 -04:00
xinhui pan 405a81ae3f drm/amdgpu: VCN avoid memory allocation during IB test
alloc extra msg from direct IB pool.

Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14 15:59:58 -04:00
xinhui pan cb9038aa8a drm/amdgpu: VCE avoid memory allocation during IB test
alloc extra msg from direct IB pool.

Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14 15:59:58 -04:00
xinhui pan 68331d7cf3 drm/amdgpu: UVD avoid memory allocation during IB test
move BO allocation in sw_init.

Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14 15:59:58 -04:00
Candice Li de3a1e3360 drm/amdgpu: Unify PSP TA context
Remove all TA binary structures and add the specific binary
structure in struct ta_context.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14 15:59:58 -04:00
James Zhu 9cec53c18a drm/amdgpu: move iommu_resume before ip init/resume
Separate iommu_resume from kfd_resume, and move it before
other amdgpu ip init/resume.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211277
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14 15:59:58 -04:00
James Zhu ea20e246f3 drm/amdgpu: add amdgpu_amdkfd_resume_iommu
Add amdgpu_amdkfd_resume_iommu for amdgpu.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211277
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14 15:59:58 -04:00
James Zhu f8846323d5 drm/amdkfd: separate kfd_iommu_resume from kfd_resume
Separate kfd_iommu_resume from kfd_resume for fine-tuning
of amdgpu device init/resume/reset/recovery sequence.

v2: squash in fix for !CONFIG_HSA_AMD

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211277
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14 15:59:46 -04:00
shaoyunl 8e6d0b6996 drm/amdgpu: Get atomicOps info from Host for sriov setup
The AtomicOp Requester Enable bit is reserved in VFs and the PF value applies to all
associated VFs. so guest driver can not directly enable the atomicOps for VF, it
depends on PF to enable it. In current design, amdgpu driver  will get the enabled
atomicOps bits through private pf2vf data

Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14 15:57:11 -04:00
xinhui pan a7496559e4 drm/amdgpu: Increase direct IB pool size
Direct IB pool is used for vce/vcn IB extra msg too. Increase its size
to AMDGPU_IB_POOL_SIZE.

v2: Squash in unused variable removal

Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14 15:56:49 -04:00
John Clements 3771449bc8 drm/amdgpu: Update RAS trigger error block support
Added trigger error support for MP0/MP1/MPIO blocks

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14 15:37:48 -04:00
John Clements 334f81d164 drm/amdgpu: Update RAS status print
Remove uncessary RAS status prints

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14 15:37:41 -04:00
Likun Gao 02f958a20c drm/amdgpu: refactor function to init no-psp fw
Refactor the code of amdgpu_ucode_init_single_fw to make it more
readable as too many ucode need to handle on this function currently.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14 15:37:34 -04:00
Nirmoy Das 62d266b2bd drm/amdgpu: cleanup debugfs for amdgpu rings
Use debugfs_create_file_size API for creating ring debugfs, and as its a
NULL returning API, change the return type for amdgpu_debugfs_ring_init
API as well. Also cleanup surrounding code.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14 15:35:59 -04:00
Nirmoy Das 59715cffce drm/amdgpu: use IS_ERR for debugfs APIs
debugfs APIs returns encoded error so use
IS_ERR for checking return value.

v2: return PTR_ERR(ent)

References: https://gitlab.freedesktop.org/drm/amd/-/issues/1686
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-By: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-14 15:35:44 -04:00
Maxime Ripard 2f76520561
Merge drm/drm-next into drm-misc-next
Kickstart new drm-misc-next cycle.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
2021-09-14 09:25:30 +02:00
Linus Torvalds a668acb8f0 drm fixes for 5.15-rc1
ttm:
 - Fix ttm_bo_move_memcpy() when ttm_resource is subclassed.
 - Fix ttm deadlock if target BO isn't idle
 - ttm build fix
 - ttm docs fix
 
 dma-buf:
 - config option fixes
 
 fbdev:
 - limit resolutions to avoid int overflow
 
 i915:
 - stddef change.
 
 amdgpu:
 - Misc cleanups, typo fixes
 - EEPROM fix
 - Add some new PCI IDs
 - Scatter/Gather display support for Yellow Carp
 - PCIe DPM fix for RKL platforms
 - RAS fix
 
 amdkfd:
 - SVM fix
 
 vc4:
 - static function fix
 
 mgag200:
 - fix uninit var
 
 panfrost:
 - lock_region fixes
 
  - Make some dma-buf config options depend on DMA_SHARED_BUFFER.
     - Handle multiplication overflow of fbdev xres/yres in the core.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmE6/HkACgkQDHTzWXnE
 hr4Edw/+PTYtJHSZZbcT/Avcdif1KpEWuBfhq+dd75Tm1SNYXBRe03CqH3d23YnZ
 1I9oZ4TG1St3KaFBrlW5BERyFD2RhAAWJ4bMUz+/bBN9Y2u/r1scVR7YKoqkI2jr
 li1pYoPVLNYrHqdhmhsl7sKOqDRi/0TNvUY/B8tWyEZhTNiMGD9A8Tyv7WJ+iinT
 /mLrR0tCYYrzkvMEVdHt0t8+Bp1nvR/ZSfCS/NavD1CZ4RffENzTnFIhBb1QvCDj
 W1bF4D6930iOS/HXmheVzKygJlz9fj+8PS1DnvIyRPJjXH74dcCn+DPDRVTxyYB1
 3ZSY0I2yFSK0oorN1jYVraDXGB1R0OtIwbdRWvyztqMxaj+gRrSNbSSEcRGAy4YL
 Ipyvd2FyHO1rGxN5CS6FDCkJ/9WxOx1caBF0D3HhZVGxqw/m8qISxS+za8U5lbrT
 90KqHnaWbKL4flfUExjpwPKSvPImgLHN4tqC8l0471i4Tku0unBf8H9RkODkreRU
 fW9GHYCjzxHMwYT0JSHGohsscCvhIhkRYTYlx3bf/1tr0SfYXPiZEJwrJfNTLkZh
 mfm5R+wTL5hGHdDheOldjiGQZsazzxzJv2NK5aAuojVRqJuy3pohiQ72mHP5Wr4M
 9zOKlXbgBDSxTJleN7MJKZhNyanFUaZut+1rhTFeQ4RCUcgqpxc=
 =R62Q
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2021-09-10' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Just an initial bunch of fixes for the merge window, amdgpu is most of
  them with a few ttm fixes and an fbdev avoid multiply overflow fix.

  core:
   - Make some dma-buf config options depend on DMA_SHARED_BUFFER
   - Handle multiplication overflow of fbdev xres/yres in the core

  ttm:
   - Fix ttm_bo_move_memcpy() when ttm_resource is subclassed
   - Fix ttm deadlock if target BO isn't idle
   - ttm build fix
   - ttm docs fix

  dma-buf:
   - config option fixes

  fbdev:
   - limit resolutions to avoid int overflow

  i915:
   - stddef change.

  amdgpu:
   - Misc cleanups, typo fixes
   - EEPROM fix
   - Add some new PCI IDs
   - Scatter/Gather display support for Yellow Carp
   - PCIe DPM fix for RKL platforms
   - RAS fix

  amdkfd:
   - SVM fix

  vc4:
   - static function fix

  mgag200:
   - fix uninit var

  panfrost:
   - lock_region fixes"

* tag 'drm-next-2021-09-10' of git://anongit.freedesktop.org/drm/drm: (36 commits)
  drm/ttm: Fix a deadlock if the target BO is not idle during swap
  fbmem: don't allow too huge resolutions
  dma-buf: DMABUF_SYSFS_STATS should depend on DMA_SHARED_BUFFER
  dma-buf: DMABUF_DEBUG should depend on DMA_SHARED_BUFFER
  drm/i915: use linux/stddef.h due to "isystem: trim/fixup stdarg.h and other headers"
  dma-buf: DMABUF_MOVE_NOTIFY should depend on DMA_SHARED_BUFFER
  drm/amdkfd: drop process ref count when xnack disable
  drm/amdgpu: enable more pm sysfs under SRIOV 1-VF mode
  drm/amdgpu: fix fdinfo race with process exit
  drm/amdgpu: Fix a deadlock if previous GEM object allocation fails
  drm/amdgpu: stop scheduler when calling hw_fini (v2)
  drm/amdgpu: Clear RAS interrupt status on aldebaran
  drm/amd/display: Initialize lt_settings on instantiation
  drm/amd/display: cleanup idents after a revert
  drm/amd/display: Fix memory leak reported by coverity
  drm/ttm: Fix ttm_bo_move_memcpy() for subclassed struct ttm_resource
  drm/amdgpu/swsmu: fix spelling mistake "minimun" -> "minimum"
  drm/amdgpu: Disable PCIE_DPM on Intel RKL Platform
  drm/amdgpu: show both cmd id and name when psp cmd failed
  drm/amd/display: setup system context for APUs
  ...
2021-09-10 11:22:23 -07:00
Colin Ian King e8ba4922a2 drm/amdgpu: sdma: clean up identation
There is a statement that is indented incorrectly. Clean it up.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-07 13:30:50 -04:00
Colin Ian King 9ae807f0ec drm/amdgpu: clean up inconsistent indenting
There are a couple of statements that are indented one character
too deeply, clean these up.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-07 13:30:50 -04:00
Christian König a7181b52ea drm/amdgpu: remove unused amdgpu_bo_validate
Just drop some dead code.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-07 13:30:50 -04:00
Christian König 101ba90ff0 drm/amdgpu: fix use after free during BO move
The memory backing old_mem is already freed at that point, move the
check a bit more up.

Signed-off-by: Christian König <christian.koenig@amd.com>
Fixes: bfa3357ef9 ("drm/ttm: allocate resource object instead of embedding it v2")
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1699
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-07 13:30:49 -04:00
Candice Li ac1509d19e drm/amdgpu: Create common PSP TA load function
Creat common PSP TA load function and update PSP ta_mem_context
with size information.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-07 13:30:49 -04:00
Ernst Sjöstrand cd54323e76 drm/amd/amdgpu: Increase HWIP_MAX_INSTANCE to 10
Seems like newer cards can have even more instances now.
Found by UBSAN: array-index-out-of-bounds in
drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:318:29
index 8 is out of range for type 'uint32_t *[8]'

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1697
Cc: stable@vger.kernel.org
Signed-off-by: Ernst Sjöstrand <ernstp@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-02 12:39:41 -04:00
Christian König d72277b6c3 dma-buf: nuke DMA_FENCE_TRACE macros v2
Only the DRM GPU scheduler, radeon and amdgpu where using them and they depend
on a non existing config option to actually emit some code.

v2: keep the signal path as is for now

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210818105443.1578-1-christian.koenig@amd.com
2021-09-02 12:40:52 +02:00
Satyajit Sahu 7d7630fc6b drm/amdgpu:schedule vce/vcn encode based on priority
Schedule the encode job in VCE/VCN encode ring
based on the priority set by UMD.

Signed-off-by: Satyajit Sahu <satyajit.sahu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-01 16:55:11 -04:00
Satyajit Sahu 0ad29a4eb1 drm/amdgpu/vcn: set the priority for each encode ring
VCN has multiple rings. Set the proper priority level for each
encode ring while initializing.

Signed-off-by: Satyajit Sahu <satyajit.sahu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-01 16:55:11 -04:00
Satyajit Sahu 080e613c74 drm/amdgpu/vce: set the priority for each ring
VCE has multiple rings. Set the proper priority level for each
ring while initializing.

Signed-off-by: Satyajit Sahu <satyajit.sahu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-01 16:55:11 -04:00
Candice Li a0a2f7bb22 drm/amd/amdgpu: add mpio to ras block
Add MPIO to RAS block

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-01 16:55:11 -04:00
Candice Li 25c94b33dd drm/amd/amdgpu: consolidate PSP TA unload function
Create common PSP TA unload function and replace all common TA unloading
sequences.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-01 16:55:11 -04:00
Tom St Denis 37df9560cd drm/amd/amdgpu: New debugfs interface for MMIO registers (v5)
This new debugfs interface uses an IOCTL interface in order to pass
along state information like SRBM and GRBM bank switching.  This
new interface also allows a full 32-bit MMIO address range which
the previous didn't.  With this new design we have room to grow
the flexibility of the file as need be.

(v2): Move read/write to .read/.write, fix style, add comment
      for IOCTL data structure

(v3): C style comments

(v4): use u32 in struct and remove offset variable

(v5): Drop flag clearing in op function, use 0xFFFFFFFF for broadcast
      instead of 0x3FF, use mutex for op/ioctl.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-01 16:55:11 -04:00
Nirmoy Das 34eaf30f9a drm/amdgpu: detach ring priority from gfx priority
Currently AMDGPU_RING_PRIO_MAX is redefinition of a
max gfx hwip priority, this won't work well when we will
have a hwip with different set of priorities than gfx.
Also, HW ring priorities are different from ring priorities.

Create a global enum for ring priority levels which each
HWIP can use to define its own priority levels.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-01 16:55:11 -04:00
Nirmoy Das 84d588c3de drm/amdgpu: rework context priority handling
To get a hardware queue priority for a context, we are currently
mapping AMDGPU_CTX_PRIORITY_* to DRM_SCHED_PRIORITY_* and then
to hardware queue priority, which is not the right way to do that
as DRM_SCHED_PRIORITY_* is software scheduler's priority and it is
independent from a hardware queue priority.

Use userspace provided context priority, AMDGPU_CTX_PRIORITY_* to
map a context to proper hardware queue priority.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-01 16:55:11 -04:00
Linus Torvalds 477f70cd2a drm for v5.15-rc1
core:
 - extract i915 eDP backlight into core
 - DP aux bus support
 - drm_device.irq_enabled removed
 - port drivers to native irq interfaces
 - export gem shadow plane handling for vgem
 - print proper driver name in framebuffer registration
 - driver fixes for implicit fencing rules
 - ARM fixed rate compression modifier added
 - updated fb damage handling
 - rmfb ioctl logging/docs
 - drop drm_gem_object_put_locked
 - define DRM_FORMAT_MAX_PLANES
 - add gem fb vmap/vunmap helpers
 - add lockdep_assert(once) helpers
 - mark drm irq midlayer as legacy
 - use offset adjusted bo mapping conversion
 
 vgaarb:
 - cleanups
 
 fbdev:
 - extend efifb handling to all arches
 - div by 0 fixes for multiple drivers
 
 udmabuf:
 - add hugepage mapping support
 
 dma-buf:
 - non-dynamic exporter fixups
 - document implicit fencing rules
 
 amdgpu:
 - Initial Cyan Skillfish support
 - switch virtual DCE over to vkms based atomic
 - VCN/JPEG power down fixes
 - NAVI PCIE link handling fixes
 - AMD HDMI freesync fixes
 - Yellow Carp + Beige Goby fixes
 - Clockgating/S0ix/SMU/EEPROM fixes
 - embed hw fence in job
 - rework dma-resv handling
 - ensure eviction to system ram
 
 amdkfd:
 - uapi: SVM address range query added
 - sysfs leak fix
 - GPUVM TLB optimizations
 - vmfault/migration counters
 
 i915:
 - Enable JSL and EHL by default
 - preliminary XeHP/DG2 support
 - remove all CNL support (never shipped)
 - move to TTM for discrete memory support
 - allow mixed object mmap handling
 - GEM uAPI spring cleaning
   - add I915_MMAP_OBJECT_FIXED
   - reinstate ADL-P mmap ioctls
   - drop a bunch of unused by userspace features
   - disable and remove GPU relocations
 - revert some i915 misfeatures
 - major refactoring of GuC for Gen11+
 - execbuffer object locking separate step
 - reject caching/set-domain on discrete
 - Enable pipe DMC loading on XE-LPD and ADL-P
 - add PSF GV point support
 - Refactor and fix DDI buffer translations
 - Clean up FBC CFB allocation code
 - Finish INTEL_GEN() and friends macro conversions
 
 nouveau:
 - add eDP backlight support
 - implicit fence fix
 
 msm:
 - a680/7c3 support
 - drm/scheduler conversion
 
 panfrost:
 - rework GPU reset
 
 virtio:
 - fix fencing for planes
 
 ast:
 - add detect support
 
 bochs:
 - move to tiny GPU driver
 
 vc4:
 - use hotplug irqs
 - HDMI codec support
 
 vmwgfx:
 - use internal vmware device headers
 
 ingenic:
 - demidlayering irq
 
 rcar-du:
 - shutdown fixes
 - convert to bridge connector helpers
 
 zynqmp-dsub:
 - misc fixes
 
 mgag200:
 - convert PLL handling to atomic
 
 mediatek:
 - MT8133 AAL support
 - gem mmap object support
 - MT8167 support
 
 etnaviv:
 - NXP Layerscape LS1028A SoC support
 - GEM mmap cleanups
 
 tegra:
 - new user API
 
 exynos:
 - missing unlock fix
 - build warning fix
 - use refcount_t
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmEtvn8ACgkQDHTzWXnE
 hr7aqw//WfcIyGdPLjAz59cW8jm+FgihD5colHtOUYRHRO4GeX/bNNufquR8+N3y
 HESsyZdpihFHms/wURMq41ibmHg0EuHA01HZzjZuGBesG4F9I8sP/HnDOxDuYuAx
 N7Lg4PlUNlfFHmw7Y84owQ6s/XWmNp5iZ8e/mTK5hcraJFQKS4QO74n9RbG/F1vC
 Hc3P6AnpqGac2AEGXt0NjIRxVVCTUIBGx+XOhj+1AMyAGzt9VcO1DS9PVCS0zsEy
 zKMj9tZAPNg0wYsXAi4kA1lK7uVY8KoXSVDYLpsI5Or2/e7mfq2b4EWrezbtp6UA
 H+w86axuwJq7NaYHYH6HqyrLTOmvcHgIl2LoZN91KaNt61xfJT3XZkyQoYViGIrJ
 oZy6X/+s+WPoW98bHZrr6vbcxtWKfEeQyUFEAaDMmraKNJwROjtwgFC9DP8MDctq
 PUSM+XkwbGRRxQfv9dNKufeWfV5blVfzEJO8EfTU1YET3WTDaUHe/FoIcLZt2DZG
 JAJgZkIlU8egthPdakUjQz/KoyLMyovcN5zcjgzgjA9PyNEq74uElN9l446kSSxu
 jEVErOdd+aG3Zzk7/ZZL/RmpNQpPfpQ2RaPUkgeUsW01myNzUNuU3KUDaSlVa+Oi
 1n7eKoaQ2to/+LjhYApVriri4hIZckNNn5FnnhkgwGi8mpHQIVQ=
 =vZkA
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2021-08-31-1' of git://anongit.freedesktop.org/drm/drm

Pull drm updates from Dave Airlie:
 "Highlights:

   - i915 has seen a lot of refactoring and uAPI cleanups due to a
     change in the upstream direction going forward

     This has all been audited with known userspace, but there may be
     some pitfalls that were missed.

   - i915 now uses common TTM to enable discrete memory on DG1/2 GPUs

   - i915 enables Jasper and Elkhart Lake by default and has preliminary
     XeHP/DG2 support

   - amdgpu adds support for Cyan Skillfish

   - lots of implicit fencing rules documented and fixed up in drivers

   - msm now uses the core scheduler

   - the irq midlayer has been removed for non-legacy drivers

   - the sysfb code now works on more than x86.

  Otherwise the usual smattering of stuff everywhere, panels, bridges,
  refactorings.

  Detailed summary:

  core:
   - extract i915 eDP backlight into core
   - DP aux bus support
   - drm_device.irq_enabled removed
   - port drivers to native irq interfaces
   - export gem shadow plane handling for vgem
   - print proper driver name in framebuffer registration
   - driver fixes for implicit fencing rules
   - ARM fixed rate compression modifier added
   - updated fb damage handling
   - rmfb ioctl logging/docs
   - drop drm_gem_object_put_locked
   - define DRM_FORMAT_MAX_PLANES
   - add gem fb vmap/vunmap helpers
   - add lockdep_assert(once) helpers
   - mark drm irq midlayer as legacy
   - use offset adjusted bo mapping conversion

  vgaarb:
   - cleanups

  fbdev:
   - extend efifb handling to all arches
   - div by 0 fixes for multiple drivers

  udmabuf:
   - add hugepage mapping support

  dma-buf:
   - non-dynamic exporter fixups
   - document implicit fencing rules

  amdgpu:
   - Initial Cyan Skillfish support
   - switch virtual DCE over to vkms based atomic
   - VCN/JPEG power down fixes
   - NAVI PCIE link handling fixes
   - AMD HDMI freesync fixes
   - Yellow Carp + Beige Goby fixes
   - Clockgating/S0ix/SMU/EEPROM fixes
   - embed hw fence in job
   - rework dma-resv handling
   - ensure eviction to system ram

  amdkfd:
   - uapi: SVM address range query added
   - sysfs leak fix
   - GPUVM TLB optimizations
   - vmfault/migration counters

  i915:
   - Enable JSL and EHL by default
   - preliminary XeHP/DG2 support
   - remove all CNL support (never shipped)
   - move to TTM for discrete memory support
   - allow mixed object mmap handling
   - GEM uAPI spring cleaning
       - add I915_MMAP_OBJECT_FIXED
       - reinstate ADL-P mmap ioctls
       - drop a bunch of unused by userspace features
       - disable and remove GPU relocations
   - revert some i915 misfeatures
   - major refactoring of GuC for Gen11+
   - execbuffer object locking separate step
   - reject caching/set-domain on discrete
   - Enable pipe DMC loading on XE-LPD and ADL-P
   - add PSF GV point support
   - Refactor and fix DDI buffer translations
   - Clean up FBC CFB allocation code
   - Finish INTEL_GEN() and friends macro conversions

  nouveau:
   - add eDP backlight support
   - implicit fence fix

  msm:
   - a680/7c3 support
   - drm/scheduler conversion

  panfrost:
   - rework GPU reset

  virtio:
   - fix fencing for planes

  ast:
   - add detect support

  bochs:
   - move to tiny GPU driver

  vc4:
   - use hotplug irqs
   - HDMI codec support

  vmwgfx:
   - use internal vmware device headers

  ingenic:
   - demidlayering irq

  rcar-du:
   - shutdown fixes
   - convert to bridge connector helpers

  zynqmp-dsub:
   - misc fixes

  mgag200:
   - convert PLL handling to atomic

  mediatek:
   - MT8133 AAL support
   - gem mmap object support
   - MT8167 support

  etnaviv:
   - NXP Layerscape LS1028A SoC support
   - GEM mmap cleanups

  tegra:
   - new user API

  exynos:
   - missing unlock fix
   - build warning fix
   - use refcount_t"

* tag 'drm-next-2021-08-31-1' of git://anongit.freedesktop.org/drm/drm: (1318 commits)
  drm/amd/display: Move AllowDRAMSelfRefreshOrDRAMClockChangeInVblank to bounding box
  drm/amd/display: Remove duplicate dml init
  drm/amd/display: Update bounding box states (v2)
  drm/amd/display: Update number of DCN3 clock states
  drm/amdgpu: disable GFX CGCG in aldebaran
  drm/amdgpu: Clear RAS interrupt status on aldebaran
  drm/amdgpu: Add support for RAS XGMI err query
  drm/amdkfd: Account for SH/SE count when setting up cu masks.
  drm/amdgpu: rename amdgpu_bo_get_preferred_pin_domain
  drm/amdgpu: drop redundant cancel_delayed_work_sync call
  drm/amdgpu: add missing cleanups for more ASICs on UVD/VCE suspend
  drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend
  drm/amdkfd: map SVM range with correct access permission
  drm/amdkfd: check access permisson to restore retry fault
  drm/amdgpu: Update RAS XGMI Error Query
  drm/amdgpu: Add driver infrastructure for MCA RAS
  drm/amd/display: Add Logging for HDMI color depth information
  drm/amd/amdgpu: consolidate PSP TA init shared buf functions
  drm/amd/amdgpu: add name field back to ras_common_if
  drm/amdgpu: Fix build with missing pm_suspend_target_state module export
  ...
2021-09-01 11:26:46 -07:00
Philip Yang d7eff46c21 drm/amdgpu: fix fdinfo race with process exit
Get process vm root BO ref in case process is exiting and root BO is
freed, to avoid NULL pointer dereference backtrace:

BUG: unable to handle kernel NULL pointer dereference at
0000000000000000
Call Trace:
amdgpu_show_fdinfo+0xfe/0x2a0 [amdgpu]
seq_show+0x12c/0x180
seq_read+0x153/0x410
vfs_read+0x91/0x140[ 3427.206183]  ksys_read+0x4f/0xb0
do_syscall_64+0x5b/0x1a0
entry_SYSCALL_64_after_hwframe+0x65/0xca

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-31 14:20:40 -04:00
xinhui pan 703677d934 drm/amdgpu: Fix a deadlock if previous GEM object allocation fails
Fall through to handle the error instead of return.

Fixes: f8aab60422 ("drm/amdgpu: Initialise drm_gem_object_funcs for imported BOs")
Cc: stable@vger.kernel.org
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-31 14:20:08 -04:00
Guchun Chen f7d6779df6 drm/amdgpu: stop scheduler when calling hw_fini (v2)
This gurantees no more work on the ring can be submitted
to hardware in suspend/resume case, otherwise a potential
race will occur and the ring will get no chance to stay
empty before suspend.

v2: Call drm_sched_resubmit_job before drm_sched_start to
restart jobs from the pending list.

Suggested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-08-31 14:19:47 -04:00
John Clements 156872b07e drm/amdgpu: Clear RAS interrupt status on aldebaran
Resolve incorrect register address

Reviewed-by: Candice Li <candice.li@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-31 14:19:37 -04:00
Linus Torvalds 7d6e3fa87e Updates to the interrupt core and driver subsystems:
Core changes:
 
    - The usual set of small fixes and improvements all over the place, but nothing
      outstanding
 
 MSI changes:
 
    - Further consolidation of the PCI/MSI interrupt chip code
 
    - Make MSI sysfs code independent of PCI/MSI and expose the MSI interrupts
      of platform devices in the same way as PCI exposes them.
 
 Driver changes:
 
    - Support for ARM GICv3 EPPI partitions
 
    - Treewide conversion to generic_handle_domain_irq() for all chained
      interrupt controllers
 
    - Conversion to bitmap_zalloc() throughout the irq chip drivers
 
    - The usual set of small fixes and improvements
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmEsnpsTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoS+/EACQdpRkzl3IDIYqThxVZ8KQzp2rKKVn
 qisAQiWg/6koNJx/yYy62KNAUyKjCIObNtRnWi7OAOx6OvNtQTD2WOLAwkh3Pgw1
 8ePYYl55k+yCs8VoITsZM9jYeO+Tk878pU2A6R943zR+g6G7bskGJrxEyZ9TbzIe
 qKfusNKnRY9/jMQaRALUAAtA9VIVR867GqORX5X8hKz8yE2rqlpb4y+1CFba5BTV
 Vlxw7cIXvXBn7BKAom5diRqEGDNJEbX+56jJ7yDZshgLo7m11D7QLw72kmb6TNVC
 g7PchvFi4afpc1ifEAAp0tk4RiSIAQ91nS3n0+jLcLbodOjIkl14eY02ZCJGAP29
 uslyzUbmy1wgejG6CA63JtZ4MYdrf/OSMGuoN78qnOKYcIsWFzOvlJmBWWNW34qW
 LCaUF9QdJ/slXu6B4vIx30GfN9q4myml8bFUobE5q9mBRrEk4R0B7iyBvPu1xKYr
 ZEan67prI5VEu+afJGpp4r294m4HNVkMLfl3nYmE5+y4MoLeMNKDY3IPTvI9iP4G
 kaFgoPvQo23WnuclNYpJ+CaA4aRASlB2nTY+oAXIYfehbey9EW5vq4/EK864ek6w
 oyUTepxxNhE81tG2jpQbf2tR4COsEHy986clxqPP4AvsZXcbypCw8O2FcflpQbHO
 5DLEAfTmp7cziQ==
 =qyll
 -----END PGP SIGNATURE-----

Merge tag 'irq-core-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq updates from Thomas Gleixner:
 "Updates to the interrupt core and driver subsystems:

  Core changes:

   - The usual set of small fixes and improvements all over the place,
     but nothing stands out

  MSI changes:

   - Further consolidation of the PCI/MSI interrupt chip code

   - Make MSI sysfs code independent of PCI/MSI and expose the MSI
     interrupts of platform devices in the same way as PCI exposes them.

  Driver changes:

   - Support for ARM GICv3 EPPI partitions

   - Treewide conversion to generic_handle_domain_irq() for all chained
     interrupt controllers

   - Conversion to bitmap_zalloc() throughout the irq chip drivers

   - The usual set of small fixes and improvements"

* tag 'irq-core-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (57 commits)
  platform-msi: Add ABI to show msi_irqs of platform devices
  genirq/msi: Move MSI sysfs handling from PCI to MSI core
  genirq/cpuhotplug: Demote debug printk to KERN_DEBUG
  irqchip/qcom-pdc: Trim unused levels of the interrupt hierarchy
  irqdomain: Export irq_domain_disconnect_hierarchy()
  irqchip/gic-v3: Fix priority comparison when non-secure priorities are used
  irqchip/apple-aic: Fix irq_disable from within irq handlers
  pinctrl/rockchip: drop the gpio related codes
  gpio/rockchip: drop irq_gc_lock/irq_gc_unlock for irq set type
  gpio/rockchip: support next version gpio controller
  gpio/rockchip: use struct rockchip_gpio_regs for gpio controller
  gpio/rockchip: add driver for rockchip gpio
  dt-bindings: gpio: change items restriction of clock for rockchip,gpio-bank
  pinctrl/rockchip: add pinctrl device to gpio bank struct
  pinctrl/rockchip: separate struct rockchip_pin_bank to a head file
  pinctrl/rockchip: always enable clock for gpio controller
  genirq: Fix kernel doc indentation
  EDAC/altera: Convert to generic_handle_domain_irq()
  powerpc: Bulk conversion to generic_handle_domain_irq()
  nios2: Bulk conversion to generic_handle_domain_irq()
  ...
2021-08-30 14:38:37 -07:00
Lang Yu 50c6dedeb1 drm/amdgpu: show both cmd id and name when psp cmd failed
To cover the corner case that people want to know the ID
of an UNKNOWN CMD.

Suggested-by: John Clements <john.clements@amd.com>
Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-30 15:09:21 -04:00
Nicholas Kazlauskas c5d3c9a093 drm/amdgpu: Enable S/G for Yellow Carp
Missing code for Yellow Carp to enable scatter gather - follows how
DCN21 support was added.

Tested that 8k framebuffer allocation and display can now succeed after
applying the patch.

v2: Add hookup in DM

Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-08-30 15:08:38 -04:00
Evan Quan 602e338ffe drm/amdgpu: reenable BACO support for 699F:C7 polaris12 SKU
This reverts the commit below:
"drm/amdgpu: disable BACO support for 699F:C7 polaris12 SKU temporarily".
As the S3 hang issue has been fixed by another commit:
"drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend".

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-30 14:59:33 -04:00
YuBiao Wang 64261a0d06 drm/amd/amdgpu: Add ready_to_reset resp for vega10
Send response to host after received the flr notification from host.
Port NV change to vega10.

Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
Reviewed-by: Jingwen Chen <Jingwen.Chen2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-30 14:59:33 -04:00
Alex Deucher 8f0c93f454 drm/amdgpu: add some additional RDNA2 PCI IDs
New PCI IDs.

Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-30 14:59:33 -04:00
Yifan Zhang 6333a495f5 drm/amdgpu: correct comments in memory type managers
The parameters were renamed.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-30 14:59:33 -04:00
Luben Tuikov cc947bf91b drm/amdgpu: Process any VBIOS RAS EEPROM address
We can now process any RAS EEPROM address from
VBIOS. Generalize so as to compute the top three
bits of the 19-bit EEPROM address, from any byte
returned as the "i2c address" from VBIOS.

Cc: John Clements <john.clements@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-30 14:59:33 -04:00
Luben Tuikov a6a355a22f drm/amdgpu: Fixes to returning VBIOS RAS EEPROM address
1) Generalize the function--if the user didn't set
   i2c_address, still return true/false to
   indicate whether VBIOS contains the RAS EEPROM
   address.  This function shouldn't evaluate
   whether the user set the i2c_address pointer or
   not.

2) Don't touch the caller's i2c_address, unless
   you have to--this function shouldn't have side
   effects.

3) Correctly set the function comment as a
   kernel-doc comment.

Cc: John Clements <john.clements@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-30 14:59:33 -04:00
Daniel Vetter 0e10e9a1db drm/sched: drop entity parameter from drm_sched_push_job
Originally a job was only bound to the queue when we pushed this, but
now that's done in drm_sched_job_init, making that parameter entirely
redundant.

Remove it.

The same applies to the context parameter in
lima_sched_context_queue_task, simplify that too.

v2:
Rebase on top of msm adopting drm/sched

Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Emma Anholt <emma@anholt.net>
Acked-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Steven Price <steven.price@arm.com> (v1)
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Qiang Yu <yuq825@gmail.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: Emma Anholt <emma@anholt.net>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Nirmoy Das <nirmoy.das@amd.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Chen Li <chenli@uniontech.com>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: Deepak R Varma <mh12gx2825@gmail.com>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: "Marek Olšák" <marek.olsak@amd.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: Dennis Li <Dennis.Li@amd.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: etnaviv@lists.freedesktop.org
Cc: lima@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: Rob Clark <robdclark@gmail.com>
Cc: Sean Paul <sean@poorly.run>
Cc: Melissa Wen <mwen@igalia.com>
Cc: linux-arm-msm@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
Link: https://patchwork.freedesktop.org/patch/msgid/20210805104705.862416-6-daniel.vetter@ffwll.ch
2021-08-30 10:54:45 +02:00
Daniel Vetter dbe48d030b drm/sched: Split drm_sched_job_init
This is a very confusingly named function, because not just does it
init an object, it arms it and provides a point of no return for
pushing a job into the scheduler. It would be nice if that's a bit
clearer in the interface.

But the real reason is that I want to push the dependency tracking
helpers into the scheduler code, and that means drm_sched_job_init
must be called a lot earlier, without arming the job.

v2:
- don't change .gitignore (Steven)
- don't forget v3d (Emma)

v3: Emma noticed that I leak the memory allocated in
drm_sched_job_init if we bail out before the point of no return in
subsequent driver patches. To be able to fix this change
drm_sched_job_cleanup() so it can handle being called both before and
after drm_sched_job_arm().

Also improve the kerneldoc for this.

v4:
- Fix the drm_sched_job_cleanup logic, I inverted the booleans, as
  usual (Melissa)

- Christian pointed out that drm_sched_entity_select_rq() also needs
  to be moved into drm_sched_job_arm, which made me realize that the
  job->id definitely needs to be moved too.

  Shuffle things to fit between job_init and job_arm.

v5:
Reshuffle the split between init/arm once more, amdgpu abuses
drm_sched.ready to signal gpu reset failures. Also document this
somewhat. (Christian)

v6:
Rebase on top of the msm drm/sched support. Note that the
drm_sched_job_init() call is completely misplaced, and hence also the
split-out drm_sched_entity_push_job(). I've put in a FIXME which the next
patch will address.

v7: Drop the FIXME in msm, after discussions with Rob I agree it shouldn't
be a problem where it is now.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Melissa Wen <mwen@igalia.com>
Cc: Melissa Wen <melissa.srw@gmail.com>
Acked-by: Emma Anholt <emma@anholt.net>
Acked-by: Steven Price <steven.price@arm.com> (v2)
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> (v5)
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Qiang Yu <yuq825@gmail.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Adam Borowski <kilobyte@angband.pl>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Paul Menzel <pmenzel@molgen.mpg.de>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Nirmoy Das <nirmoy.das@amd.com>
Cc: Deepak R Varma <mh12gx2825@gmail.com>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: Chen Li <chenli@uniontech.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: "Marek Olšák" <marek.olsak@amd.com>
Cc: Dennis Li <Dennis.Li@amd.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: Sonny Jiang <sonny.jiang@amd.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Tian Tao <tiantao6@hisilicon.com>
Cc: etnaviv@lists.freedesktop.org
Cc: lima@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: Emma Anholt <emma@anholt.net>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Sean Paul <sean@poorly.run>
Cc: linux-arm-msm@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
Link: https://patchwork.freedesktop.org/patch/msgid/20210817084917.3555822-1-daniel.vetter@ffwll.ch
2021-08-30 10:50:44 +02:00
Dave Airlie 8f0284f190 Merge tag 'amd-drm-next-5.15-2021-08-27' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.15-2021-08-27:

amdgpu:
- PLL fix for SI
- Misc code cleanups
- RAS fixes
- PSP cleanups
- Polaris UVD/VCE suspend fixes
- aldebaran fixes
- DCN3.x mclk fixes

amdkfd:
- CWSR fixes for arcturus and aldebaran
- SVM fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210827192336.4649-1-alexander.deucher@amd.com
2021-08-30 09:06:03 +10:00
Thomas Gleixner 47fb0cfdb7 irqchip updates for Linux 5.15
API updates:
 
 - Treewide conversion to generic_handle_domain_irq() for anything
   that looks like a chained interrupt controller
 
 - Update the irqdomain documentation
 
 - Use of bitmap_zalloc() throughout the tree
 
 New functionalities:
 
 - Support for GICv3 EPPI partitions
 
 Fixes:
 
 - Qualcomm PDC hierarchy fixes
 
 - Yet another priority decoding fix for the GICv3 pseudo-NMIs
 
 - Fix the apple-aic driver irq_eoi() callback to always unmask
   the interrupt
 
 - Properly handle edge interrupts on loongson-pch-pic
 
 - Let the mtk-sysirq driver advertise IRQCHIP_SKIP_SET_WAKE
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmEqIhgPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDjPkP/Rtp6WNZ1QUfJWmHovnh/Wc6ob1DXcBwi9nX
 hy4miIJ1SWuez9G49RlQAiXZoB28B6KKCKKmouiqu7ke7WUhifS0K1ej188wjxRX
 dqRG+m9yBAqKSr0lyWLB5VVCc8XBz4oZTc28n585gHiXfAPv7u0EzW+zNrnloLU5
 NrAj6ppGFUzVT0VxRqcurbymE6OwRWjc3D+z/PhtHZ4SFOhft95CXgsdvMklqyLj
 wwiuZ0Dhj5EruSP/Z7DzbbXnMNmte3HC2/cUNPYkho4/rk+2gVnYv5kVdfPHKQCY
 Wjti/kvuPC3hdTvdw8g7VQfP63R3clZhcQ8s+myoeX5LWzyAHpoxAtdsbX7oVsgs
 aKyrFhddEFVuiFizYyweS89pL0kCkTob8/zlGeuhRiVRTZ3+kG7Zf2UTTnN1ZdLw
 2lMolghiGk4LYJfr83+CDZyYP/VGHDCthfrmd//l39P2wJhkuCDbbeKaElLGWvUt
 abnPf0buCRqMAJe7vh0GHCx7290nEh2IqyHR4AYRVhRaN7bfAXdZH9Xp6ZGT25Fz
 uORgUbGAyhd5Ics/7twE4qeOkfJ6fxwgXsOlx90EfgVYyDJ1sBHNx8Buo2z8Bl/2
 rwCsW49kU7yX/wp11sJctR72RuLKC23dxS6z7aSWkRc6k3u+8xl2eeLIN59FNrKZ
 ToTdbXEQ
 =bdm2
 -----END PGP SIGNATURE-----

Merge tag 'irqchip-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core

Pull irqchip updates from Marc Zyngier:

- API updates:

  - Treewide conversion to generic_handle_domain_irq() for anything
    that looks like a chained interrupt controller

  - Update the irqdomain documentation

  - Use of bitmap_zalloc() throughout the tree

- New functionalities:

  - Support for GICv3 EPPI partitions

- Fixes:

  - Qualcomm PDC hierarchy fixes

  - Yet another priority decoding fix for the GICv3 pseudo-NMIs

  - Fix the apple-aic driver irq_eoi() callback to always unmask
    the interrupt

  - Properly handle edge interrupts on loongson-pch-pic

  - Let the mtk-sysirq driver advertise IRQCHIP_SKIP_SET_WAKE

Link: https://lore.kernel.org/r/20210828121013.2647964-1-maz@kernel.org
2021-08-29 21:19:50 +02:00
Hawking Zhang 192fb630fb drm/amdgpu: disable GFX CGCG in aldebaran
disable GFX CGCG and CGLS to workaround
a hardware issue found in aldebaran.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-26 13:56:58 -04:00
John Clements 54e6badbed drm/amdgpu: Clear RAS interrupt status on aldebaran
resolve register address issue for detecting/clearing RAS interrupt

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-26 13:56:48 -04:00
John Clements 3c4ff2dcc0 drm/amdgpu: Add support for RAS XGMI err query
Update XGMI RAS to support error query on aldebaran

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-26 13:56:14 -04:00
Dave Airlie 697b6e28d0 Merge tag 'amd-drm-next-5.15-2021-08-20' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.15-2021-08-20:

amdgpu:
- embed hw fence into job
- Misc SMU fixes
- PSP TA code cleanup
- RAS fixes
- PWM fan speed fixes
- DC workqueue cleanups
- SR-IOV fixes
- gfxoff delayed work fix
- Pin domain check fix

amdkfd:
- SVM fixes

radeon:
- Code cleanup

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210820172335.4190-1-alexander.deucher@amd.com
2021-08-26 12:18:27 +10:00
Yifan Zhang d035f84d83 drm/amdgpu: rename amdgpu_bo_get_preferred_pin_domain
amdgpu_bo_get_preferred_pin_domain is used for page tables
creation, which is not involved with page pinning. And it is used in
more cases than display scanout, modify its documentation as well.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-25 18:15:17 -04:00
Evan Quan 416e1fab47 drm/amdgpu: drop redundant cancel_delayed_work_sync call
As those _sw_fini() APIs follow just after _suspend() APIs.
And the cancel_delayed_work_sync was already called in latter.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-25 18:15:10 -04:00
Evan Quan 859e465927 drm/amdgpu: add missing cleanups for more ASICs on UVD/VCE suspend
This is a supplement for commit below:
"drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend".

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-25 18:15:02 -04:00
Evan Quan bf756fb833 drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend
Perform proper cleanups on UVD/VCE suspend: powergate enablement,
clockgating enablement and dpm disablement. This can fix some hangs
observed on suspending when UVD/VCE still using(e.g. issue
"pm-suspend" when video is still playing).

Signed-off-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-25 18:14:48 -04:00
Philip Yang ff891a2e64 drm/amdkfd: check access permisson to restore retry fault
Check range access permission to restore GPU retry fault, if GPU retry
fault on address which belongs to VMA, and VMA has no read or write
permission requested by GPU, failed to restore the address. The vm fault
event will pass back to user space.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-24 15:36:50 -04:00
John Clements f24d991bb9 drm/amdgpu: Update RAS XGMI Error Query
Resolve bug querying error on unsupported ASIC

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-24 15:36:43 -04:00
John Clements 3907c49218 drm/amdgpu: Add driver infrastructure for MCA RAS
Add MCA specific IP blocks targetting RAS features

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-24 15:36:18 -04:00
Candice Li 30acef3c4a drm/amd/amdgpu: consolidate PSP TA init shared buf functions
Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-24 15:36:05 -04:00
Candice Li 355e3e4ccc drm/amd/amdgpu: add name field back to ras_common_if
Adding name field back to ras_common_if to work around error
injection failure with amdgpuras tool.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-24 15:35:57 -04:00
Borislav Petkov a47f6a5806 drm/amdgpu: Fix build with missing pm_suspend_target_state module export
Building a randconfig here triggered:

  ERROR: modpost: "pm_suspend_target_state" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!

because the module export of that symbol happens in
kernel/power/suspend.c which is enabled with CONFIG_SUSPEND.

The ifdef guards in amdgpu_acpi_is_s0ix_supported(), however, test for
CONFIG_PM_SLEEP which is defined like this:

  config PM_SLEEP
          def_bool y
          depends on SUSPEND || HIBERNATE_CALLBACKS

and that randconfig has:

  # CONFIG_SUSPEND is not set
  CONFIG_HIBERNATE_CALLBACKS=y

leading to the module export missing.

Change the ifdeffery to depend directly on CONFIG_SUSPEND.

Fixes: 5706cb3c91 ("drm/amdgpu: fix checking pmops when PM_SLEEP is not enabled")
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/YSP6Lv53QV0cOAsd@zn.tnic
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-08-24 15:35:50 -04:00
Christophe JAILLET 8a1d1bdb84 drm/amdgpu: switch from 'pci_' to 'dma_' API
The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script below.

It has been compile tested.

@@
@@
-    PCI_DMA_BIDIRECTIONAL
+    DMA_BIDIRECTIONAL

@@
@@
-    PCI_DMA_TODEVICE
+    DMA_TO_DEVICE

@@
@@
-    PCI_DMA_FROMDEVICE
+    DMA_FROM_DEVICE

@@
@@
-    PCI_DMA_NONE
+    DMA_NONE

@@
expression e1, e2, e3;
@@
-    pci_alloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
-    pci_zalloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
-    pci_free_consistent(e1, e2, e3, e4)
+    dma_free_coherent(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_single(e1, e2, e3, e4)
+    dma_map_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_single(e1, e2, e3, e4)
+    dma_unmap_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
-    pci_map_page(e1, e2, e3, e4, e5)
+    dma_map_page(&e1->dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_page(e1, e2, e3, e4)
+    dma_unmap_page(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_sg(e1, e2, e3, e4)
+    dma_map_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_sg(e1, e2, e3, e4)
+    dma_unmap_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_device(e1, e2, e3, e4)
+    dma_sync_single_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2;
@@
-    pci_dma_mapping_error(e1, e2)
+    dma_mapping_error(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_dma_mask(e1, e2)
+    dma_set_mask(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_consistent_dma_mask(e1, e2)
+    dma_set_coherent_mask(&e1->dev, e2)

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-24 15:35:41 -04:00
Mukul Joshi f270921a17 drm/amdkfd: CWSR with sw scheduler on Aldebaran and Arcturus
Program trap handler settings to enable CWSR with software scheduler
on Aldebaran and Arcturus.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Amber Lin <Amber.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-24 15:35:33 -04:00
Shashank Sharma 7301757ea1 drm/amdgpu/OLAND: clip the ref divider max value
This patch limits the ref_div_max value to 100, during the
calculation of PLL feedback reference divider. With current
value (128), the produced fb_ref_div value generates unstable
output at particular frequencies. Radeon driver limits this
value at 100.

On Oland, when we try to setup mode 2048x1280@60 (a bit weird,
I know), it demands a clock of 221270 Khz. It's been observed
that the PLL calculations using values 128 and 100 are vastly
different, and look like this:

+------------------------------------------+
|Parameter    |AMDGPU        |Radeon       |
|             |              |             |
+-------------+----------------------------+
|Clock feedback              |             |
|divider max  |  128         |   100       |
|cap value    |              |             |
|             |              |             |
|             |              |             |
+------------------------------------------+
|ref_div_max  |              |             |
|             |  42          |  20         |
|             |              |             |
|             |              |             |
+------------------------------------------+
|ref_div      |  42          |  20         |
|             |              |             |
+------------------------------------------+
|fb_div       |  10326       |  8195       |
+------------------------------------------+
|fb_div       |  1024        |  163        |
+------------------------------------------+
|fb_dev_p     |  4           |  9          |
|frac fb_de^_p|              |             |
+----------------------------+-------------+

With ref_div_max value clipped at 100, AMDGPU driver can also
drive videmode 2048x1280@60 (221Mhz) and produce proper output
without any blanking and distortion on the screen.

PS: This value was changed from 128 to 100 in Radeon driver also, here:
4b21ce1b4b

V1:
Got acks from:
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>

V2:
- Restricting the changes only for OLAND, just to avoid any regression
  for other cards.
- Changed unsigned -> unsigned int to make checkpatch quiet.

V3: Apply the change on SI family (not only oland) (Christian)

Cc: Alex Deucher <Alexander.Deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Eddy Qin <Eddy.Qin@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-24 15:35:25 -04:00
Borislav Petkov c41a4e877a drm/amdgpu: Fix build with missing pm_suspend_target_state module export
Building a randconfig here triggered:

  ERROR: modpost: "pm_suspend_target_state" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!

because the module export of that symbol happens in
kernel/power/suspend.c which is enabled with CONFIG_SUSPEND.

The ifdef guards in amdgpu_acpi_is_s0ix_supported(), however, test for
CONFIG_PM_SLEEP which is defined like this:

  config PM_SLEEP
          def_bool y
          depends on SUSPEND || HIBERNATE_CALLBACKS

and that randconfig has:

  # CONFIG_SUSPEND is not set
  CONFIG_HIBERNATE_CALLBACKS=y

leading to the module export missing.

Change the ifdeffery to depend directly on CONFIG_SUSPEND.

Fixes: 5706cb3c91 ("drm/amdgpu: fix checking pmops when PM_SLEEP is not enabled")
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/YSP6Lv53QV0cOAsd@zn.tnic
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-08-24 11:57:44 -04:00
Christian König d5f45d1e2f drm/ttm: remove ttm_tt_destroy_common v2
Move the functionality into ttm_tt_fini and ttm_bo_tt_destroy instead.

We don't need this any more since we removed the unbind from the destroy
code paths in the drivers.

Also add a warning to ttm_tt_fini() if we try to fini a still populated TT
object.

v2: instead of reverting the patch move the functionality to different
places.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210728130552.2074-5-christian.koenig@amd.com
2021-08-23 13:54:55 +02:00
Christian König b7e8b086ff drm/amdgpu: unbind in amdgpu_ttm_tt_unpopulate
Doing this in amdgpu_ttm_backend_destroy() is to late.

It turned out that this is not a good idea at all because it leaves pointers
to freed up system memory pages in the GART tables of the drivers.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210728130552.2074-2-christian.koenig@amd.com
2021-08-23 13:43:04 +02:00
Michel Dänzer 32bc8f8373 drm/amdgpu: Cancel delayed work when GFXOFF is disabled
schedule_delayed_work does not push back the work if it was already
scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms
after the first time GFXOFF was disabled and re-enabled, even if GFXOFF
was disabled and re-enabled again during those 100 ms.

This resulted in frame drops / stutter with the upcoming mutter 41
release on Navi 14, due to constantly enabling GFXOFF in the HW and
disabling it again (for getting the GPU clock counter).

To fix this, call cancel_delayed_work_sync when the disable count
transitions from 0 to 1, and only schedule the delayed work on the
reverse transition, not if the disable count was already 0. This makes
sure the delayed work doesn't run at unexpected times, and allows it to
be lock-free.

v2:
* Use cancel_delayed_work_sync & mutex_trylock instead of
  mod_delayed_work.
v3:
* Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König)
v4:
* Fix race condition between amdgpu_gfx_off_ctrl incrementing
  adev->gfx.gfx_off_req_count and amdgpu_device_delay_enable_gfx_off
  checking for it to be 0 (Evan Quan)

Cc: stable@vger.kernel.org
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> # v3
Acked-by: Christian König <christian.koenig@amd.com> # v3
Signed-off-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-20 13:35:42 -04:00
Christian König 2a7b9a8437 drm/amdgpu: use the preferred pin domain after the check
For some reason we run into an use case where a BO is already pinned
into GTT, but should be pinned into VRAM|GTT again.

Handle that case gracefully as well.

Reviewed-by: Shashank Sharma <Shashank.sharma@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-08-20 13:35:35 -04:00
Michel Dänzer 90a9266269 drm/amdgpu: Cancel delayed work when GFXOFF is disabled
schedule_delayed_work does not push back the work if it was already
scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms
after the first time GFXOFF was disabled and re-enabled, even if GFXOFF
was disabled and re-enabled again during those 100 ms.

This resulted in frame drops / stutter with the upcoming mutter 41
release on Navi 14, due to constantly enabling GFXOFF in the HW and
disabling it again (for getting the GPU clock counter).

To fix this, call cancel_delayed_work_sync when the disable count
transitions from 0 to 1, and only schedule the delayed work on the
reverse transition, not if the disable count was already 0. This makes
sure the delayed work doesn't run at unexpected times, and allows it to
be lock-free.

v2:
* Use cancel_delayed_work_sync & mutex_trylock instead of
  mod_delayed_work.
v3:
* Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König)
v4:
* Fix race condition between amdgpu_gfx_off_ctrl incrementing
  adev->gfx.gfx_off_req_count and amdgpu_device_delay_enable_gfx_off
  checking for it to be 0 (Evan Quan)

Cc: stable@vger.kernel.org
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> # v3
Acked-by: Christian König <christian.koenig@amd.com> # v3
Signed-off-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-20 12:09:44 -04:00
Christian König 9deb0b3dcf drm/amdgpu: use the preferred pin domain after the check
For some reason we run into an use case where a BO is already pinned
into GTT, but should be pinned into VRAM|GTT again.

Handle that case gracefully as well.

Reviewed-by: Shashank Sharma <Shashank.sharma@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-08-20 12:09:32 -04:00
YuBiao Wang 691191a2f4 drm/amd/amdgpu:flush ttm delayed work before cancel_sync
[Why]
In some cases when we unload driver, warning call trace
will show up in vram_mgr_fini which claims that LRU is not empty, caused
by the ttm bo inside delay deleted queue.

[How]
We should flush delayed work to make sure the delay deleting is done.

Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-18 18:23:00 -04:00
Candice Li ce97f37be8 drm/amd: consolidate TA shared memory structures
Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-18 18:22:53 -04:00
Hawking Zhang f2bd514d85 drm/amdgpu: increase max xgmi physical node for aldebaran
aldebaran supports up to 16 xgmi physical nodes.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-18 18:22:46 -04:00
Evan Quan 42447deb88 drm/amdgpu: disable BACO support for 699F:C7 polaris12 SKU temporarily
We have a S3 issue on that SKU with BACO enabled. Will bring back this
when that root caused.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-18 18:22:25 -04:00
Zhigang Luo 424f2b2e26 drm/amdgpu: correct MMSCH 1.0 version
MMSCH 1.0 doesn't have major/minor version, only verison.

Signed-off-by: Zhigang Luo <zhigang.luo@amd.com>
Reviewed by Shaoyun.liu <Shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-18 18:22:25 -04:00
Jonathan Kim 44357a1bd5 drm/amdgpu: get extended xgmi topology data
The TA has a limit to the amount of data that can be retrieved from
GET_TOPOLOGY.  For setups that exceed this limit, the xGMI topology
needs to be re-initialized and data needs to be re-fetched from the
extended link records by setting a flag in the shared command buffer.

The number of hops and the number of links must be accumulated by the
driver. Other data points are all fetched from the first request.
Because the TA has already exceeded its link record limit, it
cannot hold bidirectional information.  Otherwise the driver would
have to do more than two fetches so the driver has to reflect the
topology information in the opposite direction.

v2: squashed with internal reviewed fix

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Hawking Zhang <hawking.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-18 18:22:24 -04:00
Evan Quan 0d8318e112 drm/amd/pm: drop the unnecessary intermediate percent-based transition
Currently, the readout of fan speed pwm is transited into percent-based
and then pwm-based. However, the transition into percent-based is totally
unnecessary and make the final output less accurate.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16 15:35:56 -04:00
Candice Li 893cf382c0 drm/amd/amdgpu: remove unnecessary RAS context field
Delete ras_if->name in the RAS ctx structure and remove related lines.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16 15:35:55 -04:00
Candice Li 6457205c07 drm/amd/amdgpu: consolidate PSP TA context
Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16 15:18:04 -04:00
Jiange Zhao 3e183e2fae drm/amdgpu: Add MB_REQ_MSG_READY_TO_RESET response when VF get FLR notification.
When guest received FLR notification from host, it would
lock adapter into reset state. There will be no more
job submission and hardware access after that.

Then it should send a response to host that it has prepared
for host reset.

Signed-off-by: Jiange Zhao <Jiange.Zhao@amd.com>
Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com>
Reviewed-by: Emily.Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16 15:17:57 -04:00
Jack Zhang c530b02f39 drm/amd/amdgpu embed hw_fence into amdgpu_job
Why: Previously hw fence is alloced separately with job.
It caused historical lifetime issues and corner cases.
The ideal situation is to take fence to manage both job
and fence's lifetime, and simplify the design of gpu-scheduler.

How:
We propose to embed hw_fence into amdgpu_job.
1. We cover the normal job submission by this method.
2. For ib_test, and submit without a parent job keep the
legacy way to create a hw fence separately.
v2:
use AMDGPU_FENCE_FLAG_EMBED_IN_JOB_BIT to show that the fence is
embedded in a job.
v3:
remove redundant variable ring in amdgpu_job
v4:
add tdr sequence support for this feature. Add a job_run_counter to
indicate whether this job is a resubmit job.
v5
add missing handling in amdgpu_fence_enable_signaling

Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com>
Signed-off-by: Jack Zhang <Jack.Zhang7@hotmail.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed by: Monk Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16 15:16:58 -04:00
Dave Airlie 2819cf0e7d drm-misc-next for v5.15:
UAPI Changes:
 
 Cross-subsystem Changes:
 - Add lockdep_assert(once) helpers.
 
 Core Changes:
 - Add lockdep assert to drm_is_current_master_locked.
 - Fix typos in dma-buf documentation.
 - Mark drm irq midlayer as legacy only.
 - Fix GPF in udmabuf_create.
 - Rename member to correct value in drm_edid.h
 
 Driver Changes:
 - Build fix to make nouveau build with NOUVEAU_BACKLIGHT.
 - Add MI101AIT-ICP1, LTTD800480070-L6WWH-RT panels.
 - Assorted fixes to bridge/it66121, anx7625.
 - Add custom crtc_state to simple helpers, and use it to
   convert pll handling in mgag200 to atomic.
 - Convert drivers to use offset-adjusted framebuffer bo mappings.
 - Assorted small fixes and fix for a use-after-free in vmwgfx.
 - Convert remaining callers of non-legacy drivers to use linux irqs directly.
 - Small cleanup in ingenic.
 - Small fixes to virtio and ti-sn65dsi86.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEuXvWqAysSYEJGuVH/lWMcqZwE8MFAmEVd7cACgkQ/lWMcqZw
 E8Pw4hAApFWMPThb1OGxAsozhfOAc+Nhd/XjClWrEeu+auowge0OJzjUTOpUci7o
 NrL5GWLPDoIXm5/8eLm6B9k+hPmsuLQUgriMMpPyulvgckSYiLKzFEygbKtilfsJ
 k+ucLYlZ93ADLIcUIaode5eeWIbtsiuXNdfpH91uLKT2Lo1k4azJTfLJl0t1TB78
 uAumaeLXirmqfyN8U/zgPSmaNyTgBlMjMrJG/0+NGaoiq3sKtC/vCU/4r2aEcsMd
 ZrYbAJph5ND/u449P4vk2mDNmpz6GE1i8y1zeq7OWq+EOFHaUv0M2v1ZGTl9NSCw
 CascPEb+Hzt4eMFEvpVn9Vz/p7NpVeT78733YkF6jO898pruDbJFz/1PFLUcWp5S
 jssvgYyhqTrLTCabg8OMrkww8S5on+qcRPVjrpaND2QJUrF58acrEQVCJmCf/XIg
 G2XXotZxv96upWj1n/bZz/KcNN7xc+J6z9SEZml/elz5er9msbL8OFEq9bxWxfKL
 xjDY0CwvRtkwnxkin6UpZHYlGGciY1j3pCMHiRpQNiHDQOfNE2pFbKHN0lWE7jiS
 wQaSMhxC8cjM1TpfHCeHmM522QS6oqJbPev9Z+2twO8KtfetbbBE3xe+Rm/Rf7eM
 vPRvHavWaRDlkq742qZqsiqP0aiqgr04i73Kw5mGLOCT/42uCuU=
 =G09r
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2021-08-12' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for v5.15:

UAPI Changes:

Cross-subsystem Changes:
- Add lockdep_assert(once) helpers.

Core Changes:
- Add lockdep assert to drm_is_current_master_locked.
- Fix typos in dma-buf documentation.
- Mark drm irq midlayer as legacy only.
- Fix GPF in udmabuf_create.
- Rename member to correct value in drm_edid.h

Driver Changes:
- Build fix to make nouveau build with NOUVEAU_BACKLIGHT.
- Add MI101AIT-ICP1, LTTD800480070-L6WWH-RT panels.
- Assorted fixes to bridge/it66121, anx7625.
- Add custom crtc_state to simple helpers, and use it to
  convert pll handling in mgag200 to atomic.
- Convert drivers to use offset-adjusted framebuffer bo mappings.
- Assorted small fixes and fix for a use-after-free in vmwgfx.
- Convert remaining callers of non-legacy drivers to use linux irqs directly.
- Small cleanup in ingenic.
- Small fixes to virtio and ti-sn65dsi86.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1cf2d7fc-402d-1852-574a-21cbbd2eaebf@linux.intel.com
2021-08-16 12:57:33 +10:00
Marc Zyngier 66c6594b6d gpu: Bulk conversion to generic_handle_domain_irq()
Wherever possible, replace constructs that match either
generic_handle_irq(irq_find_mapping()) or
generic_handle_irq(irq_linear_revmap()) to a single call to
generic_handle_domain_irq().

Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-08-12 11:39:40 +01:00
Tuo Li a211260c34 gpu: drm: amd: amdgpu: amdgpu_i2c: fix possible uninitialized-variable access in amdgpu_i2c_router_select_ddc_port()
The variable val is declared without initialization, and its address is
passed to amdgpu_i2c_get_byte(). In this function, the value of val is
accessed in:
  DRM_DEBUG("i2c 0x%02x 0x%02x read failed\n",
       addr, *val);

Also, when amdgpu_i2c_get_byte() returns, val may remain uninitialized,
but it is accessed in:
  val &= ~amdgpu_connector->router.ddc_mux_control_pin;

To fix this possible uninitialized-variable access, initialize val to 0 in
amdgpu_i2c_router_select_ddc_port().

Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Signed-off-by: Tuo Li <islituo@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-11 17:19:54 -04:00
Mukul Joshi b53ef0df1b drm/amdkfd: CWSR with software scheduler
This patch adds support to program trap handler settings
when loading driver with software scheduler (sched_policy=2).

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Suggested-by: Jay Cornwall <Jay.Cornwall@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-11 17:19:54 -04:00
Thomas Zimmermann 450d61794d drm/amdgpu: Convert to Linux IRQ interfaces
Drop the DRM IRQ midlayer in favor of Linux IRQ interfaces. DRM's
IRQ helpers are mostly useful for UMS drivers. Modern KMS drivers
don't benefit from using it.

DRM IRQ callbacks are now being called directly or inlined.

The interrupt number returned by pci_msi_vector() is now stored
in struct amdgpu_irq. Calls to pci_msi_vector() can fail and return
a negative errno code. Abort initlaizaton in thi case. The DRM IRQ
midlayer does not handle this correctly.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210803090704.32152-2-tzimmermann@suse.de
2021-08-10 20:00:44 +02:00
Alex Deucher 7cbe08a930 drm/amdgpu: handle VCN instances when harvesting (v2)
There may be multiple instances and only one is harvested.

v2: fix typo in commit message

Fixes: 83a0b86391 ("drm/amdgpu: add judgement when add ip blocks (v2)")
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1673
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-08-10 10:38:10 -04:00
Alex Deucher 59066d0083 drm/amdgpu: handle VCN instances when harvesting (v2)
There may be multiple instances and only one is harvested.

v2: fix typo in commit message

Fixes: 83a0b86391 ("drm/amdgpu: add judgement when add ip blocks (v2)")
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1673
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-10 10:34:25 -04:00
Sergio Miguéns Iglesias 6940db0fd1 drm/amdgpu: Removed unnecessary if statement
There was an "if" statement that did nothing so it was removed.

Signed-off-by: Sergio Miguéns Iglesias <sergio@lony.xyz>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-09 15:44:50 -04:00
Randy Dunlap 7b42552be6 drm/amdgpu: fix kernel-doc warnings on non-kernel-doc comments
Don't use "begin kernel-doc notation" (/**) for comments that are
not kernel-doc. This eliminates warnings reported by the 0day bot.

drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c:89: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
    * This shader is used to clear VGPRS and LDS, and also write the input
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c:209: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
    * The below shaders are used to clear SGPRS, and also write the input
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c:301: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
    * This shader is used to clear the uninitiated sgprs after the above

Fixes: 0e0036c7d1 ("drm/amdgpu: fix no full coverage issue for gprs initialization")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: Dennis Li <Dennis.Li@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-09 15:44:47 -04:00
YuBiao Wang e78b3197db drm/amd/amdgpu: skip locking delayed work if not initialized.
When init failed in early init stage, amdgpu_object has
not been initialized, so hasn't the ttm delayed queue functions.

Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
Reviewed-by: Emily.Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-09 15:42:30 -04:00
Victor Zhao 124e8b1990 drm/amdgpu: Extend full access wait time in guest
- Extend wait time and add retry, currently 6s * 2times
- Change timing algorithm

Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-09 15:42:10 -04:00
Dan Carpenter 420c81c84b drm/amdgpu: check for allocation failure in amdgpu_vkms_sw_init()
Check whether the kcalloc() fails and return -ENOMEM if it does.

Fixes: 84ec374bd5 ("drm/amdgpu: create amdgpu_vkms (v4)")
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-06 16:12:23 -04:00
Alex Deucher 202ead5a3c drm/amdgpu: don't enable baco on boco platforms in runpm
If the platform uses BOCO, don't use BACO in runtime suspend.
We could end up executing the BACO path if the platform supports
both.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1669
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-08-06 11:35:58 -04:00
John Clements 39932ef758 drm/amdgpu: set RAS EEPROM address from VBIOS
update to latest atombios fw table

[Backport to 5.14 - Alex]

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1670
Signed-off-by: John Clements <john.clements@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-08-06 11:33:50 -04:00
Tuo Li a204ea8c20 drm/amdgpu: drop redundant null-pointer checks in amdgpu_ttm_tt_populate() and amdgpu_ttm_tt_unpopulate()
The varialbe gtt in the function amdgpu_ttm_tt_populate() and
amdgpu_ttm_tt_unpopulate() is guaranteed to be not NULL in the context.
Thus the null-pointer checks are redundant and can be dropped.

Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Tuo Li <islituo@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-05 21:18:00 -04:00
Alex Deucher 11e612a093 drm/amdgpu: don't enable baco on boco platforms in runpm
If the platform uses BOCO, don't use BACO in runtime suspend.
We could end up executing the BACO path if the platform supports
both.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1669
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-05 21:18:00 -04:00
Joseph Greathouse 685967b3c1 drm/amdgpu: Put MODE register in wave debug info
Add the MODE register into the per-wave debug information.
This register holds state such as FP rounding and denorm
modes, which exceptions are enabled, and active clamping
modes.

Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-05 21:17:59 -04:00
John Clements 14fb496a84 drm/amdgpu: set RAS EEPROM address from VBIOS
update to latest atombios fw table

Signed-off-by: John Clements <john.clements@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-05 21:17:59 -04:00
Peng Ju Zhou 564e3dcf79 drm/amd/amdgpu: Recovery vcn instance iterate.
The previous logic is recording the amount of valid vcn instances
to use them on SRIOV, it is a hard task due to the vcn accessment is
based on the index of the vcn instance.

Check if the vcn instance enabled before do instance init.

Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-05 21:17:59 -04:00
John Clements 4b29652754 drm/amdgpu: added synchronization for psp cmd buf access
resolved race condition accessing psp cmd submission memory

Signed-off-by: John Clements <john.clements@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-05 21:17:59 -04:00
John Clements 9712ee0e44 drm/amdgpu: update PSP BL cmd IDs
resolved bug with incorrect PSP BL cmd IDs

Signed-off-by: John Clements <john.clements@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-05 21:17:59 -04:00
Chengming Gui a2e9b1666e drm/amdgpu: add DID for beige goby
Add device ids.

Signed-off-by: Chengming Gui <Jack.Gui@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-05 21:17:59 -04:00
Candice Li 4fb9307154 drm/amd/amdgpu: remove redundant host to psp cmd buf allocations
Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-05 21:17:58 -04:00
Ryan Taylor 733ee71ae0 drm/amdgpu: replace dce_virtual with amdgpu_vkms (v3)
Move dce_virtual into amdgpu_vkms and update all references to
dce_virtual with amdgpu_vkms.

v2: Removed more references to dce_virtual.

v3: Restored display modes from previous implementation.

Signed-off-by: Ryan Taylor <Ryan.Taylor@amd.com>
Reported-by: kernel test robot <lkp@intel.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-05 21:17:58 -04:00
Ryan Taylor fd922f7a0e drm/amdgpu: cleanup dce_virtual
Remove obsolete functions and variables from dce_virtual.

Signed-off-by: Ryan Taylor <Ryan.Taylor@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-05 21:17:58 -04:00
Ryan Taylor 84ec374bd5 drm/amdgpu: create amdgpu_vkms (v4)
Modify the VKMS driver into an api that dce_virtual can use to create
virtual displays that obey drm's atomic modesetting api.

v2: Made local functions static.

v3: Switched vkms_output kzalloc for kcalloc.
    Cleanup patches by moving display mode fixes to this patch.

v4: Update atomic_check and atomic_update to comply with new kms api.

Signed-off-by: Ryan Taylor <Ryan.Taylor@amd.com>
Reported-by: kernel test robot <lkp@intel.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-05 21:17:58 -04:00
zhouchuangao d7b5dae099 gpu/drm/amd: Remove duplicated include of drm_drv.h
Duplicate include header file <drm/drm_drv.h>
line 28: #include <drm/drm_drv.h>
line 44: #include <drm/drm_drv.h>

Signed-off-by: zhouchuangao <zhouchuangao@vivo.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-05 21:17:58 -04:00
Guchun Chen 067f44c8b4 drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)
In amdgpu_fence_driver_hw_fini, no need to call drm_sched_fini to stop
scheduler in s3 test, otherwise, fence related failure will arrive
after resume. To fix this and for a better clean up, move drm_sched_fini
from fence_hw_fini to fence_sw_fini, as it's part of driver shutdown, and
should never be called in hw_fini.

v2: rename amdgpu_fence_driver_init to amdgpu_fence_driver_sw_init,
to keep sw_init and sw_fini paired.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1668
Fixes: 8d35a25961 ("drm/amdgpu: adjust fence driver enable sequence")
Suggested-by: Christian König <christian.koenig@amd.com>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-05 21:17:58 -04:00
Mukul Joshi 719e433ed0 drm/amdgpu: Fix channel_index table layout for Aldebaran
Fix the channel_index table layout to fetch the correct
channel_index when calculating physical address from
normalized address during page retirement.
Also, fix the number of UMC instances and number of channels
within each UMC instance for Aldebaran.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-By: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-05 21:17:58 -04:00
Randy Dunlap 8d70136e2d drm/amdgpu: fix checking pmops when PM_SLEEP is not enabled
'pm_suspend_target_state' is only available when CONFIG_PM_SLEEP
is set/enabled. OTOH, when both SUSPEND and HIBERNATION are not set,
PM_SLEEP is not set, so this variable cannot be used.

../drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c: In function ‘amdgpu_acpi_is_s0ix_active’:
../drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:1046:11: error: ‘pm_suspend_target_state’ undeclared (first use in this function); did you mean ‘__KSYM_pm_suspend_target_state’?
    return pm_suspend_target_state == PM_SUSPEND_TO_IDLE;
           ^~~~~~~~~~~~~~~~~~~~~~~
           __KSYM_pm_suspend_target_state

Also use shorter IS_ENABLED(CONFIG_foo) notation for checking the
2 config symbols.

Fixes: 91e273712a ("drm/amdgpu: Check pmops for desired suspend state")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-next@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-05 21:17:57 -04:00
Chengming Gui e00f543d35 drm/amdgpu: add DID for beige goby
Add device ids.

Signed-off-by: Chengming Gui <Jack.Gui@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-05 21:02:29 -04:00
Randy Dunlap 5706cb3c91 drm/amdgpu: fix checking pmops when PM_SLEEP is not enabled
'pm_suspend_target_state' is only available when CONFIG_PM_SLEEP
is set/enabled. OTOH, when both SUSPEND and HIBERNATION are not set,
PM_SLEEP is not set, so this variable cannot be used.

../drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c: In function ‘amdgpu_acpi_is_s0ix_active’:
../drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:1046:11: error: ‘pm_suspend_target_state’ undeclared (first use in this function); did you mean ‘__KSYM_pm_suspend_target_state’?
    return pm_suspend_target_state == PM_SUSPEND_TO_IDLE;
           ^~~~~~~~~~~~~~~~~~~~~~~
           __KSYM_pm_suspend_target_state

Also use shorter IS_ENABLED(CONFIG_foo) notation for checking the
2 config symbols.

Fixes: 91e273712a ("drm/amdgpu: Check pmops for desired suspend state")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-next@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-08-05 21:02:29 -04:00
Yifan Zhang 198fbe15ce drm/amdgpu: fix the doorbell missing when in CGPG issue for renoir.
If GC has entered CGPG, ringing doorbell > first page doesn't wakeup GC.
Enlarge CP_MEC_DOORBELL_RANGE_UPPER to workaround this issue.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-02 17:21:25 -04:00
Eric Huang 3cd293a78a Revert "Revert "drm/amdkfd: Only apply TLB flush optimization on ALdebaran""
This reverts commit 53d0533049.

Revert reason: The issue has been resolved.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-02 17:21:25 -04:00
Eric Huang b1f21482af Revert "Revert "drm/amdgpu: Fix warning of Function parameter or member not described""
This reverts commit 4e7b93ca52.

Revert reason: The issue has been resolved.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-02 17:21:24 -04:00
Eric Huang fce1a7eb35 Revert "Revert "drm/amdkfd: Make TLB flush conditional on mapping""
This reverts commit 7ed9876c97.

Revert reason: The issue has been resolved.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-02 17:21:24 -04:00
Eric Huang cc6152ff4f Revert "Revert "drm/amdgpu: Add table_freed parameter to amdgpu_vm_bo_update""
This reverts commit 024d8811c9.

Revert reason: The issue has been resolved.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-02 17:21:24 -04:00
xinhui pan de59865042 drm/amdgpu: Fix out-of-bounds read when update mapping
If one GTT BO has been evicted/swapped out, it should sit in CPU domain.
TTM only alloc struct ttm_resource instead of struct ttm_range_mgr_node
for sysMem.

Now when we update mapping for such invalidated BOs, we might walk out
of bounds of struct ttm_resource.

Three possible fix:
1) Let sysMem manager alloc struct ttm_range_mgr_node, like
ttm_range_manager does.
2) Pass pages_addr to update_mapping function too, but need memset
pages_addr[] to zero when unpopulate.
3) Init amdgpu_res_cursor directly.

bug is detected by kfence.
==================================================================
BUG: KFENCE: out-of-bounds read in amdgpu_vm_bo_update_mapping+0x564/0x6e0

Out-of-bounds read at 0x000000008ea93fe9 (64B right of kfence-#167):
 amdgpu_vm_bo_update_mapping+0x564/0x6e0 [amdgpu]
 amdgpu_vm_bo_update+0x282/0xa40 [amdgpu]
 amdgpu_vm_handle_moved+0x19e/0x1f0 [amdgpu]
 amdgpu_cs_vm_handling+0x4e4/0x640 [amdgpu]
 amdgpu_cs_ioctl+0x19e7/0x23c0 [amdgpu]
 drm_ioctl_kernel+0xf3/0x180 [drm]
 drm_ioctl+0x2cb/0x550 [drm]
 amdgpu_drm_ioctl+0x5e/0xb0 [amdgpu]

kfence-#167 [0x000000008e11c055-0x000000001f676b3e
 ttm_sys_man_alloc+0x35/0x80 [ttm]
 ttm_resource_alloc+0x39/0x50 [ttm]
 ttm_bo_swapout+0x252/0x5a0 [ttm]
 ttm_device_swapout+0x107/0x180 [ttm]
 ttm_global_swapout+0x6f/0x130 [ttm]
 ttm_tt_populate+0xb1/0x2a0 [ttm]
 ttm_bo_handle_move_mem+0x17e/0x1d0 [ttm]
 ttm_mem_evict_first+0x59d/0x9c0 [ttm]
 ttm_bo_mem_space+0x39f/0x400 [ttm]
 ttm_bo_validate+0x13c/0x340 [ttm]
 ttm_bo_init_reserved+0x269/0x540 [ttm]
 amdgpu_bo_create+0x1d1/0xa30 [amdgpu]
 amdgpu_bo_create_user+0x40/0x80 [amdgpu]
 amdgpu_gem_object_create+0x71/0xc0 [amdgpu]
 amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x2f2/0xcd0 [amdgpu]
 kfd_ioctl_alloc_memory_of_gpu+0xe2/0x330 [amdgpu]
 kfd_ioctl+0x461/0x690 [amdgpu]

Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-02 16:53:17 -04:00
Yifan Zhang 1c0539a6fc drm/amdgpu: fix the doorbell missing when in CGPG issue for renoir.
If GC has entered CGPG, ringing doorbell > first page doesn't wakeup GC.
Enlarge CP_MEC_DOORBELL_RANGE_UPPER to workaround this issue.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-30 11:14:17 -04:00
xinhui pan 4d77f36f2c drm/amdgpu: Fix out-of-bounds read when update mapping
If one GTT BO has been evicted/swapped out, it should sit in CPU domain.
TTM only alloc struct ttm_resource instead of struct ttm_range_mgr_node
for sysMem.

Now when we update mapping for such invalidated BOs, we might walk out
of bounds of struct ttm_resource.

Three possible fix:
1) Let sysMem manager alloc struct ttm_range_mgr_node, like
ttm_range_manager does.
2) Pass pages_addr to update_mapping function too, but need memset
pages_addr[] to zero when unpopulate.
3) Init amdgpu_res_cursor directly.

bug is detected by kfence.
==================================================================
BUG: KFENCE: out-of-bounds read in amdgpu_vm_bo_update_mapping+0x564/0x6e0

Out-of-bounds read at 0x000000008ea93fe9 (64B right of kfence-#167):
 amdgpu_vm_bo_update_mapping+0x564/0x6e0 [amdgpu]
 amdgpu_vm_bo_update+0x282/0xa40 [amdgpu]
 amdgpu_vm_handle_moved+0x19e/0x1f0 [amdgpu]
 amdgpu_cs_vm_handling+0x4e4/0x640 [amdgpu]
 amdgpu_cs_ioctl+0x19e7/0x23c0 [amdgpu]
 drm_ioctl_kernel+0xf3/0x180 [drm]
 drm_ioctl+0x2cb/0x550 [drm]
 amdgpu_drm_ioctl+0x5e/0xb0 [amdgpu]

kfence-#167 [0x000000008e11c055-0x000000001f676b3e
 ttm_sys_man_alloc+0x35/0x80 [ttm]
 ttm_resource_alloc+0x39/0x50 [ttm]
 ttm_bo_swapout+0x252/0x5a0 [ttm]
 ttm_device_swapout+0x107/0x180 [ttm]
 ttm_global_swapout+0x6f/0x130 [ttm]
 ttm_tt_populate+0xb1/0x2a0 [ttm]
 ttm_bo_handle_move_mem+0x17e/0x1d0 [ttm]
 ttm_mem_evict_first+0x59d/0x9c0 [ttm]
 ttm_bo_mem_space+0x39f/0x400 [ttm]
 ttm_bo_validate+0x13c/0x340 [ttm]
 ttm_bo_init_reserved+0x269/0x540 [ttm]
 amdgpu_bo_create+0x1d1/0xa30 [amdgpu]
 amdgpu_bo_create_user+0x40/0x80 [amdgpu]
 amdgpu_gem_object_create+0x71/0xc0 [amdgpu]
 amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x2f2/0xcd0 [amdgpu]
 kfd_ioctl_alloc_memory_of_gpu+0xe2/0x330 [amdgpu]
 kfd_ioctl+0x461/0x690 [amdgpu]

Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-30 11:13:52 -04:00
Dave Airlie 04d505de7f Merge tag 'amd-drm-next-5.15-2021-07-29' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.15-2021-07-29:

amdgpu:
- VCN/JPEG power down sequencing fixes
- Various navi pcie link handling fixes
- Clockgating fixes
- Yellow Carp fixes
- Beige Goby fixes
- Misc code cleanups
- S0ix fixes
- SMU i2c bus rework
- EEPROM handling rework
- PSP ucode handling cleanup
- SMU error handling rework
- AMD HDMI freesync fixes
- USB PD firmware update rework
- MMIO based vram access rework
- Misc display fixes
- Backlight fixes
- Add initial Cyan Skillfish support
- Overclocking fixes suspend/resume

amdkfd:
- Sysfs leak fix
- Add counters for vm faults and migration
- GPUVM TLB optimizations

radeon:
- Misc fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210730033455.3852-1-alexander.deucher@amd.com
2021-07-30 16:48:35 +10:00
Huang Rui b8e42844b4 drm/amdgpu: enable psp front door loading by default for cyan_skillfish2
The function is ready on psp firmware, and enable it by default.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-28 22:15:44 -04:00
Likun Gao 8d35a25961 drm/amdgpu: adjust fence driver enable sequence
Fence driver was enabled per ring when sw init on per IP block before.
Change to enable all the fence driver at the same time after
amdgpu_device_ip_init finished.
Rename some function related to fence to make it reasonable for read.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-28 22:15:44 -04:00
John Clements edc8c81f24 drm/amdgpu: Added PSP13 BL loading support for additional drivers
Added BL loading support for soc/intf/dbg drivers

Signed-off-by: John Clements <john.clements@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-28 22:15:44 -04:00
John Clements 8abadab37f drm/amdgpu: Consolidated PSP13 BL FW loading
Remove duplicate code

Signed-off-by: John Clements <john.clements@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-28 22:15:43 -04:00
John Clements 6ff34fd690 drm/amdgpu: Added support for added psp driver binaries FW
Detect psp driver binaries packed into FW and try to load the FW

Signed-off-by: John Clements <john.clements@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-28 22:15:35 -04:00
John Clements f8e487ce83 drm/amdgpu: Added latest PSP FW header
Improved handling for scalling PSP FW binaries

Signed-off-by: John Clements <john.clements@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-28 22:10:33 -04:00
Huang Rui b84d029d9f drm/amdgpu: remove the access of xxx_PSP_DEBUG on cycan_skillfish
It won't need to clear the xxx_PSP_DEBUG registers, because firmware
will handle this change.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-28 22:10:26 -04:00
Alex Deucher 7fd13baeb7 drm/amdgpu/display: add support for multiple backlights
On platforms that support multiple backlights, register
each one separately.  This lets us manage them independently
rather than registering a single backlight and applying the
same settings to both.

v2: fix typo:
Reported-by: kernel test robot <lkp@intel.com>

Reviewed-by: Roman Li <Roman.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-28 22:10:15 -04:00
Eric Huang 3b2b254425 Revert "Revert "drm/amdgpu: Fix warning of Function parameter or member not described""
This reverts commit 4e7b93ca52.

Revert reason: The issue has been resolved.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-28 16:37:18 -04:00
Eric Huang 8f0e2d5c99 Revert "Revert "drm/amdkfd: Make TLB flush conditional on mapping""
This reverts commit 7ed9876c97.

Revert reason: The issue has been resolved.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-28 16:37:18 -04:00
Eric Huang e9949dd791 Revert "Revert "drm/amdgpu: Add table_freed parameter to amdgpu_vm_bo_update""
This reverts commit 024d8811c9.

Revert reason: The issue has been resolved.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-28 16:37:17 -04:00
Pratik Vishwakarma 91e273712a drm/amdgpu: Check pmops for desired suspend state
[Why]
User might change the suspend behaviour from OS.

[How]
Check with pm for target suspend state and set s0ix
flag only for s2idle state.

v2: User might change default suspend state, use target state
v3: squash in build fix

Suggested-by: Lijo Lazar <Lijo.Lazar@amd.com>
Signed-off-by: Pratik Vishwakarma <Pratik.Vishwakarma@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-07-27 12:42:17 -04:00
Pratik Vishwakarma d0260f62ee drm/amdgpu: Rename amdgpu_acpi_is_s0ix_supported
Rename amdgpu_acpi_is_s0ix_supported to better explain
functionality by renaming to amdgpu_acpi_is_s0ix_active

Signed-off-by: Pratik Vishwakarma <Pratik.Vishwakarma@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-27 12:04:26 -04:00
Pratik Vishwakarma 91b03fc6b5 drm/amdgpu: Check pmops for desired suspend state
[Why]
User might change the suspend behaviour from OS.

[How]
Check with pm for target suspend state and set s0ix
flag only for s2idle state.

v2: User might change default suspend state, use target state
v3: squash in build fix

Suggested-by: Lijo Lazar <Lijo.Lazar@amd.com>
Signed-off-by: Pratik Vishwakarma <Pratik.Vishwakarma@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-27 12:04:19 -04:00
Jiri Kosina 6aade587d3 drm/amdgpu: Avoid printing of stack contents on firmware load error
In case when psp_init_asd_microcode() fails to load ASD microcode file,
psp_v12_0_init_microcode() tries to print the firmware filename that
failed to load before bailing out.

This is wrong because:

- the firmware filename it would want it print is an incorrect one as
  psp_init_asd_microcode() and psp_v12_0_init_microcode() are loading
  different filenames
- it tries to print fw_name, but that's not yet been initialized by that
  time, so it prints random stack contents, e.g.

    amdgpu 0000:04:00.0: Direct firmware load for amdgpu/renoir_asd.bin failed with error -2
    amdgpu 0000:04:00.0: amdgpu: fail to initialize asd microcode
    amdgpu 0000:04:00.0: amdgpu: psp v12.0: Failed to load firmware "\xfeTO\x8e\xff\xff"

Fix that by bailing out immediately, instead of priting the bogus error
message.

Reported-by: Vojtech Pavlik <vojtech@ucw.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-07-26 12:43:03 -04:00
Jiri Kosina d47255d3f8 drm/amdgpu: Fix resource leak on probe error path
This reverts commit 4192f7b576.

It is not true (as stated in the reverted commit changelog) that we never
unmap the BAR on failure; it actually does happen properly on
amdgpu_driver_load_kms() -> amdgpu_driver_unload_kms() ->
amdgpu_device_fini() error path.

What's worse, this commit actually completely breaks resource freeing on
probe failure (like e.g. failure to load microcode), as
amdgpu_driver_unload_kms() notices adev->rmmio being NULL and bails too
early, leaving all the resources that'd normally be freed in
amdgpu_acpi_fini() and amdgpu_device_fini() still hanging around, leading
to all sorts of oopses when someone tries to, for example, access the
sysfs and procfs resources which are still around while the driver is
gone.

Fixes: 4192f7b576 ("drm/amdgpu: unmap register bar on device init failure")
Reported-by: Vojtech Pavlik <vojtech@ucw.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-07-26 12:42:49 -04:00
Dave Airlie 35482f9dc5 Linux 5.14-rc3
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmD95yIeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGqp0H/j/xHL20EHaUJOaV
 iJjnfGyjtnkLC5FCoV/q/v9sFuSW2p4W1nyF8/eIgVKObef94Mg4/xxaHQrWIM56
 cbzK9aIcD9InAuImJ6lju4fqjNmFrt2x7mhfzjPKqmhfINfZ5CohpLFN5XdOwzYC
 l+ZgmUUl7GLDAND2M6rtkc7AOk4qTyAySDvvPFELE/uNgV4EKaENSIWofHhEzW5v
 Yk+4agawaFTfa6H9+uMVYZBOcEKwheQ0E2tcOJvHJT8Mwm8MFoC/B7fLY5zxIdN2
 7A7r/7qbSQmSDSjOgwKS4ZOjom0xGSD+V+596SzET6jkbahR2HJ/mrFvmD7GNEoW
 OWJPjzI=
 =vzIM
 -----END PGP SIGNATURE-----

Backmerge tag 'v5.14-rc3' into drm-next

Linux 5.14-rc3

Daniel said we should pull the nouveau fix from fixes in here, probably
a good plan.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2021-07-26 09:27:59 +10:00
Hawking Zhang bdb99dbe3e drm/amdgpu: retire sdma v5_2 golden settings from driver
They are initalized by hardware during power up phase,
starting from sdma v5_2 generation

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:09:40 -04:00
Chengzhe Liu 61a6813f3f drm/amdgpu: Add msix restore for pass-through mode
In pass-through mode, after mode 1 reset, msix enablement status would
lost and never receives interrupt again. So, we should restore msix
status after mode 1 reset.

Signed-off-by: Chengzhe Liu <ChengZhe.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:09:40 -04:00
Roy Sun fe6b1032b2 drm/amdgpu: Change the imprecise output
The fail reason is that the vfgate is disabled

Signed-off-by: Roy Sun <Roy.Sun@amd.com>
Reviewed-by: Peng Ju Zhou <PengJu.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:09:40 -04:00
Chengzhe Liu 1bece222ea drm/amdgpu: Clear doorbell interrupt status for Sienna Cichlid
On Sienna Cichlid, in pass-through mode, if we unload the driver in BACO
mode(RTPM), then the kernel would receive thousands of interrupts.
That's because there is doorbell monitor interrupt on BIF, so KVM keeps
injecting interrupts to the guest VM. So we should clear the doorbell
interrupt status after BACO exit.

v2: Modify coding style and commit message

Signed-off-by: Chengzhe Liu <ChengZhe.Liu@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:09:40 -04:00
Tao Zhou a8f706966b drm/amdgpu: add pci device id for cyan_skillfish
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:08:02 -04:00
Lang Yu 7fd74ad880 drm/amdgpu: add autoload_supported check for RLC autoload
Asic cyan_skilfish2 won't support RLC autoload when using
front door loading. We just use PSP to load firmware like
gfx9 here.

So add autoload_supported flag check instead of just
checking firmware load type for RLC autoload.

Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:08:02 -04:00
Lang Yu 641df09904 drm/amdgpu: enable SMU for cyan_skilfish
Enable SMU support for cyan_skilfish.

v2: Squash in fix (Alex)

Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:08:02 -04:00
Lang Yu c5d0aa482e drm/amdgpu: use direct loading by default for cyan_skillfish2
Will switch to front door loading by default after this function is
stable.

v2: use APU flags (Alex)

Signed-off-by: Lang Yu <lang.yu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:08:01 -04:00
Lang Yu 1c7916af55 drm/amdgpu: enable psp v11.0.8 for cyan_skillfish
Add psp v11.0.8 to ip block initialization.

v2: use APU flags (Alex)

Signed-off-by: Lang Yu <lang.yu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:08:01 -04:00
Lang Yu 3188fd0752 drm/amdgpu: init psp v11.0.8 function for cyan_skillfish
Add psp v11.0.8 function into psp driver.

Signed-off-by: Lang Yu <lang.yu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:08:01 -04:00
Lang Yu e330a68f30 drm/amdgpu: add psp v11.0.8 driver for cyan_skillfish
Introduce the psp v11.0.8 driver for cyan_skillfish.

Signed-off-by: Lang Yu <lang.yu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:08:01 -04:00
Tao Zhou 338b3cf0b9 drm/amdgpu: add nbio support for cyan_skillfish
nbio version is 2.3.

v2: Make it more explicit (Alex)

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:08:01 -04:00
Tao Zhou b515937b41 drm/amdgpu: add chip early init for cyan_skillfish
Set cg/pg flags and rev id for cyan_skillfish.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:08:01 -04:00
Tao Zhou d9393f9b68 drm/amdgpu: add gc v10 golden settings for cyan_skillfish
v2: squash in updates from Ray

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:08:01 -04:00
Tao Zhou 86491ff7c6 drm/amdgpu: add sdma v5 golden settings for cyan_skillfish
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:08:01 -04:00
Tao Zhou 9724bb6621 drm/amdgpu: add cyan_skillfish support in gfx v10
Add gfx support for cyan_skillfish.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:08:01 -04:00
Tao Zhou 9dbd8a1251 drm/amdgpu: add cyan_skillfish support in gmc v10
Add gmc support for cyan_skillfish.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:08:01 -04:00
Tao Zhou d594e3cc19 drm/amdgpu: load fw direclty for cyan_skillfish
Use backdoor loading.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:08:01 -04:00
Tao Zhou bf4759a81b drm/amdgpu: add sdma fw loading support for cyan_skillfish
Same as Navi10.

v2: squash in updates (Alex)

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:08:01 -04:00
Tao Zhou 621312a2ac drm/amdgpu: add cp/rlc fw loading support for cyan_skillfish
Add cp/rlc fw loading support and gfx golden setting.

v2: squash in updates (Alex)

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:08:01 -04:00
Tao Zhou f36fb5a0e3 drm/amdgpu: set ip blocks for cyan_skillfish
Add ip blocks for cyan_skillfish.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:08:01 -04:00
Tao Zhou 6e80eacd9c drm/amdgpu: init family name for cyan_skillfish
Use FAMILY_NV for cyan_skillfish.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:08:00 -04:00
Tao Zhou 708391977b drm/amdgpu: dynamic initialize ip offset for cyan_skillfish
Add ip offset definition for cyan_skillfish and initialize it.

v2: squash in ip_offset updates (Alex)

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:08:00 -04:00
Tao Zhou d0f56dc25a drm/amdgpu: add cyan_skillfish asic type
Add cyan_skillfish asic family.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:08:00 -04:00
Lang Yu 30ebc16aac drm/amdgpu: adjust fw_name string length for toc
Adjust toc fw_name string length to PSP_FW_NAME_LEN.

Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:08:00 -04:00
Tao Zhou 5ccde01b50 drm/amdgpu: increase size for sdma fw name string
Longer firmware name needs more space.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:08:00 -04:00
Aaron Liu 69b30d80ef drm/amdgpu: add yellow carp pci id (v2)
Add Yellow Carp PCI id support.

v2: add another DID

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:08:00 -04:00
Aaron Liu e97c8d8677 drm/amdgpu: update yellow carp external rev_id handling
0x1681 has a different external revision id.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:08:00 -04:00
Kai-Heng Feng aff890288d drm/amdgpu/acp: Make PM domain really work
Devices created by mfd_add_hotplug_devices() don't really increase the
index of its name, so get_mfd_cell_dev() cannot find any device, hence a
NULL dev is passed to pm_genpd_add_device():
[   56.974926] (NULL device *): amdgpu: device acp_audio_dma.0.auto added to pm domain
[   56.974933] (NULL device *): amdgpu: Failed to add dev to genpd
[   56.974941] [drm:amdgpu_device_ip_init [amdgpu]] *ERROR* hw_init of IP block <acp_ip> failed -22
[   56.975810] amdgpu 0000:00:01.0: amdgpu: amdgpu_device_ip_init failed
[   56.975839] amdgpu 0000:00:01.0: amdgpu: Fatal error during GPU init
[   56.977136] ------------[ cut here ]------------
[   56.977143] kernel BUG at mm/slub.c:4206!
[   56.977158] invalid opcode: 0000 [#1] SMP NOPTI
[   56.977167] CPU: 1 PID: 1648 Comm: modprobe Not tainted 5.12.0-051200rc8-generic #202104182230
[   56.977175] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./FM2A68M-HD+, BIOS P5.20 02/13/2019
[   56.977180] RIP: 0010:kfree+0x3bf/0x410
[   56.977195] Code: 89 e7 48 d3 e2 f7 da e8 5f 0d 02 00 80 e7 02 75 3e 44 89 ee 4c 89 e7 e8 ef 5f fd ff e9 fa fe ff ff 49 8b 44 24 08 a8 01 75 b7 <0f> 0b 4c 8b 4d b0 48 8b 4d a8 48 89 da 4c 89 e6 41 b8 01 00 00 00
[   56.977202] RSP: 0018:ffffa48640ff79f0 EFLAGS: 00010246
[   56.977210] RAX: 0000000000000000 RBX: ffff9286127d5608 RCX: 0000000000000000
[   56.977215] RDX: 0000000000000000 RSI: ffffffffc099d0fb RDI: ffff9286127d5608
[   56.977220] RBP: ffffa48640ff7a48 R08: 0000000000000001 R09: 0000000000000001
[   56.977224] R10: 0000000000000000 R11: ffff9286087d8458 R12: fffff3ae0449f540
[   56.977229] R13: 0000000000000000 R14: dead000000000122 R15: dead000000000100
[   56.977234] FS:  00007f9de5929540(0000) GS:ffff928612e80000(0000) knlGS:0000000000000000
[   56.977240] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   56.977245] CR2: 00007f697dd97160 CR3: 00000001110f0000 CR4: 00000000001506e0
[   56.977251] Call Trace:
[   56.977261]  amdgpu_dm_encoder_destroy+0x1b/0x30 [amdgpu]
[   56.978056]  drm_mode_config_cleanup+0x4f/0x2e0 [drm]
[   56.978147]  ? kfree+0x3dd/0x410
[   56.978157]  ? drm_managed_release+0xc8/0x100 [drm]
[   56.978232]  drm_mode_config_init_release+0xe/0x10 [drm]
[   56.978311]  drm_managed_release+0x9d/0x100 [drm]
[   56.978388]  devm_drm_dev_init_release+0x4d/0x70 [drm]
[   56.978450]  devm_action_release+0x15/0x20
[   56.978459]  release_nodes+0x77/0xc0
[   56.978469]  devres_release_all+0x3f/0x50
[   56.978477]  really_probe+0x245/0x460
[   56.978485]  driver_probe_device+0xe9/0x160
[   56.978492]  device_driver_attach+0xab/0xb0
[   56.978499]  __driver_attach+0x8f/0x150
[   56.978506]  ? device_driver_attach+0xb0/0xb0
[   56.978513]  bus_for_each_dev+0x7e/0xc0
[   56.978521]  driver_attach+0x1e/0x20
[   56.978528]  bus_add_driver+0x135/0x1f0
[   56.978534]  driver_register+0x91/0xf0
[   56.978540]  __pci_register_driver+0x54/0x60
[   56.978549]  amdgpu_init+0x77/0x1000 [amdgpu]
[   56.979246]  ? 0xffffffffc0dbc000
[   56.979254]  do_one_initcall+0x48/0x1d0
[   56.979265]  ? kmem_cache_alloc_trace+0x120/0x230
[   56.979274]  ? do_init_module+0x28/0x280
[   56.979282]  do_init_module+0x62/0x280
[   56.979288]  load_module+0x71c/0x7a0
[   56.979296]  __do_sys_finit_module+0xc2/0x120
[   56.979305]  __x64_sys_finit_module+0x1a/0x20
[   56.979311]  do_syscall_64+0x38/0x90
[   56.979319]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[   56.979328] RIP: 0033:0x7f9de54f989d
[   56.979335] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c3 f5 0c 00 f7 d8 64 89 01 48
[   56.979342] RSP: 002b:00007ffe3c395a28 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[   56.979350] RAX: ffffffffffffffda RBX: 0000560df3ef4330 RCX: 00007f9de54f989d
[   56.979355] RDX: 0000000000000000 RSI: 0000560df3a07358 RDI: 000000000000000f
[   56.979360] RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
[   56.979365] R10: 000000000000000f R11: 0000000000000246 R12: 0000560df3a07358
[   56.979369] R13: 0000000000000000 R14: 0000560df3ef4460 R15: 0000560df3ef4330
[   56.979377] Modules linked in: amdgpu(+) iommu_v2 gpu_sched drm_ttm_helper ttm drm_kms_helper cec rc_core i2c_algo_bit fb_sys_fops syscopyarea sysfillrect sysimgblt nft_counter xt_tcpudp ipt_REJECT nf_reject_ipv4 xt_conntrack iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ip_set nf_tables libcrc32c nfnetlink ip6_tables iptable_filter bpfilter input_leds binfmt_misc edac_mce_amd kvm_amd ccp kvm snd_hda_codec_realtek snd_hda_codec_generic crct10dif_pclmul snd_hda_codec_hdmi ledtrig_audio ghash_clmulni_intel aesni_intel snd_hda_intel snd_intel_dspcfg snd_seq_midi crypto_simd snd_intel_sdw_acpi cryptd snd_hda_codec snd_seq_midi_event snd_rawmidi snd_hda_core snd_hwdep snd_seq fam15h_power k10temp snd_pcm snd_seq_device snd_timer snd mac_hid soundcore sch_fq_codel nct6775 hwmon_vid drm ip_tables x_tables autofs4 dm_mirror dm_region_hash dm_log hid_generic usbhid hid uas usb_storage r8169 crc32_pclmul realtek ahci xhci_pci i2c_piix4
[   56.979521]  xhci_pci_renesas libahci video
[   56.979541] ---[ end trace cb8f6a346f18da7b ]---

Instead of finding MFD hotplugged device by its name, simply iterate
over the child devices to avoid the issue.

Squash in unused variable removal (Alex)

BugLink: https://bugs.launchpad.net/bugs/1920674
Fixes: 25030321ba ("drm/amd: add pm domain for ACP IP sub blocks")
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:08:00 -04:00
Candice Li 222e0a71c2 drm/amd/amdgpu: add consistent PSP FW loading size checking
Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: John Clements <John.Clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:08:00 -04:00
Jingwen Chen ff99849b00 drm/amd/amdgpu: consider kernel job always not guilty
[Why]
Currently all timedout job will be considered to be guilty. In SRIOV
multi-vf use case, the vf flr happens first and then job time out is
found. There can be several jobs timeout during a very small time slice.
And if the innocent sdma job time out is found before the real bad
job, then the innocent sdma job will be set to guilty. This will lead
to a page fault after resubmitting job.

[How]
If the job is a kernel job, we will always consider it not guilty

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:08:00 -04:00
Graham Sider 410e302ea5 drm/amdkfd: Update SMI throttle event bitmask
Update Arcturus/Aldebaran thermal throttle SMI event path to use
ASIC-independent throttler bits when logging.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:08:00 -04:00
Oak Zeng cd5955f401 drm/amdgpu: Change a few function names
Function name "psp_np_fw_load" is not proper as people don't
know _np_fw_ means "non psp firmware". Change the function
name to psp_load_non_psp_fw for better understanding. Same
thing for function psp_execute_np_fw_load.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian Konig <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:07:59 -04:00
Oak Zeng 95f71f12aa drm/amdgpu: Fix a printing message
The printing message "PSP loading VCN firmware" is mis-leading because
people might think driver is loading VCN firmware. Actually when this
message is printed, driver is just preparing some VCN ucode, not loading
VCN firmware yet. The actual VCN firmware loading will be in the PSP block
hw_init. Fix the printing message

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Christian Konig <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:07:59 -04:00
Roy Sun 4067cdb1cf drm/amdgpu: Add error message when programing registers fails
Squash in warning fix (Alex)

Signed-off-by: Roy Sun <Roy.Sun@amd.com>
Reviewed-by: Zhou pengju <pengju.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:07:59 -04:00
Roy Sun 1a4772d922 drm/amdgpu: Change the imprecise function name
The callback functions are used for SRIOV read/write instead
of just for rlcg read/write

Signed-off-by: Roy Sun <Roy.Sun@amd.com>
Reviewed-by: Zhou pengju <pengju.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:07:59 -04:00
Veerabadhran Gopalakrishnan f72ac40941 drm/amdgpu - Corrected the video codecs array name for yellow carp
Signed-off-by: Veerabadhran Gopalakrishnan <veerabadhran.gopalakrishnan@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:07:59 -04:00
Jonathan Kim 9330481038 drm/amdkfd: report pcie bandwidth to the kfd
Similar to xGMI reporting the min/max bandwidth between direct peers, PCIe
will report the min/max bandwidth to the KFD.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:07:59 -04:00
Jonathan Kim 3f46c4e9ce drm/amdkfd: report xgmi bandwidth between direct peers to the kfd
Report the min/max bandwidth in megabytes to the kfd for direct
xgmi connections only.  Indirect peers will report 0 since
indirect route is unknown.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:07:59 -04:00
Jonathan Kim 331e78187f drm/amdgpu: add psp command to get num xgmi links between direct peers
The TA can now be invoked to provide the number of xgmi links connecting
a direct source and destination peer.
Non-direct peers will report zero links.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:07:59 -04:00
Anson Jacob d8c33180c0 drm/amdgpu: Fix documentaion for amdgpu_bo_add_to_shadow_list
make htmldocs complaints about parameter for amdgpu_bo_add_to_shadow_list

./drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:739: warning: Excess function parameter 'bo' description in 'amdgpu_bo_add_to_shadow_list'
./drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:739: warning: Function parameter or member 'vmbo' not described in 'amdgpu_bo_add_to_shadow_list'
./drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:739: warning: Excess function parameter 'bo' description in 'amdgpu_bo_add_to_shadow_list'

Signed-off-by: Anson Jacob <Anson.Jacob@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23 10:07:59 -04:00
Dave Airlie 8da49a33dd drm-misc-next for v5.15-rc1:
UAPI Changes:
 - Remove sysfs stats for dma-buf attachments, as it causes a performance regression.
   Previous merge is not in a rc kernel yet, so no userspace regression possible.
 
 Cross-subsystem Changes:
 - Sanitize user input in kyro's viewport ioctl.
 - Use refcount_t in fb_info->count
 - Assorted fixes to dma-buf.
 - Extend x86 efifb handling to all archs.
 - Fix neofb divide by 0.
 - Document corpro,gm7123 bridge dt bindings.
 
 Core Changes:
 - Slightly rework drm master handling.
 - Cleanup vgaarb handling.
 - Assorted fixes.
 
 Driver Changes:
 - Add support for ws2401 panel.
 - Assorted fixes to stm, ast, bochs.
 - Demidlayer ingenic irq.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEuXvWqAysSYEJGuVH/lWMcqZwE8MFAmD5TGAACgkQ/lWMcqZw
 E8PNgxAApjTYQSfjIBbOZnNraxW6w7/bPea35E9A47EdBQsNGnYftNsFjbrn/mCJ
 D+0eRLjCMlg4FF1SHdh9cPJ35py+ygbDeupogboLITfU99eGBth3fM2Xdg9LPcBh
 dbni/JLG9R7gIvSlqdJuweN21trfVrV/9FQEilG5DvQcl27Wx5g8VMRZke1EqGKX
 7Id09Uq50ky18vhDjQRCveYhRqJAxV+XozBatzHyxpDVzjLQvRhlAAYdvrSMHZ5R
 jreGzOfR8awc6Om+w7wx3Jn1oEGmXVZB/VqxEqGtMOr3lpARPucxrqfHsqpam3rv
 yIoEKPrkG+k6fsU7Tbg59jNqe/PbCUW3AlpyuBxf55EbnVGgjLDbq4sRRMkehPfA
 fhC31ujOXQQnAgaxyeQAaAJFKNFJzA8Cq5ZPfG+zztzuomHCiUVQBRowP65hJMzR
 +ZlEDnhUD3STLz39zuO1reZR1ZoPIvKbsokHAA+ZrIwUd6U3D3ia8V51pq+lL5aS
 TGDkyMN9jyZ+SO8Z7+2FnJAv9FAOPU/WCLU/fWW46jAvuezwMIwVcjfSqDU2XbZD
 e7KgHpHhx3BGxI8TThHKlY7mf6IL2Bm7X1Cv1pdZs/eEn3Udh2ax942uTQZu/YOO
 0AT1XchpvYCBNRw05bVI3OlJ+w3I8uV+h+11jHOKeY6cbwdHeKE=
 =BUya
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2021-07-22' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for v5.15-rc1:

UAPI Changes:
- Remove sysfs stats for dma-buf attachments, as it causes a performance regression.
  Previous merge is not in a rc kernel yet, so no userspace regression possible.

Cross-subsystem Changes:
- Sanitize user input in kyro's viewport ioctl.
- Use refcount_t in fb_info->count
- Assorted fixes to dma-buf.
- Extend x86 efifb handling to all archs.
- Fix neofb divide by 0.
- Document corpro,gm7123 bridge dt bindings.

Core Changes:
- Slightly rework drm master handling.
- Cleanup vgaarb handling.
- Assorted fixes.

Driver Changes:
- Add support for ws2401 panel.
- Assorted fixes to stm, ast, bochs.
- Demidlayer ingenic irq.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/2d0d2fe8-01fc-e216-c3fd-38db9e69944e@linux.intel.com
2021-07-23 11:32:43 +10:00
Dave Airlie 2e41a6696b Short summary of fixes pull:
* Return -ENOTTY for non-DRM ioctls
  * amdgpu: Fix COW checks
  * nouveau: init BO GME fields
  * panel: Avoid double free
  * ttm: Fix refcounting in ttm_global_init(); NULL checks
  * vc4: Fix interrupt handling
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmD5W0UACgkQaA3BHVML
 eiPH0AgAs9RuJzXPdSz4r6zkGQ1q2hGYhcev/BmV0HSxZ6X6YKbeYZqnWhwwARqc
 U/HdlVwSKVIDl9/izTDZYgTMf8zyDx+ZisP51FAccZP7bC0N9VgfXaUlQaMLrZIa
 JdKFgQNXWcaWAcMrdL4tSFKoUXWjsncvC6UrzV9I0bVn5CoWXE87M2Swk2f9J08/
 kMcAXQclWOgP8ul251YRD3PvSZXZ6c4E1dM8xbELMz4lhSDuijCkb5Bb8peoSHD1
 NbFVrbVy/3/onr/+GHGAcC15wmdzpBKPxnYmUNynfpAO/zOze/xhCEAZWJVE9GOt
 rX7+RrHDtWMStXQyoRlH7IU/rdpp7A==
 =G0lh
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-fixes-2021-07-22' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

Short summary of fixes pull:

 * Return -ENOTTY for non-DRM ioctls
 * amdgpu: Fix COW checks
 * nouveau: init BO GME fields
 * panel: Avoid double free
 * ttm: Fix refcounting in ttm_global_init(); NULL checks
 * vc4: Fix interrupt handling

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/YPlbkmH6S4VAHP9j@linux-uq9g.fritz.box
2021-07-23 11:17:03 +10:00
Veerabadhran Gopalakrishnan d80cded9cc drm/amdgpu - Corrected the video codecs array name for yellow carp
Signed-off-by: Veerabadhran Gopalakrishnan <veerabadhran.gopalakrishnan@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-21 17:47:28 -04:00
Aaron Liu 27f5355f5d drm/amdgpu: add yellow carp pci id (v2)
Add Yellow Carp PCI id support.

v2: add another DID

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-21 14:37:43 -04:00
Aaron Liu ab7a11bd36 drm/amdgpu: update yellow carp external rev_id handling
0x1681 has a different external revision id.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-21 14:37:33 -04:00
Christoph Hellwig bf44e8cecc vgaarb: don't pass a cookie to vga_client_register
The VGA arbitration is entirely based on pci_dev structures, so just pass
that back to the set_vga_decode callback.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20210716061634.2446357-8-hch@lst.de
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
2021-07-21 10:29:10 +02:00
Christoph Hellwig f6b1772b25 vgaarb: remove the unused irq_set_state argument to vga_client_register
All callers pass NULL as the irq_set_state argument, so remove it and
the ->irq_set_state member in struct vga_device.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20210716061634.2446357-7-hch@lst.de
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
2021-07-21 10:29:05 +02:00
Christoph Hellwig b877947586 vgaarb: provide a vga_client_unregister wrapper
Add a trivial wrapper for the unregister case that sets all fields to
NULL.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20210716061634.2446357-6-hch@lst.de
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
2021-07-21 10:29:00 +02:00
Dave Airlie 588b3eee52 drm-misc-next for v5.15:
UAPI Changes:
 
 Cross-subsystem Changes:
 - udmabuf: Add support for mapping hugepages
 - Add dma-buf stats to sysfs.
 - Assorted fixes to fbdev/omap2.
 - dma-buf: Document DMA_BUF_IOCTL_SYNC
 - Improve dma-buf non-dynamic exporter expectations better.
 - Add module parameters for dma-buf size and list limit.
 - Add HDMI codec support to vc4, to replace vc4's own codec.
 - Document dma-buf implicit fencing rules.
 - dma_resv_test_signaled test_all handling.
 
 Core Changes:
 - Extract i915's eDP backlight code into DRM helpers.
 - Assorted docbook updates.
 - Rework drm_dp_aux documentation.
 - Add support for the DP aux bus.
 - Shrink dma-fence-chain slightly.
 - Add alloc/free helpers for dma-fence-chain.
 - Assorted fixes to TTM., drm/of, bridge
 - drm_gem_plane_helper_prepare/cleanup_fb is now the default for gem drivers.
 - Small fix for scheduler completion.
 - Remove use of drm_device.irq_enabled.
 - Print the driver name to dmesg when registering framebuffer.
 - Export drm/gem's shadow plane handling, and use it in vkms.
 - Assorted small fixes.
 
 Driver Changes:
 - Add eDP backlight to nouveau.
 - Assorted fixes and cleanups to nouveau, panfrost, vmwgfx, anx7625,
   amdgpu, gma500, radeon, mgag200, vgem, vc4, vkms, omapdrm.
 - Add support for Samsung DB7430, Samsung ATNA33XC20, EDT ETMV570G2DHU,
   EDT ETM0350G0DH6, Innolux EJ030NA panels.
 - Fix some simple pannels missing bus_format and connector types.
 - Add mks-guest-stats instrumentation support to vmwgfx.
 - Merge i915-ttm topic branch.
 - Make s6e63m0 panel use Mipi-DBI helpers.
 - Add detect() supoprt for AST.
 - Use interrupts for hotplug on vc4.
 - vmwgfx is now moved to drm-misc-next, as sroland is no longer a maintainer for now.
 - vmwgfx now uses copies of vmware's internal device headers.
 - Slowly convert ti-sn65dsi83 over to atomic.
 - Rework amdgpu dma-resv handling.
 - Fix virtio fencing for planes.
 - Ensure amdgpu can always evict to SYSTEM.
 - Many drivers fixed for implicit fencing rules.
 - Set default prepare/cleanup fb for tiny, vram and simple helpers too.
 - Rework panfrost gpu reset and related serialization.
 - Update VKMS todo list.
 - Make bochs a tiny gpu driver, and use vram helper.
 - Use linux irq interfaces instead of drm_irq in some drivers.
 - Add support for Raspberry Pi Pico to GUD.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEuXvWqAysSYEJGuVH/lWMcqZwE8MFAmDxaBwACgkQ/lWMcqZw
 E8PBYRAAsZgmuQU1urEsDTL931jWoJ8zxHpxSLow8ZtplembyhloGeRXRmGT8erd
 ocw1wAzm0UajbFLvv50XW5N4jPnsn9IBRQVhfNNc06g4OH6qy17PPAA+clHaBJrf
 BFiAcK4rzmUet3+6335ko/OvkD5er0s7ipNljxgB7FkIwP3gh3NEFG0yFcpFpxF4
 fzT5Wz5vMW++XUCXZHMX+vBMjFP2AosxLVvsnxpM/48dyFWTiYRg7jhy5bICKYBM
 3GdRj2e1wm3cAsZISbqtDpXSlstIw6u0w+BB6ryQvD/K5nPTqydE/YMOB85DUWLg
 Sp1tijxM/KtOyC5w/IpDLkf9X24KAIcu0eKffUGbkLvIkP5cSyibelOtZBG6Jmln
 AubXpgi4+mGVyYvMEVngHyrY2tW/rtpNGr/g9To9hYVHKkdRZUtolQk7KgtdV7v3
 pFq60AilYTENJthkjCRoTi66BsocpaJfQOyppp6uD8/a0Spxfrq5tM+POWNylqxB
 70L2ObvM4Xx51GI0ziCZQwkMp2Uzwosr+6CdbrzQKaxxpbQEcr3frkv6cap5V0WY
 lnYgFw3dbA/Ga6YsnInQ87KmF4svnaWB2z/KzfnBF5pNrwoR9/4K5k7Vfb3P9YyN
 w+nrfeHto0r768PjC/05uyD9diDuHOw3RHtljf/C4klBNRDDovU=
 =x8Eo
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2021-07-16' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for v5.15:

UAPI Changes:

Cross-subsystem Changes:
- udmabuf: Add support for mapping hugepages
- Add dma-buf stats to sysfs.
- Assorted fixes to fbdev/omap2.
- dma-buf: Document DMA_BUF_IOCTL_SYNC
- Improve dma-buf non-dynamic exporter expectations better.
- Add module parameters for dma-buf size and list limit.
- Add HDMI codec support to vc4, to replace vc4's own codec.
- Document dma-buf implicit fencing rules.
- dma_resv_test_signaled test_all handling.

Core Changes:
- Extract i915's eDP backlight code into DRM helpers.
- Assorted docbook updates.
- Rework drm_dp_aux documentation.
- Add support for the DP aux bus.
- Shrink dma-fence-chain slightly.
- Add alloc/free helpers for dma-fence-chain.
- Assorted fixes to TTM., drm/of, bridge
- drm_gem_plane_helper_prepare/cleanup_fb is now the default for gem drivers.
- Small fix for scheduler completion.
- Remove use of drm_device.irq_enabled.
- Print the driver name to dmesg when registering framebuffer.
- Export drm/gem's shadow plane handling, and use it in vkms.
- Assorted small fixes.

Driver Changes:
- Add eDP backlight to nouveau.
- Assorted fixes and cleanups to nouveau, panfrost, vmwgfx, anx7625,
  amdgpu, gma500, radeon, mgag200, vgem, vc4, vkms, omapdrm.
- Add support for Samsung DB7430, Samsung ATNA33XC20, EDT ETMV570G2DHU,
  EDT ETM0350G0DH6, Innolux EJ030NA panels.
- Fix some simple pannels missing bus_format and connector types.
- Add mks-guest-stats instrumentation support to vmwgfx.
- Merge i915-ttm topic branch.
- Make s6e63m0 panel use Mipi-DBI helpers.
- Add detect() supoprt for AST.
- Use interrupts for hotplug on vc4.
- vmwgfx is now moved to drm-misc-next, as sroland is no longer a maintainer for now.
- vmwgfx now uses copies of vmware's internal device headers.
- Slowly convert ti-sn65dsi83 over to atomic.
- Rework amdgpu dma-resv handling.
- Fix virtio fencing for planes.
- Ensure amdgpu can always evict to SYSTEM.
- Many drivers fixed for implicit fencing rules.
- Set default prepare/cleanup fb for tiny, vram and simple helpers too.
- Rework panfrost gpu reset and related serialization.
- Update VKMS todo list.
- Make bochs a tiny gpu driver, and use vram helper.
- Use linux irq interfaces instead of drm_irq in some drivers.
- Add support for Raspberry Pi Pico to GUD.

Signed-off-by: Dave Airlie <airlied@redhat.com>

# gpg: Signature made Fri 16 Jul 2021 21:06:04 AEST
# gpg:                using RSA key B97BD6A80CAC4981091AE547FE558C72A67013C3
# gpg: Good signature from "Maarten Lankhorst <maarten.lankhorst@linux.intel.com>" [expired]
# gpg:                 aka "Maarten Lankhorst <maarten@debian.org>" [expired]
# gpg:                 aka "Maarten Lankhorst <maarten.lankhorst@canonical.com>" [expired]
# gpg: Note: This key has expired!
# Primary key fingerprint: B97B D6A8 0CAC 4981 091A  E547 FE55 8C72 A670 13C3
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/444811c3-cbec-e9d5-9a6b-9632eda7962a@linux.intel.com
2021-07-21 11:58:28 +10:00
Tao Zhou cfe4e8f00f drm/amdgpu: update gc golden setting for dimgrey_cavefish
Update gc_10_3_4 golden setting.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-07-16 14:18:30 -04:00
Likun Gao 3e94b5965e drm/amdgpu: update golden setting for sienna_cichlid
Update GFX golden setting for sienna_cichlid.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-07-16 14:18:10 -04:00
Xiaojian Du 4fff6fbca1 drm/amdgpu: update the golden setting for vangogh
This patch is to update the golden setting for vangogh.

Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-07-16 14:17:35 -04:00
Veerabadhran Gopalakrishnan 6505d6fcc6 amdgpu/nv.c - Optimize code for video codec support structure
Optimized the code for codec info structure initialization

Signed-off-by: Veerabadhran Gopalakrishnan <veerabadhran.gopalakrishnan@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-16 14:17:05 -04:00
Veerabadhran Gopalakrishnan ea272ce46f amdgpu/nv.c - Added video codec support for Yellow Carp
Added the supported codecs in the video capabilities query.

Signed-off-by: Veerabadhran Gopalakrishnan <veerabadhran.gopalakrishnan@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-16 14:16:57 -04:00
Kevin Wang 03373e2be2 drm/amdgpu/ttm: optimize vram access in amdgpu_ttm_access_memory()
1. using vram aper to access vram if possible
2. avoid MM_INDEX/MM_DATA is not working when mmio protect feature is
enabled.

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-16 14:03:29 -04:00
Kevin Wang 5fb95aa73f drm/amdgpu/ttm: replace duplicate code with exiting function
using exiting function to replace duplicate code blocks in
amdgpu_ttm_vram_write().

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-16 13:59:47 -04:00
Kevin Wang 048af66be7 drm/amdgpu: split amdgpu_device_access_vram() into two small parts
split amdgpu_device_access_vram()
1. amdgpu_device_mm_access(): using MM_INDEX/MM_DATA to access vram
2. amdgpu_device_aper_access(): using vram aperature to access vram (option)

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-16 13:58:13 -04:00
Tao Zhou c5c21a58ec drm/amdgpu: update gc golden setting for dimgrey_cavefish
Update gc_10_3_4 golden setting.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-16 13:51:45 -04:00
Likun Gao decd8ce9df drm/amdgpu: update golden setting for sienna_cichlid
Update GFX golden setting for sienna_cichlid.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-16 13:51:39 -04:00
Xiaojian Du 0a2ba7b72c drm/amdgpu: update the golden setting for vangogh
This patch is to update the golden setting for vangogh.

Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-16 13:51:33 -04:00
Andrey Grodzovsky 85da6459f4 drm/amdgpu: Switch to LFB for USBC PD FW in psp v13
Add USBC PD FW implementation here to be used with relevant ASICs.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-16 13:51:14 -04:00
Andrey Grodzovsky 25a3e8ac07 drm/amdgpu: Switch to VRAM buffer for USBC PD FW.
System memory-based implementation for updating the
USBCPD is deprecated for so switching
to LFB based implementation for all the ASICs.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-16 13:51:05 -04:00
Veerabadhran Gopalakrishnan 9075096b09 amdgpu/nv.c - Optimize code for video codec support structure
Optimized the code for codec info structure initialization

Signed-off-by: Veerabadhran Gopalakrishnan <veerabadhran.gopalakrishnan@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-16 13:46:09 -04:00
Veerabadhran Gopalakrishnan 554398174d amdgpu/nv.c - Added video codec support for Yellow Carp
Added the supported codecs in the video capabilities query.

Signed-off-by: Veerabadhran Gopalakrishnan <veerabadhran.gopalakrishnan@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-16 13:45:56 -04:00
Felix Kuehling fa5239f2af drm/amdgpu: workaround failed COW checks for Thunk VMAs
KFD Thunk maps invisible VRAM BOs with PROT_NONE, MAP_PRIVATE.
is_cow_mapping returns true for these mappings, which causes mmap to fail
in ttm_bo_mmap_obj.

As a workaround, clear VM_MAYWRITE for PROT_NONE-COW mappings. This
should prevent the mapping from ever becoming writable and makes
is_cow_mapping(vm_flags) false.

Fixes: f91142c621 ("drm/ttm: nuke VM_MIXEDMAP on BO mappings v3")
Suggested-by: Daniel Vetter <daniel.vetter@intel.com>
Tested-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210715190537.585456-1-Felix.Kuehling@amd.com
Signed-off-by: Christian König <christian.koenig@amd.com>
2021-07-16 09:22:57 +02:00
Jinzhou Su 775da83005 drm/amdgpu: add another Renoir DID
Add new PCI device id.

Signed-off-by: Jinzhou Su <Jinzhou.Su@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 5.11.x
2021-07-14 15:08:55 -04:00
Jinzhou Su 0c492e22ba drm/amdgpu: add another Renoir DID
Add new PCI device id.

Signed-off-by: Jinzhou Su <Jinzhou.Su@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-14 14:55:36 -04:00
Eric Huang d605094394 Revert "drm/amdgpu: Add table_freed parameter to amdgpu_vm_bo_update"
This reverts commit 075e8080c1.

Reason for revert: the related commit is reverted.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-13 11:59:30 -04:00
Eric Huang c37387c354 Revert "drm/amdkfd: Make TLB flush conditional on mapping"
This reverts commit 31f3324378.

Reason for revert: it causes regressions on several Asics.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-13 11:59:22 -04:00
Eric Huang 22762e3766 Revert "drm/amdgpu: Fix warning of Function parameter or member not described"
This reverts commit 7a68d188d1.

Reason for revert: the related commit is reverted.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-13 11:59:14 -04:00
Emily.Deng 9be26ddf88 drm/amdgpu: Restore msix after FLR
After FLR, the msix will be cleared, so need to re-enable it.

Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com>
Signed-off-by: Emily.Deng <Emily.Deng@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-13 11:58:48 -04:00
Felix Kuehling 99e7d65ccc drm/amdkfd: Allow CPU access for all VRAM BOs
The thunk needs to mmap all BOs for CPU access to allow the debugger to
access them. Invisible ones are mapped with PROT_NONE.

Fixes: 71df0368e9 ("drm/amdgpu: Implement mmap as GEM object function")
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-13 11:58:38 -04:00
Emily Deng fa8f311e9e drm/amdgpu: Correct the irq numbers for virtual crtc
The irq number should be decided by num_crtc, and the num_crtc could change
by parameter.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed by: Monk Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-13 11:48:12 -04:00
Luben Tuikov 43a44c5322 drm/amdgpu: Return error if no RAS
In amdgpu_ras_query_error_count() return an error
if the device doesn't support RAS. This prevents
that function from having to always set the values
of the integer pointers (if set), and thus
prevents function side effects--always to have to
set values of integers if integer pointers set,
regardless of whether RAS is supported or
not--with this change this side effect is
mitigated.

Also, if no pointers are set, don't count, since
we've no way of reporting the counts.

Also, give this function a kernel-doc.

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: John Clements <john.clements@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Reported-by: Tom Rix <trix@redhat.com>
Fixes: a46751fbcd ("drm/amdgpu: Fix RAS function interface")
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-13 11:48:10 -04:00
Jingwen Chen 798c511548 drm/amdgpu: SRIOV flr_work should take write_lock
[Why]
If flr_work takes read_lock, then other threads who takes
read_lock can access hardware when host is doing vf flr.

[How]
flr_work should take write_lock to avoid this case.

Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-13 11:48:09 -04:00
John Clements 308ef2ad84 drm/amdgpu: Resolve bug in UMC 6.7 error offset calculation
Use correct channel and instance values

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-13 11:42:50 -04:00
Eric Huang 024d8811c9 Revert "drm/amdgpu: Add table_freed parameter to amdgpu_vm_bo_update"
This reverts commit 075e8080c1.

Reason for revert: the related commit is reverted.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-13 11:40:01 -04:00
Eric Huang 7ed9876c97 Revert "drm/amdkfd: Make TLB flush conditional on mapping"
This reverts commit 31f3324378.

Reason for revert: it causes regressions on several Asics.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-13 11:38:55 -04:00
Eric Huang 4e7b93ca52 Revert "drm/amdgpu: Fix warning of Function parameter or member not described"
This reverts commit 7a68d188d1.

Reason for revert: the related commit is reverted.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-13 11:37:45 -04:00
Eric Huang 53d0533049 Revert "drm/amdkfd: Only apply TLB flush optimization on ALdebaran"
This reverts commit 51627f0380.

Reason for revert: it causes regression on Aldebaran.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-13 11:35:29 -04:00
Emily.Deng 3c727c1c45 drm/amdgpu: Restore msix after FLR
After FLR, the msix will be cleared, so need to re-enable it.

Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com>
Signed-off-by: Emily.Deng <Emily.Deng@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-13 11:35:04 -04:00
Felix Kuehling 468f04cfbb drm/amdkfd: Allow CPU access for all VRAM BOs
The thunk needs to mmap all BOs for CPU access to allow the debugger to
access them. Invisible ones are mapped with PROT_NONE.

Fixes: 71df0368e9 ("drm/amdgpu: Implement mmap as GEM object function")
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-13 11:35:03 -04:00
John Clements 186c8a8585 drm/amdgpu: initialize umc ras function
support umc ras function initialization for aldebaran

v2: squash in compile fix

Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-08 17:47:28 -04:00
Emily Deng 9604b74bff drm/amdgpu: Correct the irq numbers for virtual crtc
The irq number should be decided by num_crtc, and the num_crtc could change
by parameter.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed by: Monk Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-08 17:47:28 -04:00
Dan Carpenter 64598e23de drm/amdgpu: return -EFAULT if copy_to_user() fails
If copy_to_user() fails then this should return -EFAULT instead of
-EINVAL.

Fixes: c65b0805e7 ("drm/amdgpu: RAS EEPROM table is now in debugfs")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-08 17:47:28 -04:00
Dan Carpenter b8badd507a drm/amdgpu: unlock on error in amdgpu_ras_debugfs_table_read()
This error path needs to unlock before returning.  While we're at it,
the correct error code from copy_to_user() failure is -EFAULT, not
-EINVAL.

Fixes: c65b0805e7 ("drm/amdgpu: RAS EEPROM table is now in debugfs")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-08 17:47:16 -04:00
Linus Torvalds f55966571d drm fixes for 5.14-rc1
dma-buf:
 - doc fixes
 
 amdgpu:
 - Misc Navi fixes
 - Powergating fix
 - Yellow Carp updates
 - Beige Goby updates
 - S0ix fix
 - Revert overlay validation fix
 - GPU reset fix for DC
 - PPC64 fix
 - Add new dimgrey cavefish DID
 - RAS fix
 - TTM fixes
 
 amdkfd:
 - SVM fixes
 
 radeon:
 - Fix missing drm_gem_object_put in error path
 - NULL ptr deref fix
 
 i915:
 - display DP VSC fix
 - DG1 display fix
 - IRQ fixes
 - IRQ demidlayering
 
 gma500:
 - bo leaks in error paths fixed
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmDma1oACgkQDHTzWXnE
 hr4l1RAAmUhJbcqPEDGI2Hydl7o4/NMWXyyINYzcuCndeT76+yqdiTVsRQNvmTRO
 MoW8AACjnNc2xktqtbv+hH4/vZinqHQD8z93kk4A+yA6TDzOLdJTnS7XfoKFI/S3
 /lvmwYJgW1nN3TahMI/juTmJanYEnSwVvFVGtRkOEtG7wgomNDYgqdm/NSfmENXV
 6q1rHamzAMoXYFxviST+6jK2kBLFN7jjBAyaVyj4ufnF2CG/oUrsAtdvXpM1QSta
 R1LJ3g43pnOFPpaNtwSlf9wsDQZ0oMSJ2Tt+hLpyhKr+zHVygOBjRR9AGNd/2P6t
 mCQWiYI+B4i0246XDFj6yS/angQHox9/fUYMM+AgExIBH1PAdTE9yZE/JNsNBc/S
 4llmAA/eQco+DM8HQ84BPl8yi9Y3qaUONJJQsejpTf9Cvey5xQ2HnDmPbBACfqd0
 /BxCfhzcJmPxuV5bRh7e/nJt0/Uj83U5f6oQlUImmAdr+t60NnuwxoWV0wt7Uz+0
 v9TkqQ5QE2jpcm56ug8cxOkSKK9U/RDQCfqUY6zSYPMSFu5WL2fOWxJKpyIz9vSo
 E8awltAu59nUT10P/zvzdN64egWrRBwSECg4Bd6I2L9SxfnYuXJwJ+gMObFo0SF5
 cb7KfM2kffxaekf2TashsjEgkrYtCkhh6QMIkDuUOWtIzu4/ikc=
 =HX2N
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2021-07-08-1' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Some fixes for rc1 that came in the past weeks, mainly a bunch of
  amdgpu fixes, some i915 and the rest are misc around the place. I'm
  sending this a bit early so some more stuff may show up, but I'll
  probably take tomorrow off.

  dma-buf:
   - doc fixes

  amdgpu:
   - Misc Navi fixes
   - Powergating fix
   - Yellow Carp updates
   - Beige Goby updates
   - S0ix fix
   - Revert overlay validation fix
   - GPU reset fix for DC
   - PPC64 fix
   - Add new dimgrey cavefish DID
   - RAS fix
   - TTM fixes

  amdkfd:
   - SVM fixes

  radeon:
   - Fix missing drm_gem_object_put in error path
   - NULL ptr deref fix

  i915:
   - display DP VSC fix
   - DG1 display fix
   - IRQ fixes
   - IRQ demidlayering

  gma500:
   - bo leaks in error paths fixed"

* tag 'drm-next-2021-07-08-1' of git://anongit.freedesktop.org/drm/drm: (52 commits)
  drm/i915: Drop all references to DRM IRQ midlayer
  drm/i915: Use the correct IRQ during resume
  drm/i915/display/dg1: Correctly map DPLLs during state readout
  drm/i915/display: Do not zero past infoframes.vsc
  drm/amdgpu: Conditionally reset SDMA RAS error counts
  drm/amdkfd: Maintain svm_bo reference in page->zone_device_data
  drm/amdkfd: add invalid pages debug at vram migration
  drm/amdkfd: skip migration for pages already in VRAM
  drm/amdkfd: skip invalid pages during migrations
  drm/amdkfd: classify and map mixed svm range pages in GPU
  drm/amdkfd: use hmm range fault to get both domain pfns
  drm/amdgpu: get owner ref in validate and map
  drm/amdkfd: set owner ref to svm range prefault
  drm/amdkfd: add owner ref param to get hmm pages
  drm/amdkfd: device pgmap owner at the svm migrate init
  drm/amdkfd: inc counter on child ranges with xnack off
  drm/amd/display: Extend DMUB diagnostic logging to DCN3.1
  drm/amdgpu: Update NV SIMD-per-CU to 2
  drm/amdgpu: add new dimgrey cavefish DID
  drm/amd/pm: skip PrepareMp1ForUnload message in s0ix
  ...
2021-07-08 12:28:15 -07:00
Dan Carpenter 1d864f1088 drm/amdgpu: Fix signedness bug in __amdgpu_eeprom_xfer()
The i2c_transfer() function returns negatives or else the number of
messages transferred.  This code does not work because ARRAY_SIZE()
is type size_t and so that means negative values of "r" are type
promoted to high positive values which are greater than the ARRAY_SIZE().

Fix this by changing the < to != which works regardless of type
promotion.

Fixes: 746b584762 ("drm/amdgpu: Fixes to the AMDGPU EEPROM driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-08 15:18:14 -04:00
Dan Carpenter 3006c92455 drm/amdgpu: fix a signedness bug in __verify_ras_table_checksum()
If amdgpu_eeprom_read() returns a negative error code then the error
handling checks:

	if (res < buf_size) {

The problem is that "buf_size" is a u32 so negative values are type
promoted to a high positive values and the condition is false.  Fix
this by changing the type of "buf_size" to int.

Fixes: 63d4c081a5 ("drm/amdgpu: Optimize EEPROM RAS table I/O")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-08 15:17:02 -04:00
Luben Tuikov 4d9f771e11 drm/amdgpu: Return error if no RAS
In amdgpu_ras_query_error_count() return an error
if the device doesn't support RAS. This prevents
that function from having to always set the values
of the integer pointers (if set), and thus
prevents function side effects--always to have to
set values of integers if integer pointers set,
regardless of whether RAS is supported or
not--with this change this side effect is
mitigated.

Also, if no pointers are set, don't count, since
we've no way of reporting the counts.

Also, give this function a kernel-doc.

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: John Clements <john.clements@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Reported-by: Tom Rix <trix@redhat.com>
Fixes: a46751fbcd ("drm/amdgpu: Fix RAS function interface")
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-08 15:13:07 -04:00
Jingwen Chen b5840166dc drm/amdgpu: SRIOV flr_work should take write_lock
[Why]
If flr_work takes read_lock, then other threads who takes
read_lock can access hardware when host is doing vf flr.

[How]
flr_work should take write_lock to avoid this case.

Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-08 15:12:58 -04:00
Luben Tuikov c0838d3a93 drm/amdgpu: The I2C IP doesn't support 0 writes/reads
The I2C IP doesn't support writes or reads of 0 bytes.

In order for a START/STOP transaction to take
place on the bus, the data written/read has to be
at least one byte.

That is, you cannot generate a write with 0 bytes,
just to get the ACK from a device, just so you can
probe that device if it is on the bus and so to
discover all devices on the bus--you'll have to
read at least one byte. Writes of 0 bytes generate
no START/STOP on this I2C IP--the bus is not
engaged at all.

Set the I2C_AQ_NO_ZERO_LEN to the existing I2C
quirk tables for Aldebaran, Arcturus, Navi10 and
Sienna Cichlid, and add a quirk table to the I2C
driver which drives the bus when the SMU
doesn't--for instance on Vega20.

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: John Clements <john.clements@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Lijo Lazar <Lijo.Lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-08 15:12:51 -04:00
YuBiao Wang 5af4438f1e drm/amdgpu: Read clock counter via MMIO to reduce delay (v5)
[Why]
GPU timing counters are read via KIQ under sriov, which will introduce
a delay.

[How]
It could be directly read by MMIO.

v2: Add additional check to prevent carryover issue.
v3: Only check for carryover for once to prevent performance issue.
v4: Add comments of the rough frequency where carryover happens.
v5: Remove mutex and gfxoff ctrl unused with current timing registers.

Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
Acked-by: Horace Chen <horace.chen@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.co>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-08 15:12:36 -04:00
Eric Huang 51627f0380 drm/amdkfd: Only apply TLB flush optimization on ALdebaran
It is based on reverting two patches back.
  drm/amdkfd: Make TLB flush conditional on mapping
  drm/amdgpu: Add table_freed parameter to amdgpu_vm_bo_update

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-08 15:12:26 -04:00
Nirmoy Das 88f7f88159 drm/amdgpu: separate out vm pasid assignment
Use new helper function amdgpu_vm_set_pasid() to
assign vm pasid value. This also ensures that we don't free
a pasid from vm code as pasids are allocated somewhere else.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-08 15:12:20 -04:00
Nirmoy Das dcb388eddb drm/amdgpu: use xarray for storing pasid in vm
Replace idr with xarray as we actually need hash functionality.
Cleanup code related to vm pasid by adding helper function.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-08 15:12:07 -04:00
Dave Airlie 21c355b097 Short summary of fixes pull:
* amdgpu: TTM fixes
  * dma-buf: Doc fixes
  * gma500: Fix potential BO leaks in error handling
  * radeon: Fix NULL-ptr deref
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmDdhbwACgkQaA3BHVML
 eiM/fggAu5y7COqhMIMwdSOXIkLszILovZXAdERzIa6uPeh+zS1TLiW6hqkFam5g
 HtiUm982SP1OZ8p+w5R/BboeRuUymyVXdrAg+0+T6hH7aC1NtN11tfXr8h2aVJhn
 gduh5IixnhDScMrvI+8uVwzLH46fRJDwvfSGGXG3LQT1atWY4npTLozRwLE3uvMM
 w2kSoWsemo3d+40bdilhNKAwolSGNSgBW7OBhE8A8nN+3J8smNkfD9fP6cVG90Je
 HCQ0kYpIckyx43VTWrYJYYS8T7l+x8QdwjgSNmhjZ391A/EyB8oKAUL7ov8VdxJC
 vI8AabZf4JMV0f+2aVhxVMTJPvHhpQ==
 =gX7D
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-fixes-2021-07-01' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

Short summary of fixes pull:

 * amdgpu: TTM fixes
 * dma-buf: Doc fixes
 * gma500: Fix potential BO leaks in error handling
 * radeon: Fix NULL-ptr deref

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/YN2GK2SH64yqXqh9@linux-uq9g
2021-07-08 11:17:32 +10:00
Dave Airlie 0d3a1b37ab Merge tag 'amd-drm-next-5.14-2021-07-01' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.14-2021-07-01:

amdgpu:
- Misc Navi fixes
- Powergating fix
- Yellow Carp updates
- Beige Goby updates
- S0ix fix
- Revert overlay validation fix
- GPU reset fix for DC
- PPC64 fix
- Add new dimgrey cavefish DID
- RAS fix

amdkfd:
- SVM fixes

radeon:
- Fix missing drm_gem_object_put in error path

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210701042241.25449-1-alexander.deucher@amd.com
2021-07-08 08:45:20 +10:00
Linus Torvalds 757fa80f4e Tracing updates for 5.14:
- Added option for per CPU threads to the hwlat tracer
 
  - Have hwlat tracer handle hotplug CPUs
 
  - New tracer: osnoise, that detects latency caused by interrupts, softirqs
    and scheduling of other tasks.
 
  - Added timerlat tracer that creates a thread and measures in detail what
    sources of latency it has for wake ups.
 
  - Removed the "success" field of the sched_wakeup trace event.
    This has been hardcoded as "1" since 2015, no tooling should be looking
    at it now. If one exists, we can revert this commit, fix that tool and
    try to remove it again in the future.
 
  - tgid mapping fixed to handle more than PID_MAX_DEFAULT pids/tgids.
 
  - New boot command line option "tp_printk_stop", as tp_printk causes trace
    events to write to console. When user space starts, this can easily live
    lock the system. Having a boot option to stop just after boot up is
    useful to prevent that from happening.
 
  - Have ftrace_dump_on_oops boot command line option take numbers that match
    the numbers shown in /proc/sys/kernel/ftrace_dump_on_oops.
 
  - Bootconfig clean ups, fixes and enhancements.
 
  - New ktest script that tests bootconfig options.
 
  - Add tracepoint_probe_register_may_exist() to register a tracepoint
    without triggering a WARN*() if it already exists. BPF has a path from
    user space that can do this. All other paths are considered a bug.
 
  - Small clean ups and fixes
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYN8YPhQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qhxLAP9Mo5hHv7Hg6W7Ddv77rThm+qclsMR/
 yW0P+eJpMm4+xAD8Cq03oE1DimPK+9WZBKU5rSqAkqG6CjgDRw6NlIszzQQ=
 =WEPR
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing updates from Steven Rostedt:

 - Added option for per CPU threads to the hwlat tracer

 - Have hwlat tracer handle hotplug CPUs

 - New tracer: osnoise, that detects latency caused by interrupts,
   softirqs and scheduling of other tasks.

 - Added timerlat tracer that creates a thread and measures in detail
   what sources of latency it has for wake ups.

 - Removed the "success" field of the sched_wakeup trace event. This has
   been hardcoded as "1" since 2015, no tooling should be looking at it
   now. If one exists, we can revert this commit, fix that tool and try
   to remove it again in the future.

 - tgid mapping fixed to handle more than PID_MAX_DEFAULT pids/tgids.

 - New boot command line option "tp_printk_stop", as tp_printk causes
   trace events to write to console. When user space starts, this can
   easily live lock the system. Having a boot option to stop just after
   boot up is useful to prevent that from happening.

 - Have ftrace_dump_on_oops boot command line option take numbers that
   match the numbers shown in /proc/sys/kernel/ftrace_dump_on_oops.

 - Bootconfig clean ups, fixes and enhancements.

 - New ktest script that tests bootconfig options.

 - Add tracepoint_probe_register_may_exist() to register a tracepoint
   without triggering a WARN*() if it already exists. BPF has a path
   from user space that can do this. All other paths are considered a
   bug.

 - Small clean ups and fixes

* tag 'trace-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (49 commits)
  tracing: Resize tgid_map to pid_max, not PID_MAX_DEFAULT
  tracing: Simplify & fix saved_tgids logic
  treewide: Add missing semicolons to __assign_str uses
  tracing: Change variable type as bool for clean-up
  trace/timerlat: Fix indentation on timerlat_main()
  trace/osnoise: Make 'noise' variable s64 in run_osnoise()
  tracepoint: Add tracepoint_probe_register_may_exist() for BPF tracing
  tracing: Fix spelling in osnoise tracer "interferences" -> "interference"
  Documentation: Fix a typo on trace/osnoise-tracer
  trace/osnoise: Fix return value on osnoise_init_hotplug_support
  trace/osnoise: Make interval u64 on osnoise_main
  trace/osnoise: Fix 'no previous prototype' warnings
  tracing: Have osnoise_main() add a quiescent state for task rcu
  seq_buf: Make trace_seq_putmem_hex() support data longer than 8
  seq_buf: Fix overflow in seq_buf_putmem_hex()
  trace/osnoise: Support hotplug operations
  trace/hwlat: Support hotplug operations
  trace/hwlat: Protect kdata->kthread with get/put_online_cpus
  trace: Add timerlat tracer
  trace: Add osnoise tracer
  ...
2021-07-03 11:13:22 -07:00
Linus Torvalds e058a84bfd drm pull for 5.14-rc1
core:
 - mark AGP ioctls as legacy
 - disable force probing for non-master clients
 - HDR metadata property helpers
 - HDMI infoframe signal colorimetry support
 - remove drm_device.pdev pointer
 - remove DRM_KMS_FB_HELPER config option
 - remove drm_pci_alloc/free
 - drm_err_*/drm_dbg_* helpers
 - use drm driver names for fbdev
 - leaked DMA handle fix
 - 16bpc fixed point format fourcc
 - add prefetching memcpy for WC
 - Documentation fixes
 
 aperture:
 - add aperture ownership helpers
 
 dp:
 - aux fixes
 - downstream 0 port handling
 - use extended base receiver capability DPCD
 - Rename DP_PSR_SELECTIVE_UPDATE to better mach eDP spec
 - mst: use khz as link rate during init
 - VCPI fixes for StarTech hub
 
 ttm:
 - provide tt_shrink file via debugfs
 - warn about freeing pinned BOs
 - fix swapping error handling
 - move page alignment into BO
 - cleanup ttm_agp_backend
 - add ttm_sys_manager
 - don't override vm_ops
 - ttm_bo_mmap removed
 - make ttm_resource base of all managers
 - remove VM_MIXEDMAP usage
 
 panel:
 - sysfs_emit support
 - simple: runtime PM support
 - simple: power up panel when reading EDID + caching
 
 bridge:
 - MHDP8546: HDCP support + DT bindings
 - MHDP8546: Register DP AUX channel with userspace
 - TI SN65DSI83 + SN65DSI84: add driver
 - Sil8620: Fix module dependencies
 - dw-hdmi: make CEC driver loading optional
 - Ti-sn65dsi86: refclk fixes, subdrivers, runtime pm
 - It66121: Add driver + DT bindings
 - Adv7511: Support I2S IEC958 encoding
 - Anx7625: fix power-on delay
 - Nwi-dsi: Modesetting fixes; Cleanups
 - lt6911: add missing MODULE_DEVICE_TABLE
 - cdns: fix PM reference leak
 
 hyperv:
 - add new DRM driver for HyperV graphics
 
 efifb:
 - non-PCI device handling fixes
 
 i915:
 - refactor IP/device versioning
 - XeLPD Display IP preperation work
 - ADL-P enablement patches
 - DG1 uAPI behind BROKEN
 - disable mmap ioctl for discerte GPUs
 - start enabling HuC loading for Gen12+
 - major GuC backend rework for new platforms
 - initial TTM support for Discrete GPUs
 - locking rework for TTM prep
 - use correct max source link rate for eDP
 - %p4cc format printing
 - GLK display fixes
 - VLV DSI panel power fixes
 - PSR2 disabled for RKL and ADL-S
 - ACPI _DSM invalid access fixed
 - DMC FW path abstraction
 - ADL-S PCI ID update
 - uAPI headers converted to kerneldoc
 - initial LMEM support for DG1
 - x86/gpu: add Jasperlake to gen11 early quirks
 
 amdgpu:
 - Aldebaran updates + initial SR-IOV
 - new GPU: Beige Goby and Yellow Carp support
 - more LTTPR display work
 - Vangogh updates
 - SDMA 5.x GCR fixes
 - PCIe ASPM support
 - Renoir TMZ enablement
 - initial multiple eDP panel support
 - use fdinfo to track devices/process info
 - pin/unpin TTM fixes
 - free resource on fence usage query
 - fix fence calculation
 - fix hotunplug/suspend issues
 - GC/MM register access macro cleanup for SR-IOV
 - W=1 fixes
 - ACPI ATCS/ATIF handling rework
 - 16bpc fixed point format support
 - Initial smartshift support
 - RV/PCO power tuning fixes
 - new INFO query for additional vbios info
 
 amdkfd:
 - SR-IOV aldebaran support
 - HMM SVM support
 
 radeon:
 - SMU regression fixes
 - Oland flickering fix
 
 vmwgfx:
 - enable console with fbdev emulation
 - fix cpu updates of coherent multisample surfaces
 - remove reservation semaphore
 - add initial SVGA3 support
 - support arm64
 
 msm:
 - devcoredump support for display errors
 - dpu/dsi: yaml bindings conversion
 - mdp5: alpha/blend_mode/zpos support
 - a6xx: cached coherent buffer support
 - gpu iova fault improvement
 - a660 support
 
 rockchip:
 - RK3036 win1 scaling support
 - RK3066/3188 missing register support
 - RK3036/3066/3126/3188 alpha support
 
 mediatek:
 - MT8167 HDMI support
 - MT8183 DPI dual edge support
 
 tegra:
 - fixed YUV support/scaling on Tegra186+
 
 ast:
 - use pcim_iomap
 - fix DP501 EDID
 
 bochs:
 - screen blanking support
 
 etnaviv:
 - export more GPU ID values to userspace
 - add HWDB entry for GPU on i.MX8MP
 - rework linear window calcs
 
 exynos:
 - pm runtime changes
 
 imx:
 - Annotate dma_fence critical section
 - fix PRG modifiers after drmm conversion
 - Add 8 pixel alignment fix for 1366x768
 - fix YUV advertising
 - add color properties
 
 ingenic:
 - IPU planes fix
 
 panfrost:
 - Mediatek MT8183 support + DT bindings
 - export AFBC_FEATURES register to userspace
 
 simpledrm:
 - %pr for printing resources
 
 nouveau:
 - pin/unpin TTM fixes
 
 qxl:
 - unpin shadow BO
 
 virtio:
 - create dumb BOs as guest blob
 
 vkms:
 - drmm_universal_plane_alloc
 - add XRGB plane composition
 - overlay support
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmDdQzkACgkQDHTzWXnE
 hr7bhQ//aSYnp1To3tvPtwQ2H88RTnEbUd+nCi3C03QdLAbHC9dYHVdWuNPw2doh
 aiJO2JyQoqXVo95Jc39qkmpvm1lLDNQuufBweCHxbbpl8wYIUjfkIYq+fnZbWPaA
 aRVSOLE/4DIcgJTimsgOssAOK9klk/WYT9EV7CNIBA/b0R6f9iTUoBxCALDvMeVx
 Pt3Rnfsg3+u8msqBkkpkvFLZRS8lkXx6eZ0LEhUfRsfMcKo5L80cOHgvIhrh9+fN
 yBFv+u7jM3fOxyUYEoBeVY8UqTLfbgM+vdiP9pmiGn66yCZVJWIxCe1Mijk6K143
 f4OxJy1jJAGzo/knLCuCb21qbzyImQzkold9V+h8KAvTXGeMPISjbpLbwGeo8rne
 lfTAisGnu8q3xvYAU9znx9DkFQULgUuWahEYY3jX0ApVCR76hiT6H7AR9EOMhvKY
 PD1n39Bf62p7zK5QQ+XUOiX3PGv8J6Hw/wykFy+AIg4YgT/oK+QJul820MjZiYyt
 7Kt09Ibj4JO+vubxqlbJVsW3xtdg/Oz3BRMIdHs+2l/s0pSwBZa+qTcXhPGZxB5B
 HiyHiUgLsK8MQ0aIw9IK8+nJH8M60t6A179BbmVWxhYpGLH2Wvq0Vxgsedt9trHn
 2RN3mHlpXHSaZJbIbPcvuOewBLKA6K94o2ZZ8xqZbDcCjjC60ts=
 =fFet
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2021-07-01' of git://anongit.freedesktop.org/drm/drm

Pull drm updates from Dave Airlie:
 "Highlights:

   - AMD enables two more GPUs, with resulting header files

   - i915 has started to move to TTM for discrete GPU and enable DG1
     discrete GPU support (not by default yet)

   - new HyperV drm driver

   - vmwgfx adds arm64 support

   - TTM refactoring ongoing

   - 16bpc display support for AMD hw

  Otherwise it's just the usual insane amounts of work all over the
  place in lots of drivers and the core, as mostly summarised below:

  Core:
   - mark AGP ioctls as legacy
   - disable force probing for non-master clients
   - HDR metadata property helpers
   - HDMI infoframe signal colorimetry support
   - remove drm_device.pdev pointer
   - remove DRM_KMS_FB_HELPER config option
   - remove drm_pci_alloc/free
   - drm_err_*/drm_dbg_* helpers
   - use drm driver names for fbdev
   - leaked DMA handle fix
   - 16bpc fixed point format fourcc
   - add prefetching memcpy for WC
   - Documentation fixes

  aperture:
   - add aperture ownership helpers

  dp:
   - aux fixes
   - downstream 0 port handling
   - use extended base receiver capability DPCD
   - Rename DP_PSR_SELECTIVE_UPDATE to better mach eDP spec
   - mst: use khz as link rate during init
   - VCPI fixes for StarTech hub

  ttm:
   - provide tt_shrink file via debugfs
   - warn about freeing pinned BOs
   - fix swapping error handling
   - move page alignment into BO
   - cleanup ttm_agp_backend
   - add ttm_sys_manager
   - don't override vm_ops
   - ttm_bo_mmap removed
   - make ttm_resource base of all managers
   - remove VM_MIXEDMAP usage

  panel:
   - sysfs_emit support
   - simple: runtime PM support
   - simple: power up panel when reading EDID + caching

  bridge:
   - MHDP8546: HDCP support + DT bindings
   - MHDP8546: Register DP AUX channel with userspace
   - TI SN65DSI83 + SN65DSI84: add driver
   - Sil8620: Fix module dependencies
   - dw-hdmi: make CEC driver loading optional
   - Ti-sn65dsi86: refclk fixes, subdrivers, runtime pm
   - It66121: Add driver + DT bindings
   - Adv7511: Support I2S IEC958 encoding
   - Anx7625: fix power-on delay
   - Nwi-dsi: Modesetting fixes; Cleanups
   - lt6911: add missing MODULE_DEVICE_TABLE
   - cdns: fix PM reference leak

  hyperv:
   - add new DRM driver for HyperV graphics

  efifb:
   - non-PCI device handling fixes

  i915:
   - refactor IP/device versioning
   - XeLPD Display IP preperation work
   - ADL-P enablement patches
   - DG1 uAPI behind BROKEN
   - disable mmap ioctl for discerte GPUs
   - start enabling HuC loading for Gen12+
   - major GuC backend rework for new platforms
   - initial TTM support for Discrete GPUs
   - locking rework for TTM prep
   - use correct max source link rate for eDP
   - %p4cc format printing
   - GLK display fixes
   - VLV DSI panel power fixes
   - PSR2 disabled for RKL and ADL-S
   - ACPI _DSM invalid access fixed
   - DMC FW path abstraction
   - ADL-S PCI ID update
   - uAPI headers converted to kerneldoc
   - initial LMEM support for DG1
   - x86/gpu: add Jasperlake to gen11 early quirks

  amdgpu:
   - Aldebaran updates + initial SR-IOV
   - new GPU: Beige Goby and Yellow Carp support
   - more LTTPR display work
   - Vangogh updates
   - SDMA 5.x GCR fixes
   - PCIe ASPM support
   - Renoir TMZ enablement
   - initial multiple eDP panel support
   - use fdinfo to track devices/process info
   - pin/unpin TTM fixes
   - free resource on fence usage query
   - fix fence calculation
   - fix hotunplug/suspend issues
   - GC/MM register access macro cleanup for SR-IOV
   - W=1 fixes
   - ACPI ATCS/ATIF handling rework
   - 16bpc fixed point format support
   - Initial smartshift support
   - RV/PCO power tuning fixes
   - new INFO query for additional vbios info

  amdkfd:
   - SR-IOV aldebaran support
   - HMM SVM support

  radeon:
   - SMU regression fixes
   - Oland flickering fix

  vmwgfx:
   - enable console with fbdev emulation
   - fix cpu updates of coherent multisample surfaces
   - remove reservation semaphore
   - add initial SVGA3 support
   - support arm64

  msm:
   - devcoredump support for display errors
   - dpu/dsi: yaml bindings conversion
   - mdp5: alpha/blend_mode/zpos support
   - a6xx: cached coherent buffer support
   - gpu iova fault improvement
   - a660 support

  rockchip:
   - RK3036 win1 scaling support
   - RK3066/3188 missing register support
   - RK3036/3066/3126/3188 alpha support

  mediatek:
   - MT8167 HDMI support
   - MT8183 DPI dual edge support

  tegra:
   - fixed YUV support/scaling on Tegra186+

  ast:
   - use pcim_iomap
   - fix DP501 EDID

  bochs:
   - screen blanking support

  etnaviv:
   - export more GPU ID values to userspace
   - add HWDB entry for GPU on i.MX8MP
   - rework linear window calcs

  exynos:
   - pm runtime changes

  imx:
   - Annotate dma_fence critical section
   - fix PRG modifiers after drmm conversion
   - Add 8 pixel alignment fix for 1366x768
   - fix YUV advertising
   - add color properties

  ingenic:
   - IPU planes fix

  panfrost:
   - Mediatek MT8183 support + DT bindings
   - export AFBC_FEATURES register to userspace

  simpledrm:
   - %pr for printing resources

  nouveau:
   - pin/unpin TTM fixes

  qxl:
   - unpin shadow BO

  virtio:
   - create dumb BOs as guest blob

  vkms:
   - drmm_universal_plane_alloc
   - add XRGB plane composition
   - overlay support"

* tag 'drm-next-2021-07-01' of git://anongit.freedesktop.org/drm/drm: (1570 commits)
  drm/i915: Reinstate the mmap ioctl for some platforms
  drm/i915/dsc: abstract helpers to get bigjoiner primary/secondary crtc
  Revert "drm/msm/mdp5: provide dynamic bandwidth management"
  drm/msm/mdp5: provide dynamic bandwidth management
  drm/msm/mdp5: add perf blocks for holding fudge factors
  drm/msm/mdp5: switch to standard zpos property
  drm/msm/mdp5: add support for alpha/blend_mode properties
  drm/msm/mdp5: use drm_plane_state for pixel blend mode
  drm/msm/mdp5: use drm_plane_state for storing alpha value
  drm/msm/mdp5: use drm atomic helpers to handle base drm plane state
  drm/msm/dsi: do not enable PHYs when called for the slave DSI interface
  drm/msm: Add debugfs to trigger shrinker
  drm/msm/dpu: Avoid ABBA deadlock between IRQ modules
  drm/msm: devcoredump iommu fault support
  iommu/arm-smmu-qcom: Add stall support
  drm/msm: Improve the a6xx page fault handler
  iommu/arm-smmu-qcom: Add an adreno-smmu-priv callback to get pagefault info
  iommu/arm-smmu: Add support for driver IOMMU fault handlers
  drm/msm: export hangcheck_period in debugfs
  drm/msm/a6xx: add support for Adreno 660 GPU
  ...
2021-07-01 12:53:43 -07:00
Jiri Kosina 36f5f9d37e drm/amdgpu: Avoid printing of stack contents on firmware load error
In case when psp_init_asd_microcode() fails to load ASD microcode file,
psp_v12_0_init_microcode() tries to print the firmware filename that
failed to load before bailing out.

This is wrong because:

- the firmware filename it would want it print is an incorrect one as
  psp_init_asd_microcode() and psp_v12_0_init_microcode() are loading
  different filenames
- it tries to print fw_name, but that's not yet been initialized by that
  time, so it prints random stack contents, e.g.

    amdgpu 0000:04:00.0: Direct firmware load for amdgpu/renoir_asd.bin failed with error -2
    amdgpu 0000:04:00.0: amdgpu: fail to initialize asd microcode
    amdgpu 0000:04:00.0: amdgpu: psp v12.0: Failed to load firmware "\xfeTO\x8e\xff\xff"

Fix that by bailing out immediately, instead of priting the bogus error
message.

Reported-by: Vojtech Pavlik <vojtech@ucw.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 12:09:25 -04:00
Jiri Kosina 4ef87d8f10 drm/amdgpu: Fix resource leak on probe error path
This reverts commit 4192f7b576.

It is not true (as stated in the reverted commit changelog) that we never
unmap the BAR on failure; it actually does happen properly on
amdgpu_driver_load_kms() -> amdgpu_driver_unload_kms() ->
amdgpu_device_fini() error path.

What's worse, this commit actually completely breaks resource freeing on
probe failure (like e.g. failure to load microcode), as
amdgpu_driver_unload_kms() notices adev->rmmio being NULL and bails too
early, leaving all the resources that'd normally be freed in
amdgpu_acpi_fini() and amdgpu_device_fini() still hanging around, leading
to all sorts of oopses when someone tries to, for example, access the
sysfs and procfs resources which are still around while the driver is
gone.

Fixes: 4192f7b576 ("drm/amdgpu: unmap register bar on device init failure")
Reported-by: Vojtech Pavlik <vojtech@ucw.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 12:09:19 -04:00
Thomas Zimmermann 97c9bfe3f6 drm/aperture: Pass DRM driver structure instead of driver name
Print the name of the DRM driver when taking over fbdev devices. Makes
the output to dmesg more consistent. Note that the driver name is only
used for printing a string to the kernel log. No UAPI is affected by this
change.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Acked-by: Chen-Yu Tsai <wens@csie.org> # sun4i
Acked-by: Neil Armstrong <narmstrong@baylibre.com> # meson
Link: https://patchwork.freedesktop.org/patch/msgid/20210629135833.22679-1-tzimmermann@suse.de
2021-07-01 11:11:55 +02:00
Boris Brezillon 78efe21b6f drm/sched: Allow using a dedicated workqueue for the timeout/fault tdr
Mali Midgard/Bifrost GPUs have 3 hardware queues but only a global GPU
reset. This leads to extra complexity when we need to synchronize timeout
works with the reset work. One solution to address that is to have an
ordered workqueue at the driver level that will be used by the different
schedulers to queue their timeout work. Thanks to the serialization
provided by the ordered workqueue we are guaranteed that timeout
handlers are executed sequentially, and can thus easily reset the GPU
from the timeout handler without extra synchronization.

v5:
* Add a new paragraph to the timedout_job() method

v3:
* New patch

v4:
* Actually use the timeout_wq to queue the timeout work

Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Christian König <christian.koenig@amd.com>
Cc: Qiang Yu <yuq825@gmail.com>
Cc: Emma Anholt <emma@anholt.net>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-3-boris.brezillon@collabora.com
2021-07-01 08:53:25 +02:00
Lang Yu 6312333210 drm/amdgpu: show explicit name instead of id in psp_cmd_submit_buf
Use amdgpu_ucode_name to show ucode name and psp_gfx_cmd_name to
show psp_gfx_cmd name in psp_cmd_submit_buf.

v2: adjust function name

Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:25:33 -04:00
Lang Yu dc739d18c6 drm/amdgpu: add function to show psp_gfx_cmd name via id
Implement function psp_gfx_cmd_name to show cmd name
via cmd id.

v2: rename it to psp_gfx_cmd_name

Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:25:33 -04:00
Lang Yu aae435c6e8 drm/amdgpu: add function to show ucode name via id
Implement function amdgpu_ucode_name to show ucode name
via ucode id.

v2: rename it to amdgpu_ucode_name

Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:25:33 -04:00
Luben Tuikov 9de96f3f7e drm/amdgpu: Correctly disable the I2C IP block
On long transfers to the EEPROM device,
i.e. write, it is observed that the driver aborts
the transfer.

The reason for this is that the driver isn't
patient enough--the IC_STATUS register's contents
is 0x27, which is MST_ACTIVITY | TFE | TFNF |
ACTIVITY. That is, while the transmission FIFO is
empty, we, the I2C master device, are still
driving the bus.

Implement the correct procedure to disable
the block, as described in the DesignWare I2C
Databook, section 3.8.3 Disabling DW_apb_i2c on
page 56. Now there are no premature aborts on long
data transfers.

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:25:33 -04:00
Luben Tuikov e2e04041a2 drm/amdgpu: Use a single loop
In smu_v11_0_i2c_transmit() use a single loop to
transmit bytes, instead of two nested loops.

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:25:33 -04:00
Luben Tuikov 1d9d2ca85b drm/amdgpu: Fix koops when accessing RAS EEPROM
Debugfs RAS EEPROM files are available when
the ASIC supports RAS, and when the debugfs is
enabled, an also when "ras_enable" module
parameter is set to 0. However in this case,
we get a kernel oops when accessing some of
the "ras_..." controls in debugfs. The reason
for this is that struct amdgpu_ras::adev is
unset. This commit sets it, thus enabling access
to those facilities. Note that this facilitates
EEPROM access and not necessarily RAS features or
functionality.

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: John Clements <john.clements@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:25:33 -04:00
Alex Deucher d456f3875a drm/amdgpu: fix 64 bit divide in eeprom code
pos is 64 bits.

Fixes: c65b0805e7 ("drm/amdgpu: RAS EEPROM table is now in debugfs")
Cc: luben.tuikov@amd.com
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:24:41 -04:00
Luben Tuikov c65b0805e7 drm/amdgpu: RAS EEPROM table is now in debugfs
Add "ras_eeprom_size" file in debugfs, which
reports the maximum size allocated to the RAS
table in EEROM, as the number of bytes and the
number of records it could store. For instance,

$cat /sys/kernel/debug/dri/0/ras/ras_eeprom_size
262144 bytes or 10921 records
$_

Add "ras_eeprom_table" file in debugfs, which
dumps the RAS table stored EEPROM, in a formatted
way. For instance,

$cat ras_eeprom_table
 Signature    Version  FirstOffs       Size   Checksum
0x414D4452 0x00010000 0x00000014 0x000000EC 0x000000DA
Index  Offset ErrType Bank/CU          TimeStamp      Offs/Addr MemChl MCUMCID    RetiredPage
    0 0x00014      ue    0x00 0x00000000607608DC 0x000000000000   0x00    0x00 0x000000000000
    1 0x0002C      ue    0x00 0x00000000607608DC 0x000000001000   0x00    0x00 0x000000000001
    2 0x00044      ue    0x00 0x00000000607608DC 0x000000002000   0x00    0x00 0x000000000002
    3 0x0005C      ue    0x00 0x00000000607608DC 0x000000003000   0x00    0x00 0x000000000003
    4 0x00074      ue    0x00 0x00000000607608DC 0x000000004000   0x00    0x00 0x000000000004
    5 0x0008C      ue    0x00 0x00000000607608DC 0x000000005000   0x00    0x00 0x000000000005
    6 0x000A4      ue    0x00 0x00000000607608DC 0x000000006000   0x00    0x00 0x000000000006
    7 0x000BC      ue    0x00 0x00000000607608DC 0x000000007000   0x00    0x00 0x000000000007
    8 0x000D4      ue    0x00 0x00000000607608DD 0x000000008000   0x00    0x00 0x000000000008
$_

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: John Clements <john.clements@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: Xinhui Pan <xinhui.pan@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:24:41 -04:00
Luben Tuikov 63d4c081a5 drm/amdgpu: Optimize EEPROM RAS table I/O
Split functionality between read and write, which
simplifies the code and exposes areas of
optimization and more or less complexity, and take
advantage of that.

Read and write the table in one go; use a separate
stage to decode or encode the data, as opposed to
on the fly, which keeps the I2C bus busy. Use a
single read/write to read/write the table or at
most two if the number of records we're
reading/writing wraps around.

Check the check-sum of a table in EEPROM on init.

Update the checksum at the same time as when
updating the table header signature, when the
threshold was increased on boot.

Take advantage of arithmetic modulo 256, that is,
use a byte!, to greatly simplify checksum
arithmetic.

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:24:41 -04:00
Luben Tuikov 017dad64db drm/amdgpu: Get rid of test function
The code is now tested from userspace.
Remove already macroed out test function.

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:24:41 -04:00
Luben Tuikov 0686627b3f drm/amdgpu: Some renames
Qualify with "ras_". Use kernel's own--don't
redefine your own.

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:24:41 -04:00
Luben Tuikov d7edde3dea drm/amdgpu: Nerf buff
buff --> buf. Essentially buffer abbreviates to
buf, remove 1/2 of it, or just the iron part, as
opposed to just the Er,

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:24:41 -04:00
Luben Tuikov e4e6a58935 drm/amdgpu: Use explicit cardinality for clarity
RAS_MAX_RECORD_NUM may mean the maximum record
number, as in the maximum house number on your
street, or it may mean the maximum number of
records, as in the count of records, which is also
a number. To make this distinction whether the
number is ordinal (index) or cardinal (count),
rename this macro to RAS_MAX_RECORD_COUNT.

This makes it easy to understand what it refers
to, especially when we compute quantities such as,
how many records do we have left in the table,
especially when there are so many other numbers,
quantities and numerical macros around.

Also rename the long,
amdgpu_ras_eeprom_get_record_max_length() to the
more succinct and clear,
amdgpu_ras_eeprom_max_record_count().

When computing the threshold, which also deals
with counts, i.e. "how many", use cardinal
"max_eeprom_records_count", than the quantitative
"max_eeprom_records_len".

Simplify the logic here and there, as well.

Cc: Guchun Chen <guchun.chen@amd.com>
Cc: John Clements <john.clements@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:24:41 -04:00
Luben Tuikov 803c6ebdd3 drm/amdgpu: Simplify RAS EEPROM checksum calculations
Rename update_table_header() to
write_table_header() as this function is actually
writing it to EEPROM.

Use kernel types; use u8 to carry around the
checksum, in order to take advantage of arithmetic
modulo 8-bits (256).

Tidy up to 80 columns.

When updating the checksum, just recalculate the
whole thing.

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:24:41 -04:00
Luben Tuikov dce4400e65 drm/amdgpu: Fix amdgpu_ras_eeprom_init()
No need to account for the 2 bytes of EEPROM
address--this is now well abstracted away by
the fixes the the lower layers.

Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:24:41 -04:00
Luben Tuikov cf696091d3 drm/amdgpu: Return result fix in RAS
The low level EEPROM write method, doesn't return
1, but the number of bytes written. Thus do not
compare to 1, instead, compare to greater than 0
for success.

Other cleanup: if the lower layers returned
-errno, then return that, as opposed to
overwriting the error code with one-fits-all
-EINVAL. For instance, some return -EAGAIN.

Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:24:41 -04:00
Luben Tuikov 36b1a00d2b drm/amdgpu: Fix width of I2C address
The I2C address is kept as a 16-bit quantity in
the kernel. The I2C_TAR::I2C_TAR field is 10-bit
wide.

Fix the width of the I2C address for Vega20 from 8
bits to 16 bits to accommodate the full spectrum
of I2C address space.

Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:24:41 -04:00
Luben Tuikov 16ef797737 drm/amdgpu: EEPROM: add explicit read and write
Add explicit amdgpu_eeprom_read() and
amdgpu_eeprom_write() for clarity.

Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:24:40 -04:00
Luben Tuikov 1fab841ff6 drm/amdgpu: RAS xfer to read/write
Wrap amdgpu_ras_eeprom_xfer(..., bool write),
into amdgpu_ras_eeprom_read() and
amdgpu_ras_eeprom_write(), as that makes reading
and understanding the code clearer.

Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:24:40 -04:00
Luben Tuikov a43996573a drm/amdgpu: Rename misspelled function
Instead of fixing the spelling in
  amdgpu_ras_eeprom_process_recods(),
rename it to,
  amdgpu_ras_eeprom_xfer(),
to look similar to other I2C and protocol
transfer (read/write) functions.

Also to keep the column span to within reason by
using a shorter name.

Change the "num" function parameter from "int" to
"const u32" since it is the number of items
(records) to xfer, i.e. their count, which cannot
be a negative number.

Also swap the order of parameters, keeping the
pointer to records and their number next to each
other, while the direction now becomes the last
parameter.

Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:24:40 -04:00
Luben Tuikov c28aa44de8 drm/amdgpu: RAS: EEPROM --> RAS
In amdgpu_ras_eeprom.c--the interface from RAS to
EEPROM, rename macros from EEPROM to RAS, to
indicate that the quantities and objects are RAS
specific, not EEPROM. We can decrease the RAS
table, or put it in different offset of EEPROM as
needed in the future.

Remove EEPROM_ADDRESS_SIZE macro definition, equal
to 2, from the file and calculations, as that
quantity is computed and added on the stack,
in the lower layer, amdgpu_eeprom_xfer().

Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:24:40 -04:00
Luben Tuikov f4322d80ad drm/amdgpu: I2C class is HWMON
Set the auto-discoverable class of I2C bus to
HWMON. Remove SPD.

Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:24:40 -04:00
Luben Tuikov edb63a5308 drm/amdgpu: Fix wrap-around bugs in RAS
Fix the size of the EEPROM from 256000 bytes
to 262144 bytes (256 KiB).

Fix a couple or wrap around bugs. If a valid
value/address is 0 <= addr < size, the inverse of
this inequality (barring negative values which
make no sense here) is addr >= size. Fix this in
the RAS code.

Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:24:40 -04:00
Luben Tuikov ccdfbfec9e drm/amdgpu: RAS and FRU now use 19-bit I2C address
Convert RAS and FRU code to use the 19-bit I2C
memory address and remove all "slave_addr", as
this is now absolved into the 19-bit address.

Cc: Jean Delvare <jdelvare@suse.de>
Cc: John Clements <john.clements@amd.com>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:24:40 -04:00
Luben Tuikov 025a64a587 drm/amdgpu: I2C EEPROM full memory addressing
* "eeprom_addr" is now 32-bit wide.
* Remove "slave_addr" from the I2C EEPROM driver
  interface. The I2C EEPROM Device Type Identifier
  is fixed at 1010b, and the rest of the bits
  of the Device Address Byte/Device Select Code,
  are memory address bits, where the first three
  of those bits are the hardware selection bits.
  All this is now a 19-bit address and passed
  as "eeprom_addr". This abstracts the I2C bus
  for EEPROM devices for this I2C EEPROM driver.
  Now clients only pass the 19-bit EEPROM memory
  address, to the I2C EEPROM driver, as the 32-bit
  "eeprom_addr", from which they want to read from
  or write to.

Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:24:40 -04:00
Luben Tuikov 93ade343bb drm/amdgpu: EEPROM respects I2C quirks
Consult the i2c_adapter.quirks table for
the maximum read/write data length per bus
transaction. Do not exceed this transaction
limit.

Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:24:40 -04:00
Luben Tuikov 746b584762 drm/amdgpu: Fixes to the AMDGPU EEPROM driver
* When reading from the EEPROM device, there is no
  device limitation on the number of bytes
  read--they're simply sequenced out. Thus, read
  the whole data requested in one go.

* When writing to the EEPROM device, there is a
  256-byte page limit to write to before having to
  generate a STOP on the bus, as well as the
  address written to mustn't cross over the page
  boundary (it actually rolls over). Maximize the
  data written to per bus acquisition.

* Return the number of bytes read/written, or -errno.

* Add kernel doc.

Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:24:40 -04:00
Luben Tuikov daaa75fd98 drm/amdgpu: Fix Vega20 I2C to be agnostic (v2)
Teach Vega20 I2C to be agnostic. Allow addressing
different devices while the master holds the bus.
Set STOP as per the controller's specification.

v2: Qualify generating ReSTART before the 1st byte
    of the message, when set by the caller, as
    those functions are separated, as caught by
    Andrey G.

Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:24:40 -04:00
Andrey Grodzovsky 6240da4dfc dmr/amdgpu: Add RESTART handling also to smu_v11_0_i2c (VG20)
Also generilize the code to accept and translate to
HW bits any I2C relvent flags both for read and write.

Cc: Jean Delvare <jdelvare@suse.de>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:24:40 -04:00
Andrey Grodzovsky 2485f8cfff drm/amdgpu: Remember to wait 10ms for write buffer flush v2
EEPROM spec requests this.

v2: Only to be done for write data transactions.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
2021-07-01 00:24:39 -04:00
Aaron Rice 73a5784a5b drm/amdgpu: rework smu11 i2c for generic operation
Handle things besides EEPROMS.

Signed-off-by: Aaron Rice <wolf@lovehindpa.ws>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
2021-07-01 00:24:39 -04:00
Alex Deucher 3e2eae8db2 drm/amdgpu: add I2C_CLASS_HWMON to SMU i2c buses
Not sure that this really matters that much, but these could
have various other hwmon chips on them.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
2021-07-01 00:24:39 -04:00
Alex Deucher 39ed82d1d9 drm/amdgpu: i2c subsystem uses 7 bit addresses
Convert from 8 bit to 7 bit.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
2021-07-01 00:24:39 -04:00
Alex Deucher 25e5c09f2b drm/amdgpu/ras: switch fru eeprom handling to use generic helper (v2)
Use the new helper rather than doing i2c transfers directly.

v2: fix typo

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
2021-07-01 00:24:39 -04:00
Alex Deucher 24f55c0559 drm/amdgpu/ras: switch ras eeprom handling to use generic helper
Use the new helper rather than doing i2c transfers directly.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
2021-07-01 00:24:39 -04:00
Alex Deucher 00e3a289d9 drm/amdgpu: add new helper for handling EEPROM i2c transfers
Encapsulates the i2c protocol handling so other parts of the
driver can just tell it the offset and size of data to write.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
2021-07-01 00:24:39 -04:00
Alex Deucher 6963d6c176 drm/amdgpu: add a mutex for the smu11 i2c bus (v2)
So we lock software as well as hardware access to the bus.

v2: fix mutex handling.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
2021-07-01 00:24:39 -04:00
Mukul Joshi 93c5bcd4ea drm/amdgpu: Conditionally reset SDMA RAS error counts
Reset SDMA RAS error counts during init only if persistent
EDC harvesting is not supported.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:05:41 -04:00
Alex Sierra 8c21fc49a8 drm/amdkfd: add owner ref param to get hmm pages
The parameter is used in the dev_private_owner to decide if device
pages in the range require to be migrated back to system memory, based
if they are or not in the same memory domain.
In this case, this reference could come from the same memory domain
with devices connected to the same hive.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:05:41 -04:00
Alex Deucher 06ac9b6c73 drm/amdgpu: add new dimgrey cavefish DID
Add new PCI device id.

Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-07-01 00:05:18 -04:00
Huang Rui 9f6a785720 drm/amdgpu: move apu flags initialization to the start of device init
In some asics, we need to adjust the behavior according to the apu flags
at very early stage.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:05:18 -04:00
Oak Zeng 8dbe43e99f drm/amdgpu: Set ttm caching flags during bo allocation
The ttm caching flags (ttm_cached, ttm_write_combined etc) are
used to determine a buffer object's mapping attributes in both
CPU page table and GPU page table (when that buffer is also
accessed by GPU). Currently the ttm caching flags are set in
function amdgpu_ttm_io_mem_reserve which is called during
DRM_AMDGPU_GEM_MMAP ioctl. This has a problem since the GPU
mapping of the buffer object (ioctl DRM_AMDGPU_GEM_VA) can
happen earlier than the mmap time, thus the GPU page table
update code can't pick up the right ttm caching flags to
decide the right GPU page table attributes.

This patch moves the ttm caching flags setting to function
amdgpu_vram_mgr_new - this function is called during the
first step of a buffer object create (eg, DRM_AMDGPU_GEM_CREATE)
so the later both CPU and GPU mapping function calls will
pick up this flag for CPU/GPU page table set up.

v2: rebase (Alex)

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Suggested-by: Christian Koenig <Christian.Koenig@amd.com>
Reviewed-by: Christian Koenig <Christian.Koenig@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Tested-by: Po Huang <Po.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:05:12 -04:00
Aaron Liu e2329e74a6 drm/amdgpu: enable sdma0 tmz for Raven/Renoir(V2)
Without driver loaded, SDMA0_UTCL1_PAGE.TMZ_ENABLE is set to 1
by default for all asic. On Raven/Renoir, the sdma goldsetting
changes SDMA0_UTCL1_PAGE.TMZ_ENABLE to 0.
This patch restores SDMA0_UTCL1_PAGE.TMZ_ENABLE to 1.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Acked-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-07-01 00:02:22 -04:00
Joe Perches 78c14b385c treewide: Add missing semicolons to __assign_str uses
The __assign_str macro has an unusual ending semicolon but the vast
majority of uses of the macro already have semicolon termination.

$ git grep -P '\b__assign_str\b' | wc -l
551
$ git grep -P '\b__assign_str\b.*;' | wc -l
480

Add semicolons to the __assign_str() uses without semicolon termination
and all the other uses without semicolon termination via additional defines
that are equivalent to __assign_str() with the eventual goal of removing
the semicolon from the __assign_str() macro definition.

Link: https://lore.kernel.org/lkml/1e068d21106bb6db05b735b4916bb420e6c9842a.camel@perches.com/
Link: https://lkml.kernel.org/r/48a056adabd8f70444475352f617914cef504a45.camel@perches.com

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-06-30 09:19:14 -04:00
Dave Airlie 4bac159e59 Short summary of fixes pull:
* amdgpu: Fix test for allocation failures
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmDUMQsACgkQaA3BHVML
 eiNZeQf/Qs8J69XbEUEZ8FvqBbu/4vUr+nREvG69KZsb9oZnUK5Q40GA1IqDt2ig
 vl9DL84ZXpjX2/dwRock2KuZEAC7OO1XWnbQVWn6IGIc38G4RPMfiwLsF+g2Xqot
 mxIZ98wlHdQfJAaAsW5ftjI9pr4yos+Ldo+X6yfzD7zm9KD9VNTJc6oTg/hiAD9Q
 4YeNBMSeRtFoWgLYegQJUkDZTLN8PdweBkxptQswid9C3yrbAtUKfcbuALFuIvm3
 8mDDI/ZV8j0UlJvIYtZPd9PPEkd4iSwG1sjlq+C5e4Vg+cPQmPzdYyPsnkCn/om9
 qTS1Bsb6sBQDHlnNvcHCEMrtKXnCtA==
 =HHZQ
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-fixes-2021-06-24' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

Short summary of fixes pull:

 * amdgpu: Fix test for allocation failures

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/YNQxVybBGdjLMUQJ@linux-uq9g
2021-06-30 14:24:06 +10:00
Chengming Gui a2f55040cf drm/amd/amdgpu: enable gpu recovery for beige_goby
Enable gpu recovery for beige_goby.

Signed-off-by: Chengming Gui <Jack.Gui@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-30 00:18:56 -04:00
Veerabadhran Gopalakrishnan b3a24461f9 amdgpu/nv.c - Added codec query for Beige Goby
Added the Beige Goby capabilities in codec query.

v2: fix build error and indent (James)

Signed-off-by: Veerabadhran Gopalakrishnan <veerabadhran.gopalakrishnan@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-30 00:18:23 -04:00
Aaron Liu c8af9390e5 drm/amdgpu: enable tmz on yellow carp
The tmz functions are verified on yellow carp. So enable it by
default.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-30 00:18:23 -04:00
Evan Quan ff4b601a05 drm/amdgpu: update HDP LS settings
Avoid unnecessary register programming on feature disablement.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-30 00:18:23 -04:00
Evan Quan 3e7fbfb40f drm/amdgpu: update GFX MGCG settings
Update GFX MGCG related settings.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-30 00:18:22 -04:00
Evan Quan 754e9883d4 drm/amdgpu: correct clock gating settings on feature unsupported
Clock gating setting is still performed even when the corresponding
CG feature is not supported. And the tricky part is disablement is
actually performed no matter for enablement or disablement request.
That seems not logically right.
Considering HW should already properly take care of the CG state, we
will just skip the corresponding clock gating setting when the feature
is not supported.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-30 00:18:22 -04:00
Evan Quan adcf949e66 drm/amdgpu: fix the hang caused by PCIe link width switch
SMU had set all the necessary fields for a link width switch
but the width switch wasn't occurring because the link was idle
in the L1 state. Setting LC_L1_RECONFIG_EN=0x1 will allow width
switches to also be initiated while in L1 instead of waiting until
the link is back in L0.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-06-30 00:18:14 -04:00
Evan Quan 5a5da8ae95 drm/amdgpu: fix NAK-G generation during PCI-e link width switch
A lot of NAK-G being generated when link widht switching is happening.
WA for this issue is to program the SPC to 4 symbols per clock during
bootup when the native PCIE width is x4.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-06-30 00:17:56 -04:00
Evan Quan 9c26ddb1c5 drm/amdgpu: fix Navi1x tcp power gating hang when issuing lightweight invalidaiton
Fix TCP hang when a lightweight invalidation happens on Navi1x.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-30 00:17:47 -04:00
Evan Quan 0dbc2c81a1 drm/amdgpu: correct tcp harvest setting
Add missing settings for SQC bits. And correct some confusing logics
around active wgp bitmap calculation.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-30 00:16:55 -04:00
Zhan Liu a51482458d drm/amd/display: Enabling eDP no power sequencing with DAL feature mask
[Why]
Sometimes, DP receiver chip power-controlled externally by an
Embedded Controller could be treated and used as eDP,
if it drives mobile display. In this case,
we shouldn't be doing power-sequencing, hence we can skip
waiting for T7-ready and T9-ready."

[How]
Added a feature mask to enable eDP no power sequencing feature.

To enable this, set 0x10 flag in amdgpu.dcfeaturemask on
Linux command line.

Signed-off-by: Zhan Liu <zhan.liu@amd.com>
Reviewed-by: Nikola Cornij <Nikola.Cornij@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-29 23:41:22 -04:00
Liam Howlett da68547d36 drm/amdgpu: use vma_lookup() in amdgpu_ttm_tt_get_user_pages()
Use vma_lookup() to find the VMA at a specific address.  As vma_lookup()
will return NULL if the address is not within any VMA, the start address
no longer needs to be validated.

Link: https://lkml.kernel.org/r/20210521174745.2219620-14-Liam.Howlett@Oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-29 10:53:51 -07:00
Nirmoy Das ba2472eaf7 drm/amdgpu: return early for non-TTM_PL_TT type BOs
Return early for non-TTM_PL_TT BOs so that we don't pass
wrong pointer to amdgpu_gtt_mgr_has_gart_addr() which assumes
ttm_resource argument to be TTM_PL_TT type BO's.

v3: remove extra braces.
v2: merge if-conditions.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210629114413.3371-1-nirmoy.das@amd.com
2021-06-29 15:19:18 +02:00
Thomas Zimmermann 0cabcf83b2 drm/amdgpu: Track IRQ state in local device state
Replace usage of struct drm_device.irq_enabled with the driver's
own state field struct amdgpu_device.irq.installed. The field in
the DRM device structure is considered legacy and should not be
used by KMS drivers.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210625082222.3845-2-tzimmermann@suse.de
2021-06-29 11:03:17 +02:00
Dave Airlie 5e0e7a4076 A DMA address check for nouveau, an error code return fix for kmb, fixes
to wait for a moving fence after pinning the BO for amdgpu, nouveau and
 radeon, a crtc and async page flip fix for atmel-hlcdc and a cpu hang
 fix for vc4.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCYNTXAQAKCRDj7w1vZxhR
 xfgKAP9kNJ3Sx/TN64dwYX6nwSSvkw8E7FfDNrCggLyaEn03UQD+N8fMx1oHy3eV
 vx517gHHauMbyUZT4OuV/YFN2YEn/QI=
 =+lbY
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-fixes-2021-06-24' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

A DMA address check for nouveau, an error code return fix for kmb, fixes
to wait for a moving fence after pinning the BO for amdgpu, nouveau and
radeon, a crtc and async page flip fix for atmel-hlcdc and a cpu hang
fix for vc4.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20210624190353.wyizoil3wqrrxz5d@gilmour
2021-06-25 06:05:13 +10:00
Andrey Grodzovsky ea7acd7c59 drm/amdgpu: Fix BUG_ON assert
With added CPU domain to placement you can have
now 3 placemnts at once.

CC: stable@kernel.org
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210622162339.761651-5-andrey.grodzovsky@amd.com
2021-06-23 14:59:39 -04:00
Lang Yu 2b70af79fd drm/amdgpu: switch gtt_mgr to counting used pages
Change mgr->available into mgr->used (invert the value).

Makes more sense to do it this way since we don't need the spinlock any
more to double check the handling.

v3 (chk): separated from the TEMPOARAY FLAG change.

Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210622162339.761651-4-andrey.grodzovsky@amd.com
2021-06-23 14:59:39 -04:00
Christian König 9a22149e95 ydrm/amdgpu: always allow evicting to SYSTEM domain
When we run out of GTT we should still be able to evict VRAM->SYSTEM
with a bounce bufferdrm/amdgpu: always allow evicting to SYSTEM domain

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210622162339.761651-3-andrey.grodzovsky@amd.com
2021-06-23 14:59:39 -04:00
Lang Yu 3e640f1bb8 drm/amdgpu: user temporary GTT as bounce buffer
Currently, we have a limitted GTT memory size and need a bounce buffer
when doing buffer migration between VRAM and SYSTEM domain.

The problem is under GTT memory pressure we can't do buffer migration
between VRAM and SYSTEM domain. But in some cases we really need that.
Eespecially when validating a VRAM backing store BO which resides in
SYSTEM domain.

v2: still account temporary GTT allocations
v3 (chk): revert to the simpler change for now

Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210622162339.761651-2-andrey.grodzovsky@amd.com
2021-06-23 14:59:39 -04:00
Christian König 8ddf5b9bb4 drm/amdgpu: wait for moving fence after pinning
We actually need to wait for the moving fence after pinning
the BO to make sure that the pin is completed.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
References: https://lore.kernel.org/dri-devel/20210621151758.2347474-1-daniel.vetter@ffwll.ch/
CC: stable@kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20210622114506.106349-3-christian.koenig@amd.com
2021-06-22 15:29:03 +02:00
Christian König 8c505bdc9c drm/amdgpu: rework dma_resv handling v3
Drop the workaround and instead implement a better solution.

Basically we are now chaining all submissions using a dma_fence_chain
container and adding them as exclusive fence to the dma_resv object.

This way other drivers can still sync to the single exclusive fence
while amdgpu only sync to fences from different processes.

v3: add the shared fence first before the exclusive one

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210614174536.5188-2-christian.koenig@amd.com
2021-06-22 11:05:05 +02:00
Christian König 22f0463ae6 drm/amdgpu: unwrap fence chains in the explicit sync fence
Unwrap the explicit fence if it is a dma_fence_chain and
sync to the first fence not matching the owner rules.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210614174536.5188-1-christian.koenig@amd.com
2021-06-22 11:05:04 +02:00
Yifan Zhang 962f2f1ae2 Revert "drm/amdgpu/gfx9: fix the doorbell missing when in CGPG issue."
This reverts commit 631003101c.

Reason for revert: side effect of enlarging CP_MEC_DOORBELL_RANGE may
cause some APUs fail to enter gfxoff in certain user cases.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-21 17:44:16 -04:00
Yifan Zhang a334bb6979 Revert "drm/amdgpu/gfx10: enlarge CP_MEC_DOORBELL_RANGE_UPPER to cover full doorbell."
This reverts commit 1ba7b24ba6.

Reason for revert: Side effect of enlarging CP_MEC_DOORBELL_RANGE may
cause some APUs fail to enter gfxoff in certain user cases.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-21 17:43:56 -04:00
Yifan Zhang ee5468b9f1 Revert "drm/amdgpu/gfx9: fix the doorbell missing when in CGPG issue."
This reverts commit 4cbbe34807.

Reason for revert: side effect of enlarging CP_MEC_DOORBELL_RANGE may
cause some APUs fail to enter gfxoff in certain user cases.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-06-21 17:22:52 -04:00
Yifan Zhang baacf52a47 Revert "drm/amdgpu/gfx10: enlarge CP_MEC_DOORBELL_RANGE_UPPER to cover full doorbell."
This reverts commit 1c0b0efd14.

Reason for revert: Side effect of enlarging CP_MEC_DOORBELL_RANGE may
cause some APUs fail to enter gfxoff in certain user cases.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-06-21 17:22:06 -04:00
Michel Dänzer 4c6a23188e drm/amdgpu: Call drm_framebuffer_init last for framebuffer init
Once drm_framebuffer_init has returned 0, the framebuffer is hooked up
to the reference counting machinery and can no longer be destroyed with
a simple kfree. Therefore, it must be called last.

If drm_framebuffer_init returns 0 but its caller then returns non-0,
there will likely be memory corruption fireworks down the road.
The following lead me to this fix:

[   12.891228] kernel BUG at lib/list_debug.c:25!
[...]
[   12.891263] RIP: 0010:__list_add_valid+0x4b/0x70
[...]
[   12.891324] Call Trace:
[   12.891330]  drm_framebuffer_init+0xb5/0x100 [drm]
[   12.891378]  amdgpu_display_gem_fb_verify_and_init+0x47/0x120 [amdgpu]
[   12.891592]  ? amdgpu_display_user_framebuffer_create+0x10d/0x1f0 [amdgpu]
[   12.891794]  amdgpu_display_user_framebuffer_create+0x126/0x1f0 [amdgpu]
[   12.891995]  drm_internal_framebuffer_create+0x378/0x3f0 [drm]
[   12.892036]  ? drm_internal_framebuffer_create+0x3f0/0x3f0 [drm]
[   12.892075]  drm_mode_addfb2+0x34/0xd0 [drm]
[   12.892115]  ? drm_internal_framebuffer_create+0x3f0/0x3f0 [drm]
[   12.892153]  drm_ioctl_kernel+0xe2/0x150 [drm]
[   12.892193]  drm_ioctl+0x3da/0x460 [drm]
[   12.892232]  ? drm_internal_framebuffer_create+0x3f0/0x3f0 [drm]
[   12.892274]  amdgpu_drm_ioctl+0x43/0x80 [amdgpu]
[   12.892475]  __se_sys_ioctl+0x72/0xc0
[   12.892483]  do_syscall_64+0x33/0x40
[   12.892491]  entry_SYSCALL_64_after_hwframe+0x44/0xae

Fixes: f258907fdd "drm/amdgpu: Verify bo size can fit framebuffer size on init."
Signed-off-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-21 17:21:49 -04:00
Dan Carpenter eed75ce7c8 drm/amdgpu: fix amdgpu_preempt_mgr_new()
There is a reversed if statement in amdgpu_preempt_mgr_new() so it
always returns -ENOMEM.

Fixes: 09b020bb05 ("Merge tag 'drm-misc-next-2021-06-09' of git://anongit.freedesktop.org/drm/drm-misc into drm-next")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YMxbQXg/Wqm0ACxt@mwanda
Signed-off-by: Christian König <christian.koenig@amd.com>
2021-06-21 15:24:29 +02:00
Michel Dänzer 24981fa336 drm/amdgpu: Call drm_framebuffer_init last for framebuffer init
Once drm_framebuffer_init has returned 0, the framebuffer is hooked up
to the reference counting machinery and can no longer be destroyed with
a simple kfree. Therefore, it must be called last.

If drm_framebuffer_init returns 0 but its caller then returns non-0,
there will likely be memory corruption fireworks down the road.
The following lead me to this fix:

[   12.891228] kernel BUG at lib/list_debug.c:25!
[...]
[   12.891263] RIP: 0010:__list_add_valid+0x4b/0x70
[...]
[   12.891324] Call Trace:
[   12.891330]  drm_framebuffer_init+0xb5/0x100 [drm]
[   12.891378]  amdgpu_display_gem_fb_verify_and_init+0x47/0x120 [amdgpu]
[   12.891592]  ? amdgpu_display_user_framebuffer_create+0x10d/0x1f0 [amdgpu]
[   12.891794]  amdgpu_display_user_framebuffer_create+0x126/0x1f0 [amdgpu]
[   12.891995]  drm_internal_framebuffer_create+0x378/0x3f0 [drm]
[   12.892036]  ? drm_internal_framebuffer_create+0x3f0/0x3f0 [drm]
[   12.892075]  drm_mode_addfb2+0x34/0xd0 [drm]
[   12.892115]  ? drm_internal_framebuffer_create+0x3f0/0x3f0 [drm]
[   12.892153]  drm_ioctl_kernel+0xe2/0x150 [drm]
[   12.892193]  drm_ioctl+0x3da/0x460 [drm]
[   12.892232]  ? drm_internal_framebuffer_create+0x3f0/0x3f0 [drm]
[   12.892274]  amdgpu_drm_ioctl+0x43/0x80 [amdgpu]
[   12.892475]  __se_sys_ioctl+0x72/0xc0
[   12.892483]  do_syscall_64+0x33/0x40
[   12.892491]  entry_SYSCALL_64_after_hwframe+0x44/0xae

Fixes: f258907fdd "drm/amdgpu: Verify bo size can fit framebuffer size on init."
Signed-off-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-18 17:14:44 -04:00
Bokun Zhang 376002f4b0 drm/amd/amdgpu: Use IP discovery data to determine VCN enablement instead of MMSCH
In the past, we use MMSCH to determine whether a VCN is enabled or not.
This is not reliable since after a FLR, MMSCH may report junk data.

It is better to use IP discovery data.

Signed-off-by: Bokun Zhang <Bokun.Zhang@amd.com>
Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-18 17:14:01 -04:00
Yifan Zhang 942ab769c5 drm/amdgpu: remove unused parameter in amdgpu_gart_bind
Pagelist is no long used in amdgpu_gart_bind. Remove it.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-18 17:12:47 -04:00
Stanley.Yang 513befa634 drm/amdgpu: message smu to update hbm bad page number
Use SMU to update the bad pages rather than directly
accessing the EEPROM from the driver.

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-18 17:11:56 -04:00
Ashish Pawar 7c5f3d7d61 drm/amdgpu: PWRBRK sequence changes for Aldebaran
Modify power brake enablement sequence on Aldebaran

Signed-off-by: Ashish Pawar <ashish.pawar@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-18 17:11:49 -04:00
Stanley.Yang 6ec598cc9d drm/amdgpu: fix bad address translation for sienna_cichlid
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-18 17:11:40 -04:00
Rodrigo Siqueira 5fd953a3f6 drm/amd/display: Add Freesync video documentation
Recently, we added support for an experimental feature named Freesync
video; for more details on that, refer to:

commit 6f59f229f8 ("drm/amd/display: Skip modeset for front porch change")
commit d10cd527f5 ("drm/amd/display: Add freesync video modes based on preferred modes")
commit 0eb1af2e82 ("drm/amd/display: Add module parameter for freesync video mode")

Nevertheless, we did not document it in detail in our driver. This
commit introduces a kernel-doc and expands the module parameter
description.

Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Sean Paul <seanpaul@chromium.org>
Cc: Harry Wentland <Harry.Wentland@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Reviewed by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-18 17:06:43 -04:00
Alex Deucher 26c0504ad3 drm/amdgpu/vcn3: drop extraneous Beige Goby hunk
Probably a rebase leftover.  This doesn't apply to SR-IOV, and
the non-SR-IOV code below it already handles this properly.

Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-18 17:01:35 -04:00
Stanley.Yang e11d5e0d68 drm/amdgpu: add vega20 to ras quirk list
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-18 17:01:35 -04:00
xinhui pan 84408d5f38 drm/amdgpu: Set TTM_PAGE_FLAG_SG earlier for userprt BOs
Because TTM do page counting on userptr BOs which is actually not
needed. To avoid that, lets set TTM_PAGE_FLAG_SG after tt_create and
before tt_populate.

Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-18 17:01:35 -04:00
Dan Carpenter 12fc23a4a3 drm/amdgpu: fix amdgpu_preempt_mgr_new()
There is a reversed if statement in amdgpu_preempt_mgr_new() so it
always returns -ENOMEM.

Fixes: 09b020bb05 ("Merge tag 'drm-misc-next-2021-06-09' of git://anongit.freedesktop.org/drm/drm-misc into drm-next")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YMxbQXg/Wqm0ACxt@mwanda
Signed-off-by: Christian König <christian.koenig@amd.com>
2021-06-18 19:00:16 +02:00
Yifan Zhang 1c0b0efd14 drm/amdgpu/gfx10: enlarge CP_MEC_DOORBELL_RANGE_UPPER to cover full doorbell.
If GC has entered CGPG, ringing doorbell > first page doesn't wakeup GC.
Enlarge CP_MEC_DOORBELL_RANGE_UPPER to workaround this issue.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-06-16 16:04:20 -04:00
Yifan Zhang 4cbbe34807 drm/amdgpu/gfx9: fix the doorbell missing when in CGPG issue.
If GC has entered CGPG, ringing doorbell > first page doesn't wakeup GC.
Enlarge CP_MEC_DOORBELL_RANGE_UPPER to workaround this issue.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-06-16 16:04:20 -04:00
Felix Kuehling d760895d55 drm/amdgpu: Use spinlock_irqsave for pasid_lock
This should fix a kernel LOCKDEP warning on Vega10:
[  149.416604] ================================
[  149.420877] WARNING: inconsistent lock state
[  149.425152] 5.11.0-kfd-fkuehlin #517 Not tainted
[  149.429770] --------------------------------
[  149.434053] inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
[  149.440059] swapper/3/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
[  149.445198] ffff9ac80e005d68 (&adev->vm_manager.pasid_lock){?.+.}-{2:2}, at: amdgpu_vm_get_task_info+0x25/0x90 [amdgpu]
[  149.456252] {HARDIRQ-ON-W} state was registered at:
[  149.461136]   lock_acquire+0x242/0x390
[  149.464895]   _raw_spin_lock+0x2c/0x40
[  149.468647]   amdgpu_vm_handle_fault+0x44/0x380 [amdgpu]
[  149.474187]   gmc_v9_0_process_interrupt+0xa8/0x410 [amdgpu]
...

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-15 17:25:42 -04:00
Yifan Zhang 1ba7b24ba6 drm/amdgpu/gfx10: enlarge CP_MEC_DOORBELL_RANGE_UPPER to cover full doorbell.
If GC has entered CGPG, ringing doorbell > first page doesn't wakeup GC.
Enlarge CP_MEC_DOORBELL_RANGE_UPPER to workaround this issue.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-15 17:25:42 -04:00
Yifan Zhang 631003101c drm/amdgpu/gfx9: fix the doorbell missing when in CGPG issue.
If GC has entered CGPG, ringing doorbell > first page doesn't wakeup GC.
Enlarge CP_MEC_DOORBELL_RANGE_UPPER to workaround this issue.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-15 17:25:42 -04:00
Nirmoy Das e18aaea733 drm/amdgpu: move shadow_list to amdgpu_bo_vm
Move shadow_list to struct amdgpu_bo_vm as shadow BOs
are part of PT/PD BOs.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-15 17:25:42 -04:00
Nirmoy Das 23e24fbb76 drm/amdgpu: parameterize ttm BO destroy callback
Make provision to pass different ttm BO destroy callback
while creating a amdgpu_bo.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-15 17:25:42 -04:00
Nirmoy Das 391629bdfc drm/amdgpu: remove amdgpu_vm_pt
Page table entries are now in embedded in VM BO, so
we do not need struct amdgpu_vm_pt. This patch replaces
struct amdgpu_vm_pt with struct amdgpu_vm_bo_base.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-15 17:25:42 -04:00
Hawking Zhang ed4454c384 drm/amdgpu: correct psp ucode arrary start address
For ASICs that need to load sys_drv_aux and sos_aux,
the sys_start_addr is not the start address of psp
ucode array because the sys_drv_aux and sos_aux actaully
located at the end of the ucode array, instead, the
psp ucode arrary start address should be sos_hdr +
sos_hdr_offset.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: John Clements <John.Clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-15 17:25:42 -04:00
Christian König 440d0f12b5 dma-buf: add dma_fence_chain_alloc/free v3
Add a common allocation helper. Cleaning up the mix of kzalloc/kmalloc
and some unused code in the selftest.

v2: polish kernel doc a bit
v3: polish kernel doc even a bit more

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210611120301.10595-3-christian.koenig@amd.com
2021-06-14 19:38:34 +02:00
Hawking Zhang 3a07101b04 drm/amdgpu: disable DRAM memory training when GECC is enabled
GECC and G6 mem training are mutually exclusive
functionalities. VBIOS/PSP will set the flag
(BOOT_CFG_FEATURE_TWO_STAGE_DRAM_TRAINING) in
runtime database to indicate whether dram memory
training need to be disabled or not.

For Navi1x families, two stage mem training is always
enabled.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-11 16:06:21 -04:00
Hawking Zhang 8e6e054da6 drm/amdgpu: cache psp runtime boot_cfg_bitmask in sw_int
PSP runtime boot_cfg_bitmask carries various psp bl
feature bit mask that can be used by driver. Cache
it in sw_init for further usage.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-11 16:06:15 -04:00
Hawking Zhang 3d689ae4a9 drm/amdgpu: add helper function to query psp runtime db entry (v2)
PSP will dump various boot up information into a
portion of local frame buffer, called runtime database.
The helper function is used for driver to query those
shared information.

v2: init ret and check !ret to exit loop as soon as
found the entry

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-11 16:06:03 -04:00
Hawking Zhang 990ec3014d drm/amdgpu: add psp runtime db structures
PSP runtime database is used to share various
boot up information with driver.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-11 16:05:57 -04:00
Hawking Zhang 6246a416eb drm/amdgpu: enable dynamic GECC support (v2)
Dynamic GECC allows user to specify GECC enablement
status, which will take effect in next boot cycle.

v2: initialize boot_cfg to 0xFF

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-11 16:05:51 -04:00
Hawking Zhang c664223491 drm/amdgpu: add helper function to query gecc status in boot config
Query GECC enablement status in boot config

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-11 16:05:46 -04:00
Hawking Zhang 55188d64ed drm/amdgpu: allow different boot configs
More boot configs need to be supported via
BOOTCFG_CMD_SET

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-11 16:05:41 -04:00
Hawking Zhang b08be1209e drm/amdgpu: update psp gfx i/f to support dynamic GECC
psp_gfx_uresp_bootcfg is used to inform driver
bootcfg settings maintained by tOS

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-11 16:05:33 -04:00
Guchun Chen a3fbb0d810 drm/amdgpu: use adev_to_drm macro for consistency (v2)
Use adev_to_drm() to get to the drm_device pointer.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-11 16:03:32 -04:00
YuBiao Wang 29b4ac0ed9 drm/amdgpu: reset psp ring wptr during ring_create
[Why]
psp ring wptr is not initialized properly in ring_create,
which would lead to psp failure after several gpu reset.

[How]
Set ring_wptr to zero in psp_ring_create.

Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
Reviewed-by: Horace Chen <horace.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-11 16:03:26 -04:00
Peng Ju Zhou c89d2a2fe0 drm/amd/amdgpu: add instance_number check in amdgpu_discovery_get_ip_version
The original code returns IP version of instantce_0 for every IP. This implementation may be correct for most of IPs.

However, for certain IP block (VCN for example), it may have 2 instances and
both of them have the same hw_id, BUT they have different revision number (0 and 1).

In this case, the original amdgpu_discovery_get_ip_version cannot correct reflects
the result and returns false information

Signed-off-by: Bokun Zhang <Bokun.Zhang@amd.com>
Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-11 16:02:50 -04:00
Evan Quan 415e51bdcf drm/amdgpu: make audio dev's D-state transition PMFW-aware
To correctly kick into BACO state, the audio dev's D-state
transition(D0->D3) needs to be PMFW-aware. So, if the audio
dev entered D3 state prior to our driver, we need to bring
it back to D0 state and make sure there will be a D-state
transition on runpm suspend.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-11 16:02:17 -04:00
Evan Quan 5f0f1727c4 drm/amd/pm: drop the incomplete fix for Navi14 runpm issue
As the fix by adding PPSMC_MSG_PrepareMp1ForUnload is proved to
be incomplete. Another fix(see link below) has been sent out.
Link: https://lore.kernel.org/linux-pci/20210602021255.939090-1-evan.quan@amd.com/

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-11 16:01:24 -04:00
John Clements 2a9a151fe8 drm/amdgpu: Added support for loading auxiliary PSP FW
In the case with xgmi connected to cpu load alternate psp fw

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-10 11:44:24 -04:00
John Clements 79a0f4415c drm/amdgpu: Updated fw header structure source
synchronized fw header with latest source

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-10 11:44:24 -04:00
Nirmoy Das bc05716d4f drm/amdkfd: use allowed domain for vmbo validation
Fixes handling when page tables are in system memory.

v3: remove struct amdgpu_vm_parser.
v2: remove unwanted variable.
    change amdgpu_amdkfd_validate instead of amdgpu_amdkfd_bo_validate.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-10 11:44:24 -04:00
Dave Airlie c707b73f0c Merge tag 'amd-drm-next-5.14-2021-06-09' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.14-2021-06-09:

amdgpu:
- SR-IOV fixes
- Smartshift updates
- GPUVM TLB flush updates
- 16bpc fixed point display fix for DCE11
- BACO cleanups and core refactoring
- Aldebaran updates
- Initial Yellow Carp support
- RAS fixes
- PM API cleanup
- DC visual confirm updates
- DC DP MST fixes
- DC DML fixes
- Misc code cleanups and bug fixes

amdkfd:
- Initial Yellow Carp support

radeon:
- memcpy_to/from_io fixes

UAPI:
- Add Yellow Carp chip family id
  Used internally in the kernel driver and by mesa

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210610031649.4006-1-alexander.deucher@amd.com
2021-06-10 13:47:13 +10:00
Dave Airlie 691cf8cd7a drm/amdgpu: use correct rounding macro for 64-bit
This fixes 32-bit arm build due to lack of 64-bit divides.

Fixes: cb1c81467a ("drm/ttm: flip the switch for driver allocated resources v2")
Link: https://patchwork.freedesktop.org/patch/438442/
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2021-06-10 13:34:08 +10:00
Alex Deucher 2c1b1ac708 drm/amdgpu/vcn: drop gfxoff control for VCN2+
Drop disabling of gfxoff during VCN use.  This allows gfxoff
to kick in and potentially save power if the user is not using
gfx for color space conversion or scaling.

VCN1.0 had a bug which prevented it from working properly with
gfxoff, so we disabled it while using VCN.  That said, most apps
today use gfx for scaling and color space conversion rather than
overlay planes so it was generally in use anyway. This was fixed
on VCN2+, but since we mostly use gfx for color space conversion
and scaling and rapidly powering up/down gfx can negate the
advantages of gfxoff, we left gfxoff disabled. As more
applications use overlay planes for color space conversion
and scaling, this starts to be a win, so go ahead and leave
gfxoff enabled.

Note that VCN1.0 uses vcn_v1_0_idle_work_handler() and
vcn_v1_0_ring_begin_use() so they are not affected by this
patch.

Reviewed-by: James Zhu <James.Zhu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-09 22:15:02 -04:00
Dave Airlie 09b020bb05 drm-misc-next for 5.14:
UAPI Changes:
 
  * drm/panfrost: Export AFBC_FEATURES register to userspace
 
 Cross-subsystem Changes:
 
  * dma-buf: Fix debug printing; Rename dma_resv_*() functions + changes
    in callers; Cleanups
 
 Core Changes:
 
  * Add prefetching memcpy for WC
 
  * Avoid circular dependency on CONFIG_FB
 
  * Cleanups
 
  * Documentation fixes throughout DRM
 
  * ttm: Make struct ttm_resource the base of all managers + changes
    in all users of TTM; Add a generic memcpy for page-based iomem; Remove
    use of VM_MIXEDMAP; Cleanups
 
 Driver Changes:
 
  * drm/bridge: Add TI SN65DSI83 and SN65DSI84 + DT bindings
 
  * drm/hyperv: Add DRM driver for HyperV graphics output
 
  * drm/msm: Fix module dependencies
 
  * drm/panel: KD53T133: Support rotation
 
  * drm/pl111: Fix module dependencies
 
  * drm/qxl: Fixes
 
  * drm/stm: Cleanups
 
  * drm/sun4i: Be explicit about format modifiers
 
  * drm/vc4: Use struct gpio_desc; Cleanups
 
  * drm/vgem: Cleanups
 
  * drm/vmwgfx: Use ttm_bo_move_null() if there's nothing to copy
 
  * fbdev/mach64: Cleanups
 
  * fbdev/mb862xx: Use DEVICE_ATTR_RO
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmDAcD0ACgkQaA3BHVML
 eiMkvwf8CwJk2XBHwejx07UKR09jXD2fdHqXElPSsPCwz/L+zIIAr5NqswQupnKl
 n8WAPgrXAGGpQuQEdkjYbukpL6kWIbg+nqdynWSS7Zf6h0SdZMqdYxGdJ9ciarVs
 Aoc56RLJaD97CaxPD5PmkQxUuRyXlMHINjUGevjWqIcGG3CMmh+AdCGx5RChMG4K
 MiIMgdzdg09AGGmlWTe56y7ihH1RWSfgyh/BHsMJ+bxhIpLQzm7Yul5zMSh/hQY5
 qJdDAdKGOKj99Z+UL9C8ZTU3sAMHfqZR0DyqlFTd7cYvT6ZnFoF1mGJ+Tkpz/DB2
 r4/CX2B6x39sNV1lOF7qKQ1kQLgEBw==
 =VT2X
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2021-06-09' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for 5.14:

UAPI Changes:

 * drm/panfrost: Export AFBC_FEATURES register to userspace

Cross-subsystem Changes:

 * dma-buf: Fix debug printing; Rename dma_resv_*() functions + changes
   in callers; Cleanups

Core Changes:

 * Add prefetching memcpy for WC

 * Avoid circular dependency on CONFIG_FB

 * Cleanups

 * Documentation fixes throughout DRM

 * ttm: Make struct ttm_resource the base of all managers + changes
   in all users of TTM; Add a generic memcpy for page-based iomem; Remove
   use of VM_MIXEDMAP; Cleanups

Driver Changes:

 * drm/bridge: Add TI SN65DSI83 and SN65DSI84 + DT bindings

 * drm/hyperv: Add DRM driver for HyperV graphics output

 * drm/msm: Fix module dependencies

 * drm/panel: KD53T133: Support rotation

 * drm/pl111: Fix module dependencies

 * drm/qxl: Fixes

 * drm/stm: Cleanups

 * drm/sun4i: Be explicit about format modifiers

 * drm/vc4: Use struct gpio_desc; Cleanups

 * drm/vgem: Cleanups

 * drm/vmwgfx: Use ttm_bo_move_null() if there's nothing to copy

 * fbdev/mach64: Cleanups

 * fbdev/mb862xx: Use DEVICE_ATTR_RO

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/YMBw3DF2b9udByfT@linux-uq9g
2021-06-10 11:28:09 +10:00
Rohit Khaire c247c021b1 drm/amdgpu: Fix incorrect register offsets for Sienna Cichlid
RLC_CP_SCHEDULERS and RLC_SPARE_INT0 have different
offsets for Sienna Cichlid

Signed-off-by: Rohit Khaire <rohit.khaire@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-08 14:03:55 -04:00
Michel Dänzer b71a52f447 drm/amdgpu: Use drm_dbg_kms for reporting failure to get a GEM FB
drm_err meant broken user space could spam dmesg.

Fixes: f258907fdd "drm/amdgpu: Verify bo size can fit framebuffer size on init."
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-08 14:03:09 -04:00
Changfeng 2a48b5911c drm/amdgpu: switch kzalloc to kvzalloc in amdgpu_bo_create
It will cause error when alloc memory larger than 128KB in
amdgpu_bo_create->kzalloc. So it needs to switch kzalloc to kvzalloc.

Call Trace:
   alloc_pages_current+0x6a/0xe0
   kmalloc_order+0x32/0xb0
   kmalloc_order_trace+0x1e/0x80
   __kmalloc+0x249/0x2d0
   amdgpu_bo_create+0x102/0x500 [amdgpu]
   ? xas_create+0x264/0x3e0
   amdgpu_bo_create_vm+0x32/0x60 [amdgpu]
   amdgpu_vm_pt_create+0xf5/0x260 [amdgpu]
   amdgpu_vm_init+0x1fd/0x4d0 [amdgpu]

Signed-off-by: Changfeng <Changfeng.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-08 14:01:56 -04:00
Rohit Khaire 2b9ced5a96 drm/amdgpu: Use PSP to program IH_RB_CNTL_RING1/2 on SRIOV
This is similar to IH_RB_CNTL programming in
navi10_ih_toggle_ring_interrupts

Signed-off-by: Rohit Khaire <rohit.khaire@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Horace Chen <horace.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-08 12:24:26 -04:00
Zhigang Luo e1944deba1 drm/amdgpu: allocate psp fw private buffer from VRAM for sriov vf
psp added new feature to check fw buffer address for sriov vf. the
address range must be in vf fb.

Signed-off-by: Zhigang Luo <zhigang.luo@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-By : Shaoyun.liu <shaoyunl@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-08 12:15:13 -04:00
Zhigang Luo 93cdc1759b drm/amdgpu: add psp ta microcode init for aldebaran sriov vf
need to load xgmi ta for aldebaran sriov vf.

Signed-off-by: Zhigang Luo <zhigang.luo@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-By : Shaoyun.liu <shaoyunl@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-08 12:15:07 -04:00
Zhigang Luo 488b83f4d5 drm/amdgpu: remove sriov vf mmhub system aperture and fb location programming
host driver programmed mmhub system aperture and fb location for vf, no
need to program in guest side.

Signed-off-by: Zhigang Luo <zhigang.luo@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-By : Shaoyun.liu <shaoyunl@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-08 12:15:01 -04:00
Zhigang Luo 95066fd5d2 drm/amdgpu: remove sriov vf gfxhub fb location programming
host driver programmed the gfxhub fb location for vf, no need to
program in guest side.

Signed-off-by: Zhigang Luo <zhigang.luo@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-By : Shaoyun.liu <shaoyunl@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-08 12:14:55 -04:00
Zhigang Luo adbe2e3d34 drm/amdgpu: remove sriov vf checking from getting fb location
host driver programmed fb location registers for vf, no need to
check anymore.

Signed-off-by: Zhigang Luo <zhigang.luo@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-By : Shaoyun.liu <shaoyunl@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-08 12:14:36 -04:00
Nirmoy Das 6ceba306c0 drm/amdgpu: fix shadow bo skip condition
Create shadow BOs only for no-compute VM context and only for dGPU.
The existing if-condition would create shadow bo for compute context
on dGPU which not what we wanted.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-08 12:14:18 -04:00
Christophe JAILLET d5c9096541 drm/amdgpu: Fix a a typo in a comment
s/than/then/

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-07 14:58:16 -04:00
Eric Huang 7a68d188d1 drm/amdgpu: Fix warning of Function parameter or member not described
Add the parameter table_freed description on function description.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-07 14:58:01 -04:00
Christian König 0ac8f58760 drm/amdgpu: fix VM handling for GART allocations
For GTT allocations with a GART address the res contains the VMID0
addresses and can't be used for VM handling.

So ignore the res when the pages array is given or we fill the page
tables with nonsense.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-07 14:57:44 -04:00
Peng Ju Zhou 9a3bf287c4 drm/amdgpu: Fixing "Indirect register access for Navi12 sriov" for vega10
The NV12 and VEGA10 share the same interface W/RREG32_SOC15*,
the callback functions in these macros may not be defined,
so NULL pointer must be checked but not in
macro __WREG32_SOC15_RLC__, fixing the lock of NULL pointer check.

Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-07 14:57:38 -04:00
John Clements 312d9253ec drm/amdgpu: Update psp fw attestation support list
Disable support on APU

Reviewed-by: Changfeng <Changfeng.Zhu@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-07 14:57:32 -04:00
Rohit Khaire cf2a22e408 drm/amdgpu: Modify register access in sdma_v5_2 to use _SOC15 macros
In SRIOV environment, KMD should access SDMA registers
through RLCG if GC indirect access flag enabled.

Using _SOC15 read/write macros ensures that they go
through RLC when the flag is enabled.

Signed-off-by: Rohit Khaire <rohit.khaire@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-07 14:56:53 -04:00
Thomas Hellström abb50d67ad drm/ttm, drm/amdgpu: Allow the driver some control over swapping
We are calling the eviction_valuable driver callback at eviction time to
determine whether we actually can evict a buffer object.
The upcoming i915 TTM backend needs the same functionality for swapout,
and that might actually be beneficial to other drivers as well.

Add an eviction_valuable call also in the swapout path. Try to keep the
current behaviour for all drivers by returning true if the buffer object
is already in the TTM_PL_SYSTEM placement. We change behaviour for the
case where a buffer object is in a TT backed placement when swapped out,
in which case the drivers normal eviction_valuable path is run.

Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://lore.kernel.org/r/20210602083818.241793-8-thomas.hellstrom@linux.intel.com
Link: https://patchwork.freedesktop.org/patch/msgid/20210602083818.241793-8-thomas.hellstrom@linux.intel.com
2021-06-07 16:07:09 +02:00
Christian König d3fae3b3da dma-buf: drop the _rcu postfix on function names v3
The functions can be called both in _rcu context as well
as while holding the lock.

v2: add some kerneldoc as suggested by Daniel
v3: fix indentation

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210602111714.212426-7-christian.koenig@amd.com
2021-06-06 11:19:51 +02:00
Christian König fb5ce730f2 dma-buf: rename and cleanup dma_resv_get_list v2
When the comment needs to state explicitly that this is doesn't get a reference
to the object then the function is named rather badly.

Rename the function and use it in even more places.

v2: use dma_resv_shared_list as new name

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210602111714.212426-5-christian.koenig@amd.com
2021-06-06 11:18:19 +02:00
Christian König 6edbd6abb7 dma-buf: rename and cleanup dma_resv_get_excl v3
When the comment needs to state explicitly that this
doesn't get a reference to the object then the function
is named rather badly.

Rename the function and use rcu_dereference_check(), this
way it can be used from both rcu as well as lock protected
critical sections.

v2: improve kerneldoc as suggested by Daniel
v3: use dma_resv_excl_fence as function name

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20210602111714.212426-4-christian.koenig@amd.com
2021-06-06 11:17:58 +02:00
Nicholas Kazlauskas c8b73f7fdb drm/amdgpu: Add DC support and display block for Yellow Carp
To enable output on real display instead of virtual.

Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:39:19 -04:00
James Zhu bdc974cfd7 drm/amdgpu: add video_codecs query support for yellow carp
Add video_codecs query support for yellow carp.

Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: James Zhu <James.Zhu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:03:25 -04:00
Aaron Liu 7d38d9dc4e drm/amdgpu: add mode2 reset support for yellow carp
This patch adds mode2 reset support for yellow carp.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:03:24 -04:00
Xiaomeng Hou 0cf6faafc4 drm/amdgpu: correct the cu and rb info for yellow carp
Skip disabled sa to correct the cu_info and active_rbs for yellow carp.

Signed-off-by: Xiaomeng Hou <Xiaomeng.Hou@amd.com>
Suggested-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:03:24 -04:00
Xiaomeng Hou b3accd6f66 drm/amdgpu: add gpu harvest support for yellow carp (v2)
Register callback in gfxhub functions to program the bypass groups in
gc_utcl2 corresponding to harvested SA.

v2: update comments (Alex)

Signed-off-by: Xiaomeng Hou <Xiaomeng.Hou@amd.com>
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:03:24 -04:00
Nicholas Kazlauskas 4b16196752 drm/amdgpu: Load TA firmware for yellow carp
Add TA firmware to module firmware list for yellow carp and call
psp_init_ta_microcode to parse the TA firmware for HDCP support.

Cc: Aaron Liu <aaron.liu@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:03:22 -04:00
Aaron Liu de8d6375e3 drm/amdgpu: add timestamp counter query support for yellow carp
Allows software to query HW counters to timestamp submissions.
This patch can address KFDPerfCountersTest.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: chen gong <curry.gong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:03:19 -04:00
Aaron Liu bb763b5f8e drm/amdgpu: add RLC_PG_DELAY_3 for yellow carp
RLC_PG_DELAY_3 is to make RLC in safe mode to
prevent any misalignment or conflict in middle of any power
feature entry/exit sequence when CGPG feature is enabled.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:03:19 -04:00
Aaron Liu 948b1216c9 drm/amdgpu: enable VCN PG and CG for yellow carp
Enable VCN 3.0 PG and CG for Yellow Carp by setting up flags.

Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:03:18 -04:00
James Zhu 54f4f6f359 drm/amdgpu: enable vcn dpg mode on yellow carp
Enable vcn dpg mode on yellow carp.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:03:18 -04:00
James Zhu ee8d893f0f drm/amdgpu: enable vcn/jpeg on yellow carp
Enable vcn/jpeg IP on yellow carp.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:03:18 -04:00
James Zhu 737a9f860f drm/amdgpu/vcn: add vcn support for yellow carp
Add vcn firmware support for yellow carp

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:03:17 -04:00
James Zhu 3d417b5857 drm/amdgpu/jpeg: Remove harvest checking on CHIP_YELLOW_CARP
Register CC_UVD_HARVESTING is obsolete on CHIP_YELLOW_CARP.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:03:17 -04:00
Aaron Liu db72c3fac9 drm/amdgpu: add IH Clock Gating support for yellow carp
IH CG need to be enabled by driver.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:03:16 -04:00
Aaron Liu b7dd14c730 drm/amdgpu: add ATHUB Clock Gating support for yellow carp
ATHUB MGCG/MGLS is enabled by default.
Adding ATHUB MGCG/MGLS flag to ensure athub mgcg/ls enabled.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:03:16 -04:00
Aaron Liu 6bd955723e drm/amdgpu: add HDP Clock Gating support for yellow carp
HDP MGCG is enabled by default.
Adding AMD_CG_SUPPORT_HDP_MGCG to ensure hdp mgcg enabled.
HDP MGLS need to be enabled by driver.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:03:16 -04:00
Aaron Liu f1e9aa65f8 drm/amdgpu: add SDMA Clock Gating support for yellow carp
Add AMD_CG_SUPPORT_SDMA_LS support.
SDMA MGCG programming is migrated to RLC.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:03:15 -04:00
Aaron Liu fd0a316e21 drm/amdgpu: add GFX Power Gating support for yellow carp
Add GFX Power Gating support.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:03:14 -04:00
Aaron Liu 83ae09b52f drm/amdgpu: add MMHUB Clock Gating support for yellow carp
Add AMD_CG_SUPPORT_MC_MGCG/AMD_CG_SUPPORT_MC_LS support.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:03:14 -04:00
Aaron Liu 9c6c48e623 drm/amdgpu: add GFX Clock Gating support for yellow carp
Add below supports:
GFX Coarse Grain Clock Gating(CGCG)
GFX Coarse grain light sleep/deep sleep(CGLS)
GFX Medium Grain Clock Gating(MGCG)
GFX Medium Grain light sleep/deep sleep(MGLS)
GFX Fine Grain Clock Gating(FGCG)
RLC MGLS
CP  MGLS

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:03:14 -04:00
Aaron Liu 903bb18bcd drm/amdgpu: enable psp_v13 for yellow carp
This patch enables psp_v13 for yellow carp.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:03:13 -04:00
Aaron Liu 04a69d20a0 drm/amdgpu: add psp_v13 support for yellow carp
This patch adds psp_v13 support for yellow carp.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:03:12 -04:00
Alex Deucher 1b3869386e drm/amdgpu: add mmhub client support for yellow carp
To help debugging GPUVM page faults.

Acked-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:03:12 -04:00
Aaron Liu bea7534994 drm/amdgpu: reserved buffer is not needed with ip discovery enabled
When IP discovery enabled, the reserved buffer has been alloacted.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:03:12 -04:00
Huang Rui e15a5fb9b6 drm/amdgpu: introduce a stolen reserved buffer to protect specific buffer region (v2)
Some ASICs such as Yellow Carp needs to reserve a region of video memory
to avoid access from driver. So this patch is to introduce a stolen
reserved buffer to protect specific buffer region.

v2: free this buffer in amdgpu_ttm_fini.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Acked-and-Tested-by: Aaron Liu <aaron.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:03:12 -04:00
Aaron Liu cba00ce82d drm/amdgpu: add gfx golden settings for yellow carp (v3)
This patch is to add gfx golden settings for yellow carp post si.

v2: squash in updates (Alex)
v3: squash in LDS update (Alex)

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:03:11 -04:00
Aaron Liu 120a6db472 drm/amdgpu: add smu ip block for yellow carp(V3)
Yellow carp smu ip version: 13_0_1.
V2: rename smu_v13_0 to smu_v13_0_1.
V3: reuse smu_v13_0 with aldebaran.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:03:11 -04:00
Aaron Liu 011b514fd8 drm/amdgpu: support nbio_7_2_1 for yellow carp
This patch adds nbio_7_2_1 support yellow carp.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:03:09 -04:00
Aaron Liu 5c462ca9a0 drm/amdgpu: set ip blocks for yellow carp
Enable ip blocks for yellow carp.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:03:08 -04:00
Aaron Liu e88d68e106 drm/amdgpu: add sdma support for yellow carp
This patch adds the sdma v5.2 support for yellow carp.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:03:08 -04:00
Aaron Liu bbbdc9739e drm/amdgpu: add gfx support for yellow carp
Add yellow carp checks to gfx10 code.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:03:08 -04:00
Aaron Liu 531d6e5de8 drm/amdgpu: support fw load type for yellow carp
This patch sets fw load type as direct with fw_load_type=0 for yellow carp.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:03:07 -04:00
Aaron Liu c817cfa313 drm/amdgpu: add gmc v10 supports for yellow carp
Add gfx memory controller support for yellow carp.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:03:07 -04:00
Aaron Liu f82e7e49a6 drm/amdgpu: add yellow carp support for ih block
This patch adds the support for yellow carp ih block.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:03:07 -04:00
Aaron Liu e79907216b drm/amdgpu: add nv common ip block support for yellow carp
This patch adds common ip support for yellow carp.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:03:06 -04:00
Alex Deucher cdf9979be9 drm/amdgpu: add yellow_carp_reg_base_init function for yellow carp (v2)
This patch adds yellow_carp_reg_base_init function to init the register
base for yellow carp.

v2: squash in updates (Alex)

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:03:06 -04:00
Aaron Liu 8bf84f60c5 drm/amdgpu: add yellow carp support for gpu_info and ip block setting
This patch adds yellow carp support for gpu_info firmware and ip
block setting.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:03:06 -04:00
Aaron Liu ee9236b78b drm/amdgpu: add yellow carp asic_type enum
This patch adds yellow carp to amd_asic_type enum and amdgpu_asic_name[].

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:03:05 -04:00
Wan Jiabing 48b033098e drm: amdgpu: Remove unneeded semicolon in amdgpu_vm.c
Fix following coccicheck warning:
./drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:1726:2-3: Unneeded semicolon

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:03:04 -04:00
Rohit Khaire 46ed43e67d drm/amdgpu: Modify GC register access to use _SOC15 macros
In SRIOV environment, KMD should access GC registers
with RLCG if GC indirect access flag enabled.

Using _SOC15 read/write macros ensures that they go
through RLC when flag is enabled.

Signed-off-by: Rohit Khaire <rohit.khaire@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:02:57 -04:00
Rohit Khaire cec7e80fbf drm/amdgpu: Enable RLCG read/write interface for Sienna Cichlid
Enable this only for Sienna Cichild
since only Navi12 and Sienna Cichlid support SRIOV

Signed-off-by: Rohit Khaire <rohit.khaire@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:02:50 -04:00
Rohit Khaire 18703923a6 drm/amdgpu: Fix incorrect register offsets for Sienna Cichlid
RLC_CP_SCHEDULERS and RLC_SPARE_INT0 have different
offsets for Sienna Cichlid

Signed-off-by: Rohit Khaire <rohit.khaire@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:02:44 -04:00
Eric Huang 810085ddb7 drm/amdgpu: Don't flush/invalidate HDP for APUs and A+A
Integrate two generic functions to determine if HDP
flush is needed for all Asics.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 16:02:38 -04:00
Colin Ian King 7bee75a2ba drm/amdgpu: remove redundant assignment of variable k
The variable k is being assigned a value that is never read, the
assignment is redundant and can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 12:40:01 -04:00
Eric Huang 31f3324378 drm/amdkfd: Make TLB flush conditional on mapping
It is to optimize memory mapping latency, and also aviod
a page fault in a corner case of changing valid PDE into
PTE.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 12:40:01 -04:00
Eric Huang 075e8080c1 drm/amdgpu: Add table_freed parameter to amdgpu_vm_bo_update
It is to pass the flag to KFD, and optimize table_freed in
amdgpu_vm_bo_update_mapping.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 12:40:01 -04:00
Michel Dänzer 32d6378cab drm/amdgpu: Use drm_dbg_kms for reporting failure to get a GEM FB
drm_err meant broken user space could spam dmesg.

Fixes: f258907fdd "drm/amdgpu: Verify bo size can fit framebuffer size on init."
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 12:40:00 -04:00
Changfeng 31c759bbe3 drm/amdgpu: switch kzalloc to kvzalloc in amdgpu_bo_create
It will cause error when alloc memory larger than 128KB in
amdgpu_bo_create->kzalloc. So it needs to switch kzalloc to kvzalloc.

Call Trace:
   alloc_pages_current+0x6a/0xe0
   kmalloc_order+0x32/0xb0
   kmalloc_order_trace+0x1e/0x80
   __kmalloc+0x249/0x2d0
   amdgpu_bo_create+0x102/0x500 [amdgpu]
   ? xas_create+0x264/0x3e0
   amdgpu_bo_create_vm+0x32/0x60 [amdgpu]
   amdgpu_vm_pt_create+0xf5/0x260 [amdgpu]
   amdgpu_vm_init+0x1fd/0x4d0 [amdgpu]

Signed-off-by: Changfeng <Changfeng.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 12:40:00 -04:00
shaoyunl 23e4aa5179 drm/amdgpu: soc15 register access through RLC should only apply to sriov runtime
On SRIOV, driver should only access register through RLC in runtime

Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 12:40:00 -04:00
Sathishkumar S 30d95a37f4 drm/amdgpu: attr to control SS2.0 bias level (v2)
add sysfs attr to read/write smartshift bias level.
document smartshift_bias sysfs attr.

V2: add attr to amdgpu_device_attrs and use attr_update (Lijo)

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-04 12:40:00 -04:00
Christian König cb1c81467a drm/ttm: flip the switch for driver allocated resources v2
Instead of both driver and TTM allocating memory finalize embedding the
ttm_resource object as base into the driver backends.

v2: fix typo in vmwgfx grid mgr and double init in amdgpu_vram_mgr.c

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210602100914.46246-10-christian.koenig@amd.com
2021-06-04 15:16:46 +02:00
Christian König 267501ec2b drm/amdgpu: switch the VRAM backend to self alloc
Similar to the TTM range manager.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210602100914.46246-7-christian.koenig@amd.com
2021-06-04 15:16:46 +02:00
Christian König f700b18c85 drm/amdgpu: switch the GTT backend to self alloc
Similar to the TTM range manager.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210602100914.46246-6-christian.koenig@amd.com
2021-06-04 15:16:46 +02:00
Christian König d624e1bfa5 drm/amdgpu: revert "drm/amdgpu: stop allocating dummy GTT nodes"
TTM is going to need this again since we are moving the resource
allocation into the backend.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210602100914.46246-4-christian.koenig@amd.com
2021-06-04 15:16:45 +02:00
Christian König 3eb7d96e94 drm/ttm: flip over the range manager to self allocated nodes
Start with the range manager to make the resource object the base
class for the allocated nodes.

While at it cleanup a lot of the code around that.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210602100914.46246-2-christian.koenig@amd.com
2021-06-04 15:16:45 +02:00
Christian König bfa3357ef9 drm/ttm: allocate resource object instead of embedding it v2
To improve the handling we want the establish the resource object as base
class for the backend allocations.

v2: add missing error handling

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210602100914.46246-1-christian.koenig@amd.com
2021-06-04 15:16:45 +02:00
Dave Airlie 5745d647d5 Merge tag 'amd-drm-next-5.14-2021-06-02' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.14-2021-06-02:

amdgpu:
- GC/MM register access macro clean up for SR-IOV
- Beige Goby updates
- W=1 Fixes
- Aldebaran fixes
- Misc display fixes
- ACPI ATCS/ATIF handling rework
- SR-IOV fixes
- RAS fixes
- 16bpc fixed point format support
- Initial smartshift support
- RV/PCO power tuning fixes for suspend/resume
- More buffer object subclassing work
- Add new INFO query for additional vbios information
- Add new placement for preemptable SG buffers

amdkfd:
- Misc fixes

radeon:
- W=1 Fixes
- Misc cleanups

UAPI:
- Add new INFO query for additional vbios information
  Useful for debugging vbios related issues.  Proposed umr patch:
  https://patchwork.freedesktop.org/patch/433297/
- 16bpc fixed point format support
  IGT test:
  https://lists.freedesktop.org/archives/igt-dev/2021-May/031507.html
  Proposed Vulkan patch:
  a25d480207
- Add a new GEM flag which is only used internally in the kernel driver.  Userspace
  is not allowed to set it.

drm:
- 16bpc fixed point format fourcc

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210602214009.4553-1-alexander.deucher@amd.com
2021-06-04 06:13:57 +10:00
Nirmoy Das 07438603a0 drm/amdgpu: make sure we unpin the UVD BO
Releasing pinned BOs is illegal now. UVD 6 was missing from:
commit 2f40801dc5 ("drm/amdgpu: make sure we unpin the UVD BO")

Fixes: 2f40801dc5 ("drm/amdgpu: make sure we unpin the UVD BO")
Cc: stable@vger.kernel.org
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-02 17:57:07 -04:00
Victor Zhao 2370eba9f5 drm/amd/amdgpu:save psp ring wptr to avoid attack
[Why]
When some tools performing psp mailbox attack, the readback value
of register can be a random value which may break psp.

[How]
Use a psp wptr cache machanism to aovid the change made by attack.

v2: unify change and add detailed reason

Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-02 17:56:03 -04:00
Luben Tuikov dce3d8e1d0 drm/amdgpu: Don't query CE and UE errors
On QUERY2 IOCTL don't query counts of correctable
and uncorrectable errors, since when RAS is
enabled and supported on Vega20 server boards,
this takes insurmountably long time, in O(n^3),
which slows the system down to the point of it
being unusable when we have GUI up.

Fixes: ae363a212b ("drm/amdgpu: Add a new flag to AMDGPU_CTX_OP_QUERY_STATE2")
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-02 17:52:38 -04:00
Jiansong Chen 5cfc912582 drm/amdgpu: refine amdgpu_fru_get_product_info
1. eliminate potential array index out of bounds.
2. return meaningful value for failure.

Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Jack Gui <Jack.Gui@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-02 17:51:12 -04:00
Asher Song 147feb0076 drm/amdgpu: add judgement for dc support
Drop DC initialization when DCN is harvested in VBIOS. The way
doesn't affect virtual display ip initialization.

Signed-off-by: Likun Gao  <Likun.Gao@amd.com>
Signed-off-by: Asher Song <Asher.Song@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-02 17:49:15 -04:00
Christian König d3116756a7 drm/ttm: rename bo->mem and make it a pointer
When we want to decouble resource management from buffer management we need to
be able to handle resources separately.

Add a resource pointer and rename bo->mem so that all code needs to
change to access the pointer instead.

No functional change.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210430092508.60710-4-christian.koenig@amd.com
2021-06-02 11:07:25 +02:00
Jiansong Chen 7d9c70d235 drm/amdgpu: remove unsafe optimization to drop preamble ib
Take the situation with gfxoff, the optimization may cause
corrupt CE ram contents. In addition emit_cntxcntl callback
has similar optimization which firmware can handle properly
even for power feature.

Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-01 22:55:39 -04:00
Jiawei Gu 29b4c589b4 drm/amdgpu: Add vbios info ioctl interface
Add AMDGPU_INFO_VBIOS_INFO subquery id for detailed vbios info.

Provides a way for the user application to get the VBIOS
information without having to parse the binary.
It is useful for the user to be able to display in a simple way the VBIOS
version in their system if they happen to encounter an issue.

V2:
Use numeric serial.
Parse and expose vbios version string.

V3:
Remove redundant data in drm_amdgpu_info_vbios struct.

V4:
64 bit alignment in drm_amdgpu_info_vbios.

v5: squash together all the reverts, etc. (Alex)

Signed-off-by: Jiawei Gu <Jiawei.Gu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-01 22:55:39 -04:00
Alex Deucher 915821a744 drm/amdgpu: bump driver version
For 16bpc display support.

Reviewed-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Mario Kleiner <mario.kleiner.de@gmail.com>
2021-06-01 22:55:39 -04:00
Zheng Yongjun 3b42ca8073 drm/amdgpu: Remove unneeded semicolon
Remove unneeded semicolon.

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-01 22:55:39 -04:00
Jiapeng Chong 66c46621c8 amdgpu: remove unreachable code
In the function amdgpu_uvd_cs_msg(), every branch in the switch
statement will have a return, so the code below the switch statement
will not be executed.

Eliminate the follow smatch warning:

drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c:845 amdgpu_uvd_cs_msg() warn:
ignoring unreachable code.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-01 22:55:39 -04:00
Eric Huang f0e0687cf6 drm/amdgpu: Fix a bug on flag table_freed
table_freed will be always true when mapping a memory with size
bigger than 2MB. The problem is page table's entries are always
existed, but existing mapping depends on page talbe's bo, so
using a check of page table's bo existed will resolve the issue.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-01 22:55:39 -04:00
Kevin Wang ba809007f2 drm/amdgpu: optimize code about format string in gfx_v10_0_init_microcode()
the memset() and snprintf() is not necessary.

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-01 22:55:39 -04:00
Kevin Wang 2b8f731849 drm/amdgpu: fix sdma firmware version error in sriov
Re-adjust the function return order to avoid empty sdma version in the
sriov environment. (read amdgpu_firmware_info)

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-01 22:55:39 -04:00
Sathishkumar S 3fa8f89d72 drm/amdgpu: enable smart shift on dGPU (v5)
enable smart shift on dGPU if it is part of HG system and
the platform supports ATCS method to handle power shift.

V2: avoid psc updates in baco enter and exit (Lijo)
    fix alignment (Shashank)
V3: rebased on unified ATCS handling. (Alex)
V4: check for return value and warn on failed update (Shashank)
    return 0 if device does not support smart shift.  (Lizo)
V5: rebased on ATPX/ATCS structures global (Alex)

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-01 22:55:38 -04:00
Nirmoy Das 19a1d9350b drm/amdgpu: flush gart changes after all BO recovery
Don't flush gart changes after recovering each BO instead
do it after recovering all the BOs. Flishing gart also needed
for amdgpu_ttm_alloc_gart().

v4: use containerof to retrieve adev struct.
v3: rename amdgpu_gart_tlb_flush() -> amdgpu_gart_invalidate_tlb().
v2: abstract out gart tlb flushing logic to amdgpu_gart.c

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-01 22:55:38 -04:00
Nirmoy Das c7b9aa7a92 drm/amdgpu: do not allocate entries separately
Allocate PD/PT entries while allocating VM BOs and use that
instead of allocating those entries separately.

v2: create a new var for num entries.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-01 22:55:38 -04:00
Nirmoy Das 9c3fec688f drm/amdgpu: remove unused code
Remove unused code related to shadow BO.

v2: removing shadow bo ptr from base class.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-01 22:55:38 -04:00
Nirmoy Das 59276f056f drm/amdgpu: switch to amdgpu_bo_vm for vm code
The subclass, amdgpu_bo_vm is intended for PT/PD BOs which are also
shadowed, so switch to amdgpu_bo_vm BO for PT/PD BOs.

v4: update amdgpu_vm_update_funcs to accept amdgpu_bo_vm.
v3: simplify code.
    check also if shadow bo exist instead of checking bo only type.
v2: squash three related patches.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-01 22:55:38 -04:00
Nirmoy Das 1fdc79f6f9 drm/admgpu: add two shadow BO helper functions
Add amdgpu_bo_add_to_shadow_list() to handle shadow list
additions and amdgpu_bo_shadowed() to check if a BO is shadowed.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-01 22:55:38 -04:00
Nirmoy Das 2a675640bc drm/amdgpu: move shadow bo validation to VM code
Do the shadow bo validation in the VM code as
VM code knows/owns shadow BOs.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-01 22:55:38 -04:00
Nirmoy Das 6fdd6f4aa5 drm/amdgpu: add amdgpu_bo_vm bo type
Add new BO subclass that will be used by amdgpu vm code.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-01 22:55:38 -04:00
Nirmoy Das ae4c0d7674 drm/amdgpu: make sure we unpin the UVD BO
Releasing pinned BOs is illegal now. UVD 6 was missing from:
commit 2f40801dc5 ("drm/amdgpu: make sure we unpin the UVD BO")

Fixes: 2f40801dc5 ("drm/amdgpu: make sure we unpin the UVD BO")
Cc: stable@vger.kernel.org
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-01 22:55:14 -04:00
Sathishkumar S 16eb48c62b drm/amdgpu: support atcs method powershift (v4)
add support to handle ATCS method for power shift control.
used to communicate dGPU device state to SBIOS.

V2: use defined acpi func for checking psc support (Lijo)
    fix alignment (Shashank)
V3: rebased on unified ATCS handling (Alex)
V4: rebased on ATPX/ATCS structures global (Alex)

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-01 22:36:48 -04:00
Shiwu Zhang 3c609c8b1f drm/amdgpu: free the metadata buffer for sg type BOs as well
Since both sg and device type BOs have metadata buffer, free
the buffer in both cases when to destroy BOs

Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com>
Acked-by: Nirmoy Das <Nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-01 22:36:23 -04:00
Shiwu Zhang eba9852372 drm/amdgpu: fix metadata_size for ubo ioctl queries
Although the kfd_ioctl_get_dmabuf_info() still fail it will indicate
the caller right metadat_size useful for the same kfd ioctl next time.

Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com>
Reviewed-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-01 22:35:50 -04:00
Alex Deucher f9b7f3703f drm/amdgpu/acpi: make ATPX/ATCS structures global (v2)
They are global ACPI methods, so maybe the structures
global in the driver. This simplified a number of things
in the handling of these methods.

v2: reset the handle if verify interface fails (Lijo)
v3: fix compilation when ACPI is not defined.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-27 12:33:52 -04:00
Victor Zhao f1688bd69e drm/amd/amdgpu:save psp ring wptr to avoid attack
[Why]
When some tools performing psp mailbox attack, the readback value
of register can be a random value which may break psp.

[How]
Use a psp wptr cache machanism to aovid the change made by attack.

v2: unify change and add detailed reason

Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-27 12:33:52 -04:00
Lee Jones 9d8d96bec5 drm/amd/amdgpu/amdgpu_device: Make local function static
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:4624:6: warning: no previous prototype for ‘amdgpu_device_recheck_guilty_jobs’ [-Wmissing-prototypes]

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-27 12:33:51 -04:00
Alex Deucher 4965257fe6 drm/amdgpu/acpi: fix typo in ATCS handling
Path should be NULL when we already have the handle
to the object.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Tested-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-27 12:33:50 -04:00
Luben Tuikov 05adfd80cc drm/amdgpu: Use delayed work to collect RAS error counters
On Context Query2 IOCTL return the correctable and
uncorrectable errors in O(1) fashion, from cached
values, and schedule a delayed work function to
calculate and cache them for the next such IOCTL.

v2: Cancel pending delayed work at ras_fini().
v3: Remove conditionals when dealing with delayed
    work manipulation as they're inherently racy.

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: John Clements <john.clements@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-27 12:23:06 -04:00
Luben Tuikov a46751fbcd drm/amdgpu: Fix RAS function interface
The correctable and uncorrectable errors
are calculated at each invocation of this
function. Therefore, it is highly inefficient to
return just one of them based on a Boolean
input. If the caller wants both, twice the work
would be done. (And this work is O(n^3) on
Vega20.)

Fix this "interface" to simply return what it had
calculated--both values. Let the caller choose
what it wants to record, inspect, use.

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: John Clements <john.clements@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-27 12:22:54 -04:00
Luben Tuikov 2871e10199 drm/amdgpu: Don't query CE and UE errors
On QUERY2 IOCTL don't query counts of correctable
and uncorrectable errors, since when RAS is
enabled and supported on Vega20 server boards,
this takes insurmountably long time, in O(n^3),
which slows the system down to the point of it
being unusable when we have GUI up.

Fixes: ae363a212b ("drm/amdgpu: Add a new flag to AMDGPU_CTX_OP_QUERY_STATE2")
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-27 12:22:19 -04:00
Mukul Joshi 5a645ff5c6 drm/amdgpu: Correctly clear GCEA error status
While clearing GCEA error status, do not clear the bits
set by RAS TA.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-27 12:22:10 -04:00
Thomas Zimmermann 5a6af54d6e drm/amdgpu: Use %p4cc to print 4CC format
Replace use of struct drm_format_name_buf with %p4cc for printing
4CC formats.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210516121315.30321-2-tzimmermann@suse.de
2021-05-27 08:34:27 +02:00
Thomas Zimmermann 71df0368e9 drm/amdgpu: Implement mmap as GEM object function
Moving the driver-specific mmap code into a GEM object function allows
for using DRM helpers for various mmap callbacks.

This change resolves several inconsistencies between regular mmap and
prime-based mmap. The vm_ops field in vma is now set for all mmap'ed
areas. Previously it way only set for regular mmap calls, prime-based
mmap used TTM's default vm_ops. The function amdgpu_verify_access() is
no longer being called and therefore removed by this patch.

As a side effect, amdgpu_ttm_vm_ops and amdgpu_ttm_fault() are now
implemented in amdgpu's GEM code.

v4:
	* rebased
v3:
	* rename mmap function to amdgpu_gem_object_mmap() (Christian)
	* remove unnecessary checks from mmap (Christian)
v2:
	* rename amdgpu_ttm_vm_ops and amdgpu_ttm_fault() to
	  amdgpu_gem_vm_ops and amdgpu_gem_fault() (Christian)
	* the check for kfd_bo has meanwhile been removed

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210525151055.8174-3-tzimmermann@suse.de
2021-05-26 20:56:23 +02:00
Andrey Grodzovsky 8eca89a108 drm/amdgpu: Fix clang warning: unused label 'exit'
Problem:
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:332:1: warning: unused label 'exit' [-Wunused-label]
exit:
^~~~~

Fix: Put #ifdef CONFIG_64BIT around exit

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210525184431.1170373-1-andrey.grodzovsky@amd.com
2021-05-26 10:37:36 -04:00
Jiansong Chen 02b865f88b drm/amdgpu: refine amdgpu_fru_get_product_info
1. eliminate potential array index out of bounds.
2. return meaningful value for failure.

Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Jack Gui <Jack.Gui@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-25 23:47:11 -04:00
Peng Ju Zhou 2a4021ccb8 drm/amdgpu: Change IP init sequence to support PSP program IH_RB_CNTL on NV12 SRIOV
To enable PSP program IH_RB_CNTL,
the PSP IP should be initialized before IH IP, otherwise,
it will hit psp NULL pointer.

Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-25 23:47:03 -04:00
Dan Carpenter 713305570a drm/amdgpu: Fix an error code in kfd_mem_attach_dmabuf()
If amdgpu_gem_prime_export() fails, then this code accidentally
returns zero/success instead of a negative error code.

Fixes: 5ac3c3e45f ("drm/amdgpu: Add DMA mapping of GTT BOs")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-25 23:45:37 -04:00
Dan Carpenter 3e06db4d62 drm/amdgpu: add missing unreserve on error
The amdgpu_bo_unreserve() has to be done on the error path as well.

Fixes: 9e5d275319 ("drm/amdgpu: Move kfd_mem_attach outside reservation")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-25 23:44:29 -04:00
Asher Song abaf210c28 drm/amdgpu: add judgement for dc support
Drop DC initialization when DCN is harvested in VBIOS. The way
doesn't affect virtual display ip initialization.

Signed-off-by: Likun Gao  <Likun.Gao@amd.com>
Signed-off-by: Asher Song <Asher.Song@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-25 23:44:16 -04:00
tony.huang_cp 0e9def2108 drm/amdgpu: fix typo
change 'interupt' to 'interrupt'

Signed-off-by: tony.huang_cp <huangwentao@yulong.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-25 23:44:06 -04:00
Andrey Grodzovsky e1543d83ed drm/amdgpu: Fix crash when hot unplug in BACO
Problem:
When device goes into runtime suspend due to prolonged
inactivity (e.g. BACO sleep) and then hot unplugged,
PCI core will try to wake up the device as part of
unplug process. Since the device is gone all HW
programming during rpm resume fails leading
to a bad SW state later during pci remove handling.

Fix:
Use a flag we use for PCIe error recovery to avoid
accessing registres. This allows to successfully complete
rpm resume sequence and finish pci remove.

v2: Renamed HW access block flag

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1081
Link: https://patchwork.freedesktop.org/patch/msgid/20210521204122.762288-2-andrey.grodzovsky@amd.com
2021-05-25 11:56:48 -04:00
Andrey Grodzovsky 7afefb81b7 drm/amdgpu: Rename flag which prevents HW access
Make it's name not feature but function descriptive.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210521204122.762288-1-andrey.grodzovsky@amd.com
2021-05-25 11:53:52 -04:00
Thomas Zimmermann 304ba5dca4 Merge drm/drm-next into drm-misc-next
Backmerging from drm/drm-next to the patches for AMD devices
for v5.14.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2021-05-22 07:17:05 +02:00
Jiapeng Chong f43ae2d180 drm/amdgpu: Fix inconsistent indenting
Eliminate the follow smatch warning:

drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c:449
sdma_v5_0_ring_emit_mem_sync() warn: inconsistent indenting.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-21 18:03:08 -04:00
Alex Deucher e0fb14c8dc drm/amdgpu/apci: switch ATIF/ATCS probe order
Try the handle from ATPX first since this is the most
common case.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-21 18:03:05 -04:00
Alex Deucher 77bf762f8b drm/amdgpu/acpi: unify ATCS handling (v3)
Treat it like ATIF and check both the dGPU and APU for
the method.  This is required because ATCS may be hung
off of the APU in ACPI on A+A systems.

v2: add back accidently removed ACPI handle check.
v3: Fix incorrect atif check (Colin)
    Fix uninitialized variable (Colin)

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-21 18:03:01 -04:00
Felix Kuehling 5bb198930a drm/amdgpu: Use preemptible placement for KFD
KFD userptr BOs and SG BOs used for DMA mappings can be preempted with
CWSR. Therefore we can use preemptible placement and avoid unwanted
evictions due to GTT accounting.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-21 18:00:05 -04:00
Felix Kuehling b453e42a6e drm/amdgpu: Add new placement for preemptible SG BOs
SG BOs such as dmabuf imports and userptr BOs do not consume system
resources directly. Instead they point to resources owned elsewhere.
They typically get evicted by DMABuf move notifiers of MMU notifiers.
If those notifiers don't need to wait for hardware fences (i.e. the SG
BOs are used in a preemptible context), then we don't need to limit
them to the GTT size and we don't need TTM to evict them.

Create a new placement for such preemptible SG BOs that does not impose
artificial size limits and TTM evictions.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-21 17:59:59 -04:00
Lee Jones 20a3e53490 drm/amd/amdgpu/smuio_v13_0: Realign 'smuio_v13_0_is_host_gpu_xgmi_supported()' header
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/amd/amdgpu/smuio_v13_0.c:99: warning: expecting prototype for smuio_v13_0_supports_host_gpu_xgmi(). Prototype was for smuio_v13_0_is_host_gpu_xgmi_supported() instead

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-21 10:32:20 -04:00
Lee Jones f18939021a drm/amd/amdgpu/gfx_v10_0: Demote kernel-doc abuse
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:51: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-21 10:32:20 -04:00
Lee Jones 29ec545844 drm/amd/amdgpu/vcn_v1_0: Fix some function naming disparity
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c:775: warning: expecting prototype for vcn_v1_0_start(). Prototype was for vcn_v1_0_start_spg_mode() instead
 drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c:1111: warning: expecting prototype for vcn_v1_0_stop(). Prototype was for vcn_v1_0_stop_spg_mode() instead

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-21 10:32:20 -04:00
Lee Jones ef6f76407c drm/amd/amdgpu/sdma_v5_2: Repair typo in function name
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c:501: warning: expecting prototype for sdma_v_0_ctx_switch_enable(). Prototype was for sdma_v5_2_ctx_switch_enable() instead

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-21 10:32:19 -04:00
Lee Jones 1c7f15c700 drm/amd/amdgpu/amdgpu_vce: Fix a few incorrectly named functions
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c:98: warning: expecting prototype for amdgpu_vce_init(). Prototype was for amdgpu_vce_sw_init() instead
 drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c:214: warning: expecting prototype for amdgpu_vce_fini(). Prototype was for amdgpu_vce_sw_fini() instead
 drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c:590: warning: expecting prototype for amdgpu_vce_cs_validate_bo(). Prototype was for amdgpu_vce_validate_bo() instead
 drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c:724: warning: expecting prototype for amdgpu_vce_cs_parse(). Prototype was for amdgpu_vce_ring_parse_cs() instead
 drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c:960: warning: expecting prototype for amdgpu_vce_cs_parse_vm(). Prototype was for amdgpu_vce_ring_parse_cs_vm() instead

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-21 10:32:19 -04:00
Lee Jones 8d55be744b drm/amd/amdgpu/sdma_v5_0: Fix typo in function name
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c:563: warning: expecting prototype for sdma_v_0_ctx_switch_enable(). Prototype was for sdma_v5_0_ctx_switch_enable() instead

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-21 10:32:19 -04:00