Driver Changes:
- Added bits of DG2 support around page table handling (Stuart Summers, Matthew Auld)
- Fixed wakeref leak in PMU busyness during reset in GuC mode (Umesh Nerlige Ramappa)
- Fixed debugfs access crash if GuC failed to load (John Harrison)
- Bring back GuC error log to error capture, undoing accidental earlier breakage (Thomas Hellström)
- Fixed memory leak in error capture caused by earlier refactoring (Thomas Hellström)
- Exclude reserved stolen from driver use (Chris Wilson)
- Add memory region sanity checking and optional full test (Chris Wilson)
- Fixed buffer size truncation in TTM shmemfs backend (Robert Beckett)
- Use correct lock and don't overwrite internal data structures when stealing GuC context ids (Matthew Brost)
- Don't hog IRQs when destroying GuC contexts (John Harrison)
- Make GuC to Host communication more robust (Matthew Brost)
- Continuation of locking refactoring around VMA and backing store handling (Maarten Lankhorst)
- Improve performance of reading GuC log from debugfs (John Harrison)
- Log when GuC fails to reset an engine (John Harrison)
- Speed up GuC/HuC firmware loading by requesting RP0 (Vinay Belgaumkar)
- Further work on asynchronous VMA unbinding (Thomas Hellström, Christian König)
- Refactor GuC/HuC firmware handling to prepare for future platforms (John Harrison)
- Prepare for future different GuC/HuC firmware signing key sizes (Daniele Ceraolo Spurio, Michal Wajdeczko)
- Add noreclaim annotations (Matthew Auld)
- Remove racey GEM_BUG_ON between GPU reset and GuC communication handling (Matthew Brost)
- Refactor i915->gt with to_gt(i915) to prepare for future platforms (Michał Winiarski, Andi Shyti)
- Increase GuC log size for CONFIG_DEBUG_GEM (John Harrison)
- Fixed engine busyness in selftests when in GuC mode (Umesh Nerlige Ramappa)
- Make engine parking work with PREEMPT_RT (Sebastian Andrzej Siewior)
- Replace X86_FEATURE_PAT with pat_enabled() (Lucas De Marchi)
- Selftest for stealing of guc ids (Matthew Brost)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YcRvKO5cyPvIxVCi@tursulin-mobl2
Big delta, but boils down to moving set_pages to i915_vma.c, and removing
the special handling, all callers use the defaults anyway. We only remap
in ggtt, so default case will fall through.
Because we still don't require locking in i915_vma_unpin(), handle this by
using xchg in get_pages(), as it's locked with obj->mutex, and cmpxchg in
unpin, which only fails if we race a against a new pin.
Changes since v1:
- aliasing gtt sets ZERO_SIZE_PTR, not -ENODEV, remove special case
from __i915_vma_get_pages(). (Matt)
Changes since v2:
- Free correct old pages in __i915_vma_get_pages(). (Matt)
Remove race of clearing vma->pages accidentally from put,
free it but leave it set, as only get has the lock.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211216142749.1966107-4-maarten.lankhorst@linux.intel.com
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Use to_gt() helper consistently throughout the codebase.
Pure mechanical s/i915->gt/to_gt(i915). No functional changes.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211214193346.21231-5-andi.shyti@linux.intel.com
On some platforms the hw has dropped support for 4K GTT pages when
dealing with LMEM, and due to the design of 64K GTT pages in the hw, we
can only mark the *entire* page-table as operating in 64K GTT mode,
since the enable bit is still on the pde, and not the pte. And since we
we still need to allow 4K GTT pages for SMEM objects, we can't have a
"normal" 4K page-table with scratch pointing to LMEM, since that's
undefined from the hw pov. The simplest solution is to just move the 64K
scratch page to SMEM on such platforms and call it a day, since that
should work for all configurations.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211208141613.7251-4-ramalingam.c@intel.com
Certain functions within i915 uses macros that are defined for
specific architectures by the mmu, such as _PAGE_RW and _PAGE_PRESENT
(Some architectures don't even have these macros defined, like ARM64).
Instead of re-using bits defined for the CPU, we should use bits
defined for i915. This patch introduces two new 64 bit macros,
GEN8_PAGE_PRESENT and GEN8_PAGE_RW, to check for bits 0 and 1 and, to
replace all occurrences of _PAGE_RW and _PAGE_PRESENT within i915.
v2(Michael Cheng): Use GEN8_ instead of I915_
Signed-off-by: Michael Cheng <michael.cheng@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
[ Move defines together with other GEN8 defines ]
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211206215245.513677-2-michael.cheng@intel.com
With both integrated and discrete Intel GPUs in a system, the current
global check of intel_iommu_gfx_mapped, as done from intel_vtd_active()
may not be completely accurate.
In this patch we add i915 parameter to intel_vtd_active() in order to
prepare it for multiple GPUs and we also change the check away from Intel
specific intel_iommu_gfx_mapped (global exported by the Intel IOMMU
driver) to probing the presence of IOMMU on a specific device using
device_iommu_mapped().
This will return true both for IOMMU pass-through and address translation
modes which matches the current behaviour. If in the future we wanted to
distinguish between these two modes we could either use
iommu_get_domain_for_dev() and check for __IOMMU_DOMAIN_PAGING bit
indicating address translation, or ask for a new API to be exported from
the IOMMU core code.
v2:
* Check for dmar translation specifically, not just iommu domain. (Baolu)
v3:
* Go back to plain "any domain" check for now, rewrite commit message.
v4:
* Use device_iommu_mapped. (Robin, Baolu)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211126141424.493753-1-tvrtko.ursulin@linux.intel.com
Don't include stuff on behalf of users if they're not strictly necessary
for the header.
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/7bcaa1684587b9b008d3c41468fb40e63c54fbc7.1636977089.git.jani.nikula@intel.com
So far the remapped view size in GTT/DPT was padded to the next aligned
offset unnecessarily after the last color plane with an unaligned size.
Remove the unnecessary padding.
Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Fixes: 3d1adc3d64 ("drm/i915/adlp: Add support for remapping CCS FBs")
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211026225105.2783797-3-imre.deak@intel.com
(cherry picked from commit 6b6636e176)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Factor out functions that are needed by the next patch to suspend/resume
the memory mappings for DPT FBs.
No functional change, except reordering during suspend the
ggtt->invalidate(ggtt) call wrt. atomic_set(&ggtt->vm.open, open) and
mutex_unlock(&ggtt->vm.mutex). This shouldn't matter due to the i915
suspend sequence being single threaded.
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211101183551.3580546-1-imre.deak@intel.com
During remapping CCS FBs the CCS AUX surface mapped size and offset->x,y
coordinate calculations assumed a tiled layout. This works as long as
the CCS surface height is aligned to 64 lines (ensuring a 4k bytes CCS
surface tile layout). However this alignment is not required by the HW
(and the driver doesn't enforces it either).
Add the remapping logic required to remap the pages of CCS surfaces
without the above alignment, assuming the natural linear layout of the
CCS surface (vs. tiled main surface layout).
Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Fixes: 3d1adc3d64 ("drm/i915/adlp: Add support for remapping CCS FBs")
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211026225105.2783797-5-imre.deak@intel.com
Factor out functions needed to map contiguous FB obj pages to a GTT/DPT
VMA view in the next patch.
While at it s/4096/I915_GTT_PAGE_SIZE/ in add_padding_pages().
No functional changes.
v2: s/4096/I915_GTT_PAGE_SIZE/ (Matthew)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211026225105.2783797-4-imre.deak@intel.com
So far the remapped view size in GTT/DPT was padded to the next aligned
offset unnecessarily after the last color plane with an unaligned size.
Remove the unnecessary padding.
Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Fixes: 3d1adc3d64 ("drm/i915/adlp: Add support for remapping CCS FBs")
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211026225105.2783797-3-imre.deak@intel.com
UAPI Changes:
- Add uAPI for using PXP protected objects
Mesa changes: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8064
- Add PCI IDs and LMEM discovery/placement uAPI for DG1
Mesa changes: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11584
- Disable engine bonding on Gen12+ except TGL, RKL and ADL-S
Cross-subsystem Changes:
- Merges 'tip/locking/wwmutex' branch (core kernel tip)
- "mei: pxp: export pavp client to me client bus"
Core Changes:
- Update ttm_move_memcpy for async use (Thomas)
Driver Changes:
- Enable GuC submission by default on DG1 (Matt B)
- Add PXP (Protected Xe Path) support for Gen12 integrated (Daniele,
Sean, Anshuman)
See "drm/i915/pxp: add PXP documentation" for details!
- Remove force_probe protection for ADL-S (Raviteja)
- Add base support for XeHP/XeHP SDV (Matt R, Stuart, Lucas)
- Handle DRI_PRIME=1 on Intel igfx + Intel dgfx hybrid graphics setup (Tvrtko)
- Use Transparent Hugepages when IOMMU is enabled (Tvrtko, Chris)
- Implement LMEM backup and restore for suspend / resume (Thomas)
- Report INSTDONE_GEOM values in error state for DG2 (Matt R)
- Add DG2-specific shadow register table (Matt R)
- Update Gen11/Gen12/XeHP shadow register tables (Matt R)
- Maintain backward-compatible nested batch behavior on TGL+ (Matt R)
- Add new LRI reg offsets for DG2 (Akeem)
- Initialize unused MOCS entries to device specific values (Ayaz)
- Track and use the correct UC MOCS index on Gen12 (Ayaz)
- Add separate MOCS table for Gen12 devices other than TGL/RKL (Ayaz)
- Simplify the locking and eliminate some RCU usage (Daniel)
- Add some flushing for the 64K GTT path (Matt A)
- Mark GPU wedging on driver unregister unrecoverable (Janusz)
- Major rework in the GuC codebase, simplify locking and add docs (Matt B)
- Add DG1 GuC/HuC firmwares (Daniele, Matt B)
- Remember to call i915_sw_fence_fini on guc_state.blocked (Matt A)
- Use "gt" forcewake domain name for error messages instead of "blitter" (Matt R)
- Drop now duplicate LMEM uAPI RFC kerneldoc section (Daniel)
- Fix early tracepoints for requests (Matt A)
- Use locked access to ctx->engines in set_priority (Daniel)
- Convert gen6/gen7/gen8 read operations to fwtable (Matt R)
- Drop gen11/gen12 specific mmio write handlers (Matt R)
- Drop gen11 specific mmio read handlers (Matt R)
- Use designated initializers for init/exit table (Kees)
- Fix syncmap memory leak (Matt B)
- Add pretty printing for buddy allocator state debug (Matt A)
- Fix potential error pointer dereference in pinned_context() (Dan)
- Remove IS_ACTIVE macro (Lucas)
- Static code checker fixes (Nathan)
- Clean up disabled warnings (Nathan)
- Increase timeout in i915_gem_contexts selftests 5x for GuC submission (Matt B)
- Ensure wa_init_finish() is called for ctx workaround list (Matt R)
- Initialize L3CC table in mocs init (Sreedhar, Ayaz, Ram)
- Get PM ref before accessing HW register (Vinay)
- Move __i915_gem_free_object to ttm_bo_destroy (Maarten)
- Deduplicate frequency dump on debugfs (Lucas)
- Make wa list per-gt (Venkata)
- Do not define dummy vma in stack (Venkata)
- Take pinning into account in __i915_gem_object_is_lmem (Matt B, Thomas)
- Do not report currently active engine when describing objects (Tvrtko)
- Fix pdfdocs build error by removing nested grid from GuC docs (Akira)
- Remove false warning from the rps worker (Tejas)
- Flush buffer pools on driver remove (Janusz)
- Fix runtime pm handling in i915_gem_shrink (Maarten)
- Rework TTM object initialization slightly (Thomas)
- Use fixed offset for PTEs location (Michal Wa)
- Verify result from CTB (de)register action and improve error messages (Michal Wa)
- Fix bug in user proto-context creation that leaked contexts (Matt B)
- Re-use Gen11 forcewake read functions on Gen12 (Matt R)
- Make shadow tables range-based (Matt R)
- Ditch the i915_gem_ww_ctx loop member (Thomas, Maarten)
- Use NULL instead of 0 where appropriate (Ville)
- Rename pci/debugfs functions to respect file prefix (Jani, Lucas)
- Drop guc_communication_enabled (Daniele)
- Selftest fixes (Thomas, Daniel, Matt A, Maarten)
- Clean up inconsistent indenting (Colin)
- Use direction definition DMA_BIDIRECTIONAL instead of
PCI_DMA_BIDIRECTIONAL (Cai)
- Add "intel_" as prefix in set_mocs_index() (Ayaz)
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YWAO80MB2eyToYoy@jlahtine-mobl.ger.corp.intel.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
We assumed that for all modern GENs the PTEs and register space are
split in the GTTMMADR BAR, but while it is true, we should rather use
fixed offset as it is defined in the specification.
Bspec: 4409, 4457, 4604, 11181, 9027, 13246, 13321, 44980
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: CQ Tang <cq.tang@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210926201005.1450-1-michal.wajdeczko@intel.com
We really only need memcpy restore for objects that affect the
operability of the migrate context. That is, primarily the page-table
objects of the migrate VM.
Add an object flag, I915_BO_ALLOC_PM_EARLY for objects that need early
restores using memcpy and a way to assign LMEM page-table object flags
to be used by the vms.
Restore objects without this flag with the gpu blitter and only objects
carrying the flag using TTM memcpy.
Initially mark the migrate, gt, gtt and vgpu vms to use this flag, and
defer for a later audit which vms actually need it. Most importantly, user-
allocated vms with pinned page-table objects can be restored using the
blitter.
Performance-wise memcpy restore is probably as fast as gpu restore if not
faster, but using gpu restore will help tackling future restrictions in
mappable LMEM size.
v4:
- Don't mark the aliasing ppgtt page table flags for early resume, but
rather the ggtt page table flags as intended. (Matthew Auld)
- The check for user buffer objects during early resume is pointless, since
they are never marked I915_BO_ALLOC_PM_EARLY. (Matthew Auld)
v5:
- Mark GuC LMEM objects with I915_BO_ALLOC_PM_EARLY to have them restored
before we fire up the migrate context.
Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210922062527.865433-8-thomas.hellstrom@linux.intel.com
Add support for remapping CCS FBs on ADL-P to remove the restriction
of the power-of-two sized stride and the 2MB surface offset alignment
for these FBs.
We can only remap the tiles on the main surface, not the tiles on the
CCS surface, so userspace has to generate the CCS surface aligning to
the POT size padded main surface stride (by programming the AUX
pagetable accordingly). For the required AUX pagetable setup, this
requires that either the main surface stride is 8 tiles or that the
stride is 16 tiles aligned (= 64 kbytes, the area mapped by one AUX
PTE).
v2:
- Init intel_remapped_info::plane_alignment only for remapped views and
do this from intel_fb_view_init().
Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210906182715.3915100-6-imre.deak@intel.com
The full audit is quite a bit of work:
- i915_dpt has very simple lifetime (somehow we create a display pagetable vm
per object, so its _very_ simple, there's only ever a single vma in there),
and uses i915_vm_close(), which internally does a i915_vm_put(). No rcu.
Aside: wtf is i915_dpt doing in the intel_display.c garbage collector as a new
feature, instead of added as a separate file with some clean-ish interface.
Also, i915_dpt unfortunately re-introduces some coding patterns from
pre-dma_resv_lock conversion times.
- i915_gem_proto_ctx is fully refcounted and no rcu, all protected by
fpriv->proto_context_lock.
- i915_gem_context is itself rcu protected, and that might leak to anything it
points at. Before
commit cf977e1861
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Wed Dec 2 11:21:40 2020 +0000
drm/i915/gem: Spring clean debugfs
and
commit db80a1294c
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Mon Jan 18 11:08:54 2021 +0000
drm/i915/gem: Remove per-client stats from debugfs/i915_gem_objects
we had a bunch of debugfs files that relied on rcu protecting everything, but
those are gone now. The main one was removed even earlier with
There doesn't seem to be anything left that's actually protecting
stuff now that the ctx->vm itself is invariant. See
commit ccbc1b9794
Author: Jason Ekstrand <jason@jlekstrand.net>
Date: Thu Jul 8 10:48:30 2021 -0500
drm/i915/gem: Don't allow changing the VM on running contexts (v4)
Note that we drop the vm refcount before the final release of the gem context
refcount, so this is all very dangerous even without rcu. Note that aside from
later on creating new engines (a defunct feature) and debug output we're never
looked at gem_ctx->vm for anything functional, hence why this is ok.
Fingers crossed.
Preceeding patches removed all vestiges of rcu use from gem_ctx->vm
derferencing to make it clear it's really not used.
The gem_ctx->rcu protection was introduced in
commit a4e7ccdac3
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Fri Oct 4 14:40:09 2019 +0100
drm/i915: Move context management under GEM
The commit message is somewhat entertaining because it fails to
mention this fact completely, and compensates that by an in-commit
changelog entry that claims that ctx->vm is protected by ctx->mutex.
Which was the case _before_ this commit, but no longer after it.
- intel_context holds a full reference. Unfortunately intel_context is also rcu
protected and the reference to the ->vm is dropped before the
rcu barrier - only the kfree is delayed. So again we need to check
whether that leaks anywhere on the intel_context->vm. RCU is only
used to protect intel_context sitting on the breadcrumb lists, which
don't look at the vm anywhere, so we are fine.
Nothing else relies on rcu protection of intel_context and hence is
fully protected by the kref refcount alone, which protects
intel_context->vm in turn.
The breadcrumbs rcu usage was added in
commit c744d50363
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Thu Nov 26 14:04:06 2020 +0000
drm/i915/gt: Split the breadcrumb spinlock between global and contexts
its parent commit added the intel_context rcu protection:
commit 14d1eaf088
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Thu Nov 26 14:04:05 2020 +0000
drm/i915/gt: Protect context lifetime with RCU
given some credence to my claim that I've actually caught them all.
- drm_i915_gem_object's shares_resv_from pointer has a full refcount to the
dma_resv, which is a sub-refcount that's released after the final
i915_vm_put() has been called. Safe.
Aside: Maybe we should have a struct dma_resv_shared which is just dma_resv +
kref as a stand-alone thing. It's a pretty useful pattern which other drivers
might want to copy.
For a bit more context see
commit 4d8151ae53
Author: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Date: Tue Jun 1 09:46:41 2021 +0200
drm/i915: Don't free shared locks while shared
- the fpriv->vm_xa was relying on rcu_read_lock for lookup, but that
was updated in a prep patch too to just be a spinlock-protected
lookup.
- intel_gt->vm is set at driver load in intel_gt_init() and released
in intel_gt_driver_release(). There seems to be some issue that
in some error paths this is called twice, but otherwise no rcu to be
found anywhere. This was added in the below commit, which
unfortunately doesn't explain why this complication exists.
commit e6ba764802
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Sat Dec 21 16:03:24 2019 +0000
drm/i915: Remove i915->kernel_context
The proper fix most likely for this is to start using drmm_ at large
scale, but that's also huge amounts of work.
- i915_vma->vm is some real pain, because rcu is rcu protected, at
least in the vma lookup in the context lookup cache in
eb_lookup_vma(). This was added in
commit 4ff4b44cbb
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Fri Jun 16 15:05:16 2017 +0100
drm/i915: Store a direct lookup from object handle to vma
This was changed to a radix tree from the hashtable in, but with the
locking unchanged, in
commit d1b48c1e71
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Wed Aug 16 09:52:08 2017 +0100
drm/i915: Replace execbuf vma ht with an idr
In
commit 93159e1235
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Mon Mar 23 09:28:41 2020 +0000
drm/i915/gem: Avoid gem_context->mutex for simple vma lookup
the locking was changed from dev->struct_mutex to rcu, which added
the requirement to rcu protect i915_vma. Somehow this was missed in
review (or I'm completely blind).
Irrespective of all that the vma lookup cache rcu_read_lock grabs a
full reference of the vma and the rcu doesn't leak further. So no
impact on i915_address_space from that.
I have not found any other rcu use for i915_vma, but given that it
seems broken I also didn't bother to do a careful in-depth audit.
Alltogether there's nothing left in-tree anymore which requires that a
pointer deref to an i915_address_space is safe undre rcu_read_lock
only.
rcu protection of i915_address_space was introduced in
commit b32fa81115
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Thu Jun 20 19:37:05 2019 +0100
drm/i915/gtt: Defer address space cleanup to an RCU worker
by mixing up a bugfixing (i915_address_space needs to be released from
a worker) with enabling rcu support. The commit message also seems
somewhat confused, because it talks about cleanup of WC pages
requiring sleep, while the code and linked bugzilla are about a
requirement to take dev->struct_mutex (which yes sleeps but it's a
much more specific problem). Since final kref_put can be called from
pretty much anywhere (including hardirq context through the
scheduler's i915_active cleanup) we need a worker here. Hence that
part must be kept.
Ideally all these reclaim workers should have some kind of integration
with our shrinkers, but for some of these it's rather tricky. Anyway,
that's a preexisting condition in the codeebase that we wont fix in
this patch here.
We also remove the rcu_barrier in ggtt_cleanup_hw added in
commit 60a4233a49
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Mon Jul 29 14:24:12 2019 +0100
drm/i915: Flush the i915_vm_release before ggtt shutdown
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20210902142057.929669-11-daniel.vetter@ffwll.ch
This reverts the rest of 0edbb9ba1b ("drm/i915: Move cmd parser
pinning to execbuffer"). Now that the only user of i915_gem_object_get_sg
without allow_alloc has been removed, we can drop the parameter. This
portion of the revert was broken into its own patch to aid review.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210714193419.1459723-4-jason@jlekstrand.net
This was done by the following semantic patch:
@@ expression i915; @@
- INTEL_GEN(i915)
+ GRAPHICS_VER(i915)
@@ expression i915; expression E; @@
- INTEL_GEN(i915) >= E
+ GRAPHICS_VER(i915) >= E
@@ expression dev_priv; expression E; @@
- !IS_GEN(dev_priv, E)
+ GRAPHICS_VER(dev_priv) != E
@@ expression dev_priv; expression E; @@
- IS_GEN(dev_priv, E)
+ GRAPHICS_VER(dev_priv) == E
@@
expression dev_priv;
expression from, until;
@@
- IS_GEN_RANGE(dev_priv, from, until)
+ IS_GRAPHICS_VER(dev_priv, from, until)
@def@
expression E;
identifier id =~ "^gen$";
@@
- id = GRAPHICS_VER(E)
+ ver = GRAPHICS_VER(E)
@@
identifier def.id;
@@
- id
+ ver
It also takes care of renaming the variable we assign to GRAPHICS_VER()
so to use "ver" rather than "gen".
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210605155356.4183026-2-lucas.demarchi@intel.com
UAPI Changes:
- Add reworked uAPI for DG1 behind CONFIG_BROKEN (Matt A, Abdiel)
Driver Changes:
- Fix for Gitlab issues #3293 and #3450:
Avoid kernel crash on older L-shape memory machines
- Add Wa_14010733141 (VDBox SFC reset) for Gen11+ (Aditya)
- Fix crash in auto_retire active retire callback due to
misalignment (Stephane)
- Fix overlay active retire callback alignment (Tvrtko)
- Eliminate need to align active retire callbacks (Matt A, Ville,
Daniel)
- Program FF_MODE2 tuning value for all Gen12 platforms (Caz)
- Add Wa_14011060649 for TGL,RKL,DG1 and ADLS (Swathi)
- Create stolen memory region from local memory on DG1 (CQ)
- Place PD in LMEM on dGFX (Matt A)
- Use WC when default state object is allocated in LMEM (Venkata)
- Determine the coherent map type based on object location (Venkata)
- Use lmem physical addresses for fb_mmap() on discrete (Mohammed)
- Bypass aperture on fbdev when LMEM is available (Anusha)
- Return error value when displayable BO not in LMEM for dGFX (Mohammed)
- Do release kernel context if breadcrumb measure fails (Janusz)
- Hide modparams for compiled-out features (Tvrtko)
- Apply Wa_22010271021 for all Gen11 platforms (Caz)
- Fix unlikely ref count race in arming the watchdog timer (Tvrtko)
- Check actual RC6 enable status in PMU (Tvrtko)
- Fix a double free in gen8_preallocate_top_level_pdp (Lv)
- Use trylock in shrinker for GGTT on BSW VT-d and BXT (Maarten)
- Remove erroneous i915_is_ggtt check for
I915_GEM_OBJECT_UNBIND_VM_TRYLOCK (Maarten)
- Convert uAPI headers to real kerneldoc (Matt A)
- Clean up kerneldoc warnings headers (Matt A, Maarten)
- Fail driver if LMEM training failed (Matt R)
- Avoid div-by-zero on Gen2 (Ville)
- Read C0DRB3/C1DRB3 as 16 bits again and add _BW suffix (Ville)
- Remove reference to struct drm_device.pdev (Thomas)
- Increase separation between GuC and execlists code (Chris, Matt B)
- Use might_alloc() (Bernard)
- Split DGFX_FEATURES from GEN12_FEATURES (Lucas)
- Deduplicate Wa_22010271021 programming on (Jose)
- Drop duplicate WaDisable4x2SubspanOptimization:hsw (Tvrtko)
- Selftest improvements (Chris, Hsin-Yi, Tvrtko)
- Shuffle around init_memory_region for stolen (Matt)
- Typo fixes (wengjianfeng)
[airlied: fix conflict with fixes in i915_active.c]
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YLCbBR22BsQ/dpJB@jlahtine-mobl.ger.corp.intel.com
We are currently sharing the VM reservation locks across a number of
gem objects with page-table memory. Since TTM will individiualize the
reservation locks when freeing objects, including accessing the shared
locks, make sure that the shared locks are not freed until that is done.
For PPGTT we add an additional refcount, for GGTT we take additional
measures to make sure objects sharing the GGTT reservation lock are
freed at GGTT takedown
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210601074654.3103-3-thomas.hellstrom@linux.intel.com
Add support for DPT (display page table). DPT is a
slightly peculiar two level page table scheme used for
tiled scanout buffers (linear uses direct ggtt mapping
still). The plane surface address will point at a page
in the DPT which holds the PTEs for 512 actual pages.
Thus we require 1/512 of the ggttt address space
compared to a direct ggtt mapping.
We create a new DPT address space for each framebuffer and
track two vmas (one for the DPT, another for the ggtt).
TODO:
- Is the i915_address_space approaach sane?
- Maybe don't map the whole DPT to write the PTEs?
- Deal with remapping/rotation? Need to create a
separate DPT for each remapped/rotated plane I
guess. Or else we'd need to make the per-fb DPT
large enough to support potentially several
remapped/rotated vmas. How large should that be?
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
Cc: Wilson Chris P <Chris.P.Wilson@intel.com>
Cc: Tang CQ <cq.tang@intel.com>
Cc: Auld Matthew <matthew.auld@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Reviewed-by: Wilson Chris P <Chris.P.Wilson@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210506161930.309688-5-imre.deak@intel.com
We need to generalise our accessor for the page directories and tables from
using the simple kmap_atomic to support local memory, and this setup
must be done on acquisition of the backing storage prior to entering
fence execution contexts. Here we replace the kmap with the object
mapping code that for simple single page shmemfs object will return a
plain kmap, that is then kept for the lifetime of the page directory.
Note that keeping the mapping around is a potential concern here, since
while the vma is pinned the mapping remains there for the PDs
underneath, or at least until the used_count reaches zero, at which
point we can safely destroy the mapping. For 32b this will be even worse
since the address space is more limited, but since this change mostly
impacts full ppGTT platforms, the justification is that for modern
platforms we shouldn't care too much about 32b.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210427085417.120246-3-matthew.auld@intel.com
Driver Changes:
- Prepare for local/device memory support on DG1 by starting
to use it for kernel internal allocations: context, ring
and engine scratch (Matt A, CQ, Abdiel, Imre)
- Sandybridge fix to avoid hard hang on ring resume (Chris)
- Limit imported dma-buf size to int32 (Matt A)
- Double check heartbeat timeout before resetting (Chris)
- Use new tasklet API for execution list (Emil)
- Fix SPDX checkpats warnings (Chris)
- Fixes for various checkpatch warnings (Chris)
- Selftest improvements (Chris)
- Move the defer_request waiter active assertion to correct spot (Chris)
- Make local-memory probing a GT operation (Matt, Tvrtko)
- Protect against request freeing during cancellation on wedging (Chris)
- Retire unexpected starting state error dumping (Chris)
- Distinction of memory regions in debugging (Zbigniew)
- Always flush the submission queue on checking for idle (Chris)
- Consolidate 2big error check to helper (Matt)
- Decrease number of subplatform bits (Tvrtko)
- Remove unused internal request priority levels (Chris)
- Document the unused internal header bits in buddy allocator (Matt)
- Cleanup the region class/instance encoding (Matt)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YGxksaZGXHnFxlwg@jlahtine-mobl.ger.corp.intel.com
An upcoming platform has a restriction that the FB stride must be
power-of-two aligned. To support framebuffer layouts that are not in
this layout add a logic that pads the tile rows to the POT aligned size.
The HW won't read the padding PTEs, so these don't have to point to an
allocated address, or even have their valid flag set. So use a NULL PTE
instead for instance the scratch page, which is simple and keeps the SG
table compact.
v2:
- Simplify plane_view_dst_stride(). (Ville)
- Pass pitch_tiles as unsigned int.
v3:
- Drop unintentional s/plane_state->rotation/plane_config->rotation/
change.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210325214808.2071517-24-imre.deak@intel.com
For the PTEs we get an LM bit, to signal whether the page resides in
SMEM or LMEM.
Based on a patch from Michel Thierry.
BSpec: 45015
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20210203171231.551338-3-matthew.auld@intel.com
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We may create page table objects on the fly, but we may need to
wait with the ww lock held. Instead of waiting on a freed obj
lock, ensure we have the same lock for each object to keep
-EDEADLK working. This ensures that i915_vma_pin_ww can lock
the page tables when required.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-41-maarten.lankhorst@linux.intel.com
We need to get rid of allocations in the cmd parser, because it needs
to be called from a signaling context, first move all pinning to
execbuf, where we already hold all locks.
Allocate jump_whitelist in the execbuffer, and add annotations around
intel_engine_cmd_parser(), to ensure we only call the command parser
without allocating any memory, or taking any locks we're not supposed to.
Because i915_gem_object_get_page() may also allocate memory, add a
path to i915_gem_object_get_sg() that prevents memory allocations,
and walk the sg list manually. It should be similarly fast.
This has the added benefit of being able to catch all memory allocation
errors before the point of no return, and return -ENOMEM safely to the
execbuf submitter.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-4-maarten.lankhorst@linux.intel.com
Using struct drm_device.pdev is deprecated. Convert i915 to struct
drm_device.dev. No functional changes.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210128133127.2311-3-tzimmermann@suse.de
Tigerlake is plagued by spontaneous DMAR faults [reason 7, next page
table ptr is invalid] which lead to GPU hangs. These faults occur when
an iommu map is immediately reused. Adding further clflushes and
barriers around either the GTT PTE or iommu PTE updates do not prevent
the faults. So far the only effect has been from inducing a delay
between reuse of the iommu on the GPU, and applying the delay at the
iommu map allows for the smallest stable delay.
Note that such a delay is hideous and clearly does not fix the root cause,
and so should only be a bandaid until a complete solution is found. The
delay was determined by running igt/gem_exec_fence/parallel in a loop for
a few hours (unpatched MTBF is about 10s).
We have also seen such DMAR fault [reason 7] errors on other platforms,
notably gen9-gen11, but so far it has only been trivially and
consistently reproduced on Tigerlake.
v2: Leave a tell-tale to know when we apply the vt'd quirk, and as a
reminder to remove it again. Hopefully.
Testcase: igt/gem_exec_fence/parallel
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201209164008.5487-2-chris@chris-wilson.co.uk
Cross-subsystem Changes:
- DMA mapped scatterlist fixes in i915 to unblock merging of
https://lkml.org/lkml/2020/9/27/70 (Tvrtko, Tom)
Driver Changes:
- Fix for user reported issue #2381 (Graphical output stops with "switching to inteldrmfb from simple"):
Mark ininitial fb obj as WT on eLLC machines to avoid rcu lockup during fbdev init (Ville, Chris)
- Fix for Tigerlake (and earlier) to avoid spurious empty CSB events leading to hang (Chris, Bruce)
- Delay execlist processing for Tigerlake to avoid hang (Chris)
- Fix for Tigerlake RCS engine health check through heartbeat (Chris)
- Fix for Tigerlake reserved MOCS entries (Ayaz, Chris)
- Fix Media power gate sequence on Tigerlake (Rodrigo)
- Enable eLLC caching of display buffers for SKL+ (Ville)
- Support parsing of oversize batches on Gen9 (Matt, Chris)
- Exclude low pages (128KiB) of stolen from use to avoid thrashing during reset (Chris)
- Flush engines before Tigerlake breadcrumbs (Chris)
- Use the local HWSP offset during submission (Chris)
- Flush coherency domains on first set-domain-ioctl (Chris, Zbigniew)
- Use the active reference on the vma while capturing to avoid use-after-free (Chris)
- Fix MOCS PTE setting for gen9+ (Ville)
- Avoid NULL dereference on IPS driver callback while unbinding i915 (Chris)
- Avoid NULL dereference from PT/PD stash allocation error (Matt)
- Hold request reference for canceling an active context (Chris)
- Avoid infinite loop on x86-32 when mapping a lot of objects (Chris)
- Disallow WC mappings when processor doesn't support them (Chris)
- Return correct error in i915_gem_object_copy_blt() error path (Dan)
- Return correct error in intel_context_create_request() error path (Maarten)
- Tune down GuC communication enabled/disabled messages to debug (Jani)
- Fix rebased commit "Remove i915_request.lock requirement for execution callbacks" (Chris)
- Cancel outstanding work after disabling heartbeats on an engine (Chris)
- Signal cancelled requests (Chris)
- Retire cancelled requests on unload (Chris)
- Scrub HW state on driver remove (Chris)
- Undo forced context restores after trivial preemptions (Chris)
- Handle PCI unbind in PMU code (Tvrtko)
- Fix CPU hotplug with multiple GPUs in PMU code (Trtkko)
- Correctly set SFC capability for video engines (Venkata)
- Update GuC code to use firmware v49.0.1 (John, Matthew B., Daniele, Oscar, Michel, Rodrigo, Michal)
- Improve GuC warnings on loading failure (John)
- Avoid ownership race in buffer pool by clearing age (Chris)
- Use MMIO to read CSB in case of failure (Chris, Mika)
- Show engine properties in engine state dump to indicate changes (Chris, Joonas)
- Break up error capture compression loops with cond_resched() (Chris)
- Reduce GPU error capture mutex hold time to avoid khungtaskd (Chris)
- Serialise debugfs i915_gem_objects with ctx->mutex (Chris)
- Always test execution status on closing the context and close if not persistent (Chris)
- Avoid mixing integer types during batch copies (Chris, Jared)
- Skip over MI_NOOP when parsing to avoid overhead (Chris)
- Hold onto an explicit ref to i915_vma_work.pinned (Chris)
- Perform all asynchronous waits prior to marking payload start (Chris)
- Pull phys pread/pwrite implementations to the backend (Matt)
- Improve record of hung engines in error state (Tvrtko)
- Allow backends to override pread implementation (Matt)
- Reinforce LRC poisoning checks to confirm context survives execution (Chris)
- Fix memory region max size calculation (Matt)
- Fix order when adding blocks to memory region (Matt)
- Eliminate unused intel_virtual_engine_get_sibling func (Chris)
- Cleanup kasan warning for on-stack (unsigned long) casting (Chris)
- Onion unwind for scratch page allocation failure (Chris)
- Poison stolen pages before use (Chris)
- Selftest improvements (Chris)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201112163407.GA20320@jlahtine-mobl.ger.corp.intel.com
As the previous patch fixed the places where we walk the whole scatterlist
for DMA addresses, this patch fixes the random lookup functionality.
To achieve this we have to add a second lookup iterator and add a
i915_gem_object_get_sg_dma helper, to be used analoguous to existing
i915_gem_object_get_sg_dma. Therefore two lookup caches are maintained per
object and they are flushed at the same point for simplicity. (Strictly
speaking the DMA cache should be flushed from i915_gem_gtt_finish_pages,
but today this conincides with unsetting of the pages in general.)
Partial VMA view is then fixed to use the new DMA lookup and properly
query sg length.
v2:
* Checkpatch.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Tom Murphy <murphyt7@tcd.ie>
Cc: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201006092508.1064287-2-tvrtko.ursulin@linux.intel.com
When using fake lmem for tests, we are overriding the setting in
device info for dgfx devices. Current users of IS_DGFX() except one are
correct. However, as we add support for DG1, we are going to use it in
additional places to trigger dgfx-only code path.
In future if we need we can use HAS_LMEM() instead of IS_DGFX() in the
places that make sense to also contemplate fake lmem use.
v2: update gen8_gmch_probe() to use HAS_LMEM(): we need to steal the
mappable aperture later(which is fine since it doesn't exist on "DGFX"),
and use it as a substitute for LMEMBAR. The !mappable aperture property
is also useful since it exercises some other parts of the code too.
(Matthew Auld)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201001063917.3133475-1-lucas.demarchi@intel.com
(Same content as drm-intel-gt-next-2020-09-04-3, S-o-b's added)
UAPI Changes:
(- Potential implicit changes from WW locking refactoring)
Cross-subsystem Changes:
(- WW locking changes should align the i915 locking more with others)
Driver Changes:
- MAJOR: Apply WW locking across the driver (Maarten)
- Reverts for 5 commits to make applying WW locking faster (Maarten)
- Disable preparser around invalidations on Tigerlake for non-RCS engines (Chris)
- Add missing dma_fence_put() for error case of syncobj timeline (Chris)
- Parse command buffer earlier in eb_relocate(slow) to facilitate backoff (Maarten)
- Pin engine before pinning all objects (Maarten)
- Rework intel_context pinning to do everything outside of pin_mutex (Maarten)
- Avoid tracking GEM context until registered (Cc: stable, Chris)
- Provide a fastpath for waiting on vma bindings (Chris)
- Fixes to preempt-to-busy mechanism (Chris)
- Distinguish the virtual breadcrumbs from the irq breadcrumbs (Chris)
- Switch to object allocations for page directories (Chris)
- Hold context/request reference while breadcrumbs are active (Chris)
- Make sure execbuffer always passes ww state to i915_vma_pin (Maarten)
- Code refactoring to facilitate use of WW locking (Maarten)
- Locking refactoring to use more granular locking (Maarten, Chris)
- Support for multiple pinned timelines per engine (Chris)
- Move complication of I915_GEM_THROTTLE to the ioctl from general code (Chris)
- Make active tracking/vma page-directory stash work preallocated (Chris)
- Avoid flushing submission tasklet too often (Chris)
- Reduce context termination list iteration guard to RCU (Chris)
- Reductions to locking contention (Chris)
- Fixes for issues found by CI (Chris)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <jlahtine@jlahtine-mobl.ger.corp.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200907130039.GA27766@jlahtine-mobl.ger.corp.intel.com
The GEM object is grossly overweight for the practicality of tracking
large numbers of individual pages, yet it is currently our only
abstraction for tracking DMA allocations. Since those allocations need
to be reserved upfront before an operation, and that we need to break
away from simple system memory, we need to ditch using plain struct page
wrappers.
In the process, we drop the WC mapping as we ended up clflushing
everything anyway due to various issues across a wider range of
platforms. Though in a future step, we need to drop the kmap_atomic
approach which suggests we need to pre-map all the pages and keep them
mapped.
v2: Verify our large scratch page is suitably DMA aligned; and manually
clear the scratch since we are allocating plain struct pages full of
prior content.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200729164219.5737-2-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
We need to make the DMA allocations used for page directories to be
performed up front so that we can include those allocations in our
memory reservation pass. The downside is that we have to assume the
worst case, even before we know the final layout, and always allocate
enough page directories for this object, even when there will be overlap.
This unfortunately can be quite expensive, especially as we have to
clear/reset the page directories and DMA pages, but it should only be
required during early phases of a workload when new objects are being
discovered, or after memory/eviction pressure when we need to rebind.
Once we reach steady state, the objects should not be moved and we no
longer need to preallocating the pages tables.
It should be noted that the lifetime for the page directories DMA is
more or less decoupled from individual fences as they will be shared
across objects across timelines.
v2: Only allocate enough PD space for the PTE we may use, we do not need
to allocate PD that will be left as scratch.
v3: Store the shift unto the first PD level to encapsulate the different
PTE counts for gen6/gen8.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200729164219.5737-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reuse the ppgtt_bind_vma() for aliasing_ppgtt_bind_vma() so we can
reduce some code near-duplication. The catch is that we need to then
pass along the i915_address_space and not rely on vma->vm, as they
differ with the aliasing-ppgtt.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200703102519.26539-1-chris@chris-wilson.co.uk
Across suspend/resume, we clear the entire GGTT and rebuild from
scratch. In particular, we want to only preserve the global entries for
use by the HW, and delay reinstating the local binds until required by
the user. This means that we can evict any local binds in the global GTT,
saving any time in preserving their state, as they will be rebound on
demand.
References: https://gitlab.freedesktop.org/drm/intel/-/issues/1947
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200528082427.21402-2-chris@chris-wilson.co.uk
We should be able to skip restoring LOCAL (user) binds within the GGTT on
resume and let them be restored upon demand. However, our consistency
checks demand that the bind flags match the node state, and we cannot
simply clear the flags, we need to evict as well. For now, make sure we
restore the bind flags exactly upon resume.
Fixes: 0109a16ef3 ("drm/i915/gt: Clear LOCAL_BIND from shared GGTT on resume")
Fixes: bf0840cdb3 ("drm/i915/gt: Stop cross-polluting PIN_GLOBAL with PIN_USER with no-ppgtt")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200528150452.7880-1-chris@chris-wilson.co.uk
We only restore GLOBAL binds upon resume as we expect these to be pinned
for use by HW, whereas the LOCAL binds can be recreated on demand once
userspace is resumed. For the LOCAL bind to be recreated in the global
GTT (for old systems without ppgtt), we need to clear its presence flag
on deciding not to restore the mapping upon resume.
Fixes: bf0840cdb3 ("drm/i915/gt: Stop cross-polluting PIN_GLOBAL with PIN_USER with no-ppgtt")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200526150739.26147-1-chris@chris-wilson.co.uk
DMA_MASK bit values are different for different generations.
This will become more difficult to manage over time with the open
coded usage of different versions of the device.
Fix by:
disallow setting of dma mask in AGP path (< GEN(5) for i915,
add dma_mask_size to the device info configuration,
updating open code call sequence to the latest interface,
refactoring into a common function for setting the dma segment
and mask info
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
cc: Brian Welty <brian.welty@intel.com>
cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200417195107.68732-1-michael.j.ruhl@intel.com
When we allocate space in the GGTT we may have to allocate a larger
region than will be populated by the object to accommodate fencing. Make
sure that this space beyond the end of the buffer points safely into
scratch space, in case the HW tries to access it anyway (e.g. fenced
access to the last tile row).
v2: Preemptively / conservatively guard gen6 ggtt as well.
Reported-by: Imre Deak <imre.deak@intel.com>
References: https://gitlab.freedesktop.org/drm/intel/-/issues/1554
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200331152348.26946-1-chris@chris-wilson.co.uk
On TGL, bits 2-4 in the GGTT PTE are not ignored anymore and are
instead used for some extra VT-d capabilities. We don't (yet?) have
support for those capabilities, but, given that we shared the pte_encode
function betweed GGTT and PPGTT, we still set those bits to the PPGTT
PPAT values. The DMA engine gets very confused when those bits are
set while the iommu is enabled, leading to errors. E.g. when loading
the GuC we get:
[ 9.796218] DMAR: DRHD: handling fault status reg 2
[ 9.796235] DMAR: [DMA Write] Request device [00:02.0] PASID ffffffff fault addr 0 [fault reason 02] Present bit in context entry is clear
[ 9.899215] [drm:intel_guc_fw_upload [i915]] *ERROR* GuC firmware signature verification failed
To fix this, just have dedicated gen8_pte_encode function per type of
gtt. Also, explicitly set vm->pte_encode for gen8_ppgtt, even if we
don't use it, to make sure we don't accidentally assign it to the GGTT
one, like we do for gen6_ppgtt, in case we need it in the future.
Reported-by: "Sodhi, Vunny" <vunny.sodhi@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200226185657.26445-1-daniele.ceraolospurio@intel.com
The #include has been splattered all over the place, but there are
precious few places, all .c files, that actually need it.
v2: remove leftover double newlines
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200225133131.3301-1-jani.nikula@intel.com
Some DSI and VBT pending patches from Hans will apply
cleanly and with less ugly conflicts if they are rebuilt
on top of other patches that recently landed on drm-next.
Reference: https://patchwork.freedesktop.org/series/70952/
Cc: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com
use intel_uc_uses_guc() directly instead, to be consistent in the way we
check what we want to do with the GuC.
v2: split guc_log_info changes to their own patch (Michal)
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200218223327.11058-2-daniele.ceraolospurio@intel.com
In the rare cases where we are using the global GGTT for execution in
the selftests, we have marked them with PIN_USER knowing that they will
be bound as PIN_GLOBAL as well. However, we need to catch the extra flag
in deciding to use the async worker for such binds as well.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200131081543.2251298-1-chris@chris-wilson.co.uk
On Braswell and Broxton (also known as Valleyview and Apollolake), we
need to serialise updates of the GGTT using the big stop_machine()
hammer. This has the side effect of appearing to lockdep as a possible
reclaim (since it uses the cpuhp mutex and that is tainted by per-cpu
allocations). However, we want to use vm->mutex (including ggtt->mutex)
from within the shrinker and so must avoid such possible taints. For this
purpose, we introduced the asynchronous vma binding and we can apply it
to the PIN_GLOBAL so long as take care to add the necessary waits for
the worker afterwards.
Closes: https://gitlab.freedesktop.org/drm/intel/issues/211
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200130181710.2030251-3-chris@chris-wilson.co.uk
The i915_ggtt now sits beneath gt/ outside of the auspices of gem/ and
should be given a fresh name to reflect that. We also want to give it a
name that reflects its role in the system suspend/resume, with the
intention of pulling together all the GGTT operations (e.g. restoring
the fence registers once they are pulled under gt/intel_ggtt_detiler.c)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Rreviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200130181710.2030251-2-chris@chris-wilson.co.uk
uapi:
- dma-buf heaps added (and fixed)
- command line add support for panel oreientation
- command line allow overriding penguin count
drm:
- mipi dsi definition updates
- lockdep annotations for dma_resv
- remove dma-buf kmap/kunmap support
- constify fb_ops in all fbdev drivers
- MST fix for daisy chained hotplug-
- CTA-861-G modes with VIC >= 193 added
- fix drm_panel_of_backlight export
- LVDS decoder support
- more device based logging support
- scanline alighment for dumb buffers
- MST DSC helpers
scheduler:
- documentation fixes
- job distribution improvements
panel:
- Logic PD type 28 panel support
- Jimax8729d MIPI-DSI
- igenic JZ4770
- generic DSI devicetree bindings
- sony acx424AKP panel
- Leadtek LTK500HD1829
- xinpeng XPP055C272
- AUO B116XAK01
- GiantPlus GPM940B0
- BOE NV140FHM-N49
- Satoz SAT050AT40H12R2
- Sharp LS020B1DD01D panels.
ttm:
- use blocking WW lock
i915:
- hw/uapi state separation
- Lock annotation improvements
- selftest improvements
- ICL/TGL DSI VDSC support
- VBT parsing improvments
- Display refactoring
- DSI updates + fixes
- HDCP 2.2 for CFL
- CML PCI ID fixes
- GLK+ fbc fix
- PSR fixes
- GEN/GT refactor improvments
- DP MST fixes
- switch context id alloc to xarray
- workaround updates
- LMEM debugfs support
- tiled monitor fixes
- ICL+ clock gating programming removed
- DP MST disable sequence fixed
- LMEM discontiguous object maps
- prefaulting for discontiguous objects
- use LMEM for dumb buffers if possible
- add LMEM mmap support
amdgpu:
- enable sync object timelines for vulkan
- MST atomic routines
- enable MST DSC support
- add DMCUB display microengine support
- DC OEM i2c support
- Renoir DC fixes
- Initial HDCP 2.x support
- BACO support for Arcturus
- Use BACO for runtime PM power save
- gfxoff on navi10
- gfx10 golden updates and fixes
- DCN support on POWER
- GFXOFF for raven1 refresh
- MM engine idle handlers cleanup
- 10bpc EDP panel fixes
- renoir watermark fixes
- SR-IOV fixes
- Arcturus VCN fixes
- GDDR6 training fixes
- freesync fixes
- Pollock support
amdkfd:
- unify more codepath with amdgpu
- use KIQ to setup HIQ rather than MMIO
radeon:
- fix vma fault handler race
- PPC DMA fix
- register check fixes for r100/r200
nouveau:
- mmap_sem vs dma_resv fix
- rewrite the ACR secure boot code for Turing
- TU10x graphics engine support (TU11x pending)
- Page kind mapping for turing
- 10-bit LUT support
- GP10B Tegra fixes
- HD audio regression fix
hisilicon/hibmc:
- use generic fbdev code and helpers
rockchip:
- dsi/px30 support
virtio:
- fb damage support
- static some functions
vc4:
- use dma_resv lock wrappers
msm:
- use dma_resv lock wrappers
- sc7180 display + DSI support
- a618 support
- UBWC support improvements
vmwgfx:
- updates + new logging uapi
exynos:
- enable/disable callback cleanups
etnaviv:
- use dma_resv lock wrappers
atmel-hlcdc:
- clock fixes
mediatek:
- cmdq support
- non-smooth cursor fixes
- ctm property support
sun4i:
- suspend support
- A64 mipi dsi support
rcar-du:
- Color management module support
- LVDS encoder dual-link support
- R8A77980 support
analogic:
- add support for an6345
ast:
- atomic modeset support
- primary plane garbage fix
arcgpu:
- fixes for fourcc handling
tegra:
- minor fixes and improvments
mcde:
- vblank support
meson:
- OSD1 plane AFBC commit
gma500:
- add pageflip support
- reomve global drm_dev
komeda:
- tweak debugfs output
- d32 support
- runtime PM suppotr
udl:
- use generic shmem helpers
- cleanup and fixes
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJeMm6RAAoJEAx081l5xIa+vN8P/0j4jEOv+KIinAhoH+LG3EpD
m2TUuu5OQIoBrcCoWOgFBk3wqYpw6PdMBdkXh+5sE5lfeBynp8oC3Bin+QsHJE05
eGBpZtHe+70MQb0Eha+Aic0hchvBKzRnq6i0MYSIHn6afs76dLmF8knTjycxrvV5
Xu1Z3WDmjzqgWF9ja5JCD6fby11seP5RrwObYKVikO35QQyJJwGSGKgu5rq/pByK
/n0PCnCOINuL0Lz6J9qexdh/0/XYFQilRC31GJNlKbDSFuECF0GOEzEE/xUBW/pI
dLh2YwIIygm18Gar9PgvMwXJn3BfzQ0qEJsf+HlQeNw9iLgbHpp2AsTxHTE87OGe
R/y85taW3jGjPsNOKZOeLpvg/Ro8l8ZipLApvDCG2O22DThg/cd6NDjZxl1FJfRH
acDG/JdgPo5MbdRAH/cM1WuFS9gEM+0BeSQ5gCjtPakF+X4Vz+ABFDLMRJoaejkJ
q8DG32TQXELQx0RMghsqK7YCWGfl+2alA1u9w6TgJh9Rq4iVckvpDeqAZnK1Adkc
87g957Tl0n6FA4wJj/t5jrceiLRMJAm/rBK+R3GZNfWrgx4bHbCmb4fZDZsrFzph
nbAjNJ5kOchrFCaRR47ULby6+Q14MAFbkWq4Crfu4YDdzUkTPpep6pi2GIe8w0rV
P0hdYOYJf6LUda0utuQX
=oFrI
-----END PGP SIGNATURE-----
Merge tag 'drm-next-2020-01-30' of git://anongit.freedesktop.org/drm/drm
Pull drm updates from Davbe Airlie:
"This is the main pull request for graphics for 5.6. Usual selection of
changes all over.
I've got one outstanding vmwgfx pull that touches mm so kept it
separate until after all of this lands. I'll try and get it to you
soon after this, but it might be early next week (nothing wrong with
code, just my schedule is messy)
This also hits a lot of fbdev drivers with some cleanups.
Other notables:
- vulkan timeline semaphore support added to syncobjs
- nouveau turing secureboot/graphics support
- Displayport MST display stream compression support
Detailed summary:
uapi:
- dma-buf heaps added (and fixed)
- command line add support for panel oreientation
- command line allow overriding penguin count
drm:
- mipi dsi definition updates
- lockdep annotations for dma_resv
- remove dma-buf kmap/kunmap support
- constify fb_ops in all fbdev drivers
- MST fix for daisy chained hotplug-
- CTA-861-G modes with VIC >= 193 added
- fix drm_panel_of_backlight export
- LVDS decoder support
- more device based logging support
- scanline alighment for dumb buffers
- MST DSC helpers
scheduler:
- documentation fixes
- job distribution improvements
panel:
- Logic PD type 28 panel support
- Jimax8729d MIPI-DSI
- igenic JZ4770
- generic DSI devicetree bindings
- sony acx424AKP panel
- Leadtek LTK500HD1829
- xinpeng XPP055C272
- AUO B116XAK01
- GiantPlus GPM940B0
- BOE NV140FHM-N49
- Satoz SAT050AT40H12R2
- Sharp LS020B1DD01D panels.
ttm:
- use blocking WW lock
i915:
- hw/uapi state separation
- Lock annotation improvements
- selftest improvements
- ICL/TGL DSI VDSC support
- VBT parsing improvments
- Display refactoring
- DSI updates + fixes
- HDCP 2.2 for CFL
- CML PCI ID fixes
- GLK+ fbc fix
- PSR fixes
- GEN/GT refactor improvments
- DP MST fixes
- switch context id alloc to xarray
- workaround updates
- LMEM debugfs support
- tiled monitor fixes
- ICL+ clock gating programming removed
- DP MST disable sequence fixed
- LMEM discontiguous object maps
- prefaulting for discontiguous objects
- use LMEM for dumb buffers if possible
- add LMEM mmap support
amdgpu:
- enable sync object timelines for vulkan
- MST atomic routines
- enable MST DSC support
- add DMCUB display microengine support
- DC OEM i2c support
- Renoir DC fixes
- Initial HDCP 2.x support
- BACO support for Arcturus
- Use BACO for runtime PM power save
- gfxoff on navi10
- gfx10 golden updates and fixes
- DCN support on POWER
- GFXOFF for raven1 refresh
- MM engine idle handlers cleanup
- 10bpc EDP panel fixes
- renoir watermark fixes
- SR-IOV fixes
- Arcturus VCN fixes
- GDDR6 training fixes
- freesync fixes
- Pollock support
amdkfd:
- unify more codepath with amdgpu
- use KIQ to setup HIQ rather than MMIO
radeon:
- fix vma fault handler race
- PPC DMA fix
- register check fixes for r100/r200
nouveau:
- mmap_sem vs dma_resv fix
- rewrite the ACR secure boot code for Turing
- TU10x graphics engine support (TU11x pending)
- Page kind mapping for turing
- 10-bit LUT support
- GP10B Tegra fixes
- HD audio regression fix
hisilicon/hibmc:
- use generic fbdev code and helpers
rockchip:
- dsi/px30 support
virtio:
- fb damage support
- static some functions
vc4:
- use dma_resv lock wrappers
msm:
- use dma_resv lock wrappers
- sc7180 display + DSI support
- a618 support
- UBWC support improvements
vmwgfx:
- updates + new logging uapi
exynos:
- enable/disable callback cleanups
etnaviv:
- use dma_resv lock wrappers
atmel-hlcdc:
- clock fixes
mediatek:
- cmdq support
- non-smooth cursor fixes
- ctm property support
sun4i:
- suspend support
- A64 mipi dsi support
rcar-du:
- Color management module support
- LVDS encoder dual-link support
- R8A77980 support
analogic:
- add support for an6345
ast:
- atomic modeset support
- primary plane garbage fix
arcgpu:
- fixes for fourcc handling
tegra:
- minor fixes and improvments
mcde:
- vblank support
meson:
- OSD1 plane AFBC commit
gma500:
- add pageflip support
- reomve global drm_dev
komeda:
- tweak debugfs output
- d32 support
- runtime PM suppotr
udl:
- use generic shmem helpers
- cleanup and fixes"
* tag 'drm-next-2020-01-30' of git://anongit.freedesktop.org/drm/drm: (1998 commits)
drm/nouveau/fb/gp102-: allow module to load even when scrubber binary is missing
drm/nouveau/acr: return error when registering LSF if ACR not supported
drm/nouveau/disp/gv100-: not all channel types support reporting error codes
drm/nouveau/disp/nv50-: prevent oops when no channel method map provided
drm/nouveau: support synchronous pushbuf submission
drm/nouveau: signal pending fences when channel has been killed
drm/nouveau: reject attempts to submit to dead channels
drm/nouveau: zero vma pointer even if we only unreference it rather than free
drm/nouveau: Add HD-audio component notifier support
drm/nouveau: fix build error without CONFIG_IOMMU_API
drm/nouveau/kms/nv04: remove set but not used variable 'width'
drm/nouveau/kms/nv50: remove set but not unused variable 'nv_connector'
drm/nouveau/mmu: fix comptag memory leak
drm/nouveau/gr/gp10b: Use gp100_grctx and gp100_gr_zbc
drm/nouveau/pmu/gm20b,gp10b: Fix Falcon bootstrapping
drm/exynos: Rename Exynos to lowercase
drm/exynos: change callback names
drm/mst: Don't do atomic checks over disabled managers
drm/amdgpu: add the lost mutex_init back
drm/amd/display: skip opp blank or unblank if test pattern enabled
...
VT'd on Broxton and on Braswell require serialisation of GGTT updates.
However, it seems to only be required for insertion, so drop the
complication and heavyweight stop_machine() for clears. The range will
be serialised again before use.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200130092239.1743672-1-chris@chris-wilson.co.uk
Manual conversion of the printk based logging macros to the new struct
drm_based logging macros in drm/i915/gt/intel_ggtt.c.
Also includes extracting the struct drm_i915_private device from various
intel types to use in the new macros.
Signed-off-by: Wambui Karuga <wambui.karugax@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200128071437.9284-3-wambui.karugax@gmail.com
We need to hold the runtime-pm wakeref to update the global PTEs (as
they exist behind a PCI BAR). However, some systems invoke ACPI during
runtime resume and so require allocations, which is verboten inside the
vm->mutex. Ergo, we must not use intel_runtime_pm_get() inside the
mutex, but lift the call outside.
Closes: https://gitlab.freedesktop.org/drm/intel/issues/958
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200110144418.1415639-1-chris@chris-wilson.co.uk
In the near future, we will want to start a GPU error capture from a new
context, from inside the softirq region of a forced preemption. To do
so requires us to break up the monolithic error capture to provide new
entry points with finer control; in particular focusing on one
engine/gt, and being able to compose an error state from little pieces
of HW capture.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Acked-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200110123059.1348712-1-chris@chris-wilson.co.uk
Currently we first to try to unbind the VMA (and lazily rebind on next
use) as an optimisation during restore_ggtt_mappings. Ideally, the only
objects in the GGTT upon resume are the pinned kernel objects which
can't be unbound and need to be restored. As the unbind interferes with
the plan to mark those objects as active for error capture, forgo the
optimisation.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200110110402.1231745-1-chris@chris-wilson.co.uk
Fix build error:
drivers/gpu/drm/i915/gt/intel_ggtt.c: In function ggtt_restore_mappings:
drivers/gpu/drm/i915/gt/intel_ggtt.c:1239:3: error:
implicit declaration of function wbinvd_on_all_cpus; did you mean wrmsr_on_cpus? [-Werror=implicit-function-declaration]
wbinvd_on_all_cpus();
^~~~~~~~~~~~~~~~~~
wrmsr_on_cpus
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200109012303.153001-1-chenzhou10@huawei.com
Attempt to split i915_gem_gtt.[ch] into more manageable chunks.
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200107134009.3255354-1-chris@chris-wilson.co.uk