Commit Graph

157 Commits

Author SHA1 Message Date
John Harrison ddabf72176 drm/i915/xehp: Extra media engines - Part 3 (reset)
Xe_HP can have a lot of extra media engines. This patch adds the reset
support for them.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210723174239.1551352-5-matthew.d.roper@intel.com
2021-07-24 07:17:41 -07:00
Lucas De Marchi c816723b6b drm/i915/gt: replace IS_GEN and friends with GRAPHICS_VER
This was done by the following semantic patch:

	@@ expression i915; @@
	- INTEL_GEN(i915)
	+ GRAPHICS_VER(i915)

	@@ expression i915; expression E; @@
	- INTEL_GEN(i915) >= E
	+ GRAPHICS_VER(i915) >= E

	@@ expression dev_priv; expression E; @@
	- !IS_GEN(dev_priv, E)
	+ GRAPHICS_VER(dev_priv) != E

	@@ expression dev_priv; expression E; @@
	- IS_GEN(dev_priv, E)
	+ GRAPHICS_VER(dev_priv) == E

	@@
	expression dev_priv;
	expression from, until;
	@@
	- IS_GEN_RANGE(dev_priv, from, until)
	+ IS_GRAPHICS_VER(dev_priv, from, until)

	@def@
	expression E;
	identifier id =~ "^gen$";
	@@
	- id = GRAPHICS_VER(E)
	+ ver = GRAPHICS_VER(E)

	@@
	identifier def.id;
	@@
	- id
	+ ver

It also takes care of renaming the variable we assign to GRAPHICS_VER()
so to use "ver" rather than "gen".

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210605155356.4183026-2-lucas.demarchi@intel.com
2021-06-05 15:09:06 -07:00
Aditya Swarup 5b26d57fdb drm/i915: Add Wa_14010733141
The WA requires the following procedure for VDBox SFC reset:

If (MFX-SFC usage is 1) {
	1.Issue a MFX-SFC forced lock
	2.Wait for MFX-SFC forced lock ack
	3.Check the MFX-SFC usage bit
	If (MFX-SFC usage bit is 1)
		Reset VDBOX and SFC
	else
		Reset VDBOX
	Release the force lock MFX-SFC
}
else if(HCP+SFC usage is 1) {
	1.Issue a VE-SFC forced lock
	2.Wait for SFC forced lock ack
	3.Check the VE-SFC usage bit
	If (VE-SFC usage bit is 1)
		Reset VDBOX
	else
		Reset VDBOX and SFC
	Release the force lock VE-SFC.
}
else
	Reset VDBOX

- Restructure: the changes to the original code flow should stay
  relatively minimal; we only need to do an extra HCP check after the
  usual VD-MFX check and, if true, switch the register/bit we're
  performing the lock on.(MattR)

v2:
- Assign unlock mask using paired_engine->mask instead of using
  BIT(paired_vecs->id). (Daniele)

Bspec: 52890, 53509

Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Aditya Swarup <aditya.swarup@intel.com>
Co-developed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210526094852.286424-2-aditya.swarup@intel.com
2021-05-27 11:05:09 -07:00
Chris Wilson c92c36ed8d drm/i915/gt: Move submission_method into intel_gt
Since we setup the submission method for the engines once, it is easy to
assign an enum and use that instead of probing into the backends.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210521183215.65451-3-matthew.brost@intel.com
2021-05-25 15:14:24 +02:00
Dave Airlie 41d1d0c51f Merge tag 'drm-intel-gt-next-2021-04-06' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
Driver Changes:

- Prepare for local/device memory support on DG1 by starting
  to use it for kernel internal allocations: context, ring
  and engine scratch (Matt A, CQ, Abdiel, Imre)
- Sandybridge fix to avoid hard hang on ring resume (Chris)
- Limit imported dma-buf size to int32 (Matt A)
- Double check heartbeat timeout before resetting (Chris)

- Use new tasklet API for execution list (Emil)
- Fix SPDX checkpats warnings (Chris)
- Fixes for various checkpatch warnings (Chris)
- Selftest improvements (Chris)
- Move the defer_request waiter active assertion to correct spot (Chris)
- Make local-memory probing a GT operation (Matt, Tvrtko)
- Protect against request freeing during cancellation on wedging (Chris)
- Retire unexpected starting state error dumping (Chris)
- Distinction of memory regions in debugging (Zbigniew)
- Always flush the submission queue on checking for idle (Chris)

- Consolidate 2big error check to helper (Matt)
- Decrease number of subplatform bits (Tvrtko)
- Remove unused internal request priority levels (Chris)
- Document the unused internal header bits in buddy allocator (Matt)
- Cleanup the region class/instance encoding (Matt)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YGxksaZGXHnFxlwg@jlahtine-mobl.ger.corp.intel.com
2021-04-08 12:46:12 +10:00
Chris Wilson c10e4a7960 drm/i915: Protect against request freeing during cancellation on wedging
As soon as we mark a request as completed, it may be retired. So when
cancelling a request and marking it complete, make sure we first keep a
reference to the request.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210201085715.27435-4-chris@chris-wilson.co.uk
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2021-03-24 19:30:36 +01:00
Chris Wilson 24f90d6688 drm/i915/gt: SPDX cleanup
Clean up the SPDX licence declarations to comply with checkpatch.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210122192913.4518-1-chris@chris-wilson.co.uk
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2021-03-24 19:30:34 +01:00
Maarten Lankhorst 5b0a78ec0b drm/i915: Move gt_revoke() slightly
We get a lockdep splat when the reset mutex is held, because it can be
taken from fence_wait. This conflicts with the mmu notifier we have,
because we recurse between reset mutex and mmap lock -> mmu notifier.

Remove this recursion by calling revoke_mmaps before taking the lock.

The reset code still needs fixing, as taking mmap locks during reset
is not allowed.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
[danvet: Add FIXME.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-64-maarten.lankhorst@linux.intel.com
2021-03-24 17:59:37 +01:00
Jani Nikula 35bb28ece9 Merge drm/drm-next into drm-intel-next
Sync up with upstream.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2021-03-11 08:52:53 +02:00
Thomas Zimmermann e322551f47 drm/i915/gt: Remove references to struct drm_device.pdev
Using struct drm_device.pdev is deprecated. Convert i915 to struct
drm_device.dev. No functional changes.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210128133127.2311-3-tzimmermann@suse.de
2021-02-02 13:58:43 +02:00
Chris Wilson 163433e5c5 drm/i915: Mark up protected uses of 'i915_request_completed'
When we know that we are inside the timeline mutex, or inside the
submission flow (under active.lock or the holder's rcu lock), we know
that the rq->hwsp is stable and we can use the simpler direct version.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210114135612.13210-1-chris@chris-wilson.co.uk
2021-01-15 08:00:03 +00:00
Joonas Lahtinen d263dfa7d2 Merge drm/drm-next into drm-intel-gt-next
Backmerging to get a common base for merging topic branches between
drm-intel-next and drm-intel-gt-next.

Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2021-01-15 08:49:57 +02:00
Dave Airlie fb5cfcaa2e Merge tag 'drm-intel-gt-next-2021-01-14' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
UAPI Changes:
- Deprecate I915_PMU_LAST and optimize state tracking (Tvrtko)

  Avoid relying on last item ABI marker in i915_drm.h, add a
  comment to mark as deprecated.

Cross-subsystem Changes:

Core Changes:

Driver Changes:

- Restore clear residuals security mitigations for Ivybridge and
  Baytrail (Chris)
- Close #1858: Allow sysadmin to choose applied GPU security mitigations
  through i915.mitigations=... similar to CPU (Chris)
- Fix for #2024: GPU hangs on HSW GT1 (Chris)
- Fix for #2707: Driver hang when editing UVs in Blender (Chris, Ville)
- Fix for #2797: False positive GuC loading error message (Chris)
- Fix for #2859: Missing GuC firmware for older Cometlakes (Chris)
- Lessen probability of GPU hang due to DMAR faults [reason 7,
  next page table ptr is invalid] on Tigerlake (Chris)
- Fix REVID macros for TGL to fetch correct stepping (Aditya)
- Limit frequency drop to RPe on parking (Chris, Edward)
- Limit W/A 1406941453 to TGL, RKL and DG1 (Swathi)
- Make W/A 22010271021 permanent on DG1 (Lucas)
- Implement W/A 16011163337 to prevent a HS/DS hang on DG1 (Swathi)
- Only disable preemption on gen8 render engines (Chris)
- Disable arbitration around Braswell's PDP updates (Chris)
- Disable arbitration on no-preempt requests (Chris)
- Check for arbitration after writing start seqno before busywaiting (Chris)
- Retain default context state across shrinking (Venkata, CQ)
- Fix mismatch between misplaced vma check and vma insert for 32-bit
  addressing userspaces (Chris, CQ)
- Propagate error for vmap() failure instead kernel NULL deref (Chris)
- Propagate error from cancelled submit due to context closure
  immediately (Chris)
- Fix RCU race on HWSP tracking per request (Chris)
- Clear CMD parser shadow and GPU reloc batches (Matt A)

- Populate logical context during first pin (Maarten)
- Optimistically prune dma-resv from the shrinker (Chris)
- Fix for virtual engine ownership race (Chris)
- Remove timeslice suppression to restore fairness for virtual engines (Chris)
- Rearrange IVB/HSW workarounds properly between GT and engine (Chris)
- Taint the reset mutex with the shrinker (Chris)
- Replace direct submit with direct call to tasklet (Chris)
- Multiple corrections to virtual engine dequeue and breadcrumbs code (Chris)
- Avoid wakeref from potentially hard IRQ context in PMU (Tvrtko)
- Use raw clock for RC6 time estimation in PMU (Tvrtko)
- Differentiate OOM failures from invalid map types (Chris)
- Fix Gen9 to have 64 MOCS entries similar to Gen11 (Chris)
- Ignore repeated attempts to suspend request flow across reset (Chris)
- Remove livelock from "do_idle_maps" VT-d W/A (Chris)
- Cancel the preemption timeout early in case engine reset fails (Chris)
- Code flow optimization in the scheduling code (Chris)
- Clear the execlists timers upon reset (Chris)
- Drain the breadcrumbs just once (Chris, Matt A)
- Track the overall GT awake/busy time (Chris)
- Tweak submission tasklet flushing to avoid starvation (Chris)
- Track timelines created using the HWSP to restore on resume (Chris)
- Use cmpxchg64 for 32b compatilibity for active tracking (Chris)
- Prefer recycling an idle GGTT fence to avoid GPU wait (Chris)

- Restructure GT code organization for clearer split between GuC
  and execlists (Chris, Daniele, John, Matt A)
- Remove GuC code that will remain unused by new interfaces (Matt B)
- Restructure the CS timestamp clocks code to local to GT (Chris)
- Fix error return paths in perf code (Zhang)
- Replace idr_init() by idr_init_base() in perf (Deepak)
- Fix shmem_pin_map error path (Colin)
- Drop redundant free_work worker for GEM contexts (Chris, Mika)
- Increase readability and understandability of intel_workarounds.c (Lucas)
- Defer enabling the breadcrumb interrupt to after submission (Chris)
- Deal with buddy alloc block sizes beyond 4G (Venkata, Chris)
- Encode fence specific waitqueue behaviour into the wait.flags (Chris)
- Don't cancel the breadcrumb interrupt shadow too early (Chris)
- Cancel submitted requests upon context reset (Chris)
- Use correct locks in GuC code (Tvrtko)
- Prevent use of engine->wa_ctx after error (Chris, Matt R)

- Fix build warning on 32-bit (Arnd)
- Avoid memory leak if platform would have more than 16 W/A (Tvrtko)
- Avoid unnecessary #if CONFIG_PM in PMU code (Chris, Tvrtko)
- Improve debugging output (Chris, Tvrtko, Matt R)
- Make file local variables static (Jani)
- Avoid uint*_t types in i915 (Jani)
- Selftest improvements (Chris, Matt A, Dan)
- Documentation fixes (Chris, Jose)

Signed-off-by: Dave Airlie <airlied@redhat.com>

# Conflicts:
#	drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
#	drivers/gpu/drm/i915/gt/intel_breadcrumbs_types.h
#	drivers/gpu/drm/i915/gt/intel_lrc.c
#	drivers/gpu/drm/i915/gvt/mmio_context.h
#	drivers/gpu/drm/i915/i915_drv.h
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210114152232.GA21588@jlahtine-mobl.ger.corp.intel.com
2021-01-15 15:03:36 +10:00
Chris Wilson 9834dfef55 drm/i915/gt: Prune inlines
Remove all the manual inlines from non-critical sections in gt/

add/remove: 2/0 grow/shrink: 0/3 up/down: 762/-1473 (-711)
Function                                     old     new   delta
mi_set_context.isra                            -     602    +602
write_dma_entry                                -     160    +160
__set_pd_entry                               214      69    -145
clear_pd_entry                               190      42    -148
ring_request_alloc                          2021     841   -1180
Total: Before=1605086, After=1604375, chg -0.04%

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210113152224.29794-1-chris@chris-wilson.co.uk
2021-01-14 15:40:57 +00:00
Chris Wilson 0a7d355ec6 drm/i915/gt: Allow failed resets without assertion
If the engine reset fails, we will attempt to resume with the current
inflight submissions. When that happens, we cannot assert that the
engine reset cleared the pending submission, so do not.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2878
Fixes: 16f2941ad3 ("drm/i915/gt: Replace direct submit with direct call to tasklet")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210104115145.24460-3-chris@chris-wilson.co.uk
2021-01-05 09:17:22 +00:00
Chris Wilson cecb2af42c drm/i915/gt: Taint the reset mutex with the shrinker
Declare that, under extreme circumstances, the shrinker may need to wait
upon a request, in which case reset must not itself deadlock in order to
ensure forward progress of the driver. That is since the shrinker may
depend upon a reset, any reset cannot touch the shrinker.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201229141626.4773-1-chris@chris-wilson.co.uk
2020-12-30 13:36:54 +00:00
Chris Wilson 16f2941ad3 drm/i915/gt: Replace direct submit with direct call to tasklet
Rather than having special case code for opportunistically calling
process_csb() and performing a direct submit while holding the engine
spinlock for submitting the request, simply call the tasklet directly.
This allows us to retain the direct submission path, including the CS
draining to allow fast/immediate submissions, without requiring any
duplicated code paths, and most importantly greatly simplifying the
control flow by removing reentrancy. This will enable us to close a few
races in the virtual engines in the next few patches.

The trickiest part here is to ensure that paired operations (such as
schedule_in/schedule_out) remain under consistent locking domains,
e.g. when pulled outside of the engine->active.lock

v2: Use bh kicking, see commit 3c53776e29 ("Mark HI and TASKLET
softirq synchronous").
v3: Update engine-reset to be tasklet aware

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201224135544.1713-1-chris@chris-wilson.co.uk
2020-12-24 15:02:35 +00:00
Chris Wilson cb56a07d2f drm/i915/gt: Include reset failures in the trace
The GT and engine reset failures are completely invisible when looking at
a trace for a bug, but are vital to understanding the incomplete flow.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201204151234.19729-3-chris@chris-wilson.co.uk
2020-12-04 15:41:33 +00:00
Dave Airlie 46fe37b98e Merge tag 'drm-intel-next-queued-2020-11-27' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
drm/i915 features for v5.11:

Highlights:
- Enable big joiner to join two pipes to one port to overcome pipe restrictions
  (Manasi, Ville, Maarten)

Display:
- More DG1 enabling (Lucas, Aditya)
- Fixes to cases without display (Lucas, José, Jani)
- Initial PSR state improvements (José)
- JSL eDP vswing updates (Tejas)
- Handle EDID declared max 16 bpc (Ville)
- Display refactoring (Ville)

Other:
- GVT features
- Backmerge

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87czzzkk1s.fsf@intel.com
2020-12-03 13:01:44 +10:00
Lucas De Marchi e669ad6f1c drm/i915/display: add namespace to intel_finish_reset
Rename intel_finish_reset to intel_display_finish_reset, so it's clear
from gt/ that we are calling out the display code.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201106225531.920641-2-lucas.demarchi@intel.com
2020-11-11 11:50:43 -08:00
Lucas De Marchi 87ebfaab7f drm/i915/display: add namespace to intel_prepare_reset
Rename intel_prepare_reset to intel_display_prepare_reset, so it's clear
from gt/ that we are calling out the display code.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201106225531.920641-1-lucas.demarchi@intel.com
2020-11-11 11:50:42 -08:00
Tvrtko Ursulin bda3002485 drm/i915: Improve record of hung engines in error state
Between events which trigger engine and GPU resets and capturing the error
state we lose information on which engine triggered the reset. Improve
this by passing in the hung engine mask down to error capture.

Result is that the list of engines in user visible "GPU HANG: ecode
<gen>:<engines>:<ecode>, <process>" is now a list of hanging and not just
active engines. Most importantly the displayed process is now the one
which was actually hung.

v2:
 * Stub prototype. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201104134743.916027-1-tvrtko.ursulin@linux.intel.com
2020-11-09 11:59:43 +00:00
Chris Wilson b05734720d drm/i915/gt: Retire cancelled requests on unload
If we manage to hit the intel_gt_set_wedged_on_fini() while active, i.e.
module unload during a stress test, we may cancel the requests but not
clean up. This leads to a very slow module unload as we wait for
something or other to trigger the retirement flushing, or timeout and
unload with a bunch of warnings. Instead if we explicitly cancel then
cleanup on an active unload, it should be instant and quiet.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200930163253.2789-3-chris@chris-wilson.co.uk
2020-09-30 17:59:26 +01:00
Chris Wilson b3786b2937 drm/i915/gt: Distinguish the virtual breadcrumbs from the irq breadcrumbs
On the virtual engines, we only use the intel_breadcrumbs for tracking
signaling of stale breadcrumbs from the irq_workers. They do not have
any associated interrupt handling, active requests are passed to a
physical engine and associated breadcrumb interrupt handler. This causes
issues for us as we need to ensure that we do not actually try and
enable interrupts and the powermanagement required for them on the
virtual engine, as they will never be disabled. Instead, let's
specify the physical engine used for interrupt handler on a particular
breadcrumb.

v2: Drop b->irq_armed = true mocking for no interrupt HW

Fixes: 4fe6abb8f5 ("drm/i915/gt: Ignore irq enabling on the virtual engines")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200731154834.8378-4-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07 14:23:55 +03:00
Daniele Ceraolo Spurio 792592e72a drm/i915: Move the engine mask to intel_gt_info
Since the engines belong to the GT, move the runtime-updated list of
available engines to the intel_gt struct. The original mask has been
renamed to indicate it contains the maximum engine list that can be
found on a matching device.

In preparation for other info being moved to the gt in follow up patches
(sseu), introduce an intel_gt_info structure to group all gt-related
runtime info.

v2: s/max_engine_mask/platform_engine_mask (tvrtko), fix selftest

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> #v1
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200708003952.21831-5-daniele.ceraolospurio@intel.com
2020-07-08 21:07:11 +01:00
Michał Winiarski 65706203d1 drm/i915: Print caller when tainting for CI
We can add taint from multiple places, printing the caller allows us to
have a better overview of what exactly caused us to do the tainting.

v2: Tweak format and print the device (Chris)
v3: Move things around (Chris)

Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Petri Latvala <petri.latvala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200706144107.204821-2-michal@hardline.pl
2020-07-06 19:21:07 +01:00
Michał Winiarski 3f04bdce72 drm/i915: Reboot CI if we get wedged during driver init
Getting wedged device on driver init is pretty much unrecoverable.
Since we're running various scenarios that may potentially hit this in
CI (module reload / selftests / hotunplug), and if it happens, it means
that we can't trust any subsequent CI results, we should just apply the
taint to let the CI know that it should reboot (CI checks taint between
test runs).

v2: Comment that WEDGED_ON_INIT is non-recoverable, distinguish
    WEDGED_ON_INIT from WEDGED_ON_FINI (Chris)
v3: Appease checkpatch, fixup search-replace logic expression mindbomb
    in assert (Chris)

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Petri Latvala <petri.latvala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200706144107.204821-1-michal@hardline.pl
2020-07-06 19:21:07 +01:00
Jani Nikula 8a25c4be58 drm/i915/params: switch to device specific parameters
Start using device specific parameters instead of module parameters for
most things. The module parameters become the immutable initial values
for i915 parameters. The device specific parameters in i915->params
start life as a copy of i915_modparams. Any later changes are only
reflected in the debugfs.

The stragglers are:

* i915.force_probe and i915.modeset. Needed before dev_priv is
  available. This is fine because the parameters are read-only and never
  modified.

* i915.verbose_state_checks. Passing dev_priv to I915_STATE_WARN and
  I915_STATE_WARN_ON would result in massive and ugly churn. This is
  handled by not exposing the parameter via debugfs, and leaving the
  parameter writable in sysfs. This may be fixed up in follow-up work.

* i915.inject_probe_failure. Only makes sense in terms of the module,
  not the device. This is handled by not exposing the parameter via
  debugfs.

v2: Fix uc i915 lookup code (Michał Winiarski)

Cc: Juha-Pekka Heikkilä <juha-pekka.heikkila@intel.com>
Cc: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20200618150402.14022-1-jani.nikula@intel.com
2020-06-22 23:26:40 +03:00
Jani Nikula dc483ba501 drm/i915/gt: prefer struct drm_device based logging
Prefer struct drm_device based logging over struct device based logging.

No functional changes.

Cc: Wambui Karuga <wambui.karugax@gmail.com>
Reviewed-by: Wambui Karuga <wambui.karugax@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200402114819.17232-16-jani.nikula@intel.com
2020-04-08 13:49:35 +03:00
Chris Wilson 8e37d69913 drm/i915/gt: Cancel a hung context if already closed
Use the restored ability to check if a context is closed to decide
whether or not to immediately ban the context from further execution
after a hang.

Fixes: be90e34483 ("drm/i915/gt: Cancel banned contexts after GT reset")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200319170707.8262-2-chris@chris-wilson.co.uk
2020-03-19 21:28:24 +00:00
Chris Wilson f899f786d1 drm/i915: Move GGTT fence registers under gt/
Since the fence registers control HW detiling through the GGTT
aperture, make them a part of the intel_ggtt under gt/

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200316113846.4974-1-chris@chris-wilson.co.uk
2020-03-16 20:28:26 +00:00
Chris Wilson be90e34483 drm/i915/gt: Cancel banned contexts after GT reset
As we started marking the ce->gem_context as NULL on closure, we can no
longer use that to carry closure information. Instead, we can look at
whether the context was killed on closure instead.

Fixes: 130a95e909 ("drm/i915/gem: Consolidate ctx->engines[] release")
Closes: https://gitlab.freedesktop.org/drm/intel/issues/1379
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200304165113.2449213-1-chris@chris-wilson.co.uk
2020-03-04 20:28:02 +00:00
Chris Wilson 36e191f064 drm/i915: Apply i915_request_skip() on submission
Trying to use i915_request_skip() prior to i915_request_add() causes us
to try and fill the ring upto request->postfix, which has not yet been
set, and so may cause us to memset() past the end of the ring.

Instead of skipping the request immediately, just flag the error on the
request (only accepting the first fatal error we see) and then clear the
request upon submission.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200304121849.2448028-1-chris@chris-wilson.co.uk
2020-03-04 14:29:50 +00:00
Chris Wilson 6e17ae7380 drm/i915/gt: Only ignore already reset requests
If a request is being re-run after an innocent reset, it is marked as
-EAGAIN. So only skip an engine reset if the request is marked as -EIO.

Testcase: igt/gem_ctx_exec/basic-nohangcheck
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200207161602.2838218-1-chris@chris-wilson.co.uk
2020-02-07 20:52:41 +00:00
Daniele Ceraolo Spurio faea179283 drm/i915: extract engine WA programming to common resume function
The workarounds are a common "feature" across gens and submission
mechanisms and we already call the other WA related functions from
common engine ones (<setup/cleanup>_common), so it makes sense to
do the same with WA application. Medium-term, This will help us
reduce the duplication once the GuC resume function is added, but short
term it will also allow us to use the workaround lists for pre-gen8
engine workarounds.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200131075716.2212299-2-chris@chris-wilson.co.uk
2020-01-31 23:54:12 +00:00
Wambui Karuga f8474622bc drm/i915/reset: conversion to new drm logging macros in gt/intel_reset.c
This converts most instances of the printk based drm logging macros in
i915/gt/intel_resect.c to the new struct drm_based logging macros.

Signed-off-by: Wambui Karuga <wambui.karugax@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200128071437.9284-4-wambui.karugax@gmail.com
2020-01-30 15:51:32 +02:00
Chris Wilson a28477826a drm/i915/gt: Lift set-wedged engine dumping out of user paths
The user (e.g. gem_eio) can manipulate the driver into wedging itself,
allowing the user to trigger voluminous logging of inconsequential
details. If we lift the dump to direct calls to intel_gt_set_wedged(),
out of the intel_reset failure handling, we keep the detail logging for
what we expect are true HW or test failures without being tricked.

Reported-by: Tomi Sarvela <tomi.p.sarvela@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tomi Sarvela <tomi.p.sarvela@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200127231540.3302516-6-chris@chris-wilson.co.uk
2020-01-28 13:09:32 +00:00
Chris Wilson 742379c0c4 drm/i915: Start chopping up the GPU error capture
In the near future, we will want to start a GPU error capture from a new
context, from inside the softirq region of a forced preemption. To do
so requires us to break up the monolithic error capture to provide new
entry points with finer control; in particular focusing on one
engine/gt, and being able to compose an error state from little pieces
of HW capture.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Acked-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200110123059.1348712-1-chris@chris-wilson.co.uk
2020-01-10 15:34:33 +00:00
Chris Wilson 3fbbbef4f5 drm/i915/gt: Convert the final GEM_TRACE to GT_TRACE and co
Convert the few remaining GEM_TRACE() used for debugging over to the
appropriate GT_TRACE or RQ_TRACE.

References: 639f2f2489 ("drm/i915: Introduce new macros for tracing")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200106114234.2529613-4-chris@chris-wilson.co.uk
2020-01-06 14:38:56 +00:00
Chris Wilson 45b152f752 drm/i915/gt: Avoid using the GPU before initialisation
Mark the GT as wedged so that we are not tempted to use it prior to
initialisation.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191229183153.3719869-3-chris@chris-wilson.co.uk
2019-12-30 14:04:57 +00:00
Lucas De Marchi 9eae5e27be drm/i915: prefer 3-letter acronym for ironlake
We are currently using a mix of platform name and acronym to name the
functions. Let's prefer the acronym as it should be clear what platform
it's about and it's shorter, so it doesn't go over 80 columns in a few
cases. This converts ironlake to ilk where appropriate.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Jani Nikula <jani.nikula@linux.intel.com>
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191224084012.24241-7-lucas.demarchi@intel.com
2019-12-28 13:38:03 -08:00
Chris Wilson b761a7b47b drm/i915/gt: Ignore incomplete engines after init failure
Do not expose incomplete engines to the user after we fail to setup the
GT.

Fixes: e6ba764802 ("drm/i915: Remove i915->kernel_context")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Acked-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191226191237.2654510-1-chris@chris-wilson.co.uk
2019-12-26 19:51:42 +00:00
Chris Wilson 6a8679c048 drm/i915: Mark the GEM context link as RCU protected
The only protection for intel_context.gem_cotext is granted by RCU, so
annotate it as a rcu protected pointer and carefully dereference it in
the few occasions we need to use it.

Fixes: 9f3ccd40ac ("drm/i915: Drop GEM context as a direct link from i915_request")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Acked-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191222233558.2201901-1-chris@chris-wilson.co.uk
2019-12-23 13:08:47 +00:00
Chris Wilson e26b6d4341 drm/i915/gt: Pull GT initialisation under intel_gt_init()
Begin pulling the GT setup underneath a single GT umbrella; let intel_gt
take ownership of its engines! As hinted, the complication is the
lifetime of the probed engine versus the active lifetime of the GT
backends. We need to detect the engine layout early and keep it until
the end so that we can sanitize state on takeover and release.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Acked-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191222120752.1368352-1-chris@chris-wilson.co.uk
2019-12-22 12:51:32 +00:00
Chris Wilson e6ba764802 drm/i915: Remove i915->kernel_context
Allocate only an internal intel_context for the kernel_context, forgoing
a global GEM context for internal use as we only require a separate
address space (for our own protection).

Now having weaned GT from requiring ce->gem_context, we can stop
referencing it entirely. This also means we no longer have to create random
and unnecessary GEM contexts for internal use.

GEM contexts are now entirely for tracking GEM clients, and intel_context
the execution environment on the GPU.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Acked-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191221160324.1073045-1-chris@chris-wilson.co.uk
2019-12-21 16:37:10 +00:00
Chris Wilson 9f3ccd40ac drm/i915: Drop GEM context as a direct link from i915_request
Keep the intel_context as being the primary state for i915_request, with
the GEM context a backpointer from the low level state for the rarer
cases we need client information. Our goal is to remove such references
to clients from the backend, and leave the HW submission agnostic to
client interfaces and self-contained.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191220101230.256839-1-chris@chris-wilson.co.uk
2019-12-20 10:52:21 +00:00
Chris Wilson 54400257ae drm/i915/gt: Remove direct invocation of breadcrumb signaling
Only signal the breadcrumbs from inside the irq_work, simplifying our
interface and calling conventions. The micro-optimisation here is that
by always using the irq_work interface, we know we are always inside an
irq-off critical section for the breadcrumb signaling and can ellide
save/restore of the irq flags.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191217095642.3124521-7-chris@chris-wilson.co.uk
2019-12-18 17:11:28 +00:00
Venkata Sandeep Dhanalakota 639f2f2489 drm/i915: Introduce new macros for tracing
New macros ENGINE_TRACE(), CE_TRACE(), RQ_TRACE() and
GT_TRACE() are introduce to tag device name and engine
name with contexts and requests tracing in i915.

Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191213155152.69182-2-venkata.s.dhanalakota@intel.com
2019-12-13 20:16:23 +00:00
Daniele Ceraolo Spurio 3c9abe886a drm/i915/guc: kill the GuC client
We now only use 1 client without any plan to add more. The client is
also only holding information about the WQ and the process desc, so we
can just move those in the intel_guc structure and always use stage_id
0.

v2: fix comment (John)
v3: fix the comment for real, fix kerneldoc

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191205220243.27403-4-daniele.ceraolospurio@intel.com
2019-12-09 13:55:50 -08:00
Abdiel Janulgue cc662126b4 drm/i915: Introduce DRM_I915_GEM_MMAP_OFFSET
This is really just an alias of mmap_gtt. The 'mmap offset' nomenclature
comes from the value returned by this ioctl which is the offset into the
device fd which userpace uses with mmap(2).

mmap_gtt was our initial mmap_offset implementation, this extends
our CPU mmap support to allow additional fault handlers that depends on
the object's backing pages.

Note that we multiplex mmap_gtt and mmap_offset through the same ioctl,
and use the zero extending behaviour of drm to differentiate between
them, when we inspect the flags.

To support multiple mmap types on an object we need to support multiple
mmap_offsets for an object (each offset in the global device address
space corresponding to a unique instance of the object for a file + mmap
type). As we drop the simplified drm core idea of a single mmap_offset,
we need to provide replacement hooks for the dumb mmap interface as
well.

Link: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1675
Testcase: igt/gem_mmap_offset
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191204120032.3682839-1-chris@chris-wilson.co.uk
2019-12-04 15:11:44 +00:00
Chris Wilson 88cec4973d drm/i915/gt: Declare timeline.lock to be irq-free
Now that we never allow the intel_wakeref callbacks to be invoked from
interrupt context, we do not need the irqsafe spinlock for the timeline.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191120170858.3965380-1-chris@chris-wilson.co.uk
2019-11-20 17:12:11 +00:00
Chris Wilson 07779a76ee drm/i915: Mark up the calling context for intel_wakeref_put()
Previously, we assumed we could use mutex_trylock() within an atomic
context, falling back to a worker if contended. However, such trickery
is illegal inside interrupt context, and so we need to always use a
worker under such circumstances. As we normally are in process context,
we can typically use a plain mutex, and only defer to a work when we
know we are being called from an interrupt path.

Fixes: 51fbd8de87 ("drm/i915/pmu: Atomically acquire the gt_pm wakeref")
References: a0855d24fc ("locking/mutex: Complain upon mutex API misuse in IRQ contexts")
References: https://bugs.freedesktop.org/show_bug.cgi?id=111626
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191120125433.3767149-1-chris@chris-wilson.co.uk
2019-11-20 15:59:23 +00:00
Chris Wilson e8887bb3eb drm/i915: Cancel context if it hangs after it is closed
If we detect a hang in a closed context, just flush all of its requests
and cancel any remaining execution along the context. Note that after
closing the context, the last reference to the context may be dropped,
leaving it only valid under RCU.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111114323.5833-5-chris@chris-wilson.co.uk
2019-11-11 11:46:40 +00:00
Chris Wilson dfd9c1b4ea drm/i915: Show guilty context name on GPU reset
We mention that we are resetting the GPU, and dump the device state for
post mortem debugging. However, while that dump contains the active
processes and the one flagged as causing the error, we do not always
include that information in dmesg. Include the name of the guilty
process in dmesg for reference.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111114323.5833-4-chris@chris-wilson.co.uk
2019-11-11 11:46:40 +00:00
Chris Wilson 058179e72e drm/i915/gt: Replace hangcheck by heartbeats
Replace sampling the engine state every so often with a periodic
heartbeat request to measure the health of an engine. This is coupled
with the forced-preemption to allow long running requests to survive so
long as they do not block other users.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191023133108.21401-5-chris@chris-wilson.co.uk
2019-10-23 23:52:10 +01:00
Tvrtko Ursulin 5d904e3c5d drm/i915: Pass in intel_gt at some for_each_engine sites
Where the function, or code segment, operates on intel_gt, we need to
start passing it instead of i915 to for_each_engine(_masked).

This is another partial step in migration of i915->engines[] to
gt->engines[].

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191017094500.21831-2-tvrtko.ursulin@linux.intel.com
2019-10-18 00:06:27 +01:00
Tvrtko Ursulin a50134b198 drm/i915: Make for_each_engine_masked work on intel_gt
Medium term goal is to eliminate the i915->engine[] array and to get there
we have recently introduced equivalent array in intel_gt. Now we need to
migrate the code further towards this state.

This next step is to eliminate usage of i915->engines[] from the
for_each_engine_masked iterator.

For this to work we also need to use engine->id as index when populating
the gt->engine[] array and adjust the default engine set indexing to use
engine->legacy_idx instead of assuming gt->engines[] indexing.

v2:
  * Populate gt->engine[] earlier.
  * Check that we don't duplicate engine->legacy_idx

v3:
  * Work around the initialization order issue between default_engines()
    and intel_engines_driver_register() which sets engine->legacy_idx for
    now. It will be fixed properly later.

v4:
  * Merge with forgotten v2.5.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191017161852.8836-1-tvrtko.ursulin@linux.intel.com
2019-10-18 00:06:25 +01:00
Chris Wilson e9d4c9245f drm/i915: Store i915_ggtt as the backpointer on fence registers
Now that i915_ggtt knows everything about its own paths to perform mmio,
we can use that as our primary backpointer for individual fence
registers. This reduces the amount of pointer dancing we have to perform
on the common paths, but more importantly finishes our fence register
encapsulation.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191016143234.4075-1-chris@chris-wilson.co.uk
2019-10-16 19:41:36 +01:00
Chris Wilson 542a5c66e0 drm/i915/gt: Warn CI about an unrecoverable wedge
If we have a wedged GPU that we need to recover, but fail, add a taint
for CI to pickup and schedule a reboot.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Petri Latvala <petri.latvala@intel.com>
Reviewed-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191002160034.5121-1-chris@chris-wilson.co.uk
2019-10-10 11:19:32 +01:00
Chris Wilson cd6a851385 drm/i915/gt: Prefer local path to runtime powermanagement
Avoid going to the base i915 device when we already have a path from gt
to the runtime powermanagement interface. The benefit is that it looks a
bit more self-consistent to always be acquiring the gt->uncore->rpm for
use with the gt->uncore.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191007154531.1750-1-chris@chris-wilson.co.uk
2019-10-07 21:44:02 +01:00
Colin Ian King b9dcb97b6c drm/i915: make array hw_engine_mask static, makes object smaller
Don't populate the array hw_engine_mask on the stack but instead make it
static. Makes the object code smaller by 316 bytes.

Before:
   text	   data	    bss	    dec	    hex	filename
  34004	   4388	    320	  38712	   9738	gpu/drm/i915/gt/intel_reset.o

After:
   text	   data	    bss	    dec	    hex	filename
  33528	   4548	    320	  38396	   95fc	gpu/drm/i915/gt/intel_reset.o

(gcc version 9.2.1, amd64)

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191007154151.23245-1-colin.king@canonical.com
2019-10-07 21:44:02 +01:00
Chris Wilson cb5eb07278 drm/i915/overlay: Drop struct_mutex guard
The overlay uses the modeset mutex to control itself and only required
the struct_mutex for requests, which is now obsolete.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-16-chris@chris-wilson.co.uk
2019-10-04 15:39:39 +01:00
Chris Wilson b1e3177bd1 drm/i915: Coordinate i915_active with its own mutex
Forgo the struct_mutex serialisation for i915_active, and interpose its
own mutex handling for active/retire.

This is a multi-layered sleight-of-hand. First, we had to ensure that no
active/retire callbacks accidentally inverted the mutex ordering rules,
nor assumed that they were themselves serialised by struct_mutex. More
challenging though, is the rule over updating elements of the active
rbtree. Instead of the whole i915_active now being serialised by
struct_mutex, allocations/rotations of the tree are serialised by the
i915_active.mutex and individual nodes are serialised by the caller
using the i915_timeline.mutex (we need to use nested spinlocks to
interact with the dma_fence callback lists).

The pain point here is that instead of a single mutex around execbuf, we
now have to take a mutex for active tracker (one for each vma, context,
etc) and a couple of spinlocks for each fence update. The improvement in
fine grained locking allowing for multiple concurrent clients
(eventually!) should be worth it in typical loads.

v2: Add some comments that barely elucidate anything :(

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-6-chris@chris-wilson.co.uk
2019-10-04 15:39:12 +01:00
Chris Wilson 1d6f1d16d3 drm/i915/gt: Only unwedge if we can reset first
Unwedging the GPU requires a successful GPU reset before we restore the
default submission, or else we may see residual context switch events
that we were not expecting.

v2: Pull in the special-case reset_clobbers_display, and explain why it
should be safe in the context of unwedging.

v3: Just forget all about resets before unwedging if it will clobber the
display; risk it all.

Reported-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> #v1
Link: https://patchwork.freedesktop.org/patch/msgid/20190927160335.10622-1-chris@chris-wilson.co.uk
2019-09-30 17:48:14 +01:00
Chris Wilson 4abc6e7c91 drm/i915/selftests: Provide a mock GPU reset routine
For those mock tests that may wish to pretend triggering a GPU reset and
processing the cleanup.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190927211749.2181-3-chris@chris-wilson.co.uk
2019-09-27 23:25:46 +01:00
Chris Wilson 260e6b7127 drm/i915: Pass intel_gt to has-reset?
As we execute GPU resets on a gt/ basis, and use the intel_gt as the
primary for all other reset functions, also use it for the has-reset?
predicates. Gradually simplifying the churn of pointers.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190927211749.2181-1-chris@chris-wilson.co.uk
2019-09-27 23:25:14 +01:00
Sebastian Andrzej Siewior e3792238c1 drm/i915: Don't disable interrupts for intel_engine_breadcrumbs_irq()
The function intel_engine_breadcrumbs_irq() is always invoked from an interrupt
handler and for that reason it invokes (as an optimisation) only spin_lock()
for locking assuming that the interrupts are already disabled. The
function intel_engine_signal_breadcrumbs() is provided to disable
interrupts while the former function is invoked so that assumption is
also true for callers from preemptible context.

On PREEMPT_RT local_irq_disable() really disables interrupts and this
forbids to invoke spin_lock() which becomes a sleeping spinlock.

This is also problematic with `threadirqs' in conjunction with
irq_work. With force threading the interrupt handler, the handler is
invoked with disabled BH but with interrupts enabled. This is okay and
the lock itself is never acquired in IRQ context. This changes with
irq_work (signal_irq_work()) which _still_ invokes
intel_engine_breadcrumbs_irq() from IRQ context. Lockdep should see this
and complain.

Acquire the locks in intel_engine_breadcrumbs_irq() with _irqsave()
suffix and let all callers invoke intel_engine_breadcrumbs_irq()
directly instead using intel_engine_signal_breadcrumbs().

Reported-by: Clark Williams <williams@redhat.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190926105644.16703-2-bigeasy@linutronix.de
2019-09-26 18:44:35 +01:00
Michał Winiarski 5311f5171e drm/i915: Define explicit wedged on init reset state
We're currently using scratch presence as a way of identifying that we
entered wedged state at driver initialization time.
Let's use a separate flag rather than rely on scratch.

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190926133142.2838-1-chris@chris-wilson.co.uk
2019-09-26 18:44:35 +01:00
Chris Wilson cb2377a919 drm/i915: Fixup preempt-to-busy vs reset of a virtual request
Due to the nature of preempt-to-busy the execlists active tracking and
the schedule queue may become temporarily desync'ed (between resubmission
to HW and its ack from HW). This means that we may have unwound a
request and passed it back to the virtual engine, but it is still
inflight on the HW and may even result in a GPU hang. If we detect that
GPU hang and try to reset, the hanging request->engine will no longer
match the current engine, which means that the request is not on the
execlists active list and we should not try to find an older incomplete
request. Given that we have deduced this must be a request on a virtual
engine, it is the single active request in the context and so must be
guilty (as the context is still inflight, it is prevented from being
executed on another engine as we process the reset).

Fixes: 22b7a426bb ("drm/i915/execlists: Preempt-to-busy")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190923152844.8914-2-chris@chris-wilson.co.uk
2019-09-23 20:44:14 +01:00
Daniele Ceraolo Spurio 0d333ac7eb drm/i915: fix SFC reset flow
Our assumption that the we can ask the HW to lock the SFC even if not
currently in use does not match the HW commitment. The expectation from
the HW is that SW will not try to lock the SFC if the engine is not
using it and if we do that the behavior is undefined; on ICL the HW
ends up to returning the ack and ignoring our lock request, but this is
not guaranteed and we shouldn't expect it going forward.

Also, failing to get the ack while the SFC is in use means that we can't
cleanly reset it, so fail the engine reset in that scenario.

v2: drop rmw change, keep the log as debug and handle failure (Chris),
    improve comments (Tvrtko).

Reported-by: Owen Zhang <owen.zhang@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190919015330.15435-1-daniele.ceraolospurio@intel.com
2019-09-19 11:04:55 +01:00
Chris Wilson eebab60f22 drm/i915: Don't mix srcu tag and negative error codes
While srcu may use an integer tag, it does not exclude potential error
codes and so may overlap with our own use of -EINTR. Use a separate
outparam to store the tag, and report the error code separately.

Fixes: 2caffbf117 ("drm/i915: Revoke mmaps and prevent access to fence registers across reset")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190912160834.30601-1-chris@chris-wilson.co.uk
2019-09-13 15:07:34 +01:00
Tvrtko Ursulin 61fa60ff6e drm/i915: Move GT init to intel_gt.c
Code in i915_gem_init_hw is all about GT init so move it to intel_gt.c
renaming to intel_gt_init_hw.

Existing intel_gt_init_hw is renamed to intel_gt_init_hw_early since it
is currently called from driver probe.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190910143823.10686-2-tvrtko.ursulin@linux.intel.com
2019-09-11 08:11:51 +01:00
Chris Wilson 6dcb85a0ad drm/i915: Hold irq-off for the entire fake lock period
Sadly lockdep records when the irqs are re-enabled and then marks up the
fake lock as being irq-unsafe. Our hand is forced and so we must mark up
the entire fake lock critical section as irq-off.

Hopefully this is the last tweak required!

v2: Not quite, we need to mark the timeline spinlock as irqsafe. That
was a genuine bug being hidden by the earlier lockdep splat.

Fixes: d67739268c ("drm/i915/gt: Mark up the nested engine-pm timeline lock as irqsafe")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823132700.25286-2-chris@chris-wilson.co.uk
2019-08-23 19:44:21 +01:00
Chris Wilson 338aade97c drm/i915/gt: Convert timeline tracking to spinlock
Convert the active_list manipulation of timelines to use spinlocks so
that we can perform the updates from underneath a quick interrupt
callback, if need be.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190815205709.24285-2-chris@chris-wilson.co.uk
2019-08-15 23:21:13 +01:00
Jani Nikula 1d455f8de8 drm/i915: rename intel_drv.h to display/intel_display_types.h
Everything about the file is about display, and mostly about types
related to display. Move under display/ as intel_display_types.h to
reflect the facts.

There's still plenty to clean up, but start off with moving the file
where it logically belongs and naming according to contents.

v2: fix the include guard name in the renamed file

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190806113933.11799-1-jani.nikula@intel.com
2019-08-07 12:43:50 +03:00
Chris Wilson c29579d2fa drm/i915/gem: Make caps.scheduler static
We do not notify userspace when the scheduler capabilities are changed
(due to wedging the driver) and as such userspace will expect the caps
to be static and unchanging. Make it so, and so we only need to compute
our caps once during driver registration.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190806124300.24945-1-chris@chris-wilson.co.uk
2019-08-06 15:00:14 +01:00
Daniele Ceraolo Spurio 702668e606 drm/i915/uc: Unify uC platform check
We have several HAS_* checks for GuC and HuC but we mostly use HAS_GUC
and HAS_HUC, with only 1 exception. Since our HW always has either
both uC or neither of them, just replace all the checks with a unified
HAS_UC.

v2: use HAS_GT_UC (Michal)
v3: fix comment (Michal)

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190725001813.4740-2-daniele.ceraolospurio@intel.com
2019-07-25 07:30:41 +01:00
Chris Wilson c30d5dc653 drm/i915/gt: Push engine stopping into reset-prepare
Push the engine stop into the back reset_prepare (where it already was!)
This allows us to avoid dangerously setting the RING registers to 0 for
logical contexts. If we clear the register on a live context, those
invalid register values are recorded in the logical context state and
replayed (with hilarious results).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190716124931.5870-2-chris@chris-wilson.co.uk
2019-07-17 18:47:00 +01:00
Daniele Ceraolo Spurio ca7b2c1bbe drm/i915/uc: Move intel functions to intel_uc
All the intel_uc_* can now be moved to work on the intel_uc structure
for better encapsulation of uc-related actions.

Note: I've introduced uc_to_gt instead of uc_to_i915 because the aim is
to move everything to be gt-focused in the medium term, so we would've
had to replace it soon anyway.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Acked-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190713100016.8026-8-chris@chris-wilson.co.uk
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2019-07-13 20:04:36 +01:00
Daniele Ceraolo Spurio 8b5689d7e3 drm/i915/uc: move GuC/HuC inside intel_gt under a new intel_uc
Being part of the GT HW, it make sense to keep the guc/huc structures
inside the GT structure. To help with the encapsulation work done by the
following patches, both structures are placed inside a new intel_uc
container. Although this results in code with ugly nested dereferences
(i915->gt.uc.guc...), it saves us the extra work required in moving
the structures twice (i915 -> gt -> uc). The following patches will
reduce the number of places where we try to access the guc/huc
structures directly from i915 and reduce the ugliness.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190713100016.8026-7-chris@chris-wilson.co.uk
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2019-07-13 20:00:30 +01:00
Daniele Ceraolo Spurio 0f261b241d drm/i915/uc: move GuC and HuC files under gt/uc/
Both microcontrollers are part of the GT HW and are closely related to
GT operations. To keep all the files cleanly together, they've been
placed in their own subdir inside the gt/ folder

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190713100016.8026-6-chris@chris-wilson.co.uk
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2019-07-13 19:58:23 +01:00
Chris Wilson cb823ed991 drm/i915/gt: Use intel_gt as the primary object for handling resets
Having taken the first step in encapsulating the functionality by moving
the related files under gt/, the next step is to start encapsulating by
passing around the relevant structs rather than the global
drm_i915_private. In this step, we pass intel_gt to intel_reset.c

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190712192953.9187-1-chris@chris-wilson.co.uk
2019-07-12 21:06:56 +01:00
Chris Wilson 092be382a2 drm/i915: Lift intel_engines_resume() to callers
Since the reset path wants to recover the engines itself, it only wants
to reinitialise the hardware using i915_gem_init_hw(). Pull the call to
intel_engines_resume() to the module init/resume path so we can avoid it
during reset.

Fixes: 79ffac8599 ("drm/i915: Invert the GEM wakeref hierarchy")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190626154549.10066-3-chris@chris-wilson.co.uk
2019-06-26 18:01:01 +01:00
Chris Wilson 18398904ca drm/i915: Only recover active engines
If we issue a reset to a currently idle engine, leave it idle
afterwards. This is useful to excise a linkage between reset and the
shrinker. When waking the engine, we need to pin the default context
image which we use for overwriting a guilty context -- if the engine is
idle we do not need this pinned image! However, this pinning means that
waking the engine acquires the FS_RECLAIM, and so may trigger the
shrinker. The shrinker itself may need to wait upon the GPU to unbind
and object and so may require services of reset; ergo we should avoid
the engine wake up path.

The danger in skipping the recovery for idle engines is that we leave the
engine with no context defined, which may interfere with the operation of
the power context on some older platforms. In practice, we should only
be resetting an active GPU but it something to look out for on Ironlake
(if memory serves).

Fixes: 79ffac8599 ("drm/i915: Invert the GEM wakeref hierarchy")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190626154549.10066-2-chris@chris-wilson.co.uk
2019-06-26 18:01:01 +01:00
Chris Wilson 5f22e5b311 drm/i915: Rename intel_wakeref_[is]_active
Our general rule is to use is/has as the verb for boolean functions,
rename intel_wakeref_active to intel_wakeref_is_active so the question
being asked is clear.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190625130128.11009-6-chris@chris-wilson.co.uk
2019-06-25 20:17:22 +01:00
Chris Wilson 0c91621cad drm/i915/gt: Pass intel_gt to pm routines
Switch from passing the i915 container to newly named struct intel_gt.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190625130128.11009-2-chris@chris-wilson.co.uk
2019-06-25 20:17:22 +01:00
Tvrtko Ursulin f0c02c1b91 drm/i915: Rename i915_timeline to intel_timeline and move under gt
Move all timeline code under gt and rename to intel_gt prefix.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-32-tvrtko.ursulin@linux.intel.com
2019-06-21 13:48:53 +01:00
Tvrtko Ursulin eaf522f62b drm/i915: Make i915_check_and_clear_faults take intel_gt
Continuing the conversion and elimination of implicit dev_priv.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-6-tvrtko.ursulin@linux.intel.com
2019-06-21 13:48:20 +01:00
Jani Nikula df0566a641 drm/i915: move modesetting core code under display/
Now that we have a new subdirectory for display code, continue by moving
modesetting core code.

display/intel_frontbuffer.h sticks out like a sore thumb, otherwise this
is, again, a surprisingly clean operation.

v2:
- don't move intel_sideband.[ch] (Ville)
- use tabs for Makefile file lists and sort them

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190613084416.6794-3-jani.nikula@intel.com
2019-06-17 11:48:32 +03:00
Chris Wilson 422d7df4f0 drm/i915: Replace engine->timeline with a plain list
To continue the onslaught of removing the assumption of a global
execution ordering, another casualty is the engine->timeline. Without an
actual timeline to track, it is overkill and we can replace it with a
much less grand plain list. We still need a list of requests inflight,
for the simple purpose of finding inflight requests (for retiring,
resetting, preemption etc).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190614164606.15633-3-chris@chris-wilson.co.uk
2019-06-14 19:03:40 +01:00
Daniele Ceraolo Spurio c447ff7db3 drm/i915: update with_intel_runtime_pm to use the rpm structure
Matching the underlying get/put functions.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190613232156.34940-8-daniele.ceraolospurio@intel.com
2019-06-14 15:58:33 +01:00
Daniele Ceraolo Spurio d858d5695f drm/i915: update rpm_get/put to use the rpm structure
The functions where internally already only using the structure, so we
need to just flip the interface.

v2: rebase

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190613232156.34940-7-daniele.ceraolospurio@intel.com
2019-06-14 15:58:33 +01:00
Chris Wilson 84383d2e8d drm/i915: Refine i915_reset.lock_map
We already use a mutex to serialise i915_reset() and wedging, so all we
need it to link that into i915_request_wait() and we have our lock cycle
detection.

v2.5: Take error mutex for selftests

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190614071023.17929-3-chris@chris-wilson.co.uk
2019-06-14 15:17:54 +01:00
Chris Wilson 0cf289bd5d drm/i915: Move fence register tracking from i915->mm to ggtt
As the fence registers only apply to regions inside the GGTT is makes
more sense that we track these as part of the i915_ggtt and not the
general mm. In the next patch, we will then pull the register locking
underneath the i915_ggtt.mutex.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190613073254.24048-1-chris@chris-wilson.co.uk
2019-06-13 09:37:39 +01:00
Chris Wilson 33df8a7697 drm/i915: Prevent lock-cycles between GPU waits and GPU resets
We cannot allow ourselves to wait on the GPU while holding any lock as we
may need to reset the GPU. While there is not an explicit lock between
the two operations, lockdep cannot detect the dependency. So let's tell
lockdep about the wait/reset dependency with an explicit lockmap.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190612085246.16374-1-chris@chris-wilson.co.uk
2019-06-12 12:06:11 +01:00
Tvrtko Ursulin 6a8cc66ffe drm/i915: Move i915_check_and_clear_faults to intel_reset.c
The code is logically about reset so it makes sense.

It also enables making i915_clear_error_registers static.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190607115932.20271-1-tvrtko.ursulin@linux.intel.com
2019-06-10 09:09:26 +01:00
Tvrtko Ursulin f736ae1b10 drm/i915: Extract engine fault reset to a helper
Just tidying the flow a bit.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190607082557.31670-4-tvrtko.ursulin@linux.intel.com
2019-06-07 12:47:40 +01:00
Tvrtko Ursulin 77a302e043 drm/i915: Make Gen6/7 RING_FAULT_REG access engine centric
Similar to earlier conversions, eliminate the implicit dev_priv by
introducing some helpers which take the engine parameter (since the
register itself is per engine).

v2:
 * Always use parentheses in macro arguments.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190607101535.767-1-tvrtko.ursulin@linux.intel.com
2019-06-07 12:47:39 +01:00
Tvrtko Ursulin b61ea001b2 drm/i915: Reset only affected engines when handling error capture
Pass down the engine mask to i915_clear_error_registers so only affected
engines can be reset on the Gen6/7 path.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190607082557.31670-1-tvrtko.ursulin@linux.intel.com
2019-06-07 12:47:37 +01:00
Chris Wilson 10be98a77c drm/i915: Move more GEM objects under gem/
Continuing the theme of separating out the GEM clutter.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-8-chris@chris-wilson.co.uk
2019-05-28 12:45:29 +01:00