Commit Graph

62 Commits

Author SHA1 Message Date
Simona Vetter b615b9c36c Linux 6.11-rc7
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmbeHCQeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGwfwH/ijnVvDWt0L1mpkE
 oIPmKV+2018CA5ww/Hh+ncToWn/aCmrczHc1SEUOk/SbZnGyXJj/6KiNEK6XpJyu
 Hb90y53D5B9jkEq8WPbSy5RtqCU598gYPeBxkczjj431jer9EsZVzqsKxGRzdAud
 2+Ft/qLiVL8AP5P8IPuU7G8CU6OE0fUL5PyuzMGDtstL3R6lPpG+kf/VrJGV1mp7
 DVZO8hKwIi5Vu+ciaTJv+9PMHzXRnMhLIGabtGIzM8nhmrQx/Kv/PMjiEl/OBkmk
 6PzafEkxVtBKDNK2Qhp+QMTQJATuPccZI8Kn6peZhqoNWYHBqx7d88Q/2iiAGj0z
 skPW5Gs=
 =orf8
 -----END PGP SIGNATURE-----

Merge v6.11-rc7 into drm-next

Thomas needs 5a498d4d06 ("drm/fbdev-dma: Only install deferred I/O
if necessary") in drm-misc, so start the backmerge cascade.

Signed-off-by: Simona Vetter <simona.vetter@ffwll.ch>
2024-09-11 09:18:15 +02:00
Tvrtko Ursulin 9d824c7fce
drm/v3d: Disable preemption while updating GPU stats
We forgot to disable preemption around the write_seqcount_begin/end() pair
while updating GPU stats:

  [ ] WARNING: CPU: 2 PID: 12 at include/linux/seqlock.h:221 __seqprop_assert.isra.0+0x128/0x150 [v3d]
  [ ] Workqueue: v3d_bin drm_sched_run_job_work [gpu_sched]
 <...snip...>
  [ ] Call trace:
  [ ]  __seqprop_assert.isra.0+0x128/0x150 [v3d]
  [ ]  v3d_job_start_stats.isra.0+0x90/0x218 [v3d]
  [ ]  v3d_bin_job_run+0x23c/0x388 [v3d]
  [ ]  drm_sched_run_job_work+0x520/0x6d0 [gpu_sched]
  [ ]  process_one_work+0x62c/0xb48
  [ ]  worker_thread+0x468/0x5b0
  [ ]  kthread+0x1c4/0x1e0
  [ ]  ret_from_fork+0x10/0x20

Fix it.

Cc: Maíra Canal <mcanal@igalia.com>
Cc: stable@vger.kernel.org # v6.10+
Fixes: 6abe93b621 ("drm/v3d: Fix race-condition between sysfs/fdinfo and interrupt handler")
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Acked-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240813102505.80512-1-tursulin@igalia.com
2024-08-28 11:36:53 -03:00
Daniel Vetter 4461e9e5c3 Linux 6.11-rc5
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmbK2B8eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGFwkH/10QpUgzIfbFKbF+
 5hwcvaqS5myxWwJ4PjN0eR1qGE6RzVO0Tb24+TVql+7pxu+iWm1kYgC3+/T5xJsP
 ECAszdmPWSco1xaHrh2y3PyCJjaBiqFbIxdjPp7odjDpG9qarbcty8YpWs44u/gd
 RDXzHUuScEShBhEt0ZhvE1pIDL8jJ8JL3yqOMZ+XaDxtJbjaHw4GHp8efxlBWc8N
 jZKIVJi22q5NWG5T0tGtPWwzCm0ewA/JNMTEvE9leoSoAgO85NZ0ivxMC76q/tbj
 BrYk5KnzfhJs4b/n/KtIwWaLTgLyXKGqHMaMq8sbXtp410aUdgnRJO2cl3fI+1vc
 vxQfAfk=
 =RemI
 -----END PGP SIGNATURE-----

Merge v6.11-rc5 into drm-next

amdgpu pr conconflicts due to patches cherry-picked to -fixes, I might
as well catch up with a backmerge and handle them all. Plus both misc
and intel maintainers asked for a backmerge anyway.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2024-08-27 14:09:45 +02:00
Maíra Canal 497d370a64
drm/v3d: Fix out-of-bounds read in `v3d_csd_job_run()`
When enabling UBSAN on Raspberry Pi 5, we get the following warning:

[  387.894977] UBSAN: array-index-out-of-bounds in drivers/gpu/drm/v3d/v3d_sched.c:320:3
[  387.903868] index 7 is out of range for type '__u32 [7]'
[  387.909692] CPU: 0 PID: 1207 Comm: kworker/u16:2 Tainted: G        WC         6.10.3-v8-16k-numa #151
[  387.919166] Hardware name: Raspberry Pi 5 Model B Rev 1.0 (DT)
[  387.925961] Workqueue: v3d_csd drm_sched_run_job_work [gpu_sched]
[  387.932525] Call trace:
[  387.935296]  dump_backtrace+0x170/0x1b8
[  387.939403]  show_stack+0x20/0x38
[  387.942907]  dump_stack_lvl+0x90/0xd0
[  387.946785]  dump_stack+0x18/0x28
[  387.950301]  __ubsan_handle_out_of_bounds+0x98/0xd0
[  387.955383]  v3d_csd_job_run+0x3a8/0x438 [v3d]
[  387.960707]  drm_sched_run_job_work+0x520/0x6d0 [gpu_sched]
[  387.966862]  process_one_work+0x62c/0xb48
[  387.971296]  worker_thread+0x468/0x5b0
[  387.975317]  kthread+0x1c4/0x1e0
[  387.978818]  ret_from_fork+0x10/0x20
[  387.983014] ---[ end trace ]---

This happens because the UAPI provides only seven configuration
registers and we are reading the eighth position of this u32 array.

Therefore, fix the out-of-bounds read in `v3d_csd_job_run()` by
accessing only seven positions on the '__u32 [7]' array. The eighth
register exists indeed on V3D 7.1, but it isn't currently used. That
being so, let's guarantee that it remains unused and add a note that it
could be set in a future patch.

Fixes: 0ad5bc1ce4 ("drm/v3d: fix up register addresses for V3D 7.x")
Reported-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240809152001.668314-1-mcanal@igalia.com
2024-08-12 11:14:21 -03:00
Daniel Vetter 91dae758bd drm-misc-next for v6.12:
UAPI Changes:
 
 virtio:
 - Define DRM capset
 
 Cross-subsystem Changes:
 
 dma-buf:
 - heaps: Clean up documentation
 
 printk:
 - Pass description to kmsg_dump()
 
 Core Changes:
 
 CI:
 - Update IGT tests
 - Point upstream repo to GitLab instance
 
 modesetting:
 - Introduce Power Saving Policy property for connectors
 - Add might_fault() to drm_modeset_lock priming
 - Add dynamic per-crtc vblank configuration support
 
 panic:
 - Avoid build-time interference with framebuffer console
 
 docs:
 - Document Colorspace property
 
 scheduler:
 - Remove full_recover from drm_sched_start
 
 TTM:
 - Make LRU walk restartable after dropping locks
 - Allow direct reclaim to allocate local memory
 
 Driver Changes:
 
 amdgpu:
 - Support Power Saving Policy connector property
 
 ast:
 - astdp: Support AST2600 with VGA; Clean up HPD
 
 bridge:
 - Silence error message on -EPROBE_DEFER
 - analogix: Clean aup
 - bridge-connector: Fix double free
 - lt6505: Disable interrupt when powered off
 - tc358767: Make default DP port preemphasis configurable
 
 gma500:
 - Update i2c terminology
 
 ivpu:
 - Add MODULE_FIRMWARE()
 
 lcdif:
 - Fix pixel clock
 
 loongson:
 - Use GEM refcount over TTM's
 
 mgag200:
 - Improve BMC handling
 - Support VBLANK intterupts
 
 nouveau:
 - Refactor and clean up internals
 - Use GEM refcount over TTM's
 
 panel:
 - Shutdown fixes plus documentation
 - Refactor several drivers for better code sharing
 - boe-th101mb31ig002: Support for starry-er88577 MIPI-DSI panel plus
   DT; Fix porch parameter
 - edp: Support AOU B116XTN02.3, AUO B116XAN06.1, AOU B116XAT04.1,
   BOE NV140WUM-N41, BOE NV133WUM-N63, BOE NV116WHM-A4D, CMN N116BCA-EA2,
   CMN N116BCP-EA2, CSW MNB601LS1-4
 - himax-hx8394: Support Microchip AC40T08A MIPI Display panel plus DT
 - ilitek-ili9806e: Support Densitron DMT028VGHMCMI-1D TFT plus DT
 - jd9365da: Support Melfas lmfbx101117480 MIPI-DSI panel plus DT; Refactor
   for code sharing
 
 sti:
 - Fix module owner
 
 stm:
 - Avoid UAF wih managed plane and CRTC helpers
 - Fix module owner
 - Fix error handling in probe
 - Depend on COMMON_CLK
 - ltdc: Fix transparency after disabling plane; Remove unused interrupt
 
 tegra:
 - Call drm_atomic_helper_shutdown()
 
 v3d:
 - Clean up perfmon
 
 vkms:
 - Clean up
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmareygACgkQaA3BHVML
 eiO2vwf9FirbMiq4lfHzgcbNIU1dTUtjRAZjrlwGmqk5cb9lUshAMCMBMOEQBDdg
 XMQQj/RMBvRUuxzsPGk78ObSz5FBaBLgKwFprer0V6uslQaJxj4YRsnkp0l2n+0k
 +ebhfo2rUgZOdgNOkXH326w9UhqiydIa7GaA2aq1vUzXKFDfvGXtSN75BMlEWlKP
 rTft56AiwjwcKu7zYFHGlFUMSNpKAQy7lnV3+dBXAfFNHu4zVNoI/yWGEOdR7eVo
 WhiEcpvismsOh+BfUvMNPP3RKwjXHdwMlJYb+v9XGgH27hqc50lSceWydHtoJTto
 DTXF9WQhJ+/GQR9ZGmBjos9GVbECDA==
 =L/1W
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2024-08-01' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next

drm-misc-next for v6.12:

UAPI Changes:

virtio:
- Define DRM capset

Cross-subsystem Changes:

dma-buf:
- heaps: Clean up documentation

printk:
- Pass description to kmsg_dump()

Core Changes:

CI:
- Update IGT tests
- Point upstream repo to GitLab instance

modesetting:
- Introduce Power Saving Policy property for connectors
- Add might_fault() to drm_modeset_lock priming
- Add dynamic per-crtc vblank configuration support

panic:
- Avoid build-time interference with framebuffer console

docs:
- Document Colorspace property

scheduler:
- Remove full_recover from drm_sched_start

TTM:
- Make LRU walk restartable after dropping locks
- Allow direct reclaim to allocate local memory

Driver Changes:

amdgpu:
- Support Power Saving Policy connector property

ast:
- astdp: Support AST2600 with VGA; Clean up HPD

bridge:
- Silence error message on -EPROBE_DEFER
- analogix: Clean aup
- bridge-connector: Fix double free
- lt6505: Disable interrupt when powered off
- tc358767: Make default DP port preemphasis configurable

gma500:
- Update i2c terminology

ivpu:
- Add MODULE_FIRMWARE()

lcdif:
- Fix pixel clock

loongson:
- Use GEM refcount over TTM's

mgag200:
- Improve BMC handling
- Support VBLANK intterupts

nouveau:
- Refactor and clean up internals
- Use GEM refcount over TTM's

panel:
- Shutdown fixes plus documentation
- Refactor several drivers for better code sharing
- boe-th101mb31ig002: Support for starry-er88577 MIPI-DSI panel plus
  DT; Fix porch parameter
- edp: Support AOU B116XTN02.3, AUO B116XAN06.1, AOU B116XAT04.1,
  BOE NV140WUM-N41, BOE NV133WUM-N63, BOE NV116WHM-A4D, CMN N116BCA-EA2,
  CMN N116BCP-EA2, CSW MNB601LS1-4
- himax-hx8394: Support Microchip AC40T08A MIPI Display panel plus DT
- ilitek-ili9806e: Support Densitron DMT028VGHMCMI-1D TFT plus DT
- jd9365da: Support Melfas lmfbx101117480 MIPI-DSI panel plus DT; Refactor
  for code sharing

sti:
- Fix module owner

stm:
- Avoid UAF wih managed plane and CRTC helpers
- Fix module owner
- Fix error handling in probe
- Depend on COMMON_CLK
- ltdc: Fix transparency after disabling plane; Remove unused interrupt

tegra:
- Call drm_atomic_helper_shutdown()

v3d:
- Clean up perfmon

vkms:
- Clean up

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240801121406.GA102996@linux.fritz.box
2024-08-08 18:58:46 +02:00
Maxime Ripard a1ff5a7d78
Merge drm/drm-fixes into drm-misc-fixes
Let's start the new drm-misc-fixes cycle by bringing in 6.11-rc1.

Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-07-30 09:09:23 +02:00
Thomas Zimmermann 0e8655b4e8 Merge drm/drm-next into drm-misc-next
Backmerging to get a late RC of v6.10 before moving into v6.11.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2024-07-29 09:35:54 +02:00
Christian König 83b501c179 drm/scheduler: remove full_recover from drm_sched_start
This was basically just another one of amdgpus hacks. The parameter
allowed to restart the scheduler without turning fence signaling on
again.

That this is absolutely not a good idea should be obvious by now since
the fences will then just sit there and never signal.

While at it cleanup the code a bit.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240722083816.99685-1-christian.koenig@amd.com
2024-07-25 14:05:12 +02:00
Tvrtko Ursulin 32df4abc44 drm/v3d: Fix potential memory leak in the performance extension
If fetching of userspace memory fails during the main loop, all drm sync
objs looked up until that point will be leaked because of the missing
drm_syncobj_put.

Fix it by exporting and using a common cleanup helper.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Fixes: bae7cb5d68 ("drm/v3d: Create a CPU job extension for the reset performance query job")
Cc: Maíra Canal <mcanal@igalia.com>
Cc: Iago Toral Quiroga <itoral@igalia.com>
Cc: stable@vger.kernel.org # v6.8+
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240711135340.84617-4-tursulin@igalia.com
(cherry picked from commit 484de39fa5)
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2024-07-18 15:49:28 +02:00
Tvrtko Ursulin 0e50fcc20b drm/v3d: Fix potential memory leak in the timestamp extension
If fetching of userspace memory fails during the main loop, all drm sync
objs looked up until that point will be leaked because of the missing
drm_syncobj_put.

Fix it by exporting and using a common cleanup helper.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Fixes: 9ba0ff3e08 ("drm/v3d: Create a CPU job extension for the timestamp query job")
Cc: Maíra Canal <mcanal@igalia.com>
Cc: Iago Toral Quiroga <itoral@igalia.com>
Cc: stable@vger.kernel.org # v6.8+
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240711135340.84617-3-tursulin@igalia.com
(cherry picked from commit 753ce4fea6)
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2024-07-18 15:49:08 +02:00
Maíra Canal 1fe1c66274
drm/v3d: Fix Indirect Dispatch configuration for V3D 7.1.6 and later
`args->cfg[4]` is configured in Indirect Dispatch using the number of
batches. Currently, for all V3D tech versions, `args->cfg[4]` equals the
number of batches subtracted by 1. But, for V3D 7.1.6 and later, we must not
subtract 1 from the number of batches.

Implement the fix by checking the V3D tech version and revision.

Fixes several `dEQP-VK.synchronization*` CTS tests related to Indirect Dispatch.

Fixes: 18b8413b25 ("drm/v3d: Create a CPU job extension for a indirect CSD job")
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240714145243.1223131-2-mcanal@igalia.com
2024-07-15 12:49:52 -03:00
Tvrtko Ursulin 1be825c5c0
drm/v3d: Do not use intermediate storage when copying performance query results
Removing the intermediate buffer removes the last use of the
V3D_MAX_COUNTERS define, which will enable further driver cleanup.

While at it pull the 32 vs 64 bit copying decision outside the loop in
order to reduce the number of conditional instructions.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240711135340.84617-9-tursulin@igalia.com
2024-07-13 11:00:32 -03:00
Tvrtko Ursulin c9d6630f7c
drm/v3d: Size the kperfmon_ids array at runtime
Instead of statically reserving pessimistic space for the kperfmon_ids
array, make the userspace extension code allocate the exactly required
amount of space.

Apart from saving some memory at runtime, this also removes the need for
the V3D_MAX_PERFMONS macro whose removal will benefit further driver
cleanup.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240711135340.84617-8-tursulin@igalia.com
2024-07-13 11:00:31 -03:00
Tvrtko Ursulin 484de39fa5
drm/v3d: Fix potential memory leak in the performance extension
If fetching of userspace memory fails during the main loop, all drm sync
objs looked up until that point will be leaked because of the missing
drm_syncobj_put.

Fix it by exporting and using a common cleanup helper.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Fixes: bae7cb5d68 ("drm/v3d: Create a CPU job extension for the reset performance query job")
Cc: Maíra Canal <mcanal@igalia.com>
Cc: Iago Toral Quiroga <itoral@igalia.com>
Cc: stable@vger.kernel.org # v6.8+
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240711135340.84617-4-tursulin@igalia.com
2024-07-13 11:00:31 -03:00
Tvrtko Ursulin 753ce4fea6
drm/v3d: Fix potential memory leak in the timestamp extension
If fetching of userspace memory fails during the main loop, all drm sync
objs looked up until that point will be leaked because of the missing
drm_syncobj_put.

Fix it by exporting and using a common cleanup helper.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Fixes: 9ba0ff3e08 ("drm/v3d: Create a CPU job extension for the timestamp query job")
Cc: Maíra Canal <mcanal@igalia.com>
Cc: Iago Toral Quiroga <itoral@igalia.com>
Cc: stable@vger.kernel.org # v6.8+
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240711135340.84617-3-tursulin@igalia.com
2024-07-13 11:00:31 -03:00
Maíra Canal f5b798bdc9
drm/v3d: Use V3D_MAX_COUNTERS instead of V3D_PERFCNT_NUM
V3D_PERFCNT_NUM represents the maximum number of performance counters
for V3D 4.2, but not for V3D 7.1. This means that, if we use
V3D_PERFCNT_NUM, we might go out-of-bounds on V3D 7.1.

Therefore, use the number of performance counters on V3D 7.1 as the
maximum number of counters. This will allow us to create arrays on the
stack with reasonable size. Note that userspace must use the value
provided by DRM_V3D_PARAM_MAX_PERF_COUNTERS.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240512222655.2792754-6-mcanal@igalia.com
2024-05-20 16:38:03 -03:00
Maíra Canal 6abe93b621
drm/v3d: Fix race-condition between sysfs/fdinfo and interrupt handler
In V3D, the conclusion of a job is indicated by a IRQ. When a job
finishes, then we update the local and the global GPU stats of that
queue. But, while the GPU stats are being updated, a user might be
reading the stats from sysfs or fdinfo.

For example, on `gpu_stats_show()`, we could think about a scenario where
`v3d->queue[queue].start_ns != 0`, then an interrupt happens, we update
the value of `v3d->queue[queue].start_ns` to 0, we come back to
`gpu_stats_show()` to calculate `active_runtime` and now,
`active_runtime = timestamp`.

In this simple example, the user would see a spike in the queue usage,
that didn't match reality.

In order to address this issue properly, use a seqcount to protect read
and write sections of the code.

Fixes: 09a93cc4f7 ("drm/v3d: Implement show_fdinfo() callback for GPU usage stats")
Reported-by: Tvrtko Ursulin <tursulin@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240420213632.339941-7-mcanal@igalia.com
2024-04-23 19:32:49 -03:00
Maíra Canal da483d079b
drm/v3d: Create function to update a set of GPU stats
Given a set of GPU stats, that is, a `struct v3d_stats` related to a
queue in a given context, create a function that can update this set
of GPU stats.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240420213632.339941-5-mcanal@igalia.com
2024-04-23 19:32:47 -03:00
Maíra Canal b136b1953f
drm/v3d: Create a struct to store the GPU stats
This will make it easier to instantiate the GPU stats variables and it
will create a structure where we can store all the variables that refer
to GPU stats.

Note that, when we created the struct `v3d_stats`, we renamed
`jobs_sent` to `jobs_completed`. This better express the semantics of
the variable, as we are only accounting jobs that have been completed.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240420213632.339941-4-mcanal@igalia.com
2024-04-23 19:32:46 -03:00
Maíra Canal 52ce97765c
drm/v3d: Create two functions to update all GPU stats variables
Currently, we manually perform all operations to update the GPU stats
variables. Apart from the code repetition, this is very prone to errors,
as we can see on commit 35f4f8c9fc ("drm/v3d: Don't increment
`enabled_ns` twice").

Therefore, create two functions to manage updating all GPU stats
variables. Now, the jobs only need to call for `v3d_job_update_stats()`
when the job is done and `v3d_job_start_stats()` when starting the job.

Co-developed-by: Tvrtko Ursulin <tursulin@igalia.com>
Signed-off-by: Tvrtko Ursulin <tursulin@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240420213632.339941-3-mcanal@igalia.com
2024-04-23 19:32:45 -03:00
Maíra Canal 209e8d2695
drm/v3d: Create a CPU job extension for the copy performance query job
A CPU job is a type of job that performs operations that requires CPU
intervention. A copy performance query job is a job that copy the complete
or partial result of a query to a buffer. In order to copy the result of
a performance query to a buffer, we need to get the values from the
performance monitors.

So, create a user extension for the CPU job that enables the creation
of a copy performance query job. This user extension will allow the creation
of a CPU job that copy the results of a performance query to a BO with the
possibility to indicate the availability with a availability bit.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-19-mcanal@igalia.com
2023-12-01 09:47:36 -03:00
Maíra Canal bae7cb5d68
drm/v3d: Create a CPU job extension for the reset performance query job
A CPU job is a type of job that performs operations that requires CPU
intervention. A reset performance query job is a job that resets the
performance queries by resetting the values of the perfmons. Moreover,
we also reset the syncobjs related to the availability of the query.

So, create a user extension for the CPU job that enables the creation
of a reset performance job. This user extension will allow the creation of
a CPU job that resets the perfmons values and resets the availability syncobj.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-18-mcanal@igalia.com
2023-12-01 09:47:35 -03:00
Maíra Canal 6745f3e44a
drm/v3d: Create a CPU job extension to copy timestamp query to a buffer
A CPU job is a type of job that performs operations that requires CPU
intervention. A copy timestamp query job is a job that copy the complete
or partial result of a query to a buffer. As V3D doesn't provide any
mechanism to obtain a timestamp from the GPU, it is a job that needs
CPU intervention.

So, create a user extension for the CPU job that enables the creation
of a copy timestamp query job. This user extension will allow the creation
of a CPU job that copy the results of a timestamp query to a BO with the
possibility to indicate the timestamp availability with a availability bit.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-17-mcanal@igalia.com
2023-12-01 09:47:31 -03:00
Maíra Canal 34a101e642
drm/v3d: Create a CPU job extension for the reset timestamp job
A CPU job is a type of job that performs operations that requires CPU
intervention. A reset timestamp job is a job that resets the timestamp
queries based on the value offset of the first query. As V3D doesn't
provide any mechanism to obtain a timestamp from the GPU, it is a job
that needs CPU intervention.

So, create a user extension for the CPU job that enables the creation
of a reset timestamp job. This user extension will allow the creation of
a CPU job that resets the timestamp value in the timestamp BO and resets
the availability syncobj.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-16-mcanal@igalia.com
2023-12-01 09:42:47 -03:00
Maíra Canal 9ba0ff3e08
drm/v3d: Create a CPU job extension for the timestamp query job
A CPU job is a type of job that performs operations that requires CPU
intervention. A timestamp query job is a job that calculates the
query timestamp and updates the query availability by signaling a
syncobj. As V3D doesn't provide any mechanism to obtain a timestamp
from the GPU, it is a job that needs CPU intervention.

So, create a user extension for the CPU job that enables the creation
of a timestamp query job. This user extension will allow the creation of
a CPU job that performs the timestamp query calculation and updates the
timestamp BO with the proper value.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-15-mcanal@igalia.com
2023-12-01 09:41:37 -03:00
Maíra Canal 18b8413b25
drm/v3d: Create a CPU job extension for a indirect CSD job
A CPU job is a type of job that performs operations that requires CPU
intervention. An indirect CSD job is a job that, when executed in the
queue, will map the indirect buffer, read the dispatch parameters, and
submit a regular dispatch. Therefore, it is a job that needs CPU
intervention.

So, create a user extension for the CPU job that enables the creation
of an indirect CSD. This user extension will allow the creation of a CSD
job linked to a CPU job. The CPU job will wait for the indirect CSD job
dependencies and, once they are signaled, it will update the CSD job
parameters.

Co-developed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-14-mcanal@igalia.com
2023-12-01 09:40:15 -03:00
Maíra Canal 1fe0879efc
drm/v3d: Create tracepoints to track the CPU job
Create tracepoints to track the three major events of a CPU job
lifetime:
	1. Submission of a `v3d_submit_cpu` IOCTL
	2. Beginning of the execution of a CPU job
	3. Ending of the execution of a CPU job

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-11-mcanal@igalia.com
2023-12-01 09:37:48 -03:00
Melissa Wen aafc1a2bea
drm/v3d: Add a CPU job submission
Create a new type of job, a CPU job. A CPU job is a type of job that
performs operations that requires CPU intervention. The overall idea is
to use user extensions to enable different types of CPU job, allowing the
CPU job to perform different operations according to the type of user
extension. The user extension ID identify the type of CPU job that must
be dealt.

Having a CPU job is interesting for synchronization purposes as a CPU
job has a queue like any other V3D job and can be synchoronized by the
multisync extension.

Signed-off-by: Melissa Wen <mwen@igalia.com>
Co-developed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-9-mcanal@igalia.com
2023-12-01 09:34:19 -03:00
Maíra Canal 509433d814
drm/v3d: Expose the total GPU usage stats on sysfs
The previous patch exposed the accumulated amount of active time per
client for each V3D queue. But this doesn't provide a global notion of
the GPU usage.

Therefore, provide the accumulated amount of active time for each V3D
queue (BIN, RENDER, CSD, TFU and CACHE_CLEAN), considering all the jobs
submitted to the queue, independent of the client.

This data is exposed through the sysfs interface, so that if the
interface is queried at two different points of time the usage percentage
of each of the queues can be calculated.

Co-developed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Acked-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230905213416.1290219-3-mcanal@igalia.com
2023-11-06 10:09:29 -03:00
Maíra Canal 09a93cc4f7
drm/v3d: Implement show_fdinfo() callback for GPU usage stats
This patch exposes the accumulated amount of active time per client
through the fdinfo infrastructure. The amount of active time is exposed
for each V3D queue: BIN, RENDER, CSD, TFU and CACHE_CLEAN.

In order to calculate the amount of active time per client, a CPU clock
is used through the function local_clock(). The point where the jobs has
started is marked and is finally compared with the time that the job had
finished.

Moreover, the number of jobs submitted to each queue is also exposed on
fdinfo through the identifier "v3d-jobs-<queue>".

Co-developed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Acked-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230905213416.1290219-3-mcanal@igalia.com
2023-11-06 10:09:23 -03:00
Iago Toral Quiroga 0ad5bc1ce4
drm/v3d: fix up register addresses for V3D 7.x
This patch updates a number of register addresses that have
been changed in Raspberry Pi 5 (V3D 7.1) and updates the
code to use the corresponding registers and addresses based
on the actual V3D version.

Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231031073859.25298-3-itoral@igalia.com
2023-11-02 08:54:39 -03:00
Matthew Brost a6149f0393 drm/sched: Convert drm scheduler to use a work queue rather than kthread
In Xe, the new Intel GPU driver, a choice has made to have a 1 to 1
mapping between a drm_gpu_scheduler and drm_sched_entity. At first this
seems a bit odd but let us explain the reasoning below.

1. In Xe the submission order from multiple drm_sched_entity is not
guaranteed to be the same completion even if targeting the same hardware
engine. This is because in Xe we have a firmware scheduler, the GuC,
which allowed to reorder, timeslice, and preempt submissions. If a using
shared drm_gpu_scheduler across multiple drm_sched_entity, the TDR falls
apart as the TDR expects submission order == completion order. Using a
dedicated drm_gpu_scheduler per drm_sched_entity solve this problem.

2. In Xe submissions are done via programming a ring buffer (circular
buffer), a drm_gpu_scheduler provides a limit on number of jobs, if the
limit of number jobs is set to RING_SIZE / MAX_SIZE_PER_JOB we get flow
control on the ring for free.

A problem with this design is currently a drm_gpu_scheduler uses a
kthread for submission / job cleanup. This doesn't scale if a large
number of drm_gpu_scheduler are used. To work around the scaling issue,
use a worker rather than kthread for submission / job cleanup.

v2:
  - (Rob Clark) Fix msm build
  - Pass in run work queue
v3:
  - (Boris) don't have loop in worker
v4:
  - (Tvrtko) break out submit ready, stop, start helpers into own patch
v5:
  - (Boris) default to ordered work queue
v6:
  - (Luben / checkpatch) fix alignment in msm_ringbuffer.c
  - (Luben) s/drm_sched_submit_queue/drm_sched_wqueue_enqueue
  - (Luben) Update comment for drm_sched_wqueue_enqueue
  - (Luben) Positive check for submit_wq in drm_sched_init
  - (Luben) s/alloc_submit_wq/own_submit_wq
v7:
  - (Luben) s/drm_sched_wqueue_enqueue/drm_sched_run_job_queue
v8:
  - (Luben) Adjust var names / comments

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Link: https://lore.kernel.org/r/20231031032439.1558703-3-matthew.brost@intel.com
Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>
2023-11-01 17:29:21 -04:00
Luben Tuikov 56e449603f drm/sched: Convert the GPU scheduler to variable number of run-queues
The GPU scheduler has now a variable number of run-queues, which are set up at
drm_sched_init() time. This way, each driver announces how many run-queues it
requires (supports) per each GPU scheduler it creates. Note, that run-queues
correspond to scheduler "priorities", thus if the number of run-queues is set
to 1 at drm_sched_init(), then that scheduler supports a single run-queue,
i.e. single "priority". If a driver further sets a single entity per
run-queue, then this creates a 1-to-1 correspondence between a scheduler and
a scheduled entity.

Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Qiang Yu <yuq825@gmail.com>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Abhinav Kumar <quic_abhinavk@quicinc.com>
Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Cc: Danilo Krummrich <dakr@redhat.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Emma Anholt <emma@anholt.net>
Cc: etnaviv@lists.freedesktop.org
Cc: lima@lists.freedesktop.org
Cc: linux-arm-msm@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
Cc: nouveau@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://lore.kernel.org/r/20231023032251.164775-1-luben.tuikov@amd.com
2023-10-26 12:03:47 -04:00
Melissa Wen 7e302637ba drm/v3d: centralize error handling when init scheduler fails
Remove redundant error message (since now it is very similar to what
we do in drm_sched_init) and centralize all error handling in a
unique place, as we follow the same steps in any case of failure.

Signed-off-by: Melissa Wen <mwen@igalia.com>
Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220228181647.3794298-1-mwen@igalia.com
2022-03-01 22:37:13 -01:00
Jiawei Gu 8ab62eda17 drm/sched: Add device pointer to drm_gpu_scheduler
Add device pointer so scheduler's printing can use
DRM_DEV_ERROR() instead, which makes life easier under multiple GPU
scenario.

v2: amend all calls of drm_sched_init()
v3: fill dev pointer for all drm_sched_init() calls

Signed-off-by: Jiawei Gu <Jiawei.Gu@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220221095705.5290-1-Jiawei.Gu@amd.com
2022-02-23 10:04:14 +01:00
Daniel Vetter da3208e863 drm/v3d: Use scheduler dependency handling
With the prep work out of the way this isn't tricky anymore.

Aside: The chaining of the various jobs is a bit awkward, with the
possibility of failure in bad places. I think with the
drm_sched_job_init/arm split and maybe preloading the
job->dependencies xarray this should be fixable.

v2: Rebase over renamed function names for adding dependencies.

Reviewed-by: Melissa Wen <mwen@igalia.com> (v1)
Acked-by: Emma Anholt <emma@anholt.net>
Cc: Melissa Wen <melissa.srw@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Emma Anholt <emma@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20210805104705.862416-11-daniel.vetter@ffwll.ch
2021-08-30 10:58:47 +02:00
Daniel Vetter 916044fac8 drm/v3d: Move drm_sched_job_init to v3d_job_init
Prep work for using the scheduler dependency handling. We need to call
drm_sched_job_init earlier so we can use the new drm_sched_job_await*
functions for dependency handling here.

v2: Slightly better commit message and rebase to include the
drm_sched_job_arm() call (Emma).

v3: Cleanup jobs under construction correctly (Emma)

v4: Rebase over perfmon patch

Reviewed-by: Melissa Wen <mwen@igalia.com> (v3)
Acked-by: Emma Anholt <emma@anholt.net>
Cc: Melissa Wen <melissa.srw@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Emma Anholt <emma@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20210805104705.862416-10-daniel.vetter@ffwll.ch
2021-08-30 10:58:40 +02:00
Juan A. Suarez Romero 26a4dc29b7 drm/v3d: Expose performance counters to userspace
The V3D engine has several hardware performance counters that can of
interest for userspace performance analysis tools.

This exposes new ioctls to create and destroy performance monitor
objects, as well as to query the counter values.

Each created performance monitor object has an ID that can be attached
to CL/CSD submissions, so the driver enables the requested counters when
the job is submitted, and updates the performance monitor values when
the job is done.

It is up to the user to ensure all the jobs have been finished before
getting the performance monitor values. It is also up to the user to
properly synchronize BCL jobs when submitting jobs with different
performance monitors attached.

Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@linux.ie>
Cc: Emma Anholt <emma@anholt.net>
To: dri-devel@lists.freedesktop.org
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Acked-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210608111541.461991-1-jasuarez@igalia.com
2021-07-21 00:19:59 +01:00
Boris Brezillon 78efe21b6f drm/sched: Allow using a dedicated workqueue for the timeout/fault tdr
Mali Midgard/Bifrost GPUs have 3 hardware queues but only a global GPU
reset. This leads to extra complexity when we need to synchronize timeout
works with the reset work. One solution to address that is to have an
ordered workqueue at the driver level that will be used by the different
schedulers to queue their timeout work. Thanks to the serialization
provided by the ordered workqueue we are guaranteed that timeout
handlers are executed sequentially, and can thus easily reset the GPU
from the timeout handler without extra synchronization.

v5:
* Add a new paragraph to the timedout_job() method

v3:
* New patch

v4:
* Actually use the timeout_wq to queue the timeout work

Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Christian König <christian.koenig@amd.com>
Cc: Qiang Yu <yuq825@gmail.com>
Cc: Emma Anholt <emma@anholt.net>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-3-boris.brezillon@collabora.com
2021-07-01 08:53:25 +02:00
Christian König f2f12eb9c3 drm/scheduler: provide scheduler score externally
Allow multiple schedulers to share the load balancing score.

This is useful when one engine has different hw rings.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-and-Tested-by: Leo Liu <leo.liu@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210204144405.2737-1-christian.koenig@amd.com
2021-02-05 10:47:11 +01:00
Christian König 576a08e008 drm/v3d/v3d_sched: fix scheduler callbacks return status
Looks like this was not correctly adjusted.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Fixes: a6a1f036c7 ("drm/scheduler: Job timeout handler returns status (v3)")
Link: https://patchwork.freedesktop.org/patch/msgid/20210201091159.177853-1-christian.koenig@amd.com
2021-02-02 11:10:14 +01:00
Luben Tuikov a6a1f036c7 drm/scheduler: Job timeout handler returns status (v3)
This patch does not change current behaviour.

The driver's job timeout handler now returns
status indicating back to the DRM layer whether
the device (GPU) is no longer available, such as
after it's been unplugged, or whether all is
normal, i.e. current behaviour.

All drivers which make use of the
drm_sched_backend_ops' .timedout_job() callback
have been accordingly renamed and return the
would've-been default value of
DRM_GPU_SCHED_STAT_NOMINAL to restart the task's
timeout timer--this is the old behaviour, and is
preserved by this patch.

v2: Use enum as the status of a driver's job
    timeout callback method.

v3: Return scheduler/device information, rather
    than task information.

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Qiang Yu <yuq825@gmail.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: Eric Anholt <eric@anholt.net>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Steven Price <steven.price@arm.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/415095/
2021-01-29 11:30:22 +01:00
Lee Jones d49c4b2c07 drm/v3d/v3d_sched: Demote non-conformant kernel-doc header
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/v3d/v3d_sched.c:75: warning: Function parameter or member 'sched_job' not described in 'v3d_job_dependency'
 drivers/gpu/drm/v3d/v3d_sched.c:75: warning: Function parameter or member 's_entity' not described in 'v3d_job_dependency'

Cc: Eric Anholt <eric@anholt.net>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: dri-devel@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20201116174112.1833368-37-lee.jones@linaro.org
2020-11-18 11:51:27 +01:00
Mauro Carvalho Chehab e9d2871f69 drm: fix some kernel-doc markups
Some identifiers have different names between their prototypes
and the kernel-doc markup.

Others need to be fixed, as kernel-doc markups should use this format:
        identifier - description

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/12d4ca26f6843618200529ce5445063734d38c04.1605521731.git.mchehab+huawei@kernel.org
2020-11-16 20:48:20 +01:00
Daniel Vetter bc662528e2 drm/v3d: Delete v3d_dev->dev
We already have it in v3d_dev->drm.dev with zero additional pointer
chasing. Personally I don't like duplicated pointers like this
because:
- reviewers need to check whether the pointer is for the same or
  different objects if there's multiple
- compilers have an easier time too

But also a bit a bikeshed, so feel free to ignore.

Acked-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20200415074034.175360-10-daniel.vetter@ffwll.ch
2020-04-28 15:15:52 +02:00
Christian König 5918045c4e drm/scheduler: rework job destruction
We now destroy finished jobs from the worker thread to make sure that
we never destroy a job currently in timeout processing.
By this we avoid holding lock around ring mirror list in drm_sched_stop
which should solve a deadlock reported by a user.

v2: Remove unused variable.
v4: Move guilty job free into sched code.
v5:
Move sched->hw_rq_count to drm_sched_start to account for counter
decrement in drm_sched_stop even when we don't call resubmit jobs
if guily job did signal.
v6: remove unused variable

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109692

Acked-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1555599624-12285-3-git-send-email-andrey.grodzovsky@amd.com
2019-05-02 15:45:48 -05:00
Eric Anholt dffa9b7a78 drm/v3d: Add missing implicit synchronization.
It is the expectation of existing userspace (X11 + Mesa, in
particular) that jobs submitted to the kernel against a shared BO will
get implicitly synchronized by their submission order.  If we want to
allow clever userspace to disable implicit synchronization, we should
do that under its own submit flag (as amdgpu and lima do).

Note that we currently only implicitly sync for the rendering pass,
not binning -- if you texture-from-pixmap in the binning vertex shader
(vertex coordinate generation), you'll miss out on synchronization.

Fixes flickering when multiple clients are running in parallel,
particularly GL apps and compositors.

v2: Fix a missing refcount on the CSD done fence for L2 cleaning.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20190416225856.20264-6-eric@anholt.net
Acked-by: Rob Clark <robdclark@gmail.com>
2019-04-18 09:54:16 -07:00
Eric Anholt d223f98f02 drm/v3d: Add support for compute shader dispatch.
The compute shader dispatch interface is pretty simple -- just pass in
the regs that userspace has passed us, with no CLs to run.  However,
with no CL to run it means that we need to do manual cache flushing of
the L2 after the HW execution completes (for SSBO, atomic, and
image_load_store writes that are the output of compute shaders).

This doesn't yet expose the L2 cache's ability to have a region of the
address space not write back to memory (which could be used for
shared_var storage).

So far, the Mesa side has been tested on V3D v4.2 simpenrose (passing
the ES31 tests), and on the kernel side on 7278 (failing atomic
compswap tests in a way that doesn't reproduce on simpenrose).

v2: Fix excessive allocation for the clean_job (reported by Dan
    Carpenter).  Keep refs on jobs until clean_job is finished, to
    avoid spurious MMU errors if the output BOs are freed by userspace
    before L2 cleaning is finished.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20190416225856.20264-4-eric@anholt.net
Acked-by: Rob Clark <robdclark@gmail.com>
2019-04-18 09:54:10 -07:00
Eric Anholt a783a09ee7 drm/v3d: Refactor job management.
The CL submission had two jobs embedded in an exec struct.  When I
added TFU support, I had to replicate some of the exec stuff and some
of the job stuff.  As I went to add CSD, it became clear that actually
what was in exec should just be in the two CL jobs, and it would let
us share a lot more code between the 4 queues.

v2: Fix missing error path in TFU ioctl's bo[] allocation.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20190416225856.20264-3-eric@anholt.net
Acked-by: Rob Clark <robdclark@gmail.com>
2019-04-18 09:54:07 -07:00
Eric Anholt 3f0b646e1a drm/v3d: Rename the fence signaled from IRQs to "irq_fence".
We have another thing called the "done fence" that tracks when the
scheduler considers the job done, and having the shared name was
confusing.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20190313235211.28995-2-eric@anholt.net
Reviewed-by: Dave Emett <david.emett@broadcom.com>
2019-04-01 10:44:34 -07:00