Commit Graph

462 Commits

Author SHA1 Message Date
Linus Torvalds 6863c8385c Updates for the interrupt core and treewide cleanups:
- Rework of the Per Processor Interrupt (PPI) management on ARM[64].
 
     PPI support was built under the assumption that the systems are
     homogenous so that the same CPU local device types are connected to
     them. That's unfortunately wishful thinking and created horrible
     workarounds.
 
     This rework provides affinity management for PPIs so that they can be
     individually configured in the firmware tables and mops up the related
     drivers all over the place.
 
   - Prevent CPUSET/isolation changes to arbitrarily affine interrupt
     threads to random CPUs, which ignores user or driver settings.
 
   - Plug a harmless race in the interrupt affinity proc interface, which
     allows to see a half updated mask
 
   - Adjust the priority of secondary interrupt threads on RT, so that the
     combination of primary and secondary thread emulates the hardware
     interrupt plus thread scenario. Having them at the same priority can
     cause starvation issues in some drivers.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmksv3oTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoe5+D/wNnBaX9LRajuLOF+zaYw5WZxkzp6U7
 X4AP3cLny8xynI1kM5V8M1ym3Fspk0hiqxNX2LLXrSZzBR+3O4uGCyCceBXeHKo2
 vW4auUXG4MB+2sZyudQXaBpNK4A2YBubycTUcRECjkjDkBPAWgN7J+Oz2lXUSUcH
 zlitlHNo48hnZQPAJr4PDpi5q9+rChn+8/s+K1d8NlEf9HOXC98qzyMuMq+jHdJE
 AQ6tKoHkA5lHjHAUY3AbWptoHo1Wp+p5PSqsrFr6nbKuPlhUqRNEPXX0Z8q7aUTj
 NgdkvIHJVJ0C+T40FIWCNzUYOUk4gTQXBSPvptwJSHAmf9ovp+Kg2ltVZBzyL2iI
 R0EZSQAQU8iJcRrqjcAYqI36LkmwwVT6RD1zFa98xJT/AjsMpAt/U1pEMDtkoTKe
 Lv7ZQ/hloc+4wV4xS4zEtoV/ukdUfA9aEdXsh5hNH/07tvatpKO2LgortsiI+lCK
 76vAULcGvbMr5Jr63snjICgstahunpNMRn2HmnGAjmdZf4+g+TDvZR4DI6bswtuO
 jp5G6OM30Z9zKheAr1VioV1XAKr6Y4jDKVjfFy/n1k5pDwYaSJopmZxSD35aas4e
 VqWizAzc5dAVCYRlzr6S1lrMQ2JJRg0RpIn+sMS8dhf9SK7hs5ilGSOsgX1fgVat
 1N3WXvYM8vSW+g==
 =zrA1
 -----END PGP SIGNATURE-----

Merge tag 'irq-core-2025-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq core updates from Thomas Gleixner:
 "Updates for the interrupt core and treewide cleanups:

   - Rework of the Per Processor Interrupt (PPI) management on ARM[64]

     PPI support was built under the assumption that the systems are
     homogenous so that the same CPU local device types are connected to
     them. That's unfortunately wishful thinking and created horrible
     workarounds.

     This rework provides affinity management for PPIs so that they can
     be individually configured in the firmware tables and mops up the
     related drivers all over the place.

   - Prevent CPUSET/isolation changes to arbitrarily affine interrupt
     threads to random CPUs, which ignores user or driver settings.

   - Plug a harmless race in the interrupt affinity proc interface,
     which allows to see a half updated mask

   - Adjust the priority of secondary interrupt threads on RT, so that
     the combination of primary and secondary thread emulates the
     hardware interrupt plus thread scenario. Having them at the same
     priority can cause starvation issues in some drivers"

* tag 'irq-core-2025-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits)
  genirq: Remove cpumask availability check on kthread affinity setting
  genirq: Fix interrupt threads affinity vs. cpuset isolated partitions
  genirq: Prevent early spurious wake-ups of interrupt threads
  genirq: Use raw_spinlock_irq() in irq_set_affinity_notifier()
  genirq/manage: Reduce priority of forced secondary interrupt handler
  genirq/proc: Fix race in show_irq_affinity()
  genirq: Fix percpu_devid irq affinity documentation
  perf: arm_pmu: Kill last use of per-CPU cpu_armpmu pointer
  irqdomain: Kill of_node_to_fwnode() helper
  genirq: Kill irq_{g,s}et_percpu_devid_partition()
  irqchip: Kill irq-partition-percpu
  irqchip/apple-aic: Drop support for custom PMU irq partitions
  irqchip/gic-v3: Drop support for custom PPI partitions
  coresight: trbe: Request specific affinities for per CPU interrupts
  perf: arm_spe_pmu: Request specific affinities for per CPU interrupts
  perf: arm_pmu: Request specific affinities for per CPU NMIs/interrupts
  genirq: Add request_percpu_irq_affinity() helper
  genirq: Allow per-cpu interrupt sharing for non-overlapping affinities
  genirq: Update request_percpu_nmi() to take an affinity
  genirq: Add affinity to percpu_devid interrupt requests
  ...
2025-12-02 09:14:26 -08:00
Frederic Weisbecker 3de5e46e50 genirq: Remove cpumask availability check on kthread affinity setting
Failing to allocate the affinity mask of an interrupt descriptor fails the
whole descriptor initialization. It is then guaranteed that the cpumask is
always available whenever the related interrupt objects are alive, such as
the kthread handler.

Therefore remove the superfluous check since it is merely a historical
leftover. Get rid also of the comments above it that are obsolete and
useless.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://patch.msgid.link/20251121143500.42111-4-frederic@kernel.org
2025-11-22 09:26:18 +01:00
Frederic Weisbecker 801afdfbfc genirq: Fix interrupt threads affinity vs. cpuset isolated partitions
When a cpuset isolated partition is created / updated or destroyed, the
interrupt threads are affined blindly to all the non-isolated CPUs. This
happens without taking into account the interrupt threads initial affinity
that becomes ignored.

For example in a system with 8 CPUs, if an interrupt and its kthread are
initially affine to CPU 5, creating an isolated partition with only CPU 2
inside will eventually end up affining the interrupt kthread to all CPUs
but CPU 2 (that is CPUs 0,1,3-7), losing the kthread preference for CPU 5.

Besides the blind re-affining, this doesn't take care of the actual low
level interrupt which isn't migrated. As of today the only way to isolate
non managed interrupts, along with their kthreads, is to overwrite their
affinity separately, for example through /proc/irq/

To avoid doing that manually, future development should focus on updating
the interrupt's affinity whenever cpuset isolated partitions are updated.

In the meantime, cpuset shouldn't fiddle with interrupt threads directly.
To prevent from that, set the PF_NO_SETAFFINITY flag to them.

This is done through kthread_bind_mask() by affining them initially to all
possible CPUs as at that point the interrupt is not started up which means
the affinity of the hard interrupt is not known. The thread will adjust
that once it reaches the handler, which is guaranteed to happen after the
initial affinity of the hard interrupt is established.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://patch.msgid.link/20251121143500.42111-3-frederic@kernel.org
2025-11-22 09:26:18 +01:00
Chengkaitao 9d3faec60b genirq: Use raw_spinlock_irq() in irq_set_affinity_notifier()
Since irq_set_affinity_notifier() may sleep, interrupts are enabled. So
raw_spinlock_irqsave() can be replaced with raw_spinlock_irq().

Signed-off-by: Chengkaitao <chengkaitao@kylinos.cn>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/20251118012754.61805-1-pilgrimtao@gmail.com
2025-11-18 16:19:40 +01:00
Lukas Wunner 51d0656959 genirq/manage: Reduce priority of forced secondary interrupt handler
Crystal reports that the PCIe Advanced Error Reporting driver gets stuck
in an infinite loop on PREEMPT_RT:

Both the primary interrupt handler aer_irq() as well as the secondary
handler aer_isr() are forced into threads with identical priority.
Crystal writes that on the ARM system in question, the primary handler
has to clear an error in the Root Error Status register...

   "before the next error happens, or else the hardware will set the
    Multiple ERR_COR Received bit.  If that bit is set, then aer_isr()
    can't rely on the Error Source Identification register, so it scans
    through all devices looking for errors -- and for some reason, on
    this system, accessing the AER registers (or any Config Space above
    0x400, even though there are capabilities located there) generates
    an Unsupported Request Error (but returns valid data).  Since this
    happens more than once, without aer_irq() preempting, it causes
    another multi error and we get stuck in a loop."

The issue does not show on non-PREEMPT_RT because the primary handler
runs in hardirq context and thus can preempt the threaded secondary
handler, clear the Root Error Status register and prevent the secondary
handler from getting stuck.

Emulate the same behavior on PREEMPT_RT by assigning a lower default
priority to the secondary handler if the primary handler is forced into
a thread.

Reported-by: Crystal Wood <crwood@redhat.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Crystal Wood <crwood@redhat.com>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/f6dcdb41be2694886b8dbf4fe7b3ab89e9d5114c.1761569303.git.lukas@wunner.de
Closes: https://lore.kernel.org/r/20250902224441.368483-1-crwood@redhat.com/
2025-11-01 21:30:02 +01:00
Marc Zyngier bdf4e2ac29 genirq: Allow per-cpu interrupt sharing for non-overlapping affinities
Interrupt sharing for percpu-devid interrupts is forbidden, and for good
reasons. These are interrupts generated *from* a CPU and handled by itself
(timer, for example). Nobody in their right mind would put two devices on
the same pin (and if they have, they get to keep the pieces...).

But this also prevents more benign cases, where devices are connected
to groups of CPUs, and for which the affinities are not overlapping.
Effectively, the only thing they share is the interrupt number, and
nothing else.

Tweak the definition of IRQF_SHARED applied to percpu_devid interrupts to
allow this particular use case. This results in extra validation at the
point of the interrupt being setup and freed, as well as a tiny bit of
extra complexity for interrupts at handling time (to pick the correct
irqaction).

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Will Deacon <will@kernel.org>
Link: https://patch.msgid.link/20251020122944.3074811-17-maz@kernel.org
2025-10-27 17:16:35 +01:00
Marc Zyngier b9c6aa9efc genirq: Update request_percpu_nmi() to take an affinity
Continue spreading the notion of affinity to the per CPU interrupt request
code by updating the call sites that use request_percpu_nmi() (all two of
them) to take an affinity pointer. This pointer is firmly NULL for now.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Will Deacon <will@kernel.org>
Link: https://patch.msgid.link/20251020122944.3074811-16-maz@kernel.org
2025-10-27 17:16:35 +01:00
Marc Zyngier 258e7d28a3 genirq: Add affinity to percpu_devid interrupt requests
Add an affinity field to both the irqaction structure and the interrupt
request primitives. Nothing is making use of it yet, and the only value
used it NULL, which is used as a shorthand for cpu_possible_mask.

This will shortly get used with actual affinities.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Will Deacon <will@kernel.org>
Link: https://patch.msgid.link/20251020122944.3074811-15-maz@kernel.org
2025-10-27 17:16:34 +01:00
Marc Zyngier 9047a39daa genirq: Factor-in percpu irqaction creation
Move the code creating a per-cpu irqaction into its own helper, so that
future changes to this code can be kept localised.

At the same time, fix the documentation which appears to say the wrong
thing when it comes to interrupts being automatically enabled
(percpu_devid interrupts never are).

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Will Deacon <will@kernel.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20251020122944.3074811-14-maz@kernel.org
2025-10-27 17:16:34 +01:00
Charles Keepax ef3330b99c genirq/manage: Add buslock back in to enable_irq()
The locking was changed from a buslock to a plain lock, but the patch
description states there was no functional change. Assuming this was
accidental so reverting to using the buslock.

Fixes: bddd10c554 ("genirq/manage: Rework enable_irq()")
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/20251023154901.1333755-4-ckeepax@opensource.cirrus.com
2025-10-24 11:38:39 +02:00
Charles Keepax 56363e25f7 genirq/manage: Add buslock back in to __disable_irq_nosync()
The locking was changed from a buslock to a plain lock, but the patch
description states there was no functional change. Assuming this was
accidental so reverting to using the buslock.

Fixes: 1b74444467 ("genirq/manage: Rework __disable_irq_nosync()")
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/20251023154901.1333755-3-ckeepax@opensource.cirrus.com
2025-10-24 11:38:39 +02:00
Jon Hunter 58eb5721a4 genirq/manage: Use the correct lock guard in irq_set_irq_wake()
Commit 8589e325ba ("genirq/manage: Rework irq_set_irq_wake()") updated
the irq_set_irq_wake() to use the new guards for locking the interrupt
descriptor.

However, in doing so it inadvertently changed irq_set_irq_wake() such that
the 'chip_bus_lock' is no longer acquired. This has caused system suspend
tests to fail on some Tegra platforms.

Fix this by correcting the guard used in irq_set_irq_wake() to ensure the
'chip_bus_lock' is held.

Fixes: 8589e325ba ("genirq/manage: Rework irq_set_irq_wake()")
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250514095041.1109783-1-jonathanh@nvidia.com
2025-05-14 12:05:58 +02:00
Thomas Gleixner 97f4b999e0 genirq: Use scoped_guard() to shut clang up
This code pattern trips clang up:

     if (fail)
     	goto undo;

     guard(lock)(lock);
     do_stuff();
     return 0;

undo:
     ...

as it somehow extends the scope of the guard beyond the return statement.

Replace it with a scoped guard to help it to get its act together.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Closes: https://lore.kernel.org/oe-kbuild-all/202505071809.ajpPxfoZ-lkp@intel.com/
2025-05-07 17:08:44 +02:00
Dr. David Alan Gilbert aefc11550e genirq: Remove unused remove_percpu_irq()
remove_percpu_irq() has been unused since it was added in 2011 by
commit 31d9d9b6d8 ("genirq: Add support for per-cpu dev_id interrupts")

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250420164656.112641-1-linux@treblig.org
2025-05-07 11:35:41 +02:00
Thomas Gleixner 193879e28b genirq/manage: Rework irq_set_irqchip_state()
Use the new guards to get and lock the interrupt descriptor and tidy up the
code.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/all/20250429065422.670808288@linutronix.de
2025-05-07 09:08:16 +02:00
Thomas Gleixner 782249a997 genirq/manage: Rework irq_get_irqchip_state()
Use the new guards to get and lock the interrupt descriptor and tidy up the
code.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/all/20250429065422.612184618@linutronix.de
2025-05-07 09:08:16 +02:00
Thomas Gleixner 5fec6d5cd2 genirq/manage: Rework teardown_percpu_nmi()
Use the new guards to get and lock the interrupt descriptor and tidy up the
code.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/all/20250429065422.552884529@linutronix.de
2025-05-07 09:08:16 +02:00
Thomas Gleixner 65dd1f7ca9 genirq/manage: Rework prepare_percpu_nmi()
Use the new guards to get and lock the interrupt descriptor and tidy up the
code.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/all/20250429065422.494561120@linutronix.de
2025-05-07 09:08:16 +02:00
Thomas Gleixner 8e3f672b19 genirq/manage: Rework disable_percpu_irq()
Use the new guards to get and lock the interrupt descriptor and tidy up the
code.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/all/20250429065422.435932527@linutronix.de
2025-05-07 09:08:16 +02:00
Thomas Gleixner b171f712d6 genirq/manage: Rework irq_percpu_is_enabled()
Use the new guards to get and lock the interrupt descriptor and tidy up the
code.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/all/20250429065422.376836282@linutronix.de
2025-05-07 09:08:16 +02:00
Thomas Gleixner 508bd94c3a genirq/manage: Rework enable_percpu_irq()
Use the new guards to get and lock the interrupt descriptor and tidy up the
code.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/all/20250429065422.315844964@linutronix.de
2025-05-07 09:08:16 +02:00
Thomas Gleixner 90140d08ac genirq/manage: Rework irq_set_parent()
Use the new guards to get and lock the interrupt descriptor and tidy up the
code.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/all/20250429065422.258216558@linutronix.de
2025-05-07 09:08:15 +02:00
Thomas Gleixner a1ceb83141 genirq/manage: Rework can_request_irq()
Use the new guards to get and lock the interrupt descriptor and tidy up the
code.

Make the return value boolean to reflect it's meaning.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/all/20250429065422.187250840@linutronix.de
2025-05-07 09:08:15 +02:00
Thomas Gleixner 8589e325ba genirq/manage: Rework irq_set_irq_wake()
Use the new guards to get and lock the interrupt descriptor and tidy up the
code.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/all/87ldrhq0hc.ffs@tglx
2025-05-07 09:08:15 +02:00
Thomas Gleixner bddd10c554 genirq/manage: Rework enable_irq()
Use the new guards to get and lock the interrupt descriptor and tidy up the
code.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/all/20250429065422.071157729@linutronix.de
2025-05-07 09:08:15 +02:00
Thomas Gleixner 1b74444467 genirq/manage: Rework __disable_irq_nosync()
Use the new guards to get and lock the interrupt descriptor and tidy up the
code.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/all/20250429065422.013088277@linutronix.de
2025-05-07 09:08:15 +02:00
Thomas Gleixner 55ac0ad22f genirq/manage: Rework irq_set_vcpu_affinity()
Use the new guards to get and lock the interrupt descriptor and tidy up the
code.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/all/87ikmlq0fk.ffs@tglx
2025-05-07 09:08:15 +02:00
Thomas Gleixner 7e04e5c6f6 genirq/manage: Rework __irq_apply_affinity_hint()
Use the new guards to get and lock the interrupt descriptor and tidy up the
code.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/all/20250429065421.897188799@linutronix.de
2025-05-07 09:08:15 +02:00
Thomas Gleixner b0561582ea genirq/manage: Rework irq_update_affinity_desc()
Use the new guards to get and lock the interrupt descriptor and tidy up the
code.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/all/20250429065421.830357569@linutronix.de
2025-05-07 09:08:15 +02:00
Thomas Gleixner 17c1953567 genirq/manage: Convert to lock guards
Convert lock/unlock pairs to guards.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/all/20250429065421.771476066@linutronix.de
2025-05-07 09:08:14 +02:00
Thomas Gleixner 0c169edf36 genirq/manage: Cleanup kernel doc comments
Get rid of the extra tab to make it consistent.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/all/20250429065421.710273122@linutronix.de
2025-05-07 09:08:14 +02:00
Thomas Gleixner 827bafd527 genirq: Make a few functions static
None of these functions are used outside of their source files.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/878qpe2gnx.ffs@tglx
2025-03-10 10:01:20 +01:00
Andy Shevchenko 429f49ad36 genirq: Reuse irq_thread_fn() for forced thread case
rq_forced_thread_fn() uses the same action callback as the non-forced
variant but with different locking decorations.  Reuse irq_thread_fn() here
to make that clear.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20241119104339.2112455-3-andriy.shevchenko@linux.intel.com
2024-12-03 11:59:10 +01:00
Andy Shevchenko 6f8b79683d genirq: Move irq_thread_fn() further up in the code
In a preparation to reuse irq_thread_fn() move it further up in the
code. No functional change intended.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20241119104339.2112455-2-andriy.shevchenko@linux.intel.com
2024-12-03 11:59:10 +01:00
Marc Zyngier 64b6d1d7a8 genirq: Get rid of global lock in irq_do_set_affinity()
Kunkun Jiang reports that for a workload involving the simultaneous startup
of a large number of VMs (for a total of about 200 vcpus), a lot of CPU
time gets spent on spinning on the tmp_mask_lock that exists as a static
raw spinlock in irq_do_set_affinity(). This lock protects a global cpumask
(tmp_mask) that is used as a temporary variable to compute the resulting
affinity.

While this is triggered by KVM issuing a irq_set_affinity() call each time
a vcpu is about to execute, it is obvious that having a single global
resource is not very scalable.

Since a cpumask can be a fairly large structure on systems with a high core
count, a stack allocation is not really appropriate.  Instead, turn the
global cpumask into a per-CPU variable, removing the need for locking
altogether as the code is executed with preemption and interrupts disabled.

[ tglx: Moved the per CPU variable declaration outside of the function ]

Reported-by: Kunkun Jiang <jiangkunkun@huawei.com>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Kunkun Jiang <jiangkunkun@huawei.com>
Link: https://lore.kernel.org/all/20240826080618.3886694-1-maz@kernel.org
Link: https://lore.kernel.org/all/a7fc58e4-64c2-77fc-c1dc-f5eb78dbbb01@huawei.com
2024-08-27 13:54:15 +02:00
Frederic Weisbecker 68cbd415dd task_work: s/task_work_cancel()/task_work_cancel_func()/
A proper task_work_cancel() API that actually cancels a callback and not
*any* callback pointing to a given function is going to be needed for
perf events event freeing. Do the appropriate rename to prepare for
that.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240621091601.18227-2-frederic@kernel.org
2024-07-09 13:26:31 +02:00
Linus Torvalds 6bfd2d442a Updates for the interrupt subsystem:
- Core code:
 
    - Interrupt storm detection for the lockup watchdog:
 
      Lockups which are caused by interrupt storms are not easy to debug
      because there is no information about the events which make the lockup
      detector trigger.
 
      To make this more user friendly, provide an extenstion to interrupt
      statistics which allows to take snapshots and an interface to retrieve
      the delta to the snapshot. Use this new mechanism in the watchdog code
      to do a two stage lockup analysis by taking the snapshot and printing
      the deltas for the topmost active interrupts on the second trigger.
 
      Note: This contains both the interrupt and the watchdog changes as
      the latter depend on the former obviously.
 
   - Avoid summation loops in the /proc/interrupts output and use the global
     counter when possible
 
   - Skip suspended interrupts on CPU hotplug operations to ensure that they
     are not delivered before the system resumes the device drivers when
     coming out of suspend.
 
   - On CPU hot-unplug interrupts which are affine to the outgoing CPU are
     migrated to a different CPU in the affinity mask. This can fail when
     the CPUs have no vectors left. Instead of giving up try to migrate it
     to any online CPU and thereby breaking the affinity setting in order to
     prevent a stale device interrupt which targets an offline CPU
 
   - The usual small cleanups
 
  - Driver code:
 
   - Support for the RISCV AIA MSI controller
 
   - Make the interrupt allocation for the Loongson PCH controller more
     flexible to prevent vector exhaustion
 
   - The usual set of cleanups and fixes all over the place
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmZBCM0THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoeZHEACqMLN3K+1HyWflYtcTHJeYCjZLHS77
 2tQeKaaskOA4W6dcGXPxMw5CHqAobHVQQMqgcJxhUdqQiOJnFFnrtCD7JtqM0hWK
 UORNbyeovuhAo+iJ0fTuS8p63H7vm2GIWwBLWJnOuChYv/6Yyx5Cald1skvyvbzL
 zePhiiAf5mkdmJMeT5wJSCqEWSRYOXsVAJ/0YAwFG3bKkJH3bmDo6SDJY02sXT5P
 pjbtD/0hum9wIVT4fNdYleHHQMdBdj9dLlcxXBikHq50mDMw7GxvjKiLcXmoerw3
 rEBfVVJp3qpSofpNJZ3HH0ywcF3yUzq04/LPE9Tk2MoQ8NF0GzP8r9Ahke4B7cUj
 FysWNiAlC2IisEi6th313FZkTLx0zgewdsdEBTLt8eAE9TU0wamRbo99LZ8i/Qr3
 hk7jV8DzL+EDQJLgl4p1iPJgA708eW17tbCxLEa15VKVV6P58miohmhx/IfPO2Gx
 FV1PPehtItsmiK/UoRtUCoFdFsqNQtOE+h8DWLyy8RDmhBqGbn9Ut4euXiQIF+rX
 WJKPFfslCTR39BrBcZnZeNsgOCN7tEfFRstzjzkey1DaeTGWtxmA5UGhpC2vT74y
 YyXluvZlgKr4S64ABmcqQj++hQLho0OQAih3uW5YVxt4VxEUcXYMJOsV1AQGpMjF
 UnewWH5opBQdfw==
 =jFLf
 -----END PGP SIGNATURE-----

Merge tag 'irq-core-2024-05-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull interrupt subsystem updates from Thomas Gleixner:
 "Core code:

   - Interrupt storm detection for the lockup watchdog:

     Lockups which are caused by interrupt storms are not easy to debug
     because there is no information about the events which make the
     lockup detector trigger.

     To make this more user friendly, provide an extenstion to interrupt
     statistics which allows to take snapshots and an interface to
     retrieve the delta to the snapshot. Use this new mechanism in the
     watchdog code to do a two stage lockup analysis by taking the
     snapshot and printing the deltas for the topmost active interrupts
     on the second trigger.

     Note: This contains both the interrupt and the watchdog changes as
     the latter depend on the former obviously.

   - Avoid summation loops in the /proc/interrupts output and use the
     global counter when possible

   - Skip suspended interrupts on CPU hotplug operations to ensure that
     they are not delivered before the system resumes the device drivers
     when coming out of suspend.

   - On CPU hot-unplug interrupts which are affine to the outgoing CPU
     are migrated to a different CPU in the affinity mask. This can fail
     when the CPUs have no vectors left. Instead of giving up try to
     migrate it to any online CPU and thereby breaking the affinity
     setting in order to prevent a stale device interrupt which targets
     an offline CPU

   - The usual small cleanups

  Driver code:

   - Support for the RISCV AIA MSI controller

   - Make the interrupt allocation for the Loongson PCH controller more
     flexible to prevent vector exhaustion

   - The usual set of cleanups and fixes all over the place"

* tag 'irq-core-2024-05-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits)
  irqchip/gic-v3-its: Remove BUG_ON in its_vpe_irq_domain_alloc
  cpuidle: Avoid explicit cpumask allocation on stack
  irqchip/sifive-plic: Avoid explicit cpumask allocation on stack
  irqchip/riscv-aplic-direct: Avoid explicit cpumask allocation on stack
  irqchip/loongson-eiointc: Avoid explicit cpumask allocation on stack
  irqchip/gic-v3-its: Avoid explicit cpumask allocation on stack
  irqchip/irq-bcm6345-l1: Avoid explicit cpumask allocation on stack
  cpumask: Introduce cpumask_first_and_and()
  irqchip/irq-brcmstb-l2: Avoid saving mask on shutdown
  genirq: Reuse irq_is_nmi()
  genirq/cpuhotplug: Retry with cpu_online_mask when migration fails
  genirq/cpuhotplug: Skip suspended interrupts when restoring affinity
  arm64: dts: st: Add interrupt parent to pinctrl on stm32mp251
  arm64: dts: st: Add exti1 and exti2 nodes on stm32mp251
  ARM: dts: stm32: List exti parent interrupts on stm32mp131
  ARM: dts: stm32: List exti parent interrupts on stm32mp151
  arm64: Kconfig.platforms: Enable STM32_EXTI for ARCH_STM32
  irqchip/stm32-exti: Mark events reserved with RIF configuration check
  irqchip/stm32-exti: Skip secure events
  irqchip/stm32-exti: Convert driver to standard PM
  ...
2024-05-14 09:47:14 -07:00
Jinjie Ruan 6678ae1918 genirq: Reuse irq_is_nmi()
Move irq_is_nmi() to the internal header file and reuse it all over the
place.

Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20240423024037.3331215-1-ruanjinjie@huawei.com
2024-04-24 20:42:57 +02:00
David Stevens a60dd06af6 genirq/cpuhotplug: Skip suspended interrupts when restoring affinity
irq_restore_affinity_of_irq() restarts managed interrupts unconditionally
when the first CPU in the affinity mask comes online. That's correct during
normal hotplug operations, but not when resuming from S3 because the
drivers are not resumed yet and interrupt delivery is not expected by them.

Skip the startup of suspended interrupts and let resume_device_irqs() deal
with restoring them. This ensures that irqs are not delivered to drivers
during the noirq phase of resuming from S3, after non-boot CPUs are brought
back online.

Signed-off-by: David Stevens <stevensd@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20240424090341.72236-1-stevensd@chromium.org
2024-04-24 20:42:57 +02:00
Rafael J. Wysocki c2ddeb2961 genirq: Introduce IRQF_COND_ONESHOT and use it in pinctrl-amd
There is a problem when a driver requests a shared interrupt line to run a
threaded handler on it without IRQF_ONESHOT set if that flag has been set
already for the IRQ in question by somebody else.  Namely, the request
fails which usually leads to a probe failure even though the driver might
have worked just fine with IRQF_ONESHOT, but it does not want to use it by
default.  Currently, the only way to handle this is to try to request the
IRQ without IRQF_ONESHOT, but with IRQF_PROBE_SHARED set and if this fails,
try again with IRQF_ONESHOT set.  However, this is a bit cumbersome and not
very clean.

When commit 7a36b901a6 ("ACPI: OSL: Use a threaded interrupt handler for
SCI") switched the ACPI subsystem over to using a threaded interrupt
handler for the SCI, it had to use IRQF_ONESHOT for it because that's
required due to the way the SCI handler works (it needs to walk all of the
enabled GPEs before the interrupt line can be unmasked). The SCI interrupt
line is not shared with other users very often due to the SCI handling
overhead, but on sone systems it is shared and when the other user of it
attempts to install a threaded handler, a flags mismatch related to
IRQF_ONESHOT may occur.

As it turned out, that happened to the pinctrl-amd driver and so commit
4451e8e841 ("pinctrl: amd: Add IRQF_ONESHOT to the interrupt request")
attempted to address the issue by adding IRQF_ONESHOT to the interrupt
flags in that driver, but this is now causing an IRQF_ONESHOT-related
mismatch to occur on another system which cannot boot as a result of it.

Clearly, pinctrl-amd can work with IRQF_ONESHOT if need be, but it should
not set that flag by default, so it needs a way to indicate that to the
interrupt subsystem.

To that end, introdcuce a new interrupt flag, IRQF_COND_ONESHOT, which will
only have effect when the IRQ line is shared and IRQF_ONESHOT has been set
for it already, in which case it will be promoted to the latter.

This is sufficient for drivers sharing the interrupt line with the SCI as
it is requested by the ACPI subsystem before any drivers are probed, so
they will always see IRQF_ONESHOT set for the interrupt in question.

Fixes: 4451e8e841 ("pinctrl: amd: Add IRQF_ONESHOT to the interrupt request")
Reported-by: Francisco Ayala Le Brun <francisco@videowindow.eu>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Cc: 6.8+ <stable@vger.kernel.org> # 6.8+
Closes: https://lore.kernel.org/lkml/CAN-StX1HqWqi+YW=t+V52-38Mfp5fAz7YHx4aH-CQjgyNiKx3g@mail.gmail.com/
Link: https://lore.kernel.org/r/12417336.O9o76ZdvQC@kreacher
2024-03-25 23:45:21 +01:00
Crystal Wood c99303a2d2 genirq: Wake interrupt threads immediately when changing affinity
The affinity setting of interrupt threads happens in the context of the
thread when the thread is woken up by an hard interrupt. As this can be an
arbitrary after changing the affinity, the thread can become runnable on an
isolated CPU and cause isolation disruption.

Avoid this by checking the set affinity request in wait_for_interrupt() and
waking the threads immediately when the affinity is modified.

Note that this is of the most benefit on systems where the interrupt
affinity itself does not need to be deferred to the interrupt handler, but
even where that's not the case, the total dirsuption will be less.

Signed-off-by: Crystal Wood <crwood@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20240122235353.15235-1-crwood@redhat.com
2024-02-19 11:43:46 +01:00
Andreas Gruenbacher 6309727ef2 kthread: add kthread_stop_put
Add a kthread_stop_put() helper that stops a thread and puts its task
struct.  Use it to replace the various instances of kthread_stop()
followed by put_task_struct().

Remove the kthread_stop_put() macro in usbip that is similar but doesn't
return the result of kthread_stop().

[agruenba@redhat.com: fix kerneldoc comment]
  Link: https://lkml.kernel.org/r/20230911111730.2565537-1-agruenba@redhat.com
[akpm@linux-foundation.org: document kthread_stop_put()'s argument]
Link: https://lkml.kernel.org/r/20230907234048.2499820-1-agruenba@redhat.com
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-04 10:41:57 -07:00
Vincent Whitchurch e2c12739cc genirq: Prevent nested thread vs synchronize_hardirq() deadlock
There is a possibility of deadlock if synchronize_hardirq() is called
when the nested threaded interrupt is active.  The following scenario
was observed on a uniprocessor PREEMPT_NONE system:

 Thread 1                      Thread 2

 handle_nested_thread()
  Set INPROGRESS
  Call ->thread_fn()
   thread_fn goes to sleep

                              free_irq()
                               __synchronize_hardirq()
                                Busy-loop forever waiting for INPROGRESS
                                to be cleared

The INPROGRESS flag is only supposed to be used for hard interrupt
handlers.  Remove the incorrect usage in the nested threaded interrupt
case and instead re-use the threads_active / wait_for_threads mechanism
to wait for nested threaded interrupts to complete.

Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20230613-genirq-nested-v3-1-ae58221143eb@axis.com
2023-07-31 17:24:22 +02:00
John Keeping 803235982b genirq: Update affinity of secondary threads
For interrupts with secondary threads, the affinity is applied when the
thread is created but if the interrupts affinity is changed later only
the primary thread is updated.

Update the secondary thread's affinity as well to keep all the interrupts
activity on the assigned CPUs.

Signed-off-by: John Keeping <john@metanate.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20230406180857.588682-1-john@metanate.com
2023-04-15 10:17:16 +02:00
Manfred Spraul 17549b0f18 genirq: Add might_sleep() to disable_irq()
With the introduction of threaded interrupt handlers, it is virtually
never safe to call disable_irq() from non-premptible context.

Thus: Update the documentation, add an explicit might_sleep() to catch any
offenders. This is more obvious and straight forward than the implicit
might_sleep() check deeper down in the disable_irq() call chain.

Fixes: 3aa551c9b4 ("genirq: add threaded interrupt handler support")
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20221216150441.200533-3-manfred@colorfullife.com
2023-01-11 19:35:13 +01:00
Angus Chen fd19ce7799 genirq: Remove unused argument force of irq_set_affinity_deactivated()
The force parameter in irq_set_affinity_deactivated() is not used,
get rid of it.

Signed-off-by: Angus Chen <angus.chen@jaguarmicro.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20221007103236.599-1-angus.chen@jaguarmicro.com
2022-11-17 14:00:55 +01:00
Samuel Holland 610306306a genirq: Drop redundant irq_init_effective_affinity
It does exactly the same thing as irq_data_update_effective_affinity.

Signed-off-by: Samuel Holland <samuel@sholland.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220701200056.46555-5-samuel@sholland.org
2022-07-07 09:38:04 +01:00
Linus Torvalds fcfde8a7cf Updates for interrupt core and drivers:
Core code:
 
     - Make the managed interrupts more robust by shutting them down in the
       core code when the assigned affinity mask does not contain online
       CPUs.
 
     - Make the irq simulator chip work on RT
 
     - A small set of cpumask and power manageent cleanups
 
   Drivers:
 
     - A set of changes which mark GPIO interrupt chips immutable to prevent
       the GPIO subsystem from modifying it under the hood. This provides
       the necessary infrastructure and converts a set of GPIO and pinctrl
       drivers over.
 
     - A set of changes to make the pseudo-NMI handling for GICv3 more
       robust: a missing barrier and consistent handling of the priority
       mask.
 
     - Another set of GICv3 improvements and fixes, but nothing outstanding
 
     - The usual set of improvements and cleanups all over the place
 
     - No new irqchip drivers and not even a new device tree binding!
       100+ interrupt chips are truly enough.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmKLOEoTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoQ4ED/9B1kDwunvkNAPJDmSmr4hFU7EU3ZLb
 SyS2099PWekgU3TaWdD6eILm9hIvsAmmhbU7CJ0EWol6G5VsqbNoYsfOsWliuGTi
 CL3ygZL84hL4b24c3sipqWAF60WCEKLnYV7pb1DgiZM41C87+wxPB49FQbHVjroz
 WDRTF8QYWMqoTRvxGMCflDfkAwydlCrqzQwgyUB5hJj3vbiYX9dVMAkJmHRyM3Uq
 Prwhx1Ipbj/wBSReIbIXlNx4XI/iUDI0UWeh02XkVxLb5Jzg7vPCHiuyVMR1DW2J
 oEjAR+/1sGwVOoRnfRlwdRUmRRItdlbopbL4CuhO/ENrM/r/o/rMvDDMwF4WoMW9
 zXvzFBLllVpLvyFvVHO1LKI6Hx2mdyAmQ1M/TxMFOmHAyfOPtN150AJDPKdCrMk/
 0F0B0y/KPgU9P/Q9yLh2UiXRAkoUBpLpk20xZbAUGHnjXXkys4Z2fE+THIob+Ibe
 pUnXsgCXVVWyqJjdikPF2gqsSsCFUo7iblHRzI0hzOAPe3MTph0qh3hZoFAFNEYP
 IIyAv9+IiT1EvBMgjHNmZ51U0uTbt3qWOSxebEoU3a598wwEVNRRVyutqvREXhl8
 inkzpL2N3uBPX7sA25lYkH4QKRbzVoNkF/s0e/J9WZdYbj3SsxGouoGdYA2xgvtM
 8tiCnFC9hfzepQ==
 =xcXv
 -----END PGP SIGNATURE-----

Merge tag 'irq-core-2022-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull interrupt handling updates from Thomas Gleixner:
 "Core code:

   - Make the managed interrupts more robust by shutting them down in
     the core code when the assigned affinity mask does not contain
     online CPUs.

   - Make the irq simulator chip work on RT

   - A small set of cpumask and power manageent cleanups

  Drivers:

   - A set of changes which mark GPIO interrupt chips immutable to
     prevent the GPIO subsystem from modifying it under the hood. This
     provides the necessary infrastructure and converts a set of GPIO
     and pinctrl drivers over.

   - A set of changes to make the pseudo-NMI handling for GICv3 more
     robust: a missing barrier and consistent handling of the priority
     mask.

   - Another set of GICv3 improvements and fixes, but nothing
     outstanding

   - The usual set of improvements and cleanups all over the place

   - No new irqchip drivers and not even a new device tree binding!
     100+ interrupt chips are truly enough"

* tag 'irq-core-2022-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (39 commits)
  irqchip: Add Kconfig symbols for sunxi drivers
  irqchip/gic-v3: Fix priority mask handling
  irqchip/gic-v3: Refactor ISB + EOIR at ack time
  irqchip/gic-v3: Ensure pseudo-NMIs have an ISB between ack and handling
  genirq/irq_sim: Make the irq_work always run in hard irq context
  irqchip/armada-370-xp: Do not touch Performance Counter Overflow on A375, A38x, A39x
  irqchip/gic: Improved warning about incorrect type
  irqchip/csky: Return true/false (not 1/0) from bool functions
  irqchip/imx-irqsteer: Add runtime PM support
  irqchip/imx-irqsteer: Constify irq_chip struct
  irqchip/armada-370-xp: Enable MSI affinity configuration
  irqchip/aspeed-scu-ic: Fix irq_of_parse_and_map() return value
  irqchip/aspeed-i2c-ic: Fix irq_of_parse_and_map() return value
  irqchip/sun6i-r: Use NULL for chip_data
  irqchip/xtensa-mx: Fix initial IRQ affinity in non-SMP setup
  irqchip/exiu: Fix acknowledgment of edge triggered interrupts
  irqchip/gic-v3: Claim iomem resources
  dt-bindings: interrupt-controller: arm,gic-v3: Make the v2 compat requirements explicit
  irqchip/gic-v3: Relax polling of GIC{R,D}_CTLR.RWP
  irqchip/gic-v3: Detect LPI invalidation MMIO registers
  ...
2022-05-23 16:58:49 -07:00
Thomas Pfaff 8707898e22 genirq: Synchronize interrupt thread startup
A kernel hang can be observed when running setserial in a loop on a kernel
with force threaded interrupts. The sequence of events is:

   setserial
     open("/dev/ttyXXX")
       request_irq()
     do_stuff()
      -> serial interrupt
         -> wake(irq_thread)
	      desc->threads_active++;
     close()
       free_irq()
         kthread_stop(irq_thread)
     synchronize_irq() <- hangs because desc->threads_active != 0

The thread is created in request_irq() and woken up, but does not get on a
CPU to reach the actual thread function, which would handle the pending
wake-up. kthread_stop() sets the should stop condition which makes the
thread immediately exit, which in turn leaves the stale threads_active
count around.

This problem was introduced with commit 519cc8652b, which addressed a
interrupt sharing issue in the PCIe code.

Before that commit free_irq() invoked synchronize_irq(), which waits for
the hard interrupt handler and also for associated threads to complete.

To address the PCIe issue synchronize_irq() was replaced with
__synchronize_hardirq(), which only waits for the hard interrupt handler to
complete, but not for threaded handlers.

This was done under the assumption, that the interrupt thread already
reached the thread function and waits for a wake-up, which is guaranteed to
be handled before acting on the stop condition. The problematic case, that
the thread would not reach the thread function, was obviously overlooked.

Make sure that the interrupt thread is really started and reaches
thread_fn() before returning from __setup_irq().

This utilizes the existing wait queue in the interrupt descriptor. The
wait queue is unused for non-shared interrupts. For shared interrupts the
usage might cause a spurious wake-up of a waiter in synchronize_irq() or the
completion of a threaded handler might cause a spurious wake-up of the
waiter for the ready flag. Both are harmless and have no functional impact.

[ tglx: Amended changelog ]

Fixes: 519cc8652b ("genirq: Synchronize only with single thread on free_irq()")
Signed-off-by: Thomas Pfaff <tpfaff@pcs.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/552fe7b4-9224-b183-bb87-a8f36d335690@pcs.com
2022-05-05 11:54:05 +02:00
Marc Zyngier c48c8b829d genirq: Take the proposed affinity at face value if force==true
Although setting the affinity of an interrupt to a set of CPUs that doesn't
have any online CPU is generally frowned apon, there are a few limited
cases where such affinity is set from a CPUHP notifier, setting the
affinity to a CPU that isn't online yet.

The saving grace is that this is always done using the 'force' attribute,
which gives a hint that the affinity setting can be outside of the online
CPU mask and the callsite set this flag with the knowledge that the
underlying interrupt controller knows to handle it.

This restores the expected behaviour on Marek's system.

Fixes: 33de0aa4ba ("genirq: Always limit the affinity to online CPUs")
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/r/4b7fc13c-887b-a664-26e8-45aed13f048a@samsung.com
Link: https://lore.kernel.org/r/20220414140011.541725-1-maz@kernel.org
2022-04-14 16:11:25 +02:00