As part of the securing GAUDI, the F/W will configure the PCI iATU
regions. If the driver identifies a secured PCI ID, it will know to
skip iATU configuration in a very early stage.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
As F/ security indication must be available before driver approaches
PCI bus, F/W security should be derived from PCI id rather than be
fetched during boot handshake with F/W.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
DRAM scrubbing can take time hence it adds to latency during allocation.
To minimize latency during initialization, scrubbing is moved to release
call.
In case scrubbing fails it means the device is in a bad state,
hence HARD reset is initiated.
Signed-off-by: Bharat Jauhari <bjauhari@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In order to minimize hard coded values between F/W and the driver, we
send msi-x indexes dynamically to the F/W.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Clearing QM errors by the driver will prevent these H/W blocks from
stopping in case they are configured to stop on errors, so perform this
clearing only if this mode is not in use.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In case of multiple ECC errors, FW will set the DEVICE_UNUSABLE bit.
On boot-up, the driver will therefore fail inserting the device.
Signed-off-by: Koby Elbaz <kelbaz@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Prefer the use of strscpy when copying the ASIC name into a char array,
to prevent accidentally exceeding the array's length.
In addition, strlcpy is frowned upon so replace it.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
The store part was never implemented in the code and never been used
by the userspace applications.
We currently use the related parameters to a different purpose with
a defined union. However, there is no point in that and it is better
to just remove the union and the store parameters.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
When trying to debug program, the user often needs to
dump large parts of the device's DRAM, which can reach to tens of GBs.
Because reading from the device's internal memory through the PCI BAR
is extremely slow, the debug can take hours.
Instead, we can provide the user to copy data through one of the DMA
engines. This will make the operation much faster.
Currently, only GAUDI is supported.
In GAUDI, we need to find a PCI DMA engine that is IDLE and set the
DMA as secured to be able to bypass our MMU as we currently don't
map the temporary buffer to the MMU.
Example bash one-line to dump entire HBM to file (~2 minutes):
for (( i=0x0; i < 0x800000000; i+=0x8000000 )); do \
printf '0x%x\n' $i | sudo tee /sys/kernel/debug/habanalabs/hl0/addr ; \
echo 0x8000000 | sudo tee /sys/kernel/debug/habanalabs/hl0/dma_size ; \
sudo cat /sys/kernel/debug/habanalabs/hl0/data_dma >> hbm.txt ; done
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Since we moved the SOB reset flow to workqueue and
not part of the fence release flow, we might reach a
scenario where new context is created while we in the middle
of resetting the SOB.
in such cases the reset may fail due to idle check.
This will mess up the streams sync since the SOB value is invalid.
so we protect this area with a mutex, to delay context creation.
Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
There is a need to allow to user to send command submissions with
custom timeout as some CS take longer than the max timeout that is
used by default.
Signed-off-by: Alon Mizrahi <amizrahi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
The new approach is based on the notion that the relative
current power consumption is in relation of proportionality
to device's true utilization.
Utilization info ranges between [0,100]%
Currently, dc_power values are hard-coded.
Signed-off-by: Koby Elbaz <kelbaz@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In order to use minimum of hard coded values common to LKD and F/W
a dynamic method to work with PLLs is introduced in this patch.
Formerly asic specific PLL numbering is now common for all asics.
To be backward compatible a bit in dev status is defined, if the bit is
not set LKD will keep working with old PLL numbering.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In order to shorten the time cs lock is being held, we move any
possible work outside of the cs lock.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Add a little sleep between page unmappings in case mapping of
large number of host pages failed, in order to
avoid soft lockup bug during the rollback.
Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Update with latest version from the Firmware team.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
The device can get into deadlock in case it use indirect mode for MSI
interrupts (multi-msi) and have hard-reset during interrupt storm.
To prevent that, always use direct mode which means single-msi mode.
The F/W will prevent the host from writing to the indirect MSI
registers to prevent any malicious user from causing this scenario.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In case the BMC of the devices' box wants to initiate a reset of
a specific device, it must go through driver.
Once driver will receive the request it will initiate a hard reset
flow.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In order to have a better debuggability we allow debugfs access
to user mmu mapped host memory. Non-user host memory access will be
rejected.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
fixed the following coccicheck:
./drivers/misc/habanalabs/common/sysfs.c:347:60-61: WARNING opportunity
for kobj_to_dev()
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Update to the latest version of the file as supplied by the F/W.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
if reset is due to heartbeat, device CPU is no responsive in which
case no point sending PCI disable message to it.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
As there are incorrect assumptions in which some of the
initialization and data path flows cannot sleep, most allocations
are being done using GFP_ATOMIC.
We modify the code to use GFP_ATOMIC only when realy needed, as
sleepable flow should use GFP_KERNEL.
In addition add a fallback to allocate memory using GFP_KERNEL,
once ATOMIC allocation fails.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Update to the latest definition of the firmware
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Add driver implementation for reading the current power from the device
CPU F/W.
Signed-off-by: Sagiv Ozeri <sozeri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Improve "vm" debugfs node to print also the virtual addresses which are
currently mapped to HW blocks in the device.
Signed-off-by: Sagiv Ozeri <sozeri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
For simplicity, use a single bringup flag indicating which FW
binaries should loaded to device.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Timeout in wait for interrupt is in 32-bit variable so we need to use
the correct maximum value to compare.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In order to support command submissions from user space, the driver
need to add support for user interrupt completions. The driver will
allow multiple user threads to wait for an interrupt and perform
a comparison with a given user address once interrupt expires.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In order to support user interrupts, driver must enable all MSI-X
interrupts for any case user will trigger them. We differentiate
between a valid user interrupt and a non valid one.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
As the F/wW is the first to detect out of sync event, a new event is
added to notify the driver on such event. In which case the driver
performs hard reset.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Because our graph contains network operations, we need to account
for delay in the network.
5 seconds timeout per CS is not enough to account for that.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Notify to the user that although he closed the FD, the device is
still in use because there are live CS and/or memory mappings (mmaps).
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
After any reset (soft or hard) the device (the engines/QMANs) should
be idle. If they are not idle, fail the reset. If it is soft-reset,
the driver will try to do hard-reset automatically. If it is hard-reset,
the driver will make the device non-operational.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
The device is actually released only after the refcnt of the hpriv
structure is 0, which means all its contexts were closed.
If we reset the device while a context is still open, there are
possibilities for unexpected behavior and crashes. For example, if the
process has a mapping of a register block that is now currently being
reset, and the process writes/reads to that block during the reset,
the device can get stuck.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In order to support command submissions that are done directly from
user space, the driver must perform soft reset once user closes its FD.
In case the soft reset fails or device is not idle, a hard reset should
be performed.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
currently we support only 2 asids in all asics.
asid 0 for driver, and asic 1 for user.
no need to setup 1024 asids configurations at init phase.
Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Detailed description for this pull request:
1. Update extcon provider driver
- Add the support of charging interrupt to detect charger connector
for extcon-max8997.c
- Detect OTG when USB_ID pin is connected to ground for extcon-sm5502.c
- Add the support for VBUS detection for extcon-qcom-spmi-misc.c
and replace qcom,pm8941-misc binding document with yaml style.
-----BEGIN PGP SIGNATURE-----
iQJKBAABCgA0FiEEsSpuqBtbWtRe4rLGnM3fLN7rz1MFAmBupE8WHGN3MDAuY2hv
aUBzYW1zdW5nLmNvbQAKCRCczd8s3uvPUySWD/9+0DGK4TeGN65gLaSwvfsRTqh9
tehp8kmjhbH0xWeL93wY+ME+wNjFbclSC85/zX3/OwevM+xjwTD5qscTSfamDAWa
I9SErOmlOlR4mKJiTHvT7b2l7veUeOKyjy4GKyddLhCiC7zVb5rjT7pihGsNTnKf
dVJziiY3rb7eB34KqZwGexX6lmzAk68WwHBlxbj66J9CV1uxk6mmvLzX23tJmUjR
id+hHIrrvwEMe1FNNATq8BS/BKniDpllHKE0+3NMmbhUqXSyIaLVpH0PmdIMgg69
yllBc53VUjK/2m1YJICkNGXPLi9GojvFxyWf0sq9Jbcd/Q/P7m2wfe7OV3t7/1L2
kuRL2xNcM+NKqVTdzfnrxIpgRTdHYi7BjVplEOrmHpC+ehZMBIVS861kh2TF1J7j
n7LJ9FzdNcyGC26jeN/r13+0hrezxicN4nfXnG8iKGglHOjhHKD1NTWSD2Rrzt7m
mBQ9yPq2hZGrGNEaslB3c6xxwFOsdMgPl3c9vmsW5BFwmLwzrFcOkXm6Yrotn7yU
MOL9ATQ8mAEGuuKZHq3AfUVRnSapQM7ZgAralVaa+LD5j6vlDu1DRFvHRnilbNab
Ao9zFNA6TNwd6EduSv32GsCPhPlYxlQQ3h2HoaoAvm1UXY3bhsFObW2oJap9WtQe
iYw4FInGNjS/ESIkZg==
=13Nh
-----END PGP SIGNATURE-----
Merge tag 'extcon-next-for-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/extcon into char-misc-next
Chanwoo writes:
Update extcon next for v5.13
Detailed description for this pull request:
1. Update extcon provider driver
- Add the support of charging interrupt to detect charger connector
for extcon-max8997.c
- Detect OTG when USB_ID pin is connected to ground for extcon-sm5502.c
- Add the support for VBUS detection for extcon-qcom-spmi-misc.c
and replace qcom,pm8941-misc binding document with yaml style.
* tag 'extcon-next-for-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/extcon:
extcon: qcom-spmi: Add support for VBUS detection
bindings: pm8941-misc: Add support for VBUS detection
bindings: pm8941-misc: Convert bindings to YAML
extcon: sm5502: Detect OTG when USB_ID is connected to ground
extcon: max8997: Add CHGINS and CHGRM interrupt handling
FPGA Manager:
- Russ' first change improves port_enable reliability
- Russ' second change adds a new device ID for a DFL device
- Geert's change updates the examples in binding with dt overlay sugar
syntax
All patches have been reviewed on the mailing list, and have been in the
last linux-next releases (as part of my for-next branch) without issues.
Signed-off-by: Moritz Fischer <mdf@kernel.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRORt0E5Sb/c/mZMgkXxQAtim5VSwUCYG5feQAKCRAXxQAtim5V
S1v3AQDR8dK1Bfoa8h8QHM+Xhb8UNkoC70Fbl/znQrTbRPimtAEAkgmKgElijBdL
zE+zMFa0OUnkBCHW2IQZXfuu+qI9GQw=
=P0ck
-----END PGP SIGNATURE-----
Merge tag 'fpga-late-for-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/mdf/linux-fpga into char-misc-next
Moritz writes:
Second set of FPGA Manager changes for 5.13-rc1
FPGA Manager:
- Russ' first change improves port_enable reliability
- Russ' second change adds a new device ID for a DFL device
- Geert's change updates the examples in binding with dt overlay sugar
syntax
All patches have been reviewed on the mailing list, and have been in the
last linux-next releases (as part of my for-next branch) without issues.
Signed-off-by: Moritz Fischer <mdf@kernel.org>
* tag 'fpga-late-for-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/mdf/linux-fpga:
fpga: dfl: pci: add DID for D5005 PAC cards
dt-bindings: fpga: fpga-region: Convert to sugar syntax
fpga: dfl: afu: harden port enable logic
fpga: Add support for Xilinx DFX AXI Shutdown manager
dt-bindings: fpga: Add compatible value for Xilinx DFX AXI shutdown manager
fpga: xilinx-pr-decoupler: Simplify code by using dev_err_probe()
fpga: fpga-mgr: xilinx-spi: fix error messages on -EPROBE_DEFER
VBUS can be detected via a dedicated PMIC pin. Add support
for reporting the VBUS status.
Signed-off-by: Anirudh Ghayal <aghayal@codeaurora.org>
Signed-off-by: Kavya Nunna <knunna@codeaurora.org>
Signed-off-by: Guru Das Srinagesh <gurus@codeaurora.org>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Add interrupt support for reporting VBUS detection status that can be
detected via a dedicated PMIC pin.
Signed-off-by: Anirudh Ghayal <aghayal@codeaurora.org>
Signed-off-by: Guru Das Srinagesh <gurus@codeaurora.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Convert bindings from txt to YAML.
Signed-off-by: Guru Das Srinagesh <gurus@codeaurora.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
In it's curent state this driver ignores OTG adapters with ID pin
connected to ground. This commit adds a check to set extcon into
host mode when such OTG adapter is connected.
Signed-off-by: Nikita Travkin <nikitos.tr@gmail.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
This allows the MAX8997 charger to set the current limit depending on
the detected extcon charger type.
Signed-off-by: Timon Baetz <timon.baetz@protonmail.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
The sparse tool complains as follows:
drivers/hwtracing/coresight/coresight-etm-perf.c:61:25: warning:
symbol 'format_attr_contextid' was not declared. Should it be static?
This symbol is not used outside of coresight-etm-perf.c, so this
commit marks it static.
Link: https://lore.kernel.org/r/20210308123250.2417947-1-weiyongjun1@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20210407160007.418053-3-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>