linux

Commit Graph

Author	SHA1	Message	Date
Robert Richter	c6c3187d66	lib/firmware_table: Provide buffer length argument to cdat_table_parse() There exist card implementations with a CDAT table using a fixed size buffer, but with entries filled in that do not fill the whole table length size. Then, the last entry in the CDAT table may not mark the end of the CDAT table buffer specified by the length field in the CDAT header. It can be shorter with trailing unused (zero'ed) data. The actual table length is determined while reading all CDAT entries of the table with DOE. If the table is greater than expected (containing zero'ed trailing data), the CDAT parser fails with: [ 48.691717] Malformed DSMAS table length: (24:0) [ 48.702084] [CDAT:0x00] Invalid zero length [ 48.711460] cxl_port endpoint1: Failed to parse CDAT: -22 In addition, a check of the table buffer length is missing to prevent an out-of-bound access then parsing the CDAT table. Hardening code against device returning borked table. Fix that by providing an optional buffer length argument to acpi_parse_entries_array() that can be used by cdat_table_parse() to propagate the buffer size down to its users to check the buffer length. This also prevents a possible out-of-bound access mentioned. Add a check to warn about a malformed CDAT table length. Cc: Rafael J. Wysocki <rafael@kernel.org> Cc: Len Brown <lenb@kernel.org> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/ZdEnopFO0Tl3t2O1@rric.localdomain Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2024-03-13 00:03:21 -07:00
Robert Richter	e0c818e004	cxl/pci: Get rid of pointer arithmetic reading CDAT table Reading the CDAT table using DOE requires a Table Access Response Header in addition to the CDAT entry. In current implementation this has caused offsets with sizeof(__le32) to the actual buffers. This led to hardly readable code and even bugs. E.g., see fix of devm_kfree() in read_cdat_data(): commit `c65efe3685` ("cxl/cdat: Free correct buffer on checksum error") Rework code to avoid calculations with sizeof(__le32). Introduce struct cdat_doe_rsp for this which contains the Table Access Response Header and a variable payload size for various data structures afterwards to access the CDAT table and its CDAT Data Structures without recalculating buffer offsets. Cc: Lukas Wunner <lukas@wunner.de> Cc: Fan Ni <nifan.cxl@gmail.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240216155844.406996-3-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2024-03-12 23:52:29 -07:00
Robert Richter	ec8ffff3a9	cxl/pci: Rename DOE mailbox handle to doe_mb Trivial variable rename for the DOE mailbox handle from cdat_doe to doe_mb. The variable name cdat_doe is too ambiguous, use doe_mb that is commonly used for the mailbox. Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240216155844.406996-2-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2024-03-12 23:52:29 -07:00
Dave Jiang	99b52aac2d	cxl: Fix the incorrect assignment of SSLBIS entry pointer initial location The 'entry' pointer in cdat_sslbis_handler() is set to header + sizeof(common header). However, the math missed the addition of the SSLBIS main header. It should be header + sizeof(common header) + sizeof(*sslbis). Use a defined struct for all the SSLBIS parts in order to avoid pointer math errors. The bug causes incorrect parsing of the SSLBIS table and introduces incorrect performance values to the access_coordinates during the CXL access_coordinate calculation path if there are CXL switches present in the topology. The issue was found during testing of new code being added to add additional checks for invalid CDAT values during CXL access_coordinate calculation. The testing was done on qemu with a CXL topology including a CXL switch. Fixes: `80aa780dda` ("cxl: Add callback to parse the SSLBIS subtable from CDAT") Signed-off-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Link: https://lore.kernel.org/r/20240301210948.1298075-1-dave.jiang@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2024-03-12 23:41:59 -07:00
Ben Cheatham	8039804cfa	cxl/core: Add CXL EINJ debugfs files Export CXL helper functions in einj-cxl.c for getting/injecting available CXL protocol error types to sysfs under kernel/debug/cxl. The kernel/debug/cxl/einj_types file will print the available CXL protocol errors in the same format as the available_error_types file provided by the einj module. The kernel/debug/cxl/$dport_dev/einj_inject file is functionally the same as the error_type and error_inject files provided by the EINJ module, i.e.: writing an error type into $dport_dev/einj_inject will inject said error type into the CXL dport represented by $dport_dev. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ben Cheatham <Benjamin.Cheatham@amd.com> Link: https://lore.kernel.org/r/20240311142508.31717-4-Benjamin.Cheatham@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2024-03-12 23:08:30 -07:00
Dave Jiang	debdce20c4	cxl/region: Deal with numa nodes not enumerated by SRAT For the numa nodes that are not created by SRAT, no memory_target is allocated and is not managed by the HMAT_REPORTING code. Therefore hmat_callback() memory hotplug notifier will exit early on those NUMA nodes. The CXL memory hotplug notifier will need to call node_set_perf_attrs() directly in order to setup the access sysfs attributes. In acpi_numa_init(), the last proximity domain (pxm) id created by SRAT is stored. Add a helper function acpi_node_backed_by_real_pxm() in order to check if a NUMA node id is defined by SRAT or created by CFMWS. node_set_perf_attrs() symbol is exported to allow update of perf attribs for a node. The sysfs path of /sys/devices/system/node/nodeX/access0/initiators/* is created by node_set_perf_attrs() for the various attributes where nodeX is matched to the NUMA node of the CXL region. Cc: Rafael J. Wysocki <rafael@kernel.org> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240308220055.2172956-13-dave.jiang@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2024-03-12 14:54:03 -07:00
Dave Jiang	067353a46d	cxl/region: Add memory hotplug notifier for cxl region When the CXL region is formed, the driver computes the performance data for the region. However this data is not available at the node data collection that has been populated by the HMAT during kernel initialization. Add a memory hotplug notifier to update the access coordinates to the 'struct memory_target' context kept by the HMAT_REPORTING code. Add CXL_CALLBACK_PRI for a memory hotplug callback priority. Set the priority number to be called before HMAT_CALLBACK_PRI. The CXL update must happen before hmat_callback(). A new HMAT_REPORTING helper hmat_update_target_coordinates() is added in order to allow CXL to update the memory_target access coordinates. A new ext_updated member is added to the memory_target to indicate that the access coordinates within the memory_target has been updated by an external agent such as CXL. This prevents data being overwritten by the hmat_update_target_attrs() triggered by hmat_callback(). Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Rafael J. Wysocki <rafael@kernel.org> Reviewed-by: Huang, Ying <ying.huang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240308220055.2172956-12-dave.jiang@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2024-03-12 12:34:12 -07:00
Dave Jiang	c20eaf4411	cxl/region: Add sysfs attribute for locality attributes of CXL regions Add read/write latencies and bandwidth sysfs attributes for the enabled CXL region. The bandwidth is the aggregated bandwidth of all devices that contribute to the CXL region. The latency is the worst latency of the device amongst all the devices that contribute to the CXL region. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240308220055.2172956-11-dave.jiang@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2024-03-12 12:34:11 -07:00
Dave Jiang	3d9f4a1972	cxl/region: Calculate performance data for a region Calculate and store the performance data for a CXL region. Find the worst read and write latency for all the included ranges from each of the devices that attributes to the region and designate that as the latency data. Sum all the read and write bandwidth data for each of the device region and that is the total bandwidth for the region. The perf list is expected to be constructed before the endpoint decoders are registered and thus there should be no early reading of the entries from the region assemble action. The calling of the region qos calculate function is under the protection of cxl_dpa_rwsem and will ensure that all DPA associated work has completed. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240308220055.2172956-10-dave.jiang@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2024-03-12 12:34:11 -07:00
Dave Jiang	3d8be8b398	cxl: Set cxlmd->endpoint before adding port device Move setting of cxlmd->endpoint to before calling add_device() on the port device. Otherwise when referencing cxlmd->endpoint in region discovery code that is triggered by the port driver probe function, the endpoint port pointer is not valid. Current code does not hit this issue yet since cxlmd->endpoint is not being referenced during region discovery. However follow on code that does performance calculations will. Tested-by: Wonjae Lee <wj28.lee@samsung.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240308220055.2172956-9-dave.jiang@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2024-03-12 12:34:11 -07:00
Dave Jiang	6ef83c4e19	cxl: Move QoS class to be calculated from the nearest CPU Retrieve the qos_class (QTG ID) using the access coordinates from the nearest CPU rather than the nearst initiator that may not be a CPU. This may be the more appropriate number that applications care about. For most cases, access0 and access1 have the same values. Link: https://lore.kernel.org/linux-cxl/20240112113023.00006c50@Huawei.com/ Suggested-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240308220055.2172956-8-dave.jiang@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2024-03-12 12:34:11 -07:00
Dave Jiang	863027d409	cxl: Split out host bridge access coordinates The difference between access class 0 and access class 1 for 'struct access_coordinate', if any, is that class 0 is for the distance from the target to the closest initiator and that class 1 is for the distance from the target to the closest CPU. For CXL memory, the nearest initiator may not necessarily be a CPU node. The performance path from the CXL endpoint to the host bridge should remain the same. However, the numbers extracted and stored from HMAT is the difference for the two access classes. Split out the performance numbers for the host bridge (generic target) from the calculation of the entire path in order to allow calculation of both access classes for a CXL region. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240308220055.2172956-7-dave.jiang@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2024-03-12 12:34:11 -07:00
Dave Jiang	032f7b37ad	cxl: Split out combine_coordinates() for common shared usage Refactor the common code of combining coordinates in order to reduce code. Create a new function cxl_cooordinates_combine() it combine two 'struct access_coordinate'. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240308220055.2172956-6-dave.jiang@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2024-03-12 12:34:11 -07:00
Dave Jiang	bd98cbbbf8	ACPI: HMAT / cxl: Add retrieval of generic port coordinates for both access classes Update acpi_get_genport_coordinates() to allow retrieval of both access classes of the 'struct access_coordinate' for a generic target. The update will allow CXL code to compute access coordinates for both access class. Cc: Rafael J. Wysocki <rafael@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240308220055.2172956-5-dave.jiang@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2024-03-12 12:34:11 -07:00
Dan Williams	40de53fd00	Merge branch 'for-6.8/cxl-cper' into for-6.8/cxl Pick up CXL CPER notification removal for v6.8-rc6, to return in a later merge window.	2024-02-20 22:57:35 -08:00
Robert Richter	0cab687205	cxl/pci: Fix disabling memory if DVSEC CXL Range does not match a CFMWS window The Linux CXL subsystem is built on the assumption that HPA == SPA. That is, the host physical address (HPA) the HDM decoder registers are programmed with are system physical addresses (SPA). During HDM decoder setup, the DVSEC CXL range registers (cxl-3.1, 8.1.3.8) are checked if the memory is enabled and the CXL range is in a HPA window that is described in a CFMWS structure of the CXL host bridge (cxl-3.1, 9.18.1.3). Now, if the HPA is not an SPA, the CXL range does not match a CFMWS window and the CXL memory range will be disabled then. The HDM decoder stops working which causes system memory being disabled and further a system hang during HDM decoder initialization, typically when a CXL enabled kernel boots. Prevent a system hang and do not disable the HDM decoder if the decoder's CXL range is not found in a CFMWS window. Note the change only fixes a hardware hang, but does not implement HPA/SPA translation. Support for this can be added in a follow on patch series. Signed-off-by: Robert Richter <rrichter@amd.com> Fixes: `34e37b4c43` ("cxl/port: Enable HDM Capability after validating DVSEC Ranges") Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20240216160113.407141-1-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2024-02-16 23:20:34 -08:00
Dave Jiang	cc214417f0	cxl: Fix sysfs export of qos_class for memdev Current implementation exports only to /sys/bus/cxl/devices/.../memN/qos_class. With both ram and pmem exposed, the second registered sysfs attribute is rejected as duplicate. It's not possible to create qos_class under the dev_groups via the driver due to the ram and pmem sysfs sub-directories already created by the device sysfs groups. Move the ram and pmem qos_class to the device sysfs groups and add a call to sysfs_update() after the perf data are validated so the qos_class can be visible. The end results should be /sys/bus/cxl/devices/.../memN/ram/qos_class and /sys/bus/cxl/devices/.../memN/pmem/qos_class. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240206190431.1810289-4-dave.jiang@intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2024-02-16 23:20:34 -08:00
Dave Jiang	10cb393d76	cxl: Remove unnecessary type cast in cxl_qos_class_verify() The passed in host bridge parameter for device_for_each_child() has unnecessary void * type cast. Remove the type cast. Suggested-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240206190431.1810289-3-dave.jiang@intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2024-02-16 23:20:34 -08:00
Dave Jiang	00413c1506	cxl: Change 'struct cxl_memdev_state' *_perf_list to single 'struct cxl_dpa_perf' In order to address the issue with being able to expose qos_class sysfs attributes under 'ram' and 'pmem' sub-directories, the attributes must be defined as static attributes rather than under driver->dev_groups. To avoid implementing locking for accessing the 'struct cxl_dpa_perf` lists, convert the list to a single 'struct cxl_dpa_perf' entry in preparation to move the attributes to statically defined. While theoretically a partition may have multiple qos_class via CDAT, this has not been encountered with testing on available hardware. The code is simplified for now to not support the complex case until a use case is needed to support that. Link: https://lore.kernel.org/linux-cxl/65b200ba228f_2d43c29468@dwillia2-mobl3.amr.corp.intel.com.notmuch/ Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240206190431.1810289-2-dave.jiang@intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2024-02-16 23:20:34 -08:00
Alison Schofield	cb66b1d60c	cxl/region: Allow out of order assembly of autodiscovered regions Autodiscovered regions can fail to assemble if they are not discovered in HPA decode order. The user will see failure messages like: [] cxl region0: endpoint5: HPA order violation region1 [] cxl region0: endpoint5: failed to allocate region reference The check that is causing the failure helps the CXL driver enforce a CXL spec mandate that decoders be committed in HPA order. The check is needless for autodiscovered regions since their decoders are already programmed. Trying to enforce order in the assembly of these regions is useless because they are assembled once all their member endpoints arrive, and there is no guarantee on the order in which endpoints are discovered during probe. Keep the existing check, but for autodiscovered regions, allow the out of order assembly after a sanity check that the lesser numbered decoder has the lesser HPA starting address. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Tested-by: Wonjae Lee <wj28.lee@samsung.com> Link: https://lore.kernel.org/r/3dec69ee97524ab229a20c6739272c3000b18408.1706736863.git.alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2024-02-16 23:20:34 -08:00
Alison Schofield	453a7fde80	cxl/region: Handle endpoint decoders in cxl_region_find_decoder() In preparation for adding a new caller of cxl_region_find_decoders() teach it to find a decoder from a cxl_endpoint_decoder structure. Combining switch and endpoint decoder lookup in one function prevents code duplication in call sites. Update the existing caller. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Tested-by: Wonjae Lee <wj28.lee@samsung.com> Link: https://lore.kernel.org/r/79ae6d72978ef9f3ceec9722e1cb793820553c8e.1706736863.git.alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2024-02-16 23:20:34 -08:00
Linus Torvalds	e6f39a90de	EFI fixes for v6.8 #1 - Tighten ELF relocation checks on the RISC-V EFI stub - Give up if the new EFI memory attributes protocol fails spuriously on x86 - Take care not to place the kernel in the lowest 16 MB of DRAM on x86 - Omit special purpose EFI memory from memblock - Some fixes for the CXL CPER reporting code - Make the PE/COFF layout of mixed-mode capable images comply with a strict interpretation of the spec -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQQQm/3uucuRGn1Dmh0wbglWLn0tXAUCZcDtKAAKCRAwbglWLn0t XMDfAP9ttq8Ir4+hp8A0DGE79x6eSgBIkl5ztGmMQGybzEkzdAEAgxfDUieQW4TT GmbyGGUouvSYxfZf4gVTQn8b/bd57AI= =Af8A -----END PGP SIGNATURE----- Merge tag 'efi-fixes-for-v6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI fixes from Ard Biesheuvel: "The only notable change here is the patch that changes the way we deal with spurious errors from the EFI memory attribute protocol. This will be backported to v6.6, and is intended to ensure that we will not paint ourselves into a corner when we tighten this further in order to comply with MS requirements on signed EFI code. Note that this protocol does not currently exist in x86 production systems in the field, only in Microsoft's fork of OVMF, but it will be mandatory for Windows logo certification for x86 PCs in the future. - Tighten ELF relocation checks on the RISC-V EFI stub - Give up if the new EFI memory attributes protocol fails spuriously on x86 - Take care not to place the kernel in the lowest 16 MB of DRAM on x86 - Omit special purpose EFI memory from memblock - Some fixes for the CXL CPER reporting code - Make the PE/COFF layout of mixed-mode capable images comply with a strict interpretation of the spec" * tag 'efi-fixes-for-v6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: x86/efistub: Use 1:1 file:memory mapping for PE/COFF .compat section cxl/trace: Remove unnecessary memcpy's cxl/cper: Fix errant CPER prints for CXL events efi: Don't add memblocks for soft-reserved memory efi: runtime: Fix potential overflow of soft-reserved region size efi/libstub: Add one kernel-doc comment x86/efistub: Avoid placing the kernel below LOAD_PHYSICAL_ADDR x86/efistub: Give up if memory attribute protocol returns an error riscv/efistub: Tighten ELF relocation check riscv/efistub: Ensure GP-relative addressing is not used	2024-02-09 10:40:50 -08:00
Ira Weiny	dbea519d68	cxl/trace: Remove unnecessary memcpy's CPER events don't have UUIDs. Therefore UUIDs were removed from the records passed to trace events and replaced with hard coded values. As pointed out by Jonathan, the new defines for the UUIDs present a more efficient way to assign UUID in trace records.[1] Replace memcpy's with the use of static data. [1] https://lore.kernel.org/all/20240108132325.00000e9c@Huawei.com/ Suggested-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>	2024-02-03 18:31:17 +01:00
Li Ming	eef5c7b28d	cxl/pci: Skip to handle RAS errors if CXL.mem device is detached The PCI AER model is an awkward fit for CXL error handling. While the expectation is that a PCI device can escalate to link reset to recover from an AER event, the same reset on CXL amounts to a surprise memory hotplug of massive amounts of memory. At present, the CXL error handler attempts some optimistic error handling to unbind the device from the cxl_mem driver after reaping some RAS register values. This results in a "hopeful" attempt to unplug the memory, but there is no guarantee that will succeed. A subsequent AER notification after the memdev unbind event can no longer assume the registers are mapped. Check for memdev bind before reaping status register values to avoid crashes of the form: BUG: unable to handle page fault for address: ffa00000195e9100 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page [...] RIP: 0010:__cxl_handle_ras+0x30/0x110 [cxl_core] [...] Call Trace: <TASK> ? __die+0x24/0x70 ? page_fault_oops+0x82/0x160 ? kernelmode_fixup_or_oops+0x84/0x110 ? exc_page_fault+0x113/0x170 ? asm_exc_page_fault+0x26/0x30 ? __pfx_dpc_reset_link+0x10/0x10 ? __cxl_handle_ras+0x30/0x110 [cxl_core] ? find_cxl_port+0x59/0x80 [cxl_core] cxl_handle_rp_ras+0xbc/0xd0 [cxl_core] cxl_error_detected+0x6c/0xf0 [cxl_core] report_error_detected+0xc7/0x1c0 pci_walk_bus+0x73/0x90 pcie_do_recovery+0x23f/0x330 Longer term, the unbind and PCI_ERS_RESULT_DISCONNECT behavior might need to be replaced with a new PCI_ERS_RESULT_PANIC. Fixes: `6ac07883db` ("cxl/pci: Add RCH downstream port error logging") Cc: stable@vger.kernel.org Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Li Ming <ming4.li@intel.com> Link: https://lore.kernel.org/r/20240129131856.2458980-1-ming4.li@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2024-01-29 12:59:07 -08:00
Quanquan Cao	d76779dd36	cxl/region：Fix overflow issue in alloc_hpa() Creating a region with 16 memory devices caused a problem. The div_u64_rem function, used for dividing an unsigned 64-bit number by a 32-bit one, faced an issue when SZ_256M * p->interleave_ways. The result surpassed the maximum limit of the 32-bit divisor (4G), leading to an overflow and a remainder of 0. note: At this point, p->interleave_ways is 16, meaning 16 * 256M = 4G To fix this issue, I replaced the div_u64_rem function with div64_u64_rem and adjusted the type of the remainder. Signed-off-by: Quanquan Cao <caoqq@fujitsu.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Fixes: `23a22cd1c9` ("cxl/region: Allocate HPA capacity to regions") Cc: <stable@vger.kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2024-01-24 21:03:03 -08:00
Shiyang Ruan	73bf93edee	cxl/core: use sysfs_emit() for attr's _show() sprintf() is deprecated for sysfs, use preferred sysfs_emit() instead. Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Link: https://lore.kernel.org/r/20240112062709.2490947-1-ruansy.fnst@fujitsu.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2024-01-12 14:47:04 -08:00
Dan Williams	3601311593	Merge branch 'for-6.8/cxl-cper' into for-6.8/cxl Pick up the CPER to CXL driver integration work for v6.8. Some additional cleanup of cper_estatus_print() messages is needed, but that is to be handled incrementally.	2024-01-09 19:21:44 -08:00
Ira Weiny	dc97f6344f	cxl/pci: Register for and process CPER events If the firmware has configured CXL event support to be firmware first the OS can process those events through CPER records. The CXL layer has unique DPA to HPA knowledge and standard event trace parsing in place. CPER records contain Bus, Device, Function information which can be used to identify the PCI device which is sending the event. Change the PCI driver registration to include registration of a CXL CPER callback to process events through the trace subsystem. Use new scoped based management to simplify the handling of the PCI device object. Tested-by: Smita-Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Reviewed-by: Smita-Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Link: https://lore.kernel.org/r/20231220-cxl-cper-v5-9-1bb8a4ca2c7a@intel.com Signed-off-by: Ira Weiny <ira.weiny@intel.com> [djbw: use new pci_dev guard, flip init order] Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2024-01-09 15:41:23 -08:00
Ira Weiny	f9c683386f	cxl/events: Create a CXL event union The CXL CPER and event log records share everything but a UUID/GUID in their structures. Define a cxl_event union without the UUID/GUID to be shared between the CPER and event log record formats. Adjust the code to use this union. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20231220-cxl-cper-v5-6-1bb8a4ca2c7a@intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2024-01-09 15:41:22 -08:00
Ira Weiny	6eade11075	cxl/events: Separate UUID from event structures The UEFI CXL CPER structure does not include the UUID. Now that the UUID is passed separately to the trace event there is no need to have the UUID in those structures. Move UUID from the event record header to the raw structures. Adjust cxl-test to Create dummy structures for creating test records. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20231220-cxl-cper-v5-5-1bb8a4ca2c7a@intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2024-01-09 15:39:38 -08:00
Ira Weiny	207a1f8230	cxl/events: Remove passing a UUID to known event traces The UUID data is redundant in the known event trace types. The addition of static defines allows the trace macros to create the UUID data inside the trace thus removing unnecessary code. Have well known trace events use static data to set the uuid field based on the event type. Suggested-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20231220-cxl-cper-v5-4-1bb8a4ca2c7a@intel.com Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2024-01-09 15:39:38 -08:00
Ira Weiny	4c115c9c1f	cxl/events: Create common event UUID defines Dan points out in review that the cxl_test code could be made better through the use of UUID's defines rather than being open coded.[1] Create UUID defines and use them rather than open coding them. Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: http://lore.kernel.org/r/65738d09e30e2_45e0129451@dwillia2-xfh.jf.intel.com.notmuch [1] Link: https://lore.kernel.org/r/20231220-cxl-cper-v5-3-1bb8a4ca2c7a@intel.com [djbw: clang-format uuid definitions] Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2024-01-09 15:39:38 -08:00
Dan Williams	e16bf7e015	Merge branch 'for-6.7/cxl' into for-6.8/cxl Pick up a late locking change + fixup that is better as merge window material than rc material.	2024-01-05 19:24:33 -08:00
Dan Williams	80dda9a69a	Merge branch 'for-6.8/cxl-misc' into for-6.8/cxl Pick up some miscellaneous fixups for v6.8.	2024-01-05 19:03:06 -08:00
Dan Williams	d3953c78fc	Merge branch 'for-6.8/cxl-cdat' into for-6.8/cxl Pick up some follow-on fixes for 'cxl_root' reference count leaks.	2024-01-05 18:59:06 -08:00
Dave Jiang	66f11890d3	cxl: Refactor to use __free() for cxl_root allocation in cxl_find_nvdimm_bridge() Use scope-based resource management __free() macro to drop the open coded put_device() in cxl_find_nvdimm_bridge(). Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/170449247353.3779673.5963704495491343135.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2024-01-05 14:36:29 -08:00
Dave Jiang	98e7ab3345	cxl: Fix device reference leak in cxl_port_perf_data_calculate() cxl_port_perf_data_calculate() calls find_cxl_root() and does not dereference the 'struct device' in the cxl_root->port. find_cxl_root() calls get_device() and takes a reference on the port 'struct device' member. Use the __free() macro to ensure the dereference happens. Fixes: `7a4f148dd8` ("cxl: Compute the entire CXL path latency and bandwidth data") Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/170449246681.3779673.2288926019977963333.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2024-01-05 14:36:29 -08:00
Dave Jiang	44cd71ef7b	cxl: Convert find_cxl_root() to return a 'struct cxl_root *' Commit `790815902e` ("cxl: Add support for _DSM Function for retrieving QTG ID") introduced 'struct cxl_root', however all usages have been worked indirectly through cxl_port. Refactor code such as find_cxl_root() function to use 'struct cxl_root' directly. Suggested-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/170449246044.3779673.13035770941393418591.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2024-01-05 14:36:29 -08:00
Dave Jiang	98856b2ea3	cxl: Introduce put_cxl_root() helper Add a helper function put_cxl_root() to maintain symmetry for find_cxl_root() function instead of relying on open coding of the put_device() in order to dereference the 'struct device' that happens via get_device() in find_cxl_root(). Suggested-by: Robert Richter <rrichter@amd.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Robert Richter <rrichter@amd.com> Link: https://lore.kernel.org/r/170449245417.3779673.4566146351673989387.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2024-01-05 14:36:29 -08:00
Dan Williams	5459e186a5	cxl/port: Fix missing target list lock cxl_port_setup_targets() modifies the ->targets[] array of a switch decoder. target_list_show() expects to be able to emit a coherent snapshot of that array by "holding" ->target_lock for read. The target_lock is held for write during initialization of the ->targets[] array, but it is not held for write during cxl_port_setup_targets(). The ->target_lock() predates the introduction of @cxl_region_rwsem. That semaphore protects changes to host-physical-address (HPA) decode which is precisely what writes to a switch decoder's target list affects. Replace ->target_lock with @cxl_region_rwsem. Now the side-effect of snapshotting a unstable view of a decoder's target list is likely benign so the Fixes: tag is presumptive. Fixes: `27b3f8d138` ("cxl/region: Program target lists") Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2024-01-04 13:59:50 -08:00
Huang Ying	d6488fee66	cxl/port: Fix decoder initialization when nr_targets > interleave_ways The decoder_populate_targets() helper walks all of the targets in a port and makes sure they can be looked up in @target_map. Where @target_map is a lookup table from target position to target id (corresponding to a cxl_dport instance). However @target_map is only responsible for conveying the active dport instances as indicated by interleave_ways. When nr_targets > interleave_ways it results in decoder_populate_targets() walking off the end of the valid entries in @target_map. Given target_map is initialized to 0 it results in the dport lookup failing if position 0 is not mapped to a dport with an id of 0: cxl_port port3: Failed to populate active decoder targets cxl_port port3: Failed to add decoder cxl_port port3: Failed to add decoder3.0 cxl_bus_probe: cxl_port port3: probe: -6 This bug also highlights that when the decoder's ->targets[] array is written in cxl_port_setup_targets() it is missing a hold of the targets_lock to synchronize against sysfs readers of the target list. A fix for that is saved for a later patch. Fixes: `a5c2580216` ("cxl/bus: Populate the target list at decoder create") Cc: <stable@vger.kernel.org> Signed-off-by: Huang, Ying <ying.huang@intel.com> [djbw: rewrite the changelog, find the Fixes: tag] Co-developed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2024-01-04 13:59:50 -08:00
Jim Harris	c7ad3dc364	cxl/region: fix x9 interleave typo CXL supports x3, x6 and x12 - not x9. Fixes: `80d10a6cee` ("cxl/region: Add interleave geometry attributes") Signed-off-by: Jim Harris <jim.harris@samsung.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Link: https://lore.kernel.org/r/169904271254.204936.8580772404462743630.stgit@ubuntu Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2024-01-03 17:55:20 -08:00
Ira Weiny	6d0fc416c4	cxl/trace: Pass UUID explicitly to event traces CXL CPER events are identified by the CPER Section Type GUID. The GUID correlates with the CXL UUID for the event record. It turns out that a CXL CPER record is a strict subset of the CXL event record, only the UUID header field is chopped. In order to unify handling between native and CPER flavors of CXL events, prepare the code for the UUID to be passed in rather than inferred from the record itself. Later patches update the passed in record to only refer to the common data between the formats. Pass the UUID explicitly to each trace event to be able to remove the UUID from the event structures. Originally it was desirable to remove the UUID from the well known event because the UUID value was redundant. However, the trace API was already in place.[1] Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/all/36f2d12934d64a278f2c0313cbd01abc@huawei.com [1] Link: https://lore.kernel.org/r/20231220-cxl-cper-v5-1-1bb8a4ca2c7a@intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2024-01-03 14:15:52 -08:00
Dan Williams	11c8393202	Merge branch 'for-6.8/cxl-cdat' into for-6.8/cxl Pick up the CDAT parsing and QOS class infrastructure for v6.8.	2024-01-02 11:03:04 -08:00
Randy Dunlap	58f1e9d3a3	cxl/region: use %pap format to print resource_size_t Use "%pap" to print a resource_size_t (phys_addr_t derived type) to prevent build warnings on 32-bit arches (seen on i386 and riscv-32). ../drivers/cxl/core/region.c: In function 'alloc_hpa': ../drivers/cxl/core/region.c:556:25: warning: format '%llx' expects argument of type 'long long unsigned int', but argument 5 has type 'resource_size_t' {aka 'unsigned int'} [-Wformat=] 556 \| "HPA allocation error (%ld) for size:%#llx in %s %pr\n", Fixes: `7984d22f13` ("cxl/region: Add dev_dbg() detail on failure to allocate HPA space") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Fan Ni <fan.ni@samsung.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Alison Schofield <alison.schofield@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: <linux-cxl@vger.kernel.org> Link: https://lore.kernel.org/r/20240102173917.19718-1-rdunlap@infradead.org Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2024-01-02 11:02:49 -08:00
Alison Schofield	7984d22f13	cxl/region: Add dev_dbg() detail on failure to allocate HPA space When the region driver fails while allocating HPA space for a new region it can be because the parent resource, the CXL Window, has no more available space. In that case, the debug user sees this message: cxl_core:alloc_hpa:555: cxl region2: failed to allocate HPA: -34 Expand the message like this: cxl_core:alloc_hpa:555: cxl region8: HPA allocation error (-34) for size:0x20000000 in CXL Window 0 [mem 0xf010000000-0xf04fffffff flags 0x200] Now the debug user can examine /proc/iomem and consider actions like removing other allocations in that space or reducing the size of their region request. Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Link: https://lore.kernel.org/r/20231223004740.1401858-1-alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-12-24 13:43:48 -08:00
Dave Jiang	185c1a489f	cxl: Check qos_class validity on memdev probe Add a check to make sure the qos_class for the device will match one of the root decoders qos_class. If no match is found, then the qos_class for the device is set to invalid. Also add a check to ensure that the device's host bridge matches to one of the root decoder's downstream targets. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/170319626313.2212653.9021004640856081917.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-12-22 15:46:02 -08:00
Dave Jiang	86557b7edf	cxl: Store QTG IDs and related info to the CXL memory device context Once the QTG ID _DSM is executed successfully, the QTG ID is retrieved from the return package. Create a list of entries in the cxl_memdev context and store the QTG ID as qos_class token and the associated DPA range. This information can be exposed to user space via sysfs in order to help region setup for hot-plugged CXL memory devices. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/170319625109.2212653.11872111896220384056.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-12-22 15:31:53 -08:00
Dave Jiang	7a4f148dd8	cxl: Compute the entire CXL path latency and bandwidth data CXL Memory Device SW Guide [1] rev1.0 2.11.2 provides instruction on how to calculate latency and bandwidth for CXL memory device. Calculate minimum bandwidth and total latency for the path from the CXL device to the root port. The QTG id is retrieved by providing the performance data as input and calling the root port callback ->get_qos_class(). The retrieved id is stored with the cxl_port of the CXL device. For example for a device that is directly attached to a host bus: Total Latency = Device Latency (from CDAT) + Dev to Host Bus (HB) Link Latency + Generic Port Latency Min Bandwidth = Min bandwidth for link bandwidth between HB and CXL device, device CDAT bandwidth, and Generic Port Bandwidth For a device that has a switch in between host bus and CXL device: Total Latency = Device (CDAT) Latency + Dev to Switch Link Latency + Switch (CDAT) Latency + Switch to HB Link Latency + Generic Port Latency Min Bandwidth = Min bandwidth for link bandwidth between CXL device to CXL switch, CXL device CDAT bandwidth, CXL switch CDAT bandwidth, CXL switch to HB bandwidth, and Generic Port Bandwidth. [1]: https://cdrdv2-public.intel.com/643805/643805_CXL%20Memory%20Device%20SW%20Guide_Rev1p0.pdf Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/170319624458.2212653.13252496567443656371.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-12-22 15:31:52 -08:00
Dave Jiang	14a6960b3e	cxl: Add helper function that calculate performance data for downstream ports The CDAT information from the switch, Switch Scoped Latency and Bandwidth Information Structure (SSLBIS), is parsed and stored under a cxl_dport based on the correlated downstream port id from the SSLBIS entry. Walk the entire CXL port paths and collect all the performance data. Also pick up the link latency number that's stored under the dports. The entire path PCIe bandwidth can be retrieved using the pcie_bandwidth_available() call. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/170319623824.2212653.10302079766473698427.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-12-22 15:31:52 -08:00
Dave Jiang	4d07a05397	cxl: Calculate and store PCI link latency for the downstream ports The latency is calculated by dividing the flit size over the bandwidth. Add support to retrieve the flit size for the CXL switch device and calculate the latency of the PCIe link. Cache the latency number with cxl_dport. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/170319621931.2212653.6800240203604822886.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-12-22 14:53:49 -08:00
Dave Jiang	790815902e	cxl: Add support for _DSM Function for retrieving QTG ID CXL spec v3.0 9.17.3 CXL Root Device Specific Methods (_DSM) Add support to retrieve QTG ID via ACPI _DSM call. The _DSM call requires an input of an ACPI package with 4 dwords (read latency, write latency, read bandwidth, write bandwidth). The call returns a package with 1 WORD that provides the max supported QTG ID and a package that may contain 0 or more WORDs as the recommended QTG IDs in the recommended order. Create a cxl_root container for the root cxl_port and provide a callback ->get_qos_class() in order to retrieve the QoS class. For the ACPI case, the _DSM helper is used to retrieve the QTG ID and returned. A devm_cxl_add_root() function is added for root port setup and registration of the cxl_root callback operation(s). Signed-off-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/170319621294.2212653.1649682083061569256.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-12-22 14:33:28 -08:00
Dave Jiang	80aa780dda	cxl: Add callback to parse the SSLBIS subtable from CDAT Provide a callback to parse the Switched Scoped Latency and Bandwidth Information Structure (SSLBIS) in the CDAT structures. The SSLBIS contains the bandwidth and latency information that's tied to the CXL switch that the data table has been read from. The extracted values are stored to the cxl_dport correlated by the port_id depending on the SSLBIS entry. Coherent Device Attribute Table 1.03 2.1 Switched Scoped Latency and Bandwidth Information Structure (DSLBIS) Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/170319620635.2212653.5194389158785365150.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-12-22 14:33:28 -08:00
Dave Jiang	63cef81b9d	cxl: Add callback to parse the DSLBIS subtable from CDAT Provide a callback to parse the Device Scoped Latency and Bandwidth Information Structure (DSLBIS) in the CDAT structures. The DSLBIS contains the bandwidth and latency information that's tied to a DSMAS handle. The driver will retrieve the read and write latency and bandwidth associated with the DSMAS which is tied to a DPA range. Coherent Device Attribute Table 1.03 2.1 Device Scoped Latency and Bandwidth Information Structure (DSLBIS) Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/170319620005.2212653.7475488478229720542.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-12-22 14:33:28 -08:00
Dave Jiang	ad6f04c026	cxl: Add callback to parse the DSMAS subtables from CDAT Provide a callback function to the CDAT parser in order to parse the Device Scoped Memory Affinity Structure (DSMAS). Each DSMAS structure contains the DPA range and its associated attributes in each entry. See the CDAT specification for details. The device handle and the DPA range is saved and to be associated with the DSLBIS locality data when the DSLBIS entries are parsed. The xarray is a local variable. When the total path performance data is calculated and storred this xarray can be discarded. Coherent Device Attribute Table 1.03 2.1 Device Scoped memory Affinity Structure (DSMAS) Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/170319619355.2212653.2675953129671561293.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-12-22 14:33:10 -08:00
Dave Jiang	ace196de69	cxl: Fix unregister_region() callback parameter assignment In devm_cxl_add_region(), devm_add_action_or_reset() is called by passing in unregister_region() with data ptr of 'cxlr'. However, in unregister_region(), the passed in parameter is incorrectly assumed to be a 'struct device' rather than the 'cxlr' pointer. The code has been working because 'struct device' is the first member of 'struct cxl_region'. Issue found by inspection. Fix the assignment so that cxlr is pointing directly to the passed in parameter. Not flagged for -stable since there is no functional impact of this fix. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/170258123810.952211.3907381447996426480.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-12-18 15:35:00 -08:00
Ira Weiny	ef3d5cf9c5	cxl/pmu: Ensure put_device on pmu devices The following kmemleaks were detected when removing the cxl module stack: unreferenced object 0xffff88822616b800 (size 1024): ... backtrace: [<00000000bedc6f83>] kmalloc_trace+0x26/0x90 [<00000000448d1afc>] devm_cxl_pmu_add+0x3a/0x110 [cxl_core] [<00000000ca3bfe16>] 0xffffffffa105213b [<00000000ba7f78dc>] local_pci_probe+0x41/0x90 [<000000005bb027ac>] pci_device_probe+0xb0/0x1c0 ... unreferenced object 0xffff8882260abcc0 (size 16): ... hex dump (first 16 bytes): 70 6d 75 5f 6d 65 6d 30 2e 30 00 26 82 88 ff ff pmu_mem0.0.&.... backtrace: ... [<00000000152b5e98>] dev_set_name+0x43/0x50 [<00000000c228798b>] devm_cxl_pmu_add+0x102/0x110 [cxl_core] [<00000000ca3bfe16>] 0xffffffffa105213b [<00000000ba7f78dc>] local_pci_probe+0x41/0x90 [<000000005bb027ac>] pci_device_probe+0xb0/0x1c0 ... unreferenced object 0xffff8882272af200 (size 256): ... backtrace: [<00000000bedc6f83>] kmalloc_trace+0x26/0x90 [<00000000a14d1813>] device_add+0x4ea/0x890 [<00000000a3f07b47>] devm_cxl_pmu_add+0xbe/0x110 [cxl_core] [<00000000ca3bfe16>] 0xffffffffa105213b [<00000000ba7f78dc>] local_pci_probe+0x41/0x90 [<000000005bb027ac>] pci_device_probe+0xb0/0x1c0 ... devm_cxl_pmu_add() correctly registers a device remove function but it only calls device_del() which is only part of device unregistration. Properly call device_unregister() to free up the memory associated with the device. Fixes: `1ad3f701c3` ("cxl/pci: Find and register CXL PMU devices") Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20231016-pmu-unregister-fix-v1-1-1e2eb2fa3c69@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-12-14 21:54:45 -08:00
Ira Weiny	c65efe3685	cxl/cdat: Free correct buffer on checksum error The new 6.7-rc1 kernel now checks the checksum on CDAT data. While using a branch of Fan's DCD qemu work (and specifying DCD devices), the following splat was observed. WARNING: CPU: 1 PID: 1384 at drivers/base/devres.c:1064 devm_kfree+0x4f/0x60 ... RIP: 0010:devm_kfree+0x4f/0x60 ... ? devm_kfree+0x4f/0x60 read_cdat_data+0x1a0/0x2a0 [cxl_core] cxl_port_probe+0xdf/0x200 [cxl_port] ... The issue in qemu is still unknown but the spat is a straight forward bug in the CDAT checksum processing code. Use a CDAT buffer variable to ensure the devm_free() works correctly on error. Fixes: `670e4e88f3` ("cxl: Add checksum verification to CDAT from CXL") Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Robert Richter <rrichter@amd.com> Link: http://lore.kernel.org/r/20231116-fix-cdat-devm-free-v1-1-b148b40707d7@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-12-08 16:14:28 -08:00
Dan Williams	6f5c4eca48	cxl/hdm: Fix dpa translation locking The helper, cxl_dpa_resource_start(), snapshots the dpa-address of an endpoint-decoder after acquiring the cxl_dpa_rwsem. However, it is sufficient to assert that cxl_dpa_rwsem is held rather than acquire it in the helper. Otherwise, it triggers multiple lockdep reports: 1/ Tracing callbacks are in an atomic context that can not acquire sleeping locks: BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:1525 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1288, name: bash preempt_count: 2, expected: 0 RCU nest depth: 0, expected: 0 [..] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS edk2-20230524-3.fc38 05/24/2023 Call Trace: <TASK> dump_stack_lvl+0x71/0x90 __might_resched+0x1b2/0x2c0 down_read+0x1a/0x190 cxl_dpa_resource_start+0x15/0x50 [cxl_core] cxl_trace_hpa+0x122/0x300 [cxl_core] trace_event_raw_event_cxl_poison+0x1c9/0x2d0 [cxl_core] 2/ The rwsem is already held in the inject poison path: WARNING: possible recursive locking detected 6.7.0-rc2+ #12 Tainted: G W OE N -------------------------------------------- bash/1288 is trying to acquire lock: ffffffffc05f73d0 (cxl_dpa_rwsem){++++}-{3:3}, at: cxl_dpa_resource_start+0x15/0x50 [cxl_core] but task is already holding lock: ffffffffc05f73d0 (cxl_dpa_rwsem){++++}-{3:3}, at: cxl_inject_poison+0x7d/0x1e0 [cxl_core] [..] Call Trace: <TASK> dump_stack_lvl+0x71/0x90 __might_resched+0x1b2/0x2c0 down_read+0x1a/0x190 cxl_dpa_resource_start+0x15/0x50 [cxl_core] cxl_trace_hpa+0x122/0x300 [cxl_core] trace_event_raw_event_cxl_poison+0x1c9/0x2d0 [cxl_core] __traceiter_cxl_poison+0x5c/0x80 [cxl_core] cxl_inject_poison+0x1bc/0x1e0 [cxl_core] This appears to have been an issue since the initial implementation and uncovered by the new cxl-poison.sh test [1]. That test is now passing with these changes. Fixes: `28a3ae4ff6` ("cxl/trace: Add an HPA to cxl_poison trace events") Link: http://lore.kernel.org/r/e4f2716646918135ddbadf4146e92abb659de734.1700615159.git.alison.schofield@intel.com [1] Cc: <stable@vger.kernel.org> Cc: Alison Schofield <alison.schofield@intel.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-12-07 19:14:04 -08:00
Davidlohr Bueso	cb46fca88d	cxl: Add Support for Get Timestamp Add the call to the UAPI such that userspace may corelate the timestamps from the device log with system wall time, if, for example there's any sort of inaccuracy or skew in the device. Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230829152014.15452-1-dave@stgolabs.net Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-12-07 12:44:19 -08:00
Alison Schofield	0e33ac9c3f	cxl/memdev: Hold region_rwsem during inject and clear poison ops Poison inject and clear are supported via debugfs where a privileged user can inject and clear poison to a device physical address. Commit `458ba8189c` ("cxl: Add cxl_decoders_committed() helper") added a lockdep assert that highlighted a gap in poison inject and clear functions where holding the dpa_rwsem does not assure that a a DPA is not added to a region. The impact for inject and clear is that if the DPA address being injected or cleared has been attached to a region, but not yet committed, the dev_dbg() message intended to alert the debug user that they are acting on a mapped address is not emitted. Also, the cxl_poison trace event that serves as a log of the inject and clear activity will not include region info. Close this gap by snapshotting an unchangeable region state during poison inject and clear operations. That means holding both the region_rwsem and the dpa_rwsem during the inject and clear ops. Fixes: `d2fbc48658` ("cxl/memdev: Add support for the Inject Poison mailbox command") Fixes: `9690b07748` ("cxl/memdev: Add support for the Clear Poison mailbox command") Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/08721dc1df0a51e4e38fecd02425c3475912dfd5.1701041440.git.alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-11-29 17:23:03 -08:00
Alison Schofield	5558b92e8d	cxl/core: Always hold region_rwsem while reading poison lists A read of a device poison list is triggered via a sysfs attribute and the results are logged as kernel trace events of type cxl_poison. The work is managed by either: a) the region driver when one of more regions map the device, or by b) the memdev driver when no regions map the device. In the case of a) the region driver holds the region_rwsem while reading the poison by committed endpoint decoder mappings and for any unmapped resources. This makes sure that the cxl_poison trace event trace reports valid region info. (Region name, HPA, and UUID). In the case of b) the memdev driver holds the dpa_rwsem preventing new DPA resources from being attached to a region. However, it leaves a gap between region attach and decoder commit actions. If a DPA in the gap is in the poison list, the cxl_poison trace event will omit the region info. Close the gap by holding the region_rwsem and the dpa_rwsem when reading poison per memdev. Since both methods now hold both locks, down_read both from the caller. Doing so also addresses the lockdep assert that found this issue: Commit `458ba8189c` ("cxl: Add cxl_decoders_committed() helper") Fixes: `f0832a5863` ("cxl/region: Provide region info to the cxl_poison trace event") Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/08e8e7ec9a3413b91d51de39e385653494b1eed0.1701041440.git.alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-11-29 17:03:53 -08:00
Dave Jiang	36a1c2ee50	cxl/hdm: Fix a benign lockdep splat The new helper "cxl_num_decoders_committed()" added a lockdep assertion to validate that port->commit_end is protected against modification. That assertion fires in init_hdm_decoder() where it is initializing port->commit_end. Given that it is both accessing and writing that property it obstensibly needs the lock. In practice, CXL decoder commit rules (must commit in order) and the in-order discovery of device decoders makes the manipulation of ->commit_end in init_hdm_decoder() safe. However, rather than rely on the subtle rules of CXL hardware, just make the implementation obviously correct from a software perspective. The Fixes: tag is only for cleaning up a lockdep splat, there is no functional issue addressed by this fix. Fixes: `458ba8189c` ("cxl: Add cxl_decoders_committed() helper") Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/170025232811.2147250.16376901801315194121.stgit@djiang5-mobl3 Acked-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-11-22 16:34:30 -08:00
Terry Bowman	b3741ac86c	cxl/pci: Change CXL AER support check to use native AER Native CXL protocol errors are delivered to the OS through AER reporting. The owner of AER owns CXL Protocol error management with respect to _OSC negotiation.[1] CXL device errors are handled by a separate interrupt with native control gated by _OSC control field 'CXL Memory Error Reporting Control'. The CXL driver incorrectly checks for 'CXL Memory Error Reporting Control' before accessing AER registers and caching RCH downport AER registers. Replace the current check in these 2 cases with native AER checks. [1] CXL 3.0 - 9.17.2 CXL _OSC, Table-9-26, Interpretation of CXL _OSC Support Fields, p.641 Fixes: `f05fd10d13` ("cxl/pci: Add RCH downstream port AER register discovery") Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Link: https://lore.kernel.org/r/20231102155232.1421261-1-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-11-02 14:09:01 -07:00
Dan Williams	5d09c63f11	cxl/hdm: Remove broken error path Dan reports that cxl_decoder_commit() potentially leaks a hold of cxl_dpa_rwsem. The potential error case is a "should not" happen scenario, turn it into a "can not" happen scenario by adding the error check to cxl_port_setup_targets() where other setting validation occurs. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: http://lore.kernel.org/r/63295673-5d63-4919-b851-3b06d48734c0@moroto.mountain Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Fixes: `176baefb2e` ("cxl/hdm: Commit decoder state to hardware") Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-10-31 14:10:04 -07:00
Dan Carpenter	69d56b15a7	cxl/hdm: Fix && vs \|\| bug If "info" is NULL then this code will crash. \|\| was intended instead of &&. Fixes: `8ce520fdea` ("cxl/hdm: Use stored Component Register mappings to map HDM decoder capability") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Robert Richter <rrichter@amd.com> Link: https://lore.kernel.org/r/60028378-d3d5-4d6d-90fd-f915f061e731@moroto.mountain Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-10-31 14:09:50 -07:00
Dan Williams	b3cfdbf6a0	Merge branch 'for-6.7/cxl-commited' into cxl/next Add the committed decoder sysfs attribute for v6.7.	2023-10-31 11:00:08 -07:00
Dan Williams	de5512b2a2	Merge branch 'for-6.7/cxl' into cxl/next Pickup some misc. CXL updates for v6.7.	2023-10-31 10:59:44 -07:00
Dan Williams	624eda92ab	Merge branch 'for-6.7/cxl-qtg' into cxl/next Merge some prep-work for CXL QOS class support. This cycle saw large collisions with mm on this topic, so the bulk of this topic needs to wait.	2023-10-31 10:59:26 -07:00
Dan Williams	7f946e6d83	Merge branch 'for-6.7/cxl-rch-eh' into cxl/next Restricted CXL Host (RCH) Error Handling undoes the topology munging of CXL 1.1 to enabled some AER recovery, and lands some base infrastructure for handling Root-Complex-Event-Collectors (RCECs) with CXL. Include this long running series finally for v6.7.	2023-10-31 10:59:00 -07:00
Dave Jiang	8358e8f159	cxl: Add support for reading CXL switch CDAT table Add read_cdat_data() call in cxl_switch_port_probe() to allow reading of CDAT data for CXL switches. read_cdat_data() needs to be adjusted for the retrieving of the PCIe device depending on if the passed in port is endpoint or switch. Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/169713682855.2205276.6418370379144967443.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-10-27 20:48:02 -07:00
Dave Jiang	670e4e88f3	cxl: Add checksum verification to CDAT from CXL A CDAT table is available from a CXL device. The table is read by the driver and cached in software. With the CXL subsystem needing to parse the CDAT table, the checksum should be verified. Add checksum verification after the CDAT table is read from device. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/169713682277.2205276.2687265961314933628.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-10-27 20:48:02 -07:00
Dave Jiang	529c0a4404	cxl: Export QTG ids from CFMWS to sysfs as qos_class attribute Export the QoS Throttling Group ID from the CXL Fixed Memory Window Structure (CFMWS) under the root decoder sysfs attributes as qos_class. CXL rev3.0 9.17.1.3 CXL Fixed Memory Window Structure (CFMWS) cxl cli will use this id to match with the _DSM retrieved id for a hot-plugged CXL memory device DPA memory range to make sure that the DPA range is under the right CFMWS window. Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/169713681699.2205276.14475306324720093079.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-10-27 20:48:02 -07:00
Dave Jiang	05e37b2138	cxl: Add decoders_committed sysfs attribute to cxl_port This attribute allows cxl-cli to determine whether there are decoders committed to a memdev. This is only a snapshot of the state, and doesn't offer any protection or serialization against a concurrent disable-region operation. Reviewed-by: Jim Harris <jim.harris@samsung.com> Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/169747907439.272156.10261062080830155662.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-10-27 20:29:41 -07:00
Dave Jiang	458ba8189c	cxl: Add cxl_decoders_committed() helper Add a helper to retrieve the number of decoders committed for the port. Replace all the open coding of the calculation with the helper. Link: https://lore.kernel.org/linux-cxl/651c98472dfed_ae7e729495@dwillia2-xfh.jf.intel.com.notmuch/ Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Jim Harris <jim.harris@samsung.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/169747906849.272156.1729290904857372335.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-10-27 20:29:41 -07:00
Robert Richter	e8db070160	cxl/core/regs: Rework cxl_map_pmu_regs() to use map->dev for devm struct cxl_register_map carries a @dev parameter for devm operations. Simplify the function interface to use that instead of a separate @dev argument. Signed-off-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20231018171713.1883517-21-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-10-27 20:13:39 -07:00
Robert Richter	d3970f006f	cxl/core/regs: Rename phys_addr in cxl_map_component_regs() Trivial change that renames variable phys_addr in cxl_map_component_regs() to shorten its length to keep the 80 char size limit for the line and also for consistency between the different paths. Signed-off-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20231018171713.1883517-20-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-10-27 20:13:39 -07:00
Terry Bowman	d1a9def33d	cxl/pci: Disable root port interrupts in RCH mode The RCH root port contains root command AER registers that should not be enabled.[1] Disable these to prevent root port interrupts. [1] CXL 3.0 - 12.2.1.1 RCH Downstream Port-detected Errors Signed-off-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20231018171713.1883517-17-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-10-27 20:13:38 -07:00
Terry Bowman	6ac07883db	cxl/pci: Add RCH downstream port error logging RCH downstream port error logging is missing in the current CXL driver. The missing AER and RAS error logging is needed for communicating driver error details to userspace. Update the driver to include PCIe AER and CXL RAS error logging. Add RCH downstream port error handling into the existing RCiEP handler. The downstream port error handler is added to the RCiEP error handler because the downstream port is implemented in a RCRB, is not PCI enumerable, and as a result is not directly accessible to the PCI AER root port driver. The AER root port driver calls the RCiEP handler for handling RCD errors and RCH downstream port protocol errors. Update existing RCiEP correctable and uncorrectable handlers to also call the RCH handler. The RCH handler will read the RCH AER registers, check for error severity, and if an error exists will log using an existing kernel AER trace routine. The RCH handler will also log downstream port RAS errors if they exist. Co-developed-by: Robert Richter <rrichter@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20231018171713.1883517-16-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-10-27 20:13:38 -07:00
Terry Bowman	6c5f3aacb2	cxl/pci: Map RCH downstream AER registers for logging protocol errors The restricted CXL host (RCH) error handler will log protocol errors using AER and RAS status registers. The AER and RAS registers need to be virtually memory mapped before enabling interrupts. Create the initializer function devm_cxl_setup_parent_dport() for this when the endpoint is connected with the dport. The initialization sets up the RCH RAS and AER mappings. Add 'struct cxl_regs' to 'struct cxl_dport' for saving a pointer to the RCH downstream port's AER and RAS registers. Signed-off-by: Terry Bowman <terry.bowman@amd.com> Co-developed-by: Robert Richter <rrichter@amd.com> Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20231018171713.1883517-15-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-10-27 20:13:38 -07:00
Terry Bowman	bf6c9fa846	cxl/pci: Update CXL error logging to use RAS register address The CXL error handler currently only logs endpoint RAS status. The CXL topology includes several components providing RAS details to be logged during error handling.[1] Update the current handler's RAS logging to use a RAS register address. Also, update the error handler function names to be consistent with correctable and uncorrectable RAS. This will allow for adding support to log other CXL component's RAS details in the future. [1] CXL3.0 Table 8-22 CXL_Capability_ID Assignment Co-developed-by: Robert Richter <rrichter@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20231018171713.1883517-14-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-10-27 20:13:38 -07:00
Terry Bowman	6777877eb7	PCI/AER: Refactor cper_print_aer() for use by CXL driver module The CXL driver plans to use cper_print_aer() for logging restricted CXL host (RCH) AER errors. cper_print_aer() is not currently exported and therefore not usable by the CXL drivers built as loadable modules. Export the cper_print_aer() function. Use the EXPORT_SYMBOL_NS_GPL() variant to restrict the export to CXL drivers. The CONFIG_ACPI_APEI_PCIEAER kernel config is currently used to enable cper_print_aer(). cper_print_aer() logs the AER registers and is useful in PCIE AER logging outside of APEI. Remove the CONFIG_ACPI_APEI_PCIEAER dependency to enable cper_print_aer(). The cper_print_aer() function name implies CPER specific use but is useful in non-CPER cases as well. Rename cper_print_aer() to pci_print_aer(). Also, update cxl_core to import CXL namespace imports. Co-developed-by: Robert Richter <rrichter@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Robert Richter <rrichter@amd.com> Cc: Mahesh J Salgaonkar <mahesh@linux.ibm.com> Cc: Oliver O'Halloran <oohall@gmail.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: linux-pci@vger.kernel.org Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20231018171713.1883517-13-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-10-27 20:13:38 -07:00
Robert Richter	f05fd10d13	cxl/pci: Add RCH downstream port AER register discovery Restricted CXL host (RCH) downstream port AER information is not currently logged while in the error state. One problem preventing the error logging is the AER and RAS registers are not accessible. The CXL driver requires changes to find RCH downstream port AER and RAS registers for purpose of error logging. RCH downstream ports are not enumerated during a PCI bus scan and are instead discovered using system firmware, ACPI in this case.[1] The downstream port is implemented as a Root Complex Register Block (RCRB). The RCRB is a 4k memory block containing PCIe registers based on the PCIe root port.[2] The RCRB includes AER extended capability registers used for reporting errors. Note, the RCH's AER Capability is located in the RCRB memory space instead of PCI configuration space, thus its register access is different. Existing kernel PCIe AER functions can not be used to manage the downstream port AER capabilities and RAS registers because the port was not enumerated during PCI scan and the registers are not PCI config accessible. Discover RCH downstream port AER extended capability registers. Use MMIO accesses to search for extended AER capability in RCRB register space. [1] CXL 3.0 Spec, 9.11.2 - System Firmware View of CXL 1.1 Hierarchy [2] CXL 3.0 Spec, 8.2.1.1 - RCH Downstream Port RCRB Co-developed-by: Robert Richter <rrichter@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20231018171713.1883517-12-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-10-27 20:13:38 -07:00
Robert Richter	a2fcb84a19	cxl/port: Remove Component Register base address from struct cxl_port The Component Register base address @component_reg_phys is no longer used after the rework of the Component Register setup which now uses struct member @reg_map instead. Remove the base address. Signed-off-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20231018171713.1883517-10-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-10-27 20:13:37 -07:00
Robert Richter	8ce520fdea	cxl/hdm: Use stored Component Register mappings to map HDM decoder capability Now, that the Component Register mappings are stored, use them to enable and map the HDM decoder capabilities. The Component Registers do not need to be probed again for this, remove probing code. The HDM capability applies to Endpoints, USPs and VH Host Bridges. The Endpoint's component register mappings are located in the cxlds and else in the port's structure. Duplicate the cxlds->reg_map in port->reg_map for endpoint ports. Signed-off-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> [rework to drop cxl_port_get_comp_map()] Link: https://lore.kernel.org/r/20231018171713.1883517-8-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-10-27 20:13:37 -07:00
Robert Richter	2dd1827920	cxl/pci: Store the endpoint's Component Register mappings in struct cxl_dev_state Same as for ports and dports, also store the endpoint's Component Register mappings, use struct cxl_dev_state for that. Keep the Component Register base address @component_reg_phys a bit to not break functionality. It will be removed after the transition in a later patch. Signed-off-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20231018171713.1883517-7-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-10-27 20:13:37 -07:00
Robert Richter	4d758764e7	cxl/port: Pre-initialize component register mappings The component registers of a component may not exist and cxl_setup_comp_regs() will fail for that reason. In another case, Software may not use and set those registers up. cxl_setup_comp_regs() is then called with a base address of CXL_RESOURCE_NONE. Both are valid cases, but the function returns without initializing the register map. Now, a missing component register block is not necessarily a reason to fail (feature is optional or its existence checked later). Change cxl_setup_comp_regs() to also use components with the component register block missing. Thus, always initialize struct cxl_register_map with valid values, set @dev and make @resource CXL_RESOURCE_NONE. The change is in preparation of follow-on patches. Signed-off-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20231018171713.1883517-6-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-10-27 20:13:37 -07:00
Robert Richter	d8add49263	cxl/port: Rename @comp_map to @reg_map in struct cxl_register_map Name the field @reg_map, because @reg_map->host will be used for mapping operations beyond component registers (i.e. AER registers). This is valid for all occurrences of @comp_map. Change them all. Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20231018171713.1883517-5-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-10-27 20:13:37 -07:00
Dan Williams	33d9c987bf	cxl/port: Fix @host confusion in cxl_dport_setup_regs() commit `5d2ffbe4b8` ("cxl/port: Store the downstream port's Component Register mappings in struct cxl_dport") ...moved the dport component registers from a raw component_reg_phys passed in at dport instantiation time to a 'struct cxl_register_map' populated with both the component register data and the "host" device for mapping operations. While typical CXL switch dports are mapped by their associated 'struct cxl_port', an RCH host bridge dport registered by cxl_acpi needs to wait until the cxl_mem driver makes the attachment to map the registers. This is because there are no intervening 'struct cxl_port' instances between the root cxl_port and the endpoint port in an RCH topology. For now just mark the host as NULL in the RCH dport case until code that needs to map the dport registers arrives. This patch is not flagged for -stable since nothing in the current driver uses the dport->comp_map. Now, I am slightly uneasy that cxl_setup_comp_regs() sets map->host to a wrong value and then cxl_dport_setup_regs() fixes it up, but the alternatives I came up with are more messy. For example, adding an @logdev to 'struct cxl_register_map' that the dev_printk()s can fall back to when @host is NULL. I settled on "post-fixup+comment" since it is only RCH dports that have this special case where register probing is split between a host-bridge RCRB lookup and when cxl_mem_probe() does the association of the cxl_memdev and endpoint port. [moved rename of @comp_map to @reg_map into next patch] Fixes: `5d2ffbe4b8` ("cxl/port: Store the downstream port's Component Register mappings in struct cxl_dport") Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20231018171713.1883517-4-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-10-27 20:13:36 -07:00
Robert Richter	dd22581f89	cxl/core/regs: Rename @dev to @host in struct cxl_register_map The primary role of @dev is to host the mappings for devm operations. @dev is too ambiguous as a name. I.e. when does @dev refer to the 'struct device ' instance that the registers belong, and when does @dev refer to the 'struct device ' instance hosting the mapping for devm operations? Clarify the role of @dev in cxl_register_map by renaming it to @host. Also, rename local variables to 'host' where map->host is used. Signed-off-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20231018171713.1883517-3-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-10-27 20:13:36 -07:00
Dan Williams	8d2ad999ca	cxl/port: Fix delete_endpoint() vs parent unregistration race The CXL subsystem, at cxl_mem ->probe() time, establishes a lineage of ports (struct cxl_port objects) between an endpoint and the root of a CXL topology. Each port including the endpoint port is attached to the cxl_port driver. Given that setup, it follows that when either any port in that lineage goes through a cxl_port ->remove() event, or the memdev goes through a cxl_mem ->remove() event. The hierarchy below the removed port, or the entire hierarchy if the memdev is removed needs to come down. The delete_endpoint() callback is careful to check whether it is being called to tear down the hierarchy, or if it is only being called to teardown the memdev because an ancestor port is going through ->remove(). That care needs to take the device_lock() of the endpoint's parent. Which requires 2 bugs to be fixed: 1/ A reference on the parent is needed to prevent use-after-free scenarios like this signature: BUG: spinlock bad magic on CPU#0, kworker/u56:0/11 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS edk2-20230524-3.fc38 05/24/2023 Workqueue: cxl_port detach_memdev [cxl_core] RIP: 0010:spin_bug+0x65/0xa0 Call Trace: do_raw_spin_lock+0x69/0xa0 __mutex_lock+0x695/0xb80 delete_endpoint+0xad/0x150 [cxl_core] devres_release_all+0xb8/0x110 device_unbind_cleanup+0xe/0x70 device_release_driver_internal+0x1d2/0x210 detach_memdev+0x15/0x20 [cxl_core] process_one_work+0x1e3/0x4c0 worker_thread+0x1dd/0x3d0 2/ In the case of RCH topologies, the parent device that needs to be locked is not always @port->dev as returned by cxl_mem_find_port(), use endpoint->dev.parent instead. Fixes: `8dd2bc0f8e` ("cxl/mem: Add the cxl_mem driver") Cc: <stable@vger.kernel.org> Reported-by: Robert Richter <rrichter@amd.com> Closes: http://lore.kernel.org/r/20231018171713.1883517-2-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-10-27 20:13:23 -07:00
Jim Harris	98a04c7ace	cxl/region: Fix x1 root-decoder granularity calculations Root decoder granularity must match value from CFWMS, which may not be the region's granularity for non-interleaved root decoders. So when calculating granularities for host bridge decoders, use the region's granularity instead of the root decoder's granularity to ensure the correct granularities are set for the host bridge decoders and any downstream switch decoders. Test configuration is 1 host bridge * 2 switches * 2 endpoints per switch. Region created with 2048 granularity using following command line: cxl create-region -m -d decoder0.0 -w 4 mem0 mem2 mem1 mem3 \ -g 2048 -s 2048M Use "cxl list -PDE \| grep granularity" to get a view of the granularity set at each level of the topology. Before this patch: "interleave_granularity":2048, "interleave_granularity":2048, "interleave_granularity":512, "interleave_granularity":2048, "interleave_granularity":2048, "interleave_granularity":512, "interleave_granularity":256, After: "interleave_granularity":2048, "interleave_granularity":2048, "interleave_granularity":4096, "interleave_granularity":2048, "interleave_granularity":2048, "interleave_granularity":4096, "interleave_granularity":2048, Fixes: `27b3f8d138` ("cxl/region: Program target lists") Cc: <stable@vger.kernel.org> Signed-off-by: Jim Harris <jim.harris@samsung.com> Link: https://lore.kernel.org/r/169824893473.1403938.16110924262989774582.stgit@bgt-140510-bm03.eng.stellus.in [djbw: fixup the prebuilt cxl_test region] Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-10-27 13:04:52 -07:00
Li Zhijian	3531b27f1f	cxl/region: Fix cxl_region_rwsem lock held when returning to user space Fix a missed "goto out" to unlock on error to cleanup this splat: WARNING: lock held when returning to user space! 6.6.0-rc3-lizhijian+ #213 Not tainted ------------------------------------------------ cxl/673 is leaving the kernel with locks still held! 1 lock held by cxl/673: #0: ffffffffa013b9d0 (cxl_region_rwsem){++++}-{3:3}, at: commit_store+0x7d/0x3e0 [cxl_core] In terms of user visible impact of this bug for backports: cxl_region_invalidate_memregion() on x86 invokes wbinvd which is a problematic instruction for virtualized environments. So, on virtualized x86, cxl_region_invalidate_memregion() returns an error. This failure case got missed because CXL memory-expander device passthrough is not a production use case, and emulation of CXL devices is typically limited to kernel development builds with CONFIG_CXL_REGION_INVALIDATION_TEST=y, that makes cxl_region_invalidate_memregion() succeed. In other words, the expected exposure of this bug is limited to CXL subsystem development environments using QEMU that neglected CONFIG_CXL_REGION_INVALIDATION_TEST=y. Fixes: `d1257d098a` ("cxl/region: Move cache invalidation before region teardown, and before setup") Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20231025085450.2514906-1-lizhijian@fujitsu.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-10-27 13:04:51 -07:00
Alison Schofield	0cf36a85c1	cxl/region: Use cxl_calc_interleave_pos() for auto-discovery For auto-discovered regions the driver must assign each target to a valid position in the region interleave set based on the decoder topology. The current implementation fails to parse valid decode topologies, as it does not consider the child offset into a parent port. The sort put all targets of one port ahead of another port when an interleave was expected, causing the region assembly to fail. Replace the existing relative sort with cxl_calc_interleave_pos() that finds the exact position in a region interleave for an endpoint based on a walk up the ancestral tree from endpoint to root decoder. cxl_calc_interleave_pos() was introduced in a prior patch, so the work here is to use it in cxl_region_sort_targets(). Remove the obsoleted helper functions from the prior sort. Testing passes on pre-production hardware with BIOS defined regions that natively trigger this autodiscovery path of the region driver. Testing passes a CXL unit test using the dev_dbg() calculation test (see cxl_region_attach()) across an expanded set of region configs: 1, 1, 1+1, 1+1+1, 2, 2+2, 2+2+2, 2+2+2+2, 4, 4+4, where each number represents the count of endpoints per host bridge. Fixes: `a32320b71f` ("cxl/region: Add region autodiscovery") Reported-by: Dmytro Adamenko <dmytro.adamenko@intel.com> Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jim Harris <jim.harris@samsung.com> Link: https://lore.kernel.org/r/3946cc55ddc19678733eddc9de2c317749f43f3b.1698263080.git.alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-10-27 13:04:51 -07:00
Alison Schofield	a3e00c964f	cxl/region: Calculate a target position in a region interleave Introduce a calculation to find a target's position in a region interleave. Perform a self-test of the calculation on user-defined regions. The region driver uses the kernel sort() function to put region targets in relative order. Positions are assigned based on each target's index in that sorted list. That relative sort doesn't consider the offset of a port into its parent port which causes some auto-discovered regions to fail creation. In one failure case, a 2 + 2 config (2 host bridges each with 2 endpoints), the sort puts all the targets of one port ahead of another port when they were expected to be interleaved. In preparation for repairing the autodiscovery region assembly, introduce a new method for discovering a target position in the region interleave. cxl_calc_interleave_pos() adds a method to find the target position by ascending from an endpoint to a root decoder. The calculation starts with the endpoint's local position and position in the parent port. It traverses towards the root decoder and examines both position and ways in order to allow the position to be refined all the way to the root decoder. This calculation: position = position * parent_ways + parent_pos; applied iteratively yields the correct position. Include a self-test that exercises this new position calculation against every successfully configured user-defined region. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/0ac32c75cf81dd8b86bf07d70ff139d33c2300bc.1698263080.git.alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-10-27 13:04:48 -07:00
Alison Schofield	1110581412	cxl/region: Prepare the decoder match range helper for reuse match_decoder_by_range() and decoder_match_range() both determine if an HPA range matches a decoder. The first does it for root decoders and the second one operates on switch decoders. Tidy these up with clear naming and make the switch helper more like the root decoder helper in style and functionality. Make it take the actual range, rather than an endpoint decoder from which it extracts the range. Require an exact match on switch decoders, because unlike a root decoder that maps an entire region, Linux only supports 1:1 mapping of switch to endpoint decoders. Note that root-decoders are a super-set of switch-decoders and the range they cover is a super-set of a region, hence the use of range_contains() for that case. Aside from aesthetics and maintainability, this is in preparation for reuse. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Jim Harris <jim.harris@samsung.com> Link: https://lore.kernel.org/r/011b1f498e1758bb8df17c5951be00bd8d489e3b.1698263080.git.alison.schofield@intel.com [djbw: fixup root decoder vs switch decoder range checks] Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-10-26 08:46:54 -07:00
Alison Schofield	9214c9d56c	cxl/mbox: Remove useless cast in cxl_mem_create_range_info() DEFINE_RES_MEM() is a wrapper around the DEFINE_RES_NAMED() macro which already has the (struct resource) for the compound literal. The user of the macro should not repeat the cast. Cleans up these sparse warnings: drivers/cxl/core/mbox.c:1184:18: warning: cast to non-scalar drivers/cxl/core/mbox.c:1184:18: warning: cast from non-scalar Fixes: `52c4d11f1d` ("resource: Convert DEFINE_RES_NAMED() to be compound literal") Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20230815172052.22514-1-alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-10-24 16:04:10 -07:00
Jim Harris	0718588c7a	cxl/region: Do not try to cleanup after cxl_region_setup_targets() fails Commit `5e42bcbc3f` ("cxl/region: decrement ->nr_targets on error in cxl_region_attach()") tried to avoid 'eiw' initialization errors when ->nr_targets exceeded 16, by just decrementing ->nr_targets when cxl_region_setup_targets() failed. Commit `86987c7662` ("cxl/region: Cleanup target list on attach error") extended that cleanup to also clear cxled->pos and p->targets[pos]. The initialization error was incidentally fixed separately by: Commit `8d42854257` ("cxl/region: Fix port setup uninitialized variable warnings") which was merged a few days after `5e42bcbc3f`. But now the original cleanup when cxl_region_setup_targets() fails prevents endpoint and switch decoder resources from being reused: 1) the cleanup does not set the decoder's region to NULL, which results in future dpa_size_store() calls returning -EBUSY 2) the decoder is not properly freed, which results in future commit errors associated with the upstream switch Now that the initialization errors were fixed separately, the proper cleanup for this case is to just return immediately. Then the resources associated with this target get cleanup up as normal when the failed region is deleted. The ->nr_targets decrement in the error case also helped prevent a p->targets[] array overflow, so add a new check to prevent against that overflow. Tested by trying to create an invalid region for a 2 switch * 2 endpoint topology, and then following up with creating a valid region. Fixes: `5e42bcbc3f` ("cxl/region: decrement ->nr_targets on error in cxl_region_attach()") Cc: <stable@vger.kernel.org> Signed-off-by: Jim Harris <jim.harris@samsung.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/169703589120.1202031.14696100866518083806.stgit@bgt-140510-bm03.eng.stellus.in Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-10-24 14:43:37 -07:00
Dan Williams	88d3917f82	cxl/mem: Fix shutdown order Ira reports that removing cxl_mock_mem causes a crash with the following trace: BUG: kernel NULL pointer dereference, address: 0000000000000044 [..] RIP: 0010:cxl_region_decode_reset+0x7f/0x180 [cxl_core] [..] Call Trace: <TASK> cxl_region_detach+0xe8/0x210 [cxl_core] cxl_decoder_kill_region+0x27/0x40 [cxl_core] cxld_unregister+0x29/0x40 [cxl_core] devres_release_all+0xb8/0x110 device_unbind_cleanup+0xe/0x70 device_release_driver_internal+0x1d2/0x210 bus_remove_device+0xd7/0x150 device_del+0x155/0x3e0 device_unregister+0x13/0x60 devm_release_action+0x4d/0x90 ? __pfx_unregister_port+0x10/0x10 [cxl_core] delete_endpoint+0x121/0x130 [cxl_core] devres_release_all+0xb8/0x110 device_unbind_cleanup+0xe/0x70 device_release_driver_internal+0x1d2/0x210 bus_remove_device+0xd7/0x150 device_del+0x155/0x3e0 ? lock_release+0x142/0x290 cdev_device_del+0x15/0x50 cxl_memdev_unregister+0x54/0x70 [cxl_core] This crash is due to the clearing out the cxl_memdev's driver context (@cxlds) before the subsystem is done with it. This is ultimately due to the region(s), that this memdev is a member, being torn down and expecting to be able to de-reference @cxlds, like here: static int cxl_region_decode_reset(struct cxl_region *cxlr, int count) ... if (cxlds->rcd) goto endpoint_reset; ... Fix it by keeping the driver context valid until memdev-device unregistration, and subsequently the entire stack of related dependencies, unwinds. Fixes: `9cc238c7a5` ("cxl/pci: Introduce cdevm_file_operations") Reported-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-10-09 11:35:45 -07:00
Dan Williams	3398183808	cxl/memdev: Fix sanitize vs decoder setup locking The sanitize operation is destructive and the expectation is that the device is unmapped while in progress. The current implementation does a lockless check for decoders being active, but then does nothing to prevent decoders from racing to be committed. Introduce state tracking to resolve this race. This incidentally cleans up unpriveleged userspace from triggering mmio read cycles by spinning on reading the 'security/state' attribute. Which at a minimum is a waste since the kernel state machine can cache the completion result. Lastly cxl_mem_sanitize() was mistakenly marked EXPORT_SYMBOL() in the original implementation, but an export was never required. Fixes: `0c36b6ad43` ("cxl/mbox: Add sanitization handling machinery") Cc: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-10-06 00:12:45 -07:00
Dan Williams	5f2da19714	cxl/pci: Fix sanitize notifier setup Fix a race condition between the mailbox-background command interrupt firing and the security-state sysfs attribute being removed. The race is difficult to see due to the awkward placement of the sanitize-notifier setup code and the multiple places the teardown calls are made, cxl_memdev_security_init() and cxl_memdev_security_shutdown(). Unify setup in one place, cxl_sanitize_setup_notifier(). Arrange for the paired cxl_sanitize_teardown_notifier() to safely quiet the notifier and let the cxl_memdev + irq be unregistered later in the flow. Note: The special wrinkle of the sanitize notifier is that it interacts with interrupts, which are enabled early in the flow, and it interacts with memdev sysfs which is not initialized until late in the flow. Hence why this setup routine takes an @cxlmd argument, and not just @mds. This fix is also needed as a preparation fix for a memdev unregistration crash. Reported-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com> Closes: http://lore.kernel.org/r/20230929100316.00004546@Huawei.com Cc: Dave Jiang <dave.jiang@intel.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Fixes: `0c36b6ad43` ("cxl/mbox: Add sanitization handling machinery") Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-10-06 00:12:45 -07:00
Dan Williams	f29a824b0b	cxl/pci: Clarify devm host for memdev relative setup It is all too easy to get confused about @dev usage in the CXL driver stack. Before adding a new cxl_pci_probe() setup operation that has a devm lifetime dependent on @cxlds->dev binding, but also references @cxlmd->dev, and prints messages, rework the devm_cxl_add_memdev() and cxl_memdev_setup_fw_upload() function signatures to make this distinction explicit. I.e. pass in the devm context as an @host argument rather than infer it from other objects. This is in preparation for adding a devm_cxl_sanitize_setup_notifier(). Note the whitespace fixup near the change of the devm_cxl_add_memdev() signature. That uncaught typo originated in the patch that added cxl_memdev_security_init(). Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-10-06 00:12:44 -07:00
Dan Williams	2627c995c1	cxl/pci: Remove inconsistent usage of dev_err_probe() If dev_err_probe() is to be used it should at least be used consistently within the same function. It is also worth questioning whether every potential -ENOMEM needs an explicit error message. Remove the cxl_setup_fw_upload() error prints for what are rare / hardware-independent failures. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-10-06 00:12:44 -07:00
Dan Williams	e30a106558	cxl/pci: Cleanup 'sanitize' to always poll In preparation for fixing the init/teardown of the 'sanitize' workqueue and sysfs notification mechanism, arrange for cxl_mbox_sanitize_work() to be the single location where the sysfs attribute is notified. With that change there is no distinction between polled mode and interrupt mode. All the interrupt does is accelerate the polling interval. The change to check for "mds->security.sanitize_node" under the lock is there to ensure that the interrupt, the work routine and the setup/teardown code can all have a consistent view of the registered notifier and the workqueue state. I.e. the expectation is that the interrupt is live past the point that the sanitize sysfs attribute is published, and it may race teardown, so it must be consulted under a lock. Given that new locking requirement, cxl_pci_mbox_irq() is moved from hard to thread irq context. Lastly, some opportunistic replacements of "queue_delayed_work(system_wq, ...)", which is just open coded schedule_delayed_work(), are included. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-09-29 15:43:34 -07:00
Dan Williams	a76b62518e	cxl/port: Fix cxl_test register enumeration regression The cxl_test unit test environment models a CXL topology for sysfs/user-ABI regression testing. It uses interface mocking via the "--wrap=" linker option to redirect cxl_core routines that parse hardware registers with versions that just publish objects, like devm_cxl_enumerate_decoders(). Starting with: Commit `19ab69a60e` ("cxl/port: Store the port's Component Register mappings in struct cxl_port") ...port register enumeration is moved into devm_cxl_add_port(). This conflicts with the "cxl_test avoids emulating registers stance" so either the port code needs to be refactored (too violent), or modified so that register enumeration is skipped on "fake" cxl_test ports (annoying, but straightforward). This conflict has happened previously and the "check for platform device" workaround to avoid instrusive refactoring was deployed in those scenarios. In general, refactoring should only benefit production code, test code needs to remain minimally instrusive to the greatest extent possible. This was missed previously because it may sometimes just cause warning messages to be emitted, but it can also cause test failures. The backport to -stable is only nice to have for clean cxl_test runs. Fixes: `19ab69a60e` ("cxl/port: Store the port's Component Register mappings in struct cxl_port") Cc: stable@vger.kernel.org Reported-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Tested-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/169476525052.1013896.6235102957693675187.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-09-22 14:28:42 -07:00
Dan Williams	7914992b37	cxl/port: Quiet warning messages from the cxl_test environment The cxl_test platform device CXL port hierarchy is useful for testing, but throws warning messages of the form: cxl_mem mem2: at cxl_root_port.1 no parent for dport: platform cxl_mem mem3: at cxl_root_port.2 no parent for dport: platform cxl_mem mem4: at cxl_root_port.3 no parent for dport: platform cxl_mem mem5: at cxl_root_port.0 no parent for dport: platform cxl_mem mem6: at cxl_root_port.1 no parent for dport: platform cxl_mem mem7: at cxl_root_port.2 no parent for dport: platform cxl_mem mem8: at cxl_root_port.3 no parent for dport: platform cxl_mem mem9: at cxl_root_port.4 no parent for dport: platform cxl_mem mem10: at cxl_root_port.4 no parent for dport: platform ...and this message when running testing in QEMU: cxl_region region4: Bypassing cpu_cache_invalidate_memregion() for testing! Noisy cxl_test warnings have caused other regressions to be missed. In the interest of using cxl_test for early detection of dev_err() and dev_warn() messages, silence platform device topology and cache-invalidation messages. Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-09-15 21:45:55 -07:00
Alison Schofield	18f35dc931	cxl/region: Refactor granularity select in cxl_port_setup_targets() In cxl_port_setup_targets() the region driver validates the configuration of auto-discovered region decoders, as well as decoders the driver is preparing to program. The existing calculations use the encoded interleave granularity value to create an interleave granularity that properly fans out when routing an x1 interleave to a greater than x1 interleave. That all worked well, until this config came along: Host Bridge: 2 way at 256 granularity Switch Decoder_A: 1 way at 512 Endpoint_X: 2 way at 256 Switch Decoder_B: 1 way at 512 Endpoint_Y: 2 way at 256 When the Host Bridge interleave is greater than 1 and the root decoder interleave is exactly 1, the region driver needs to consider the number of targets in the region when calculating the expected granularity. While examining the existing logic, and trying to cover the case above, a couple of simplifications appeared, hence this proposed refactoring. The first simplification is to apply the logic to the nominal values and use the existing helper function granularity_to_eig() to translate the desired granularity to the encoded form. This means the comment and code regarding setting address bits is discarded. Although that logic is not wrong, it adds a level of complexity that is not required in the granularity selection. The eig and eiw are indeed part of the routing instructions programmed into the decoders. Up-level the discussion to nominal ways and granularity for clearer analysis. The second simplification reduces the logic to a single granularity calculation that works for all cases. The new calculation doesn't care if parent_iw => 1 because parent_iw is used as a multiplier. The refactor cleans up a useless assignment of eiw made after the iw is already calculated. Regression testing included an examination of all of the ways and granularity selections made during a run of the cxl_test unit tests. There were no differences in selections before and after this patch. Fixes: ("27b3f8d13830 cxl/region: Program target lists") Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20230822180928.117596-1-alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-09-14 20:47:42 -07:00
Alison Schofield	9e4edf1a21	cxl/region: Match auto-discovered region decoders by HPA range Currently, when the region driver attaches a region to a port, it selects the ports next available decoder to program. With the addition of auto-discovered regions, a port decoder has already been programmed so grabbing the next available decoder can be a mismatch when there is more than one region using the port. The failure appears like this with CXL DEBUG enabled: [] cxl_core:alloc_region_ref:754: cxl region0: endpoint9: HPA order violation region0:[mem 0x14780000000-0x1478fffffff flags 0x200] vs [mem 0x880000000-0x185fffffff flags 0x200] [] cxl_core:cxl_port_attach_region:972: cxl region0: endpoint9: failed to allocate region reference When CXL DEBUG is not enabled, there is no failure message. The region just never materializes. Users can suspect this issue if they know their firmware has programmed decoders so that more than one region is using a port. Note that the problem may appear intermittently, ie not on every reboot. Add a matching method for auto-discovered regions that finds a decoder based on an HPA range. The decoder range must exactly match the region resource parameter. Fixes: `a32320b71f` ("cxl/region: Add region autodiscovery") Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230905211007.256385-1-alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-09-14 20:46:12 -07:00
Ira Weiny	d2f7060588	cxl/mbox: Fix CEL logic for poison and security commands The following debug output was observed while testing CXL cxl_core:cxl_walk_cel:721: cxl_mock_mem cxl_mem.0: Opcode 0x4300 unsupported by driver opcode 0x4300 (Get Poison) is supported by the driver and the mock device supports it. The logic should be checking that the opcode is both not poison and not security. Fix the logic to allow poison and security commands. Fixes: `ad64f5952c` ("cxl/memdev: Only show sanitize sysfs files when supported") Cc: <stable@vger.kernel.org> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230903-cxl-cel-fix-v1-1-e260c9467be3@intel.com [cleanup cxl_walk_cel() to centralized "enabled" checks] Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-09-14 13:48:49 -07:00
Davidlohr Bueso	ad64f5952c	cxl/memdev: Only show sanitize sysfs files when supported If the device does not support Sanitize or Secure Erase commands, hide the respective sysfs interfaces such that the operation can never be attempted. In order to be generic, keep track of the enabled security commands found in the CEL - the driver does not support Security Passthrough. Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Link: https://lore.kernel.org/r/20230726051940.3570-4-dave@stgolabs.net Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>	2023-07-28 13:16:54 -06:00
Yang Li	fe77cc2e5a	cxl: Fix one kernel-doc comment Fix a merge error that updated the argument to cxl_mem_get_fw_info() but not the kernel-doc. drivers/cxl/core/memdev.c:678: warning: Function parameter or member 'mds' not described in 'cxl_mem_get_fw_info' drivers/cxl/core/memdev.c:678: warning: Excess function parameter 'cxlds' description in 'cxl_mem_get_fw_info' Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Link: https://lore.kernel.org/r/20230629021118.102744-1-yang.lee@linux.alibaba.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-29 16:03:58 -07:00
Dan Williams	0c0df63177	Merge branch 'for-6.5/cxl-rch-eh' into for-6.5/cxl Pick up the first half of the RCH error handling series. The back half needs some fixups for test regressions. Small conflicts with the PMU work around register enumeration and setup helpers.	2023-06-25 18:56:13 -07:00
Dan Williams	d2f9fe6953	Merge branch 'for-6.5/cxl-perf' into for-6.5/cxl Pick up initial support for the CXL 3.0 performance monitoring definition. Small conflicts with the firmware update work as they both placed their init code in the same location.	2023-06-25 17:53:18 -07:00
Dan Williams	e2c18eb50c	Merge branch 'for-6.5/cxl-region-fixes' into for-6.5/cxl Pick up the recent fixes to how CPU caches are managed relative to region setup / teardown, and make sure that all decoders transition successfully before updating the region state from COMMIT => ACTIVE.	2023-06-25 17:23:50 -07:00
Dan Williams	aeaefabc59	Merge branch 'for-6.5/cxl-type-2' into for-6.5/cxl Pick up the driver cleanups identified in preparation for CXL "type-2" (accelerator) device support. The major change here from a conflict generation perspective is the split of 'struct cxl_memdev_state' from the core 'struct cxl_dev_state'. Since an accelerator may not care about all the optional features that are standard on a CXL "type-3" (host-only memory expander) device. A silent conflict also occurs with the move of the endpoint port to be a formal property of a 'struct cxl_memdev' rather than drvdata.	2023-06-25 17:16:51 -07:00
Dan Williams	867eab655d	Merge branch 'for-6.5/cxl-fwupd' into for-6.5/cxl Add the first typical (non-sanitization) consumer of the new background command infrastructure, firmware update. Given both firmware-update and sanitization were developed in parallel from the common background-command baseline, resolve some minor context conflicts.	2023-06-25 16:12:26 -07:00
Dan Williams	dcfb70610d	Merge branch 'for-6.5/cxl-background' into for-6.5/cxl Pick up the sanitization work and the infrastructure for other background commands for 6.5. Sanitization has a different completion path than typical background commands so it was important to have both thought out and implemented before either went upstream.	2023-06-25 16:01:45 -07:00
Vishal Verma	9521875bbe	cxl: add a firmware update mechanism using the sysfs firmware loader The sysfs based firmware loader mechanism was created to easily allow userspace to upload firmware images to FPGA cards. This also happens to be pretty suitable to create a user-initiated but kernel-controlled firmware update mechanism for CXL devices, using the CXL specified mailbox commands. Since firmware update commands can be long-running, and can be processed in the background by the endpoint device, it is desirable to have the ability to chunk the firmware transfer down to smaller pieces, so that one operation does not monopolize the mailbox, locking out any other long running background commands entirely - e.g. security commands like 'sanitize' or poison scanning operations. The firmware loader mechanism allows a natural way to perform this chunking, as after each mailbox command, that is restricted to the maximum mailbox payload size, the cxl memdev driver relinquishes control back to the fw_loader system and awaits the next chunk of data to transfer. This opens opportunities for other background commands to access the mailbox and send their own slices of background commands. Add the necessary helpers and state tracking to be able to perform the 'Get FW Info', 'Transfer FW', and 'Activate FW' mailbox commands as described in the CXL spec. Wire these up to the firmware loader callbacks, and register with that system to create the memX/firmware/ sysfs ABI. Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Jonathan Cameron <Jonathan.Cameron@Huawei.com> Cc: Russ Weight <russell.h.weight@intel.com> Cc: Alison Schofield <alison.schofield@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Ben Widawsky <bwidawsk@kernel.org> Cc: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/20230602-vv-fw_update-v4-1-c6265bd7343b@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 15:58:40 -07:00
Davidlohr Bueso	180ffd338c	cxl/mem: Support Secure Erase Implement support for the non-pmem exclusive secure erase, per CXL specs. Create a write-only 'security/erase' sysfs file to perform the requested operation. As with the sanitation this requires the device being offline and thus no active HPA-DPA decoding. The expectation is that userspace can use it such as: cxl disable-memdev memX echo 1 > /sys/bus/cxl/devices/memX/security/erase cxl enable-memdev memX Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Link: https://lore.kernel.org/r/20230612181038.14421-7-dave@stgolabs.net Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 15:34:41 -07:00
Davidlohr Bueso	48dcdbb16e	cxl/mem: Wire up Sanitization support Implement support for CXL 3.0 8.2.9.8.5.1 Sanitize. This is done by adding a security/sanitize' memdev sysfs file to trigger the operation and extend the status file to make it poll(2)-capable for completion. Unlike all other background commands, this is the only operation that is special and monopolizes the device for long periods of time. In addition to the traditional pmem security requirements, all regions must also be offline in order to perform the operation. This permits avoiding explicit global CPU cache management, relying instead on the implict cache management when a region transitions between CXL_CONFIG_ACTIVE and CXL_CONFIG_COMMIT. The expectation is that userspace can use it such as: cxl disable-memdev memX echo 1 > /sys/bus/cxl/devices/memX/security/sanitize cxl wait-sanitize memX cxl enable-memdev memX Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Link: https://lore.kernel.org/r/20230612181038.14421-5-dave@stgolabs.net Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 15:21:16 -07:00
Davidlohr Bueso	0c36b6ad43	cxl/mbox: Add sanitization handling machinery Sanitization is by definition a device-monopolizing operation, and thus the timeslicing rules for other background commands do not apply. As such handle this special case asynchronously and return immediately. Subsequent changes will allow completion to be pollable from userspace via a sysfs file interface. For devices that don't support interrupts for notifying background command completion, self-poll with the caveat that the poller can be out of sync with the ready hardware, and therefore care must be taken to not allow any new commands to go through until the poller sees the hw completion. The poller takes the mbox_mutex to stabilize the flagging, minimizing any runtime overhead in the send path to check for 'sanitize_tmo' for uncommon poll scenarios. The irq case is much simpler as hardware will serialize/error appropriately. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230612181038.14421-4-dave@stgolabs.net Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 14:54:51 -07:00
Davidlohr Bueso	9968c9dd56	cxl/mem: Introduce security state sysfs file Add a read-only sysfs file to display the security state of a device (currently only pmem): /sys/bus/cxl/devices/memX/security/state This introduces a cxl_security_state structure that is to be the placeholder for common CXL security features. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20230612181038.14421-3-dave@stgolabs.net Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 14:54:51 -07:00
Dan Williams	8f0220af58	Revert "cxl/port: Enable the HDM decoder capability for switch ports" commit `eb0764b822` ("cxl/port: Enable the HDM decoder capability for switch ports") ...was added on the observation of CXL memory not being accessible after setting up a region on a "cold-plugged" device. A "cold-plugged" CXL device is one that was not present at boot, so platform-firmware/BIOS has no chance to set it up. While it is true that the debug found the enable bit clear in the host-bridge's instance of the global control register (CXL 3.0 8.2.4.19.2 CXL HDM Decoder Global Control Register), that bit is described as: "This bit is only applicable to CXL.mem devices and shall return 0 on CXL Host Bridges and Upstream Switch Ports." So it is meant to be zero, and further testing confirmed that this "fix" had no effect on the failure. Revert it, and be more vigilant about proposed fixes in the future. Since the original copied stable@, flag this revert for stable@ as well. Cc: <stable@vger.kernel.org> Fixes: `eb0764b822` ("cxl/port: Enable the HDM decoder capability for switch ports") Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168685882012.3475336.16733084892658264991.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 14:32:18 -07:00
Dan Williams	516b300c4c	cxl/memdev: Formalize endpoint port linkage Move the endpoint port that the cxl_mem driver establishes from drvdata to a first class attribute. This is in preparation for device-memory drivers reusing the CXL core for memory region management. Those drivers need a type-safe method to retrieve their CXL port linkage. Leave drvdata for private usage of the cxl_mem driver not external consumers of a 'struct cxl_memdev' object. Reviewed-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168679264292.3436160.3901392135863405807.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 14:31:33 -07:00
Dan Williams	8c897b366c	cxl/region: Manage decoder target_type at decoder-attach time Switch-level (mid-level) decoders between the platform root and an endpoint can dynamically switch modes between HDM-H and HDM-D[B] depending on which region they target. Use the region type to fixup each decoder that gets allocated to map the given region. Note that endpoint decoders are meant to determine the region type, so warn if those ever need to be fixed up, but since it is possible to continue do so. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168679262543.3436160.13053831955768440312.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 14:31:09 -07:00
Dan Williams	cecbb5da92	cxl/hdm: Default CXL_DEVTYPE_DEVMEM decoders to CXL_DECODER_DEVMEM In preparation for device-memory region creation, arrange for decoders of CXL_DEVTYPE_DEVMEM memdevs to default to CXL_DECODER_DEVMEM for their target type. Revisit this if a device ever shows up that wants to offer mixed HDM-H (Host-Only Memory) and HDM-DB support, or an CXL_DEVTYPE_DEVMEM device that supports HDM-H. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168679261945.3436160.11673393474107374595.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 14:31:08 -07:00
Dan Williams	5aa39a9165	cxl/port: Rename CXL_DECODER_{EXPANDER, ACCELERATOR} => {HOSTONLYMEM, DEVMEM} In preparation for support for HDM-D and HDM-DB configuration (device-memory, and device-memory with back-invalidate). Rename the current type designators to use HOSTONLYMEM and DEVMEM as a suffix. HDM-DB can be supported by devices that are not accelerators, so DEVMEM is a more generic term for that case. Fixup one location where this type value was open coded. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168679261369.3436160.7042443847605280593.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 14:31:08 -07:00
Dan Williams	f6b8ab32e3	cxl/memdev: Make mailbox functionality optional In support of the Linux CXL core scaling for a wider set of CXL devices, allow for the creation of memdevs with some memory device capabilities disabled. Specifically, allow for CXL devices outside of those claiming to be compliant with the generic CXL memory device class code, like vendor specific Type-2/3 devices that host CXL.mem. This implies, allow for the creation of memdevs that only support component-registers, not necessarily memory-device-registers (like mailbox registers). A memdev derived from a CXL endpoint that does not support generic class code expectations is tagged "CXL_DEVTYPE_DEVMEM", while a memdev derived from a class-code compliant endpoint is tagged "CXL_DEVTYPE_CLASSMEM". The primary assumption of a CXL_DEVTYPE_DEVMEM memdev is that it optionally may not host a mailbox. Disable the command passthrough ioctl for memdevs that are not CXL_DEVTYPE_CLASSMEM, and return empty strings from memdev attributes associated with data retrieved via the class-device-standard IDENTIFY command. Note that empty strings were chosen over attribute visibility to maintain compatibility with shipping versions of cxl-cli that expect those attributes to always be present. Once cxl-cli has dropped that requirement this workaround can be deprecated. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168679260782.3436160.7587293613945445365.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 14:31:08 -07:00
Dan Williams	59f8d15107	cxl/mbox: Move mailbox related driver state to its own data structure 'struct cxl_dev_state' makes too many assumptions about the capabilities of a CXL device. In particular it assumes a CXL device has a mailbox and all of the infrastructure and state that comes along with that. In preparation for supporting accelerator / Type-2 devices that may not have a mailbox and in general maintain a minimal core context structure, make mailbox functionality a super-set of 'struct cxl_dev_state' with 'struct cxl_memdev_state'. With this reorganization it allows for CXL devices that support HDM decoder mapping, but not other general-expander / Type-3 capabilities, to only enable that subset without the rest of the mailbox infrastructure coming along for the ride. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168679260240.3436160.15520641540463704524.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 14:31:08 -07:00
Dan Williams	688baac109	cxl/regs: Clarify when a 'struct cxl_register_map' is input vs output The @map parameter to cxl_probe_X_registers() is filled in with the mapping parameters of the register block. The @map parameter to cxl_map_X_registers() only reads that information to perform the mapping. Mark @map const for cxl_map_X_registers() to clarify that it is only an input to those helpers. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168679258103.3436160.4941603739448763855.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 14:31:08 -07:00
Dan Williams	adfe19738b	cxl/region: Fix state transitions after reset failure Jonathan reports that failed attempts to reset a region (teardown its HDM decoder configuration) mistakenly advance the state of the region to "not committed". Revert to the previous state of the region on reset failure so that the reset can be re-attempted. Reported-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com> Closes: http://lore.kernel.org/r/20230316171441.0000205b@Huawei.com Fixes: `176baefb2e` ("cxl/hdm: Commit decoder state to hardware") Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168696507968.3590522.14484000711718573626.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 13:45:40 -07:00
Dan Williams	2ab47045ac	cxl/region: Flag partially torn down regions as unusable cxl_region_decode_reset() walks all the decoders associated with a given region and disables them. Due to decoder ordering rules it is possible that a switch in the topology notices that a given decoder can not be shutdown before another region with a higher HPA is shutdown first. That can leave the region in a partially committed state. Capture that state in a new CXL_REGION_F_NEEDS_RESET flag and require that a successful cxl_region_decode_reset() attempt must be completed before cxl_region_probe() accepts the region. This is a corollary for the bug that Jonathan identified in "CXL/region : commit reset of out of order region appears to succeed." [1]. Cc: Jonathan Cameron <Jonathan.Cameron@Huawei.com> Link: http://lore.kernel.org/r/20230316171441.0000205b@Huawei.com [1] Fixes: `176baefb2e` ("cxl/hdm: Commit decoder state to hardware") Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168696507423.3590522.16254212607926684429.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 13:45:40 -07:00
Dan Williams	d1257d098a	cxl/region: Move cache invalidation before region teardown, and before setup Vikram raised a concern with the theoretical case of a CPU sending MemClnEvict to a device that is not prepared to receive. MemClnEvict is a message that is sent after a CPU has taken ownership of a cacheline from accelerator memory (HDM-DB). In the case of hotplug or HDM decoder reconfiguration it is possible that the CPU is holding old contents for a new device that has taken over the physical address range being cached by the CPU. To avoid this scenario, invalidate caches prior to tearing down an HDM decoder configuration. Now, this poses another problem that it is possible for something to speculate into that space while the decode configuration is still up, so to close that gap also invalidate prior to establish new contents behind a given physical address range. With this change the cache invalidation is now explicit and need not be checked in cxl_region_probe(), and that obviates the need for CXL_REGION_F_INCOHERENT. Cc: Jonathan Cameron <Jonathan.Cameron@Huawei.com> Fixes: `d18bc74ace` ("cxl/region: Manage CPU caches relative to DPA invalidation events") Reported-by: Vikram Sethi <vsethi@nvidia.com> Closes: http://lore.kernel.org/r/BYAPR12MB33364B5EB908BF7239BB996BBD53A@BYAPR12MB3336.namprd12.prod.outlook.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168696506886.3590522.4597053660991916591.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 13:45:40 -07:00
Robert Richter	5d2ffbe4b8	cxl/port: Store the downstream port's Component Register mappings in struct cxl_dport Same as for ports, also store the downstream port's Component Register mappings, use struct cxl_dport for that. Signed-off-by: Robert Richter <rrichter@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230622205523.85375-16-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 12:22:53 -07:00
Robert Richter	19ab69a60e	cxl/port: Store the port's Component Register mappings in struct cxl_port CXL capabilities are stored in the Component Registers. To use them, the specific I/O ranges of the capabilities must be determined by probing the registers. For this, the whole Component Register range needs to be mapped temporarily to detect the offset and length of a capability range. In order to use more than one capability of a component (e.g. RAS and HDM) the Component Register are probed and its mappings created multiple times. This also causes overlapping I/O ranges as the whole Component Register range must be mapped again while a capability's I/O range is already mapped. Different capabilities cannot be setup at the same time. E.g. the RAS capability must be made available as soon as the PCI driver is bound, the HDM decoder is setup later during port enumeration. Moreover, during early setup it is still unknown if a certain capability is needed. A central capability setup is therefore not possible, capabilities must be individually enabled once needed during initialization. To avoid a duplicate register probe and overlapping I/O mappings, only probe the Component Registers one time and store the Component Register mapping in struct port. The stored mappings can be used later to iomap the capability register range when enabling the capability, which will be implemented in a follow-on patch. Signed-off-by: Robert Richter <rrichter@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230622205523.85375-15-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 12:22:44 -07:00
Robert Richter	733b57f262	cxl/pci: Early setup RCH dport component registers from RCRB CXL RAS capabilities must be enabled and accessible as soon as the CXL endpoint is detected in the PCI hierarchy and bound to the cxl_pci driver. This needs to be independent of other modules such as cxl_port or cxl_mem. CXL RAS capabilities reside in the Component Registers. For an RCH this is determined by probing RCRB which is implemented very late once the CXL Memory Device is created. Change this by moving the RCRB probe to the cxl_pci driver. Do this by using a new introduced function cxl_pci_find_port() similar to cxl_mem_find_port() to determine the involved dport by the endpoint's PCI handle. Plug this into the existing cxl_pci_setup_regs() function to setup Component Registers. Probe the RCRB in case the Component Registers cannot be located through the CXL Register Locator capability. This unifies code and early sets up the Component Registers at the same time for both, VH and RCH mode. Only the cxl_pci driver is involved for this. This allows an early mapping of the CXL RAS capability registers. Signed-off-by: Robert Richter <rrichter@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230622205523.85375-14-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 11:57:11 -07:00
Robert Richter	f1d0525eff	cxl/regs: Remove early capability checks in Component Register setup When probing the Component Registers in function cxl_probe_regs() there are also checks for the existence of the HDM and RAS capabilities. The checks may fail for components that do not implement the HDM capability causing the Component Registers setup to fail too. Remove the checks for a generalized use of cxl_probe_regs() and check them directly before mapping the RAS or HDM capabilities. This allows it to setup other Component Registers esp. of an RCH Downstream Port, which will be implemented in a follow-on patch. Signed-off-by: Robert Richter <rrichter@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230622205523.85375-12-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 11:51:36 -07:00
Robert Richter	d8bffff201	cxl/port: Remove Component Register base address from struct cxl_dport The Component Register base address @component_reg_phys is no longer used after the rework of the Component Register setup which now uses struct member @comp_map instead. Remove the base address. Signed-off-by: Robert Richter <rrichter@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230622205523.85375-11-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 11:51:17 -07:00
Terry Bowman	d076bb8c4c	cxl/pci: Refactor component register discovery for reuse The endpoint implements component register setup code. Refactor it for reuse with RCRB, downstream port, and upstream port setup. Move PCI specifics from cxl_setup_regs() into cxl_pci_setup_regs(). Move cxl_setup_regs() into cxl/core/regs.c and export it. This also includes supporting static functions cxl_map_registerblock(), cxl_unmap_register_block() and cxl_probe_regs(). Co-developed-by: Robert Richter <rrichter@amd.com> Signed-off-by: Robert Richter <rrichter@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230622205523.85375-8-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 11:39:38 -07:00
Robert Richter	573408049b	cxl/core/regs: Add @dev to cxl_register_map The corresponding device of a register mapping is used for devm operations and logging. For operations with struct cxl_register_map the device needs to be kept track separately. To simpify the involved function interfaces, add @dev to cxl_register_map. While at it also reorder function arguments of cxl_map_device_regs() and cxl_map_component_regs() to have the object @cxl_register_map first. As a result a bunch of functions are available to be used with a @cxl_register_map object. This patch is in preparation of reworking the component register setup code. Signed-off-by: Robert Richter <rrichter@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230622205523.85375-7-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 11:37:58 -07:00
Dan Williams	7481653dee	cxl: Rename 'uport' to 'uport_dev' For symmetry with the recent rename of ->dport_dev for a 'struct cxl_dport', add the "_dev" suffix to the ->uport property of a 'struct cxl_port'. These devices represent the downstream-port-device and upstream-port-device respectively in the CXL/PCIe topology. Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230622205523.85375-6-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 11:37:54 -07:00
Robert Richter	227db57459	cxl: Rename member @dport of struct cxl_dport to @dport_dev Reading code like dport->dport does not immediately suggest that this points to the corresponding device structure of the dport. Rename struct member @dport to @dport_dev. While at it, also rename @new argument of add_dport() to @dport. This better describes the variable as a dport (e.g. new->dport becomes to dport->dport_dev). Co-developed-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230622205523.85375-5-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 11:37:49 -07:00
Dan Williams	0619337856	cxl/rch: Prepare for caching the MMIO mapped PCIe AER capability Prepare cxl_probe_rcrb() for retrieving more than just the component register block. The RCH AER handling code wants to get back to the AER capability that happens to be MMIO mapped rather then configuration cycles. Move RCRB specific downstream port data, like the RCRB base and the AER capability offset, into its own data structure ('struct cxl_rcrb_info') for cxl_probe_rcrb() to fill. Extend 'struct cxl_dport' to include a 'struct cxl_rcrb_info' attribute. This centralizes all RCRB scanning in one routine. Co-developed-by: Robert Richter <rrichter@amd.com> Signed-off-by: Robert Richter <rrichter@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230622205523.85375-4-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 11:35:26 -07:00
Robert Richter	eb4663b07e	cxl/acpi: Probe RCRB later during RCH downstream port creation The RCRB is extracted already during ACPI CEDT table parsing while the data of this is needed not earlier than dport creation. This implementation comes with drawbacks: During ACPI table scan there is already MMIO access including mapping and unmapping, but only ACPI data should be collected here. The collected data must be transferred through a couple of interfaces until it is finally consumed when creating the dport. This causes complex data structures and function interfaces. Additionally, RCRB parsing will be extended to also extract AER data, it would be much easier do this at a later point during port and dport creation when the data structures are available to hold that data. To simplify all that, probe the RCRB at a later point during RCH downstream port creation. Change ACPI table parser to only extract the base address of either the component registers or the RCRB. Parse and extract the RCRB in devm_cxl_add_rch_dport(). This is in preparation to centralize all RCRB scanning. Signed-off-by: Robert Richter <rrichter@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230622205523.85375-2-terry.bowman@amd.com Co-developed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/20230622205523.85375-3-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 11:35:20 -07:00
Jonathan Cameron	1ad3f701c3	cxl/pci: Find and register CXL PMU devices CXL PMU devices can be found from entries in the Register Locator DVSEC. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230526095824.16336-4-Jonathan.Cameron@huawei.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-05-30 11:20:35 -07:00
Jonathan Cameron	d717d7f3df	cxl: Add functions to get an instance of / count regblocks of a given type Until the recently release CXL 3.0 specification, there was only ever one instance of any given register block pointed to by the Register Block Locator DVSEC. Now, the specification allows for multiple CXL PMU instances, each with their own register block. To enable this add cxl_find_regblock_instance() that takes an index parameter and use that to implement cxl_count_regblock() and cxl_find_regblock(). Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20230526095824.16336-3-Jonathan.Cameron@huawei.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-05-30 11:20:35 -07:00
Dave Jiang	793a539ac7	cxl: Explicitly initialize resources when media is not ready When media is not ready do not assume that the capacity information from the identify command is valid, i.e. ->total_bytes ->partition_align_bytes ->{volatile,persistent}_only_bytes. Explicitly zero out the capacity resources and exit early. Given zero-init of those fields this patch is functionally equivalent to the prior state, but it improves readability and robustness going forward. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168506118166.3004974.13523455340007852589.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-05-26 13:34:39 -07:00
Davidlohr Bueso	ccadf1310f	cxl/mbox: Add background cmd handling machinery This adds support for handling background operations, as defined in the CXL 3.0 spec. Commands that can take too long (over ~2 seconds) can run in the background asynchronously (to the hardware). The driver will deal with such commands synchronously, blocking all other incoming commands for a specified period of time, allowing time-slicing the command such that the caller can send incremental requests to avoid monopolizing the driver/device. Any out of sync (timeout) between the driver and hardware is just disregarded as an invalid state until the next successful submission. Such timeouts are considered a rare occurrence, either a real device problem or a driver issue that needs to reduce the size of the background operation to fit the timeout. On devices where mbox interrupts are supported, this will still use a poller that will wakeup in the specified wait intervals. The irq handler will simply awake the blocked cmd, which is also safe vs a task that is either waking (timing out) or already awoken. Similarly any irq setup error during the probing falls back to polling, thus avoids unnecessarily erroring out. Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Link: https://lore.kernel.org/r/20230523170927.20685-5-dave@stgolabs.net Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-05-23 12:55:12 -07:00
Robert Richter	a70fc4ed20	cxl/port: Fix NULL pointer access in devm_cxl_add_port() In devm_cxl_add_port() the port creation may fail and its associated pointer does not contain a valid address. During error message generation this invalid port address is used. Fix that wrong address access. Fixes: `f3cd264c4e` ("cxl: Unify debug messages when calling devm_cxl_add_port()") Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20230519215436.3394532-1-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-05-19 17:47:01 -07:00
Dave Jiang	e764f12208	cxl: Move cxl_await_media_ready() to before capacity info retrieval Move cxl_await_media_ready() to cxl_pci probe before driver starts issuing IDENTIFY and retrieving memory device information to ensure that the device is ready to provide the information. Allow cxl_pci_probe() to succeed even if media is not ready. Cache the media failure in cxlds and don't ask the device for any media information. The rationale for proceeding in the !media_ready case is to allow for mailbox operations to interrogate and/or remediate the device. After media is repaired then rebinding the cxl_pci driver is expected to restart the capacity scan. Suggested-by: Dan Williams <dan.j.williams@intel.com> Fixes: `b39cb1052a` ("cxl/mem: Register CXL memX devices") Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168445310026.3251520.8124296540679268206.stgit@djiang5-mobl3 [djbw: fixup cxl_test] Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-05-18 16:43:45 -07:00
Dave Jiang	ce17ad0d54	cxl: Wait Memory_Info_Valid before access memory related info The Memory_Info_Valid bit (CXL 3.0 8.1.3.8.2) indicates that the CXL Range Size High and Size Low registers are valid. The bit must be set within 1 second of reset deassertion to the device. Check valid bit before we check the Memory_Active bit when waiting for cxl_await_media_ready() to ensure that the memory info is valid for consumption. Also ensures both DVSEC ranges 1 and 2 are ready if DVSEC Capability indicates they are both supported. Fixes: `523e594d9c` ("cxl/pci: Implement wait for media active") Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168444687469.3134781.11033518965387297327.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-05-18 16:42:41 -07:00
Dan Williams	eb0764b822	cxl/port: Enable the HDM decoder capability for switch ports Derick noticed, when testing hot plug, that hot-add behaves nominally after a removal. However, if the hot-add is done without a prior removal, CXL.mem accesses fail. It turns out that the original implementation of the port driver and region programming wrongly assumed that platform-firmware always enables the host-bridge HDM decoder capability. Add support turning on switch-level HDM decoders in the case where platform-firmware has not. The implementation is careful to only arrange for the enable to be undone if the current instance of the driver was the one that did the enable. This is to interoperate with platform-firmware that may expect CXL.mem to remain active after the driver is shutdown. This comes at the cost of potentially not shutting down the enable on kexec flows, but it is mitigated by the fact that the related HDM decoders still need to be enabled on an individual basis. Cc: <stable@vger.kernel.org> Reported-by: Derick Marks <derick.w.marks@intel.com> Fixes: `54cdbf845c` ("cxl/port: Add a driver for 'struct cxl_port' objects") Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/168437998331.403037.15719879757678389217.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-05-18 13:18:49 -07:00
Dave Jiang	764d102ef9	cxl: Add missing return to cdat read error path Add a return to the error path when cxl_cdat_read_table() fails. Current code continues with the table pointer points to freed memory. Fixes: `7a877c9239` ("cxl/pci: Simplify CDAT retrieval error path") Signed-off-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Link: https://lore.kernel.org/r/168382793506.3510737.4792518576623749076.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-05-13 00:20:06 -07:00
Linus Torvalds	7acc137211	cxl for v6.4 - Refactor the DOE infrastructure (Data Object Exchange PCI-config-cycle mailbox) to be a facility of the PCI core rather than the CXL core. This is foundational for upcoming support for PCI device-attestation and PCIe / CXL link encryption. - Add support for retrieving and injecting poison for CXL memory expanders. This enabling uses trace-events to convey CXL media error records to user tooling. It includes translation of device-local addresses (DPA) to system physical addresses (SPA) and their corresponding CXL region. - Fixes for decoder enumeration that missed v6.3-final - Miscellaneous fixups -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCZE2nNwAKCRDfioYZHlFs Z5c2AQCTWebok6CD+HN01xnIx+CBWAUQe0QIGR40dT2P6/WGEgEA8wMae0w/FDlc lQDvSoIyPvy1hGO7Ppb0K2AT6jrQAgU= =blcC -----END PGP SIGNATURE----- Merge tag 'cxl-for-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull compute express link updates from Dan Williams: "DOE support is promoted from drivers/cxl/ to drivers/pci/ with Bjorn's blessing, and the CXL core continues to mature its media management capabilities with support for listing and injecting media errors. Some late fixes that missed v6.3-final are also included: - Refactor the DOE infrastructure (Data Object Exchange PCI-config-cycle mailbox) to be a facility of the PCI core rather than the CXL core. This is foundational for upcoming support for PCI device-attestation and PCIe / CXL link encryption. - Add support for retrieving and injecting poison for CXL memory expanders. This enabling uses trace-events to convey CXL media error records to user tooling. It includes translation of device-local addresses (DPA) to system physical addresses (SPA) and their corresponding CXL region. - Fixes for decoder enumeration that missed v6.3-final - Miscellaneous fixups" * tag 'cxl-for-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (38 commits) cxl/test: Add mock test for set_timestamp cxl/mbox: Update CMD_RC_TABLE tools/testing/cxl: Require CONFIG_DEBUG_FS tools/testing/cxl: Add a sysfs attr to test poison inject limits tools/testing/cxl: Use injected poison for get poison list tools/testing/cxl: Mock the Clear Poison mailbox command tools/testing/cxl: Mock the Inject Poison mailbox command cxl/mem: Add debugfs attributes for poison inject and clear cxl/memdev: Trace inject and clear poison as cxl_poison events cxl/memdev: Warn of poison inject or clear to a mapped region cxl/memdev: Add support for the Clear Poison mailbox command cxl/memdev: Add support for the Inject Poison mailbox command tools/testing/cxl: Mock support for Get Poison List cxl/trace: Add an HPA to cxl_poison trace events cxl/region: Provide region info to the cxl_poison trace event cxl/memdev: Add trigger_poison_list sysfs attribute cxl/trace: Add TRACE support for CXL media-error records cxl/mbox: Add GET_POISON_LIST mailbox command cxl/mbox: Initialize the poison state cxl/mbox: Restrict poison cmds to debugfs cxl_raw_allow_all ...	2023-04-30 11:51:51 -07:00
Linus Torvalds	556eb8b791	Driver core changes for 6.4-rc1 Here is the large set of driver core changes for 6.4-rc1. Once again, a busy development cycle, with lots of changes happening in the driver core in the quest to be able to move "struct bus" and "struct class" into read-only memory, a task now complete with these changes. This will make the future rust interactions with the driver core more "provably correct" as well as providing more obvious lifetime rules for all busses and classes in the kernel. The changes required for this did touch many individual classes and busses as many callbacks were changed to take const * parameters instead. All of these changes have been submitted to the various subsystem maintainers, giving them plenty of time to review, and most of them actually did so. Other than those changes, included in here are a small set of other things: - kobject logging improvements - cacheinfo improvements and updates - obligatory fw_devlink updates and fixes - documentation updates - device property cleanups and const * changes - firwmare loader dependency fixes. All of these have been in linux-next for a while with no reported problems. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZEp7Sw8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+ykitQCfamUHpxGcKOAGuLXMotXNakTEsxgAoIquENm5 LEGadNS38k5fs+73UaxV =7K4B -----END PGP SIGNATURE----- Merge tag 'driver-core-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the large set of driver core changes for 6.4-rc1. Once again, a busy development cycle, with lots of changes happening in the driver core in the quest to be able to move "struct bus" and "struct class" into read-only memory, a task now complete with these changes. This will make the future rust interactions with the driver core more "provably correct" as well as providing more obvious lifetime rules for all busses and classes in the kernel. The changes required for this did touch many individual classes and busses as many callbacks were changed to take const * parameters instead. All of these changes have been submitted to the various subsystem maintainers, giving them plenty of time to review, and most of them actually did so. Other than those changes, included in here are a small set of other things: - kobject logging improvements - cacheinfo improvements and updates - obligatory fw_devlink updates and fixes - documentation updates - device property cleanups and const * changes - firwmare loader dependency fixes. All of these have been in linux-next for a while with no reported problems" * tag 'driver-core-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (120 commits) device property: make device_property functions take const device * driver core: update comments in device_rename() driver core: Don't require dynamic_debug for initcall_debug probe timing firmware_loader: rework crypto dependencies firmware_loader: Strip off \n from customized path zram: fix up permission for the hot_add sysfs file cacheinfo: Add use_arch[\|_cache]_info field/function arch_topology: Remove early cacheinfo error message if -ENOENT cacheinfo: Check cache properties are present in DT cacheinfo: Check sib_leaf in cache_leaves_are_shared() cacheinfo: Allow early level detection when DT/ACPI info is missing/broken cacheinfo: Add arm64 early level initializer implementation cacheinfo: Add arch specific early level initializer tty: make tty_class a static const structure driver core: class: remove struct class_interface * from callbacks driver core: class: mark the struct class in struct class_interface constant driver core: class: make class_register() take a const * driver core: class: mark class_release() as taking a const * driver core: remove incorrect comment for device_create* MIPS: vpe-cmp: remove module owner pointer from struct class usage. ...	2023-04-27 11:53:57 -07:00
Dan Williams	ca899f4021	Merge branch 'for-6.3/cxl-autodetect-fixes' into for-6.4/cxl Pick up late v6.3 fixes for v6.4.	2023-04-23 12:10:13 -07:00
Dan Williams	856ef55e7e	Merge branch 'for-6.4/cxl-poison' into for-6.4/cxl Include the poison list and injection infrastructure from Alison for v6.4.	2023-04-23 12:09:56 -07:00
Alison Schofield	98b6926562	cxl/memdev: Trace inject and clear poison as cxl_poison events The cxl_poison trace event allows users to view the history of poison list reads. With the addition of inject and clear poison capabilities, users will expect similar tracing. Add trace types 'Inject' and 'Clear' to the cxl_poison trace_event and trace successful operations only. If the driver finds that the DPA being injected or cleared of poison is mapped in a region, that region info is included in the cxl_poison trace event. Region reconfigurations can make this extra info useless if the debug operations are not carefully managed. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/e20eb7c3029137b480ece671998c183da0477e2e.1681874357.git.alison.schofield@intel.com Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-23 12:08:39 -07:00
Alison Schofield	0a105ab28a	cxl/memdev: Warn of poison inject or clear to a mapped region Inject and clear poison capabilities and intended for debug usage only. In order to be useful in debug environments, the driver needs to allow inject and clear operations on DPAs mapped in regions. dev_warn_once() when either operation occurs. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/f911ca5277c9d0f9757b72d7e6842871bfff4fa2.1681874357.git.alison.schofield@intel.com Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-23 11:46:22 -07:00
Alison Schofield	9690b07748	cxl/memdev: Add support for the Clear Poison mailbox command CXL devices optionally support the CLEAR POISON mailbox command. Add memdev driver support for clearing poison. Per the CXL Specification (3.0 8.2.9.8.4.3), after receiving a valid clear poison request, the device removes the address from the device's Poison List and writes 0 (zero) for 64 bytes starting at address. If the device cannot clear poison from the address, it returns a permanent media error and -ENXIO is returned to the user. Additionally, and per the spec also, it is not an error to clear poison of an address that is not poisoned. If the address is not contained in the device's dpa resource, or is not 64 byte aligned, the driver returns -EINVAL without sending the command to the device. Poison clearing is intended for debug only and will be exposed to userspace through debugfs. Restrict compilation to CONFIG_DEBUG_FS. Implementation note: Although the CXL specification defines the clear command to accept 64 bytes of 'write-data', this implementation always uses zeroes as write-data. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/8682c30ec24bd9c45af5feccb04b02be51e58c0a.1681874357.git.alison.schofield@intel.com Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-23 11:46:22 -07:00
Alison Schofield	d2fbc48658	cxl/memdev: Add support for the Inject Poison mailbox command CXL devices optionally support the INJECT POISON mailbox command. Add memdev driver support for the mailbox command. Per the CXL Specification (3.0 8.2.9.8.4.2), after receiving a valid inject poison request, the device will return poison when the address is accessed through the CXL.mem driver. Injecting poison adds the address to the device's Poison List and the error source is set to Injected. In addition, the device adds a poison creation event to its internal Informational Event log, updates the Event Status register, and if configured, interrupts the host. Also, per the CXL Specification, it is not an error to inject poison into an address that already has poison present and no error is returned from the device. If the address is not contained in the device's dpa resource, or is not 64 byte aligned, return -EINVAL without issuing the mbox command. Poison injection is intended for debug only and will be exposed to userspace through debugfs. Restrict compilation to CONFIG_DEBUG_FS. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/241c64115e6bd2effed9c7a20b08b3908dd7be8f.1681874357.git.alison.schofield@intel.com Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-23 11:46:22 -07:00
Alison Schofield	28a3ae4ff6	cxl/trace: Add an HPA to cxl_poison trace events When a cxl_poison trace event is reported for a region, the poisoned Device Physical Address (DPA) can be translated to a Host Physical Address (HPA) for consumption by user space. Translate and add the resulting HPA to the cxl_poison trace event. Follow the device decode logic as defined in the CXL Spec 3.0 Section 8.2.4.19.13. If no region currently maps the poison, assign ULLONG_MAX to the cxl_poison event hpa field. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/6d3cd726f9042a59902785b0a2cb3ddfb70e0219.1681838292.git.alison.schofield@intel.com Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-23 11:46:13 -07:00
Alison Schofield	f0832a5863	cxl/region: Provide region info to the cxl_poison trace event User space may need to know which region, if any, maps the poison address(es) logged in a cxl_poison trace event. Since the mapping of DPAs (device physical addresses) to a region can change, the kernel must provide this information at the time the poison list is read. The event informs user space that at event <timestamp> this <region> mapped to this <DPA>, which is poisoned. The cxl_poison trace event is already wired up to log the region name and uuid if it receives param 'struct cxl_region'. In order to provide that cxl_region, add another method for gathering poison - by committed endpoint decoder mappings. This method is only available with CONFIG_CXL_REGION and is only used if a region actually maps the memdev where poison is being read. After the region driver reads the poison list for all the mapped resources, poison is read for any remaining unmapped resources. The default method remains: read the poison by memdev resource. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/438b01ccaa70592539e8eda4eb2b1d617ba03160.1681838292.git.alison.schofield@intel.com Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-23 11:46:06 -07:00
Alison Schofield	7ff6ad1075	cxl/memdev: Add trigger_poison_list sysfs attribute When a boolean 'true' is written to this attribute the memdev driver retrieves the poison list from the device. The list consists of addresses that are poisoned, or would result in poison if accessed, and the source of the poison. This attribute is only visible for devices supporting the capability. The retrieved errors are logged as kernel events when cxl_poison event tracing is enabled. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/1081cfdc8a349dc754779642d584707e56db26ba.1681838291.git.alison.schofield@intel.com Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-23 11:46:02 -07:00
Alison Schofield	ddf49d57b8	cxl/trace: Add TRACE support for CXL media-error records CXL devices may support the retrieval of a device poison list. Add a new trace event that the CXL subsystem may use to log the media-error records returned in the poison list. Log each media-error record as a cxl_poison trace event of type 'List'. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/de6196f5269483d886ab1834744f82d27189a666.1681838291.git.alison.schofield@intel.com Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-23 11:45:53 -07:00
Alison Schofield	ed83f7ca39	cxl/mbox: Add GET_POISON_LIST mailbox command CXL devices maintain a list of locations that are poisoned or result in poison if the addresses are accessed by the host. Per the spec, (CXL 3.0 8.2.9.8.4.1), the device returns this Poison list as a set of Media Error Records that include the source of the error, the starting device physical address, and length. The length is the number of adjacent DPAs in the record and is in units of 64 bytes. Retrieve the poison list. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/a1f332e817834ef8e89c0ff32e760308fb903346.1681838291.git.alison.schofield@intel.com Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-23 11:45:49 -07:00
Alison Schofield	d0abf5787a	cxl/mbox: Initialize the poison state Driver reads of the poison list are synchronized to ensure that a reader does not get an incomplete list because their request overlapped (was interrupted or preceded by) another read request of the same DPA range. (CXL Spec 3.0 Section 8.2.9.8.4.1). The driver maintains state information to achieve this goal. To initialize the state, first recognize the poison commands in the CEL (Command Effects Log). If the device supports Get Poison List, allocate a single buffer for the poison list and protect it with a lock. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/9078d180769be28a5087288b38cdfc827cae58bf.1681838291.git.alison.schofield@intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-23 11:45:26 -07:00
Alison Schofield	dec441d32a	cxl/mbox: Restrict poison cmds to debugfs cxl_raw_allow_all The Get, Inject, and Clear poison commands are not available for direct user access because they require kernel driver controls to perform safely. Further restrict access to these commands by requiring the selection of the debugfs attribute 'cxl_raw_allow_all' to enable in raw mode. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/0e5cb41ffae2bab800957d3b9003eedfd0a2dfd5.1681838291.git.alison.schofield@intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-23 11:45:09 -07:00
Dan Williams	3db166d6cf	cxl/mbox: Deprecate poison commands The CXL subsystem is adding formal mechanisms for managing device poison. Minimize the maintenance burden going forward, and maximize the investment in common tooling by deprecating direct user access to poison commands outside of CXL_MEM_RAW_COMMANDS debug scenarios. A new cxl_deprecated_commands[] list is created for querying which command ids defined in previous kernels are now deprecated. CXL Media and Poison Management commands, opcodes 0x43XX, defined in CXL 3.0 Spec, Table 8-93 are deprecated with one exception: Get Scan Media Capabilities. Keep Get Scan Media Capabilities as it simply provides information and has no impact on the device state. Effectively all of the commands defined in: commit `87815ee9d0` ("cxl/pci: Add media provisioning required commands") ...were defined prematurely and should have waited until the kernel implementation was decided. To my knowledge there are no shipping devices with poison support and no known tools that would regress with this change. Co-developed-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/652197e9bc8885e6448d989405b9e50ee9d6b0a6.1681838291.git.alison.schofield@intel.com Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-22 14:41:30 -07:00
Dan Williams	267214a231	cxl/port: Fix port to pci device assumptions in read_cdat_data() Not all endpoint CXL ports are associated with PCI devices. The cxl_test infrastructure models 'struct cxl_port' instances hosted by platform devices. Teach read_cdat_data() to be careful about non-pci hosted cxl_memdev instances. Otherwise, cxl_test crashes with this signature: RIP: 0010:xas_start+0x6d/0x290 [..] Call Trace: <TASK> xas_load+0xa/0x50 xas_find+0x25b/0x2f0 xa_find+0x118/0x1d0 pci_find_doe_mailbox+0x51/0xc0 read_cdat_data+0x45/0x190 [cxl_core] cxl_port_probe+0x10a/0x1e0 [cxl_port] cxl_bus_probe+0x17/0x50 [cxl_core] Some other cleanups are included like removing the single-use @uport variable, and removing the indirection through 'struct cxl_dev_state' to lookup the device that registered the memdev and may be a pci device. Fixes: `af0a6c3587` ("cxl/pci: Use CDAT DOE mailbox created by PCI core") Reviewed-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/168213190748.708404.16215095414060364800.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-22 14:04:43 -07:00
Lukas Wunner	f960e57dca	cxl/pci: Rightsize CDAT response allocation Jonathan notes that cxl_cdat_get_length() and cxl_cdat_read_table() allocate 32 dwords for the DOE response even though it may be smaller. In the case of cxl_cdat_get_length(), only the second dword of the response is of interest (it contains the length). So reduce the allocation to 2 dwords and let DOE discard the remainder. In the case of cxl_cdat_read_table(), a correctly sized allocation for the full CDAT already exists. Let DOE write each table entry directly into that allocation. There's a snag in that the table entry is preceded by a Table Access Response Header (1 dword, CXL 3.0 table 8-14). Save the last dword of the previous table entry, let DOE overwrite it with the header of the next entry and restore it afterwards. The resulting CDAT is preceded by 4 unavoidable useless bytes. Increase the allocation size accordingly. The buffer overflow check in cxl_cdat_read_table() becomes unnecessary because the remaining bytes in the allocation are tracked in "length", which is passed to DOE and limits how many bytes it writes to the allocation. Additionally, cxl_cdat_read_table() bails out if the DOE response is truncated due to insufficient space. Tested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/7a4e1f86958a79a70f29b96a92199522f08f8322.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-18 10:36:58 -07:00
Dave Jiang	7a877c9239	cxl/pci: Simplify CDAT retrieval error path The cdat.table and cdat.length fields in struct cxl_port are set before CDAT retrieval and must therefore be unset on failure. Simplify by setting only on success. Suggested-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com> Link: https://lore.kernel.org/linux-cxl/20230209113422.00007017@Huawei.com/ Signed-off-by: Dave Jiang <dave.jiang@intel.com> [lukas: rebase and rephrase commit message] Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/7a5c7104fb6a3016dbaec1c5d0ed34619ea11a0c.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-18 10:36:58 -07:00
Lukas Wunner	af0a6c3587	cxl/pci: Use CDAT DOE mailbox created by PCI core The PCI core has just been amended to create a pci_doe_mb struct for every DOE instance on device enumeration. Drop creation of a (duplicate) CDAT DOE mailbox on cxl probing in favor of the one already created by the PCI core. Tested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/becaf70e8faf9681d474200117d62d7eaac46cca.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-18 10:36:58 -07:00
Lukas Wunner	58709b924e	cxl/pci: Use synchronous API for DOE A synchronous API for DOE has just been introduced. Convert CXL CDAT retrieval over to it. Tested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/c329c0a21c11c3b524ce2336b0bbb3c80a28c415.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-18 10:36:58 -07:00
Dan Williams	c841ecd827	cxl/hdm: Add more HDM decoder debug messages at startup A recent debug session yielded a couple debug messages that were useful for determining the reason why the driver was or was not falling back to CXL range register emulation, and for identifying decoder setting enumeration problems. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/168149845668.792294.11814353796371419167.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-18 10:32:47 -07:00
Dan Williams	7bba261e0a	cxl/port: Scan single-target ports for decoders Do not assume that a single-target port falls back to a passthrough decoder configuration. Scan for decoders and only fallback after probing that the HDM decoder capability is not present. One user visible affect of this bug is the inability to enumerate present CXL regions as the decoder settings for the present decoders are skipped. Fixes: `d17d0540a0` ("cxl/core/hdm: Add CXL standard decoder enumeration to the core") Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: http://lore.kernel.org/r/20230227153128.8164-1-Jonathan.Cameron@huawei.com Cc: <stable@vger.kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/168149845130.792294.3210421233937427962.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-18 10:32:47 -07:00
Dan Williams	104087a8aa	cxl/core: Drop unused io-64-nonatomic-lo-hi.h After the discovery of a case where an implementation misbehaves with register reads larger than the definition of the register the other usages of readq() were audited and found to be correct, but some cases where the io-64-nonatomic-lo-hi.h include is not needed were discovered, delete them. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/168149844596.792294.8273108394688012953.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-18 10:32:46 -07:00
Dan Williams	1423885c84	cxl/hdm: Use 4-byte reads to retrieve HDM decoder base+limit The CXL specification mandates that 4-byte registers must be accessed with 4-byte access cycles. CXL 3.0 8.2.3 "Component Register Layout and Definition" states that the behavior is undefined if (2) 32-bit registers are accessed as an 8-byte quantity. It turns out that at least one hardware implementation is sensitive to this in practice. The @size variable results in zero with: size = readq(hdm + CXL_HDM_DECODER0_SIZE_LOW_OFFSET(which)); ...and the correct size with: lo = readl(hdm + CXL_HDM_DECODER0_SIZE_LOW_OFFSET(which)); hi = readl(hdm + CXL_HDM_DECODER0_SIZE_HIGH_OFFSET(which)); size = (hi << 32) + lo; Fixes: `d17d0540a0` ("cxl/core/hdm: Add CXL standard decoder enumeration to the core") Cc: <stable@vger.kernel.org> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/168149844056.792294.8224490474529733736.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-18 10:32:46 -07:00
Dan Williams	7701c8bef4	cxl/hdm: Fail upon detecting 0-sized decoders Decoders committed with 0-size lead to later crashes on shutdown as __cxl_dpa_release() assumes a 'struct resource' has been established in the in 'cxlds->dpa_res'. Just fail the driver load in this instance since there are deeper problems with the enumeration or the setup when this happens. Fixes: `9c57cde0dc` ("cxl/hdm: Enumerate allocated DPA") Cc: <stable@vger.kernel.org> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/168149843516.792294.11872242648319572632.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-18 10:32:46 -07:00
Dan Williams	ca712e4705	Merge branch 'for-6.3/cxl-doe-fixes' into for-6.3/cxl Pick up the fixes (first 6 patches) from the DOE rework series from Lukas for v6.3-rc. Link: https://lore.kernel.org/all/cover.1678543498.git.lukas@wunner.de/	2023-04-04 15:37:25 -07:00
Dan Williams	24b1819718	cxl/hdm: Extend DVSEC range register emulation for region enumeration One motivation for mapping range registers to decoder objects is to use those settings for region autodiscovery. The need to map a region for devices programmed to use range registers is especially urgent now that the kernel no longer routes "Soft Reserved" ranges in the memory map to device-dax by default. The CXL memory range loses all access mechanisms. Complete the implementation by marking the DPA reservation and setting the endpoint-decoder state to signal autodiscovery. Note that the default settings of ways=1 and granularity=4096 set in cxl_decode_init() do not need to be updated. Fixes: `09d09e04d2` ("cxl/dax: Create dax devices for CXL RAM regions") Tested-by: Dave Jiang <dave.jiang@intel.com> Tested-by: Gregory Price <gregory.price@memverge.com> Link: https://lore.kernel.org/r/168012575521.221280.14177293493678527326.stgit@dwillia2-xfh.jf.intel.com Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-04 15:34:34 -07:00
Dan Williams	52cc48ad2a	cxl/hdm: Limit emulation to the number of range registers Recall that range register emulation seeks to treat the 2 potential range registers as Linux CXL "decoder" objects. The number of range registers can be 1 or 2, while HDM decoder ranges can include more than 2. Be careful not to confuse DVSEC range count with HDM capability decoder count. Commit to range register earlier in devm_cxl_setup_hdm(). Otherwise, a device with more HDM decoders than range registers can set @cxlhdm->decoder_count to an invalid value. Avoid introducing a forward declaration by just moving the definition of should_emulate_decoders() earlier in the file. should_emulate_decoders() is unchanged. Tested-by: Dave Jiang <dave.jiang@intel.com> Fixes: `d7a2153762` ("cxl/hdm: Add emulation when HDM decoders are not committed") Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168012574932.221280.15944705098679646436.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-04 15:34:34 -07:00
Dan Williams	9ff3eec958	cxl/region: Move coherence tracking into cxl_region_attach() Each time the contents of a given HPA are potentially changed in a cache incoherent manner the CXL core sets CXL_REGION_F_INCOHERENT to invalidate CPU caches before the region is used. Successful invocation of attach_target() indicates that DPA has been newly assigned to a given HPA in the dynamic region creation flow. However, attach_target() is also reused in the autodiscovery flow where the region was activated by platform firmware. In that case there is no need to invalidate caches because that region is already in active use and nothing about the autodiscovery flow modifies the HPA-to-DPA relationship. In the autodiscovery case cxl_region_attach() exits early after determining the endpoint decoder is already correctly attached to the region. Fixes: `a32320b71f` ("cxl/region: Add region autodiscovery") Reviewed-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168002858817.50647.1217607907088920888.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-04 15:34:34 -07:00
Dan Williams	030f880342	cxl/region: Fix region setup/teardown for RCDs RCDs (CXL memory devices that link train without VH capability and show up as root complex integrated endpoints), hide the presence of the link between the endpoint and the host-bridge. The CXL region setup/teardown paths assume that a link hop is present and go looking for at least one 'struct cxl_port' instance between the CXL root port-object and an endpoint port-object leading to crashes of the form: BUG: kernel NULL pointer dereference, address: 0000000000000008 [..] RIP: 0010:cxl_region_setup_targets+0x3e9/0xae0 [cxl_core] [..] Call Trace: <TASK> cxl_region_attach+0x46c/0x7a0 [cxl_core] cxl_create_region+0x20b/0x270 [cxl_core] cxl_mock_mem_probe+0x641/0x800 [cxl_mock_mem] platform_probe+0x5b/0xb0 Detect RCDs explicitly and skip walking the non-existent port hierarchy between root and endpoint in that case. While this has been a problem since: commit `0a19bfc8de` ("cxl/port: Add RCD endpoint port enumeration") ...it becomes a more reliable crash scenario with the new autodiscovery implementation. Fixes: `a32320b71f` ("cxl/region: Add region autodiscovery") Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168002858268.50647.728091521032131326.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-04 15:34:34 -07:00
Dan Williams	d35b495ddf	cxl/port: Fix find_cxl_root() for RCDs and simplify it The find_cxl_root() helper is used to lookup root decoders and other CXL platform topology information for a given endpoint. It turns out that for RCDs it has never worked. The result of find_cxl_root(&cxlmd->dev) is always NULL for the RCH topology case because it expects to find a cxl_port at the host-bridge. RCH topologies only have the root cxl_port object with the host-bridge as a dport. While there are no reports of this being a problem to date, by inspection region enumeration should crash as a result of this problem, and it does in a local unit test for this scenario. However, an observation that ever since: commit `f17b558d66` ("cxl/pmem: Refactor nvdimm device registration, delete the workqueue") ...all callers of find_cxl_root() occur after the memdev connection to the port topology has been established. That means that find_cxl_root() can be simplified to a walk of the endpoint port topology to the root. Switch to that arrangement which also fixes the RCD bug. Fixes: `a32320b71f` ("cxl/region: Add region autodiscovery") Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168002857715.50647.344876437247313909.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-04 15:34:34 -07:00
Dan Williams	b70c2cf95e	cxl/hdm: Skip emulation when driver manages mem_enable If the driver is allowed to enable memory operation itself then it can also turn on HDM decoder support at will. With this the second call to cxl_setup_hdm_decoder_from_dvsec(), when an HDM decoder is not committed, is not needed. Fixes: `b777e9bec9` ("cxl/hdm: Emulate HDM decoder from DVSEC range registers") Link: http://lore.kernel.org/r/20230220113657.000042e1@huawei.com Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167703068474.185722.664126485486344246.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-04 15:34:34 -07:00
Dan Williams	82f0832af2	cxl/hdm: Fix double allocation of @cxlhdm devm_cxl_setup_emulated_hdm() reallocates an instance of @cxlhdm that was already allocated at the start of devm_cxl_setup_hdm(). Only one is needed and devm_cxl_setup_emulated_hdm() does not do enough to warrant being an explicit helper. Fixes: `4474ce565e` ("cxl/hdm: Create emulated cxl_hdm for devices that do not have HDM decoders") Tested-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/167703067936.185722.7908921750127154779.stgit@dwillia2-xfh.jf.intel.com Link: https://lore.kernel.org/r/168012574357.221280.5001364964799725366.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-04 15:34:12 -07:00
Lukas Wunner	4fe2c13d59	cxl/pci: Handle excessive CDAT length If the length in the CDAT header is larger than the concatenation of the header and all table entries, then the CDAT exposed to user space contains trailing null bytes. Not every consumer may be able to handle that. Per Postel's robustness principle, "be liberal in what you accept" and silently reduce the cached length to avoid exposing those null bytes. Fixes: `c97006046c` ("cxl/port: Read CDAT table") Tested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: stable@vger.kernel.org # v6.0+ Link: https://lore.kernel.org/r/6d98b3c7da5343172bd3ccabfabbc1f31c079d74.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-03 16:16:49 -07:00
Lukas Wunner	b56faef231	cxl/pci: Handle truncated CDAT entries If truncated CDAT entries are received from a device, the concatenation of those entries constitutes a corrupt CDAT, yet is happily exposed to user space. Avoid by verifying response lengths and erroring out if truncation is detected. The last CDAT entry may still be truncated despite the checks introduced herein if the length in the CDAT header is too small. However, that is easily detectable by user space because it reaches EOF prematurely. A subsequent commit which rightsizes the CDAT response allocation closes that remaining loophole. The two lines introduced here which exceed 80 chars are shortened to less than 80 chars by a subsequent commit which migrates to a synchronous DOE API and replaces "t.task.rv" by "rc". The existing acpi_cdat_header and acpi_table_cdat struct definitions provided by ACPICA cannot be used because they do not employ __le16 or __le32 types. I believe that cannot be changed because those types are Linux-specific and ACPI is specified for little endian platforms only, hence doesn't care about endianness. So duplicate the structs. Fixes: `c97006046c` ("cxl/port: Read CDAT table") Tested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: stable@vger.kernel.org # v6.0+ Link: https://lore.kernel.org/r/bce3aebc0e8e18a1173425a7a865b232c3912963.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-03 16:16:34 -07:00
Lukas Wunner	34bafc747c	cxl/pci: Handle truncated CDAT header cxl_cdat_get_length() only checks whether the DOE response size is sufficient for the Table Access response header (1 dword), but not the succeeding CDAT header (1 dword length plus other fields). It thus returns whatever uninitialized memory happens to be on the stack if a truncated DOE response with only 1 dword was received. Fix it. Fixes: `c97006046c` ("cxl/port: Read CDAT table") Reported-by: Ming Li <ming4.li@intel.com> Tested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Ming Li <ming4.li@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: stable@vger.kernel.org # v6.0+ Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Link: https://lore.kernel.org/r/000e69cd163461c8b1bc2cf4155b6e25402c29c7.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-03 16:16:19 -07:00
Greg Kroah-Hartman	75cff725d9	driver core: bus: mark the struct bus_type for sysfs callbacks as constant struct bus_type should never be modified in a sysfs callback as there is nothing in the structure to modify, and frankly, the structure is almost never used in a sysfs callback, so mark it as constant to allow struct bus_type to be moved to read-only memory. Cc: "David S. Miller" <davem@davemloft.net> Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Bounine <alex.bou9@gmail.com> Cc: Alison Schofield <alison.schofield@intel.com> Cc: Ben Widawsky <bwidawsk@kernel.org> Cc: Dexuan Cui <decui@microsoft.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Harald Freudenberger <freude@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Hu Haowen <src.res@email.cn> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Laurentiu Tudor <laurentiu.tudor@nxp.com> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Stuart Yoder <stuyoder@gmail.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Yanteng Si <siyanteng@loongson.cn> Acked-by: Ilya Dryomov <idryomov@gmail.com> # rbd Acked-by: Ira Weiny <ira.weiny@intel.com> # cxl Reviewed-by: Alex Shi <alexs@kernel.org> Acked-by: Iwona Winiarska <iwona.winiarska@intel.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> # pci Acked-by: Wei Liu <wei.liu@kernel.org> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> # scsi Link: https://lore.kernel.org/r/20230313182918.1312597-23-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-03-23 13:20:40 +01:00
Lukas Wunner	fbaa38214c	cxl/pci: Fix CDAT retrieval on big endian The CDAT exposed in sysfs differs between little endian and big endian arches: On big endian, every 4 bytes are byte-swapped. PCI Configuration Space is little endian (PCI r3.0 sec 6.1). Accessors such as pci_read_config_dword() implicitly swap bytes on big endian. That way, the macros in include/uapi/linux/pci_regs.h work regardless of the arch's endianness. For an example of implicit byte-swapping, see ppc4xx_pciex_read_config(), which calls in_le32(), which uses lwbrx (Load Word Byte-Reverse Indexed). DOE Read/Write Data Mailbox Registers are unlike other registers in Configuration Space in that they contain or receive a 4 byte portion of an opaque byte stream (a "Data Object" per PCIe r6.0 sec 7.9.24.5f). They need to be copied to or from the request/response buffer verbatim. So amend pci_doe_send_req() and pci_doe_recv_resp() to undo the implicit byte-swapping. The CXL_DOE_TABLE_ACCESS_* and PCI_DOE_DATA_OBJECT_DISC_* macros assume implicit byte-swapping. Byte-swap requests after constructing them with those macros and byte-swap responses before parsing them. Change the request and response type to __le32 to avoid sparse warnings. Per a request from Jonathan, replace sizeof(u32) with sizeof(__le32) for consistency. Fixes: `c97006046c` ("cxl/port: Read CDAT table") Tested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Cc: stable@vger.kernel.org # v6.0+ Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/3051114102f41d19df3debbee123129118fc5e6d.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-03-21 12:27:08 -07:00
Linus Torvalds	7c3dc440b1	cxl for v6.3 - CXL RAM region enumeration: instantiate 'struct cxl_region' objects for platform firmware created memory regions - CXL RAM region provisioning: complement the existing PMEM region creation support with RAM region support - "Soft Reservation" policy change: Online (memory hot-add) soft-reserved memory (EFI_MEMORY_SP) by default, but still allow for setting aside such memory for dedicated access via device-dax. - CXL Events and Interrupts: Takeover CXL event handling from platform-firmware (ACPI calls this CXL Memory Error Reporting) and export CXL Events via Linux Trace Events. - Convey CXL _OSC results to drivers: Similar to PCI, let the CXL subsystem interrogate the result of CXL _OSC negotiation. - Emulate CXL DVSEC Range Registers as "decoders": Allow for first-generation devices that pre-date the definition of the CXL HDM Decoder Capability to translate the CXL DVSEC Range Registers into 'struct cxl_decoder' objects. - Set timestamp: Per spec, set the device timestamp in case of hotplug, or if platform-firwmare failed to set it. - General fixups: linux-next build issues, non-urgent fixes for pre-production hardware, unit test fixes, spelling and debug message improvements. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCY/WYcgAKCRDfioYZHlFs Z6m3APkBUtiEEm1o8ikdu5llUS1OTLBwqjJDwGMTyf8X/WDXhgD+J2mLsCgARS7X 5IS0RAtefutrW5sQpUucPM7QiLuraAY= =kOXC -----END PGP SIGNATURE----- Merge tag 'cxl-for-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull Compute Express Link (CXL) updates from Dan Williams: "To date Linux has been dependent on platform-firmware to map CXL RAM regions and handle events / errors from devices. With this update we can now parse / update the CXL memory layout, and report events / errors from devices. This is a precursor for the CXL subsystem to handle the end-to-end "RAS" flow for CXL memory. i.e. the flow that for DDR-attached-DRAM is handled by the EDAC driver where it maps system physical address events to a field-replaceable-unit (FRU / endpoint device). In general, CXL has the potential to standardize what has historically been a pile of memory-controller-specific error handling logic. Another change of note is the default policy for handling RAM-backed device-dax instances. Previously the default access mode was "device", mmap(2) a device special file to access memory. The new default is "kmem" where the address range is assigned to the core-mm via add_memory_driver_managed(). This saves typical users from wondering why their platform memory is not visible via free(1) and stuck behind a device-file. At the same time it allows expert users to deploy policy to, for example, get dedicated access to high performance memory, or hide low performance memory from general purpose kernel allocations. This affects not only CXL, but also systems with high-bandwidth-memory that platform-firmware tags with the EFI_MEMORY_SP (special purpose) designation. Summary: - CXL RAM region enumeration: instantiate 'struct cxl_region' objects for platform firmware created memory regions - CXL RAM region provisioning: complement the existing PMEM region creation support with RAM region support - "Soft Reservation" policy change: Online (memory hot-add) soft-reserved memory (EFI_MEMORY_SP) by default, but still allow for setting aside such memory for dedicated access via device-dax. - CXL Events and Interrupts: Takeover CXL event handling from platform-firmware (ACPI calls this CXL Memory Error Reporting) and export CXL Events via Linux Trace Events. - Convey CXL _OSC results to drivers: Similar to PCI, let the CXL subsystem interrogate the result of CXL _OSC negotiation. - Emulate CXL DVSEC Range Registers as "decoders": Allow for first-generation devices that pre-date the definition of the CXL HDM Decoder Capability to translate the CXL DVSEC Range Registers into 'struct cxl_decoder' objects. - Set timestamp: Per spec, set the device timestamp in case of hotplug, or if platform-firwmare failed to set it. - General fixups: linux-next build issues, non-urgent fixes for pre-production hardware, unit test fixes, spelling and debug message improvements" * tag 'cxl-for-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (66 commits) dax/kmem: Fix leak of memory-hotplug resources cxl/mem: Add kdoc param for event log driver state cxl/trace: Add serial number to trace points cxl/trace: Add host output to trace points cxl/trace: Standardize device information output cxl/pci: Remove locked check for dvsec_range_allowed() cxl/hdm: Add emulation when HDM decoders are not committed cxl/hdm: Create emulated cxl_hdm for devices that do not have HDM decoders cxl/hdm: Emulate HDM decoder from DVSEC range registers cxl/pci: Refactor cxl_hdm_decode_init() cxl/port: Export cxl_dvsec_rr_decode() to cxl_port cxl/pci: Break out range register decoding from cxl_hdm_decode_init() cxl: add RAS status unmasking for CXL cxl: remove unnecessary calling of pci_enable_pcie_error_reporting() dax/hmem: build hmem device support as module if possible dax: cxl: add CXL_REGION dependency cxl: avoid returning uninitialized error code cxl/pmem: Fix nvdimm registration races cxl/mem: Fix UAPI command comment cxl/uapi: Tag commands from cxl_query_cmd() ...	2023-02-25 09:19:23 -08:00
Linus Torvalds	a93e884edf	Driver core changes for 6.3-rc1 Here is the large set of driver core changes for 6.3-rc1. There's a lot of changes this development cycle, most of the work falls into two different categories: - fw_devlink fixes and updates. This has gone through numerous review cycles and lots of review and testing by lots of different devices. Hopefully all should be good now, and Saravana will be keeping a watch for any potential regression on odd embedded systems. - driver core changes to work to make struct bus_type able to be moved into read-only memory (i.e. const) The recent work with Rust has pointed out a number of areas in the driver core where we are passing around and working with structures that really do not have to be dynamic at all, and they should be able to be read-only making things safer overall. This is the contuation of that work (started last release with kobject changes) in moving struct bus_type to be constant. We didn't quite make it for this release, but the remaining patches will be finished up for the release after this one, but the groundwork has been laid for this effort. Other than that we have in here: - debugfs memory leak fixes in some subsystems - error path cleanups and fixes for some never-able-to-be-hit codepaths. - cacheinfo rework and fixes - Other tiny fixes, full details are in the shortlog All of these have been in linux-next for a while with no reported problems. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCY/ipdg8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+ynL3gCgwzbcWu0So3piZyLiJKxsVo9C2EsAn3sZ9gN6 6oeFOjD3JDju3cQsfGgd =Su6W -----END PGP SIGNATURE----- Merge tag 'driver-core-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the large set of driver core changes for 6.3-rc1. There's a lot of changes this development cycle, most of the work falls into two different categories: - fw_devlink fixes and updates. This has gone through numerous review cycles and lots of review and testing by lots of different devices. Hopefully all should be good now, and Saravana will be keeping a watch for any potential regression on odd embedded systems. - driver core changes to work to make struct bus_type able to be moved into read-only memory (i.e. const) The recent work with Rust has pointed out a number of areas in the driver core where we are passing around and working with structures that really do not have to be dynamic at all, and they should be able to be read-only making things safer overall. This is the contuation of that work (started last release with kobject changes) in moving struct bus_type to be constant. We didn't quite make it for this release, but the remaining patches will be finished up for the release after this one, but the groundwork has been laid for this effort. Other than that we have in here: - debugfs memory leak fixes in some subsystems - error path cleanups and fixes for some never-able-to-be-hit codepaths. - cacheinfo rework and fixes - Other tiny fixes, full details are in the shortlog All of these have been in linux-next for a while with no reported problems" [ Geert Uytterhoeven points out that that last sentence isn't true, and that there's a pending report that has a fix that is queued up - Linus ] * tag 'driver-core-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (124 commits) debugfs: drop inline constant formatting for ERR_PTR(-ERROR) OPP: fix error checking in opp_migrate_dentry() debugfs: update comment of debugfs_rename() i3c: fix device.h kernel-doc warnings dma-mapping: no need to pass a bus_type into get_arch_dma_ops() driver core: class: move EXPORT_SYMBOL_GPL() lines to the correct place Revert "driver core: add error handling for devtmpfs_create_node()" Revert "devtmpfs: add debug info to handle()" Revert "devtmpfs: remove return value of devtmpfs_delete_node()" driver core: cpu: don't hand-override the uevent bus_type callback. devtmpfs: remove return value of devtmpfs_delete_node() devtmpfs: add debug info to handle() driver core: add error handling for devtmpfs_create_node() driver core: bus: update my copyright notice driver core: bus: add bus_get_dev_root() function driver core: bus: constify bus_unregister() driver core: bus: constify some internal functions driver core: bus: constify bus_get_kset() driver core: bus: constify bus_register/unregister_notifier() driver core: remove private pointer from struct bus_type ...	2023-02-24 12:58:55 -08:00
Dan Williams	23c198e3df	Merge branch 'for-6.3/cxl-events' into cxl/next Include some additional fixups for event support for v6.3, namely, rationalize the identifiers in the trace output and fixup a kdoc comment.	2023-02-16 15:27:12 -08:00
Ira Weiny	279676c9aa	cxl/trace: Add serial number to trace points Device serial numbers are useful information for the user. Add device serial numbers to all the trace points. Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20230208-cxl-event-names-v2-3-fca130c2c68b@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-02-16 14:29:11 -08:00
Ira Weiny	cd0570172d	cxl/trace: Add host output to trace points The host parameter of where the memdev is connected is useful information. Report host consistently in all trace points. Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20230208-cxl-event-names-v2-2-fca130c2c68b@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-02-16 14:29:11 -08:00
Ira Weiny	0c8393dcdb	cxl/trace: Standardize device information output The trace points were written to take a struct device input for the trace. In CXL multiple device objects are associated with each CXL hardware device. Using different device objects in the trace point can lead to confusion for users. The PCIe device is nice to have, but the user space tooling relies on the memory device naming. It is better to have those device names reported. Change all trace points to take struct cxl_memdev as a standard and report that name. Furthermore, standardize on the name 'memdev' in both /sys/kernel/tracing/trace and cxl-cli monitor output. Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20230208-cxl-event-names-v2-1-fca130c2c68b@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-02-16 14:29:11 -08:00
Dan Williams	a5fcd228ca	Merge branch 'for-6.3/cxl-rr-emu' into cxl/next Pick up the CXL DVSEC range register emulation for v6.3, and resolve conflicts with the cxl_port_probe() split (from for-6.3/cxl-ram-region) and event handling (from for-6.3/cxl-events).	2023-02-14 16:06:10 -08:00
Dave Jiang	6980daaa3e	cxl/pci: Remove locked check for dvsec_range_allowed() Remove the CXL_DECODER_F_LOCK check to be permissive of platform BIOSes that allow CXL.mem to be remapped. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167640370085.935665.13128321011001358077.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-02-14 15:45:21 -08:00

... 2 3 4 5 6 ...

577 Commits