linux-cxl.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v2 00/28] CXL Region Creation / HDM decoder programming
@ 2021-10-22 18:36 Ben Widawsky
  2021-10-22 18:36 ` [RFC PATCH v2 01/28] cxl: Rename CXL_MEM to CXL_PCI Ben Widawsky
                   ` (27 more replies)
  0 siblings, 28 replies; 106+ messages in thread
From: Ben Widawsky @ 2021-10-22 18:36 UTC (permalink / raw)
  To: linux-cxl, Chet Douglas
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

Because v1 wasn't reviewed, and I quickly asked people to ignore it in favor of
this version, I'm not going to bother listing out the changes. I've only issued minor
edits to the original cover letter. If you already read that, feel free to move
along.

I don't have plans to send further RFCs until I receive feedback, good or bad.

CXL region creation
-------------------
An interleaved set of devices is known in the Compute Express Link [1]
specification as a region. In addition to a region being comprised of devices
that can be configured in a variety of orders there are other properties that
define a region. It is expected that a region may be created as part of
provisioning by the hardware vendor, or interactively via operating system
controls. This patch series implements both the interfaces to create and
configure a region, as well as the algorithm to program the HDM decoders to
enable CXL.mem traffic while obeying those configuration parameters. The series
stops short of storing the new regions in the Label Storage Area of the CXL
devices.

Some version of this functionality has all been posted previously, with the
exception of the actual HDM decoder programming. It's probably wise to forget
those exist, and take my apology in advance for not addressing feedback you may
have already given.

There are two branches I am using as development branches. The branch for
port/mem driver [2] is fairly solid. The branch for region creation [3] is less
baked.

cxl_port
========

The cxl_port driver is implemented within the cxl_port module. While loading of
this module is optional, the other new drivers depend on it. The port driver is
responsible for all activities around HDM decoder enumeration and programming.
Introduced earlier, the concept of a port is an abstraction over CXL components
with an upstream port, every host bridge, switch, and endpoint.

cxl_mem
=======

The cxl_mem driver's main job is to walk up the hierarchy to make the
determination if it is CXL.mem routed, meaning, all components above it in the
hierarchy are participating in the CXL.mem protocol. It is implemented within
the cxl_mem module. As the host bridge ports are added by a platform specific
driver, such as cxl_acpi, the scope of the mem driver can be reduced to scan for
switches and ask cxl_core to work on enumerating them. With this done, the
determination as to whether a device is CXL.mem routed can be done simply by
checking if the struct device has a driver bound to it.

This driver is also a logical place to migrate certain functionality from
cxl_pci. That is saved for later.

cxl_region
==========

Region verification and programming state are owned by the cxl_region driver
(implemented in the cxl_region module). It relies on cxl_mem to determine if
devices are CXL routed, and cxl_port to actually handle the programming of the
HDM decoders. Much of the region driver is an implementation of algorithms
described in the CXL Type 3 Memory Device Software Guide [4].

The patches for the region driver could be squashed. They're broken out to aid
review and because that's the order they were implemented in. My preference is
to keep those as they are.

Why RFC?
--------
While I think most of the architecture is sound, I don't believe anyone but
other developers should use this branch. Where I'd really like most eyes:
- Locking and device lifetimes. I suspect I have many bugs in this area.
- Region configuration. Should values have a default, should they all be
  explicit?
- What should/shouldn't be in core. I like how this ended up, but at this point
  I'm fairly biased (CXL.cache pun).
- What to extend to cxl_test.

What's missing
---------------
- CXL 2.0 switch support
- A full topology lock for programming HDM decoders. I'm looking for feedback on
  the best way to do this.
- Check that HDM decoder programming addresses are correct (must program higher
  addresses only)
- Volatile regions (or BIOS configured persistent regions)
- Connection to libnvdimm/labels. This includes many aspects, not the least of
  which is saving the region into the Label Storage Area so that it can be
  reestablished on reboot.

Here is an example of output when programming a x1 interleave region:
[   23.959814][  T645] cxl_core:cxl_add_region:406: cxl region0.0:0: Added region0.0:0 to decoder0.0
[   23.962972][  T645] cxl_port:cxl_commit_decoder:248: cxl_port port1: decoder1.0
[   23.962972][  T645] 	Base 0x0000004c00000000
[   23.962972][  T645] 	Size 268435456
[   23.962972][  T645] 	IG 256
[   23.962972][  T645] 	IW 1
[   23.962972][  T645] 	TargetList: 0 -1 -1 -1 -1 -1 -1 -1
[   23.965529][  T645] cxl_port:cxl_commit_decoder:248: cxl_port port3: decoder3.0
[   23.965529][  T645] 	Base 0x0000004c00000000
[   23.965529][  T645] 	Size 268435456
[   23.965529][  T645] 	IG 256
[   23.965529][  T645] 	IW 1
[   23.965529][  T645] 	TargetList: -1 -1 -1 -1 -1 -1 -1 -1

If you're wondering how I tested this, I've baked it into my cxlctl app [5] and
lib [6]. Eventually this will get absorbed by ndctl/cxl-cli/libcxl [7]. Region
deletion isn't implemented yet.

To get the detailed errors, trace-cmd can be utilized. Until a region device
exists, the region module will not be loaded, which means the region tracepoints
will not exist. To get around this, modprobe cxl_region before anything.

trace-cmd record -e cxl ./cxlctl create-region -n -a -s $((256<<20)) /sys/bus/cxl/devices/decoder0.0

Note: A minor bugfix is needed in QEMU if testing interleave configs. I've
pushed that to my v4 branch.
---

[1]: https://www.computeexpresslink.org/download-the-specification
[2]: https://gitlab.com/bwidawsk/linux/-/tree/cxl_port-v3
[3]: https://gitlab.com/bwidawsk/linux/-/tree/cxl_regions-v4
[4]: https://cdrdv2.intel.com/v1/dl/getContent/643805?wapkw=CXL%20memory%20device%20sw%20guide
[5]: https://gitlab.com/bwidawsk-cxl/cxlctl
[6]: https://gitlab.com/bwidawsk-cxl/cxl_rs
[7]: https://lore.kernel.org/linux-cxl/CAPcyv4joKOhTdaRBJVeoOtqhRjBvdtt9902TS=c39=zWTZXvuw@mail.gmail.com/
---

Ben Widawsky (28):
  cxl: Rename CXL_MEM to CXL_PCI
  cxl: Move register block enumeration to core
  cxl/acpi: Map component registers for Root Ports
  cxl: Add helper for new drivers
  cxl/core: Convert decoder range to resource
  cxl: Introduce endpoint decoders
  cxl/core: Move target population locking to caller
  cxl/port: Introduce a port driver
  cxl/acpi: Map single port host bridge component registers
  cxl/core: Store global list of root ports
  cxl/acpi: Rescan bus at probe completion
  cxl/core: Store component register base for memdevs
  cxl: Flesh out register names
  cxl: Hide devm host for ports
  cxl/core: Introduce API to scan switch ports
  cxl: Introduce cxl_mem driver
  cxl: Disable switch hierarchies for now
  cxl/region: Add region creation ABI
  cxl/region: Introduce concept of region configuration
  cxl/region: Introduce a cxl_region driver
  cxl/acpi: Handle address space allocation
  cxl/region: Address space allocation
  cxl/region: Implement XHB verification
  cxl/region: HB port config verification
  cxl/region: Record host bridge target list
  cxl/mem: Store the endpoint's uport
  cxl/region: Gather HDM decoder resources
  cxl: Program decoders for regions

 .clang-format                                 |   3 +
 Documentation/ABI/testing/sysfs-bus-cxl       |  63 ++
 .../driver-api/cxl/memory-devices.rst         |  28 +
 drivers/cxl/Kconfig                           |  28 +-
 drivers/cxl/Makefile                          |   8 +-
 drivers/cxl/acpi.c                            | 132 +++-
 drivers/cxl/core/Makefile                     |   2 +
 drivers/cxl/core/bus.c                        | 437 +++++++++++-
 drivers/cxl/core/core.h                       |   2 +
 drivers/cxl/core/memdev.c                     |   7 +-
 drivers/cxl/core/pci.c                        |  99 +++
 drivers/cxl/core/region.c                     | 453 +++++++++++++
 drivers/cxl/core/regs.c                       |  62 +-
 drivers/cxl/cxl.h                             |  85 ++-
 drivers/cxl/cxlmem.h                          |   8 +-
 drivers/cxl/mem.c                             | 162 +++++
 drivers/cxl/pci.c                             |  69 +-
 drivers/cxl/pci.h                             |  48 +-
 drivers/cxl/port.c                            | 491 ++++++++++++++
 drivers/cxl/region.c                          | 629 ++++++++++++++++++
 drivers/cxl/region.h                          |  57 ++
 drivers/cxl/trace.h                           |  75 +++
 tools/testing/cxl/Kbuild                      |   2 +
 tools/testing/cxl/mock_acpi.c                 |   4 +-
 tools/testing/cxl/test/mem.c                  |   3 +-
 25 files changed, 2806 insertions(+), 151 deletions(-)
 create mode 100644 drivers/cxl/core/pci.c
 create mode 100644 drivers/cxl/core/region.c
 create mode 100644 drivers/cxl/mem.c
 create mode 100644 drivers/cxl/port.c
 create mode 100644 drivers/cxl/region.c
 create mode 100644 drivers/cxl/region.h
 create mode 100644 drivers/cxl/trace.h

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [RFC PATCH v2 01/28] cxl: Rename CXL_MEM to CXL_PCI
  2021-10-22 18:36 [RFC PATCH v2 00/28] CXL Region Creation / HDM decoder programming Ben Widawsky
@ 2021-10-22 18:36 ` Ben Widawsky
  2021-10-29 20:15   ` Dan Williams
  2021-10-22 18:36 ` [RFC PATCH v2 02/28] cxl: Move register block enumeration to core Ben Widawsky
                   ` (26 subsequent siblings)
  27 siblings, 1 reply; 106+ messages in thread
From: Ben Widawsky @ 2021-10-22 18:36 UTC (permalink / raw)
  To: linux-cxl, Chet Douglas
  Cc: Ben Widawsky, Dan Williams, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

With the upcoming introduction of a driver to control the non-PCI
aspects of CXL.mem, such as interleave set creation and configuration,
there will be an opportunity to disconnection control over CXL device
memory and CXL device manageability. CXL device manageability is
implemented by the cxl_pci driver. Doing this rename allows the CXL
memory driver to be enabled by a new config option independently of CXL
device manageability through CXL.io/PCI mechanisms.

Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 drivers/cxl/Kconfig  | 13 ++++++-------
 drivers/cxl/Makefile |  2 +-
 2 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
index e6de221cc568..23773d0ac896 100644
--- a/drivers/cxl/Kconfig
+++ b/drivers/cxl/Kconfig
@@ -13,14 +13,13 @@ menuconfig CXL_BUS
 
 if CXL_BUS
 
-config CXL_MEM
-	tristate "CXL.mem: Memory Devices"
+config CXL_PCI
+	tristate "PCI manageability"
 	default CXL_BUS
 	help
-	  The CXL.mem protocol allows a device to act as a provider of
-	  "System RAM" and/or "Persistent Memory" that is fully coherent
-	  as if the memory was attached to the typical CPU memory
-	  controller.
+	  The CXL specification defines a set of interfaces which are controlled
+	  through well known PCI configuration mechanisms. Such access is
+	  referred to CXL.io in the specification.
 
 	  Say 'y/m' to enable a driver that will attach to CXL.mem devices for
 	  configuration and management primarily via the mailbox interface. See
@@ -31,7 +30,7 @@ config CXL_MEM
 
 config CXL_MEM_RAW_COMMANDS
 	bool "RAW Command Interface for Memory Devices"
-	depends on CXL_MEM
+	depends on CXL_PCI
 	help
 	  Enable CXL RAW command interface.
 
diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile
index d1aaabc940f3..cf07ae6cea17 100644
--- a/drivers/cxl/Makefile
+++ b/drivers/cxl/Makefile
@@ -1,6 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0
 obj-$(CONFIG_CXL_BUS) += core/
-obj-$(CONFIG_CXL_MEM) += cxl_pci.o
+obj-$(CONFIG_CXL_PCI) += cxl_pci.o
 obj-$(CONFIG_CXL_ACPI) += cxl_acpi.o
 obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o
 
-- 
2.33.1


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [RFC PATCH v2 02/28] cxl: Move register block enumeration to core
  2021-10-22 18:36 [RFC PATCH v2 00/28] CXL Region Creation / HDM decoder programming Ben Widawsky
  2021-10-22 18:36 ` [RFC PATCH v2 01/28] cxl: Rename CXL_MEM to CXL_PCI Ben Widawsky
@ 2021-10-22 18:36 ` Ben Widawsky
  2021-10-29 20:23   ` Dan Williams
  2021-10-22 18:36 ` [RFC PATCH v2 03/28] cxl/acpi: Map component registers for Root Ports Ben Widawsky
                   ` (25 subsequent siblings)
  27 siblings, 1 reply; 106+ messages in thread
From: Ben Widawsky @ 2021-10-22 18:36 UTC (permalink / raw)
  To: linux-cxl, Chet Douglas
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

CXL drivers or cxl_core itself will require the ability to map component
registers in order to map HDM decoder resources amongst other things.
Much of the register mapping code has already moved into cxl_core. The
code to pull the BAR number and block office remained within cxl_pci
because there was no need to move it. Upcoming work will require this
functionality to be available outside of cxl_pci.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 drivers/cxl/core/regs.c | 54 +++++++++++++++++++++++++++++++++++++++++
 drivers/cxl/cxl.h       |  4 +++
 drivers/cxl/pci.c       | 52 ---------------------------------------
 3 files changed, 58 insertions(+), 52 deletions(-)

diff --git a/drivers/cxl/core/regs.c b/drivers/cxl/core/regs.c
index 41de4a136ecd..40598905c080 100644
--- a/drivers/cxl/core/regs.c
+++ b/drivers/cxl/core/regs.c
@@ -5,6 +5,7 @@
 #include <linux/slab.h>
 #include <linux/pci.h>
 #include <cxlmem.h>
+#include <pci.h>
 
 /**
  * DOC: cxl registers
@@ -247,3 +248,56 @@ int cxl_map_device_regs(struct pci_dev *pdev,
 	return 0;
 }
 EXPORT_SYMBOL_GPL(cxl_map_device_regs);
+
+static void cxl_decode_regblock(u32 reg_lo, u32 reg_hi,
+				struct cxl_register_map *map)
+{
+	map->block_offset =
+		((u64)reg_hi << 32) | (reg_lo & CXL_REGLOC_ADDR_MASK);
+	map->barno = FIELD_GET(CXL_REGLOC_BIR_MASK, reg_lo);
+	map->reg_type = FIELD_GET(CXL_REGLOC_RBI_MASK, reg_lo);
+}
+
+/**
+ * cxl_find_regblock() - Locate register blocks by type
+ * @pdev: The CXL PCI device to enumerate.
+ * @type: Register Block Indicator id
+ * @map: Enumeration output, clobbered on error
+ *
+ * Return: 0 if register block enumerated, negative error code otherwise
+ *
+ * A CXL DVSEC may additional point one or more register blocks, search
+ * for them by @type.
+ */
+int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
+		      struct cxl_register_map *map)
+{
+	u32 regloc_size, regblocks;
+	int regloc, i;
+
+	regloc = pci_find_dvsec_capability(pdev, PCI_DVSEC_VENDOR_ID_CXL,
+					   PCI_DVSEC_ID_CXL_REGLOC_DVSEC_ID);
+	if (!regloc)
+		return -ENXIO;
+
+	pci_read_config_dword(pdev, regloc + PCI_DVSEC_HEADER1, &regloc_size);
+	regloc_size = FIELD_GET(PCI_DVSEC_HEADER1_LENGTH_MASK, regloc_size);
+
+	regloc += PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET;
+	regblocks = (regloc_size - PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET) / 8;
+
+	for (i = 0; i < regblocks; i++, regloc += 8) {
+		u32 reg_lo, reg_hi;
+
+		pci_read_config_dword(pdev, regloc, &reg_lo);
+		pci_read_config_dword(pdev, regloc + 4, &reg_hi);
+
+		cxl_decode_regblock(reg_lo, reg_hi, map);
+
+		if (map->reg_type == type)
+			return 0;
+	}
+
+	return -ENODEV;
+}
+EXPORT_SYMBOL_GPL(cxl_find_regblock);
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 7cd16ef144dd..f06c596fad71 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -161,6 +161,10 @@ int cxl_map_device_regs(struct pci_dev *pdev,
 			struct cxl_device_regs *regs,
 			struct cxl_register_map *map);
 
+enum cxl_regloc_type;
+int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
+		      struct cxl_register_map *map);
+
 #define CXL_RESOURCE_NONE ((resource_size_t) -1)
 #define CXL_TARGET_STRLEN 20
 
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index f19a06809079..d1adc759d051 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -400,58 +400,6 @@ static int cxl_map_regs(struct cxl_mem *cxlm, struct cxl_register_map *map)
 	return 0;
 }
 
-static void cxl_decode_regblock(u32 reg_lo, u32 reg_hi,
-				struct cxl_register_map *map)
-{
-	map->block_offset =
-		((u64)reg_hi << 32) | (reg_lo & CXL_REGLOC_ADDR_MASK);
-	map->barno = FIELD_GET(CXL_REGLOC_BIR_MASK, reg_lo);
-	map->reg_type = FIELD_GET(CXL_REGLOC_RBI_MASK, reg_lo);
-}
-
-/**
- * cxl_find_regblock() - Locate register blocks by type
- * @pdev: The CXL PCI device to enumerate.
- * @type: Register Block Indicator id
- * @map: Enumeration output, clobbered on error
- *
- * Return: 0 if register block enumerated, negative error code otherwise
- *
- * A CXL DVSEC may additional point one or more register blocks, search
- * for them by @type.
- */
-static int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
-			     struct cxl_register_map *map)
-{
-	u32 regloc_size, regblocks;
-	int regloc, i;
-
-	regloc = pci_find_dvsec_capability(pdev, PCI_DVSEC_VENDOR_ID_CXL,
-					   PCI_DVSEC_ID_CXL_REGLOC_DVSEC_ID);
-	if (!regloc)
-		return -ENXIO;
-
-	pci_read_config_dword(pdev, regloc + PCI_DVSEC_HEADER1, &regloc_size);
-	regloc_size = FIELD_GET(PCI_DVSEC_HEADER1_LENGTH_MASK, regloc_size);
-
-	regloc += PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET;
-	regblocks = (regloc_size - PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET) / 8;
-
-	for (i = 0; i < regblocks; i++, regloc += 8) {
-		u32 reg_lo, reg_hi;
-
-		pci_read_config_dword(pdev, regloc, &reg_lo);
-		pci_read_config_dword(pdev, regloc + 4, &reg_hi);
-
-		cxl_decode_regblock(reg_lo, reg_hi, map);
-
-		if (map->reg_type == type)
-			return 0;
-	}
-
-	return -ENODEV;
-}
-
 static int cxl_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
 			  struct cxl_register_map *map)
 {
-- 
2.33.1


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [RFC PATCH v2 03/28] cxl/acpi: Map component registers for Root Ports
  2021-10-22 18:36 [RFC PATCH v2 00/28] CXL Region Creation / HDM decoder programming Ben Widawsky
  2021-10-22 18:36 ` [RFC PATCH v2 01/28] cxl: Rename CXL_MEM to CXL_PCI Ben Widawsky
  2021-10-22 18:36 ` [RFC PATCH v2 02/28] cxl: Move register block enumeration to core Ben Widawsky
@ 2021-10-22 18:36 ` Ben Widawsky
  2021-10-29 20:28   ` Dan Williams
  2021-10-22 18:36 ` [RFC PATCH v2 04/28] cxl: Add helper for new drivers Ben Widawsky
                   ` (24 subsequent siblings)
  27 siblings, 1 reply; 106+ messages in thread
From: Ben Widawsky @ 2021-10-22 18:36 UTC (permalink / raw)
  To: linux-cxl, Chet Douglas
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

With the addition of cxl_find_register_block() in cxl_core, it becomes
trivial to complete the TODO left for mapping the component registers of
root ports. None of the CXL drivers currently use component registers of
downstream ports (which is what a CXL 2.0 Root Port is). As such, there
should be no functional change.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 drivers/cxl/acpi.c | 10 ++++++++--
 drivers/cxl/pci.h  |  4 ++++
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index af1c6c1875ac..7d13e7f0aefc 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -7,6 +7,7 @@
 #include <linux/acpi.h>
 #include <linux/pci.h>
 #include "cxl.h"
+#include "pci.h"
 
 static struct acpi_table_header *acpi_cedt;
 
@@ -206,11 +207,13 @@ static resource_size_t get_chbcr(struct acpi_cedt_chbs *chbs)
 
 __mock int match_add_root_ports(struct pci_dev *pdev, void *data)
 {
+	resource_size_t creg = CXL_RESOURCE_NONE;
 	struct cxl_walk_context *ctx = data;
 	struct pci_bus *root_bus = ctx->root;
 	struct cxl_port *port = ctx->port;
 	int type = pci_pcie_type(pdev);
 	struct device *dev = ctx->dev;
+	struct cxl_register_map map;
 	u32 lnkcap, port_num;
 	int rc;
 
@@ -224,9 +227,12 @@ __mock int match_add_root_ports(struct pci_dev *pdev, void *data)
 				  &lnkcap) != PCIBIOS_SUCCESSFUL)
 		return 0;
 
-	/* TODO walk DVSEC to find component register base */
+	rc = cxl_find_regblock(pdev, CXL_REGLOC_RBI_COMPONENT, &map);
+	if (!rc)
+		creg = cxl_reg_block(pdev, &map);
+
 	port_num = FIELD_GET(PCI_EXP_LNKCAP_PN, lnkcap);
-	rc = cxl_add_dport(port, &pdev->dev, port_num, CXL_RESOURCE_NONE);
+	rc = cxl_add_dport(port, &pdev->dev, port_num, creg);
 	if (rc) {
 		ctx->error = rc;
 		return rc;
diff --git a/drivers/cxl/pci.h b/drivers/cxl/pci.h
index 7d3e4bf06b45..12fdcb1b14e5 100644
--- a/drivers/cxl/pci.h
+++ b/drivers/cxl/pci.h
@@ -31,4 +31,8 @@ enum cxl_regloc_type {
 #define CXL_REGLOC_RBI_MASK GENMASK(15, 8)
 #define CXL_REGLOC_ADDR_MASK GENMASK(31, 16)
 
+#define cxl_reg_block(pdev, map)                                               \
+	((resource_size_t)(pci_resource_start(pdev, (map)->barno) +            \
+			   (map)->block_offset))
+
 #endif /* __CXL_PCI_H__ */
-- 
2.33.1


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [RFC PATCH v2 04/28] cxl: Add helper for new drivers
  2021-10-22 18:36 [RFC PATCH v2 00/28] CXL Region Creation / HDM decoder programming Ben Widawsky
                   ` (2 preceding siblings ...)
  2021-10-22 18:36 ` [RFC PATCH v2 03/28] cxl/acpi: Map component registers for Root Ports Ben Widawsky
@ 2021-10-22 18:36 ` Ben Widawsky
  2021-10-29 20:30   ` Dan Williams
  2021-10-22 18:36 ` [RFC PATCH v2 05/28] cxl/core: Convert decoder range to resource Ben Widawsky
                   ` (23 subsequent siblings)
  27 siblings, 1 reply; 106+ messages in thread
From: Ben Widawsky @ 2021-10-22 18:36 UTC (permalink / raw)
  To: linux-cxl, Chet Douglas
  Cc: Ben Widawsky, Dan Williams, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

Many CXL drivers simply want to register and unregister themselves.
module_driver already supported this. A simple wrapper around that
reduces a decent amount of boilerplate in upcoming patches.

Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 drivers/cxl/cxl.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index f06c596fad71..4483e1a39fc3 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -317,6 +317,9 @@ int __cxl_driver_register(struct cxl_driver *cxl_drv, struct module *owner,
 #define cxl_driver_register(x) __cxl_driver_register(x, THIS_MODULE, KBUILD_MODNAME)
 void cxl_driver_unregister(struct cxl_driver *cxl_drv);
 
+#define module_cxl_driver(__cxl_driver) \
+	module_driver(__cxl_driver, cxl_driver_register, cxl_driver_unregister)
+
 #define CXL_DEVICE_NVDIMM_BRIDGE	1
 #define CXL_DEVICE_NVDIMM		2
 
-- 
2.33.1


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [RFC PATCH v2 05/28] cxl/core: Convert decoder range to resource
  2021-10-22 18:36 [RFC PATCH v2 00/28] CXL Region Creation / HDM decoder programming Ben Widawsky
                   ` (3 preceding siblings ...)
  2021-10-22 18:36 ` [RFC PATCH v2 04/28] cxl: Add helper for new drivers Ben Widawsky
@ 2021-10-22 18:36 ` Ben Widawsky
  2021-10-29 20:50   ` Dan Williams
  2021-10-22 18:36 ` [RFC PATCH v2 06/28] cxl: Introduce endpoint decoders Ben Widawsky
                   ` (22 subsequent siblings)
  27 siblings, 1 reply; 106+ messages in thread
From: Ben Widawsky @ 2021-10-22 18:36 UTC (permalink / raw)
  To: linux-cxl, Chet Douglas
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

Regions will use the resource API in order to help manage allocated
space. As regions are children of the decoder, it makes sense that the
parent host the main resource to be suballocated by the region.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 drivers/cxl/acpi.c     | 12 ++++--------
 drivers/cxl/core/bus.c |  4 ++--
 drivers/cxl/cxl.h      |  4 ++--
 3 files changed, 8 insertions(+), 12 deletions(-)

diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index 7d13e7f0aefc..b972abc9f6ef 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -126,10 +126,9 @@ static void cxl_add_cfmws_decoders(struct device *dev,
 
 		cxld->flags = cfmws_to_decoder_flags(cfmws->restrictions);
 		cxld->target_type = CXL_DECODER_EXPANDER;
-		cxld->range = (struct range) {
-			.start = cfmws->base_hpa,
-			.end = cfmws->base_hpa + cfmws->window_size - 1,
-		};
+		cxld->res = (struct resource)DEFINE_RES_MEM_NAMED(cfmws->base_hpa,
+								  cfmws->window_size,
+								  "cfmws");
 		cxld->interleave_ways = CFMWS_INTERLEAVE_WAYS(cfmws);
 		cxld->interleave_granularity =
 			CFMWS_INTERLEAVE_GRANULARITY(cfmws);
@@ -339,10 +338,7 @@ static int add_host_bridge_uport(struct device *match, void *arg)
 	cxld->interleave_ways = 1;
 	cxld->interleave_granularity = PAGE_SIZE;
 	cxld->target_type = CXL_DECODER_EXPANDER;
-	cxld->range = (struct range) {
-		.start = 0,
-		.end = -1,
-	};
+	cxld->res = (struct resource)DEFINE_RES_MEM(0, 0);
 
 	device_lock(&port->dev);
 	dport = list_first_entry(&port->dports, typeof(*dport), list);
diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
index ebd061d03950..454d4d846eb2 100644
--- a/drivers/cxl/core/bus.c
+++ b/drivers/cxl/core/bus.c
@@ -47,7 +47,7 @@ static ssize_t start_show(struct device *dev, struct device_attribute *attr,
 {
 	struct cxl_decoder *cxld = to_cxl_decoder(dev);
 
-	return sysfs_emit(buf, "%#llx\n", cxld->range.start);
+	return sysfs_emit(buf, "%#llx\n", cxld->res.start);
 }
 static DEVICE_ATTR_RO(start);
 
@@ -56,7 +56,7 @@ static ssize_t size_show(struct device *dev, struct device_attribute *attr,
 {
 	struct cxl_decoder *cxld = to_cxl_decoder(dev);
 
-	return sysfs_emit(buf, "%#llx\n", range_len(&cxld->range));
+	return sysfs_emit(buf, "%#llx\n", resource_size(&cxld->res));
 }
 static DEVICE_ATTR_RO(size);
 
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 4483e1a39fc3..7f2e2bdc7883 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -195,7 +195,7 @@ enum cxl_decoder_type {
  * struct cxl_decoder - CXL address range decode configuration
  * @dev: this decoder's device
  * @id: kernel device name id
- * @range: address range considered by this decoder
+ * @res: address space resources considered by this decoder
  * @interleave_ways: number of cxl_dports in this decode
  * @interleave_granularity: data stride per dport
  * @target_type: accelerator vs expander (type2 vs type3) selector
@@ -206,7 +206,7 @@ enum cxl_decoder_type {
 struct cxl_decoder {
 	struct device dev;
 	int id;
-	struct range range;
+	struct resource res;
 	int interleave_ways;
 	int interleave_granularity;
 	enum cxl_decoder_type target_type;
-- 
2.33.1


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [RFC PATCH v2 06/28] cxl: Introduce endpoint decoders
  2021-10-22 18:36 [RFC PATCH v2 00/28] CXL Region Creation / HDM decoder programming Ben Widawsky
                   ` (4 preceding siblings ...)
  2021-10-22 18:36 ` [RFC PATCH v2 05/28] cxl/core: Convert decoder range to resource Ben Widawsky
@ 2021-10-22 18:36 ` Ben Widawsky
  2021-10-29 21:00   ` Dan Williams
  2021-10-22 18:36 ` [RFC PATCH v2 07/28] cxl/core: Move target population locking to caller Ben Widawsky
                   ` (21 subsequent siblings)
  27 siblings, 1 reply; 106+ messages in thread
From: Ben Widawsky @ 2021-10-22 18:36 UTC (permalink / raw)
  To: linux-cxl, Chet Douglas
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

Endpoints have decoders too. It is useful to share the same
infrastructure from cxl_core. Endpoints do not have dports (downstream
targets), only the underlying physical medium. As a result, some special
casing is needed.

There is no functional change introduced yet as endpoints don't actually
enumerate decoders yet.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 drivers/cxl/core/bus.c | 40 ++++++++++++++++++++++++++++++++--------
 drivers/cxl/cxl.h      |  3 ++-
 2 files changed, 34 insertions(+), 9 deletions(-)

diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
index 454d4d846eb2..5564a71773e2 100644
--- a/drivers/cxl/core/bus.c
+++ b/drivers/cxl/core/bus.c
@@ -175,6 +175,12 @@ static const struct attribute_group *cxl_decoder_switch_attribute_groups[] = {
 	NULL,
 };
 
+static const struct attribute_group *cxl_decoder_endpoint_attribute_groups[] = {
+	&cxl_decoder_base_attribute_group,
+	&cxl_base_attribute_group,
+	NULL,
+};
+
 static void cxl_decoder_release(struct device *dev)
 {
 	struct cxl_decoder *cxld = to_cxl_decoder(dev);
@@ -184,6 +190,12 @@ static void cxl_decoder_release(struct device *dev)
 	kfree(cxld);
 }
 
+static const struct device_type cxl_decoder_endpoint_type = {
+	.name = "cxl_decoder_endpoint",
+	.release = cxl_decoder_release,
+	.groups = cxl_decoder_endpoint_attribute_groups,
+};
+
 static const struct device_type cxl_decoder_switch_type = {
 	.name = "cxl_decoder_switch",
 	.release = cxl_decoder_release,
@@ -196,6 +208,11 @@ static const struct device_type cxl_decoder_root_type = {
 	.groups = cxl_decoder_root_attribute_groups,
 };
 
+static bool is_endpoint_decoder(struct device *dev)
+{
+	return dev->type == &cxl_decoder_endpoint_type;
+}
+
 bool is_root_decoder(struct device *dev)
 {
 	return dev->type == &cxl_decoder_root_type;
@@ -483,7 +500,8 @@ static int decoder_populate_targets(struct cxl_decoder *cxld,
 	return rc;
 }
 
-struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets)
+struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port,
+				      unsigned int nr_targets)
 {
 	struct cxl_decoder *cxld, cxld_const_init = {
 		.nr_targets = nr_targets,
@@ -491,7 +509,7 @@ struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets)
 	struct device *dev;
 	int rc = 0;
 
-	if (nr_targets > CXL_DECODER_MAX_INTERLEAVE || nr_targets < 1)
+	if (nr_targets > CXL_DECODER_MAX_INTERLEAVE)
 		return ERR_PTR(-EINVAL);
 
 	cxld = kzalloc(struct_size(cxld, target, nr_targets), GFP_KERNEL);
@@ -510,8 +528,11 @@ struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets)
 	dev->parent = &port->dev;
 	dev->bus = &cxl_bus_type;
 
+	/* Endpoints don't have a target list */
+	if (nr_targets == 0)
+		dev->type = &cxl_decoder_endpoint_type;
 	/* root ports do not have a cxl_port_type parent */
-	if (port->dev.parent->type == &cxl_port_type)
+	else if (port->dev.parent->type == &cxl_port_type)
 		dev->type = &cxl_decoder_switch_type;
 	else
 		dev->type = &cxl_decoder_root_type;
@@ -538,12 +559,15 @@ int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
 	if (cxld->interleave_ways < 1)
 		return -EINVAL;
 
-	port = to_cxl_port(cxld->dev.parent);
-	rc = decoder_populate_targets(cxld, port, target_map);
-	if (rc)
-		return rc;
-
 	dev = &cxld->dev;
+
+	port = to_cxl_port(cxld->dev.parent);
+	if (!is_endpoint_decoder(dev)) {
+		rc = decoder_populate_targets(cxld, port, target_map);
+		if (rc)
+			return rc;
+	}
+
 	rc = dev_set_name(dev, "decoder%d.%d", port->id, cxld->id);
 	if (rc)
 		return rc;
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 7f2e2bdc7883..91b8fd54bc93 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -293,7 +293,8 @@ int cxl_add_dport(struct cxl_port *port, struct device *dport, int port_id,
 
 struct cxl_decoder *to_cxl_decoder(struct device *dev);
 bool is_root_decoder(struct device *dev);
-struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets);
+struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port,
+				      unsigned int nr_targets);
 int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map);
 int cxl_decoder_autoremove(struct device *host, struct cxl_decoder *cxld);
 
-- 
2.33.1


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [RFC PATCH v2 07/28] cxl/core: Move target population locking to caller
  2021-10-22 18:36 [RFC PATCH v2 00/28] CXL Region Creation / HDM decoder programming Ben Widawsky
                   ` (5 preceding siblings ...)
  2021-10-22 18:36 ` [RFC PATCH v2 06/28] cxl: Introduce endpoint decoders Ben Widawsky
@ 2021-10-22 18:36 ` Ben Widawsky
  2021-10-29 23:03   ` Dan Williams
  2021-10-22 18:36 ` [RFC PATCH v2 08/28] cxl/port: Introduce a port driver Ben Widawsky
                   ` (20 subsequent siblings)
  27 siblings, 1 reply; 106+ messages in thread
From: Ben Widawsky @ 2021-10-22 18:36 UTC (permalink / raw)
  To: linux-cxl, Chet Douglas
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

A CXL port driver which would be responsible for adding decoders will
want to do so during probe. Unfortunately, the device lock is held
during probe. Making this change allows for a new lock-free variant of
cxl_decoder_add which will be used by the port driver.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 drivers/cxl/core/bus.c | 29 ++++++++++++-----------------
 1 file changed, 12 insertions(+), 17 deletions(-)

diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
index 5564a71773e2..5f2bde47a5c2 100644
--- a/drivers/cxl/core/bus.c
+++ b/drivers/cxl/core/bus.c
@@ -478,25 +478,18 @@ static int decoder_populate_targets(struct cxl_decoder *cxld,
 	if (!target_map)
 		return 0;
 
-	device_lock(&port->dev);
-	if (list_empty(&port->dports)) {
-		rc = -EINVAL;
-		goto out_unlock;
-	}
+	device_lock_assert(&port->dev);
+	if (list_empty(&port->dports))
+		return -EINVAL;
 
 	for (i = 0; i < cxld->nr_targets; i++) {
 		struct cxl_dport *dport = find_dport(port, target_map[i]);
 
-		if (!dport) {
-			rc = -ENXIO;
-			goto out_unlock;
-		}
+		if (!dport)
+			return -ENXIO;
 		cxld->target[i] = dport;
 	}
 
-out_unlock:
-	device_unlock(&port->dev);
-
 	return rc;
 }
 
@@ -548,7 +541,7 @@ int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
 {
 	struct cxl_port *port;
 	struct device *dev;
-	int rc;
+	int rc = 0;
 
 	if (WARN_ON_ONCE(!cxld))
 		return -EINVAL;
@@ -562,11 +555,13 @@ int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
 	dev = &cxld->dev;
 
 	port = to_cxl_port(cxld->dev.parent);
-	if (!is_endpoint_decoder(dev)) {
+
+	device_lock(&port->dev);
+	if (!is_endpoint_decoder(dev))
 		rc = decoder_populate_targets(cxld, port, target_map);
-		if (rc)
-			return rc;
-	}
+	device_unlock(&port->dev);
+	if (rc)
+		return rc;
 
 	rc = dev_set_name(dev, "decoder%d.%d", port->id, cxld->id);
 	if (rc)
-- 
2.33.1


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [RFC PATCH v2 08/28] cxl/port: Introduce a port driver
  2021-10-22 18:36 [RFC PATCH v2 00/28] CXL Region Creation / HDM decoder programming Ben Widawsky
                   ` (6 preceding siblings ...)
  2021-10-22 18:36 ` [RFC PATCH v2 07/28] cxl/core: Move target population locking to caller Ben Widawsky
@ 2021-10-22 18:36 ` Ben Widawsky
  2021-10-30  1:37   ` Dan Williams
  2021-11-03 15:18   ` Jonathan Cameron
  2021-10-22 18:36 ` [RFC PATCH v2 09/28] cxl/acpi: Map single port host bridge component registers Ben Widawsky
                   ` (19 subsequent siblings)
  27 siblings, 2 replies; 106+ messages in thread
From: Ben Widawsky @ 2021-10-22 18:36 UTC (permalink / raw)
  To: linux-cxl, Chet Douglas
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

The CXL port driver will be responsible for managing the decoder
resources contained within the port. It will also provide APIs that
other drivers will consume for managing these resources.

Since the port driver is responsible for instantiating new decoders, and
it does so during probe(), a new API is needed to add decoders for
callers which already hold the device lock of the port.

This patch has no functional change because no driver is registering new
ports and the root ports that are already registered should be skipped.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 .../driver-api/cxl/memory-devices.rst         |   5 +
 drivers/cxl/Makefile                          |   2 +
 drivers/cxl/acpi.c                            |   1 +
 drivers/cxl/core/bus.c                        |  44 +++-
 drivers/cxl/core/regs.c                       |   6 +-
 drivers/cxl/cxl.h                             |  31 ++-
 drivers/cxl/port.c                            | 242 ++++++++++++++++++
 7 files changed, 315 insertions(+), 16 deletions(-)
 create mode 100644 drivers/cxl/port.c

diff --git a/Documentation/driver-api/cxl/memory-devices.rst b/Documentation/driver-api/cxl/memory-devices.rst
index 3b8f41395f6b..fbf0393cdddc 100644
--- a/Documentation/driver-api/cxl/memory-devices.rst
+++ b/Documentation/driver-api/cxl/memory-devices.rst
@@ -28,6 +28,11 @@ CXL Memory Device
 .. kernel-doc:: drivers/cxl/pci.c
    :internal:
 
+CXL Port
+--------
+.. kernel-doc:: drivers/cxl/port.c
+   :doc: cxl port
+
 CXL Core
 --------
 .. kernel-doc:: drivers/cxl/cxl.h
diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile
index cf07ae6cea17..40b386aaedf7 100644
--- a/drivers/cxl/Makefile
+++ b/drivers/cxl/Makefile
@@ -1,5 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0
 obj-$(CONFIG_CXL_BUS) += core/
+obj-$(CONFIG_CXL_MEM) += cxl_port.o
 obj-$(CONFIG_CXL_PCI) += cxl_pci.o
 obj-$(CONFIG_CXL_ACPI) += cxl_acpi.o
 obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o
@@ -7,3 +8,4 @@ obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o
 cxl_pci-y := pci.o
 cxl_acpi-y := acpi.o
 cxl_pmem-y := pmem.o
+cxl_port-y := port.o
diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index b972abc9f6ef..d61397055e9f 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -491,3 +491,4 @@ static struct platform_driver cxl_acpi_driver = {
 module_platform_driver(cxl_acpi_driver);
 MODULE_LICENSE("GPL v2");
 MODULE_IMPORT_NS(CXL);
+MODULE_SOFTDEP("pre: cxl_port");
diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
index 5f2bde47a5c2..dffbd0ac64af 100644
--- a/drivers/cxl/core/bus.c
+++ b/drivers/cxl/core/bus.c
@@ -266,6 +266,7 @@ struct cxl_port *to_cxl_port(struct device *dev)
 		return NULL;
 	return container_of(dev, struct cxl_port, dev);
 }
+EXPORT_SYMBOL_GPL(to_cxl_port);
 
 static void unregister_port(void *_port)
 {
@@ -537,11 +538,21 @@ struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port,
 }
 EXPORT_SYMBOL_GPL(cxl_decoder_alloc);
 
-int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
+/**
+ * cxl_decoder_add_locked() - Adds a decoder with port locked.
+ * @cxld: Decoder to be added
+ * @target_map: Downstream port array of port ids
+ *
+ * The parent port owning the @cxld must have its device mutex locked before
+ * calling this function.
+ *
+ * Returns 0 if successful; otherwise, a negative error code is returned.
+ */
+int cxl_decoder_add_locked(struct cxl_decoder *cxld, int *target_map)
 {
 	struct cxl_port *port;
 	struct device *dev;
-	int rc = 0;
+	int rc;
 
 	if (WARN_ON_ONCE(!cxld))
 		return -EINVAL;
@@ -556,12 +567,11 @@ int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
 
 	port = to_cxl_port(cxld->dev.parent);
 
-	device_lock(&port->dev);
-	if (!is_endpoint_decoder(dev))
+	if (!is_endpoint_decoder(dev)) {
 		rc = decoder_populate_targets(cxld, port, target_map);
-	device_unlock(&port->dev);
-	if (rc)
-		return rc;
+		if (rc)
+			return rc;
+	}
 
 	rc = dev_set_name(dev, "decoder%d.%d", port->id, cxld->id);
 	if (rc)
@@ -569,6 +579,24 @@ int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
 
 	return device_add(dev);
 }
+EXPORT_SYMBOL_GPL(cxl_decoder_add_locked);
+
+int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
+{
+	int rc;
+
+	if (WARN_ON_ONCE(!cxld))
+		return -EINVAL;
+
+	if (WARN_ON_ONCE(IS_ERR(cxld)))
+		return PTR_ERR(cxld);
+
+	device_lock(&to_cxl_port(cxld->dev.parent)->dev);
+	rc = cxl_decoder_add_locked(cxld, target_map);
+	device_unlock(&to_cxl_port(cxld->dev.parent)->dev);
+
+	return rc;
+}
 EXPORT_SYMBOL_GPL(cxl_decoder_add);
 
 static void cxld_unregister(void *dev)
@@ -627,6 +655,8 @@ static int cxl_device_id(struct device *dev)
 		return CXL_DEVICE_NVDIMM_BRIDGE;
 	if (dev->type == &cxl_nvdimm_type)
 		return CXL_DEVICE_NVDIMM;
+	if (dev->type == &cxl_port_type)
+		return CXL_DEVICE_PORT;
 	return 0;
 }
 
diff --git a/drivers/cxl/core/regs.c b/drivers/cxl/core/regs.c
index 40598905c080..c8ab8880b81b 100644
--- a/drivers/cxl/core/regs.c
+++ b/drivers/cxl/core/regs.c
@@ -159,9 +159,8 @@ void cxl_probe_device_regs(struct device *dev, void __iomem *base,
 }
 EXPORT_SYMBOL_GPL(cxl_probe_device_regs);
 
-static void __iomem *devm_cxl_iomap_block(struct device *dev,
-					  resource_size_t addr,
-					  resource_size_t length)
+void __iomem *devm_cxl_iomap_block(struct device *dev, resource_size_t addr,
+				   resource_size_t length)
 {
 	void __iomem *ret_val;
 	struct resource *res;
@@ -180,6 +179,7 @@ static void __iomem *devm_cxl_iomap_block(struct device *dev,
 
 	return ret_val;
 }
+EXPORT_SYMBOL_GPL(devm_cxl_iomap_block);
 
 int cxl_map_component_regs(struct pci_dev *pdev,
 			   struct cxl_component_regs *regs,
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 91b8fd54bc93..ad22caf9135c 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -17,6 +17,9 @@
  * (port-driver, region-driver, nvdimm object-drivers... etc).
  */
 
+/* CXL 2.0 8.2.4 CXL Component Register Layout and Definition */
+#define CXL_COMPONENT_REG_BLOCK_SIZE SZ_64K
+
 /* CXL 2.0 8.2.5 CXL.cache and CXL.mem Registers*/
 #define CXL_CM_OFFSET 0x1000
 #define CXL_CM_CAP_HDR_OFFSET 0x0
@@ -36,11 +39,22 @@
 #define CXL_HDM_DECODER_CAP_OFFSET 0x0
 #define   CXL_HDM_DECODER_COUNT_MASK GENMASK(3, 0)
 #define   CXL_HDM_DECODER_TARGET_COUNT_MASK GENMASK(7, 4)
-#define CXL_HDM_DECODER0_BASE_LOW_OFFSET 0x10
-#define CXL_HDM_DECODER0_BASE_HIGH_OFFSET 0x14
-#define CXL_HDM_DECODER0_SIZE_LOW_OFFSET 0x18
-#define CXL_HDM_DECODER0_SIZE_HIGH_OFFSET 0x1c
-#define CXL_HDM_DECODER0_CTRL_OFFSET 0x20
+#define   CXL_HDM_DECODER_INTERLEAVE_11_8 BIT(8)
+#define   CXL_HDM_DECODER_INTERLEAVE_14_12 BIT(9)
+#define CXL_HDM_DECODER_CTRL_OFFSET 0x4
+#define   CXL_HDM_DECODER_ENABLE BIT(1)
+#define CXL_HDM_DECODER0_BASE_LOW_OFFSET(i) (0x20 * (i) + 0x10)
+#define CXL_HDM_DECODER0_BASE_HIGH_OFFSET(i) (0x20 * (i) + 0x14)
+#define CXL_HDM_DECODER0_SIZE_LOW_OFFSET(i) (0x20 * (i) + 0x18)
+#define CXL_HDM_DECODER0_SIZE_HIGH_OFFSET(i) (0x20 * (i) + 0x1c)
+#define CXL_HDM_DECODER0_CTRL_OFFSET(i) (0x20 * (i) + 0x20)
+#define   CXL_HDM_DECODER0_CTRL_IG_MASK GENMASK(3, 0)
+#define   CXL_HDM_DECODER0_CTRL_IW_MASK GENMASK(7, 4)
+#define   CXL_HDM_DECODER0_CTRL_COMMIT BIT(9)
+#define   CXL_HDM_DECODER0_CTRL_COMMITTED BIT(10)
+#define   CXL_HDM_DECODER0_CTRL_TYPE BIT(12)
+#define CXL_HDM_DECODER0_TL_LOW(i) (0x20 * (i) + 0x24)
+#define CXL_HDM_DECODER0_TL_HIGH(i) (0x20 * (i) + 0x28)
 
 static inline int cxl_hdm_decoder_count(u32 cap_hdr)
 {
@@ -164,6 +178,8 @@ int cxl_map_device_regs(struct pci_dev *pdev,
 enum cxl_regloc_type;
 int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
 		      struct cxl_register_map *map);
+void __iomem *devm_cxl_iomap_block(struct device *dev, resource_size_t addr,
+				   resource_size_t length);
 
 #define CXL_RESOURCE_NONE ((resource_size_t) -1)
 #define CXL_TARGET_STRLEN 20
@@ -178,7 +194,8 @@ int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
 #define CXL_DECODER_F_TYPE2 BIT(2)
 #define CXL_DECODER_F_TYPE3 BIT(3)
 #define CXL_DECODER_F_LOCK  BIT(4)
-#define CXL_DECODER_F_MASK  GENMASK(4, 0)
+#define CXL_DECODER_F_EN    BIT(5)
+#define CXL_DECODER_F_MASK  GENMASK(5, 0)
 
 enum cxl_decoder_type {
        CXL_DECODER_ACCELERATOR = 2,
@@ -295,6 +312,7 @@ struct cxl_decoder *to_cxl_decoder(struct device *dev);
 bool is_root_decoder(struct device *dev);
 struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port,
 				      unsigned int nr_targets);
+int cxl_decoder_add_locked(struct cxl_decoder *cxld, int *target_map);
 int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map);
 int cxl_decoder_autoremove(struct device *host, struct cxl_decoder *cxld);
 
@@ -323,6 +341,7 @@ void cxl_driver_unregister(struct cxl_driver *cxl_drv);
 
 #define CXL_DEVICE_NVDIMM_BRIDGE	1
 #define CXL_DEVICE_NVDIMM		2
+#define CXL_DEVICE_PORT			3
 
 #define MODULE_ALIAS_CXL(type) MODULE_ALIAS("cxl:t" __stringify(type) "*")
 #define CXL_MODALIAS_FMT "cxl:t%d"
diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
new file mode 100644
index 000000000000..ebbfb72ae995
--- /dev/null
+++ b/drivers/cxl/port.c
@@ -0,0 +1,242 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright(c) 2021 Intel Corporation. All rights reserved. */
+#include <linux/device.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+
+#include "cxlmem.h"
+
+/**
+ * DOC: cxl port
+ *
+ * The port driver implements the set of functionality needed to allow full
+ * decoder enumeration and routing. A CXL port is an abstraction of a CXL
+ * component that implements some amount of CXL decoding of CXL.mem traffic.
+ * As of the CXL 2.0 spec, this includes:
+ *
+ *	.. list-table:: CXL Components w/ Ports
+ *		:widths: 25 25 50
+ *		:header-rows: 1
+ *
+ *		* - component
+ *		  - upstream
+ *		  - downstream
+ *		* - Hostbridge
+ *		  - ACPI0016
+ *		  - root port
+ *		* - Switch
+ *		  - Switch Upstream Port
+ *		  - Switch Downstream Port
+ *		* - Endpoint (not yet implemented)
+ *		  - Endpoint Port
+ *		  - N/A
+ *
+ * The primary service this driver provides is enumerating HDM decoders and
+ * presenting APIs to other drivers to utilize the decoders.
+ */
+
+struct cxl_port_data {
+	struct cxl_component_regs regs;
+
+	struct port_caps {
+		unsigned int count;
+		unsigned int tc;
+		unsigned int interleave11_8;
+		unsigned int interleave14_12;
+	} caps;
+};
+
+static inline int cxl_hdm_decoder_ig(u32 ctrl)
+{
+	int val = FIELD_GET(CXL_HDM_DECODER0_CTRL_IG_MASK, ctrl);
+
+	return 8 + val;
+}
+
+static inline int cxl_hdm_decoder_iw(u32 ctrl)
+{
+	int val = FIELD_GET(CXL_HDM_DECODER0_CTRL_IW_MASK, ctrl);
+
+	return 1 << val;
+}
+
+static void get_caps(struct cxl_port *port, struct cxl_port_data *cpd)
+{
+	void __iomem *hdm_decoder = cpd->regs.hdm_decoder;
+	struct port_caps *caps = &cpd->caps;
+	u32 hdm_cap;
+
+	hdm_cap = readl(hdm_decoder + CXL_HDM_DECODER_CAP_OFFSET);
+
+	caps->count = cxl_hdm_decoder_count(hdm_cap);
+	caps->tc = FIELD_GET(CXL_HDM_DECODER_TARGET_COUNT_MASK, hdm_cap);
+	caps->interleave11_8 =
+		FIELD_GET(CXL_HDM_DECODER_INTERLEAVE_11_8, hdm_cap);
+	caps->interleave14_12 =
+		FIELD_GET(CXL_HDM_DECODER_INTERLEAVE_14_12, hdm_cap);
+}
+
+static int map_regs(struct cxl_port *port, void __iomem *crb,
+		    struct cxl_port_data *cpd)
+{
+	struct cxl_register_map map;
+	struct cxl_component_reg_map *comp_map = &map.component_map;
+
+	cxl_probe_component_regs(&port->dev, crb, comp_map);
+	if (!comp_map->hdm_decoder.valid) {
+		dev_err(&port->dev, "HDM decoder registers invalid\n");
+		return -ENXIO;
+	}
+
+	cpd->regs.hdm_decoder = crb + comp_map->hdm_decoder.offset;
+
+	return 0;
+}
+
+static u64 get_decoder_size(void __iomem *hdm_decoder, int n)
+{
+	u32 ctrl = readl(hdm_decoder + CXL_HDM_DECODER0_CTRL_OFFSET(n));
+
+	if (!!FIELD_GET(CXL_HDM_DECODER0_CTRL_COMMITTED, ctrl))
+		return 0;
+
+	return ioread64_hi_lo(hdm_decoder +
+			      CXL_HDM_DECODER0_SIZE_LOW_OFFSET(n));
+}
+
+static bool is_endpoint_port(struct cxl_port *port)
+{
+	if (!port->uport->driver)
+		return false;
+
+	return to_cxl_drv(port->uport->driver)->id ==
+	       CXL_DEVICE_MEMORY_EXPANDER;
+}
+
+static int enumerate_hdm_decoders(struct cxl_port *port,
+				  struct cxl_port_data *portdata)
+{
+	int i = 0;
+
+	for (i = 0; i < portdata->caps.count; i++) {
+		int iw = 1, ig = 0, rc, target_count = portdata->caps.tc;
+		void __iomem *hdm_decoder = portdata->regs.hdm_decoder;
+		enum cxl_decoder_type type = CXL_DECODER_EXPANDER;
+		struct resource res = DEFINE_RES_MEM(0, 0);
+		struct cxl_decoder *cxld;
+		int *target_map = NULL;
+		u64 size;
+
+		if (is_endpoint_port(port))
+			target_count = 0;
+
+		cxld = cxl_decoder_alloc(port, target_count);
+		if (IS_ERR(cxld)) {
+			dev_warn(&port->dev,
+				 "Failed to allocate the decoder\n");
+			return PTR_ERR(cxld);
+		}
+
+		size = get_decoder_size(hdm_decoder, i);
+		if (size != 0) {
+			int temp[CXL_DECODER_MAX_INTERLEAVE];
+			u64 target_list, base;
+			u32 ctrl;
+			int j;
+
+			target_map = temp;
+			ctrl = readl(hdm_decoder + CXL_HDM_DECODER0_CTRL_OFFSET(i));
+			base = ioread64_hi_lo(hdm_decoder + CXL_HDM_DECODER0_BASE_LOW_OFFSET(i));
+			res = (struct resource)DEFINE_RES_MEM(base, size);
+
+			cxld->flags = CXL_DECODER_F_EN;
+			iw = cxl_hdm_decoder_iw(ctrl);
+			ig = cxl_hdm_decoder_ig(ctrl);
+
+			if (FIELD_GET(CXL_HDM_DECODER0_CTRL_TYPE, ctrl) == 0)
+				type = CXL_DECODER_ACCELERATOR;
+
+			target_list = ioread64_hi_lo(hdm_decoder + CXL_HDM_DECODER0_TL_LOW(i));
+			for (j = 0; j < iw; j++)
+				target_map[j] = (target_list >> (j * 8)) & 0xff;
+		}
+
+		cxld->target_type = type;
+		cxld->res = res;
+		cxld->interleave_ways = iw;
+		cxld->interleave_granularity = ig;
+
+		rc = cxl_decoder_add_locked(cxld, target_map);
+		if (rc)
+			put_device(&cxld->dev);
+		else
+			rc = cxl_decoder_autoremove(port->uport->parent, cxld);
+		if (rc)
+			dev_err(&port->dev, "Failed to add decoder\n");
+	}
+
+	return 0;
+}
+
+static int cxl_port_probe(struct device *dev)
+{
+	struct cxl_port *port = to_cxl_port(dev);
+	struct cxl_port_data *portdata;
+	void __iomem *crb;
+	u32 ctrl;
+	int rc;
+
+	if (port->component_reg_phys == CXL_RESOURCE_NONE)
+		return 0;
+
+	portdata = devm_kzalloc(dev, sizeof(*portdata), GFP_KERNEL);
+	if (!portdata)
+		return -ENOMEM;
+
+	crb = devm_cxl_iomap_block(&port->dev, port->component_reg_phys,
+				   CXL_COMPONENT_REG_BLOCK_SIZE);
+	if (IS_ERR_OR_NULL(crb)) {
+		dev_err(&port->dev, "No component registers mapped\n");
+		return -ENXIO;
+	}
+
+	rc = map_regs(port, crb, portdata);
+	if (rc)
+		return rc;
+
+	get_caps(port, portdata);
+	if (portdata->caps.count == 0) {
+		dev_err(&port->dev, "Spec violation. Caps invalid\n");
+		return -ENXIO;
+	}
+
+	/*
+	 * Enable HDM decoders for this port.
+	 *
+	 * FIXME: If the component was using DVSEC range registers for decode,
+	 * this will destroy that.
+	 */
+	ctrl = readl(portdata->regs.hdm_decoder + CXL_HDM_DECODER_CTRL_OFFSET);
+	ctrl |= CXL_HDM_DECODER_ENABLE;
+	writel(ctrl, portdata->regs.hdm_decoder + CXL_HDM_DECODER_CTRL_OFFSET);
+
+	rc = enumerate_hdm_decoders(port, portdata);
+	if (rc) {
+		dev_err(&port->dev, "Couldn't enumerate decoders (%d)\n", rc);
+		return rc;
+	}
+
+	dev_set_drvdata(dev, portdata);
+	return 0;
+}
+
+static struct cxl_driver cxl_port_driver = {
+	.name = "cxl_port",
+	.probe = cxl_port_probe,
+	.id = CXL_DEVICE_PORT,
+};
+module_cxl_driver(cxl_port_driver);
+
+MODULE_LICENSE("GPL v2");
+MODULE_IMPORT_NS(CXL);
+MODULE_ALIAS_CXL(CXL_DEVICE_PORT);
-- 
2.33.1


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [RFC PATCH v2 09/28] cxl/acpi: Map single port host bridge component registers
  2021-10-22 18:36 [RFC PATCH v2 00/28] CXL Region Creation / HDM decoder programming Ben Widawsky
                   ` (7 preceding siblings ...)
  2021-10-22 18:36 ` [RFC PATCH v2 08/28] cxl/port: Introduce a port driver Ben Widawsky
@ 2021-10-22 18:36 ` Ben Widawsky
  2021-10-31 18:03   ` Dan Williams
  2021-10-22 18:36 ` [RFC PATCH v2 10/28] cxl/core: Store global list of root ports Ben Widawsky
                   ` (18 subsequent siblings)
  27 siblings, 1 reply; 106+ messages in thread
From: Ben Widawsky @ 2021-10-22 18:36 UTC (permalink / raw)
  To: linux-cxl, Chet Douglas
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

Now that the port driver exists and is able to do proper decoder
enumeration of the component registers, it becomes trivial to use that
for host bridge uports. For reasons out of scope, a functional change
would be visible if the HDM decoder was programmed by BIOS to values
other than the full address range. Similarly if a type2 device was
connected to this root port and programmed by BIOS, that can now be
acted upon accordingly.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 drivers/cxl/acpi.c | 25 ++++++++++++++++++++++++-
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index d61397055e9f..8cca0814dfb8 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -280,12 +280,14 @@ static int add_host_bridge_uport(struct device *match, void *arg)
 	struct cxl_port *root_port = arg;
 	struct device *host = root_port->dev.parent;
 	struct acpi_device *bridge = to_cxl_host_bridge(host, match);
+	struct cxl_component_reg_map map;
 	struct acpi_pci_root *pci_root;
 	struct cxl_walk_context ctx;
 	int single_port_map[1], rc;
 	struct cxl_decoder *cxld;
 	struct cxl_dport *dport;
 	struct cxl_port *port;
+	void __iomem *crb;
 
 	if (!bridge)
 		return 0;
@@ -318,10 +320,31 @@ static int add_host_bridge_uport(struct device *match, void *arg)
 		return -ENODEV;
 	if (ctx.error)
 		return ctx.error;
+	/*
+	 * If the host bridge has more than 1 root port, it must have registers
+	 * controlling the HDM decoders. Those will be enumerated by the port
+	 * driver.
+	 */
 	if (ctx.count > 1)
 		return 0;
 
-	/* TODO: Scan CHBCR for HDM Decoder resources */
+	/*
+	 * If the single ported host bridge has a component register block,
+	 * simply let the port driver handle the decoder enumeration.
+	 *
+	 * Host bridge component registers live in the system's physical address
+	 * space.
+	 */
+	crb = ioremap(dport->component_reg_phys, CXL_COMPONENT_REG_BLOCK_SIZE);
+	if (crb) {
+		cxl_probe_component_regs(&root_port->dev, crb, &map);
+		iounmap(crb);
+		if (map.hdm_decoder.valid) {
+			dev_dbg(host,
+				"Found single port host bridge with component registers\n");
+			return 0;
+		}
+	}
 
 	/*
 	 * Per the CXL specification (8.2.5.12 CXL HDM Decoder Capability
-- 
2.33.1


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [RFC PATCH v2 10/28] cxl/core: Store global list of root ports
  2021-10-22 18:36 [RFC PATCH v2 00/28] CXL Region Creation / HDM decoder programming Ben Widawsky
                   ` (8 preceding siblings ...)
  2021-10-22 18:36 ` [RFC PATCH v2 09/28] cxl/acpi: Map single port host bridge component registers Ben Widawsky
@ 2021-10-22 18:36 ` Ben Widawsky
  2021-10-31 18:32   ` Dan Williams
  2021-10-22 18:36 ` [RFC PATCH v2 11/28] cxl/acpi: Rescan bus at probe completion Ben Widawsky
                   ` (17 subsequent siblings)
  27 siblings, 1 reply; 106+ messages in thread
From: Ben Widawsky @ 2021-10-22 18:36 UTC (permalink / raw)
  To: linux-cxl, Chet Douglas
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

CXL root ports (the downstream port to a host bridge) are to be
enumerated by a platform specific driver. In the case of ACPI compliant
systems, this is like the cxl_acpi driver. Root ports are the first
CXL spec defined component that can be "found" by that platform specific
driver.

By storing a list of these root ports components in lower levels of the
topology (switches and endpoints), have a mechanism to walk up their
device hierarchy to find an enumerated root port. This will be necessary
for region programming.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 drivers/cxl/acpi.c            |  4 ++--
 drivers/cxl/core/bus.c        | 34 +++++++++++++++++++++++++++++++++-
 drivers/cxl/cxl.h             |  5 ++++-
 tools/testing/cxl/mock_acpi.c |  4 ++--
 4 files changed, 41 insertions(+), 6 deletions(-)

diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index 8cca0814dfb8..625c5d95b83f 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -231,7 +231,7 @@ __mock int match_add_root_ports(struct pci_dev *pdev, void *data)
 		creg = cxl_reg_block(pdev, &map);
 
 	port_num = FIELD_GET(PCI_EXP_LNKCAP_PN, lnkcap);
-	rc = cxl_add_dport(port, &pdev->dev, port_num, creg);
+	rc = cxl_add_dport(port, &pdev->dev, port_num, creg, true);
 	if (rc) {
 		ctx->error = rc;
 		return rc;
@@ -406,7 +406,7 @@ static int add_host_bridge_dport(struct device *match, void *arg)
 		dev_dbg(host, "No CHBS found for Host Bridge: %s\n",
 			dev_name(match));
 
-	rc = cxl_add_dport(root_port, match, uid, get_chbcr(chbs));
+	rc = cxl_add_dport(root_port, match, uid, get_chbcr(chbs), false);
 	if (rc) {
 		dev_err(host, "failed to add downstream port: %s\n",
 			dev_name(match));
diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
index dffbd0ac64af..03394a3ae75f 100644
--- a/drivers/cxl/core/bus.c
+++ b/drivers/cxl/core/bus.c
@@ -25,6 +25,8 @@
  */
 
 static DEFINE_IDA(cxl_port_ida);
+static LIST_HEAD(cxl_root_ports);
+static DECLARE_RWSEM(root_port_sem);
 
 static ssize_t devtype_show(struct device *dev, struct device_attribute *attr,
 			    char *buf)
@@ -268,12 +270,31 @@ struct cxl_port *to_cxl_port(struct device *dev)
 }
 EXPORT_SYMBOL_GPL(to_cxl_port);
 
+struct cxl_dport *cxl_get_root_dport(struct device *dev)
+{
+	struct cxl_dport *ret = NULL;
+	struct cxl_dport *dport;
+
+	down_read(&root_port_sem);
+	list_for_each_entry(dport, &cxl_root_ports, root_port_link) {
+		if (dport->dport == dev) {
+			ret = dport;
+			break;
+		}
+	}
+
+	up_read(&root_port_sem);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(cxl_get_root_dport);
+
 static void unregister_port(void *_port)
 {
 	struct cxl_port *port = _port;
 	struct cxl_dport *dport;
 
 	device_lock(&port->dev);
+	down_read(&root_port_sem);
 	list_for_each_entry(dport, &port->dports, list) {
 		char link_name[CXL_TARGET_STRLEN];
 
@@ -281,7 +302,10 @@ static void unregister_port(void *_port)
 			     dport->port_id) >= CXL_TARGET_STRLEN)
 			continue;
 		sysfs_remove_link(&port->dev.kobj, link_name);
+
+		list_del_init(&dport->root_port_link);
 	}
+	up_read(&root_port_sem);
 	device_unlock(&port->dev);
 	device_unregister(&port->dev);
 }
@@ -431,12 +455,13 @@ static int add_dport(struct cxl_port *port, struct cxl_dport *new)
  * @dport_dev: firmware or PCI device representing the dport
  * @port_id: identifier for this dport in a decoder's target list
  * @component_reg_phys: optional location of CXL component registers
+ * @root_port: is this a root port (hostbridge downstream)
  *
  * Note that all allocations and links are undone by cxl_port deletion
  * and release.
  */
 int cxl_add_dport(struct cxl_port *port, struct device *dport_dev, int port_id,
-		  resource_size_t component_reg_phys)
+		  resource_size_t component_reg_phys, bool root_port)
 {
 	char link_name[CXL_TARGET_STRLEN];
 	struct cxl_dport *dport;
@@ -451,6 +476,7 @@ int cxl_add_dport(struct cxl_port *port, struct device *dport_dev, int port_id,
 		return -ENOMEM;
 
 	INIT_LIST_HEAD(&dport->list);
+	INIT_LIST_HEAD(&dport->root_port_link);
 	dport->dport = get_device(dport_dev);
 	dport->port_id = port_id;
 	dport->component_reg_phys = component_reg_phys;
@@ -464,6 +490,12 @@ int cxl_add_dport(struct cxl_port *port, struct device *dport_dev, int port_id,
 	if (rc)
 		goto err;
 
+	if (root_port) {
+		down_write(&root_port_sem);
+		list_add_tail(&dport->root_port_link, &cxl_root_ports);
+		up_write(&root_port_sem);
+	}
+
 	return 0;
 err:
 	cxl_dport_release(dport);
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index ad22caf9135c..d26d97e1098d 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -291,6 +291,7 @@ struct cxl_port {
  * @component_reg_phys: downstream port component registers
  * @port: reference to cxl_port that contains this downstream port
  * @list: node for a cxl_port's list of cxl_dport instances
+ * @root_port_link: node for global list of root ports
  */
 struct cxl_dport {
 	struct device *dport;
@@ -298,6 +299,7 @@ struct cxl_dport {
 	resource_size_t component_reg_phys;
 	struct cxl_port *port;
 	struct list_head list;
+	struct list_head root_port_link;
 };
 
 struct cxl_port *to_cxl_port(struct device *dev);
@@ -306,7 +308,8 @@ struct cxl_port *devm_cxl_add_port(struct device *host, struct device *uport,
 				   struct cxl_port *parent_port);
 
 int cxl_add_dport(struct cxl_port *port, struct device *dport, int port_id,
-		  resource_size_t component_reg_phys);
+		  resource_size_t component_reg_phys, bool root_port);
+struct cxl_dport *cxl_get_root_dport(struct device *dev);
 
 struct cxl_decoder *to_cxl_decoder(struct device *dev);
 bool is_root_decoder(struct device *dev);
diff --git a/tools/testing/cxl/mock_acpi.c b/tools/testing/cxl/mock_acpi.c
index 4c8a493ace56..ddefc4345f36 100644
--- a/tools/testing/cxl/mock_acpi.c
+++ b/tools/testing/cxl/mock_acpi.c
@@ -57,7 +57,7 @@ static int match_add_root_port(struct pci_dev *pdev, void *data)
 
 	/* TODO walk DVSEC to find component register base */
 	port_num = FIELD_GET(PCI_EXP_LNKCAP_PN, lnkcap);
-	rc = cxl_add_dport(port, &pdev->dev, port_num, CXL_RESOURCE_NONE);
+	rc = cxl_add_dport(port, &pdev->dev, port_num, CXL_RESOURCE_NONE, true);
 	if (rc) {
 		dev_err(dev, "failed to add dport: %s (%d)\n",
 			dev_name(&pdev->dev), rc);
@@ -78,7 +78,7 @@ static int mock_add_root_port(struct platform_device *pdev, void *data)
 	struct device *dev = ctx->dev;
 	int rc;
 
-	rc = cxl_add_dport(port, &pdev->dev, pdev->id, CXL_RESOURCE_NONE);
+	rc = cxl_add_dport(port, &pdev->dev, pdev->id, CXL_RESOURCE_NONE, true);
 	if (rc) {
 		dev_err(dev, "failed to add dport: %s (%d)\n",
 			dev_name(&pdev->dev), rc);
-- 
2.33.1


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [RFC PATCH v2 11/28] cxl/acpi: Rescan bus at probe completion
  2021-10-22 18:36 [RFC PATCH v2 00/28] CXL Region Creation / HDM decoder programming Ben Widawsky
                   ` (9 preceding siblings ...)
  2021-10-22 18:36 ` [RFC PATCH v2 10/28] cxl/core: Store global list of root ports Ben Widawsky
@ 2021-10-22 18:36 ` Ben Widawsky
  2021-10-31 19:25   ` Dan Williams
  2021-10-22 18:36 ` [RFC PATCH v2 12/28] cxl/core: Store component register base for memdevs Ben Widawsky
                   ` (16 subsequent siblings)
  27 siblings, 1 reply; 106+ messages in thread
From: Ben Widawsky @ 2021-10-22 18:36 UTC (permalink / raw)
  To: linux-cxl, Chet Douglas
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

Ensure that devices being probed before cxl_acpi has completed will get
a second chance.

CXL drivers are brought up through two enumerable, asynchronous
mechanism. The leaf nodes in the CXL topology, endpoints, are enumerated
via PCI headers. The root node's enumeration is platform specific. The
current defacto mechanism for enumerating the root node is through the
presence of an ACPI device, ACPI0017.

The primary job of a cxl_mem driver is to determine if CXL.mem traffic
can be routed to/from the PCIe device that it is being probed. A
prerequisite in this determination is that all CXL components in the
path from root to leaf are capable of routing CXL.mem traffic. If the
cxl_mem driver is probed before cxl_acpi is complete the driver will be
unable to make this determination. To address this, cxl_acpi (or in the
future, another platform specific driver) will rescan all devices to
make sure the ordering is correct.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 drivers/cxl/acpi.c | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index 625c5d95b83f..1cc3a74c16bd 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -451,6 +451,14 @@ static u32 cedt_instance(struct platform_device *pdev)
 	return U32_MAX;
 }
 
+static void bus_rescan(struct work_struct *work)
+{
+	if (bus_rescan_devices(&cxl_bus_type))
+		pr_err("Failed to rescan CXL bus\n");
+}
+
+static DECLARE_WORK(deferred_bus_rescan, bus_rescan);
+
 static int cxl_acpi_probe(struct platform_device *pdev)
 {
 	int rc;
@@ -484,9 +492,19 @@ static int cxl_acpi_probe(struct platform_device *pdev)
 	if (rc)
 		goto out;
 
-	if (IS_ENABLED(CONFIG_CXL_PMEM))
+	if (IS_ENABLED(CONFIG_CXL_PMEM)) {
 		rc = device_for_each_child(&root_port->dev, root_port,
 					   add_root_nvdimm_bridge);
+		if (rc)
+			goto out;
+	}
+
+	/*
+	 * While ACPI is scanning hostbridge ports, switches and memory devices
+	 * may have been probed. Those devices will need to know whether the
+	 * hostbridge is CXL capable.
+	 */
+	schedule_work(&deferred_bus_rescan);
 
 out:
 	acpi_put_table(acpi_cedt);
-- 
2.33.1


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [RFC PATCH v2 12/28] cxl/core: Store component register base for memdevs
  2021-10-22 18:36 [RFC PATCH v2 00/28] CXL Region Creation / HDM decoder programming Ben Widawsky
                   ` (10 preceding siblings ...)
  2021-10-22 18:36 ` [RFC PATCH v2 11/28] cxl/acpi: Rescan bus at probe completion Ben Widawsky
@ 2021-10-22 18:36 ` Ben Widawsky
  2021-10-31 20:13   ` Dan Williams
  2021-10-22 18:36 ` [RFC PATCH v2 13/28] cxl: Flesh out register names Ben Widawsky
                   ` (15 subsequent siblings)
  27 siblings, 1 reply; 106+ messages in thread
From: Ben Widawsky @ 2021-10-22 18:36 UTC (permalink / raw)
  To: linux-cxl, Chet Douglas
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

Bake component registers into the memdev creation API in order to be
able to use them as part of driver probing.

Component register base addresses are obtained through PCI mechanisms.
As such it makes most sense for the cxl_pci driver to obtain that
address. In order to reuse the port driver for enumerating decoder
resources for an endpoint, it is desirable to be able to add the
endpoint as a port.  Unfortunately, by the time an endpoint driver would
run, it no longer has any concept of the underlying PCI device (this is
done intentionally to provide separation between drivers).

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 drivers/cxl/core/memdev.c    |  5 +++--
 drivers/cxl/cxlmem.h         |  5 ++++-
 drivers/cxl/pci.c            | 17 ++++++++++++++++-
 tools/testing/cxl/test/mem.c |  3 ++-
 4 files changed, 25 insertions(+), 5 deletions(-)

diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index bf1b04d00ff4..15762c16d83f 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -276,8 +276,8 @@ static const struct file_operations cxl_memdev_fops = {
 	.llseek = noop_llseek,
 };
 
-struct cxl_memdev *
-devm_cxl_add_memdev(struct cxl_mem *cxlm)
+struct cxl_memdev *devm_cxl_add_memdev(struct cxl_mem *cxlm,
+				       resource_size_t component_reg_phys)
 {
 	struct cxl_memdev *cxlmd;
 	struct device *dev;
@@ -298,6 +298,7 @@ devm_cxl_add_memdev(struct cxl_mem *cxlm)
 	 * needed as this is ordered with cdev_add() publishing the device.
 	 */
 	cxlmd->cxlm = cxlm;
+	cxlmd->creg_base = component_reg_phys;
 
 	cdev = &cxlmd->cdev;
 	rc = cdev_device_add(cdev, dev);
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index c4f450ad434d..62fe8e2c59e4 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -35,12 +35,14 @@
  * @cdev: char dev core object for ioctl operations
  * @cxlm: pointer to the parent device driver data
  * @id: id number of this memdev instance.
+ * @creg_base: register base of component registers
  */
 struct cxl_memdev {
 	struct device dev;
 	struct cdev cdev;
 	struct cxl_mem *cxlm;
 	int id;
+	resource_size_t creg_base;
 };
 
 static inline struct cxl_memdev *to_cxl_memdev(struct device *dev)
@@ -48,7 +50,8 @@ static inline struct cxl_memdev *to_cxl_memdev(struct device *dev)
 	return container_of(dev, struct cxl_memdev, dev);
 }
 
-struct cxl_memdev *devm_cxl_add_memdev(struct cxl_mem *cxlm);
+struct cxl_memdev *devm_cxl_add_memdev(struct cxl_mem *cxlm,
+				       resource_size_t component_reg_phys);
 
 /**
  * struct cxl_mbox_cmd - A command to be submitted to hardware.
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index d1adc759d051..96a312ed8269 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -421,6 +421,7 @@ static int cxl_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
 
 static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 {
+	resource_size_t creg = CXL_RESOURCE_NONE;
 	struct cxl_register_map map;
 	struct cxl_memdev *cxlmd;
 	struct cxl_mem *cxlm;
@@ -465,7 +466,21 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	if (rc)
 		return rc;
 
-	cxlmd = devm_cxl_add_memdev(cxlm);
+	/*
+	 * If the component registers can't be found, the cxl_pci driver may
+	 * still be useful for management functions so don't return an error.
+	 *
+	 * XXX: Creating the device is going to kick of the cxl_mem probing.
+	 * That probe requires the component registers. Therefore, the register
+	 * block must always be found first.
+	 */
+	rc = cxl_find_regblock(pdev, CXL_REGLOC_RBI_COMPONENT, &map);
+	if (rc)
+		dev_warn(&cxlmd->dev, "No component registers (%d)\n", rc);
+	else
+		creg = cxl_reg_block(pdev, &map);
+
+	cxlmd = devm_cxl_add_memdev(cxlm, creg);
 	if (IS_ERR(cxlmd))
 		return PTR_ERR(cxlmd);
 
diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c
index 12a8437a9ca0..471fc7fb5418 100644
--- a/tools/testing/cxl/test/mem.c
+++ b/tools/testing/cxl/test/mem.c
@@ -227,7 +227,8 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
 	if (rc)
 		return rc;
 
-	cxlmd = devm_cxl_add_memdev(cxlm);
+	/* TODO: mock component registers, or... */
+	cxlmd = devm_cxl_add_memdev(cxlm, CXL_RESOURCE_NONE);
 	if (IS_ERR(cxlmd))
 		return PTR_ERR(cxlmd);
 
-- 
2.33.1


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [RFC PATCH v2 13/28] cxl: Flesh out register names
  2021-10-22 18:36 [RFC PATCH v2 00/28] CXL Region Creation / HDM decoder programming Ben Widawsky
                   ` (11 preceding siblings ...)
  2021-10-22 18:36 ` [RFC PATCH v2 12/28] cxl/core: Store component register base for memdevs Ben Widawsky
@ 2021-10-22 18:36 ` Ben Widawsky
  2021-10-31 20:18   ` Dan Williams
  2021-11-03 15:53   ` Jonathan Cameron
  2021-10-22 18:36 ` [RFC PATCH v2 14/28] cxl: Hide devm host for ports Ben Widawsky
                   ` (14 subsequent siblings)
  27 siblings, 2 replies; 106+ messages in thread
From: Ben Widawsky @ 2021-10-22 18:36 UTC (permalink / raw)
  To: linux-cxl, Chet Douglas
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

Get a better naming scheme in place for upcoming additions. To solidify
the schema, add all the DVSEC identifiers to start with.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

---
See:
https://lore.kernel.org/linux-pci/20210913190131.xiiszmno46qie7v5@intel.com/
---
 drivers/cxl/core/regs.c | 14 ++++++++------
 drivers/cxl/pci.h       | 38 ++++++++++++++++++++++++++++++--------
 2 files changed, 38 insertions(+), 14 deletions(-)

diff --git a/drivers/cxl/core/regs.c b/drivers/cxl/core/regs.c
index c8ab8880b81b..b837196fbf39 100644
--- a/drivers/cxl/core/regs.c
+++ b/drivers/cxl/core/regs.c
@@ -253,9 +253,11 @@ static void cxl_decode_regblock(u32 reg_lo, u32 reg_hi,
 				struct cxl_register_map *map)
 {
 	map->block_offset =
-		((u64)reg_hi << 32) | (reg_lo & CXL_REGLOC_ADDR_MASK);
-	map->barno = FIELD_GET(CXL_REGLOC_BIR_MASK, reg_lo);
-	map->reg_type = FIELD_GET(CXL_REGLOC_RBI_MASK, reg_lo);
+		((u64)reg_hi << 32) |
+		(reg_lo & DVSEC_REGISTER_LOCATOR_BLOCK_OFFSET_LOW_MASK);
+	map->barno = FIELD_GET(DVSEC_REGISTER_LOCATOR_BIR_MASK, reg_lo);
+	map->reg_type =
+		FIELD_GET(DVSEC_REGISTER_LOCATOR_BLOCK_IDENTIFIER_MASK, reg_lo);
 }
 
 /**
@@ -276,15 +278,15 @@ int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
 	int regloc, i;
 
 	regloc = pci_find_dvsec_capability(pdev, PCI_DVSEC_VENDOR_ID_CXL,
-					   PCI_DVSEC_ID_CXL_REGLOC_DVSEC_ID);
+					   CXL_DVSEC_REGISTER_LOCATOR);
 	if (!regloc)
 		return -ENXIO;
 
 	pci_read_config_dword(pdev, regloc + PCI_DVSEC_HEADER1, &regloc_size);
 	regloc_size = FIELD_GET(PCI_DVSEC_HEADER1_LENGTH_MASK, regloc_size);
 
-	regloc += PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET;
-	regblocks = (regloc_size - PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET) / 8;
+	regloc += DVSEC_REGISTER_LOCATOR_BLOCK1_OFFSET;
+	regblocks = (regloc_size - DVSEC_REGISTER_LOCATOR_BLOCK1_OFFSET) / 8;
 
 	for (i = 0; i < regblocks; i++, regloc += 8) {
 		u32 reg_lo, reg_hi;
diff --git a/drivers/cxl/pci.h b/drivers/cxl/pci.h
index 12fdcb1b14e5..fe2898b17736 100644
--- a/drivers/cxl/pci.h
+++ b/drivers/cxl/pci.h
@@ -7,17 +7,36 @@
 
 /*
  * See section 8.1 Configuration Space Registers in the CXL 2.0
- * Specification
+ * Specification. Names are taken straight from the specification with "CXL" and
+ * "DVSEC" redundancies removed.
  */
 #define PCI_DVSEC_HEADER1_LENGTH_MASK	GENMASK(31, 20)
 #define PCI_DVSEC_VENDOR_ID_CXL		0x1E98
-#define PCI_DVSEC_ID_CXL		0x0
 
-#define PCI_DVSEC_ID_CXL_REGLOC_DVSEC_ID	0x8
-#define PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET	0xC
+/* 8.1.3: PCIe DVSEC for CXL Device */
+#define CXL_DVSEC_PCIE_DEVICE					0
 
-/* BAR Indicator Register (BIR) */
-#define CXL_REGLOC_BIR_MASK GENMASK(2, 0)
+/* 8.1.4: Non-CXL Function Map DVSEC */
+#define CXL_DVSEC_FUNCTION_MAP					2
+
+/* 8.1.5: CXL 2.0 Extensions DVSEC for Ports */
+#define CXL_DVSEC_PORT_EXTENSIONS				3
+
+/* 8.1.6: GPF DVSEC for CXL Port */
+#define CXL_DVSEC_PORT_GPF					4
+
+/* 8.1.7: GPF DVSEC for CXL Device */
+#define CXL_DVSEC_DEVICE_GPF					5
+
+/* 8.1.8: PCIe DVSEC for Flex Bus Port */
+#define CXL_DVSEC_PCIE_FLEXBUS_PORT				7
+
+/* 8.1.9: Register Locator DVSEC */
+#define CXL_DVSEC_REGISTER_LOCATOR				8
+#define   DVSEC_REGISTER_LOCATOR_BLOCK1_OFFSET			0xC
+#define     DVSEC_REGISTER_LOCATOR_BIR_MASK			GENMASK(2, 0)
+#define	    DVSEC_REGISTER_LOCATOR_BLOCK_IDENTIFIER_MASK	GENMASK(15, 8)
+#define     DVSEC_REGISTER_LOCATOR_BLOCK_OFFSET_LOW_MASK	GENMASK(31, 16)
 
 /* Register Block Identifier (RBI) */
 enum cxl_regloc_type {
@@ -28,8 +47,11 @@ enum cxl_regloc_type {
 	CXL_REGLOC_RBI_TYPES
 };
 
-#define CXL_REGLOC_RBI_MASK GENMASK(15, 8)
-#define CXL_REGLOC_ADDR_MASK GENMASK(31, 16)
+/* 8.1.10: MLD DVSEC */
+#define CXL_DVSEC_MLD						9
+
+/* 14.16.1 CXL Device Test Capability Advertisement */
+#define CXL_DVSEC_PCIE_TEST_CAPABILITY				10
 
 #define cxl_reg_block(pdev, map)                                               \
 	((resource_size_t)(pci_resource_start(pdev, (map)->barno) +            \
-- 
2.33.1


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [RFC PATCH v2 14/28] cxl: Hide devm host for ports
  2021-10-22 18:36 [RFC PATCH v2 00/28] CXL Region Creation / HDM decoder programming Ben Widawsky
                   ` (12 preceding siblings ...)
  2021-10-22 18:36 ` [RFC PATCH v2 13/28] cxl: Flesh out register names Ben Widawsky
@ 2021-10-22 18:36 ` Ben Widawsky
  2021-10-31 21:14   ` Dan Williams
  2021-10-22 18:36 ` [RFC PATCH v2 15/28] cxl/core: Introduce API to scan switch ports Ben Widawsky
                   ` (13 subsequent siblings)
  27 siblings, 1 reply; 106+ messages in thread
From: Ben Widawsky @ 2021-10-22 18:36 UTC (permalink / raw)
  To: linux-cxl, Chet Douglas
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

Generally, a platform will only have a single instance of an  entity
which has a platform specific CXL driver. For the time being, this is
only the ACPI0017 device defined as part of the CXL specification and
ancillary specs (ACPI). With this in mind it becomes easy to drop the
need to pass this around for devm functions which want to know the root
host. That change allows future work with the mem/port driver to avoid
having to determine the root host for port creation for endpoints and
switches.

There's some technical dept leftover with regard to cxl_test.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 drivers/cxl/acpi.c     | 15 ++++++++++++---
 drivers/cxl/core/bus.c | 33 +++++++++++++++++++++++++++++----
 drivers/cxl/cxl.h      |  5 ++++-
 3 files changed, 45 insertions(+), 8 deletions(-)

diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index 1cc3a74c16bd..f6b78736c933 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -298,8 +298,7 @@ static int add_host_bridge_uport(struct device *match, void *arg)
 		return -ENODEV;
 	}
 
-	port = devm_cxl_add_port(host, match, dport->component_reg_phys,
-				 root_port);
+	port = devm_cxl_add_port(match, dport->component_reg_phys, root_port);
 	if (IS_ERR(port))
 		return PTR_ERR(port);
 	dev_dbg(host, "%s: add: %s\n", dev_name(match), dev_name(&port->dev));
@@ -467,7 +466,9 @@ static int cxl_acpi_probe(struct platform_device *pdev)
 	struct device *host = &pdev->dev;
 	struct acpi_device *adev = ACPI_COMPANION(host);
 
-	root_port = devm_cxl_add_port(host, host, CXL_RESOURCE_NONE, NULL);
+	cxl_register_root(host);
+
+	root_port = devm_cxl_add_port(host, CXL_RESOURCE_NONE, NULL);
 	if (IS_ERR(root_port))
 		return PTR_ERR(root_port);
 	dev_dbg(host, "add: %s\n", dev_name(&root_port->dev));
@@ -513,6 +514,13 @@ static int cxl_acpi_probe(struct platform_device *pdev)
 	return 0;
 }
 
+static int cxl_acpi_remove(struct platform_device *pdev)
+{
+	cxl_unregister_root();
+
+	return 0;
+}
+
 static bool native_acpi0017 = true;
 
 static const struct acpi_device_id cxl_acpi_ids[] = {
@@ -523,6 +531,7 @@ MODULE_DEVICE_TABLE(acpi, cxl_acpi_ids);
 
 static struct platform_driver cxl_acpi_driver = {
 	.probe = cxl_acpi_probe,
+	.remove = cxl_acpi_remove,
 	.driver = {
 		.name = KBUILD_MODNAME,
 		.acpi_match_table = cxl_acpi_ids,
diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
index 03394a3ae75f..c7e1894d503b 100644
--- a/drivers/cxl/core/bus.c
+++ b/drivers/cxl/core/bus.c
@@ -27,6 +27,32 @@
 static DEFINE_IDA(cxl_port_ida);
 static LIST_HEAD(cxl_root_ports);
 static DECLARE_RWSEM(root_port_sem);
+static DECLARE_RWSEM(root_host_sem);
+
+static struct device *cxl_root_host;
+
+int cxl_register_root(struct device *host)
+{
+	down_write(&root_host_sem);
+	if (cxl_root_host)
+		return -EBUSY;
+
+	get_device(host);
+	cxl_root_host = host;
+	up_write(&root_host_sem);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(cxl_register_root);
+
+void cxl_unregister_root(void)
+{
+	down_write(&root_host_sem);
+	put_device(cxl_root_host);
+	cxl_root_host = NULL;
+	up_write(&root_host_sem);
+}
+EXPORT_SYMBOL_GPL(cxl_unregister_root);
 
 static ssize_t devtype_show(struct device *dev, struct device_attribute *attr,
 			    char *buf)
@@ -375,12 +401,11 @@ static struct cxl_port *cxl_port_alloc(struct device *uport,
 
 /**
  * devm_cxl_add_port - register a cxl_port in CXL memory decode hierarchy
- * @host: host device for devm operations
  * @uport: "physical" device implementing this upstream port
  * @component_reg_phys: (optional) for configurable cxl_port instances
  * @parent_port: next hop up in the CXL memory decode hierarchy
  */
-struct cxl_port *devm_cxl_add_port(struct device *host, struct device *uport,
+struct cxl_port *devm_cxl_add_port(struct device *uport,
 				   resource_size_t component_reg_phys,
 				   struct cxl_port *parent_port)
 {
@@ -404,11 +429,11 @@ struct cxl_port *devm_cxl_add_port(struct device *host, struct device *uport,
 	if (rc)
 		goto err;
 
-	rc = devm_add_action_or_reset(host, unregister_port, port);
+	rc = devm_add_action_or_reset(cxl_root_host, unregister_port, port);
 	if (rc)
 		return ERR_PTR(rc);
 
-	rc = devm_cxl_link_uport(host, port);
+	rc = devm_cxl_link_uport(cxl_root_host, port);
 	if (rc)
 		return ERR_PTR(rc);
 
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index d26d97e1098d..419c2e2db6f0 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -184,6 +184,9 @@ void __iomem *devm_cxl_iomap_block(struct device *dev, resource_size_t addr,
 #define CXL_RESOURCE_NONE ((resource_size_t) -1)
 #define CXL_TARGET_STRLEN 20
 
+int cxl_register_root(struct device *host);
+void cxl_unregister_root(void);
+
 /*
  * cxl_decoder flags that define the type of memory / devices this
  * decoder supports as well as configuration lock status See "CXL 2.0
@@ -303,7 +306,7 @@ struct cxl_dport {
 };
 
 struct cxl_port *to_cxl_port(struct device *dev);
-struct cxl_port *devm_cxl_add_port(struct device *host, struct device *uport,
+struct cxl_port *devm_cxl_add_port(struct device *uport,
 				   resource_size_t component_reg_phys,
 				   struct cxl_port *parent_port);
 
-- 
2.33.1


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [RFC PATCH v2 15/28] cxl/core: Introduce API to scan switch ports
  2021-10-22 18:36 [RFC PATCH v2 00/28] CXL Region Creation / HDM decoder programming Ben Widawsky
                   ` (13 preceding siblings ...)
  2021-10-22 18:36 ` [RFC PATCH v2 14/28] cxl: Hide devm host for ports Ben Widawsky
@ 2021-10-22 18:36 ` Ben Widawsky
  2021-11-01  5:39   ` Dan Williams
  2021-11-03 16:08   ` Jonathan Cameron
  2021-10-22 18:36 ` [RFC PATCH v2 16/28] cxl: Introduce cxl_mem driver Ben Widawsky
                   ` (12 subsequent siblings)
  27 siblings, 2 replies; 106+ messages in thread
From: Ben Widawsky @ 2021-10-22 18:36 UTC (permalink / raw)
  To: linux-cxl, Chet Douglas
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

The CXL drivers encapsulate the components that direct memory traffic in
an entity known as a cxl_port. Compute Express Link specifies three such
components: hostbridge (ie. a collection of root ports), switches, and
endpoints. There are currently drivers that create these ports for the
hostbridges and the endpoints (cxl_acpi and cxl_mem). The new API
introduced allows callers to initiate a scan down from the hostbridge
and create ports for switches in the CXL topology.

The intended user of this API is for endpoint devices. An endpoint
device will need to determine if it is CXL.mem capable, which requires
all components in the path from hostbridge to the endpoint to be CXL.mem
capable. Once an endpoint device determines it's connected to a CXL
capable root port, it can call this API to fill in all the ports in
between the hostbridge and itself.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 .../driver-api/cxl/memory-devices.rst         |   6 +
 drivers/cxl/core/Makefile                     |   1 +
 drivers/cxl/core/bus.c                        | 145 ++++++++++++++++++
 drivers/cxl/core/pci.c                        |  99 ++++++++++++
 drivers/cxl/cxl.h                             |   2 +
 drivers/cxl/pci.h                             |   6 +
 drivers/cxl/port.c                            |   2 +-
 tools/testing/cxl/Kbuild                      |   1 +
 8 files changed, 261 insertions(+), 1 deletion(-)
 create mode 100644 drivers/cxl/core/pci.c

diff --git a/Documentation/driver-api/cxl/memory-devices.rst b/Documentation/driver-api/cxl/memory-devices.rst
index fbf0393cdddc..547336c95593 100644
--- a/Documentation/driver-api/cxl/memory-devices.rst
+++ b/Documentation/driver-api/cxl/memory-devices.rst
@@ -47,6 +47,12 @@ CXL Core
 .. kernel-doc:: drivers/cxl/core/bus.c
    :identifiers:
 
+.. kernel-doc:: drivers/cxl/core/pci.c
+   :doc: cxl pci
+
+.. kernel-doc:: drivers/cxl/core/pci.c
+   :identifiers:
+
 .. kernel-doc:: drivers/cxl/core/pmem.c
    :doc: cxl pmem
 
diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
index 07eb8e1fb8a6..9d33d2d5bf09 100644
--- a/drivers/cxl/core/Makefile
+++ b/drivers/cxl/core/Makefile
@@ -7,3 +7,4 @@ cxl_core-y += pmem.o
 cxl_core-y += regs.o
 cxl_core-y += memdev.o
 cxl_core-y += mbox.o
+cxl_core-y += pci.o
diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
index c7e1894d503b..f10e7d5b22a4 100644
--- a/drivers/cxl/core/bus.c
+++ b/drivers/cxl/core/bus.c
@@ -8,6 +8,7 @@
 #include <linux/idr.h>
 #include <cxlmem.h>
 #include <cxl.h>
+#include <pci.h>
 #include "core.h"
 
 /**
@@ -445,6 +446,150 @@ struct cxl_port *devm_cxl_add_port(struct device *uport,
 }
 EXPORT_SYMBOL_GPL(devm_cxl_add_port);
 
+void devm_cxl_remove_port(struct cxl_port *port)
+{
+	down_read(&root_host_sem);
+	if (cxl_root_host) {
+		devm_release_action(cxl_root_host, cxl_unlink_uport, port);
+		devm_release_action(cxl_root_host, unregister_port, port);
+	}
+	up_read(&root_host_sem);
+}
+EXPORT_SYMBOL_GPL(devm_cxl_remove_port);
+
+static int match_port(struct device *dev, const void *data)
+{
+	struct pci_dev *pdev = (struct pci_dev *)data;
+
+	if (dev->type != &cxl_port_type)
+		return 0;
+
+	return to_cxl_port(dev)->uport == &pdev->dev;
+}
+
+static struct cxl_port *find_cxl_port(struct pci_dev *usp)
+{
+	struct device *port_dev;
+
+	if (!pci_is_pcie(usp) || pci_pcie_type(usp) != PCI_EXP_TYPE_UPSTREAM)
+		return NULL;
+
+	port_dev = bus_find_device(&cxl_bus_type, NULL, usp, match_port);
+	if (port_dev)
+		return to_cxl_port(port_dev);
+
+	return NULL;
+}
+
+static int add_upstream_port(struct device *host, struct pci_dev *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct cxl_port *parent_port;
+	struct cxl_register_map map;
+	struct cxl_port *port;
+	int rc;
+
+	/*
+	 * Upstream ports must be connected to a downstream port or root port.
+	 * That downstream or root port must have a parent.
+	 */
+	if (!pdev->dev.parent->parent)
+		return -ENXIO;
+
+	/* A port is useless if there are no component registers */
+	rc = cxl_find_regblock(pdev, CXL_REGLOC_RBI_COMPONENT, &map);
+	if (rc)
+		return rc;
+
+	parent_port = find_cxl_port(to_pci_dev(pdev->dev.parent->parent));
+	if (!parent_port)
+		return -ENODEV;
+
+	port = devm_cxl_add_port(dev, cxl_reg_block(pdev, &map), parent_port);
+	put_device(&parent_port->dev);
+	if (IS_ERR(port))
+		dev_err(dev, "Failed to add upstream port %ld\n",
+			PTR_ERR(port));
+	else
+		dev_dbg(dev, "Added CXL port\n");
+
+	return rc;
+}
+
+static int add_downstream_port(struct pci_dev *pdev)
+{
+	resource_size_t creg = CXL_RESOURCE_NONE;
+	struct device *dev = &pdev->dev;
+	struct cxl_port *parent_port;
+	struct cxl_register_map map;
+	u32 lnkcap, port_num;
+	int rc;
+
+	/*
+	 * Ports are to be scanned from top down. Therefore, the upstream port
+	 * must already exist.
+	 */
+	parent_port = find_cxl_port(to_pci_dev(pdev->dev.parent));
+	if (!parent_port)
+		return -ENODEV;
+
+	/*
+	 * The spec mandates component registers are present but the
+	 * driver does not.
+	 */
+	rc = cxl_find_regblock(pdev, CXL_REGLOC_RBI_COMPONENT, &map);
+	if (!rc)
+		creg = cxl_reg_block(pdev, &map);
+
+	if (pci_read_config_dword(pdev, pci_pcie_cap(pdev) + PCI_EXP_LNKCAP,
+				  &lnkcap) != PCIBIOS_SUCCESSFUL)
+		return 1;
+	port_num = FIELD_GET(PCI_EXP_LNKCAP_PN, lnkcap);
+
+	rc = cxl_add_dport(parent_port, dev, port_num, creg, false);
+	put_device(&parent_port->dev);
+	if (rc)
+		dev_err(dev, "Failed to add downstream port to %s\n",
+			dev_name(&parent_port->dev));
+	else
+		dev_dbg(dev, "Added downstream port to %s\n",
+			dev_name(&parent_port->dev));
+
+	return rc;
+}
+
+static int match_add_ports(struct pci_dev *pdev, void *data)
+{
+	struct device *dev = &pdev->dev;
+	struct device *host = data;
+	int rc;
+
+	/* This port has already been added... */
+	if (find_cxl_port(pdev))
+		return 0;
+
+	if (is_cxl_switch_usp((dev)))
+		rc = add_upstream_port(host, pdev);
+
+	if (is_cxl_switch_dsp((dev)))
+		rc = add_downstream_port(pdev);
+
+	return rc;
+}
+
+/**
+ * cxl_scan_ports() - Adds all ports for the subtree beginning with @dport
+ * @dport: Beginning node of the CXL topology
+ */
+void cxl_scan_ports(struct cxl_dport *dport)
+{
+	struct device *d = dport->dport;
+	struct pci_dev *pdev = to_pci_dev(d);
+
+	pci_walk_bus(pdev->bus, match_add_ports, &dport->port->dev);
+}
+EXPORT_SYMBOL_GPL(cxl_scan_ports);
+
 static struct cxl_dport *find_dport(struct cxl_port *port, int id)
 {
 	struct cxl_dport *dport;
diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
new file mode 100644
index 000000000000..c0cbe984c778
--- /dev/null
+++ b/drivers/cxl/core/pci.c
@@ -0,0 +1,99 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright(c) 2021 Intel Corporation. All rights reserved. */
+#include <linux/device.h>
+#include <linux/pci.h>
+#include <pci.h>
+
+/**
+ * DOC: cxl pci
+ *
+ * Compute Express Link protocols are layered on top of PCIe. CXL core provides
+ * a set of helpers for CXL interactions which occur via PCIe.
+ */
+
+/**
+ * is_cxl_mem_enabled() - Does the device understand CXL.mem protocol
+ * @pdev: The PCI device for which to determine CXL enablement
+ *
+ * This is the most discrete determination as to whether a device supports
+ * CXL.mem protocol. At a minimum, a CXL device must advertise it is capable of
+ * negotiating the CXL.mem protocol while operating in Flex Bus.CXL mode. There
+ * are other determining factors as to whether CXL.mem protocol is supported in
+ * the path from root port to endpoint. Those other factors require a more
+ * comprehensive survey of the CXL topology and would use is_cxl_mem_enabled()
+ * as a cursory check.
+ *
+ * If the PCI device is enabled for CXL.mem protocol return true; otherwise
+ * return false.
+ *
+ * TODO: Is there other architecturally visible state that can be used to infer
+ *       CXL.mem protocol support?
+ */
+bool is_cxl_mem_enabled(struct pci_dev *pdev)
+{
+	int pcie_dvsec;
+	u16 dvsec_ctrl;
+
+	pcie_dvsec = pci_find_dvsec_capability(pdev, PCI_DVSEC_VENDOR_ID_CXL,
+					       CXL_DVSEC_PCIE_DEVICE);
+	if (!pcie_dvsec) {
+		dev_info(&pdev->dev,
+			 "Unable to determine CXL protocol support");
+		return false;
+	}
+
+	pci_read_config_word(pdev,
+			     pcie_dvsec + DVSEC_PCIE_DEVICE_CONTROL_OFFSET,
+			     &dvsec_ctrl);
+	if (!(dvsec_ctrl & DVSEC_PCIE_DEVICE_MEM_ENABLE)) {
+		dev_info(&pdev->dev, "CXL.mem protocol not enabled on device");
+		return false;
+	}
+
+	return true;
+}
+EXPORT_SYMBOL_GPL(is_cxl_mem_enabled);
+
+/**
+ * is_cxl_switch_usp() - Is the device a CXL.mem enabled switch
+ * @dev: Device to query for switch type
+ *
+ * If the device is a CXL.mem capable upstream switch port return true;
+ * otherwise return false.
+ */
+bool is_cxl_switch_usp(struct device *dev)
+{
+	struct pci_dev *pdev;
+
+	if (!dev_is_pci(dev))
+		return false;
+
+	pdev = to_pci_dev(dev);
+
+	return pci_is_pcie(pdev) &&
+	       pci_pcie_type(pdev) == PCI_EXP_TYPE_UPSTREAM &&
+	       is_cxl_mem_enabled(pdev);
+}
+EXPORT_SYMBOL_GPL(is_cxl_switch_usp);
+
+/**
+ * is_cxl_switch_dsp() - Is the device a CXL.mem enabled switch
+ * @dev: Device to query for switch type
+ *
+ * If the device is a CXL.mem capable downstream switch port return true;
+ * otherwise return false.
+ */
+bool is_cxl_switch_dsp(struct device *dev)
+{
+	struct pci_dev *pdev;
+
+	if (!dev_is_pci(dev))
+		return false;
+
+	pdev = to_pci_dev(dev);
+
+	return pci_is_pcie(pdev) &&
+	       pci_pcie_type(pdev) == PCI_EXP_TYPE_DOWNSTREAM &&
+	       is_cxl_mem_enabled(pdev);
+}
+EXPORT_SYMBOL_GPL(is_cxl_switch_dsp);
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 419c2e2db6f0..03b414462416 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -309,6 +309,8 @@ struct cxl_port *to_cxl_port(struct device *dev);
 struct cxl_port *devm_cxl_add_port(struct device *uport,
 				   resource_size_t component_reg_phys,
 				   struct cxl_port *parent_port);
+void devm_cxl_remove_port(struct cxl_port *port);
+void cxl_scan_ports(struct cxl_dport *root_port);
 
 int cxl_add_dport(struct cxl_port *port, struct device *dport, int port_id,
 		  resource_size_t component_reg_phys, bool root_port);
diff --git a/drivers/cxl/pci.h b/drivers/cxl/pci.h
index fe2898b17736..9d6ca77d3e14 100644
--- a/drivers/cxl/pci.h
+++ b/drivers/cxl/pci.h
@@ -15,6 +15,8 @@
 
 /* 8.1.3: PCIe DVSEC for CXL Device */
 #define CXL_DVSEC_PCIE_DEVICE					0
+#define   DVSEC_PCIE_DEVICE_CONTROL_OFFSET			0xC
+#define     DVSEC_PCIE_DEVICE_MEM_ENABLE			BIT(2)
 
 /* 8.1.4: Non-CXL Function Map DVSEC */
 #define CXL_DVSEC_FUNCTION_MAP					2
@@ -57,4 +59,8 @@ enum cxl_regloc_type {
 	((resource_size_t)(pci_resource_start(pdev, (map)->barno) +            \
 			   (map)->block_offset))
 
+bool is_cxl_switch_usp(struct device *dev);
+bool is_cxl_switch_dsp(struct device *dev);
+bool is_cxl_mem_enabled(struct pci_dev *pdev);
+
 #endif /* __CXL_PCI_H__ */
diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
index ebbfb72ae995..3ddfd7673a56 100644
--- a/drivers/cxl/port.c
+++ b/drivers/cxl/port.c
@@ -170,7 +170,7 @@ static int enumerate_hdm_decoders(struct cxl_port *port,
 		if (rc)
 			put_device(&cxld->dev);
 		else
-			rc = cxl_decoder_autoremove(port->uport->parent, cxld);
+			rc = cxl_decoder_autoremove(&port->dev, cxld);
 		if (rc)
 			dev_err(&port->dev, "Failed to add decoder\n");
 	}
diff --git a/tools/testing/cxl/Kbuild b/tools/testing/cxl/Kbuild
index 86deba8308a1..46db4dd345a0 100644
--- a/tools/testing/cxl/Kbuild
+++ b/tools/testing/cxl/Kbuild
@@ -31,6 +31,7 @@ cxl_core-y += $(CXL_CORE_SRC)/pmem.o
 cxl_core-y += $(CXL_CORE_SRC)/regs.o
 cxl_core-y += $(CXL_CORE_SRC)/memdev.o
 cxl_core-y += $(CXL_CORE_SRC)/mbox.o
+cxl_core-y += $(CXL_CORE_SRC)/pci.o
 cxl_core-y += config_check.o
 
 cxl_core-y += mock_pmem.o
-- 
2.33.1


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [RFC PATCH v2 16/28] cxl: Introduce cxl_mem driver
  2021-10-22 18:36 [RFC PATCH v2 00/28] CXL Region Creation / HDM decoder programming Ben Widawsky
                   ` (14 preceding siblings ...)
  2021-10-22 18:36 ` [RFC PATCH v2 15/28] cxl/core: Introduce API to scan switch ports Ben Widawsky
@ 2021-10-22 18:36 ` Ben Widawsky
  2021-10-22 18:36 ` [RFC PATCH v2 17/28] cxl: Disable switch hierarchies for now Ben Widawsky
                   ` (11 subsequent siblings)
  27 siblings, 0 replies; 106+ messages in thread
From: Ben Widawsky @ 2021-10-22 18:36 UTC (permalink / raw)
  To: linux-cxl, Chet Douglas
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

Add a driver that is capable of determining whether a device is in a
CXL.mem routed part of the topology.

This driver allows a higher level driver - such as one controlling CXL
regions, which is itself a set of CXL devices - to easily determine if
the CXL devices are CXL.mem capable by checking if the driver has bound.
CXL memory device services may also be provided by this driver though
none are needed as of yet. cxl_mem also plays the part of registering
itself as an endpoint port, which is a required step to enumerate the
device's HDM decoder resources.

As part of this patch, find_dport_by_dev() is promoted to the cxl_core's
set of APIs for use by the new driver.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 .../driver-api/cxl/memory-devices.rst         |   3 +
 drivers/cxl/Kconfig                           |  15 ++
 drivers/cxl/Makefile                          |   2 +
 drivers/cxl/acpi.c                            |  17 +-
 drivers/cxl/core/bus.c                        |  19 +++
 drivers/cxl/core/core.h                       |   1 +
 drivers/cxl/core/memdev.c                     |   2 +-
 drivers/cxl/cxl.h                             |   3 +
 drivers/cxl/cxlmem.h                          |   2 +
 drivers/cxl/mem.c                             | 158 ++++++++++++++++++
 10 files changed, 205 insertions(+), 17 deletions(-)
 create mode 100644 drivers/cxl/mem.c

diff --git a/Documentation/driver-api/cxl/memory-devices.rst b/Documentation/driver-api/cxl/memory-devices.rst
index 547336c95593..d04d07ae7118 100644
--- a/Documentation/driver-api/cxl/memory-devices.rst
+++ b/Documentation/driver-api/cxl/memory-devices.rst
@@ -28,6 +28,9 @@ CXL Memory Device
 .. kernel-doc:: drivers/cxl/pci.c
    :internal:
 
+.. kernel-doc:: drivers/cxl/mem.c
+   :doc: cxl mem
+
 CXL Port
 --------
 .. kernel-doc:: drivers/cxl/port.c
diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
index 23773d0ac896..bce98bdaf12c 100644
--- a/drivers/cxl/Kconfig
+++ b/drivers/cxl/Kconfig
@@ -28,6 +28,21 @@ config CXL_PCI
 
 	  If unsure say 'm'.
 
+config CXL_MEM
+	tristate "CXL.mem: Memory Devices"
+	default CXL_BUS
+        help
+          The CXL.mem protocol allows a device to act as a provider of
+	  "System RAM" and/or "Persistent Memory" that is fully coherent
+	  as if the memory was attached to the typical CPU memory controller.
+	  This is known as HDM "Host-managed Device Memory".
+
+	  Say 'y/m' to enable a driver that will attach to CXL.mem devices for
+	  memory expansion and control of HDM. See Chapter 9.13 in the CXL 2.0
+	  specification for a detailed description of HDM.
+
+	  If unsure say 'm'.
+
 config CXL_MEM_RAW_COMMANDS
 	bool "RAW Command Interface for Memory Devices"
 	depends on CXL_PCI
diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile
index 40b386aaedf7..77499a6b40f2 100644
--- a/drivers/cxl/Makefile
+++ b/drivers/cxl/Makefile
@@ -2,9 +2,11 @@
 obj-$(CONFIG_CXL_BUS) += core/
 obj-$(CONFIG_CXL_MEM) += cxl_port.o
 obj-$(CONFIG_CXL_PCI) += cxl_pci.o
+obj-$(CONFIG_CXL_MEM) += cxl_mem.o
 obj-$(CONFIG_CXL_ACPI) += cxl_acpi.o
 obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o
 
+cxl_mem-y := mem.o
 cxl_pci-y := pci.o
 cxl_acpi-y := acpi.o
 cxl_pmem-y := pmem.o
diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index f6b78736c933..23afb3b8f5cc 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -243,21 +243,6 @@ __mock int match_add_root_ports(struct pci_dev *pdev, void *data)
 	return 0;
 }
 
-static struct cxl_dport *find_dport_by_dev(struct cxl_port *port, struct device *dev)
-{
-	struct cxl_dport *dport;
-
-	device_lock(&port->dev);
-	list_for_each_entry(dport, &port->dports, list)
-		if (dport->dport == dev) {
-			device_unlock(&port->dev);
-			return dport;
-		}
-
-	device_unlock(&port->dev);
-	return NULL;
-}
-
 __mock struct acpi_device *to_cxl_host_bridge(struct device *host,
 					      struct device *dev)
 {
@@ -292,7 +277,7 @@ static int add_host_bridge_uport(struct device *match, void *arg)
 	if (!bridge)
 		return 0;
 
-	dport = find_dport_by_dev(root_port, match);
+	dport = cxl_find_dport_by_dev(root_port, match);
 	if (!dport) {
 		dev_dbg(host, "host bridge expected and not found\n");
 		return -ENODEV;
diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
index f10e7d5b22a4..4b4e6033b774 100644
--- a/drivers/cxl/core/bus.c
+++ b/drivers/cxl/core/bus.c
@@ -673,6 +673,23 @@ int cxl_add_dport(struct cxl_port *port, struct device *dport_dev, int port_id,
 }
 EXPORT_SYMBOL_GPL(cxl_add_dport);
 
+struct cxl_dport *cxl_find_dport_by_dev(struct cxl_port *port,
+					struct device *dev)
+{
+	struct cxl_dport *dport;
+
+	device_lock(&port->dev);
+	list_for_each_entry(dport, &port->dports, list)
+		if (dport->dport == dev) {
+			device_unlock(&port->dev);
+			return dport;
+		}
+
+	device_unlock(&port->dev);
+	return NULL;
+}
+EXPORT_SYMBOL_GPL(cxl_find_dport_by_dev);
+
 static int decoder_populate_targets(struct cxl_decoder *cxld,
 				    struct cxl_port *port, int *target_map)
 {
@@ -859,6 +876,8 @@ static int cxl_device_id(struct device *dev)
 		return CXL_DEVICE_NVDIMM;
 	if (dev->type == &cxl_port_type)
 		return CXL_DEVICE_PORT;
+	if (dev->type == &cxl_memdev_type)
+		return CXL_DEVICE_MEMORY_EXPANDER;
 	return 0;
 }
 
diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
index e0c9aacc4e9c..dea246cb7c58 100644
--- a/drivers/cxl/core/core.h
+++ b/drivers/cxl/core/core.h
@@ -6,6 +6,7 @@
 
 extern const struct device_type cxl_nvdimm_bridge_type;
 extern const struct device_type cxl_nvdimm_type;
+extern const struct device_type cxl_memdev_type;
 
 extern struct attribute_group cxl_base_attribute_group;
 
diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index 15762c16d83f..4a7a6e48cba2 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -127,7 +127,7 @@ static const struct attribute_group *cxl_memdev_attribute_groups[] = {
 	NULL,
 };
 
-static const struct device_type cxl_memdev_type = {
+const struct device_type cxl_memdev_type = {
 	.name = "cxl_memdev",
 	.release = cxl_memdev_release,
 	.devnode = cxl_memdev_devnode,
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 03b414462416..5823c267a4de 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -315,6 +315,8 @@ void cxl_scan_ports(struct cxl_dport *root_port);
 int cxl_add_dport(struct cxl_port *port, struct device *dport, int port_id,
 		  resource_size_t component_reg_phys, bool root_port);
 struct cxl_dport *cxl_get_root_dport(struct device *dev);
+struct cxl_dport *cxl_find_dport_by_dev(struct cxl_port *port,
+					struct device *dev);
 
 struct cxl_decoder *to_cxl_decoder(struct device *dev);
 bool is_root_decoder(struct device *dev);
@@ -350,6 +352,7 @@ void cxl_driver_unregister(struct cxl_driver *cxl_drv);
 #define CXL_DEVICE_NVDIMM_BRIDGE	1
 #define CXL_DEVICE_NVDIMM		2
 #define CXL_DEVICE_PORT			3
+#define CXL_DEVICE_MEMORY_EXPANDER	4
 
 #define MODULE_ALIAS_CXL(type) MODULE_ALIAS("cxl:t" __stringify(type) "*")
 #define CXL_MODALIAS_FMT "cxl:t%d"
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 62fe8e2c59e4..cc5844150ce0 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -36,6 +36,7 @@
  * @cxlm: pointer to the parent device driver data
  * @id: id number of this memdev instance.
  * @creg_base: register base of component registers
+ * @root_port: Hostbridge's root port connected to this endpoint
  */
 struct cxl_memdev {
 	struct device dev;
@@ -43,6 +44,7 @@ struct cxl_memdev {
 	struct cxl_mem *cxlm;
 	int id;
 	resource_size_t creg_base;
+	struct cxl_dport *root_port;
 };
 
 static inline struct cxl_memdev *to_cxl_memdev(struct device *dev)
diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
new file mode 100644
index 000000000000..97092b9b748d
--- /dev/null
+++ b/drivers/cxl/mem.c
@@ -0,0 +1,158 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright(c) 2021 Intel Corporation. All rights reserved. */
+#include <linux/device.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+
+#include "cxlmem.h"
+#include "pci.h"
+
+/**
+ * DOC: cxl mem
+ *
+ * CXL memory endpoint devices and switches are CXL capable devices that are
+ * participating in CXL.mem protocol. Their functionality builds on top of the
+ * CXL.io protocol that allows enumerating and configuring components via
+ * standard PCI mechanisms.
+ *
+ * The cxl_mem driver implements enumeration and control over these CXL
+ * components.
+ */
+
+struct walk_ctx {
+	struct cxl_dport *root_port;
+	bool has_switch;
+};
+
+/**
+ * walk_to_root_port() - Walk up to root port
+ * @dev: Device to walk up from
+ * @ctx: Information to populate while walking
+ *
+ * A platform specific driver such as cxl_acpi is responsible for scanning CXL
+ * topologies in a top-down fashion. If the CXL memory device is directly
+ * connected to the top level hostbridge, nothing else needs to be done. If
+ * however there are CXL components (ie. a CXL switch) in between an endpoint
+ * and a hostbridge the platform specific driver must be notified after all the
+ * components are enumerated.
+ */
+static void walk_to_root_port(struct device *dev, struct walk_ctx *ctx)
+{
+	struct cxl_dport *root_port;
+
+	if (!dev->parent)
+		return;
+
+	root_port = cxl_get_root_dport(dev);
+	if (root_port)
+		ctx->root_port = root_port;
+
+	if (is_cxl_switch_usp(dev))
+		ctx->has_switch = true;
+
+	walk_to_root_port(dev->parent, ctx);
+}
+
+static void remove_endpoint(void *_cxlmd)
+{
+	struct cxl_memdev *cxlmd = _cxlmd;
+	struct cxl_port *endpoint;
+
+	if (cxlmd->root_port)
+		sysfs_remove_link(&cxlmd->dev.kobj, "root_port");
+
+	endpoint = dev_get_drvdata(&cxlmd->dev);
+
+	devm_cxl_remove_port(endpoint);
+}
+
+static int create_endpoint(struct device *dev, struct cxl_port *parent,
+			   struct cxl_dport *dport)
+{
+	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
+	struct cxl_port *endpoint;
+	int rc;
+
+	endpoint = devm_cxl_add_port(dev, cxlmd->creg_base, parent);
+	if (IS_ERR(endpoint))
+		return PTR_ERR(endpoint);
+
+	rc = sysfs_create_link(&cxlmd->dev.kobj, &dport->dport->kobj,
+			       "root_port");
+	if (rc) {
+		device_del(&endpoint->dev);
+		return rc;
+	}
+	dev_set_drvdata(dev, endpoint);
+	dev_dbg(dev, "add: %s\n", dev_name(&endpoint->dev));
+
+	return devm_add_action_or_reset(dev, remove_endpoint, cxlmd);
+}
+
+static int cxl_mem_probe(struct device *dev)
+{
+	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
+	struct cxl_port *hostbridge, *parent_port;
+	struct walk_ctx ctx = { NULL, false };
+	int rc;
+
+	walk_to_root_port(dev, &ctx);
+
+	/*
+	 * Couldn't find a CXL capable root port. This may happen if cxl_acpi
+	 * hasn't completed in which case cxl_acpi will rescan the bus.
+	 */
+	if (!ctx.root_port)
+		return -ENODEV;
+
+	/* FIXME: This lock is racy, and does it even need to be here? */
+	hostbridge = ctx.root_port->port;
+	device_lock(&hostbridge->dev);
+
+	/* hostbridge has no port driver, the topology isn't enabled yet */
+	if (!hostbridge->dev.driver) {
+		device_unlock(&hostbridge->dev);
+		return -ENODEV;
+	}
+
+	/* No switch + found root port means we're done */
+	if (!ctx.has_switch) {
+		parent_port = to_cxl_port(&hostbridge->dev);
+		goto out;
+	}
+
+	/* Walk down from the root port and add all switches */
+	cxl_scan_ports(ctx.root_port);
+
+	/* If parent is a dport the endpoint is good to go. */
+	parent_port = to_cxl_port(dev->parent->parent);
+	if (!cxl_find_dport_by_dev(parent_port, dev->parent)) {
+		rc = -ENODEV;
+		goto err_out;
+	}
+
+out:
+
+	rc = create_endpoint(dev, parent_port, ctx.root_port);
+	if (rc)
+		goto err_out;
+
+	cxlmd->root_port = ctx.root_port;
+
+err_out:
+	device_unlock(&hostbridge->dev);
+	return rc;
+}
+
+static struct cxl_driver cxl_mem_driver = {
+	.name = "cxl_mem",
+	.probe = cxl_mem_probe,
+	.id = CXL_DEVICE_MEMORY_EXPANDER,
+};
+
+module_cxl_driver(cxl_mem_driver);
+
+MODULE_LICENSE("GPL v2");
+MODULE_IMPORT_NS(CXL);
+MODULE_ALIAS_CXL(CXL_DEVICE_MEMORY_EXPANDER);
+MODULE_SOFTDEP("pre: cxl_port");
-- 
2.33.1


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [RFC PATCH v2 17/28] cxl: Disable switch hierarchies for now
  2021-10-22 18:36 [RFC PATCH v2 00/28] CXL Region Creation / HDM decoder programming Ben Widawsky
                   ` (15 preceding siblings ...)
  2021-10-22 18:36 ` [RFC PATCH v2 16/28] cxl: Introduce cxl_mem driver Ben Widawsky
@ 2021-10-22 18:36 ` Ben Widawsky
  2021-10-22 18:36 ` [RFC PATCH v2 18/28] cxl/region: Add region creation ABI Ben Widawsky
                   ` (10 subsequent siblings)
  27 siblings, 0 replies; 106+ messages in thread
From: Ben Widawsky @ 2021-10-22 18:36 UTC (permalink / raw)
  To: linux-cxl, Chet Douglas
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

Switches aren't supported by the region driver yet. If a device finds
itself under a switch it will not bind a driver so that it cannot be
used later for region creation/configuration.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 drivers/cxl/mem.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
index 97092b9b748d..0b0272576e55 100644
--- a/drivers/cxl/mem.c
+++ b/drivers/cxl/mem.c
@@ -121,6 +121,11 @@ static int cxl_mem_probe(struct device *dev)
 		goto out;
 	}
 
+	/* FIXME: Add true switch support */
+	dev_err(dev, "Devices behind switches are currently unsupported\n");
+	rc = -ENODEV;
+	goto err_out;
+
 	/* Walk down from the root port and add all switches */
 	cxl_scan_ports(ctx.root_port);
 
-- 
2.33.1


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [RFC PATCH v2 18/28] cxl/region: Add region creation ABI
  2021-10-22 18:36 [RFC PATCH v2 00/28] CXL Region Creation / HDM decoder programming Ben Widawsky
                   ` (16 preceding siblings ...)
  2021-10-22 18:36 ` [RFC PATCH v2 17/28] cxl: Disable switch hierarchies for now Ben Widawsky
@ 2021-10-22 18:36 ` Ben Widawsky
  2021-10-22 18:37 ` [RFC PATCH v2 19/28] cxl/region: Introduce concept of region configuration Ben Widawsky
                   ` (9 subsequent siblings)
  27 siblings, 0 replies; 106+ messages in thread
From: Ben Widawsky @ 2021-10-22 18:36 UTC (permalink / raw)
  To: linux-cxl, Chet Douglas
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

Regions are created as a child of the decoder that encompasses an
address space with constraints. Regions have a number of attributes that
must be configured before the region can be activated.

The ABI is not meant to be secure, but is meant to avoid accidental
races. As a result, a buggy process may create a region by name that was
allocated by a different process. However, multiple processes which are
trying not to race with each other shouldn't need special
synchronization to do so.

// Allocate a new region name
region = $(cat /sys/bus/cxl/devices/decoder0.0/create_region)

// Create a new region by name
echo $region > /sys/bus/cxl/devices/decoder0.0/create_region

// Region now exists in sysfs
stat -t /sys/bus/cxl/devices/decoder0.0/$region

// Delete the region, and name
echo $region > /sys/bus/cxl/devices/decoder0.0/delete_region

After creating a region, it will show up in sysfs

/sys/bus/cxl/devices/
├── decoder0.0 -> ../../../devices/platform/ACPI0017:00/root0/decoder0.0
├── decoder1.0 -> ../../../devices/platform/ACPI0017:00/root0/port1/decoder1.0
├── decoder2.0 -> ../../../devices/platform/ACPI0017:00/root0/port1/port2/decoder2.0
├── mem0 -> ../../../devices/pci0000:34/0000:34:00.0/0000:35:00.0/mem0
├── pmem0 -> ../../../devices/pci0000:34/0000:34:00.0/0000:35:00.0/mem0/pmem0
├── port1 -> ../../../devices/platform/ACPI0017:00/root0/port1
├── port2 -> ../../../devices/platform/ACPI0017:00/root0/port1/port2
└── root0 -> ../../../devices/platform/ACPI0017:00/root0

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 Documentation/ABI/testing/sysfs-bus-cxl       |  23 +++
 .../driver-api/cxl/memory-devices.rst         |  11 ++
 drivers/cxl/core/Makefile                     |   1 +
 drivers/cxl/core/bus.c                        |  84 +++++++++++
 drivers/cxl/core/region.c                     | 139 ++++++++++++++++++
 drivers/cxl/cxl.h                             |   9 ++
 drivers/cxl/region.h                          |  37 +++++
 tools/testing/cxl/Kbuild                      |   1 +
 8 files changed, 305 insertions(+)
 create mode 100644 drivers/cxl/core/region.c
 create mode 100644 drivers/cxl/region.h

diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl
index 0b6a2e6e8fbb..5c48d2ed108f 100644
--- a/Documentation/ABI/testing/sysfs-bus-cxl
+++ b/Documentation/ABI/testing/sysfs-bus-cxl
@@ -127,3 +127,26 @@ Description:
 		memory (type-3). The 'target_type' attribute indicates the
 		current setting which may dynamically change based on what
 		memory regions are activated in this decode hierarchy.
+
+What:		/sys/bus/cxl/devices/decoderX.Y/create_region
+Date:		June, 2021
+KernelVersion:	v5.16
+Contact:	linux-cxl@vger.kernel.org
+Description:
+		Creates a new CXL region. Writing a value of the form
+		"regionX.Y:Z" will create a new uninitialized region that will
+		be mapped by the CXL decoderX.Y. Reading from this node will
+		return a newly allocated region name. In order to create a
+		region (writing) you must use a value returned from reading the
+		node. Regions must be subsequently configured and bound to a
+		region driver before they can be used.
+
+What:		/sys/bus/cxl/devices/decoderX.Y/delete_region
+Date:		June, 2021
+KernelVersion:	v5.16
+Contact:	linux-cxl@vger.kernel.org
+Description:
+		Deletes the named region. A region must be unbound from the
+		region driver before being deleted. The attributes expects a
+		region in the form "regionX.Y:Z". The region's name, allocated
+		by reading create_region, will also be released.
diff --git a/Documentation/driver-api/cxl/memory-devices.rst b/Documentation/driver-api/cxl/memory-devices.rst
index d04d07ae7118..0ed00906acdd 100644
--- a/Documentation/driver-api/cxl/memory-devices.rst
+++ b/Documentation/driver-api/cxl/memory-devices.rst
@@ -65,6 +65,17 @@ CXL Core
 .. kernel-doc:: drivers/cxl/core/mbox.c
    :doc: cxl mbox
 
+CXL Regions
+-----------
+.. kernel-doc:: drivers/cxl/region.h
+   :identifiers:
+
+.. kernel-doc:: drivers/cxl/core/region.c
+   :doc: cxl core region
+
+.. kernel-doc:: drivers/cxl/core/region.c
+   :identifiers:
+
 External Interfaces
 ===================
 
diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
index 9d33d2d5bf09..6a3b3ecd7a25 100644
--- a/drivers/cxl/core/Makefile
+++ b/drivers/cxl/core/Makefile
@@ -4,6 +4,7 @@ obj-$(CONFIG_CXL_BUS) += cxl_core.o
 ccflags-y += -DDEFAULT_SYMBOL_NAMESPACE=CXL -I$(srctree)/drivers/cxl
 cxl_core-y := bus.o
 cxl_core-y += pmem.o
+cxl_core-y += region.o
 cxl_core-y += regs.o
 cxl_core-y += memdev.o
 cxl_core-y += mbox.o
diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
index 4b4e6033b774..7a51a290ed25 100644
--- a/drivers/cxl/core/bus.c
+++ b/drivers/cxl/core/bus.c
@@ -157,6 +157,74 @@ static ssize_t target_list_show(struct device *dev,
 }
 static DEVICE_ATTR_RO(target_list);
 
+static ssize_t create_region_show(struct device *dev,
+				  struct device_attribute *attr, char *buf)
+{
+	struct cxl_port *port = to_cxl_port(dev->parent);
+	struct cxl_decoder *cxld = to_cxl_decoder(dev);
+	int rc;
+
+	if (dev_WARN_ONCE(dev, !is_root_decoder(dev),
+			  "Invalid decoder selected for region.")) {
+		return -ENODEV;
+	}
+
+	rc = ida_alloc(&cxld->region_ida, GFP_KERNEL);
+	if (rc < 0) {
+		dev_err(&cxld->dev, "Couldn't get a new id\n");
+		return rc;
+	}
+
+	return sysfs_emit(buf, "region%d.%d:%d\n", port->id, cxld->id, rc);
+}
+
+static ssize_t create_region_store(struct device *dev,
+				   struct device_attribute *attr,
+				   const char *buf, size_t len)
+{
+	struct cxl_decoder *cxld = to_cxl_decoder(dev);
+	int decoder, port, region_id;
+	struct cxl_region *region;
+	ssize_t rc;
+
+	if (sscanf(buf, "region%d.%d:%d", &decoder, &port, &region_id) != 3)
+		return -EINVAL;
+
+	if (decoder != cxld->id)
+		return -EINVAL;
+
+	if (cxld->id != port)
+		return -EINVAL;
+
+	region = cxl_alloc_region(cxld, region_id);
+	if (IS_ERR(region))
+		return PTR_ERR(region);
+
+	rc = cxl_add_region(cxld, region);
+	if (rc) {
+		kfree(region);
+		return rc;
+	}
+
+	return len;
+}
+static DEVICE_ATTR_RW(create_region);
+
+static ssize_t delete_region_store(struct device *dev,
+				   struct device_attribute *attr,
+				   const char *buf, size_t len)
+{
+	struct cxl_decoder *cxld = to_cxl_decoder(dev);
+	int rc;
+
+	rc = cxl_delete_region(cxld, buf);
+	if (rc)
+		return rc;
+
+	return len;
+}
+static DEVICE_ATTR_WO(delete_region);
+
 static struct attribute *cxl_decoder_base_attrs[] = {
 	&dev_attr_start.attr,
 	&dev_attr_size.attr,
@@ -170,6 +238,8 @@ static struct attribute_group cxl_decoder_base_attribute_group = {
 };
 
 static struct attribute *cxl_decoder_root_attrs[] = {
+	&dev_attr_create_region.attr,
+	&dev_attr_delete_region.attr,
 	&dev_attr_cap_pmem.attr,
 	&dev_attr_cap_ram.attr,
 	&dev_attr_cap_type2.attr,
@@ -210,11 +280,23 @@ static const struct attribute_group *cxl_decoder_endpoint_attribute_groups[] = {
 	NULL,
 };
 
+static int delete_region(struct device *dev, void *arg)
+{
+	struct cxl_decoder *cxld = to_cxl_decoder(dev->parent);
+
+	return cxl_delete_region(cxld, dev_name(dev));
+}
+
 static void cxl_decoder_release(struct device *dev)
 {
 	struct cxl_decoder *cxld = to_cxl_decoder(dev);
 	struct cxl_port *port = to_cxl_port(dev->parent);
 
+	device_for_each_child(&cxld->dev, cxld, delete_region);
+
+	dev_WARN_ONCE(dev, !ida_is_empty(&cxld->region_ida),
+		      "Lost track of a region");
+
 	ida_free(&port->decoder_ida, cxld->id);
 	kfree(cxld);
 }
@@ -750,6 +832,8 @@ struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port,
 	else
 		dev->type = &cxl_decoder_root_type;
 
+	ida_init(&cxld->region_ida);
+
 	return cxld;
 err:
 	kfree(cxld);
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
new file mode 100644
index 000000000000..588f0ca65bb2
--- /dev/null
+++ b/drivers/cxl/core/region.c
@@ -0,0 +1,139 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright(c) 2021 Intel Corporation. All rights reserved. */
+#include <linux/io-64-nonatomic-lo-hi.h>
+#include <linux/device.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <linux/idr.h>
+#include <region.h>
+#include <cxl.h>
+
+/**
+ * DOC: cxl core region
+ *
+ * Regions are managed through the Linux device model. Each region instance is a
+ * unique struct device. CXL core provides functionality to create, destroy, and
+ * configure regions. This is all implemented here. Binding a region
+ * (programming the hardware) is handled by a separate region driver.
+ */
+
+static void cxl_region_release(struct device *dev);
+
+static const struct device_type cxl_region_type = {
+	.name = "cxl_region",
+	.release = cxl_region_release,
+};
+
+struct cxl_region *to_cxl_region(struct device *dev)
+{
+	if (dev_WARN_ONCE(dev, dev->type != &cxl_region_type,
+			  "not a cxl_region device\n"))
+		return NULL;
+
+	return container_of(dev, struct cxl_region, dev);
+}
+EXPORT_SYMBOL_GPL(to_cxl_region);
+
+static void cxl_region_release(struct device *dev)
+{
+	struct cxl_decoder *cxld = to_cxl_decoder(dev->parent);
+	struct cxl_region *region = to_cxl_region(dev);
+
+	ida_free(&cxld->region_ida, region->id);
+	kfree(region);
+}
+
+struct cxl_region *cxl_alloc_region(struct cxl_decoder *cxld, int id)
+{
+	struct cxl_region *region;
+
+	region = kzalloc(sizeof(*region), GFP_KERNEL);
+	if (!region)
+		return ERR_PTR(-ENOMEM);
+
+	region->id = id;
+
+	return region;
+}
+
+/**
+ * cxl_add_region - Adds a region to a decoder
+ * @cxld: Parent decoder.
+ * @region: Region to be added to the decoder.
+ *
+ * This is the second step of region initialization. Regions exist within an
+ * address space which is mapped by a @cxld. That @cxld must be a root decoder,
+ * and it enforces constraints upon the region as it is configured.
+ *
+ * Return: 0 if the region was added to the @cxld, else returns negative error
+ * code. The region will be named "regionX.Y.Z" where X is the port, Y is the
+ * decoder id, and Z is the region number.
+ */
+int cxl_add_region(struct cxl_decoder *cxld, struct cxl_region *region)
+{
+	struct cxl_port *port = to_cxl_port(cxld->dev.parent);
+	struct device *dev = &region->dev;
+	int rc;
+
+	device_initialize(dev);
+	dev->parent = &cxld->dev;
+	device_set_pm_not_required(dev);
+	dev->bus = &cxl_bus_type;
+	dev->type = &cxl_region_type;
+	rc = dev_set_name(dev, "region%d.%d:%d", port->id, cxld->id,
+			  region->id);
+	if (rc)
+		goto err;
+
+	rc = device_add(dev);
+	if (rc)
+		goto err;
+
+	dev_dbg(dev, "Added %s to %s\n", dev_name(dev), dev_name(&cxld->dev));
+
+	return 0;
+
+err:
+	put_device(dev);
+	return rc;
+}
+
+static struct cxl_region *cxl_find_region_by_name(struct cxl_decoder *cxld,
+						  const char *name)
+{
+	struct device *region_dev;
+
+	region_dev = device_find_child_by_name(&cxld->dev, name);
+	if (!region_dev)
+		return ERR_PTR(-ENOENT);
+
+	return to_cxl_region(region_dev);
+}
+
+/**
+ * cxl_delete_region - Deletes a region
+ * @cxld: Parent decoder
+ * @region_name: Named region, ie. regionX.Y:Z
+ */
+int cxl_delete_region(struct cxl_decoder *cxld, const char *region_name)
+{
+	struct cxl_region *region;
+
+	device_lock(&cxld->dev);
+
+	region = cxl_find_region_by_name(cxld, region_name);
+	if (IS_ERR(region)) {
+		device_unlock(&cxld->dev);
+		return PTR_ERR(region);
+	}
+
+	dev_dbg(&cxld->dev, "Requested removal of %s from %s\n",
+		dev_name(&region->dev), dev_name(&cxld->dev));
+
+	device_unregister(&region->dev);
+	device_unlock(&cxld->dev);
+
+	put_device(&region->dev);
+
+	return 0;
+}
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 5823c267a4de..bcef2cb4e55b 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -220,6 +220,7 @@ enum cxl_decoder_type {
  * @interleave_granularity: data stride per dport
  * @target_type: accelerator vs expander (type2 vs type3) selector
  * @flags: memory type capabilities and locking
+ * @region_ida: allocator for region ids.
  * @nr_targets: number of elements in @target
  * @target: active ordered target list in current decoder configuration
  */
@@ -231,6 +232,7 @@ struct cxl_decoder {
 	int interleave_granularity;
 	enum cxl_decoder_type target_type;
 	unsigned long flags;
+	struct ida region_ida;
 	const int nr_targets;
 	struct cxl_dport *target[];
 };
@@ -305,6 +307,13 @@ struct cxl_dport {
 	struct list_head root_port_link;
 };
 
+bool is_cxl_region(struct device *dev);
+struct cxl_region *to_cxl_region(struct device *dev);
+struct cxl_region *cxl_alloc_region(struct cxl_decoder *cxld,
+				    int interleave_ways);
+int cxl_add_region(struct cxl_decoder *cxld, struct cxl_region *region);
+int cxl_delete_region(struct cxl_decoder *cxld, const char *region);
+
 struct cxl_port *to_cxl_port(struct device *dev);
 struct cxl_port *devm_cxl_add_port(struct device *uport,
 				   resource_size_t component_reg_phys,
diff --git a/drivers/cxl/region.h b/drivers/cxl/region.h
new file mode 100644
index 000000000000..d032df438832
--- /dev/null
+++ b/drivers/cxl/region.h
@@ -0,0 +1,37 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* Copyright(c) 2021 Intel Corporation. */
+#ifndef __CXL_REGION_H__
+#define __CXL_REGION_H__
+
+#include <linux/uuid.h>
+
+#include "cxl.h"
+
+/**
+ * struct cxl_region - CXL region
+ * @dev: This region's device.
+ * @id: This regions id. Id is globally unique across all regions.
+ * @list: Node in decoder's region list.
+ * @res: Address space consumed by this region.
+ * @size: Size of the region determined from LSA or userspace.
+ * @uuid: The UUID for this region.
+ * @eniw: Number of interleave ways this region is configured for.
+ * @ig: Interleave granularity of region
+ * @targets: The memory devices comprising the region.
+ */
+struct cxl_region {
+	struct device dev;
+	int id;
+	struct list_head list;
+
+	struct {
+		struct resource *res;
+		u64 size;
+		uuid_t uuid;
+		int eniw;
+		int ig;
+		struct cxl_memdev *targets[CXL_DECODER_MAX_INTERLEAVE];
+	};
+};
+
+#endif
diff --git a/tools/testing/cxl/Kbuild b/tools/testing/cxl/Kbuild
index 46db4dd345a0..3e5a1c4dce14 100644
--- a/tools/testing/cxl/Kbuild
+++ b/tools/testing/cxl/Kbuild
@@ -32,6 +32,7 @@ cxl_core-y += $(CXL_CORE_SRC)/regs.o
 cxl_core-y += $(CXL_CORE_SRC)/memdev.o
 cxl_core-y += $(CXL_CORE_SRC)/mbox.o
 cxl_core-y += $(CXL_CORE_SRC)/pci.o
+cxl_core-y += $(CXL_CORE_SRC)/region.o
 cxl_core-y += config_check.o
 
 cxl_core-y += mock_pmem.o
-- 
2.33.1


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [RFC PATCH v2 19/28] cxl/region: Introduce concept of region configuration
  2021-10-22 18:36 [RFC PATCH v2 00/28] CXL Region Creation / HDM decoder programming Ben Widawsky
                   ` (17 preceding siblings ...)
  2021-10-22 18:36 ` [RFC PATCH v2 18/28] cxl/region: Add region creation ABI Ben Widawsky
@ 2021-10-22 18:37 ` Ben Widawsky
  2021-10-22 18:37 ` [RFC PATCH v2 20/28] cxl/region: Introduce a cxl_region driver Ben Widawsky
                   ` (8 subsequent siblings)
  27 siblings, 0 replies; 106+ messages in thread
From: Ben Widawsky @ 2021-10-22 18:37 UTC (permalink / raw)
  To: linux-cxl, Chet Douglas
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

The region creation APIs create a vacant region. Configuring the region
works in the same way as similar subsystems such as devdax. Sysfs attrs
will be provided to allow userspace to configure the region.  Finally
once all configuration is complete, userspace may activate the region.

Introduced here are the most basic attributes needed to configure a
region. Details of these attribute are described in the ABI
Documentation.

A example is provided below:

/sys/bus/cxl/devices/region0.0:0
├── interleave_granularity
├── interleave_ways
├── offset
├── size
├── subsystem -> ../../../../../../bus/cxl
├── target0
├── uevent
└── uuid

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 Documentation/ABI/testing/sysfs-bus-cxl |  40 ++++
 drivers/cxl/core/region.c               | 295 ++++++++++++++++++++++++
 2 files changed, 335 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl
index 5c48d2ed108f..ef9b10d05c59 100644
--- a/Documentation/ABI/testing/sysfs-bus-cxl
+++ b/Documentation/ABI/testing/sysfs-bus-cxl
@@ -150,3 +150,43 @@ Description:
 		region driver before being deleted. The attributes expects a
 		region in the form "regionX.Y:Z". The region's name, allocated
 		by reading create_region, will also be released.
+
+What:		/sys/bus/cxl/devices/decoderX.Y/regionX.Y:Z/offset
+Date:		June, 2021
+KernelVersion:	v5.16
+Contact:	linux-cxl@vger.kernel.org
+Description:
+		(RO) A region resides within an address space that is claimed by
+		a decoder. Region space allocation is handled by the driver, but
+		the offset may be read by userspace tooling in order to
+		determine fragmentation, and available size for new regions.
+
+What:
+/sys/bus/cxl/devices/decoderX.Y/regionX.Y:Z/{interleave,size,uuid,target[0-15]}
+Date:		June, 2021
+KernelVersion:	v5.16
+Contact:	linux-cxl@vger.kernel.org
+Description:
+		(RW) Configuring regions requires a minimal set of parameters in
+		order for the subsequent bind operation to succeed. The
+		following parameters are defined:
+
+		==	========================================================
+		interleave_granularity Mandatory. Number of consecutive bytes
+			each device in the interleave set will claim. The
+			possible interleave granularity values are determined by
+			the CXL spec and the participating devices.
+		interleave_ways Mandatory. Number of devices participating in the
+			region. Each device will provide 1/interleave of storage
+			for the region.
+		size	Manadatory. Phsyical address space the region will
+			consume.
+		target  Mandatory. Memory devices are the backing storage for a
+			region. There will be N targets based on the number of
+			interleave ways that the top level decoder is configured
+			for. Each target must be set with a memdev device ie.
+			'mem1'. This attribute only becomes available after
+			setting the 'interleave' attribute.
+		uuid	Optional. A unique identifier for the region. If none is
+			selected, the kernel will create one.
+		==	========================================================
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 588f0ca65bb2..3b0d74d4dd6c 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -3,9 +3,12 @@
 #include <linux/io-64-nonatomic-lo-hi.h>
 #include <linux/device.h>
 #include <linux/module.h>
+#include <linux/sizes.h>
 #include <linux/slab.h>
+#include <linux/uuid.h>
 #include <linux/idr.h>
 #include <region.h>
+#include <cxlmem.h>
 #include <cxl.h>
 
 /**
@@ -17,11 +20,300 @@
  * (programming the hardware) is handled by a separate region driver.
  */
 
+struct cxl_region *to_cxl_region(struct device *dev);
+static const struct attribute_group region_interleave_group;
+
+static bool is_region_active(struct cxl_region *region)
+{
+	/* TODO: Regions can't be activated yet. */
+	return false;
+}
+
+static void remove_target(struct cxl_region *region, int target)
+{
+	struct cxl_memdev *cxlmd;
+
+	cxlmd = region->targets[target];
+	if (cxlmd)
+		put_device(&cxlmd->dev);
+	region->targets[target] = NULL;
+}
+
+static ssize_t interleave_ways_show(struct device *dev,
+				    struct device_attribute *attr, char *buf)
+{
+	struct cxl_region *region = to_cxl_region(dev);
+
+	return sysfs_emit(buf, "%d\n", region->eniw);
+}
+
+static ssize_t interleave_ways_store(struct device *dev,
+				     struct device_attribute *attr,
+				     const char *buf, size_t len)
+{
+	struct cxl_region *region = to_cxl_region(dev);
+	int ret, prev_eniw;
+	int val;
+
+	prev_eniw = region->eniw;
+	ret = kstrtoint(buf, 0, &val);
+	if (ret)
+		return ret;
+	if (ret < 0 || ret > CXL_DECODER_MAX_INTERLEAVE)
+		return -EINVAL;
+
+	region->eniw = val;
+
+	ret = sysfs_update_group(&dev->kobj, &region_interleave_group);
+	if (ret < 0)
+		goto err;
+
+	sysfs_notify(&dev->kobj, NULL, "target_interleave");
+
+	while (prev_eniw > region->eniw)
+		remove_target(region, --prev_eniw);
+
+	return len;
+
+err:
+	region->eniw = prev_eniw;
+	return ret;
+}
+static DEVICE_ATTR_RW(interleave_ways);
+
+static ssize_t interleave_granularity_show(struct device *dev,
+					   struct device_attribute *attr,
+					   char *buf)
+{
+	struct cxl_region *region = to_cxl_region(dev);
+
+	return sysfs_emit(buf, "%d\n", region->ig);
+}
+
+static ssize_t interleave_granularity_store(struct device *dev,
+					    struct device_attribute *attr,
+					    const char *buf, size_t len)
+{
+	struct cxl_region *region = to_cxl_region(dev);
+	int val, ret;
+
+	ret = kstrtoint(buf, 0, &val);
+	if (ret)
+		return ret;
+	region->ig = val;
+
+	return len;
+}
+static DEVICE_ATTR_RW(interleave_granularity);
+
+static ssize_t offset_show(struct device *dev, struct device_attribute *attr,
+			   char *buf)
+{
+	struct cxl_region *region = to_cxl_region(dev);
+	struct cxl_decoder *cxld = to_cxl_decoder(dev->parent);
+
+	if (!region->res)
+		return sysfs_emit(buf, "\n");
+
+	return sysfs_emit(buf, "%#llx\n", cxld->res.start - region->res->start);
+}
+static DEVICE_ATTR_RO(offset);
+
+static ssize_t size_show(struct device *dev, struct device_attribute *attr,
+			 char *buf)
+{
+	struct cxl_region *region = to_cxl_region(dev);
+
+	return sysfs_emit(buf, "%llu\n", region->size);
+}
+
+static ssize_t size_store(struct device *dev, struct device_attribute *attr,
+			  const char *buf, size_t len)
+{
+	struct cxl_region *region = to_cxl_region(dev);
+	unsigned long long val;
+	ssize_t rc;
+
+	rc = kstrtoull(buf, 0, &val);
+	if (rc)
+		return rc;
+
+	device_lock(&region->dev);
+	if (is_region_active(region))
+		rc = -EBUSY;
+	else
+		region->size = val;
+	device_unlock(&region->dev);
+
+	return rc ? rc : len;
+}
+static DEVICE_ATTR_RW(size);
+
+static ssize_t uuid_show(struct device *dev, struct device_attribute *attr,
+			 char *buf)
+{
+	struct cxl_region *region = to_cxl_region(dev);
+
+	return sysfs_emit(buf, "%pUb\n", &region->uuid);
+}
+
+static ssize_t uuid_store(struct device *dev, struct device_attribute *attr,
+			  const char *buf, size_t len)
+{
+	struct cxl_region *region = to_cxl_region(dev);
+	ssize_t rc;
+
+	if (len != UUID_STRING_LEN + 1)
+		return -EINVAL;
+
+	device_lock(&region->dev);
+	if (is_region_active(region))
+		rc = -EBUSY;
+	else
+		rc = uuid_parse(buf, &region->uuid);
+	device_unlock(&region->dev);
+
+	return rc ? rc : len;
+}
+static DEVICE_ATTR_RW(uuid);
+
+static struct attribute *region_attrs[] = {
+	&dev_attr_interleave_ways.attr,
+	&dev_attr_interleave_granularity.attr,
+	&dev_attr_offset.attr,
+	&dev_attr_size.attr,
+	&dev_attr_uuid.attr,
+	NULL,
+};
+
+static const struct attribute_group region_group = {
+	.attrs = region_attrs,
+};
+
+static size_t show_targetN(struct cxl_region *region, char *buf, int n)
+{
+	int ret;
+
+	device_lock(&region->dev);
+	if (!region->targets[n])
+		ret = sysfs_emit(buf, "\n");
+	else
+		ret = sysfs_emit(buf, "%s\n",
+				 dev_name(&region->targets[n]->dev));
+	device_unlock(&region->dev);
+
+	return ret;
+}
+
+static size_t set_targetN(struct cxl_region *region, const char *buf, int n,
+			  size_t len)
+{
+	struct device *memdev_dev;
+	struct cxl_memdev *cxlmd;
+
+	device_lock(&region->dev);
+
+	if (len == 1 || region->targets[n])
+		remove_target(region, n);
+
+	/* Remove target special case */
+	if (len == 1) {
+		device_unlock(&region->dev);
+		return len;
+	}
+
+	memdev_dev = bus_find_device_by_name(&cxl_bus_type, NULL, buf);
+	if (!memdev_dev)
+		return -ENOENT;
+
+	/* reference to memdev held until target is unset or region goes away */
+
+	cxlmd = to_cxl_memdev(memdev_dev);
+	region->targets[n] = cxlmd;
+
+	device_unlock(&region->dev);
+
+	return len;
+}
+
+#define TARGET_ATTR_RW(n)                                                      \
+	static ssize_t target##n##_show(                                       \
+		struct device *dev, struct device_attribute *attr, char *buf)  \
+	{                                                                      \
+		return show_targetN(to_cxl_region(dev), buf, (n));             \
+	}                                                                      \
+	static ssize_t target##n##_store(struct device *dev,                   \
+					 struct device_attribute *attr,        \
+					 const char *buf, size_t len)          \
+	{                                                                      \
+		return set_targetN(to_cxl_region(dev), buf, (n), len);         \
+	}                                                                      \
+	static DEVICE_ATTR_RW(target##n)
+
+TARGET_ATTR_RW(0);
+TARGET_ATTR_RW(1);
+TARGET_ATTR_RW(2);
+TARGET_ATTR_RW(3);
+TARGET_ATTR_RW(4);
+TARGET_ATTR_RW(5);
+TARGET_ATTR_RW(6);
+TARGET_ATTR_RW(7);
+TARGET_ATTR_RW(8);
+TARGET_ATTR_RW(9);
+TARGET_ATTR_RW(10);
+TARGET_ATTR_RW(11);
+TARGET_ATTR_RW(12);
+TARGET_ATTR_RW(13);
+TARGET_ATTR_RW(14);
+TARGET_ATTR_RW(15);
+
+static struct attribute *interleave_attrs[] = {
+	&dev_attr_target0.attr,
+	&dev_attr_target1.attr,
+	&dev_attr_target2.attr,
+	&dev_attr_target3.attr,
+	&dev_attr_target4.attr,
+	&dev_attr_target5.attr,
+	&dev_attr_target6.attr,
+	&dev_attr_target7.attr,
+	&dev_attr_target8.attr,
+	&dev_attr_target9.attr,
+	&dev_attr_target10.attr,
+	&dev_attr_target11.attr,
+	&dev_attr_target12.attr,
+	&dev_attr_target13.attr,
+	&dev_attr_target14.attr,
+	&dev_attr_target15.attr,
+	NULL,
+};
+
+static umode_t visible_targets(struct kobject *kobj, struct attribute *a, int n)
+{
+	struct device *dev = container_of(kobj, struct device, kobj);
+	struct cxl_region *region = to_cxl_region(dev);
+
+	if (n < region->eniw)
+		return a->mode;
+	return 0;
+}
+
+static const struct attribute_group region_interleave_group = {
+	.attrs = interleave_attrs,
+	.is_visible = visible_targets,
+};
+
+static const struct attribute_group *region_groups[] = {
+	&region_group,
+	&region_interleave_group,
+	NULL,
+};
+
 static void cxl_region_release(struct device *dev);
 
 static const struct device_type cxl_region_type = {
 	.name = "cxl_region",
 	.release = cxl_region_release,
+	.groups = region_groups
 };
 
 struct cxl_region *to_cxl_region(struct device *dev)
@@ -38,8 +330,11 @@ static void cxl_region_release(struct device *dev)
 {
 	struct cxl_decoder *cxld = to_cxl_decoder(dev->parent);
 	struct cxl_region *region = to_cxl_region(dev);
+	int i;
 
 	ida_free(&cxld->region_ida, region->id);
+	for (i = 0; i < region->eniw; i++)
+		remove_target(region, i);
 	kfree(region);
 }
 
-- 
2.33.1


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [RFC PATCH v2 20/28] cxl/region: Introduce a cxl_region driver
  2021-10-22 18:36 [RFC PATCH v2 00/28] CXL Region Creation / HDM decoder programming Ben Widawsky
                   ` (18 preceding siblings ...)
  2021-10-22 18:37 ` [RFC PATCH v2 19/28] cxl/region: Introduce concept of region configuration Ben Widawsky
@ 2021-10-22 18:37 ` Ben Widawsky
  2021-10-22 18:37 ` [RFC PATCH v2 21/28] cxl/acpi: Handle address space allocation Ben Widawsky
                   ` (7 subsequent siblings)
  27 siblings, 0 replies; 106+ messages in thread
From: Ben Widawsky @ 2021-10-22 18:37 UTC (permalink / raw)
  To: linux-cxl, Chet Douglas
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

The cxl_region driver is responsible for managing the HDM decoder
programming in the CXL topology. Once a region is created it must be
configured and bound to the driver in order to activate it.

The following is a sample of how such controls might work:

region=$(cat /sys/bus/cxl/devices/decoder0.0/create_region)
echo $region > /sys/bus/cxl/devices/decoder0.0/create_region
echo 2 > /sys/bus/cxl/devices/decoder0.0/region0.0:0/interleave
echo $((256<<20)) > /sys/bus/cxl/devices/decoder0.0/region0.0:0/size
echo mem0 > /sys/bus/cxl/devices/decoder0.0/region0.0:0/target0
echo region0.0:0 > /sys/bus/cxl/drivers/cxl_region/bind

In order to handle the eventual rise in failure modes of binding a
region, a new trace event is created to help track these failures for
debug and reconfiguration paths in userspace.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 .clang-format                                 |   1 +
 .../driver-api/cxl/memory-devices.rst         |   3 +
 drivers/cxl/Makefile                          |   2 +
 drivers/cxl/core/bus.c                        |  19 +-
 drivers/cxl/core/core.h                       |   1 +
 drivers/cxl/core/region.c                     |  25 +-
 drivers/cxl/cxl.h                             |   5 +
 drivers/cxl/region.c                          | 324 ++++++++++++++++++
 drivers/cxl/region.h                          |   4 +
 drivers/cxl/trace.h                           |  45 +++
 10 files changed, 425 insertions(+), 4 deletions(-)
 create mode 100644 drivers/cxl/region.c
 create mode 100644 drivers/cxl/trace.h

diff --git a/.clang-format b/.clang-format
index 15d4eaabc6b5..cb7c46371465 100644
--- a/.clang-format
+++ b/.clang-format
@@ -169,6 +169,7 @@ ForEachMacros:
   - 'for_each_cpu_and'
   - 'for_each_cpu_not'
   - 'for_each_cpu_wrap'
+  - 'for_each_cxl_endpoint'
   - 'for_each_dapm_widgets'
   - 'for_each_dev_addr'
   - 'for_each_dev_scope'
diff --git a/Documentation/driver-api/cxl/memory-devices.rst b/Documentation/driver-api/cxl/memory-devices.rst
index 0ed00906acdd..fb5ed0c914b6 100644
--- a/Documentation/driver-api/cxl/memory-devices.rst
+++ b/Documentation/driver-api/cxl/memory-devices.rst
@@ -67,6 +67,9 @@ CXL Core
 
 CXL Regions
 -----------
+.. kernel-doc:: drivers/cxl/region.c
+   :doc: cxl region
+
 .. kernel-doc:: drivers/cxl/region.h
    :identifiers:
 
diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile
index 77499a6b40f2..9936568711cd 100644
--- a/drivers/cxl/Makefile
+++ b/drivers/cxl/Makefile
@@ -5,9 +5,11 @@ obj-$(CONFIG_CXL_PCI) += cxl_pci.o
 obj-$(CONFIG_CXL_MEM) += cxl_mem.o
 obj-$(CONFIG_CXL_ACPI) += cxl_acpi.o
 obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o
+obj-$(CONFIG_CXL_MEM) += cxl_region.o
 
 cxl_mem-y := mem.o
 cxl_pci-y := pci.o
 cxl_acpi-y := acpi.o
 cxl_pmem-y := pmem.o
 cxl_port-y := port.o
+cxl_region-y := region.o
diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
index 7a51a290ed25..3042e6e6f5b3 100644
--- a/drivers/cxl/core/bus.c
+++ b/drivers/cxl/core/bus.c
@@ -7,6 +7,7 @@
 #include <linux/slab.h>
 #include <linux/idr.h>
 #include <cxlmem.h>
+#include <region.h>
 #include <cxl.h>
 #include <pci.h>
 #include "core.h"
@@ -32,6 +33,8 @@ static DECLARE_RWSEM(root_host_sem);
 
 static struct device *cxl_root_host;
 
+struct cxl_region *to_cxl_region(struct device *dev);
+
 int cxl_register_root(struct device *host)
 {
 	down_write(&root_host_sem);
@@ -962,6 +965,8 @@ static int cxl_device_id(struct device *dev)
 		return CXL_DEVICE_PORT;
 	if (dev->type == &cxl_memdev_type)
 		return CXL_DEVICE_MEMORY_EXPANDER;
+	if (dev->type == &cxl_region_type)
+		return CXL_DEVICE_REGION;
 	return 0;
 }
 
@@ -978,7 +983,19 @@ static int cxl_bus_match(struct device *dev, struct device_driver *drv)
 
 static int cxl_bus_probe(struct device *dev)
 {
-	return to_cxl_drv(dev->driver)->probe(dev);
+	int id = cxl_device_id(dev);
+
+	if (id == CXL_DEVICE_REGION) {
+		/* Regions cannot bind until parameters are set */
+		struct cxl_region *region = to_cxl_region(dev);
+
+		if (is_cxl_region_configured(region))
+			return to_cxl_drv(dev->driver)->probe(dev);
+	} else {
+		return to_cxl_drv(dev->driver)->probe(dev);
+	}
+
+	return -ENODEV;
 }
 
 static void cxl_bus_remove(struct device *dev)
diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
index dea246cb7c58..5bff806a8742 100644
--- a/drivers/cxl/core/core.h
+++ b/drivers/cxl/core/core.h
@@ -7,6 +7,7 @@
 extern const struct device_type cxl_nvdimm_bridge_type;
 extern const struct device_type cxl_nvdimm_type;
 extern const struct device_type cxl_memdev_type;
+extern const struct device_type cxl_region_type;
 
 extern struct attribute_group cxl_base_attribute_group;
 
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 3b0d74d4dd6c..f227d339f092 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -11,6 +11,8 @@
 #include <cxlmem.h>
 #include <cxl.h>
 
+#include "core.h"
+
 /**
  * DOC: cxl core region
  *
@@ -25,10 +27,27 @@ static const struct attribute_group region_interleave_group;
 
 static bool is_region_active(struct cxl_region *region)
 {
-	/* TODO: Regions can't be activated yet. */
-	return false;
+	return region->active;
 }
 
+/*
+ * Most sanity checking is left up to region binding. This does the most basic
+ * check to determine whether or not the core should try probing the driver.
+ */
+bool is_cxl_region_configured(const struct cxl_region *region)
+{
+	/* zero sized regions aren't a thing. */
+	if (region->size <= 0)
+		return false;
+
+	/* all regions have at least 1 target */
+	if (!region->targets[0])
+		return false;
+
+	return true;
+}
+EXPORT_SYMBOL_GPL(is_cxl_region_configured);
+
 static void remove_target(struct cxl_region *region, int target)
 {
 	struct cxl_memdev *cxlmd;
@@ -310,7 +329,7 @@ static const struct attribute_group *region_groups[] = {
 
 static void cxl_region_release(struct device *dev);
 
-static const struct device_type cxl_region_type = {
+const struct device_type cxl_region_type = {
 	.name = "cxl_region",
 	.release = cxl_region_release,
 	.groups = region_groups
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index bcef2cb4e55b..430de5ab805a 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -200,6 +200,10 @@ void cxl_unregister_root(void);
 #define CXL_DECODER_F_EN    BIT(5)
 #define CXL_DECODER_F_MASK  GENMASK(5, 0)
 
+#define cxl_is_pmem_t3(flags)                                                  \
+	(((flags) & (CXL_DECODER_F_TYPE3 | CXL_DECODER_F_PMEM)) ==             \
+	 (CXL_DECODER_F_TYPE3 | CXL_DECODER_F_PMEM))
+
 enum cxl_decoder_type {
        CXL_DECODER_ACCELERATOR = 2,
        CXL_DECODER_EXPANDER = 3,
@@ -362,6 +366,7 @@ void cxl_driver_unregister(struct cxl_driver *cxl_drv);
 #define CXL_DEVICE_NVDIMM		2
 #define CXL_DEVICE_PORT			3
 #define CXL_DEVICE_MEMORY_EXPANDER	4
+#define CXL_DEVICE_REGION		5
 
 #define MODULE_ALIAS_CXL(type) MODULE_ALIAS("cxl:t" __stringify(type) "*")
 #define CXL_MODALIAS_FMT "cxl:t%d"
diff --git a/drivers/cxl/region.c b/drivers/cxl/region.c
new file mode 100644
index 000000000000..da4f91b48743
--- /dev/null
+++ b/drivers/cxl/region.c
@@ -0,0 +1,324 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright(c) 2021 Intel Corporation. All rights reserved. */
+#include <linux/platform_device.h>
+#include <linux/device.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include "cxlmem.h"
+#include "region.h"
+
+#define CREATE_TRACE_POINTS
+#include "trace.h"
+
+/**
+ * DOC: cxl region
+ *
+ * This module implements a region driver that is capable of programming CXL
+ * hardware to setup regions.
+ *
+ * A CXL region encompasses a chunk of host physical address space that may be
+ * consumed by a single device (x1 interleave aka linear) or across multiple
+ * devices (xN interleaved). A region is a child device of a &struct
+ * cxl_decoder. There may be multiple active regions under a single &struct
+ * cxl_decoder. The common case for multiple regions would be several linear,
+ * contiguous regions under a single decoder. Generally, there will be a 1:1
+ * relationship between decoder and region when the region is interleaved.
+ */
+
+#define for_each_cxl_endpoint(ep, region, idx)                                 \
+	for (idx = 0, ep = (region)->targets[idx]; idx < region_ways(region);  \
+	     idx++, ep = (region)->targets[idx])
+
+#define region_ways(region) ((region)->eniw)
+#define region_ig(region) (ilog2((region)->ig))
+
+#define cxld_from_region(r) to_cxl_decoder(region->dev.parent)
+
+struct cfmws_context {
+	const struct cxl_region *region;
+	struct cxl_port *hbs[CXL_DECODER_MAX_INTERLEAVE];
+	int count;
+};
+
+static struct cxl_port *get_hostbridge(const struct cxl_memdev *endpoint)
+{
+	return endpoint->root_port->port;
+}
+
+static struct cxl_port *get_root_decoder(const struct cxl_memdev *endpoint)
+{
+	struct cxl_port *hostbridge = get_hostbridge(endpoint);
+
+	if (hostbridge)
+		return to_cxl_port(hostbridge->dev.parent);
+
+	return NULL;
+}
+
+/**
+ * sanitize_region() - Check is region is reasonably configured
+ * @region: The region to check
+ *
+ * Determination as to whether or not a region can possibly be configured is
+ * described in CXL Memory Device SW Guide. In order to implement the algorithms
+ * described there, certain more basic configuration parameters must first need
+ * to be validated. That is accomplished by this function.
+ *
+ * Returns 0 if the region is reasonably configured, else returns a negative
+ * error code.
+ */
+static int sanitize_region(const struct cxl_region *region)
+{
+	int i;
+
+	if (dev_WARN_ONCE(&region->dev, !is_cxl_region_configured(region),
+			  "unconfigured regions can't be probed (race?)\n")) {
+		return -ENXIO;
+	}
+
+	if (region->size % (SZ_256M * region_ways(region))) {
+		trace_sanitize_failed(region,
+				      "Invalid size. Must be multiple of NIW");
+		return -ENXIO;
+	}
+
+	for (i = 0; i < region_ways(region); i++) {
+		if (!region->targets[i]) {
+			trace_sanitize_failed(region,
+					      "Missing memory device target");
+			return -ENXIO;
+		}
+		if (!region->targets[i]->dev.driver) {
+			trace_sanitize_failed(region,
+					      "Target isn't CXL.mem capable");
+			return -ENODEV;
+		}
+	}
+
+	return 0;
+}
+
+/**
+ * allocate_address_space() - Gets address space for the region.
+ * @region: The region that will consume the address space
+ */
+static int allocate_address_space(struct cxl_region *region)
+{
+	/* TODO */
+	return 0;
+}
+
+/**
+ * find_cdat_dsmas() - Find a valid DSMAS for the region
+ * @region: The region
+ */
+static bool find_cdat_dsmas(const struct cxl_region *region)
+{
+	return true;
+}
+
+/**
+ * qtg_match() - Does this CFMWS have desirable QTG for the endpoint
+ * @cfmws: The CFMWS for the region
+ * @endpoint: Endpoint whose QTG is being compared
+ *
+ * Prior to calling this function, the caller should verify that all endpoints
+ * in the region have the same QTG ID.
+ *
+ * Returns true if the QTG ID of the CFMWS matches the endpoint
+ */
+static bool qtg_match(const struct cxl_decoder *cfmws,
+		      const struct cxl_memdev *endpoint)
+{
+	/* TODO: */
+	return true;
+}
+
+/**
+ * region_xhb_config_valid() - determine cross host bridge validity
+ * @cfmws: The CFMWS to check against
+ * @region: The region being programmed
+ *
+ * The algorithm is outlined in 2.13.14 "Verify XHB configuration sequence" of
+ * the CXL Memory Device SW Guide (Rev1p0).
+ *
+ * Returns true if the configuration is valid.
+ */
+static bool region_xhb_config_valid(const struct cxl_region *region,
+				    const struct cxl_decoder *cfmws)
+{
+	/* TODO: */
+	return true;
+}
+
+/**
+ * region_hb_rp_config_valid() - determine root port ordering is correct
+ * @cfmws: CFMWS decoder for this @region
+ * @region: Region to validate
+ *
+ * The algorithm is outlined in 2.13.15 "Verify HB root port configuration
+ * sequence" of the CXL Memory Device SW Guide (Rev1p0).
+ *
+ * Returns true if the configuration is valid.
+ */
+static bool region_hb_rp_config_valid(const struct cxl_region *region,
+				      const struct cxl_decoder *cfmws)
+{
+	/* TODO: */
+	return true;
+}
+
+/**
+ * cfmws_contains() - determine if this region can exist in the cfmws
+ * @cfmws: CFMWS that potentially decodes to this region
+ * @region: region to be routed by the @cfmws
+ */
+static bool cfmws_contains(const struct cxl_region *region,
+			   const struct cxl_decoder *cfmws)
+{
+	/* TODO: */
+	return true;
+}
+
+static bool cfmws_valid(const struct cxl_region *region,
+			const struct cxl_decoder *cfmws)
+{
+	const struct cxl_memdev *endpoint = region->targets[0];
+
+	if (!qtg_match(cfmws, endpoint))
+		return false;
+
+	if (!cxl_is_pmem_t3(cfmws->flags))
+		return false;
+
+	if (!region_xhb_config_valid(region, cfmws))
+		return false;
+
+	if (!region_hb_rp_config_valid(region, cfmws))
+		return false;
+
+	if (!cfmws_contains(region, cfmws))
+		return false;
+
+	return true;
+}
+
+static int cfmws_match(struct device *dev, void *data)
+{
+	struct cfmws_context *ctx = (struct cfmws_context *)data;
+	const struct cxl_region *region = ctx->region;
+
+	if (!is_root_decoder(dev))
+		return 0;
+
+	return !!cfmws_valid(region, to_cxl_decoder(dev));
+}
+
+/*
+ * This is a roughly equivalent implementation to "Figure 45 - High-level
+ * sequence: Finding CFMWS for region" from the CXL Memory Device SW Guide
+ * Rev1p0.
+ */
+static struct cxl_decoder *find_cfmws(const struct cxl_region *region,
+				      const struct cxl_port *root)
+{
+	struct cfmws_context ctx;
+	struct device *ret;
+
+	ctx.region = region;
+
+	ret = device_find_child((struct device *)&root->dev, &ctx, cfmws_match);
+	if (ret)
+		return to_cxl_decoder(ret);
+
+	return NULL;
+}
+
+/**
+ * gather_hdm_decoders() - Amass all HDM decoders in the hierarchy
+ * @region: The region to be programmed
+ *
+ * Programming the hardware such that the correct set of devices receive the
+ * correct memory traffic requires all connected components in the hierarchy to
+ * have HDM decoders programmed.
+ *
+ * Returns 0 if an HDM decoder was obtained for each component, else returns a
+ * negative error code.
+ */
+static int gather_hdm_decoders(const struct cxl_region *region)
+{
+	/* TODO: */
+	return 0;
+}
+
+static int bind_region(const struct cxl_region *region)
+{
+	/* TODO: */
+	return 0;
+}
+
+static int cxl_region_probe(struct device *dev)
+{
+	struct cxl_region *region = to_cxl_region(dev);
+	struct cxl_port *root_decoder;
+	struct cxl_decoder *cfmws, *ours;
+	int ret;
+
+	device_lock_assert(&region->dev);
+
+	if (uuid_is_null(&region->uuid))
+		uuid_gen(&region->uuid);
+
+	if (region->active)
+		return 0;
+
+	/* TODO: What about volatile, and LSA generated regions? */
+
+	ret = sanitize_region(region);
+	if (ret)
+		return ret;
+
+	ret = allocate_address_space(region);
+	if (ret)
+		return ret;
+
+	if (!find_cdat_dsmas(region))
+		return -ENXIO;
+
+	cfmws = cxld_from_region(region);
+	if (!cfmws_valid(region, cfmws)) {
+		dev_err(dev, "Picked invalid cfmws\n");
+		return -ENXIO;
+	}
+
+	root_decoder = get_root_decoder(region->targets[0]);
+	ours = find_cfmws(region, root_decoder);
+	if (ours != cfmws)
+		dev_warn(dev, "Picked different cfmws %s %s\n",
+			 dev_name(&cfmws->dev), dev_name(&ours->dev));
+	if (ours)
+		put_device(&ours->dev);
+
+	ret = gather_hdm_decoders(region);
+	if (ret)
+		return ret;
+
+	ret = bind_region(region);
+	if (!ret) {
+		region->active = true;
+		trace_region_activated(region, "");
+	}
+
+	return ret;
+}
+
+static struct cxl_driver cxl_region_driver = {
+	.name = "cxl_region",
+	.probe = cxl_region_probe,
+	.id = CXL_DEVICE_REGION,
+};
+module_cxl_driver(cxl_region_driver);
+
+MODULE_LICENSE("GPL v2");
+MODULE_IMPORT_NS(CXL);
+MODULE_ALIAS_CXL(CXL_DEVICE_REGION);
diff --git a/drivers/cxl/region.h b/drivers/cxl/region.h
index d032df438832..5df417324cab 100644
--- a/drivers/cxl/region.h
+++ b/drivers/cxl/region.h
@@ -12,6 +12,7 @@
  * @dev: This region's device.
  * @id: This regions id. Id is globally unique across all regions.
  * @list: Node in decoder's region list.
+ * @active: If the region has been activated.
  * @res: Address space consumed by this region.
  * @size: Size of the region determined from LSA or userspace.
  * @uuid: The UUID for this region.
@@ -23,6 +24,7 @@ struct cxl_region {
 	struct device dev;
 	int id;
 	struct list_head list;
+	bool active;
 
 	struct {
 		struct resource *res;
@@ -34,4 +36,6 @@ struct cxl_region {
 	};
 };
 
+bool is_cxl_region_configured(const struct cxl_region *region);
+
 #endif
diff --git a/drivers/cxl/trace.h b/drivers/cxl/trace.h
new file mode 100644
index 000000000000..8f7f471e15b8
--- /dev/null
+++ b/drivers/cxl/trace.h
@@ -0,0 +1,45 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM cxl
+
+#if !defined(__CXL_TRACE_H__) || defined(TRACE_HEADER_MULTI_READ)
+#define __CXL_TRACE_H__
+
+#include <linux/tracepoint.h>
+
+DECLARE_EVENT_CLASS(cxl_region_template,
+
+	TP_PROTO(const struct cxl_region *region, char *status),
+
+	TP_ARGS(region, status),
+
+	TP_STRUCT__entry(
+		__string(dev_name, dev_name(&region->dev))
+		__field(const struct cxl_region *, region)
+		__string(status, status)
+	),
+
+	TP_fast_assign(
+		__assign_str(dev_name, dev_name(&region->dev));
+		__entry->region = (const struct cxl_region *)region;
+		__assign_str(status, status);
+	),
+
+	TP_printk("%s: (%s)\n", __get_str(dev_name), __get_str(status))
+);
+
+DEFINE_EVENT(cxl_region_template, region_activated,
+	     TP_PROTO(const struct cxl_region *region, char *status),
+	     TP_ARGS(region, status));
+DEFINE_EVENT(cxl_region_template, sanitize_failed,
+	     TP_PROTO(const struct cxl_region *region, char *status),
+	     TP_ARGS(region, status));
+
+#endif /* if !defined (__CXL_TRACE_H__) || defined(TRACE_HEADER_MULTI_READ) */
+
+/* This part must be outside protection */
+#undef TRACE_INCLUDE_PATH
+#define TRACE_INCLUDE_PATH ../../../drivers/cxl
+#define TRACE_INCLUDE_FILE trace
+#include <trace/define_trace.h>
-- 
2.33.1


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [RFC PATCH v2 21/28] cxl/acpi: Handle address space allocation
  2021-10-22 18:36 [RFC PATCH v2 00/28] CXL Region Creation / HDM decoder programming Ben Widawsky
                   ` (19 preceding siblings ...)
  2021-10-22 18:37 ` [RFC PATCH v2 20/28] cxl/region: Introduce a cxl_region driver Ben Widawsky
@ 2021-10-22 18:37 ` Ben Widawsky
  2021-10-22 18:37 ` [RFC PATCH v2 22/28] cxl/region: Address " Ben Widawsky
                   ` (6 subsequent siblings)
  27 siblings, 0 replies; 106+ messages in thread
From: Ben Widawsky @ 2021-10-22 18:37 UTC (permalink / raw)
  To: linux-cxl, Chet Douglas
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

Regions are carved out of an addresses space which is claimed by top
level decoders, and subsequently their children decoders. Regions are
created with a size and therefore must fit, with proper alignment, in
that address space. The support for doing this fitting is handled by the
driver automatically.

As an example, a platform might configure a top level decoder to claim
1TB of address space @ 0x800000000 -> 0x10800000000; it would be
possible to create M regions with appropriate alignment to occupy that
address space. Each of those regions would have a host physical address
somewhere in the range between 32G and 1.3TB, and the location will be
determined by the logic added here.

The request_region() usage is not strictly mandatory at this point as
the actual handling of the address space is done with genpools. It is
highly likely however that the resource/region APIs will become useful
in the not too distant future.

All decoders manage a host physical address space while active. Only the
root decoder has constraints on location and size. As a result, it makes
most sense for the root decoder to be responsible for managing the
entire address space, and mid-level decoders and endpoints can ask the
root decoder for suballocations.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 drivers/cxl/acpi.c | 30 ++++++++++++++++++++++++++++++
 drivers/cxl/cxl.h  |  2 ++
 2 files changed, 32 insertions(+)

diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index 23afb3b8f5cc..65219b18f3bc 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -1,6 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0-only
 /* Copyright(c) 2021 Intel Corporation. All rights reserved. */
 #include <linux/platform_device.h>
+#include <linux/genalloc.h>
 #include <linux/module.h>
 #include <linux/device.h>
 #include <linux/kernel.h>
@@ -75,6 +76,27 @@ static int cxl_acpi_cfmws_verify(struct device *dev,
 	return 0;
 }
 
+/*
+ * Every decoder while active has an address space that it is decoding. However,
+ * only the root level decoders have fixed host physical address space ranges.
+ */
+static int cxl_create_cfmws_address_space(struct cxl_decoder *cxld,
+					  struct acpi_cedt_cfmws *cfmws)
+{
+	const int order = ilog2(SZ_256M * cxld->interleave_ways);
+	struct device *dev = &cxld->dev;
+	struct gen_pool *pool;
+
+	pool = devm_gen_pool_create(dev, order, NUMA_NO_NODE, dev_name(dev));
+	if (IS_ERR(pool))
+		return PTR_ERR(pool);
+
+	cxld->address_space = pool;
+
+	return gen_pool_add(cxld->address_space, cfmws->base_hpa,
+			    cfmws->window_size, NUMA_NO_NODE);
+}
+
 static void cxl_add_cfmws_decoders(struct device *dev,
 				   struct cxl_port *root_port)
 {
@@ -133,6 +155,14 @@ static void cxl_add_cfmws_decoders(struct device *dev,
 		cxld->interleave_granularity =
 			CFMWS_INTERLEAVE_GRANULARITY(cfmws);
 
+		rc = cxl_create_cfmws_address_space(cxld, cfmws);
+		if (rc) {
+			dev_err(dev,
+				"Failed to create CFMWS address space for decoder\n");
+			put_device(&cxld->dev);
+			goto next;
+		}
+
 		rc = cxl_decoder_add(cxld, target_map);
 		if (rc)
 			put_device(&cxld->dev);
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 430de5ab805a..509b20375e0e 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -225,6 +225,7 @@ enum cxl_decoder_type {
  * @target_type: accelerator vs expander (type2 vs type3) selector
  * @flags: memory type capabilities and locking
  * @region_ida: allocator for region ids.
+ * @address_space: Used/free address space for regions.
  * @nr_targets: number of elements in @target
  * @target: active ordered target list in current decoder configuration
  */
@@ -237,6 +238,7 @@ struct cxl_decoder {
 	enum cxl_decoder_type target_type;
 	unsigned long flags;
 	struct ida region_ida;
+	struct gen_pool *address_space;
 	const int nr_targets;
 	struct cxl_dport *target[];
 };
-- 
2.33.1


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [RFC PATCH v2 22/28] cxl/region: Address space allocation
  2021-10-22 18:36 [RFC PATCH v2 00/28] CXL Region Creation / HDM decoder programming Ben Widawsky
                   ` (20 preceding siblings ...)
  2021-10-22 18:37 ` [RFC PATCH v2 21/28] cxl/acpi: Handle address space allocation Ben Widawsky
@ 2021-10-22 18:37 ` Ben Widawsky
  2021-10-22 18:37 ` [RFC PATCH v2 23/28] cxl/region: Implement XHB verification Ben Widawsky
                   ` (5 subsequent siblings)
  27 siblings, 0 replies; 106+ messages in thread
From: Ben Widawsky @ 2021-10-22 18:37 UTC (permalink / raw)
  To: linux-cxl, Chet Douglas
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

When a region is not assigned a host physical address, one is picked by
the driver. As the address will determine which CFMWS contains the
region, it's usually a better idea to let the driver make this
determination.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 drivers/cxl/region.c | 35 +++++++++++++++++++++++++++++++++--
 drivers/cxl/trace.h  |  3 +++
 2 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/drivers/cxl/region.c b/drivers/cxl/region.c
index da4f91b48743..d127c9c69eef 100644
--- a/drivers/cxl/region.c
+++ b/drivers/cxl/region.c
@@ -1,6 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0-only
 /* Copyright(c) 2021 Intel Corporation. All rights reserved. */
 #include <linux/platform_device.h>
+#include <linux/genalloc.h>
 #include <linux/device.h>
 #include <linux/module.h>
 #include <linux/pci.h>
@@ -55,6 +56,17 @@ static struct cxl_port *get_root_decoder(const struct cxl_memdev *endpoint)
 	return NULL;
 }
 
+static void release_cxl_region(void *r)
+{
+	struct cxl_region *region = (struct cxl_region *)r;
+	struct cxl_decoder *cxld = cxld_from_region(region);
+	struct resource *parent_res = &cxld->res;
+	const struct resource *res = region->res;
+
+	gen_pool_free(cxld->address_space, res->start, resource_size(res));
+	__release_region(parent_res, res->start, resource_size(res));
+}
+
 /**
  * sanitize_region() - Check is region is reasonably configured
  * @region: The region to check
@@ -104,8 +116,27 @@ static int sanitize_region(const struct cxl_region *region)
  */
 static int allocate_address_space(struct cxl_region *region)
 {
-	/* TODO */
-	return 0;
+	struct cxl_decoder *cxld = cxld_from_region(region);
+	unsigned long start;
+
+	start = gen_pool_alloc(cxld->address_space, region->size);
+	if (!start) {
+		trace_allocation_failed(region,
+					"Couldn't allocate address space");
+		return -ENOMEM;
+	}
+	region->res =
+		__request_region(&cxld->res, start, region->size,
+				 dev_name(&region->dev), IORESOURCE_EXCLUSIVE);
+
+	if (IS_ERR(region->res)) {
+		trace_allocation_failed(region, "Couldn't obtain region");
+		gen_pool_free(cxld->address_space, start, region->size);
+		return PTR_ERR(region->res);
+	}
+
+	return devm_add_action_or_reset(&region->dev, release_cxl_region,
+					region);
 }
 
 /**
diff --git a/drivers/cxl/trace.h b/drivers/cxl/trace.h
index 8f7f471e15b8..a53f00ba5d0e 100644
--- a/drivers/cxl/trace.h
+++ b/drivers/cxl/trace.h
@@ -35,6 +35,9 @@ DEFINE_EVENT(cxl_region_template, region_activated,
 DEFINE_EVENT(cxl_region_template, sanitize_failed,
 	     TP_PROTO(const struct cxl_region *region, char *status),
 	     TP_ARGS(region, status));
+DEFINE_EVENT(cxl_region_template, allocation_failed,
+	     TP_PROTO(const struct cxl_region *region, char *status),
+	     TP_ARGS(region, status));
 
 #endif /* if !defined (__CXL_TRACE_H__) || defined(TRACE_HEADER_MULTI_READ) */
 
-- 
2.33.1


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [RFC PATCH v2 23/28] cxl/region: Implement XHB verification
  2021-10-22 18:36 [RFC PATCH v2 00/28] CXL Region Creation / HDM decoder programming Ben Widawsky
                   ` (21 preceding siblings ...)
  2021-10-22 18:37 ` [RFC PATCH v2 22/28] cxl/region: Address " Ben Widawsky
@ 2021-10-22 18:37 ` Ben Widawsky
  2021-10-22 18:37 ` [RFC PATCH v2 24/28] cxl/region: HB port config verification Ben Widawsky
                   ` (4 subsequent siblings)
  27 siblings, 0 replies; 106+ messages in thread
From: Ben Widawsky @ 2021-10-22 18:37 UTC (permalink / raw)
  To: linux-cxl, Chet Douglas
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

Cross host bridge verification primarily determines if the requested
interleave ordering can be achieved by the root decoder, which isn't as
programmable as other decoders.

The algorithm implemented here is based on the CXL Type 3 Memory Device
Software Guide, chapter 2.13.14

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 .clang-format        |  1 +
 drivers/cxl/region.c | 81 +++++++++++++++++++++++++++++++++++++++++++-
 drivers/cxl/trace.h  |  3 ++
 3 files changed, 84 insertions(+), 1 deletion(-)

diff --git a/.clang-format b/.clang-format
index cb7c46371465..55f628f21722 100644
--- a/.clang-format
+++ b/.clang-format
@@ -169,6 +169,7 @@ ForEachMacros:
   - 'for_each_cpu_and'
   - 'for_each_cpu_not'
   - 'for_each_cpu_wrap'
+  - 'for_each_cxl_decoder_target'
   - 'for_each_cxl_endpoint'
   - 'for_each_dapm_widgets'
   - 'for_each_dev_addr'
diff --git a/drivers/cxl/region.c b/drivers/cxl/region.c
index d127c9c69eef..53442de33d11 100644
--- a/drivers/cxl/region.c
+++ b/drivers/cxl/region.c
@@ -30,6 +30,11 @@
 	for (idx = 0, ep = (region)->targets[idx]; idx < region_ways(region);  \
 	     idx++, ep = (region)->targets[idx])
 
+#define for_each_cxl_decoder_target(target, decoder, idx)                      \
+	for (idx = 0, target = (decoder)->target[idx];                         \
+	     idx < (decoder)->nr_targets;                                      \
+	     idx++, target = (decoder)->target[idx])
+
 #define region_ways(region) ((region)->eniw)
 #define region_ig(region) (ilog2((region)->ig))
 
@@ -165,6 +170,28 @@ static bool qtg_match(const struct cxl_decoder *cfmws,
 	return true;
 }
 
+static int get_unique_hostbridges(const struct cxl_region *region,
+				  struct cxl_port **hbs)
+{
+	struct cxl_memdev *ep;
+	int i, hb_count = 0;
+
+	for_each_cxl_endpoint(ep, region, i) {
+		struct cxl_port *hb = get_hostbridge(ep);
+		bool found = false;
+		int j;
+
+		for (j = 0; j < hb_count; j++) {
+			if (hbs[j] == hb)
+				found = true;
+		}
+		if (!found)
+			hbs[hb_count++] = hb;
+	}
+
+	return hb_count;
+}
+
 /**
  * region_xhb_config_valid() - determine cross host bridge validity
  * @cfmws: The CFMWS to check against
@@ -178,7 +205,59 @@ static bool qtg_match(const struct cxl_decoder *cfmws,
 static bool region_xhb_config_valid(const struct cxl_region *region,
 				    const struct cxl_decoder *cfmws)
 {
-	/* TODO: */
+	struct cxl_port *hbs[CXL_DECODER_MAX_INTERLEAVE];
+	int cfmws_ig, i;
+	struct cxl_dport *target;
+
+	/* Are all devices in this region on the same CXL host bridge */
+	if (get_unique_hostbridges(region, hbs) == 1)
+		return true;
+
+	cfmws_ig = cfmws->interleave_granularity;
+
+	/* CFMWS.HBIG >= Device.Label.IG */
+	if (cfmws_ig < region_ig(region)) {
+		trace_xhb_valid(region,
+				"granularity does not support the region interleave granularity\n");
+		return false;
+	}
+
+	/* ((2^(CFMWS.HBIG - Device.RLabel.IG) * (2^CFMWS.ENIW)) > Device.RLabel.NLabel) */
+	if (1 << (cfmws_ig - region_ig(region)) * (1 << cfmws->interleave_ways) >
+	    region_ways(region)) {
+		trace_xhb_valid(region,
+				"granularity to device granularity ratio requires a larger number of devices than currently configured");
+		return false;
+	}
+
+	/* Check that endpoints are hooked up in the correct order */
+	for_each_cxl_decoder_target(target, cfmws, i) {
+		struct cxl_memdev *endpoint = region->targets[i];
+
+		if (get_hostbridge(endpoint) != target->port) {
+			trace_xhb_valid(region, "device ordering bad\n");
+			return false;
+		}
+	}
+
+	/*
+	 * CFMWS.InterleaveTargetList[n] must contain all devices, x where:
+	 *	(Device[x],RegionLabel.Position >> (CFMWS.HBIG -
+	 *	Device[x].RegionLabel.InterleaveGranularity)) &
+	 *	((2^CFMWS.ENIW) - 1) = n
+	 *
+	 * Linux notes: All devices are known to have the same interleave
+	 * granularity at this point.
+	 */
+	for_each_cxl_decoder_target(target, cfmws, i) {
+		if (((i >> (cfmws_ig - region_ig(region)))) &
+		    (((1 << cfmws->interleave_ways) - 1) != target->port_id)) {
+			trace_xhb_valid(region,
+					"One or more devices are not connected to the correct hostbridge.");
+			return false;
+		}
+	}
+
 	return true;
 }
 
diff --git a/drivers/cxl/trace.h b/drivers/cxl/trace.h
index a53f00ba5d0e..4de47d1111ac 100644
--- a/drivers/cxl/trace.h
+++ b/drivers/cxl/trace.h
@@ -38,6 +38,9 @@ DEFINE_EVENT(cxl_region_template, sanitize_failed,
 DEFINE_EVENT(cxl_region_template, allocation_failed,
 	     TP_PROTO(const struct cxl_region *region, char *status),
 	     TP_ARGS(region, status));
+DEFINE_EVENT(cxl_region_template, xhb_valid,
+	     TP_PROTO(const struct cxl_region *region, char *status),
+	     TP_ARGS(region, status));
 
 #endif /* if !defined (__CXL_TRACE_H__) || defined(TRACE_HEADER_MULTI_READ) */
 
-- 
2.33.1


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [RFC PATCH v2 24/28] cxl/region: HB port config verification
  2021-10-22 18:36 [RFC PATCH v2 00/28] CXL Region Creation / HDM decoder programming Ben Widawsky
                   ` (22 preceding siblings ...)
  2021-10-22 18:37 ` [RFC PATCH v2 23/28] cxl/region: Implement XHB verification Ben Widawsky
@ 2021-10-22 18:37 ` Ben Widawsky
  2021-10-22 18:37 ` [RFC PATCH v2 25/28] cxl/region: Record host bridge target list Ben Widawsky
                   ` (3 subsequent siblings)
  27 siblings, 0 replies; 106+ messages in thread
From: Ben Widawsky @ 2021-10-22 18:37 UTC (permalink / raw)
  To: linux-cxl, Chet Douglas
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

Host bridge root port verification determines if the device ordering in
an interleave set can be programmed through the host bridges and
switches.

The algorithm implemented here is based on the CXL Type 3 Memory Device
Software Guide, chapter 2.13.15

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 .clang-format        |  1 +
 drivers/cxl/cxl.h    |  3 ++
 drivers/cxl/region.c | 90 +++++++++++++++++++++++++++++++++++++++++++-
 drivers/cxl/trace.h  |  3 ++
 4 files changed, 96 insertions(+), 1 deletion(-)

diff --git a/.clang-format b/.clang-format
index 55f628f21722..96c282b63e7b 100644
--- a/.clang-format
+++ b/.clang-format
@@ -171,6 +171,7 @@ ForEachMacros:
   - 'for_each_cpu_wrap'
   - 'for_each_cxl_decoder_target'
   - 'for_each_cxl_endpoint'
+  - 'for_each_cxl_endpoint_hb'
   - 'for_each_dapm_widgets'
   - 'for_each_dev_addr'
   - 'for_each_dev_scope'
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 509b20375e0e..63554c9cebf0 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -303,6 +303,7 @@ struct cxl_port {
  * @port: reference to cxl_port that contains this downstream port
  * @list: node for a cxl_port's list of cxl_dport instances
  * @root_port_link: node for global list of root ports
+ * @verify_link: node used for hb root port verification
  */
 struct cxl_dport {
 	struct device *dport;
@@ -311,6 +312,8 @@ struct cxl_dport {
 	struct cxl_port *port;
 	struct list_head list;
 	struct list_head root_port_link;
+
+	struct list_head verify_link;
 };
 
 bool is_cxl_region(struct device *dev);
diff --git a/drivers/cxl/region.c b/drivers/cxl/region.c
index 53442de33d11..ecf453fd3257 100644
--- a/drivers/cxl/region.c
+++ b/drivers/cxl/region.c
@@ -4,6 +4,7 @@
 #include <linux/genalloc.h>
 #include <linux/device.h>
 #include <linux/module.h>
+#include <linux/sort.h>
 #include <linux/pci.h>
 #include "cxlmem.h"
 #include "region.h"
@@ -30,6 +31,11 @@
 	for (idx = 0, ep = (region)->targets[idx]; idx < region_ways(region);  \
 	     idx++, ep = (region)->targets[idx])
 
+#define for_each_cxl_endpoint_hb(ep, region, hb, idx)                          \
+	for (idx = 0, (ep) = (region)->targets[idx];                           \
+	     idx < region_ways(region); idx++, (ep) = (region)->targets[idx])  \
+		if (get_hostbridge(ep) == (hb))
+
 #define for_each_cxl_decoder_target(target, decoder, idx)                      \
 	for (idx = 0, target = (decoder)->target[idx];                         \
 	     idx < (decoder)->nr_targets;                                      \
@@ -261,6 +267,29 @@ static bool region_xhb_config_valid(const struct cxl_region *region,
 	return true;
 }
 
+static int get_num_root_ports(const struct cxl_region *region)
+{
+	struct cxl_memdev *endpoint;
+	struct cxl_dport *dport, *tmp;
+	int num_root_ports = 0;
+	LIST_HEAD(root_ports);
+	int idx;
+
+	for_each_cxl_endpoint(endpoint, region, idx) {
+		struct cxl_dport *root_port = endpoint->root_port;
+
+		if (list_empty(&root_port->verify_link)) {
+			list_add_tail(&root_port->verify_link, &root_ports);
+			num_root_ports++;
+		}
+	}
+
+	list_for_each_entry_safe(dport, tmp, &root_ports, verify_link)
+		list_del_init(&dport->verify_link);
+
+	return num_root_ports;
+}
+
 /**
  * region_hb_rp_config_valid() - determine root port ordering is correct
  * @cfmws: CFMWS decoder for this @region
@@ -274,7 +303,66 @@ static bool region_xhb_config_valid(const struct cxl_region *region,
 static bool region_hb_rp_config_valid(const struct cxl_region *region,
 				      const struct cxl_decoder *cfmws)
 {
-	/* TODO: */
+	const int num_root_ports = get_num_root_ports(region);
+	struct cxl_port *hbs[CXL_DECODER_MAX_INTERLEAVE];
+	int hb_count, i;
+
+	hb_count = get_unique_hostbridges(region, hbs);
+
+	/*
+	 * Are all devices in this region on the same CXL Host Bridge
+	 * Root Port?
+	 */
+	if (num_root_ports == 1)
+		return true;
+
+	for (i = 0; i < hb_count; i++) {
+		struct cxl_port *hb = hbs[i];
+		struct cxl_dport *rp;
+		int position_mask;
+		int idx;
+
+		/*
+		 * Calculate the position mask: NumRootPorts = 2^PositionMask
+		 * for this region.
+		 *
+		 * XXX: pos_mask is actually (1 << PositionMask)  - 1
+		 */
+		position_mask = (1 << (ilog2(num_root_ports))) - 1;
+
+		/*
+		 * Calculate the PortGrouping for each device on this CXL Host
+		 * Bridge Root Port:
+		 * PortGrouping = RegionLabel.Position & PositionMask
+		 */
+		list_for_each_entry(rp, &hb->dports, list) {
+			struct cxl_memdev *ep;
+			int port_grouping = -1;
+
+			for_each_cxl_endpoint_hb(ep, region, hb, idx) {
+				if (ep->root_port != rp)
+					continue;
+
+				if (port_grouping == -1) {
+					port_grouping = idx & position_mask;
+					continue;
+				}
+
+				/*
+				 * Do all devices in the region connected to this CXL
+				 * Host Bridge Root Port have the same PortGrouping?
+				 */
+				if ((idx & position_mask) != port_grouping) {
+					trace_hb_rp_valid(region,
+							  "One or more devices are not connected to the correct Host Bridge Root Port\n");
+					return false;
+				}
+			}
+
+			/* TODO: Check switch programming */
+		}
+	}
+
 	return true;
 }
 
diff --git a/drivers/cxl/trace.h b/drivers/cxl/trace.h
index 4de47d1111ac..57fe9342817c 100644
--- a/drivers/cxl/trace.h
+++ b/drivers/cxl/trace.h
@@ -41,6 +41,9 @@ DEFINE_EVENT(cxl_region_template, allocation_failed,
 DEFINE_EVENT(cxl_region_template, xhb_valid,
 	     TP_PROTO(const struct cxl_region *region, char *status),
 	     TP_ARGS(region, status));
+DEFINE_EVENT(cxl_region_template, hb_rp_valid,
+	     TP_PROTO(const struct cxl_region *region, char *status),
+	     TP_ARGS(region, status));
 
 #endif /* if !defined (__CXL_TRACE_H__) || defined(TRACE_HEADER_MULTI_READ) */
 
-- 
2.33.1


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [RFC PATCH v2 25/28] cxl/region: Record host bridge target list
  2021-10-22 18:36 [RFC PATCH v2 00/28] CXL Region Creation / HDM decoder programming Ben Widawsky
                   ` (23 preceding siblings ...)
  2021-10-22 18:37 ` [RFC PATCH v2 24/28] cxl/region: HB port config verification Ben Widawsky
@ 2021-10-22 18:37 ` Ben Widawsky
  2021-10-22 18:37 ` [RFC PATCH v2 26/28] cxl/mem: Store the endpoint's uport Ben Widawsky
                   ` (2 subsequent siblings)
  27 siblings, 0 replies; 106+ messages in thread
From: Ben Widawsky @ 2021-10-22 18:37 UTC (permalink / raw)
  To: linux-cxl, Chet Douglas
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

Part of host bridge verification in the CXL Type 3 Memory Device
Software Guide calculates the host bridge interleave target list (6th
step in the flow chart). With host bridge verification already done, it
is trivial to store away the configuration information.

TODO: Needs support for switches (7th step in the flow chart).

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 drivers/cxl/region.c | 41 ++++++++++++++++++++++++++++++-----------
 drivers/cxl/region.h | 12 ++++++++++++
 2 files changed, 42 insertions(+), 11 deletions(-)

diff --git a/drivers/cxl/region.c b/drivers/cxl/region.c
index ecf453fd3257..3276c7243c2a 100644
--- a/drivers/cxl/region.c
+++ b/drivers/cxl/region.c
@@ -294,14 +294,17 @@ static int get_num_root_ports(const struct cxl_region *region)
  * region_hb_rp_config_valid() - determine root port ordering is correct
  * @cfmws: CFMWS decoder for this @region
  * @region: Region to validate
+ * @p: HDM decoder programming state. Populated if non-NULL.
  *
  * The algorithm is outlined in 2.13.15 "Verify HB root port configuration
  * sequence" of the CXL Memory Device SW Guide (Rev1p0).
  *
- * Returns true if the configuration is valid.
+ * Returns true if the configuration is valid, the configuration state is
+ * updated for later programming.
  */
 static bool region_hb_rp_config_valid(const struct cxl_region *region,
-				      const struct cxl_decoder *cfmws)
+				      const struct cxl_decoder *cfmws,
+				      struct decoder_programming *p)
 {
 	const int num_root_ports = get_num_root_ports(region);
 	struct cxl_port *hbs[CXL_DECODER_MAX_INTERLEAVE];
@@ -309,18 +312,29 @@ static bool region_hb_rp_config_valid(const struct cxl_region *region,
 
 	hb_count = get_unique_hostbridges(region, hbs);
 
+	if (p)
+		p->hb_count = hb_count;
+
 	/*
 	 * Are all devices in this region on the same CXL Host Bridge
 	 * Root Port?
 	 */
-	if (num_root_ports == 1)
+	if (num_root_ports == 1) {
+		if (p) {
+			p->hbs[0].rp_target_list[0] = region->targets[0]->root_port;
+			p->hbs[0].rp_count = 1;
+		}
 		return true;
+	}
 
 	for (i = 0; i < hb_count; i++) {
+		struct cxl_dport *rp, **targets;
 		struct cxl_port *hb = hbs[i];
-		struct cxl_dport *rp;
 		int position_mask;
-		int idx;
+		int idx, *rp_count;
+
+		targets = &p->hbs[i].rp_target_list[0];
+		rp_count = &p->hbs[i].rp_count;
 
 		/*
 		 * Calculate the position mask: NumRootPorts = 2^PositionMask
@@ -343,9 +357,12 @@ static bool region_hb_rp_config_valid(const struct cxl_region *region,
 				if (ep->root_port != rp)
 					continue;
 
-				if (port_grouping == -1) {
+				if (port_grouping == -1)
 					port_grouping = idx & position_mask;
-					continue;
+
+				if (p) {
+					(*rp_count)++;
+					targets[port_grouping] = ep->root_port;
 				}
 
 				/*
@@ -379,7 +396,8 @@ static bool cfmws_contains(const struct cxl_region *region,
 }
 
 static bool cfmws_valid(const struct cxl_region *region,
-			const struct cxl_decoder *cfmws)
+			const struct cxl_decoder *cfmws,
+			struct decoder_programming *p)
 {
 	const struct cxl_memdev *endpoint = region->targets[0];
 
@@ -392,7 +410,7 @@ static bool cfmws_valid(const struct cxl_region *region,
 	if (!region_xhb_config_valid(region, cfmws))
 		return false;
 
-	if (!region_hb_rp_config_valid(region, cfmws))
+	if (!region_hb_rp_config_valid(region, cfmws, p))
 		return false;
 
 	if (!cfmws_contains(region, cfmws))
@@ -409,7 +427,7 @@ static int cfmws_match(struct device *dev, void *data)
 	if (!is_root_decoder(dev))
 		return 0;
 
-	return !!cfmws_valid(region, to_cxl_decoder(dev));
+	return !!cfmws_valid(region, to_cxl_decoder(dev), NULL);
 }
 
 /*
@@ -484,7 +502,8 @@ static int cxl_region_probe(struct device *dev)
 		return -ENXIO;
 
 	cfmws = cxld_from_region(region);
-	if (!cfmws_valid(region, cfmws)) {
+	if (!cfmws_valid(region, cfmws,
+			 (struct decoder_programming *)&region->state)) {
 		dev_err(dev, "Picked invalid cfmws\n");
 		return -ENXIO;
 	}
diff --git a/drivers/cxl/region.h b/drivers/cxl/region.h
index 5df417324cab..51f442636364 100644
--- a/drivers/cxl/region.h
+++ b/drivers/cxl/region.h
@@ -19,6 +19,10 @@
  * @eniw: Number of interleave ways this region is configured for.
  * @ig: Interleave granularity of region
  * @targets: The memory devices comprising the region.
+ * @state: Configuration state for host bridges, switches, and endpoints.
+ * @state.hbs: Host bridge state. One per hostbridge in the interleave set.
+ * @state.hbs.rp_count: Count of root ports for this region
+ * @state.hbs.rp_target_list: Ordered list of downstream root ports.
  */
 struct cxl_region {
 	struct device dev;
@@ -34,6 +38,14 @@ struct cxl_region {
 		int ig;
 		struct cxl_memdev *targets[CXL_DECODER_MAX_INTERLEAVE];
 	};
+
+	struct decoder_programming {
+		int hb_count;
+		struct {
+			int rp_count;
+			struct cxl_dport *rp_target_list[CXL_DECODER_MAX_INTERLEAVE];
+		} hbs[CXL_DECODER_MAX_INTERLEAVE];
+	} state;
 };
 
 bool is_cxl_region_configured(const struct cxl_region *region);
-- 
2.33.1


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [RFC PATCH v2 26/28] cxl/mem: Store the endpoint's uport
  2021-10-22 18:36 [RFC PATCH v2 00/28] CXL Region Creation / HDM decoder programming Ben Widawsky
                   ` (24 preceding siblings ...)
  2021-10-22 18:37 ` [RFC PATCH v2 25/28] cxl/region: Record host bridge target list Ben Widawsky
@ 2021-10-22 18:37 ` Ben Widawsky
  2021-10-22 18:37 ` [RFC PATCH v2 27/28] cxl/region: Gather HDM decoder resources Ben Widawsky
  2021-10-22 18:37 ` [RFC PATCH v2 28/28] cxl: Program decoders for regions Ben Widawsky
  27 siblings, 0 replies; 106+ messages in thread
From: Ben Widawsky @ 2021-10-22 18:37 UTC (permalink / raw)
  To: linux-cxl, Chet Douglas
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

Keeping track of an endpoint's port is important when attempting to
assemble all relevant components in the CXL.mem route from root to
endpoint. Without this change, the endpoint's port can be obtained
through walks of the hierarchy.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 drivers/cxl/cxlmem.h | 1 +
 drivers/cxl/mem.c    | 5 ++---
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index cc5844150ce0..02efb270acac 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -45,6 +45,7 @@ struct cxl_memdev {
 	int id;
 	resource_size_t creg_base;
 	struct cxl_dport *root_port;
+	struct cxl_port *uport;
 };
 
 static inline struct cxl_memdev *to_cxl_memdev(struct device *dev)
diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
index 0b0272576e55..d348056f320f 100644
--- a/drivers/cxl/mem.c
+++ b/drivers/cxl/mem.c
@@ -61,8 +61,7 @@ static void remove_endpoint(void *_cxlmd)
 	if (cxlmd->root_port)
 		sysfs_remove_link(&cxlmd->dev.kobj, "root_port");
 
-	endpoint = dev_get_drvdata(&cxlmd->dev);
-
+	endpoint = cxlmd->uport;
 	devm_cxl_remove_port(endpoint);
 }
 
@@ -83,7 +82,7 @@ static int create_endpoint(struct device *dev, struct cxl_port *parent,
 		device_del(&endpoint->dev);
 		return rc;
 	}
-	dev_set_drvdata(dev, endpoint);
+	cxlmd->uport = endpoint;
 	dev_dbg(dev, "add: %s\n", dev_name(&endpoint->dev));
 
 	return devm_add_action_or_reset(dev, remove_endpoint, cxlmd);
-- 
2.33.1


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [RFC PATCH v2 27/28] cxl/region: Gather HDM decoder resources
  2021-10-22 18:36 [RFC PATCH v2 00/28] CXL Region Creation / HDM decoder programming Ben Widawsky
                   ` (25 preceding siblings ...)
  2021-10-22 18:37 ` [RFC PATCH v2 26/28] cxl/mem: Store the endpoint's uport Ben Widawsky
@ 2021-10-22 18:37 ` Ben Widawsky
  2021-10-22 18:37 ` [RFC PATCH v2 28/28] cxl: Program decoders for regions Ben Widawsky
  27 siblings, 0 replies; 106+ messages in thread
From: Ben Widawsky @ 2021-10-22 18:37 UTC (permalink / raw)
  To: linux-cxl, Chet Douglas
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

Prepare for HDM decoder programming by iterating through all components
and obtaining a cxl_decoder for each.

Programming a CXL region to accept memory transactions over a set of
devices requires programming HDM decoders for every component that is
part of the hierarchy enabling the region (host bridges, switches, and
endpoints). For this to be possible, each of these components must have
an available HDM decoder, which is a limited hardware resource.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 drivers/cxl/core/bus.c |  8 ++++++
 drivers/cxl/cxl.h      |  3 ++
 drivers/cxl/port.c     | 53 ++++++++++++++++++++++++++++++++++
 drivers/cxl/region.c   | 65 +++++++++++++++++++++++++++++++++++++++---
 drivers/cxl/region.h   |  4 +++
 drivers/cxl/trace.h    | 21 ++++++++++++++
 6 files changed, 150 insertions(+), 4 deletions(-)

diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
index 3042e6e6f5b3..a75fb2c7e094 100644
--- a/drivers/cxl/core/bus.c
+++ b/drivers/cxl/core/bus.c
@@ -327,6 +327,14 @@ static bool is_endpoint_decoder(struct device *dev)
 	return dev->type == &cxl_decoder_endpoint_type;
 }
 
+bool is_cxl_decoder(struct device *dev)
+{
+	return dev->type == &cxl_decoder_switch_type ||
+	       dev->type == &cxl_decoder_endpoint_type ||
+	       dev->type == &cxl_decoder_root_type;
+}
+EXPORT_SYMBOL_GPL(is_cxl_decoder);
+
 bool is_root_decoder(struct device *dev)
 {
 	return dev->type == &cxl_decoder_root_type;
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 63554c9cebf0..79d22992fddf 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -335,11 +335,14 @@ int cxl_add_dport(struct cxl_port *port, struct device *dport, int port_id,
 struct cxl_dport *cxl_get_root_dport(struct device *dev);
 struct cxl_dport *cxl_find_dport_by_dev(struct cxl_port *port,
 					struct device *dev);
+struct cxl_decoder *cxl_pop_decoder(struct cxl_port *port);
+void cxl_push_decoder(struct cxl_decoder *cxld);
 
 struct cxl_decoder *to_cxl_decoder(struct device *dev);
 bool is_root_decoder(struct device *dev);
 struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port,
 				      unsigned int nr_targets);
+bool is_cxl_decoder(struct device *dev);
 int cxl_decoder_add_locked(struct cxl_decoder *cxld, int *target_map);
 int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map);
 int cxl_decoder_autoremove(struct device *host, struct cxl_decoder *cxld);
diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
index 3ddfd7673a56..5d35ccf2407f 100644
--- a/drivers/cxl/port.c
+++ b/drivers/cxl/port.c
@@ -35,6 +35,59 @@
  * presenting APIs to other drivers to utilize the decoders.
  */
 
+static int unused_decoder(struct device *dev, void *data)
+{
+	struct cxl_decoder *cxld;
+
+	if (!is_cxl_decoder(dev))
+		return 0;
+
+	if (dev_WARN_ONCE(dev, is_root_decoder(dev),
+			  "Root decoders can't be present"))
+		return 0;
+
+	cxld = to_cxl_decoder(dev);
+	if (cxld->flags & CXL_DECODER_F_EN)
+		return 0;
+
+	/*
+	 * Mark this decoder as enabled to prevent other entities from thinking
+	 * it's available.
+	 */
+	cxld->flags |= CXL_DECODER_F_EN;
+
+	return 1;
+}
+
+/**
+ * cxl_pop_decoder() - Obtains an available decoder resource
+ * @port: Owner of the decoder resource
+ */
+struct cxl_decoder *cxl_pop_decoder(struct cxl_port *port)
+{
+	struct device *cxldd;
+
+	cxldd = device_find_child(&port->dev, NULL, unused_decoder);
+	if (!cxldd)
+		return ERR_PTR(-EBUSY);
+
+	/* Keep the reference */
+
+	return to_cxl_decoder(cxldd);
+}
+EXPORT_SYMBOL_GPL(cxl_pop_decoder);
+
+/**
+ * cxl_push_decoder() - Restores decoder resource to the port
+ * @cxld: the decoder resource to replace
+ */
+void cxl_push_decoder(struct cxl_decoder *cxld)
+{
+	cxld->flags &= ~CXL_DECODER_F_EN;
+	put_device(&cxld->dev);
+}
+EXPORT_SYMBOL_GPL(cxl_push_decoder);
+
 struct cxl_port_data {
 	struct cxl_component_regs regs;
 
diff --git a/drivers/cxl/region.c b/drivers/cxl/region.c
index 3276c7243c2a..f4d190ede3ee 100644
--- a/drivers/cxl/region.c
+++ b/drivers/cxl/region.c
@@ -450,9 +450,32 @@ static struct cxl_decoder *find_cfmws(const struct cxl_region *region,
 	return NULL;
 }
 
+static void put_all_decoders(struct decoder_programming *p)
+{
+	int i;
+
+	for (i = 0; i < CXL_DECODER_MAX_INTERLEAVE; i++) {
+		if (p->hbs[i].cxld) {
+			cxl_push_decoder(p->hbs[i].cxld);
+			p->hbs[i].cxld = NULL;
+		}
+
+		if (p->ep_cxld[i]) {
+			cxl_push_decoder(p->ep_cxld[i]);
+			p->ep_cxld[i] = NULL;
+		}
+	}
+}
+
+static void release_decoders(void *p)
+{
+	put_all_decoders(p);
+}
+
 /**
  * gather_hdm_decoders() - Amass all HDM decoders in the hierarchy
  * @region: The region to be programmed
+ * @p: Programming state that will gather decoders
  *
  * Programming the hardware such that the correct set of devices receive the
  * correct memory traffic requires all connected components in the hierarchy to
@@ -461,10 +484,44 @@ static struct cxl_decoder *find_cfmws(const struct cxl_region *region,
  * Returns 0 if an HDM decoder was obtained for each component, else returns a
  * negative error code.
  */
-static int gather_hdm_decoders(const struct cxl_region *region)
+static int gather_hdm_decoders(const struct cxl_region *region, struct decoder_programming *p)
 {
-	/* TODO: */
-	return 0;
+	struct cxl_memdev *ep;
+	struct cxl_port *hbs[CXL_DECODER_MAX_INTERLEAVE];
+	int i, hb_count = get_unique_hostbridges(region, hbs);
+
+	for_each_cxl_endpoint(ep, region, i) {
+		struct cxl_port *port = ep->uport;
+
+		p->ep_cxld[i] = cxl_pop_decoder(port);
+		if (IS_ERR(p->ep_cxld[i])) {
+			int err = PTR_ERR(p->ep_cxld[i]);
+
+			trace_cxl_region_decoder(region, port);
+			p->ep_cxld[i] = NULL;
+			put_all_decoders(p);
+			return err;
+		}
+	}
+
+	/* TODO: Switches */
+
+	for (i = 0; i < hb_count; i++) {
+		struct cxl_port *hb = hbs[i];
+
+		p->hbs[i].cxld = cxl_pop_decoder(hb);
+		if (IS_ERR(p->hbs[i].cxld)) {
+			int err = PTR_ERR(p->hbs[i].cxld);
+
+			trace_cxl_region_decoder(region, hb);
+			p->hbs[i].cxld = NULL;
+			put_all_decoders(p);
+			return err;
+		}
+	}
+
+	return devm_add_action_or_reset((struct device *)&region->dev,
+					release_decoders, p);
 }
 
 static int bind_region(const struct cxl_region *region)
@@ -516,7 +573,7 @@ static int cxl_region_probe(struct device *dev)
 	if (ours)
 		put_device(&ours->dev);
 
-	ret = gather_hdm_decoders(region);
+	ret = gather_hdm_decoders(region, &region->state);
 	if (ret)
 		return ret;
 
diff --git a/drivers/cxl/region.h b/drivers/cxl/region.h
index 51f442636364..4c0b94c3c001 100644
--- a/drivers/cxl/region.h
+++ b/drivers/cxl/region.h
@@ -23,6 +23,8 @@
  * @state.hbs: Host bridge state. One per hostbridge in the interleave set.
  * @state.hbs.rp_count: Count of root ports for this region
  * @state.hbs.rp_target_list: Ordered list of downstream root ports.
+ * @state.hbs.cxld: an available decoder to set up the programming.
+ * @state.ep_cxld: available decoders for endpoint programming.
  */
 struct cxl_region {
 	struct device dev;
@@ -44,7 +46,9 @@ struct cxl_region {
 		struct {
 			int rp_count;
 			struct cxl_dport *rp_target_list[CXL_DECODER_MAX_INTERLEAVE];
+			struct cxl_decoder *cxld;
 		} hbs[CXL_DECODER_MAX_INTERLEAVE];
+		struct cxl_decoder *ep_cxld[CXL_DECODER_MAX_INTERLEAVE];
 	} state;
 };
 
diff --git a/drivers/cxl/trace.h b/drivers/cxl/trace.h
index 57fe9342817c..516a7aed8a27 100644
--- a/drivers/cxl/trace.h
+++ b/drivers/cxl/trace.h
@@ -45,6 +45,27 @@ DEFINE_EVENT(cxl_region_template, hb_rp_valid,
 	     TP_PROTO(const struct cxl_region *region, char *status),
 	     TP_ARGS(region, status));
 
+TRACE_EVENT(cxl_region_decoder,
+	TP_PROTO(const struct cxl_region *region, struct cxl_port *port),
+
+	TP_ARGS(region, port),
+
+	TP_STRUCT__entry(
+		__field(const struct cxl_region *, region)
+		__field(struct cxl_port *, port)
+		__string(rdev_name, dev_name(&region->dev))
+		__string(pdev_name, dev_name(&port->dev))
+	),
+
+	TP_fast_assign(
+		__assign_str(rdev_name, dev_name(&region->dev));
+		__assign_str(pdev_name, dev_name(&port->dev));
+	),
+
+	TP_printk("%s: HDM decoder error for %s", __get_str(rdev_name), __get_str(pdev_name))
+);
+
+
 #endif /* if !defined (__CXL_TRACE_H__) || defined(TRACE_HEADER_MULTI_READ) */
 
 /* This part must be outside protection */
-- 
2.33.1


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [RFC PATCH v2 28/28] cxl: Program decoders for regions
  2021-10-22 18:36 [RFC PATCH v2 00/28] CXL Region Creation / HDM decoder programming Ben Widawsky
                   ` (26 preceding siblings ...)
  2021-10-22 18:37 ` [RFC PATCH v2 27/28] cxl/region: Gather HDM decoder resources Ben Widawsky
@ 2021-10-22 18:37 ` Ben Widawsky
  27 siblings, 0 replies; 106+ messages in thread
From: Ben Widawsky @ 2021-10-22 18:37 UTC (permalink / raw)
  To: linux-cxl, Chet Douglas
  Cc: Ben Widawsky, Alison Schofield, Dan Williams, Ira Weiny,
	Jonathan Cameron, Vishal Verma

Do the HDM decoder programming for all endpoints and host bridges in a
region. Switches are currently unimplemented.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 drivers/cxl/cxl.h    |   3 +
 drivers/cxl/port.c   | 196 +++++++++++++++++++++++++++++++++++++++++++
 drivers/cxl/region.c |  33 +++++++-
 3 files changed, 231 insertions(+), 1 deletion(-)

diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 79d22992fddf..f2ee26ac2398 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -52,6 +52,7 @@
 #define   CXL_HDM_DECODER0_CTRL_IW_MASK GENMASK(7, 4)
 #define   CXL_HDM_DECODER0_CTRL_COMMIT BIT(9)
 #define   CXL_HDM_DECODER0_CTRL_COMMITTED BIT(10)
+#define   CXL_HDM_DECODER0_CTRL_COMMIT_ERROR BIT(11)
 #define   CXL_HDM_DECODER0_CTRL_TYPE BIT(12)
 #define CXL_HDM_DECODER0_TL_LOW(i) (0x20 * (i) + 0x24)
 #define CXL_HDM_DECODER0_TL_HIGH(i) (0x20 * (i) + 0x28)
@@ -337,6 +338,8 @@ struct cxl_dport *cxl_find_dport_by_dev(struct cxl_port *port,
 					struct device *dev);
 struct cxl_decoder *cxl_pop_decoder(struct cxl_port *port);
 void cxl_push_decoder(struct cxl_decoder *cxld);
+int cxl_commit_decoder(struct cxl_decoder *cxld);
+void cxl_disable_decoder(struct cxl_decoder *cxld);
 
 struct cxl_decoder *to_cxl_decoder(struct device *dev);
 bool is_root_decoder(struct device *dev);
diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
index 5d35ccf2407f..67143c7f59bb 100644
--- a/drivers/cxl/port.c
+++ b/drivers/cxl/port.c
@@ -99,6 +99,202 @@ struct cxl_port_data {
 	} caps;
 };
 
+#define COMMIT_TIMEOUT_MS 10
+static int wait_for_commit(struct cxl_decoder *cxld)
+{
+	struct cxl_port *port = to_cxl_port(cxld->dev.parent);
+	const unsigned long start = jiffies;
+	struct cxl_port_data *cpd;
+	void __iomem *hdm_decoder;
+	unsigned long end = start;
+	u32 ctrl;
+
+	cpd = dev_get_drvdata(&port->dev);
+	hdm_decoder = cpd->regs.hdm_decoder;
+
+	do {
+		end = jiffies;
+		ctrl = readl(hdm_decoder +
+			     CXL_HDM_DECODER0_CTRL_OFFSET(cxld->id));
+		if (FIELD_GET(CXL_HDM_DECODER0_CTRL_COMMITTED, ctrl))
+			break;
+
+		if (time_after(end, start + COMMIT_TIMEOUT_MS)) {
+			dev_err(&cxld->dev, "HDM decoder commit timeout %x\n", ctrl);
+			return -ETIMEDOUT;
+		}
+		if ((ctrl & CXL_HDM_DECODER0_CTRL_COMMIT_ERROR) != 0) {
+			dev_err(&cxld->dev, "HDM decoder commit error %x\n", ctrl);
+			return -ENXIO;
+		}
+	} while (!!FIELD_GET(CXL_HDM_DECODER0_CTRL_COMMITTED, ctrl));
+
+	return 0;
+}
+
+/**
+ * cxl_commit_decoder() - Program a configured cxl_decoder
+ * @cxld: The preconfigured cxl decoder.
+ *
+ * A cxl decoder that is to be committed should have been earmarked as enabled.
+ * This mechanism acts as a soft reservation on the decoder.
+ *
+ * Returns 0 if commit was successful, negative error code otherwise.
+ */
+int cxl_commit_decoder(struct cxl_decoder *cxld)
+{
+	u32 ctrl, tl_lo, tl_hi, base_lo, base_hi, size_lo, size_hi;
+	struct cxl_port *port = to_cxl_port(cxld->dev.parent);
+	struct cxl_port_data *cpd;
+	void __iomem *hdm_decoder;
+	int rc;
+
+	/*
+	 * Decoder flags are entirely software controlled and therefore this
+	 * case is purely a driver bug.
+	 */
+	if (dev_WARN_ONCE(&port->dev, (cxld->flags & CXL_DECODER_F_EN) == 0,
+			  "Invalid %s enable state\n", dev_name(&cxld->dev)))
+		return -ENXIO;
+
+	cpd = dev_get_drvdata(&port->dev);
+	hdm_decoder = cpd->regs.hdm_decoder;
+	ctrl = readl(hdm_decoder + CXL_HDM_DECODER0_CTRL_OFFSET(cxld->id));
+
+	/*
+	 * A decoder that's currently active cannot be changed without the
+	 * system being quiesced. While the driver should prevent against this,
+	 * for a variety of reasons the hardware might not be in sync with the
+	 * hardware and so, do not splat on error.
+	 */
+	size_hi = readl(hdm_decoder +
+			CXL_HDM_DECODER0_SIZE_HIGH_OFFSET(cxld->id));
+	size_lo =
+		readl(hdm_decoder + CXL_HDM_DECODER0_SIZE_LOW_OFFSET(cxld->id));
+	if (FIELD_GET(CXL_HDM_DECODER0_CTRL_COMMITTED, ctrl) &&
+	    (size_lo + size_hi)) {
+		dev_err(&port->dev, "Tried to change an active decoder (%s)\n",
+			dev_name(&cxld->dev));
+		return -EBUSY;
+	}
+
+	u32p_replace_bits(&ctrl, 8 - ilog2(cxld->interleave_granularity),
+			  CXL_HDM_DECODER0_CTRL_IG_MASK);
+	u32p_replace_bits(&ctrl, ilog2(cxld->interleave_ways),
+			  CXL_HDM_DECODER0_CTRL_IW_MASK);
+	u32p_replace_bits(&ctrl, 1, CXL_HDM_DECODER0_CTRL_COMMIT);
+
+	/* TODO: set based on type */
+	u32p_replace_bits(&ctrl, 1, CXL_HDM_DECODER0_CTRL_TYPE);
+
+	base_lo = FIELD_PREP(GENMASK(31, 28),
+			     (u32)(cxld->res.start & 0xffffffff));
+	base_hi = FIELD_PREP(~0, (u32)(cxld->res.start >> 32));
+
+	size_lo = (u32)(resource_size(&cxld->res)) & GENMASK(31, 28);
+	size_hi = (u32)((resource_size(&cxld->res) >> 32));
+
+	if (cxld->nr_targets > 0) {
+		tl_lo |= FIELD_PREP(GENMASK(7, 0), cxld->target[0]->port_id);
+		if (cxld->interleave_ways > 1)
+			tl_lo |= FIELD_PREP(GENMASK(15, 8),
+					    cxld->target[1]->port_id);
+		if (cxld->interleave_ways > 2)
+			tl_lo |= FIELD_PREP(GENMASK(23, 16),
+					    cxld->target[2]->port_id);
+		if (cxld->interleave_ways > 3)
+			tl_lo |= FIELD_PREP(GENMASK(31, 24),
+					    cxld->target[3]->port_id);
+		if (cxld->interleave_ways > 4)
+			tl_hi |= FIELD_PREP(GENMASK(7, 0),
+					    cxld->target[4]->port_id);
+		if (cxld->interleave_ways > 5)
+			tl_hi |= FIELD_PREP(GENMASK(15, 8),
+					    cxld->target[5]->port_id);
+		if (cxld->interleave_ways > 6)
+			tl_hi |= FIELD_PREP(GENMASK(23, 16),
+					    cxld->target[6]->port_id);
+		if (cxld->interleave_ways > 7)
+			tl_hi |= FIELD_PREP(GENMASK(31, 24),
+					    cxld->target[7]->port_id);
+
+		writel(tl_hi, hdm_decoder + CXL_HDM_DECODER0_TL_HIGH(cxld->id));
+		writel(tl_lo, hdm_decoder + CXL_HDM_DECODER0_TL_LOW(cxld->id));
+	}
+
+	writel(size_hi,
+	       hdm_decoder + CXL_HDM_DECODER0_SIZE_HIGH_OFFSET(cxld->id));
+	writel(size_lo,
+	       hdm_decoder + CXL_HDM_DECODER0_SIZE_LOW_OFFSET(cxld->id));
+	writel(base_hi,
+	       hdm_decoder + CXL_HDM_DECODER0_BASE_HIGH_OFFSET(cxld->id));
+	writel(base_lo,
+	       hdm_decoder + CXL_HDM_DECODER0_BASE_LOW_OFFSET(cxld->id));
+	writel(ctrl, hdm_decoder + CXL_HDM_DECODER0_CTRL_OFFSET(cxld->id));
+
+	rc = wait_for_commit(cxld);
+	if (rc)
+		return rc;
+
+#define DPORT_TL_STR "%d %d %d %d %d %d %d %d"
+#define DPORT(i)                                                               \
+	(cxld->nr_targets && cxld->interleave_ways > (i)) ?                    \
+		      cxld->target[(i)]->port_id :                                   \
+		      -1
+#define DPORT_TL                                                               \
+	DPORT(0), DPORT(1), DPORT(2), DPORT(3), DPORT(4), DPORT(5), DPORT(6),  \
+		DPORT(7)
+
+	dev_dbg(&port->dev,
+		"%s\n\tBase %pa\n\tSize %llu\n\tIG %u\n\tIW %u\n\tTargetList: " DPORT_TL_STR,
+		dev_name(&cxld->dev), &cxld->res.start,
+		resource_size(&cxld->res), cxld->interleave_granularity,
+		cxld->interleave_ways, DPORT_TL);
+#undef DPORT_TL
+#undef DPORT
+#undef DPORT_TL_STR
+	return 0;
+}
+EXPORT_SYMBOL_GPL(cxl_commit_decoder);
+
+/**
+ * cxl_disable_decoder() - Disables a decoder
+ * @cxld: The active cxl decoder.
+ *
+ * CXL decoders (as of 2.0 spec) have no way to deactivate them other than to
+ * set the size of the HDM to 0. This function will clear all registers, and if
+ * the decoder is active, commit the 0'd out registers.
+ */
+void cxl_disable_decoder(struct cxl_decoder *cxld)
+{
+	struct cxl_port *port = to_cxl_port(cxld->dev.parent);
+	struct cxl_port_data *cpd;
+	void __iomem *hdm_decoder;
+	u32 ctrl;
+
+	cpd = dev_get_drvdata(&port->dev);
+	hdm_decoder = cpd->regs.hdm_decoder;
+	ctrl = readl(hdm_decoder + CXL_HDM_DECODER0_CTRL_OFFSET(cxld->id));
+
+	if (dev_WARN_ONCE(&port->dev, (cxld->flags & CXL_DECODER_F_EN) == 0,
+			  "Invalid decoder enable state\n"))
+		return;
+
+	/* There's no way to "uncommit" a committed decoder, only 0 size it */
+	writel(0, hdm_decoder + CXL_HDM_DECODER0_TL_HIGH(cxld->id));
+	writel(0, hdm_decoder + CXL_HDM_DECODER0_TL_LOW(cxld->id));
+	writel(0, hdm_decoder + CXL_HDM_DECODER0_SIZE_HIGH_OFFSET(cxld->id));
+	writel(0, hdm_decoder + CXL_HDM_DECODER0_SIZE_LOW_OFFSET(cxld->id));
+	writel(0, hdm_decoder + CXL_HDM_DECODER0_BASE_HIGH_OFFSET(cxld->id));
+	writel(0, hdm_decoder + CXL_HDM_DECODER0_BASE_LOW_OFFSET(cxld->id));
+
+	/* If the device isn't actually active, just zero out all the fields */
+	if (FIELD_GET(CXL_HDM_DECODER0_CTRL_COMMITTED, ctrl))
+		writel(CXL_HDM_DECODER0_CTRL_COMMIT,
+		       hdm_decoder + CXL_HDM_DECODER0_CTRL_OFFSET(cxld->id));
+}
+EXPORT_SYMBOL_GPL(cxl_disable_decoder);
+
 static inline int cxl_hdm_decoder_ig(u32 ctrl)
 {
 	int val = FIELD_GET(CXL_HDM_DECODER0_CTRL_IG_MASK, ctrl);
diff --git a/drivers/cxl/region.c b/drivers/cxl/region.c
index f4d190ede3ee..05e7ca3d0900 100644
--- a/drivers/cxl/region.c
+++ b/drivers/cxl/region.c
@@ -526,7 +526,38 @@ static int gather_hdm_decoders(const struct cxl_region *region, struct decoder_p
 
 static int bind_region(const struct cxl_region *region)
 {
-	/* TODO: */
+	const struct decoder_programming *p = &region->state;
+	int i, rc;
+
+	for (i = 0; i < p->hb_count; i++) {
+		struct cxl_decoder *cxld = p->hbs[i].cxld;
+		int j;
+
+		cxld->res = (struct resource)DEFINE_RES_MEM(region->res->start,
+							    region->size);
+		cxld->interleave_granularity = region->ig;
+		cxld->interleave_ways = p->hbs[i].rp_count;
+		for (j = 0; j < p->hbs[i].rp_count; j++)
+			cxld->target[j] = p->hbs[i].rp_target_list[j];
+
+		rc = cxl_commit_decoder(cxld);
+		if (rc)
+			return rc;
+	}
+
+	for (i = 0; i < region_ways(region); i++) {
+		struct cxl_decoder *cxld = p->ep_cxld[i];
+
+		cxld->res = (struct resource)DEFINE_RES_MEM(region->res->start,
+							    region->size);
+		cxld->interleave_granularity = region->ig;
+		cxld->interleave_ways = region_ways(region);
+
+		rc = cxl_commit_decoder(cxld);
+		if (rc)
+			return rc;
+	}
+
 	return 0;
 }
 
-- 
2.33.1


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 01/28] cxl: Rename CXL_MEM to CXL_PCI
  2021-10-22 18:36 ` [RFC PATCH v2 01/28] cxl: Rename CXL_MEM to CXL_PCI Ben Widawsky
@ 2021-10-29 20:15   ` Dan Williams
  2021-10-29 21:20     ` Ben Widawsky
  0 siblings, 1 reply; 106+ messages in thread
From: Dan Williams @ 2021-10-29 20:15 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> With the upcoming introduction of a driver to control the non-PCI
> aspects of CXL.mem, such as interleave set creation and configuration,
> there will be an opportunity to disconnection control over CXL device

s/disconnection/disconnect/

> memory and CXL device manageability. CXL device manageability is
> implemented by the cxl_pci driver. Doing this rename allows the CXL
> memory driver to be enabled by a new config option independently of CXL
> device manageability through CXL.io/PCI mechanisms.

That comes across a bit hard to parse to me, how about:

"The cxl_mem module was renamed cxl_pci in commit 21e9f76733a8 ("cxl:
Rename mem to pci"). In preparation for adding an ancillary driver for
cxl_memdev devices (registered on the cxl bus by cxl_pci), go ahead
and rename CONFIG_CXL_MEM to CONFIG_CXL_PCI. Tree up the CXL_MEM name
for that new driver to manage generic CXL.mem endpoint operations."

> Suggested-by: Dan Williams <dan.j.williams@intel.com>
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> ---
>  drivers/cxl/Kconfig  | 13 ++++++-------
>  drivers/cxl/Makefile |  2 +-
>  2 files changed, 7 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> index e6de221cc568..23773d0ac896 100644
> --- a/drivers/cxl/Kconfig
> +++ b/drivers/cxl/Kconfig
> @@ -13,14 +13,13 @@ menuconfig CXL_BUS
>
>  if CXL_BUS
>
> -config CXL_MEM
> -       tristate "CXL.mem: Memory Devices"
> +config CXL_PCI
> +       tristate "PCI manageability"

s/PCI manageability/CXL Memory Device: PCI Operations/

...as I don't think an end user reading "PCI Manageability" would know
that it supports basic memory expander enumeration and mailbox
operations.

>         default CXL_BUS
>         help
> -         The CXL.mem protocol allows a device to act as a provider of
> -         "System RAM" and/or "Persistent Memory" that is fully coherent
> -         as if the memory was attached to the typical CPU memory
> -         controller.
> +         The CXL specification defines a set of interfaces which are controlled
> +         through well known PCI configuration mechanisms. Such access is
> +         referred to CXL.io in the specification.

The CXL specification defines a "CXL memory device" sub-class in the
PCI "memory controller" base class of devices. Device's identified by
this class code provide support for volatile and / or persistent
memory to be mapped into the system address map (Host-managed Device
Memory (HDM)).

>
>           Say 'y/m' to enable a driver that will attach to CXL.mem devices for

This would need updating too...

s/CXL.mem devices /generic CXL memory expanders identified by the
memory device class code/

...to reduce confusion about this driver for generic type-3 vs vendor
specific type-2 devices that also support CXL.mem

>           configuration and management primarily via the mailbox interface. See
> @@ -31,7 +30,7 @@ config CXL_MEM
>
>  config CXL_MEM_RAW_COMMANDS
>         bool "RAW Command Interface for Memory Devices"
> -       depends on CXL_MEM
> +       depends on CXL_PCI
>         help
>           Enable CXL RAW command interface.
>
> diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile
> index d1aaabc940f3..cf07ae6cea17 100644
> --- a/drivers/cxl/Makefile
> +++ b/drivers/cxl/Makefile
> @@ -1,6 +1,6 @@
>  # SPDX-License-Identifier: GPL-2.0
>  obj-$(CONFIG_CXL_BUS) += core/
> -obj-$(CONFIG_CXL_MEM) += cxl_pci.o
> +obj-$(CONFIG_CXL_PCI) += cxl_pci.o
>  obj-$(CONFIG_CXL_ACPI) += cxl_acpi.o
>  obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o
>
> --
> 2.33.1
>

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 02/28] cxl: Move register block enumeration to core
  2021-10-22 18:36 ` [RFC PATCH v2 02/28] cxl: Move register block enumeration to core Ben Widawsky
@ 2021-10-29 20:23   ` Dan Williams
  2021-10-29 21:23     ` Ben Widawsky
  0 siblings, 1 reply; 106+ messages in thread
From: Dan Williams @ 2021-10-29 20:23 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> CXL drivers or cxl_core itself will require the ability to map component
> registers in order to map HDM decoder resources amongst other things.

The PCI bits of this are only needed for switch support, right?
Everything else already knows how to find the component_reg_phys base
before devm_cxl_add_port().

It would help if this first sentence was a specific example where the
PCI helpers are needed outside of cxl_pci.

> Much of the register mapping code has already moved into cxl_core. The
> code to pull the BAR number and block office remained within cxl_pci
> because there was no need to move it. Upcoming work will require this
> functionality to be available outside of cxl_pci.
>
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> ---
>  drivers/cxl/core/regs.c | 54 +++++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/cxl.h       |  4 +++
>  drivers/cxl/pci.c       | 52 ---------------------------------------
>  3 files changed, 58 insertions(+), 52 deletions(-)
>
> diff --git a/drivers/cxl/core/regs.c b/drivers/cxl/core/regs.c
> index 41de4a136ecd..40598905c080 100644
> --- a/drivers/cxl/core/regs.c
> +++ b/drivers/cxl/core/regs.c
> @@ -5,6 +5,7 @@
>  #include <linux/slab.h>
>  #include <linux/pci.h>
>  #include <cxlmem.h>
> +#include <pci.h>
>
>  /**
>   * DOC: cxl registers
> @@ -247,3 +248,56 @@ int cxl_map_device_regs(struct pci_dev *pdev,
>         return 0;
>  }
>  EXPORT_SYMBOL_GPL(cxl_map_device_regs);
> +
> +static void cxl_decode_regblock(u32 reg_lo, u32 reg_hi,
> +                               struct cxl_register_map *map)
> +{
> +       map->block_offset =
> +               ((u64)reg_hi << 32) | (reg_lo & CXL_REGLOC_ADDR_MASK);
> +       map->barno = FIELD_GET(CXL_REGLOC_BIR_MASK, reg_lo);
> +       map->reg_type = FIELD_GET(CXL_REGLOC_RBI_MASK, reg_lo);
> +}
> +
> +/**
> + * cxl_find_regblock() - Locate register blocks by type
> + * @pdev: The CXL PCI device to enumerate.
> + * @type: Register Block Indicator id
> + * @map: Enumeration output, clobbered on error
> + *
> + * Return: 0 if register block enumerated, negative error code otherwise
> + *
> + * A CXL DVSEC may additional point one or more register blocks, search
> + * for them by @type.
> + */
> +int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
> +                     struct cxl_register_map *map)
> +{
> +       u32 regloc_size, regblocks;
> +       int regloc, i;
> +
> +       regloc = pci_find_dvsec_capability(pdev, PCI_DVSEC_VENDOR_ID_CXL,
> +                                          PCI_DVSEC_ID_CXL_REGLOC_DVSEC_ID);
> +       if (!regloc)
> +               return -ENXIO;
> +
> +       pci_read_config_dword(pdev, regloc + PCI_DVSEC_HEADER1, &regloc_size);
> +       regloc_size = FIELD_GET(PCI_DVSEC_HEADER1_LENGTH_MASK, regloc_size);
> +
> +       regloc += PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET;
> +       regblocks = (regloc_size - PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET) / 8;
> +
> +       for (i = 0; i < regblocks; i++, regloc += 8) {
> +               u32 reg_lo, reg_hi;
> +
> +               pci_read_config_dword(pdev, regloc, &reg_lo);
> +               pci_read_config_dword(pdev, regloc + 4, &reg_hi);
> +
> +               cxl_decode_regblock(reg_lo, reg_hi, map);
> +
> +               if (map->reg_type == type)
> +                       return 0;
> +       }
> +
> +       return -ENODEV;
> +}
> +EXPORT_SYMBOL_GPL(cxl_find_regblock);
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 7cd16ef144dd..f06c596fad71 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -161,6 +161,10 @@ int cxl_map_device_regs(struct pci_dev *pdev,
>                         struct cxl_device_regs *regs,
>                         struct cxl_register_map *map);
>
> +enum cxl_regloc_type;
> +int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
> +                     struct cxl_register_map *map);
> +
>  #define CXL_RESOURCE_NONE ((resource_size_t) -1)
>  #define CXL_TARGET_STRLEN 20
>
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index f19a06809079..d1adc759d051 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -400,58 +400,6 @@ static int cxl_map_regs(struct cxl_mem *cxlm, struct cxl_register_map *map)
>         return 0;
>  }
>
> -static void cxl_decode_regblock(u32 reg_lo, u32 reg_hi,
> -                               struct cxl_register_map *map)
> -{
> -       map->block_offset =
> -               ((u64)reg_hi << 32) | (reg_lo & CXL_REGLOC_ADDR_MASK);
> -       map->barno = FIELD_GET(CXL_REGLOC_BIR_MASK, reg_lo);
> -       map->reg_type = FIELD_GET(CXL_REGLOC_RBI_MASK, reg_lo);
> -}
> -
> -/**
> - * cxl_find_regblock() - Locate register blocks by type
> - * @pdev: The CXL PCI device to enumerate.
> - * @type: Register Block Indicator id
> - * @map: Enumeration output, clobbered on error
> - *
> - * Return: 0 if register block enumerated, negative error code otherwise
> - *
> - * A CXL DVSEC may additional point one or more register blocks, search
> - * for them by @type.
> - */
> -static int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
> -                            struct cxl_register_map *map)
> -{
> -       u32 regloc_size, regblocks;
> -       int regloc, i;
> -
> -       regloc = pci_find_dvsec_capability(pdev, PCI_DVSEC_VENDOR_ID_CXL,
> -                                          PCI_DVSEC_ID_CXL_REGLOC_DVSEC_ID);
> -       if (!regloc)
> -               return -ENXIO;
> -
> -       pci_read_config_dword(pdev, regloc + PCI_DVSEC_HEADER1, &regloc_size);
> -       regloc_size = FIELD_GET(PCI_DVSEC_HEADER1_LENGTH_MASK, regloc_size);
> -
> -       regloc += PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET;
> -       regblocks = (regloc_size - PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET) / 8;
> -
> -       for (i = 0; i < regblocks; i++, regloc += 8) {
> -               u32 reg_lo, reg_hi;
> -
> -               pci_read_config_dword(pdev, regloc, &reg_lo);
> -               pci_read_config_dword(pdev, regloc + 4, &reg_hi);
> -
> -               cxl_decode_regblock(reg_lo, reg_hi, map);
> -
> -               if (map->reg_type == type)
> -                       return 0;
> -       }
> -
> -       return -ENODEV;
> -}
> -
>  static int cxl_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
>                           struct cxl_register_map *map)
>  {
> --
> 2.33.1
>

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 03/28] cxl/acpi: Map component registers for Root Ports
  2021-10-22 18:36 ` [RFC PATCH v2 03/28] cxl/acpi: Map component registers for Root Ports Ben Widawsky
@ 2021-10-29 20:28   ` Dan Williams
  0 siblings, 0 replies; 106+ messages in thread
From: Dan Williams @ 2021-10-29 20:28 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> With the addition of cxl_find_register_block() in cxl_core, it becomes
> trivial to complete the TODO left for mapping the component registers of
> root ports. None of the CXL drivers currently use component registers of
> downstream ports (which is what a CXL 2.0 Root Port is). As such, there
> should be no functional change.

Oh, I forgot that cxl_acpi had this todo, I think it would be best to
fold this with the previous patch so that it makes a complete story.

>
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> ---
>  drivers/cxl/acpi.c | 10 ++++++++--
>  drivers/cxl/pci.h  |  4 ++++
>  2 files changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index af1c6c1875ac..7d13e7f0aefc 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -7,6 +7,7 @@
>  #include <linux/acpi.h>
>  #include <linux/pci.h>
>  #include "cxl.h"
> +#include "pci.h"
>
>  static struct acpi_table_header *acpi_cedt;
>
> @@ -206,11 +207,13 @@ static resource_size_t get_chbcr(struct acpi_cedt_chbs *chbs)
>
>  __mock int match_add_root_ports(struct pci_dev *pdev, void *data)
>  {
> +       resource_size_t creg = CXL_RESOURCE_NONE;
>         struct cxl_walk_context *ctx = data;
>         struct pci_bus *root_bus = ctx->root;
>         struct cxl_port *port = ctx->port;
>         int type = pci_pcie_type(pdev);
>         struct device *dev = ctx->dev;
> +       struct cxl_register_map map;
>         u32 lnkcap, port_num;
>         int rc;
>
> @@ -224,9 +227,12 @@ __mock int match_add_root_ports(struct pci_dev *pdev, void *data)
>                                   &lnkcap) != PCIBIOS_SUCCESSFUL)
>                 return 0;
>
> -       /* TODO walk DVSEC to find component register base */
> +       rc = cxl_find_regblock(pdev, CXL_REGLOC_RBI_COMPONENT, &map);
> +       if (!rc)
> +               creg = cxl_reg_block(pdev, &map);
> +
>         port_num = FIELD_GET(PCI_EXP_LNKCAP_PN, lnkcap);
> -       rc = cxl_add_dport(port, &pdev->dev, port_num, CXL_RESOURCE_NONE);
> +       rc = cxl_add_dport(port, &pdev->dev, port_num, creg);
>         if (rc) {
>                 ctx->error = rc;
>                 return rc;
> diff --git a/drivers/cxl/pci.h b/drivers/cxl/pci.h
> index 7d3e4bf06b45..12fdcb1b14e5 100644
> --- a/drivers/cxl/pci.h
> +++ b/drivers/cxl/pci.h
> @@ -31,4 +31,8 @@ enum cxl_regloc_type {
>  #define CXL_REGLOC_RBI_MASK GENMASK(15, 8)
>  #define CXL_REGLOC_ADDR_MASK GENMASK(31, 16)
>
> +#define cxl_reg_block(pdev, map)                                               \
> +       ((resource_size_t)(pci_resource_start(pdev, (map)->barno) +            \
> +                          (map)->block_offset))
> +
>  #endif /* __CXL_PCI_H__ */
> --
> 2.33.1
>

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 04/28] cxl: Add helper for new drivers
  2021-10-22 18:36 ` [RFC PATCH v2 04/28] cxl: Add helper for new drivers Ben Widawsky
@ 2021-10-29 20:30   ` Dan Williams
  0 siblings, 0 replies; 106+ messages in thread
From: Dan Williams @ 2021-10-29 20:30 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

s/cxl: Add helper for new drivers/cxl: Introduce module_cxl_driver/

...as "Add helper for new drivers" could be any kind of helper
anywhere in the stack.

On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> Many CXL drivers simply want to register and unregister themselves.
> module_driver already supported this. A simple wrapper around that
> reduces a decent amount of boilerplate in upcoming patches.
>

Other than the subject clarification, looks good to me.

> Suggested-by: Dan Williams <dan.j.williams@intel.com>
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> ---
>  drivers/cxl/cxl.h | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index f06c596fad71..4483e1a39fc3 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -317,6 +317,9 @@ int __cxl_driver_register(struct cxl_driver *cxl_drv, struct module *owner,
>  #define cxl_driver_register(x) __cxl_driver_register(x, THIS_MODULE, KBUILD_MODNAME)
>  void cxl_driver_unregister(struct cxl_driver *cxl_drv);
>
> +#define module_cxl_driver(__cxl_driver) \
> +       module_driver(__cxl_driver, cxl_driver_register, cxl_driver_unregister)
> +
>  #define CXL_DEVICE_NVDIMM_BRIDGE       1
>  #define CXL_DEVICE_NVDIMM              2
>
> --
> 2.33.1
>

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 05/28] cxl/core: Convert decoder range to resource
  2021-10-22 18:36 ` [RFC PATCH v2 05/28] cxl/core: Convert decoder range to resource Ben Widawsky
@ 2021-10-29 20:50   ` Dan Williams
  2021-10-29 21:26     ` Ben Widawsky
  0 siblings, 1 reply; 106+ messages in thread
From: Dan Williams @ 2021-10-29 20:50 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> Regions will use the resource API in order to help manage allocated
> space. As regions are children of the decoder, it makes sense that the
> parent host the main resource to be suballocated by the region.
>
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> ---
>  drivers/cxl/acpi.c     | 12 ++++--------
>  drivers/cxl/core/bus.c |  4 ++--
>  drivers/cxl/cxl.h      |  4 ++--
>  3 files changed, 8 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index 7d13e7f0aefc..b972abc9f6ef 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -126,10 +126,9 @@ static void cxl_add_cfmws_decoders(struct device *dev,
>
>                 cxld->flags = cfmws_to_decoder_flags(cfmws->restrictions);
>                 cxld->target_type = CXL_DECODER_EXPANDER;
> -               cxld->range = (struct range) {
> -                       .start = cfmws->base_hpa,
> -                       .end = cfmws->base_hpa + cfmws->window_size - 1,
> -               };
> +               cxld->res = (struct resource)DEFINE_RES_MEM_NAMED(cfmws->base_hpa,
> +                                                                 cfmws->window_size,
> +                                                                 "cfmws");

I think this should just be DEFINE_RES_MEM(), and then set the name of
it inside cxl_decoder_add() to the dev_name() of the decoder device.
That way a dump of the resource tree hierarchy makes sense compared to
the device hierarchy with actual device names not a series of
repeating "cfmws" entries.

>                 cxld->interleave_ways = CFMWS_INTERLEAVE_WAYS(cfmws);
>                 cxld->interleave_granularity =
>                         CFMWS_INTERLEAVE_GRANULARITY(cfmws);
> @@ -339,10 +338,7 @@ static int add_host_bridge_uport(struct device *match, void *arg)
>         cxld->interleave_ways = 1;
>         cxld->interleave_granularity = PAGE_SIZE;
>         cxld->target_type = CXL_DECODER_EXPANDER;
> -       cxld->range = (struct range) {
> -               .start = 0,
> -               .end = -1,
> -       };
> +       cxld->res = (struct resource)DEFINE_RES_MEM(0, 0);
>
>         device_lock(&port->dev);
>         dport = list_first_entry(&port->dports, typeof(*dport), list);
> diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> index ebd061d03950..454d4d846eb2 100644
> --- a/drivers/cxl/core/bus.c
> +++ b/drivers/cxl/core/bus.c
> @@ -47,7 +47,7 @@ static ssize_t start_show(struct device *dev, struct device_attribute *attr,
>  {
>         struct cxl_decoder *cxld = to_cxl_decoder(dev);
>
> -       return sysfs_emit(buf, "%#llx\n", cxld->range.start);
> +       return sysfs_emit(buf, "%#llx\n", cxld->res.start);
>  }
>  static DEVICE_ATTR_RO(start);
>
> @@ -56,7 +56,7 @@ static ssize_t size_show(struct device *dev, struct device_attribute *attr,
>  {
>         struct cxl_decoder *cxld = to_cxl_decoder(dev);
>
> -       return sysfs_emit(buf, "%#llx\n", range_len(&cxld->range));
> +       return sysfs_emit(buf, "%#llx\n", resource_size(&cxld->res));

It's not clear to me that anything other than root decoders will host
a resource tree, every decoder downstream of the root would reference
a subset of the root decoder range. Something like:

diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 5e2e93451928..00bf742396e0 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -200,7 +200,8 @@ enum cxl_decoder_type {
  * struct cxl_decoder - CXL address range decode configuration
  * @dev: this decoder's device
  * @id: kernel device name id
- * @range: address range considered by this decoder
+ * @res: platform address range hosted by a root decoder
+ * @range: address range programmed in the (non-root) decoder
  * @interleave_ways: number of cxl_dports in this decode
  * @interleave_granularity: data stride per dport
  * @target_type: accelerator vs expander (type2 vs type3) selector
@@ -211,7 +212,10 @@ enum cxl_decoder_type {
 struct cxl_decoder {
        struct device dev;
        int id;
-       struct range range;
+       union {
+               struct resource res;
+               struct range range;
+       };
        int interleave_ways;
        int interleave_granularity;
        enum cxl_decoder_type target_type;

...because regions will __request_region(&cxld->res, ...) from the
root decoder, and all the intervening decoders in that stack will be
updated to make that mapping happen.

Note, this is just how I had it roughly mapped out in my head, I defer
making it a hard recommendation until I get deeper into this set to
see if we diverge.

>  }
>  static DEVICE_ATTR_RO(size);
>
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 4483e1a39fc3..7f2e2bdc7883 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -195,7 +195,7 @@ enum cxl_decoder_type {
>   * struct cxl_decoder - CXL address range decode configuration
>   * @dev: this decoder's device
>   * @id: kernel device name id
> - * @range: address range considered by this decoder
> + * @res: address space resources considered by this decoder
>   * @interleave_ways: number of cxl_dports in this decode
>   * @interleave_granularity: data stride per dport
>   * @target_type: accelerator vs expander (type2 vs type3) selector
> @@ -206,7 +206,7 @@ enum cxl_decoder_type {
>  struct cxl_decoder {
>         struct device dev;
>         int id;
> -       struct range range;
> +       struct resource res;
>         int interleave_ways;
>         int interleave_granularity;
>         enum cxl_decoder_type target_type;
> --
> 2.33.1
>

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 06/28] cxl: Introduce endpoint decoders
  2021-10-22 18:36 ` [RFC PATCH v2 06/28] cxl: Introduce endpoint decoders Ben Widawsky
@ 2021-10-29 21:00   ` Dan Williams
  2021-10-29 22:02     ` Ben Widawsky
  0 siblings, 1 reply; 106+ messages in thread
From: Dan Williams @ 2021-10-29 21:00 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> Endpoints have decoders too. It is useful to share the same
> infrastructure from cxl_core. Endpoints do not have dports (downstream
> targets), only the underlying physical medium. As a result, some special
> casing is needed.
>
> There is no functional change introduced yet as endpoints don't actually
> enumerate decoders yet.

The nr_targets type change either needs to be commented on here, or
moved to its own patch.

Otherwise, this looks good.

>
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> ---
>  drivers/cxl/core/bus.c | 40 ++++++++++++++++++++++++++++++++--------
>  drivers/cxl/cxl.h      |  3 ++-
>  2 files changed, 34 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> index 454d4d846eb2..5564a71773e2 100644
> --- a/drivers/cxl/core/bus.c
> +++ b/drivers/cxl/core/bus.c
> @@ -175,6 +175,12 @@ static const struct attribute_group *cxl_decoder_switch_attribute_groups[] = {
>         NULL,
>  };
>
> +static const struct attribute_group *cxl_decoder_endpoint_attribute_groups[] = {
> +       &cxl_decoder_base_attribute_group,
> +       &cxl_base_attribute_group,
> +       NULL,
> +};
> +
>  static void cxl_decoder_release(struct device *dev)
>  {
>         struct cxl_decoder *cxld = to_cxl_decoder(dev);
> @@ -184,6 +190,12 @@ static void cxl_decoder_release(struct device *dev)
>         kfree(cxld);
>  }
>
> +static const struct device_type cxl_decoder_endpoint_type = {
> +       .name = "cxl_decoder_endpoint",
> +       .release = cxl_decoder_release,
> +       .groups = cxl_decoder_endpoint_attribute_groups,
> +};
> +
>  static const struct device_type cxl_decoder_switch_type = {
>         .name = "cxl_decoder_switch",
>         .release = cxl_decoder_release,
> @@ -196,6 +208,11 @@ static const struct device_type cxl_decoder_root_type = {
>         .groups = cxl_decoder_root_attribute_groups,
>  };
>
> +static bool is_endpoint_decoder(struct device *dev)
> +{
> +       return dev->type == &cxl_decoder_endpoint_type;
> +}
> +
>  bool is_root_decoder(struct device *dev)
>  {
>         return dev->type == &cxl_decoder_root_type;
> @@ -483,7 +500,8 @@ static int decoder_populate_targets(struct cxl_decoder *cxld,
>         return rc;
>  }
>
> -struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets)
> +struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port,
> +                                     unsigned int nr_targets)
>  {
>         struct cxl_decoder *cxld, cxld_const_init = {
>                 .nr_targets = nr_targets,
> @@ -491,7 +509,7 @@ struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets)
>         struct device *dev;
>         int rc = 0;
>
> -       if (nr_targets > CXL_DECODER_MAX_INTERLEAVE || nr_targets < 1)
> +       if (nr_targets > CXL_DECODER_MAX_INTERLEAVE)
>                 return ERR_PTR(-EINVAL);
>
>         cxld = kzalloc(struct_size(cxld, target, nr_targets), GFP_KERNEL);
> @@ -510,8 +528,11 @@ struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets)
>         dev->parent = &port->dev;
>         dev->bus = &cxl_bus_type;
>
> +       /* Endpoints don't have a target list */
> +       if (nr_targets == 0)
> +               dev->type = &cxl_decoder_endpoint_type;
>         /* root ports do not have a cxl_port_type parent */
> -       if (port->dev.parent->type == &cxl_port_type)
> +       else if (port->dev.parent->type == &cxl_port_type)
>                 dev->type = &cxl_decoder_switch_type;
>         else
>                 dev->type = &cxl_decoder_root_type;
> @@ -538,12 +559,15 @@ int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
>         if (cxld->interleave_ways < 1)
>                 return -EINVAL;
>
> -       port = to_cxl_port(cxld->dev.parent);
> -       rc = decoder_populate_targets(cxld, port, target_map);
> -       if (rc)
> -               return rc;
> -
>         dev = &cxld->dev;
> +
> +       port = to_cxl_port(cxld->dev.parent);
> +       if (!is_endpoint_decoder(dev)) {
> +               rc = decoder_populate_targets(cxld, port, target_map);
> +               if (rc)
> +                       return rc;
> +       }
> +
>         rc = dev_set_name(dev, "decoder%d.%d", port->id, cxld->id);
>         if (rc)
>                 return rc;
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 7f2e2bdc7883..91b8fd54bc93 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -293,7 +293,8 @@ int cxl_add_dport(struct cxl_port *port, struct device *dport, int port_id,
>
>  struct cxl_decoder *to_cxl_decoder(struct device *dev);
>  bool is_root_decoder(struct device *dev);
> -struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets);
> +struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port,
> +                                     unsigned int nr_targets);
>  int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map);
>  int cxl_decoder_autoremove(struct device *host, struct cxl_decoder *cxld);
>
> --
> 2.33.1
>

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 01/28] cxl: Rename CXL_MEM to CXL_PCI
  2021-10-29 20:15   ` Dan Williams
@ 2021-10-29 21:20     ` Ben Widawsky
  2021-10-29 21:39       ` Dan Williams
  0 siblings, 1 reply; 106+ messages in thread
From: Ben Widawsky @ 2021-10-29 21:20 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On 21-10-29 13:15:46, Dan Williams wrote:
> On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> >
> > With the upcoming introduction of a driver to control the non-PCI
> > aspects of CXL.mem, such as interleave set creation and configuration,
> > there will be an opportunity to disconnection control over CXL device
> 
> s/disconnection/disconnect/
> 
> > memory and CXL device manageability. CXL device manageability is
> > implemented by the cxl_pci driver. Doing this rename allows the CXL
> > memory driver to be enabled by a new config option independently of CXL
> > device manageability through CXL.io/PCI mechanisms.
> 
> That comes across a bit hard to parse to me, how about:
> 
> "The cxl_mem module was renamed cxl_pci in commit 21e9f76733a8 ("cxl:
> Rename mem to pci"). In preparation for adding an ancillary driver for
> cxl_memdev devices (registered on the cxl bus by cxl_pci), go ahead
> and rename CONFIG_CXL_MEM to CONFIG_CXL_PCI. Tree up the CXL_MEM name

I assume you meant s/tree/tee - right?

> for that new driver to manage generic CXL.mem endpoint operations."
> 
> > Suggested-by: Dan Williams <dan.j.williams@intel.com>
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > ---
> >  drivers/cxl/Kconfig  | 13 ++++++-------
> >  drivers/cxl/Makefile |  2 +-
> >  2 files changed, 7 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> > index e6de221cc568..23773d0ac896 100644
> > --- a/drivers/cxl/Kconfig
> > +++ b/drivers/cxl/Kconfig
> > @@ -13,14 +13,13 @@ menuconfig CXL_BUS
> >
> >  if CXL_BUS
> >
> > -config CXL_MEM
> > -       tristate "CXL.mem: Memory Devices"
> > +config CXL_PCI
> > +       tristate "PCI manageability"
> 
> s/PCI manageability/CXL Memory Device: PCI Operations/
> 
> ...as I don't think an end user reading "PCI Manageability" would know
> that it supports basic memory expander enumeration and mailbox
> operations.
> 
> >         default CXL_BUS
> >         help
> > -         The CXL.mem protocol allows a device to act as a provider of
> > -         "System RAM" and/or "Persistent Memory" that is fully coherent
> > -         as if the memory was attached to the typical CPU memory
> > -         controller.
> > +         The CXL specification defines a set of interfaces which are controlled
> > +         through well known PCI configuration mechanisms. Such access is
> > +         referred to CXL.io in the specification.
> 
> The CXL specification defines a "CXL memory device" sub-class in the
> PCI "memory controller" base class of devices. Device's identified by
> this class code provide support for volatile and / or persistent
> memory to be mapped into the system address map (Host-managed Device
> Memory (HDM)).
> 
> >
> >           Say 'y/m' to enable a driver that will attach to CXL.mem devices for
> 
> This would need updating too...
> 
> s/CXL.mem devices /generic CXL memory expanders identified by the
> memory device class code/
> 
> ...to reduce confusion about this driver for generic type-3 vs vendor
> specific type-2 devices that also support CXL.mem
> 

Overall seems like an improvement to me. I'm not too fond of the word "generic"
though, both here and in the commit message. I think they work equally well just
deleting that word.

> >           configuration and management primarily via the mailbox interface. See
> > @@ -31,7 +30,7 @@ config CXL_MEM
> >
> >  config CXL_MEM_RAW_COMMANDS
> >         bool "RAW Command Interface for Memory Devices"
> > -       depends on CXL_MEM
> > +       depends on CXL_PCI
> >         help
> >           Enable CXL RAW command interface.
> >
> > diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile
> > index d1aaabc940f3..cf07ae6cea17 100644
> > --- a/drivers/cxl/Makefile
> > +++ b/drivers/cxl/Makefile
> > @@ -1,6 +1,6 @@
> >  # SPDX-License-Identifier: GPL-2.0
> >  obj-$(CONFIG_CXL_BUS) += core/
> > -obj-$(CONFIG_CXL_MEM) += cxl_pci.o
> > +obj-$(CONFIG_CXL_PCI) += cxl_pci.o
> >  obj-$(CONFIG_CXL_ACPI) += cxl_acpi.o
> >  obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o
> >
> > --
> > 2.33.1
> >

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 02/28] cxl: Move register block enumeration to core
  2021-10-29 20:23   ` Dan Williams
@ 2021-10-29 21:23     ` Ben Widawsky
  0 siblings, 0 replies; 106+ messages in thread
From: Ben Widawsky @ 2021-10-29 21:23 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On 21-10-29 13:23:55, Dan Williams wrote:
> On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> >
> > CXL drivers or cxl_core itself will require the ability to map component
> > registers in order to map HDM decoder resources amongst other things.
> 
> The PCI bits of this are only needed for switch support, right?
> Everything else already knows how to find the component_reg_phys base
> before devm_cxl_add_port().
> 
> It would help if this first sentence was a specific example where the
> PCI helpers are needed outside of cxl_pci.
> 

You've already spotted the other use in the next commit. I can change "CXL
drivers" to the specific drivers and related components.

> > Much of the register mapping code has already moved into cxl_core. The
> > code to pull the BAR number and block office remained within cxl_pci
> > because there was no need to move it. Upcoming work will require this
> > functionality to be available outside of cxl_pci.
> >
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > ---
> >  drivers/cxl/core/regs.c | 54 +++++++++++++++++++++++++++++++++++++++++
> >  drivers/cxl/cxl.h       |  4 +++
> >  drivers/cxl/pci.c       | 52 ---------------------------------------
> >  3 files changed, 58 insertions(+), 52 deletions(-)
> >
> > diff --git a/drivers/cxl/core/regs.c b/drivers/cxl/core/regs.c
> > index 41de4a136ecd..40598905c080 100644
> > --- a/drivers/cxl/core/regs.c
> > +++ b/drivers/cxl/core/regs.c
> > @@ -5,6 +5,7 @@
> >  #include <linux/slab.h>
> >  #include <linux/pci.h>
> >  #include <cxlmem.h>
> > +#include <pci.h>
> >
> >  /**
> >   * DOC: cxl registers
> > @@ -247,3 +248,56 @@ int cxl_map_device_regs(struct pci_dev *pdev,
> >         return 0;
> >  }
> >  EXPORT_SYMBOL_GPL(cxl_map_device_regs);
> > +
> > +static void cxl_decode_regblock(u32 reg_lo, u32 reg_hi,
> > +                               struct cxl_register_map *map)
> > +{
> > +       map->block_offset =
> > +               ((u64)reg_hi << 32) | (reg_lo & CXL_REGLOC_ADDR_MASK);
> > +       map->barno = FIELD_GET(CXL_REGLOC_BIR_MASK, reg_lo);
> > +       map->reg_type = FIELD_GET(CXL_REGLOC_RBI_MASK, reg_lo);
> > +}
> > +
> > +/**
> > + * cxl_find_regblock() - Locate register blocks by type
> > + * @pdev: The CXL PCI device to enumerate.
> > + * @type: Register Block Indicator id
> > + * @map: Enumeration output, clobbered on error
> > + *
> > + * Return: 0 if register block enumerated, negative error code otherwise
> > + *
> > + * A CXL DVSEC may additional point one or more register blocks, search
> > + * for them by @type.
> > + */
> > +int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
> > +                     struct cxl_register_map *map)
> > +{
> > +       u32 regloc_size, regblocks;
> > +       int regloc, i;
> > +
> > +       regloc = pci_find_dvsec_capability(pdev, PCI_DVSEC_VENDOR_ID_CXL,
> > +                                          PCI_DVSEC_ID_CXL_REGLOC_DVSEC_ID);
> > +       if (!regloc)
> > +               return -ENXIO;
> > +
> > +       pci_read_config_dword(pdev, regloc + PCI_DVSEC_HEADER1, &regloc_size);
> > +       regloc_size = FIELD_GET(PCI_DVSEC_HEADER1_LENGTH_MASK, regloc_size);
> > +
> > +       regloc += PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET;
> > +       regblocks = (regloc_size - PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET) / 8;
> > +
> > +       for (i = 0; i < regblocks; i++, regloc += 8) {
> > +               u32 reg_lo, reg_hi;
> > +
> > +               pci_read_config_dword(pdev, regloc, &reg_lo);
> > +               pci_read_config_dword(pdev, regloc + 4, &reg_hi);
> > +
> > +               cxl_decode_regblock(reg_lo, reg_hi, map);
> > +
> > +               if (map->reg_type == type)
> > +                       return 0;
> > +       }
> > +
> > +       return -ENODEV;
> > +}
> > +EXPORT_SYMBOL_GPL(cxl_find_regblock);
> > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> > index 7cd16ef144dd..f06c596fad71 100644
> > --- a/drivers/cxl/cxl.h
> > +++ b/drivers/cxl/cxl.h
> > @@ -161,6 +161,10 @@ int cxl_map_device_regs(struct pci_dev *pdev,
> >                         struct cxl_device_regs *regs,
> >                         struct cxl_register_map *map);
> >
> > +enum cxl_regloc_type;
> > +int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
> > +                     struct cxl_register_map *map);
> > +
> >  #define CXL_RESOURCE_NONE ((resource_size_t) -1)
> >  #define CXL_TARGET_STRLEN 20
> >
> > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > index f19a06809079..d1adc759d051 100644
> > --- a/drivers/cxl/pci.c
> > +++ b/drivers/cxl/pci.c
> > @@ -400,58 +400,6 @@ static int cxl_map_regs(struct cxl_mem *cxlm, struct cxl_register_map *map)
> >         return 0;
> >  }
> >
> > -static void cxl_decode_regblock(u32 reg_lo, u32 reg_hi,
> > -                               struct cxl_register_map *map)
> > -{
> > -       map->block_offset =
> > -               ((u64)reg_hi << 32) | (reg_lo & CXL_REGLOC_ADDR_MASK);
> > -       map->barno = FIELD_GET(CXL_REGLOC_BIR_MASK, reg_lo);
> > -       map->reg_type = FIELD_GET(CXL_REGLOC_RBI_MASK, reg_lo);
> > -}
> > -
> > -/**
> > - * cxl_find_regblock() - Locate register blocks by type
> > - * @pdev: The CXL PCI device to enumerate.
> > - * @type: Register Block Indicator id
> > - * @map: Enumeration output, clobbered on error
> > - *
> > - * Return: 0 if register block enumerated, negative error code otherwise
> > - *
> > - * A CXL DVSEC may additional point one or more register blocks, search
> > - * for them by @type.
> > - */
> > -static int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
> > -                            struct cxl_register_map *map)
> > -{
> > -       u32 regloc_size, regblocks;
> > -       int regloc, i;
> > -
> > -       regloc = pci_find_dvsec_capability(pdev, PCI_DVSEC_VENDOR_ID_CXL,
> > -                                          PCI_DVSEC_ID_CXL_REGLOC_DVSEC_ID);
> > -       if (!regloc)
> > -               return -ENXIO;
> > -
> > -       pci_read_config_dword(pdev, regloc + PCI_DVSEC_HEADER1, &regloc_size);
> > -       regloc_size = FIELD_GET(PCI_DVSEC_HEADER1_LENGTH_MASK, regloc_size);
> > -
> > -       regloc += PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET;
> > -       regblocks = (regloc_size - PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET) / 8;
> > -
> > -       for (i = 0; i < regblocks; i++, regloc += 8) {
> > -               u32 reg_lo, reg_hi;
> > -
> > -               pci_read_config_dword(pdev, regloc, &reg_lo);
> > -               pci_read_config_dword(pdev, regloc + 4, &reg_hi);
> > -
> > -               cxl_decode_regblock(reg_lo, reg_hi, map);
> > -
> > -               if (map->reg_type == type)
> > -                       return 0;
> > -       }
> > -
> > -       return -ENODEV;
> > -}
> > -
> >  static int cxl_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
> >                           struct cxl_register_map *map)
> >  {
> > --
> > 2.33.1
> >

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 05/28] cxl/core: Convert decoder range to resource
  2021-10-29 20:50   ` Dan Williams
@ 2021-10-29 21:26     ` Ben Widawsky
  2021-10-29 22:22       ` Dan Williams
  0 siblings, 1 reply; 106+ messages in thread
From: Ben Widawsky @ 2021-10-29 21:26 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On 21-10-29 13:50:29, Dan Williams wrote:
> On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> >
> > Regions will use the resource API in order to help manage allocated
> > space. As regions are children of the decoder, it makes sense that the
> > parent host the main resource to be suballocated by the region.
> >
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > ---
> >  drivers/cxl/acpi.c     | 12 ++++--------
> >  drivers/cxl/core/bus.c |  4 ++--
> >  drivers/cxl/cxl.h      |  4 ++--
> >  3 files changed, 8 insertions(+), 12 deletions(-)
> >
> > diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> > index 7d13e7f0aefc..b972abc9f6ef 100644
> > --- a/drivers/cxl/acpi.c
> > +++ b/drivers/cxl/acpi.c
> > @@ -126,10 +126,9 @@ static void cxl_add_cfmws_decoders(struct device *dev,
> >
> >                 cxld->flags = cfmws_to_decoder_flags(cfmws->restrictions);
> >                 cxld->target_type = CXL_DECODER_EXPANDER;
> > -               cxld->range = (struct range) {
> > -                       .start = cfmws->base_hpa,
> > -                       .end = cfmws->base_hpa + cfmws->window_size - 1,
> > -               };
> > +               cxld->res = (struct resource)DEFINE_RES_MEM_NAMED(cfmws->base_hpa,
> > +                                                                 cfmws->window_size,
> > +                                                                 "cfmws");
> 
> I think this should just be DEFINE_RES_MEM(), and then set the name of
> it inside cxl_decoder_add() to the dev_name() of the decoder device.
> That way a dump of the resource tree hierarchy makes sense compared to
> the device hierarchy with actual device names not a series of
> repeating "cfmws" entries.
> 
> >                 cxld->interleave_ways = CFMWS_INTERLEAVE_WAYS(cfmws);
> >                 cxld->interleave_granularity =
> >                         CFMWS_INTERLEAVE_GRANULARITY(cfmws);
> > @@ -339,10 +338,7 @@ static int add_host_bridge_uport(struct device *match, void *arg)
> >         cxld->interleave_ways = 1;
> >         cxld->interleave_granularity = PAGE_SIZE;
> >         cxld->target_type = CXL_DECODER_EXPANDER;
> > -       cxld->range = (struct range) {
> > -               .start = 0,
> > -               .end = -1,
> > -       };
> > +       cxld->res = (struct resource)DEFINE_RES_MEM(0, 0);
> >
> >         device_lock(&port->dev);
> >         dport = list_first_entry(&port->dports, typeof(*dport), list);
> > diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> > index ebd061d03950..454d4d846eb2 100644
> > --- a/drivers/cxl/core/bus.c
> > +++ b/drivers/cxl/core/bus.c
> > @@ -47,7 +47,7 @@ static ssize_t start_show(struct device *dev, struct device_attribute *attr,
> >  {
> >         struct cxl_decoder *cxld = to_cxl_decoder(dev);
> >
> > -       return sysfs_emit(buf, "%#llx\n", cxld->range.start);
> > +       return sysfs_emit(buf, "%#llx\n", cxld->res.start);
> >  }
> >  static DEVICE_ATTR_RO(start);
> >
> > @@ -56,7 +56,7 @@ static ssize_t size_show(struct device *dev, struct device_attribute *attr,
> >  {
> >         struct cxl_decoder *cxld = to_cxl_decoder(dev);
> >
> > -       return sysfs_emit(buf, "%#llx\n", range_len(&cxld->range));
> > +       return sysfs_emit(buf, "%#llx\n", resource_size(&cxld->res));
> 
> It's not clear to me that anything other than root decoders will host
> a resource tree, every decoder downstream of the root would reference
> a subset of the root decoder range. Something like:
> 
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 5e2e93451928..00bf742396e0 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -200,7 +200,8 @@ enum cxl_decoder_type {
>   * struct cxl_decoder - CXL address range decode configuration
>   * @dev: this decoder's device
>   * @id: kernel device name id
> - * @range: address range considered by this decoder
> + * @res: platform address range hosted by a root decoder
> + * @range: address range programmed in the (non-root) decoder
>   * @interleave_ways: number of cxl_dports in this decode
>   * @interleave_granularity: data stride per dport
>   * @target_type: accelerator vs expander (type2 vs type3) selector
> @@ -211,7 +212,10 @@ enum cxl_decoder_type {
>  struct cxl_decoder {
>         struct device dev;
>         int id;
> -       struct range range;
> +       union {
> +               struct resource res;
> +               struct range range;
> +       };
>         int interleave_ways;
>         int interleave_granularity;
>         enum cxl_decoder_type target_type;
> 
> ...because regions will __request_region(&cxld->res, ...) from the
> root decoder, and all the intervening decoders in that stack will be
> updated to make that mapping happen.
> 
> Note, this is just how I had it roughly mapped out in my head, I defer
> making it a hard recommendation until I get deeper into this set to
> see if we diverge.
> 

You're correct.

I had changed this already for my v3. I ended up with this:
+       union {
+               struct resource cfmws_res;
+               struct resource *decoder_res;
+       };

Not sure if you prefer one way or another.

> >  }
> >  static DEVICE_ATTR_RO(size);
> >
> > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> > index 4483e1a39fc3..7f2e2bdc7883 100644
> > --- a/drivers/cxl/cxl.h
> > +++ b/drivers/cxl/cxl.h
> > @@ -195,7 +195,7 @@ enum cxl_decoder_type {
> >   * struct cxl_decoder - CXL address range decode configuration
> >   * @dev: this decoder's device
> >   * @id: kernel device name id
> > - * @range: address range considered by this decoder
> > + * @res: address space resources considered by this decoder
> >   * @interleave_ways: number of cxl_dports in this decode
> >   * @interleave_granularity: data stride per dport
> >   * @target_type: accelerator vs expander (type2 vs type3) selector
> > @@ -206,7 +206,7 @@ enum cxl_decoder_type {
> >  struct cxl_decoder {
> >         struct device dev;
> >         int id;
> > -       struct range range;
> > +       struct resource res;
> >         int interleave_ways;
> >         int interleave_granularity;
> >         enum cxl_decoder_type target_type;
> > --
> > 2.33.1
> >

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 01/28] cxl: Rename CXL_MEM to CXL_PCI
  2021-10-29 21:20     ` Ben Widawsky
@ 2021-10-29 21:39       ` Dan Williams
  0 siblings, 0 replies; 106+ messages in thread
From: Dan Williams @ 2021-10-29 21:39 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Fri, Oct 29, 2021 at 2:20 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> On 21-10-29 13:15:46, Dan Williams wrote:
> > On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > >
> > > With the upcoming introduction of a driver to control the non-PCI
> > > aspects of CXL.mem, such as interleave set creation and configuration,
> > > there will be an opportunity to disconnection control over CXL device
> >
> > s/disconnection/disconnect/
> >
> > > memory and CXL device manageability. CXL device manageability is
> > > implemented by the cxl_pci driver. Doing this rename allows the CXL
> > > memory driver to be enabled by a new config option independently of CXL
> > > device manageability through CXL.io/PCI mechanisms.
> >
> > That comes across a bit hard to parse to me, how about:
> >
> > "The cxl_mem module was renamed cxl_pci in commit 21e9f76733a8 ("cxl:
> > Rename mem to pci"). In preparation for adding an ancillary driver for
> > cxl_memdev devices (registered on the cxl bus by cxl_pci), go ahead
> > and rename CONFIG_CXL_MEM to CONFIG_CXL_PCI. Tree up the CXL_MEM name
>
> I assume you meant s/tree/tee - right?

Sorry, that should have been "Free"

>
> > for that new driver to manage generic CXL.mem endpoint operations."
> >
> > > Suggested-by: Dan Williams <dan.j.williams@intel.com>
> > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > > ---
> > >  drivers/cxl/Kconfig  | 13 ++++++-------
> > >  drivers/cxl/Makefile |  2 +-
> > >  2 files changed, 7 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> > > index e6de221cc568..23773d0ac896 100644
> > > --- a/drivers/cxl/Kconfig
> > > +++ b/drivers/cxl/Kconfig
> > > @@ -13,14 +13,13 @@ menuconfig CXL_BUS
> > >
> > >  if CXL_BUS
> > >
> > > -config CXL_MEM
> > > -       tristate "CXL.mem: Memory Devices"
> > > +config CXL_PCI
> > > +       tristate "PCI manageability"
> >
> > s/PCI manageability/CXL Memory Device: PCI Operations/
> >
> > ...as I don't think an end user reading "PCI Manageability" would know
> > that it supports basic memory expander enumeration and mailbox
> > operations.
> >
> > >         default CXL_BUS
> > >         help
> > > -         The CXL.mem protocol allows a device to act as a provider of
> > > -         "System RAM" and/or "Persistent Memory" that is fully coherent
> > > -         as if the memory was attached to the typical CPU memory
> > > -         controller.
> > > +         The CXL specification defines a set of interfaces which are controlled
> > > +         through well known PCI configuration mechanisms. Such access is
> > > +         referred to CXL.io in the specification.
> >
> > The CXL specification defines a "CXL memory device" sub-class in the
> > PCI "memory controller" base class of devices. Device's identified by
> > this class code provide support for volatile and / or persistent
> > memory to be mapped into the system address map (Host-managed Device
> > Memory (HDM)).
> >
> > >
> > >           Say 'y/m' to enable a driver that will attach to CXL.mem devices for
> >
> > This would need updating too...
> >
> > s/CXL.mem devices /generic CXL memory expanders identified by the
> > memory device class code/
> >
> > ...to reduce confusion about this driver for generic type-3 vs vendor
> > specific type-2 devices that also support CXL.mem
> >
>
> Overall seems like an improvement to me. I'm not too fond of the word "generic"
> though, both here and in the commit message. I think they work equally well just
> deleting that word.

Sure, and it's redundant with the mention of "class code".

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 06/28] cxl: Introduce endpoint decoders
  2021-10-29 21:00   ` Dan Williams
@ 2021-10-29 22:02     ` Ben Widawsky
  2021-10-29 22:25       ` Dan Williams
  0 siblings, 1 reply; 106+ messages in thread
From: Ben Widawsky @ 2021-10-29 22:02 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On 21-10-29 14:00:21, Dan Williams wrote:
> On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> >
> > Endpoints have decoders too. It is useful to share the same
> > infrastructure from cxl_core. Endpoints do not have dports (downstream
> > targets), only the underlying physical medium. As a result, some special
> > casing is needed.
> >
> > There is no functional change introduced yet as endpoints don't actually
> > enumerate decoders yet.
> 
> The nr_targets type change either needs to be commented on here, or
> moved to its own patch.

Do you have a preference?

> 
> Otherwise, this looks good.
> 
> >
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > ---
> >  drivers/cxl/core/bus.c | 40 ++++++++++++++++++++++++++++++++--------
> >  drivers/cxl/cxl.h      |  3 ++-
> >  2 files changed, 34 insertions(+), 9 deletions(-)
> >
> > diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> > index 454d4d846eb2..5564a71773e2 100644
> > --- a/drivers/cxl/core/bus.c
> > +++ b/drivers/cxl/core/bus.c
> > @@ -175,6 +175,12 @@ static const struct attribute_group *cxl_decoder_switch_attribute_groups[] = {
> >         NULL,
> >  };
> >
> > +static const struct attribute_group *cxl_decoder_endpoint_attribute_groups[] = {
> > +       &cxl_decoder_base_attribute_group,
> > +       &cxl_base_attribute_group,
> > +       NULL,
> > +};
> > +
> >  static void cxl_decoder_release(struct device *dev)
> >  {
> >         struct cxl_decoder *cxld = to_cxl_decoder(dev);
> > @@ -184,6 +190,12 @@ static void cxl_decoder_release(struct device *dev)
> >         kfree(cxld);
> >  }
> >
> > +static const struct device_type cxl_decoder_endpoint_type = {
> > +       .name = "cxl_decoder_endpoint",
> > +       .release = cxl_decoder_release,
> > +       .groups = cxl_decoder_endpoint_attribute_groups,
> > +};
> > +
> >  static const struct device_type cxl_decoder_switch_type = {
> >         .name = "cxl_decoder_switch",
> >         .release = cxl_decoder_release,
> > @@ -196,6 +208,11 @@ static const struct device_type cxl_decoder_root_type = {
> >         .groups = cxl_decoder_root_attribute_groups,
> >  };
> >
> > +static bool is_endpoint_decoder(struct device *dev)
> > +{
> > +       return dev->type == &cxl_decoder_endpoint_type;
> > +}
> > +
> >  bool is_root_decoder(struct device *dev)
> >  {
> >         return dev->type == &cxl_decoder_root_type;
> > @@ -483,7 +500,8 @@ static int decoder_populate_targets(struct cxl_decoder *cxld,
> >         return rc;
> >  }
> >
> > -struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets)
> > +struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port,
> > +                                     unsigned int nr_targets)
> >  {
> >         struct cxl_decoder *cxld, cxld_const_init = {
> >                 .nr_targets = nr_targets,
> > @@ -491,7 +509,7 @@ struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets)
> >         struct device *dev;
> >         int rc = 0;
> >
> > -       if (nr_targets > CXL_DECODER_MAX_INTERLEAVE || nr_targets < 1)
> > +       if (nr_targets > CXL_DECODER_MAX_INTERLEAVE)
> >                 return ERR_PTR(-EINVAL);
> >
> >         cxld = kzalloc(struct_size(cxld, target, nr_targets), GFP_KERNEL);
> > @@ -510,8 +528,11 @@ struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets)
> >         dev->parent = &port->dev;
> >         dev->bus = &cxl_bus_type;
> >
> > +       /* Endpoints don't have a target list */
> > +       if (nr_targets == 0)
> > +               dev->type = &cxl_decoder_endpoint_type;
> >         /* root ports do not have a cxl_port_type parent */
> > -       if (port->dev.parent->type == &cxl_port_type)
> > +       else if (port->dev.parent->type == &cxl_port_type)
> >                 dev->type = &cxl_decoder_switch_type;
> >         else
> >                 dev->type = &cxl_decoder_root_type;
> > @@ -538,12 +559,15 @@ int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
> >         if (cxld->interleave_ways < 1)
> >                 return -EINVAL;
> >
> > -       port = to_cxl_port(cxld->dev.parent);
> > -       rc = decoder_populate_targets(cxld, port, target_map);
> > -       if (rc)
> > -               return rc;
> > -
> >         dev = &cxld->dev;
> > +
> > +       port = to_cxl_port(cxld->dev.parent);
> > +       if (!is_endpoint_decoder(dev)) {
> > +               rc = decoder_populate_targets(cxld, port, target_map);
> > +               if (rc)
> > +                       return rc;
> > +       }
> > +
> >         rc = dev_set_name(dev, "decoder%d.%d", port->id, cxld->id);
> >         if (rc)
> >                 return rc;
> > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> > index 7f2e2bdc7883..91b8fd54bc93 100644
> > --- a/drivers/cxl/cxl.h
> > +++ b/drivers/cxl/cxl.h
> > @@ -293,7 +293,8 @@ int cxl_add_dport(struct cxl_port *port, struct device *dport, int port_id,
> >
> >  struct cxl_decoder *to_cxl_decoder(struct device *dev);
> >  bool is_root_decoder(struct device *dev);
> > -struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, int nr_targets);
> > +struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port,
> > +                                     unsigned int nr_targets);
> >  int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map);
> >  int cxl_decoder_autoremove(struct device *host, struct cxl_decoder *cxld);
> >
> > --
> > 2.33.1
> >

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 05/28] cxl/core: Convert decoder range to resource
  2021-10-29 21:26     ` Ben Widawsky
@ 2021-10-29 22:22       ` Dan Williams
  2021-10-29 22:37         ` Ben Widawsky
  0 siblings, 1 reply; 106+ messages in thread
From: Dan Williams @ 2021-10-29 22:22 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Fri, Oct 29, 2021 at 2:27 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
[..]
> > Note, this is just how I had it roughly mapped out in my head, I defer
> > making it a hard recommendation until I get deeper into this set to
> > see if we diverge.
> >
>
> You're correct.
>
> I had changed this already for my v3. I ended up with this:
> +       union {
> +               struct resource cfmws_res;

"cfmws" is something that will only appear on ACPI based CXL systems,
so if it needs a prefix it should be something that does not preclude
CXL on other platform firmware architectures. So, "platform_" or
"root_"?

> +               struct resource *decoder_res;

So the region device is going to be doing the __request_region() from
@cfmws_res, not decoders, so why is this a resource and not a range?

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 06/28] cxl: Introduce endpoint decoders
  2021-10-29 22:02     ` Ben Widawsky
@ 2021-10-29 22:25       ` Dan Williams
  0 siblings, 0 replies; 106+ messages in thread
From: Dan Williams @ 2021-10-29 22:25 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Fri, Oct 29, 2021 at 3:03 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> On 21-10-29 14:00:21, Dan Williams wrote:
> > On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > >
> > > Endpoints have decoders too. It is useful to share the same
> > > infrastructure from cxl_core. Endpoints do not have dports (downstream
> > > targets), only the underlying physical medium. As a result, some special
> > > casing is needed.
> > >
> > > There is no functional change introduced yet as endpoints don't actually
> > > enumerate decoders yet.
> >
> > The nr_targets type change either needs to be commented on here, or
> > moved to its own patch.
>
> Do you have a preference?

Slight preference for separate patch.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 05/28] cxl/core: Convert decoder range to resource
  2021-10-29 22:22       ` Dan Williams
@ 2021-10-29 22:37         ` Ben Widawsky
  2021-11-01 14:33           ` Ben Widawsky
  0 siblings, 1 reply; 106+ messages in thread
From: Ben Widawsky @ 2021-10-29 22:37 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On 21-10-29 15:22:19, Dan Williams wrote:
> On Fri, Oct 29, 2021 at 2:27 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
> [..]
> > > Note, this is just how I had it roughly mapped out in my head, I defer
> > > making it a hard recommendation until I get deeper into this set to
> > > see if we diverge.
> > >
> >
> > You're correct.
> >
> > I had changed this already for my v3. I ended up with this:
> > +       union {
> > +               struct resource cfmws_res;
> 
> "cfmws" is something that will only appear on ACPI based CXL systems,
> so if it needs a prefix it should be something that does not preclude
> CXL on other platform firmware architectures. So, "platform_" or
> "root_"?

Sure. Platform_ is better.

> 
> > +               struct resource *decoder_res;
> 
> So the region device is going to be doing the __request_region() from
> @cfmws_res, not decoders, so why is this a resource and not a range?

Well, sort of. Most of v3 was to ultimately go with the decoders doing the
__request_region, although thinking about this further that will not work since
multiple decoders might want the same ranges of the parent resource.

So I guess it has to be region. I can go back to renaming it res in that case.
You're correct that the region device should own doing the __request_region.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 07/28] cxl/core: Move target population locking to caller
  2021-10-22 18:36 ` [RFC PATCH v2 07/28] cxl/core: Move target population locking to caller Ben Widawsky
@ 2021-10-29 23:03   ` Dan Williams
  0 siblings, 0 replies; 106+ messages in thread
From: Dan Williams @ 2021-10-29 23:03 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> A CXL port driver which would be responsible for adding decoders will
> want to do so during probe.

"In preparation for a port driver that enumerates a descendant port +
decoder hierarchy, arrange for an unlocked version of
cxl_decoder_add(). Otherwise a port-driver that adds a child decoder
will deadlock on the device_lock() in ->probe()."

...although when I write it that way I would expect to see
cxl_decoder_add_locked() in this patch.

The port driver patch is big enough already, so pulling the API
addition into this patch would be my preference.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 08/28] cxl/port: Introduce a port driver
  2021-10-22 18:36 ` [RFC PATCH v2 08/28] cxl/port: Introduce a port driver Ben Widawsky
@ 2021-10-30  1:37   ` Dan Williams
  2021-10-31 17:53     ` Dan Williams
                       ` (2 more replies)
  2021-11-03 15:18   ` Jonathan Cameron
  1 sibling, 3 replies; 106+ messages in thread
From: Dan Williams @ 2021-10-30  1:37 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> The CXL port driver will be responsible for managing the decoder

Small quibble to use present tense, the future is now with this patch!

> resources contained within the port. It will also provide APIs that
> other drivers will consume for managing these resources.
>
> Since the port driver is responsible for instantiating new decoders, and
> it does so during probe(), a new API is needed to add decoders for
> callers which already hold the device lock of the port.
>
> This patch has no functional change because no driver is registering new
> ports and the root ports that are already registered should be skipped.
>
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> ---
>  .../driver-api/cxl/memory-devices.rst         |   5 +
>  drivers/cxl/Makefile                          |   2 +
>  drivers/cxl/acpi.c                            |   1 +
>  drivers/cxl/core/bus.c                        |  44 +++-
>  drivers/cxl/core/regs.c                       |   6 +-
>  drivers/cxl/cxl.h                             |  31 ++-
>  drivers/cxl/port.c                            | 242 ++++++++++++++++++
>  7 files changed, 315 insertions(+), 16 deletions(-)
>  create mode 100644 drivers/cxl/port.c
>
> diff --git a/Documentation/driver-api/cxl/memory-devices.rst b/Documentation/driver-api/cxl/memory-devices.rst
> index 3b8f41395f6b..fbf0393cdddc 100644
> --- a/Documentation/driver-api/cxl/memory-devices.rst
> +++ b/Documentation/driver-api/cxl/memory-devices.rst
> @@ -28,6 +28,11 @@ CXL Memory Device
>  .. kernel-doc:: drivers/cxl/pci.c
>     :internal:
>
> +CXL Port
> +--------
> +.. kernel-doc:: drivers/cxl/port.c
> +   :doc: cxl port
> +
>  CXL Core
>  --------
>  .. kernel-doc:: drivers/cxl/cxl.h
> diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile
> index cf07ae6cea17..40b386aaedf7 100644
> --- a/drivers/cxl/Makefile
> +++ b/drivers/cxl/Makefile
> @@ -1,5 +1,6 @@
>  # SPDX-License-Identifier: GPL-2.0
>  obj-$(CONFIG_CXL_BUS) += core/
> +obj-$(CONFIG_CXL_MEM) += cxl_port.o

It feel odd that CONFIG_CXL_MEM builds cxl_port, why not have a
CONFIG_CXL_PORT that is simply selected by CONFIG_CXL_MEM, or a
CONFIG_CXL_PORT that defaults to the value of CONFIG_CXL_BUS?

>  obj-$(CONFIG_CXL_PCI) += cxl_pci.o
>  obj-$(CONFIG_CXL_ACPI) += cxl_acpi.o
>  obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o
> @@ -7,3 +8,4 @@ obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o
>  cxl_pci-y := pci.o
>  cxl_acpi-y := acpi.o
>  cxl_pmem-y := pmem.o
> +cxl_port-y := port.o
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index b972abc9f6ef..d61397055e9f 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -491,3 +491,4 @@ static struct platform_driver cxl_acpi_driver = {
>  module_platform_driver(cxl_acpi_driver);
>  MODULE_LICENSE("GPL v2");
>  MODULE_IMPORT_NS(CXL);
> +MODULE_SOFTDEP("pre: cxl_port");

Why does cxl_acpi depend on cxl_port being loaded at this point? I
think this wants to wait for a future patch where cxl_acpi uses port
driver services.

> diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> index 5f2bde47a5c2..dffbd0ac64af 100644
> --- a/drivers/cxl/core/bus.c
> +++ b/drivers/cxl/core/bus.c
> @@ -266,6 +266,7 @@ struct cxl_port *to_cxl_port(struct device *dev)
>                 return NULL;
>         return container_of(dev, struct cxl_port, dev);
>  }
> +EXPORT_SYMBOL_GPL(to_cxl_port);
>
>  static void unregister_port(void *_port)
>  {
> @@ -537,11 +538,21 @@ struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port,
>  }
>  EXPORT_SYMBOL_GPL(cxl_decoder_alloc);
>
> -int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
> +/**
> + * cxl_decoder_add_locked() - Adds a decoder with port locked.
> + * @cxld: Decoder to be added
> + * @target_map: Downstream port array of port ids
> + *
> + * The parent port owning the @cxld must have its device mutex locked before
> + * calling this function.
> + *
> + * Returns 0 if successful; otherwise, a negative error code is returned.
> + */
> +int cxl_decoder_add_locked(struct cxl_decoder *cxld, int *target_map)
>  {
>         struct cxl_port *port;
>         struct device *dev;
> -       int rc = 0;
> +       int rc;
>
>         if (WARN_ON_ONCE(!cxld))
>                 return -EINVAL;
> @@ -556,12 +567,11 @@ int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
>
>         port = to_cxl_port(cxld->dev.parent);
>
> -       device_lock(&port->dev);
> -       if (!is_endpoint_decoder(dev))
> +       if (!is_endpoint_decoder(dev)) {
>                 rc = decoder_populate_targets(cxld, port, target_map);
> -       device_unlock(&port->dev);
> -       if (rc)
> -               return rc;
> +               if (rc)
> +                       return rc;
> +       }

Yeah, this hunk is proof that the api change belongs in the previous patch.

>
>         rc = dev_set_name(dev, "decoder%d.%d", port->id, cxld->id);
>         if (rc)
> @@ -569,6 +579,24 @@ int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
>
>         return device_add(dev);
>  }
> +EXPORT_SYMBOL_GPL(cxl_decoder_add_locked);
> +
> +int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map)
> +{
> +       int rc;
> +
> +       if (WARN_ON_ONCE(!cxld))
> +               return -EINVAL;
> +
> +       if (WARN_ON_ONCE(IS_ERR(cxld)))
> +               return PTR_ERR(cxld);
> +
> +       device_lock(&to_cxl_port(cxld->dev.parent)->dev);
> +       rc = cxl_decoder_add_locked(cxld, target_map);
> +       device_unlock(&to_cxl_port(cxld->dev.parent)->dev);

I don't like that this converts the parent device twice, especially
since to_cxl_port() does extra work for type validation, so please add
a local @port variable.

> +
> +       return rc;
> +}
>  EXPORT_SYMBOL_GPL(cxl_decoder_add);
>
>  static void cxld_unregister(void *dev)
> @@ -627,6 +655,8 @@ static int cxl_device_id(struct device *dev)
>                 return CXL_DEVICE_NVDIMM_BRIDGE;
>         if (dev->type == &cxl_nvdimm_type)
>                 return CXL_DEVICE_NVDIMM;
> +       if (dev->type == &cxl_port_type)
> +               return CXL_DEVICE_PORT;
>         return 0;
>  }
>
> diff --git a/drivers/cxl/core/regs.c b/drivers/cxl/core/regs.c
> index 40598905c080..c8ab8880b81b 100644
> --- a/drivers/cxl/core/regs.c
> +++ b/drivers/cxl/core/regs.c
> @@ -159,9 +159,8 @@ void cxl_probe_device_regs(struct device *dev, void __iomem *base,
>  }
>  EXPORT_SYMBOL_GPL(cxl_probe_device_regs);
>
> -static void __iomem *devm_cxl_iomap_block(struct device *dev,
> -                                         resource_size_t addr,
> -                                         resource_size_t length)
> +void __iomem *devm_cxl_iomap_block(struct device *dev, resource_size_t addr,
> +                                  resource_size_t length)
>  {
>         void __iomem *ret_val;
>         struct resource *res;
> @@ -180,6 +179,7 @@ static void __iomem *devm_cxl_iomap_block(struct device *dev,
>
>         return ret_val;
>  }
> +EXPORT_SYMBOL_GPL(devm_cxl_iomap_block);

I think it's well past time to convert all the EXPORT_SYMBOL_GPL in
the core to EXPORT_SYMBOL_NS_GPL($symbol, CXL).

>
>  int cxl_map_component_regs(struct pci_dev *pdev,
>                            struct cxl_component_regs *regs,
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 91b8fd54bc93..ad22caf9135c 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -17,6 +17,9 @@
>   * (port-driver, region-driver, nvdimm object-drivers... etc).
>   */
>
> +/* CXL 2.0 8.2.4 CXL Component Register Layout and Definition */
> +#define CXL_COMPONENT_REG_BLOCK_SIZE SZ_64K
> +
>  /* CXL 2.0 8.2.5 CXL.cache and CXL.mem Registers*/
>  #define CXL_CM_OFFSET 0x1000
>  #define CXL_CM_CAP_HDR_OFFSET 0x0
> @@ -36,11 +39,22 @@
>  #define CXL_HDM_DECODER_CAP_OFFSET 0x0
>  #define   CXL_HDM_DECODER_COUNT_MASK GENMASK(3, 0)
>  #define   CXL_HDM_DECODER_TARGET_COUNT_MASK GENMASK(7, 4)
> -#define CXL_HDM_DECODER0_BASE_LOW_OFFSET 0x10
> -#define CXL_HDM_DECODER0_BASE_HIGH_OFFSET 0x14
> -#define CXL_HDM_DECODER0_SIZE_LOW_OFFSET 0x18
> -#define CXL_HDM_DECODER0_SIZE_HIGH_OFFSET 0x1c
> -#define CXL_HDM_DECODER0_CTRL_OFFSET 0x20
> +#define   CXL_HDM_DECODER_INTERLEAVE_11_8 BIT(8)
> +#define   CXL_HDM_DECODER_INTERLEAVE_14_12 BIT(9)
> +#define CXL_HDM_DECODER_CTRL_OFFSET 0x4
> +#define   CXL_HDM_DECODER_ENABLE BIT(1)
> +#define CXL_HDM_DECODER0_BASE_LOW_OFFSET(i) (0x20 * (i) + 0x10)
> +#define CXL_HDM_DECODER0_BASE_HIGH_OFFSET(i) (0x20 * (i) + 0x14)
> +#define CXL_HDM_DECODER0_SIZE_LOW_OFFSET(i) (0x20 * (i) + 0x18)
> +#define CXL_HDM_DECODER0_SIZE_HIGH_OFFSET(i) (0x20 * (i) + 0x1c)
> +#define CXL_HDM_DECODER0_CTRL_OFFSET(i) (0x20 * (i) + 0x20)
> +#define   CXL_HDM_DECODER0_CTRL_IG_MASK GENMASK(3, 0)
> +#define   CXL_HDM_DECODER0_CTRL_IW_MASK GENMASK(7, 4)
> +#define   CXL_HDM_DECODER0_CTRL_COMMIT BIT(9)
> +#define   CXL_HDM_DECODER0_CTRL_COMMITTED BIT(10)
> +#define   CXL_HDM_DECODER0_CTRL_TYPE BIT(12)
> +#define CXL_HDM_DECODER0_TL_LOW(i) (0x20 * (i) + 0x24)
> +#define CXL_HDM_DECODER0_TL_HIGH(i) (0x20 * (i) + 0x28)
>
>  static inline int cxl_hdm_decoder_count(u32 cap_hdr)
>  {
> @@ -164,6 +178,8 @@ int cxl_map_device_regs(struct pci_dev *pdev,
>  enum cxl_regloc_type;
>  int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
>                       struct cxl_register_map *map);
> +void __iomem *devm_cxl_iomap_block(struct device *dev, resource_size_t addr,
> +                                  resource_size_t length);
>
>  #define CXL_RESOURCE_NONE ((resource_size_t) -1)
>  #define CXL_TARGET_STRLEN 20
> @@ -178,7 +194,8 @@ int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
>  #define CXL_DECODER_F_TYPE2 BIT(2)
>  #define CXL_DECODER_F_TYPE3 BIT(3)
>  #define CXL_DECODER_F_LOCK  BIT(4)
> -#define CXL_DECODER_F_MASK  GENMASK(4, 0)
> +#define CXL_DECODER_F_EN    BIT(5)

Spell out ENABLE.

> +#define CXL_DECODER_F_MASK  GENMASK(5, 0)
>
>  enum cxl_decoder_type {
>         CXL_DECODER_ACCELERATOR = 2,
> @@ -295,6 +312,7 @@ struct cxl_decoder *to_cxl_decoder(struct device *dev);
>  bool is_root_decoder(struct device *dev);
>  struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port,
>                                       unsigned int nr_targets);
> +int cxl_decoder_add_locked(struct cxl_decoder *cxld, int *target_map);
>  int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map);
>  int cxl_decoder_autoremove(struct device *host, struct cxl_decoder *cxld);
>
> @@ -323,6 +341,7 @@ void cxl_driver_unregister(struct cxl_driver *cxl_drv);
>
>  #define CXL_DEVICE_NVDIMM_BRIDGE       1
>  #define CXL_DEVICE_NVDIMM              2
> +#define CXL_DEVICE_PORT                        3
>
>  #define MODULE_ALIAS_CXL(type) MODULE_ALIAS("cxl:t" __stringify(type) "*")
>  #define CXL_MODALIAS_FMT "cxl:t%d"
> diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
> new file mode 100644
> index 000000000000..ebbfb72ae995
> --- /dev/null
> +++ b/drivers/cxl/port.c
> @@ -0,0 +1,242 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/* Copyright(c) 2021 Intel Corporation. All rights reserved. */
> +#include <linux/device.h>
> +#include <linux/module.h>
> +#include <linux/slab.h>
> +
> +#include "cxlmem.h"
> +
> +/**
> + * DOC: cxl port
> + *
> + * The port driver implements the set of functionality needed to allow full
> + * decoder enumeration and routing. A CXL port is an abstraction of a CXL
> + * component that implements some amount of CXL decoding of CXL.mem traffic.
> + * As of the CXL 2.0 spec, this includes:
> + *
> + *     .. list-table:: CXL Components w/ Ports
> + *             :widths: 25 25 50
> + *             :header-rows: 1
> + *
> + *             * - component
> + *               - upstream
> + *               - downstream
> + *             * - Hostbridge
> + *               - ACPI0016
> + *               - root port
> + *             * - Switch
> + *               - Switch Upstream Port
> + *               - Switch Downstream Port
> + *             * - Endpoint (not yet implemented)
> + *               - Endpoint Port
> + *               - N/A
> + *
> + * The primary service this driver provides is enumerating HDM decoders and
> + * presenting APIs to other drivers to utilize the decoders.
> + */
> +
> +struct cxl_port_data {
> +       struct cxl_component_regs regs;
> +
> +       struct port_caps {
> +               unsigned int count;
> +               unsigned int tc;
> +               unsigned int interleave11_8;
> +               unsigned int interleave14_12;
> +       } caps;
> +};
> +
> +static inline int cxl_hdm_decoder_ig(u32 ctrl)

No need for plain inline in C files.

It's not clear why this simple helper needs a "cxl_hdm_decoder"
namespace prefix?

> +{
> +       int val = FIELD_GET(CXL_HDM_DECODER0_CTRL_IG_MASK, ctrl);
> +
> +       return 8 + val;
> +}

Why is this return a power of 2 value...

> +
> +static inline int cxl_hdm_decoder_iw(u32 ctrl)
> +{
> +       int val = FIELD_GET(CXL_HDM_DECODER0_CTRL_IW_MASK, ctrl);
> +
> +       return 1 << val;

...while this one is converted to absolute values.

These could just be:

unsigned int to_interleave_granularity(u32 ctrl)
unsigned int to_interleave_ways(u32 ctrl)

...and return units in bytes.

> +}
> +
> +static void get_caps(struct cxl_port *port, struct cxl_port_data *cpd)
> +{
> +       void __iomem *hdm_decoder = cpd->regs.hdm_decoder;
> +       struct port_caps *caps = &cpd->caps;
> +       u32 hdm_cap;
> +
> +       hdm_cap = readl(hdm_decoder + CXL_HDM_DECODER_CAP_OFFSET);
> +
> +       caps->count = cxl_hdm_decoder_count(hdm_cap);
> +       caps->tc = FIELD_GET(CXL_HDM_DECODER_TARGET_COUNT_MASK, hdm_cap);
> +       caps->interleave11_8 =
> +               FIELD_GET(CXL_HDM_DECODER_INTERLEAVE_11_8, hdm_cap);
> +       caps->interleave14_12 =
> +               FIELD_GET(CXL_HDM_DECODER_INTERLEAVE_14_12, hdm_cap);
> +}
> +
> +static int map_regs(struct cxl_port *port, void __iomem *crb,
> +                   struct cxl_port_data *cpd)
> +{
> +       struct cxl_register_map map;
> +       struct cxl_component_reg_map *comp_map = &map.component_map;
> +
> +       cxl_probe_component_regs(&port->dev, crb, comp_map);
> +       if (!comp_map->hdm_decoder.valid) {
> +               dev_err(&port->dev, "HDM decoder registers invalid\n");
> +               return -ENXIO;
> +       }

Perhaps promote cxl_probe_regs() from the cxl_pci to the core and make
it take a dev instead of a pdev, then you can do:

cxl_probe_regs(&port_dev->dev, CXL_REGLOC_RBI_COMPONENT)

...instead of open coding it again?

> +
> +       cpd->regs.hdm_decoder = crb + comp_map->hdm_decoder.offset;
> +
> +       return 0;
> +}
> +
> +static u64 get_decoder_size(void __iomem *hdm_decoder, int n)
> +{
> +       u32 ctrl = readl(hdm_decoder + CXL_HDM_DECODER0_CTRL_OFFSET(n));
> +
> +       if (!!FIELD_GET(CXL_HDM_DECODER0_CTRL_COMMITTED, ctrl))
> +               return 0;
> +
> +       return ioread64_hi_lo(hdm_decoder +
> +                             CXL_HDM_DECODER0_SIZE_LOW_OFFSET(n));
> +}
> +
> +static bool is_endpoint_port(struct cxl_port *port)
> +{
> +       if (!port->uport->driver)
> +               return false;
> +
> +       return to_cxl_drv(port->uport->driver)->id ==
> +              CXL_DEVICE_MEMORY_EXPANDER;

Why does endpoint port device type determination need to reach through
and read the driver type?

> +}
> +
> +static int enumerate_hdm_decoders(struct cxl_port *port,
> +                                 struct cxl_port_data *portdata)
> +{
> +       int i = 0;
> +
> +       for (i = 0; i < portdata->caps.count; i++) {
> +               int iw = 1, ig = 0, rc, target_count = portdata->caps.tc;
> +               void __iomem *hdm_decoder = portdata->regs.hdm_decoder;
> +               enum cxl_decoder_type type = CXL_DECODER_EXPANDER;
> +               struct resource res = DEFINE_RES_MEM(0, 0);
> +               struct cxl_decoder *cxld;
> +               int *target_map = NULL;
> +               u64 size;
> +
> +               if (is_endpoint_port(port))
> +                       target_count = 0;
> +
> +               cxld = cxl_decoder_alloc(port, target_count);
> +               if (IS_ERR(cxld)) {
> +                       dev_warn(&port->dev,
> +                                "Failed to allocate the decoder\n");
> +                       return PTR_ERR(cxld);
> +               }
> +
> +               size = get_decoder_size(hdm_decoder, i);
> +               if (size != 0) {
> +                       int temp[CXL_DECODER_MAX_INTERLEAVE];
> +                       u64 target_list, base;
> +                       u32 ctrl;
> +                       int j;
> +
> +                       target_map = temp;
> +                       ctrl = readl(hdm_decoder + CXL_HDM_DECODER0_CTRL_OFFSET(i));
> +                       base = ioread64_hi_lo(hdm_decoder + CXL_HDM_DECODER0_BASE_LOW_OFFSET(i));
> +                       res = (struct resource)DEFINE_RES_MEM(base, size);
> +
> +                       cxld->flags = CXL_DECODER_F_EN;
> +                       iw = cxl_hdm_decoder_iw(ctrl);
> +                       ig = cxl_hdm_decoder_ig(ctrl);
> +
> +                       if (FIELD_GET(CXL_HDM_DECODER0_CTRL_TYPE, ctrl) == 0)
> +                               type = CXL_DECODER_ACCELERATOR;
> +
> +                       target_list = ioread64_hi_lo(hdm_decoder + CXL_HDM_DECODER0_TL_LOW(i));
> +                       for (j = 0; j < iw; j++)
> +                               target_map[j] = (target_list >> (j * 8)) & 0xff;

Perhaps this can be cleaned up with:

union {
    u64 value;
    char target_id[8];
} target_list;

target_list.value = ioread64_hi_lo(hdm_decoder + CXL_HDM_DECODER0_TL_LOW(i));
for (j = 0; j < interleave_ways; j++)
    target_map[j] = target_list.target_id[j];

> +               }
> +
> +               cxld->target_type = type;
> +               cxld->res = res;
> +               cxld->interleave_ways = iw;
> +               cxld->interleave_granularity = ig;
> +
> +               rc = cxl_decoder_add_locked(cxld, target_map);
> +               if (rc)
> +                       put_device(&cxld->dev);
> +               else
> +                       rc = cxl_decoder_autoremove(port->uport->parent, cxld);

This looks broken to me.

With the assumption that the CXL subsystem wants the property that the
entire port object hierarchy is removed when cxl_acpi is removed, and
for a subset of the hierarchy to be removed when an individual port is
disabled (detached from the cxl_port driver) then I believe we want
something like the patch at the bottom of this mail... only compile
tested:

> +               if (rc)
> +                       dev_err(&port->dev, "Failed to add decoder\n");
> +       }
> +
> +       return 0;
> +}
> +
> +static int cxl_port_probe(struct device *dev)
> +{
> +       struct cxl_port *port = to_cxl_port(dev);
> +       struct cxl_port_data *portdata;
> +       void __iomem *crb;
> +       u32 ctrl;
> +       int rc;
> +
> +       if (port->component_reg_phys == CXL_RESOURCE_NONE)
> +               return 0;
> +
> +       portdata = devm_kzalloc(dev, sizeof(*portdata), GFP_KERNEL);
> +       if (!portdata)
> +               return -ENOMEM;
> +
> +       crb = devm_cxl_iomap_block(&port->dev, port->component_reg_phys,
> +                                  CXL_COMPONENT_REG_BLOCK_SIZE);
> +       if (IS_ERR_OR_NULL(crb)) {

devm_cxl_iomap_block() does not return an ERR_PTR so this should just
be a plain NULL check.

> +               dev_err(&port->dev, "No component registers mapped\n");
> +               return -ENXIO;
> +       }
> +
> +       rc = map_regs(port, crb, portdata);
> +       if (rc)
> +               return rc;
> +
> +       get_caps(port, portdata);
> +       if (portdata->caps.count == 0) {
> +               dev_err(&port->dev, "Spec violation. Caps invalid\n");
> +               return -ENXIO;
> +       }
> +
> +       /*
> +        * Enable HDM decoders for this port.
> +        *
> +        * FIXME: If the component was using DVSEC range registers for decode,
> +        * this will destroy that.

Yeah, definitely need to check that before this patch can move
forward. Perhaps a port should not even be registered if DVSEC
Memory_Size && Mem_Enable are non zero, that device is explicitly
opting out of being a part of the CXL 2.0 subsystem hierarchy.
However, we might still need to track it and potentially reserve it
out of CFMWS capacity to make sure nothing else collides with it. I'll
also note that "ECN: Devices operating in CXL 1.1 mode with no RCRB"
was recently published that reads on what the driver should do here.

> +        */
> +       ctrl = readl(portdata->regs.hdm_decoder + CXL_HDM_DECODER_CTRL_OFFSET);
> +       ctrl |= CXL_HDM_DECODER_ENABLE;
> +       writel(ctrl, portdata->regs.hdm_decoder + CXL_HDM_DECODER_CTRL_OFFSET);

I feel like that if the driver finds it enabled it should leave it
enabled at ->remove() time, as you have it here, as BIOS might not be
expecting the OS to disable a decoder it set up. However, if the
driver actually does the enable, then it should pair it with a disable
at the end of time, so not a blind enable, but one that conditionally
arranges for the unwind.

> +
> +       rc = enumerate_hdm_decoders(port, portdata);
> +       if (rc) {
> +               dev_err(&port->dev, "Couldn't enumerate decoders (%d)\n", rc);
> +               return rc;
> +       }
> +
> +       dev_set_drvdata(dev, portdata);
> +       return 0;
> +}
> +
> +static struct cxl_driver cxl_port_driver = {
> +       .name = "cxl_port",
> +       .probe = cxl_port_probe,
> +       .id = CXL_DEVICE_PORT,
> +};
> +module_cxl_driver(cxl_port_driver);
> +
> +MODULE_LICENSE("GPL v2");
> +MODULE_IMPORT_NS(CXL);
> +MODULE_ALIAS_CXL(CXL_DEVICE_PORT);
> --
> 2.33.1

Proposed infrastructure for a new cxl_port_decoder_autoremove():

diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
index 3163167ecc3a..78f8313a1069 100644
--- a/drivers/cxl/acpi.c
+++ b/drivers/cxl/acpi.c
@@ -374,6 +374,11 @@ static int add_root_nvdimm_bridge(struct device
*match, void *data)
        return 1;
 }

+static void clear_cxl_topology_host(void *data)
+{
+       set_cxl_topology_host(NULL);
+}
+
 static int cxl_acpi_probe(struct platform_device *pdev)
 {
        int rc;
@@ -382,6 +387,11 @@ static int cxl_acpi_probe(struct platform_device *pdev)
        struct acpi_device *adev = ACPI_COMPANION(host);
        struct cxl_cfmws_context ctx;

+       set_cxl_topology_host(host);
+       rc = devm_add_action_or_reset(host, clear_cxl_topology_host, host);
+       if (rc)
+               return rc;
+
        root_port = devm_cxl_add_port(host, host, CXL_RESOURCE_NONE, NULL);
        if (IS_ERR(root_port))
                return PTR_ERR(root_port);
diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
index 17a4fff029f8..3146b6aa0a2f 100644
--- a/drivers/cxl/core/bus.c
+++ b/drivers/cxl/core/bus.c
@@ -3,6 +3,7 @@
 #include <linux/io-64-nonatomic-lo-hi.h>
 #include <linux/device.h>
 #include <linux/module.h>
+#include <linux/rwsem.h>
 #include <linux/pci.h>
 #include <linux/slab.h>
 #include <linux/idr.h>
@@ -563,6 +564,84 @@ int cxl_decoder_autoremove(struct device *host,
struct cxl_decoder *cxld)
 }
 EXPORT_SYMBOL_NS_GPL(cxl_decoder_autoremove, CXL);

+void trigger_decoder_autoremove(void *data)
+{
+       struct cxl_decoder *cxld = data;
+       struct device *host = get_cxl_topology_host();
+
+       /* The topology host driver beat us to the punch */
+       if (!host)
+               return;
+
+       devm_remove_action(host, cxld_unregister, &cxld->dev);
+       put_cxl_topology_host(host);
+}
+
+/**
+ * cxl_port_decoder_autoremove - remove decoders discovered by the port driver
+ * @cxld: decoder to remove
+ *
+ * CXL.mem requires an intact port / decoder topology from root level
+ * platform decoders to endpoint decoders. Arrange for decoders
+ * enumerated by the port driver to be removed when either the root is
+ * removed (in which the entire hierarchy is removed), or when an
+ * individual port is disabled, in which case only that port's
+ * sub-section of the hierarchy is removed.
+ */
+int cxl_port_decoder_autoremove(struct cxl_decoder *cxld)
+{
+       struct cxl_port *port = to_cxl_port(cxld->dev.parent);
+       struct device *host = get_cxl_topology_host();
+       int rc;
+
+       if (!host)
+               return -ENODEV;
+
+       if (!port->dev.driver) {
+               rc = -ENXIO;
+               goto out;
+       }
+
+       rc = cxl_decoder_autoremove(host, cxld);
+       if (rc)
+               goto out;
+
+       rc = devm_add_action_or_reset(&port->dev, trigger_decoder_autoremove,
+                                       cxld);
+out:
+       put_cxl_topology_host(host);
+       return rc;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_port_decoder_autoremove, CXL);
+
+static DECLARE_RWSEM(cxl_topology_rwsem);
+static struct device *cxl_topology_host;
+
+void set_cxl_topology_host(struct device *dev)
+{
+       down_write(&cxl_topology_rwsem);
+       cxl_topology_host = dev;
+       up_write(&cxl_topology_rwsem);
+}
+EXPORT_SYMBOL_NS_GPL(set_cxl_topology_host, CXL);
+
+struct device *get_cxl_topology_host(void)
+{
+       down_read(&cxl_topology_rwsem);
+       if (cxl_topology_host)
+               return cxl_topology_host;
+       up_read(&cxl_topology_rwsem);
+       return NULL;
+}
+EXPORT_SYMBOL_NS_GPL(get_cxl_topology_host, CXL);
+
+void put_cxl_topology_host(struct device *dev)
+{
+       WARN_ON(dev != cxl_topology_host);
+       up_read(&cxl_topology_rwsem);
+}
+EXPORT_SYMBOL_NS_GPL(put_cxl_topology_host, CXL);
+
 /**
  * __cxl_driver_register - register a driver for the cxl bus
  * @cxl_drv: cxl driver structure to attach
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 5e2e93451928..bd7b008207a4 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -302,6 +302,10 @@ struct cxl_decoder *cxl_decoder_alloc(struct
cxl_port *port, int nr_targets);
 int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map);
 int cxl_decoder_autoremove(struct device *host, struct cxl_decoder *cxld);

+void set_cxl_topology_host(struct device *dev);
+struct device *get_cxl_topology_host(void);
+void put_cxl_topology_host(struct device *dev);
+
 extern struct bus_type cxl_bus_type;

 struct cxl_driver {

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 08/28] cxl/port: Introduce a port driver
  2021-10-30  1:37   ` Dan Williams
@ 2021-10-31 17:53     ` Dan Williams
  2021-10-31 18:10       ` Dan Williams
  2021-11-01 17:53     ` Ben Widawsky
  2021-11-04 16:37     ` Ben Widawsky
  2 siblings, 1 reply; 106+ messages in thread
From: Dan Williams @ 2021-10-31 17:53 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Fri, Oct 29, 2021 at 6:37 PM Dan Williams <dan.j.williams@intel.com> wrote:
[..]
>
> Proposed infrastructure for a new cxl_port_decoder_autoremove():
>
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index 3163167ecc3a..78f8313a1069 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -374,6 +374,11 @@ static int add_root_nvdimm_bridge(struct device
> *match, void *data)
>         return 1;
>  }
>
> +static void clear_cxl_topology_host(void *data)
> +{
> +       set_cxl_topology_host(NULL);
> +}
> +
>  static int cxl_acpi_probe(struct platform_device *pdev)
>  {
>         int rc;
> @@ -382,6 +387,11 @@ static int cxl_acpi_probe(struct platform_device *pdev)
>         struct acpi_device *adev = ACPI_COMPANION(host);
>         struct cxl_cfmws_context ctx;
>
> +       set_cxl_topology_host(host);
> +       rc = devm_add_action_or_reset(host, clear_cxl_topology_host, host);
> +       if (rc)
> +               return rc;
> +
>         root_port = devm_cxl_add_port(host, host, CXL_RESOURCE_NONE, NULL);
>         if (IS_ERR(root_port))
>                 return PTR_ERR(root_port);
> diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> index 17a4fff029f8..3146b6aa0a2f 100644
> --- a/drivers/cxl/core/bus.c
> +++ b/drivers/cxl/core/bus.c
> @@ -3,6 +3,7 @@
>  #include <linux/io-64-nonatomic-lo-hi.h>
>  #include <linux/device.h>
>  #include <linux/module.h>
> +#include <linux/rwsem.h>
>  #include <linux/pci.h>
>  #include <linux/slab.h>
>  #include <linux/idr.h>
> @@ -563,6 +564,84 @@ int cxl_decoder_autoremove(struct device *host,
> struct cxl_decoder *cxld)
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_decoder_autoremove, CXL);
>
> +void trigger_decoder_autoremove(void *data)
> +{
> +       struct cxl_decoder *cxld = data;
> +       struct device *host = get_cxl_topology_host();
> +
> +       /* The topology host driver beat us to the punch */
> +       if (!host)
> +               return;

This can be dropped or turned into a WARN_ON() because something is
wrong if the port removal is being triggered after the topology host
has already went away.

> +
> +       devm_remove_action(host, cxld_unregister, &cxld->dev);

Should be devm_release_action().

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 09/28] cxl/acpi: Map single port host bridge component registers
  2021-10-22 18:36 ` [RFC PATCH v2 09/28] cxl/acpi: Map single port host bridge component registers Ben Widawsky
@ 2021-10-31 18:03   ` Dan Williams
  2021-11-01 17:07     ` Ben Widawsky
  0 siblings, 1 reply; 106+ messages in thread
From: Dan Williams @ 2021-10-31 18:03 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> Now that the port driver exists and is able to do proper decoder
> enumeration of the component registers, it becomes trivial to use that

This is the second occurrence of "becomes trivial" in this series,
it's distracting because it immediately sets off alarm bells that the
changelog is not in:

Background
Problem
Solution

...format.

> for host bridge uports. For reasons out of scope, a functional change
> would be visible if the HDM decoder was programmed by BIOS to values
> other than the full address range. Similarly if a type2 device was
> connected to this root port and programmed by BIOS, that can now be
> acted upon accordingly.

I would reserve discussion of "no functional change" for patches that
are pure cleanup. In this case this patch will cause the kernel to
behave differently when other conditions are met.

>
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> ---
>  drivers/cxl/acpi.c | 25 ++++++++++++++++++++++++-
>  1 file changed, 24 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index d61397055e9f..8cca0814dfb8 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -280,12 +280,14 @@ static int add_host_bridge_uport(struct device *match, void *arg)
>         struct cxl_port *root_port = arg;
>         struct device *host = root_port->dev.parent;
>         struct acpi_device *bridge = to_cxl_host_bridge(host, match);
> +       struct cxl_component_reg_map map;
>         struct acpi_pci_root *pci_root;
>         struct cxl_walk_context ctx;
>         int single_port_map[1], rc;
>         struct cxl_decoder *cxld;
>         struct cxl_dport *dport;
>         struct cxl_port *port;
> +       void __iomem *crb;
>
>         if (!bridge)
>                 return 0;
> @@ -318,10 +320,31 @@ static int add_host_bridge_uport(struct device *match, void *arg)
>                 return -ENODEV;
>         if (ctx.error)
>                 return ctx.error;
> +       /*
> +        * If the host bridge has more than 1 root port, it must have registers
> +        * controlling the HDM decoders. Those will be enumerated by the port
> +        * driver.
> +        */
>         if (ctx.count > 1)
>                 return 0;
>
> -       /* TODO: Scan CHBCR for HDM Decoder resources */
> +       /*
> +        * If the single ported host bridge has a component register block,
> +        * simply let the port driver handle the decoder enumeration.
> +        *
> +        * Host bridge component registers live in the system's physical address
> +        * space.
> +        */
> +       crb = ioremap(dport->component_reg_phys, CXL_COMPONENT_REG_BLOCK_SIZE);

This looks broken, the driver for the port should be mapping dport
component registers to offer services to upper layers.

There's also a missing iounmap.

> +       if (crb) {
> +               cxl_probe_component_regs(&root_port->dev, crb, &map);
> +               iounmap(crb);
> +               if (map.hdm_decoder.valid) {
> +                       dev_dbg(host,
> +                               "Found single port host bridge with component registers\n");
> +                       return 0;
> +               }
> +       }
>
>         /*
>          * Per the CXL specification (8.2.5.12 CXL HDM Decoder Capability
> --
> 2.33.1
>

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 08/28] cxl/port: Introduce a port driver
  2021-10-31 17:53     ` Dan Williams
@ 2021-10-31 18:10       ` Dan Williams
  2021-11-01 17:36         ` Ben Widawsky
  0 siblings, 1 reply; 106+ messages in thread
From: Dan Williams @ 2021-10-31 18:10 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Sun, Oct 31, 2021 at 10:53 AM Dan Williams <dan.j.williams@intel.com> wrote:
>
> On Fri, Oct 29, 2021 at 6:37 PM Dan Williams <dan.j.williams@intel.com> wrote:
> [..]
> >
> > Proposed infrastructure for a new cxl_port_decoder_autoremove():
> >
> > diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> > index 3163167ecc3a..78f8313a1069 100644
> > --- a/drivers/cxl/acpi.c
> > +++ b/drivers/cxl/acpi.c
> > @@ -374,6 +374,11 @@ static int add_root_nvdimm_bridge(struct device
> > *match, void *data)
> >         return 1;
> >  }
> >
> > +static void clear_cxl_topology_host(void *data)
> > +{
> > +       set_cxl_topology_host(NULL);
> > +}
> > +
> >  static int cxl_acpi_probe(struct platform_device *pdev)
> >  {
> >         int rc;
> > @@ -382,6 +387,11 @@ static int cxl_acpi_probe(struct platform_device *pdev)
> >         struct acpi_device *adev = ACPI_COMPANION(host);
> >         struct cxl_cfmws_context ctx;
> >
> > +       set_cxl_topology_host(host);
> > +       rc = devm_add_action_or_reset(host, clear_cxl_topology_host, host);
> > +       if (rc)
> > +               return rc;
> > +
> >         root_port = devm_cxl_add_port(host, host, CXL_RESOURCE_NONE, NULL);
> >         if (IS_ERR(root_port))
> >                 return PTR_ERR(root_port);
> > diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> > index 17a4fff029f8..3146b6aa0a2f 100644
> > --- a/drivers/cxl/core/bus.c
> > +++ b/drivers/cxl/core/bus.c
> > @@ -3,6 +3,7 @@
> >  #include <linux/io-64-nonatomic-lo-hi.h>
> >  #include <linux/device.h>
> >  #include <linux/module.h>
> > +#include <linux/rwsem.h>
> >  #include <linux/pci.h>
> >  #include <linux/slab.h>
> >  #include <linux/idr.h>
> > @@ -563,6 +564,84 @@ int cxl_decoder_autoremove(struct device *host,
> > struct cxl_decoder *cxld)
> >  }
> >  EXPORT_SYMBOL_NS_GPL(cxl_decoder_autoremove, CXL);
> >
> > +void trigger_decoder_autoremove(void *data)
> > +{
> > +       struct cxl_decoder *cxld = data;
> > +       struct device *host = get_cxl_topology_host();
> > +
> > +       /* The topology host driver beat us to the punch */
> > +       if (!host)
> > +               return;
>
> This can be dropped or turned into a WARN_ON() because something is
> wrong if the port removal is being triggered after the topology host
> has already went away.

It further occurs to me that if all ports are guaranteed to be torn
down by topology host removal, then we might not even need to play
these games at all. cxl_acpi is already unregistering all of its child
ports, we just need the property that if a port adds a child port it
remove it. Then I don't think we even need cxl_topology_host games.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 10/28] cxl/core: Store global list of root ports
  2021-10-22 18:36 ` [RFC PATCH v2 10/28] cxl/core: Store global list of root ports Ben Widawsky
@ 2021-10-31 18:32   ` Dan Williams
  2021-11-01 18:43     ` Ben Widawsky
  0 siblings, 1 reply; 106+ messages in thread
From: Dan Williams @ 2021-10-31 18:32 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> CXL root ports (the downstream port to a host bridge) are to be
> enumerated by a platform specific driver. In the case of ACPI compliant
> systems, this is like the cxl_acpi driver. Root ports are the first
> CXL spec defined component that can be "found" by that platform specific
> driver.
>
> By storing a list of these root ports components in lower levels of the
> topology (switches and endpoints), have a mechanism to walk up their
> device hierarchy to find an enumerated root port. This will be necessary
> for region programming.
>
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> ---
>  drivers/cxl/acpi.c            |  4 ++--
>  drivers/cxl/core/bus.c        | 34 +++++++++++++++++++++++++++++++++-
>  drivers/cxl/cxl.h             |  5 ++++-
>  tools/testing/cxl/mock_acpi.c |  4 ++--
>  4 files changed, 41 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index 8cca0814dfb8..625c5d95b83f 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -231,7 +231,7 @@ __mock int match_add_root_ports(struct pci_dev *pdev, void *data)
>                 creg = cxl_reg_block(pdev, &map);
>
>         port_num = FIELD_GET(PCI_EXP_LNKCAP_PN, lnkcap);
> -       rc = cxl_add_dport(port, &pdev->dev, port_num, creg);
> +       rc = cxl_add_dport(port, &pdev->dev, port_num, creg, true);
>         if (rc) {
>                 ctx->error = rc;
>                 return rc;
> @@ -406,7 +406,7 @@ static int add_host_bridge_dport(struct device *match, void *arg)
>                 dev_dbg(host, "No CHBS found for Host Bridge: %s\n",
>                         dev_name(match));
>
> -       rc = cxl_add_dport(root_port, match, uid, get_chbcr(chbs));
> +       rc = cxl_add_dport(root_port, match, uid, get_chbcr(chbs), false);
>         if (rc) {
>                 dev_err(host, "failed to add downstream port: %s\n",
>                         dev_name(match));
> diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> index dffbd0ac64af..03394a3ae75f 100644
> --- a/drivers/cxl/core/bus.c
> +++ b/drivers/cxl/core/bus.c
> @@ -25,6 +25,8 @@
>   */
>
>  static DEFINE_IDA(cxl_port_ida);
> +static LIST_HEAD(cxl_root_ports);
> +static DECLARE_RWSEM(root_port_sem);

I don't see a need for this new list and lock...

>
>  static ssize_t devtype_show(struct device *dev, struct device_attribute *attr,
>                             char *buf)
> @@ -268,12 +270,31 @@ struct cxl_port *to_cxl_port(struct device *dev)
>  }
>  EXPORT_SYMBOL_GPL(to_cxl_port);
>
> +struct cxl_dport *cxl_get_root_dport(struct device *dev)
> +{
> +       struct cxl_dport *ret = NULL;
> +       struct cxl_dport *dport;
> +
> +       down_read(&root_port_sem);
> +       list_for_each_entry(dport, &cxl_root_ports, root_port_link) {
> +               if (dport->dport == dev) {
> +                       ret = dport;
> +                       break;
> +               }
> +       }
> +
> +       up_read(&root_port_sem);
> +       return ret;
> +}
> +EXPORT_SYMBOL_GPL(cxl_get_root_dport);

This can be done by walking the existing topology:

struct cxl_dport *cxl_get_root_dport(struct device *dev)
{
        struct device *host = get_cxl_topology_host();
        struct cxl_dport *dport, *found = NULL;
        struct cxl_port *port;
        struct device *root;

        if (!host)
                return NULL;

        root = device_find_child(host, &root, match_cxl_root_port);
        if (!root)
                goto out;
        device_lock(root);
        port = to_cxl_port(root);
        list_for_each_entry (dport, &port->dports, list)
                if (dport->dport == dev) {
                        found = dport;
                        break;
                }
        device_unlock(root);
        put_device(root);

out:
        put_cxl_topology_host(host);
        return found;
}


It occurs to me after writing this that device_lock() for iterating
dports can be offloaded to the topology rwsem.

> +
>  static void unregister_port(void *_port)
>  {
>         struct cxl_port *port = _port;
>         struct cxl_dport *dport;
>
>         device_lock(&port->dev);
> +       down_read(&root_port_sem);
>         list_for_each_entry(dport, &port->dports, list) {
>                 char link_name[CXL_TARGET_STRLEN];
>
> @@ -281,7 +302,10 @@ static void unregister_port(void *_port)
>                              dport->port_id) >= CXL_TARGET_STRLEN)
>                         continue;
>                 sysfs_remove_link(&port->dev.kobj, link_name);
> +
> +               list_del_init(&dport->root_port_link);
>         }
> +       up_read(&root_port_sem);
>         device_unlock(&port->dev);
>         device_unregister(&port->dev);
>  }
> @@ -431,12 +455,13 @@ static int add_dport(struct cxl_port *port, struct cxl_dport *new)
>   * @dport_dev: firmware or PCI device representing the dport
>   * @port_id: identifier for this dport in a decoder's target list
>   * @component_reg_phys: optional location of CXL component registers
> + * @root_port: is this a root port (hostbridge downstream)
>   *
>   * Note that all allocations and links are undone by cxl_port deletion
>   * and release.
>   */
>  int cxl_add_dport(struct cxl_port *port, struct device *dport_dev, int port_id,
> -                 resource_size_t component_reg_phys)
> +                 resource_size_t component_reg_phys, bool root_port)

With the above implementation there's no need to add this last parameter.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 11/28] cxl/acpi: Rescan bus at probe completion
  2021-10-22 18:36 ` [RFC PATCH v2 11/28] cxl/acpi: Rescan bus at probe completion Ben Widawsky
@ 2021-10-31 19:25   ` Dan Williams
  2021-11-01 18:56     ` Ben Widawsky
  0 siblings, 1 reply; 106+ messages in thread
From: Dan Williams @ 2021-10-31 19:25 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> Ensure that devices being probed before cxl_acpi has completed will get
> a second chance.

I think this is at the wrong level... more below.
>
> CXL drivers are brought up through two enumerable, asynchronous
> mechanism. The leaf nodes in the CXL topology, endpoints, are enumerated
> via PCI headers. The root node's enumeration is platform specific. The
> current defacto mechanism for enumerating the root node is through the
> presence of an ACPI device, ACPI0017.
>
> The primary job of a cxl_mem driver is to determine if CXL.mem traffic
> can be routed to/from the PCIe device that it is being probed. A
> prerequisite in this determination is that all CXL components in the
> path from root to leaf are capable of routing CXL.mem traffic. If the
> cxl_mem driver is probed before cxl_acpi is complete the driver will be
> unable to make this determination. To address this, cxl_acpi (or in the
> future, another platform specific driver) will rescan all devices to
> make sure the ordering is correct.
>
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> ---
>  drivers/cxl/acpi.c | 20 +++++++++++++++++++-
>  1 file changed, 19 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index 625c5d95b83f..1cc3a74c16bd 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -451,6 +451,14 @@ static u32 cedt_instance(struct platform_device *pdev)
>         return U32_MAX;
>  }
>
> +static void bus_rescan(struct work_struct *work)
> +{
> +       if (bus_rescan_devices(&cxl_bus_type))
> +               pr_err("Failed to rescan CXL bus\n");
> +}
> +
> +static DECLARE_WORK(deferred_bus_rescan, bus_rescan);
> +
>  static int cxl_acpi_probe(struct platform_device *pdev)
>  {
>         int rc;
> @@ -484,9 +492,19 @@ static int cxl_acpi_probe(struct platform_device *pdev)
>         if (rc)
>                 goto out;
>
> -       if (IS_ENABLED(CONFIG_CXL_PMEM))
> +       if (IS_ENABLED(CONFIG_CXL_PMEM)) {
>                 rc = device_for_each_child(&root_port->dev, root_port,
>                                            add_root_nvdimm_bridge);
> +               if (rc)
> +                       goto out;
> +       }
> +
> +       /*
> +        * While ACPI is scanning hostbridge ports, switches and memory devices
> +        * may have been probed. Those devices will need to know whether the
> +        * hostbridge is CXL capable.
> +        */
> +       schedule_work(&deferred_bus_rescan);

I think this belongs in port driver similar to the one in
drivers/cxl/pmem.c, because any port online event might mean that a
downstream port can now attach to the port driver. I prefer a local
ordered workqueue via queue_work() rather than the global unordered
workqueue primarily because it can be flushed without getting
entangled with other random work in the system, but also because the
ordered property is useful for comprehended hierarchical topology
events.

In any event this is missing flushing to prevent races between .text
removal at module_exit() and pending work.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 12/28] cxl/core: Store component register base for memdevs
  2021-10-22 18:36 ` [RFC PATCH v2 12/28] cxl/core: Store component register base for memdevs Ben Widawsky
@ 2021-10-31 20:13   ` Dan Williams
  2021-11-01 21:50     ` Ben Widawsky
  0 siblings, 1 reply; 106+ messages in thread
From: Dan Williams @ 2021-10-31 20:13 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> Bake component registers into the memdev creation API in order to be
> able to use them as part of driver probing.
>
> Component register base addresses are obtained through PCI mechanisms.
> As such it makes most sense for the cxl_pci driver to obtain that
> address. In order to reuse the port driver for enumerating decoder
> resources for an endpoint, it is desirable to be able to add the
> endpoint as a port.  Unfortunately, by the time an endpoint driver would
> run, it no longer has any concept of the underlying PCI device (this is
> done intentionally to provide separation between drivers).

Changelog looks good.

>
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> ---
>  drivers/cxl/core/memdev.c    |  5 +++--
>  drivers/cxl/cxlmem.h         |  5 ++++-
>  drivers/cxl/pci.c            | 17 ++++++++++++++++-
>  tools/testing/cxl/test/mem.c |  3 ++-
>  4 files changed, 25 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index bf1b04d00ff4..15762c16d83f 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -276,8 +276,8 @@ static const struct file_operations cxl_memdev_fops = {
>         .llseek = noop_llseek,
>  };
>
> -struct cxl_memdev *
> -devm_cxl_add_memdev(struct cxl_mem *cxlm)
> +struct cxl_memdev *devm_cxl_add_memdev(struct cxl_mem *cxlm,
> +                                      resource_size_t component_reg_phys)
>  {
>         struct cxl_memdev *cxlmd;
>         struct device *dev;
> @@ -298,6 +298,7 @@ devm_cxl_add_memdev(struct cxl_mem *cxlm)
>          * needed as this is ordered with cdev_add() publishing the device.
>          */
>         cxlmd->cxlm = cxlm;
> +       cxlmd->creg_base = component_reg_phys;

Let's pick one style for both, either rename component_reg_phys to
creg_base, or creg_base to component_reg_phys. My preference is
component_reg_phys, but I won't put up much of a fight against
creg_base.


>
>         cdev = &cxlmd->cdev;
>         rc = cdev_device_add(cdev, dev);
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index c4f450ad434d..62fe8e2c59e4 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -35,12 +35,14 @@
>   * @cdev: char dev core object for ioctl operations
>   * @cxlm: pointer to the parent device driver data
>   * @id: id number of this memdev instance.
> + * @creg_base: register base of component registers
>   */
>  struct cxl_memdev {
>         struct device dev;
>         struct cdev cdev;
>         struct cxl_mem *cxlm;
>         int id;
> +       resource_size_t creg_base;
>  };
>
>  static inline struct cxl_memdev *to_cxl_memdev(struct device *dev)
> @@ -48,7 +50,8 @@ static inline struct cxl_memdev *to_cxl_memdev(struct device *dev)
>         return container_of(dev, struct cxl_memdev, dev);
>  }
>
> -struct cxl_memdev *devm_cxl_add_memdev(struct cxl_mem *cxlm);
> +struct cxl_memdev *devm_cxl_add_memdev(struct cxl_mem *cxlm,
> +                                      resource_size_t component_reg_phys);
>
>  /**
>   * struct cxl_mbox_cmd - A command to be submitted to hardware.
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index d1adc759d051..96a312ed8269 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -421,6 +421,7 @@ static int cxl_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
>
>  static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  {
> +       resource_size_t creg = CXL_RESOURCE_NONE;
>         struct cxl_register_map map;
>         struct cxl_memdev *cxlmd;
>         struct cxl_mem *cxlm;
> @@ -465,7 +466,21 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>         if (rc)
>                 return rc;
>
> -       cxlmd = devm_cxl_add_memdev(cxlm);
> +       /*
> +        * If the component registers can't be found, the cxl_pci driver may
> +        * still be useful for management functions so don't return an error.

This comment makes sense...

> +        *
> +        * XXX: Creating the device is going to kick of the cxl_mem probing.
> +        * That probe requires the component registers. Therefore, the register
> +        * block must always be found first.
> +        */

..., but I don't understand the point of this comment. Given that
devm_cxl_add_memdev() takes the base address as an argument it's
already clear that the component registers need to be found before
devm_cxl_add_memdev().

> +       rc = cxl_find_regblock(pdev, CXL_REGLOC_RBI_COMPONENT, &map);
> +       if (rc)
> +               dev_warn(&cxlmd->dev, "No component registers (%d)\n", rc);
> +       else
> +               creg = cxl_reg_block(pdev, &map);
> +
> +       cxlmd = devm_cxl_add_memdev(cxlm, creg);
>         if (IS_ERR(cxlmd))
>                 return PTR_ERR(cxlmd);
>
> diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c
> index 12a8437a9ca0..471fc7fb5418 100644
> --- a/tools/testing/cxl/test/mem.c
> +++ b/tools/testing/cxl/test/mem.c
> @@ -227,7 +227,8 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
>         if (rc)
>                 return rc;
>
> -       cxlmd = devm_cxl_add_memdev(cxlm);
> +       /* TODO: mock component registers, or... */

How about:

/* cxl_test does not emulate registers, any memdev operations that
imply component register access will be mocked at the memdev
operations interface */

> +       cxlmd = devm_cxl_add_memdev(cxlm, CXL_RESOURCE_NONE);
>         if (IS_ERR(cxlmd))
>                 return PTR_ERR(cxlmd);
>
> --
> 2.33.1
>

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 13/28] cxl: Flesh out register names
  2021-10-22 18:36 ` [RFC PATCH v2 13/28] cxl: Flesh out register names Ben Widawsky
@ 2021-10-31 20:18   ` Dan Williams
  2021-11-01 22:00     ` Ben Widawsky
  2021-11-03 15:53   ` Jonathan Cameron
  1 sibling, 1 reply; 106+ messages in thread
From: Dan Williams @ 2021-10-31 20:18 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> Get a better naming scheme in place for upcoming additions. To solidify
> the schema, add all the DVSEC identifiers to start with.

The title and this changelog don't give anything of substance to
review the patch.

This also looks like a rename and addition of more definitions. The
rename has one rationale, the additional definitions have a different
one, so split those into 2 patches, or fold the additions into the
patch that uses them.

>
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
>
> ---
> See:
> https://lore.kernel.org/linux-pci/20210913190131.xiiszmno46qie7v5@intel.com/

Perhaps summarize this above, it's not clear what's relevant from that
thread to this patch.

> ---
>  drivers/cxl/core/regs.c | 14 ++++++++------
>  drivers/cxl/pci.h       | 38 ++++++++++++++++++++++++++++++--------
>  2 files changed, 38 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/cxl/core/regs.c b/drivers/cxl/core/regs.c
> index c8ab8880b81b..b837196fbf39 100644
> --- a/drivers/cxl/core/regs.c
> +++ b/drivers/cxl/core/regs.c
> @@ -253,9 +253,11 @@ static void cxl_decode_regblock(u32 reg_lo, u32 reg_hi,
>                                 struct cxl_register_map *map)
>  {
>         map->block_offset =
> -               ((u64)reg_hi << 32) | (reg_lo & CXL_REGLOC_ADDR_MASK);
> -       map->barno = FIELD_GET(CXL_REGLOC_BIR_MASK, reg_lo);
> -       map->reg_type = FIELD_GET(CXL_REGLOC_RBI_MASK, reg_lo);
> +               ((u64)reg_hi << 32) |
> +               (reg_lo & DVSEC_REGISTER_LOCATOR_BLOCK_OFFSET_LOW_MASK);
> +       map->barno = FIELD_GET(DVSEC_REGISTER_LOCATOR_BIR_MASK, reg_lo);
> +       map->reg_type =
> +               FIELD_GET(DVSEC_REGISTER_LOCATOR_BLOCK_IDENTIFIER_MASK, reg_lo);
>  }
>
>  /**
> @@ -276,15 +278,15 @@ int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
>         int regloc, i;
>
>         regloc = pci_find_dvsec_capability(pdev, PCI_DVSEC_VENDOR_ID_CXL,
> -                                          PCI_DVSEC_ID_CXL_REGLOC_DVSEC_ID);
> +                                          CXL_DVSEC_REGISTER_LOCATOR);
>         if (!regloc)
>                 return -ENXIO;
>
>         pci_read_config_dword(pdev, regloc + PCI_DVSEC_HEADER1, &regloc_size);
>         regloc_size = FIELD_GET(PCI_DVSEC_HEADER1_LENGTH_MASK, regloc_size);
>
> -       regloc += PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET;
> -       regblocks = (regloc_size - PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET) / 8;
> +       regloc += DVSEC_REGISTER_LOCATOR_BLOCK1_OFFSET;
> +       regblocks = (regloc_size - DVSEC_REGISTER_LOCATOR_BLOCK1_OFFSET) / 8;
>
>         for (i = 0; i < regblocks; i++, regloc += 8) {
>                 u32 reg_lo, reg_hi;
> diff --git a/drivers/cxl/pci.h b/drivers/cxl/pci.h
> index 12fdcb1b14e5..fe2898b17736 100644
> --- a/drivers/cxl/pci.h
> +++ b/drivers/cxl/pci.h
> @@ -7,17 +7,36 @@
>
>  /*
>   * See section 8.1 Configuration Space Registers in the CXL 2.0
> - * Specification
> + * Specification. Names are taken straight from the specification with "CXL" and
> + * "DVSEC" redundancies removed.
>   */
>  #define PCI_DVSEC_HEADER1_LENGTH_MASK  GENMASK(31, 20)
>  #define PCI_DVSEC_VENDOR_ID_CXL                0x1E98
> -#define PCI_DVSEC_ID_CXL               0x0
>
> -#define PCI_DVSEC_ID_CXL_REGLOC_DVSEC_ID       0x8
> -#define PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET  0xC
> +/* 8.1.3: PCIe DVSEC for CXL Device */
> +#define CXL_DVSEC_PCIE_DEVICE                                  0
>
> -/* BAR Indicator Register (BIR) */
> -#define CXL_REGLOC_BIR_MASK GENMASK(2, 0)
> +/* 8.1.4: Non-CXL Function Map DVSEC */
> +#define CXL_DVSEC_FUNCTION_MAP                                 2
> +
> +/* 8.1.5: CXL 2.0 Extensions DVSEC for Ports */
> +#define CXL_DVSEC_PORT_EXTENSIONS                              3
> +
> +/* 8.1.6: GPF DVSEC for CXL Port */
> +#define CXL_DVSEC_PORT_GPF                                     4
> +
> +/* 8.1.7: GPF DVSEC for CXL Device */
> +#define CXL_DVSEC_DEVICE_GPF                                   5
> +
> +/* 8.1.8: PCIe DVSEC for Flex Bus Port */
> +#define CXL_DVSEC_PCIE_FLEXBUS_PORT                            7
> +
> +/* 8.1.9: Register Locator DVSEC */
> +#define CXL_DVSEC_REGISTER_LOCATOR                             8
> +#define   DVSEC_REGISTER_LOCATOR_BLOCK1_OFFSET                 0xC
> +#define     DVSEC_REGISTER_LOCATOR_BIR_MASK                    GENMASK(2, 0)
> +#define            DVSEC_REGISTER_LOCATOR_BLOCK_IDENTIFIER_MASK        GENMASK(15, 8)
> +#define     DVSEC_REGISTER_LOCATOR_BLOCK_OFFSET_LOW_MASK       GENMASK(31, 16)
>
>  /* Register Block Identifier (RBI) */
>  enum cxl_regloc_type {
> @@ -28,8 +47,11 @@ enum cxl_regloc_type {
>         CXL_REGLOC_RBI_TYPES
>  };
>
> -#define CXL_REGLOC_RBI_MASK GENMASK(15, 8)
> -#define CXL_REGLOC_ADDR_MASK GENMASK(31, 16)
> +/* 8.1.10: MLD DVSEC */
> +#define CXL_DVSEC_MLD                                          9
> +
> +/* 14.16.1 CXL Device Test Capability Advertisement */
> +#define CXL_DVSEC_PCIE_TEST_CAPABILITY                         10
>
>  #define cxl_reg_block(pdev, map)                                               \
>         ((resource_size_t)(pci_resource_start(pdev, (map)->barno) +            \
> --
> 2.33.1
>

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 14/28] cxl: Hide devm host for ports
  2021-10-22 18:36 ` [RFC PATCH v2 14/28] cxl: Hide devm host for ports Ben Widawsky
@ 2021-10-31 21:14   ` Dan Williams
  0 siblings, 0 replies; 106+ messages in thread
From: Dan Williams @ 2021-10-31 21:14 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> Generally, a platform will only have a single instance of an  entity
> which has a platform specific CXL driver. For the time being, this is
> only the ACPI0017 device defined as part of the CXL specification and
> ancillary specs (ACPI). With this in mind it becomes easy to drop the
> need to pass this around for devm functions which want to know the root
> host. That change allows future work with the mem/port driver to avoid
> having to determine the root host for port creation for endpoints and
> switches.

Given the revelation earlier that we don't need to devm registration
underneath the root object to coordinate descendant device teardown,
this change can be dropped.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 15/28] cxl/core: Introduce API to scan switch ports
  2021-10-22 18:36 ` [RFC PATCH v2 15/28] cxl/core: Introduce API to scan switch ports Ben Widawsky
@ 2021-11-01  5:39   ` Dan Williams
  2021-11-01 22:56     ` Ben Widawsky
  2021-11-03 16:08   ` Jonathan Cameron
  1 sibling, 1 reply; 106+ messages in thread
From: Dan Williams @ 2021-11-01  5:39 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> The CXL drivers encapsulate the components that direct memory traffic in
> an entity known as a cxl_port. Compute Express Link specifies three such
> components: hostbridge (ie. a collection of root ports), switches, and
> endpoints. There are currently drivers that create these ports for the
> hostbridges and the endpoints (cxl_acpi and cxl_mem). The new API
> introduced allows callers to initiate a scan down from the hostbridge
> and create ports for switches in the CXL topology.
>
> The intended user of this API is for endpoint devices. An endpoint
> device will need to determine if it is CXL.mem capable, which requires
> all components in the path from hostbridge to the endpoint to be CXL.mem
> capable. Once an endpoint device determines it's connected to a CXL
> capable root port, it can call this API to fill in all the ports in
> between the hostbridge and itself.
>
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> ---
>  .../driver-api/cxl/memory-devices.rst         |   6 +
>  drivers/cxl/core/Makefile                     |   1 +
>  drivers/cxl/core/bus.c                        | 145 ++++++++++++++++++
>  drivers/cxl/core/pci.c                        |  99 ++++++++++++
>  drivers/cxl/cxl.h                             |   2 +
>  drivers/cxl/pci.h                             |   6 +
>  drivers/cxl/port.c                            |   2 +-
>  tools/testing/cxl/Kbuild                      |   1 +
>  8 files changed, 261 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/cxl/core/pci.c
>
> diff --git a/Documentation/driver-api/cxl/memory-devices.rst b/Documentation/driver-api/cxl/memory-devices.rst
> index fbf0393cdddc..547336c95593 100644
> --- a/Documentation/driver-api/cxl/memory-devices.rst
> +++ b/Documentation/driver-api/cxl/memory-devices.rst
> @@ -47,6 +47,12 @@ CXL Core
>  .. kernel-doc:: drivers/cxl/core/bus.c
>     :identifiers:
>
> +.. kernel-doc:: drivers/cxl/core/pci.c
> +   :doc: cxl pci
> +
> +.. kernel-doc:: drivers/cxl/core/pci.c
> +   :identifiers:
> +
>  .. kernel-doc:: drivers/cxl/core/pmem.c
>     :doc: cxl pmem
>
> diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
> index 07eb8e1fb8a6..9d33d2d5bf09 100644
> --- a/drivers/cxl/core/Makefile
> +++ b/drivers/cxl/core/Makefile
> @@ -7,3 +7,4 @@ cxl_core-y += pmem.o
>  cxl_core-y += regs.o
>  cxl_core-y += memdev.o
>  cxl_core-y += mbox.o
> +cxl_core-y += pci.o
> diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> index c7e1894d503b..f10e7d5b22a4 100644
> --- a/drivers/cxl/core/bus.c
> +++ b/drivers/cxl/core/bus.c
> @@ -8,6 +8,7 @@
>  #include <linux/idr.h>
>  #include <cxlmem.h>
>  #include <cxl.h>
> +#include <pci.h>
>  #include "core.h"
>
>  /**
> @@ -445,6 +446,150 @@ struct cxl_port *devm_cxl_add_port(struct device *uport,
>  }
>  EXPORT_SYMBOL_GPL(devm_cxl_add_port);
>
> +void devm_cxl_remove_port(struct cxl_port *port)
> +{
> +       down_read(&root_host_sem);
> +       if (cxl_root_host) {
> +               devm_release_action(cxl_root_host, cxl_unlink_uport, port);
> +               devm_release_action(cxl_root_host, unregister_port, port);
> +       }
> +       up_read(&root_host_sem);
> +}
> +EXPORT_SYMBOL_GPL(devm_cxl_remove_port);

If the scan establishes the property that all child ports are devm
allocated with their cxl_port-parent, and only if the cxl_port-parent
is bound to its driver then I think we don't need to play
devm_release_action games().

> +
> +static int match_port(struct device *dev, const void *data)
> +{
> +       struct pci_dev *pdev = (struct pci_dev *)data;
> +
> +       if (dev->type != &cxl_port_type)
> +               return 0;
> +
> +       return to_cxl_port(dev)->uport == &pdev->dev;
> +}
> +
> +static struct cxl_port *find_cxl_port(struct pci_dev *usp)
> +{
> +       struct device *port_dev;
> +
> +       if (!pci_is_pcie(usp) || pci_pcie_type(usp) != PCI_EXP_TYPE_UPSTREAM)
> +               return NULL;
> +
> +       port_dev = bus_find_device(&cxl_bus_type, NULL, usp, match_port);
> +       if (port_dev)
> +               return to_cxl_port(port_dev);
> +
> +       return NULL;
> +}
> +
> +static int add_upstream_port(struct device *host, struct pci_dev *pdev)
> +{
> +       struct device *dev = &pdev->dev;
> +       struct cxl_port *parent_port;
> +       struct cxl_register_map map;
> +       struct cxl_port *port;
> +       int rc;
> +
> +       /*
> +        * Upstream ports must be connected to a downstream port or root port.
> +        * That downstream or root port must have a parent.
> +        */
> +       if (!pdev->dev.parent->parent)
> +               return -ENXIO;
> +
> +       /* A port is useless if there are no component registers */
> +       rc = cxl_find_regblock(pdev, CXL_REGLOC_RBI_COMPONENT, &map);
> +       if (rc)
> +               return rc;
> +
> +       parent_port = find_cxl_port(to_pci_dev(pdev->dev.parent->parent));

This deref chain is unreadable. It wants a helper if it stays, but I
can't immediately think of a reason to ever need to look at a
grandparent in the hierarchy.

> +       if (!parent_port)
> +               return -ENODEV;
> +
> +       port = devm_cxl_add_port(dev, cxl_reg_block(pdev, &map), parent_port);

This is broken because the pci device being used here does not have a
driver that knows about CXL bus events.

> +       put_device(&parent_port->dev);
> +       if (IS_ERR(port))
> +               dev_err(dev, "Failed to add upstream port %ld\n",
> +                       PTR_ERR(port));
> +       else
> +               dev_dbg(dev, "Added CXL port\n");
> +
> +       return rc;
> +}
> +
> +static int add_downstream_port(struct pci_dev *pdev)
> +{
> +       resource_size_t creg = CXL_RESOURCE_NONE;
> +       struct device *dev = &pdev->dev;
> +       struct cxl_port *parent_port;
> +       struct cxl_register_map map;
> +       u32 lnkcap, port_num;
> +       int rc;
> +
> +       /*
> +        * Ports are to be scanned from top down. Therefore, the upstream port
> +        * must already exist.
> +        */
> +       parent_port = find_cxl_port(to_pci_dev(pdev->dev.parent));
> +       if (!parent_port)
> +               return -ENODEV;
> +
> +       /*
> +        * The spec mandates component registers are present but the
> +        * driver does not.

What is this trying to convey?

> +        */
> +       rc = cxl_find_regblock(pdev, CXL_REGLOC_RBI_COMPONENT, &map);
> +       if (!rc)
> +               creg = cxl_reg_block(pdev, &map);
> +
> +       if (pci_read_config_dword(pdev, pci_pcie_cap(pdev) + PCI_EXP_LNKCAP,
> +                                 &lnkcap) != PCIBIOS_SUCCESSFUL)
> +               return 1;
> +       port_num = FIELD_GET(PCI_EXP_LNKCAP_PN, lnkcap);
> +
> +       rc = cxl_add_dport(parent_port, dev, port_num, creg, false);
> +       put_device(&parent_port->dev);

What get_device() is this paired with?

> +       if (rc)
> +               dev_err(dev, "Failed to add downstream port to %s\n",
> +                       dev_name(&parent_port->dev));
> +       else
> +               dev_dbg(dev, "Added downstream port to %s\n",
> +                       dev_name(&parent_port->dev));
> +
> +       return rc;
> +}
> +
> +static int match_add_ports(struct pci_dev *pdev, void *data)
> +{
> +       struct device *dev = &pdev->dev;
> +       struct device *host = data;
> +       int rc;
> +
> +       /* This port has already been added... */
> +       if (find_cxl_port(pdev))
> +               return 0;
> +
> +       if (is_cxl_switch_usp((dev)))
> +               rc = add_upstream_port(host, pdev);
> +
> +       if (is_cxl_switch_dsp((dev)))
> +               rc = add_downstream_port(pdev);
> +
> +       return rc;
> +}
> +
> +/**
> + * cxl_scan_ports() - Adds all ports for the subtree beginning with @dport
> + * @dport: Beginning node of the CXL topology
> + */
> +void cxl_scan_ports(struct cxl_dport *dport)
> +{
> +       struct device *d = dport->dport;
> +       struct pci_dev *pdev = to_pci_dev(d);
> +
> +       pci_walk_bus(pdev->bus, match_add_ports, &dport->port->dev);
> +}
> +EXPORT_SYMBOL_GPL(cxl_scan_ports);
> +
>  static struct cxl_dport *find_dport(struct cxl_port *port, int id)
>  {
>         struct cxl_dport *dport;
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> new file mode 100644
> index 000000000000..c0cbe984c778
> --- /dev/null
> +++ b/drivers/cxl/core/pci.c
> @@ -0,0 +1,99 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/* Copyright(c) 2021 Intel Corporation. All rights reserved. */
> +#include <linux/device.h>
> +#include <linux/pci.h>
> +#include <pci.h>
> +
> +/**
> + * DOC: cxl pci
> + *
> + * Compute Express Link protocols are layered on top of PCIe. CXL core provides
> + * a set of helpers for CXL interactions which occur via PCIe.
> + */
> +
> +/**
> + * is_cxl_mem_enabled() - Does the device understand CXL.mem protocol
> + * @pdev: The PCI device for which to determine CXL enablement
> + *
> + * This is the most discrete determination as to whether a device supports
> + * CXL.mem protocol. At a minimum, a CXL device must advertise it is capable of
> + * negotiating the CXL.mem protocol while operating in Flex Bus.CXL mode. There
> + * are other determining factors as to whether CXL.mem protocol is supported in
> + * the path from root port to endpoint. Those other factors require a more
> + * comprehensive survey of the CXL topology and would use is_cxl_mem_enabled()
> + * as a cursory check.
> + *
> + * If the PCI device is enabled for CXL.mem protocol return true; otherwise
> + * return false.
> + *
> + * TODO: Is there other architecturally visible state that can be used to infer
> + *       CXL.mem protocol support?

I was expecting that cxl_port->dev.driver != NULL is the CXL subsystem
source of truth.

> + */
> +bool is_cxl_mem_enabled(struct pci_dev *pdev)
> +{
> +       int pcie_dvsec;
> +       u16 dvsec_ctrl;
> +
> +       pcie_dvsec = pci_find_dvsec_capability(pdev, PCI_DVSEC_VENDOR_ID_CXL,
> +                                              CXL_DVSEC_PCIE_DEVICE);
> +       if (!pcie_dvsec) {
> +               dev_info(&pdev->dev,
> +                        "Unable to determine CXL protocol support");
> +               return false;
> +       }
> +
> +       pci_read_config_word(pdev,
> +                            pcie_dvsec + DVSEC_PCIE_DEVICE_CONTROL_OFFSET,
> +                            &dvsec_ctrl);
> +       if (!(dvsec_ctrl & DVSEC_PCIE_DEVICE_MEM_ENABLE)) {
> +               dev_info(&pdev->dev, "CXL.mem protocol not enabled on device");
> +               return false;
> +       }

Per above, doesn't the port driver already check this?

> +
> +       return true;
> +}
> +EXPORT_SYMBOL_GPL(is_cxl_mem_enabled);

Also per above, can't this stay internal to the port driver?

> +
> +/**
> + * is_cxl_switch_usp() - Is the device a CXL.mem enabled switch
> + * @dev: Device to query for switch type
> + *
> + * If the device is a CXL.mem capable upstream switch port return true;
> + * otherwise return false.
> + */
> +bool is_cxl_switch_usp(struct device *dev)
> +{
> +       struct pci_dev *pdev;
> +
> +       if (!dev_is_pci(dev))
> +               return false;
> +
> +       pdev = to_pci_dev(dev);
> +
> +       return pci_is_pcie(pdev) &&
> +              pci_pcie_type(pdev) == PCI_EXP_TYPE_UPSTREAM &&
> +              is_cxl_mem_enabled(pdev);
> +}
> +EXPORT_SYMBOL_GPL(is_cxl_switch_usp);

Who uses this api outside of the core?

> +
> +/**
> + * is_cxl_switch_dsp() - Is the device a CXL.mem enabled switch
> + * @dev: Device to query for switch type
> + *
> + * If the device is a CXL.mem capable downstream switch port return true;
> + * otherwise return false.
> + */
> +bool is_cxl_switch_dsp(struct device *dev)
> +{
> +       struct pci_dev *pdev;
> +
> +       if (!dev_is_pci(dev))
> +               return false;
> +
> +       pdev = to_pci_dev(dev);
> +
> +       return pci_is_pcie(pdev) &&
> +              pci_pcie_type(pdev) == PCI_EXP_TYPE_DOWNSTREAM &&
> +              is_cxl_mem_enabled(pdev);
> +}
> +EXPORT_SYMBOL_GPL(is_cxl_switch_dsp);

Same question... at a minimum wait to add exports until there is a user.

> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 419c2e2db6f0..03b414462416 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -309,6 +309,8 @@ struct cxl_port *to_cxl_port(struct device *dev);
>  struct cxl_port *devm_cxl_add_port(struct device *uport,
>                                    resource_size_t component_reg_phys,
>                                    struct cxl_port *parent_port);
> +void devm_cxl_remove_port(struct cxl_port *port);
> +void cxl_scan_ports(struct cxl_dport *root_port);
>
>  int cxl_add_dport(struct cxl_port *port, struct device *dport, int port_id,
>                   resource_size_t component_reg_phys, bool root_port);
> diff --git a/drivers/cxl/pci.h b/drivers/cxl/pci.h
> index fe2898b17736..9d6ca77d3e14 100644
> --- a/drivers/cxl/pci.h
> +++ b/drivers/cxl/pci.h
> @@ -15,6 +15,8 @@
>
>  /* 8.1.3: PCIe DVSEC for CXL Device */
>  #define CXL_DVSEC_PCIE_DEVICE                                  0
> +#define   DVSEC_PCIE_DEVICE_CONTROL_OFFSET                     0xC
> +#define     DVSEC_PCIE_DEVICE_MEM_ENABLE                       BIT(2)
>
>  /* 8.1.4: Non-CXL Function Map DVSEC */
>  #define CXL_DVSEC_FUNCTION_MAP                                 2
> @@ -57,4 +59,8 @@ enum cxl_regloc_type {
>         ((resource_size_t)(pci_resource_start(pdev, (map)->barno) +            \
>                            (map)->block_offset))
>
> +bool is_cxl_switch_usp(struct device *dev);
> +bool is_cxl_switch_dsp(struct device *dev);
> +bool is_cxl_mem_enabled(struct pci_dev *pdev);
> +
>  #endif /* __CXL_PCI_H__ */
> diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
> index ebbfb72ae995..3ddfd7673a56 100644
> --- a/drivers/cxl/port.c
> +++ b/drivers/cxl/port.c
> @@ -170,7 +170,7 @@ static int enumerate_hdm_decoders(struct cxl_port *port,
>                 if (rc)
>                         put_device(&cxld->dev);
>                 else
> -                       rc = cxl_decoder_autoremove(port->uport->parent, cxld);
> +                       rc = cxl_decoder_autoremove(&port->dev, cxld);

Looks like a hunk that got added to the wrong patch.

>                 if (rc)
>                         dev_err(&port->dev, "Failed to add decoder\n");
>         }
> diff --git a/tools/testing/cxl/Kbuild b/tools/testing/cxl/Kbuild
> index 86deba8308a1..46db4dd345a0 100644
> --- a/tools/testing/cxl/Kbuild
> +++ b/tools/testing/cxl/Kbuild
> @@ -31,6 +31,7 @@ cxl_core-y += $(CXL_CORE_SRC)/pmem.o
>  cxl_core-y += $(CXL_CORE_SRC)/regs.o
>  cxl_core-y += $(CXL_CORE_SRC)/memdev.o
>  cxl_core-y += $(CXL_CORE_SRC)/mbox.o
> +cxl_core-y += $(CXL_CORE_SRC)/pci.o
>  cxl_core-y += config_check.o
>
>  cxl_core-y += mock_pmem.o
> --
> 2.33.1
>

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 05/28] cxl/core: Convert decoder range to resource
  2021-10-29 22:37         ` Ben Widawsky
@ 2021-11-01 14:33           ` Ben Widawsky
  0 siblings, 0 replies; 106+ messages in thread
From: Ben Widawsky @ 2021-11-01 14:33 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On 21-10-29 15:37:51, Ben Widawsky wrote:
> On 21-10-29 15:22:19, Dan Williams wrote:
> > On Fri, Oct 29, 2021 at 2:27 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > [..]
> > > > Note, this is just how I had it roughly mapped out in my head, I defer
> > > > making it a hard recommendation until I get deeper into this set to
> > > > see if we diverge.
> > > >
> > >
> > > You're correct.
> > >
> > > I had changed this already for my v3. I ended up with this:
> > > +       union {
> > > +               struct resource cfmws_res;
> > 
> > "cfmws" is something that will only appear on ACPI based CXL systems,
> > so if it needs a prefix it should be something that does not preclude
> > CXL on other platform firmware architectures. So, "platform_" or
> > "root_"?
> 
> Sure. Platform_ is better.
> 

Coming back to this. While "CFMWS" is indeed defined as part of ACPI, I do think
it could also be a generic name for what it represents. In other words, I don't
see any issue with using CFMWS, but, I'll start using "platform" since it seems
it's your preference (and I prefer that to "root").

> > 
> > > +               struct resource *decoder_res;
> > 
> > So the region device is going to be doing the __request_region() from
> > @cfmws_res, not decoders, so why is this a resource and not a range?
> 
> Well, sort of. Most of v3 was to ultimately go with the decoders doing the
> __request_region, although thinking about this further that will not work since
> multiple decoders might want the same ranges of the parent resource.
> 
> So I guess it has to be region. I can go back to renaming it res in that case.
> You're correct that the region device should own doing the __request_region.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 09/28] cxl/acpi: Map single port host bridge component registers
  2021-10-31 18:03   ` Dan Williams
@ 2021-11-01 17:07     ` Ben Widawsky
  2021-11-02  2:15       ` Dan Williams
  0 siblings, 1 reply; 106+ messages in thread
From: Ben Widawsky @ 2021-11-01 17:07 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On 21-10-31 11:03:37, Dan Williams wrote:
> On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> >
> > Now that the port driver exists and is able to do proper decoder
> > enumeration of the component registers, it becomes trivial to use that
> 
> This is the second occurrence of "becomes trivial" in this series,
> it's distracting because it immediately sets off alarm bells that the
> changelog is not in:
> 
> Background
> Problem
> Solution
> 
> ...format.

Let me make sure I understand.

Background: Now that the port driver exists...
Problem: host bridge uports can use the port driver [but don't yet]
Solution: This patch (the description is indeed missing).

Is your point that I didn't document a solution, or something else?


> 
> > for host bridge uports. For reasons out of scope, a functional change
> > would be visible if the HDM decoder was programmed by BIOS to values
> > other than the full address range. Similarly if a type2 device was
> > connected to this root port and programmed by BIOS, that can now be
> > acted upon accordingly.
> 
> I would reserve discussion of "no functional change" for patches that
> are pure cleanup. In this case this patch will cause the kernel to
> behave differently when other conditions are met.

True. I will remove that.

> 
> >
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > ---
> >  drivers/cxl/acpi.c | 25 ++++++++++++++++++++++++-
> >  1 file changed, 24 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> > index d61397055e9f..8cca0814dfb8 100644
> > --- a/drivers/cxl/acpi.c
> > +++ b/drivers/cxl/acpi.c
> > @@ -280,12 +280,14 @@ static int add_host_bridge_uport(struct device *match, void *arg)
> >         struct cxl_port *root_port = arg;
> >         struct device *host = root_port->dev.parent;
> >         struct acpi_device *bridge = to_cxl_host_bridge(host, match);
> > +       struct cxl_component_reg_map map;
> >         struct acpi_pci_root *pci_root;
> >         struct cxl_walk_context ctx;
> >         int single_port_map[1], rc;
> >         struct cxl_decoder *cxld;
> >         struct cxl_dport *dport;
> >         struct cxl_port *port;
> > +       void __iomem *crb;
> >
> >         if (!bridge)
> >                 return 0;
> > @@ -318,10 +320,31 @@ static int add_host_bridge_uport(struct device *match, void *arg)
> >                 return -ENODEV;
> >         if (ctx.error)
> >                 return ctx.error;
> > +       /*
> > +        * If the host bridge has more than 1 root port, it must have registers
> > +        * controlling the HDM decoders. Those will be enumerated by the port
> > +        * driver.
> > +        */
> >         if (ctx.count > 1)
> >                 return 0;
> >
> > -       /* TODO: Scan CHBCR for HDM Decoder resources */
> > +       /*
> > +        * If the single ported host bridge has a component register block,
> > +        * simply let the port driver handle the decoder enumeration.
> > +        *
> > +        * Host bridge component registers live in the system's physical address
> > +        * space.
> > +        */
> > +       crb = ioremap(dport->component_reg_phys, CXL_COMPONENT_REG_BLOCK_SIZE);
> 
> This looks broken, the driver for the port should be mapping dport
> component registers to offer services to upper layers.
> 
> There's also a missing iounmap.
> 

Where's the missing iounmap? The port driver does what you say, but I need a way
to shortcircuit the case where the root port doesn't have component registers
which you've previously documented as allowed by spec. How would you recommend
doing that?

> > +       if (crb) {
> > +               cxl_probe_component_regs(&root_port->dev, crb, &map);
> > +               iounmap(crb);
> > +               if (map.hdm_decoder.valid) {
> > +                       dev_dbg(host,
> > +                               "Found single port host bridge with component registers\n");
> > +                       return 0;
> > +               }
> > +       }
> >
> >         /*
> >          * Per the CXL specification (8.2.5.12 CXL HDM Decoder Capability
> > --
> > 2.33.1
> >

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 08/28] cxl/port: Introduce a port driver
  2021-10-31 18:10       ` Dan Williams
@ 2021-11-01 17:36         ` Ben Widawsky
  0 siblings, 0 replies; 106+ messages in thread
From: Ben Widawsky @ 2021-11-01 17:36 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On 21-10-31 11:10:23, Dan Williams wrote:
> On Sun, Oct 31, 2021 at 10:53 AM Dan Williams <dan.j.williams@intel.com> wrote:
> >
> > On Fri, Oct 29, 2021 at 6:37 PM Dan Williams <dan.j.williams@intel.com> wrote:
> > [..]
> > >
> > > Proposed infrastructure for a new cxl_port_decoder_autoremove():
> > >
> > > diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> > > index 3163167ecc3a..78f8313a1069 100644
> > > --- a/drivers/cxl/acpi.c
> > > +++ b/drivers/cxl/acpi.c
> > > @@ -374,6 +374,11 @@ static int add_root_nvdimm_bridge(struct device
> > > *match, void *data)
> > >         return 1;
> > >  }
> > >
> > > +static void clear_cxl_topology_host(void *data)
> > > +{
> > > +       set_cxl_topology_host(NULL);
> > > +}
> > > +
> > >  static int cxl_acpi_probe(struct platform_device *pdev)
> > >  {
> > >         int rc;
> > > @@ -382,6 +387,11 @@ static int cxl_acpi_probe(struct platform_device *pdev)
> > >         struct acpi_device *adev = ACPI_COMPANION(host);
> > >         struct cxl_cfmws_context ctx;
> > >
> > > +       set_cxl_topology_host(host);
> > > +       rc = devm_add_action_or_reset(host, clear_cxl_topology_host, host);
> > > +       if (rc)
> > > +               return rc;
> > > +
> > >         root_port = devm_cxl_add_port(host, host, CXL_RESOURCE_NONE, NULL);
> > >         if (IS_ERR(root_port))
> > >                 return PTR_ERR(root_port);
> > > diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> > > index 17a4fff029f8..3146b6aa0a2f 100644
> > > --- a/drivers/cxl/core/bus.c
> > > +++ b/drivers/cxl/core/bus.c
> > > @@ -3,6 +3,7 @@
> > >  #include <linux/io-64-nonatomic-lo-hi.h>
> > >  #include <linux/device.h>
> > >  #include <linux/module.h>
> > > +#include <linux/rwsem.h>
> > >  #include <linux/pci.h>
> > >  #include <linux/slab.h>
> > >  #include <linux/idr.h>
> > > @@ -563,6 +564,84 @@ int cxl_decoder_autoremove(struct device *host,
> > > struct cxl_decoder *cxld)
> > >  }
> > >  EXPORT_SYMBOL_NS_GPL(cxl_decoder_autoremove, CXL);
> > >
> > > +void trigger_decoder_autoremove(void *data)
> > > +{
> > > +       struct cxl_decoder *cxld = data;
> > > +       struct device *host = get_cxl_topology_host();
> > > +
> > > +       /* The topology host driver beat us to the punch */
> > > +       if (!host)
> > > +               return;
> >
> > This can be dropped or turned into a WARN_ON() because something is
> > wrong if the port removal is being triggered after the topology host
> > has already went away.
> 
> It further occurs to me that if all ports are guaranteed to be torn
> down by topology host removal, then we might not even need to play
> these games at all. cxl_acpi is already unregistering all of its child
> ports, we just need the property that if a port adds a child port it
> remove it. Then I don't think we even need cxl_topology_host games.

I think having a handle to the "topology_host" is still valuable since I used it
to do the CFMWS lookup. Perhaps you have a different solution for that and I'll
see it later.

Is the suggestion here to have unregister_port() be responsible for doing an
unregister_dev for all children that are ports, or do you mean something else?

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 08/28] cxl/port: Introduce a port driver
  2021-10-30  1:37   ` Dan Williams
  2021-10-31 17:53     ` Dan Williams
@ 2021-11-01 17:53     ` Ben Widawsky
  2021-11-01 17:54       ` Ben Widawsky
  2021-11-02  3:31       ` Dan Williams
  2021-11-04 16:37     ` Ben Widawsky
  2 siblings, 2 replies; 106+ messages in thread
From: Ben Widawsky @ 2021-11-01 17:53 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On 21-10-29 18:37:36, Dan Williams wrote:
> On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> >

[snip]

> > diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile
> > index cf07ae6cea17..40b386aaedf7 100644
> > --- a/drivers/cxl/Makefile
> > +++ b/drivers/cxl/Makefile
> > @@ -1,5 +1,6 @@
> >  # SPDX-License-Identifier: GPL-2.0
> >  obj-$(CONFIG_CXL_BUS) += core/
> > +obj-$(CONFIG_CXL_MEM) += cxl_port.o
> 
> It feel odd that CONFIG_CXL_MEM builds cxl_port, why not have a
> CONFIG_CXL_PORT that is simply selected by CONFIG_CXL_MEM, or a
> CONFIG_CXL_PORT that defaults to the value of CONFIG_CXL_BUS?
> 

Can you help me understand when CONFIG_CXL_MEM is useful when
#CONFIG_CXL_PORT=n? I was unable to figure out such a case and so I tied the two
together.

> >  obj-$(CONFIG_CXL_PCI) += cxl_pci.o
> >  obj-$(CONFIG_CXL_ACPI) += cxl_acpi.o
> >  obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o
> > @@ -7,3 +8,4 @@ obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o
> >  cxl_pci-y := pci.o
> >  cxl_acpi-y := acpi.o
> >  cxl_pmem-y := pmem.o
> > +cxl_port-y := port.o
> > diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> > index b972abc9f6ef..d61397055e9f 100644
> > --- a/drivers/cxl/acpi.c
> > +++ b/drivers/cxl/acpi.c
> > @@ -491,3 +491,4 @@ static struct platform_driver cxl_acpi_driver = {
> >  module_platform_driver(cxl_acpi_driver);
> >  MODULE_LICENSE("GPL v2");
> >  MODULE_IMPORT_NS(CXL);
> > +MODULE_SOFTDEP("pre: cxl_port");
> 
> Why does cxl_acpi depend on cxl_port being loaded at this point? I
> think this wants to wait for a future patch where cxl_acpi uses port
> driver services.

As of this patch it does use port driver services. cxl/acpi: Map single port
host bridge component registers made that the case.

[snip]

> > diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
> > new file mode 100644
> > index 000000000000..ebbfb72ae995
> > --- /dev/null
> > +++ b/drivers/cxl/port.c
> > @@ -0,0 +1,242 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/* Copyright(c) 2021 Intel Corporation. All rights reserved. */
> > +#include <linux/device.h>
> > +#include <linux/module.h>
> > +#include <linux/slab.h>
> > +
> > +#include "cxlmem.h"
> > +
> > +/**
> > + * DOC: cxl port
> > + *
> > + * The port driver implements the set of functionality needed to allow full
> > + * decoder enumeration and routing. A CXL port is an abstraction of a CXL
> > + * component that implements some amount of CXL decoding of CXL.mem traffic.
> > + * As of the CXL 2.0 spec, this includes:
> > + *
> > + *     .. list-table:: CXL Components w/ Ports
> > + *             :widths: 25 25 50
> > + *             :header-rows: 1
> > + *
> > + *             * - component
> > + *               - upstream
> > + *               - downstream
> > + *             * - Hostbridge
> > + *               - ACPI0016
> > + *               - root port
> > + *             * - Switch
> > + *               - Switch Upstream Port
> > + *               - Switch Downstream Port
> > + *             * - Endpoint (not yet implemented)
> > + *               - Endpoint Port
> > + *               - N/A
> > + *
> > + * The primary service this driver provides is enumerating HDM decoders and
> > + * presenting APIs to other drivers to utilize the decoders.
> > + */
> > +
> > +struct cxl_port_data {
> > +       struct cxl_component_regs regs;
> > +
> > +       struct port_caps {
> > +               unsigned int count;
> > +               unsigned int tc;
> > +               unsigned int interleave11_8;
> > +               unsigned int interleave14_12;
> > +       } caps;
> > +};
> > +
> > +static inline int cxl_hdm_decoder_ig(u32 ctrl)
> 
> No need for plain inline in C files.
> 
> It's not clear why this simple helper needs a "cxl_hdm_decoder"
> namespace prefix?

I had a patch to share this with acpi driver at one point, but I dropped it. Do
you care if I merge those two decoders, or just rename?

> 
> > +{
> > +       int val = FIELD_GET(CXL_HDM_DECODER0_CTRL_IG_MASK, ctrl);
> > +
> > +       return 8 + val;
> > +}
> 
> Why is this return a power of 2 value...

I don't understand this comment.

> 
> > +
> > +static inline int cxl_hdm_decoder_iw(u32 ctrl)
> > +{
> > +       int val = FIELD_GET(CXL_HDM_DECODER0_CTRL_IW_MASK, ctrl);
> > +
> > +       return 1 << val;
> 
> ...while this one is converted to absolute values.
> 
> These could just be:
> 
> unsigned int to_interleave_granularity(u32 ctrl)
> unsigned int to_interleave_ways(u32 ctrl)
> 
> ...and return units in bytes.
> 
> > +}
> > +
> > +static void get_caps(struct cxl_port *port, struct cxl_port_data *cpd)
> > +{
> > +       void __iomem *hdm_decoder = cpd->regs.hdm_decoder;
> > +       struct port_caps *caps = &cpd->caps;
> > +       u32 hdm_cap;
> > +
> > +       hdm_cap = readl(hdm_decoder + CXL_HDM_DECODER_CAP_OFFSET);
> > +
> > +       caps->count = cxl_hdm_decoder_count(hdm_cap);
> > +       caps->tc = FIELD_GET(CXL_HDM_DECODER_TARGET_COUNT_MASK, hdm_cap);
> > +       caps->interleave11_8 =
> > +               FIELD_GET(CXL_HDM_DECODER_INTERLEAVE_11_8, hdm_cap);
> > +       caps->interleave14_12 =
> > +               FIELD_GET(CXL_HDM_DECODER_INTERLEAVE_14_12, hdm_cap);
> > +}
> > +
> > +static int map_regs(struct cxl_port *port, void __iomem *crb,
> > +                   struct cxl_port_data *cpd)
> > +{
> > +       struct cxl_register_map map;
> > +       struct cxl_component_reg_map *comp_map = &map.component_map;
> > +
> > +       cxl_probe_component_regs(&port->dev, crb, comp_map);
> > +       if (!comp_map->hdm_decoder.valid) {
> > +               dev_err(&port->dev, "HDM decoder registers invalid\n");
> > +               return -ENXIO;
> > +       }
> 
> Perhaps promote cxl_probe_regs() from the cxl_pci to the core and make
> it take a dev instead of a pdev, then you can do:
> 
> cxl_probe_regs(&port_dev->dev, CXL_REGLOC_RBI_COMPONENT)
> 
> ...instead of open coding it again?
> 
> > +
> > +       cpd->regs.hdm_decoder = crb + comp_map->hdm_decoder.offset;
> > +
> > +       return 0;
> > +}
> > +
> > +static u64 get_decoder_size(void __iomem *hdm_decoder, int n)
> > +{
> > +       u32 ctrl = readl(hdm_decoder + CXL_HDM_DECODER0_CTRL_OFFSET(n));
> > +
> > +       if (!!FIELD_GET(CXL_HDM_DECODER0_CTRL_COMMITTED, ctrl))
> > +               return 0;
> > +
> > +       return ioread64_hi_lo(hdm_decoder +
> > +                             CXL_HDM_DECODER0_SIZE_LOW_OFFSET(n));
> > +}
> > +
> > +static bool is_endpoint_port(struct cxl_port *port)
> > +{
> > +       if (!port->uport->driver)
> > +               return false;
> > +
> > +       return to_cxl_drv(port->uport->driver)->id ==
> > +              CXL_DEVICE_MEMORY_EXPANDER;
> 
> Why does endpoint port device type determination need to reach through
> and read the driver type?
> 

I couldn't figure out a better way at this point in enumeration. I'm open to
suggestions.

[snip]

> > +
> > +       /*
> > +        * Enable HDM decoders for this port.
> > +        *
> > +        * FIXME: If the component was using DVSEC range registers for decode,
> > +        * this will destroy that.
> 
> Yeah, definitely need to check that before this patch can move
> forward. Perhaps a port should not even be registered if DVSEC
> Memory_Size && Mem_Enable are non zero, that device is explicitly
> opting out of being a part of the CXL 2.0 subsystem hierarchy.
> However, we might still need to track it and potentially reserve it
> out of CFMWS capacity to make sure nothing else collides with it. I'll
> also note that "ECN: Devices operating in CXL 1.1 mode with no RCRB"
> was recently published that reads on what the driver should do here.
> 

I believe we want to create the port since we might decide to reset and want
control back over it and as you said, safety check other things. The reason it
was left as a FIXME is because this belongs in the PCI driver which I didn't
really want to touch at this time. I will go back and add that for the next
version.

> > +        */
> > +       ctrl = readl(portdata->regs.hdm_decoder + CXL_HDM_DECODER_CTRL_OFFSET);
> > +       ctrl |= CXL_HDM_DECODER_ENABLE;
> > +       writel(ctrl, portdata->regs.hdm_decoder + CXL_HDM_DECODER_CTRL_OFFSET);
> 
> I feel like that if the driver finds it enabled it should leave it
> enabled at ->remove() time, as you have it here, as BIOS might not be
> expecting the OS to disable a decoder it set up. However, if the
> driver actually does the enable, then it should pair it with a disable
> at the end of time, so not a blind enable, but one that conditionally
> arranges for the unwind.

My thought was that once we enumerate a port, all of it's architectural state
belongs to the OS. For us that means blanket enable/disable. I don't feel
strongly about this.

> 
> > +
> > +       rc = enumerate_hdm_decoders(port, portdata);
> > +       if (rc) {
> > +               dev_err(&port->dev, "Couldn't enumerate decoders (%d)\n", rc);
> > +               return rc;
> > +       }
> > +
> > +       dev_set_drvdata(dev, portdata);
> > +       return 0;
> > +}
> > +
> > +static struct cxl_driver cxl_port_driver = {
> > +       .name = "cxl_port",
> > +       .probe = cxl_port_probe,
> > +       .id = CXL_DEVICE_PORT,
> > +};
> > +module_cxl_driver(cxl_port_driver);
> > +
> > +MODULE_LICENSE("GPL v2");
> > +MODULE_IMPORT_NS(CXL);
> > +MODULE_ALIAS_CXL(CXL_DEVICE_PORT);
> > --
> > 2.33.1
> 
> Proposed infrastructure for a new cxl_port_decoder_autoremove():
> 
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index 3163167ecc3a..78f8313a1069 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -374,6 +374,11 @@ static int add_root_nvdimm_bridge(struct device
> *match, void *data)
>         return 1;
>  }
> 
> +static void clear_cxl_topology_host(void *data)
> +{
> +       set_cxl_topology_host(NULL);
> +}
> +
>  static int cxl_acpi_probe(struct platform_device *pdev)
>  {
>         int rc;
> @@ -382,6 +387,11 @@ static int cxl_acpi_probe(struct platform_device *pdev)
>         struct acpi_device *adev = ACPI_COMPANION(host);
>         struct cxl_cfmws_context ctx;
> 
> +       set_cxl_topology_host(host);
> +       rc = devm_add_action_or_reset(host, clear_cxl_topology_host, host);
> +       if (rc)
> +               return rc;
> +
>         root_port = devm_cxl_add_port(host, host, CXL_RESOURCE_NONE, NULL);
>         if (IS_ERR(root_port))
>                 return PTR_ERR(root_port);
> diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> index 17a4fff029f8..3146b6aa0a2f 100644
> --- a/drivers/cxl/core/bus.c
> +++ b/drivers/cxl/core/bus.c
> @@ -3,6 +3,7 @@
>  #include <linux/io-64-nonatomic-lo-hi.h>
>  #include <linux/device.h>
>  #include <linux/module.h>
> +#include <linux/rwsem.h>
>  #include <linux/pci.h>
>  #include <linux/slab.h>
>  #include <linux/idr.h>
> @@ -563,6 +564,84 @@ int cxl_decoder_autoremove(struct device *host,
> struct cxl_decoder *cxld)
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_decoder_autoremove, CXL);
> 
> +void trigger_decoder_autoremove(void *data)
> +{
> +       struct cxl_decoder *cxld = data;
> +       struct device *host = get_cxl_topology_host();
> +
> +       /* The topology host driver beat us to the punch */
> +       if (!host)
> +               return;
> +
> +       devm_remove_action(host, cxld_unregister, &cxld->dev);
> +       put_cxl_topology_host(host);
> +}
> +
> +/**
> + * cxl_port_decoder_autoremove - remove decoders discovered by the port driver
> + * @cxld: decoder to remove
> + *
> + * CXL.mem requires an intact port / decoder topology from root level
> + * platform decoders to endpoint decoders. Arrange for decoders
> + * enumerated by the port driver to be removed when either the root is
> + * removed (in which the entire hierarchy is removed), or when an
> + * individual port is disabled, in which case only that port's
> + * sub-section of the hierarchy is removed.
> + */
> +int cxl_port_decoder_autoremove(struct cxl_decoder *cxld)
> +{
> +       struct cxl_port *port = to_cxl_port(cxld->dev.parent);
> +       struct device *host = get_cxl_topology_host();
> +       int rc;
> +
> +       if (!host)
> +               return -ENODEV;
> +
> +       if (!port->dev.driver) {
> +               rc = -ENXIO;
> +               goto out;
> +       }
> +
> +       rc = cxl_decoder_autoremove(host, cxld);
> +       if (rc)
> +               goto out;
> +
> +       rc = devm_add_action_or_reset(&port->dev, trigger_decoder_autoremove,
> +                                       cxld);
> +out:
> +       put_cxl_topology_host(host);
> +       return rc;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_port_decoder_autoremove, CXL);
> +
> +static DECLARE_RWSEM(cxl_topology_rwsem);
> +static struct device *cxl_topology_host;
> +
> +void set_cxl_topology_host(struct device *dev)
> +{
> +       down_write(&cxl_topology_rwsem);
> +       cxl_topology_host = dev;
> +       up_write(&cxl_topology_rwsem);
> +}
> +EXPORT_SYMBOL_NS_GPL(set_cxl_topology_host, CXL);
> +
> +struct device *get_cxl_topology_host(void)
> +{
> +       down_read(&cxl_topology_rwsem);
> +       if (cxl_topology_host)
> +               return cxl_topology_host;
> +       up_read(&cxl_topology_rwsem);
> +       return NULL;
> +}
> +EXPORT_SYMBOL_NS_GPL(get_cxl_topology_host, CXL);
> +
> +void put_cxl_topology_host(struct device *dev)
> +{
> +       WARN_ON(dev != cxl_topology_host);
> +       up_read(&cxl_topology_rwsem);
> +}
> +EXPORT_SYMBOL_NS_GPL(put_cxl_topology_host, CXL);
> +
>  /**
>   * __cxl_driver_register - register a driver for the cxl bus
>   * @cxl_drv: cxl driver structure to attach
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 5e2e93451928..bd7b008207a4 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -302,6 +302,10 @@ struct cxl_decoder *cxl_decoder_alloc(struct
> cxl_port *port, int nr_targets);
>  int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map);
>  int cxl_decoder_autoremove(struct device *host, struct cxl_decoder *cxld);
> 
> +void set_cxl_topology_host(struct device *dev);
> +struct device *get_cxl_topology_host(void);
> +void put_cxl_topology_host(struct device *dev);
> +
>  extern struct bus_type cxl_bus_type;
> 
>  struct cxl_driver {

As you've seen by now I do implement something like that you have below in a
later patch. I believe it will still be nice to have, but I haven't read through
all of your feedback yet. So I might change my mind.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 08/28] cxl/port: Introduce a port driver
  2021-11-01 17:53     ` Ben Widawsky
@ 2021-11-01 17:54       ` Ben Widawsky
  2021-11-02  3:31       ` Dan Williams
  1 sibling, 0 replies; 106+ messages in thread
From: Ben Widawsky @ 2021-11-01 17:54 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On 21-11-01 10:53:15, Ben Widawsky wrote:
> On 21-10-29 18:37:36, Dan Williams wrote:
> > On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > >
> > >  obj-$(CONFIG_CXL_PCI) += cxl_pci.o
> > >  obj-$(CONFIG_CXL_ACPI) += cxl_acpi.o
> > >  obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o
> > > @@ -7,3 +8,4 @@ obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o
> > >  cxl_pci-y := pci.o
> > >  cxl_acpi-y := acpi.o
> > >  cxl_pmem-y := pmem.o
> > > +cxl_port-y := port.o
> > > diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> > > index b972abc9f6ef..d61397055e9f 100644
> > > --- a/drivers/cxl/acpi.c
> > > +++ b/drivers/cxl/acpi.c
> > > @@ -491,3 +491,4 @@ static struct platform_driver cxl_acpi_driver = {
> > >  module_platform_driver(cxl_acpi_driver);
> > >  MODULE_LICENSE("GPL v2");
> > >  MODULE_IMPORT_NS(CXL);
> > > +MODULE_SOFTDEP("pre: cxl_port");
> > 
> > Why does cxl_acpi depend on cxl_port being loaded at this point? I
> > think this wants to wait for a future patch where cxl_acpi uses port
> > driver services.
> 
> As of this patch it does use port driver services. cxl/acpi: Map single port
> host bridge component registers made that the case.
> 
> [snip]
> 

My mistake... I had the order changed locally. Ignore this comment.


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 10/28] cxl/core: Store global list of root ports
  2021-10-31 18:32   ` Dan Williams
@ 2021-11-01 18:43     ` Ben Widawsky
  2021-11-02  2:04       ` Dan Williams
  0 siblings, 1 reply; 106+ messages in thread
From: Ben Widawsky @ 2021-11-01 18:43 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On 21-10-31 11:32:45, Dan Williams wrote:
> On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> >
> > CXL root ports (the downstream port to a host bridge) are to be
> > enumerated by a platform specific driver. In the case of ACPI compliant
> > systems, this is like the cxl_acpi driver. Root ports are the first
> > CXL spec defined component that can be "found" by that platform specific
> > driver.
> >
> > By storing a list of these root ports components in lower levels of the
> > topology (switches and endpoints), have a mechanism to walk up their
> > device hierarchy to find an enumerated root port. This will be necessary
> > for region programming.
> >
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > ---
> >  drivers/cxl/acpi.c            |  4 ++--
> >  drivers/cxl/core/bus.c        | 34 +++++++++++++++++++++++++++++++++-
> >  drivers/cxl/cxl.h             |  5 ++++-
> >  tools/testing/cxl/mock_acpi.c |  4 ++--
> >  4 files changed, 41 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> > index 8cca0814dfb8..625c5d95b83f 100644
> > --- a/drivers/cxl/acpi.c
> > +++ b/drivers/cxl/acpi.c
> > @@ -231,7 +231,7 @@ __mock int match_add_root_ports(struct pci_dev *pdev, void *data)
> >                 creg = cxl_reg_block(pdev, &map);
> >
> >         port_num = FIELD_GET(PCI_EXP_LNKCAP_PN, lnkcap);
> > -       rc = cxl_add_dport(port, &pdev->dev, port_num, creg);
> > +       rc = cxl_add_dport(port, &pdev->dev, port_num, creg, true);
> >         if (rc) {
> >                 ctx->error = rc;
> >                 return rc;
> > @@ -406,7 +406,7 @@ static int add_host_bridge_dport(struct device *match, void *arg)
> >                 dev_dbg(host, "No CHBS found for Host Bridge: %s\n",
> >                         dev_name(match));
> >
> > -       rc = cxl_add_dport(root_port, match, uid, get_chbcr(chbs));
> > +       rc = cxl_add_dport(root_port, match, uid, get_chbcr(chbs), false);
> >         if (rc) {
> >                 dev_err(host, "failed to add downstream port: %s\n",
> >                         dev_name(match));
> > diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> > index dffbd0ac64af..03394a3ae75f 100644
> > --- a/drivers/cxl/core/bus.c
> > +++ b/drivers/cxl/core/bus.c
> > @@ -25,6 +25,8 @@
> >   */
> >
> >  static DEFINE_IDA(cxl_port_ida);
> > +static LIST_HEAD(cxl_root_ports);
> > +static DECLARE_RWSEM(root_port_sem);
> 
> I don't see a need for this new list and lock...
> 
> >
> >  static ssize_t devtype_show(struct device *dev, struct device_attribute *attr,
> >                             char *buf)
> > @@ -268,12 +270,31 @@ struct cxl_port *to_cxl_port(struct device *dev)
> >  }
> >  EXPORT_SYMBOL_GPL(to_cxl_port);
> >
> > +struct cxl_dport *cxl_get_root_dport(struct device *dev)
> > +{
> > +       struct cxl_dport *ret = NULL;
> > +       struct cxl_dport *dport;
> > +
> > +       down_read(&root_port_sem);
> > +       list_for_each_entry(dport, &cxl_root_ports, root_port_link) {
> > +               if (dport->dport == dev) {
> > +                       ret = dport;
> > +                       break;
> > +               }
> > +       }
> > +
> > +       up_read(&root_port_sem);
> > +       return ret;
> > +}
> > +EXPORT_SYMBOL_GPL(cxl_get_root_dport);
> 
> This can be done by walking the existing topology:
> 
> struct cxl_dport *cxl_get_root_dport(struct device *dev)
> {
>         struct device *host = get_cxl_topology_host();
>         struct cxl_dport *dport, *found = NULL;
>         struct cxl_port *port;
>         struct device *root;
> 
>         if (!host)
>                 return NULL;
> 
>         root = device_find_child(host, &root, match_cxl_root_port);
>         if (!root)
>                 goto out;
>         device_lock(root);
>         port = to_cxl_port(root);
>         list_for_each_entry (dport, &port->dports, list)
>                 if (dport->dport == dev) {
>                         found = dport;
>                         break;
>                 }
>         device_unlock(root);
>         put_device(root);
> 
> out:
>         put_cxl_topology_host(host);
>         return found;
> }
> 
> 
> It occurs to me after writing this that device_lock() for iterating
> dports can be offloaded to the topology rwsem.
> 

When I originally wrote this function, I didn't have a pointer to the root host.
Also, you said previously we'd remove the root host. What would you like me to
do?

> > +
> >  static void unregister_port(void *_port)
> >  {
> >         struct cxl_port *port = _port;
> >         struct cxl_dport *dport;
> >
> >         device_lock(&port->dev);
> > +       down_read(&root_port_sem);
> >         list_for_each_entry(dport, &port->dports, list) {
> >                 char link_name[CXL_TARGET_STRLEN];
> >
> > @@ -281,7 +302,10 @@ static void unregister_port(void *_port)
> >                              dport->port_id) >= CXL_TARGET_STRLEN)
> >                         continue;
> >                 sysfs_remove_link(&port->dev.kobj, link_name);
> > +
> > +               list_del_init(&dport->root_port_link);
> >         }
> > +       up_read(&root_port_sem);
> >         device_unlock(&port->dev);
> >         device_unregister(&port->dev);
> >  }
> > @@ -431,12 +455,13 @@ static int add_dport(struct cxl_port *port, struct cxl_dport *new)
> >   * @dport_dev: firmware or PCI device representing the dport
> >   * @port_id: identifier for this dport in a decoder's target list
> >   * @component_reg_phys: optional location of CXL component registers
> > + * @root_port: is this a root port (hostbridge downstream)
> >   *
> >   * Note that all allocations and links are undone by cxl_port deletion
> >   * and release.
> >   */
> >  int cxl_add_dport(struct cxl_port *port, struct device *dport_dev, int port_id,
> > -                 resource_size_t component_reg_phys)
> > +                 resource_size_t component_reg_phys, bool root_port)
> 
> With the above implementation there's no need to add this last parameter.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 11/28] cxl/acpi: Rescan bus at probe completion
  2021-10-31 19:25   ` Dan Williams
@ 2021-11-01 18:56     ` Ben Widawsky
  2021-11-01 21:45       ` Ben Widawsky
  0 siblings, 1 reply; 106+ messages in thread
From: Ben Widawsky @ 2021-11-01 18:56 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On 21-10-31 12:25:32, Dan Williams wrote:
> On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> >
> > Ensure that devices being probed before cxl_acpi has completed will get
> > a second chance.
> 
> I think this is at the wrong level... more below.
> >
> > CXL drivers are brought up through two enumerable, asynchronous
> > mechanism. The leaf nodes in the CXL topology, endpoints, are enumerated
> > via PCI headers. The root node's enumeration is platform specific. The
> > current defacto mechanism for enumerating the root node is through the
> > presence of an ACPI device, ACPI0017.
> >
> > The primary job of a cxl_mem driver is to determine if CXL.mem traffic
> > can be routed to/from the PCIe device that it is being probed. A
> > prerequisite in this determination is that all CXL components in the
> > path from root to leaf are capable of routing CXL.mem traffic. If the
> > cxl_mem driver is probed before cxl_acpi is complete the driver will be
> > unable to make this determination. To address this, cxl_acpi (or in the
> > future, another platform specific driver) will rescan all devices to
> > make sure the ordering is correct.
> >
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > ---
> >  drivers/cxl/acpi.c | 20 +++++++++++++++++++-
> >  1 file changed, 19 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> > index 625c5d95b83f..1cc3a74c16bd 100644
> > --- a/drivers/cxl/acpi.c
> > +++ b/drivers/cxl/acpi.c
> > @@ -451,6 +451,14 @@ static u32 cedt_instance(struct platform_device *pdev)
> >         return U32_MAX;
> >  }
> >
> > +static void bus_rescan(struct work_struct *work)
> > +{
> > +       if (bus_rescan_devices(&cxl_bus_type))
> > +               pr_err("Failed to rescan CXL bus\n");
> > +}
> > +
> > +static DECLARE_WORK(deferred_bus_rescan, bus_rescan);
> > +
> >  static int cxl_acpi_probe(struct platform_device *pdev)
> >  {
> >         int rc;
> > @@ -484,9 +492,19 @@ static int cxl_acpi_probe(struct platform_device *pdev)
> >         if (rc)
> >                 goto out;
> >
> > -       if (IS_ENABLED(CONFIG_CXL_PMEM))
> > +       if (IS_ENABLED(CONFIG_CXL_PMEM)) {
> >                 rc = device_for_each_child(&root_port->dev, root_port,
> >                                            add_root_nvdimm_bridge);
> > +               if (rc)
> > +                       goto out;
> > +       }
> > +
> > +       /*
> > +        * While ACPI is scanning hostbridge ports, switches and memory devices
> > +        * may have been probed. Those devices will need to know whether the
> > +        * hostbridge is CXL capable.
> > +        */
> > +       schedule_work(&deferred_bus_rescan);
> 
> I think this belongs in port driver similar to the one in
> drivers/cxl/pmem.c, because any port online event might mean that a
> downstream port can now attach to the port driver. I prefer a local
> ordered workqueue via queue_work() rather than the global unordered
> workqueue primarily because it can be flushed without getting
> entangled with other random work in the system, but also because the
> ordered property is useful for comprehended hierarchical topology
> events.

Makes sense. I was trying to limit the number of rescans that took place and I
believe I convinced myself that the current patch does work properly. However,
architecturally it makes more sense for the port driver to own this and
realistically I assume many rescans will coalesce.

> 
> In any event this is missing flushing to prevent races between .text
> removal at module_exit() and pending work.


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 11/28] cxl/acpi: Rescan bus at probe completion
  2021-11-01 18:56     ` Ben Widawsky
@ 2021-11-01 21:45       ` Ben Widawsky
  2021-11-02  1:56         ` Dan Williams
  0 siblings, 1 reply; 106+ messages in thread
From: Ben Widawsky @ 2021-11-01 21:45 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On 21-11-01 11:56:02, Ben Widawsky wrote:
> On 21-10-31 12:25:32, Dan Williams wrote:
> > On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > >
> > > Ensure that devices being probed before cxl_acpi has completed will get
> > > a second chance.
> > 
> > I think this is at the wrong level... more below.
> > >
> > > CXL drivers are brought up through two enumerable, asynchronous
> > > mechanism. The leaf nodes in the CXL topology, endpoints, are enumerated
> > > via PCI headers. The root node's enumeration is platform specific. The
> > > current defacto mechanism for enumerating the root node is through the
> > > presence of an ACPI device, ACPI0017.
> > >
> > > The primary job of a cxl_mem driver is to determine if CXL.mem traffic
> > > can be routed to/from the PCIe device that it is being probed. A
> > > prerequisite in this determination is that all CXL components in the
> > > path from root to leaf are capable of routing CXL.mem traffic. If the
> > > cxl_mem driver is probed before cxl_acpi is complete the driver will be
> > > unable to make this determination. To address this, cxl_acpi (or in the
> > > future, another platform specific driver) will rescan all devices to
> > > make sure the ordering is correct.
> > >
> > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > > ---
> > >  drivers/cxl/acpi.c | 20 +++++++++++++++++++-
> > >  1 file changed, 19 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> > > index 625c5d95b83f..1cc3a74c16bd 100644
> > > --- a/drivers/cxl/acpi.c
> > > +++ b/drivers/cxl/acpi.c
> > > @@ -451,6 +451,14 @@ static u32 cedt_instance(struct platform_device *pdev)
> > >         return U32_MAX;
> > >  }
> > >
> > > +static void bus_rescan(struct work_struct *work)
> > > +{
> > > +       if (bus_rescan_devices(&cxl_bus_type))
> > > +               pr_err("Failed to rescan CXL bus\n");
> > > +}
> > > +
> > > +static DECLARE_WORK(deferred_bus_rescan, bus_rescan);
> > > +
> > >  static int cxl_acpi_probe(struct platform_device *pdev)
> > >  {
> > >         int rc;
> > > @@ -484,9 +492,19 @@ static int cxl_acpi_probe(struct platform_device *pdev)
> > >         if (rc)
> > >                 goto out;
> > >
> > > -       if (IS_ENABLED(CONFIG_CXL_PMEM))
> > > +       if (IS_ENABLED(CONFIG_CXL_PMEM)) {
> > >                 rc = device_for_each_child(&root_port->dev, root_port,
> > >                                            add_root_nvdimm_bridge);
> > > +               if (rc)
> > > +                       goto out;
> > > +       }
> > > +
> > > +       /*
> > > +        * While ACPI is scanning hostbridge ports, switches and memory devices
> > > +        * may have been probed. Those devices will need to know whether the
> > > +        * hostbridge is CXL capable.
> > > +        */
> > > +       schedule_work(&deferred_bus_rescan);
> > 
> > I think this belongs in port driver similar to the one in
> > drivers/cxl/pmem.c, because any port online event might mean that a
> > downstream port can now attach to the port driver. I prefer a local
> > ordered workqueue via queue_work() rather than the global unordered
> > workqueue primarily because it can be flushed without getting
> > entangled with other random work in the system, but also because the
> > ordered property is useful for comprehended hierarchical topology
> > events.
> 
> Makes sense. I was trying to limit the number of rescans that took place and I
> believe I convinced myself that the current patch does work properly. However,
> architecturally it makes more sense for the port driver to own this and
> realistically I assume many rescans will coalesce.
> 

Oh. I assume you want this rolled into (cxl/port: Introduce a port driver),
correct?

> > 
> > In any event this is missing flushing to prevent races between .text
> > removal at module_exit() and pending work.
> 

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 12/28] cxl/core: Store component register base for memdevs
  2021-10-31 20:13   ` Dan Williams
@ 2021-11-01 21:50     ` Ben Widawsky
  0 siblings, 0 replies; 106+ messages in thread
From: Ben Widawsky @ 2021-11-01 21:50 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On 21-10-31 13:13:32, Dan Williams wrote:
> On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> >

[snip]

> > -       cxlmd = devm_cxl_add_memdev(cxlm);
> > +       /*
> > +        * If the component registers can't be found, the cxl_pci driver may
> > +        * still be useful for management functions so don't return an error.
> 
> This comment makes sense...
> 
> > +        *
> > +        * XXX: Creating the device is going to kick of the cxl_mem probing.
> > +        * That probe requires the component registers. Therefore, the register
> > +        * block must always be found first.
> > +        */
> 
> ..., but I don't understand the point of this comment. Given that
> devm_cxl_add_memdev() takes the base address as an argument it's
> already clear that the component registers need to be found before
> devm_cxl_add_memdev().
> 

That comment was created before devm_cxl_add_memdev() took the base. Indeed the
comment no longer makes sense.

> > +       rc = cxl_find_regblock(pdev, CXL_REGLOC_RBI_COMPONENT, &map);
> > +       if (rc)
> > +               dev_warn(&cxlmd->dev, "No component registers (%d)\n", rc);
> > +       else
> > +               creg = cxl_reg_block(pdev, &map);
> > +
> > +       cxlmd = devm_cxl_add_memdev(cxlm, creg);
> >         if (IS_ERR(cxlmd))
> >                 return PTR_ERR(cxlmd);
> >
> > diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c
> > index 12a8437a9ca0..471fc7fb5418 100644
> > --- a/tools/testing/cxl/test/mem.c
> > +++ b/tools/testing/cxl/test/mem.c
> > @@ -227,7 +227,8 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
> >         if (rc)
> >                 return rc;
> >
> > -       cxlmd = devm_cxl_add_memdev(cxlm);
> > +       /* TODO: mock component registers, or... */
> 
> How about:
> 
> /* cxl_test does not emulate registers, any memdev operations that
> imply component register access will be mocked at the memdev
> operations interface */
> 

Sounds good to me.

> > +       cxlmd = devm_cxl_add_memdev(cxlm, CXL_RESOURCE_NONE);
> >         if (IS_ERR(cxlmd))
> >                 return PTR_ERR(cxlmd);
> >
> > --
> > 2.33.1
> >

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 13/28] cxl: Flesh out register names
  2021-10-31 20:18   ` Dan Williams
@ 2021-11-01 22:00     ` Ben Widawsky
  2021-11-02  1:53       ` Dan Williams
  0 siblings, 1 reply; 106+ messages in thread
From: Ben Widawsky @ 2021-11-01 22:00 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On 21-10-31 13:18:36, Dan Williams wrote:
> On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> >
> > Get a better naming scheme in place for upcoming additions. To solidify
> > the schema, add all the DVSEC identifiers to start with.
> 
> The title and this changelog don't give anything of substance to
> review the patch.
> 
> This also looks like a rename and addition of more definitions. The
> rename has one rationale, the additional definitions have a different
> one, so split those into 2 patches, or fold the additions into the
> patch that uses them.
> 

I added more than necessary as a means to codify the naming scheme [as stated].
Many of them are not used. I can split this patch into two, though I personally
don't find it offensive to do it as one. Before I do that, I'd like to know
though if you're going to reject the patch if I'm not actually using all of the
defines later on.

> >
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> >
> > ---
> > See:
> > https://lore.kernel.org/linux-pci/20210913190131.xiiszmno46qie7v5@intel.com/
> 
> Perhaps summarize this above, it's not clear what's relevant from that
> thread to this patch.
> 
> > ---
> >  drivers/cxl/core/regs.c | 14 ++++++++------
> >  drivers/cxl/pci.h       | 38 ++++++++++++++++++++++++++++++--------
> >  2 files changed, 38 insertions(+), 14 deletions(-)
> >
> > diff --git a/drivers/cxl/core/regs.c b/drivers/cxl/core/regs.c
> > index c8ab8880b81b..b837196fbf39 100644
> > --- a/drivers/cxl/core/regs.c
> > +++ b/drivers/cxl/core/regs.c
> > @@ -253,9 +253,11 @@ static void cxl_decode_regblock(u32 reg_lo, u32 reg_hi,
> >                                 struct cxl_register_map *map)
> >  {
> >         map->block_offset =
> > -               ((u64)reg_hi << 32) | (reg_lo & CXL_REGLOC_ADDR_MASK);
> > -       map->barno = FIELD_GET(CXL_REGLOC_BIR_MASK, reg_lo);
> > -       map->reg_type = FIELD_GET(CXL_REGLOC_RBI_MASK, reg_lo);
> > +               ((u64)reg_hi << 32) |
> > +               (reg_lo & DVSEC_REGISTER_LOCATOR_BLOCK_OFFSET_LOW_MASK);
> > +       map->barno = FIELD_GET(DVSEC_REGISTER_LOCATOR_BIR_MASK, reg_lo);
> > +       map->reg_type =
> > +               FIELD_GET(DVSEC_REGISTER_LOCATOR_BLOCK_IDENTIFIER_MASK, reg_lo);
> >  }
> >
> >  /**
> > @@ -276,15 +278,15 @@ int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
> >         int regloc, i;
> >
> >         regloc = pci_find_dvsec_capability(pdev, PCI_DVSEC_VENDOR_ID_CXL,
> > -                                          PCI_DVSEC_ID_CXL_REGLOC_DVSEC_ID);
> > +                                          CXL_DVSEC_REGISTER_LOCATOR);
> >         if (!regloc)
> >                 return -ENXIO;
> >
> >         pci_read_config_dword(pdev, regloc + PCI_DVSEC_HEADER1, &regloc_size);
> >         regloc_size = FIELD_GET(PCI_DVSEC_HEADER1_LENGTH_MASK, regloc_size);
> >
> > -       regloc += PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET;
> > -       regblocks = (regloc_size - PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET) / 8;
> > +       regloc += DVSEC_REGISTER_LOCATOR_BLOCK1_OFFSET;
> > +       regblocks = (regloc_size - DVSEC_REGISTER_LOCATOR_BLOCK1_OFFSET) / 8;
> >
> >         for (i = 0; i < regblocks; i++, regloc += 8) {
> >                 u32 reg_lo, reg_hi;
> > diff --git a/drivers/cxl/pci.h b/drivers/cxl/pci.h
> > index 12fdcb1b14e5..fe2898b17736 100644
> > --- a/drivers/cxl/pci.h
> > +++ b/drivers/cxl/pci.h
> > @@ -7,17 +7,36 @@
> >
> >  /*
> >   * See section 8.1 Configuration Space Registers in the CXL 2.0
> > - * Specification
> > + * Specification. Names are taken straight from the specification with "CXL" and
> > + * "DVSEC" redundancies removed.
> >   */
> >  #define PCI_DVSEC_HEADER1_LENGTH_MASK  GENMASK(31, 20)
> >  #define PCI_DVSEC_VENDOR_ID_CXL                0x1E98
> > -#define PCI_DVSEC_ID_CXL               0x0
> >
> > -#define PCI_DVSEC_ID_CXL_REGLOC_DVSEC_ID       0x8
> > -#define PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET  0xC
> > +/* 8.1.3: PCIe DVSEC for CXL Device */
> > +#define CXL_DVSEC_PCIE_DEVICE                                  0
> >
> > -/* BAR Indicator Register (BIR) */
> > -#define CXL_REGLOC_BIR_MASK GENMASK(2, 0)
> > +/* 8.1.4: Non-CXL Function Map DVSEC */
> > +#define CXL_DVSEC_FUNCTION_MAP                                 2
> > +
> > +/* 8.1.5: CXL 2.0 Extensions DVSEC for Ports */
> > +#define CXL_DVSEC_PORT_EXTENSIONS                              3
> > +
> > +/* 8.1.6: GPF DVSEC for CXL Port */
> > +#define CXL_DVSEC_PORT_GPF                                     4
> > +
> > +/* 8.1.7: GPF DVSEC for CXL Device */
> > +#define CXL_DVSEC_DEVICE_GPF                                   5
> > +
> > +/* 8.1.8: PCIe DVSEC for Flex Bus Port */
> > +#define CXL_DVSEC_PCIE_FLEXBUS_PORT                            7
> > +
> > +/* 8.1.9: Register Locator DVSEC */
> > +#define CXL_DVSEC_REGISTER_LOCATOR                             8
> > +#define   DVSEC_REGISTER_LOCATOR_BLOCK1_OFFSET                 0xC
> > +#define     DVSEC_REGISTER_LOCATOR_BIR_MASK                    GENMASK(2, 0)
> > +#define            DVSEC_REGISTER_LOCATOR_BLOCK_IDENTIFIER_MASK        GENMASK(15, 8)
> > +#define     DVSEC_REGISTER_LOCATOR_BLOCK_OFFSET_LOW_MASK       GENMASK(31, 16)
> >
> >  /* Register Block Identifier (RBI) */
> >  enum cxl_regloc_type {
> > @@ -28,8 +47,11 @@ enum cxl_regloc_type {
> >         CXL_REGLOC_RBI_TYPES
> >  };
> >
> > -#define CXL_REGLOC_RBI_MASK GENMASK(15, 8)
> > -#define CXL_REGLOC_ADDR_MASK GENMASK(31, 16)
> > +/* 8.1.10: MLD DVSEC */
> > +#define CXL_DVSEC_MLD                                          9
> > +
> > +/* 14.16.1 CXL Device Test Capability Advertisement */
> > +#define CXL_DVSEC_PCIE_TEST_CAPABILITY                         10
> >
> >  #define cxl_reg_block(pdev, map)                                               \
> >         ((resource_size_t)(pci_resource_start(pdev, (map)->barno) +            \
> > --
> > 2.33.1
> >

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 15/28] cxl/core: Introduce API to scan switch ports
  2021-11-01  5:39   ` Dan Williams
@ 2021-11-01 22:56     ` Ben Widawsky
  2021-11-02  1:45       ` Dan Williams
  0 siblings, 1 reply; 106+ messages in thread
From: Ben Widawsky @ 2021-11-01 22:56 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On 21-10-31 22:39:43, Dan Williams wrote:
> On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> >
> > The CXL drivers encapsulate the components that direct memory traffic in
> > an entity known as a cxl_port. Compute Express Link specifies three such
> > components: hostbridge (ie. a collection of root ports), switches, and
> > endpoints. There are currently drivers that create these ports for the
> > hostbridges and the endpoints (cxl_acpi and cxl_mem). The new API
> > introduced allows callers to initiate a scan down from the hostbridge
> > and create ports for switches in the CXL topology.
> >
> > The intended user of this API is for endpoint devices. An endpoint
> > device will need to determine if it is CXL.mem capable, which requires
> > all components in the path from hostbridge to the endpoint to be CXL.mem
> > capable. Once an endpoint device determines it's connected to a CXL
> > capable root port, it can call this API to fill in all the ports in
> > between the hostbridge and itself.
> >
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > ---
> >  .../driver-api/cxl/memory-devices.rst         |   6 +
> >  drivers/cxl/core/Makefile                     |   1 +
> >  drivers/cxl/core/bus.c                        | 145 ++++++++++++++++++
> >  drivers/cxl/core/pci.c                        |  99 ++++++++++++
> >  drivers/cxl/cxl.h                             |   2 +
> >  drivers/cxl/pci.h                             |   6 +
> >  drivers/cxl/port.c                            |   2 +-
> >  tools/testing/cxl/Kbuild                      |   1 +
> >  8 files changed, 261 insertions(+), 1 deletion(-)
> >  create mode 100644 drivers/cxl/core/pci.c
> >
> > diff --git a/Documentation/driver-api/cxl/memory-devices.rst b/Documentation/driver-api/cxl/memory-devices.rst
> > index fbf0393cdddc..547336c95593 100644
> > --- a/Documentation/driver-api/cxl/memory-devices.rst
> > +++ b/Documentation/driver-api/cxl/memory-devices.rst
> > @@ -47,6 +47,12 @@ CXL Core
> >  .. kernel-doc:: drivers/cxl/core/bus.c
> >     :identifiers:
> >
> > +.. kernel-doc:: drivers/cxl/core/pci.c
> > +   :doc: cxl pci
> > +
> > +.. kernel-doc:: drivers/cxl/core/pci.c
> > +   :identifiers:
> > +
> >  .. kernel-doc:: drivers/cxl/core/pmem.c
> >     :doc: cxl pmem
> >
> > diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
> > index 07eb8e1fb8a6..9d33d2d5bf09 100644
> > --- a/drivers/cxl/core/Makefile
> > +++ b/drivers/cxl/core/Makefile
> > @@ -7,3 +7,4 @@ cxl_core-y += pmem.o
> >  cxl_core-y += regs.o
> >  cxl_core-y += memdev.o
> >  cxl_core-y += mbox.o
> > +cxl_core-y += pci.o
> > diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> > index c7e1894d503b..f10e7d5b22a4 100644
> > --- a/drivers/cxl/core/bus.c
> > +++ b/drivers/cxl/core/bus.c
> > @@ -8,6 +8,7 @@
> >  #include <linux/idr.h>
> >  #include <cxlmem.h>
> >  #include <cxl.h>
> > +#include <pci.h>
> >  #include "core.h"
> >
> >  /**
> > @@ -445,6 +446,150 @@ struct cxl_port *devm_cxl_add_port(struct device *uport,
> >  }
> >  EXPORT_SYMBOL_GPL(devm_cxl_add_port);
> >
> > +void devm_cxl_remove_port(struct cxl_port *port)
> > +{
> > +       down_read(&root_host_sem);
> > +       if (cxl_root_host) {
> > +               devm_release_action(cxl_root_host, cxl_unlink_uport, port);
> > +               devm_release_action(cxl_root_host, unregister_port, port);
> > +       }
> > +       up_read(&root_host_sem);
> > +}
> > +EXPORT_SYMBOL_GPL(devm_cxl_remove_port);
> 
> If the scan establishes the property that all child ports are devm
> allocated with their cxl_port-parent, and only if the cxl_port-parent
> is bound to its driver then I think we don't need to play
> devm_release_action games().
> 

We had discussed this previously. I was running into an issue when unloading
cxl_mem. I needed a way to remove the endpoint port and this was your
recommendation. Are you suggesting if the chain is set up correctly, I don't
need to do anything?

I don't remember exactly what was blowing up but I can try again after things
are properly parented.

> > +
> > +static int match_port(struct device *dev, const void *data)
> > +{
> > +       struct pci_dev *pdev = (struct pci_dev *)data;
> > +
> > +       if (dev->type != &cxl_port_type)
> > +               return 0;
> > +
> > +       return to_cxl_port(dev)->uport == &pdev->dev;
> > +}
> > +
> > +static struct cxl_port *find_cxl_port(struct pci_dev *usp)
> > +{
> > +       struct device *port_dev;
> > +
> > +       if (!pci_is_pcie(usp) || pci_pcie_type(usp) != PCI_EXP_TYPE_UPSTREAM)
> > +               return NULL;
> > +
> > +       port_dev = bus_find_device(&cxl_bus_type, NULL, usp, match_port);
> > +       if (port_dev)
> > +               return to_cxl_port(port_dev);
> > +
> > +       return NULL;
> > +}
> > +
> > +static int add_upstream_port(struct device *host, struct pci_dev *pdev)
> > +{
> > +       struct device *dev = &pdev->dev;
> > +       struct cxl_port *parent_port;
> > +       struct cxl_register_map map;
> > +       struct cxl_port *port;
> > +       int rc;
> > +
> > +       /*
> > +        * Upstream ports must be connected to a downstream port or root port.
> > +        * That downstream or root port must have a parent.
> > +        */
> > +       if (!pdev->dev.parent->parent)
> > +               return -ENXIO;
> > +
> > +       /* A port is useless if there are no component registers */
> > +       rc = cxl_find_regblock(pdev, CXL_REGLOC_RBI_COMPONENT, &map);
> > +       if (rc)
> > +               return rc;
> > +
> > +       parent_port = find_cxl_port(to_pci_dev(pdev->dev.parent->parent));
> 
> This deref chain is unreadable. It wants a helper if it stays, but I
> can't immediately think of a reason to ever need to look at a
> grandparent in the hierarchy.

The goal is to be able to find the next PCIe port up in the chain.

My understanding was:
pdev = PCIe upstream switch
pdev->dev.parent = PCIe downstream switch connected to pdev.
pdev->dev.parent->parent = PCIe upstream switch connected to pdev->dev.parent

I was unable to find an idiomatic way to do that. I'm open to suggestions.

> 
> > +       if (!parent_port)
> > +               return -ENODEV;
> > +
> > +       port = devm_cxl_add_port(dev, cxl_reg_block(pdev, &map), parent_port);
> 
> This is broken because the pci device being used here does not have a
> driver that knows about CXL bus events.

I don't understand this, but I'd like to. Doesn't this make a port device which
gets probed by the port driver? Why does the PCI device matter?

(I'll mention again, switch code is not tested).

> 
> > +       put_device(&parent_port->dev);
> > +       if (IS_ERR(port))
> > +               dev_err(dev, "Failed to add upstream port %ld\n",
> > +                       PTR_ERR(port));
> > +       else
> > +               dev_dbg(dev, "Added CXL port\n");
> > +
> > +       return rc;
> > +}
> > +
> > +static int add_downstream_port(struct pci_dev *pdev)
> > +{
> > +       resource_size_t creg = CXL_RESOURCE_NONE;
> > +       struct device *dev = &pdev->dev;
> > +       struct cxl_port *parent_port;
> > +       struct cxl_register_map map;
> > +       u32 lnkcap, port_num;
> > +       int rc;
> > +
> > +       /*
> > +        * Ports are to be scanned from top down. Therefore, the upstream port
> > +        * must already exist.
> > +        */
> > +       parent_port = find_cxl_port(to_pci_dev(pdev->dev.parent));
> > +       if (!parent_port)
> > +               return -ENODEV;
> > +
> > +       /*
> > +        * The spec mandates component registers are present but the
> > +        * driver does not.
> 
> What is this trying to convey?
> 

That I'm not validating the hardware, and even though component registers are
mandatory, the driver will move on even if they're not found. This functionality
may need to change in the future and so I left the comment there.

> > +        */
> > +       rc = cxl_find_regblock(pdev, CXL_REGLOC_RBI_COMPONENT, &map);
> > +       if (!rc)
> > +               creg = cxl_reg_block(pdev, &map);
> > +
> > +       if (pci_read_config_dword(pdev, pci_pcie_cap(pdev) + PCI_EXP_LNKCAP,
> > +                                 &lnkcap) != PCIBIOS_SUCCESSFUL)
> > +               return 1;
> > +       port_num = FIELD_GET(PCI_EXP_LNKCAP_PN, lnkcap);
> > +
> > +       rc = cxl_add_dport(parent_port, dev, port_num, creg, false);
> > +       put_device(&parent_port->dev);
> 
> What get_device() is this paired with?
> 

find_cxl_port() holds a reference. I had a comment somewhere to that effect but it
seems to have disappeared. Since it sounds like this code might be changing, I
won't fix that yet.

> > +       if (rc)
> > +               dev_err(dev, "Failed to add downstream port to %s\n",
> > +                       dev_name(&parent_port->dev));
> > +       else
> > +               dev_dbg(dev, "Added downstream port to %s\n",
> > +                       dev_name(&parent_port->dev));
> > +
> > +       return rc;
> > +}
> > +
> > +static int match_add_ports(struct pci_dev *pdev, void *data)
> > +{
> > +       struct device *dev = &pdev->dev;
> > +       struct device *host = data;
> > +       int rc;
> > +
> > +       /* This port has already been added... */
> > +       if (find_cxl_port(pdev))
> > +               return 0;
> > +
> > +       if (is_cxl_switch_usp((dev)))
> > +               rc = add_upstream_port(host, pdev);
> > +
> > +       if (is_cxl_switch_dsp((dev)))
> > +               rc = add_downstream_port(pdev);
> > +
> > +       return rc;
> > +}
> > +
> > +/**
> > + * cxl_scan_ports() - Adds all ports for the subtree beginning with @dport
> > + * @dport: Beginning node of the CXL topology
> > + */
> > +void cxl_scan_ports(struct cxl_dport *dport)
> > +{
> > +       struct device *d = dport->dport;
> > +       struct pci_dev *pdev = to_pci_dev(d);
> > +
> > +       pci_walk_bus(pdev->bus, match_add_ports, &dport->port->dev);
> > +}
> > +EXPORT_SYMBOL_GPL(cxl_scan_ports);
> > +
> >  static struct cxl_dport *find_dport(struct cxl_port *port, int id)
> >  {
> >         struct cxl_dport *dport;
> > diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> > new file mode 100644
> > index 000000000000..c0cbe984c778
> > --- /dev/null
> > +++ b/drivers/cxl/core/pci.c
> > @@ -0,0 +1,99 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/* Copyright(c) 2021 Intel Corporation. All rights reserved. */
> > +#include <linux/device.h>
> > +#include <linux/pci.h>
> > +#include <pci.h>
> > +
> > +/**
> > + * DOC: cxl pci
> > + *
> > + * Compute Express Link protocols are layered on top of PCIe. CXL core provides
> > + * a set of helpers for CXL interactions which occur via PCIe.
> > + */
> > +
> > +/**
> > + * is_cxl_mem_enabled() - Does the device understand CXL.mem protocol
> > + * @pdev: The PCI device for which to determine CXL enablement
> > + *
> > + * This is the most discrete determination as to whether a device supports
> > + * CXL.mem protocol. At a minimum, a CXL device must advertise it is capable of
> > + * negotiating the CXL.mem protocol while operating in Flex Bus.CXL mode. There
> > + * are other determining factors as to whether CXL.mem protocol is supported in
> > + * the path from root port to endpoint. Those other factors require a more
> > + * comprehensive survey of the CXL topology and would use is_cxl_mem_enabled()
> > + * as a cursory check.
> > + *
> > + * If the PCI device is enabled for CXL.mem protocol return true; otherwise
> > + * return false.
> > + *
> > + * TODO: Is there other architecturally visible state that can be used to infer
> > + *       CXL.mem protocol support?
> 
> I was expecting that cxl_port->dev.driver != NULL is the CXL subsystem
> source of truth.
> 
> > + */
> > +bool is_cxl_mem_enabled(struct pci_dev *pdev)
> > +{
> > +       int pcie_dvsec;
> > +       u16 dvsec_ctrl;
> > +
> > +       pcie_dvsec = pci_find_dvsec_capability(pdev, PCI_DVSEC_VENDOR_ID_CXL,
> > +                                              CXL_DVSEC_PCIE_DEVICE);
> > +       if (!pcie_dvsec) {
> > +               dev_info(&pdev->dev,
> > +                        "Unable to determine CXL protocol support");
> > +               return false;
> > +       }
> > +
> > +       pci_read_config_word(pdev,
> > +                            pcie_dvsec + DVSEC_PCIE_DEVICE_CONTROL_OFFSET,
> > +                            &dvsec_ctrl);
> > +       if (!(dvsec_ctrl & DVSEC_PCIE_DEVICE_MEM_ENABLE)) {
> > +               dev_info(&pdev->dev, "CXL.mem protocol not enabled on device");
> > +               return false;
> > +       }
> 
> Per above, doesn't the port driver already check this?
> 
> > +
> > +       return true;
> > +}
> > +EXPORT_SYMBOL_GPL(is_cxl_mem_enabled);
> 
> Also per above, can't this stay internal to the port driver?
> 

At one point I think there was another consumer, so it got put here. I can be
internal to the port driver.

> > +
> > +/**
> > + * is_cxl_switch_usp() - Is the device a CXL.mem enabled switch
> > + * @dev: Device to query for switch type
> > + *
> > + * If the device is a CXL.mem capable upstream switch port return true;
> > + * otherwise return false.
> > + */
> > +bool is_cxl_switch_usp(struct device *dev)
> > +{
> > +       struct pci_dev *pdev;
> > +
> > +       if (!dev_is_pci(dev))
> > +               return false;
> > +
> > +       pdev = to_pci_dev(dev);
> > +
> > +       return pci_is_pcie(pdev) &&
> > +              pci_pcie_type(pdev) == PCI_EXP_TYPE_UPSTREAM &&
> > +              is_cxl_mem_enabled(pdev);
> > +}
> > +EXPORT_SYMBOL_GPL(is_cxl_switch_usp);
> 
> Who uses this api outside of the core?
> 
> > +
> > +/**
> > + * is_cxl_switch_dsp() - Is the device a CXL.mem enabled switch
> > + * @dev: Device to query for switch type
> > + *
> > + * If the device is a CXL.mem capable downstream switch port return true;
> > + * otherwise return false.
> > + */
> > +bool is_cxl_switch_dsp(struct device *dev)
> > +{
> > +       struct pci_dev *pdev;
> > +
> > +       if (!dev_is_pci(dev))
> > +               return false;
> > +
> > +       pdev = to_pci_dev(dev);
> > +
> > +       return pci_is_pcie(pdev) &&
> > +              pci_pcie_type(pdev) == PCI_EXP_TYPE_DOWNSTREAM &&
> > +              is_cxl_mem_enabled(pdev);
> > +}
> > +EXPORT_SYMBOL_GPL(is_cxl_switch_dsp);
> 
> Same question... at a minimum wait to add exports until there is a user.
> 

Okay. It gets used by the mem driver, but I can remove the export here.

> > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> > index 419c2e2db6f0..03b414462416 100644
> > --- a/drivers/cxl/cxl.h
> > +++ b/drivers/cxl/cxl.h
> > @@ -309,6 +309,8 @@ struct cxl_port *to_cxl_port(struct device *dev);
> >  struct cxl_port *devm_cxl_add_port(struct device *uport,
> >                                    resource_size_t component_reg_phys,
> >                                    struct cxl_port *parent_port);
> > +void devm_cxl_remove_port(struct cxl_port *port);
> > +void cxl_scan_ports(struct cxl_dport *root_port);
> >
> >  int cxl_add_dport(struct cxl_port *port, struct device *dport, int port_id,
> >                   resource_size_t component_reg_phys, bool root_port);
> > diff --git a/drivers/cxl/pci.h b/drivers/cxl/pci.h
> > index fe2898b17736..9d6ca77d3e14 100644
> > --- a/drivers/cxl/pci.h
> > +++ b/drivers/cxl/pci.h
> > @@ -15,6 +15,8 @@
> >
> >  /* 8.1.3: PCIe DVSEC for CXL Device */
> >  #define CXL_DVSEC_PCIE_DEVICE                                  0
> > +#define   DVSEC_PCIE_DEVICE_CONTROL_OFFSET                     0xC
> > +#define     DVSEC_PCIE_DEVICE_MEM_ENABLE                       BIT(2)
> >
> >  /* 8.1.4: Non-CXL Function Map DVSEC */
> >  #define CXL_DVSEC_FUNCTION_MAP                                 2
> > @@ -57,4 +59,8 @@ enum cxl_regloc_type {
> >         ((resource_size_t)(pci_resource_start(pdev, (map)->barno) +            \
> >                            (map)->block_offset))
> >
> > +bool is_cxl_switch_usp(struct device *dev);
> > +bool is_cxl_switch_dsp(struct device *dev);
> > +bool is_cxl_mem_enabled(struct pci_dev *pdev);
> > +
> >  #endif /* __CXL_PCI_H__ */
> > diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
> > index ebbfb72ae995..3ddfd7673a56 100644
> > --- a/drivers/cxl/port.c
> > +++ b/drivers/cxl/port.c
> > @@ -170,7 +170,7 @@ static int enumerate_hdm_decoders(struct cxl_port *port,
> >                 if (rc)
> >                         put_device(&cxld->dev);
> >                 else
> > -                       rc = cxl_decoder_autoremove(port->uport->parent, cxld);
> > +                       rc = cxl_decoder_autoremove(&port->dev, cxld);
> 
> Looks like a hunk that got added to the wrong patch.

Yep

> 
> >                 if (rc)
> >                         dev_err(&port->dev, "Failed to add decoder\n");
> >         }
> > diff --git a/tools/testing/cxl/Kbuild b/tools/testing/cxl/Kbuild
> > index 86deba8308a1..46db4dd345a0 100644
> > --- a/tools/testing/cxl/Kbuild
> > +++ b/tools/testing/cxl/Kbuild
> > @@ -31,6 +31,7 @@ cxl_core-y += $(CXL_CORE_SRC)/pmem.o
> >  cxl_core-y += $(CXL_CORE_SRC)/regs.o
> >  cxl_core-y += $(CXL_CORE_SRC)/memdev.o
> >  cxl_core-y += $(CXL_CORE_SRC)/mbox.o
> > +cxl_core-y += $(CXL_CORE_SRC)/pci.o
> >  cxl_core-y += config_check.o
> >
> >  cxl_core-y += mock_pmem.o
> > --
> > 2.33.1
> >

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 15/28] cxl/core: Introduce API to scan switch ports
  2021-11-01 22:56     ` Ben Widawsky
@ 2021-11-02  1:45       ` Dan Williams
  2021-11-02 16:39         ` Ben Widawsky
  2021-11-16 16:50         ` Ben Widawsky
  0 siblings, 2 replies; 106+ messages in thread
From: Dan Williams @ 2021-11-02  1:45 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Mon, Nov 1, 2021 at 3:56 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> On 21-10-31 22:39:43, Dan Williams wrote:
> > On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > >
> > > The CXL drivers encapsulate the components that direct memory traffic in
> > > an entity known as a cxl_port. Compute Express Link specifies three such
> > > components: hostbridge (ie. a collection of root ports), switches, and
> > > endpoints. There are currently drivers that create these ports for the
> > > hostbridges and the endpoints (cxl_acpi and cxl_mem). The new API
> > > introduced allows callers to initiate a scan down from the hostbridge
> > > and create ports for switches in the CXL topology.
> > >
> > > The intended user of this API is for endpoint devices. An endpoint
> > > device will need to determine if it is CXL.mem capable, which requires
> > > all components in the path from hostbridge to the endpoint to be CXL.mem
> > > capable. Once an endpoint device determines it's connected to a CXL
> > > capable root port, it can call this API to fill in all the ports in
> > > between the hostbridge and itself.
> > >
> > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > > ---
> > >  .../driver-api/cxl/memory-devices.rst         |   6 +
> > >  drivers/cxl/core/Makefile                     |   1 +
> > >  drivers/cxl/core/bus.c                        | 145 ++++++++++++++++++
> > >  drivers/cxl/core/pci.c                        |  99 ++++++++++++
> > >  drivers/cxl/cxl.h                             |   2 +
> > >  drivers/cxl/pci.h                             |   6 +
> > >  drivers/cxl/port.c                            |   2 +-
> > >  tools/testing/cxl/Kbuild                      |   1 +
> > >  8 files changed, 261 insertions(+), 1 deletion(-)
> > >  create mode 100644 drivers/cxl/core/pci.c
> > >
> > > diff --git a/Documentation/driver-api/cxl/memory-devices.rst b/Documentation/driver-api/cxl/memory-devices.rst
> > > index fbf0393cdddc..547336c95593 100644
> > > --- a/Documentation/driver-api/cxl/memory-devices.rst
> > > +++ b/Documentation/driver-api/cxl/memory-devices.rst
> > > @@ -47,6 +47,12 @@ CXL Core
> > >  .. kernel-doc:: drivers/cxl/core/bus.c
> > >     :identifiers:
> > >
> > > +.. kernel-doc:: drivers/cxl/core/pci.c
> > > +   :doc: cxl pci
> > > +
> > > +.. kernel-doc:: drivers/cxl/core/pci.c
> > > +   :identifiers:
> > > +
> > >  .. kernel-doc:: drivers/cxl/core/pmem.c
> > >     :doc: cxl pmem
> > >
> > > diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
> > > index 07eb8e1fb8a6..9d33d2d5bf09 100644
> > > --- a/drivers/cxl/core/Makefile
> > > +++ b/drivers/cxl/core/Makefile
> > > @@ -7,3 +7,4 @@ cxl_core-y += pmem.o
> > >  cxl_core-y += regs.o
> > >  cxl_core-y += memdev.o
> > >  cxl_core-y += mbox.o
> > > +cxl_core-y += pci.o
> > > diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> > > index c7e1894d503b..f10e7d5b22a4 100644
> > > --- a/drivers/cxl/core/bus.c
> > > +++ b/drivers/cxl/core/bus.c
> > > @@ -8,6 +8,7 @@
> > >  #include <linux/idr.h>
> > >  #include <cxlmem.h>
> > >  #include <cxl.h>
> > > +#include <pci.h>
> > >  #include "core.h"
> > >
> > >  /**
> > > @@ -445,6 +446,150 @@ struct cxl_port *devm_cxl_add_port(struct device *uport,
> > >  }
> > >  EXPORT_SYMBOL_GPL(devm_cxl_add_port);
> > >
> > > +void devm_cxl_remove_port(struct cxl_port *port)
> > > +{
> > > +       down_read(&root_host_sem);
> > > +       if (cxl_root_host) {
> > > +               devm_release_action(cxl_root_host, cxl_unlink_uport, port);
> > > +               devm_release_action(cxl_root_host, unregister_port, port);
> > > +       }
> > > +       up_read(&root_host_sem);
> > > +}
> > > +EXPORT_SYMBOL_GPL(devm_cxl_remove_port);
> >
> > If the scan establishes the property that all child ports are devm
> > allocated with their cxl_port-parent, and only if the cxl_port-parent
> > is bound to its driver then I think we don't need to play
> > devm_release_action games().
> >
>
> We had discussed this previously. I was running into an issue when unloading
> cxl_mem. I needed a way to remove the endpoint port and this was your
> recommendation. Are you suggesting if the chain is set up correctly, I don't
> need to do anything?

I think if the chain is set up correctly then you don't need to do
anything special. The endpoint port would be devm registered by the
cxl_memdev driver to its parent cxl_port provided that port is
actively attached to its driver.

> I don't remember exactly what was blowing up but I can try again after things
> are properly parented.

Cool.

>
> > > +
> > > +static int match_port(struct device *dev, const void *data)
> > > +{
> > > +       struct pci_dev *pdev = (struct pci_dev *)data;
> > > +
> > > +       if (dev->type != &cxl_port_type)
> > > +               return 0;
> > > +
> > > +       return to_cxl_port(dev)->uport == &pdev->dev;
> > > +}
> > > +
> > > +static struct cxl_port *find_cxl_port(struct pci_dev *usp)
> > > +{
> > > +       struct device *port_dev;
> > > +
> > > +       if (!pci_is_pcie(usp) || pci_pcie_type(usp) != PCI_EXP_TYPE_UPSTREAM)
> > > +               return NULL;
> > > +
> > > +       port_dev = bus_find_device(&cxl_bus_type, NULL, usp, match_port);
> > > +       if (port_dev)
> > > +               return to_cxl_port(port_dev);
> > > +
> > > +       return NULL;
> > > +}
> > > +
> > > +static int add_upstream_port(struct device *host, struct pci_dev *pdev)
> > > +{
> > > +       struct device *dev = &pdev->dev;
> > > +       struct cxl_port *parent_port;
> > > +       struct cxl_register_map map;
> > > +       struct cxl_port *port;
> > > +       int rc;
> > > +
> > > +       /*
> > > +        * Upstream ports must be connected to a downstream port or root port.
> > > +        * That downstream or root port must have a parent.
> > > +        */
> > > +       if (!pdev->dev.parent->parent)
> > > +               return -ENXIO;
> > > +
> > > +       /* A port is useless if there are no component registers */
> > > +       rc = cxl_find_regblock(pdev, CXL_REGLOC_RBI_COMPONENT, &map);
> > > +       if (rc)
> > > +               return rc;
> > > +
> > > +       parent_port = find_cxl_port(to_pci_dev(pdev->dev.parent->parent));
> >
> > This deref chain is unreadable. It wants a helper if it stays, but I
> > can't immediately think of a reason to ever need to look at a
> > grandparent in the hierarchy.
>
> The goal is to be able to find the next PCIe port up in the chain.
>
> My understanding was:
> pdev = PCIe upstream switch
> pdev->dev.parent = PCIe downstream switch connected to pdev.
> pdev->dev.parent->parent = PCIe upstream switch connected to pdev->dev.parent
>
> I was unable to find an idiomatic way to do that. I'm open to suggestions.

Oh ok, I see it now, but I think this can be done in pure CXL terms
and generic devices with the assumption that the parent device of a
cxl_memdev must be a dport. Then this works whether the parent port is
a platform device like ACPI or cxl_test, or a PCIe device.

static int port_has_dport(struct device *dev, const void *dport_dev)
{
        int found = 0;
        struct cxl_port *port;
        struct cxl_dport *dport;

        if (dev->type != &cxl_port_type)
                return 0;
        port = to_cxl_port(dev);

        device_lock(&port->dev);
        list_for_each_entry (dport, &port->dports, list)
                if (dport->dport == dport_dev) {
                        found = 1;
                        break;
                }
        device_unlock(&port->dev);

        return found;
}

struct cxl_port *find_parent_cxl_port(struct cxl_memdev *cxlmd)
{
        return bus_find_device(&cxl_bus_type, NULL, cxlmd->dev.parent,
                               port_has_dport);
}

> >
> > > +       if (!parent_port)
> > > +               return -ENODEV;
> > > +
> > > +       port = devm_cxl_add_port(dev, cxl_reg_block(pdev, &map), parent_port);
> >
> > This is broken because the pci device being used here does not have a
> > driver that knows about CXL bus events.
>
> I don't understand this, but I'd like to. Doesn't this make a port device which
> gets probed by the port driver? Why does the PCI device matter?

I am reacting to the first argument of this call being @dev that came
from the pci_dev that was passed in to be the "host" for the devm
operation. The devm release action triggers at driver unbind of that
host device, but that doesn't make sense because the driver for a
switch has nothing to do with CXL operation.

>
> (I'll mention again, switch code is not tested).
>
> >
> > > +       put_device(&parent_port->dev);
> > > +       if (IS_ERR(port))
> > > +               dev_err(dev, "Failed to add upstream port %ld\n",
> > > +                       PTR_ERR(port));
> > > +       else
> > > +               dev_dbg(dev, "Added CXL port\n");
> > > +
> > > +       return rc;
> > > +}
> > > +
> > > +static int add_downstream_port(struct pci_dev *pdev)
> > > +{
> > > +       resource_size_t creg = CXL_RESOURCE_NONE;
> > > +       struct device *dev = &pdev->dev;
> > > +       struct cxl_port *parent_port;
> > > +       struct cxl_register_map map;
> > > +       u32 lnkcap, port_num;
> > > +       int rc;
> > > +
> > > +       /*
> > > +        * Ports are to be scanned from top down. Therefore, the upstream port
> > > +        * must already exist.
> > > +        */
> > > +       parent_port = find_cxl_port(to_pci_dev(pdev->dev.parent));
> > > +       if (!parent_port)
> > > +               return -ENODEV;
> > > +
> > > +       /*
> > > +        * The spec mandates component registers are present but the
> > > +        * driver does not.
> >
> > What is this trying to convey?
> >
>
> That I'm not validating the hardware, and even though component registers are
> mandatory, the driver will move on even if they're not found. This functionality
> may need to change in the future and so I left the comment there.

I think that could be conveyed without comment with something like:

        if (rc)
                creg = CXL_RESOURCE_NONE;
        else
                creg = cxl_reg_block(pdev, &map);

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 13/28] cxl: Flesh out register names
  2021-11-01 22:00     ` Ben Widawsky
@ 2021-11-02  1:53       ` Dan Williams
  0 siblings, 0 replies; 106+ messages in thread
From: Dan Williams @ 2021-11-02  1:53 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Mon, Nov 1, 2021 at 3:00 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> On 21-10-31 13:18:36, Dan Williams wrote:
> > On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > >
> > > Get a better naming scheme in place for upcoming additions. To solidify
> > > the schema, add all the DVSEC identifiers to start with.
> >
> > The title and this changelog don't give anything of substance to
> > review the patch.
> >
> > This also looks like a rename and addition of more definitions. The
> > rename has one rationale, the additional definitions have a different
> > one, so split those into 2 patches, or fold the additions into the
> > patch that uses them.
> >
>
> I added more than necessary as a means to codify the naming scheme [as stated].
> Many of them are not used. I can split this patch into two, though I personally
> don't find it offensive to do it as one. Before I do that, I'd like to know
> though if you're going to reject the patch if I'm not actually using all of the
> defines later on.
>

They don't all need to be used, but a rationale for adding them is the request.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 11/28] cxl/acpi: Rescan bus at probe completion
  2021-11-01 21:45       ` Ben Widawsky
@ 2021-11-02  1:56         ` Dan Williams
  0 siblings, 0 replies; 106+ messages in thread
From: Dan Williams @ 2021-11-02  1:56 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Mon, Nov 1, 2021 at 2:45 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> On 21-11-01 11:56:02, Ben Widawsky wrote:
> > On 21-10-31 12:25:32, Dan Williams wrote:
> > > On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > >
> > > > Ensure that devices being probed before cxl_acpi has completed will get
> > > > a second chance.
> > >
> > > I think this is at the wrong level... more below.
> > > >
> > > > CXL drivers are brought up through two enumerable, asynchronous
> > > > mechanism. The leaf nodes in the CXL topology, endpoints, are enumerated
> > > > via PCI headers. The root node's enumeration is platform specific. The
> > > > current defacto mechanism for enumerating the root node is through the
> > > > presence of an ACPI device, ACPI0017.
> > > >
> > > > The primary job of a cxl_mem driver is to determine if CXL.mem traffic
> > > > can be routed to/from the PCIe device that it is being probed. A
> > > > prerequisite in this determination is that all CXL components in the
> > > > path from root to leaf are capable of routing CXL.mem traffic. If the
> > > > cxl_mem driver is probed before cxl_acpi is complete the driver will be
> > > > unable to make this determination. To address this, cxl_acpi (or in the
> > > > future, another platform specific driver) will rescan all devices to
> > > > make sure the ordering is correct.
> > > >
> > > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > > > ---
> > > >  drivers/cxl/acpi.c | 20 +++++++++++++++++++-
> > > >  1 file changed, 19 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> > > > index 625c5d95b83f..1cc3a74c16bd 100644
> > > > --- a/drivers/cxl/acpi.c
> > > > +++ b/drivers/cxl/acpi.c
> > > > @@ -451,6 +451,14 @@ static u32 cedt_instance(struct platform_device *pdev)
> > > >         return U32_MAX;
> > > >  }
> > > >
> > > > +static void bus_rescan(struct work_struct *work)
> > > > +{
> > > > +       if (bus_rescan_devices(&cxl_bus_type))
> > > > +               pr_err("Failed to rescan CXL bus\n");
> > > > +}
> > > > +
> > > > +static DECLARE_WORK(deferred_bus_rescan, bus_rescan);
> > > > +
> > > >  static int cxl_acpi_probe(struct platform_device *pdev)
> > > >  {
> > > >         int rc;
> > > > @@ -484,9 +492,19 @@ static int cxl_acpi_probe(struct platform_device *pdev)
> > > >         if (rc)
> > > >                 goto out;
> > > >
> > > > -       if (IS_ENABLED(CONFIG_CXL_PMEM))
> > > > +       if (IS_ENABLED(CONFIG_CXL_PMEM)) {
> > > >                 rc = device_for_each_child(&root_port->dev, root_port,
> > > >                                            add_root_nvdimm_bridge);
> > > > +               if (rc)
> > > > +                       goto out;
> > > > +       }
> > > > +
> > > > +       /*
> > > > +        * While ACPI is scanning hostbridge ports, switches and memory devices
> > > > +        * may have been probed. Those devices will need to know whether the
> > > > +        * hostbridge is CXL capable.
> > > > +        */
> > > > +       schedule_work(&deferred_bus_rescan);
> > >
> > > I think this belongs in port driver similar to the one in
> > > drivers/cxl/pmem.c, because any port online event might mean that a
> > > downstream port can now attach to the port driver. I prefer a local
> > > ordered workqueue via queue_work() rather than the global unordered
> > > workqueue primarily because it can be flushed without getting
> > > entangled with other random work in the system, but also because the
> > > ordered property is useful for comprehended hierarchical topology
> > > events.
> >
> > Makes sense. I was trying to limit the number of rescans that took place and I
> > believe I convinced myself that the current patch does work properly. However,
> > architecturally it makes more sense for the port driver to own this and
> > realistically I assume many rescans will coalesce.
> >
>
> Oh. I assume you want this rolled into (cxl/port: Introduce a port driver),
> correct?

Yeah, otherwise the port driver is broken, right?

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 10/28] cxl/core: Store global list of root ports
  2021-11-01 18:43     ` Ben Widawsky
@ 2021-11-02  2:04       ` Dan Williams
  0 siblings, 0 replies; 106+ messages in thread
From: Dan Williams @ 2021-11-02  2:04 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Mon, Nov 1, 2021 at 11:43 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> On 21-10-31 11:32:45, Dan Williams wrote:
> > On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > >
> > > CXL root ports (the downstream port to a host bridge) are to be
> > > enumerated by a platform specific driver. In the case of ACPI compliant
> > > systems, this is like the cxl_acpi driver. Root ports are the first
> > > CXL spec defined component that can be "found" by that platform specific
> > > driver.
> > >
> > > By storing a list of these root ports components in lower levels of the
> > > topology (switches and endpoints), have a mechanism to walk up their
> > > device hierarchy to find an enumerated root port. This will be necessary
> > > for region programming.
> > >
> > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > > ---
> > >  drivers/cxl/acpi.c            |  4 ++--
> > >  drivers/cxl/core/bus.c        | 34 +++++++++++++++++++++++++++++++++-
> > >  drivers/cxl/cxl.h             |  5 ++++-
> > >  tools/testing/cxl/mock_acpi.c |  4 ++--
> > >  4 files changed, 41 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> > > index 8cca0814dfb8..625c5d95b83f 100644
> > > --- a/drivers/cxl/acpi.c
> > > +++ b/drivers/cxl/acpi.c
> > > @@ -231,7 +231,7 @@ __mock int match_add_root_ports(struct pci_dev *pdev, void *data)
> > >                 creg = cxl_reg_block(pdev, &map);
> > >
> > >         port_num = FIELD_GET(PCI_EXP_LNKCAP_PN, lnkcap);
> > > -       rc = cxl_add_dport(port, &pdev->dev, port_num, creg);
> > > +       rc = cxl_add_dport(port, &pdev->dev, port_num, creg, true);
> > >         if (rc) {
> > >                 ctx->error = rc;
> > >                 return rc;
> > > @@ -406,7 +406,7 @@ static int add_host_bridge_dport(struct device *match, void *arg)
> > >                 dev_dbg(host, "No CHBS found for Host Bridge: %s\n",
> > >                         dev_name(match));
> > >
> > > -       rc = cxl_add_dport(root_port, match, uid, get_chbcr(chbs));
> > > +       rc = cxl_add_dport(root_port, match, uid, get_chbcr(chbs), false);
> > >         if (rc) {
> > >                 dev_err(host, "failed to add downstream port: %s\n",
> > >                         dev_name(match));
> > > diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> > > index dffbd0ac64af..03394a3ae75f 100644
> > > --- a/drivers/cxl/core/bus.c
> > > +++ b/drivers/cxl/core/bus.c
> > > @@ -25,6 +25,8 @@
> > >   */
> > >
> > >  static DEFINE_IDA(cxl_port_ida);
> > > +static LIST_HEAD(cxl_root_ports);
> > > +static DECLARE_RWSEM(root_port_sem);
> >
> > I don't see a need for this new list and lock...
> >
> > >
> > >  static ssize_t devtype_show(struct device *dev, struct device_attribute *attr,
> > >                             char *buf)
> > > @@ -268,12 +270,31 @@ struct cxl_port *to_cxl_port(struct device *dev)
> > >  }
> > >  EXPORT_SYMBOL_GPL(to_cxl_port);
> > >
> > > +struct cxl_dport *cxl_get_root_dport(struct device *dev)
> > > +{
> > > +       struct cxl_dport *ret = NULL;
> > > +       struct cxl_dport *dport;
> > > +
> > > +       down_read(&root_port_sem);
> > > +       list_for_each_entry(dport, &cxl_root_ports, root_port_link) {
> > > +               if (dport->dport == dev) {
> > > +                       ret = dport;
> > > +                       break;
> > > +               }
> > > +       }
> > > +
> > > +       up_read(&root_port_sem);
> > > +       return ret;
> > > +}
> > > +EXPORT_SYMBOL_GPL(cxl_get_root_dport);
> >
> > This can be done by walking the existing topology:
> >
> > struct cxl_dport *cxl_get_root_dport(struct device *dev)
> > {
> >         struct device *host = get_cxl_topology_host();
> >         struct cxl_dport *dport, *found = NULL;
> >         struct cxl_port *port;
> >         struct device *root;
> >
> >         if (!host)
> >                 return NULL;
> >
> >         root = device_find_child(host, &root, match_cxl_root_port);
> >         if (!root)
> >                 goto out;
> >         device_lock(root);
> >         port = to_cxl_port(root);
> >         list_for_each_entry (dport, &port->dports, list)
> >                 if (dport->dport == dev) {
> >                         found = dport;
> >                         break;
> >                 }
> >         device_unlock(root);
> >         put_device(root);
> >
> > out:
> >         put_cxl_topology_host(host);
> >         return found;
> > }
> >
> >
> > It occurs to me after writing this that device_lock() for iterating
> > dports can be offloaded to the topology rwsem.
> >
>
> When I originally wrote this function, I didn't have a pointer to the root host.
> Also, you said previously we'd remove the root host. What would you like me to
> do?

Root device retrieval is useful for these scans. However, the role of
overall "devm host" can be removed with careful port-child devm
registration.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 09/28] cxl/acpi: Map single port host bridge component registers
  2021-11-01 17:07     ` Ben Widawsky
@ 2021-11-02  2:15       ` Dan Williams
  2021-11-02 16:31         ` Ben Widawsky
  0 siblings, 1 reply; 106+ messages in thread
From: Dan Williams @ 2021-11-02  2:15 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Mon, Nov 1, 2021 at 10:08 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> On 21-10-31 11:03:37, Dan Williams wrote:
> > On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > >
> > > Now that the port driver exists and is able to do proper decoder
> > > enumeration of the component registers, it becomes trivial to use that
> >
> > This is the second occurrence of "becomes trivial" in this series,
> > it's distracting because it immediately sets off alarm bells that the
> > changelog is not in:
> >
> > Background
> > Problem
> > Solution
> >
> > ...format.
>
> Let me make sure I understand.
>
> Background: Now that the port driver exists...
> Problem: host bridge uports can use the port driver [but don't yet]
> Solution: This patch (the description is indeed missing).
>
> Is your point that I didn't document a solution, or something else?

Why do single port host bridge registers need to be mapped?

Why does the cxl_acpi driver need to do it and not the just introduced
port driver?

This patch seems to be saying, map these because they can be mapped.

>
>
> >
> > > for host bridge uports. For reasons out of scope, a functional change
> > > would be visible if the HDM decoder was programmed by BIOS to values
> > > other than the full address range. Similarly if a type2 device was
> > > connected to this root port and programmed by BIOS, that can now be
> > > acted upon accordingly.
> >
> > I would reserve discussion of "no functional change" for patches that
> > are pure cleanup. In this case this patch will cause the kernel to
> > behave differently when other conditions are met.
>
> True. I will remove that.
>
> >
> > >
> > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > > ---
> > >  drivers/cxl/acpi.c | 25 ++++++++++++++++++++++++-
> > >  1 file changed, 24 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> > > index d61397055e9f..8cca0814dfb8 100644
> > > --- a/drivers/cxl/acpi.c
> > > +++ b/drivers/cxl/acpi.c
> > > @@ -280,12 +280,14 @@ static int add_host_bridge_uport(struct device *match, void *arg)
> > >         struct cxl_port *root_port = arg;
> > >         struct device *host = root_port->dev.parent;
> > >         struct acpi_device *bridge = to_cxl_host_bridge(host, match);
> > > +       struct cxl_component_reg_map map;
> > >         struct acpi_pci_root *pci_root;
> > >         struct cxl_walk_context ctx;
> > >         int single_port_map[1], rc;
> > >         struct cxl_decoder *cxld;
> > >         struct cxl_dport *dport;
> > >         struct cxl_port *port;
> > > +       void __iomem *crb;
> > >
> > >         if (!bridge)
> > >                 return 0;
> > > @@ -318,10 +320,31 @@ static int add_host_bridge_uport(struct device *match, void *arg)
> > >                 return -ENODEV;
> > >         if (ctx.error)
> > >                 return ctx.error;
> > > +       /*
> > > +        * If the host bridge has more than 1 root port, it must have registers
> > > +        * controlling the HDM decoders. Those will be enumerated by the port
> > > +        * driver.
> > > +        */
> > >         if (ctx.count > 1)
> > >                 return 0;
> > >
> > > -       /* TODO: Scan CHBCR for HDM Decoder resources */
> > > +       /*
> > > +        * If the single ported host bridge has a component register block,
> > > +        * simply let the port driver handle the decoder enumeration.
> > > +        *
> > > +        * Host bridge component registers live in the system's physical address
> > > +        * space.
> > > +        */
> > > +       crb = ioremap(dport->component_reg_phys, CXL_COMPONENT_REG_BLOCK_SIZE);
> >
> > This looks broken, the driver for the port should be mapping dport
> > component registers to offer services to upper layers.
> >
> > There's also a missing iounmap.
> >
>
> Where's the missing iounmap?

Oops, sorry, don't know why I missed that. Forgive the noise.

> The port driver does what you say, but I need a way
> to shortcircuit the case where the root port doesn't have component registers
> which you've previously documented as allowed by spec. How would you recommend
> doing that?

Perhaps cxl_port_probe() needs to special case the single dport case
and just say, "no registers needed, single port == HDM passthrough".

Until there's a need to look at non-HDM registers I'd hold off on
enabling this case. Probably that comes soon when considering IDE
support, but no need to pre-enable that yet.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 08/28] cxl/port: Introduce a port driver
  2021-11-01 17:53     ` Ben Widawsky
  2021-11-01 17:54       ` Ben Widawsky
@ 2021-11-02  3:31       ` Dan Williams
  2021-11-02 16:27         ` Ben Widawsky
  2021-11-02 16:58         ` Ben Widawsky
  1 sibling, 2 replies; 106+ messages in thread
From: Dan Williams @ 2021-11-02  3:31 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Mon, Nov 1, 2021 at 10:53 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> On 21-10-29 18:37:36, Dan Williams wrote:
> > On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > >
>
> [snip]
>
> > > diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile
> > > index cf07ae6cea17..40b386aaedf7 100644
> > > --- a/drivers/cxl/Makefile
> > > +++ b/drivers/cxl/Makefile
> > > @@ -1,5 +1,6 @@
> > >  # SPDX-License-Identifier: GPL-2.0
> > >  obj-$(CONFIG_CXL_BUS) += core/
> > > +obj-$(CONFIG_CXL_MEM) += cxl_port.o
> >
> > It feel odd that CONFIG_CXL_MEM builds cxl_port, why not have a
> > CONFIG_CXL_PORT that is simply selected by CONFIG_CXL_MEM, or a
> > CONFIG_CXL_PORT that defaults to the value of CONFIG_CXL_BUS?
> >
>
> Can you help me understand when CONFIG_CXL_MEM is useful when
> #CONFIG_CXL_PORT=n? I was unable to figure out such a case and so I tied the two
> together.

With a 'select' dependency it's impossible to have the
CONFIG_CXL_PORT=n and CONFIG_CXL_MEM=m combination. The extra config
symbol is for idiomatic (one config-symbol per module .ko) reasons to
reflect the module dependency in the Kconfig.

[..]
> > > +static inline int cxl_hdm_decoder_ig(u32 ctrl)
> >
> > No need for plain inline in C files.
> >
> > It's not clear why this simple helper needs a "cxl_hdm_decoder"
> > namespace prefix?
>
> I had a patch to share this with acpi driver at one point, but I dropped it. Do
> you care if I merge those two decoders, or just rename?

Not sure what you mean by "merge"?

>
> >
> > > +{
> > > +       int val = FIELD_GET(CXL_HDM_DECODER0_CTRL_IG_MASK, ctrl);
> > > +
> > > +       return 8 + val;
> > > +}
> >
> > Why is this return a power of 2 value...
>
> I don't understand this comment.

Isn't this returning an IG as a bit-shift value? The "iw" is returning
bytes and I'm proposing they both be bytes.

>
> >
> > > +
> > > +static inline int cxl_hdm_decoder_iw(u32 ctrl)
> > > +{
> > > +       int val = FIELD_GET(CXL_HDM_DECODER0_CTRL_IW_MASK, ctrl);
> > > +
> > > +       return 1 << val;
> >
> > ...while this one is converted to absolute values.
> >
> > These could just be:
> >
> > unsigned int to_interleave_granularity(u32 ctrl)
> > unsigned int to_interleave_ways(u32 ctrl)
> >
> > ...and return units in bytes.
> >
> > > +}
> > > +
> > > +static void get_caps(struct cxl_port *port, struct cxl_port_data *cpd)
> > > +{
> > > +       void __iomem *hdm_decoder = cpd->regs.hdm_decoder;
> > > +       struct port_caps *caps = &cpd->caps;
> > > +       u32 hdm_cap;
> > > +
> > > +       hdm_cap = readl(hdm_decoder + CXL_HDM_DECODER_CAP_OFFSET);
> > > +
> > > +       caps->count = cxl_hdm_decoder_count(hdm_cap);
> > > +       caps->tc = FIELD_GET(CXL_HDM_DECODER_TARGET_COUNT_MASK, hdm_cap);
> > > +       caps->interleave11_8 =
> > > +               FIELD_GET(CXL_HDM_DECODER_INTERLEAVE_11_8, hdm_cap);
> > > +       caps->interleave14_12 =
> > > +               FIELD_GET(CXL_HDM_DECODER_INTERLEAVE_14_12, hdm_cap);
> > > +}
> > > +
> > > +static int map_regs(struct cxl_port *port, void __iomem *crb,
> > > +                   struct cxl_port_data *cpd)
> > > +{
> > > +       struct cxl_register_map map;
> > > +       struct cxl_component_reg_map *comp_map = &map.component_map;
> > > +
> > > +       cxl_probe_component_regs(&port->dev, crb, comp_map);
> > > +       if (!comp_map->hdm_decoder.valid) {
> > > +               dev_err(&port->dev, "HDM decoder registers invalid\n");
> > > +               return -ENXIO;
> > > +       }
> >
> > Perhaps promote cxl_probe_regs() from the cxl_pci to the core and make
> > it take a dev instead of a pdev, then you can do:
> >
> > cxl_probe_regs(&port_dev->dev, CXL_REGLOC_RBI_COMPONENT)
> >
> > ...instead of open coding it again?
> >
> > > +
> > > +       cpd->regs.hdm_decoder = crb + comp_map->hdm_decoder.offset;
> > > +
> > > +       return 0;
> > > +}
> > > +
> > > +static u64 get_decoder_size(void __iomem *hdm_decoder, int n)
> > > +{
> > > +       u32 ctrl = readl(hdm_decoder + CXL_HDM_DECODER0_CTRL_OFFSET(n));
> > > +
> > > +       if (!!FIELD_GET(CXL_HDM_DECODER0_CTRL_COMMITTED, ctrl))
> > > +               return 0;
> > > +
> > > +       return ioread64_hi_lo(hdm_decoder +
> > > +                             CXL_HDM_DECODER0_SIZE_LOW_OFFSET(n));
> > > +}
> > > +
> > > +static bool is_endpoint_port(struct cxl_port *port)
> > > +{
> > > +       if (!port->uport->driver)
> > > +               return false;
> > > +
> > > +       return to_cxl_drv(port->uport->driver)->id ==
> > > +              CXL_DEVICE_MEMORY_EXPANDER;
> >
> > Why does endpoint port device type determination need to reach through
> > and read the driver type?
> >
>
> I couldn't figure out a better way at this point in enumeration. I'm open to
> suggestions.

list_empty(&port->dports)?


>
> [snip]
>
> > > +
> > > +       /*
> > > +        * Enable HDM decoders for this port.
> > > +        *
> > > +        * FIXME: If the component was using DVSEC range registers for decode,
> > > +        * this will destroy that.
> >
> > Yeah, definitely need to check that before this patch can move
> > forward. Perhaps a port should not even be registered if DVSEC
> > Memory_Size && Mem_Enable are non zero, that device is explicitly
> > opting out of being a part of the CXL 2.0 subsystem hierarchy.
> > However, we might still need to track it and potentially reserve it
> > out of CFMWS capacity to make sure nothing else collides with it. I'll
> > also note that "ECN: Devices operating in CXL 1.1 mode with no RCRB"
> > was recently published that reads on what the driver should do here.
> >
>
> I believe we want to create the port since we might decide to reset and want
> control back over it and as you said, safety check other things. The reason it
> was left as a FIXME is because this belongs in the PCI driver which I didn't
> really want to touch at this time. I will go back and add that for the next
> version.

Ok.

>
> > > +        */
> > > +       ctrl = readl(portdata->regs.hdm_decoder + CXL_HDM_DECODER_CTRL_OFFSET);
> > > +       ctrl |= CXL_HDM_DECODER_ENABLE;
> > > +       writel(ctrl, portdata->regs.hdm_decoder + CXL_HDM_DECODER_CTRL_OFFSET);
> >
> > I feel like that if the driver finds it enabled it should leave it
> > enabled at ->remove() time, as you have it here, as BIOS might not be
> > expecting the OS to disable a decoder it set up. However, if the
> > driver actually does the enable, then it should pair it with a disable
> > at the end of time, so not a blind enable, but one that conditionally
> > arranges for the unwind.
>
> My thought was that once we enumerate a port, all of it's architectural state
> belongs to the OS. For us that means blanket enable/disable. I don't feel
> strongly about this.

I think the driver needs to tread carefully when it comes to potential
BIOS interactions. The minimum BIOS collision would be to not toggle
the enable off if the OS cannot assert that it set it. A more precise
policy would be checking if the device contributes to any EFI memory
map ranges, or any locked CFMWS entries.

[..]
> > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> > index 5e2e93451928..bd7b008207a4 100644
> > --- a/drivers/cxl/cxl.h
> > +++ b/drivers/cxl/cxl.h
> > @@ -302,6 +302,10 @@ struct cxl_decoder *cxl_decoder_alloc(struct
> > cxl_port *port, int nr_targets);
> >  int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map);
> >  int cxl_decoder_autoremove(struct device *host, struct cxl_decoder *cxld);
> >
> > +void set_cxl_topology_host(struct device *dev);
> > +struct device *get_cxl_topology_host(void);
> > +void put_cxl_topology_host(struct device *dev);
> > +
> >  extern struct bus_type cxl_bus_type;
> >
> >  struct cxl_driver {
>
> As you've seen by now I do implement something like that you have below in a
> later patch. I believe it will still be nice to have, but I haven't read through
> all of your feedback yet. So I might change my mind.

The main difference I proposed is a {get,put}_cxl_topology_host() to
wrap topology operations in a rwsem.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 08/28] cxl/port: Introduce a port driver
  2021-11-02  3:31       ` Dan Williams
@ 2021-11-02 16:27         ` Ben Widawsky
  2021-11-02 17:21           ` Dan Williams
  2021-11-02 16:58         ` Ben Widawsky
  1 sibling, 1 reply; 106+ messages in thread
From: Ben Widawsky @ 2021-11-02 16:27 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On 21-11-01 20:31:03, Dan Williams wrote:
> On Mon, Nov 1, 2021 at 10:53 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> >
> > On 21-10-29 18:37:36, Dan Williams wrote:
> > > On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > >
> >
> > [snip]
> >
> > > > diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile
> > > > index cf07ae6cea17..40b386aaedf7 100644
> > > > --- a/drivers/cxl/Makefile
> > > > +++ b/drivers/cxl/Makefile
> > > > @@ -1,5 +1,6 @@
> > > >  # SPDX-License-Identifier: GPL-2.0
> > > >  obj-$(CONFIG_CXL_BUS) += core/
> > > > +obj-$(CONFIG_CXL_MEM) += cxl_port.o
> > >
> > > It feel odd that CONFIG_CXL_MEM builds cxl_port, why not have a
> > > CONFIG_CXL_PORT that is simply selected by CONFIG_CXL_MEM, or a
> > > CONFIG_CXL_PORT that defaults to the value of CONFIG_CXL_BUS?
> > >
> >
> > Can you help me understand when CONFIG_CXL_MEM is useful when
> > #CONFIG_CXL_PORT=n? I was unable to figure out such a case and so I tied the two
> > together.
> 
> With a 'select' dependency it's impossible to have the
> CONFIG_CXL_PORT=n and CONFIG_CXL_MEM=m combination. The extra config
> symbol is for idiomatic (one config-symbol per module .ko) reasons to
> reflect the module dependency in the Kconfig.

Can't argue with idiomisms ;-)

> 
> [..]
> > > > +static inline int cxl_hdm_decoder_ig(u32 ctrl)
> > >
> > > No need for plain inline in C files.
> > >
> > > It's not clear why this simple helper needs a "cxl_hdm_decoder"
> > > namespace prefix?
> >
> > I had a patch to share this with acpi driver at one point, but I dropped it. Do
> > you care if I merge those two decoders, or just rename?
> 
> Not sure what you mean by "merge"?
> 

To have a unified function for both the cxl_acpi driver and the cxl_port driver.

> >
> > >
> > > > +{
> > > > +       int val = FIELD_GET(CXL_HDM_DECODER0_CTRL_IG_MASK, ctrl);
> > > > +
> > > > +       return 8 + val;
> > > > +}
> > >
> > > Why is this return a power of 2 value...
> >
> > I don't understand this comment.
> 
> Isn't this returning an IG as a bit-shift value? The "iw" is returning
> bytes and I'm proposing they both be bytes.
> 

IG is definitely wrong since it's documented in struct cxl_decoder as data
stride. I will fix that. I don't follow what you mean by "iw" is returning
bytes. It's returning the number of ways.

> >
> > >
> > > > +
> > > > +static inline int cxl_hdm_decoder_iw(u32 ctrl)
> > > > +{
> > > > +       int val = FIELD_GET(CXL_HDM_DECODER0_CTRL_IW_MASK, ctrl);
> > > > +
> > > > +       return 1 << val;
> > >
> > > ...while this one is converted to absolute values.
> > >
> > > These could just be:
> > >
> > > unsigned int to_interleave_granularity(u32 ctrl)
> > > unsigned int to_interleave_ways(u32 ctrl)
> > >
> > > ...and return units in bytes.
> > >
> > > > +}
> > > > +
> > > > +static void get_caps(struct cxl_port *port, struct cxl_port_data *cpd)
> > > > +{
> > > > +       void __iomem *hdm_decoder = cpd->regs.hdm_decoder;
> > > > +       struct port_caps *caps = &cpd->caps;
> > > > +       u32 hdm_cap;
> > > > +
> > > > +       hdm_cap = readl(hdm_decoder + CXL_HDM_DECODER_CAP_OFFSET);
> > > > +
> > > > +       caps->count = cxl_hdm_decoder_count(hdm_cap);
> > > > +       caps->tc = FIELD_GET(CXL_HDM_DECODER_TARGET_COUNT_MASK, hdm_cap);
> > > > +       caps->interleave11_8 =
> > > > +               FIELD_GET(CXL_HDM_DECODER_INTERLEAVE_11_8, hdm_cap);
> > > > +       caps->interleave14_12 =
> > > > +               FIELD_GET(CXL_HDM_DECODER_INTERLEAVE_14_12, hdm_cap);
> > > > +}
> > > > +
> > > > +static int map_regs(struct cxl_port *port, void __iomem *crb,
> > > > +                   struct cxl_port_data *cpd)
> > > > +{
> > > > +       struct cxl_register_map map;
> > > > +       struct cxl_component_reg_map *comp_map = &map.component_map;
> > > > +
> > > > +       cxl_probe_component_regs(&port->dev, crb, comp_map);
> > > > +       if (!comp_map->hdm_decoder.valid) {
> > > > +               dev_err(&port->dev, "HDM decoder registers invalid\n");
> > > > +               return -ENXIO;
> > > > +       }
> > >
> > > Perhaps promote cxl_probe_regs() from the cxl_pci to the core and make
> > > it take a dev instead of a pdev, then you can do:
> > >
> > > cxl_probe_regs(&port_dev->dev, CXL_REGLOC_RBI_COMPONENT)
> > >
> > > ...instead of open coding it again?
> > >
> > > > +
> > > > +       cpd->regs.hdm_decoder = crb + comp_map->hdm_decoder.offset;
> > > > +
> > > > +       return 0;
> > > > +}
> > > > +
> > > > +static u64 get_decoder_size(void __iomem *hdm_decoder, int n)
> > > > +{
> > > > +       u32 ctrl = readl(hdm_decoder + CXL_HDM_DECODER0_CTRL_OFFSET(n));
> > > > +
> > > > +       if (!!FIELD_GET(CXL_HDM_DECODER0_CTRL_COMMITTED, ctrl))
> > > > +               return 0;
> > > > +
> > > > +       return ioread64_hi_lo(hdm_decoder +
> > > > +                             CXL_HDM_DECODER0_SIZE_LOW_OFFSET(n));
> > > > +}
> > > > +
> > > > +static bool is_endpoint_port(struct cxl_port *port)
> > > > +{
> > > > +       if (!port->uport->driver)
> > > > +               return false;
> > > > +
> > > > +       return to_cxl_drv(port->uport->driver)->id ==
> > > > +              CXL_DEVICE_MEMORY_EXPANDER;
> > >
> > > Why does endpoint port device type determination need to reach through
> > > and read the driver type?
> > >
> >
> > I couldn't figure out a better way at this point in enumeration. I'm open to
> > suggestions.
> 
> list_empty(&port->dports)?
> 

That seems obvious enough. In the v2 series it's possible due to failures for a
switch port to also have list_empty(&port->dports). Should ports not get added
if they have 0 dports (unless they're endpoints)?

> 
> >
> > [snip]
> >
> > > > +
> > > > +       /*
> > > > +        * Enable HDM decoders for this port.
> > > > +        *
> > > > +        * FIXME: If the component was using DVSEC range registers for decode,
> > > > +        * this will destroy that.
> > >
> > > Yeah, definitely need to check that before this patch can move
> > > forward. Perhaps a port should not even be registered if DVSEC
> > > Memory_Size && Mem_Enable are non zero, that device is explicitly
> > > opting out of being a part of the CXL 2.0 subsystem hierarchy.
> > > However, we might still need to track it and potentially reserve it
> > > out of CFMWS capacity to make sure nothing else collides with it. I'll
> > > also note that "ECN: Devices operating in CXL 1.1 mode with no RCRB"
> > > was recently published that reads on what the driver should do here.
> > >
> >
> > I believe we want to create the port since we might decide to reset and want
> > control back over it and as you said, safety check other things. The reason it
> > was left as a FIXME is because this belongs in the PCI driver which I didn't
> > really want to touch at this time. I will go back and add that for the next
> > version.
> 
> Ok.
> 
> >
> > > > +        */
> > > > +       ctrl = readl(portdata->regs.hdm_decoder + CXL_HDM_DECODER_CTRL_OFFSET);
> > > > +       ctrl |= CXL_HDM_DECODER_ENABLE;
> > > > +       writel(ctrl, portdata->regs.hdm_decoder + CXL_HDM_DECODER_CTRL_OFFSET);
> > >
> > > I feel like that if the driver finds it enabled it should leave it
> > > enabled at ->remove() time, as you have it here, as BIOS might not be
> > > expecting the OS to disable a decoder it set up. However, if the
> > > driver actually does the enable, then it should pair it with a disable
> > > at the end of time, so not a blind enable, but one that conditionally
> > > arranges for the unwind.
> >
> > My thought was that once we enumerate a port, all of it's architectural state
> > belongs to the OS. For us that means blanket enable/disable. I don't feel
> > strongly about this.
> 
> I think the driver needs to tread carefully when it comes to potential
> BIOS interactions. The minimum BIOS collision would be to not toggle
> the enable off if the OS cannot assert that it set it. A more precise
> policy would be checking if the device contributes to any EFI memory
> map ranges, or any locked CFMWS entries.

Locked == bit 4, fixed, right? The more precise policy makes sense to me, though
I still wonder what happens if we reset a device for that case?

I'll work on the more precise version.

> 
> [..]
> > > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> > > index 5e2e93451928..bd7b008207a4 100644
> > > --- a/drivers/cxl/cxl.h
> > > +++ b/drivers/cxl/cxl.h
> > > @@ -302,6 +302,10 @@ struct cxl_decoder *cxl_decoder_alloc(struct
> > > cxl_port *port, int nr_targets);
> > >  int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map);
> > >  int cxl_decoder_autoremove(struct device *host, struct cxl_decoder *cxld);
> > >
> > > +void set_cxl_topology_host(struct device *dev);
> > > +struct device *get_cxl_topology_host(void);
> > > +void put_cxl_topology_host(struct device *dev);
> > > +
> > >  extern struct bus_type cxl_bus_type;
> > >
> > >  struct cxl_driver {
> >
> > As you've seen by now I do implement something like that you have below in a
> > later patch. I believe it will still be nice to have, but I haven't read through
> > all of your feedback yet. So I might change my mind.
> 
> The main difference I proposed is a {get,put}_cxl_topology_host() to
> wrap topology operations in a rwsem.

Makes sense. I needed a topology lock eventually anyway.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 09/28] cxl/acpi: Map single port host bridge component registers
  2021-11-02  2:15       ` Dan Williams
@ 2021-11-02 16:31         ` Ben Widawsky
  2021-11-02 17:46           ` Dan Williams
  0 siblings, 1 reply; 106+ messages in thread
From: Ben Widawsky @ 2021-11-02 16:31 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On 21-11-01 19:15:33, Dan Williams wrote:
> On Mon, Nov 1, 2021 at 10:08 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> >
> > On 21-10-31 11:03:37, Dan Williams wrote:
> > > On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > >
> > > > Now that the port driver exists and is able to do proper decoder
> > > > enumeration of the component registers, it becomes trivial to use that
> > >
> > > This is the second occurrence of "becomes trivial" in this series,
> > > it's distracting because it immediately sets off alarm bells that the
> > > changelog is not in:
> > >
> > > Background
> > > Problem
> > > Solution
> > >
> > > ...format.
> >
> > Let me make sure I understand.
> >
> > Background: Now that the port driver exists...
> > Problem: host bridge uports can use the port driver [but don't yet]
> > Solution: This patch (the description is indeed missing).
> >
> > Is your point that I didn't document a solution, or something else?
> 
> Why do single port host bridge registers need to be mapped?
> 
> Why does the cxl_acpi driver need to do it and not the just introduced
> port driver?
> 
> This patch seems to be saying, map these because they can be mapped.
> 
> >
> >
> > >
> > > > for host bridge uports. For reasons out of scope, a functional change
> > > > would be visible if the HDM decoder was programmed by BIOS to values
> > > > other than the full address range. Similarly if a type2 device was
> > > > connected to this root port and programmed by BIOS, that can now be
> > > > acted upon accordingly.
> > >
> > > I would reserve discussion of "no functional change" for patches that
> > > are pure cleanup. In this case this patch will cause the kernel to
> > > behave differently when other conditions are met.
> >
> > True. I will remove that.
> >
> > >
> > > >
> > > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > > > ---
> > > >  drivers/cxl/acpi.c | 25 ++++++++++++++++++++++++-
> > > >  1 file changed, 24 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> > > > index d61397055e9f..8cca0814dfb8 100644
> > > > --- a/drivers/cxl/acpi.c
> > > > +++ b/drivers/cxl/acpi.c
> > > > @@ -280,12 +280,14 @@ static int add_host_bridge_uport(struct device *match, void *arg)
> > > >         struct cxl_port *root_port = arg;
> > > >         struct device *host = root_port->dev.parent;
> > > >         struct acpi_device *bridge = to_cxl_host_bridge(host, match);
> > > > +       struct cxl_component_reg_map map;
> > > >         struct acpi_pci_root *pci_root;
> > > >         struct cxl_walk_context ctx;
> > > >         int single_port_map[1], rc;
> > > >         struct cxl_decoder *cxld;
> > > >         struct cxl_dport *dport;
> > > >         struct cxl_port *port;
> > > > +       void __iomem *crb;
> > > >
> > > >         if (!bridge)
> > > >                 return 0;
> > > > @@ -318,10 +320,31 @@ static int add_host_bridge_uport(struct device *match, void *arg)
> > > >                 return -ENODEV;
> > > >         if (ctx.error)
> > > >                 return ctx.error;
> > > > +       /*
> > > > +        * If the host bridge has more than 1 root port, it must have registers
> > > > +        * controlling the HDM decoders. Those will be enumerated by the port
> > > > +        * driver.
> > > > +        */
> > > >         if (ctx.count > 1)
> > > >                 return 0;
> > > >
> > > > -       /* TODO: Scan CHBCR for HDM Decoder resources */
> > > > +       /*
> > > > +        * If the single ported host bridge has a component register block,
> > > > +        * simply let the port driver handle the decoder enumeration.
> > > > +        *
> > > > +        * Host bridge component registers live in the system's physical address
> > > > +        * space.
> > > > +        */
> > > > +       crb = ioremap(dport->component_reg_phys, CXL_COMPONENT_REG_BLOCK_SIZE);
> > >
> > > This looks broken, the driver for the port should be mapping dport
> > > component registers to offer services to upper layers.
> > >
> > > There's also a missing iounmap.
> > >
> >
> > Where's the missing iounmap?
> 
> Oops, sorry, don't know why I missed that. Forgive the noise.
> 
> > The port driver does what you say, but I need a way
> > to shortcircuit the case where the root port doesn't have component registers
> > which you've previously documented as allowed by spec. How would you recommend
> > doing that?
> 
> Perhaps cxl_port_probe() needs to special case the single dport case
> and just say, "no registers needed, single port == HDM passthrough".
> 
> Until there's a need to look at non-HDM registers I'd hold off on
> enabling this case. Probably that comes soon when considering IDE
> support, but no need to pre-enable that yet.

It seemed like a very simple thing to support given the port driver's existence
so it was added to remove a TODO. However, I will drop it as you request.


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 15/28] cxl/core: Introduce API to scan switch ports
  2021-11-02  1:45       ` Dan Williams
@ 2021-11-02 16:39         ` Ben Widawsky
  2021-11-02 20:00           ` Dan Williams
  2021-11-16 16:50         ` Ben Widawsky
  1 sibling, 1 reply; 106+ messages in thread
From: Ben Widawsky @ 2021-11-02 16:39 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On 21-11-01 18:45:57, Dan Williams wrote:
> On Mon, Nov 1, 2021 at 3:56 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
> >
> > On 21-10-31 22:39:43, Dan Williams wrote:
> > > On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > >
> > > > The CXL drivers encapsulate the components that direct memory traffic in
> > > > an entity known as a cxl_port. Compute Express Link specifies three such
> > > > components: hostbridge (ie. a collection of root ports), switches, and
> > > > endpoints. There are currently drivers that create these ports for the
> > > > hostbridges and the endpoints (cxl_acpi and cxl_mem). The new API
> > > > introduced allows callers to initiate a scan down from the hostbridge
> > > > and create ports for switches in the CXL topology.
> > > >
> > > > The intended user of this API is for endpoint devices. An endpoint
> > > > device will need to determine if it is CXL.mem capable, which requires
> > > > all components in the path from hostbridge to the endpoint to be CXL.mem
> > > > capable. Once an endpoint device determines it's connected to a CXL
> > > > capable root port, it can call this API to fill in all the ports in
> > > > between the hostbridge and itself.
> > > >
> > > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > > > ---
> > > >  .../driver-api/cxl/memory-devices.rst         |   6 +
> > > >  drivers/cxl/core/Makefile                     |   1 +
> > > >  drivers/cxl/core/bus.c                        | 145 ++++++++++++++++++
> > > >  drivers/cxl/core/pci.c                        |  99 ++++++++++++
> > > >  drivers/cxl/cxl.h                             |   2 +
> > > >  drivers/cxl/pci.h                             |   6 +
> > > >  drivers/cxl/port.c                            |   2 +-
> > > >  tools/testing/cxl/Kbuild                      |   1 +
> > > >  8 files changed, 261 insertions(+), 1 deletion(-)
> > > >  create mode 100644 drivers/cxl/core/pci.c
> > > >
> > > > diff --git a/Documentation/driver-api/cxl/memory-devices.rst b/Documentation/driver-api/cxl/memory-devices.rst
> > > > index fbf0393cdddc..547336c95593 100644
> > > > --- a/Documentation/driver-api/cxl/memory-devices.rst
> > > > +++ b/Documentation/driver-api/cxl/memory-devices.rst
> > > > @@ -47,6 +47,12 @@ CXL Core
> > > >  .. kernel-doc:: drivers/cxl/core/bus.c
> > > >     :identifiers:
> > > >
> > > > +.. kernel-doc:: drivers/cxl/core/pci.c
> > > > +   :doc: cxl pci
> > > > +
> > > > +.. kernel-doc:: drivers/cxl/core/pci.c
> > > > +   :identifiers:
> > > > +
> > > >  .. kernel-doc:: drivers/cxl/core/pmem.c
> > > >     :doc: cxl pmem
> > > >
> > > > diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
> > > > index 07eb8e1fb8a6..9d33d2d5bf09 100644
> > > > --- a/drivers/cxl/core/Makefile
> > > > +++ b/drivers/cxl/core/Makefile
> > > > @@ -7,3 +7,4 @@ cxl_core-y += pmem.o
> > > >  cxl_core-y += regs.o
> > > >  cxl_core-y += memdev.o
> > > >  cxl_core-y += mbox.o
> > > > +cxl_core-y += pci.o
> > > > diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> > > > index c7e1894d503b..f10e7d5b22a4 100644
> > > > --- a/drivers/cxl/core/bus.c
> > > > +++ b/drivers/cxl/core/bus.c
> > > > @@ -8,6 +8,7 @@
> > > >  #include <linux/idr.h>
> > > >  #include <cxlmem.h>
> > > >  #include <cxl.h>
> > > > +#include <pci.h>
> > > >  #include "core.h"
> > > >
> > > >  /**
> > > > @@ -445,6 +446,150 @@ struct cxl_port *devm_cxl_add_port(struct device *uport,
> > > >  }
> > > >  EXPORT_SYMBOL_GPL(devm_cxl_add_port);
> > > >
> > > > +void devm_cxl_remove_port(struct cxl_port *port)
> > > > +{
> > > > +       down_read(&root_host_sem);
> > > > +       if (cxl_root_host) {
> > > > +               devm_release_action(cxl_root_host, cxl_unlink_uport, port);
> > > > +               devm_release_action(cxl_root_host, unregister_port, port);
> > > > +       }
> > > > +       up_read(&root_host_sem);
> > > > +}
> > > > +EXPORT_SYMBOL_GPL(devm_cxl_remove_port);
> > >
> > > If the scan establishes the property that all child ports are devm
> > > allocated with their cxl_port-parent, and only if the cxl_port-parent
> > > is bound to its driver then I think we don't need to play
> > > devm_release_action games().
> > >
> >
> > We had discussed this previously. I was running into an issue when unloading
> > cxl_mem. I needed a way to remove the endpoint port and this was your
> > recommendation. Are you suggesting if the chain is set up correctly, I don't
> > need to do anything?
> 
> I think if the chain is set up correctly then you don't need to do
> anything special. The endpoint port would be devm registered by the
> cxl_memdev driver to its parent cxl_port provided that port is
> actively attached to its driver.
> 
> > I don't remember exactly what was blowing up but I can try again after things
> > are properly parented.
> 
> Cool.
> 
> >
> > > > +
> > > > +static int match_port(struct device *dev, const void *data)
> > > > +{
> > > > +       struct pci_dev *pdev = (struct pci_dev *)data;
> > > > +
> > > > +       if (dev->type != &cxl_port_type)
> > > > +               return 0;
> > > > +
> > > > +       return to_cxl_port(dev)->uport == &pdev->dev;
> > > > +}
> > > > +
> > > > +static struct cxl_port *find_cxl_port(struct pci_dev *usp)
> > > > +{
> > > > +       struct device *port_dev;
> > > > +
> > > > +       if (!pci_is_pcie(usp) || pci_pcie_type(usp) != PCI_EXP_TYPE_UPSTREAM)
> > > > +               return NULL;
> > > > +
> > > > +       port_dev = bus_find_device(&cxl_bus_type, NULL, usp, match_port);
> > > > +       if (port_dev)
> > > > +               return to_cxl_port(port_dev);
> > > > +
> > > > +       return NULL;
> > > > +}
> > > > +
> > > > +static int add_upstream_port(struct device *host, struct pci_dev *pdev)
> > > > +{
> > > > +       struct device *dev = &pdev->dev;
> > > > +       struct cxl_port *parent_port;
> > > > +       struct cxl_register_map map;
> > > > +       struct cxl_port *port;
> > > > +       int rc;
> > > > +
> > > > +       /*
> > > > +        * Upstream ports must be connected to a downstream port or root port.
> > > > +        * That downstream or root port must have a parent.
> > > > +        */
> > > > +       if (!pdev->dev.parent->parent)
> > > > +               return -ENXIO;
> > > > +
> > > > +       /* A port is useless if there are no component registers */
> > > > +       rc = cxl_find_regblock(pdev, CXL_REGLOC_RBI_COMPONENT, &map);
> > > > +       if (rc)
> > > > +               return rc;
> > > > +
> > > > +       parent_port = find_cxl_port(to_pci_dev(pdev->dev.parent->parent));
> > >
> > > This deref chain is unreadable. It wants a helper if it stays, but I
> > > can't immediately think of a reason to ever need to look at a
> > > grandparent in the hierarchy.
> >
> > The goal is to be able to find the next PCIe port up in the chain.
> >
> > My understanding was:
> > pdev = PCIe upstream switch
> > pdev->dev.parent = PCIe downstream switch connected to pdev.
> > pdev->dev.parent->parent = PCIe upstream switch connected to pdev->dev.parent
> >
> > I was unable to find an idiomatic way to do that. I'm open to suggestions.
> 
> Oh ok, I see it now, but I think this can be done in pure CXL terms
> and generic devices with the assumption that the parent device of a
> cxl_memdev must be a dport. Then this works whether the parent port is
> a platform device like ACPI or cxl_test, or a PCIe device.
> 
> static int port_has_dport(struct device *dev, const void *dport_dev)
> {
>         int found = 0;
>         struct cxl_port *port;
>         struct cxl_dport *dport;
> 
>         if (dev->type != &cxl_port_type)
>                 return 0;
>         port = to_cxl_port(dev);
> 
>         device_lock(&port->dev);
>         list_for_each_entry (dport, &port->dports, list)
>                 if (dport->dport == dport_dev) {
>                         found = 1;
>                         break;
>                 }
>         device_unlock(&port->dev);
> 
>         return found;
> }
> 
> struct cxl_port *find_parent_cxl_port(struct cxl_memdev *cxlmd)
> {
>         return bus_find_device(&cxl_bus_type, NULL, cxlmd->dev.parent,
>                                port_has_dport);
> }
> 

Ah, I remember now that I had something like this originally. I thought it may
be more desirable to do the grandparent route since you should be able to assert
there is a grandparent. I'll change it.

> > >
> > > > +       if (!parent_port)
> > > > +               return -ENODEV;
> > > > +
> > > > +       port = devm_cxl_add_port(dev, cxl_reg_block(pdev, &map), parent_port);
> > >
> > > This is broken because the pci device being used here does not have a
> > > driver that knows about CXL bus events.
> >
> > I don't understand this, but I'd like to. Doesn't this make a port device which
> > gets probed by the port driver? Why does the PCI device matter?
> 
> I am reacting to the first argument of this call being @dev that came
> from the pci_dev that was passed in to be the "host" for the devm
> operation. The devm release action triggers at driver unbind of that
> host device, but that doesn't make sense because the driver for a
> switch has nothing to do with CXL operation.
> 

What's the correct host, parent_port->dev?

> >
> > (I'll mention again, switch code is not tested).
> >
> > >
> > > > +       put_device(&parent_port->dev);
> > > > +       if (IS_ERR(port))
> > > > +               dev_err(dev, "Failed to add upstream port %ld\n",
> > > > +                       PTR_ERR(port));
> > > > +       else
> > > > +               dev_dbg(dev, "Added CXL port\n");
> > > > +
> > > > +       return rc;
> > > > +}
> > > > +
> > > > +static int add_downstream_port(struct pci_dev *pdev)
> > > > +{
> > > > +       resource_size_t creg = CXL_RESOURCE_NONE;
> > > > +       struct device *dev = &pdev->dev;
> > > > +       struct cxl_port *parent_port;
> > > > +       struct cxl_register_map map;
> > > > +       u32 lnkcap, port_num;
> > > > +       int rc;
> > > > +
> > > > +       /*
> > > > +        * Ports are to be scanned from top down. Therefore, the upstream port
> > > > +        * must already exist.
> > > > +        */
> > > > +       parent_port = find_cxl_port(to_pci_dev(pdev->dev.parent));
> > > > +       if (!parent_port)
> > > > +               return -ENODEV;
> > > > +
> > > > +       /*
> > > > +        * The spec mandates component registers are present but the
> > > > +        * driver does not.
> > >
> > > What is this trying to convey?
> > >
> >
> > That I'm not validating the hardware, and even though component registers are
> > mandatory, the driver will move on even if they're not found. This functionality
> > may need to change in the future and so I left the comment there.
> 
> I think that could be conveyed without comment with something like:
> 
>         if (rc)
>                 creg = CXL_RESOURCE_NONE;
>         else
>                 creg = cxl_reg_block(pdev, &map);

As you like.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 08/28] cxl/port: Introduce a port driver
  2021-11-02  3:31       ` Dan Williams
  2021-11-02 16:27         ` Ben Widawsky
@ 2021-11-02 16:58         ` Ben Widawsky
  2021-11-04 19:10           ` Dan Williams
  1 sibling, 1 reply; 106+ messages in thread
From: Ben Widawsky @ 2021-11-02 16:58 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On 21-11-01 20:31:03, Dan Williams wrote:
> On Mon, Nov 1, 2021 at 10:53 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> >
> > On 21-10-29 18:37:36, Dan Williams wrote:
> > > On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > >
> >
> > [snip]
> >
> > > > diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile
> > > > index cf07ae6cea17..40b386aaedf7 100644
> > > > --- a/drivers/cxl/Makefile
> > > > +++ b/drivers/cxl/Makefile
> > > > @@ -1,5 +1,6 @@
> > > >  # SPDX-License-Identifier: GPL-2.0
> > > >  obj-$(CONFIG_CXL_BUS) += core/
> > > > +obj-$(CONFIG_CXL_MEM) += cxl_port.o
> > >
> > > It feel odd that CONFIG_CXL_MEM builds cxl_port, why not have a
> > > CONFIG_CXL_PORT that is simply selected by CONFIG_CXL_MEM, or a
> > > CONFIG_CXL_PORT that defaults to the value of CONFIG_CXL_BUS?
> > >
> >
> > Can you help me understand when CONFIG_CXL_MEM is useful when
> > #CONFIG_CXL_PORT=n? I was unable to figure out such a case and so I tied the two
> > together.
> 
> With a 'select' dependency it's impossible to have the
> CONFIG_CXL_PORT=n and CONFIG_CXL_MEM=m combination. The extra config
> symbol is for idiomatic (one config-symbol per module .ko) reasons to
> reflect the module dependency in the Kconfig.
> 

Should CXL_MEM also select CXL_PCI? I don't see a way CXL_MEM can work without
CXL_PCI.

[snip]


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 08/28] cxl/port: Introduce a port driver
  2021-11-02 16:27         ` Ben Widawsky
@ 2021-11-02 17:21           ` Dan Williams
  0 siblings, 0 replies; 106+ messages in thread
From: Dan Williams @ 2021-11-02 17:21 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Tue, Nov 2, 2021 at 9:27 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> On 21-11-01 20:31:03, Dan Williams wrote:
> > On Mon, Nov 1, 2021 at 10:53 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > >
> > > On 21-10-29 18:37:36, Dan Williams wrote:
> > > > On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > > >
> > >
> > > [snip]
> > >
> > > > > diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile
> > > > > index cf07ae6cea17..40b386aaedf7 100644
> > > > > --- a/drivers/cxl/Makefile
> > > > > +++ b/drivers/cxl/Makefile
> > > > > @@ -1,5 +1,6 @@
> > > > >  # SPDX-License-Identifier: GPL-2.0
> > > > >  obj-$(CONFIG_CXL_BUS) += core/
> > > > > +obj-$(CONFIG_CXL_MEM) += cxl_port.o
> > > >
> > > > It feel odd that CONFIG_CXL_MEM builds cxl_port, why not have a
> > > > CONFIG_CXL_PORT that is simply selected by CONFIG_CXL_MEM, or a
> > > > CONFIG_CXL_PORT that defaults to the value of CONFIG_CXL_BUS?
> > > >
> > >
> > > Can you help me understand when CONFIG_CXL_MEM is useful when
> > > #CONFIG_CXL_PORT=n? I was unable to figure out such a case and so I tied the two
> > > together.
> >
> > With a 'select' dependency it's impossible to have the
> > CONFIG_CXL_PORT=n and CONFIG_CXL_MEM=m combination. The extra config
> > symbol is for idiomatic (one config-symbol per module .ko) reasons to
> > reflect the module dependency in the Kconfig.
>
> Can't argue with idiomisms ;-)

It's idiomatic because it's extensible. I expect type-2 drivers to
select port services as well.

>
> >
> > [..]
> > > > > +static inline int cxl_hdm_decoder_ig(u32 ctrl)
> > > >
> > > > No need for plain inline in C files.
> > > >
> > > > It's not clear why this simple helper needs a "cxl_hdm_decoder"
> > > > namespace prefix?
> > >
> > > I had a patch to share this with acpi driver at one point, but I dropped it. Do
> > > you care if I merge those two decoders, or just rename?
> >
> > Not sure what you mean by "merge"?
> >
>
> To have a unified function for both the cxl_acpi driver and the cxl_port driver.

Ah, yeah, CFMWS_INTERLEAVE_WAYS and CFMWS_INTERLEAVE_GRANULARITY are
doing the same work.

>
> > >
> > > >
> > > > > +{
> > > > > +       int val = FIELD_GET(CXL_HDM_DECODER0_CTRL_IG_MASK, ctrl);
> > > > > +
> > > > > +       return 8 + val;
> > > > > +}
> > > >
> > > > Why is this return a power of 2 value...
> > >
> > > I don't understand this comment.
> >
> > Isn't this returning an IG as a bit-shift value? The "iw" is returning
> > bytes and I'm proposing they both be bytes.
> >
>
> IG is definitely wrong since it's documented in struct cxl_decoder as data
> stride. I will fix that. I don't follow what you mean by "iw" is returning
> bytes. It's returning the number of ways.

Sorry, some noise from me there. I should have said integer instead of
encoded value.

CFMWS_INTERLEAVE_GRANULARITY has the same problem (another indicator
that a unit test for all user facing values is needed to catch simple
bugs like this).

>
> > >
> > > >
> > > > > +
> > > > > +static inline int cxl_hdm_decoder_iw(u32 ctrl)
> > > > > +{
> > > > > +       int val = FIELD_GET(CXL_HDM_DECODER0_CTRL_IW_MASK, ctrl);
> > > > > +
> > > > > +       return 1 << val;
> > > >
> > > > ...while this one is converted to absolute values.
> > > >
> > > > These could just be:
> > > >
> > > > unsigned int to_interleave_granularity(u32 ctrl)
> > > > unsigned int to_interleave_ways(u32 ctrl)
> > > >
> > > > ...and return units in bytes.
> > > >
> > > > > +}
> > > > > +
> > > > > +static void get_caps(struct cxl_port *port, struct cxl_port_data *cpd)
> > > > > +{
> > > > > +       void __iomem *hdm_decoder = cpd->regs.hdm_decoder;
> > > > > +       struct port_caps *caps = &cpd->caps;
> > > > > +       u32 hdm_cap;
> > > > > +
> > > > > +       hdm_cap = readl(hdm_decoder + CXL_HDM_DECODER_CAP_OFFSET);
> > > > > +
> > > > > +       caps->count = cxl_hdm_decoder_count(hdm_cap);
> > > > > +       caps->tc = FIELD_GET(CXL_HDM_DECODER_TARGET_COUNT_MASK, hdm_cap);
> > > > > +       caps->interleave11_8 =
> > > > > +               FIELD_GET(CXL_HDM_DECODER_INTERLEAVE_11_8, hdm_cap);
> > > > > +       caps->interleave14_12 =
> > > > > +               FIELD_GET(CXL_HDM_DECODER_INTERLEAVE_14_12, hdm_cap);
> > > > > +}
> > > > > +
> > > > > +static int map_regs(struct cxl_port *port, void __iomem *crb,
> > > > > +                   struct cxl_port_data *cpd)
> > > > > +{
> > > > > +       struct cxl_register_map map;
> > > > > +       struct cxl_component_reg_map *comp_map = &map.component_map;
> > > > > +
> > > > > +       cxl_probe_component_regs(&port->dev, crb, comp_map);
> > > > > +       if (!comp_map->hdm_decoder.valid) {
> > > > > +               dev_err(&port->dev, "HDM decoder registers invalid\n");
> > > > > +               return -ENXIO;
> > > > > +       }
> > > >
> > > > Perhaps promote cxl_probe_regs() from the cxl_pci to the core and make
> > > > it take a dev instead of a pdev, then you can do:
> > > >
> > > > cxl_probe_regs(&port_dev->dev, CXL_REGLOC_RBI_COMPONENT)
> > > >
> > > > ...instead of open coding it again?
> > > >
> > > > > +
> > > > > +       cpd->regs.hdm_decoder = crb + comp_map->hdm_decoder.offset;
> > > > > +
> > > > > +       return 0;
> > > > > +}
> > > > > +
> > > > > +static u64 get_decoder_size(void __iomem *hdm_decoder, int n)
> > > > > +{
> > > > > +       u32 ctrl = readl(hdm_decoder + CXL_HDM_DECODER0_CTRL_OFFSET(n));
> > > > > +
> > > > > +       if (!!FIELD_GET(CXL_HDM_DECODER0_CTRL_COMMITTED, ctrl))
> > > > > +               return 0;
> > > > > +
> > > > > +       return ioread64_hi_lo(hdm_decoder +
> > > > > +                             CXL_HDM_DECODER0_SIZE_LOW_OFFSET(n));
> > > > > +}
> > > > > +
> > > > > +static bool is_endpoint_port(struct cxl_port *port)
> > > > > +{
> > > > > +       if (!port->uport->driver)
> > > > > +               return false;
> > > > > +
> > > > > +       return to_cxl_drv(port->uport->driver)->id ==
> > > > > +              CXL_DEVICE_MEMORY_EXPANDER;
> > > >
> > > > Why does endpoint port device type determination need to reach through
> > > > and read the driver type?
> > > >
> > >
> > > I couldn't figure out a better way at this point in enumeration. I'm open to
> > > suggestions.
> >
> > list_empty(&port->dports)?
> >
>
> That seems obvious enough. In the v2 series it's possible due to failures for a
> switch port to also have list_empty(&port->dports). Should ports not get added
> if they have 0 dports (unless they're endpoints)?

Sounds reasonable, I can't imagine a usable non-endpoint port that has
no dports.

>
> >
> > >
> > > [snip]
> > >
> > > > > +
> > > > > +       /*
> > > > > +        * Enable HDM decoders for this port.
> > > > > +        *
> > > > > +        * FIXME: If the component was using DVSEC range registers for decode,
> > > > > +        * this will destroy that.
> > > >
> > > > Yeah, definitely need to check that before this patch can move
> > > > forward. Perhaps a port should not even be registered if DVSEC
> > > > Memory_Size && Mem_Enable are non zero, that device is explicitly
> > > > opting out of being a part of the CXL 2.0 subsystem hierarchy.
> > > > However, we might still need to track it and potentially reserve it
> > > > out of CFMWS capacity to make sure nothing else collides with it. I'll
> > > > also note that "ECN: Devices operating in CXL 1.1 mode with no RCRB"
> > > > was recently published that reads on what the driver should do here.
> > > >
> > >
> > > I believe we want to create the port since we might decide to reset and want
> > > control back over it and as you said, safety check other things. The reason it
> > > was left as a FIXME is because this belongs in the PCI driver which I didn't
> > > really want to touch at this time. I will go back and add that for the next
> > > version.
> >
> > Ok.
> >
> > >
> > > > > +        */
> > > > > +       ctrl = readl(portdata->regs.hdm_decoder + CXL_HDM_DECODER_CTRL_OFFSET);
> > > > > +       ctrl |= CXL_HDM_DECODER_ENABLE;
> > > > > +       writel(ctrl, portdata->regs.hdm_decoder + CXL_HDM_DECODER_CTRL_OFFSET);
> > > >
> > > > I feel like that if the driver finds it enabled it should leave it
> > > > enabled at ->remove() time, as you have it here, as BIOS might not be
> > > > expecting the OS to disable a decoder it set up. However, if the
> > > > driver actually does the enable, then it should pair it with a disable
> > > > at the end of time, so not a blind enable, but one that conditionally
> > > > arranges for the unwind.
> > >
> > > My thought was that once we enumerate a port, all of it's architectural state
> > > belongs to the OS. For us that means blanket enable/disable. I don't feel
> > > strongly about this.
> >
> > I think the driver needs to tread carefully when it comes to potential
> > BIOS interactions. The minimum BIOS collision would be to not toggle
> > the enable off if the OS cannot assert that it set it. A more precise
> > policy would be checking if the device contributes to any EFI memory
> > map ranges, or any locked CFMWS entries.
>
> Locked == bit 4, fixed, right? The more precise policy makes sense to me, though
> I still wonder what happens if we reset a device for that case?

I expect it will be painful and likely crash the system if that device
is contributing to active System RAM. There is definitely some more
exclusion needed to coordinate with secondary bus reset on the PCI
side. I can imagine someone trying to passthrough a CXL device to a VM
unaware that it will trigger a reset and crash the system.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 09/28] cxl/acpi: Map single port host bridge component registers
  2021-11-02 16:31         ` Ben Widawsky
@ 2021-11-02 17:46           ` Dan Williams
  2021-11-02 17:57             ` Ben Widawsky
  0 siblings, 1 reply; 106+ messages in thread
From: Dan Williams @ 2021-11-02 17:46 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Tue, Nov 2, 2021 at 9:31 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
[..]
> It seemed like a very simple thing to support given the port driver's existence
> so it was added to remove a TODO. However, I will drop it as you request.

I'm honestly asking the question why this is needed and more
specifically why this is needed in this location?

In other words, how is this case different than typical component
register probing that is done in the port driver itself? I would
expect the TODO just gets deleted by the port driver addition.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 09/28] cxl/acpi: Map single port host bridge component registers
  2021-11-02 17:46           ` Dan Williams
@ 2021-11-02 17:57             ` Ben Widawsky
  2021-11-02 18:10               ` Dan Williams
  0 siblings, 1 reply; 106+ messages in thread
From: Ben Widawsky @ 2021-11-02 17:57 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On 21-11-02 10:46:41, Dan Williams wrote:
> On Tue, Nov 2, 2021 at 9:31 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> [..]
> > It seemed like a very simple thing to support given the port driver's existence
> > so it was added to remove a TODO. However, I will drop it as you request.
> 
> I'm honestly asking the question why this is needed and more
> specifically why this is needed in this location?
> 
> In other words, how is this case different than typical component
> register probing that is done in the port driver itself? I would
> expect the TODO just gets deleted by the port driver addition.

Without this code, a decoder is added for the full address range regardless if
there is an existing programmed HDM decoder. With this code when the port driver
probes this host bridge HDM component registers, everything will be enumerated
by the normal flow.

This seemed easier than trying to have the port driver determine what cxl_acpi
driver did and unwind that if there is a programmed decoder.

But now you're making me think I've missed something big...


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 09/28] cxl/acpi: Map single port host bridge component registers
  2021-11-02 17:57             ` Ben Widawsky
@ 2021-11-02 18:10               ` Dan Williams
  2021-11-02 18:27                 ` Ben Widawsky
  0 siblings, 1 reply; 106+ messages in thread
From: Dan Williams @ 2021-11-02 18:10 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Tue, Nov 2, 2021 at 10:57 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> On 21-11-02 10:46:41, Dan Williams wrote:
> > On Tue, Nov 2, 2021 at 9:31 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > [..]
> > > It seemed like a very simple thing to support given the port driver's existence
> > > so it was added to remove a TODO. However, I will drop it as you request.
> >
> > I'm honestly asking the question why this is needed and more
> > specifically why this is needed in this location?
> >
> > In other words, how is this case different than typical component
> > register probing that is done in the port driver itself? I would
> > expect the TODO just gets deleted by the port driver addition.
>
> Without this code, a decoder is added for the full address range regardless if
> there is an existing programmed HDM decoder. With this code when the port driver
> probes this host bridge HDM component registers, everything will be enumerated
> by the normal flow.
>
> This seemed easier than trying to have the port driver determine what cxl_acpi
> driver did and unwind that if there is a programmed decoder.

I think this is more an argument to unify all decoder creation in the
port driver. When the port driver sees a single dport with no HDM
decoder registers it can create the passthrough decoder, if it sees
single dport with HDM decoder registers then it can create a
programmable decoder.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 09/28] cxl/acpi: Map single port host bridge component registers
  2021-11-02 18:10               ` Dan Williams
@ 2021-11-02 18:27                 ` Ben Widawsky
  2021-11-02 18:49                   ` Dan Williams
  0 siblings, 1 reply; 106+ messages in thread
From: Ben Widawsky @ 2021-11-02 18:27 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On 21-11-02 11:10:05, Dan Williams wrote:
> On Tue, Nov 2, 2021 at 10:57 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> >
> > On 21-11-02 10:46:41, Dan Williams wrote:
> > > On Tue, Nov 2, 2021 at 9:31 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > [..]
> > > > It seemed like a very simple thing to support given the port driver's existence
> > > > so it was added to remove a TODO. However, I will drop it as you request.
> > >
> > > I'm honestly asking the question why this is needed and more
> > > specifically why this is needed in this location?
> > >
> > > In other words, how is this case different than typical component
> > > register probing that is done in the port driver itself? I would
> > > expect the TODO just gets deleted by the port driver addition.
> >
> > Without this code, a decoder is added for the full address range regardless if
> > there is an existing programmed HDM decoder. With this code when the port driver
> > probes this host bridge HDM component registers, everything will be enumerated
> > by the normal flow.
> >
> > This seemed easier than trying to have the port driver determine what cxl_acpi
> > driver did and unwind that if there is a programmed decoder.
> 
> I think this is more an argument to unify all decoder creation in the
> port driver. When the port driver sees a single dport with no HDM
> decoder registers it can create the passthrough decoder, if it sees
> single dport with HDM decoder registers then it can create a
> programmable decoder.

That too, though I'm not sure how useful that will be for us. I think the more
likely case is BIOS is trying to hide some device (or switch) address space.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 09/28] cxl/acpi: Map single port host bridge component registers
  2021-11-02 18:27                 ` Ben Widawsky
@ 2021-11-02 18:49                   ` Dan Williams
  2021-11-02 21:15                     ` Ben Widawsky
  0 siblings, 1 reply; 106+ messages in thread
From: Dan Williams @ 2021-11-02 18:49 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Tue, Nov 2, 2021 at 11:27 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> On 21-11-02 11:10:05, Dan Williams wrote:
> > On Tue, Nov 2, 2021 at 10:57 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > >
> > > On 21-11-02 10:46:41, Dan Williams wrote:
> > > > On Tue, Nov 2, 2021 at 9:31 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > > [..]
> > > > > It seemed like a very simple thing to support given the port driver's existence
> > > > > so it was added to remove a TODO. However, I will drop it as you request.
> > > >
> > > > I'm honestly asking the question why this is needed and more
> > > > specifically why this is needed in this location?
> > > >
> > > > In other words, how is this case different than typical component
> > > > register probing that is done in the port driver itself? I would
> > > > expect the TODO just gets deleted by the port driver addition.
> > >
> > > Without this code, a decoder is added for the full address range regardless if
> > > there is an existing programmed HDM decoder. With this code when the port driver
> > > probes this host bridge HDM component registers, everything will be enumerated
> > > by the normal flow.
> > >
> > > This seemed easier than trying to have the port driver determine what cxl_acpi
> > > driver did and unwind that if there is a programmed decoder.
> >
> > I think this is more an argument to unify all decoder creation in the
> > port driver. When the port driver sees a single dport with no HDM
> > decoder registers it can create the passthrough decoder, if it sees
> > single dport with HDM decoder registers then it can create a
> > programmable decoder.
>
> That too, though I'm not sure how useful that will be for us.

?

This is about the proper place to enumerate decoders. The proposal in
this patch is to do some part of the enumeration in cxl_acpi and some
part in cxl_port, I'm saying push it all to cxl_port. I.e. it was a
layering violation in retrospect to create a default passthrough
decoder from cxl_acpi.

> I think the more
> likely case is BIOS is trying to hide some device (or switch) address space.

Not sure why that is relevant for the decision about where to place
component register probing and passthrough decoder creation?

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 15/28] cxl/core: Introduce API to scan switch ports
  2021-11-02 16:39         ` Ben Widawsky
@ 2021-11-02 20:00           ` Dan Williams
  0 siblings, 0 replies; 106+ messages in thread
From: Dan Williams @ 2021-11-02 20:00 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Tue, Nov 2, 2021 at 9:39 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> On 21-11-01 18:45:57, Dan Williams wrote:
> > On Mon, Nov 1, 2021 at 3:56 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > >
> > > On 21-10-31 22:39:43, Dan Williams wrote:
> > > > On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > > >
> > > > > The CXL drivers encapsulate the components that direct memory traffic in
> > > > > an entity known as a cxl_port. Compute Express Link specifies three such
> > > > > components: hostbridge (ie. a collection of root ports), switches, and
> > > > > endpoints. There are currently drivers that create these ports for the
> > > > > hostbridges and the endpoints (cxl_acpi and cxl_mem). The new API
> > > > > introduced allows callers to initiate a scan down from the hostbridge
> > > > > and create ports for switches in the CXL topology.
> > > > >
> > > > > The intended user of this API is for endpoint devices. An endpoint
> > > > > device will need to determine if it is CXL.mem capable, which requires
> > > > > all components in the path from hostbridge to the endpoint to be CXL.mem
> > > > > capable. Once an endpoint device determines it's connected to a CXL
> > > > > capable root port, it can call this API to fill in all the ports in
> > > > > between the hostbridge and itself.
> > > > >
> > > > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > > > > ---
> > > > >  .../driver-api/cxl/memory-devices.rst         |   6 +
> > > > >  drivers/cxl/core/Makefile                     |   1 +
> > > > >  drivers/cxl/core/bus.c                        | 145 ++++++++++++++++++
> > > > >  drivers/cxl/core/pci.c                        |  99 ++++++++++++
> > > > >  drivers/cxl/cxl.h                             |   2 +
> > > > >  drivers/cxl/pci.h                             |   6 +
> > > > >  drivers/cxl/port.c                            |   2 +-
> > > > >  tools/testing/cxl/Kbuild                      |   1 +
> > > > >  8 files changed, 261 insertions(+), 1 deletion(-)
> > > > >  create mode 100644 drivers/cxl/core/pci.c
> > > > >
> > > > > diff --git a/Documentation/driver-api/cxl/memory-devices.rst b/Documentation/driver-api/cxl/memory-devices.rst
> > > > > index fbf0393cdddc..547336c95593 100644
> > > > > --- a/Documentation/driver-api/cxl/memory-devices.rst
> > > > > +++ b/Documentation/driver-api/cxl/memory-devices.rst
> > > > > @@ -47,6 +47,12 @@ CXL Core
> > > > >  .. kernel-doc:: drivers/cxl/core/bus.c
> > > > >     :identifiers:
> > > > >
> > > > > +.. kernel-doc:: drivers/cxl/core/pci.c
> > > > > +   :doc: cxl pci
> > > > > +
> > > > > +.. kernel-doc:: drivers/cxl/core/pci.c
> > > > > +   :identifiers:
> > > > > +
> > > > >  .. kernel-doc:: drivers/cxl/core/pmem.c
> > > > >     :doc: cxl pmem
> > > > >
> > > > > diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
> > > > > index 07eb8e1fb8a6..9d33d2d5bf09 100644
> > > > > --- a/drivers/cxl/core/Makefile
> > > > > +++ b/drivers/cxl/core/Makefile
> > > > > @@ -7,3 +7,4 @@ cxl_core-y += pmem.o
> > > > >  cxl_core-y += regs.o
> > > > >  cxl_core-y += memdev.o
> > > > >  cxl_core-y += mbox.o
> > > > > +cxl_core-y += pci.o
> > > > > diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> > > > > index c7e1894d503b..f10e7d5b22a4 100644
> > > > > --- a/drivers/cxl/core/bus.c
> > > > > +++ b/drivers/cxl/core/bus.c
> > > > > @@ -8,6 +8,7 @@
> > > > >  #include <linux/idr.h>
> > > > >  #include <cxlmem.h>
> > > > >  #include <cxl.h>
> > > > > +#include <pci.h>
> > > > >  #include "core.h"
> > > > >
> > > > >  /**
> > > > > @@ -445,6 +446,150 @@ struct cxl_port *devm_cxl_add_port(struct device *uport,
> > > > >  }
> > > > >  EXPORT_SYMBOL_GPL(devm_cxl_add_port);
> > > > >
> > > > > +void devm_cxl_remove_port(struct cxl_port *port)
> > > > > +{
> > > > > +       down_read(&root_host_sem);
> > > > > +       if (cxl_root_host) {
> > > > > +               devm_release_action(cxl_root_host, cxl_unlink_uport, port);
> > > > > +               devm_release_action(cxl_root_host, unregister_port, port);
> > > > > +       }
> > > > > +       up_read(&root_host_sem);
> > > > > +}
> > > > > +EXPORT_SYMBOL_GPL(devm_cxl_remove_port);
> > > >
> > > > If the scan establishes the property that all child ports are devm
> > > > allocated with their cxl_port-parent, and only if the cxl_port-parent
> > > > is bound to its driver then I think we don't need to play
> > > > devm_release_action games().
> > > >
> > >
> > > We had discussed this previously. I was running into an issue when unloading
> > > cxl_mem. I needed a way to remove the endpoint port and this was your
> > > recommendation. Are you suggesting if the chain is set up correctly, I don't
> > > need to do anything?
> >
> > I think if the chain is set up correctly then you don't need to do
> > anything special. The endpoint port would be devm registered by the
> > cxl_memdev driver to its parent cxl_port provided that port is
> > actively attached to its driver.
> >
> > > I don't remember exactly what was blowing up but I can try again after things
> > > are properly parented.
> >
> > Cool.
> >
> > >
> > > > > +
> > > > > +static int match_port(struct device *dev, const void *data)
> > > > > +{
> > > > > +       struct pci_dev *pdev = (struct pci_dev *)data;
> > > > > +
> > > > > +       if (dev->type != &cxl_port_type)
> > > > > +               return 0;
> > > > > +
> > > > > +       return to_cxl_port(dev)->uport == &pdev->dev;
> > > > > +}
> > > > > +
> > > > > +static struct cxl_port *find_cxl_port(struct pci_dev *usp)
> > > > > +{
> > > > > +       struct device *port_dev;
> > > > > +
> > > > > +       if (!pci_is_pcie(usp) || pci_pcie_type(usp) != PCI_EXP_TYPE_UPSTREAM)
> > > > > +               return NULL;
> > > > > +
> > > > > +       port_dev = bus_find_device(&cxl_bus_type, NULL, usp, match_port);
> > > > > +       if (port_dev)
> > > > > +               return to_cxl_port(port_dev);
> > > > > +
> > > > > +       return NULL;
> > > > > +}
> > > > > +
> > > > > +static int add_upstream_port(struct device *host, struct pci_dev *pdev)
> > > > > +{
> > > > > +       struct device *dev = &pdev->dev;
> > > > > +       struct cxl_port *parent_port;
> > > > > +       struct cxl_register_map map;
> > > > > +       struct cxl_port *port;
> > > > > +       int rc;
> > > > > +
> > > > > +       /*
> > > > > +        * Upstream ports must be connected to a downstream port or root port.
> > > > > +        * That downstream or root port must have a parent.
> > > > > +        */
> > > > > +       if (!pdev->dev.parent->parent)
> > > > > +               return -ENXIO;
> > > > > +
> > > > > +       /* A port is useless if there are no component registers */
> > > > > +       rc = cxl_find_regblock(pdev, CXL_REGLOC_RBI_COMPONENT, &map);
> > > > > +       if (rc)
> > > > > +               return rc;
> > > > > +
> > > > > +       parent_port = find_cxl_port(to_pci_dev(pdev->dev.parent->parent));
> > > >
> > > > This deref chain is unreadable. It wants a helper if it stays, but I
> > > > can't immediately think of a reason to ever need to look at a
> > > > grandparent in the hierarchy.
> > >
> > > The goal is to be able to find the next PCIe port up in the chain.
> > >
> > > My understanding was:
> > > pdev = PCIe upstream switch
> > > pdev->dev.parent = PCIe downstream switch connected to pdev.
> > > pdev->dev.parent->parent = PCIe upstream switch connected to pdev->dev.parent
> > >
> > > I was unable to find an idiomatic way to do that. I'm open to suggestions.
> >
> > Oh ok, I see it now, but I think this can be done in pure CXL terms
> > and generic devices with the assumption that the parent device of a
> > cxl_memdev must be a dport. Then this works whether the parent port is
> > a platform device like ACPI or cxl_test, or a PCIe device.
> >
> > static int port_has_dport(struct device *dev, const void *dport_dev)
> > {
> >         int found = 0;
> >         struct cxl_port *port;
> >         struct cxl_dport *dport;
> >
> >         if (dev->type != &cxl_port_type)
> >                 return 0;
> >         port = to_cxl_port(dev);
> >
> >         device_lock(&port->dev);
> >         list_for_each_entry (dport, &port->dports, list)
> >                 if (dport->dport == dport_dev) {
> >                         found = 1;
> >                         break;
> >                 }
> >         device_unlock(&port->dev);
> >
> >         return found;> > }
> >
> > struct cxl_port *find_parent_cxl_port(struct cxl_memdev *cxlmd)
> > {
> >         return bus_find_device(&cxl_bus_type, NULL, cxlmd->dev.parent,
> >                                port_has_dport);

Hmm this still needs a grandparent because cxlmd->dev.parent, is the
device that registered the memdev and that device's parent is the
dport. I need to see if cxl_test supports this property.

> > }
> >
>
> Ah, I remember now that I had something like this originally. I thought it may
> be more desirable to do the grandparent route since you should be able to assert
> there is a grandparent. I'll change it.
>
> > > >
> > > > > +       if (!parent_port)
> > > > > +               return -ENODEV;
> > > > > +
> > > > > +       port = devm_cxl_add_port(dev, cxl_reg_block(pdev, &map), parent_port);
> > > >
> > > > This is broken because the pci device being used here does not have a
> > > > driver that knows about CXL bus events.
> > >
> > > I don't understand this, but I'd like to. Doesn't this make a port device which
> > > gets probed by the port driver? Why does the PCI device matter?
> >
> > I am reacting to the first argument of this call being @dev that came
> > from the pci_dev that was passed in to be the "host" for the devm
> > operation. The devm release action triggers at driver unbind of that
> > host device, but that doesn't make sense because the driver for a
> > switch has nothing to do with CXL operation.
> >
>
> What's the correct host, parent_port->dev?

Yeah, provided you have the parent locked and that port is attached to
the port driver. However, I'm concerned about lock ordering... I
"think" the port driver just needs to make sure that it never takes
the parent lock in the remove() path since that's the one that will
recurse from an upper layer port ->remove() event.

>
> > >
> > > (I'll mention again, switch code is not tested).
> > >
> > > >
> > > > > +       put_device(&parent_port->dev);
> > > > > +       if (IS_ERR(port))
> > > > > +               dev_err(dev, "Failed to add upstream port %ld\n",
> > > > > +                       PTR_ERR(port));
> > > > > +       else
> > > > > +               dev_dbg(dev, "Added CXL port\n");
> > > > > +
> > > > > +       return rc;
> > > > > +}
> > > > > +
> > > > > +static int add_downstream_port(struct pci_dev *pdev)
> > > > > +{
> > > > > +       resource_size_t creg = CXL_RESOURCE_NONE;
> > > > > +       struct device *dev = &pdev->dev;
> > > > > +       struct cxl_port *parent_port;
> > > > > +       struct cxl_register_map map;
> > > > > +       u32 lnkcap, port_num;
> > > > > +       int rc;
> > > > > +
> > > > > +       /*
> > > > > +        * Ports are to be scanned from top down. Therefore, the upstream port
> > > > > +        * must already exist.
> > > > > +        */
> > > > > +       parent_port = find_cxl_port(to_pci_dev(pdev->dev.parent));
> > > > > +       if (!parent_port)
> > > > > +               return -ENODEV;
> > > > > +
> > > > > +       /*
> > > > > +        * The spec mandates component registers are present but the
> > > > > +        * driver does not.
> > > >
> > > > What is this trying to convey?
> > > >
> > >
> > > That I'm not validating the hardware, and even though component registers are
> > > mandatory, the driver will move on even if they're not found. This functionality
> > > may need to change in the future and so I left the comment there.
> >
> > I think that could be conveyed without comment with something like:
> >
> >         if (rc)
> >                 creg = CXL_RESOURCE_NONE;
> >         else
> >                 creg = cxl_reg_block(pdev, &map);
>
> As you like.

/me tries to resist urge for Princess Bridge reference...

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 09/28] cxl/acpi: Map single port host bridge component registers
  2021-11-02 18:49                   ` Dan Williams
@ 2021-11-02 21:15                     ` Ben Widawsky
  2021-11-02 21:34                       ` Dan Williams
  0 siblings, 1 reply; 106+ messages in thread
From: Ben Widawsky @ 2021-11-02 21:15 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On 21-11-02 11:49:14, Dan Williams wrote:
> On Tue, Nov 2, 2021 at 11:27 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> >
> > On 21-11-02 11:10:05, Dan Williams wrote:
> > > On Tue, Nov 2, 2021 at 10:57 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > >
> > > > On 21-11-02 10:46:41, Dan Williams wrote:
> > > > > On Tue, Nov 2, 2021 at 9:31 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > > > [..]
> > > > > > It seemed like a very simple thing to support given the port driver's existence
> > > > > > so it was added to remove a TODO. However, I will drop it as you request.
> > > > >
> > > > > I'm honestly asking the question why this is needed and more
> > > > > specifically why this is needed in this location?
> > > > >
> > > > > In other words, how is this case different than typical component
> > > > > register probing that is done in the port driver itself? I would
> > > > > expect the TODO just gets deleted by the port driver addition.
> > > >
> > > > Without this code, a decoder is added for the full address range regardless if
> > > > there is an existing programmed HDM decoder. With this code when the port driver
> > > > probes this host bridge HDM component registers, everything will be enumerated
> > > > by the normal flow.
> > > >
> > > > This seemed easier than trying to have the port driver determine what cxl_acpi
> > > > driver did and unwind that if there is a programmed decoder.
> > >
> > > I think this is more an argument to unify all decoder creation in the
> > > port driver. When the port driver sees a single dport with no HDM
> > > decoder registers it can create the passthrough decoder, if it sees
> > > single dport with HDM decoder registers then it can create a
> > > programmable decoder.
> >
> > That too, though I'm not sure how useful that will be for us.
> 
> ?
> 
> This is about the proper place to enumerate decoders. The proposal in
> this patch is to do some part of the enumeration in cxl_acpi and some
> part in cxl_port, I'm saying push it all to cxl_port. I.e. it was a
> layering violation in retrospect to create a default passthrough
> decoder from cxl_acpi.

I think it makes sense to have cxl_decoder_add() not called from anything but
the port driver. So long as there's a way to determine it's a CXL 2.0 Root Port,
which, it seems we'll have, it should be straight forward.

> 
> > I think the more
> > likely case is BIOS is trying to hide some device (or switch) address space.
> 
> Not sure why that is relevant for the decision about where to place
> component register probing and passthrough decoder creation?

The comment wasn't about where, it was just a response as to how there's more
motivation than just having a programmable decoder, you might find a locked
decoder that has unexpected constraints.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 09/28] cxl/acpi: Map single port host bridge component registers
  2021-11-02 21:15                     ` Ben Widawsky
@ 2021-11-02 21:34                       ` Dan Williams
  2021-11-02 21:47                         ` Ben Widawsky
  0 siblings, 1 reply; 106+ messages in thread
From: Dan Williams @ 2021-11-02 21:34 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Tue, Nov 2, 2021 at 2:16 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> On 21-11-02 11:49:14, Dan Williams wrote:
> > On Tue, Nov 2, 2021 at 11:27 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > >
> > > On 21-11-02 11:10:05, Dan Williams wrote:
> > > > On Tue, Nov 2, 2021 at 10:57 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > > >
> > > > > On 21-11-02 10:46:41, Dan Williams wrote:
> > > > > > On Tue, Nov 2, 2021 at 9:31 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > > > > [..]
> > > > > > > It seemed like a very simple thing to support given the port driver's existence
> > > > > > > so it was added to remove a TODO. However, I will drop it as you request.
> > > > > >
> > > > > > I'm honestly asking the question why this is needed and more
> > > > > > specifically why this is needed in this location?
> > > > > >
> > > > > > In other words, how is this case different than typical component
> > > > > > register probing that is done in the port driver itself? I would
> > > > > > expect the TODO just gets deleted by the port driver addition.
> > > > >
> > > > > Without this code, a decoder is added for the full address range regardless if
> > > > > there is an existing programmed HDM decoder. With this code when the port driver
> > > > > probes this host bridge HDM component registers, everything will be enumerated
> > > > > by the normal flow.
> > > > >
> > > > > This seemed easier than trying to have the port driver determine what cxl_acpi
> > > > > driver did and unwind that if there is a programmed decoder.
> > > >
> > > > I think this is more an argument to unify all decoder creation in the
> > > > port driver. When the port driver sees a single dport with no HDM
> > > > decoder registers it can create the passthrough decoder, if it sees
> > > > single dport with HDM decoder registers then it can create a
> > > > programmable decoder.
> > >
> > > That too, though I'm not sure how useful that will be for us.
> >
> > ?
> >
> > This is about the proper place to enumerate decoders. The proposal in
> > this patch is to do some part of the enumeration in cxl_acpi and some
> > part in cxl_port, I'm saying push it all to cxl_port. I.e. it was a
> > layering violation in retrospect to create a default passthrough
> > decoder from cxl_acpi.
>
> I think it makes sense to have cxl_decoder_add() not called from anything but
> the port driver. So long as there's a way to determine it's a CXL 2.0 Root Port,
> which, it seems we'll have, it should be straight forward.

Does it need to be root port aware? I expect any single ported switch
is treated the same, not that I expect single-ported switches to be a
real thing, but I don't expect our port driver would care.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 09/28] cxl/acpi: Map single port host bridge component registers
  2021-11-02 21:34                       ` Dan Williams
@ 2021-11-02 21:47                         ` Ben Widawsky
  0 siblings, 0 replies; 106+ messages in thread
From: Ben Widawsky @ 2021-11-02 21:47 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On 21-11-02 14:34:19, Dan Williams wrote:
> On Tue, Nov 2, 2021 at 2:16 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
> >
> > On 21-11-02 11:49:14, Dan Williams wrote:
> > > On Tue, Nov 2, 2021 at 11:27 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > >
> > > > On 21-11-02 11:10:05, Dan Williams wrote:
> > > > > On Tue, Nov 2, 2021 at 10:57 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > > > >
> > > > > > On 21-11-02 10:46:41, Dan Williams wrote:
> > > > > > > On Tue, Nov 2, 2021 at 9:31 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > > > > > [..]
> > > > > > > > It seemed like a very simple thing to support given the port driver's existence
> > > > > > > > so it was added to remove a TODO. However, I will drop it as you request.
> > > > > > >
> > > > > > > I'm honestly asking the question why this is needed and more
> > > > > > > specifically why this is needed in this location?
> > > > > > >
> > > > > > > In other words, how is this case different than typical component
> > > > > > > register probing that is done in the port driver itself? I would
> > > > > > > expect the TODO just gets deleted by the port driver addition.
> > > > > >
> > > > > > Without this code, a decoder is added for the full address range regardless if
> > > > > > there is an existing programmed HDM decoder. With this code when the port driver
> > > > > > probes this host bridge HDM component registers, everything will be enumerated
> > > > > > by the normal flow.
> > > > > >
> > > > > > This seemed easier than trying to have the port driver determine what cxl_acpi
> > > > > > driver did and unwind that if there is a programmed decoder.
> > > > >
> > > > > I think this is more an argument to unify all decoder creation in the
> > > > > port driver. When the port driver sees a single dport with no HDM
> > > > > decoder registers it can create the passthrough decoder, if it sees
> > > > > single dport with HDM decoder registers then it can create a
> > > > > programmable decoder.
> > > >
> > > > That too, though I'm not sure how useful that will be for us.
> > >
> > > ?
> > >
> > > This is about the proper place to enumerate decoders. The proposal in
> > > this patch is to do some part of the enumeration in cxl_acpi and some
> > > part in cxl_port, I'm saying push it all to cxl_port. I.e. it was a
> > > layering violation in retrospect to create a default passthrough
> > > decoder from cxl_acpi.
> >
> > I think it makes sense to have cxl_decoder_add() not called from anything but
> > the port driver. So long as there's a way to determine it's a CXL 2.0 Root Port,
> > which, it seems we'll have, it should be straight forward.
> 
> Does it need to be root port aware? I expect any single ported switch
> is treated the same, not that I expect single-ported switches to be a
> real thing, but I don't expect our port driver would care.

You're correct... I thought switch ports required HDM decoder registers even if
single port, but I just checked and they are the same as the host bridge. So
indeed the enumeration should only be special cased for the endpoint.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 08/28] cxl/port: Introduce a port driver
  2021-10-22 18:36 ` [RFC PATCH v2 08/28] cxl/port: Introduce a port driver Ben Widawsky
  2021-10-30  1:37   ` Dan Williams
@ 2021-11-03 15:18   ` Jonathan Cameron
  1 sibling, 0 replies; 106+ messages in thread
From: Jonathan Cameron @ 2021-11-03 15:18 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Dan Williams,
	Ira Weiny, Vishal Verma

On Fri, 22 Oct 2021 11:36:49 -0700
Ben Widawsky <ben.widawsky@intel.com> wrote:

> The CXL port driver will be responsible for managing the decoder
> resources contained within the port. It will also provide APIs that
> other drivers will consume for managing these resources.
> 
> Since the port driver is responsible for instantiating new decoders, and
> it does so during probe(), a new API is needed to add decoders for
> callers which already hold the device lock of the port.
> 
> This patch has no functional change because no driver is registering new
> ports and the root ports that are already registered should be skipped.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

Hi Ben, A few fairly generic comments on stuff noticed whilst
catching up with discussion around this series.

Jonathan

> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 91b8fd54bc93..ad22caf9135c 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -17,6 +17,9 @@
>   * (port-driver, region-driver, nvdimm object-drivers... etc).
>   */
>  
> +/* CXL 2.0 8.2.4 CXL Component Register Layout and Definition */
> +#define CXL_COMPONENT_REG_BLOCK_SIZE SZ_64K
> +
>  /* CXL 2.0 8.2.5 CXL.cache and CXL.mem Registers*/
>  #define CXL_CM_OFFSET 0x1000
>  #define CXL_CM_CAP_HDR_OFFSET 0x0
> @@ -36,11 +39,22 @@
>  #define CXL_HDM_DECODER_CAP_OFFSET 0x0
>  #define   CXL_HDM_DECODER_COUNT_MASK GENMASK(3, 0)
>  #define   CXL_HDM_DECODER_TARGET_COUNT_MASK GENMASK(7, 4)
> -#define CXL_HDM_DECODER0_BASE_LOW_OFFSET 0x10
> -#define CXL_HDM_DECODER0_BASE_HIGH_OFFSET 0x14
> -#define CXL_HDM_DECODER0_SIZE_LOW_OFFSET 0x18
> -#define CXL_HDM_DECODER0_SIZE_HIGH_OFFSET 0x1c
> -#define CXL_HDM_DECODER0_CTRL_OFFSET 0x20
> +#define   CXL_HDM_DECODER_INTERLEAVE_11_8 BIT(8)
> +#define   CXL_HDM_DECODER_INTERLEAVE_14_12 BIT(9)
> +#define CXL_HDM_DECODER_CTRL_OFFSET 0x4
> +#define   CXL_HDM_DECODER_ENABLE BIT(1)
> +#define CXL_HDM_DECODER0_BASE_LOW_OFFSET(i) (0x20 * (i) + 0x10)
> +#define CXL_HDM_DECODER0_BASE_HIGH_OFFSET(i) (0x20 * (i) + 0x14)
> +#define CXL_HDM_DECODER0_SIZE_LOW_OFFSET(i) (0x20 * (i) + 0x18)
> +#define CXL_HDM_DECODER0_SIZE_HIGH_OFFSET(i) (0x20 * (i) + 0x1c)
> +#define CXL_HDM_DECODER0_CTRL_OFFSET(i) (0x20 * (i) + 0x20)
> +#define   CXL_HDM_DECODER0_CTRL_IG_MASK GENMASK(3, 0)
> +#define   CXL_HDM_DECODER0_CTRL_IW_MASK GENMASK(7, 4)
> +#define   CXL_HDM_DECODER0_CTRL_COMMIT BIT(9)
> +#define   CXL_HDM_DECODER0_CTRL_COMMITTED BIT(10)
> +#define   CXL_HDM_DECODER0_CTRL_TYPE BIT(12)
> +#define CXL_HDM_DECODER0_TL_LOW(i) (0x20 * (i) + 0x24)
> +#define CXL_HDM_DECODER0_TL_HIGH(i) (0x20 * (i) + 0x28)

CXL_HDM_DECODERX_TL_HIGH etc perhaps as no longer just decoder 0?


> diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
> new file mode 100644
> index 000000000000..ebbfb72ae995
> --- /dev/null
> +++ b/drivers/cxl/port.c

...

> +
> +static u64 get_decoder_size(void __iomem *hdm_decoder, int n)
> +{
> +	u32 ctrl = readl(hdm_decoder + CXL_HDM_DECODER0_CTRL_OFFSET(n));
> +
> +	if (!!FIELD_GET(CXL_HDM_DECODER0_CTRL_COMMITTED, ctrl))

Drop the !! as doesn't do anything useful.

> +		return 0;
> +
> +	return ioread64_hi_lo(hdm_decoder +
> +			      CXL_HDM_DECODER0_SIZE_LOW_OFFSET(n));
> +}
> +
> +static bool is_endpoint_port(struct cxl_port *port)
> +{
> +	if (!port->uport->driver)
> +		return false;
> +
> +	return to_cxl_drv(port->uport->driver)->id ==
> +	       CXL_DEVICE_MEMORY_EXPANDER;
> +}
> +
> +static int enumerate_hdm_decoders(struct cxl_port *port,
> +				  struct cxl_port_data *portdata)
> +{
> +	int i = 0;

Don't init.

> +
> +	for (i = 0; i < portdata->caps.count; i++) {
> +		int iw = 1, ig = 0, rc, target_count = portdata->caps.tc;

Use some more lines for this - it'll be easier to read!

> +		void __iomem *hdm_decoder = portdata->regs.hdm_decoder;
> +		enum cxl_decoder_type type = CXL_DECODER_EXPANDER;
> +		struct resource res = DEFINE_RES_MEM(0, 0);
> +		struct cxl_decoder *cxld;
> +		int *target_map = NULL;
> +		u64 size;
> +
> +		if (is_endpoint_port(port))
> +			target_count = 0;

I'd rather just see a bool for this instead of using the value not allowed for
other cases.  Slightly more code, but no need for the comment in the previous
patch.

> +
> +		cxld = cxl_decoder_alloc(port, target_count);
> +		if (IS_ERR(cxld)) {
> +			dev_warn(&port->dev,
> +				 "Failed to allocate the decoder\n");
> +			return PTR_ERR(cxld);
> +		}
> +
> +		size = get_decoder_size(hdm_decoder, i);
> +		if (size != 0) {
> +			int temp[CXL_DECODER_MAX_INTERLEAVE];
> +			u64 target_list, base;
> +			u32 ctrl;
> +			int j;
> +
> +			target_map = temp;
> +			ctrl = readl(hdm_decoder + CXL_HDM_DECODER0_CTRL_OFFSET(i));
> +			base = ioread64_hi_lo(hdm_decoder + CXL_HDM_DECODER0_BASE_LOW_OFFSET(i));
> +			res = (struct resource)DEFINE_RES_MEM(base, size);
> +
> +			cxld->flags = CXL_DECODER_F_EN;
> +			iw = cxl_hdm_decoder_iw(ctrl);
> +			ig = cxl_hdm_decoder_ig(ctrl);
> +
> +			if (FIELD_GET(CXL_HDM_DECODER0_CTRL_TYPE, ctrl) == 0)
> +				type = CXL_DECODER_ACCELERATOR;
> +
> +			target_list = ioread64_hi_lo(hdm_decoder + CXL_HDM_DECODER0_TL_LOW(i));
> +			for (j = 0; j < iw; j++)
> +				target_map[j] = (target_list >> (j * 8)) & 0xff;

temp just went out of scope, and target_map is still in it...

> +		}
> +
> +		cxld->target_type = type;
> +		cxld->res = res;
> +		cxld->interleave_ways = iw;
> +		cxld->interleave_granularity = ig;
> +
> +		rc = cxl_decoder_add_locked(cxld, target_map);
> +		if (rc)
> +			put_device(&cxld->dev);
> +		else
> +			rc = cxl_decoder_autoremove(port->uport->parent, cxld);
> +		if (rc)
> +			dev_err(&port->dev, "Failed to add decoder\n");

why not return an error?

> +	}
> +
> +	return 0;
> +}
> +
> +static int cxl_port_probe(struct device *dev)
> +{
> +	struct cxl_port *port = to_cxl_port(dev);
> +	struct cxl_port_data *portdata;
> +	void __iomem *crb;
> +	u32 ctrl;
> +	int rc;
> +
> +	if (port->component_reg_phys == CXL_RESOURCE_NONE)
> +		return 0;
> +
> +	portdata = devm_kzalloc(dev, sizeof(*portdata), GFP_KERNEL);
> +	if (!portdata)
> +		return -ENOMEM;
> +
> +	crb = devm_cxl_iomap_block(&port->dev, port->component_reg_phys,
> +				   CXL_COMPONENT_REG_BLOCK_SIZE);
> +	if (IS_ERR_OR_NULL(crb)) {
> +		dev_err(&port->dev, "No component registers mapped\n");
> +		return -ENXIO;
> +	}
> +
> +	rc = map_regs(port, crb, portdata);
> +	if (rc)
> +		return rc;
> +
> +	get_caps(port, portdata);
> +	if (portdata->caps.count == 0) {
> +		dev_err(&port->dev, "Spec violation. Caps invalid\n");
> +		return -ENXIO;
> +	}
> +
> +	/*
> +	 * Enable HDM decoders for this port.
> +	 *
> +	 * FIXME: If the component was using DVSEC range registers for decode,
> +	 * this will destroy that.
> +	 */
> +	ctrl = readl(portdata->regs.hdm_decoder + CXL_HDM_DECODER_CTRL_OFFSET);
> +	ctrl |= CXL_HDM_DECODER_ENABLE;
> +	writel(ctrl, portdata->regs.hdm_decoder + CXL_HDM_DECODER_CTRL_OFFSET);
> +
> +	rc = enumerate_hdm_decoders(port, portdata);
> +	if (rc) {
> +		dev_err(&port->dev, "Couldn't enumerate decoders (%d)\n", rc);
> +		return rc;
> +	}
> +
> +	dev_set_drvdata(dev, portdata);
> +	return 0;
> +}
> +
> +static struct cxl_driver cxl_port_driver = {
> +	.name = "cxl_port",
> +	.probe = cxl_port_probe,
> +	.id = CXL_DEVICE_PORT,
> +};
> +module_cxl_driver(cxl_port_driver);
> +
> +MODULE_LICENSE("GPL v2");
> +MODULE_IMPORT_NS(CXL);
> +MODULE_ALIAS_CXL(CXL_DEVICE_PORT);


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 13/28] cxl: Flesh out register names
  2021-10-22 18:36 ` [RFC PATCH v2 13/28] cxl: Flesh out register names Ben Widawsky
  2021-10-31 20:18   ` Dan Williams
@ 2021-11-03 15:53   ` Jonathan Cameron
  2021-11-03 16:03     ` Ben Widawsky
  1 sibling, 1 reply; 106+ messages in thread
From: Jonathan Cameron @ 2021-11-03 15:53 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Dan Williams,
	Ira Weiny, Vishal Verma

On Fri, 22 Oct 2021 11:36:54 -0700
Ben Widawsky <ben.widawsky@intel.com> wrote:

> Get a better naming scheme in place for upcoming additions. To solidify
> the schema, add all the DVSEC identifiers to start with.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> 
> ---
> See:
> https://lore.kernel.org/linux-pci/20210913190131.xiiszmno46qie7v5@intel.com/
> ---
>  drivers/cxl/core/regs.c | 14 ++++++++------
>  drivers/cxl/pci.h       | 38 ++++++++++++++++++++++++++++++--------
>  2 files changed, 38 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/cxl/core/regs.c b/drivers/cxl/core/regs.c
> index c8ab8880b81b..b837196fbf39 100644
> --- a/drivers/cxl/core/regs.c
> +++ b/drivers/cxl/core/regs.c
> @@ -253,9 +253,11 @@ static void cxl_decode_regblock(u32 reg_lo, u32 reg_hi,
>  				struct cxl_register_map *map)
>  {
>  	map->block_offset =
> -		((u64)reg_hi << 32) | (reg_lo & CXL_REGLOC_ADDR_MASK);
> -	map->barno = FIELD_GET(CXL_REGLOC_BIR_MASK, reg_lo);
> -	map->reg_type = FIELD_GET(CXL_REGLOC_RBI_MASK, reg_lo);
> +		((u64)reg_hi << 32) |
> +		(reg_lo & DVSEC_REGISTER_LOCATOR_BLOCK_OFFSET_LOW_MASK);
> +	map->barno = FIELD_GET(DVSEC_REGISTER_LOCATOR_BIR_MASK, reg_lo);
> +	map->reg_type =
> +		FIELD_GET(DVSEC_REGISTER_LOCATOR_BLOCK_IDENTIFIER_MASK, reg_lo);
>  }
>  
>  /**
> @@ -276,15 +278,15 @@ int cxl_find_regblock(struct pci_dev *pdev, enum cxl_regloc_type type,
>  	int regloc, i;
>  
>  	regloc = pci_find_dvsec_capability(pdev, PCI_DVSEC_VENDOR_ID_CXL,
> -					   PCI_DVSEC_ID_CXL_REGLOC_DVSEC_ID);
> +					   CXL_DVSEC_REGISTER_LOCATOR);
>  	if (!regloc)
>  		return -ENXIO;
>  
>  	pci_read_config_dword(pdev, regloc + PCI_DVSEC_HEADER1, &regloc_size);
>  	regloc_size = FIELD_GET(PCI_DVSEC_HEADER1_LENGTH_MASK, regloc_size);
>  
> -	regloc += PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET;
> -	regblocks = (regloc_size - PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET) / 8;
> +	regloc += DVSEC_REGISTER_LOCATOR_BLOCK1_OFFSET;
> +	regblocks = (regloc_size - DVSEC_REGISTER_LOCATOR_BLOCK1_OFFSET) / 8;
>  
>  	for (i = 0; i < regblocks; i++, regloc += 8) {
>  		u32 reg_lo, reg_hi;
> diff --git a/drivers/cxl/pci.h b/drivers/cxl/pci.h
> index 12fdcb1b14e5..fe2898b17736 100644
> --- a/drivers/cxl/pci.h
> +++ b/drivers/cxl/pci.h
> @@ -7,17 +7,36 @@
>  
>  /*
>   * See section 8.1 Configuration Space Registers in the CXL 2.0
> - * Specification
> + * Specification. Names are taken straight from the specification with "CXL" and
> + * "DVSEC" redundancies removed.
>   */
>  #define PCI_DVSEC_HEADER1_LENGTH_MASK	GENMASK(31, 20)
>  #define PCI_DVSEC_VENDOR_ID_CXL		0x1E98
> -#define PCI_DVSEC_ID_CXL		0x0
>  
> -#define PCI_DVSEC_ID_CXL_REGLOC_DVSEC_ID	0x8
> -#define PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET	0xC
> +/* 8.1.3: PCIe DVSEC for CXL Device */
> +#define CXL_DVSEC_PCIE_DEVICE					0
>  
> -/* BAR Indicator Register (BIR) */
> -#define CXL_REGLOC_BIR_MASK GENMASK(2, 0)
> +/* 8.1.4: Non-CXL Function Map DVSEC */
> +#define CXL_DVSEC_FUNCTION_MAP					2
> +
> +/* 8.1.5: CXL 2.0 Extensions DVSEC for Ports */
> +#define CXL_DVSEC_PORT_EXTENSIONS				3
> +
> +/* 8.1.6: GPF DVSEC for CXL Port */
> +#define CXL_DVSEC_PORT_GPF					4
> +
> +/* 8.1.7: GPF DVSEC for CXL Device */
> +#define CXL_DVSEC_DEVICE_GPF					5
> +
> +/* 8.1.8: PCIe DVSEC for Flex Bus Port */
> +#define CXL_DVSEC_PCIE_FLEXBUS_PORT				7
> +
> +/* 8.1.9: Register Locator DVSEC */
> +#define CXL_DVSEC_REGISTER_LOCATOR				8
> +#define   DVSEC_REGISTER_LOCATOR_BLOCK1_OFFSET			0xC

Hmm. I'm not particularly keen on not having CXL in the naming for the offset
and fields.  They get used in places where, to the casual reader, they may look
like generic PCI_DVSEC_ defintiions....  the do get very long though, but I'm not
sure that's the right place to abbreviate. 

Agreed with your comment in the thread that it's really hard to
get a good consistent model for what are near enough free form names
in a spec (or seem like it when read long after they were defined).

> +#define     DVSEC_REGISTER_LOCATOR_BIR_MASK			GENMASK(2, 0)
> +#define	    DVSEC_REGISTER_LOCATOR_BLOCK_IDENTIFIER_MASK	GENMASK(15, 8)
> +#define     DVSEC_REGISTER_LOCATOR_BLOCK_OFFSET_LOW_MASK	GENMASK(31, 16) 
>  
>  /* Register Block Identifier (RBI) */
>  enum cxl_regloc_type {
> @@ -28,8 +47,11 @@ enum cxl_regloc_type {
>  	CXL_REGLOC_RBI_TYPES
>  };
>  
> -#define CXL_REGLOC_RBI_MASK GENMASK(15, 8)
> -#define CXL_REGLOC_ADDR_MASK GENMASK(31, 16)
> +/* 8.1.10: MLD DVSEC */
> +#define CXL_DVSEC_MLD						9
> +
> +/* 14.16.1 CXL Device Test Capability Advertisement */
> +#define CXL_DVSEC_PCIE_TEST_CAPABILITY				10
>  
>  #define cxl_reg_block(pdev, map)                                               \
>  	((resource_size_t)(pci_resource_start(pdev, (map)->barno) +            \


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 13/28] cxl: Flesh out register names
  2021-11-03 15:53   ` Jonathan Cameron
@ 2021-11-03 16:03     ` Ben Widawsky
  2021-11-03 16:42       ` Jonathan Cameron
  0 siblings, 1 reply; 106+ messages in thread
From: Ben Widawsky @ 2021-11-03 16:03 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Dan Williams,
	Ira Weiny, Vishal Verma

On 21-11-03 15:53:59, Jonathan Cameron wrote:
> On Fri, 22 Oct 2021 11:36:54 -0700
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 

[snip]

> > diff --git a/drivers/cxl/pci.h b/drivers/cxl/pci.h
> > index 12fdcb1b14e5..fe2898b17736 100644
> > --- a/drivers/cxl/pci.h
> > +++ b/drivers/cxl/pci.h
> > @@ -7,17 +7,36 @@
> >  
> >  /*
> >   * See section 8.1 Configuration Space Registers in the CXL 2.0
> > - * Specification
> > + * Specification. Names are taken straight from the specification with "CXL" and
> > + * "DVSEC" redundancies removed.
> >   */
> >  #define PCI_DVSEC_HEADER1_LENGTH_MASK	GENMASK(31, 20)
> >  #define PCI_DVSEC_VENDOR_ID_CXL		0x1E98
> > -#define PCI_DVSEC_ID_CXL		0x0
> >  
> > -#define PCI_DVSEC_ID_CXL_REGLOC_DVSEC_ID	0x8
> > -#define PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET	0xC
> > +/* 8.1.3: PCIe DVSEC for CXL Device */
> > +#define CXL_DVSEC_PCIE_DEVICE					0
> >  
> > -/* BAR Indicator Register (BIR) */
> > -#define CXL_REGLOC_BIR_MASK GENMASK(2, 0)
> > +/* 8.1.4: Non-CXL Function Map DVSEC */
> > +#define CXL_DVSEC_FUNCTION_MAP					2
> > +
> > +/* 8.1.5: CXL 2.0 Extensions DVSEC for Ports */
> > +#define CXL_DVSEC_PORT_EXTENSIONS				3
> > +
> > +/* 8.1.6: GPF DVSEC for CXL Port */
> > +#define CXL_DVSEC_PORT_GPF					4
> > +
> > +/* 8.1.7: GPF DVSEC for CXL Device */
> > +#define CXL_DVSEC_DEVICE_GPF					5
> > +
> > +/* 8.1.8: PCIe DVSEC for Flex Bus Port */
> > +#define CXL_DVSEC_PCIE_FLEXBUS_PORT				7
> > +
> > +/* 8.1.9: Register Locator DVSEC */
> > +#define CXL_DVSEC_REGISTER_LOCATOR				8
> > +#define   DVSEC_REGISTER_LOCATOR_BLOCK1_OFFSET			0xC
> 
> Hmm. I'm not particularly keen on not having CXL in the naming for the offset
> and fields.  They get used in places where, to the casual reader, they may look
> like generic PCI_DVSEC_ defintiions....  the do get very long though, but I'm not
> sure that's the right place to abbreviate. 
> 
> Agreed with your comment in the thread that it's really hard to
> get a good consistent model for what are near enough free form names
> in a spec (or seem like it when read long after they were defined).
> 

Hi Jonathan, thanks for looking.

Well for these in particular, I'd expect the actual register/DVSEC offset to
have been read a few lines above, so I'm not too worried about these ones
specifically. However, it's a good point as this grows it may become
problematic. Do you have a suggestion? My ideal would be for the CXL consortium
to publish header files for these things and then we can just grumble about how
silly they are and move along, but I'm not sure that will happen.

Anyway... I really don't care what we choose as long as it's consistent. What
color would you like?

[snip]


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 15/28] cxl/core: Introduce API to scan switch ports
  2021-10-22 18:36 ` [RFC PATCH v2 15/28] cxl/core: Introduce API to scan switch ports Ben Widawsky
  2021-11-01  5:39   ` Dan Williams
@ 2021-11-03 16:08   ` Jonathan Cameron
  2021-11-10 17:49     ` Ben Widawsky
  1 sibling, 1 reply; 106+ messages in thread
From: Jonathan Cameron @ 2021-11-03 16:08 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Dan Williams,
	Ira Weiny, Vishal Verma

On Fri, 22 Oct 2021 11:36:56 -0700
Ben Widawsky <ben.widawsky@intel.com> wrote:

> The CXL drivers encapsulate the components that direct memory traffic in
> an entity known as a cxl_port. Compute Express Link specifies three such
> components: hostbridge (ie. a collection of root ports), switches, and
> endpoints. There are currently drivers that create these ports for the
> hostbridges and the endpoints (cxl_acpi and cxl_mem). The new API
> introduced allows callers to initiate a scan down from the hostbridge
> and create ports for switches in the CXL topology.
> 
> The intended user of this API is for endpoint devices. An endpoint
> device will need to determine if it is CXL.mem capable, which requires
> all components in the path from hostbridge to the endpoint to be CXL.mem
> capable. Once an endpoint device determines it's connected to a CXL
> capable root port, it can call this API to fill in all the ports in
> between the hostbridge and itself.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

This is an unusual enough thing to be doing on PCI that I'd suggest
making sure to cc linux-pci + Bjorn for next version of this...
Shall we say, this makes me nervous and more eyes might be good :)

One trivial inline.
> ---
>  .../driver-api/cxl/memory-devices.rst         |   6 +
>  drivers/cxl/core/Makefile                     |   1 +
>  drivers/cxl/core/bus.c                        | 145 ++++++++++++++++++
>  drivers/cxl/core/pci.c                        |  99 ++++++++++++
>  drivers/cxl/cxl.h                             |   2 +
>  drivers/cxl/pci.h                             |   6 +
>  drivers/cxl/port.c                            |   2 +-
>  tools/testing/cxl/Kbuild                      |   1 +
>  8 files changed, 261 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/cxl/core/pci.c
> 
...

> diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> index c7e1894d503b..f10e7d5b22a4 100644
> --- a/drivers/cxl/core/bus.c
> +++ b/drivers/cxl/core/bus.c
...
> +static struct cxl_port *find_cxl_port(struct pci_dev *usp)
> +{
> +	struct device *port_dev;
> +
> +	if (!pci_is_pcie(usp) || pci_pcie_type(usp) != PCI_EXP_TYPE_UPSTREAM)
> +		return NULL;
> +
> +	port_dev = bus_find_device(&cxl_bus_type, NULL, usp, match_port);
> +	if (port_dev)
> +		return to_cxl_port(port_dev);

Flip this logic to make it more readable.
	if (!port_dev)
		return NULL;

> +
> +	return NULL;
> +}
> +

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 13/28] cxl: Flesh out register names
  2021-11-03 16:03     ` Ben Widawsky
@ 2021-11-03 16:42       ` Jonathan Cameron
  2021-11-03 17:05         ` Ben Widawsky
  0 siblings, 1 reply; 106+ messages in thread
From: Jonathan Cameron @ 2021-11-03 16:42 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Dan Williams,
	Ira Weiny, Vishal Verma

On Wed, 3 Nov 2021 09:03:36 -0700
Ben Widawsky <ben.widawsky@intel.com> wrote:

> On 21-11-03 15:53:59, Jonathan Cameron wrote:
> > On Fri, 22 Oct 2021 11:36:54 -0700
> > Ben Widawsky <ben.widawsky@intel.com> wrote:
> >   
> 
> [snip]
> 
> > > diff --git a/drivers/cxl/pci.h b/drivers/cxl/pci.h
> > > index 12fdcb1b14e5..fe2898b17736 100644
> > > --- a/drivers/cxl/pci.h
> > > +++ b/drivers/cxl/pci.h
> > > @@ -7,17 +7,36 @@
> > >  
> > >  /*
> > >   * See section 8.1 Configuration Space Registers in the CXL 2.0
> > > - * Specification
> > > + * Specification. Names are taken straight from the specification with "CXL" and
> > > + * "DVSEC" redundancies removed.
> > >   */
> > >  #define PCI_DVSEC_HEADER1_LENGTH_MASK	GENMASK(31, 20)
> > >  #define PCI_DVSEC_VENDOR_ID_CXL		0x1E98
> > > -#define PCI_DVSEC_ID_CXL		0x0
> > >  
> > > -#define PCI_DVSEC_ID_CXL_REGLOC_DVSEC_ID	0x8
> > > -#define PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET	0xC
> > > +/* 8.1.3: PCIe DVSEC for CXL Device */
> > > +#define CXL_DVSEC_PCIE_DEVICE					0
> > >  
> > > -/* BAR Indicator Register (BIR) */
> > > -#define CXL_REGLOC_BIR_MASK GENMASK(2, 0)
> > > +/* 8.1.4: Non-CXL Function Map DVSEC */
> > > +#define CXL_DVSEC_FUNCTION_MAP					2
> > > +
> > > +/* 8.1.5: CXL 2.0 Extensions DVSEC for Ports */
> > > +#define CXL_DVSEC_PORT_EXTENSIONS				3
> > > +
> > > +/* 8.1.6: GPF DVSEC for CXL Port */
> > > +#define CXL_DVSEC_PORT_GPF					4
> > > +
> > > +/* 8.1.7: GPF DVSEC for CXL Device */
> > > +#define CXL_DVSEC_DEVICE_GPF					5
> > > +
> > > +/* 8.1.8: PCIe DVSEC for Flex Bus Port */
> > > +#define CXL_DVSEC_PCIE_FLEXBUS_PORT				7
> > > +
> > > +/* 8.1.9: Register Locator DVSEC */
> > > +#define CXL_DVSEC_REGISTER_LOCATOR				8
> > > +#define   DVSEC_REGISTER_LOCATOR_BLOCK1_OFFSET			0xC  
> > 
> > Hmm. I'm not particularly keen on not having CXL in the naming for the offset
> > and fields.  They get used in places where, to the casual reader, they may look
> > like generic PCI_DVSEC_ defintiions....  the do get very long though, but I'm not
> > sure that's the right place to abbreviate. 
> > 
> > Agreed with your comment in the thread that it's really hard to
> > get a good consistent model for what are near enough free form names
> > in a spec (or seem like it when read long after they were defined).
> >   
> 
> Hi Jonathan, thanks for looking.
> 
> Well for these in particular, I'd expect the actual register/DVSEC offset to
> have been read a few lines above, so I'm not too worried about these ones
> specifically. However, it's a good point as this grows it may become
> problematic. Do you have a suggestion? My ideal would be for the CXL consortium
> to publish header files for these things and then we can just grumble about how
> silly they are and move along, but I'm not sure that will happen.

:)  You can propose it... - Can you imagine how long those calls would go on
for??????

> 
> Anyway... I really don't care what we choose as long as it's consistent. What
> color would you like?

Blue, no yellow.... Ahhhhhhh!

I think a bit of flexibility on abbreviations is needed when they get insanely
long...  Meh, we can always rename them all later when we decide the choice was
wrong.

> 
> [snip]
> 


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 13/28] cxl: Flesh out register names
  2021-11-03 16:42       ` Jonathan Cameron
@ 2021-11-03 17:05         ` Ben Widawsky
  0 siblings, 0 replies; 106+ messages in thread
From: Ben Widawsky @ 2021-11-03 17:05 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Dan Williams,
	Ira Weiny, Vishal Verma

On 21-11-03 16:42:32, Jonathan Cameron wrote:
> On Wed, 3 Nov 2021 09:03:36 -0700
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > On 21-11-03 15:53:59, Jonathan Cameron wrote:
> > > On Fri, 22 Oct 2021 11:36:54 -0700
> > > Ben Widawsky <ben.widawsky@intel.com> wrote:
> > >   
> > 
> > [snip]
> > 
> > > > diff --git a/drivers/cxl/pci.h b/drivers/cxl/pci.h
> > > > index 12fdcb1b14e5..fe2898b17736 100644
> > > > --- a/drivers/cxl/pci.h
> > > > +++ b/drivers/cxl/pci.h
> > > > @@ -7,17 +7,36 @@
> > > >  
> > > >  /*
> > > >   * See section 8.1 Configuration Space Registers in the CXL 2.0
> > > > - * Specification
> > > > + * Specification. Names are taken straight from the specification with "CXL" and
> > > > + * "DVSEC" redundancies removed.
> > > >   */
> > > >  #define PCI_DVSEC_HEADER1_LENGTH_MASK	GENMASK(31, 20)
> > > >  #define PCI_DVSEC_VENDOR_ID_CXL		0x1E98
> > > > -#define PCI_DVSEC_ID_CXL		0x0
> > > >  
> > > > -#define PCI_DVSEC_ID_CXL_REGLOC_DVSEC_ID	0x8
> > > > -#define PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET	0xC
> > > > +/* 8.1.3: PCIe DVSEC for CXL Device */
> > > > +#define CXL_DVSEC_PCIE_DEVICE					0
> > > >  
> > > > -/* BAR Indicator Register (BIR) */
> > > > -#define CXL_REGLOC_BIR_MASK GENMASK(2, 0)
> > > > +/* 8.1.4: Non-CXL Function Map DVSEC */
> > > > +#define CXL_DVSEC_FUNCTION_MAP					2
> > > > +
> > > > +/* 8.1.5: CXL 2.0 Extensions DVSEC for Ports */
> > > > +#define CXL_DVSEC_PORT_EXTENSIONS				3
> > > > +
> > > > +/* 8.1.6: GPF DVSEC for CXL Port */
> > > > +#define CXL_DVSEC_PORT_GPF					4
> > > > +
> > > > +/* 8.1.7: GPF DVSEC for CXL Device */
> > > > +#define CXL_DVSEC_DEVICE_GPF					5
> > > > +
> > > > +/* 8.1.8: PCIe DVSEC for Flex Bus Port */
> > > > +#define CXL_DVSEC_PCIE_FLEXBUS_PORT				7
> > > > +
> > > > +/* 8.1.9: Register Locator DVSEC */
> > > > +#define CXL_DVSEC_REGISTER_LOCATOR				8
> > > > +#define   DVSEC_REGISTER_LOCATOR_BLOCK1_OFFSET			0xC  
> > > 
> > > Hmm. I'm not particularly keen on not having CXL in the naming for the offset
> > > and fields.  They get used in places where, to the casual reader, they may look
> > > like generic PCI_DVSEC_ defintiions....  the do get very long though, but I'm not
> > > sure that's the right place to abbreviate. 
> > > 
> > > Agreed with your comment in the thread that it's really hard to
> > > get a good consistent model for what are near enough free form names
> > > in a spec (or seem like it when read long after they were defined).
> > >   
> > 
> > Hi Jonathan, thanks for looking.
> > 
> > Well for these in particular, I'd expect the actual register/DVSEC offset to
> > have been read a few lines above, so I'm not too worried about these ones
> > specifically. However, it's a good point as this grows it may become
> > problematic. Do you have a suggestion? My ideal would be for the CXL consortium
> > to publish header files for these things and then we can just grumble about how
> > silly they are and move along, but I'm not sure that will happen.
> 
> :)  You can propose it... - Can you imagine how long those calls would go on
> for??????
> 
> > 
> > Anyway... I really don't care what we choose as long as it's consistent. What
> > color would you like?
> 
> Blue, no yellow.... Ahhhhhhh!
> 
> I think a bit of flexibility on abbreviations is needed when they get insanely
> long...  Meh, we can always rename them all later when we decide the choice was
> wrong.

Two very good choices. Ack. I'll respin. Let me spell out though in case it
wasn't clear, my goal was to make something grepable, which abbreviations take
away from. This is because as the 3.0 spec goes through I'm noticing our chapter
references are no longer 100% accurate (annoying!).

> 
> > 
> > [snip]
> > 
> 

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 08/28] cxl/port: Introduce a port driver
  2021-10-30  1:37   ` Dan Williams
  2021-10-31 17:53     ` Dan Williams
  2021-11-01 17:53     ` Ben Widawsky
@ 2021-11-04 16:37     ` Ben Widawsky
  2021-11-04 19:17       ` Dan Williams
  2 siblings, 1 reply; 106+ messages in thread
From: Ben Widawsky @ 2021-11-04 16:37 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On 21-10-29 18:37:36, Dan Williams wrote:
> On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> >

[snip]

> 
> +void trigger_decoder_autoremove(void *data)
> +{
> +       struct cxl_decoder *cxld = data;
> +       struct device *host = get_cxl_topology_host();
> +
> +       /* The topology host driver beat us to the punch */
> +       if (!host)
> +               return;
> +
> +       devm_remove_action(host, cxld_unregister, &cxld->dev);
> +       put_cxl_topology_host(host);
> +}
> +
> +/**
> + * cxl_port_decoder_autoremove - remove decoders discovered by the port driver
> + * @cxld: decoder to remove
> + *
> + * CXL.mem requires an intact port / decoder topology from root level
> + * platform decoders to endpoint decoders. Arrange for decoders
> + * enumerated by the port driver to be removed when either the root is
> + * removed (in which the entire hierarchy is removed), or when an
> + * individual port is disabled, in which case only that port's
> + * sub-section of the hierarchy is removed.
> + */
> +int cxl_port_decoder_autoremove(struct cxl_decoder *cxld)
> +{
> +       struct cxl_port *port = to_cxl_port(cxld->dev.parent);
> +       struct device *host = get_cxl_topology_host();
> +       int rc;
> +
> +       if (!host)
> +               return -ENODEV;
> +
> +       if (!port->dev.driver) {
> +               rc = -ENXIO;
> +               goto out;
> +       }
> +
> +       rc = cxl_decoder_autoremove(host, cxld);
> +       if (rc)
> +               goto out;
> +
> +       rc = devm_add_action_or_reset(&port->dev, trigger_decoder_autoremove,
> +                                       cxld);
> +out:
> +       put_cxl_topology_host(host);
> +       return rc;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_port_decoder_autoremove, CXL);

The only port which has decoders that aren't discovered by the port driver would
be the "root port" because it's platform specific. However, that port still is
treated similarly to the other ports. Therefore I don't see why you need
cxl_decoder_autoremove() anymore. Could you please explain? I think the safety
check in trigger_decoder_autoremove() makes this work for all cases.

[snip]

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 08/28] cxl/port: Introduce a port driver
  2021-11-02 16:58         ` Ben Widawsky
@ 2021-11-04 19:10           ` Dan Williams
  2021-11-04 19:49             ` Ben Widawsky
  0 siblings, 1 reply; 106+ messages in thread
From: Dan Williams @ 2021-11-04 19:10 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Tue, Nov 2, 2021 at 9:58 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> On 21-11-01 20:31:03, Dan Williams wrote:
> > On Mon, Nov 1, 2021 at 10:53 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > >
> > > On 21-10-29 18:37:36, Dan Williams wrote:
> > > > On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > > >
> > >
> > > [snip]
> > >
> > > > > diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile
> > > > > index cf07ae6cea17..40b386aaedf7 100644
> > > > > --- a/drivers/cxl/Makefile
> > > > > +++ b/drivers/cxl/Makefile
> > > > > @@ -1,5 +1,6 @@
> > > > >  # SPDX-License-Identifier: GPL-2.0
> > > > >  obj-$(CONFIG_CXL_BUS) += core/
> > > > > +obj-$(CONFIG_CXL_MEM) += cxl_port.o
> > > >
> > > > It feel odd that CONFIG_CXL_MEM builds cxl_port, why not have a
> > > > CONFIG_CXL_PORT that is simply selected by CONFIG_CXL_MEM, or a
> > > > CONFIG_CXL_PORT that defaults to the value of CONFIG_CXL_BUS?
> > > >
> > >
> > > Can you help me understand when CONFIG_CXL_MEM is useful when
> > > #CONFIG_CXL_PORT=n? I was unable to figure out such a case and so I tied the two
> > > together.
> >
> > With a 'select' dependency it's impossible to have the
> > CONFIG_CXL_PORT=n and CONFIG_CXL_MEM=m combination. The extra config
> > symbol is for idiomatic (one config-symbol per module .ko) reasons to
> > reflect the module dependency in the Kconfig.
> >
>
> Should CXL_MEM also select CXL_PCI? I don't see a way CXL_MEM can work without
> CXL_PCI.

The "select" dependency should be an "I use services from module X"
relationship. CXL_MEM uses port services and can't operate unless that
module is present. CXL_MEM does not strictly need CXL_PCI especially
since drivers/cxl/core/memdev.c abandoned all PCI refernces.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 08/28] cxl/port: Introduce a port driver
  2021-11-04 16:37     ` Ben Widawsky
@ 2021-11-04 19:17       ` Dan Williams
  2021-11-04 19:46         ` Ben Widawsky
  0 siblings, 1 reply; 106+ messages in thread
From: Dan Williams @ 2021-11-04 19:17 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Thu, Nov 4, 2021 at 9:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> On 21-10-29 18:37:36, Dan Williams wrote:
> > On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > >
>
> [snip]
>
> >
> > +void trigger_decoder_autoremove(void *data)
> > +{
> > +       struct cxl_decoder *cxld = data;
> > +       struct device *host = get_cxl_topology_host();
> > +
> > +       /* The topology host driver beat us to the punch */
> > +       if (!host)
> > +               return;
> > +
> > +       devm_remove_action(host, cxld_unregister, &cxld->dev);
> > +       put_cxl_topology_host(host);
> > +}
> > +
> > +/**
> > + * cxl_port_decoder_autoremove - remove decoders discovered by the port driver
> > + * @cxld: decoder to remove
> > + *
> > + * CXL.mem requires an intact port / decoder topology from root level
> > + * platform decoders to endpoint decoders. Arrange for decoders
> > + * enumerated by the port driver to be removed when either the root is
> > + * removed (in which the entire hierarchy is removed), or when an
> > + * individual port is disabled, in which case only that port's
> > + * sub-section of the hierarchy is removed.
> > + */
> > +int cxl_port_decoder_autoremove(struct cxl_decoder *cxld)
> > +{
> > +       struct cxl_port *port = to_cxl_port(cxld->dev.parent);
> > +       struct device *host = get_cxl_topology_host();
> > +       int rc;
> > +
> > +       if (!host)
> > +               return -ENODEV;
> > +
> > +       if (!port->dev.driver) {
> > +               rc = -ENXIO;
> > +               goto out;
> > +       }
> > +
> > +       rc = cxl_decoder_autoremove(host, cxld);
> > +       if (rc)
> > +               goto out;
> > +
> > +       rc = devm_add_action_or_reset(&port->dev, trigger_decoder_autoremove,
> > +                                       cxld);
> > +out:
> > +       put_cxl_topology_host(host);
> > +       return rc;
> > +}
> > +EXPORT_SYMBOL_NS_GPL(cxl_port_decoder_autoremove, CXL);
>
> The only port which has decoders that aren't discovered by the port driver would
> be the "root port" because it's platform specific. However, that port still is
> treated similarly to the other ports. Therefore I don't see why you need
> cxl_decoder_autoremove() anymore. Could you please explain? I think the safety
> check in trigger_decoder_autoremove() makes this work for all cases.

I don't think it's worth the games to explain why CXL sees fit to
register (here comes your favorite argument...) non-idiomatic devm
actions on devices that are not associated with the running device +
driver. So as long as the port driver auto-unloads its child devices
then we're golden. Is your concern that you want to have the CFMWS
decoders registers from cxl_port and not cxl_acpi?

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 08/28] cxl/port: Introduce a port driver
  2021-11-04 19:17       ` Dan Williams
@ 2021-11-04 19:46         ` Ben Widawsky
  2021-11-04 20:00           ` Dan Williams
  0 siblings, 1 reply; 106+ messages in thread
From: Ben Widawsky @ 2021-11-04 19:46 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On 21-11-04 12:17:48, Dan Williams wrote:
> On Thu, Nov 4, 2021 at 9:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> >
> > On 21-10-29 18:37:36, Dan Williams wrote:
> > > On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > >
> >
> > [snip]
> >
> > >
> > > +void trigger_decoder_autoremove(void *data)
> > > +{
> > > +       struct cxl_decoder *cxld = data;
> > > +       struct device *host = get_cxl_topology_host();
> > > +
> > > +       /* The topology host driver beat us to the punch */
> > > +       if (!host)
> > > +               return;
> > > +
> > > +       devm_remove_action(host, cxld_unregister, &cxld->dev);
> > > +       put_cxl_topology_host(host);
> > > +}
> > > +
> > > +/**
> > > + * cxl_port_decoder_autoremove - remove decoders discovered by the port driver
> > > + * @cxld: decoder to remove
> > > + *
> > > + * CXL.mem requires an intact port / decoder topology from root level
> > > + * platform decoders to endpoint decoders. Arrange for decoders
> > > + * enumerated by the port driver to be removed when either the root is
> > > + * removed (in which the entire hierarchy is removed), or when an
> > > + * individual port is disabled, in which case only that port's
> > > + * sub-section of the hierarchy is removed.
> > > + */
> > > +int cxl_port_decoder_autoremove(struct cxl_decoder *cxld)
> > > +{
> > > +       struct cxl_port *port = to_cxl_port(cxld->dev.parent);
> > > +       struct device *host = get_cxl_topology_host();
> > > +       int rc;
> > > +
> > > +       if (!host)
> > > +               return -ENODEV;
> > > +
> > > +       if (!port->dev.driver) {
> > > +               rc = -ENXIO;
> > > +               goto out;
> > > +       }
> > > +
> > > +       rc = cxl_decoder_autoremove(host, cxld);
> > > +       if (rc)
> > > +               goto out;
> > > +
> > > +       rc = devm_add_action_or_reset(&port->dev, trigger_decoder_autoremove,
> > > +                                       cxld);
> > > +out:
> > > +       put_cxl_topology_host(host);
> > > +       return rc;
> > > +}
> > > +EXPORT_SYMBOL_NS_GPL(cxl_port_decoder_autoremove, CXL);
> >
> > The only port which has decoders that aren't discovered by the port driver would
> > be the "root port" because it's platform specific. However, that port still is
> > treated similarly to the other ports. Therefore I don't see why you need
> > cxl_decoder_autoremove() anymore. Could you please explain? I think the safety
> > check in trigger_decoder_autoremove() makes this work for all cases.
> 
> I don't think it's worth the games to explain why CXL sees fit to
> register (here comes your favorite argument...) non-idiomatic devm
> actions on devices that are not associated with the running device +
> driver. So as long as the port driver auto-unloads its child devices
> then we're golden. Is your concern that you want to have the CFMWS
> decoders registers from cxl_port and not cxl_acpi?

Hey! I'm all for idioms when it makes sense. To me, one of the coolest things
about working on a new subsystem is you get to define some of your own idioms,
but anyway...

First, no I think the platform driver (cxl_acpi) should enumerate the platform
decoders. I confused devm_remove_action() with devm_release_action(). I wonder
though, if you made trigger_decoder_autoremove() call devm_release_action, would
that be sufficient and then remove cxl_decoder_autoremove()?

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 08/28] cxl/port: Introduce a port driver
  2021-11-04 19:10           ` Dan Williams
@ 2021-11-04 19:49             ` Ben Widawsky
  2021-11-04 20:04               ` Dan Williams
  0 siblings, 1 reply; 106+ messages in thread
From: Ben Widawsky @ 2021-11-04 19:49 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On 21-11-04 12:10:09, Dan Williams wrote:
> On Tue, Nov 2, 2021 at 9:58 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> >
> > On 21-11-01 20:31:03, Dan Williams wrote:
> > > On Mon, Nov 1, 2021 at 10:53 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > >
> > > > On 21-10-29 18:37:36, Dan Williams wrote:
> > > > > On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > > > >
> > > >
> > > > [snip]
> > > >
> > > > > > diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile
> > > > > > index cf07ae6cea17..40b386aaedf7 100644
> > > > > > --- a/drivers/cxl/Makefile
> > > > > > +++ b/drivers/cxl/Makefile
> > > > > > @@ -1,5 +1,6 @@
> > > > > >  # SPDX-License-Identifier: GPL-2.0
> > > > > >  obj-$(CONFIG_CXL_BUS) += core/
> > > > > > +obj-$(CONFIG_CXL_MEM) += cxl_port.o
> > > > >
> > > > > It feel odd that CONFIG_CXL_MEM builds cxl_port, why not have a
> > > > > CONFIG_CXL_PORT that is simply selected by CONFIG_CXL_MEM, or a
> > > > > CONFIG_CXL_PORT that defaults to the value of CONFIG_CXL_BUS?
> > > > >
> > > >
> > > > Can you help me understand when CONFIG_CXL_MEM is useful when
> > > > #CONFIG_CXL_PORT=n? I was unable to figure out such a case and so I tied the two
> > > > together.
> > >
> > > With a 'select' dependency it's impossible to have the
> > > CONFIG_CXL_PORT=n and CONFIG_CXL_MEM=m combination. The extra config
> > > symbol is for idiomatic (one config-symbol per module .ko) reasons to
> > > reflect the module dependency in the Kconfig.
> > >
> >
> > Should CXL_MEM also select CXL_PCI? I don't see a way CXL_MEM can work without
> > CXL_PCI.
> 
> The "select" dependency should be an "I use services from module X"
> relationship. CXL_MEM uses port services and can't operate unless that
> module is present. CXL_MEM does not strictly need CXL_PCI especially
> since drivers/cxl/core/memdev.c abandoned all PCI refernces.

The problem I see is cxl_mem devices have a soft dependency on cxl_pci. For
example what we discussed regard DVSEC range registers. My plan was if the
endpoint driver can't determine range registers aren't in use, it won't bind. It
can't make this determination without cxl_pci. DOE has a similar implication
though not fatal.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 08/28] cxl/port: Introduce a port driver
  2021-11-04 19:46         ` Ben Widawsky
@ 2021-11-04 20:00           ` Dan Williams
  2021-11-04 21:26             ` Ben Widawsky
  0 siblings, 1 reply; 106+ messages in thread
From: Dan Williams @ 2021-11-04 20:00 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Thu, Nov 4, 2021 at 12:46 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> On 21-11-04 12:17:48, Dan Williams wrote:
> > On Thu, Nov 4, 2021 at 9:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > >
> > > On 21-10-29 18:37:36, Dan Williams wrote:
> > > > On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > > >
> > >
> > > [snip]
> > >
> > > >
> > > > +void trigger_decoder_autoremove(void *data)
> > > > +{
> > > > +       struct cxl_decoder *cxld = data;
> > > > +       struct device *host = get_cxl_topology_host();
> > > > +
> > > > +       /* The topology host driver beat us to the punch */
> > > > +       if (!host)
> > > > +               return;
> > > > +
> > > > +       devm_remove_action(host, cxld_unregister, &cxld->dev);
> > > > +       put_cxl_topology_host(host);
> > > > +}
> > > > +
> > > > +/**
> > > > + * cxl_port_decoder_autoremove - remove decoders discovered by the port driver
> > > > + * @cxld: decoder to remove
> > > > + *
> > > > + * CXL.mem requires an intact port / decoder topology from root level
> > > > + * platform decoders to endpoint decoders. Arrange for decoders
> > > > + * enumerated by the port driver to be removed when either the root is
> > > > + * removed (in which the entire hierarchy is removed), or when an
> > > > + * individual port is disabled, in which case only that port's
> > > > + * sub-section of the hierarchy is removed.
> > > > + */
> > > > +int cxl_port_decoder_autoremove(struct cxl_decoder *cxld)
> > > > +{
> > > > +       struct cxl_port *port = to_cxl_port(cxld->dev.parent);
> > > > +       struct device *host = get_cxl_topology_host();
> > > > +       int rc;
> > > > +
> > > > +       if (!host)
> > > > +               return -ENODEV;
> > > > +
> > > > +       if (!port->dev.driver) {
> > > > +               rc = -ENXIO;
> > > > +               goto out;
> > > > +       }
> > > > +
> > > > +       rc = cxl_decoder_autoremove(host, cxld);
> > > > +       if (rc)
> > > > +               goto out;
> > > > +
> > > > +       rc = devm_add_action_or_reset(&port->dev, trigger_decoder_autoremove,
> > > > +                                       cxld);
> > > > +out:
> > > > +       put_cxl_topology_host(host);
> > > > +       return rc;
> > > > +}
> > > > +EXPORT_SYMBOL_NS_GPL(cxl_port_decoder_autoremove, CXL);
> > >
> > > The only port which has decoders that aren't discovered by the port driver would
> > > be the "root port" because it's platform specific. However, that port still is
> > > treated similarly to the other ports. Therefore I don't see why you need
> > > cxl_decoder_autoremove() anymore. Could you please explain? I think the safety
> > > check in trigger_decoder_autoremove() makes this work for all cases.
> >
> > I don't think it's worth the games to explain why CXL sees fit to
> > register (here comes your favorite argument...) non-idiomatic devm
> > actions on devices that are not associated with the running device +
> > driver. So as long as the port driver auto-unloads its child devices
> > then we're golden. Is your concern that you want to have the CFMWS
> > decoders registers from cxl_port and not cxl_acpi?
>
> Hey! I'm all for idioms when it makes sense. To me, one of the coolest things
> about working on a new subsystem is you get to define some of your own idioms,
> but anyway...
>
> First, no I think the platform driver (cxl_acpi) should enumerate the platform
> decoders. I confused devm_remove_action() with devm_release_action(). I wonder
> though, if you made trigger_decoder_autoremove() call devm_release_action, would
> that be sufficient and then remove cxl_decoder_autoremove()?

devm_release_action() is not needed if the devm is allowed to fire
naturally which was the realization I came to here:

https://lore.kernel.org/r/CAPcyv4gpA=DH0SQvRdmF6dY01mZ1S-gGEWTSDbb+0ajYtyNv0A@mail.gmail.com

I.e. I regret opening that pandora's box of registering "remote" devm
actions vs typical local devm actions.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 08/28] cxl/port: Introduce a port driver
  2021-11-04 19:49             ` Ben Widawsky
@ 2021-11-04 20:04               ` Dan Williams
  2021-11-04 21:25                 ` Ben Widawsky
  0 siblings, 1 reply; 106+ messages in thread
From: Dan Williams @ 2021-11-04 20:04 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Thu, Nov 4, 2021 at 12:49 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> On 21-11-04 12:10:09, Dan Williams wrote:
> > On Tue, Nov 2, 2021 at 9:58 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > >
> > > On 21-11-01 20:31:03, Dan Williams wrote:
> > > > On Mon, Nov 1, 2021 at 10:53 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > > >
> > > > > On 21-10-29 18:37:36, Dan Williams wrote:
> > > > > > On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > > > > >
> > > > >
> > > > > [snip]
> > > > >
> > > > > > > diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile
> > > > > > > index cf07ae6cea17..40b386aaedf7 100644
> > > > > > > --- a/drivers/cxl/Makefile
> > > > > > > +++ b/drivers/cxl/Makefile
> > > > > > > @@ -1,5 +1,6 @@
> > > > > > >  # SPDX-License-Identifier: GPL-2.0
> > > > > > >  obj-$(CONFIG_CXL_BUS) += core/
> > > > > > > +obj-$(CONFIG_CXL_MEM) += cxl_port.o
> > > > > >
> > > > > > It feel odd that CONFIG_CXL_MEM builds cxl_port, why not have a
> > > > > > CONFIG_CXL_PORT that is simply selected by CONFIG_CXL_MEM, or a
> > > > > > CONFIG_CXL_PORT that defaults to the value of CONFIG_CXL_BUS?
> > > > > >
> > > > >
> > > > > Can you help me understand when CONFIG_CXL_MEM is useful when
> > > > > #CONFIG_CXL_PORT=n? I was unable to figure out such a case and so I tied the two
> > > > > together.
> > > >
> > > > With a 'select' dependency it's impossible to have the
> > > > CONFIG_CXL_PORT=n and CONFIG_CXL_MEM=m combination. The extra config
> > > > symbol is for idiomatic (one config-symbol per module .ko) reasons to
> > > > reflect the module dependency in the Kconfig.
> > > >
> > >
> > > Should CXL_MEM also select CXL_PCI? I don't see a way CXL_MEM can work without
> > > CXL_PCI.
> >
> > The "select" dependency should be an "I use services from module X"
> > relationship. CXL_MEM uses port services and can't operate unless that
> > module is present. CXL_MEM does not strictly need CXL_PCI especially
> > since drivers/cxl/core/memdev.c abandoned all PCI refernces.
>
> The problem I see is cxl_mem devices have a soft dependency on cxl_pci. For
> example what we discussed regard DVSEC range registers. My plan was if the
> endpoint driver can't determine range registers aren't in use, it won't bind. It
> can't make this determination without cxl_pci. DOE has a similar implication
> though not fatal.

I think this gets back to my original preference of just making the
port driver a part of the core rather than its own independent module
for no real reason. In fact it makes the validation job harder because
it expands the testing surface for different combinations of module
loading and device binding.

For the specific issue of CXL_MEM dependencies on CXL_PCI, isn't
CXL_PCI passing in cacheed data for CXL_MEM? CXL_MEM has no idea that
CXL_PCI exists, but it certainly knows that CXL_PORT exists,

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 08/28] cxl/port: Introduce a port driver
  2021-11-04 20:04               ` Dan Williams
@ 2021-11-04 21:25                 ` Ben Widawsky
  0 siblings, 0 replies; 106+ messages in thread
From: Ben Widawsky @ 2021-11-04 21:25 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On 21-11-04 13:04:13, Dan Williams wrote:
> On Thu, Nov 4, 2021 at 12:49 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
> >
> > On 21-11-04 12:10:09, Dan Williams wrote:
> > > On Tue, Nov 2, 2021 at 9:58 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > >
> > > > On 21-11-01 20:31:03, Dan Williams wrote:
> > > > > On Mon, Nov 1, 2021 at 10:53 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > > > >
> > > > > > On 21-10-29 18:37:36, Dan Williams wrote:
> > > > > > > On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > > > > > >
> > > > > >
> > > > > > [snip]
> > > > > >
> > > > > > > > diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile
> > > > > > > > index cf07ae6cea17..40b386aaedf7 100644
> > > > > > > > --- a/drivers/cxl/Makefile
> > > > > > > > +++ b/drivers/cxl/Makefile
> > > > > > > > @@ -1,5 +1,6 @@
> > > > > > > >  # SPDX-License-Identifier: GPL-2.0
> > > > > > > >  obj-$(CONFIG_CXL_BUS) += core/
> > > > > > > > +obj-$(CONFIG_CXL_MEM) += cxl_port.o
> > > > > > >
> > > > > > > It feel odd that CONFIG_CXL_MEM builds cxl_port, why not have a
> > > > > > > CONFIG_CXL_PORT that is simply selected by CONFIG_CXL_MEM, or a
> > > > > > > CONFIG_CXL_PORT that defaults to the value of CONFIG_CXL_BUS?
> > > > > > >
> > > > > >
> > > > > > Can you help me understand when CONFIG_CXL_MEM is useful when
> > > > > > #CONFIG_CXL_PORT=n? I was unable to figure out such a case and so I tied the two
> > > > > > together.
> > > > >
> > > > > With a 'select' dependency it's impossible to have the
> > > > > CONFIG_CXL_PORT=n and CONFIG_CXL_MEM=m combination. The extra config
> > > > > symbol is for idiomatic (one config-symbol per module .ko) reasons to
> > > > > reflect the module dependency in the Kconfig.
> > > > >
> > > >
> > > > Should CXL_MEM also select CXL_PCI? I don't see a way CXL_MEM can work without
> > > > CXL_PCI.
> > >
> > > The "select" dependency should be an "I use services from module X"
> > > relationship. CXL_MEM uses port services and can't operate unless that
> > > module is present. CXL_MEM does not strictly need CXL_PCI especially
> > > since drivers/cxl/core/memdev.c abandoned all PCI refernces.
> >
> > The problem I see is cxl_mem devices have a soft dependency on cxl_pci. For
> > example what we discussed regard DVSEC range registers. My plan was if the
> > endpoint driver can't determine range registers aren't in use, it won't bind. It
> > can't make this determination without cxl_pci. DOE has a similar implication
> > though not fatal.
> 
> I think this gets back to my original preference of just making the
> port driver a part of the core rather than its own independent module
> for no real reason. In fact it makes the validation job harder because
> it expands the testing surface for different combinations of module
> loading and device binding.
> 
> For the specific issue of CXL_MEM dependencies on CXL_PCI, isn't
> CXL_PCI passing in cacheed data for CXL_MEM? CXL_MEM has no idea that
> CXL_PCI exists, but it certainly knows that CXL_PORT exists,

Thinking about it further, the current bar for entry in cxl_mem is there must as
least be component registers. If CXL_PCI doesn't do that, nothing will proceed,
so I think this is fine.

Since you brought it up, further down the road, I'm not opposed toward moving
the port driver into core, but for now I like it as the separate module. Some of
the topology host games we're playing now might have been avoidable, although I
think they're not a bad idea anyway.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 08/28] cxl/port: Introduce a port driver
  2021-11-04 20:00           ` Dan Williams
@ 2021-11-04 21:26             ` Ben Widawsky
  0 siblings, 0 replies; 106+ messages in thread
From: Ben Widawsky @ 2021-11-04 21:26 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On 21-11-04 13:00:35, Dan Williams wrote:
> On Thu, Nov 4, 2021 at 12:46 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
> >
> > On 21-11-04 12:17:48, Dan Williams wrote:
> > > On Thu, Nov 4, 2021 at 9:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > >
> > > > On 21-10-29 18:37:36, Dan Williams wrote:
> > > > > On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > > > >
> > > >
> > > > [snip]
> > > >
> > > > >
> > > > > +void trigger_decoder_autoremove(void *data)
> > > > > +{
> > > > > +       struct cxl_decoder *cxld = data;
> > > > > +       struct device *host = get_cxl_topology_host();
> > > > > +
> > > > > +       /* The topology host driver beat us to the punch */
> > > > > +       if (!host)
> > > > > +               return;
> > > > > +
> > > > > +       devm_remove_action(host, cxld_unregister, &cxld->dev);
> > > > > +       put_cxl_topology_host(host);
> > > > > +}
> > > > > +
> > > > > +/**
> > > > > + * cxl_port_decoder_autoremove - remove decoders discovered by the port driver
> > > > > + * @cxld: decoder to remove
> > > > > + *
> > > > > + * CXL.mem requires an intact port / decoder topology from root level
> > > > > + * platform decoders to endpoint decoders. Arrange for decoders
> > > > > + * enumerated by the port driver to be removed when either the root is
> > > > > + * removed (in which the entire hierarchy is removed), or when an
> > > > > + * individual port is disabled, in which case only that port's
> > > > > + * sub-section of the hierarchy is removed.
> > > > > + */
> > > > > +int cxl_port_decoder_autoremove(struct cxl_decoder *cxld)
> > > > > +{
> > > > > +       struct cxl_port *port = to_cxl_port(cxld->dev.parent);
> > > > > +       struct device *host = get_cxl_topology_host();
> > > > > +       int rc;
> > > > > +
> > > > > +       if (!host)
> > > > > +               return -ENODEV;
> > > > > +
> > > > > +       if (!port->dev.driver) {
> > > > > +               rc = -ENXIO;
> > > > > +               goto out;
> > > > > +       }
> > > > > +
> > > > > +       rc = cxl_decoder_autoremove(host, cxld);
> > > > > +       if (rc)
> > > > > +               goto out;
> > > > > +
> > > > > +       rc = devm_add_action_or_reset(&port->dev, trigger_decoder_autoremove,
> > > > > +                                       cxld);
> > > > > +out:
> > > > > +       put_cxl_topology_host(host);
> > > > > +       return rc;
> > > > > +}
> > > > > +EXPORT_SYMBOL_NS_GPL(cxl_port_decoder_autoremove, CXL);
> > > >
> > > > The only port which has decoders that aren't discovered by the port driver would
> > > > be the "root port" because it's platform specific. However, that port still is
> > > > treated similarly to the other ports. Therefore I don't see why you need
> > > > cxl_decoder_autoremove() anymore. Could you please explain? I think the safety
> > > > check in trigger_decoder_autoremove() makes this work for all cases.
> > >
> > > I don't think it's worth the games to explain why CXL sees fit to
> > > register (here comes your favorite argument...) non-idiomatic devm
> > > actions on devices that are not associated with the running device +
> > > driver. So as long as the port driver auto-unloads its child devices
> > > then we're golden. Is your concern that you want to have the CFMWS
> > > decoders registers from cxl_port and not cxl_acpi?
> >
> > Hey! I'm all for idioms when it makes sense. To me, one of the coolest things
> > about working on a new subsystem is you get to define some of your own idioms,
> > but anyway...
> >
> > First, no I think the platform driver (cxl_acpi) should enumerate the platform
> > decoders. I confused devm_remove_action() with devm_release_action(). I wonder
> > though, if you made trigger_decoder_autoremove() call devm_release_action, would
> > that be sufficient and then remove cxl_decoder_autoremove()?
> 
> devm_release_action() is not needed if the devm is allowed to fire
> naturally which was the realization I came to here:
> 
> https://lore.kernel.org/r/CAPcyv4gpA=DH0SQvRdmF6dY01mZ1S-gGEWTSDbb+0ajYtyNv0A@mail.gmail.com
> 
> I.e. I regret opening that pandora's box of registering "remote" devm
> actions vs typical local devm actions.

I wasn't suggesting release() fixes anything, but it did seem like we could get
rid of cxl_decoder_autoremove() if we switched it to release(). Must of this
gets too complex for me too quickly though, so if you think it won't work I'm
happy to leave it at that.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 15/28] cxl/core: Introduce API to scan switch ports
  2021-11-03 16:08   ` Jonathan Cameron
@ 2021-11-10 17:49     ` Ben Widawsky
  2021-11-10 18:10       ` Jonathan Cameron
  0 siblings, 1 reply; 106+ messages in thread
From: Ben Widawsky @ 2021-11-10 17:49 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Dan Williams,
	Ira Weiny, Vishal Verma

On 21-11-03 16:08:21, Jonathan Cameron wrote:
> On Fri, 22 Oct 2021 11:36:56 -0700
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > The CXL drivers encapsulate the components that direct memory traffic in
> > an entity known as a cxl_port. Compute Express Link specifies three such
> > components: hostbridge (ie. a collection of root ports), switches, and
> > endpoints. There are currently drivers that create these ports for the
> > hostbridges and the endpoints (cxl_acpi and cxl_mem). The new API
> > introduced allows callers to initiate a scan down from the hostbridge
> > and create ports for switches in the CXL topology.
> > 
> > The intended user of this API is for endpoint devices. An endpoint
> > device will need to determine if it is CXL.mem capable, which requires
> > all components in the path from hostbridge to the endpoint to be CXL.mem
> > capable. Once an endpoint device determines it's connected to a CXL
> > capable root port, it can call this API to fill in all the ports in
> > between the hostbridge and itself.
> > 
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> 
> This is an unusual enough thing to be doing on PCI that I'd suggest
> making sure to cc linux-pci + Bjorn for next version of this...
> Shall we say, this makes me nervous and more eyes might be good :)
> 
> One trivial inline.

Makes sense, just this patch or the whole series?

> > ---
> >  .../driver-api/cxl/memory-devices.rst         |   6 +
> >  drivers/cxl/core/Makefile                     |   1 +
> >  drivers/cxl/core/bus.c                        | 145 ++++++++++++++++++
> >  drivers/cxl/core/pci.c                        |  99 ++++++++++++
> >  drivers/cxl/cxl.h                             |   2 +
> >  drivers/cxl/pci.h                             |   6 +
> >  drivers/cxl/port.c                            |   2 +-
> >  tools/testing/cxl/Kbuild                      |   1 +
> >  8 files changed, 261 insertions(+), 1 deletion(-)
> >  create mode 100644 drivers/cxl/core/pci.c
> > 
> ...
> 
> > diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> > index c7e1894d503b..f10e7d5b22a4 100644
> > --- a/drivers/cxl/core/bus.c
> > +++ b/drivers/cxl/core/bus.c
> ...
> > +static struct cxl_port *find_cxl_port(struct pci_dev *usp)
> > +{
> > +	struct device *port_dev;
> > +
> > +	if (!pci_is_pcie(usp) || pci_pcie_type(usp) != PCI_EXP_TYPE_UPSTREAM)
> > +		return NULL;
> > +
> > +	port_dev = bus_find_device(&cxl_bus_type, NULL, usp, match_port);
> > +	if (port_dev)
> > +		return to_cxl_port(port_dev);
> 
> Flip this logic to make it more readable.
> 	if (!port_dev)
> 		return NULL;
> 
> > +
> > +	return NULL;
> > +}
> > +

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 15/28] cxl/core: Introduce API to scan switch ports
  2021-11-10 17:49     ` Ben Widawsky
@ 2021-11-10 18:10       ` Jonathan Cameron
  2021-11-10 21:03         ` Dan Williams
  0 siblings, 1 reply; 106+ messages in thread
From: Jonathan Cameron @ 2021-11-10 18:10 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Dan Williams,
	Ira Weiny, Vishal Verma

On Wed, 10 Nov 2021 09:49:23 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> On 21-11-03 16:08:21, Jonathan Cameron wrote:
> > On Fri, 22 Oct 2021 11:36:56 -0700
> > Ben Widawsky <ben.widawsky@intel.com> wrote:
> >   
> > > The CXL drivers encapsulate the components that direct memory traffic in
> > > an entity known as a cxl_port. Compute Express Link specifies three such
> > > components: hostbridge (ie. a collection of root ports), switches, and
> > > endpoints. There are currently drivers that create these ports for the
> > > hostbridges and the endpoints (cxl_acpi and cxl_mem). The new API
> > > introduced allows callers to initiate a scan down from the hostbridge
> > > and create ports for switches in the CXL topology.
> > > 
> > > The intended user of this API is for endpoint devices. An endpoint
> > > device will need to determine if it is CXL.mem capable, which requires
> > > all components in the path from hostbridge to the endpoint to be CXL.mem
> > > capable. Once an endpoint device determines it's connected to a CXL
> > > capable root port, it can call this API to fill in all the ports in
> > > between the hostbridge and itself.
> > > 
> > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>  
> > 
> > This is an unusual enough thing to be doing on PCI that I'd suggest
> > making sure to cc linux-pci + Bjorn for next version of this...
> > Shall we say, this makes me nervous and more eyes might be good :)
> > 
> > One trivial inline.  
> 
> Makes sense, just this patch or the whole series?

The nervousness was just this patch IIRC.

Jonathan

> 
> > > ---
> > >  .../driver-api/cxl/memory-devices.rst         |   6 +
> > >  drivers/cxl/core/Makefile                     |   1 +
> > >  drivers/cxl/core/bus.c                        | 145 ++++++++++++++++++
> > >  drivers/cxl/core/pci.c                        |  99 ++++++++++++
> > >  drivers/cxl/cxl.h                             |   2 +
> > >  drivers/cxl/pci.h                             |   6 +
> > >  drivers/cxl/port.c                            |   2 +-
> > >  tools/testing/cxl/Kbuild                      |   1 +
> > >  8 files changed, 261 insertions(+), 1 deletion(-)
> > >  create mode 100644 drivers/cxl/core/pci.c
> > >   
> > ...
> >   
> > > diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> > > index c7e1894d503b..f10e7d5b22a4 100644
> > > --- a/drivers/cxl/core/bus.c
> > > +++ b/drivers/cxl/core/bus.c  
> > ...  
> > > +static struct cxl_port *find_cxl_port(struct pci_dev *usp)
> > > +{
> > > +	struct device *port_dev;
> > > +
> > > +	if (!pci_is_pcie(usp) || pci_pcie_type(usp) != PCI_EXP_TYPE_UPSTREAM)
> > > +		return NULL;
> > > +
> > > +	port_dev = bus_find_device(&cxl_bus_type, NULL, usp, match_port);
> > > +	if (port_dev)
> > > +		return to_cxl_port(port_dev);  
> > 
> > Flip this logic to make it more readable.
> > 	if (!port_dev)
> > 		return NULL;
> >   
> > > +
> > > +	return NULL;
> > > +}
> > > +  


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 15/28] cxl/core: Introduce API to scan switch ports
  2021-11-10 18:10       ` Jonathan Cameron
@ 2021-11-10 21:03         ` Dan Williams
  0 siblings, 0 replies; 106+ messages in thread
From: Dan Williams @ 2021-11-10 21:03 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Ben Widawsky, linux-cxl, Chet Douglas, Alison Schofield,
	Ira Weiny, Vishal Verma

On Wed, Nov 10, 2021 at 10:11 AM Jonathan Cameron
<Jonathan.Cameron@huawei.com> wrote:
>
> On Wed, 10 Nov 2021 09:49:23 -0800
> Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> > On 21-11-03 16:08:21, Jonathan Cameron wrote:
> > > On Fri, 22 Oct 2021 11:36:56 -0700
> > > Ben Widawsky <ben.widawsky@intel.com> wrote:
> > >
> > > > The CXL drivers encapsulate the components that direct memory traffic in
> > > > an entity known as a cxl_port. Compute Express Link specifies three such
> > > > components: hostbridge (ie. a collection of root ports), switches, and
> > > > endpoints. There are currently drivers that create these ports for the
> > > > hostbridges and the endpoints (cxl_acpi and cxl_mem). The new API
> > > > introduced allows callers to initiate a scan down from the hostbridge
> > > > and create ports for switches in the CXL topology.
> > > >
> > > > The intended user of this API is for endpoint devices. An endpoint
> > > > device will need to determine if it is CXL.mem capable, which requires
> > > > all components in the path from hostbridge to the endpoint to be CXL.mem
> > > > capable. Once an endpoint device determines it's connected to a CXL
> > > > capable root port, it can call this API to fill in all the ports in
> > > > between the hostbridge and itself.
> > > >
> > > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > >
> > > This is an unusual enough thing to be doing on PCI that I'd suggest
> > > making sure to cc linux-pci + Bjorn for next version of this...
> > > Shall we say, this makes me nervous and more eyes might be good :)
> > >
> > > One trivial inline.
> >
> > Makes sense, just this patch or the whole series?
>
> The nervousness was just this patch IIRC.

I would copy linux-pci on the whole thing. The parallel universe that
drivers/cxl/ needs and is attaching to PCI objects in the device tree
should at least be something PCI developers are aware of, especially
when we come back to talk about Native PCI hotplug interactions with
CXL.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 15/28] cxl/core: Introduce API to scan switch ports
  2021-11-02  1:45       ` Dan Williams
  2021-11-02 16:39         ` Ben Widawsky
@ 2021-11-16 16:50         ` Ben Widawsky
  2021-11-16 17:51           ` Dan Williams
  1 sibling, 1 reply; 106+ messages in thread
From: Ben Widawsky @ 2021-11-16 16:50 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On 21-11-01 18:45:57, Dan Williams wrote:
> On Mon, Nov 1, 2021 at 3:56 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
> >
> > On 21-10-31 22:39:43, Dan Williams wrote:
> > > On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > >
> > > > The CXL drivers encapsulate the components that direct memory traffic in
> > > > an entity known as a cxl_port. Compute Express Link specifies three such
> > > > components: hostbridge (ie. a collection of root ports), switches, and
> > > > endpoints. There are currently drivers that create these ports for the
> > > > hostbridges and the endpoints (cxl_acpi and cxl_mem). The new API
> > > > introduced allows callers to initiate a scan down from the hostbridge
> > > > and create ports for switches in the CXL topology.
> > > >
> > > > The intended user of this API is for endpoint devices. An endpoint
> > > > device will need to determine if it is CXL.mem capable, which requires
> > > > all components in the path from hostbridge to the endpoint to be CXL.mem
> > > > capable. Once an endpoint device determines it's connected to a CXL
> > > > capable root port, it can call this API to fill in all the ports in
> > > > between the hostbridge and itself.
> > > >
> > > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > > > ---
> > > >  .../driver-api/cxl/memory-devices.rst         |   6 +
> > > >  drivers/cxl/core/Makefile                     |   1 +
> > > >  drivers/cxl/core/bus.c                        | 145 ++++++++++++++++++
> > > >  drivers/cxl/core/pci.c                        |  99 ++++++++++++
> > > >  drivers/cxl/cxl.h                             |   2 +
> > > >  drivers/cxl/pci.h                             |   6 +
> > > >  drivers/cxl/port.c                            |   2 +-
> > > >  tools/testing/cxl/Kbuild                      |   1 +
> > > >  8 files changed, 261 insertions(+), 1 deletion(-)
> > > >  create mode 100644 drivers/cxl/core/pci.c
> > > >
> > > > diff --git a/Documentation/driver-api/cxl/memory-devices.rst b/Documentation/driver-api/cxl/memory-devices.rst
> > > > index fbf0393cdddc..547336c95593 100644
> > > > --- a/Documentation/driver-api/cxl/memory-devices.rst
> > > > +++ b/Documentation/driver-api/cxl/memory-devices.rst
> > > > @@ -47,6 +47,12 @@ CXL Core
> > > >  .. kernel-doc:: drivers/cxl/core/bus.c
> > > >     :identifiers:
> > > >
> > > > +.. kernel-doc:: drivers/cxl/core/pci.c
> > > > +   :doc: cxl pci
> > > > +
> > > > +.. kernel-doc:: drivers/cxl/core/pci.c
> > > > +   :identifiers:
> > > > +
> > > >  .. kernel-doc:: drivers/cxl/core/pmem.c
> > > >     :doc: cxl pmem
> > > >
> > > > diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
> > > > index 07eb8e1fb8a6..9d33d2d5bf09 100644
> > > > --- a/drivers/cxl/core/Makefile
> > > > +++ b/drivers/cxl/core/Makefile
> > > > @@ -7,3 +7,4 @@ cxl_core-y += pmem.o
> > > >  cxl_core-y += regs.o
> > > >  cxl_core-y += memdev.o
> > > >  cxl_core-y += mbox.o
> > > > +cxl_core-y += pci.o
> > > > diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> > > > index c7e1894d503b..f10e7d5b22a4 100644
> > > > --- a/drivers/cxl/core/bus.c
> > > > +++ b/drivers/cxl/core/bus.c
> > > > @@ -8,6 +8,7 @@
> > > >  #include <linux/idr.h>
> > > >  #include <cxlmem.h>
> > > >  #include <cxl.h>
> > > > +#include <pci.h>
> > > >  #include "core.h"
> > > >
> > > >  /**
> > > > @@ -445,6 +446,150 @@ struct cxl_port *devm_cxl_add_port(struct device *uport,
> > > >  }
> > > >  EXPORT_SYMBOL_GPL(devm_cxl_add_port);
> > > >
> > > > +void devm_cxl_remove_port(struct cxl_port *port)
> > > > +{
> > > > +       down_read(&root_host_sem);
> > > > +       if (cxl_root_host) {
> > > > +               devm_release_action(cxl_root_host, cxl_unlink_uport, port);
> > > > +               devm_release_action(cxl_root_host, unregister_port, port);
> > > > +       }
> > > > +       up_read(&root_host_sem);
> > > > +}
> > > > +EXPORT_SYMBOL_GPL(devm_cxl_remove_port);
> > >
> > > If the scan establishes the property that all child ports are devm
> > > allocated with their cxl_port-parent, and only if the cxl_port-parent
> > > is bound to its driver then I think we don't need to play
> > > devm_release_action games().
> > >
> >
> > We had discussed this previously. I was running into an issue when unloading
> > cxl_mem. I needed a way to remove the endpoint port and this was your
> > recommendation. Are you suggesting if the chain is set up correctly, I don't
> > need to do anything?
> 
> I think if the chain is set up correctly then you don't need to do
> anything special. The endpoint port would be devm registered by the
> cxl_memdev driver to its parent cxl_port provided that port is
> actively attached to its driver.
> 
> > I don't remember exactly what was blowing up but I can try again after things
> > are properly parented.
> 
> Cool.
> 
> >
> > > > +
> > > > +static int match_port(struct device *dev, const void *data)
> > > > +{
> > > > +       struct pci_dev *pdev = (struct pci_dev *)data;
> > > > +
> > > > +       if (dev->type != &cxl_port_type)
> > > > +               return 0;
> > > > +
> > > > +       return to_cxl_port(dev)->uport == &pdev->dev;
> > > > +}
> > > > +
> > > > +static struct cxl_port *find_cxl_port(struct pci_dev *usp)
> > > > +{
> > > > +       struct device *port_dev;
> > > > +
> > > > +       if (!pci_is_pcie(usp) || pci_pcie_type(usp) != PCI_EXP_TYPE_UPSTREAM)
> > > > +               return NULL;
> > > > +
> > > > +       port_dev = bus_find_device(&cxl_bus_type, NULL, usp, match_port);
> > > > +       if (port_dev)
> > > > +               return to_cxl_port(port_dev);
> > > > +
> > > > +       return NULL;
> > > > +}
> > > > +
> > > > +static int add_upstream_port(struct device *host, struct pci_dev *pdev)
> > > > +{
> > > > +       struct device *dev = &pdev->dev;
> > > > +       struct cxl_port *parent_port;
> > > > +       struct cxl_register_map map;
> > > > +       struct cxl_port *port;
> > > > +       int rc;
> > > > +
> > > > +       /*
> > > > +        * Upstream ports must be connected to a downstream port or root port.
> > > > +        * That downstream or root port must have a parent.
> > > > +        */
> > > > +       if (!pdev->dev.parent->parent)
> > > > +               return -ENXIO;
> > > > +
> > > > +       /* A port is useless if there are no component registers */
> > > > +       rc = cxl_find_regblock(pdev, CXL_REGLOC_RBI_COMPONENT, &map);
> > > > +       if (rc)
> > > > +               return rc;
> > > > +
> > > > +       parent_port = find_cxl_port(to_pci_dev(pdev->dev.parent->parent));
> > >
> > > This deref chain is unreadable. It wants a helper if it stays, but I
> > > can't immediately think of a reason to ever need to look at a
> > > grandparent in the hierarchy.
> >
> > The goal is to be able to find the next PCIe port up in the chain.
> >
> > My understanding was:
> > pdev = PCIe upstream switch
> > pdev->dev.parent = PCIe downstream switch connected to pdev.
> > pdev->dev.parent->parent = PCIe upstream switch connected to pdev->dev.parent
> >
> > I was unable to find an idiomatic way to do that. I'm open to suggestions.
> 
> Oh ok, I see it now, but I think this can be done in pure CXL terms
> and generic devices with the assumption that the parent device of a
> cxl_memdev must be a dport. Then this works whether the parent port is
> a platform device like ACPI or cxl_test, or a PCIe device.
> 
> static int port_has_dport(struct device *dev, const void *dport_dev)
> {
>         int found = 0;
>         struct cxl_port *port;
>         struct cxl_dport *dport;
> 
>         if (dev->type != &cxl_port_type)
>                 return 0;
>         port = to_cxl_port(dev);
> 
>         device_lock(&port->dev);
>         list_for_each_entry (dport, &port->dports, list)
>                 if (dport->dport == dport_dev) {
>                         found = 1;
>                         break;
>                 }
>         device_unlock(&port->dev);
> 
>         return found;
> }
> 
> struct cxl_port *find_parent_cxl_port(struct cxl_memdev *cxlmd)
> {
>         return bus_find_device(&cxl_bus_type, NULL, cxlmd->dev.parent,
>                                port_has_dport);
> }
> 

I'm squinting at this and wondering what I will have to change to make it work.

This logic you're asking me to change is the code which is adding the switches
to the port hierarchy (before the memdev adds itself as a port). I first need to
build that before I can check if the dport matches cxlmd->dev.parent, right?

Since I foolishly added this patch without the consumer, I'll outline here the
order.

1. memdev driver walks up hierarchy to find root port & check for switches
2. if parent.driver goto 6 (parent is CXL 2.0 capable root port)
3. starting from the root port this endpoint is connected to, walk down the
   topology and add all upstream and downstream ports
4. if parent.driver goto 6 (parent is CXL 2.0 DSP)
5. ERROR - don't bind
6. Add self as port

So I'm not seeing how this can work without something like I already have.
Please advise. For now, I'll keep my existing logic but call the function
find_parent_cxl_port() so we can replace it with something more palatable.

> > >
> > > > +       if (!parent_port)
> > > > +               return -ENODEV;
> > > > +
> > > > +       port = devm_cxl_add_port(dev, cxl_reg_block(pdev, &map), parent_port);
> > >
> > > This is broken because the pci device being used here does not have a
> > > driver that knows about CXL bus events.
> >
> > I don't understand this, but I'd like to. Doesn't this make a port device which
> > gets probed by the port driver? Why does the PCI device matter?
> 
> I am reacting to the first argument of this call being @dev that came
> from the pci_dev that was passed in to be the "host" for the devm
> operation. The devm release action triggers at driver unbind of that
> host device, but that doesn't make sense because the driver for a
> switch has nothing to do with CXL operation.
> 
> >
> > (I'll mention again, switch code is not tested).
> >
> > >
> > > > +       put_device(&parent_port->dev);
> > > > +       if (IS_ERR(port))
> > > > +               dev_err(dev, "Failed to add upstream port %ld\n",
> > > > +                       PTR_ERR(port));
> > > > +       else
> > > > +               dev_dbg(dev, "Added CXL port\n");
> > > > +
> > > > +       return rc;
> > > > +}
> > > > +
> > > > +static int add_downstream_port(struct pci_dev *pdev)
> > > > +{
> > > > +       resource_size_t creg = CXL_RESOURCE_NONE;
> > > > +       struct device *dev = &pdev->dev;
> > > > +       struct cxl_port *parent_port;
> > > > +       struct cxl_register_map map;
> > > > +       u32 lnkcap, port_num;
> > > > +       int rc;
> > > > +
> > > > +       /*
> > > > +        * Ports are to be scanned from top down. Therefore, the upstream port
> > > > +        * must already exist.
> > > > +        */
> > > > +       parent_port = find_cxl_port(to_pci_dev(pdev->dev.parent));
> > > > +       if (!parent_port)
> > > > +               return -ENODEV;
> > > > +
> > > > +       /*
> > > > +        * The spec mandates component registers are present but the
> > > > +        * driver does not.
> > >
> > > What is this trying to convey?
> > >
> >
> > That I'm not validating the hardware, and even though component registers are
> > mandatory, the driver will move on even if they're not found. This functionality
> > may need to change in the future and so I left the comment there.
> 
> I think that could be conveyed without comment with something like:
> 
>         if (rc)
>                 creg = CXL_RESOURCE_NONE;
>         else
>                 creg = cxl_reg_block(pdev, &map);

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 15/28] cxl/core: Introduce API to scan switch ports
  2021-11-16 16:50         ` Ben Widawsky
@ 2021-11-16 17:51           ` Dan Williams
  2021-11-16 18:02             ` Ben Widawsky
  0 siblings, 1 reply; 106+ messages in thread
From: Dan Williams @ 2021-11-16 17:51 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On Tue, Nov 16, 2021 at 8:51 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> On 21-11-01 18:45:57, Dan Williams wrote:
> > On Mon, Nov 1, 2021 at 3:56 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > >
> > > On 21-10-31 22:39:43, Dan Williams wrote:
> > > > On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > > >
> > > > > The CXL drivers encapsulate the components that direct memory traffic in
> > > > > an entity known as a cxl_port. Compute Express Link specifies three such
> > > > > components: hostbridge (ie. a collection of root ports), switches, and
> > > > > endpoints. There are currently drivers that create these ports for the
> > > > > hostbridges and the endpoints (cxl_acpi and cxl_mem). The new API
> > > > > introduced allows callers to initiate a scan down from the hostbridge
> > > > > and create ports for switches in the CXL topology.
> > > > >
> > > > > The intended user of this API is for endpoint devices. An endpoint
> > > > > device will need to determine if it is CXL.mem capable, which requires
> > > > > all components in the path from hostbridge to the endpoint to be CXL.mem
> > > > > capable. Once an endpoint device determines it's connected to a CXL
> > > > > capable root port, it can call this API to fill in all the ports in
> > > > > between the hostbridge and itself.
> > > > >
> > > > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > > > > ---
> > > > >  .../driver-api/cxl/memory-devices.rst         |   6 +
> > > > >  drivers/cxl/core/Makefile                     |   1 +
> > > > >  drivers/cxl/core/bus.c                        | 145 ++++++++++++++++++
> > > > >  drivers/cxl/core/pci.c                        |  99 ++++++++++++
> > > > >  drivers/cxl/cxl.h                             |   2 +
> > > > >  drivers/cxl/pci.h                             |   6 +
> > > > >  drivers/cxl/port.c                            |   2 +-
> > > > >  tools/testing/cxl/Kbuild                      |   1 +
> > > > >  8 files changed, 261 insertions(+), 1 deletion(-)
> > > > >  create mode 100644 drivers/cxl/core/pci.c
> > > > >
> > > > > diff --git a/Documentation/driver-api/cxl/memory-devices.rst b/Documentation/driver-api/cxl/memory-devices.rst
> > > > > index fbf0393cdddc..547336c95593 100644
> > > > > --- a/Documentation/driver-api/cxl/memory-devices.rst
> > > > > +++ b/Documentation/driver-api/cxl/memory-devices.rst
> > > > > @@ -47,6 +47,12 @@ CXL Core
> > > > >  .. kernel-doc:: drivers/cxl/core/bus.c
> > > > >     :identifiers:
> > > > >
> > > > > +.. kernel-doc:: drivers/cxl/core/pci.c
> > > > > +   :doc: cxl pci
> > > > > +
> > > > > +.. kernel-doc:: drivers/cxl/core/pci.c
> > > > > +   :identifiers:
> > > > > +
> > > > >  .. kernel-doc:: drivers/cxl/core/pmem.c
> > > > >     :doc: cxl pmem
> > > > >
> > > > > diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
> > > > > index 07eb8e1fb8a6..9d33d2d5bf09 100644
> > > > > --- a/drivers/cxl/core/Makefile
> > > > > +++ b/drivers/cxl/core/Makefile
> > > > > @@ -7,3 +7,4 @@ cxl_core-y += pmem.o
> > > > >  cxl_core-y += regs.o
> > > > >  cxl_core-y += memdev.o
> > > > >  cxl_core-y += mbox.o
> > > > > +cxl_core-y += pci.o
> > > > > diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> > > > > index c7e1894d503b..f10e7d5b22a4 100644
> > > > > --- a/drivers/cxl/core/bus.c
> > > > > +++ b/drivers/cxl/core/bus.c
> > > > > @@ -8,6 +8,7 @@
> > > > >  #include <linux/idr.h>
> > > > >  #include <cxlmem.h>
> > > > >  #include <cxl.h>
> > > > > +#include <pci.h>
> > > > >  #include "core.h"
> > > > >
> > > > >  /**
> > > > > @@ -445,6 +446,150 @@ struct cxl_port *devm_cxl_add_port(struct device *uport,
> > > > >  }
> > > > >  EXPORT_SYMBOL_GPL(devm_cxl_add_port);
> > > > >
> > > > > +void devm_cxl_remove_port(struct cxl_port *port)
> > > > > +{
> > > > > +       down_read(&root_host_sem);
> > > > > +       if (cxl_root_host) {
> > > > > +               devm_release_action(cxl_root_host, cxl_unlink_uport, port);
> > > > > +               devm_release_action(cxl_root_host, unregister_port, port);
> > > > > +       }
> > > > > +       up_read(&root_host_sem);
> > > > > +}
> > > > > +EXPORT_SYMBOL_GPL(devm_cxl_remove_port);
> > > >
> > > > If the scan establishes the property that all child ports are devm
> > > > allocated with their cxl_port-parent, and only if the cxl_port-parent
> > > > is bound to its driver then I think we don't need to play
> > > > devm_release_action games().
> > > >
> > >
> > > We had discussed this previously. I was running into an issue when unloading
> > > cxl_mem. I needed a way to remove the endpoint port and this was your
> > > recommendation. Are you suggesting if the chain is set up correctly, I don't
> > > need to do anything?
> >
> > I think if the chain is set up correctly then you don't need to do
> > anything special. The endpoint port would be devm registered by the
> > cxl_memdev driver to its parent cxl_port provided that port is
> > actively attached to its driver.
> >
> > > I don't remember exactly what was blowing up but I can try again after things
> > > are properly parented.
> >
> > Cool.
> >
> > >
> > > > > +
> > > > > +static int match_port(struct device *dev, const void *data)
> > > > > +{
> > > > > +       struct pci_dev *pdev = (struct pci_dev *)data;
> > > > > +
> > > > > +       if (dev->type != &cxl_port_type)
> > > > > +               return 0;
> > > > > +
> > > > > +       return to_cxl_port(dev)->uport == &pdev->dev;
> > > > > +}
> > > > > +
> > > > > +static struct cxl_port *find_cxl_port(struct pci_dev *usp)
> > > > > +{
> > > > > +       struct device *port_dev;
> > > > > +
> > > > > +       if (!pci_is_pcie(usp) || pci_pcie_type(usp) != PCI_EXP_TYPE_UPSTREAM)
> > > > > +               return NULL;
> > > > > +
> > > > > +       port_dev = bus_find_device(&cxl_bus_type, NULL, usp, match_port);
> > > > > +       if (port_dev)
> > > > > +               return to_cxl_port(port_dev);
> > > > > +
> > > > > +       return NULL;
> > > > > +}
> > > > > +
> > > > > +static int add_upstream_port(struct device *host, struct pci_dev *pdev)
> > > > > +{
> > > > > +       struct device *dev = &pdev->dev;
> > > > > +       struct cxl_port *parent_port;
> > > > > +       struct cxl_register_map map;
> > > > > +       struct cxl_port *port;
> > > > > +       int rc;
> > > > > +
> > > > > +       /*
> > > > > +        * Upstream ports must be connected to a downstream port or root port.
> > > > > +        * That downstream or root port must have a parent.
> > > > > +        */
> > > > > +       if (!pdev->dev.parent->parent)
> > > > > +               return -ENXIO;
> > > > > +
> > > > > +       /* A port is useless if there are no component registers */
> > > > > +       rc = cxl_find_regblock(pdev, CXL_REGLOC_RBI_COMPONENT, &map);
> > > > > +       if (rc)
> > > > > +               return rc;
> > > > > +
> > > > > +       parent_port = find_cxl_port(to_pci_dev(pdev->dev.parent->parent));
> > > >
> > > > This deref chain is unreadable. It wants a helper if it stays, but I
> > > > can't immediately think of a reason to ever need to look at a
> > > > grandparent in the hierarchy.
> > >
> > > The goal is to be able to find the next PCIe port up in the chain.
> > >
> > > My understanding was:
> > > pdev = PCIe upstream switch
> > > pdev->dev.parent = PCIe downstream switch connected to pdev.
> > > pdev->dev.parent->parent = PCIe upstream switch connected to pdev->dev.parent
> > >
> > > I was unable to find an idiomatic way to do that. I'm open to suggestions.
> >
> > Oh ok, I see it now, but I think this can be done in pure CXL terms
> > and generic devices with the assumption that the parent device of a
> > cxl_memdev must be a dport. Then this works whether the parent port is
> > a platform device like ACPI or cxl_test, or a PCIe device.
> >
> > static int port_has_dport(struct device *dev, const void *dport_dev)
> > {
> >         int found = 0;
> >         struct cxl_port *port;
> >         struct cxl_dport *dport;
> >
> >         if (dev->type != &cxl_port_type)
> >                 return 0;
> >         port = to_cxl_port(dev);
> >
> >         device_lock(&port->dev);
> >         list_for_each_entry (dport, &port->dports, list)
> >                 if (dport->dport == dport_dev) {
> >                         found = 1;
> >                         break;
> >                 }
> >         device_unlock(&port->dev);
> >
> >         return found;
> > }
> >
> > struct cxl_port *find_parent_cxl_port(struct cxl_memdev *cxlmd)
> > {
> >         return bus_find_device(&cxl_bus_type, NULL, cxlmd->dev.parent,
> >                                port_has_dport);
> > }
> >
>
> I'm squinting at this and wondering what I will have to change to make it work.
>
> This logic you're asking me to change is the code which is adding the switches
> to the port hierarchy (before the memdev adds itself as a port). I first need to
> build that before I can check if the dport matches cxlmd->dev.parent, right?
>
> Since I foolishly added this patch without the consumer, I'll outline here the
> order.
>
> 1. memdev driver walks up hierarchy to find root port & check for switches
> 2. if parent.driver goto 6 (parent is CXL 2.0 capable root port)
> 3. starting from the root port this endpoint is connected to, walk down the
>    topology and add all upstream and downstream ports
> 4. if parent.driver goto 6 (parent is CXL 2.0 DSP)
> 5. ERROR - don't bind
> 6. Add self as port
>
> So I'm not seeing how this can work without something like I already have.
> Please advise. For now, I'll keep my existing logic but call the function
> find_parent_cxl_port() so we can replace it with something more palatable.

I am assuming that the grandparent of a memdev must be a dport, and
the grandparent of a dport must be another dport (until the root is
reached). With that assumption the topology can be built up
incrementally without needing to explicitly verify PCIE port types.
Something like the following where @dev is the endpoint memdev.

        while (dport = grandparent(dev)) {
                if (!is_cxl_dport(dport))
                        if (is_cxl_dport(grandparent(dport)))
                                /* add a new cxl_port */;
                        else {
                                /* walk up and retry */;
                                continue;
                        }
                port = dport->port;
                if (port->dev.driver)
                        /* good to go, CXL.mem is enabled all the way
to @dport */
                else
                        /* CXL.mem is disabled from @dport down */
        }

...where that needs repeating until it either finds a disabled
cxl_port, or until it stops adding new cxl_ports. Does that match your
proposal?

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [RFC PATCH v2 15/28] cxl/core: Introduce API to scan switch ports
  2021-11-16 17:51           ` Dan Williams
@ 2021-11-16 18:02             ` Ben Widawsky
  0 siblings, 0 replies; 106+ messages in thread
From: Ben Widawsky @ 2021-11-16 18:02 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, Chet Douglas, Alison Schofield, Ira Weiny,
	Jonathan Cameron, Vishal Verma

On 21-11-16 09:51:36, Dan Williams wrote:
> On Tue, Nov 16, 2021 at 8:51 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> >
> > On 21-11-01 18:45:57, Dan Williams wrote:
> > > On Mon, Nov 1, 2021 at 3:56 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > >
> > > > On 21-10-31 22:39:43, Dan Williams wrote:
> > > > > On Fri, Oct 22, 2021 at 11:37 AM Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > > > >
> > > > > > The CXL drivers encapsulate the components that direct memory traffic in
> > > > > > an entity known as a cxl_port. Compute Express Link specifies three such
> > > > > > components: hostbridge (ie. a collection of root ports), switches, and
> > > > > > endpoints. There are currently drivers that create these ports for the
> > > > > > hostbridges and the endpoints (cxl_acpi and cxl_mem). The new API
> > > > > > introduced allows callers to initiate a scan down from the hostbridge
> > > > > > and create ports for switches in the CXL topology.
> > > > > >
> > > > > > The intended user of this API is for endpoint devices. An endpoint
> > > > > > device will need to determine if it is CXL.mem capable, which requires
> > > > > > all components in the path from hostbridge to the endpoint to be CXL.mem
> > > > > > capable. Once an endpoint device determines it's connected to a CXL
> > > > > > capable root port, it can call this API to fill in all the ports in
> > > > > > between the hostbridge and itself.
> > > > > >
> > > > > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > > > > > ---
> > > > > >  .../driver-api/cxl/memory-devices.rst         |   6 +
> > > > > >  drivers/cxl/core/Makefile                     |   1 +
> > > > > >  drivers/cxl/core/bus.c                        | 145 ++++++++++++++++++
> > > > > >  drivers/cxl/core/pci.c                        |  99 ++++++++++++
> > > > > >  drivers/cxl/cxl.h                             |   2 +
> > > > > >  drivers/cxl/pci.h                             |   6 +
> > > > > >  drivers/cxl/port.c                            |   2 +-
> > > > > >  tools/testing/cxl/Kbuild                      |   1 +
> > > > > >  8 files changed, 261 insertions(+), 1 deletion(-)
> > > > > >  create mode 100644 drivers/cxl/core/pci.c
> > > > > >
> > > > > > diff --git a/Documentation/driver-api/cxl/memory-devices.rst b/Documentation/driver-api/cxl/memory-devices.rst
> > > > > > index fbf0393cdddc..547336c95593 100644
> > > > > > --- a/Documentation/driver-api/cxl/memory-devices.rst
> > > > > > +++ b/Documentation/driver-api/cxl/memory-devices.rst
> > > > > > @@ -47,6 +47,12 @@ CXL Core
> > > > > >  .. kernel-doc:: drivers/cxl/core/bus.c
> > > > > >     :identifiers:
> > > > > >
> > > > > > +.. kernel-doc:: drivers/cxl/core/pci.c
> > > > > > +   :doc: cxl pci
> > > > > > +
> > > > > > +.. kernel-doc:: drivers/cxl/core/pci.c
> > > > > > +   :identifiers:
> > > > > > +
> > > > > >  .. kernel-doc:: drivers/cxl/core/pmem.c
> > > > > >     :doc: cxl pmem
> > > > > >
> > > > > > diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
> > > > > > index 07eb8e1fb8a6..9d33d2d5bf09 100644
> > > > > > --- a/drivers/cxl/core/Makefile
> > > > > > +++ b/drivers/cxl/core/Makefile
> > > > > > @@ -7,3 +7,4 @@ cxl_core-y += pmem.o
> > > > > >  cxl_core-y += regs.o
> > > > > >  cxl_core-y += memdev.o
> > > > > >  cxl_core-y += mbox.o
> > > > > > +cxl_core-y += pci.o
> > > > > > diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> > > > > > index c7e1894d503b..f10e7d5b22a4 100644
> > > > > > --- a/drivers/cxl/core/bus.c
> > > > > > +++ b/drivers/cxl/core/bus.c
> > > > > > @@ -8,6 +8,7 @@
> > > > > >  #include <linux/idr.h>
> > > > > >  #include <cxlmem.h>
> > > > > >  #include <cxl.h>
> > > > > > +#include <pci.h>
> > > > > >  #include "core.h"
> > > > > >
> > > > > >  /**
> > > > > > @@ -445,6 +446,150 @@ struct cxl_port *devm_cxl_add_port(struct device *uport,
> > > > > >  }
> > > > > >  EXPORT_SYMBOL_GPL(devm_cxl_add_port);
> > > > > >
> > > > > > +void devm_cxl_remove_port(struct cxl_port *port)
> > > > > > +{
> > > > > > +       down_read(&root_host_sem);
> > > > > > +       if (cxl_root_host) {
> > > > > > +               devm_release_action(cxl_root_host, cxl_unlink_uport, port);
> > > > > > +               devm_release_action(cxl_root_host, unregister_port, port);
> > > > > > +       }
> > > > > > +       up_read(&root_host_sem);
> > > > > > +}
> > > > > > +EXPORT_SYMBOL_GPL(devm_cxl_remove_port);
> > > > >
> > > > > If the scan establishes the property that all child ports are devm
> > > > > allocated with their cxl_port-parent, and only if the cxl_port-parent
> > > > > is bound to its driver then I think we don't need to play
> > > > > devm_release_action games().
> > > > >
> > > >
> > > > We had discussed this previously. I was running into an issue when unloading
> > > > cxl_mem. I needed a way to remove the endpoint port and this was your
> > > > recommendation. Are you suggesting if the chain is set up correctly, I don't
> > > > need to do anything?
> > >
> > > I think if the chain is set up correctly then you don't need to do
> > > anything special. The endpoint port would be devm registered by the
> > > cxl_memdev driver to its parent cxl_port provided that port is
> > > actively attached to its driver.
> > >
> > > > I don't remember exactly what was blowing up but I can try again after things
> > > > are properly parented.
> > >
> > > Cool.
> > >
> > > >
> > > > > > +
> > > > > > +static int match_port(struct device *dev, const void *data)
> > > > > > +{
> > > > > > +       struct pci_dev *pdev = (struct pci_dev *)data;
> > > > > > +
> > > > > > +       if (dev->type != &cxl_port_type)
> > > > > > +               return 0;
> > > > > > +
> > > > > > +       return to_cxl_port(dev)->uport == &pdev->dev;
> > > > > > +}
> > > > > > +
> > > > > > +static struct cxl_port *find_cxl_port(struct pci_dev *usp)
> > > > > > +{
> > > > > > +       struct device *port_dev;
> > > > > > +
> > > > > > +       if (!pci_is_pcie(usp) || pci_pcie_type(usp) != PCI_EXP_TYPE_UPSTREAM)
> > > > > > +               return NULL;
> > > > > > +
> > > > > > +       port_dev = bus_find_device(&cxl_bus_type, NULL, usp, match_port);
> > > > > > +       if (port_dev)
> > > > > > +               return to_cxl_port(port_dev);
> > > > > > +
> > > > > > +       return NULL;
> > > > > > +}
> > > > > > +
> > > > > > +static int add_upstream_port(struct device *host, struct pci_dev *pdev)
> > > > > > +{
> > > > > > +       struct device *dev = &pdev->dev;
> > > > > > +       struct cxl_port *parent_port;
> > > > > > +       struct cxl_register_map map;
> > > > > > +       struct cxl_port *port;
> > > > > > +       int rc;
> > > > > > +
> > > > > > +       /*
> > > > > > +        * Upstream ports must be connected to a downstream port or root port.
> > > > > > +        * That downstream or root port must have a parent.
> > > > > > +        */
> > > > > > +       if (!pdev->dev.parent->parent)
> > > > > > +               return -ENXIO;
> > > > > > +
> > > > > > +       /* A port is useless if there are no component registers */
> > > > > > +       rc = cxl_find_regblock(pdev, CXL_REGLOC_RBI_COMPONENT, &map);
> > > > > > +       if (rc)
> > > > > > +               return rc;
> > > > > > +
> > > > > > +       parent_port = find_cxl_port(to_pci_dev(pdev->dev.parent->parent));
> > > > >
> > > > > This deref chain is unreadable. It wants a helper if it stays, but I
> > > > > can't immediately think of a reason to ever need to look at a
> > > > > grandparent in the hierarchy.
> > > >
> > > > The goal is to be able to find the next PCIe port up in the chain.
> > > >
> > > > My understanding was:
> > > > pdev = PCIe upstream switch
> > > > pdev->dev.parent = PCIe downstream switch connected to pdev.
> > > > pdev->dev.parent->parent = PCIe upstream switch connected to pdev->dev.parent
> > > >
> > > > I was unable to find an idiomatic way to do that. I'm open to suggestions.
> > >
> > > Oh ok, I see it now, but I think this can be done in pure CXL terms
> > > and generic devices with the assumption that the parent device of a
> > > cxl_memdev must be a dport. Then this works whether the parent port is
> > > a platform device like ACPI or cxl_test, or a PCIe device.
> > >
> > > static int port_has_dport(struct device *dev, const void *dport_dev)
> > > {
> > >         int found = 0;
> > >         struct cxl_port *port;
> > >         struct cxl_dport *dport;
> > >
> > >         if (dev->type != &cxl_port_type)
> > >                 return 0;
> > >         port = to_cxl_port(dev);
> > >
> > >         device_lock(&port->dev);
> > >         list_for_each_entry (dport, &port->dports, list)
> > >                 if (dport->dport == dport_dev) {
> > >                         found = 1;
> > >                         break;
> > >                 }
> > >         device_unlock(&port->dev);
> > >
> > >         return found;
> > > }
> > >
> > > struct cxl_port *find_parent_cxl_port(struct cxl_memdev *cxlmd)
> > > {
> > >         return bus_find_device(&cxl_bus_type, NULL, cxlmd->dev.parent,
> > >                                port_has_dport);
> > > }
> > >
> >
> > I'm squinting at this and wondering what I will have to change to make it work.
> >
> > This logic you're asking me to change is the code which is adding the switches
> > to the port hierarchy (before the memdev adds itself as a port). I first need to
> > build that before I can check if the dport matches cxlmd->dev.parent, right?
> >
> > Since I foolishly added this patch without the consumer, I'll outline here the
> > order.
> >
> > 1. memdev driver walks up hierarchy to find root port & check for switches
> > 2. if parent.driver goto 6 (parent is CXL 2.0 capable root port)
> > 3. starting from the root port this endpoint is connected to, walk down the
> >    topology and add all upstream and downstream ports
> > 4. if parent.driver goto 6 (parent is CXL 2.0 DSP)
> > 5. ERROR - don't bind
> > 6. Add self as port
> >
> > So I'm not seeing how this can work without something like I already have.
> > Please advise. For now, I'll keep my existing logic but call the function
> > find_parent_cxl_port() so we can replace it with something more palatable.
> 
> I am assuming that the grandparent of a memdev must be a dport, and
> the grandparent of a dport must be another dport (until the root is
> reached). With that assumption the topology can be built up
> incrementally without needing to explicitly verify PCIE port types.
> Something like the following where @dev is the endpoint memdev.
> 
>         while (dport = grandparent(dev)) {
>                 if (!is_cxl_dport(dport))
>                         if (is_cxl_dport(grandparent(dport)))
>                                 /* add a new cxl_port */;
>                         else {
>                                 /* walk up and retry */;
>                                 continue;
>                         }
>                 port = dport->port;
>                 if (port->dev.driver)
>                         /* good to go, CXL.mem is enabled all the way
> to @dport */
>                 else
>                         /* CXL.mem is disabled from @dport down */
>         }
> 
> ...where that needs repeating until it either finds a disabled
> cxl_port, or until it stops adding new cxl_ports. Does that match your
> proposal?

I think I'm not conveying something properly, or I'm not fully understanding
you. The dports can only be added as you enumerate switches (assuming the
endpoint isn't directly connected to the root port). I'm specifically referring
to the enumeration part of the code where the switch port/dports are added. This
comes after the memdev driver walks up to find there is a switch in the middle.

I've done some cleanup, let me resend and we can further discuss on that. I
think it will be clearer with the last two patches here combined.


^ permalink raw reply	[flat|nested] 106+ messages in thread

end of thread, other threads:[~2021-11-16 18:05 UTC | newest]

Thread overview: 106+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-22 18:36 [RFC PATCH v2 00/28] CXL Region Creation / HDM decoder programming Ben Widawsky
2021-10-22 18:36 ` [RFC PATCH v2 01/28] cxl: Rename CXL_MEM to CXL_PCI Ben Widawsky
2021-10-29 20:15   ` Dan Williams
2021-10-29 21:20     ` Ben Widawsky
2021-10-29 21:39       ` Dan Williams
2021-10-22 18:36 ` [RFC PATCH v2 02/28] cxl: Move register block enumeration to core Ben Widawsky
2021-10-29 20:23   ` Dan Williams
2021-10-29 21:23     ` Ben Widawsky
2021-10-22 18:36 ` [RFC PATCH v2 03/28] cxl/acpi: Map component registers for Root Ports Ben Widawsky
2021-10-29 20:28   ` Dan Williams
2021-10-22 18:36 ` [RFC PATCH v2 04/28] cxl: Add helper for new drivers Ben Widawsky
2021-10-29 20:30   ` Dan Williams
2021-10-22 18:36 ` [RFC PATCH v2 05/28] cxl/core: Convert decoder range to resource Ben Widawsky
2021-10-29 20:50   ` Dan Williams
2021-10-29 21:26     ` Ben Widawsky
2021-10-29 22:22       ` Dan Williams
2021-10-29 22:37         ` Ben Widawsky
2021-11-01 14:33           ` Ben Widawsky
2021-10-22 18:36 ` [RFC PATCH v2 06/28] cxl: Introduce endpoint decoders Ben Widawsky
2021-10-29 21:00   ` Dan Williams
2021-10-29 22:02     ` Ben Widawsky
2021-10-29 22:25       ` Dan Williams
2021-10-22 18:36 ` [RFC PATCH v2 07/28] cxl/core: Move target population locking to caller Ben Widawsky
2021-10-29 23:03   ` Dan Williams
2021-10-22 18:36 ` [RFC PATCH v2 08/28] cxl/port: Introduce a port driver Ben Widawsky
2021-10-30  1:37   ` Dan Williams
2021-10-31 17:53     ` Dan Williams
2021-10-31 18:10       ` Dan Williams
2021-11-01 17:36         ` Ben Widawsky
2021-11-01 17:53     ` Ben Widawsky
2021-11-01 17:54       ` Ben Widawsky
2021-11-02  3:31       ` Dan Williams
2021-11-02 16:27         ` Ben Widawsky
2021-11-02 17:21           ` Dan Williams
2021-11-02 16:58         ` Ben Widawsky
2021-11-04 19:10           ` Dan Williams
2021-11-04 19:49             ` Ben Widawsky
2021-11-04 20:04               ` Dan Williams
2021-11-04 21:25                 ` Ben Widawsky
2021-11-04 16:37     ` Ben Widawsky
2021-11-04 19:17       ` Dan Williams
2021-11-04 19:46         ` Ben Widawsky
2021-11-04 20:00           ` Dan Williams
2021-11-04 21:26             ` Ben Widawsky
2021-11-03 15:18   ` Jonathan Cameron
2021-10-22 18:36 ` [RFC PATCH v2 09/28] cxl/acpi: Map single port host bridge component registers Ben Widawsky
2021-10-31 18:03   ` Dan Williams
2021-11-01 17:07     ` Ben Widawsky
2021-11-02  2:15       ` Dan Williams
2021-11-02 16:31         ` Ben Widawsky
2021-11-02 17:46           ` Dan Williams
2021-11-02 17:57             ` Ben Widawsky
2021-11-02 18:10               ` Dan Williams
2021-11-02 18:27                 ` Ben Widawsky
2021-11-02 18:49                   ` Dan Williams
2021-11-02 21:15                     ` Ben Widawsky
2021-11-02 21:34                       ` Dan Williams
2021-11-02 21:47                         ` Ben Widawsky
2021-10-22 18:36 ` [RFC PATCH v2 10/28] cxl/core: Store global list of root ports Ben Widawsky
2021-10-31 18:32   ` Dan Williams
2021-11-01 18:43     ` Ben Widawsky
2021-11-02  2:04       ` Dan Williams
2021-10-22 18:36 ` [RFC PATCH v2 11/28] cxl/acpi: Rescan bus at probe completion Ben Widawsky
2021-10-31 19:25   ` Dan Williams
2021-11-01 18:56     ` Ben Widawsky
2021-11-01 21:45       ` Ben Widawsky
2021-11-02  1:56         ` Dan Williams
2021-10-22 18:36 ` [RFC PATCH v2 12/28] cxl/core: Store component register base for memdevs Ben Widawsky
2021-10-31 20:13   ` Dan Williams
2021-11-01 21:50     ` Ben Widawsky
2021-10-22 18:36 ` [RFC PATCH v2 13/28] cxl: Flesh out register names Ben Widawsky
2021-10-31 20:18   ` Dan Williams
2021-11-01 22:00     ` Ben Widawsky
2021-11-02  1:53       ` Dan Williams
2021-11-03 15:53   ` Jonathan Cameron
2021-11-03 16:03     ` Ben Widawsky
2021-11-03 16:42       ` Jonathan Cameron
2021-11-03 17:05         ` Ben Widawsky
2021-10-22 18:36 ` [RFC PATCH v2 14/28] cxl: Hide devm host for ports Ben Widawsky
2021-10-31 21:14   ` Dan Williams
2021-10-22 18:36 ` [RFC PATCH v2 15/28] cxl/core: Introduce API to scan switch ports Ben Widawsky
2021-11-01  5:39   ` Dan Williams
2021-11-01 22:56     ` Ben Widawsky
2021-11-02  1:45       ` Dan Williams
2021-11-02 16:39         ` Ben Widawsky
2021-11-02 20:00           ` Dan Williams
2021-11-16 16:50         ` Ben Widawsky
2021-11-16 17:51           ` Dan Williams
2021-11-16 18:02             ` Ben Widawsky
2021-11-03 16:08   ` Jonathan Cameron
2021-11-10 17:49     ` Ben Widawsky
2021-11-10 18:10       ` Jonathan Cameron
2021-11-10 21:03         ` Dan Williams
2021-10-22 18:36 ` [RFC PATCH v2 16/28] cxl: Introduce cxl_mem driver Ben Widawsky
2021-10-22 18:36 ` [RFC PATCH v2 17/28] cxl: Disable switch hierarchies for now Ben Widawsky
2021-10-22 18:36 ` [RFC PATCH v2 18/28] cxl/region: Add region creation ABI Ben Widawsky
2021-10-22 18:37 ` [RFC PATCH v2 19/28] cxl/region: Introduce concept of region configuration Ben Widawsky
2021-10-22 18:37 ` [RFC PATCH v2 20/28] cxl/region: Introduce a cxl_region driver Ben Widawsky
2021-10-22 18:37 ` [RFC PATCH v2 21/28] cxl/acpi: Handle address space allocation Ben Widawsky
2021-10-22 18:37 ` [RFC PATCH v2 22/28] cxl/region: Address " Ben Widawsky
2021-10-22 18:37 ` [RFC PATCH v2 23/28] cxl/region: Implement XHB verification Ben Widawsky
2021-10-22 18:37 ` [RFC PATCH v2 24/28] cxl/region: HB port config verification Ben Widawsky
2021-10-22 18:37 ` [RFC PATCH v2 25/28] cxl/region: Record host bridge target list Ben Widawsky
2021-10-22 18:37 ` [RFC PATCH v2 26/28] cxl/mem: Store the endpoint's uport Ben Widawsky
2021-10-22 18:37 ` [RFC PATCH v2 27/28] cxl/region: Gather HDM decoder resources Ben Widawsky
2021-10-22 18:37 ` [RFC PATCH v2 28/28] cxl: Program decoders for regions Ben Widawsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).