linux-snps-arc.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 0/4] mm/memory_hotplug: full support for
@ 2021-09-27 15:05 David Hildenbrand
  2021-09-27 15:05 ` [PATCH v1 1/4] mm/memory_hotplug: handle memblock_add_node() failures in add_memory_resource() David Hildenbrand
                   ` (4 more replies)
  0 siblings, 5 replies; 15+ messages in thread
From: David Hildenbrand @ 2021-09-27 15:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Hildenbrand, Andrew Morton, Mike Rapoport, Michal Hocko,
	Oscar Salvador, Jianyong Wu, Aneesh Kumar K . V, Vineet Gupta,
	Geert Uytterhoeven, Huacai Chen, Jiaxun Yang,
	Thomas Bogendoerfer, Heiko Carstens, Vasily Gorbik,
	Christian Borntraeger, Eric Biederman, Arnd Bergmann,
	linux-snps-arc, linux-ia64, linux-m68k, linux-mips, linux-s390,
	linux-mm, kexec

Architectures that require CONFIG_ARCH_KEEP_MEMBLOCK=y, such as arm64,
don't cleanly support add_memory_driver_managed() yet. Most prominently,
kexec_file can still end up placing images on such driver-managed memory,
resulting in undesired behavior.

Teaching kexec to not place images on driver-managed memory is especially
relevant for virtio-mem. Details can be found in commit 7b7b27214bba
("mm/memory_hotplug: introduce add_memory_driver_managed()").

Extend memblock with a new flag and set it from memory hotplug code
when applicable. This is required to fully support virtio-mem on
arm64, making also kexec_file behave like on x86-64.

Alternative A: Extend kexec_walk_memblock() to consult the kernel resource
tree whether IORESOURCE_SYSRAM_DRIVER_MANAGED is set. This feels wrong,
because the goal was to rely on memblock and not the resource tree.

Alternative B: Reuse MEMBLOCK_HOTPLUG. MEMBLOCK_HOTPLUG serves a different
purpose, though.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Jianyong Wu <Jianyong.Wu@arm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: linux-snps-arc@lists.infradead.org
Cc: linux-ia64@vger.kernel.org
Cc: linux-m68k@lists.linux-m68k.org
Cc: linux-mips@vger.kernel.org
Cc: linux-s390@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: kexec@lists.infradead.org

David Hildenbrand (4):
  mm/memory_hotplug: handle memblock_add_node() failures in
    add_memory_resource()
  memblock: allow to specify flags with memblock_add_node()
  memblock: add MEMBLOCK_DRIVER_MANAGED to mimic
    IORESOURCE_SYSRAM_DRIVER_MANAGED
  mm/memory_hotplug: indicate MEMBLOCK_DRIVER_MANAGED with
    IORESOURCE_SYSRAM_DRIVER_MANAGED

 arch/arc/mm/init.c               |  4 ++--
 arch/ia64/mm/contig.c            |  2 +-
 arch/ia64/mm/init.c              |  2 +-
 arch/m68k/mm/mcfmmu.c            |  3 ++-
 arch/m68k/mm/motorola.c          |  6 ++++--
 arch/mips/loongson64/init.c      |  4 +++-
 arch/mips/sgi-ip27/ip27-memory.c |  3 ++-
 arch/s390/kernel/setup.c         |  3 ++-
 include/linux/memblock.h         | 19 ++++++++++++++++---
 include/linux/mm.h               |  2 +-
 kernel/kexec_file.c              |  5 +++++
 mm/memblock.c                    | 13 +++++++++----
 mm/memory_hotplug.c              | 11 +++++++++--
 13 files changed, 57 insertions(+), 20 deletions(-)


base-commit: 5816b3e6577eaa676ceb00a848f0fd65fe2adc29
-- 
2.31.1


_______________________________________________
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v1 1/4] mm/memory_hotplug: handle memblock_add_node() failures in add_memory_resource()
  2021-09-27 15:05 [PATCH v1 0/4] mm/memory_hotplug: full support for David Hildenbrand
@ 2021-09-27 15:05 ` David Hildenbrand
  2021-09-27 15:05 ` [PATCH v1 2/4] memblock: allow to specify flags with memblock_add_node() David Hildenbrand
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 15+ messages in thread
From: David Hildenbrand @ 2021-09-27 15:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Hildenbrand, Andrew Morton, Mike Rapoport, Michal Hocko,
	Oscar Salvador, Jianyong Wu, Aneesh Kumar K . V, Vineet Gupta,
	Geert Uytterhoeven, Huacai Chen, Jiaxun Yang,
	Thomas Bogendoerfer, Heiko Carstens, Vasily Gorbik,
	Christian Borntraeger, Eric Biederman, Arnd Bergmann,
	linux-snps-arc, linux-ia64, linux-m68k, linux-mips, linux-s390,
	linux-mm, kexec

If memblock_add_node() fails, we're most probably running out of memory.
While this is unlikely to happen, it can happen and having memory added
without a memblock can be problematic for architectures that use
memblock to detect valid memory. Let's fail in a nice way instead of
silently ignoring the error.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/memory_hotplug.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 9fd0be32a281..917b3528636d 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1384,8 +1384,11 @@ int __ref add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags)
 
 	mem_hotplug_begin();
 
-	if (IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK))
-		memblock_add_node(start, size, nid);
+	if (IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK)) {
+		ret = memblock_add_node(start, size, nid);
+		if (ret)
+			goto error_mem_hotplug_end;
+	}
 
 	ret = __try_online_node(nid, false);
 	if (ret < 0)
@@ -1458,6 +1461,7 @@ int __ref add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags)
 		rollback_node_hotadd(nid);
 	if (IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK))
 		memblock_remove(start, size);
+error_mem_hotplug_end:
 	mem_hotplug_done();
 	return ret;
 }
-- 
2.31.1


_______________________________________________
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v1 2/4] memblock: allow to specify flags with memblock_add_node()
  2021-09-27 15:05 [PATCH v1 0/4] mm/memory_hotplug: full support for David Hildenbrand
  2021-09-27 15:05 ` [PATCH v1 1/4] mm/memory_hotplug: handle memblock_add_node() failures in add_memory_resource() David Hildenbrand
@ 2021-09-27 15:05 ` David Hildenbrand
  2021-09-27 15:19   ` Geert Uytterhoeven
                     ` (2 more replies)
  2021-09-27 15:05 ` [PATCH v1 3/4] memblock: add MEMBLOCK_DRIVER_MANAGED to mimic IORESOURCE_SYSRAM_DRIVER_MANAGED David Hildenbrand
                   ` (2 subsequent siblings)
  4 siblings, 3 replies; 15+ messages in thread
From: David Hildenbrand @ 2021-09-27 15:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Hildenbrand, Andrew Morton, Mike Rapoport, Michal Hocko,
	Oscar Salvador, Jianyong Wu, Aneesh Kumar K . V, Vineet Gupta,
	Geert Uytterhoeven, Huacai Chen, Jiaxun Yang,
	Thomas Bogendoerfer, Heiko Carstens, Vasily Gorbik,
	Christian Borntraeger, Eric Biederman, Arnd Bergmann,
	linux-snps-arc, linux-ia64, linux-m68k, linux-mips, linux-s390,
	linux-mm, kexec

We want to specify flags when hotplugging memory. Let's prepare to pass
flags to memblock_add_node() by adjusting all existing users.

Note that when hotplugging memory the system is already up and running
and we don't want to add the memory first and apply flags later: it
should happen within one memblock call.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 arch/arc/mm/init.c               | 4 ++--
 arch/ia64/mm/contig.c            | 2 +-
 arch/ia64/mm/init.c              | 2 +-
 arch/m68k/mm/mcfmmu.c            | 3 ++-
 arch/m68k/mm/motorola.c          | 6 ++++--
 arch/mips/loongson64/init.c      | 4 +++-
 arch/mips/sgi-ip27/ip27-memory.c | 3 ++-
 arch/s390/kernel/setup.c         | 3 ++-
 include/linux/memblock.h         | 3 ++-
 include/linux/mm.h               | 2 +-
 mm/memblock.c                    | 9 +++++----
 mm/memory_hotplug.c              | 2 +-
 12 files changed, 26 insertions(+), 17 deletions(-)

diff --git a/arch/arc/mm/init.c b/arch/arc/mm/init.c
index 699ecf119641..110eb69e9bee 100644
--- a/arch/arc/mm/init.c
+++ b/arch/arc/mm/init.c
@@ -59,13 +59,13 @@ void __init early_init_dt_add_memory_arch(u64 base, u64 size)
 
 		low_mem_sz = size;
 		in_use = 1;
-		memblock_add_node(base, size, 0);
+		memblock_add_node(base, size, 0, MEMBLOCK_NONE);
 	} else {
 #ifdef CONFIG_HIGHMEM
 		high_mem_start = base;
 		high_mem_sz = size;
 		in_use = 1;
-		memblock_add_node(base, size, 1);
+		memblock_add_node(base, size, 1, MEMBLOCK_NONE);
 		memblock_reserve(base, size);
 #endif
 	}
diff --git a/arch/ia64/mm/contig.c b/arch/ia64/mm/contig.c
index 42e025cfbd08..24901d809301 100644
--- a/arch/ia64/mm/contig.c
+++ b/arch/ia64/mm/contig.c
@@ -153,7 +153,7 @@ find_memory (void)
 	efi_memmap_walk(find_max_min_low_pfn, NULL);
 	max_pfn = max_low_pfn;
 
-	memblock_add_node(0, PFN_PHYS(max_low_pfn), 0);
+	memblock_add_node(0, PFN_PHYS(max_low_pfn), 0, MEMBLOCK_NONE);
 
 	find_initrd();
 
diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c
index 5c6da8d83c1a..5d165607bf35 100644
--- a/arch/ia64/mm/init.c
+++ b/arch/ia64/mm/init.c
@@ -378,7 +378,7 @@ int __init register_active_ranges(u64 start, u64 len, int nid)
 #endif
 
 	if (start < end)
-		memblock_add_node(__pa(start), end - start, nid);
+		memblock_add_node(__pa(start), end - start, nid, MEMBLOCK_NONE);
 	return 0;
 }
 
diff --git a/arch/m68k/mm/mcfmmu.c b/arch/m68k/mm/mcfmmu.c
index eac9dde65193..6f1f25125294 100644
--- a/arch/m68k/mm/mcfmmu.c
+++ b/arch/m68k/mm/mcfmmu.c
@@ -174,7 +174,8 @@ void __init cf_bootmem_alloc(void)
 	m68k_memory[0].addr = _rambase;
 	m68k_memory[0].size = _ramend - _rambase;
 
-	memblock_add_node(m68k_memory[0].addr, m68k_memory[0].size, 0);
+	memblock_add_node(m68k_memory[0].addr, m68k_memory[0].size, 0,
+			  MEMBLOCK_NONE);
 
 	/* compute total pages in system */
 	num_pages = PFN_DOWN(_ramend - _rambase);
diff --git a/arch/m68k/mm/motorola.c b/arch/m68k/mm/motorola.c
index 3a653f0a4188..e80c5d7e6728 100644
--- a/arch/m68k/mm/motorola.c
+++ b/arch/m68k/mm/motorola.c
@@ -410,7 +410,8 @@ void __init paging_init(void)
 
 	min_addr = m68k_memory[0].addr;
 	max_addr = min_addr + m68k_memory[0].size;
-	memblock_add_node(m68k_memory[0].addr, m68k_memory[0].size, 0);
+	memblock_add_node(m68k_memory[0].addr, m68k_memory[0].size, 0,
+			  MEMBLOCK_NONE);
 	for (i = 1; i < m68k_num_memory;) {
 		if (m68k_memory[i].addr < min_addr) {
 			printk("Ignoring memory chunk at 0x%lx:0x%lx before the first chunk\n",
@@ -421,7 +422,8 @@ void __init paging_init(void)
 				(m68k_num_memory - i) * sizeof(struct m68k_mem_info));
 			continue;
 		}
-		memblock_add_node(m68k_memory[i].addr, m68k_memory[i].size, i);
+		memblock_add_node(m68k_memory[i].addr, m68k_memory[i].size, i,
+				  MEMBLOCK_NONE);
 		addr = m68k_memory[i].addr + m68k_memory[i].size;
 		if (addr > max_addr)
 			max_addr = addr;
diff --git a/arch/mips/loongson64/init.c b/arch/mips/loongson64/init.c
index 76e0a9636a0e..4ac5ba80bbf6 100644
--- a/arch/mips/loongson64/init.c
+++ b/arch/mips/loongson64/init.c
@@ -77,7 +77,9 @@ void __init szmem(unsigned int node)
 				(u32)node_id, mem_type, mem_start, mem_size);
 			pr_info("       start_pfn:0x%llx, end_pfn:0x%llx, num_physpages:0x%lx\n",
 				start_pfn, end_pfn, num_physpages);
-			memblock_add_node(PFN_PHYS(start_pfn), PFN_PHYS(node_psize), node);
+			memblock_add_node(PFN_PHYS(start_pfn),
+					  PFN_PHYS(node_psize), node,
+					  MEMBLOCK_NONE);
 			break;
 		case SYSTEM_RAM_RESERVED:
 			pr_info("Node%d: mem_type:%d, mem_start:0x%llx, mem_size:0x%llx MB\n",
diff --git a/arch/mips/sgi-ip27/ip27-memory.c b/arch/mips/sgi-ip27/ip27-memory.c
index 6173684b5aaa..adc2faeecf7c 100644
--- a/arch/mips/sgi-ip27/ip27-memory.c
+++ b/arch/mips/sgi-ip27/ip27-memory.c
@@ -341,7 +341,8 @@ static void __init szmem(void)
 				continue;
 			}
 			memblock_add_node(PFN_PHYS(slot_getbasepfn(node, slot)),
-					  PFN_PHYS(slot_psize), node);
+					  PFN_PHYS(slot_psize), node,
+					  MEMBLOCK_NONE);
 		}
 	}
 }
diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c
index 67e5fff96ee0..f3943f15af6e 100644
--- a/arch/s390/kernel/setup.c
+++ b/arch/s390/kernel/setup.c
@@ -593,7 +593,8 @@ static void __init setup_resources(void)
 	 * part of the System RAM resource.
 	 */
 	if (crashk_res.end) {
-		memblock_add_node(crashk_res.start, resource_size(&crashk_res), 0);
+		memblock_add_node(crashk_res.start, resource_size(&crashk_res),
+				  0, MEMBLOCK_NONE);
 		memblock_reserve(crashk_res.start, resource_size(&crashk_res));
 		insert_resource(&iomem_resource, &crashk_res);
 	}
diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index 34de69b3b8ba..b49a58f621bc 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -100,7 +100,8 @@ static inline void memblock_discard(void) {}
 #endif
 
 void memblock_allow_resize(void);
-int memblock_add_node(phys_addr_t base, phys_addr_t size, int nid);
+int memblock_add_node(phys_addr_t base, phys_addr_t size, int nid,
+		      enum memblock_flags flags);
 int memblock_add(phys_addr_t base, phys_addr_t size);
 int memblock_remove(phys_addr_t base, phys_addr_t size);
 int memblock_free(phys_addr_t base, phys_addr_t size);
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 73a52aba448f..0117cb35b212 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2447,7 +2447,7 @@ static inline unsigned long get_num_physpages(void)
  * unsigned long max_zone_pfns[MAX_NR_ZONES] = {max_dma, max_normal_pfn,
  * 							 max_highmem_pfn};
  * for_each_valid_physical_page_range()
- * 	memblock_add_node(base, size, nid)
+ *	memblock_add_node(base, size, nid, MEMBLOCK_NONE)
  * free_area_init(max_zone_pfns);
  */
 void free_area_init(unsigned long *max_zone_pfn);
diff --git a/mm/memblock.c b/mm/memblock.c
index 184dcd2e5d99..47a56b223141 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -655,6 +655,7 @@ static int __init_memblock memblock_add_range(struct memblock_type *type,
  * @base: base address of the new region
  * @size: size of the new region
  * @nid: nid of the new region
+ * @flags: flags of the new region
  *
  * Add new memblock region [@base, @base + @size) to the "memory"
  * type. See memblock_add_range() description for mode details
@@ -663,14 +664,14 @@ static int __init_memblock memblock_add_range(struct memblock_type *type,
  * 0 on success, -errno on failure.
  */
 int __init_memblock memblock_add_node(phys_addr_t base, phys_addr_t size,
-				       int nid)
+				      int nid, enum memblock_flags flags)
 {
 	phys_addr_t end = base + size - 1;
 
-	memblock_dbg("%s: [%pa-%pa] nid=%d %pS\n", __func__,
-		     &base, &end, nid, (void *)_RET_IP_);
+	memblock_dbg("%s: [%pa-%pa] nid=%d flags=%x %pS\n", __func__,
+		     &base, &end, nid, flags, (void *)_RET_IP_);
 
-	return memblock_add_range(&memblock.memory, base, size, nid, 0);
+	return memblock_add_range(&memblock.memory, base, size, nid, flags);
 }
 
 /**
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 917b3528636d..5f873e7f5b29 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1385,7 +1385,7 @@ int __ref add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags)
 	mem_hotplug_begin();
 
 	if (IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK)) {
-		ret = memblock_add_node(start, size, nid);
+		ret = memblock_add_node(start, size, nid, MEMBLOCK_NONE);
 		if (ret)
 			goto error_mem_hotplug_end;
 	}
-- 
2.31.1


_______________________________________________
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v1 3/4] memblock: add MEMBLOCK_DRIVER_MANAGED to mimic IORESOURCE_SYSRAM_DRIVER_MANAGED
  2021-09-27 15:05 [PATCH v1 0/4] mm/memory_hotplug: full support for David Hildenbrand
  2021-09-27 15:05 ` [PATCH v1 1/4] mm/memory_hotplug: handle memblock_add_node() failures in add_memory_resource() David Hildenbrand
  2021-09-27 15:05 ` [PATCH v1 2/4] memblock: allow to specify flags with memblock_add_node() David Hildenbrand
@ 2021-09-27 15:05 ` David Hildenbrand
  2021-09-29 16:39   ` Mike Rapoport
  2021-09-27 15:05 ` [PATCH v1 4/4] mm/memory_hotplug: indicate MEMBLOCK_DRIVER_MANAGED with IORESOURCE_SYSRAM_DRIVER_MANAGED David Hildenbrand
  2021-09-27 15:07 ` [PATCH v1 0/4] mm/memory_hotplug: full support for David Hildenbrand
  4 siblings, 1 reply; 15+ messages in thread
From: David Hildenbrand @ 2021-09-27 15:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Hildenbrand, Andrew Morton, Mike Rapoport, Michal Hocko,
	Oscar Salvador, Jianyong Wu, Aneesh Kumar K . V, Vineet Gupta,
	Geert Uytterhoeven, Huacai Chen, Jiaxun Yang,
	Thomas Bogendoerfer, Heiko Carstens, Vasily Gorbik,
	Christian Borntraeger, Eric Biederman, Arnd Bergmann,
	linux-snps-arc, linux-ia64, linux-m68k, linux-mips, linux-s390,
	linux-mm, kexec

Let's add a flag that corresponds to IORESOURCE_SYSRAM_DRIVER_MANAGED.
Similar to MEMBLOCK_HOTPLUG, most infrastructure has to treat such memory
like ordinary MEMBLOCK_NONE memory -- for example, when selecting memory
regions to add to the vmcore for dumping in the crashkernel via
for_each_mem_range().

However, especially kexec_file is not supposed to select such memblocks via
for_each_free_mem_range() / for_each_free_mem_range_reverse() to place
kexec images, similar to how we handle IORESOURCE_SYSRAM_DRIVER_MANAGED
without CONFIG_ARCH_KEEP_MEMBLOCK.

Let's document why kexec_walk_memblock() won't try placing images on
areas marked MEMBLOCK_DRIVER_MANAGED -- similar to
IORESOURCE_SYSRAM_DRIVER_MANAGED handling in locate_mem_hole_callback()
via kexec_walk_resources().

We'll make sure that memory hotplug code sets the flag where applicable
(IORESOURCE_SYSRAM_DRIVER_MANAGED) next. This prepares architectures
that need CONFIG_ARCH_KEEP_MEMBLOCK, such as arm64, for virtio-mem
support.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 include/linux/memblock.h | 16 ++++++++++++++--
 kernel/kexec_file.c      |  5 +++++
 mm/memblock.c            |  4 ++++
 3 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index b49a58f621bc..7d8d656d5082 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -33,12 +33,17 @@ extern unsigned long long max_possible_pfn;
  * @MEMBLOCK_NOMAP: don't add to kernel direct mapping and treat as
  * reserved in the memory map; refer to memblock_mark_nomap() description
  * for further details
+ * @MEMBLOCK_DRIVER_MANAGED: memory region that is always detected via a driver,
+ * corresponding to IORESOURCE_SYSRAM_DRIVER_MANAGED in the kernel resource
+ * tree. Especially kexec should never use this memory for placing images and
+ * shouldn't expose this memory to the second kernel.
  */
 enum memblock_flags {
 	MEMBLOCK_NONE		= 0x0,	/* No special request */
 	MEMBLOCK_HOTPLUG	= 0x1,	/* hotpluggable region */
 	MEMBLOCK_MIRROR		= 0x2,	/* mirrored region */
 	MEMBLOCK_NOMAP		= 0x4,	/* don't add to kernel direct mapping */
+	MEMBLOCK_DRIVER_MANAGED = 0x8,	/* always detected via a driver */
 };
 
 /**
@@ -209,7 +214,8 @@ static inline void __next_physmem_range(u64 *idx, struct memblock_type *type,
  */
 #define for_each_mem_range(i, p_start, p_end) \
 	__for_each_mem_range(i, &memblock.memory, NULL, NUMA_NO_NODE,	\
-			     MEMBLOCK_HOTPLUG, p_start, p_end, NULL)
+			     MEMBLOCK_HOTPLUG | MEMBLOCK_DRIVER_MANAGED, \
+			     p_start, p_end, NULL)
 
 /**
  * for_each_mem_range_rev - reverse iterate through memblock areas from
@@ -220,7 +226,8 @@ static inline void __next_physmem_range(u64 *idx, struct memblock_type *type,
  */
 #define for_each_mem_range_rev(i, p_start, p_end)			\
 	__for_each_mem_range_rev(i, &memblock.memory, NULL, NUMA_NO_NODE, \
-				 MEMBLOCK_HOTPLUG, p_start, p_end, NULL)
+				 MEMBLOCK_HOTPLUG | MEMBLOCK_DRIVER_MANAGED,\
+				 p_start, p_end, NULL)
 
 /**
  * for_each_reserved_mem_range - iterate over all reserved memblock areas
@@ -250,6 +257,11 @@ static inline bool memblock_is_nomap(struct memblock_region *m)
 	return m->flags & MEMBLOCK_NOMAP;
 }
 
+static inline bool memblock_is_driver_managed(struct memblock_region *m)
+{
+	return m->flags & MEMBLOCK_DRIVER_MANAGED;
+}
+
 int memblock_search_pfn_nid(unsigned long pfn, unsigned long *start_pfn,
 			    unsigned long  *end_pfn);
 void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn,
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index 33400ff051a8..8347fc158d2b 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -556,6 +556,11 @@ static int kexec_walk_memblock(struct kexec_buf *kbuf,
 	if (kbuf->image->type == KEXEC_TYPE_CRASH)
 		return func(&crashk_res, kbuf);
 
+	/*
+	 * Using MEMBLOCK_NONE will properly skip MEMBLOCK_DRIVER_MANAGED. See
+	 * IORESOURCE_SYSRAM_DRIVER_MANAGED handling in
+	 * locate_mem_hole_callback().
+	 */
 	if (kbuf->top_down) {
 		for_each_free_mem_range_reverse(i, NUMA_NO_NODE, MEMBLOCK_NONE,
 						&mstart, &mend, NULL) {
diff --git a/mm/memblock.c b/mm/memblock.c
index 47a56b223141..540a35317fb0 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -979,6 +979,10 @@ static bool should_skip_region(struct memblock_type *type,
 	if (!(flags & MEMBLOCK_NOMAP) && memblock_is_nomap(m))
 		return true;
 
+	/* skip driver-managed memory unless we were asked for it explicitly */
+	if (!(flags & MEMBLOCK_DRIVER_MANAGED) && memblock_is_driver_managed(m))
+		return true;
+
 	return false;
 }
 
-- 
2.31.1


_______________________________________________
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v1 4/4] mm/memory_hotplug: indicate MEMBLOCK_DRIVER_MANAGED with IORESOURCE_SYSRAM_DRIVER_MANAGED
  2021-09-27 15:05 [PATCH v1 0/4] mm/memory_hotplug: full support for David Hildenbrand
                   ` (2 preceding siblings ...)
  2021-09-27 15:05 ` [PATCH v1 3/4] memblock: add MEMBLOCK_DRIVER_MANAGED to mimic IORESOURCE_SYSRAM_DRIVER_MANAGED David Hildenbrand
@ 2021-09-27 15:05 ` David Hildenbrand
  2021-09-27 15:07 ` [PATCH v1 0/4] mm/memory_hotplug: full support for David Hildenbrand
  4 siblings, 0 replies; 15+ messages in thread
From: David Hildenbrand @ 2021-09-27 15:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Hildenbrand, Andrew Morton, Mike Rapoport, Michal Hocko,
	Oscar Salvador, Jianyong Wu, Aneesh Kumar K . V, Vineet Gupta,
	Geert Uytterhoeven, Huacai Chen, Jiaxun Yang,
	Thomas Bogendoerfer, Heiko Carstens, Vasily Gorbik,
	Christian Borntraeger, Eric Biederman, Arnd Bergmann,
	linux-snps-arc, linux-ia64, linux-m68k, linux-mips, linux-s390,
	linux-mm, kexec

Let's communicate driver-managed regions to memblock, to properly
teach kexec_file with CONFIG_ARCH_KEEP_MEMBLOCK to not place images on
these memory regions.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/memory_hotplug.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 5f873e7f5b29..6d90818d4ce8 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1357,6 +1357,7 @@ bool mhp_supports_memmap_on_memory(unsigned long size)
 int __ref add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags)
 {
 	struct mhp_params params = { .pgprot = pgprot_mhp(PAGE_KERNEL) };
+	enum memblock_flags memblock_flags = MEMBLOCK_NONE;
 	struct vmem_altmap mhp_altmap = {};
 	struct memory_group *group = NULL;
 	u64 start, size;
@@ -1385,7 +1386,9 @@ int __ref add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags)
 	mem_hotplug_begin();
 
 	if (IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK)) {
-		ret = memblock_add_node(start, size, nid, MEMBLOCK_NONE);
+		if (res->flags & IORESOURCE_SYSRAM_DRIVER_MANAGED)
+			memblock_flags = MEMBLOCK_DRIVER_MANAGED;
+		ret = memblock_add_node(start, size, nid, memblock_flags);
 		if (ret)
 			goto error_mem_hotplug_end;
 	}
-- 
2.31.1


_______________________________________________
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 0/4] mm/memory_hotplug: full support for
  2021-09-27 15:05 [PATCH v1 0/4] mm/memory_hotplug: full support for David Hildenbrand
                   ` (3 preceding siblings ...)
  2021-09-27 15:05 ` [PATCH v1 4/4] mm/memory_hotplug: indicate MEMBLOCK_DRIVER_MANAGED with IORESOURCE_SYSRAM_DRIVER_MANAGED David Hildenbrand
@ 2021-09-27 15:07 ` David Hildenbrand
  4 siblings, 0 replies; 15+ messages in thread
From: David Hildenbrand @ 2021-09-27 15:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrew Morton, Mike Rapoport, Michal Hocko, Oscar Salvador,
	Jianyong Wu, Aneesh Kumar K . V, Vineet Gupta,
	Geert Uytterhoeven, Huacai Chen, Jiaxun Yang,
	Thomas Bogendoerfer, Heiko Carstens, Vasily Gorbik,
	Christian Borntraeger, Eric Biederman, Arnd Bergmann,
	linux-snps-arc, linux-ia64, linux-m68k, linux-mips, linux-s390,
	linux-mm, kexec

Intended subject was "[PATCH v1 0/4] mm/memory_hotplug: full support for 
add_memory_driver_managed() with CONFIG_ARCH_KEEP_MEMBLOCK"

-- 
Thanks,

David / dhildenb


_______________________________________________
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 2/4] memblock: allow to specify flags with memblock_add_node()
  2021-09-27 15:05 ` [PATCH v1 2/4] memblock: allow to specify flags with memblock_add_node() David Hildenbrand
@ 2021-09-27 15:19   ` Geert Uytterhoeven
  2021-09-28  9:38   ` Heiko Carstens
  2021-09-29 16:25   ` Mike Rapoport
  2 siblings, 0 replies; 15+ messages in thread
From: Geert Uytterhoeven @ 2021-09-27 15:19 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Linux Kernel Mailing List, Andrew Morton, Mike Rapoport,
	Michal Hocko, Oscar Salvador, Jianyong Wu, Aneesh Kumar K . V,
	Vineet Gupta, Huacai Chen, Jiaxun Yang, Thomas Bogendoerfer,
	Heiko Carstens, Vasily Gorbik, Christian Borntraeger,
	Eric Biederman, Arnd Bergmann, arcml, linux-ia64, linux-m68k,
	open list:BROADCOM NVRAM DRIVER, linux-s390, Linux MM, kexec

On Mon, Sep 27, 2021 at 5:05 PM David Hildenbrand <david@redhat.com> wrote:
> We want to specify flags when hotplugging memory. Let's prepare to pass
> flags to memblock_add_node() by adjusting all existing users.
>
> Note that when hotplugging memory the system is already up and running
> and we don't want to add the memory first and apply flags later: it
> should happen within one memblock call.
>
> Signed-off-by: David Hildenbrand <david@redhat.com>

>  arch/m68k/mm/mcfmmu.c            | 3 ++-
>  arch/m68k/mm/motorola.c          | 6 ++++--

Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

_______________________________________________
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 2/4] memblock: allow to specify flags with memblock_add_node()
  2021-09-27 15:05 ` [PATCH v1 2/4] memblock: allow to specify flags with memblock_add_node() David Hildenbrand
  2021-09-27 15:19   ` Geert Uytterhoeven
@ 2021-09-28  9:38   ` Heiko Carstens
  2021-09-29 16:25   ` Mike Rapoport
  2 siblings, 0 replies; 15+ messages in thread
From: Heiko Carstens @ 2021-09-28  9:38 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, Andrew Morton, Mike Rapoport, Michal Hocko,
	Oscar Salvador, Jianyong Wu, Aneesh Kumar K . V, Vineet Gupta,
	Geert Uytterhoeven, Huacai Chen, Jiaxun Yang,
	Thomas Bogendoerfer, Vasily Gorbik, Christian Borntraeger,
	Eric Biederman, Arnd Bergmann, linux-snps-arc, linux-ia64,
	linux-m68k, linux-mips, linux-s390, linux-mm, kexec

On Mon, Sep 27, 2021 at 05:05:16PM +0200, David Hildenbrand wrote:
> We want to specify flags when hotplugging memory. Let's prepare to pass
> flags to memblock_add_node() by adjusting all existing users.
> 
> Note that when hotplugging memory the system is already up and running
> and we don't want to add the memory first and apply flags later: it
> should happen within one memblock call.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
...
>  arch/s390/kernel/setup.c         | 3 ++-

For s390
Acked-by: Heiko Carstens <hca@linux.ibm.com>

_______________________________________________
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 2/4] memblock: allow to specify flags with memblock_add_node()
  2021-09-27 15:05 ` [PATCH v1 2/4] memblock: allow to specify flags with memblock_add_node() David Hildenbrand
  2021-09-27 15:19   ` Geert Uytterhoeven
  2021-09-28  9:38   ` Heiko Carstens
@ 2021-09-29 16:25   ` Mike Rapoport
  2021-09-29 16:30     ` David Hildenbrand
  2 siblings, 1 reply; 15+ messages in thread
From: Mike Rapoport @ 2021-09-29 16:25 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, Andrew Morton, Michal Hocko, Oscar Salvador,
	Jianyong Wu, Aneesh Kumar K . V, Vineet Gupta,
	Geert Uytterhoeven, Huacai Chen, Jiaxun Yang,
	Thomas Bogendoerfer, Heiko Carstens, Vasily Gorbik,
	Christian Borntraeger, Eric Biederman, Arnd Bergmann,
	linux-snps-arc, linux-ia64, linux-m68k, linux-mips, linux-s390,
	linux-mm, kexec

On Mon, Sep 27, 2021 at 05:05:16PM +0200, David Hildenbrand wrote:
> We want to specify flags when hotplugging memory. Let's prepare to pass
> flags to memblock_add_node() by adjusting all existing users.
> 
> Note that when hotplugging memory the system is already up and running
> and we don't want to add the memory first and apply flags later: it
> should happen within one memblock call.

Why is it important that the system is up and why it should happen in a
single call?
I don't mind adding flags parameter to memblock_add_node() but this
changelog does not really explain the reasons to do it.
 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  arch/arc/mm/init.c               | 4 ++--
>  arch/ia64/mm/contig.c            | 2 +-
>  arch/ia64/mm/init.c              | 2 +-
>  arch/m68k/mm/mcfmmu.c            | 3 ++-
>  arch/m68k/mm/motorola.c          | 6 ++++--
>  arch/mips/loongson64/init.c      | 4 +++-
>  arch/mips/sgi-ip27/ip27-memory.c | 3 ++-
>  arch/s390/kernel/setup.c         | 3 ++-
>  include/linux/memblock.h         | 3 ++-
>  include/linux/mm.h               | 2 +-
>  mm/memblock.c                    | 9 +++++----
>  mm/memory_hotplug.c              | 2 +-
>  12 files changed, 26 insertions(+), 17 deletions(-)
> 
> diff --git a/arch/arc/mm/init.c b/arch/arc/mm/init.c
> index 699ecf119641..110eb69e9bee 100644
> --- a/arch/arc/mm/init.c
> +++ b/arch/arc/mm/init.c
> @@ -59,13 +59,13 @@ void __init early_init_dt_add_memory_arch(u64 base, u64 size)
>  
>  		low_mem_sz = size;
>  		in_use = 1;
> -		memblock_add_node(base, size, 0);
> +		memblock_add_node(base, size, 0, MEMBLOCK_NONE);
>  	} else {
>  #ifdef CONFIG_HIGHMEM
>  		high_mem_start = base;
>  		high_mem_sz = size;
>  		in_use = 1;
> -		memblock_add_node(base, size, 1);
> +		memblock_add_node(base, size, 1, MEMBLOCK_NONE);
>  		memblock_reserve(base, size);
>  #endif
>  	}
> diff --git a/arch/ia64/mm/contig.c b/arch/ia64/mm/contig.c
> index 42e025cfbd08..24901d809301 100644
> --- a/arch/ia64/mm/contig.c
> +++ b/arch/ia64/mm/contig.c
> @@ -153,7 +153,7 @@ find_memory (void)
>  	efi_memmap_walk(find_max_min_low_pfn, NULL);
>  	max_pfn = max_low_pfn;
>  
> -	memblock_add_node(0, PFN_PHYS(max_low_pfn), 0);
> +	memblock_add_node(0, PFN_PHYS(max_low_pfn), 0, MEMBLOCK_NONE);
>  
>  	find_initrd();
>  
> diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c
> index 5c6da8d83c1a..5d165607bf35 100644
> --- a/arch/ia64/mm/init.c
> +++ b/arch/ia64/mm/init.c
> @@ -378,7 +378,7 @@ int __init register_active_ranges(u64 start, u64 len, int nid)
>  #endif
>  
>  	if (start < end)
> -		memblock_add_node(__pa(start), end - start, nid);
> +		memblock_add_node(__pa(start), end - start, nid, MEMBLOCK_NONE);
>  	return 0;
>  }
>  
> diff --git a/arch/m68k/mm/mcfmmu.c b/arch/m68k/mm/mcfmmu.c
> index eac9dde65193..6f1f25125294 100644
> --- a/arch/m68k/mm/mcfmmu.c
> +++ b/arch/m68k/mm/mcfmmu.c
> @@ -174,7 +174,8 @@ void __init cf_bootmem_alloc(void)
>  	m68k_memory[0].addr = _rambase;
>  	m68k_memory[0].size = _ramend - _rambase;
>  
> -	memblock_add_node(m68k_memory[0].addr, m68k_memory[0].size, 0);
> +	memblock_add_node(m68k_memory[0].addr, m68k_memory[0].size, 0,
> +			  MEMBLOCK_NONE);
>  
>  	/* compute total pages in system */
>  	num_pages = PFN_DOWN(_ramend - _rambase);
> diff --git a/arch/m68k/mm/motorola.c b/arch/m68k/mm/motorola.c
> index 3a653f0a4188..e80c5d7e6728 100644
> --- a/arch/m68k/mm/motorola.c
> +++ b/arch/m68k/mm/motorola.c
> @@ -410,7 +410,8 @@ void __init paging_init(void)
>  
>  	min_addr = m68k_memory[0].addr;
>  	max_addr = min_addr + m68k_memory[0].size;
> -	memblock_add_node(m68k_memory[0].addr, m68k_memory[0].size, 0);
> +	memblock_add_node(m68k_memory[0].addr, m68k_memory[0].size, 0,
> +			  MEMBLOCK_NONE);
>  	for (i = 1; i < m68k_num_memory;) {
>  		if (m68k_memory[i].addr < min_addr) {
>  			printk("Ignoring memory chunk at 0x%lx:0x%lx before the first chunk\n",
> @@ -421,7 +422,8 @@ void __init paging_init(void)
>  				(m68k_num_memory - i) * sizeof(struct m68k_mem_info));
>  			continue;
>  		}
> -		memblock_add_node(m68k_memory[i].addr, m68k_memory[i].size, i);
> +		memblock_add_node(m68k_memory[i].addr, m68k_memory[i].size, i,
> +				  MEMBLOCK_NONE);
>  		addr = m68k_memory[i].addr + m68k_memory[i].size;
>  		if (addr > max_addr)
>  			max_addr = addr;
> diff --git a/arch/mips/loongson64/init.c b/arch/mips/loongson64/init.c
> index 76e0a9636a0e..4ac5ba80bbf6 100644
> --- a/arch/mips/loongson64/init.c
> +++ b/arch/mips/loongson64/init.c
> @@ -77,7 +77,9 @@ void __init szmem(unsigned int node)
>  				(u32)node_id, mem_type, mem_start, mem_size);
>  			pr_info("       start_pfn:0x%llx, end_pfn:0x%llx, num_physpages:0x%lx\n",
>  				start_pfn, end_pfn, num_physpages);
> -			memblock_add_node(PFN_PHYS(start_pfn), PFN_PHYS(node_psize), node);
> +			memblock_add_node(PFN_PHYS(start_pfn),
> +					  PFN_PHYS(node_psize), node,
> +					  MEMBLOCK_NONE);
>  			break;
>  		case SYSTEM_RAM_RESERVED:
>  			pr_info("Node%d: mem_type:%d, mem_start:0x%llx, mem_size:0x%llx MB\n",
> diff --git a/arch/mips/sgi-ip27/ip27-memory.c b/arch/mips/sgi-ip27/ip27-memory.c
> index 6173684b5aaa..adc2faeecf7c 100644
> --- a/arch/mips/sgi-ip27/ip27-memory.c
> +++ b/arch/mips/sgi-ip27/ip27-memory.c
> @@ -341,7 +341,8 @@ static void __init szmem(void)
>  				continue;
>  			}
>  			memblock_add_node(PFN_PHYS(slot_getbasepfn(node, slot)),
> -					  PFN_PHYS(slot_psize), node);
> +					  PFN_PHYS(slot_psize), node,
> +					  MEMBLOCK_NONE);
>  		}
>  	}
>  }
> diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c
> index 67e5fff96ee0..f3943f15af6e 100644
> --- a/arch/s390/kernel/setup.c
> +++ b/arch/s390/kernel/setup.c
> @@ -593,7 +593,8 @@ static void __init setup_resources(void)
>  	 * part of the System RAM resource.
>  	 */
>  	if (crashk_res.end) {
> -		memblock_add_node(crashk_res.start, resource_size(&crashk_res), 0);
> +		memblock_add_node(crashk_res.start, resource_size(&crashk_res),
> +				  0, MEMBLOCK_NONE);
>  		memblock_reserve(crashk_res.start, resource_size(&crashk_res));
>  		insert_resource(&iomem_resource, &crashk_res);
>  	}
> diff --git a/include/linux/memblock.h b/include/linux/memblock.h
> index 34de69b3b8ba..b49a58f621bc 100644
> --- a/include/linux/memblock.h
> +++ b/include/linux/memblock.h
> @@ -100,7 +100,8 @@ static inline void memblock_discard(void) {}
>  #endif
>  
>  void memblock_allow_resize(void);
> -int memblock_add_node(phys_addr_t base, phys_addr_t size, int nid);
> +int memblock_add_node(phys_addr_t base, phys_addr_t size, int nid,
> +		      enum memblock_flags flags);
>  int memblock_add(phys_addr_t base, phys_addr_t size);
>  int memblock_remove(phys_addr_t base, phys_addr_t size);
>  int memblock_free(phys_addr_t base, phys_addr_t size);
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 73a52aba448f..0117cb35b212 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -2447,7 +2447,7 @@ static inline unsigned long get_num_physpages(void)
>   * unsigned long max_zone_pfns[MAX_NR_ZONES] = {max_dma, max_normal_pfn,
>   * 							 max_highmem_pfn};
>   * for_each_valid_physical_page_range()
> - * 	memblock_add_node(base, size, nid)
> + *	memblock_add_node(base, size, nid, MEMBLOCK_NONE)
>   * free_area_init(max_zone_pfns);
>   */
>  void free_area_init(unsigned long *max_zone_pfn);
> diff --git a/mm/memblock.c b/mm/memblock.c
> index 184dcd2e5d99..47a56b223141 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -655,6 +655,7 @@ static int __init_memblock memblock_add_range(struct memblock_type *type,
>   * @base: base address of the new region
>   * @size: size of the new region
>   * @nid: nid of the new region
> + * @flags: flags of the new region
>   *
>   * Add new memblock region [@base, @base + @size) to the "memory"
>   * type. See memblock_add_range() description for mode details
> @@ -663,14 +664,14 @@ static int __init_memblock memblock_add_range(struct memblock_type *type,
>   * 0 on success, -errno on failure.
>   */
>  int __init_memblock memblock_add_node(phys_addr_t base, phys_addr_t size,
> -				       int nid)
> +				      int nid, enum memblock_flags flags)
>  {
>  	phys_addr_t end = base + size - 1;
>  
> -	memblock_dbg("%s: [%pa-%pa] nid=%d %pS\n", __func__,
> -		     &base, &end, nid, (void *)_RET_IP_);
> +	memblock_dbg("%s: [%pa-%pa] nid=%d flags=%x %pS\n", __func__,
> +		     &base, &end, nid, flags, (void *)_RET_IP_);
>  
> -	return memblock_add_range(&memblock.memory, base, size, nid, 0);
> +	return memblock_add_range(&memblock.memory, base, size, nid, flags);
>  }
>  
>  /**
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 917b3528636d..5f873e7f5b29 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1385,7 +1385,7 @@ int __ref add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags)
>  	mem_hotplug_begin();
>  
>  	if (IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK)) {
> -		ret = memblock_add_node(start, size, nid);
> +		ret = memblock_add_node(start, size, nid, MEMBLOCK_NONE);
>  		if (ret)
>  			goto error_mem_hotplug_end;
>  	}
> -- 
> 2.31.1
> 

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 2/4] memblock: allow to specify flags with memblock_add_node()
  2021-09-29 16:25   ` Mike Rapoport
@ 2021-09-29 16:30     ` David Hildenbrand
  0 siblings, 0 replies; 15+ messages in thread
From: David Hildenbrand @ 2021-09-29 16:30 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: linux-kernel, Andrew Morton, Michal Hocko, Oscar Salvador,
	Jianyong Wu, Aneesh Kumar K . V, Vineet Gupta,
	Geert Uytterhoeven, Huacai Chen, Jiaxun Yang,
	Thomas Bogendoerfer, Heiko Carstens, Vasily Gorbik,
	Christian Borntraeger, Eric Biederman, Arnd Bergmann,
	linux-snps-arc, linux-ia64, linux-m68k, linux-mips, linux-s390,
	linux-mm, kexec

On 29.09.21 18:25, Mike Rapoport wrote:
> On Mon, Sep 27, 2021 at 05:05:16PM +0200, David Hildenbrand wrote:
>> We want to specify flags when hotplugging memory. Let's prepare to pass
>> flags to memblock_add_node() by adjusting all existing users.
>>
>> Note that when hotplugging memory the system is already up and running
>> and we don't want to add the memory first and apply flags later: it
>> should happen within one memblock call.
> 
> Why is it important that the system is up and why it should happen in a
> single call?
> I don't mind adding flags parameter to memblock_add_node() but this
> changelog does not really explain the reasons to do it.

"After memblock_add_node(), we could race with anybody performing a 
search for MEMBLOCK_NONE, like kexec_file -- and that only happens once 
the system is already up and running. So we want both steps to happen 
atomically."

I can add that to the patch description.

(I think it still won't be completely atomic because memblock isn't 
properly implementing locking yet, but that's a different story)

-- 
Thanks,

David / dhildenb


_______________________________________________
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 3/4] memblock: add MEMBLOCK_DRIVER_MANAGED to mimic IORESOURCE_SYSRAM_DRIVER_MANAGED
  2021-09-27 15:05 ` [PATCH v1 3/4] memblock: add MEMBLOCK_DRIVER_MANAGED to mimic IORESOURCE_SYSRAM_DRIVER_MANAGED David Hildenbrand
@ 2021-09-29 16:39   ` Mike Rapoport
  2021-09-29 16:54     ` David Hildenbrand
  0 siblings, 1 reply; 15+ messages in thread
From: Mike Rapoport @ 2021-09-29 16:39 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, Andrew Morton, Michal Hocko, Oscar Salvador,
	Jianyong Wu, Aneesh Kumar K . V, Vineet Gupta,
	Geert Uytterhoeven, Huacai Chen, Jiaxun Yang,
	Thomas Bogendoerfer, Heiko Carstens, Vasily Gorbik,
	Christian Borntraeger, Eric Biederman, Arnd Bergmann,
	linux-snps-arc, linux-ia64, linux-m68k, linux-mips, linux-s390,
	linux-mm, kexec

Hi,

On Mon, Sep 27, 2021 at 05:05:17PM +0200, David Hildenbrand wrote:
> Let's add a flag that corresponds to IORESOURCE_SYSRAM_DRIVER_MANAGED.
> Similar to MEMBLOCK_HOTPLUG, most infrastructure has to treat such memory
> like ordinary MEMBLOCK_NONE memory -- for example, when selecting memory
> regions to add to the vmcore for dumping in the crashkernel via
> for_each_mem_range().
 
Can you please elaborate on the difference in semantics of MEMBLOCK_HOTPLUG
and MEMBLOCK_DRIVER_MANAGED?
Unless I'm missing something they both mark memory that can be unplugged
anytime and so it should not be used in certain cases. Why is there a need
for a new flag?

> However, especially kexec_file is not supposed to select such memblocks via
> for_each_free_mem_range() / for_each_free_mem_range_reverse() to place
> kexec images, similar to how we handle IORESOURCE_SYSRAM_DRIVER_MANAGED
> without CONFIG_ARCH_KEEP_MEMBLOCK.
> 
> Let's document why kexec_walk_memblock() won't try placing images on
> areas marked MEMBLOCK_DRIVER_MANAGED -- similar to
> IORESOURCE_SYSRAM_DRIVER_MANAGED handling in locate_mem_hole_callback()
> via kexec_walk_resources().
> 
> We'll make sure that memory hotplug code sets the flag where applicable
> (IORESOURCE_SYSRAM_DRIVER_MANAGED) next. This prepares architectures
> that need CONFIG_ARCH_KEEP_MEMBLOCK, such as arm64, for virtio-mem
> support.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  include/linux/memblock.h | 16 ++++++++++++++--
>  kernel/kexec_file.c      |  5 +++++
>  mm/memblock.c            |  4 ++++
>  3 files changed, 23 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/memblock.h b/include/linux/memblock.h
> index b49a58f621bc..7d8d656d5082 100644
> --- a/include/linux/memblock.h
> +++ b/include/linux/memblock.h
> @@ -33,12 +33,17 @@ extern unsigned long long max_possible_pfn;
>   * @MEMBLOCK_NOMAP: don't add to kernel direct mapping and treat as
>   * reserved in the memory map; refer to memblock_mark_nomap() description
>   * for further details
> + * @MEMBLOCK_DRIVER_MANAGED: memory region that is always detected via a driver,
> + * corresponding to IORESOURCE_SYSRAM_DRIVER_MANAGED in the kernel resource
> + * tree. Especially kexec should never use this memory for placing images and
> + * shouldn't expose this memory to the second kernel.
>   */
>  enum memblock_flags {
>  	MEMBLOCK_NONE		= 0x0,	/* No special request */
>  	MEMBLOCK_HOTPLUG	= 0x1,	/* hotpluggable region */
>  	MEMBLOCK_MIRROR		= 0x2,	/* mirrored region */
>  	MEMBLOCK_NOMAP		= 0x4,	/* don't add to kernel direct mapping */
> +	MEMBLOCK_DRIVER_MANAGED = 0x8,	/* always detected via a driver */
>  };
>  
>  /**
> @@ -209,7 +214,8 @@ static inline void __next_physmem_range(u64 *idx, struct memblock_type *type,
>   */
>  #define for_each_mem_range(i, p_start, p_end) \
>  	__for_each_mem_range(i, &memblock.memory, NULL, NUMA_NO_NODE,	\
> -			     MEMBLOCK_HOTPLUG, p_start, p_end, NULL)
> +			     MEMBLOCK_HOTPLUG | MEMBLOCK_DRIVER_MANAGED, \
> +			     p_start, p_end, NULL)
>  
>  /**
>   * for_each_mem_range_rev - reverse iterate through memblock areas from
> @@ -220,7 +226,8 @@ static inline void __next_physmem_range(u64 *idx, struct memblock_type *type,
>   */
>  #define for_each_mem_range_rev(i, p_start, p_end)			\
>  	__for_each_mem_range_rev(i, &memblock.memory, NULL, NUMA_NO_NODE, \
> -				 MEMBLOCK_HOTPLUG, p_start, p_end, NULL)
> +				 MEMBLOCK_HOTPLUG | MEMBLOCK_DRIVER_MANAGED,\
> +				 p_start, p_end, NULL)
>  
>  /**
>   * for_each_reserved_mem_range - iterate over all reserved memblock areas
> @@ -250,6 +257,11 @@ static inline bool memblock_is_nomap(struct memblock_region *m)
>  	return m->flags & MEMBLOCK_NOMAP;
>  }
>  
> +static inline bool memblock_is_driver_managed(struct memblock_region *m)
> +{
> +	return m->flags & MEMBLOCK_DRIVER_MANAGED;
> +}
> +
>  int memblock_search_pfn_nid(unsigned long pfn, unsigned long *start_pfn,
>  			    unsigned long  *end_pfn);
>  void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn,
> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index 33400ff051a8..8347fc158d2b 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -556,6 +556,11 @@ static int kexec_walk_memblock(struct kexec_buf *kbuf,
>  	if (kbuf->image->type == KEXEC_TYPE_CRASH)
>  		return func(&crashk_res, kbuf);
>  
> +	/*
> +	 * Using MEMBLOCK_NONE will properly skip MEMBLOCK_DRIVER_MANAGED. See
> +	 * IORESOURCE_SYSRAM_DRIVER_MANAGED handling in
> +	 * locate_mem_hole_callback().
> +	 */
>  	if (kbuf->top_down) {
>  		for_each_free_mem_range_reverse(i, NUMA_NO_NODE, MEMBLOCK_NONE,
>  						&mstart, &mend, NULL) {
> diff --git a/mm/memblock.c b/mm/memblock.c
> index 47a56b223141..540a35317fb0 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -979,6 +979,10 @@ static bool should_skip_region(struct memblock_type *type,
>  	if (!(flags & MEMBLOCK_NOMAP) && memblock_is_nomap(m))
>  		return true;
>  
> +	/* skip driver-managed memory unless we were asked for it explicitly */
> +	if (!(flags & MEMBLOCK_DRIVER_MANAGED) && memblock_is_driver_managed(m))
> +		return true;
> +
>  	return false;
>  }
>  
> -- 
> 2.31.1
> 

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 3/4] memblock: add MEMBLOCK_DRIVER_MANAGED to mimic IORESOURCE_SYSRAM_DRIVER_MANAGED
  2021-09-29 16:39   ` Mike Rapoport
@ 2021-09-29 16:54     ` David Hildenbrand
  2021-09-30 21:21       ` Mike Rapoport
  0 siblings, 1 reply; 15+ messages in thread
From: David Hildenbrand @ 2021-09-29 16:54 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: linux-kernel, Andrew Morton, Michal Hocko, Oscar Salvador,
	Jianyong Wu, Aneesh Kumar K . V, Vineet Gupta,
	Geert Uytterhoeven, Huacai Chen, Jiaxun Yang,
	Thomas Bogendoerfer, Heiko Carstens, Vasily Gorbik,
	Christian Borntraeger, Eric Biederman, Arnd Bergmann,
	linux-snps-arc, linux-ia64, linux-m68k, linux-mips, linux-s390,
	linux-mm, kexec

On 29.09.21 18:39, Mike Rapoport wrote:
> Hi,
> 
> On Mon, Sep 27, 2021 at 05:05:17PM +0200, David Hildenbrand wrote:
>> Let's add a flag that corresponds to IORESOURCE_SYSRAM_DRIVER_MANAGED.
>> Similar to MEMBLOCK_HOTPLUG, most infrastructure has to treat such memory
>> like ordinary MEMBLOCK_NONE memory -- for example, when selecting memory
>> regions to add to the vmcore for dumping in the crashkernel via
>> for_each_mem_range().
>   
> Can you please elaborate on the difference in semantics of MEMBLOCK_HOTPLUG
> and MEMBLOCK_DRIVER_MANAGED?
> Unless I'm missing something they both mark memory that can be unplugged
> anytime and so it should not be used in certain cases. Why is there a need
> for a new flag?

In the cover letter I have "Alternative B: Reuse MEMBLOCK_HOTPLUG. 
MEMBLOCK_HOTPLUG serves a different purpose, though.", but looking into 
the details it won't work as is.

MEMBLOCK_HOTPLUG is used to mark memory early during boot that can later 
get hotunplugged again and should be placed into ZONE_MOVABLE if the 
"movable_node" kernel parameter is set.

The confusing part is that we talk about "hotpluggable" but really mean 
"hotunpluggable": the reason is that HW flags DIMM slots that can later 
be hotplugged as "hotpluggable" even though there is already something 
hotplugged.

For example, ranges in the ACPI SRAT that are marked as 
ACPI_SRAT_MEM_HOT_PLUGGABLE will be marked MEMBLOCK_HOTPLUG early during 
boot (drivers/acpi/numa/srat.c:acpi_numa_memory_affinity_init()). Later, 
we use that information to size ZONE_MOVABLE 
(mm/page_alloc.c:find_zone_movable_pfns_for_nodes()). This will make 
sure that these "hotpluggable" DIMMs can later get hotunplugged.

Also, see should_skip_region() how this relates to the "movable_node" 
kernel parameter:

	/* skip hotpluggable memory regions if needed */
	if (movable_node_is_enabled() && memblock_is_hotpluggable(m) &&
	    (flags & MEMBLOCK_HOTPLUG))
		return true;

Long story short: MEMBLOCK_HOTPLUG has different semantics and is a 
special case for "movable_node".

-- 
Thanks,

David / dhildenb


_______________________________________________
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 3/4] memblock: add MEMBLOCK_DRIVER_MANAGED to mimic IORESOURCE_SYSRAM_DRIVER_MANAGED
  2021-09-29 16:54     ` David Hildenbrand
@ 2021-09-30 21:21       ` Mike Rapoport
  2021-10-01  8:04         ` David Hildenbrand
  0 siblings, 1 reply; 15+ messages in thread
From: Mike Rapoport @ 2021-09-30 21:21 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, Andrew Morton, Michal Hocko, Oscar Salvador,
	Jianyong Wu, Aneesh Kumar K . V, Vineet Gupta,
	Geert Uytterhoeven, Huacai Chen, Jiaxun Yang,
	Thomas Bogendoerfer, Heiko Carstens, Vasily Gorbik,
	Christian Borntraeger, Eric Biederman, Arnd Bergmann,
	linux-snps-arc, linux-ia64, linux-m68k, linux-mips, linux-s390,
	linux-mm, kexec

On Wed, Sep 29, 2021 at 06:54:01PM +0200, David Hildenbrand wrote:
> On 29.09.21 18:39, Mike Rapoport wrote:
> > Hi,
> > 
> > On Mon, Sep 27, 2021 at 05:05:17PM +0200, David Hildenbrand wrote:
> > > Let's add a flag that corresponds to IORESOURCE_SYSRAM_DRIVER_MANAGED.
> > > Similar to MEMBLOCK_HOTPLUG, most infrastructure has to treat such memory
> > > like ordinary MEMBLOCK_NONE memory -- for example, when selecting memory
> > > regions to add to the vmcore for dumping in the crashkernel via
> > > for_each_mem_range().
> > Can you please elaborate on the difference in semantics of MEMBLOCK_HOTPLUG
> > and MEMBLOCK_DRIVER_MANAGED?
> > Unless I'm missing something they both mark memory that can be unplugged
> > anytime and so it should not be used in certain cases. Why is there a need
> > for a new flag?
> 
> In the cover letter I have "Alternative B: Reuse MEMBLOCK_HOTPLUG.
> MEMBLOCK_HOTPLUG serves a different purpose, though.", but looking into the
> details it won't work as is.
> 
> MEMBLOCK_HOTPLUG is used to mark memory early during boot that can later get
> hotunplugged again and should be placed into ZONE_MOVABLE if the
> "movable_node" kernel parameter is set.
> 
> The confusing part is that we talk about "hotpluggable" but really mean
> "hotunpluggable": the reason is that HW flags DIMM slots that can later be
> hotplugged as "hotpluggable" even though there is already something
> hotplugged.

MEMBLOCK_HOTPLUG name is indeed somewhat confusing, but still it's core
meaning "this memory may be removed" which does not differ from what
IORESOURCE_SYSRAM_DRIVER_MANAGED means.

MEMBLOCK_HOTPLUG regions are indeed placed into ZONE_MOVABLE, but more
importantly, they are avoided when we allocate memory from memblock.

So, in my view, both flags mean that the memory may be removed and it
should not be used for certain types of allocations.
 
> For example, ranges in the ACPI SRAT that are marked as
> ACPI_SRAT_MEM_HOT_PLUGGABLE will be marked MEMBLOCK_HOTPLUG early during
> boot (drivers/acpi/numa/srat.c:acpi_numa_memory_affinity_init()). Later, we
> use that information to size ZONE_MOVABLE
> (mm/page_alloc.c:find_zone_movable_pfns_for_nodes()). This will make sure
> that these "hotpluggable" DIMMs can later get hotunplugged.
> 
> Also, see should_skip_region() how this relates to the "movable_node" kernel
> parameter:
> 
> 	/* skip hotpluggable memory regions if needed */
> 	if (movable_node_is_enabled() && memblock_is_hotpluggable(m) &&
> 	    (flags & MEMBLOCK_HOTPLUG))
> 		return true;

Hmm, I think that the movable_node_is_enabled() check here is excessive,
but I suspect we cannot simply remove it without breaking anything.

I'll take a deeper look on the potential consequences.

BTW, is there anything that prevents putting kexec to hot-unplugable memory
that was cold-plugged on boot?

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 3/4] memblock: add MEMBLOCK_DRIVER_MANAGED to mimic IORESOURCE_SYSRAM_DRIVER_MANAGED
  2021-09-30 21:21       ` Mike Rapoport
@ 2021-10-01  8:04         ` David Hildenbrand
  2021-10-01 14:03           ` Mike Rapoport
  0 siblings, 1 reply; 15+ messages in thread
From: David Hildenbrand @ 2021-10-01  8:04 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: linux-kernel, Andrew Morton, Michal Hocko, Oscar Salvador,
	Jianyong Wu, Aneesh Kumar K . V, Vineet Gupta,
	Geert Uytterhoeven, Huacai Chen, Jiaxun Yang,
	Thomas Bogendoerfer, Heiko Carstens, Vasily Gorbik,
	Christian Borntraeger, Eric Biederman, Arnd Bergmann,
	linux-snps-arc, linux-ia64, linux-m68k, linux-mips, linux-s390,
	linux-mm, kexec

On 30.09.21 23:21, Mike Rapoport wrote:
> On Wed, Sep 29, 2021 at 06:54:01PM +0200, David Hildenbrand wrote:
>> On 29.09.21 18:39, Mike Rapoport wrote:
>>> Hi,
>>>
>>> On Mon, Sep 27, 2021 at 05:05:17PM +0200, David Hildenbrand wrote:
>>>> Let's add a flag that corresponds to IORESOURCE_SYSRAM_DRIVER_MANAGED.
>>>> Similar to MEMBLOCK_HOTPLUG, most infrastructure has to treat such memory
>>>> like ordinary MEMBLOCK_NONE memory -- for example, when selecting memory
>>>> regions to add to the vmcore for dumping in the crashkernel via
>>>> for_each_mem_range().
>>> Can you please elaborate on the difference in semantics of MEMBLOCK_HOTPLUG
>>> and MEMBLOCK_DRIVER_MANAGED?
>>> Unless I'm missing something they both mark memory that can be unplugged
>>> anytime and so it should not be used in certain cases. Why is there a need
>>> for a new flag?
>>
>> In the cover letter I have "Alternative B: Reuse MEMBLOCK_HOTPLUG.
>> MEMBLOCK_HOTPLUG serves a different purpose, though.", but looking into the
>> details it won't work as is.
>>
>> MEMBLOCK_HOTPLUG is used to mark memory early during boot that can later get
>> hotunplugged again and should be placed into ZONE_MOVABLE if the
>> "movable_node" kernel parameter is set.
>>
>> The confusing part is that we talk about "hotpluggable" but really mean
>> "hotunpluggable": the reason is that HW flags DIMM slots that can later be
>> hotplugged as "hotpluggable" even though there is already something
>> hotplugged.
> 
> MEMBLOCK_HOTPLUG name is indeed somewhat confusing, but still it's core
> meaning "this memory may be removed" which does not differ from what
> IORESOURCE_SYSRAM_DRIVER_MANAGED means.
> 
> MEMBLOCK_HOTPLUG regions are indeed placed into ZONE_MOVABLE, but more
> importantly, they are avoided when we allocate memory from memblock.
> 
> So, in my view, both flags mean that the memory may be removed and it
> should not be used for certain types of allocations.

The semantics are different:

MEMBLOCK_HOTPLUG: memory is indicated as "System RAM" in the 
firmware-provided memory map and added to the system early during boot; 
we want this memory to be managed by ZONE_MOVABLE with "movable_node" 
set on the kernel command line, because only then we want it to be 
hotpluggable again. kexec *has to* indicate this memory to the second 
kernel and can place kexec-images on this memory. After memory 
hotunplug, kexec has to be re-armed.

MEMBLOCK_DRIVER_MANAGED: memory is not indicated as System RAM" in the 
firmware-provided memory map; this memory is always detected and added 
to the system by a driver; memory might not actually be physically 
hotunpluggable and the ZONE selection does not depend on "movable_core". 
kexec *must not* indicate this memory to the second kernel and *must 
not* place kexec-images on this memory.


I would really advise against mixing concepts here.


What we could do is indicate *all* hotplugged memory (not just 
IORESOURCE_SYSRAM_DRIVER_MANAGED memory) as MEMBLOCK_HOTPLUG and make 
MEMBLOCK_HOTPLUG less dependent on "movable_node".

MEMBLOCK_HOTPLUG for early boot memory: with "movable_core", place it in 
ZONE_MOVABLE. Even without "movable_core", don't place early kernel 
allocations on this memory.
MEMBLOCK_HOTPLUG for all memory: don't place kexec images or on this 
memory, independent of "movable_core".


memblock would then not contain the information "contained in 
firmware-provided memory map" vs. "not contained in firmware-provided 
memory map"; but I think right now it's not strictly required to have 
that information if we'd go down that path.

>   
>> For example, ranges in the ACPI SRAT that are marked as
>> ACPI_SRAT_MEM_HOT_PLUGGABLE will be marked MEMBLOCK_HOTPLUG early during
>> boot (drivers/acpi/numa/srat.c:acpi_numa_memory_affinity_init()). Later, we
>> use that information to size ZONE_MOVABLE
>> (mm/page_alloc.c:find_zone_movable_pfns_for_nodes()). This will make sure
>> that these "hotpluggable" DIMMs can later get hotunplugged.
>>
>> Also, see should_skip_region() how this relates to the "movable_node" kernel
>> parameter:
>>
>> 	/* skip hotpluggable memory regions if needed */
>> 	if (movable_node_is_enabled() && memblock_is_hotpluggable(m) &&
>> 	    (flags & MEMBLOCK_HOTPLUG))
>> 		return true;
> 
> Hmm, I think that the movable_node_is_enabled() check here is excessive,
> but I suspect we cannot simply remove it without breaking anything.

The reasoning is: without "movable_core" we don't want this memory to be 
hotunpluggable; consequently, we don't care if we place kexec-images on 
this memory. MEMBLOCK_HOTPLUG is currently only active with "movable_core".

If we remove that check, we will always not place early kernel 
allocations on that memory, even if we don't care about ZONE_MOVABLE.

> 
> I'll take a deeper look on the potential consequences.
> 
> BTW, is there anything that prevents putting kexec to hot-unplugable memory
> that was cold-plugged on boot?

I think it depends on how the platform handles hotunpluggable DIMMs or 
hotunpluggable NUMA nodes. If the platform ends up indicates such memory 
via MEMBLOCK_HOTPLUG, and "movable_core" is set, memory would be put 
into ZONE_MOVABLE and kexec would not place kexec-images on that memory.

-- 
Thanks,

David / dhildenb


_______________________________________________
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 3/4] memblock: add MEMBLOCK_DRIVER_MANAGED to mimic IORESOURCE_SYSRAM_DRIVER_MANAGED
  2021-10-01  8:04         ` David Hildenbrand
@ 2021-10-01 14:03           ` Mike Rapoport
  0 siblings, 0 replies; 15+ messages in thread
From: Mike Rapoport @ 2021-10-01 14:03 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, Andrew Morton, Michal Hocko, Oscar Salvador,
	Jianyong Wu, Aneesh Kumar K . V, Vineet Gupta,
	Geert Uytterhoeven, Huacai Chen, Jiaxun Yang,
	Thomas Bogendoerfer, Heiko Carstens, Vasily Gorbik,
	Christian Borntraeger, Eric Biederman, Arnd Bergmann,
	linux-snps-arc, linux-ia64, linux-m68k, linux-mips, linux-s390,
	linux-mm, kexec

On Fri, Oct 01, 2021 at 10:04:24AM +0200, David Hildenbrand wrote:
> On 30.09.21 23:21, Mike Rapoport wrote:
> > On Wed, Sep 29, 2021 at 06:54:01PM +0200, David Hildenbrand wrote:
> > > On 29.09.21 18:39, Mike Rapoport wrote:
> > > > Hi,
> > > > 
> > > > On Mon, Sep 27, 2021 at 05:05:17PM +0200, David Hildenbrand wrote:
> > > > > Let's add a flag that corresponds to IORESOURCE_SYSRAM_DRIVER_MANAGED.
> > > > > Similar to MEMBLOCK_HOTPLUG, most infrastructure has to treat such memory
> > > > > like ordinary MEMBLOCK_NONE memory -- for example, when selecting memory
> > > > > regions to add to the vmcore for dumping in the crashkernel via
> > > > > for_each_mem_range().
> > > > Can you please elaborate on the difference in semantics of MEMBLOCK_HOTPLUG
> > > > and MEMBLOCK_DRIVER_MANAGED?
> > > > Unless I'm missing something they both mark memory that can be unplugged
> > > > anytime and so it should not be used in certain cases. Why is there a need
> > > > for a new flag?
> > > 
> > > In the cover letter I have "Alternative B: Reuse MEMBLOCK_HOTPLUG.
> > > MEMBLOCK_HOTPLUG serves a different purpose, though.", but looking into the
> > > details it won't work as is.
> > > 
> > > MEMBLOCK_HOTPLUG is used to mark memory early during boot that can later get
> > > hotunplugged again and should be placed into ZONE_MOVABLE if the
> > > "movable_node" kernel parameter is set.
> > > 
> > > The confusing part is that we talk about "hotpluggable" but really mean
> > > "hotunpluggable": the reason is that HW flags DIMM slots that can later be
> > > hotplugged as "hotpluggable" even though there is already something
> > > hotplugged.
> > 
> > MEMBLOCK_HOTPLUG name is indeed somewhat confusing, but still it's core
> > meaning "this memory may be removed" which does not differ from what
> > IORESOURCE_SYSRAM_DRIVER_MANAGED means.
> > 
> > MEMBLOCK_HOTPLUG regions are indeed placed into ZONE_MOVABLE, but more
> > importantly, they are avoided when we allocate memory from memblock.
> > 
> > So, in my view, both flags mean that the memory may be removed and it
> > should not be used for certain types of allocations.
> 
> The semantics are different:
> 
> MEMBLOCK_HOTPLUG: memory is indicated as "System RAM" in the
> firmware-provided memory map and added to the system early during boot; we
> want this memory to be managed by ZONE_MOVABLE with "movable_node" set on
> the kernel command line, because only then we want it to be hotpluggable
> again. kexec *has to* indicate this memory to the second kernel and can
> place kexec-images on this memory. After memory hotunplug, kexec has to be
> re-armed.
> 
> MEMBLOCK_DRIVER_MANAGED: memory is not indicated as System RAM" in the
> firmware-provided memory map; this memory is always detected and added to
> the system by a driver; memory might not actually be physically
> hotunpluggable and the ZONE selection does not depend on "movable_core".
> kexec *must not* indicate this memory to the second kernel and *must not*
> place kexec-images on this memory.

Ok, this clarifies.
This explanation should be a part of the changelog. The sentences about the
zone selection could be probably skipped, because they are less important
for this case. E.g something like:

MEMBLOCK_HOTPLUG: memory is indicated as "System RAM" in the
firmware-provided memory map and added to the system early during boot;
kexec *has to* indicate this memory to the second kernel and can place
kexec-images on this memory. After memory hotunplug, kexec has to be
re-armed.

MEMBLOCK_DRIVER_MANAGED: memory is not indicated as "System RAM" in the
firmware-provided memory map; this memory is always detected and added to
the system by a driver; memory might not actually be physically
hotunpluggable.  kexec *must not* indicate this memory to the second kernel
and *must not* place kexec-images on this memory.

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-10-01 14:03 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-27 15:05 [PATCH v1 0/4] mm/memory_hotplug: full support for David Hildenbrand
2021-09-27 15:05 ` [PATCH v1 1/4] mm/memory_hotplug: handle memblock_add_node() failures in add_memory_resource() David Hildenbrand
2021-09-27 15:05 ` [PATCH v1 2/4] memblock: allow to specify flags with memblock_add_node() David Hildenbrand
2021-09-27 15:19   ` Geert Uytterhoeven
2021-09-28  9:38   ` Heiko Carstens
2021-09-29 16:25   ` Mike Rapoport
2021-09-29 16:30     ` David Hildenbrand
2021-09-27 15:05 ` [PATCH v1 3/4] memblock: add MEMBLOCK_DRIVER_MANAGED to mimic IORESOURCE_SYSRAM_DRIVER_MANAGED David Hildenbrand
2021-09-29 16:39   ` Mike Rapoport
2021-09-29 16:54     ` David Hildenbrand
2021-09-30 21:21       ` Mike Rapoport
2021-10-01  8:04         ` David Hildenbrand
2021-10-01 14:03           ` Mike Rapoport
2021-09-27 15:05 ` [PATCH v1 4/4] mm/memory_hotplug: indicate MEMBLOCK_DRIVER_MANAGED with IORESOURCE_SYSRAM_DRIVER_MANAGED David Hildenbrand
2021-09-27 15:07 ` [PATCH v1 0/4] mm/memory_hotplug: full support for David Hildenbrand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).