mirror of https://github.com/torvalds/linux.git
Merge branch 'memory-hotplug'
Sumanth Korikkar says: ==================== Provide a new interface for dynamic configuration and deconfiguration of hotplug memory on s390, allowing with/without memmap_on_memory support. It is a follow up on the discussion with David when introducing memmap_on_memory support for s390 and support dynamic (de)configuration of memory: https://lore.kernel.org/all/ee492da8-74b4-4a97-8b24-73e07257f01d@redhat.com/ https://lore.kernel.org/all/20241202082732.3959803-1-sumanthk@linux.ibm.com/ The original motivation for introducing memmap_on_memory on s390 was to avoid using online memory to store struct pages metadata, particularly for standby memory blocks. This became critical in cases where there was an imbalance between standby and online memory, potentially leading to boot failures due to insufficient memory for metadata allocation. To address this, memmap_on_memory was utilized on s390. However, in its current form, it adds struct pages metadata at the start of each memory block at the time of addition (only standby memory), and this configuration is static. It cannot be changed at runtime (When the user needs continuous physical memory). Inorder to provide more flexibility to the user and overcome the above limitation, add an option to dynamically configure and deconfigure hotpluggable memory block with/without memmap_on_memory. With the new interface, s390 will not add all possible hotplug memory in advance, like before, to make it visible in sysfs for online/offline actions. Instead, before memory block can be set online, it has to be configured via a new interface in /sys/firmware/memory/memoryX/config, which makes s390 similar to others. i.e. Adding of hotpluggable memory is controlled by the user instead of adding it at boottime. s390 kernel sysfs interface to configure/deconfigure memory with memmap_on_memory (with upcoming lsmem changes): * Initial memory layout: lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED,MEMMAP_ON_MEMORY RANGE SIZE STATE BLOCK CONFIGURED MEMMAP_ON_MEMORY 0x00000000-0x7fffffff 2G online 0-15 yes no 0x80000000-0xffffffff 2G offline 16-31 no yes * Configure memory echo 1 > /sys/firmware/memory/memory16/config lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED,MEMMAP_ON_MEMORY RANGE SIZE STATE BLOCK CONFIGURED MEMMAP_ON_MEMORY 0x00000000-0x7fffffff 2G online 0-15 yes no 0x80000000-0x87ffffff 128M offline 16 yes yes 0x88000000-0xffffffff 1.9G offline 17-31 no yes * Deconfigure memory echo 0 > /sys/firmware/memory/memory16/config lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED,MEMMAP_ON_MEMORY RANGE SIZE STATE BLOCK CONFIGURED MEMMAP_ON_MEMORY 0x00000000-0x7fffffff 2G online 0-15 yes no 0x80000000-0xffffffff 2G offline 16-31 no yes * Enable memmap_on_memory and online it. (Deconfigure first) echo 0 > /sys/devices/system/memory/memory5/online echo 0 > /sys/firmware/memory/memory5/config lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED,MEMMAP_ON_MEMORY RANGE SIZE STATE BLOCK CONFIGURED MEMMAP_ON_MEMORY 0x00000000-0x27ffffff 640M online 0-4 yes no 0x28000000-0x2fffffff 128M offline 5 no no 0x30000000-0x7fffffff 1.3G online 6-15 yes no 0x80000000-0xffffffff 2G offline 16-31 no yes (Enable memmap_on_memory and online it) echo 1 > /sys/firmware/memory/memory5/memmap_on_memory echo 1 > /sys/firmware/memory/memory5/config echo 1 > /sys/devices/system/memory/memory5/online lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED,MEMMAP_ON_MEMORY RANGE SIZE STATE BLOCK CONFIGURED MEMMAP_ON_MEMORY 0x00000000-0x27ffffff 640M online 0-4 yes no 0x28000000-0x2fffffff 128M online 5 yes yes 0x30000000-0x7fffffff 1.3G online 6-15 yes no 0x80000000-0xffffffff 2G offline 16-31 no yes * Disable memmap_on_memory and online it. (Deconfigure first) echo 0 > /sys/devices/system/memory/memory5/online echo 0 > /sys/firmware/memory/memory5/config lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED,MEMMAP_ON_MEMORY RANGE SIZE STATE BLOCK CONFIGURED MEMMAP_ON_MEMORY 0x00000000-0x27ffffff 640M online 0-4 yes no 0x28000000-0x2fffffff 128M offline 5 no yes 0x30000000-0x7fffffff 1.3G online 6-15 yes no 0x80000000-0xffffffff 2G offline 16-31 no yes (Disable memmap_on_memory and online it) echo 0 > /sys/firmware/memory/memory5/memmap_on_memory echo 1 > /sys/firmware/memory/memory5/config echo 1 > /sys/devices/system/memory/memory5/online lsmem -o RANGE,SIZE,STATE,BLOCK,CONFIGURED,MEMMAP_ON_MEMORY RANGE SIZE STATE BLOCK CONFIGURED MEMMAP_ON_MEMORY 0x00000000-0x7fffffff 2G online 0-15 yes no 0x80000000-0xffffffff 2G offline 16-31 no yes * Userspace changes: lsmem/chmem tool is also changed to use the new interface. I will send it to util-linux soon. Patch 1 adds support for removal of boot-allocated memory blocks. Patch 2 provides option to dynamically configure and deconfigure memory with/without memmap_on_memory. Patch 3 removes MHP_OFFLINE_INACCESSIBLE from s390. The mhp flag was used to mark memory as not accessible until memory hotplug online phase begins. However, with patch 2, it is no longer essential. Memory can be brought to accessible state before adding memory, as the memory is added during runttime now instead of boottime. Patch 4 removes the MEM_PREPARE_ONLINE/MEM_FINISH_OFFLINE notifiers. It is no longer needed. Memory can be brought to accessible state before adding memory now, with runtime (de)configuration of memory. Note: The patches apply to the linux-next branch. v3: Thanks David * Avoid goto label in create_standby_sclp_mems(). * Use unsigned long instead of u64. * Add Acked-by. v2: Thanks David * Rename struct mblock/mblock_arg with struct sclp_mem/sclp_mem_arg. * Rename all mblocks/mblock references with sclp_mems/sclp_mem - structures, functions. * Rename create_online_mblock() with create_configured_sclp_mem(). * Rename config_mblock_show()/config_mblock_store() with config_sclp_mem_show()/config_sclp_mem_store(). * Remove contains_standby_increment() and sclp_mem_notifier. sclp mem state change is performed when adding/removing memory. sclp memory notifier - no longer needed with this patchset. * Recover sclp mem state when add_memory() fails. * Refactor and add function init_sclp_mem(). * Use unsigned long instead of unsigned long long. * Simplify and correct kobj handling. Thanks Heiko. ==================== Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
This commit is contained in:
commit
ec9b3b85ea
|
|
@ -164,6 +164,8 @@ void page_table_free(struct mm_struct *mm, unsigned long *table)
|
||||||
{
|
{
|
||||||
struct ptdesc *ptdesc = virt_to_ptdesc(table);
|
struct ptdesc *ptdesc = virt_to_ptdesc(table);
|
||||||
|
|
||||||
|
if (pagetable_is_reserved(ptdesc))
|
||||||
|
return free_reserved_ptdesc(ptdesc);
|
||||||
pagetable_dtor_free(ptdesc);
|
pagetable_dtor_free(ptdesc);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -4,6 +4,7 @@
|
||||||
*/
|
*/
|
||||||
|
|
||||||
#include <linux/memory_hotplug.h>
|
#include <linux/memory_hotplug.h>
|
||||||
|
#include <linux/bootmem_info.h>
|
||||||
#include <linux/cpufeature.h>
|
#include <linux/cpufeature.h>
|
||||||
#include <linux/memblock.h>
|
#include <linux/memblock.h>
|
||||||
#include <linux/pfn.h>
|
#include <linux/pfn.h>
|
||||||
|
|
@ -39,15 +40,21 @@ static void __ref *vmem_alloc_pages(unsigned int order)
|
||||||
|
|
||||||
static void vmem_free_pages(unsigned long addr, int order, struct vmem_altmap *altmap)
|
static void vmem_free_pages(unsigned long addr, int order, struct vmem_altmap *altmap)
|
||||||
{
|
{
|
||||||
|
unsigned int nr_pages = 1 << order;
|
||||||
|
struct page *page;
|
||||||
|
|
||||||
if (altmap) {
|
if (altmap) {
|
||||||
vmem_altmap_free(altmap, 1 << order);
|
vmem_altmap_free(altmap, 1 << order);
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
/* We don't expect boot memory to be removed ever. */
|
page = virt_to_page((void *)addr);
|
||||||
if (!slab_is_available() ||
|
if (PageReserved(page)) {
|
||||||
WARN_ON_ONCE(PageReserved(virt_to_page((void *)addr))))
|
/* allocated from memblock */
|
||||||
return;
|
while (nr_pages--)
|
||||||
free_pages(addr, order);
|
free_bootmem_page(page++);
|
||||||
|
} else {
|
||||||
|
free_pages(addr, order);
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
void *vmem_crst_alloc(unsigned long val)
|
void *vmem_crst_alloc(unsigned long val)
|
||||||
|
|
@ -79,10 +86,6 @@ pte_t __ref *vmem_pte_alloc(void)
|
||||||
|
|
||||||
static void vmem_pte_free(unsigned long *table)
|
static void vmem_pte_free(unsigned long *table)
|
||||||
{
|
{
|
||||||
/* We don't expect boot memory to be removed ever. */
|
|
||||||
if (!slab_is_available() ||
|
|
||||||
WARN_ON_ONCE(PageReserved(virt_to_page(table))))
|
|
||||||
return;
|
|
||||||
page_table_free(&init_mm, table);
|
page_table_free(&init_mm, table);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -226,7 +226,6 @@ static int memory_block_online(struct memory_block *mem)
|
||||||
unsigned long start_pfn = section_nr_to_pfn(mem->start_section_nr);
|
unsigned long start_pfn = section_nr_to_pfn(mem->start_section_nr);
|
||||||
unsigned long nr_pages = PAGES_PER_SECTION * sections_per_block;
|
unsigned long nr_pages = PAGES_PER_SECTION * sections_per_block;
|
||||||
unsigned long nr_vmemmap_pages = 0;
|
unsigned long nr_vmemmap_pages = 0;
|
||||||
struct memory_notify arg;
|
|
||||||
struct zone *zone;
|
struct zone *zone;
|
||||||
int ret;
|
int ret;
|
||||||
|
|
||||||
|
|
@ -246,19 +245,9 @@ static int memory_block_online(struct memory_block *mem)
|
||||||
if (mem->altmap)
|
if (mem->altmap)
|
||||||
nr_vmemmap_pages = mem->altmap->free;
|
nr_vmemmap_pages = mem->altmap->free;
|
||||||
|
|
||||||
arg.altmap_start_pfn = start_pfn;
|
|
||||||
arg.altmap_nr_pages = nr_vmemmap_pages;
|
|
||||||
arg.start_pfn = start_pfn + nr_vmemmap_pages;
|
|
||||||
arg.nr_pages = nr_pages - nr_vmemmap_pages;
|
|
||||||
mem_hotplug_begin();
|
mem_hotplug_begin();
|
||||||
ret = memory_notify(MEM_PREPARE_ONLINE, &arg);
|
|
||||||
ret = notifier_to_errno(ret);
|
|
||||||
if (ret)
|
|
||||||
goto out_notifier;
|
|
||||||
|
|
||||||
if (nr_vmemmap_pages) {
|
if (nr_vmemmap_pages) {
|
||||||
ret = mhp_init_memmap_on_memory(start_pfn, nr_vmemmap_pages,
|
ret = mhp_init_memmap_on_memory(start_pfn, nr_vmemmap_pages, zone);
|
||||||
zone, mem->altmap->inaccessible);
|
|
||||||
if (ret)
|
if (ret)
|
||||||
goto out;
|
goto out;
|
||||||
}
|
}
|
||||||
|
|
@ -280,11 +269,7 @@ static int memory_block_online(struct memory_block *mem)
|
||||||
nr_vmemmap_pages);
|
nr_vmemmap_pages);
|
||||||
|
|
||||||
mem->zone = zone;
|
mem->zone = zone;
|
||||||
mem_hotplug_done();
|
|
||||||
return ret;
|
|
||||||
out:
|
out:
|
||||||
memory_notify(MEM_FINISH_OFFLINE, &arg);
|
|
||||||
out_notifier:
|
|
||||||
mem_hotplug_done();
|
mem_hotplug_done();
|
||||||
return ret;
|
return ret;
|
||||||
}
|
}
|
||||||
|
|
@ -297,7 +282,6 @@ static int memory_block_offline(struct memory_block *mem)
|
||||||
unsigned long start_pfn = section_nr_to_pfn(mem->start_section_nr);
|
unsigned long start_pfn = section_nr_to_pfn(mem->start_section_nr);
|
||||||
unsigned long nr_pages = PAGES_PER_SECTION * sections_per_block;
|
unsigned long nr_pages = PAGES_PER_SECTION * sections_per_block;
|
||||||
unsigned long nr_vmemmap_pages = 0;
|
unsigned long nr_vmemmap_pages = 0;
|
||||||
struct memory_notify arg;
|
|
||||||
int ret;
|
int ret;
|
||||||
|
|
||||||
if (!mem->zone)
|
if (!mem->zone)
|
||||||
|
|
@ -329,11 +313,6 @@ static int memory_block_offline(struct memory_block *mem)
|
||||||
mhp_deinit_memmap_on_memory(start_pfn, nr_vmemmap_pages);
|
mhp_deinit_memmap_on_memory(start_pfn, nr_vmemmap_pages);
|
||||||
|
|
||||||
mem->zone = NULL;
|
mem->zone = NULL;
|
||||||
arg.altmap_start_pfn = start_pfn;
|
|
||||||
arg.altmap_nr_pages = nr_vmemmap_pages;
|
|
||||||
arg.start_pfn = start_pfn + nr_vmemmap_pages;
|
|
||||||
arg.nr_pages = nr_pages - nr_vmemmap_pages;
|
|
||||||
memory_notify(MEM_FINISH_OFFLINE, &arg);
|
|
||||||
out:
|
out:
|
||||||
mem_hotplug_done();
|
mem_hotplug_done();
|
||||||
return ret;
|
return ret;
|
||||||
|
|
|
||||||
|
|
@ -9,9 +9,12 @@
|
||||||
#define pr_fmt(fmt) KMSG_COMPONENT ": " fmt
|
#define pr_fmt(fmt) KMSG_COMPONENT ": " fmt
|
||||||
|
|
||||||
#include <linux/cpufeature.h>
|
#include <linux/cpufeature.h>
|
||||||
|
#include <linux/container_of.h>
|
||||||
#include <linux/err.h>
|
#include <linux/err.h>
|
||||||
#include <linux/errno.h>
|
#include <linux/errno.h>
|
||||||
#include <linux/init.h>
|
#include <linux/init.h>
|
||||||
|
#include <linux/kobject.h>
|
||||||
|
#include <linux/kstrtox.h>
|
||||||
#include <linux/memory.h>
|
#include <linux/memory.h>
|
||||||
#include <linux/memory_hotplug.h>
|
#include <linux/memory_hotplug.h>
|
||||||
#include <linux/mm.h>
|
#include <linux/mm.h>
|
||||||
|
|
@ -27,7 +30,6 @@
|
||||||
#define SCLP_CMDW_ASSIGN_STORAGE 0x000d0001
|
#define SCLP_CMDW_ASSIGN_STORAGE 0x000d0001
|
||||||
#define SCLP_CMDW_UNASSIGN_STORAGE 0x000c0001
|
#define SCLP_CMDW_UNASSIGN_STORAGE 0x000c0001
|
||||||
|
|
||||||
static DEFINE_MUTEX(sclp_mem_mutex);
|
|
||||||
static LIST_HEAD(sclp_mem_list);
|
static LIST_HEAD(sclp_mem_list);
|
||||||
static u8 sclp_max_storage_id;
|
static u8 sclp_max_storage_id;
|
||||||
static DECLARE_BITMAP(sclp_storage_ids, 256);
|
static DECLARE_BITMAP(sclp_storage_ids, 256);
|
||||||
|
|
@ -38,6 +40,18 @@ struct memory_increment {
|
||||||
int standby;
|
int standby;
|
||||||
};
|
};
|
||||||
|
|
||||||
|
struct sclp_mem {
|
||||||
|
struct kobject kobj;
|
||||||
|
unsigned int id;
|
||||||
|
unsigned int memmap_on_memory;
|
||||||
|
unsigned int config;
|
||||||
|
};
|
||||||
|
|
||||||
|
struct sclp_mem_arg {
|
||||||
|
struct sclp_mem *sclp_mems;
|
||||||
|
struct kset *kset;
|
||||||
|
};
|
||||||
|
|
||||||
struct assign_storage_sccb {
|
struct assign_storage_sccb {
|
||||||
struct sccb_header header;
|
struct sccb_header header;
|
||||||
u16 rn;
|
u16 rn;
|
||||||
|
|
@ -163,92 +177,166 @@ static int sclp_mem_change_state(unsigned long start, unsigned long size,
|
||||||
return rc ? -EIO : 0;
|
return rc ? -EIO : 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
static bool contains_standby_increment(unsigned long start, unsigned long end)
|
static ssize_t sclp_config_mem_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
|
||||||
{
|
{
|
||||||
struct memory_increment *incr;
|
struct sclp_mem *sclp_mem = container_of(kobj, struct sclp_mem, kobj);
|
||||||
unsigned long istart;
|
|
||||||
|
|
||||||
list_for_each_entry(incr, &sclp_mem_list, list) {
|
return sysfs_emit(buf, "%u\n", READ_ONCE(sclp_mem->config));
|
||||||
istart = rn2addr(incr->rn);
|
|
||||||
if (end - 1 < istart)
|
|
||||||
continue;
|
|
||||||
if (start > istart + sclp.rzm - 1)
|
|
||||||
continue;
|
|
||||||
if (incr->standby)
|
|
||||||
return true;
|
|
||||||
}
|
|
||||||
return false;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
static int sclp_mem_notifier(struct notifier_block *nb,
|
static ssize_t sclp_config_mem_store(struct kobject *kobj, struct kobj_attribute *attr,
|
||||||
unsigned long action, void *data)
|
const char *buf, size_t count)
|
||||||
{
|
{
|
||||||
unsigned long start, size;
|
unsigned long addr, block_size;
|
||||||
struct memory_notify *arg;
|
struct sclp_mem *sclp_mem;
|
||||||
|
struct memory_block *mem;
|
||||||
unsigned char id;
|
unsigned char id;
|
||||||
int rc = 0;
|
bool value;
|
||||||
|
int rc;
|
||||||
|
|
||||||
arg = data;
|
rc = kstrtobool(buf, &value);
|
||||||
start = arg->start_pfn << PAGE_SHIFT;
|
if (rc)
|
||||||
size = arg->nr_pages << PAGE_SHIFT;
|
return rc;
|
||||||
mutex_lock(&sclp_mem_mutex);
|
sclp_mem = container_of(kobj, struct sclp_mem, kobj);
|
||||||
|
block_size = memory_block_size_bytes();
|
||||||
|
addr = sclp_mem->id * block_size;
|
||||||
|
/*
|
||||||
|
* Hold device_hotplug_lock when adding/removing memory blocks.
|
||||||
|
* Additionally, also protect calls to find_memory_block() and
|
||||||
|
* sclp_attach_storage().
|
||||||
|
*/
|
||||||
|
rc = lock_device_hotplug_sysfs();
|
||||||
|
if (rc)
|
||||||
|
goto out;
|
||||||
for_each_clear_bit(id, sclp_storage_ids, sclp_max_storage_id + 1)
|
for_each_clear_bit(id, sclp_storage_ids, sclp_max_storage_id + 1)
|
||||||
sclp_attach_storage(id);
|
sclp_attach_storage(id);
|
||||||
switch (action) {
|
if (value) {
|
||||||
case MEM_GOING_OFFLINE:
|
if (sclp_mem->config)
|
||||||
|
goto out_unlock;
|
||||||
|
rc = sclp_mem_change_state(addr, block_size, 1);
|
||||||
|
if (rc)
|
||||||
|
goto out_unlock;
|
||||||
/*
|
/*
|
||||||
* Do not allow to set memory blocks offline that contain
|
* Set entire memory block CMMA state to nodat. Later, when
|
||||||
* standby memory. This is done to simplify the "memory online"
|
* page tables pages are allocated via __add_memory(), those
|
||||||
* case.
|
* regions are marked __arch_set_page_dat().
|
||||||
*/
|
*/
|
||||||
if (contains_standby_increment(start, start + size))
|
__arch_set_page_nodat((void *)__va(addr), block_size >> PAGE_SHIFT);
|
||||||
rc = -EPERM;
|
rc = __add_memory(0, addr, block_size,
|
||||||
break;
|
sclp_mem->memmap_on_memory ?
|
||||||
case MEM_PREPARE_ONLINE:
|
MHP_MEMMAP_ON_MEMORY : MHP_NONE);
|
||||||
/*
|
if (rc) {
|
||||||
* Access the altmap_start_pfn and altmap_nr_pages fields
|
sclp_mem_change_state(addr, block_size, 0);
|
||||||
* within the struct memory_notify specifically when dealing
|
goto out_unlock;
|
||||||
* with only MEM_PREPARE_ONLINE/MEM_FINISH_OFFLINE notifiers.
|
|
||||||
*
|
|
||||||
* When altmap is in use, take the specified memory range
|
|
||||||
* online, which includes the altmap.
|
|
||||||
*/
|
|
||||||
if (arg->altmap_nr_pages) {
|
|
||||||
start = PFN_PHYS(arg->altmap_start_pfn);
|
|
||||||
size += PFN_PHYS(arg->altmap_nr_pages);
|
|
||||||
}
|
}
|
||||||
rc = sclp_mem_change_state(start, size, 1);
|
mem = find_memory_block(pfn_to_section_nr(PFN_DOWN(addr)));
|
||||||
if (rc || !arg->altmap_nr_pages)
|
put_device(&mem->dev);
|
||||||
break;
|
WRITE_ONCE(sclp_mem->config, 1);
|
||||||
/*
|
} else {
|
||||||
* Set CMMA state to nodat here, since the struct page memory
|
if (!sclp_mem->config)
|
||||||
* at the beginning of the memory block will not go through the
|
goto out_unlock;
|
||||||
* buddy allocator later.
|
mem = find_memory_block(pfn_to_section_nr(PFN_DOWN(addr)));
|
||||||
*/
|
if (mem->state != MEM_OFFLINE) {
|
||||||
__arch_set_page_nodat((void *)__va(start), arg->altmap_nr_pages);
|
put_device(&mem->dev);
|
||||||
break;
|
rc = -EBUSY;
|
||||||
case MEM_FINISH_OFFLINE:
|
goto out_unlock;
|
||||||
/*
|
|
||||||
* When altmap is in use, take the specified memory range
|
|
||||||
* offline, which includes the altmap.
|
|
||||||
*/
|
|
||||||
if (arg->altmap_nr_pages) {
|
|
||||||
start = PFN_PHYS(arg->altmap_start_pfn);
|
|
||||||
size += PFN_PHYS(arg->altmap_nr_pages);
|
|
||||||
}
|
}
|
||||||
sclp_mem_change_state(start, size, 0);
|
/* drop the ref just got via find_memory_block() */
|
||||||
break;
|
put_device(&mem->dev);
|
||||||
default:
|
sclp_mem_change_state(addr, block_size, 0);
|
||||||
break;
|
__remove_memory(addr, block_size);
|
||||||
|
WRITE_ONCE(sclp_mem->config, 0);
|
||||||
}
|
}
|
||||||
mutex_unlock(&sclp_mem_mutex);
|
out_unlock:
|
||||||
return rc ? NOTIFY_BAD : NOTIFY_OK;
|
unlock_device_hotplug();
|
||||||
|
out:
|
||||||
|
return rc ? rc : count;
|
||||||
}
|
}
|
||||||
|
|
||||||
static struct notifier_block sclp_mem_nb = {
|
static struct kobj_attribute sclp_config_mem_attr =
|
||||||
.notifier_call = sclp_mem_notifier,
|
__ATTR(config, 0644, sclp_config_mem_show, sclp_config_mem_store);
|
||||||
|
|
||||||
|
static ssize_t sclp_memmap_on_memory_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
|
||||||
|
{
|
||||||
|
struct sclp_mem *sclp_mem = container_of(kobj, struct sclp_mem, kobj);
|
||||||
|
|
||||||
|
return sysfs_emit(buf, "%u\n", READ_ONCE(sclp_mem->memmap_on_memory));
|
||||||
|
}
|
||||||
|
|
||||||
|
static ssize_t sclp_memmap_on_memory_store(struct kobject *kobj, struct kobj_attribute *attr,
|
||||||
|
const char *buf, size_t count)
|
||||||
|
{
|
||||||
|
struct sclp_mem *sclp_mem;
|
||||||
|
unsigned long block_size;
|
||||||
|
struct memory_block *mem;
|
||||||
|
bool value;
|
||||||
|
int rc;
|
||||||
|
|
||||||
|
rc = kstrtobool(buf, &value);
|
||||||
|
if (rc)
|
||||||
|
return rc;
|
||||||
|
rc = lock_device_hotplug_sysfs();
|
||||||
|
if (rc)
|
||||||
|
return rc;
|
||||||
|
block_size = memory_block_size_bytes();
|
||||||
|
sclp_mem = container_of(kobj, struct sclp_mem, kobj);
|
||||||
|
mem = find_memory_block(pfn_to_section_nr(PFN_DOWN(sclp_mem->id * block_size)));
|
||||||
|
if (!mem) {
|
||||||
|
WRITE_ONCE(sclp_mem->memmap_on_memory, value);
|
||||||
|
} else {
|
||||||
|
put_device(&mem->dev);
|
||||||
|
rc = -EBUSY;
|
||||||
|
}
|
||||||
|
unlock_device_hotplug();
|
||||||
|
return rc ? rc : count;
|
||||||
|
}
|
||||||
|
|
||||||
|
static const struct kobj_type ktype = {
|
||||||
|
.sysfs_ops = &kobj_sysfs_ops,
|
||||||
};
|
};
|
||||||
|
|
||||||
|
static struct kobj_attribute sclp_memmap_attr =
|
||||||
|
__ATTR(memmap_on_memory, 0644, sclp_memmap_on_memory_show, sclp_memmap_on_memory_store);
|
||||||
|
|
||||||
|
static struct attribute *sclp_mem_attrs[] = {
|
||||||
|
&sclp_config_mem_attr.attr,
|
||||||
|
&sclp_memmap_attr.attr,
|
||||||
|
NULL,
|
||||||
|
};
|
||||||
|
|
||||||
|
static struct attribute_group sclp_mem_attr_group = {
|
||||||
|
.attrs = sclp_mem_attrs,
|
||||||
|
};
|
||||||
|
|
||||||
|
static int sclp_create_mem(struct sclp_mem *sclp_mem, struct kset *kset,
|
||||||
|
unsigned int id, bool config, bool memmap_on_memory)
|
||||||
|
{
|
||||||
|
int rc;
|
||||||
|
|
||||||
|
sclp_mem->memmap_on_memory = memmap_on_memory;
|
||||||
|
sclp_mem->config = config;
|
||||||
|
sclp_mem->id = id;
|
||||||
|
kobject_init(&sclp_mem->kobj, &ktype);
|
||||||
|
rc = kobject_add(&sclp_mem->kobj, &kset->kobj, "memory%d", id);
|
||||||
|
if (rc)
|
||||||
|
return rc;
|
||||||
|
return sysfs_create_group(&sclp_mem->kobj, &sclp_mem_attr_group);
|
||||||
|
}
|
||||||
|
|
||||||
|
static int sclp_create_configured_mem(struct memory_block *mem, void *argument)
|
||||||
|
{
|
||||||
|
struct sclp_mem *sclp_mems;
|
||||||
|
struct sclp_mem_arg *arg;
|
||||||
|
struct kset *kset;
|
||||||
|
unsigned int id;
|
||||||
|
|
||||||
|
id = mem->dev.id;
|
||||||
|
arg = (struct sclp_mem_arg *)argument;
|
||||||
|
sclp_mems = arg->sclp_mems;
|
||||||
|
kset = arg->kset;
|
||||||
|
return sclp_create_mem(&sclp_mems[id], kset, id, true, false);
|
||||||
|
}
|
||||||
|
|
||||||
static void __init align_to_block_size(unsigned long *start,
|
static void __init align_to_block_size(unsigned long *start,
|
||||||
unsigned long *size,
|
unsigned long *size,
|
||||||
unsigned long alignment)
|
unsigned long alignment)
|
||||||
|
|
@ -264,14 +352,17 @@ static void __init align_to_block_size(unsigned long *start,
|
||||||
*size = size_align;
|
*size = size_align;
|
||||||
}
|
}
|
||||||
|
|
||||||
static void __init add_memory_merged(u16 rn)
|
static int __init sclp_create_standby_mems_merged(struct sclp_mem *sclp_mems,
|
||||||
|
struct kset *kset, u16 rn)
|
||||||
{
|
{
|
||||||
unsigned long start, size, addr, block_size;
|
unsigned long start, size, addr, block_size;
|
||||||
static u16 first_rn, num;
|
static u16 first_rn, num;
|
||||||
|
unsigned int id;
|
||||||
|
int rc = 0;
|
||||||
|
|
||||||
if (rn && first_rn && (first_rn + num == rn)) {
|
if (rn && first_rn && (first_rn + num == rn)) {
|
||||||
num++;
|
num++;
|
||||||
return;
|
return rc;
|
||||||
}
|
}
|
||||||
if (!first_rn)
|
if (!first_rn)
|
||||||
goto skip_add;
|
goto skip_add;
|
||||||
|
|
@ -286,24 +377,57 @@ static void __init add_memory_merged(u16 rn)
|
||||||
if (!size)
|
if (!size)
|
||||||
goto skip_add;
|
goto skip_add;
|
||||||
for (addr = start; addr < start + size; addr += block_size) {
|
for (addr = start; addr < start + size; addr += block_size) {
|
||||||
add_memory(0, addr, block_size,
|
id = addr / block_size;
|
||||||
cpu_has_edat1() ?
|
rc = sclp_create_mem(&sclp_mems[id], kset, id, false,
|
||||||
MHP_MEMMAP_ON_MEMORY | MHP_OFFLINE_INACCESSIBLE : MHP_NONE);
|
mhp_supports_memmap_on_memory());
|
||||||
|
if (rc)
|
||||||
|
break;
|
||||||
}
|
}
|
||||||
skip_add:
|
skip_add:
|
||||||
first_rn = rn;
|
first_rn = rn;
|
||||||
num = 1;
|
num = 1;
|
||||||
|
return rc;
|
||||||
}
|
}
|
||||||
|
|
||||||
static void __init sclp_add_standby_memory(void)
|
static int __init sclp_create_standby_mems(struct sclp_mem *sclp_mems, struct kset *kset)
|
||||||
{
|
{
|
||||||
struct memory_increment *incr;
|
struct memory_increment *incr;
|
||||||
|
int rc = 0;
|
||||||
|
|
||||||
list_for_each_entry(incr, &sclp_mem_list, list) {
|
list_for_each_entry(incr, &sclp_mem_list, list) {
|
||||||
if (incr->standby)
|
if (incr->standby)
|
||||||
add_memory_merged(incr->rn);
|
rc = sclp_create_standby_mems_merged(sclp_mems, kset, incr->rn);
|
||||||
|
if (rc)
|
||||||
|
return rc;
|
||||||
}
|
}
|
||||||
add_memory_merged(0);
|
return sclp_create_standby_mems_merged(sclp_mems, kset, 0);
|
||||||
|
}
|
||||||
|
|
||||||
|
static int __init sclp_init_mem(void)
|
||||||
|
{
|
||||||
|
const unsigned long block_size = memory_block_size_bytes();
|
||||||
|
unsigned int max_sclp_mems;
|
||||||
|
struct sclp_mem *sclp_mems;
|
||||||
|
struct sclp_mem_arg arg;
|
||||||
|
struct kset *kset;
|
||||||
|
int rc;
|
||||||
|
|
||||||
|
max_sclp_mems = roundup(sclp.rnmax * sclp.rzm, block_size) / block_size;
|
||||||
|
/* Allocate memory for all blocks ahead of time. */
|
||||||
|
sclp_mems = kcalloc(max_sclp_mems, sizeof(struct sclp_mem), GFP_KERNEL);
|
||||||
|
if (!sclp_mems)
|
||||||
|
return -ENOMEM;
|
||||||
|
kset = kset_create_and_add("memory", NULL, firmware_kobj);
|
||||||
|
if (!kset)
|
||||||
|
return -ENOMEM;
|
||||||
|
/* Initial memory is in the "configured" state already. */
|
||||||
|
arg.sclp_mems = sclp_mems;
|
||||||
|
arg.kset = kset;
|
||||||
|
rc = for_each_memory_block(&arg, sclp_create_configured_mem);
|
||||||
|
if (rc)
|
||||||
|
return rc;
|
||||||
|
/* Standby memory is "deconfigured". */
|
||||||
|
return sclp_create_standby_mems(sclp_mems, kset);
|
||||||
}
|
}
|
||||||
|
|
||||||
static void __init insert_increment(u16 rn, int standby, int assigned)
|
static void __init insert_increment(u16 rn, int standby, int assigned)
|
||||||
|
|
@ -336,7 +460,7 @@ static void __init insert_increment(u16 rn, int standby, int assigned)
|
||||||
list_add(&new_incr->list, prev);
|
list_add(&new_incr->list, prev);
|
||||||
}
|
}
|
||||||
|
|
||||||
static int __init sclp_detect_standby_memory(void)
|
static int __init sclp_setup_memory(void)
|
||||||
{
|
{
|
||||||
struct read_storage_sccb *sccb;
|
struct read_storage_sccb *sccb;
|
||||||
int i, id, assigned, rc;
|
int i, id, assigned, rc;
|
||||||
|
|
@ -388,12 +512,9 @@ static int __init sclp_detect_standby_memory(void)
|
||||||
goto out;
|
goto out;
|
||||||
for (i = 1; i <= sclp.rnmax - assigned; i++)
|
for (i = 1; i <= sclp.rnmax - assigned; i++)
|
||||||
insert_increment(0, 1, 0);
|
insert_increment(0, 1, 0);
|
||||||
rc = register_memory_notifier(&sclp_mem_nb);
|
rc = sclp_init_mem();
|
||||||
if (rc)
|
|
||||||
goto out;
|
|
||||||
sclp_add_standby_memory();
|
|
||||||
out:
|
out:
|
||||||
free_page((unsigned long)sccb);
|
free_page((unsigned long)sccb);
|
||||||
return rc;
|
return rc;
|
||||||
}
|
}
|
||||||
__initcall(sclp_detect_standby_memory);
|
__initcall(sclp_setup_memory);
|
||||||
|
|
|
||||||
|
|
@ -96,17 +96,8 @@ int set_memory_block_size_order(unsigned int order);
|
||||||
#define MEM_GOING_ONLINE (1<<3)
|
#define MEM_GOING_ONLINE (1<<3)
|
||||||
#define MEM_CANCEL_ONLINE (1<<4)
|
#define MEM_CANCEL_ONLINE (1<<4)
|
||||||
#define MEM_CANCEL_OFFLINE (1<<5)
|
#define MEM_CANCEL_OFFLINE (1<<5)
|
||||||
#define MEM_PREPARE_ONLINE (1<<6)
|
|
||||||
#define MEM_FINISH_OFFLINE (1<<7)
|
|
||||||
|
|
||||||
struct memory_notify {
|
struct memory_notify {
|
||||||
/*
|
|
||||||
* The altmap_start_pfn and altmap_nr_pages fields are designated for
|
|
||||||
* specifying the altmap range and are exclusively intended for use in
|
|
||||||
* MEM_PREPARE_ONLINE/MEM_FINISH_OFFLINE notifiers.
|
|
||||||
*/
|
|
||||||
unsigned long altmap_start_pfn;
|
|
||||||
unsigned long altmap_nr_pages;
|
|
||||||
unsigned long start_pfn;
|
unsigned long start_pfn;
|
||||||
unsigned long nr_pages;
|
unsigned long nr_pages;
|
||||||
};
|
};
|
||||||
|
|
|
||||||
|
|
@ -58,22 +58,6 @@ typedef int __bitwise mhp_t;
|
||||||
* implies the node id (nid).
|
* implies the node id (nid).
|
||||||
*/
|
*/
|
||||||
#define MHP_NID_IS_MGID ((__force mhp_t)BIT(2))
|
#define MHP_NID_IS_MGID ((__force mhp_t)BIT(2))
|
||||||
/*
|
|
||||||
* The hotplugged memory is completely inaccessible while the memory is
|
|
||||||
* offline. The memory provider will handle MEM_PREPARE_ONLINE /
|
|
||||||
* MEM_FINISH_OFFLINE notifications and make the memory accessible.
|
|
||||||
*
|
|
||||||
* This flag is only relevant when used along with MHP_MEMMAP_ON_MEMORY,
|
|
||||||
* because the altmap cannot be written (e.g., poisoned) when adding
|
|
||||||
* memory -- before it is set online.
|
|
||||||
*
|
|
||||||
* This allows for adding memory with an altmap that is not currently
|
|
||||||
* made available by a hypervisor. When onlining that memory, the
|
|
||||||
* hypervisor can be instructed to make that memory available, and
|
|
||||||
* the onlining phase will not require any memory allocations, which is
|
|
||||||
* helpful in low-memory situations.
|
|
||||||
*/
|
|
||||||
#define MHP_OFFLINE_INACCESSIBLE ((__force mhp_t)BIT(3))
|
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Extended parameters for memory hotplug:
|
* Extended parameters for memory hotplug:
|
||||||
|
|
@ -123,7 +107,7 @@ extern void adjust_present_page_count(struct page *page,
|
||||||
long nr_pages);
|
long nr_pages);
|
||||||
/* VM interface that may be used by firmware interface */
|
/* VM interface that may be used by firmware interface */
|
||||||
extern int mhp_init_memmap_on_memory(unsigned long pfn, unsigned long nr_pages,
|
extern int mhp_init_memmap_on_memory(unsigned long pfn, unsigned long nr_pages,
|
||||||
struct zone *zone, bool mhp_off_inaccessible);
|
struct zone *zone);
|
||||||
extern void mhp_deinit_memmap_on_memory(unsigned long pfn, unsigned long nr_pages);
|
extern void mhp_deinit_memmap_on_memory(unsigned long pfn, unsigned long nr_pages);
|
||||||
extern int online_pages(unsigned long pfn, unsigned long nr_pages,
|
extern int online_pages(unsigned long pfn, unsigned long nr_pages,
|
||||||
struct zone *zone, struct memory_group *group);
|
struct zone *zone, struct memory_group *group);
|
||||||
|
|
|
||||||
|
|
@ -25,7 +25,6 @@ struct vmem_altmap {
|
||||||
unsigned long free;
|
unsigned long free;
|
||||||
unsigned long align;
|
unsigned long align;
|
||||||
unsigned long alloc;
|
unsigned long alloc;
|
||||||
bool inaccessible;
|
|
||||||
};
|
};
|
||||||
|
|
||||||
/*
|
/*
|
||||||
|
|
|
||||||
|
|
@ -1088,7 +1088,7 @@ void adjust_present_page_count(struct page *page, struct memory_group *group,
|
||||||
}
|
}
|
||||||
|
|
||||||
int mhp_init_memmap_on_memory(unsigned long pfn, unsigned long nr_pages,
|
int mhp_init_memmap_on_memory(unsigned long pfn, unsigned long nr_pages,
|
||||||
struct zone *zone, bool mhp_off_inaccessible)
|
struct zone *zone)
|
||||||
{
|
{
|
||||||
unsigned long end_pfn = pfn + nr_pages;
|
unsigned long end_pfn = pfn + nr_pages;
|
||||||
int ret, i;
|
int ret, i;
|
||||||
|
|
@ -1097,15 +1097,6 @@ int mhp_init_memmap_on_memory(unsigned long pfn, unsigned long nr_pages,
|
||||||
if (ret)
|
if (ret)
|
||||||
return ret;
|
return ret;
|
||||||
|
|
||||||
/*
|
|
||||||
* Memory block is accessible at this stage and hence poison the struct
|
|
||||||
* pages now. If the memory block is accessible during memory hotplug
|
|
||||||
* addition phase, then page poisining is already performed in
|
|
||||||
* sparse_add_section().
|
|
||||||
*/
|
|
||||||
if (mhp_off_inaccessible)
|
|
||||||
page_init_poison(pfn_to_page(pfn), sizeof(struct page) * nr_pages);
|
|
||||||
|
|
||||||
move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, MIGRATE_UNMOVABLE,
|
move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, MIGRATE_UNMOVABLE,
|
||||||
false);
|
false);
|
||||||
|
|
||||||
|
|
@ -1444,7 +1435,7 @@ static void remove_memory_blocks_and_altmaps(u64 start, u64 size)
|
||||||
}
|
}
|
||||||
|
|
||||||
static int create_altmaps_and_memory_blocks(int nid, struct memory_group *group,
|
static int create_altmaps_and_memory_blocks(int nid, struct memory_group *group,
|
||||||
u64 start, u64 size, mhp_t mhp_flags)
|
u64 start, u64 size)
|
||||||
{
|
{
|
||||||
unsigned long memblock_size = memory_block_size_bytes();
|
unsigned long memblock_size = memory_block_size_bytes();
|
||||||
u64 cur_start;
|
u64 cur_start;
|
||||||
|
|
@ -1460,8 +1451,6 @@ static int create_altmaps_and_memory_blocks(int nid, struct memory_group *group,
|
||||||
};
|
};
|
||||||
|
|
||||||
mhp_altmap.free = memory_block_memmap_on_memory_pages();
|
mhp_altmap.free = memory_block_memmap_on_memory_pages();
|
||||||
if (mhp_flags & MHP_OFFLINE_INACCESSIBLE)
|
|
||||||
mhp_altmap.inaccessible = true;
|
|
||||||
params.altmap = kmemdup(&mhp_altmap, sizeof(struct vmem_altmap),
|
params.altmap = kmemdup(&mhp_altmap, sizeof(struct vmem_altmap),
|
||||||
GFP_KERNEL);
|
GFP_KERNEL);
|
||||||
if (!params.altmap) {
|
if (!params.altmap) {
|
||||||
|
|
@ -1555,7 +1544,7 @@ int add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags)
|
||||||
*/
|
*/
|
||||||
if ((mhp_flags & MHP_MEMMAP_ON_MEMORY) &&
|
if ((mhp_flags & MHP_MEMMAP_ON_MEMORY) &&
|
||||||
mhp_supports_memmap_on_memory()) {
|
mhp_supports_memmap_on_memory()) {
|
||||||
ret = create_altmaps_and_memory_blocks(nid, group, start, size, mhp_flags);
|
ret = create_altmaps_and_memory_blocks(nid, group, start, size);
|
||||||
if (ret)
|
if (ret)
|
||||||
goto error;
|
goto error;
|
||||||
} else {
|
} else {
|
||||||
|
|
|
||||||
|
|
@ -951,8 +951,7 @@ int __meminit sparse_add_section(int nid, unsigned long start_pfn,
|
||||||
* Poison uninitialized struct pages in order to catch invalid flags
|
* Poison uninitialized struct pages in order to catch invalid flags
|
||||||
* combinations.
|
* combinations.
|
||||||
*/
|
*/
|
||||||
if (!altmap || !altmap->inaccessible)
|
page_init_poison(memmap, sizeof(struct page) * nr_pages);
|
||||||
page_init_poison(memmap, sizeof(struct page) * nr_pages);
|
|
||||||
|
|
||||||
ms = __nr_to_section(section_nr);
|
ms = __nr_to_section(section_nr);
|
||||||
set_section_nid(section_nr, nid);
|
set_section_nid(section_nr, nid);
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue