Accepting request 1033139 from home:colyli:branches:Base:System

Update mdadm package to latest mdadm since mdadm-4.2 (jsc#PED-1009)

OBS-URL: https://build.opensuse.org/request/show/1033139
OBS-URL: https://build.opensuse.org/package/show/Base:System/mdadm?expand=0&rev=210
This commit is contained in:
Neil Brown 2022-11-03 23:28:28 +00:00 committed by Git OBS Bridge
parent 9f888e87f6
commit a98bc811ce
173 changed files with 6818 additions and 10696 deletions

View File

@ -1,43 +0,0 @@
From 0833f9c3dbaaee202b92ea956f9e2decc7b9593a Mon Sep 17 00:00:00 2001
From: Gioh Kim <gi-oh.kim@profitbricks.com>
Date: Tue, 6 Nov 2018 15:27:42 +0100
Subject: [PATCH] Assemble: keep MD_DISK_FAILFAST and MD_DISK_WRITEMOSTLY flag
Git-commit: 0833f9c3dbaaee202b92ea956f9e2decc7b9593a
Patch-mainline: mdadm-4.1+
References: jsc#SLE-10078, jsc#SLE-9348
Before updating superblock of slave disks, desired_state value
is set for the target state of the slave disks. But it forgets
to check MD_DISK_FAILFAST and MD_DISK_WRITEMOSTLY flags. Then
start_arrays() calls ADD_NEW_DISK ioctl-call and pass the state
without MD_DISK_FAILFAST and MD_DISK_WRITEMOSTLY.
Currenlty it does not generate any problem because kernel does not
care MD_DISK_FAILFAST or MD_DISK_WRITEMOSTLY flags.
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Gioh Kim <gi-oh.kim@profitbricks.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
Assemble.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/Assemble.c b/Assemble.c
index a79466c..f39c9e1 100644
--- a/Assemble.c
+++ b/Assemble.c
@@ -1704,6 +1704,9 @@ try_again:
else
desired_state = (1<<MD_DISK_ACTIVE) | (1<<MD_DISK_SYNC);
+ desired_state |= devices[j].i.disk.state & ((1<<MD_DISK_FAILFAST) |
+ (1<<MD_DISK_WRITEMOSTLY));
+
if (!devices[j].uptodate)
continue;
--
2.25.0

View File

@ -0,0 +1,50 @@
From f1cc8ab9ab6a92c3cd94ab7590b46285e214681e Mon Sep 17 00:00:00 2001
From: Lukasz Florczak <lukasz.florczak@linux.intel.com>
Date: Tue, 15 Mar 2022 09:30:30 +0100
Subject: [PATCH 01/61] Unify error message.
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
Provide the same error message for the same error that can occur in Grow.c and super-intel.c.
Signed-off-by: Lukasz Florczak <lukasz.florczak@linux.intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
Grow.c | 4 ++--
super-intel.c | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/Grow.c b/Grow.c
index 9c6fc95..9a94720 100644
--- a/Grow.c
+++ b/Grow.c
@@ -1001,8 +1001,8 @@ int remove_disks_for_takeover(struct supertype *st,
rv = 1;
sysfs_free(arrays);
if (rv) {
- pr_err("Error. Cannot perform operation on /dev/%s\n", st->devnm);
- pr_err("For this operation it MUST be single array in container\n");
+ pr_err("Error. Cannot perform operation on %s- for this operation "
+ "it MUST be single array in container\n", st->devnm);
return rv;
}
}
diff --git a/super-intel.c b/super-intel.c
index d5fad10..5ffa763 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -11683,8 +11683,8 @@ enum imsm_reshape_type imsm_analyze_change(struct supertype *st,
struct imsm_super *mpb = super->anchor;
if (mpb->num_raid_devs > 1) {
- pr_err("Error. Cannot perform operation on %s- for this operation it MUST be single array in container\n",
- geo->dev_name);
+ pr_err("Error. Cannot perform operation on %s- for this operation "
+ "it MUST be single array in container\n", geo->dev_name);
change = -1;
}
}
--
2.35.3

View File

@ -1,82 +0,0 @@
From 6b6112842030309c297a521918d1a2e982426fa3 Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@suse.com>
Date: Fri, 9 Nov 2018 17:12:33 +1100
Subject: [PATCH 1/5] Document PART-POLICY lines
Git-commit: 6b6112842030309c297a521918d1a2e982426fa3
Patch-mainline: mdadm-4.1+
References: jsc#SLE-10078, jsc#SLE-9348
PART-POLICY has been accepted in mdadm.conf since the same
time that POLICY was accepted, but it was never documented.
So add the missing documentation.
Also fix a bug which would have stopped it from working if
anyone had ever tried to use it.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
mdadm.conf.5 | 24 +++++++++++++++++++++++-
policy.c | 2 +-
2 files changed, 24 insertions(+), 2 deletions(-)
diff --git a/mdadm.conf.5 b/mdadm.conf.5
index 18512cb0ec7e..47c962ab2071 100644
--- a/mdadm.conf.5
+++ b/mdadm.conf.5
@@ -501,7 +501,7 @@ To update hot plug configuration it is necessary to execute
.B mdadm \-\-udev\-rules
command after changing the config file
-Key words used in the
+Keywords used in the
.I POLICY
line and supported values are:
@@ -565,6 +565,28 @@ be automatically added to that array (or it's container)
as above and the disk will become a spare in remaining cases
.RE
+.TP
+.B PART-POLICY
+This is similar to
+.B POLICY
+and accepts the same keyword assignments. It allows a consistent set
+of policies to applied to each of the partitions of a device.
+
+A
+.B PART-POLICY
+line should set
+.I type=disk
+and identify the path to one or more disk devices. Each partition on
+these disks will be treated according to the
+.I action=
+setting from this line. If a
+.I domain
+is set in the line, then the domain associated with each patition will
+be based on the domain, but with
+.RB \(dq -part N\(dq
+appended, when N is the partition number for the partition that was
+found.
+
.SH EXAMPLE
DEVICE /dev/sd[bcdjkl]1
.br
diff --git a/policy.c b/policy.c
index c0d18a7eca5b..258f39317951 100644
--- a/policy.c
+++ b/policy.c
@@ -300,7 +300,7 @@ static int path_has_part(char *path, char **part)
l--;
if (l < 5 || strncmp(path+l-5, "-part", 5) != 0)
return 0;
- *part = path+l-4;
+ *part = path+l-5;
return 1;
}
--
2.14.0.rc0.dirty

View File

@ -0,0 +1,36 @@
From 5ce5a15f0bf007e850e15259bba4f53736605fb2 Mon Sep 17 00:00:00 2001
From: Lukasz Florczak <lukasz.florczak@linux.intel.com>
Date: Fri, 25 Mar 2022 12:48:59 +0100
Subject: [PATCH 02/61] mdadm: Fix double free
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
If there was a size mismatch after creation it would get fixed on grow
in imsm_fix_size_mismatch(), but due to double free "double free or corruption (fasttop)"
error occurs and grow cannot proceed.
Signed-off-by: Lukasz Florczak <lukasz.florczak@linux.intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
super-intel.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/super-intel.c b/super-intel.c
index 5ffa763..6ff336e 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -11783,9 +11783,8 @@ static int imsm_fix_size_mismatch(struct supertype *st, int subarray_index)
st->update_tail = &st->updates;
} else {
imsm_sync_metadata(st);
+ free(update);
}
-
- free(update);
}
ret_val = 0;
exit:
--
2.35.3

View File

@ -0,0 +1,86 @@
From fea026b4849182fc8413014c81456e7215af28d9 Mon Sep 17 00:00:00 2001
From: Mateusz Kusiak <mateusz.kusiak@intel.com>
Date: Wed, 23 Mar 2022 15:05:19 +0100
Subject: [PATCH 03/61] Grow_reshape: Add r0 grow size error message and update
man
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
Grow size on r0 is not supported for imsm and native metadata.
Add proper error message.
Update man for proper use of --size.
Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
Grow.c | 6 ++++++
mdadm.8.in | 19 ++++++++++++-------
2 files changed, 18 insertions(+), 7 deletions(-)
diff --git a/Grow.c b/Grow.c
index 9a94720..aa72490 100644
--- a/Grow.c
+++ b/Grow.c
@@ -1998,6 +1998,12 @@ int Grow_reshape(char *devname, int fd,
goto release;
}
+ if (array.level == 0) {
+ pr_err("Component size change is not supported for RAID0\n");
+ rv = 1;
+ goto release;
+ }
+
if (reshape_super(st, s->size, UnSet, UnSet, 0, 0, UnSet, NULL,
devname, APPLY_METADATA_CHANGES,
c->verbose > 0)) {
diff --git a/mdadm.8.in b/mdadm.8.in
index be902db..e2a4242 100644
--- a/mdadm.8.in
+++ b/mdadm.8.in
@@ -459,7 +459,8 @@ number of spare devices.
.TP
.BR \-z ", " \-\-size=
-Amount (in Kilobytes) of space to use from each drive in RAID levels 1/4/5/6.
+Amount (in Kilobytes) of space to use from each drive in RAID levels 1/4/5/6/10
+and for RAID 0 on external metadata.
This must be a multiple of the chunk size, and must leave about 128Kb
of space at the end of the drive for the RAID superblock.
If this is not specified
@@ -478,10 +479,19 @@ To guard against this it can be useful to set the initial size
slightly smaller than the smaller device with the aim that it will
still be larger than any replacement.
+This option can be used with
+.B \-\-create
+for determining initial size of an array. For external metadata,
+it can be used on a volume, but not on a container itself.
+Setting initial size of
+.B RAID 0
+array is only valid for external metadata.
+
This value can be set with
.B \-\-grow
-for RAID level 1/4/5/6 though
+for RAID level 1/4/5/6/10 though
DDF arrays may not be able to support this.
+RAID 0 array size cannot be changed.
If the array was created with a size smaller than the currently
active drives, the extra space can be accessed using
.BR \-\-grow .
@@ -501,11 +511,6 @@ problems the array can be made bigger again with no loss with another
.B "\-\-grow \-\-size="
command.
-This value cannot be used when creating a
-.B CONTAINER
-such as with DDF and IMSM metadata, though it perfectly valid when
-creating an array inside a container.
-
.TP
.BR \-Z ", " \-\-array\-size=
This is only meaningful with
--
2.35.3

View File

@ -1,339 +0,0 @@
From cd72f9d114da206baa01fd56ff2d8ffcc08f3239 Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@suse.com>
Date: Fri, 9 Nov 2018 17:12:33 +1100
Subject: [PATCH 2/5] policy: support devices with multiple paths.
Git-commit: cd72f9d114da206baa01fd56ff2d8ffcc08f3239
Patch-mainline: mdadm-4.1+
References: jsc#SLE-10078, jsc#SLE-9348
As new releases of Linux some time change the name of
a path, some distros keep "legacy" names as well. This
is useful, but confuses mdadm which assumes each device has
precisely one path.
So change this assumption: allow a disk to have several
paths, and allow any to match when looking for a policy
which matches a disk.
Reported-and-tested-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
Incremental.c | 5 +-
mdadm.h | 2 +-
policy.c | 163 ++++++++++++++++++++++++++++++++--------------------------
3 files changed, 95 insertions(+), 75 deletions(-)
diff --git a/Incremental.c b/Incremental.c
index a4ff7d4bd22d..d4d3c353560d 100644
--- a/Incremental.c
+++ b/Incremental.c
@@ -1080,6 +1080,7 @@ static int partition_try_spare(char *devname, int *dfdp, struct dev_policy *pol,
struct supertype *st2 = NULL;
char *devname = NULL;
unsigned long long devsectors;
+ char *pathlist[2];
if (de->d_ino == 0 || de->d_name[0] == '.' ||
(de->d_type != DT_LNK && de->d_type != DT_UNKNOWN))
@@ -1094,7 +1095,9 @@ static int partition_try_spare(char *devname, int *dfdp, struct dev_policy *pol,
/* This is a partition - skip it */
goto next;
- pol2 = path_policy(de->d_name, type_disk);
+ pathlist[0] = de->d_name;
+ pathlist[1] = NULL;
+ pol2 = path_policy(pathlist, type_disk);
domain_merge(&domlist, pol2, st ? st->ss->name : NULL);
if (domain_test(domlist, pol, st ? st->ss->name : NULL) != 1)
diff --git a/mdadm.h b/mdadm.h
index 387e681aa4ed..705bd9b53b3a 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -1247,7 +1247,7 @@ extern void policyline(char *line, char *type);
extern void policy_add(char *type, ...);
extern void policy_free(void);
-extern struct dev_policy *path_policy(char *path, char *type);
+extern struct dev_policy *path_policy(char **paths, char *type);
extern struct dev_policy *disk_policy(struct mdinfo *disk);
extern struct dev_policy *devid_policy(int devid);
extern void dev_policy_free(struct dev_policy *p);
diff --git a/policy.c b/policy.c
index 258f39317951..fa67d5594c04 100644
--- a/policy.c
+++ b/policy.c
@@ -189,15 +189,17 @@ struct dev_policy *pol_find(struct dev_policy *pol, char *name)
return pol;
}
-static char *disk_path(struct mdinfo *disk)
+static char **disk_paths(struct mdinfo *disk)
{
struct stat stb;
int prefix_len;
DIR *by_path;
char symlink[PATH_MAX] = "/dev/disk/by-path/";
- char nm[PATH_MAX];
+ char **paths;
+ int cnt = 0;
struct dirent *ent;
- int rv;
+
+ paths = xmalloc(sizeof(*paths) * (cnt+1));
by_path = opendir(symlink);
if (by_path) {
@@ -214,22 +216,13 @@ static char *disk_path(struct mdinfo *disk)
continue;
if (stb.st_rdev != makedev(disk->disk.major, disk->disk.minor))
continue;
- closedir(by_path);
- return xstrdup(ent->d_name);
+ paths[cnt++] = xstrdup(ent->d_name);
+ paths = xrealloc(paths, sizeof(*paths) * (cnt+1));
}
closedir(by_path);
}
- /* A NULL path isn't really acceptable - use the devname.. */
- sprintf(symlink, "/sys/dev/block/%d:%d", disk->disk.major, disk->disk.minor);
- rv = readlink(symlink, nm, sizeof(nm)-1);
- if (rv > 0) {
- char *dname;
- nm[rv] = 0;
- dname = strrchr(nm, '/');
- if (dname)
- return xstrdup(dname + 1);
- }
- return xstrdup("unknown");
+ paths[cnt] = NULL;
+ return paths;
}
char type_part[] = "part";
@@ -246,18 +239,53 @@ static char *disk_type(struct mdinfo *disk)
return type_disk;
}
-static int pol_match(struct rule *rule, char *path, char *type)
+static int path_has_part(char *path, char **part)
+{
+ /* check if path ends with "-partNN" and
+ * if it does, place a pointer to "-pathNN"
+ * in 'part'.
+ */
+ int l;
+ if (!path)
+ return 0;
+ l = strlen(path);
+ while (l > 1 && isdigit(path[l-1]))
+ l--;
+ if (l < 5 || strncmp(path+l-5, "-part", 5) != 0)
+ return 0;
+ *part = path+l-5;
+ return 1;
+}
+
+static int pol_match(struct rule *rule, char **paths, char *type, char **part)
{
- /* check if this rule matches on path and type */
+ /* Check if this rule matches on any path and type.
+ * If 'part' is not NULL, then 'path' must end in -partN, which
+ * we ignore for matching, and return in *part on success.
+ */
int pathok = 0; /* 0 == no path, 1 == match, -1 == no match yet */
int typeok = 0;
- while (rule) {
+ for (; rule; rule = rule->next) {
if (rule->name == rule_path) {
+ char *p;
+ int i;
if (pathok == 0)
pathok = -1;
- if (path && fnmatch(rule->value, path, 0) == 0)
- pathok = 1;
+ if (!paths)
+ continue;
+ for (i = 0; paths[i]; i++) {
+ if (part) {
+ if (!path_has_part(paths[i], &p))
+ continue;
+ *p = '\0';
+ *part = p+1;
+ }
+ if (fnmatch(rule->value, paths[i], 0) == 0)
+ pathok = 1;
+ if (part)
+ *p = '-';
+ }
}
if (rule->name == rule_type) {
if (typeok == 0)
@@ -265,7 +293,6 @@ static int pol_match(struct rule *rule, char *path, char *type)
if (type && strcmp(rule->value, type) == 0)
typeok = 1;
}
- rule = rule->next;
}
return pathok >= 0 && typeok >= 0;
}
@@ -286,24 +313,6 @@ static void pol_merge(struct dev_policy **pol, struct rule *rule)
pol_new(pol, r->name, r->value, metadata);
}
-static int path_has_part(char *path, char **part)
-{
- /* check if path ends with "-partNN" and
- * if it does, place a pointer to "-pathNN"
- * in 'part'.
- */
- int l;
- if (!path)
- return 0;
- l = strlen(path);
- while (l > 1 && isdigit(path[l-1]))
- l--;
- if (l < 5 || strncmp(path+l-5, "-part", 5) != 0)
- return 0;
- *part = path+l-5;
- return 1;
-}
-
static void pol_merge_part(struct dev_policy **pol, struct rule *rule, char *part)
{
/* copy any name assignments from rule into pol, appending
@@ -352,7 +361,7 @@ static int config_rules_has_path = 0;
* path_policy() gathers policy information for the
* disk described in the given a 'path' and a 'type'.
*/
-struct dev_policy *path_policy(char *path, char *type)
+struct dev_policy *path_policy(char **paths, char *type)
{
struct pol_rule *rules;
struct dev_policy *pol = NULL;
@@ -361,27 +370,24 @@ struct dev_policy *path_policy(char *path, char *type)
rules = config_rules;
while (rules) {
- char *part;
+ char *part = NULL;
if (rules->type == rule_policy)
- if (pol_match(rules->rule, path, type))
+ if (pol_match(rules->rule, paths, type, NULL))
pol_merge(&pol, rules->rule);
if (rules->type == rule_part && strcmp(type, type_part) == 0)
- if (path_has_part(path, &part)) {
- *part = 0;
- if (pol_match(rules->rule, path, type_disk))
- pol_merge_part(&pol, rules->rule, part+1);
- *part = '-';
- }
+ if (pol_match(rules->rule, paths, type_disk, &part))
+ pol_merge_part(&pol, rules->rule, part);
rules = rules->next;
}
/* Now add any metadata-specific internal knowledge
* about this path
*/
- for (i=0; path && superlist[i]; i++)
+ for (i=0; paths[0] && superlist[i]; i++)
if (superlist[i]->get_disk_controller_domain) {
const char *d =
- superlist[i]->get_disk_controller_domain(path);
+ superlist[i]->get_disk_controller_domain(
+ paths[0]);
if (d)
pol_new(&pol, pol_domain, d, superlist[i]->name);
}
@@ -400,22 +406,34 @@ void pol_add(struct dev_policy **pol,
pol_dedup(*pol);
}
+static void free_paths(char **paths)
+{
+ int i;
+
+ if (!paths)
+ return;
+
+ for (i = 0; paths[i]; i++)
+ free(paths[i]);
+ free(paths);
+}
+
/*
* disk_policy() gathers policy information for the
* disk described in the given mdinfo (disk.{major,minor}).
*/
struct dev_policy *disk_policy(struct mdinfo *disk)
{
- char *path = NULL;
+ char **paths = NULL;
char *type = disk_type(disk);
struct dev_policy *pol = NULL;
if (config_rules_has_path)
- path = disk_path(disk);
+ paths = disk_paths(disk);
- pol = path_policy(path, type);
+ pol = path_policy(paths, type);
- free(path);
+ free_paths(paths);
return pol;
}
@@ -756,27 +774,26 @@ int policy_check_path(struct mdinfo *disk, struct map_ent *array)
{
char path[PATH_MAX];
FILE *f = NULL;
- char *id_path = disk_path(disk);
- int rv;
+ char **id_paths = disk_paths(disk);
+ int i;
+ int rv = 0;
- if (!id_path)
- return 0;
+ for (i = 0; id_paths[i]; i++) {
+ snprintf(path, PATH_MAX, FAILED_SLOTS_DIR "/%s", id_paths[i]);
+ f = fopen(path, "r");
+ if (!f)
+ continue;
- snprintf(path, PATH_MAX, FAILED_SLOTS_DIR "/%s", id_path);
- f = fopen(path, "r");
- if (!f) {
- free(id_path);
- return 0;
+ rv = fscanf(f, " %s %x:%x:%x:%x\n",
+ array->metadata,
+ array->uuid,
+ array->uuid+1,
+ array->uuid+2,
+ array->uuid+3);
+ fclose(f);
+ break;
}
-
- rv = fscanf(f, " %s %x:%x:%x:%x\n",
- array->metadata,
- array->uuid,
- array->uuid+1,
- array->uuid+2,
- array->uuid+3);
- fclose(f);
- free(id_path);
+ free_paths(id_paths);
return rv == 5;
}
--
2.14.0.rc0.dirty

View File

@ -1,140 +0,0 @@
From 4199d3c629c14866505923d19fa50017ee92d2e1 Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@suse.com>
Date: Wed, 5 Dec 2018 16:35:00 +1100
Subject: [PATCH 3/5] mdcheck: add systemd unit files to run mdcheck.
Git-commit: 4199d3c629c14866505923d19fa50017ee92d2e1
Patch-mainline: mdadm-4.1+
References: bsc#1115407
Having the mdcheck script is not use if is never run.
This patch adds systemd unit files so that it can easily
be run on the first Sunday of each month for 6 hours,
then on every subsequent morning until the check is
finished.
The units still need to be enabled with
systemctl enable mdcheck_start.timer
The timer will only actually be started when an array
which might need it becomes active.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
---
Makefile | 5 ++++-
systemd/mdcheck_continue.service | 18 ++++++++++++++++++
systemd/mdcheck_continue.timer | 13 +++++++++++++
systemd/mdcheck_start.service | 17 +++++++++++++++++
systemd/mdcheck_start.timer | 15 +++++++++++++++
5 files changed, 67 insertions(+), 1 deletion(-)
create mode 100644 systemd/mdcheck_continue.service
create mode 100644 systemd/mdcheck_continue.timer
create mode 100644 systemd/mdcheck_start.service
create mode 100644 systemd/mdcheck_start.timer
diff --git a/Makefile b/Makefile
index 2767ac68396d..afb62cc6e3a6 100644
--- a/Makefile
+++ b/Makefile
@@ -276,7 +276,10 @@ install-udev: udev-md-raid-arrays.rules udev-md-raid-assembly.rules udev-md-raid
install-systemd: systemd/mdmon@.service
@for file in mdmon@.service mdmonitor.service mdadm-last-resort@.timer \
- mdadm-last-resort@.service mdadm-grow-continue@.service; \
+ mdadm-last-resort@.service mdadm-grow-continue@.service \
+ mdcheck_start.timer mdcheck_start.service \
+ mdcheck_continue.timer mdcheck_continue.service \
+ ; \
do sed -e 's,BINDIR,$(BINDIR),g' systemd/$$file > .install.tmp.2 && \
$(ECHO) $(INSTALL) -D -m 644 systemd/$$file $(DESTDIR)$(SYSTEMD_DIR)/$$file ; \
$(INSTALL) -D -m 644 .install.tmp.2 $(DESTDIR)$(SYSTEMD_DIR)/$$file ; \
diff --git a/systemd/mdcheck_continue.service b/systemd/mdcheck_continue.service
new file mode 100644
index 000000000000..592c60798f82
--- /dev/null
+++ b/systemd/mdcheck_continue.service
@@ -0,0 +1,18 @@
+# This file is part of mdadm.
+#
+# mdadm is free software; you can redistribute it and/or modify it
+# under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+
+[Unit]
+Description=MD array scrubbing - continuation
+ConditionPathExistsGlob = /var/lib/mdcheck/MD_UUID_*
+
+[Service]
+Type=oneshot
+Environment= MDADM_CHECK_DURATION='"6 hours"'
+EnvironmentFile=-/run/sysconfig/mdadm
+ExecStartPre=-/usr/lib/mdadm/mdadm_env.sh
+ExecStart=/usr/share/mdadm/mdcheck --continue --duration $MDADM_CHECK_DURATION
+
diff --git a/systemd/mdcheck_continue.timer b/systemd/mdcheck_continue.timer
new file mode 100644
index 000000000000..3ccfd7858a3f
--- /dev/null
+++ b/systemd/mdcheck_continue.timer
@@ -0,0 +1,13 @@
+# This file is part of mdadm.
+#
+# mdadm is free software; you can redistribute it and/or modify it
+# under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+
+[Unit]
+Description=MD array scrubbing - continuation
+
+[Timer]
+OnCalendar= 1:05:00
+
diff --git a/systemd/mdcheck_start.service b/systemd/mdcheck_start.service
new file mode 100644
index 000000000000..812141bb5c9a
--- /dev/null
+++ b/systemd/mdcheck_start.service
@@ -0,0 +1,17 @@
+# This file is part of mdadm.
+#
+# mdadm is free software; you can redistribute it and/or modify it
+# under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+
+[Unit]
+Description=MD array scrubbing
+Wants=mdcheck_continue.timer
+
+[Service]
+Type=oneshot
+Environment= MDADM_CHECK_DURATION='"6 hours"'
+EnvironmentFile=-/run/sysconfig/mdadm
+ExecStartPre=-/usr/lib/mdadm/mdadm_env.sh
+ExecStart=/usr/share/mdadm/mdcheck --duration $MDADM_CHECK_DURATION
diff --git a/systemd/mdcheck_start.timer b/systemd/mdcheck_start.timer
new file mode 100644
index 000000000000..64807362d649
--- /dev/null
+++ b/systemd/mdcheck_start.timer
@@ -0,0 +1,15 @@
+# This file is part of mdadm.
+#
+# mdadm is free software; you can redistribute it and/or modify it
+# under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+
+[Unit]
+Description=MD array scrubbing
+
+[Timer]
+OnCalendar=Sun *-*-1..7 1:00:00
+
+[Install]
+WantedBy= mdmonitor.service
--
2.14.0.rc0.dirty

View File

@ -0,0 +1,70 @@
From cf9a109209aad285372b67306d54118af6fc522b Mon Sep 17 00:00:00 2001
From: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Date: Fri, 14 Jan 2022 16:44:33 +0100
Subject: [PATCH 04/61] udev: adapt rules to systemd v247
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
New events have been added in kernel 4.14 ("bind" and "unbind").
Systemd maintainer suggests to modify "add|change" branches.
This patches implements their suggestions. There is no issue yet because
new event types are not used in md.
Please see systemd announcement for details[1].
[1] https://lists.freedesktop.org/archives/systemd-devel/2020-November/045646.html
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
udev-md-raid-arrays.rules | 2 +-
udev-md-raid-assembly.rules | 5 +++--
udev-md-raid-safe-timeouts.rules | 2 +-
3 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/udev-md-raid-arrays.rules b/udev-md-raid-arrays.rules
index 13c9076..2967ace 100644
--- a/udev-md-raid-arrays.rules
+++ b/udev-md-raid-arrays.rules
@@ -3,7 +3,7 @@
SUBSYSTEM!="block", GOTO="md_end"
# handle md arrays
-ACTION!="add|change", GOTO="md_end"
+ACTION=="remove", GOTO="md_end"
KERNEL!="md*", GOTO="md_end"
# partitions have no md/{array_state,metadata_version}, but should not
diff --git a/udev-md-raid-assembly.rules b/udev-md-raid-assembly.rules
index d668cdd..39b4344 100644
--- a/udev-md-raid-assembly.rules
+++ b/udev-md-raid-assembly.rules
@@ -30,8 +30,9 @@ LABEL="md_inc"
# remember you can limit what gets auto/incrementally assembled by
# mdadm.conf(5)'s 'AUTO' and selectively whitelist using 'ARRAY'
-ACTION=="add|change", IMPORT{program}="BINDIR/mdadm --incremental --export $devnode --offroot $env{DEVLINKS}"
-ACTION=="add|change", ENV{MD_STARTED}=="*unsafe*", ENV{MD_FOREIGN}=="no", ENV{SYSTEMD_WANTS}+="mdadm-last-resort@$env{MD_DEVICE}.timer"
+ACTION!="remove", IMPORT{program}="BINDIR/mdadm --incremental --export $devnode --offroot $env{DEVLINKS}"
+ACTION!="remove", ENV{MD_STARTED}=="*unsafe*", ENV{MD_FOREIGN}=="no", ENV{SYSTEMD_WANTS}+="mdadm-last-resort@$env{MD_DEVICE}.timer"
+
ACTION=="remove", ENV{ID_PATH}=="?*", RUN+="BINDIR/mdadm -If $name --path $env{ID_PATH}"
ACTION=="remove", ENV{ID_PATH}!="?*", RUN+="BINDIR/mdadm -If $name"
diff --git a/udev-md-raid-safe-timeouts.rules b/udev-md-raid-safe-timeouts.rules
index 12bdcaa..2e185ce 100644
--- a/udev-md-raid-safe-timeouts.rules
+++ b/udev-md-raid-safe-timeouts.rules
@@ -50,7 +50,7 @@ ENV{DEVTYPE}!="partition", GOTO="md_timeouts_end"
IMPORT{program}="/sbin/mdadm --examine --export $devnode"
-ACTION=="add|change", \
+ACTION!="remove", \
ENV{ID_FS_TYPE}=="linux_raid_member", \
ENV{MD_LEVEL}=="raid[1-9]*", \
TEST=="/sys/block/$parent/device/timeout", \
--
2.35.3

View File

@ -1,85 +0,0 @@
From 7cd7e91ab3de5aa75dc963cb08b0618c1885cf0d Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@suse.com>
Date: Wed, 5 Dec 2018 16:35:00 +1100
Subject: [PATCH 4/5] Monitor: add system timer to run --oneshot periodically
Git-commit: 7cd7e91ab3de5aa75dc963cb08b0618c1885cf0d
Patch-mainline: mdadm-4.1+
References: bsc#1115407
"mdadm --monitor --oneshot" can be used to get a warning
if there are any degraded arrays. It can be helpful to get
this warning periodically while the condition persists.
This patch add a systemd service and timer which can
be enabled with
systemctl enable mdmonitor-oneshot.service
and will then provide daily warnings.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
---
Makefile | 1 +
systemd/mdmonitor-oneshot.service | 15 +++++++++++++++
systemd/mdmonitor-oneshot.timer | 15 +++++++++++++++
3 files changed, 31 insertions(+)
create mode 100644 systemd/mdmonitor-oneshot.service
create mode 100644 systemd/mdmonitor-oneshot.timer
diff --git a/Makefile b/Makefile
index afb62cc6e3a6..dfe00b0a0be8 100644
--- a/Makefile
+++ b/Makefile
@@ -279,6 +279,7 @@ install-systemd: systemd/mdmon@.service
mdadm-last-resort@.service mdadm-grow-continue@.service \
mdcheck_start.timer mdcheck_start.service \
mdcheck_continue.timer mdcheck_continue.service \
+ mdmonitor-oneshot.timer mdmonitor-oneshot.service \
; \
do sed -e 's,BINDIR,$(BINDIR),g' systemd/$$file > .install.tmp.2 && \
$(ECHO) $(INSTALL) -D -m 644 systemd/$$file $(DESTDIR)$(SYSTEMD_DIR)/$$file ; \
diff --git a/systemd/mdmonitor-oneshot.service b/systemd/mdmonitor-oneshot.service
new file mode 100644
index 000000000000..fd469b12cc78
--- /dev/null
+++ b/systemd/mdmonitor-oneshot.service
@@ -0,0 +1,15 @@
+# This file is part of mdadm.
+#
+# mdadm is free software; you can redistribute it and/or modify it
+# under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+
+[Unit]
+Description=Reminder for degraded MD arrays
+
+[Service]
+Environment= MDADM_MONITOR_ARGS=--scan
+EnvironmentFile=-/run/sysconfig/mdadm
+ExecStartPre=-/usr/lib/mdadm/mdadm_env.sh
+ExecStart=BINDIR/mdadm --monitor --oneshot $MDADM_MONITOR_ARGS
diff --git a/systemd/mdmonitor-oneshot.timer b/systemd/mdmonitor-oneshot.timer
new file mode 100644
index 000000000000..cb54bdaa8897
--- /dev/null
+++ b/systemd/mdmonitor-oneshot.timer
@@ -0,0 +1,15 @@
+# This file is part of mdadm.
+#
+# mdadm is free software; you can redistribute it and/or modify it
+# under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+
+[Unit]
+Description=Reminder for degraded MD arrays
+
+[Timer]
+OnCalendar= 2:00:00
+
+[Install]
+WantedBy= mdmonitor.service
--
2.14.0.rc0.dirty

View File

@ -0,0 +1,255 @@
From 83a379cfbd283b387919fe05d44eb4c49e155ad6 Mon Sep 17 00:00:00 2001
From: Lukasz Florczak <lukasz.florczak@linux.intel.com>
Date: Mon, 21 Feb 2022 13:05:20 +0100
Subject: [PATCH 05/61] Replace error prone signal() with sigaction()
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
Up to this date signal() was used which implementation could vary [1].
Sigaction() call is preferred. This commit introduces replacement
from signal() to sigaction() by the use of signal_s() wrapper.
Also remove redundant signal.h header includes.
[1] https://man7.org/linux/man-pages/man2/signal.2.html
Signed-off-by: Lukasz Florczak <lukasz.florczak@linux.intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
Grow.c | 4 ++--
Monitor.c | 5 +++--
managemon.c | 1 -
mdadm.h | 22 ++++++++++++++++++++++
mdmon.c | 1 -
monitor.c | 1 -
probe_roms.c | 6 +++---
raid6check.c | 25 +++++++++++++++----------
util.c | 1 -
9 files changed, 45 insertions(+), 21 deletions(-)
diff --git a/Grow.c b/Grow.c
index aa72490..18c5719 100644
--- a/Grow.c
+++ b/Grow.c
@@ -26,7 +26,6 @@
#include <sys/mman.h>
#include <stddef.h>
#include <stdint.h>
-#include <signal.h>
#include <sys/wait.h>
#if ! defined(__BIG_ENDIAN) && ! defined(__LITTLE_ENDIAN)
@@ -3566,7 +3565,8 @@ started:
fd = -1;
mlockall(MCL_FUTURE);
- signal(SIGTERM, catch_term);
+ if (signal_s(SIGTERM, catch_term) == SIG_ERR)
+ goto release;
if (st->ss->external) {
/* metadata handler takes it from here */
diff --git a/Monitor.c b/Monitor.c
index 30c031a..c0ab541 100644
--- a/Monitor.c
+++ b/Monitor.c
@@ -26,7 +26,6 @@
#include "md_p.h"
#include "md_u.h"
#include <sys/wait.h>
-#include <signal.h>
#include <limits.h>
#include <syslog.h>
#ifndef NO_LIBUDEV
@@ -435,8 +434,10 @@ static void alert(char *event, char *dev, char *disc, struct alert_info *info)
if (mp) {
FILE *mdstat;
char hname[256];
+
gethostname(hname, sizeof(hname));
- signal(SIGPIPE, SIG_IGN);
+ signal_s(SIGPIPE, SIG_IGN);
+
if (info->mailfrom)
fprintf(mp, "From: %s\n", info->mailfrom);
else
diff --git a/managemon.c b/managemon.c
index bb7334c..0e9bdf0 100644
--- a/managemon.c
+++ b/managemon.c
@@ -106,7 +106,6 @@
#include "mdmon.h"
#include <sys/syscall.h>
#include <sys/socket.h>
-#include <signal.h>
static void close_aa(struct active_array *aa)
{
diff --git a/mdadm.h b/mdadm.h
index c7268a7..26e7e5c 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -46,6 +46,7 @@ extern __off64_t lseek64 __P ((int __fd, __off64_t __offset, int __whence));
#include <string.h>
#include <syslog.h>
#include <stdbool.h>
+#include <signal.h>
/* Newer glibc requires sys/sysmacros.h directly for makedev() */
#include <sys/sysmacros.h>
#ifdef __dietlibc__
@@ -1729,6 +1730,27 @@ static inline char *to_subarray(struct mdstat_ent *ent, char *container)
return &ent->metadata_version[10+strlen(container)+1];
}
+/**
+ * signal_s() - Wrapper for sigaction() with signal()-like interface.
+ * @sig: The signal to set the signal handler to.
+ * @handler: The signal handler.
+ *
+ * Return: previous handler or SIG_ERR on failure.
+ */
+static inline sighandler_t signal_s(int sig, sighandler_t handler)
+{
+ struct sigaction new_act;
+ struct sigaction old_act;
+
+ new_act.sa_handler = handler;
+ new_act.sa_flags = 0;
+
+ if (sigaction(sig, &new_act, &old_act) == 0)
+ return old_act.sa_handler;
+
+ return SIG_ERR;
+}
+
#ifdef DEBUG
#define dprintf(fmt, arg...) \
fprintf(stderr, "%s: %s: "fmt, Name, __func__, ##arg)
diff --git a/mdmon.c b/mdmon.c
index c71e62c..5570574 100644
--- a/mdmon.c
+++ b/mdmon.c
@@ -56,7 +56,6 @@
#include <errno.h>
#include <string.h>
#include <fcntl.h>
-#include <signal.h>
#include <dirent.h>
#ifdef USE_PTHREADS
#include <pthread.h>
diff --git a/monitor.c b/monitor.c
index e0d3be6..b877e59 100644
--- a/monitor.c
+++ b/monitor.c
@@ -22,7 +22,6 @@
#include "mdmon.h"
#include <sys/syscall.h>
#include <sys/select.h>
-#include <signal.h>
static char *array_states[] = {
"clear", "inactive", "suspended", "readonly", "read-auto",
diff --git a/probe_roms.c b/probe_roms.c
index 7ea04c7..94c80c2 100644
--- a/probe_roms.c
+++ b/probe_roms.c
@@ -22,7 +22,6 @@
#include "probe_roms.h"
#include "mdadm.h"
#include <unistd.h>
-#include <signal.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <sys/stat.h>
@@ -69,7 +68,8 @@ static int probe_address16(const __u16 *ptr, __u16 *val)
void probe_roms_exit(void)
{
- signal(SIGBUS, SIG_DFL);
+ signal_s(SIGBUS, SIG_DFL);
+
if (rom_fd >= 0) {
close(rom_fd);
rom_fd = -1;
@@ -98,7 +98,7 @@ int probe_roms_init(unsigned long align)
if (roms_init())
return -1;
- if (signal(SIGBUS, sigbus) == SIG_ERR)
+ if (signal_s(SIGBUS, sigbus) == SIG_ERR)
rc = -1;
if (rc == 0) {
fd = open("/dev/mem", O_RDONLY);
diff --git a/raid6check.c b/raid6check.c
index a8e6005..9947776 100644
--- a/raid6check.c
+++ b/raid6check.c
@@ -24,7 +24,6 @@
#include "mdadm.h"
#include <stdint.h>
-#include <signal.h>
#include <sys/mman.h>
#define CHECK_PAGE_BITS (12)
@@ -130,30 +129,36 @@ void raid6_stats(int *disk, int *results, int raid_disks, int chunk_size)
}
int lock_stripe(struct mdinfo *info, unsigned long long start,
- int chunk_size, int data_disks, sighandler_t *sig) {
+ int chunk_size, int data_disks, sighandler_t *sig)
+{
int rv;
+
+ sig[0] = signal_s(SIGTERM, SIG_IGN);
+ sig[1] = signal_s(SIGINT, SIG_IGN);
+ sig[2] = signal_s(SIGQUIT, SIG_IGN);
+
+ if (sig[0] == SIG_ERR || sig[1] == SIG_ERR || sig[2] == SIG_ERR)
+ return 1;
+
if(mlockall(MCL_CURRENT | MCL_FUTURE) != 0) {
return 2;
}
- sig[0] = signal(SIGTERM, SIG_IGN);
- sig[1] = signal(SIGINT, SIG_IGN);
- sig[2] = signal(SIGQUIT, SIG_IGN);
-
rv = sysfs_set_num(info, NULL, "suspend_lo", start * chunk_size * data_disks);
rv |= sysfs_set_num(info, NULL, "suspend_hi", (start + 1) * chunk_size * data_disks);
return rv * 256;
}
-int unlock_all_stripes(struct mdinfo *info, sighandler_t *sig) {
+int unlock_all_stripes(struct mdinfo *info, sighandler_t *sig)
+{
int rv;
rv = sysfs_set_num(info, NULL, "suspend_lo", 0x7FFFFFFFFFFFFFFFULL);
rv |= sysfs_set_num(info, NULL, "suspend_hi", 0);
rv |= sysfs_set_num(info, NULL, "suspend_lo", 0);
- signal(SIGQUIT, sig[2]);
- signal(SIGINT, sig[1]);
- signal(SIGTERM, sig[0]);
+ signal_s(SIGQUIT, sig[2]);
+ signal_s(SIGINT, sig[1]);
+ signal_s(SIGTERM, sig[0]);
if(munlockall() != 0)
return 3;
diff --git a/util.c b/util.c
index 3d05d07..cc94f96 100644
--- a/util.c
+++ b/util.c
@@ -35,7 +35,6 @@
#include <poll.h>
#include <ctype.h>
#include <dirent.h>
-#include <signal.h>
#include <dlfcn.h>
--
2.35.3

View File

@ -1,88 +0,0 @@
From d7a1fda2769ba272d89de6caeab35d52b73a9c3c Mon Sep 17 00:00:00 2001
From: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com>
Date: Wed, 17 Oct 2018 12:11:41 +0200
Subject: [PATCH 5/5] imsm: update metadata correctly while raid10 double
degradation
Git-commit: d7a1fda2769ba272d89de6caeab35d52b73a9c3c
Patch-mainline: mdadm-4.1+
References: jsc#SLE-10078, jsc#SLE-9348
Mdmon calls end_migration() when map state changes from normal to
degraded. It is not valid because in raid 10 double degradation case
mdmon breaks checkpointing but array is still rebuilding.
In this case mdmon has to mark map as degraded and continues marking
recovery checkpoint in metadata. Migration can be finished only if newly
failed device is a rebuilding device.
Add catching double degraded to degraded transition. Migration is
finished but map state doesn't change, array is still degraded.
Update failed_disk_num correctly. If double degradation
happens rebuild will start on the lowest slot, but this variable points
to the first failed slot. If second fail happens while rebuild this
variable shouldn't be updated until rebuild is not finished.
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
super-intel.c | 25 +++++++++++++++++++------
1 file changed, 19 insertions(+), 6 deletions(-)
diff --git a/super-intel.c b/super-intel.c
index 6438987b778c..d2035ccd8270 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -8136,7 +8136,8 @@ static int mark_failure(struct intel_super *super,
set_imsm_ord_tbl_ent(map2, slot2,
idx | IMSM_ORD_REBUILD);
}
- if (map->failed_disk_num == 0xff)
+ if (map->failed_disk_num == 0xff ||
+ (!is_rebuilding(dev) && map->failed_disk_num > slot))
map->failed_disk_num = slot;
clear_disk_badblocks(super->bbm_log, ord_to_idx(ord));
@@ -8558,13 +8559,25 @@ static void imsm_set_disk(struct active_array *a, int n, int state)
break;
}
if (is_rebuilding(dev)) {
- dprintf_cont("while rebuilding.");
+ dprintf_cont("while rebuilding ");
if (map->map_state != map_state) {
- dprintf_cont(" Map state change");
- end_migration(dev, super, map_state);
+ dprintf_cont("map state change ");
+ if (n == map->failed_disk_num) {
+ dprintf_cont("end migration");
+ end_migration(dev, super, map_state);
+ } else {
+ dprintf_cont("raid10 double degradation, map state change");
+ map->map_state = map_state;
+ }
super->updates_pending++;
- } else if (!rebuild_done) {
+ } else if (!rebuild_done)
break;
+ else if (n == map->failed_disk_num) {
+ /* r10 double degraded to degraded transition */
+ dprintf_cont("raid10 double degradation end migration");
+ end_migration(dev, super, map_state);
+ a->last_checkpoint = 0;
+ super->updates_pending++;
}
/* check if recovery is really finished */
@@ -8575,7 +8588,7 @@ static void imsm_set_disk(struct active_array *a, int n, int state)
}
if (recovery_not_finished) {
dprintf_cont("\n");
- dprintf("Rebuild has not finished yet, state not changed");
+ dprintf_cont("Rebuild has not finished yet, map state changes only if raid10 double degradation happens");
if (a->last_checkpoint < mdi->recovery_start) {
a->last_checkpoint =
mdi->recovery_start;
--
2.14.0.rc0.dirty

View File

@ -0,0 +1,126 @@
From e9dd5644843e2013a7dd1a8a5da2b9fa35837416 Mon Sep 17 00:00:00 2001
From: Lukasz Florczak <lukasz.florczak@linux.intel.com>
Date: Fri, 18 Mar 2022 09:26:04 +0100
Subject: [PATCH 06/61] mdadm: Respect config file location in man
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
Default config file location could differ depending on OS (e.g. Debian family).
This patch takes default config file into consideration when creating mdadm.man
file as well as mdadm.conf.man.
Rename mdadm.conf.5 to mdadm.conf.5.in. Now mdadm.conf.5 is generated automatically.
Signed-off-by: Lukasz Florczak <lukasz.florczak@linux.intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
.gitignore | 1 +
Makefile | 7 ++++++-
mdadm.8.in | 16 ++++++++--------
mdadm.conf.5 => mdadm.conf.5.in | 2 +-
4 files changed, 16 insertions(+), 10 deletions(-)
rename mdadm.conf.5 => mdadm.conf.5.in (99%)
diff --git a/.gitignore b/.gitignore
index 217fe76..8d791c6 100644
--- a/.gitignore
+++ b/.gitignore
@@ -3,6 +3,7 @@
/*-stamp
/mdadm
/mdadm.8
+/mdadm.conf.5
/mdadm.udeb
/mdassemble
/mdmon
diff --git a/Makefile b/Makefile
index 2a51d81..bf12603 100644
--- a/Makefile
+++ b/Makefile
@@ -227,7 +227,12 @@ raid6check : raid6check.o mdadm.h $(CHECK_OBJS)
mdadm.8 : mdadm.8.in
sed -e 's/{DEFAULT_METADATA}/$(DEFAULT_METADATA)/g' \
- -e 's,{MAP_PATH},$(MAP_PATH),g' mdadm.8.in > mdadm.8
+ -e 's,{MAP_PATH},$(MAP_PATH),g' -e 's,{CONFFILE},$(CONFFILE),g' \
+ -e 's,{CONFFILE2},$(CONFFILE2),g' mdadm.8.in > mdadm.8
+
+mdadm.conf.5 : mdadm.conf.5.in
+ sed -e 's,{CONFFILE},$(CONFFILE),g' \
+ -e 's,{CONFFILE2},$(CONFFILE2),g' mdadm.conf.5.in > mdadm.conf.5
mdadm.man : mdadm.8
man -l mdadm.8 > mdadm.man
diff --git a/mdadm.8.in b/mdadm.8.in
index e2a4242..8b21ffd 100644
--- a/mdadm.8.in
+++ b/mdadm.8.in
@@ -267,13 +267,13 @@ the exact meaning of this option in different contexts.
.TP
.BR \-c ", " \-\-config=
Specify the config file or directory. Default is to use
-.B /etc/mdadm.conf
+.B {CONFFILE}
and
-.BR /etc/mdadm.conf.d ,
+.BR {CONFFILE}.d ,
or if those are missing then
-.B /etc/mdadm/mdadm.conf
+.B {CONFFILE2}
and
-.BR /etc/mdadm/mdadm.conf.d .
+.BR {CONFFILE2}.d .
If the config file given is
.B "partitions"
then nothing will be read, but
@@ -2014,9 +2014,9 @@ The config file is only used if explicitly named with
or requested with (a possibly implicit)
.BR \-\-scan .
In the later case,
-.B /etc/mdadm.conf
+.B {CONFFILE}
or
-.B /etc/mdadm/mdadm.conf
+.B {CONFFILE2}
is used.
If
@@ -3344,7 +3344,7 @@ uses this to find arrays when
is given in Misc mode, and to monitor array reconstruction
on Monitor mode.
-.SS /etc/mdadm.conf
+.SS {CONFFILE} (or {CONFFILE2})
The config file lists which devices may be scanned to see if
they contain MD super block, and gives identifying information
@@ -3352,7 +3352,7 @@ they contain MD super block, and gives identifying information
.BR mdadm.conf (5)
for more details.
-.SS /etc/mdadm.conf.d
+.SS {CONFFILE}.d (or {CONFFILE2}.d)
A directory containing configuration files which are read in lexical
order.
diff --git a/mdadm.conf.5 b/mdadm.conf.5.in
similarity index 99%
rename from mdadm.conf.5
rename to mdadm.conf.5.in
index 74a21c5..83edd00 100644
--- a/mdadm.conf.5
+++ b/mdadm.conf.5.in
@@ -8,7 +8,7 @@
.SH NAME
mdadm.conf \- configuration for management of Software RAID with mdadm
.SH SYNOPSIS
-/etc/mdadm.conf
+{CONFFILE}
.SH DESCRIPTION
.PP
.I mdadm
--
2.35.3

View File

@ -1,48 +0,0 @@
From 563ac108659980b3d1e226fe416254a86656235f Mon Sep 17 00:00:00 2001
From: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Date: Tue, 6 Nov 2018 16:20:17 +0100
Subject: [PATCH] Assemble: mask FAILFAST and WRITEMOSTLY flags when finding
the most recent device
Git-commit: 563ac108659980b3d1e226fe416254a86656235f
Patch-mainline: mdadm-4.1+
References: jsc#SLE-10078, jsc#SLE-9348
If devices[].i.disk.state has MD_DISK_FAILFAST or MD_DISK_WRITEMOSTLY
flag, it cannot be the most recent device. Both flags should be masked
before checking the state.
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
Assemble.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/Assemble.c b/Assemble.c
index f39c9e1..9f75c68 100644
--- a/Assemble.c
+++ b/Assemble.c
@@ -578,6 +578,7 @@ static int load_devices(struct devs *devices, char *devmap,
struct supertype *tst;
int i;
int dfd;
+ int disk_state;
if (tmpdev->used != 1)
continue;
@@ -711,7 +712,9 @@ static int load_devices(struct devs *devices, char *devmap,
devices[devcnt].i.disk.major = major(stb.st_rdev);
devices[devcnt].i.disk.minor = minor(stb.st_rdev);
- if (devices[devcnt].i.disk.state == 6) {
+ disk_state = devices[devcnt].i.disk.state & ~((1<<MD_DISK_FAILFAST) |
+ (1<<MD_DISK_WRITEMOSTLY));
+ if (disk_state == ((1<<MD_DISK_ACTIVE) | (1<<MD_DISK_SYNC))) {
if (most_recent < 0 ||
devices[devcnt].i.events
> devices[most_recent].i.events) {
--
2.25.0

View File

@ -0,0 +1,50 @@
From c23400377bb3d8e98e810cd92dba478dac1dff82 Mon Sep 17 00:00:00 2001
From: Lukasz Florczak <lukasz.florczak@linux.intel.com>
Date: Fri, 18 Mar 2022 09:26:05 +0100
Subject: [PATCH 07/61] mdadm: Update ReadMe
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
Instead of hardcoded config file path give reference to config manual.
Add missing monitordelay and homecluster parameters.
Signed-off-by: Lukasz Florczak <lukasz.florczak@linux.intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
---
ReadMe.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/ReadMe.c b/ReadMe.c
index 8139976..8f873c4 100644
--- a/ReadMe.c
+++ b/ReadMe.c
@@ -613,7 +613,6 @@ char Help_incr[] =
;
char Help_config[] =
-"The /etc/mdadm.conf config file:\n\n"
" The config file contains, apart from blank lines and comment lines that\n"
" start with a hash(#), array lines, device lines, and various\n"
" configuration lines.\n"
@@ -636,10 +635,12 @@ char Help_config[] =
" than a device must match all of them to be considered.\n"
"\n"
" Other configuration lines include:\n"
-" mailaddr, mailfrom, program used for --monitor mode\n"
-" create, auto used when creating device names in /dev\n"
-" homehost, policy, part-policy used to guide policy in various\n"
-" situations\n"
+" mailaddr, mailfrom, program, monitordelay used for --monitor mode\n"
+" create, auto used when creating device names in /dev\n"
+" homehost, homecluster, policy, part-policy used to guide policy in various\n"
+" situations\n"
+"\n"
+"For more details see mdadm.conf(5).\n"
"\n"
;
--
2.35.3

View File

@ -1,39 +0,0 @@
From 085df42259cba7863cd6ebe5cd0d8492ac5b869e Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@suse.com>
Date: Thu, 6 Dec 2018 10:35:41 +1100
Subject: [PATCH] Grow: avoid overflow in compute_backup_blocks()
Git-commit: 085df42259cba7863cd6ebe5cd0d8492ac5b869e
Patch-mainline: mdadm-4.1+
References: jsc#SLE-10078, jsc#SLE-9348
With a chunk size of 16Meg and data drive count of 8,
this calculate can easily overflow the 'int' type that
is used for the multiplications.
So force it to use "long" instead.
Reported-and-tested-by: Ed Spiridonov <edo.rus@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
Grow.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/Grow.c b/Grow.c
index 4436a4d6bd4c..76f82c075e38 100644
--- a/Grow.c
+++ b/Grow.c
@@ -1196,7 +1196,8 @@ unsigned long compute_backup_blocks(int nchunk, int ochunk,
/* Find GCD */
a = GCD(a, b);
/* LCM == product / GCD */
- blocks = (ochunk/512) * (nchunk/512) * odata * ndata / a;
+ blocks = (unsigned long)(ochunk/512) * (unsigned long)(nchunk/512) *
+ odata * ndata / a;
return blocks;
}
--
2.14.0.rc0.dirty

View File

@ -0,0 +1,205 @@
From 24e075c659d0a8718aabefe5af4c97195a188af7 Mon Sep 17 00:00:00 2001
From: Lukasz Florczak <lukasz.florczak@linux.intel.com>
Date: Fri, 18 Mar 2022 09:26:06 +0100
Subject: [PATCH 08/61] mdadm: Update config man regarding default files and
multi-keyword behavior
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
Simplify default and alternative config file and directory location references
from mdadm(8) as references to mdadm.conf(5). Add FILE section in config man
and explain order and conditions in which default and alternative config files
and directories are used.
Update config man behavior regarding parsing order when multiple keywords/config
files are involved.
Signed-off-by: Lukasz Florczak <lukasz.florczak@linux.intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
---
mdadm.8.in | 30 +++++++++--------------
mdadm.conf.5.in | 65 ++++++++++++++++++++++++++++++++++++++++++++-----
2 files changed, 71 insertions(+), 24 deletions(-)
diff --git a/mdadm.8.in b/mdadm.8.in
index 8b21ffd..0be02e4 100644
--- a/mdadm.8.in
+++ b/mdadm.8.in
@@ -266,14 +266,11 @@ the exact meaning of this option in different contexts.
.TP
.BR \-c ", " \-\-config=
-Specify the config file or directory. Default is to use
-.B {CONFFILE}
-and
-.BR {CONFFILE}.d ,
-or if those are missing then
-.B {CONFFILE2}
-and
-.BR {CONFFILE2}.d .
+Specify the config file or directory. If not specified, default config file
+and default conf.d directory will be used. See
+.BR mdadm.conf (5)
+for more details.
+
If the config file given is
.B "partitions"
then nothing will be read, but
@@ -2013,11 +2010,9 @@ The config file is only used if explicitly named with
.B \-\-config
or requested with (a possibly implicit)
.BR \-\-scan .
-In the later case,
-.B {CONFFILE}
-or
-.B {CONFFILE2}
-is used.
+In the later case, default config file is used. See
+.BR mdadm.conf (5)
+for more details.
If
.B \-\-scan
@@ -3346,16 +3341,15 @@ on Monitor mode.
.SS {CONFFILE} (or {CONFFILE2})
-The config file lists which devices may be scanned to see if
-they contain MD super block, and gives identifying information
-(e.g. UUID) about known MD arrays. See
+Default config file. See
.BR mdadm.conf (5)
for more details.
.SS {CONFFILE}.d (or {CONFFILE2}.d)
-A directory containing configuration files which are read in lexical
-order.
+Default directory containing configuration files. See
+.BR mdadm.conf (5)
+for more details.
.SS {MAP_PATH}
When
diff --git a/mdadm.conf.5.in b/mdadm.conf.5.in
index 83edd00..dd331a6 100644
--- a/mdadm.conf.5.in
+++ b/mdadm.conf.5.in
@@ -88,7 +88,8 @@ but only the major and minor device numbers. It scans
.I /dev
to find the name that matches the numbers.
-If no DEVICE line is present, then "DEVICE partitions containers" is assumed.
+If no DEVICE line is present in any config file,
+then "DEVICE partitions containers" is assumed.
For example:
.IP
@@ -272,6 +273,10 @@ catenated with spaces to form the address.
Note that this value cannot be set via the
.I mdadm
commandline. It is only settable via the config file.
+There should only be one
+.B MAILADDR
+line and it should have only one address. Any subsequent addresses
+are silently ignored.
.TP
.B PROGRAM
@@ -286,7 +291,8 @@ device.
There should only be one
.B program
-line and it should be give only one program.
+line and it should be given only one program. Any subsequent programs
+are silently ignored.
.TP
@@ -295,7 +301,14 @@ The
.B create
line gives default values to be used when creating arrays, new members
of arrays, and device entries for arrays.
-These include:
+
+There should only be one
+.B create
+line. Any subsequent lines will override the previous settings.
+
+Keywords used in the
+.I CREATE
+line and supported values are:
.RS 4
.TP
@@ -475,8 +488,8 @@ The known metadata types are
.B AUTO
should be given at most once. Subsequent lines are silently ignored.
-Thus an earlier config file in a config directory will over-ride
-the setting in a later config file.
+Thus a later config file in a config directory will not overwrite
+the setting in an earlier config file.
.TP
.B POLICY
@@ -594,6 +607,7 @@ The
line lists custom values of MD device's sysfs attributes which will be
stored in sysfs after the array is assembled. Multiple lines are allowed and each
line has to contain the uuid or the name of the device to which it relates.
+Lines are applied in reverse order.
.RS 4
.TP
.B uuid=
@@ -621,7 +635,46 @@ is running in
.B \-\-monitor
mode.
.B \-d/\-\-delay
-command line argument takes precedence over the config file
+command line argument takes precedence over the config file.
+
+If multiple
+.B MINITORDELAY
+lines are provided, only first non-zero value is considered.
+
+.SH FILES
+
+.SS {CONFFILE}
+
+The default config file location, used when
+.I mdadm
+is running without --config option.
+
+.SS {CONFFILE}.d
+
+The default directory with config files. Used when
+.I mdadm
+is running without --config option, after successful reading of the
+.B {CONFFILE}
+default config file. Files in that directory
+are read in lexical order.
+
+
+.SS {CONFFILE2}
+
+Alternative config file that is read, when
+.I mdadm
+is running without --config option and the
+.B {CONFFILE}
+default config file was not opened successfully.
+
+.SS {CONFFILE2}.d
+
+The alternative directory with config files. Used when
+.I mdadm
+is runninng without --config option, after reading the
+.B {CONFFILE2}
+alternative config file whether it was successful or not. Files in
+that directory are read in lexical order.
.SH EXAMPLE
DEVICE /dev/sd[bcdjkl]1
--
2.35.3

View File

@ -1,35 +0,0 @@
From 76d505dec6c9f92564553596fc8350324be82463 Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@suse.com>
Date: Thu, 6 Dec 2018 10:36:28 +1100
Subject: [PATCH] Grow: report correct new chunk size.
Git-commit: 76d505dec6c9f92564553596fc8350324be82463
Patch-mainline: mdadm-4.1+
References: jsc#SLE-10078, jsc#SLE-9348
When using "--grow --chunk=" to change chunk
size, the old chunksize is reported instead of the new.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
Grow.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Grow.c b/Grow.c
index 76f82c075e38..363b209d14a3 100644
--- a/Grow.c
+++ b/Grow.c
@@ -3286,7 +3286,7 @@ static int reshape_array(char *container, int fd, char *devname,
goto release;
} else if (verbose >= 0)
printf("chunk size for %s set to %d\n",
- devname, array.chunk_size);
+ devname, info->new_chunk);
}
unfreeze(st);
return 0;
--
2.14.0.rc0.dirty

View File

@ -0,0 +1,47 @@
From c33bbda5b0e127bb161fd4ad44bcfaa2a5daf153 Mon Sep 17 00:00:00 2001
From: Lukasz Florczak <lukasz.florczak@linux.intel.com>
Date: Fri, 18 Mar 2022 09:26:07 +0100
Subject: [PATCH 09/61] mdadm: Update config manual
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
Add missing HOMECLUSTER keyword description.
Signed-off-by: Lukasz Florczak <lukasz.florczak@linux.intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
---
mdadm.conf.5.in | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/mdadm.conf.5.in b/mdadm.conf.5.in
index dd331a6..cd4e6a9 100644
--- a/mdadm.conf.5.in
+++ b/mdadm.conf.5.in
@@ -439,6 +439,23 @@ from any possible local name. e.g.
.B /dev/md/1_1
or
.BR /dev/md/home_0 .
+
+.TP
+.B HOMECLUSTER
+The
+.B homcluster
+line gives a default value for the
+.B \-\-homecluster=
+option to mdadm. It specifies the cluster name for the md device.
+The md device can be assembled only on the cluster which matches
+the name specified. If
+.B homcluster
+is not provided, mdadm tries to detect the cluster name automatically.
+
+There should only be one
+.B homecluster
+line. Any subsequent lines will be silently ignored.
+
.TP
.B AUTO
A list of names of metadata format can be given, each preceded by a
--
2.35.3

View File

@ -0,0 +1,156 @@
From 913f07d1db4a0078acc26d6ccabe1c315cf9273c Mon Sep 17 00:00:00 2001
From: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Date: Thu, 20 Jan 2022 13:18:32 +0100
Subject: [PATCH 10/61] Create, Build: use default_layout()
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
This code is duplicated for Build mode so make default_layout() extern
and use it. Simplify the function structure.
It introduced change for Build mode, now for raid0 RAID0_ORIG_LAYOUT
will be returned same as for Create.
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
Build.c | 23 +------------------
Create.c | 67 ++++++++++++++++++++++++++++++++++----------------------
mdadm.h | 1 +
3 files changed, 43 insertions(+), 48 deletions(-)
diff --git a/Build.c b/Build.c
index 962c2e3..8d6f6f5 100644
--- a/Build.c
+++ b/Build.c
@@ -71,28 +71,7 @@ int Build(char *mddev, struct mddev_dev *devlist,
}
if (s->layout == UnSet)
- switch(s->level) {
- default: /* no layout */
- s->layout = 0;
- break;
- case 10:
- s->layout = 0x102; /* near=2, far=1 */
- if (c->verbose > 0)
- pr_err("layout defaults to n1\n");
- break;
- case 5:
- case 6:
- s->layout = map_name(r5layout, "default");
- if (c->verbose > 0)
- pr_err("layout defaults to %s\n", map_num(r5layout, s->layout));
- break;
- case LEVEL_FAULTY:
- s->layout = map_name(faultylayout, "default");
-
- if (c->verbose > 0)
- pr_err("layout defaults to %s\n", map_num(faultylayout, s->layout));
- break;
- }
+ s->layout = default_layout(NULL, s->level, c->verbose);
/* We need to create the device. It can have no name. */
map_lock(&map);
diff --git a/Create.c b/Create.c
index 0ff1922..9ea19de 100644
--- a/Create.c
+++ b/Create.c
@@ -39,39 +39,54 @@ static int round_size_and_verify(unsigned long long *size, int chunk)
return 0;
}
-static int default_layout(struct supertype *st, int level, int verbose)
+/**
+ * default_layout() - Get default layout for level.
+ * @st: metadata requested, could be NULL.
+ * @level: raid level requested.
+ * @verbose: verbose level.
+ *
+ * Try to ask metadata handler first, otherwise use global defaults.
+ *
+ * Return: Layout or &UnSet, return value meaning depends of level used.
+ */
+int default_layout(struct supertype *st, int level, int verbose)
{
int layout = UnSet;
+ mapping_t *layout_map = NULL;
+ char *layout_name = NULL;
if (st && st->ss->default_geometry)
st->ss->default_geometry(st, &level, &layout, NULL);
- if (layout == UnSet)
- switch(level) {
- default: /* no layout */
- layout = 0;
- break;
- case 0:
- layout = RAID0_ORIG_LAYOUT;
- break;
- case 10:
- layout = 0x102; /* near=2, far=1 */
- if (verbose > 0)
- pr_err("layout defaults to n2\n");
- break;
- case 5:
- case 6:
- layout = map_name(r5layout, "default");
- if (verbose > 0)
- pr_err("layout defaults to %s\n", map_num(r5layout, layout));
- break;
- case LEVEL_FAULTY:
- layout = map_name(faultylayout, "default");
+ if (layout != UnSet)
+ return layout;
- if (verbose > 0)
- pr_err("layout defaults to %s\n", map_num(faultylayout, layout));
- break;
- }
+ switch (level) {
+ default: /* no layout */
+ layout = 0;
+ break;
+ case 0:
+ layout = RAID0_ORIG_LAYOUT;
+ break;
+ case 10:
+ layout = 0x102; /* near=2, far=1 */
+ layout_name = "n2";
+ break;
+ case 5:
+ case 6:
+ layout_map = r5layout;
+ break;
+ case LEVEL_FAULTY:
+ layout_map = faultylayout;
+ break;
+ }
+
+ if (layout_map) {
+ layout = map_name(layout_map, "default");
+ layout_name = map_num(layout_map, layout);
+ }
+ if (layout_name && verbose > 0)
+ pr_err("layout defaults to %s\n", layout_name);
return layout;
}
diff --git a/mdadm.h b/mdadm.h
index 26e7e5c..cd72e71 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -1512,6 +1512,7 @@ extern int get_linux_version(void);
extern int mdadm_version(char *version);
extern unsigned long long parse_size(char *size);
extern int parse_uuid(char *str, int uuid[4]);
+int default_layout(struct supertype *st, int level, int verbose);
extern int is_near_layout_10(int layout);
extern int parse_layout_10(char *layout);
extern int parse_layout_faulty(char *layout);
--
2.35.3

View File

@ -1,34 +0,0 @@
From 467e6a1b4ece8e552ee638dab7f44a4d235ece1a Mon Sep 17 00:00:00 2001
From: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Date: Fri, 7 Dec 2018 12:04:44 +0100
Subject: [PATCH] policy.c: prevent NULL pointer referencing
Git-commit: 467e6a1b4ece8e552ee638dab7f44a4d235ece1a
Patch-mainline: mdadm-4.1
References: bsc#1106078
paths could be NULL and paths[0] should be followed by NULL pointer
checking.
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
---
policy.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/policy.c b/policy.c
index fa67d55..e3a0671 100644
--- a/policy.c
+++ b/policy.c
@@ -383,7 +383,7 @@ struct dev_policy *path_policy(char **paths, char *type)
/* Now add any metadata-specific internal knowledge
* about this path
*/
- for (i=0; paths[0] && superlist[i]; i++)
+ for (i=0; paths && paths[0] && superlist[i]; i++)
if (superlist[i]->get_disk_controller_domain) {
const char *d =
superlist[i]->get_disk_controller_domain(
--
2.25.0

View File

@ -0,0 +1,385 @@
From 5f21d67472ad08c1e96b4385254adba79aa1c467 Mon Sep 17 00:00:00 2001
From: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Date: Thu, 20 Jan 2022 13:18:33 +0100
Subject: [PATCH 11/61] mdadm: add map_num_s()
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
map_num() returns NULL if key is not defined. This patch adds
alternative, non NULL version for cases where NULL is not expected.
There are many printf() calls where map_num() is called on variable
without NULL verification. It works, even if NULL is passed because
gcc is able to ignore NULL argument quietly but the behavior is
undefined. For safety reasons such usages will use map_num_s() now.
It is a potential point of regression.
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
Assemble.c | 6 ++----
Create.c | 2 +-
Detail.c | 4 ++--
Grow.c | 16 ++++++++--------
Query.c | 4 ++--
maps.c | 24 ++++++++++++++++++++++++
mdadm.c | 20 ++++++++++----------
mdadm.h | 2 +-
super-ddf.c | 6 +++---
super-intel.c | 2 +-
super0.c | 2 +-
super1.c | 2 +-
sysfs.c | 9 +++++----
13 files changed, 61 insertions(+), 38 deletions(-)
diff --git a/Assemble.c b/Assemble.c
index 704b829..9eac9ce 100644
--- a/Assemble.c
+++ b/Assemble.c
@@ -63,7 +63,7 @@ static void set_array_assembly_status(struct context *c,
struct assembly_array_info *arr)
{
int raid_disks = arr->preexist_cnt + arr->new_cnt;
- char *status_msg = map_num(assemble_statuses, status);
+ char *status_msg = map_num_s(assemble_statuses, status);
if (c->export && result)
*result |= status;
@@ -77,9 +77,7 @@ static void set_array_assembly_status(struct context *c,
fprintf(stderr, " (%d new)", arr->new_cnt);
if (arr->exp_cnt)
fprintf(stderr, " ( + %d for expansion)", arr->exp_cnt);
- if (status_msg)
- fprintf(stderr, " %s", status_msg);
- fprintf(stderr, ".\n");
+ fprintf(stderr, " %s.\n", status_msg);
}
static int name_matches(char *found, char *required, char *homehost, int require_homehost)
diff --git a/Create.c b/Create.c
index 9ea19de..c84c1ac 100644
--- a/Create.c
+++ b/Create.c
@@ -83,7 +83,7 @@ int default_layout(struct supertype *st, int level, int verbose)
if (layout_map) {
layout = map_name(layout_map, "default");
- layout_name = map_num(layout_map, layout);
+ layout_name = map_num_s(layout_map, layout);
}
if (layout_name && verbose > 0)
pr_err("layout defaults to %s\n", layout_name);
diff --git a/Detail.c b/Detail.c
index 95d4cc7..ce7a844 100644
--- a/Detail.c
+++ b/Detail.c
@@ -495,8 +495,8 @@ int Detail(char *dev, struct context *c)
if (array.state & (1 << MD_SB_CLEAN)) {
if ((array.level == 0) ||
(array.level == LEVEL_LINEAR))
- arrayst = map_num(sysfs_array_states,
- sra->array_state);
+ arrayst = map_num_s(sysfs_array_states,
+ sra->array_state);
else
arrayst = "clean";
} else {
diff --git a/Grow.c b/Grow.c
index 18c5719..8a242b0 100644
--- a/Grow.c
+++ b/Grow.c
@@ -547,7 +547,7 @@ int Grow_consistency_policy(char *devname, int fd, struct context *c, struct sha
if (s->consistency_policy != CONSISTENCY_POLICY_RESYNC &&
s->consistency_policy != CONSISTENCY_POLICY_PPL) {
pr_err("Operation not supported for consistency policy %s\n",
- map_num(consistency_policies, s->consistency_policy));
+ map_num_s(consistency_policies, s->consistency_policy));
return 1;
}
@@ -578,14 +578,14 @@ int Grow_consistency_policy(char *devname, int fd, struct context *c, struct sha
if (sra->consistency_policy == (unsigned)s->consistency_policy) {
pr_err("Consistency policy is already %s\n",
- map_num(consistency_policies, s->consistency_policy));
+ map_num_s(consistency_policies, s->consistency_policy));
ret = 1;
goto free_info;
} else if (sra->consistency_policy != CONSISTENCY_POLICY_RESYNC &&
sra->consistency_policy != CONSISTENCY_POLICY_PPL) {
pr_err("Current consistency policy is %s, cannot change to %s\n",
- map_num(consistency_policies, sra->consistency_policy),
- map_num(consistency_policies, s->consistency_policy));
+ map_num_s(consistency_policies, sra->consistency_policy),
+ map_num_s(consistency_policies, s->consistency_policy));
ret = 1;
goto free_info;
}
@@ -704,8 +704,8 @@ int Grow_consistency_policy(char *devname, int fd, struct context *c, struct sha
}
ret = sysfs_set_str(sra, NULL, "consistency_policy",
- map_num(consistency_policies,
- s->consistency_policy));
+ map_num_s(consistency_policies,
+ s->consistency_policy));
if (ret)
pr_err("Failed to change array consistency policy\n");
@@ -2241,7 +2241,7 @@ size_change_error:
info.new_layout = UnSet;
if (info.array.level == 6 && info.new_level == UnSet) {
char l[40], *h;
- strcpy(l, map_num(r6layout, info.array.layout));
+ strcpy(l, map_num_s(r6layout, info.array.layout));
h = strrchr(l, '-');
if (h && strcmp(h, "-6") == 0) {
*h = 0;
@@ -2266,7 +2266,7 @@ size_change_error:
info.new_layout = info.array.layout;
else if (info.array.level == 5 && info.new_level == 6) {
char l[40];
- strcpy(l, map_num(r5layout, info.array.layout));
+ strcpy(l, map_num_s(r5layout, info.array.layout));
strcat(l, "-6");
info.new_layout = map_name(r6layout, l);
} else {
diff --git a/Query.c b/Query.c
index 23fbf8a..adcd231 100644
--- a/Query.c
+++ b/Query.c
@@ -93,7 +93,7 @@ int Query(char *dev)
else {
printf("%s: %s %s %d devices, %d spare%s. Use mdadm --detail for more detail.\n",
dev, human_size_brief(larray_size,IEC),
- map_num(pers, level), raid_disks,
+ map_num_s(pers, level), raid_disks,
spare_disks, spare_disks == 1 ? "" : "s");
}
st = guess_super(fd);
@@ -131,7 +131,7 @@ int Query(char *dev)
dev,
info.disk.number, info.array.raid_disks,
activity,
- map_num(pers, info.array.level),
+ map_num_s(pers, info.array.level),
mddev);
if (st->ss == &super0)
put_md_name(mddev);
diff --git a/maps.c b/maps.c
index a4fd279..20fcf71 100644
--- a/maps.c
+++ b/maps.c
@@ -166,6 +166,30 @@ mapping_t sysfs_array_states[] = {
{ NULL, ARRAY_UNKNOWN_STATE }
};
+/**
+ * map_num_s() - Safer alternative of map_num() function.
+ * @map: map to search.
+ * @num: key to match.
+ *
+ * Shall be used only if key existence is quaranted.
+ *
+ * Return: Pointer to name of the element.
+ */
+char *map_num_s(mapping_t *map, int num)
+{
+ char *ret = map_num(map, num);
+
+ assert(ret);
+ return ret;
+}
+
+/**
+ * map_num() - get element name by key.
+ * @map: map to search.
+ * @num: key to match.
+ *
+ * Return: Pointer to name of the element or NULL.
+ */
char *map_num(mapping_t *map, int num)
{
while (map->name) {
diff --git a/mdadm.c b/mdadm.c
index 26299b2..be40686 100644
--- a/mdadm.c
+++ b/mdadm.c
@@ -280,8 +280,8 @@ int main(int argc, char *argv[])
else
fprintf(stderr, "-%c", opt);
fprintf(stderr, " would set mdadm mode to \"%s\", but it is already set to \"%s\".\n",
- map_num(modes, newmode),
- map_num(modes, mode));
+ map_num_s(modes, newmode),
+ map_num_s(modes, mode));
exit(2);
} else if (!mode && newmode) {
mode = newmode;
@@ -544,7 +544,7 @@ int main(int argc, char *argv[])
switch(s.level) {
default:
pr_err("layout not meaningful for %s arrays.\n",
- map_num(pers, s.level));
+ map_num_s(pers, s.level));
exit(2);
case UnSet:
pr_err("raid level must be given before layout.\n");
@@ -1248,10 +1248,10 @@ int main(int argc, char *argv[])
if (option_index > 0)
pr_err(":option --%s not valid in %s mode\n",
long_options[option_index].name,
- map_num(modes, mode));
+ map_num_s(modes, mode));
else
pr_err("option -%c not valid in %s mode\n",
- opt, map_num(modes, mode));
+ opt, map_num_s(modes, mode));
exit(2);
}
@@ -1276,7 +1276,7 @@ int main(int argc, char *argv[])
if (s.consistency_policy != CONSISTENCY_POLICY_UNKNOWN &&
s.consistency_policy != CONSISTENCY_POLICY_JOURNAL) {
pr_err("--write-journal is not supported with consistency policy: %s\n",
- map_num(consistency_policies, s.consistency_policy));
+ map_num_s(consistency_policies, s.consistency_policy));
exit(2);
}
}
@@ -1285,12 +1285,12 @@ int main(int argc, char *argv[])
s.consistency_policy != CONSISTENCY_POLICY_UNKNOWN) {
if (s.level <= 0) {
pr_err("--consistency-policy not meaningful with level %s.\n",
- map_num(pers, s.level));
+ map_num_s(pers, s.level));
exit(2);
} else if (s.consistency_policy == CONSISTENCY_POLICY_JOURNAL &&
!s.journaldisks) {
pr_err("--write-journal is required for consistency policy: %s\n",
- map_num(consistency_policies, s.consistency_policy));
+ map_num_s(consistency_policies, s.consistency_policy));
exit(2);
} else if (s.consistency_policy == CONSISTENCY_POLICY_PPL &&
s.level != 5) {
@@ -1300,14 +1300,14 @@ int main(int argc, char *argv[])
(!s.bitmap_file ||
strcmp(s.bitmap_file, "none") == 0)) {
pr_err("--bitmap is required for consistency policy: %s\n",
- map_num(consistency_policies, s.consistency_policy));
+ map_num_s(consistency_policies, s.consistency_policy));
exit(2);
} else if (s.bitmap_file &&
strcmp(s.bitmap_file, "none") != 0 &&
s.consistency_policy != CONSISTENCY_POLICY_BITMAP &&
s.consistency_policy != CONSISTENCY_POLICY_JOURNAL) {
pr_err("--bitmap is not compatible with consistency policy: %s\n",
- map_num(consistency_policies, s.consistency_policy));
+ map_num_s(consistency_policies, s.consistency_policy));
exit(2);
}
}
diff --git a/mdadm.h b/mdadm.h
index cd72e71..09915a0 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -770,7 +770,7 @@ extern int restore_stripes(int *dest, unsigned long long *offsets,
#endif
#define SYSLOG_FACILITY LOG_DAEMON
-
+extern char *map_num_s(mapping_t *map, int num);
extern char *map_num(mapping_t *map, int num);
extern int map_name(mapping_t *map, char *name);
extern mapping_t r0layout[], r5layout[], r6layout[],
diff --git a/super-ddf.c b/super-ddf.c
index 3f304cd..8cda23a 100644
--- a/super-ddf.c
+++ b/super-ddf.c
@@ -1477,13 +1477,13 @@ static void examine_vds(struct ddf_super *sb)
printf("\n");
printf(" unit[%d] : %d\n", i, be16_to_cpu(ve->unit));
printf(" state[%d] : %s, %s%s\n", i,
- map_num(ddf_state, ve->state & 7),
+ map_num_s(ddf_state, ve->state & 7),
(ve->state & DDF_state_morphing) ? "Morphing, ": "",
(ve->state & DDF_state_inconsistent)? "Not Consistent" : "Consistent");
printf(" init state[%d] : %s\n", i,
- map_num(ddf_init_state, ve->init_state&DDF_initstate_mask));
+ map_num_s(ddf_init_state, ve->init_state & DDF_initstate_mask));
printf(" access[%d] : %s\n", i,
- map_num(ddf_access, (ve->init_state & DDF_access_mask) >> 6));
+ map_num_s(ddf_access, (ve->init_state & DDF_access_mask) >> 6));
printf(" Name[%d] : %.16s\n", i, ve->name);
examine_vd(i, sb, ve->guid);
}
diff --git a/super-intel.c b/super-intel.c
index 6ff336e..ba3bd41 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -5625,7 +5625,7 @@ static int init_super_imsm_volume(struct supertype *st, mdu_array_info_t *info,
free(dev);
free(dv);
pr_err("imsm does not support consistency policy %s\n",
- map_num(consistency_policies, s->consistency_policy));
+ map_num_s(consistency_policies, s->consistency_policy));
return 0;
}
diff --git a/super0.c b/super0.c
index b79b97a..61c9ec1 100644
--- a/super0.c
+++ b/super0.c
@@ -288,7 +288,7 @@ static void export_examine_super0(struct supertype *st)
{
mdp_super_t *sb = st->sb;
- printf("MD_LEVEL=%s\n", map_num(pers, sb->level));
+ printf("MD_LEVEL=%s\n", map_num_s(pers, sb->level));
printf("MD_DEVICES=%d\n", sb->raid_disks);
if (sb->minor_version >= 90)
printf("MD_UUID=%08x:%08x:%08x:%08x\n",
diff --git a/super1.c b/super1.c
index a12a5bc..e3e2f95 100644
--- a/super1.c
+++ b/super1.c
@@ -671,7 +671,7 @@ static void export_examine_super1(struct supertype *st)
int len = 32;
int layout;
- printf("MD_LEVEL=%s\n", map_num(pers, __le32_to_cpu(sb->level)));
+ printf("MD_LEVEL=%s\n", map_num_s(pers, __le32_to_cpu(sb->level)));
printf("MD_DEVICES=%d\n", __le32_to_cpu(sb->raid_disks));
for (i = 0; i < 32; i++)
if (sb->set_name[i] == '\n' || sb->set_name[i] == '\0') {
diff --git a/sysfs.c b/sysfs.c
index 2995713..0d98a65 100644
--- a/sysfs.c
+++ b/sysfs.c
@@ -689,7 +689,7 @@ int sysfs_set_array(struct mdinfo *info, int vers)
if (info->array.level < 0)
return 0; /* FIXME */
rv |= sysfs_set_str(info, NULL, "level",
- map_num(pers, info->array.level));
+ map_num_s(pers, info->array.level));
if (info->reshape_active && info->delta_disks != UnSet)
raid_disks -= info->delta_disks;
rv |= sysfs_set_num(info, NULL, "raid_disks", raid_disks);
@@ -724,9 +724,10 @@ int sysfs_set_array(struct mdinfo *info, int vers)
}
if (info->consistency_policy == CONSISTENCY_POLICY_PPL) {
- if (sysfs_set_str(info, NULL, "consistency_policy",
- map_num(consistency_policies,
- info->consistency_policy))) {
+ char *policy = map_num_s(consistency_policies,
+ info->consistency_policy);
+
+ if (sysfs_set_str(info, NULL, "consistency_policy", policy)) {
pr_err("This kernel does not support PPL. Falling back to consistency-policy=resync.\n");
info->consistency_policy = CONSISTENCY_POLICY_RESYNC;
}
--
2.35.3

View File

@ -0,0 +1,125 @@
From 1066ab83dbe9a4cc20f7db44a40aa2cbb9d5eed6 Mon Sep 17 00:00:00 2001
From: Lukasz Florczak <lukasz.florczak@linux.intel.com>
Date: Fri, 13 May 2022 09:19:42 +0200
Subject: [PATCH 13/61] mdmon: Stop parsing duplicate options
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
Introduce new function is_duplicate_opt() to check if given option
was already used and prevent setting it again along with an error
message.
Move parsing above in_initrd() check to be able to detect --offroot
option duplicates.
Now help option is executed after parsing to prevent executing commands
like: 'mdmon --help --ndlksnlksajndfjksndafasj'.
Signed-off-by: Lukasz Florczak <lukasz.florczak@linux.intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
mdmon.c | 44 +++++++++++++++++++++++++++++++++++---------
1 file changed, 35 insertions(+), 9 deletions(-)
diff --git a/mdmon.c b/mdmon.c
index 5570574..c057da6 100644
--- a/mdmon.c
+++ b/mdmon.c
@@ -288,6 +288,15 @@ void usage(void)
exit(2);
}
+static bool is_duplicate_opt(const int opt, const int set_val, const char *long_name)
+{
+ if (opt == set_val) {
+ pr_err("--%s option duplicated!\n", long_name);
+ return true;
+ }
+ return false;
+}
+
static int mdmon(char *devnm, int must_fork, int takeover);
int main(int argc, char *argv[])
@@ -299,6 +308,7 @@ int main(int argc, char *argv[])
int all = 0;
int takeover = 0;
int dofork = 1;
+ bool help = false;
static struct option options[] = {
{"all", 0, NULL, 'a'},
{"takeover", 0, NULL, 't'},
@@ -308,37 +318,50 @@ int main(int argc, char *argv[])
{NULL, 0, NULL, 0}
};
- if (in_initrd()) {
- /*
- * set first char of argv[0] to @. This is used by
- * systemd to signal that the task was launched from
- * initrd/initramfs and should be preserved during shutdown
- */
- argv[0][0] = '@';
- }
-
while ((opt = getopt_long(argc, argv, "thaF", options, NULL)) != -1) {
switch (opt) {
case 'a':
+ if (is_duplicate_opt(all, 1, "all"))
+ exit(1);
container_name = argv[optind-1];
all = 1;
break;
case 't':
+ if (is_duplicate_opt(takeover, 1, "takeover"))
+ exit(1);
takeover = 1;
break;
case 'F':
+ if (is_duplicate_opt(dofork, 0, "foreground"))
+ exit(1);
dofork = 0;
break;
case OffRootOpt:
+ if (is_duplicate_opt(argv[0][0], '@', "offroot"))
+ exit(1);
argv[0][0] = '@';
break;
case 'h':
+ if (is_duplicate_opt(help, true, "help"))
+ exit(1);
+ help = true;
+ break;
default:
usage();
break;
}
}
+
+ if (in_initrd()) {
+ /*
+ * set first char of argv[0] to @. This is used by
+ * systemd to signal that the task was launched from
+ * initrd/initramfs and should be preserved during shutdown
+ */
+ argv[0][0] = '@';
+ }
+
if (all == 0 && container_name == NULL) {
if (argv[optind])
container_name = argv[optind];
@@ -353,6 +376,9 @@ int main(int argc, char *argv[])
if (strcmp(container_name, "/proc/mdstat") == 0)
all = 1;
+ if (help)
+ usage();
+
if (all) {
struct mdstat_ent *mdstat, *e;
int container_len = strlen(container_name);
--
2.35.3

View File

@ -1,38 +0,0 @@
From 757e55435997e355ee9b03e5d913b5496a3c39a8 Mon Sep 17 00:00:00 2001
From: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com>
Date: Tue, 11 Dec 2018 15:04:07 +0100
Subject: [PATCH] policy.c: Fix for compiler error
Git-commit: 757e55435997e355ee9b03e5d913b5496a3c39a8
Patch-mainline: mdadm-4.1+
References: jsc#SLE-10078, jsc#SLE-9348
After cd72f9d(policy: support devices with multiple paths.) compilation
on old compilers fails because "p may be used uninitialized
in this function".
Initialize it with NULL to prevent this.
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
policy.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/policy.c b/policy.c
index e3a0671..3c53bd3 100644
--- a/policy.c
+++ b/policy.c
@@ -268,7 +268,7 @@ static int pol_match(struct rule *rule, char **paths, char *type, char **part)
for (; rule; rule = rule->next) {
if (rule->name == rule_path) {
- char *p;
+ char *p = NULL;
int i;
if (pathok == 0)
pathok = -1;
--
2.25.0

View File

@ -0,0 +1,44 @@
From 20e114e334ed6ed3280c37a9a08fb95578393d1a Mon Sep 17 00:00:00 2001
From: Mateusz Kusiak <mateusz.kusiak@intel.com>
Date: Thu, 19 May 2022 09:16:08 +0200
Subject: [PATCH 14/61] Grow: block -n on external volumes.
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
Performing --raid-devices on external metadata volume should be blocked
as it causes unwanted behaviour.
Eg. Performing
mdadm -G /dev/md/volume -l10 -n4
on r0_d2 inside 4 disk container, returns
mdadm: Need 2 spares to avoid degraded array, only have 0.
Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
Grow.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/Grow.c b/Grow.c
index 8a242b0..f6efbc4 100644
--- a/Grow.c
+++ b/Grow.c
@@ -1892,6 +1892,14 @@ int Grow_reshape(char *devname, int fd,
if (retval) {
pr_err("Cannot read superblock for %s\n", devname);
+ close(cfd);
+ free(subarray);
+ return 1;
+ }
+
+ if (s->raiddisks && subarray) {
+ pr_err("--raid-devices operation can be performed on a container only\n");
+ close(cfd);
free(subarray);
return 1;
}
--
2.35.3

View File

@ -1,98 +0,0 @@
From a4e96fd8f3f0b5416783237c1cb6ee87e7eff23d Mon Sep 17 00:00:00 2001
From: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com>
Date: Fri, 8 Feb 2019 11:07:10 +0100
Subject: [PATCH] imsm: finish recovery when drive with rebuild fails
Git-commit: a4e96fd8f3f0b5416783237c1cb6ee87e7eff23d
Patch-mainline: mdadm-4.1-12
References: bsc#1126975
Commit d7a1fda2769b ("imsm: update metadata correctly while raid10 double
degradation") resolves main Imsm double degradation problems but it
omits one case. Now metadata hangs in the rebuilding state if the drive
under rebuild is removed during recovery from double degradation.
The root cause of this problem is comparing new map_state with current
and if they both are degraded assuming that nothing new happens.
Don't rely on map states, just check if device is failed. If the drive
under rebuild fails then finish migration, in other cases update map
state only (second fail means that destination map state can't be normal).
To avoid problems with reassembling move end_migration (called after
double degradation successful recovery) after check if recovery really
finished, for details see (7ce057018 "imsm: fix: rebuild does not
continue after reboot").
Remove redundant code responsible for finishing rebuild process. Function
end_migration do exactly the same. Set last_checkpoint to 0, to prepare
it for the next rebuild.
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
super-intel.c | 26 +++++++++++---------------
1 file changed, 11 insertions(+), 15 deletions(-)
diff --git a/super-intel.c b/super-intel.c
index d2035cc..38a1b6c 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -8560,26 +8560,22 @@ static void imsm_set_disk(struct active_array *a, int n, int state)
}
if (is_rebuilding(dev)) {
dprintf_cont("while rebuilding ");
- if (map->map_state != map_state) {
- dprintf_cont("map state change ");
+ if (state & DS_FAULTY) {
+ dprintf_cont("removing failed drive ");
if (n == map->failed_disk_num) {
dprintf_cont("end migration");
end_migration(dev, super, map_state);
+ a->last_checkpoint = 0;
} else {
- dprintf_cont("raid10 double degradation, map state change");
+ dprintf_cont("fail detected during rebuild, changing map state");
map->map_state = map_state;
}
super->updates_pending++;
- } else if (!rebuild_done)
- break;
- else if (n == map->failed_disk_num) {
- /* r10 double degraded to degraded transition */
- dprintf_cont("raid10 double degradation end migration");
- end_migration(dev, super, map_state);
- a->last_checkpoint = 0;
- super->updates_pending++;
}
+ if (!rebuild_done)
+ break;
+
/* check if recovery is really finished */
for (mdi = a->info.devs; mdi ; mdi = mdi->next)
if (mdi->recovery_start != MaxSector) {
@@ -8588,7 +8584,7 @@ static void imsm_set_disk(struct active_array *a, int n, int state)
}
if (recovery_not_finished) {
dprintf_cont("\n");
- dprintf_cont("Rebuild has not finished yet, map state changes only if raid10 double degradation happens");
+ dprintf_cont("Rebuild has not finished yet");
if (a->last_checkpoint < mdi->recovery_start) {
a->last_checkpoint =
mdi->recovery_start;
@@ -8598,9 +8594,9 @@ static void imsm_set_disk(struct active_array *a, int n, int state)
}
dprintf_cont(" Rebuild done, still degraded");
- dev->vol.migr_state = 0;
- set_migr_type(dev, 0);
- dev->vol.curr_migr_unit = 0;
+ end_migration(dev, super, map_state);
+ a->last_checkpoint = 0;
+ super->updates_pending++;
for (i = 0; i < map->num_members; i++) {
int idx = get_imsm_ord_tbl_ent(dev, i, MAP_0);
--
2.16.4

View File

@ -0,0 +1,93 @@
From de064c93e3819d72720e4fba6575265ba10e1553 Mon Sep 17 00:00:00 2001
From: Mateusz Grzonka <mateusz.grzonka@intel.com>
Date: Mon, 13 Jun 2022 12:11:25 +0200
Subject: [PATCH 15/61] Incremental: Fix possible memory and resource leaks
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
map allocated through map_by_uuid() is not freed if mdfd is invalid.
In addition mdfd is not closed, and mdinfo list is not freed too.
Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com>
Change-Id: I25e726f0e2502cf7e8ce80c2bd7944b3b1e2b9dc
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
Incremental.c | 32 +++++++++++++++++++++++---------
1 file changed, 23 insertions(+), 9 deletions(-)
diff --git a/Incremental.c b/Incremental.c
index a57fc32..4d0cd9d 100644
--- a/Incremental.c
+++ b/Incremental.c
@@ -1499,7 +1499,7 @@ static int Incremental_container(struct supertype *st, char *devname,
return 0;
}
for (ra = list ; ra ; ra = ra->next) {
- int mdfd;
+ int mdfd = -1;
char chosen_name[1024];
struct map_ent *mp;
struct mddev_ident *match = NULL;
@@ -1514,6 +1514,12 @@ static int Incremental_container(struct supertype *st, char *devname,
if (mp) {
mdfd = open_dev(mp->devnm);
+ if (!is_fd_valid(mdfd)) {
+ pr_err("failed to open %s: %s.\n",
+ mp->devnm, strerror(errno));
+ rv = 2;
+ goto release;
+ }
if (mp->path)
strcpy(chosen_name, mp->path);
else
@@ -1573,21 +1579,25 @@ static int Incremental_container(struct supertype *st, char *devname,
c->autof,
trustworthy,
chosen_name, 0);
+
+ if (!is_fd_valid(mdfd)) {
+ pr_err("create_mddev failed with chosen name %s: %s.\n",
+ chosen_name, strerror(errno));
+ rv = 2;
+ goto release;
+ }
}
- if (only && (!mp || strcmp(mp->devnm, only) != 0))
- continue;
- if (mdfd < 0) {
- pr_err("failed to open %s: %s.\n",
- chosen_name, strerror(errno));
- return 2;
+ if (only && (!mp || strcmp(mp->devnm, only) != 0)) {
+ close_fd(&mdfd);
+ continue;
}
assemble_container_content(st, mdfd, ra, c,
chosen_name, &result);
map_free(map);
map = NULL;
- close(mdfd);
+ close_fd(&mdfd);
}
if (c->export && result) {
char sep = '=';
@@ -1610,7 +1620,11 @@ static int Incremental_container(struct supertype *st, char *devname,
}
printf("\n");
}
- return 0;
+
+release:
+ map_free(map);
+ sysfs_free(list);
+ return rv;
}
static void run_udisks(char *arg1, char *arg2)
--
2.35.3

View File

@ -1,327 +0,0 @@
From 9f4218274cd4a1e1f356a1617f9a1d09960cf255 Mon Sep 17 00:00:00 2001
From: Pawel Baldysiak <pawel.baldysiak@intel.com>
Date: Mon, 28 Jan 2019 17:10:41 +0100
Subject: [PATCH] imsm: fix reshape for >2TB drives
Git-commit: 9f4218274cd4a1e1f356a1617f9a1d09960cf255
Patch-mainline: mdadm-4.1+
References: jsc#SLE-10078, jsc#SLE-9348
If reshape is performed on drives larger then 2 TB,
migration checkpoint area that is calculated exeeds 32-bit value.
This checkpoint area is a reserved space threated as backup
during reshape - at the end of the drive, right before metadata.
As a result - wrong space is used and the data that may exists there
is overwritten.
Adding additional field to migration record to track high order 32-bits
of pba of this area. Three other fields that may exceed 32-bit value
for large drives are added as well.
Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
super-intel.c | 149 ++++++++++++++++++++++++++++++++++++--------------
1 file changed, 107 insertions(+), 42 deletions(-)
diff --git a/super-intel.c b/super-intel.c
index 38a1b6c..1cc7d5f 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -296,7 +296,7 @@ struct migr_record {
__u32 rec_status; /* Status used to determine how to restart
* migration in case it aborts
* in some fashion */
- __u32 curr_migr_unit; /* 0..numMigrUnits-1 */
+ __u32 curr_migr_unit_lo; /* 0..numMigrUnits-1 */
__u32 family_num; /* Family number of MPB
* containing the RaidDev
* that is migrating */
@@ -306,16 +306,23 @@ struct migr_record {
__u32 dest_depth_per_unit; /* Num member blocks each destMap
* member disk
* advances per unit-of-operation */
- __u32 ckpt_area_pba; /* Pba of first block of ckpt copy area */
- __u32 dest_1st_member_lba; /* First member lba on first
- * stripe of destination */
- __u32 num_migr_units; /* Total num migration units-of-op */
+ __u32 ckpt_area_pba_lo; /* Pba of first block of ckpt copy area */
+ __u32 dest_1st_member_lba_lo; /* First member lba on first
+ * stripe of destination */
+ __u32 num_migr_units_lo; /* Total num migration units-of-op */
__u32 post_migr_vol_cap; /* Size of volume after
* migration completes */
__u32 post_migr_vol_cap_hi; /* Expansion space for LBA64 */
__u32 ckpt_read_disk_num; /* Which member disk in destSubMap[0] the
* migration ckpt record was read from
* (for recovered migrations) */
+ __u32 curr_migr_unit_hi; /* 0..numMigrUnits-1 high order 32 bits */
+ __u32 ckpt_area_pba_hi; /* Pba of first block of ckpt copy area
+ * high order 32 bits */
+ __u32 dest_1st_member_lba_hi; /* First member lba on first stripe of
+ * destination - high order 32 bits */
+ __u32 num_migr_units_hi; /* Total num migration units-of-op
+ * high order 32 bits */
} __attribute__ ((__packed__));
struct md_list {
@@ -1208,6 +1215,38 @@ static unsigned long long imsm_dev_size(struct imsm_dev *dev)
return join_u32(dev->size_low, dev->size_high);
}
+static unsigned long long migr_chkp_area_pba(struct migr_record *migr_rec)
+{
+ if (migr_rec == NULL)
+ return 0;
+ return join_u32(migr_rec->ckpt_area_pba_lo,
+ migr_rec->ckpt_area_pba_hi);
+}
+
+static unsigned long long current_migr_unit(struct migr_record *migr_rec)
+{
+ if (migr_rec == NULL)
+ return 0;
+ return join_u32(migr_rec->curr_migr_unit_lo,
+ migr_rec->curr_migr_unit_hi);
+}
+
+static unsigned long long migr_dest_1st_member_lba(struct migr_record *migr_rec)
+{
+ if (migr_rec == NULL)
+ return 0;
+ return join_u32(migr_rec->dest_1st_member_lba_lo,
+ migr_rec->dest_1st_member_lba_hi);
+}
+
+static unsigned long long get_num_migr_units(struct migr_record *migr_rec)
+{
+ if (migr_rec == NULL)
+ return 0;
+ return join_u32(migr_rec->num_migr_units_lo,
+ migr_rec->num_migr_units_hi);
+}
+
static void set_total_blocks(struct imsm_disk *disk, unsigned long long n)
{
split_ull(n, &disk->total_blocks_lo, &disk->total_blocks_hi);
@@ -1233,6 +1272,33 @@ static void set_imsm_dev_size(struct imsm_dev *dev, unsigned long long n)
split_ull(n, &dev->size_low, &dev->size_high);
}
+static void set_migr_chkp_area_pba(struct migr_record *migr_rec,
+ unsigned long long n)
+{
+ split_ull(n, &migr_rec->ckpt_area_pba_lo, &migr_rec->ckpt_area_pba_hi);
+}
+
+static void set_current_migr_unit(struct migr_record *migr_rec,
+ unsigned long long n)
+{
+ split_ull(n, &migr_rec->curr_migr_unit_lo,
+ &migr_rec->curr_migr_unit_hi);
+}
+
+static void set_migr_dest_1st_member_lba(struct migr_record *migr_rec,
+ unsigned long long n)
+{
+ split_ull(n, &migr_rec->dest_1st_member_lba_lo,
+ &migr_rec->dest_1st_member_lba_hi);
+}
+
+static void set_num_migr_units(struct migr_record *migr_rec,
+ unsigned long long n)
+{
+ split_ull(n, &migr_rec->num_migr_units_lo,
+ &migr_rec->num_migr_units_hi);
+}
+
static unsigned long long per_dev_array_size(struct imsm_map *map)
{
unsigned long long array_size = 0;
@@ -1629,12 +1695,14 @@ void convert_to_4k_imsm_migr_rec(struct intel_super *super)
struct migr_record *migr_rec = super->migr_rec;
migr_rec->blocks_per_unit /= IMSM_4K_DIV;
- migr_rec->ckpt_area_pba /= IMSM_4K_DIV;
- migr_rec->dest_1st_member_lba /= IMSM_4K_DIV;
migr_rec->dest_depth_per_unit /= IMSM_4K_DIV;
split_ull((join_u32(migr_rec->post_migr_vol_cap,
migr_rec->post_migr_vol_cap_hi) / IMSM_4K_DIV),
&migr_rec->post_migr_vol_cap, &migr_rec->post_migr_vol_cap_hi);
+ set_migr_chkp_area_pba(migr_rec,
+ migr_chkp_area_pba(migr_rec) / IMSM_4K_DIV);
+ set_migr_dest_1st_member_lba(migr_rec,
+ migr_dest_1st_member_lba(migr_rec) / IMSM_4K_DIV);
}
void convert_to_4k_imsm_disk(struct imsm_disk *disk)
@@ -1727,8 +1795,8 @@ void examine_migr_rec_imsm(struct intel_super *super)
printf("Normal\n");
else
printf("Contains Data\n");
- printf(" Current Unit : %u\n",
- __le32_to_cpu(migr_rec->curr_migr_unit));
+ printf(" Current Unit : %llu\n",
+ current_migr_unit(migr_rec));
printf(" Family : %u\n",
__le32_to_cpu(migr_rec->family_num));
printf(" Ascending : %u\n",
@@ -1737,16 +1805,15 @@ void examine_migr_rec_imsm(struct intel_super *super)
__le32_to_cpu(migr_rec->blocks_per_unit));
printf(" Dest. Depth Per Unit : %u\n",
__le32_to_cpu(migr_rec->dest_depth_per_unit));
- printf(" Checkpoint Area pba : %u\n",
- __le32_to_cpu(migr_rec->ckpt_area_pba));
- printf(" First member lba : %u\n",
- __le32_to_cpu(migr_rec->dest_1st_member_lba));
- printf(" Total Number of Units : %u\n",
- __le32_to_cpu(migr_rec->num_migr_units));
- printf(" Size of volume : %u\n",
- __le32_to_cpu(migr_rec->post_migr_vol_cap));
- printf(" Expansion space for LBA64 : %u\n",
- __le32_to_cpu(migr_rec->post_migr_vol_cap_hi));
+ printf(" Checkpoint Area pba : %llu\n",
+ migr_chkp_area_pba(migr_rec));
+ printf(" First member lba : %llu\n",
+ migr_dest_1st_member_lba(migr_rec));
+ printf(" Total Number of Units : %llu\n",
+ get_num_migr_units(migr_rec));
+ printf(" Size of volume : %llu\n",
+ join_u32(migr_rec->post_migr_vol_cap,
+ migr_rec->post_migr_vol_cap_hi));
printf(" Record was read from : %u\n",
__le32_to_cpu(migr_rec->ckpt_read_disk_num));
@@ -1759,13 +1826,15 @@ void convert_from_4k_imsm_migr_rec(struct intel_super *super)
struct migr_record *migr_rec = super->migr_rec;
migr_rec->blocks_per_unit *= IMSM_4K_DIV;
- migr_rec->ckpt_area_pba *= IMSM_4K_DIV;
- migr_rec->dest_1st_member_lba *= IMSM_4K_DIV;
migr_rec->dest_depth_per_unit *= IMSM_4K_DIV;
split_ull((join_u32(migr_rec->post_migr_vol_cap,
migr_rec->post_migr_vol_cap_hi) * IMSM_4K_DIV),
&migr_rec->post_migr_vol_cap,
&migr_rec->post_migr_vol_cap_hi);
+ set_migr_chkp_area_pba(migr_rec,
+ migr_chkp_area_pba(migr_rec) * IMSM_4K_DIV);
+ set_migr_dest_1st_member_lba(migr_rec,
+ migr_dest_1st_member_lba(migr_rec) * IMSM_4K_DIV);
}
void convert_from_4k(struct intel_super *super)
@@ -3096,7 +3165,7 @@ static int imsm_create_metadata_checkpoint_update(
return 0;
}
(*u)->type = update_general_migration_checkpoint;
- (*u)->curr_migr_unit = __le32_to_cpu(super->migr_rec->curr_migr_unit);
+ (*u)->curr_migr_unit = current_migr_unit(super->migr_rec);
dprintf("prepared for %u\n", (*u)->curr_migr_unit);
return update_memory_size;
@@ -3397,13 +3466,13 @@ static void getinfo_super_imsm_volume(struct supertype *st, struct mdinfo *info,
case MIGR_GEN_MIGR: {
__u64 blocks_per_unit = blocks_per_migr_unit(super,
dev);
- __u64 units = __le32_to_cpu(migr_rec->curr_migr_unit);
+ __u64 units = current_migr_unit(migr_rec);
unsigned long long array_blocks;
int used_disks;
if (__le32_to_cpu(migr_rec->ascending_migr) &&
(units <
- (__le32_to_cpu(migr_rec->num_migr_units)-1)) &&
+ (get_num_migr_units(migr_rec)-1)) &&
(super->migr_rec->rec_status ==
__cpu_to_le32(UNIT_SRC_IN_CP_AREA)))
units++;
@@ -10697,7 +10766,7 @@ void init_migr_record_imsm(struct supertype *st, struct imsm_dev *dev,
if (array_blocks % __le32_to_cpu(migr_rec->blocks_per_unit))
num_migr_units++;
- migr_rec->num_migr_units = __cpu_to_le32(num_migr_units);
+ set_num_migr_units(migr_rec, num_migr_units);
migr_rec->post_migr_vol_cap = dev->size_low;
migr_rec->post_migr_vol_cap_hi = dev->size_high;
@@ -10714,7 +10783,7 @@ void init_migr_record_imsm(struct supertype *st, struct imsm_dev *dev,
min_dev_sectors = dev_sectors;
close(fd);
}
- migr_rec->ckpt_area_pba = __cpu_to_le32(min_dev_sectors -
+ set_migr_chkp_area_pba(migr_rec, min_dev_sectors -
RAID_DISK_RESERVED_BLOCKS_IMSM_HI);
write_imsm_migr_rec(st);
@@ -10765,8 +10834,7 @@ int save_backup_imsm(struct supertype *st,
start = info->reshape_progress * 512;
for (i = 0; i < new_disks; i++) {
- target_offsets[i] = (unsigned long long)
- __le32_to_cpu(super->migr_rec->ckpt_area_pba) * 512;
+ target_offsets[i] = migr_chkp_area_pba(super->migr_rec) * 512;
/* move back copy area adderss, it will be moved forward
* in restore_stripes() using start input variable
*/
@@ -10845,12 +10913,11 @@ int save_checkpoint_imsm(struct supertype *st, struct mdinfo *info, int state)
if (info->reshape_progress % blocks_per_unit)
curr_migr_unit++;
- super->migr_rec->curr_migr_unit =
- __cpu_to_le32(curr_migr_unit);
+ set_current_migr_unit(super->migr_rec, curr_migr_unit);
super->migr_rec->rec_status = __cpu_to_le32(state);
- super->migr_rec->dest_1st_member_lba =
- __cpu_to_le32(curr_migr_unit *
- __le32_to_cpu(super->migr_rec->dest_depth_per_unit));
+ set_migr_dest_1st_member_lba(super->migr_rec,
+ super->migr_rec->dest_depth_per_unit * curr_migr_unit);
+
if (write_imsm_migr_rec(st) < 0) {
dprintf("imsm: Cannot write migration record outside backup area\n");
return 1;
@@ -10884,8 +10951,8 @@ int recover_backup_imsm(struct supertype *st, struct mdinfo *info)
char *buf = NULL;
int retval = 1;
unsigned int sector_size = super->sector_size;
- unsigned long curr_migr_unit = __le32_to_cpu(migr_rec->curr_migr_unit);
- unsigned long num_migr_units = __le32_to_cpu(migr_rec->num_migr_units);
+ unsigned long curr_migr_unit = current_migr_unit(migr_rec);
+ unsigned long num_migr_units = get_num_migr_units(migr_rec);
char buffer[20];
int skipped_disks = 0;
@@ -10912,11 +10979,9 @@ int recover_backup_imsm(struct supertype *st, struct mdinfo *info)
map_dest = get_imsm_map(id->dev, MAP_0);
new_disks = map_dest->num_members;
- read_offset = (unsigned long long)
- __le32_to_cpu(migr_rec->ckpt_area_pba) * 512;
+ read_offset = migr_chkp_area_pba(migr_rec) * 512;
- write_offset = ((unsigned long long)
- __le32_to_cpu(migr_rec->dest_1st_member_lba) +
+ write_offset = (migr_dest_1st_member_lba(migr_rec) +
pba_of_lba0(map_dest)) * 512;
unit_len = __le32_to_cpu(migr_rec->dest_depth_per_unit) * 512;
@@ -12019,12 +12084,12 @@ static int imsm_manage_reshape(
max_position = sra->component_size * ndata;
source_layout = imsm_level_to_layout(map_src->raid_level);
- while (__le32_to_cpu(migr_rec->curr_migr_unit) <
- __le32_to_cpu(migr_rec->num_migr_units)) {
+ while (current_migr_unit(migr_rec) <
+ get_num_migr_units(migr_rec)) {
/* current reshape position [blocks] */
unsigned long long current_position =
__le32_to_cpu(migr_rec->blocks_per_unit)
- * __le32_to_cpu(migr_rec->curr_migr_unit);
+ * current_migr_unit(migr_rec);
unsigned long long border;
/* Check that array hasn't become failed.
--
2.25.0

View File

@ -1,106 +0,0 @@
From ebf3be9931f31df54df52b1821479e6a80a4d9c6 Mon Sep 17 00:00:00 2001
From: Dimitri John Ledkov <xnox@ubuntu.com>
Date: Tue, 15 Jan 2019 19:08:37 +0000
Subject: [PATCH] Fix spelling typos.
Git-commit: ebf3be9931f31df54df52b1821479e6a80a4d9c6
Patch-mainline: mdadm-4.1+
References: jsc#SLE-10078, jsc#SLE-9348
Signed-off-by: Dimitri John Ledkov <xnox@ubuntu.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
Assemble.c | 2 +-
Create.c | 2 +-
Grow.c | 6 +++---
super-ddf.c | 2 +-
super-intel.c | 2 +-
5 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/Assemble.c b/Assemble.c
index 9f75c68..9f050c1 100644
--- a/Assemble.c
+++ b/Assemble.c
@@ -879,7 +879,7 @@ static int force_array(struct mdinfo *content,
current_events = devices[chosen_drive].i.events;
add_another:
if (c->verbose >= 0)
- pr_err("forcing event count in %s(%d) from %d upto %d\n",
+ pr_err("forcing event count in %s(%d) from %d up to %d\n",
devices[chosen_drive].devname,
devices[chosen_drive].i.disk.raid_disk,
(int)(devices[chosen_drive].i.events),
diff --git a/Create.c b/Create.c
index 04b1dfc..6f1b228 100644
--- a/Create.c
+++ b/Create.c
@@ -823,7 +823,7 @@ int Create(struct supertype *st, char *mddev,
}
bitmap_fd = open(s->bitmap_file, O_RDWR);
if (bitmap_fd < 0) {
- pr_err("weird: %s cannot be openned\n",
+ pr_err("weird: %s cannot be opened\n",
s->bitmap_file);
goto abort_locked;
}
diff --git a/Grow.c b/Grow.c
index 363b209..6d32661 100644
--- a/Grow.c
+++ b/Grow.c
@@ -446,7 +446,7 @@ int Grow_addbitmap(char *devname, int fd, struct context *c, struct shape *s)
if (offset_setable) {
st->ss->getinfo_super(st, mdi, NULL);
if (sysfs_init(mdi, fd, NULL)) {
- pr_err("failed to intialize sysfs.\n");
+ pr_err("failed to initialize sysfs.\n");
free(mdi);
}
rv = sysfs_set_num_signed(mdi, NULL, "bitmap/location",
@@ -2178,7 +2178,7 @@ size_change_error:
memset(&info, 0, sizeof(info));
info.array = array;
if (sysfs_init(&info, fd, NULL)) {
- pr_err("failed to intialize sysfs.\n");
+ pr_err("failed to initialize sysfs.\n");
rv = 1;
goto release;
}
@@ -2903,7 +2903,7 @@ static int impose_level(int fd, int level, char *devname, int verbose)
struct mdinfo info;
if (sysfs_init(&info, fd, NULL)) {
- pr_err("failed to intialize sysfs.\n");
+ pr_err("failed to initialize sysfs.\n");
return 1;
}
diff --git a/super-ddf.c b/super-ddf.c
index 618542c..c095e8a 100644
--- a/super-ddf.c
+++ b/super-ddf.c
@@ -1900,7 +1900,7 @@ static struct vd_config *find_vdcr(struct ddf_super *ddf, unsigned int inst,
return conf;
}
bad:
- pr_err("Could't find disk %d in array %u\n", n, inst);
+ pr_err("Couldn't find disk %d in array %u\n", n, inst);
return NULL;
}
diff --git a/super-intel.c b/super-intel.c
index 1cc7d5f..c399433 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -10034,7 +10034,7 @@ static void imsm_process_update(struct supertype *st,
break;
}
default:
- pr_err("error: unsuported process update type:(type: %d)\n", type);
+ pr_err("error: unsupported process update type:(type: %d)\n", type);
}
}
--
2.25.0

View File

@ -0,0 +1,101 @@
From e702f392959d1c2ad2089e595b52235ed97b4e18 Mon Sep 17 00:00:00 2001
From: Kinga Tanska <kinga.tanska@intel.com>
Date: Mon, 6 Jun 2022 12:32:12 +0200
Subject: [PATCH 16/61] Mdmonitor: Fix segfault
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
Mdadm with "--monitor" parameter requires md device
as an argument to be monitored. If given argument is
not a md device, error shall be returned. Previously
it was not checked and invalid argument caused
segmentation fault. This commit adds checking
that devices passed to mdmonitor are md devices.
Signed-off-by: Kinga Tanska <kinga.tanska@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
Monitor.c | 10 +++++++++-
mdadm.h | 1 +
mdopen.c | 17 +++++++++++++++++
3 files changed, 27 insertions(+), 1 deletion(-)
diff --git a/Monitor.c b/Monitor.c
index c0ab541..4e5802b 100644
--- a/Monitor.c
+++ b/Monitor.c
@@ -182,6 +182,7 @@ int Monitor(struct mddev_dev *devlist,
continue;
if (strcasecmp(mdlist->devname, "<ignore>") == 0)
continue;
+
st = xcalloc(1, sizeof *st);
if (mdlist->devname[0] == '/')
st->devname = xstrdup(mdlist->devname);
@@ -190,6 +191,8 @@ int Monitor(struct mddev_dev *devlist,
strcpy(strcpy(st->devname, "/dev/md/"),
mdlist->devname);
}
+ if (!is_mddev(mdlist->devname))
+ return 1;
st->next = statelist;
st->devnm[0] = 0;
st->percent = RESYNC_UNKNOWN;
@@ -203,7 +206,12 @@ int Monitor(struct mddev_dev *devlist,
struct mddev_dev *dv;
for (dv = devlist; dv; dv = dv->next) {
- struct state *st = xcalloc(1, sizeof *st);
+ struct state *st;
+
+ if (!is_mddev(dv->devname))
+ return 1;
+
+ st = xcalloc(1, sizeof *st);
mdlist = conf_get_ident(dv->devname);
st->devname = xstrdup(dv->devname);
st->next = statelist;
diff --git a/mdadm.h b/mdadm.h
index 09915a0..d53df16 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -1636,6 +1636,7 @@ extern int create_mddev(char *dev, char *name, int autof, int trustworthy,
#define FOREIGN 2
#define METADATA 3
extern int open_mddev(char *dev, int report_errors);
+extern int is_mddev(char *dev);
extern int open_container(int fd);
extern int metadata_container_matches(char *metadata, char *devnm);
extern int metadata_subdev_matches(char *metadata, char *devnm);
diff --git a/mdopen.c b/mdopen.c
index 245be53..d18c931 100644
--- a/mdopen.c
+++ b/mdopen.c
@@ -475,6 +475,23 @@ int open_mddev(char *dev, int report_errors)
return mdfd;
}
+/**
+ * is_mddev() - check that file name passed is an md device.
+ * @dev: file name that has to be checked.
+ * Return: 1 if file passed is an md device, 0 if not.
+ */
+int is_mddev(char *dev)
+{
+ int fd = open_mddev(dev, 1);
+
+ if (fd >= 0) {
+ close(fd);
+ return 1;
+ }
+
+ return 0;
+}
+
char *find_free_devnm(int use_partitions)
{
static char devnm[32];
--
2.35.3

View File

@ -1,49 +0,0 @@
From e3615ecb5b6ad8eb408296878aad5628e0e27166 Mon Sep 17 00:00:00 2001
From: Coly Li <colyli@suse.de>
Date: Tue, 12 Feb 2019 12:53:18 +0800
Subject: [PATCH] Detail.c: do not skip first character when calling xstrdup in
Detail()
Git-commit: e3615ecb5b6ad8eb408296878aad5628e0e27166
Patch-mainline: mdadm-4.1+
References: bsc#1123814
'Commit b9c9bd9bacaa ("Detail: ensure --export names are acceptable as
shell variables")' duplicates mdi->sys_name to sysdev string by,
char *sysdev = xstrdup(mdi->sys_name + 1);
which skips the first character of mdi->sys_name. Then when running
mdadm --detail <md device> --export, the output looks like,
MD_DEVICE_ev_sda2_ROLE=1
MD_DEVICE_ev_sda2_DEV=/dev/sda2
The first character of md device (between MD_DEVICE and _ROLE/_DEV)
is dropped. The expected output should be,
MD_DEVICE_dev_sda2_ROLE=1
MD_DEVICE_dev_sda2_DEV=/dev/sda2
This patch removes the '+ 1' from calling xstrdup() in Detail(), which
gets the dropped first character back.
Reported-by: Arvin Schnell <aschnell@suse.com>
Fixes: b9c9bd9bacaa ("Detail: ensure --export names are acceptable as 4 shell variables")
Signed-off-by: Coly Li <colyli@suse.de>
Cc: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
---
Detail.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Detail.c b/Detail.c
index b3e857a..20ea03a 100644
--- a/Detail.c
+++ b/Detail.c
@@ -284,7 +284,7 @@ int Detail(char *dev, struct context *c)
struct mdinfo *mdi;
for (mdi = sra->devs; mdi; mdi = mdi->next) {
char *path;
- char *sysdev = xstrdup(mdi->sys_name + 1);
+ char *sysdev = xstrdup(mdi->sys_name);
char *cp;
path = map_dev(mdi->disk.major,
--
2.25.0

View File

@ -0,0 +1,64 @@
From f5ff2988761625b43eb15555993f2797af29f166 Mon Sep 17 00:00:00 2001
From: Kinga Tanska <kinga.tanska@intel.com>
Date: Mon, 6 Jun 2022 12:32:13 +0200
Subject: [PATCH 17/61] Mdmonitor: Improve logging method
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
Change logging, and as a result, mdmonitor in verbose
mode will report its configuration.
Signed-off-by: Kinga Tanska <kinga.tanska@intel.com>
Signed-off-by: Oleksandr Shchirskyi <oleksandr.shchirskyi@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
Monitor.c | 25 ++++++++++++++-----------
1 file changed, 14 insertions(+), 11 deletions(-)
diff --git a/Monitor.c b/Monitor.c
index 4e5802b..6ca1ebe 100644
--- a/Monitor.c
+++ b/Monitor.c
@@ -136,24 +136,27 @@ int Monitor(struct mddev_dev *devlist,
struct mddev_ident *mdlist;
int delay_for_event = c->delay;
- if (!mailaddr) {
+ if (!mailaddr)
mailaddr = conf_get_mailaddr();
- if (mailaddr && ! c->scan)
- pr_err("Monitor using email address \"%s\" from config file\n",
- mailaddr);
- }
- mailfrom = conf_get_mailfrom();
- if (!alert_cmd) {
+ if (!alert_cmd)
alert_cmd = conf_get_program();
- if (alert_cmd && !c->scan)
- pr_err("Monitor using program \"%s\" from config file\n",
- alert_cmd);
- }
+
+ mailfrom = conf_get_mailfrom();
+
if (c->scan && !mailaddr && !alert_cmd && !dosyslog) {
pr_err("No mail address or alert command - not monitoring.\n");
return 1;
}
+
+ if (c->verbose) {
+ pr_err("Monitor is started with delay %ds\n", c->delay);
+ if (mailaddr)
+ pr_err("Monitor using email address %s\n", mailaddr);
+ if (alert_cmd)
+ pr_err("Monitor using program %s\n", alert_cmd);
+ }
+
info.alert_cmd = alert_cmd;
info.mailaddr = mailaddr;
info.mailfrom = mailfrom;
--
2.35.3

View File

@ -0,0 +1,76 @@
From 626bc45396c4959f2c4685c2faa7c4f553f4efdf Mon Sep 17 00:00:00 2001
From: Mateusz Grzonka <mateusz.grzonka@intel.com>
Date: Mon, 13 Jun 2022 11:59:34 +0200
Subject: [PATCH 18/61] Fix possible NULL ptr dereferences and memory leaks
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
In Assemble there was a NULL check for sra variable,
which effectively didn't stop the execution in every case.
That might have resulted in a NULL pointer dereference.
Also in super-ddf, mu variable was set to NULL for some condition,
and then immidiately dereferenced.
Additionally some memory wasn't freed as well.
Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
Assemble.c | 7 ++++++-
super-ddf.c | 9 +++++++--
2 files changed, 13 insertions(+), 3 deletions(-)
diff --git a/Assemble.c b/Assemble.c
index 9eac9ce..4b21356 100644
--- a/Assemble.c
+++ b/Assemble.c
@@ -1982,7 +1982,12 @@ int assemble_container_content(struct supertype *st, int mdfd,
}
sra = sysfs_read(mdfd, NULL, GET_VERSION|GET_DEVS);
- if (sra == NULL || strcmp(sra->text_version, content->text_version) != 0) {
+ if (sra == NULL) {
+ pr_err("Failed to read sysfs parameters\n");
+ return 1;
+ }
+
+ if (strcmp(sra->text_version, content->text_version) != 0) {
if (content->array.major_version == -1 &&
content->array.minor_version == -2 &&
c->readonly &&
diff --git a/super-ddf.c b/super-ddf.c
index 8cda23a..abbc8b0 100644
--- a/super-ddf.c
+++ b/super-ddf.c
@@ -5125,13 +5125,16 @@ static struct mdinfo *ddf_activate_spare(struct active_array *a,
*/
vc = find_vdcr(ddf, a->info.container_member, rv->disk.raid_disk,
&n_bvd, &vcl);
- if (vc == NULL)
+ if (vc == NULL) {
+ free(rv);
return NULL;
+ }
mu = xmalloc(sizeof(*mu));
if (posix_memalign(&mu->space, 512, sizeof(struct vcl)) != 0) {
free(mu);
- mu = NULL;
+ free(rv);
+ return NULL;
}
mu->len = ddf->conf_rec_len * 512 * vcl->conf.sec_elmnt_count;
@@ -5161,6 +5164,8 @@ static struct mdinfo *ddf_activate_spare(struct active_array *a,
pr_err("BUG: can't find disk %d (%d/%d)\n",
di->disk.raid_disk,
di->disk.major, di->disk.minor);
+ free(mu);
+ free(rv);
return NULL;
}
vc->phys_refnum[i_prim] = ddf->phys->entries[dl->pdnum].refnum;
--
2.35.3

View File

@ -1,75 +0,0 @@
From cab114c5ca870e5f1b57fb2602cd9a038271c2e0 Mon Sep 17 00:00:00 2001
From: Corey Hickey <bugfood-c@fatooh.org>
Date: Mon, 11 Feb 2019 17:18:38 -0800
Subject: [PATCH] Fix reshape for decreasing data offset
Git-commit: cab114c5ca870e5f1b57fb2602cd9a038271c2e0
Patch-mainline: mdadm-4.1+
References: jsc#SLE-10078, jsc#SLE-9348
...when not changing the number of disks.
This patch needs context to explain. These are the relevant parts of
the original code (condensed and annotated):
if (dir > 0) {
/* Increase data offset (reshape backwards) */
if (data_offset < sd->data_offset + min) {
pr_err("--data-offset too small on %s\n",
dn);
goto release;
}
} else {
/* Decrease data offset (reshape forwards) */
if (data_offset < sd->data_offset - min) {
pr_err("--data-offset too small on %s\n",
dn);
goto release;
}
}
When this code is reached, mdadm has already decided on a reshape
direction. When increasing the data offset, the reshape runs backwards
(dir==1); when decreasing the data offset, the reshape runs forwards
(dir==-1).
The conditional within the backwards reshape is correct: the requested
offset must be larger than the old offset plus a minimum delta; thus the
reshape has room to work.
For the forwards reshape, the requested offset needs to be smaller than
the old offset minus a minimum delta; to do this correctly, the
comparison must be reversed.
Also update the error message.
Note: I have tested this change on a RAID 5 on Linux 4.18.0 and verified
that there were no errors from the kernel and that the device data
remained intact. I do not know if there are considerations for different
RAID levels.
Signed-off-by: Corey Hickey <bugfood-c@fatooh.org>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
Grow.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/Grow.c b/Grow.c
index 6d32661..764374f 100644
--- a/Grow.c
+++ b/Grow.c
@@ -2613,8 +2613,8 @@ static int set_new_data_offset(struct mdinfo *sra, struct supertype *st,
goto release;
}
if (data_offset != INVALID_SECTORS &&
- data_offset < sd->data_offset - min) {
- pr_err("--data-offset too small on %s\n",
+ data_offset > sd->data_offset - min) {
+ pr_err("--data-offset too large on %s\n",
dn);
goto release;
}
--
2.25.0

View File

@ -0,0 +1,304 @@
From 756a15f32338fdf0c562678694bc8991ad6afb90 Mon Sep 17 00:00:00 2001
From: Mateusz Grzonka <mateusz.grzonka@intel.com>
Date: Mon, 13 Jun 2022 12:00:09 +0200
Subject: [PATCH 19/61] imsm: Remove possibility for get_imsm_dev to return
NULL
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
Returning NULL from get_imsm_dev or __get_imsm_dev will cause segfault.
Guarantee that it never happens.
Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
super-intel.c | 153 +++++++++++++++++++++++++-------------------------
1 file changed, 78 insertions(+), 75 deletions(-)
diff --git a/super-intel.c b/super-intel.c
index ba3bd41..3788feb 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -851,6 +851,21 @@ static struct disk_info *get_disk_info(struct imsm_update_create_array *update)
return inf;
}
+/**
+ * __get_imsm_dev() - Get device with index from imsm_super.
+ * @mpb: &imsm_super pointer, not NULL.
+ * @index: Device index.
+ *
+ * Function works as non-NULL, aborting in such a case,
+ * when NULL would be returned.
+ *
+ * Device index should be in range 0 up to num_raid_devs.
+ * Function assumes the index was already verified.
+ * Index must be valid, otherwise abort() is called.
+ *
+ * Return: Pointer to corresponding imsm_dev.
+ *
+ */
static struct imsm_dev *__get_imsm_dev(struct imsm_super *mpb, __u8 index)
{
int offset;
@@ -858,30 +873,47 @@ static struct imsm_dev *__get_imsm_dev(struct imsm_super *mpb, __u8 index)
void *_mpb = mpb;
if (index >= mpb->num_raid_devs)
- return NULL;
+ goto error;
/* devices start after all disks */
offset = ((void *) &mpb->disk[mpb->num_disks]) - _mpb;
- for (i = 0; i <= index; i++)
+ for (i = 0; i <= index; i++, offset += sizeof_imsm_dev(_mpb + offset, 0))
if (i == index)
return _mpb + offset;
- else
- offset += sizeof_imsm_dev(_mpb + offset, 0);
-
- return NULL;
+error:
+ pr_err("cannot find imsm_dev with index %u in imsm_super\n", index);
+ abort();
}
+/**
+ * get_imsm_dev() - Get device with index from intel_super.
+ * @super: &intel_super pointer, not NULL.
+ * @index: Device index.
+ *
+ * Function works as non-NULL, aborting in such a case,
+ * when NULL would be returned.
+ *
+ * Device index should be in range 0 up to num_raid_devs.
+ * Function assumes the index was already verified.
+ * Index must be valid, otherwise abort() is called.
+ *
+ * Return: Pointer to corresponding imsm_dev.
+ *
+ */
static struct imsm_dev *get_imsm_dev(struct intel_super *super, __u8 index)
{
struct intel_dev *dv;
if (index >= super->anchor->num_raid_devs)
- return NULL;
+ goto error;
+
for (dv = super->devlist; dv; dv = dv->next)
if (dv->index == index)
return dv->dev;
- return NULL;
+error:
+ pr_err("cannot find imsm_dev with index %u in intel_super\n", index);
+ abort();
}
static inline unsigned long long __le48_to_cpu(const struct bbm_log_block_addr
@@ -4364,8 +4396,7 @@ int check_mpb_migr_compatibility(struct intel_super *super)
for (i = 0; i < super->anchor->num_raid_devs; i++) {
struct imsm_dev *dev_iter = __get_imsm_dev(super->anchor, i);
- if (dev_iter &&
- dev_iter->vol.migr_state == 1 &&
+ if (dev_iter->vol.migr_state == 1 &&
dev_iter->vol.migr_type == MIGR_GEN_MIGR) {
/* This device is migrating */
map0 = get_imsm_map(dev_iter, MAP_0);
@@ -4514,8 +4545,6 @@ static void clear_hi(struct intel_super *super)
}
for (i = 0; i < mpb->num_raid_devs; ++i) {
struct imsm_dev *dev = get_imsm_dev(super, i);
- if (!dev)
- return;
for (n = 0; n < 2; ++n) {
struct imsm_map *map = get_imsm_map(dev, n);
if (!map)
@@ -5836,7 +5865,7 @@ static int add_to_super_imsm_volume(struct supertype *st, mdu_disk_info_t *dk,
struct imsm_dev *_dev = __get_imsm_dev(mpb, 0);
_disk = __get_imsm_disk(mpb, dl->index);
- if (!_dev || !_disk) {
+ if (!_disk) {
pr_err("BUG mpb setup error\n");
return 1;
}
@@ -6171,10 +6200,10 @@ static int write_super_imsm(struct supertype *st, int doclose)
for (i = 0; i < mpb->num_raid_devs; i++) {
struct imsm_dev *dev = __get_imsm_dev(mpb, i);
struct imsm_dev *dev2 = get_imsm_dev(super, i);
- if (dev && dev2) {
- imsm_copy_dev(dev, dev2);
- mpb_size += sizeof_imsm_dev(dev, 0);
- }
+
+ imsm_copy_dev(dev, dev2);
+ mpb_size += sizeof_imsm_dev(dev, 0);
+
if (is_gen_migration(dev2))
clear_migration_record = 0;
}
@@ -9033,29 +9062,26 @@ static int imsm_rebuild_allowed(struct supertype *cont, int dev_idx, int failed)
__u8 state;
dev2 = get_imsm_dev(cont->sb, dev_idx);
- if (dev2) {
- state = imsm_check_degraded(cont->sb, dev2, failed, MAP_0);
- if (state == IMSM_T_STATE_FAILED) {
- map = get_imsm_map(dev2, MAP_0);
- if (!map)
- return 1;
- for (slot = 0; slot < map->num_members; slot++) {
- /*
- * Check if failed disks are deleted from intel
- * disk list or are marked to be deleted
- */
- idx = get_imsm_disk_idx(dev2, slot, MAP_X);
- idisk = get_imsm_dl_disk(cont->sb, idx);
- /*
- * Do not rebuild the array if failed disks
- * from failed sub-array are not removed from
- * container.
- */
- if (idisk &&
- is_failed(&idisk->disk) &&
- (idisk->action != DISK_REMOVE))
- return 0;
- }
+
+ state = imsm_check_degraded(cont->sb, dev2, failed, MAP_0);
+ if (state == IMSM_T_STATE_FAILED) {
+ map = get_imsm_map(dev2, MAP_0);
+ for (slot = 0; slot < map->num_members; slot++) {
+ /*
+ * Check if failed disks are deleted from intel
+ * disk list or are marked to be deleted
+ */
+ idx = get_imsm_disk_idx(dev2, slot, MAP_X);
+ idisk = get_imsm_dl_disk(cont->sb, idx);
+ /*
+ * Do not rebuild the array if failed disks
+ * from failed sub-array are not removed from
+ * container.
+ */
+ if (idisk &&
+ is_failed(&idisk->disk) &&
+ (idisk->action != DISK_REMOVE))
+ return 0;
}
}
return 1;
@@ -10089,7 +10115,6 @@ static void imsm_process_update(struct supertype *st,
int victim = u->dev_idx;
struct active_array *a;
struct intel_dev **dp;
- struct imsm_dev *dev;
/* sanity check that we are not affecting the uuid of
* active arrays, or deleting an active array
@@ -10105,8 +10130,7 @@ static void imsm_process_update(struct supertype *st,
* is active in the container, so checking
* mpb->num_raid_devs is just extra paranoia
*/
- dev = get_imsm_dev(super, victim);
- if (a || !dev || mpb->num_raid_devs == 1) {
+ if (a || mpb->num_raid_devs == 1 || victim >= super->anchor->num_raid_devs) {
dprintf("failed to delete subarray-%d\n", victim);
break;
}
@@ -10140,7 +10164,7 @@ static void imsm_process_update(struct supertype *st,
if (a->info.container_member == target)
break;
dev = get_imsm_dev(super, u->dev_idx);
- if (a || !dev || !check_name(super, name, 1)) {
+ if (a || !check_name(super, name, 1)) {
dprintf("failed to rename subarray-%d\n", target);
break;
}
@@ -10169,10 +10193,6 @@ static void imsm_process_update(struct supertype *st,
struct imsm_update_rwh_policy *u = (void *)update->buf;
int target = u->dev_idx;
struct imsm_dev *dev = get_imsm_dev(super, target);
- if (!dev) {
- dprintf("could not find subarray-%d\n", target);
- break;
- }
if (dev->rwh_policy != u->new_policy) {
dev->rwh_policy = u->new_policy;
@@ -11397,8 +11417,10 @@ static int imsm_create_metadata_update_for_migration(
{
struct intel_super *super = st->sb;
int update_memory_size;
+ int current_chunk_size;
struct imsm_update_reshape_migration *u;
- struct imsm_dev *dev;
+ struct imsm_dev *dev = get_imsm_dev(super, super->current_vol);
+ struct imsm_map *map = get_imsm_map(dev, MAP_0);
int previous_level = -1;
dprintf("(enter) New Level = %i\n", geo->level);
@@ -11415,23 +11437,15 @@ static int imsm_create_metadata_update_for_migration(
u->new_disks[0] = -1;
u->new_chunksize = -1;
- dev = get_imsm_dev(super, u->subdev);
- if (dev) {
- struct imsm_map *map;
+ current_chunk_size = __le16_to_cpu(map->blocks_per_strip) / 2;
- map = get_imsm_map(dev, MAP_0);
- if (map) {
- int current_chunk_size =
- __le16_to_cpu(map->blocks_per_strip) / 2;
-
- if (geo->chunksize != current_chunk_size) {
- u->new_chunksize = geo->chunksize / 1024;
- dprintf("imsm: chunk size change from %i to %i\n",
- current_chunk_size, u->new_chunksize);
- }
- previous_level = map->raid_level;
- }
+ if (geo->chunksize != current_chunk_size) {
+ u->new_chunksize = geo->chunksize / 1024;
+ dprintf("imsm: chunk size change from %i to %i\n",
+ current_chunk_size, u->new_chunksize);
}
+ previous_level = map->raid_level;
+
if (geo->level == 5 && previous_level == 0) {
struct mdinfo *spares = NULL;
@@ -12519,9 +12533,6 @@ static int validate_internal_bitmap_imsm(struct supertype *st)
unsigned long long offset;
struct dl *d;
- if (!dev)
- return -1;
-
if (dev->rwh_policy != RWH_BITMAP)
return 0;
@@ -12567,16 +12578,8 @@ static int add_internal_bitmap_imsm(struct supertype *st, int *chunkp,
return -1;
dev = get_imsm_dev(super, vol_idx);
-
- if (!dev) {
- dprintf("cannot find the device for volume index %d\n",
- vol_idx);
- return -1;
- }
dev->rwh_policy = RWH_BITMAP;
-
*chunkp = calculate_bitmap_chunksize(st, dev);
-
return 0;
}
--
2.35.3

View File

@ -0,0 +1,87 @@
From 190dc029b141c423e724566cbed5d5afbb10b05a Mon Sep 17 00:00:00 2001
From: Nigel Croxon <ncroxon@redhat.com>
Date: Mon, 18 Apr 2022 13:44:23 -0400
Subject: [PATCH 20/61] Revert "mdadm: fix coredump of mdadm --monitor -r"
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
This reverts commit 546047688e1c64638f462147c755b58119cabdc8.
The change from commit mdadm: fix coredump of mdadm
--monitor -r broke the printing of the return message when
passing -r to mdadm --manage, the removal of a device from
an array.
If the current code reverts this commit, both issues are
still fixed.
The original problem reported that the fix tried to address
was: The --monitor -r option requires a parameter,
otherwise a null pointer will be manipulated when
converting to integer data, and a core dump will appear.
The original problem was really fixed with:
60815698c0a Refactor parse_num and use it to parse optarg.
Which added a check for NULL in 'optarg' before moving it
to the 'increments' variable.
New issue: When trying to remove a device using the short
argument -r, instead of the long argument --remove, the
output is empty. The problem started when commit
546047688e1c was added.
Steps to Reproduce:
1. create/assemble /dev/md0 device
2. mdadm --manage /dev/md0 -r /dev/vdxx
Actual results:
Nothing, empty output, nothing happens, the device is still
connected to the array.
The output should have stated "mdadm: hot remove failed
for /dev/vdxx: Device or resource busy", if the device was
still active. Or it should remove the device and print
a message:
mdadm: set /dev/vdd faulty in /dev/md0
mdadm: hot removed /dev/vdd from /dev/md0
The following commit should be reverted as it breaks
mdadm --manage -r.
commit 546047688e1c64638f462147c755b58119cabdc8
Author: Wu Guanghao <wuguanghao3@huawei.com>
Date: Mon Aug 16 15:24:51 2021 +0800
mdadm: fix coredump of mdadm --monitor -r
-Nigel
Signed-off-by: Nigel Croxon <ncroxon@redhat.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
---
ReadMe.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/ReadMe.c b/ReadMe.c
index 8f873c4..bec1be9 100644
--- a/ReadMe.c
+++ b/ReadMe.c
@@ -81,11 +81,11 @@ char Version[] = "mdadm - v" VERSION " - " VERS_DATE EXTRAVERSION "\n";
* found, it is started.
*/
-char short_options[]="-ABCDEFGIQhVXYWZ:vqbc:i:l:p:m:r:n:x:u:c:d:z:U:N:safRSow1tye:k";
+char short_options[]="-ABCDEFGIQhVXYWZ:vqbc:i:l:p:m:n:x:u:c:d:z:U:N:sarfRSow1tye:k:";
char short_bitmap_options[]=
- "-ABCDEFGIQhVXYWZ:vqb:c:i:l:p:m:r:n:x:u:c:d:z:U:N:sarfRSow1tye:k:";
+ "-ABCDEFGIQhVXYWZ:vqb:c:i:l:p:m:n:x:u:c:d:z:U:N:sarfRSow1tye:k:";
char short_bitmap_auto_options[]=
- "-ABCDEFGIQhVXYWZ:vqb:c:i:l:p:m:r:n:x:u:c:d:z:U:N:sa:rfRSow1tye:k:";
+ "-ABCDEFGIQhVXYWZ:vqb:c:i:l:p:m:n:x:u:c:d:z:U:N:sa:rfRSow1tye:k:";
struct option long_options[] = {
{"manage", 0, 0, ManageOpt},
--
2.35.3

View File

@ -1,104 +0,0 @@
From 76b906d2406cdf136f64de77e881eb2d180108d9 Mon Sep 17 00:00:00 2001
From: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Date: Fri, 7 Dec 2018 14:30:09 +0100
Subject: [PATCH] mdadm/tests: add one test case for failfast of raid1
Git-commit: 76b906d2406cdf136f64de77e881eb2d180108d9
Patch-mainline: mdadm-4.1+
References: jsc#SLE-10078, jsc#SLE-9348
This creates raid1 device with the failfast option and check all
slaves have the failfast flag. And it does assembling and growing
the raid1 device and check the failfast works fine.
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
tests/05r1-failfast | 74 +++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 74 insertions(+)
create mode 100644 tests/05r1-failfast
diff --git a/tests/05r1-failfast b/tests/05r1-failfast
new file mode 100644
index 0000000..823dd6f
--- /dev/null
+++ b/tests/05r1-failfast
@@ -0,0 +1,74 @@
+
+# create a simple mirror and check failfast flag works
+mdadm -CR $md0 -e1.2 --level=raid1 --failfast -n2 $dev0 $dev1
+check raid1
+if grep -v failfast /sys/block/md0/md/rd*/state > /dev/null
+then
+ die "failfast missing"
+fi
+
+# Removing works with the failfast flag
+mdadm $md0 -f $dev0
+mdadm $md0 -r $dev0
+if grep -v failfast /sys/block/md0/md/rd1/state > /dev/null
+then
+ die "failfast missing"
+fi
+
+# Adding works with the failfast flag
+mdadm $md0 -a --failfast $dev0
+check wait
+if grep -v failfast /sys/block/md0/md/rd0/state > /dev/null
+then
+ die "failfast missing"
+fi
+
+mdadm -S $md0
+
+# Assembling works with the failfast flag
+mdadm -A $md0 $dev0 $dev1
+check raid1
+if grep -v failfast /sys/block/md0/md/rd*/state > /dev/null
+then
+ die "failfast missing"
+fi
+
+# Adding works with the nofailfast flag
+mdadm $md0 -f $dev0
+mdadm $md0 -r $dev0
+mdadm $md0 -a --nofailfast $dev0
+check wait
+if grep failfast /sys/block/md0/md/rd0/state > /dev/null
+then
+ die "failfast should be missing"
+fi
+
+# Assembling with one faulty slave works with the failfast flag
+mdadm $md0 -f $dev0
+mdadm $md0 -r $dev0
+mdadm -S $md0
+mdadm -A $md0 $dev0 $dev1
+check raid1
+mdadm -S $md0
+
+# Spare works with the failfast flag
+mdadm -CR $md0 -e1.2 --level=raid1 --failfast -n2 $dev0 $dev1
+check raid1
+mdadm $md0 -a --failfast $dev2
+check wait
+check spares 1
+if grep -v failfast /sys/block/md0/md/rd*/state > /dev/null
+then
+ die "failfast missing"
+fi
+
+# Grow works with the failfast flag
+mdadm -G $md0 --raid-devices=3
+check wait
+if grep -v failfast /sys/block/md0/md/rd*/state > /dev/null
+then
+ die "failfast missing"
+fi
+mdadm -S $md0
+
+exit 0
--
2.25.0

View File

@ -1,53 +0,0 @@
From 69d084784de196acec8ab703cd1b379af211d624 Mon Sep 17 00:00:00 2001
From: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Date: Fri, 22 Feb 2019 10:15:45 +0100
Subject: [PATCH] mdmon: don't attempt to manage new arrays when terminating
Git-commit: 69d084784de196acec8ab703cd1b379af211d624
Patch-mainline: mdadm-4.1-12
References: bsc#1127526
When mdmon gets a SIGTERM, it stops managing arrays that are clean. If
there is more that one array in the container and one of them is dirty
and the clean one is still present in mdstat, mdmon will treat it as a
new array and start managing it again. This leads to a cycle of
remove_old() / manage_new() calls for the clean array, until the other
one also becomes clean.
Prevent this by not calling manage_new() if sigterm is set. Also, remove
a check for sigterm in manage_new() because the condition will never be
true.
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
managemon.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/managemon.c b/managemon.c
index 101231c..29b91ba 100644
--- a/managemon.c
+++ b/managemon.c
@@ -727,9 +727,7 @@ static void manage_new(struct mdstat_ent *mdstat,
dprintf("inst: %s action: %d state: %d\n", inst,
new->action_fd, new->info.state_fd);
- if (sigterm)
- new->info.safe_mode_delay = 1;
- else if (mdi->safe_mode_delay >= 50)
+ if (mdi->safe_mode_delay >= 50)
/* Normal start, mdadm set this. */
new->info.safe_mode_delay = mdi->safe_mode_delay;
else
@@ -803,7 +801,7 @@ void manage(struct mdstat_ent *mdstat, struct supertype *container)
break;
}
}
- if (a == NULL || !a->container)
+ if ((a == NULL || !a->container) && !sigterm)
manage_new(mdstat, container, a);
}
}
--
2.16.4

View File

@ -0,0 +1,33 @@
From 953cc7e5a485a91ddec7312c7a5d7779749fad5f Mon Sep 17 00:00:00 2001
From: Kinga Tanska <kinga.tanska@intel.com>
Date: Tue, 21 Jun 2022 00:10:39 +0800
Subject: [PATCH 21/61] util: replace ioctl use with function
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
Replace using of ioctl calling to get md array info with
special function prepared to it.
Signed-off-by: Kinga Tanska <kinga.tanska@intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
---
util.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/util.c b/util.c
index cc94f96..38f0420 100644
--- a/util.c
+++ b/util.c
@@ -267,7 +267,7 @@ int md_array_active(int fd)
* GET_ARRAY_INFO doesn't provide access to the proper state
* information, so fallback to a basic check for raid_disks != 0
*/
- ret = ioctl(fd, GET_ARRAY_INFO, &array);
+ ret = md_get_array_info(fd, &array);
}
return !ret;
--
2.35.3

View File

@ -1,12 +1,10 @@
From 3a84e55858171f96321fa4c775fe7e4e851c6b85 Mon Sep 17 00:00:00 2001
From 63902857b98c37c8ac4b837bb01d006b327a4532 Mon Sep 17 00:00:00 2001
From: Heming Zhao <heming.zhao@suse.com>
Date: Thu, 31 Mar 2022 23:30:51 +0800
Subject: [PATCH] mdadm/super1: restore commit 45a87c2f31335 to fix clustered
slot issue
To: linux-raid@vger.kernel.org,
jes@trained-monkey.org
Patch-mainline: N/A, maintainer didn't respond this patch.
References: bsc#1197158, bsc#1197571
Date: Tue, 21 Jun 2022 00:10:40 +0800
Subject: [PATCH 22/61] mdadm/super1: restore commit 45a87c2f31335 to fix
clustered slot issue
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
Commit 9d67f6496c71 ("mdadm:check the nodes when operate clustered
array") modified assignment logic for st->nodes in write_bitmap1(),
@ -81,12 +79,13 @@ kernel: md0: detected capacity change from 0 to 612352
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Heming Zhao <heming.zhao@suse.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
---
super1.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/super1.c b/super1.c
index a12a5bc847b9..f08d4f831319 100644
index e3e2f95..3a0c69f 100644
--- a/super1.c
+++ b/super1.c
@@ -2674,7 +2674,17 @@ static int write_bitmap1(struct supertype *st, int fd, enum bitmap_update update
@ -109,5 +108,5 @@ index a12a5bc847b9..f08d4f831319 100644
* Since the nodes num is not increased, no
* need to check the space enough or not,
--
2.33.0
2.35.3

View File

@ -1,62 +0,0 @@
From d2e11da4b7fd0453e942f43e4196dc63b3dbd708 Mon Sep 17 00:00:00 2001
From: Pawel Baldysiak <pawel.baldysiak@intel.com>
Date: Fri, 22 Feb 2019 13:30:27 +0100
Subject: [PATCH] mdmon: wait for previous mdmon to exit during takeover
Git-commit: d2e11da4b7fd0453e942f43e4196dc63b3dbd708
Patch-mainline: mdadm-4.1+
References: jsc#SLE-10078, jsc#SLE-9348
Since the patch c76242c5("mdmon: get safe mode delay file descriptor
early"), safe_mode_dalay is set properly by initrd mdmon. But in some
cases with filesystem traffic since the very start of the system, it
might take a while to transit to clean state. Due to fact that new
mdmon does not wait for the old one to exit - it might happen that the
new one switches safe_mode_delay back to seconds, before old one exits.
As the result two mdmons are running concurrently on same array.
Wait for the old mdmon to exit by pinging it with SIGUSR1 signal, just
in case it is sleeping.
Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
mdmon.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/mdmon.c b/mdmon.c
index 0955fcc..ff985d2 100644
--- a/mdmon.c
+++ b/mdmon.c
@@ -171,6 +171,7 @@ static void try_kill_monitor(pid_t pid, char *devname, int sock)
int fd;
int n;
long fl;
+ int rv;
/* first rule of survival... don't off yourself */
if (pid == getpid())
@@ -201,9 +202,16 @@ static void try_kill_monitor(pid_t pid, char *devname, int sock)
fl &= ~O_NONBLOCK;
fcntl(sock, F_SETFL, fl);
n = read(sock, buf, 100);
- /* Ignore result, it is just the wait that
- * matters
- */
+
+ /* If there is I/O going on it might took some time to get to
+ * clean state. Wait for monitor to exit fully to avoid races.
+ * Ping it with SIGUSR1 in case that it is sleeping */
+ for (n = 0; n < 25; n++) {
+ rv = kill(pid, SIGUSR1);
+ if (rv < 0)
+ break;
+ usleep(200000);
+ }
}
void remove_pidfile(char *devname)
--
2.25.0

View File

@ -1,56 +0,0 @@
From 2b57e4fe041d52ae29866c93a878a11c07223cff Mon Sep 17 00:00:00 2001
From: Pawel Baldysiak <pawel.baldysiak@intel.com>
Date: Fri, 22 Feb 2019 12:56:27 +0100
Subject: [PATCH] Assemble: Fix starting array with initial reshape checkpoint
Git-commit: 2b57e4fe041d52ae29866c93a878a11c07223cff
Patch-mainline: mdadm-4.1+
References: jsc#SLE-10078, jsc#SLE-9348
If array was stopped during reshape initialization,
there might be a "0" checkpoint recorded in metadata.
If array with such condition (reshape with position 0)
is passed to kernel - it will refuse to start such array.
Treat such array as normal during assemble, Grow_continue() will
reinitialize and start the reshape.
Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
Assemble.c | 18 ++++++++++++++++--
1 file changed, 16 insertions(+), 2 deletions(-)
diff --git a/Assemble.c b/Assemble.c
index 9f050c1..420c7b3 100644
--- a/Assemble.c
+++ b/Assemble.c
@@ -2061,8 +2061,22 @@ int assemble_container_content(struct supertype *st, int mdfd,
spare, &c->backup_file, c->verbose) == 1)
return 1;
- err = sysfs_set_str(content, NULL,
- "array_state", "readonly");
+ if (content->reshape_progress == 0) {
+ /* If reshape progress is 0 - we are assembling the
+ * array that was stopped, before reshape has started.
+ * Array needs to be started as active, Grow_continue()
+ * will start the reshape.
+ */
+ sysfs_set_num(content, NULL, "reshape_position",
+ MaxSector);
+ err = sysfs_set_str(content, NULL,
+ "array_state", "active");
+ sysfs_set_num(content, NULL, "reshape_position", 0);
+ } else {
+ err = sysfs_set_str(content, NULL,
+ "array_state", "readonly");
+ }
+
if (err)
return 1;
--
2.25.0

View File

@ -0,0 +1,124 @@
From 76c152ca9851e9fcdf52e8f6e7e6c09b936bdd14 Mon Sep 17 00:00:00 2001
From: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Date: Tue, 21 Jun 2022 00:10:41 +0800
Subject: [PATCH 23/61] imsm: introduce get_disk_slot_in_dev()
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
The routine was added to remove unnecessary get_imsm_dev() and
get_imsm_map() calls, used only to determine disk slot.
Additionally, enum for IMSM return statues was added for further usage.
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
---
super-intel.c | 47 ++++++++++++++++++++++++++++++++++++-----------
1 file changed, 36 insertions(+), 11 deletions(-)
diff --git a/super-intel.c b/super-intel.c
index 3788feb..cd1f1e3 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -366,6 +366,18 @@ struct migr_record {
};
ASSERT_SIZE(migr_record, 128)
+/**
+ * enum imsm_status - internal IMSM return values representation.
+ * @STATUS_OK: function succeeded.
+ * @STATUS_ERROR: General error ocurred (not specified).
+ *
+ * Typedefed to imsm_status_t.
+ */
+typedef enum imsm_status {
+ IMSM_STATUS_ERROR = -1,
+ IMSM_STATUS_OK = 0,
+} imsm_status_t;
+
struct md_list {
/* usage marker:
* 1: load metadata
@@ -1183,7 +1195,7 @@ static void set_imsm_ord_tbl_ent(struct imsm_map *map, int slot, __u32 ord)
map->disk_ord_tbl[slot] = __cpu_to_le32(ord);
}
-static int get_imsm_disk_slot(struct imsm_map *map, unsigned idx)
+static int get_imsm_disk_slot(struct imsm_map *map, const unsigned int idx)
{
int slot;
__u32 ord;
@@ -1194,7 +1206,7 @@ static int get_imsm_disk_slot(struct imsm_map *map, unsigned idx)
return slot;
}
- return -1;
+ return IMSM_STATUS_ERROR;
}
static int get_imsm_raid_level(struct imsm_map *map)
@@ -1209,6 +1221,23 @@ static int get_imsm_raid_level(struct imsm_map *map)
return map->raid_level;
}
+/**
+ * get_disk_slot_in_dev() - retrieve disk slot from &imsm_dev.
+ * @super: &intel_super pointer, not NULL.
+ * @dev_idx: imsm device index.
+ * @idx: disk index.
+ *
+ * Return: Slot on success, IMSM_STATUS_ERROR otherwise.
+ */
+static int get_disk_slot_in_dev(struct intel_super *super, const __u8 dev_idx,
+ const unsigned int idx)
+{
+ struct imsm_dev *dev = get_imsm_dev(super, dev_idx);
+ struct imsm_map *map = get_imsm_map(dev, MAP_0);
+
+ return get_imsm_disk_slot(map, idx);
+}
+
static int cmp_extent(const void *av, const void *bv)
{
const struct extent *a = av;
@@ -1225,13 +1254,9 @@ static int count_memberships(struct dl *dl, struct intel_super *super)
int memberships = 0;
int i;
- for (i = 0; i < super->anchor->num_raid_devs; i++) {
- struct imsm_dev *dev = get_imsm_dev(super, i);
- struct imsm_map *map = get_imsm_map(dev, MAP_0);
-
- if (get_imsm_disk_slot(map, dl->index) >= 0)
+ for (i = 0; i < super->anchor->num_raid_devs; i++)
+ if (get_disk_slot_in_dev(super, i, dl->index) >= 0)
memberships++;
- }
return memberships;
}
@@ -1941,6 +1966,7 @@ void examine_migr_rec_imsm(struct intel_super *super)
/* first map under migration */
map = get_imsm_map(dev, MAP_0);
+
if (map)
slot = get_imsm_disk_slot(map, super->disks->index);
if (map == NULL || slot > 1 || slot < 0) {
@@ -9655,10 +9681,9 @@ static int apply_update_activate_spare(struct imsm_update_activate_spare *u,
/* count arrays using the victim in the metadata */
found = 0;
for (a = active_array; a ; a = a->next) {
- dev = get_imsm_dev(super, a->info.container_member);
- map = get_imsm_map(dev, MAP_0);
+ int dev_idx = a->info.container_member;
- if (get_imsm_disk_slot(map, victim) >= 0)
+ if (get_disk_slot_in_dev(super, dev_idx, victim) >= 0)
found++;
}
--
2.35.3

View File

@ -1,64 +0,0 @@
From 227aeaa872d4898273cf87a4253898823d556c43 Mon Sep 17 00:00:00 2001
From: Corey Hickey <bugfood-c@fatooh.org>
Date: Mon, 11 Feb 2019 17:42:27 -0800
Subject: [PATCH] add missing units to --examine
Git-commit: 227aeaa872d4898273cf87a4253898823d556c43
Patch-mainline: mdadm-4.1+
References: jsc#SLE-10078, jsc#SLE-9348
Within the output of "mdadm --examine", there are three sizes reported
on adjacent lines. For example:
$ sudo mdadm --examine /dev/md3
[...]
Avail Dev Size : 17580545024 (8383.06 GiB 9001.24 GB)
Array Size : 17580417024 (16765.99 GiB 18002.35 GB)
Used Dev Size : 11720278016 (5588.66 GiB 6000.78 GB)
[...]
This can be confusing, since the first and third line are in 512-byte
sectors, and the second is in KiB.
Add units to avoid ambiguity.
(I don't particularly like the "KiB" notation, but it is at least
unambiguous.)
Signed-off-by: Corey Hickey <bugfood-c@fatooh.org>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
super1.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/super1.c b/super1.c
index 636a286..b85dc20 100644
--- a/super1.c
+++ b/super1.c
@@ -360,7 +360,7 @@ static void examine_super1(struct supertype *st, char *homehost)
printf(" Raid Level : %s\n", c?c:"-unknown-");
printf(" Raid Devices : %d\n", __le32_to_cpu(sb->raid_disks));
printf("\n");
- printf(" Avail Dev Size : %llu%s\n",
+ printf(" Avail Dev Size : %llu sectors%s\n",
(unsigned long long)__le64_to_cpu(sb->data_size),
human_size(__le64_to_cpu(sb->data_size)<<9));
if (__le32_to_cpu(sb->level) > 0) {
@@ -378,11 +378,11 @@ static void examine_super1(struct supertype *st, char *homehost)
if (ddsks) {
long long asize = __le64_to_cpu(sb->size);
asize = (asize << 9) * ddsks / ddsks_denom;
- printf(" Array Size : %llu%s\n",
+ printf(" Array Size : %llu KiB%s\n",
asize >> 10, human_size(asize));
}
if (sb->size != sb->data_size)
- printf(" Used Dev Size : %llu%s\n",
+ printf(" Used Dev Size : %llu sectors%s\n",
(unsigned long long)__le64_to_cpu(sb->size),
human_size(__le64_to_cpu(sb->size)<<9));
}
--
2.25.0

View File

@ -0,0 +1,254 @@
From 6d4d9ab295de165e57b5c30e044028dbffb8f297 Mon Sep 17 00:00:00 2001
From: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Date: Tue, 21 Jun 2022 00:10:42 +0800
Subject: [PATCH 24/61] imsm: use same slot across container
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
Autolayout relies on drives order on super->disks list, but
it is not quaranted by readdir() in sysfs_read(). As a result
drive could be put in different slot in second volume.
Make it consistent by reffering to first volume, if exists.
Use enum imsm_status to unify error handling.
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
---
super-intel.c | 169 ++++++++++++++++++++++++++++++++------------------
1 file changed, 108 insertions(+), 61 deletions(-)
diff --git a/super-intel.c b/super-intel.c
index cd1f1e3..deef7c8 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -7522,11 +7522,27 @@ static int validate_geometry_imsm_volume(struct supertype *st, int level,
return 1;
}
-static int imsm_get_free_size(struct supertype *st, int raiddisks,
- unsigned long long size, int chunk,
- unsigned long long *freesize)
+/**
+ * imsm_get_free_size() - get the biggest, common free space from members.
+ * @super: &intel_super pointer, not NULL.
+ * @raiddisks: number of raid disks.
+ * @size: requested size, could be 0 (means max size).
+ * @chunk: requested chunk.
+ * @freesize: pointer for returned size value.
+ *
+ * Return: &IMSM_STATUS_OK or &IMSM_STATUS_ERROR.
+ *
+ * @freesize is set to meaningful value, this can be @size, or calculated
+ * max free size.
+ * super->create_offset value is modified and set appropriately in
+ * merge_extends() for further creation.
+ */
+static imsm_status_t imsm_get_free_size(struct intel_super *super,
+ const int raiddisks,
+ unsigned long long size,
+ const int chunk,
+ unsigned long long *freesize)
{
- struct intel_super *super = st->sb;
struct imsm_super *mpb = super->anchor;
struct dl *dl;
int i;
@@ -7570,12 +7586,10 @@ static int imsm_get_free_size(struct supertype *st, int raiddisks,
/* chunk is in K */
minsize = chunk * 2;
- if (cnt < raiddisks ||
- (super->orom && used && used != raiddisks) ||
- maxsize < minsize ||
- maxsize == 0) {
+ if (cnt < raiddisks || (super->orom && used && used != raiddisks) ||
+ maxsize < minsize || maxsize == 0) {
pr_err("not enough devices with space to create array.\n");
- return 0; /* No enough free spaces large enough */
+ return IMSM_STATUS_ERROR;
}
if (size == 0) {
@@ -7588,37 +7602,69 @@ static int imsm_get_free_size(struct supertype *st, int raiddisks,
}
if (mpb->num_raid_devs > 0 && size && size != maxsize)
pr_err("attempting to create a second volume with size less then remaining space.\n");
- cnt = 0;
- for (dl = super->disks; dl; dl = dl->next)
- if (dl->e)
- dl->raiddisk = cnt++;
-
*freesize = size;
dprintf("imsm: imsm_get_free_size() returns : %llu\n", size);
- return 1;
+ return IMSM_STATUS_OK;
}
-static int reserve_space(struct supertype *st, int raiddisks,
- unsigned long long size, int chunk,
- unsigned long long *freesize)
+/**
+ * autolayout_imsm() - automatically layout a new volume.
+ * @super: &intel_super pointer, not NULL.
+ * @raiddisks: number of raid disks.
+ * @size: requested size, could be 0 (means max size).
+ * @chunk: requested chunk.
+ * @freesize: pointer for returned size value.
+ *
+ * We are being asked to automatically layout a new volume based on the current
+ * contents of the container. If the parameters can be satisfied autolayout_imsm
+ * will record the disks, start offset, and will return size of the volume to
+ * be created. See imsm_get_free_size() for details.
+ * add_to_super() and getinfo_super() detect when autolayout is in progress.
+ * If first volume exists, slots are set consistently to it.
+ *
+ * Return: &IMSM_STATUS_OK on success, &IMSM_STATUS_ERROR otherwise.
+ *
+ * Disks are marked for creation via dl->raiddisk.
+ */
+static imsm_status_t autolayout_imsm(struct intel_super *super,
+ const int raiddisks,
+ unsigned long long size, const int chunk,
+ unsigned long long *freesize)
{
- struct intel_super *super = st->sb;
- struct dl *dl;
- int cnt;
- int rv = 0;
+ int curr_slot = 0;
+ struct dl *disk;
+ int vol_cnt = super->anchor->num_raid_devs;
+ imsm_status_t rv;
- rv = imsm_get_free_size(st, raiddisks, size, chunk, freesize);
- if (rv) {
- cnt = 0;
- for (dl = super->disks; dl; dl = dl->next)
- if (dl->e)
- dl->raiddisk = cnt++;
- rv = 1;
+ rv = imsm_get_free_size(super, raiddisks, size, chunk, freesize);
+ if (rv != IMSM_STATUS_OK)
+ return IMSM_STATUS_ERROR;
+
+ for (disk = super->disks; disk; disk = disk->next) {
+ if (!disk->e)
+ continue;
+
+ if (curr_slot == raiddisks)
+ break;
+
+ if (vol_cnt == 0) {
+ disk->raiddisk = curr_slot;
+ } else {
+ int _slot = get_disk_slot_in_dev(super, 0, disk->index);
+
+ if (_slot == -1) {
+ pr_err("Disk %s is not used in first volume, aborting\n",
+ disk->devname);
+ return IMSM_STATUS_ERROR;
+ }
+ disk->raiddisk = _slot;
+ }
+ curr_slot++;
}
- return rv;
+ return IMSM_STATUS_OK;
}
static int validate_geometry_imsm(struct supertype *st, int level, int layout,
@@ -7654,35 +7700,35 @@ static int validate_geometry_imsm(struct supertype *st, int level, int layout,
}
if (!dev) {
- if (st->sb) {
- struct intel_super *super = st->sb;
- if (!validate_geometry_imsm_orom(st->sb, level, layout,
- raiddisks, chunk, size,
- verbose))
+ struct intel_super *super = st->sb;
+
+ /*
+ * Autolayout mode, st->sb and freesize must be set.
+ */
+ if (!super || !freesize) {
+ pr_vrb("freesize and superblock must be set for autolayout, aborting\n");
+ return 1;
+ }
+
+ if (!validate_geometry_imsm_orom(st->sb, level, layout,
+ raiddisks, chunk, size,
+ verbose))
+ return 0;
+
+ if (super->orom) {
+ imsm_status_t rv;
+ int count = count_volumes(super->hba, super->orom->dpa,
+ verbose);
+ if (super->orom->vphba <= count) {
+ pr_vrb("platform does not support more than %d raid volumes.\n",
+ super->orom->vphba);
return 0;
- /* we are being asked to automatically layout a
- * new volume based on the current contents of
- * the container. If the the parameters can be
- * satisfied reserve_space will record the disks,
- * start offset, and size of the volume to be
- * created. add_to_super and getinfo_super
- * detect when autolayout is in progress.
- */
- /* assuming that freesize is always given when array is
- created */
- if (super->orom && freesize) {
- int count;
- count = count_volumes(super->hba,
- super->orom->dpa, verbose);
- if (super->orom->vphba <= count) {
- pr_vrb("platform does not support more than %d raid volumes.\n",
- super->orom->vphba);
- return 0;
- }
}
- if (freesize)
- return reserve_space(st, raiddisks, size,
- *chunk, freesize);
+
+ rv = autolayout_imsm(super, raiddisks, size, *chunk,
+ freesize);
+ if (rv != IMSM_STATUS_OK)
+ return 0;
}
return 1;
}
@@ -11538,7 +11584,7 @@ enum imsm_reshape_type imsm_analyze_change(struct supertype *st,
unsigned long long current_size;
unsigned long long free_size;
unsigned long long max_size;
- int rv;
+ imsm_status_t rv;
getinfo_super_imsm_volume(st, &info, NULL);
if (geo->level != info.array.level && geo->level >= 0 &&
@@ -11657,9 +11703,10 @@ enum imsm_reshape_type imsm_analyze_change(struct supertype *st,
}
/* check the maximum available size
*/
- rv = imsm_get_free_size(st, dev->vol.map->num_members,
- 0, chunk, &free_size);
- if (rv == 0)
+ rv = imsm_get_free_size(super, dev->vol.map->num_members,
+ 0, chunk, &free_size);
+
+ if (rv != IMSM_STATUS_OK)
/* Cannot find maximum available space
*/
max_size = 0;
--
2.35.3

View File

@ -0,0 +1,124 @@
From 9a7df595bbe360132cb37c8b39aa1fd9ac24b43f Mon Sep 17 00:00:00 2001
From: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Date: Tue, 21 Jun 2022 00:10:43 +0800
Subject: [PATCH 25/61] imsm: block changing slots during creation
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
If user specifies drives for array creation, then slot order across
volumes is not preserved.
Ideally, it should be checked in validate_geometry() but it is not
possible in current implementation (order is determined later).
Add verification in add_to_super_imsm_volume() and throw error if
mismatch is detected.
IMSM allows to use only same members within container.
This is not hardware dependency but metadata limitation.
Therefore, 09-imsm-overlap test is removed. Testing it is pointless.
After this patch, creation in this scenario is blocked. Offset
verification is covered in other tests.
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
---
super-intel.c | 33 ++++++++++++++++++++++-----------
tests/09imsm-overlap | 28 ----------------------------
2 files changed, 22 insertions(+), 39 deletions(-)
delete mode 100644 tests/09imsm-overlap
diff --git a/super-intel.c b/super-intel.c
index deef7c8..8ffe485 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -5789,6 +5789,10 @@ static int add_to_super_imsm_volume(struct supertype *st, mdu_disk_info_t *dk,
struct imsm_map *map;
struct dl *dl, *df;
int slot;
+ int autolayout = 0;
+
+ if (!is_fd_valid(fd))
+ autolayout = 1;
dev = get_imsm_dev(super, super->current_vol);
map = get_imsm_map(dev, MAP_0);
@@ -5799,25 +5803,32 @@ static int add_to_super_imsm_volume(struct supertype *st, mdu_disk_info_t *dk,
return 1;
}
- if (!is_fd_valid(fd)) {
- /* we're doing autolayout so grab the pre-marked (in
- * validate_geometry) raid_disk
- */
- for (dl = super->disks; dl; dl = dl->next)
+ for (dl = super->disks; dl ; dl = dl->next) {
+ if (autolayout) {
if (dl->raiddisk == dk->raid_disk)
break;
- } else {
- for (dl = super->disks; dl ; dl = dl->next)
- if (dl->major == dk->major &&
- dl->minor == dk->minor)
- break;
+ } else if (dl->major == dk->major && dl->minor == dk->minor)
+ break;
}
if (!dl) {
- pr_err("%s is not a member of the same container\n", devname);
+ if (!autolayout)
+ pr_err("%s is not a member of the same container.\n",
+ devname);
return 1;
}
+ if (!autolayout && super->current_vol > 0) {
+ int _slot = get_disk_slot_in_dev(super, 0, dl->index);
+
+ if (_slot != dk->raid_disk) {
+ pr_err("Member %s is in %d slot for the first volume, but is in %d slot for a new volume.\n",
+ dl->devname, _slot, dk->raid_disk);
+ pr_err("Raid members are in different order than for the first volume, aborting.\n");
+ return 1;
+ }
+ }
+
if (mpb->num_disks == 0)
if (!get_dev_sector_size(dl->fd, dl->devname,
&super->sector_size))
diff --git a/tests/09imsm-overlap b/tests/09imsm-overlap
deleted file mode 100644
index ff5d209..0000000
--- a/tests/09imsm-overlap
+++ /dev/null
@@ -1,28 +0,0 @@
-
-. tests/env-imsm-template
-
-# create raid arrays with varying degress of overlap
-mdadm -CR $container -e imsm -n 6 $dev0 $dev1 $dev2 $dev3 $dev4 $dev5
-imsm_check container 6
-
-size=1024
-level=1
-num_disks=2
-mdadm -CR $member0 $dev0 $dev1 -n $num_disks -l $level -z $size
-mdadm -CR $member1 $dev1 $dev2 -n $num_disks -l $level -z $size
-mdadm -CR $member2 $dev2 $dev3 -n $num_disks -l $level -z $size
-mdadm -CR $member3 $dev3 $dev4 -n $num_disks -l $level -z $size
-mdadm -CR $member4 $dev4 $dev5 -n $num_disks -l $level -z $size
-
-udevadm settle
-
-offset=0
-imsm_check member $member0 $num_disks $level $size 1024 $offset
-offset=$((offset+size+4096))
-imsm_check member $member1 $num_disks $level $size 1024 $offset
-offset=$((offset+size+4096))
-imsm_check member $member2 $num_disks $level $size 1024 $offset
-offset=$((offset+size+4096))
-imsm_check member $member3 $num_disks $level $size 1024 $offset
-offset=$((offset+size+4096))
-imsm_check member $member4 $num_disks $level $size 1024 $offset
--
2.35.3

View File

@ -1,121 +0,0 @@
From 05501181f18cdccdb0b3cec1d8cf59f0995504d7 Mon Sep 17 00:00:00 2001
From: Pawel Baldysiak <pawel.baldysiak@intel.com>
Date: Fri, 8 Mar 2019 12:19:11 +0100
Subject: [PATCH] imsm: fix spare activation for old matrix arrays
Git-commit: 05501181f18cdccdb0b3cec1d8cf59f0995504d7
Patch-mainline: mdadm-4.1+
References: jsc#SLE-10078, jsc#SLE-9348
During spare activation get_extents() calculates metadata reserved space based
on smallest active RAID member or it will take the defaults. Since patch
611d9529("imsm: change reserved space to 4MB") default is extended. If array
was created prior that patch, reserved space is smaller. In case of matrix
RAID - spare is activated in each array one-by-one, so it is spare for first
activation, but treated as "active" during second one.
In case of adding spare drive to old matrix RAID with the size the same as
already existing member drive the routine will take the defaults during second
run and mdmon will refuse to rebuild second volume, claiming that the drive
does not have enough free space.
Add parameter to get_extents(), so the during spare activation reserved space
is always based on smallest active drive - even if given drive is already
active in some other array of matrix RAID.
Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
super-intel.c | 19 ++++++++++---------
1 file changed, 10 insertions(+), 9 deletions(-)
diff --git a/super-intel.c b/super-intel.c
index c399433..5a7c9f8 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -1313,7 +1313,8 @@ static unsigned long long per_dev_array_size(struct imsm_map *map)
return array_size;
}
-static struct extent *get_extents(struct intel_super *super, struct dl *dl)
+static struct extent *get_extents(struct intel_super *super, struct dl *dl,
+ int get_minimal_reservation)
{
/* find a list of used extents on the given physical device */
struct extent *rv, *e;
@@ -1325,7 +1326,7 @@ static struct extent *get_extents(struct intel_super *super, struct dl *dl)
* regardless of whether the OROM has assigned sectors from the
* IMSM_RESERVED_SECTORS region
*/
- if (dl->index == -1)
+ if (dl->index == -1 || get_minimal_reservation)
reservation = imsm_min_reserved_sectors(super);
else
reservation = MPB_SECTOR_CNT + IMSM_RESERVED_SECTORS;
@@ -1386,7 +1387,7 @@ static __u32 imsm_reserved_sectors(struct intel_super *super, struct dl *dl)
if (dl->index == -1)
return MPB_SECTOR_CNT;
- e = get_extents(super, dl);
+ e = get_extents(super, dl, 0);
if (!e)
return MPB_SECTOR_CNT + IMSM_RESERVED_SECTORS;
@@ -1478,7 +1479,7 @@ static __u32 imsm_min_reserved_sectors(struct intel_super *super)
return rv;
/* find last lba used by subarrays on the smallest active disk */
- e = get_extents(super, dl_min);
+ e = get_extents(super, dl_min, 0);
if (!e)
return rv;
for (i = 0; e[i].size; i++)
@@ -1519,7 +1520,7 @@ int get_spare_criteria_imsm(struct supertype *st, struct spare_criteria *c)
if (!dl)
return -EINVAL;
/* find last lba used by subarrays */
- e = get_extents(super, dl);
+ e = get_extents(super, dl, 0);
if (!e)
return -EINVAL;
for (i = 0; e[i].size; i++)
@@ -7203,7 +7204,7 @@ static int validate_geometry_imsm_volume(struct supertype *st, int level,
pos = 0;
i = 0;
- e = get_extents(super, dl);
+ e = get_extents(super, dl, 0);
if (!e) continue;
do {
unsigned long long esize;
@@ -7261,7 +7262,7 @@ static int validate_geometry_imsm_volume(struct supertype *st, int level,
}
/* retrieve the largest free space block */
- e = get_extents(super, dl);
+ e = get_extents(super, dl, 0);
maxsize = 0;
i = 0;
if (e) {
@@ -7359,7 +7360,7 @@ static int imsm_get_free_size(struct supertype *st, int raiddisks,
if (super->orom && dl->index < 0 && mpb->num_raid_devs)
continue;
- e = get_extents(super, dl);
+ e = get_extents(super, dl, 0);
if (!e)
continue;
for (i = 1; e[i-1].size; i++)
@@ -8846,7 +8847,7 @@ static struct dl *imsm_add_spare(struct intel_super *super, int slot,
/* Does this unused device have the requisite free space?
* It needs to be able to cover all member volumes
*/
- ex = get_extents(super, dl);
+ ex = get_extents(super, dl, 1);
if (!ex) {
dprintf("cannot get extents\n");
continue;
--
2.25.0

View File

@ -1,99 +0,0 @@
From 22dc741f63e6403d59c2c14f56fd4791265f9bbb Mon Sep 17 00:00:00 2001
From: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com>
Date: Mon, 1 Apr 2019 16:53:41 +0200
Subject: [PATCH] Create: Block rounding size to max
Git-commit: 22dc741f63e6403d59c2c14f56fd4791265f9bbb
Patch-mainline: mdadm-4.1+
References: jsc#SLE-10078, jsc#SLE-9348
When passed size is smaller than chunk, mdadm rounds it to 0 but 0 there
means max available space.
Block it for every metadata. Remove the same check from imsm routine.
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
Create.c | 23 ++++++++++++++++++++---
super-intel.c | 5 ++---
2 files changed, 22 insertions(+), 6 deletions(-)
diff --git a/Create.c b/Create.c
index 6f1b228..292f92a 100644
--- a/Create.c
+++ b/Create.c
@@ -27,6 +27,18 @@
#include "md_p.h"
#include <ctype.h>
+static int round_size_and_verify(unsigned long long *size, int chunk)
+{
+ if (*size == 0)
+ return 0;
+ *size &= ~(unsigned long long)(chunk - 1);
+ if (*size == 0) {
+ pr_err("Size cannot be smaller than chunk.\n");
+ return 1;
+ }
+ return 0;
+}
+
static int default_layout(struct supertype *st, int level, int verbose)
{
int layout = UnSet;
@@ -248,11 +260,14 @@ int Create(struct supertype *st, char *mddev,
pr_err("unknown level %d\n", s->level);
return 1;
}
+
if (s->size == MAX_SIZE)
/* use '0' to mean 'max' now... */
s->size = 0;
if (s->size && s->chunk && s->chunk != UnSet)
- s->size &= ~(unsigned long long)(s->chunk - 1);
+ if (round_size_and_verify(&s->size, s->chunk))
+ return 1;
+
newsize = s->size * 2;
if (st && ! st->ss->validate_geometry(st, s->level, s->layout, s->raiddisks,
&s->chunk, s->size*2,
@@ -267,7 +282,8 @@ int Create(struct supertype *st, char *mddev,
/* default chunk was just set */
if (c->verbose > 0)
pr_err("chunk size defaults to %dK\n", s->chunk);
- s->size &= ~(unsigned long long)(s->chunk - 1);
+ if (round_size_and_verify(&s->size, s->chunk))
+ return 1;
do_default_chunk = 0;
}
}
@@ -413,7 +429,8 @@ int Create(struct supertype *st, char *mddev,
/* default chunk was just set */
if (c->verbose > 0)
pr_err("chunk size defaults to %dK\n", s->chunk);
- s->size &= ~(unsigned long long)(s->chunk - 1);
+ if (round_size_and_verify(&s->size, s->chunk))
+ return 1;
do_default_chunk = 0;
}
}
diff --git a/super-intel.c b/super-intel.c
index 5a7c9f8..2ba045a 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -7455,9 +7455,8 @@ static int validate_geometry_imsm(struct supertype *st, int level, int layout,
verbose);
}
- if (size && ((size < 1024) || (*chunk != UnSet &&
- size < (unsigned long long) *chunk))) {
- pr_err("Given size must be greater than 1M and chunk size.\n");
+ if (size && (size < 1024)) {
+ pr_err("Given size must be greater than 1M.\n");
/* Depends on algorithm in Create.c :
* if container was given (dev == NULL) return -1,
* if block device was given ( dev != NULL) return 0.
--
2.25.0

View File

@ -0,0 +1,180 @@
From 70f1ff4291b0388adca1f4c91918ce1175e8b360 Mon Sep 17 00:00:00 2001
From: Lukasz Florczak <lukasz.florczak@linux.intel.com>
Date: Wed, 15 Jun 2022 14:28:39 +0200
Subject: [PATCH 26/61] mdadm: block update=ppl for non raid456 levels
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
Option ppl should be used only for raid levels 4, 5 and 6. Cancel update
for other levels.
Applied globally for imsm and ddf format.
Additionally introduce is_level456() helper function.
Signed-off-by: Lukasz Florczak <lukasz.florczak@linux.intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
Assemble.c | 11 +++++------
Grow.c | 2 +-
Manage.c | 14 ++++++++++++--
mdadm.h | 11 +++++++++++
super0.c | 2 +-
super1.c | 3 +--
6 files changed, 31 insertions(+), 12 deletions(-)
diff --git a/Assemble.c b/Assemble.c
index 4b21356..6df6bfb 100644
--- a/Assemble.c
+++ b/Assemble.c
@@ -906,8 +906,7 @@ static int force_array(struct mdinfo *content,
* devices in RAID4 or last devices in RAID4/5/6.
*/
delta = devices[j].i.delta_disks;
- if (devices[j].i.array.level >= 4 &&
- devices[j].i.array.level <= 6 &&
+ if (is_level456(devices[j].i.array.level) &&
i/2 >= content->array.raid_disks - delta)
/* OK */;
else if (devices[j].i.array.level == 4 &&
@@ -1226,8 +1225,7 @@ static int start_array(int mdfd,
fprintf(stderr, ".\n");
}
if (content->reshape_active &&
- content->array.level >= 4 &&
- content->array.level <= 6) {
+ is_level456(content->array.level)) {
/* might need to increase the size
* of the stripe cache - default is 256
*/
@@ -1974,7 +1972,8 @@ int assemble_container_content(struct supertype *st, int mdfd,
int start_reshape;
char *avail;
int err;
- int is_raid456, is_clean, all_disks;
+ int is_clean, all_disks;
+ bool is_raid456;
if (sysfs_init(content, mdfd, NULL)) {
pr_err("Unable to initialize sysfs\n");
@@ -2107,7 +2106,7 @@ int assemble_container_content(struct supertype *st, int mdfd,
content->array.state |= 1;
}
- is_raid456 = (content->array.level >= 4 && content->array.level <= 6);
+ is_raid456 = is_level456(content->array.level);
is_clean = content->array.state & 1;
if (enough(content->array.level, content->array.raid_disks,
diff --git a/Grow.c b/Grow.c
index f6efbc4..8c520d4 100644
--- a/Grow.c
+++ b/Grow.c
@@ -2944,7 +2944,7 @@ static int impose_level(int fd, int level, char *devname, int verbose)
}
md_get_array_info(fd, &array);
- if (level == 0 && (array.level >= 4 && array.level <= 6)) {
+ if (level == 0 && is_level456(array.level)) {
/* To convert to RAID0 we need to fail and
* remove any non-data devices. */
int found = 0;
diff --git a/Manage.c b/Manage.c
index f789e0c..e5e6abe 100644
--- a/Manage.c
+++ b/Manage.c
@@ -307,7 +307,7 @@ int Manage_stop(char *devname, int fd, int verbose, int will_retry)
* - unfreeze reshape
* - wait on 'sync_completed' for that point to be reached.
*/
- if (mdi && (mdi->array.level >= 4 && mdi->array.level <= 6) &&
+ if (mdi && is_level456(mdi->array.level) &&
sysfs_attribute_available(mdi, NULL, "sync_action") &&
sysfs_attribute_available(mdi, NULL, "reshape_direction") &&
sysfs_get_str(mdi, NULL, "sync_action", buf, 20) > 0 &&
@@ -1679,6 +1679,7 @@ int Update_subarray(char *dev, char *subarray, char *update, struct mddev_ident
{
struct supertype supertype, *st = &supertype;
int fd, rv = 2;
+ struct mdinfo *info = NULL;
memset(st, 0, sizeof(*st));
@@ -1696,6 +1697,13 @@ int Update_subarray(char *dev, char *subarray, char *update, struct mddev_ident
if (mdmon_running(st->devnm))
st->update_tail = &st->updates;
+ info = st->ss->container_content(st, subarray);
+
+ if (strncmp(update, "ppl", 3) == 0 && !is_level456(info->array.level)) {
+ pr_err("RWH policy ppl is supported only for raid4, raid5 and raid6.\n");
+ goto free_super;
+ }
+
rv = st->ss->update_subarray(st, subarray, update, ident);
if (rv) {
@@ -1711,7 +1719,9 @@ int Update_subarray(char *dev, char *subarray, char *update, struct mddev_ident
pr_err("Updated subarray-%s name from %s, UUIDs may have changed\n",
subarray, dev);
- free_super:
+free_super:
+ if (info)
+ free(info);
st->ss->free_super(st);
close(fd);
diff --git a/mdadm.h b/mdadm.h
index d53df16..974415b 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -796,6 +796,17 @@ static inline int is_fd_valid(int fd)
return (fd > -1);
}
+/**
+ * is_level456() - check whether given level is between inclusive 4 and 6.
+ * @level: level to check.
+ *
+ * Return: true if condition is met, false otherwise
+ */
+static inline bool is_level456(int level)
+{
+ return (level >= 4 && level <= 6);
+}
+
/**
* close_fd() - verify, close and unset file descriptor.
* @fd: pointer to file descriptor.
diff --git a/super0.c b/super0.c
index 61c9ec1..37f595e 100644
--- a/super0.c
+++ b/super0.c
@@ -683,7 +683,7 @@ static int update_super0(struct supertype *st, struct mdinfo *info,
int parity = sb->level == 6 ? 2 : 1;
rv = 0;
- if (sb->level >= 4 && sb->level <= 6 &&
+ if (is_level456(sb->level) &&
sb->reshape_position % (
sb->new_chunk/512 *
(sb->raid_disks - sb->delta_disks - parity))) {
diff --git a/super1.c b/super1.c
index 3a0c69f..71af860 100644
--- a/super1.c
+++ b/super1.c
@@ -1530,8 +1530,7 @@ static int update_super1(struct supertype *st, struct mdinfo *info,
* So we reject a revert-reshape unless the
* alignment is good.
*/
- if (__le32_to_cpu(sb->level) >= 4 &&
- __le32_to_cpu(sb->level) <= 6) {
+ if (is_level456(__le32_to_cpu(sb->level))) {
reshape_sectors =
__le64_to_cpu(sb->reshape_position);
reshape_chunk = __le32_to_cpu(sb->new_chunk);
--
2.35.3

View File

@ -0,0 +1,33 @@
From 42e02e613fb0b4a2c0c0d984b9e6e2933875bb44 Mon Sep 17 00:00:00 2001
From: Lukasz Florczak <lukasz.florczak@linux.intel.com>
Date: Fri, 22 Jul 2022 08:43:47 +0200
Subject: [PATCH 27/61] mdadm: Fix array size mismatch after grow
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
imsm_fix_size_mismatch() is invoked to fix the problem, but it couldn't
proceed due to migration check. This patch allows for intended behavior.
Signed-off-by: Lukasz Florczak <lukasz.florczak@linux.intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
super-intel.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/super-intel.c b/super-intel.c
index 8ffe485..76b947f 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -11854,7 +11854,7 @@ static int imsm_fix_size_mismatch(struct supertype *st, int subarray_index)
unsigned long long d_size = imsm_dev_size(dev);
int u_size;
- if (calc_size == d_size || dev->vol.migr_type == MIGR_GEN_MIGR)
+ if (calc_size == d_size)
continue;
/* There is a difference, confirm that imsm_dev_size is
--
2.35.3

View File

@ -1,35 +0,0 @@
From 3c9b46cf9ae15a9be98fc47e2080bd9494496246 Mon Sep 17 00:00:00 2001
From: Liwei Song <liwei.song@windriver.com>
Date: Tue, 19 Mar 2019 23:51:05 -0400
Subject: [PATCH] udev: Add udev rules to create by-partuuid for md device
Git-commit: 3c9b46cf9ae15a9be98fc47e2080bd9494496246
Patch-mainline: mdadm-4.1+
References: jsc#SLE-10078, jsc#SLE-9348
This rules will create link under /dev/disk/by-partuuid/ for
MD devices partition, with which will support specify
root=PARTUUID=XXX to boot rootfs.
Signed-off-by: Liwei Song <liwei.song@windriver.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
udev-md-raid-arrays.rules | 1 +
1 file changed, 1 insertion(+)
diff --git a/udev-md-raid-arrays.rules b/udev-md-raid-arrays.rules
index c95ec7b..5b99d58 100644
--- a/udev-md-raid-arrays.rules
+++ b/udev-md-raid-arrays.rules
@@ -30,6 +30,7 @@ IMPORT{builtin}="blkid"
OPTIONS+="link_priority=100"
OPTIONS+="watch"
ENV{ID_FS_USAGE}=="filesystem|other|crypto", ENV{ID_FS_UUID_ENC}=="?*", SYMLINK+="disk/by-uuid/$env{ID_FS_UUID_ENC}"
+ENV{ID_FS_USAGE}=="filesystem|other", ENV{ID_PART_ENTRY_UUID}=="?*", SYMLINK+="disk/by-partuuid/$env{ID_PART_ENTRY_UUID}"
ENV{ID_FS_USAGE}=="filesystem|other", ENV{ID_FS_LABEL_ENC}=="?*", SYMLINK+="disk/by-label/$env{ID_FS_LABEL_ENC}"
ENV{MD_LEVEL}=="raid[1-9]*", ENV{SYSTEMD_WANTS}+="mdmonitor.service"
--
2.25.0

View File

@ -0,0 +1,37 @@
From 751757620afb25a4c02746bf8368a7b5f22352ec Mon Sep 17 00:00:00 2001
From: Lukasz Florczak <lukasz.florczak@linux.intel.com>
Date: Fri, 22 Jul 2022 08:43:48 +0200
Subject: [PATCH 28/61] mdadm: Remove dead code in imsm_fix_size_mismatch
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
imsm_create_metadata_update_for_size_change() that returns u_size value
could return 0 in the past. As its behavior changed, and returned value
is always the size of imsm_update_size_change structure, check for
u_size is no longer needed.
Signed-off-by: Lukasz Florczak <lukasz.florczak@linux.intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
super-intel.c | 4 ----
1 file changed, 4 deletions(-)
diff --git a/super-intel.c b/super-intel.c
index 76b947f..4ddfcf9 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -11869,10 +11869,6 @@ static int imsm_fix_size_mismatch(struct supertype *st, int subarray_index)
geo.size = d_size;
u_size = imsm_create_metadata_update_for_size_change(st, &geo,
&update);
- if (u_size < 1) {
- dprintf("imsm: Cannot prepare size change update\n");
- goto exit;
- }
imsm_update_metadata_locally(st, update, u_size);
if (st->update_tail) {
append_metadata_update(st, update, u_size);
--
2.35.3

View File

@ -1,114 +0,0 @@
From ae7d61e35ec2ab6361c3e509a8db00698ef3396f Mon Sep 17 00:00:00 2001
From: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Date: Tue, 7 May 2019 16:08:47 +0200
Subject: [PATCH] mdmon: fix wrong array state when disk fails during mdmon
startup
Git-commit: ae7d61e35ec2ab6361c3e509a8db00698ef3396f
Patch-mainline: mdadm-4.1+
References: jsc#SLE-10078, jsc#SLE-9348
If a member drive disappears and is set faulty by the kernel during
mdmon startup, after ss->load_container() but before manage_new(), mdmon
will try to readd the faulty drive to the array and start rebuilding.
Metadata on the active drive is updated, but the faulty drive is not
removed from the array and is left in a "blocked" state and any write
request to the array will block. If the faulty drive reappears in the
system e.g. after a reboot, the array will not assemble because metadata
on the drives will be incompatible (at least on imsm).
Fix this by adding a new option for sysfs_read(): "GET_DEVS_ALL". This
is an extension for the "GET_DEVS" option and causes all member devices
to be returned, even if the associated block device has been removed.
Use this option in manage_new() to include the faulty device on the
active_array's devices list. Mdmon will then properly remove the faulty
device from the array and update the metadata to reflect the degraded
state.
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
managemon.c | 2 +-
mdadm.h | 1 +
super-intel.c | 2 +-
sysfs.c | 23 ++++++++++++++---------
4 files changed, 17 insertions(+), 11 deletions(-)
diff --git a/managemon.c b/managemon.c
index 29b91ba..200cf83 100644
--- a/managemon.c
+++ b/managemon.c
@@ -678,7 +678,7 @@ static void manage_new(struct mdstat_ent *mdstat,
mdi = sysfs_read(-1, mdstat->devnm,
GET_LEVEL|GET_CHUNK|GET_DISKS|GET_COMPONENT|
GET_SAFEMODE|GET_DEVS|GET_OFFSET|GET_SIZE|GET_STATE|
- GET_LAYOUT);
+ GET_LAYOUT|GET_DEVS_ALL);
if (!mdi)
return;
diff --git a/mdadm.h b/mdadm.h
index 705bd9b..427cc52 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -647,6 +647,7 @@ enum sysfs_read_flags {
GET_ERROR = (1 << 24),
GET_ARRAY_STATE = (1 << 25),
GET_CONSISTENCY_POLICY = (1 << 26),
+ GET_DEVS_ALL = (1 << 27),
};
/* If fd >= 0, get the array it is open on,
diff --git a/super-intel.c b/super-intel.c
index 2ba045a..4fd5e84 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -8560,7 +8560,7 @@ static void imsm_set_disk(struct active_array *a, int n, int state)
disk = get_imsm_disk(super, ord_to_idx(ord));
/* check for new failures */
- if (state & DS_FAULTY) {
+ if (disk && (state & DS_FAULTY)) {
if (mark_failure(super, dev, disk, ord_to_idx(ord)))
super->updates_pending++;
}
diff --git a/sysfs.c b/sysfs.c
index df6fdda..2dd9ab6 100644
--- a/sysfs.c
+++ b/sysfs.c
@@ -313,17 +313,22 @@ struct mdinfo *sysfs_read(int fd, char *devnm, unsigned long options)
/* assume this is a stale reference to a hot
* removed device
*/
- free(dev);
- continue;
+ if (!(options & GET_DEVS_ALL)) {
+ free(dev);
+ continue;
+ }
+ } else {
+ sscanf(buf, "%d:%d", &dev->disk.major, &dev->disk.minor);
}
- sscanf(buf, "%d:%d", &dev->disk.major, &dev->disk.minor);
- /* special case check for block devices that can go 'offline' */
- strcpy(dbase, "block/device/state");
- if (load_sys(fname, buf, sizeof(buf)) == 0 &&
- strncmp(buf, "offline", 7) == 0) {
- free(dev);
- continue;
+ if (!(options & GET_DEVS_ALL)) {
+ /* special case check for block devices that can go 'offline' */
+ strcpy(dbase, "block/device/state");
+ if (load_sys(fname, buf, sizeof(buf)) == 0 &&
+ strncmp(buf, "offline", 7) == 0) {
+ free(dev);
+ continue;
+ }
}
/* finally add this disk to the array */
--
2.25.0

View File

@ -1,216 +0,0 @@
From 4ec389e3f0c1233f5aa2d5b4e63d96e33d2a37f0 Mon Sep 17 00:00:00 2001
From: Roman Sobanski <roman.sobanski@intel.com>
Date: Tue, 2 Jul 2019 13:29:27 +0200
Subject: [PATCH] Enable probe_roms to scan more than 6 roms.
Git-commit: 4ec389e3f0c1233f5aa2d5b4e63d96e33d2a37f0
Patch-mainline: mdadm-4.1+
References: bsc#1156040
In some cases if more than 6 oroms exist, resource for particular
controller may not be found. Change method for storing
adapter_rom_resources from array to list.
Signed-off-by: Roman Sobanski <roman.sobanski@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Acked-by: Coly Li <colyli@suse.de>
---
probe_roms.c | 98 ++++++++++++++++++++++++++++++++++--------------------------
1 file changed, 56 insertions(+), 42 deletions(-)
diff --git a/probe_roms.c b/probe_roms.c
index b0b0883..7ea04c7 100644
--- a/probe_roms.c
+++ b/probe_roms.c
@@ -35,6 +35,9 @@ static const int rom_len = 0xf0000 - 0xc0000; /* option-rom memory region */
static int _sigbus;
static unsigned long rom_align;
+static void roms_deinit(void);
+static int roms_init(void);
+
static void sigbus(int sig)
{
_sigbus = 1;
@@ -75,6 +78,7 @@ void probe_roms_exit(void)
munmap(rom_mem, rom_len);
rom_mem = MAP_FAILED;
}
+ roms_deinit();
}
int probe_roms_init(unsigned long align)
@@ -91,6 +95,9 @@ int probe_roms_init(unsigned long align)
else
return -1;
+ if (roms_init())
+ return -1;
+
if (signal(SIGBUS, sigbus) == SIG_ERR)
rc = -1;
if (rc == 0) {
@@ -131,6 +138,7 @@ struct resource {
unsigned long end;
unsigned long data;
const char *name;
+ struct resource *next;
};
static struct resource system_rom_resource = {
@@ -147,37 +155,7 @@ static struct resource extension_rom_resource = {
.end = 0xeffff,
};
-static struct resource adapter_rom_resources[] = { {
- .name = "Adapter ROM",
- .start = 0xc8000,
- .data = 0,
- .end = 0,
-}, {
- .name = "Adapter ROM",
- .start = 0,
- .data = 0,
- .end = 0,
-}, {
- .name = "Adapter ROM",
- .start = 0,
- .data = 0,
- .end = 0,
-}, {
- .name = "Adapter ROM",
- .start = 0,
- .data = 0,
- .end = 0,
-}, {
- .name = "Adapter ROM",
- .start = 0,
- .data = 0,
- .end = 0,
-}, {
- .name = "Adapter ROM",
- .start = 0,
- .data = 0,
- .end = 0,
-} };
+static struct resource *adapter_rom_resources;
static struct resource video_rom_resource = {
.name = "Video ROM",
@@ -186,8 +164,35 @@ static struct resource video_rom_resource = {
.end = 0xc7fff,
};
+static int roms_init(void)
+{
+ adapter_rom_resources = malloc(sizeof(struct resource));
+ if (adapter_rom_resources == NULL)
+ return 1;
+ adapter_rom_resources->name = "Adapter ROM";
+ adapter_rom_resources->start = 0xc8000;
+ adapter_rom_resources->data = 0;
+ adapter_rom_resources->end = 0;
+ adapter_rom_resources->next = NULL;
+ return 0;
+}
+
+static void roms_deinit(void)
+{
+ struct resource *res;
+
+ res = adapter_rom_resources;
+ while (res) {
+ struct resource *tmp = res;
+
+ res = res->next;
+ free(tmp);
+ }
+}
+
#define ROMSIGNATURE 0xaa55
+
static int romsignature(const unsigned char *rom)
{
const unsigned short * const ptr = (const unsigned short *)rom;
@@ -208,16 +213,14 @@ static int romchecksum(const unsigned char *rom, unsigned long length)
int scan_adapter_roms(scan_fn fn)
{
/* let scan_fn examing each of the adapter roms found by probe_roms */
- unsigned int i;
+ struct resource *res = adapter_rom_resources;
int found;
if (rom_fd < 0)
return 0;
found = 0;
- for (i = 0; i < ARRAY_SIZE(adapter_rom_resources); i++) {
- struct resource *res = &adapter_rom_resources[i];
-
+ while (res) {
if (res->start) {
found = fn(isa_bus_to_virt(res->start),
isa_bus_to_virt(res->end),
@@ -226,6 +229,7 @@ int scan_adapter_roms(scan_fn fn)
break;
} else
break;
+ res = res->next;
}
return found;
@@ -241,14 +245,14 @@ void probe_roms(void)
const void *rom;
unsigned long start, length, upper;
unsigned char c;
- unsigned int i;
+ struct resource *res = adapter_rom_resources;
__u16 val=0;
if (rom_fd < 0)
return;
/* video rom */
- upper = adapter_rom_resources[0].start;
+ upper = res->start;
for (start = video_rom_resource.start; start < upper; start += rom_align) {
rom = isa_bus_to_virt(start);
if (!romsignature(rom))
@@ -283,8 +287,9 @@ void probe_roms(void)
upper = extension_rom_resource.start;
}
+ struct resource *prev_res = res;
/* check for adapter roms on 2k boundaries */
- for (i = 0; i < ARRAY_SIZE(adapter_rom_resources) && start < upper; start += rom_align) {
+ for (; start < upper; start += rom_align) {
rom = isa_bus_to_virt(start);
if (!romsignature(rom))
continue;
@@ -308,10 +313,19 @@ void probe_roms(void)
if (!length || start + length > upper || !romchecksum(rom, length))
continue;
- adapter_rom_resources[i].start = start;
- adapter_rom_resources[i].data = start + (unsigned long) val;
- adapter_rom_resources[i].end = start + length - 1;
+ if (res == NULL) {
+ res = calloc(1, sizeof(struct resource));
+ if (res == NULL)
+ return;
+ prev_res->next = res;
+ }
+
+ res->start = start;
+ res->data = start + (unsigned long)val;
+ res->end = start + length - 1;
- start = adapter_rom_resources[i++].end & ~(rom_align - 1);
+ start = res->end & ~(rom_align - 1);
+ prev_res = res;
+ res = res->next;
}
}
--
2.16.4

View File

@ -0,0 +1,43 @@
From c8d1c398505b62d9129a4e711f17e4469f4327ff Mon Sep 17 00:00:00 2001
From: Kinga Tanska <kinga.tanska@intel.com>
Date: Thu, 14 Jul 2022 09:02:10 +0200
Subject: [PATCH 29/61] Monitor: use devname as char array instead of pointer
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
Device name wasn't filled properly due to incorrect use of strcpy.
Strcpy was used twice. Firstly to fill devname with "/dev/md/"
and then to add chosen name. First strcpy result was overwritten by
second one (as a result <device_name> instead of "/dev/md/<device_name>"
was assigned). This commit changes this implementation to use snprintf
and devname with fixed size.
Signed-off-by: Kinga Tanska <kinga.tanska@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
Monitor.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/Monitor.c b/Monitor.c
index 6ca1ebe..a5b11ae 100644
--- a/Monitor.c
+++ b/Monitor.c
@@ -190,9 +190,11 @@ int Monitor(struct mddev_dev *devlist,
if (mdlist->devname[0] == '/')
st->devname = xstrdup(mdlist->devname);
else {
- st->devname = xmalloc(8+strlen(mdlist->devname)+1);
- strcpy(strcpy(st->devname, "/dev/md/"),
- mdlist->devname);
+ /* length of "/dev/md/" + device name + terminating byte */
+ size_t _len = sizeof("/dev/md/") + strnlen(mdlist->devname, PATH_MAX);
+
+ st->devname = xcalloc(_len, sizeof(char));
+ snprintf(st->devname, _len, "/dev/md/%s", mdlist->devname);
}
if (!is_mddev(mdlist->devname))
return 1;
--
2.35.3

View File

@ -0,0 +1,136 @@
From 84d969be8f6d8a345b75f558fad26e4f62a558f6 Mon Sep 17 00:00:00 2001
From: Kinga Tanska <kinga.tanska@intel.com>
Date: Thu, 14 Jul 2022 09:02:11 +0200
Subject: [PATCH 30/61] Monitor: use snprintf to fill device name
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
Safe string functions are propagated in Monitor.c.
Signed-off-by: Kinga Tanska <kinga.tanska@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
Monitor.c | 37 ++++++++++++++-----------------------
1 file changed, 14 insertions(+), 23 deletions(-)
diff --git a/Monitor.c b/Monitor.c
index a5b11ae..93f36ac 100644
--- a/Monitor.c
+++ b/Monitor.c
@@ -33,8 +33,8 @@
#endif
struct state {
- char *devname;
- char devnm[32]; /* to sync with mdstat info */
+ char devname[MD_NAME_MAX + sizeof("/dev/md/")]; /* length of "/dev/md/" + device name + terminating byte*/
+ char devnm[MD_NAME_MAX]; /* to sync with mdstat info */
unsigned int utime;
int err;
char *spare_group;
@@ -45,9 +45,9 @@ struct state {
int devstate[MAX_DISKS];
dev_t devid[MAX_DISKS];
int percent;
- char parent_devnm[32]; /* For subarray, devnm of parent.
- * For others, ""
- */
+ char parent_devnm[MD_NAME_MAX]; /* For subarray, devnm of parent.
+ * For others, ""
+ */
struct supertype *metadata;
struct state *subarray;/* for a container it is a link to first subarray
* for a subarray it is a link to next subarray
@@ -187,15 +187,8 @@ int Monitor(struct mddev_dev *devlist,
continue;
st = xcalloc(1, sizeof *st);
- if (mdlist->devname[0] == '/')
- st->devname = xstrdup(mdlist->devname);
- else {
- /* length of "/dev/md/" + device name + terminating byte */
- size_t _len = sizeof("/dev/md/") + strnlen(mdlist->devname, PATH_MAX);
-
- st->devname = xcalloc(_len, sizeof(char));
- snprintf(st->devname, _len, "/dev/md/%s", mdlist->devname);
- }
+ snprintf(st->devname, MD_NAME_MAX + sizeof("/dev/md/"),
+ "/dev/md/%s", basename(mdlist->devname));
if (!is_mddev(mdlist->devname))
return 1;
st->next = statelist;
@@ -218,7 +211,7 @@ int Monitor(struct mddev_dev *devlist,
st = xcalloc(1, sizeof *st);
mdlist = conf_get_ident(dv->devname);
- st->devname = xstrdup(dv->devname);
+ snprintf(st->devname, MD_NAME_MAX + sizeof("/dev/md/"), "%s", dv->devname);
st->next = statelist;
st->devnm[0] = 0;
st->percent = RESYNC_UNKNOWN;
@@ -301,7 +294,6 @@ int Monitor(struct mddev_dev *devlist,
for (stp = &statelist; (st = *stp) != NULL; ) {
if (st->from_auto && st->err > 5) {
*stp = st->next;
- free(st->devname);
free(st->spare_group);
free(st);
} else
@@ -554,7 +546,7 @@ static int check_array(struct state *st, struct mdstat_ent *mdstat,
goto disappeared;
if (st->devnm[0] == 0)
- strcpy(st->devnm, fd2devnm(fd));
+ snprintf(st->devnm, MD_NAME_MAX, "%s", fd2devnm(fd));
for (mse2 = mdstat; mse2; mse2 = mse2->next)
if (strcmp(mse2->devnm, st->devnm) == 0) {
@@ -684,7 +676,7 @@ static int check_array(struct state *st, struct mdstat_ent *mdstat,
strncmp(mse->metadata_version, "external:", 9) == 0 &&
is_subarray(mse->metadata_version+9)) {
char *sl;
- strcpy(st->parent_devnm, mse->metadata_version + 10);
+ snprintf(st->parent_devnm, MD_NAME_MAX, "%s", mse->metadata_version + 10);
sl = strchr(st->parent_devnm, '/');
if (sl)
*sl = 0;
@@ -772,14 +764,13 @@ static int add_new_arrays(struct mdstat_ent *mdstat, struct state **statelist,
continue;
}
- st->devname = xstrdup(name);
+ snprintf(st->devname, MD_NAME_MAX + sizeof("/dev/md/"), "%s", name);
if ((fd = open(st->devname, O_RDONLY)) < 0 ||
md_get_array_info(fd, &array) < 0) {
/* no such array */
if (fd >= 0)
close(fd);
put_md_name(st->devname);
- free(st->devname);
if (st->metadata) {
st->metadata->ss->free_super(st->metadata);
free(st->metadata);
@@ -791,7 +782,7 @@ static int add_new_arrays(struct mdstat_ent *mdstat, struct state **statelist,
st->next = *statelist;
st->err = 1;
st->from_auto = 1;
- strcpy(st->devnm, mse->devnm);
+ snprintf(st->devnm, MD_NAME_MAX, "%s", mse->devnm);
st->percent = RESYNC_UNKNOWN;
st->expected_spares = -1;
if (mse->metadata_version &&
@@ -799,8 +790,8 @@ static int add_new_arrays(struct mdstat_ent *mdstat, struct state **statelist,
"external:", 9) == 0 &&
is_subarray(mse->metadata_version+9)) {
char *sl;
- strcpy(st->parent_devnm,
- mse->metadata_version+10);
+ snprintf(st->parent_devnm, MD_NAME_MAX,
+ "%s", mse->metadata_version + 10);
sl = strchr(st->parent_devnm, '/');
*sl = 0;
} else
--
2.35.3

View File

@ -1,43 +0,0 @@
From a4f7290c20c2ff78328c9db0b18029165cfb05b2 Mon Sep 17 00:00:00 2001
From: Jes Sorensen <jsorensen@fb.com>
Date: Tue, 9 Jul 2019 13:26:08 -0400
Subject: [PATCH] super-intel: Fix issue with abs() being irrelevant
Git-commit: a4f7290c20c2ff78328c9db0b18029165cfb05b2
Patch-mainline: mdadm-4.1+
References: jsc#SLE-10078, jsc#SLE-9348
gcc9 complains about subtracting unsigned from unsigned and code
assuming the result can be negative.
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
super-intel.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/super-intel.c b/super-intel.c
index 4fd5e84..230e164 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -2875,7 +2875,7 @@ static unsigned long long calc_component_size(struct imsm_map *map,
{
unsigned long long component_size;
unsigned long long dev_size = imsm_dev_size(dev);
- unsigned long long calc_dev_size = 0;
+ long long calc_dev_size = 0;
unsigned int member_disks = imsm_num_data_members(map);
if (member_disks == 0)
@@ -2889,7 +2889,7 @@ static unsigned long long calc_component_size(struct imsm_map *map,
* 2048 blocks per each device. If the difference is higher it means
* that array size was expanded and num_data_stripes was not updated.
*/
- if ((unsigned int)abs(calc_dev_size - dev_size) >
+ if (llabs(calc_dev_size - (long long)dev_size) >
(1 << SECT_PER_MB_SHIFT) * member_disks) {
component_size = dev_size / member_disks;
dprintf("Invalid num_data_stripes in metadata; expected=%llu, found=%llu\n",
--
2.25.0

View File

@ -0,0 +1,45 @@
From 14ae4c37bce9a53da08d59d6c2d7e0946e9c9f47 Mon Sep 17 00:00:00 2001
From: Logan Gunthorpe <logang@deltatee.com>
Date: Wed, 22 Jun 2022 14:25:06 -0600
Subject: [PATCH 31/61] Makefile: Don't build static build with everything and
everything-test
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
Running the test suite requires building everything, but it seems to be
difficult to build the static version of mdadm now seeing there
is no readily available static udev library.
The test suite doesn't need the static binary so just don't build it
with the everything or everything-test targets.
Leave the mdadm.static and install-static targets in place in case
someone still has a use case for the static binary.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
Signed-off-by: Coly Li <colyli@suse.de>
---
Makefile | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/Makefile b/Makefile
index bf12603..ec1f99e 100644
--- a/Makefile
+++ b/Makefile
@@ -182,9 +182,9 @@ check_rundir:
echo "***** or set CHECK_RUN_DIR=0"; exit 1; \
fi
-everything: all mdadm.static swap_super test_stripe raid6check \
+everything: all swap_super test_stripe raid6check \
mdadm.Os mdadm.O2 man
-everything-test: all mdadm.static swap_super test_stripe \
+everything-test: all swap_super test_stripe \
mdadm.Os mdadm.O2 man
# mdadm.uclibc doesn't work on x86-64
# mdadm.tcc doesn't work..
--
2.35.3

View File

@ -1,61 +0,0 @@
From 7039d1f8200b9599b23db5953934fdb43b0442e0 Mon Sep 17 00:00:00 2001
From: Jes Sorensen <jsorensen@fb.com>
Date: Tue, 9 Jul 2019 14:15:38 -0400
Subject: [PATCH] mdadm.h: Introduced unaligned {get,put}_unaligned{16,32}()
Git-commit: 7039d1f8200b9599b23db5953934fdb43b0442e0
Patch-mainline: mdadm-4.1+
References: jsc#SLE-10078, jsc#SLE-9348
We need these to avoid gcc9 going all crazy on us.
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
mdadm.h | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)
diff --git a/mdadm.h b/mdadm.h
index 427cc52..0fa9e1b 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -191,6 +191,36 @@ struct dlm_lksb {
#endif
#endif /* __KLIBC__ */
+/*
+ * Partially stolen from include/linux/unaligned/packed_struct.h
+ */
+struct __una_u16 { __u16 x; } __attribute__ ((packed));
+struct __una_u32 { __u32 x; } __attribute__ ((packed));
+
+static inline __u16 __get_unaligned16(const void *p)
+{
+ const struct __una_u16 *ptr = (const struct __una_u16 *)p;
+ return ptr->x;
+}
+
+static inline __u32 __get_unaligned32(const void *p)
+{
+ const struct __una_u32 *ptr = (const struct __una_u32 *)p;
+ return ptr->x;
+}
+
+static inline void __put_unaligned16(__u16 val, void *p)
+{
+ struct __una_u16 *ptr = (struct __una_u16 *)p;
+ ptr->x = val;
+}
+
+static inline void __put_unaligned32(__u32 val, void *p)
+{
+ struct __una_u32 *ptr = (struct __una_u32 *)p;
+ ptr->x = val;
+}
+
/*
* Check at compile time that something is of a particular type.
* Always evaluates to 1 so you may use it easily in comparisons.
--
2.25.0

View File

@ -0,0 +1,144 @@
From 679bd9508a30b2a0a1baecc9a21dd6c7d8d8d7dc Mon Sep 17 00:00:00 2001
From: Logan Gunthorpe <logang@deltatee.com>
Date: Wed, 22 Jun 2022 14:25:07 -0600
Subject: [PATCH 32/61] DDF: Cleanup validate_geometry_ddf_container()
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
Move the function up so that the function declaration is not necessary
and remove the unused arguments to the function.
No functional changes are intended but will help with a bug fix in the
next patch.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
Signed-off-by: Coly Li <colyli@suse.de>
---
super-ddf.c | 88 ++++++++++++++++++++++++-----------------------------
1 file changed, 39 insertions(+), 49 deletions(-)
diff --git a/super-ddf.c b/super-ddf.c
index abbc8b0..9d867f6 100644
--- a/super-ddf.c
+++ b/super-ddf.c
@@ -503,13 +503,6 @@ struct ddf_super {
static int load_super_ddf_all(struct supertype *st, int fd,
void **sbp, char *devname);
static int get_svd_state(const struct ddf_super *, const struct vcl *);
-static int
-validate_geometry_ddf_container(struct supertype *st,
- int level, int layout, int raiddisks,
- int chunk, unsigned long long size,
- unsigned long long data_offset,
- char *dev, unsigned long long *freesize,
- int verbose);
static int validate_geometry_ddf_bvd(struct supertype *st,
int level, int layout, int raiddisks,
@@ -3322,6 +3315,42 @@ static int reserve_space(struct supertype *st, int raiddisks,
return 1;
}
+static int
+validate_geometry_ddf_container(struct supertype *st,
+ int level, int raiddisks,
+ unsigned long long data_offset,
+ char *dev, unsigned long long *freesize,
+ int verbose)
+{
+ int fd;
+ unsigned long long ldsize;
+
+ if (level != LEVEL_CONTAINER)
+ return 0;
+ if (!dev)
+ return 1;
+
+ fd = dev_open(dev, O_RDONLY|O_EXCL);
+ if (fd < 0) {
+ if (verbose)
+ pr_err("ddf: Cannot open %s: %s\n",
+ dev, strerror(errno));
+ return 0;
+ }
+ if (!get_dev_size(fd, dev, &ldsize)) {
+ close(fd);
+ return 0;
+ }
+ close(fd);
+ if (freesize) {
+ *freesize = avail_size_ddf(st, ldsize >> 9, INVALID_SECTORS);
+ if (*freesize == 0)
+ return 0;
+ }
+
+ return 1;
+}
+
static int validate_geometry_ddf(struct supertype *st,
int level, int layout, int raiddisks,
int *chunk, unsigned long long size,
@@ -3347,11 +3376,9 @@ static int validate_geometry_ddf(struct supertype *st,
level = LEVEL_CONTAINER;
if (level == LEVEL_CONTAINER) {
/* Must be a fresh device to add to a container */
- return validate_geometry_ddf_container(st, level, layout,
- raiddisks, *chunk,
- size, data_offset, dev,
- freesize,
- verbose);
+ return validate_geometry_ddf_container(st, level, raiddisks,
+ data_offset, dev,
+ freesize, verbose);
}
if (!dev) {
@@ -3449,43 +3476,6 @@ static int validate_geometry_ddf(struct supertype *st,
return 1;
}
-static int
-validate_geometry_ddf_container(struct supertype *st,
- int level, int layout, int raiddisks,
- int chunk, unsigned long long size,
- unsigned long long data_offset,
- char *dev, unsigned long long *freesize,
- int verbose)
-{
- int fd;
- unsigned long long ldsize;
-
- if (level != LEVEL_CONTAINER)
- return 0;
- if (!dev)
- return 1;
-
- fd = dev_open(dev, O_RDONLY|O_EXCL);
- if (fd < 0) {
- if (verbose)
- pr_err("ddf: Cannot open %s: %s\n",
- dev, strerror(errno));
- return 0;
- }
- if (!get_dev_size(fd, dev, &ldsize)) {
- close(fd);
- return 0;
- }
- close(fd);
- if (freesize) {
- *freesize = avail_size_ddf(st, ldsize >> 9, INVALID_SECTORS);
- if (*freesize == 0)
- return 0;
- }
-
- return 1;
-}
-
static int validate_geometry_ddf_bvd(struct supertype *st,
int level, int layout, int raiddisks,
int *chunk, unsigned long long size,
--
2.35.3

View File

@ -1,43 +0,0 @@
From 486720e0c2418e7e2e0a16221f7c42a308622254 Mon Sep 17 00:00:00 2001
From: Jes Sorensen <jsorensen@fb.com>
Date: Tue, 9 Jul 2019 14:49:22 -0400
Subject: [PATCH] super-intel: Use put_unaligned in split_ull
Git-commit: 486720e0c2418e7e2e0a16221f7c42a308622254
Patch-mainline: mdadm-4.1+
References: jsc#SLE-10078, jsc#SLE-9348
Shut up some gcc9 errors by using put_unaligned() accessors. Not pretty,
but better than it was.
Also correct to the correct swap macros.
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
super-intel.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/super-intel.c b/super-intel.c
index 230e164..d7e8a65 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -1165,12 +1165,12 @@ static int count_memberships(struct dl *dl, struct intel_super *super)
static __u32 imsm_min_reserved_sectors(struct intel_super *super);
-static int split_ull(unsigned long long n, __u32 *lo, __u32 *hi)
+static int split_ull(unsigned long long n, void *lo, void *hi)
{
if (lo == 0 || hi == 0)
return 1;
- *lo = __le32_to_cpu((unsigned)n);
- *hi = __le32_to_cpu((unsigned)(n >> 32));
+ __put_unaligned32(__cpu_to_le32((__u32)n), lo);
+ __put_unaligned32(__cpu_to_le32((n >> 32)), hi);
return 0;
}
--
2.25.0

View File

@ -0,0 +1,52 @@
From 2b93288a5650bb811932836f67f30d63c5ddcfbd Mon Sep 17 00:00:00 2001
From: Logan Gunthorpe <logang@deltatee.com>
Date: Wed, 22 Jun 2022 14:25:08 -0600
Subject: [PATCH 33/61] DDF: Fix NULL pointer dereference in
validate_geometry_ddf()
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
A relatively recent patch added a call to validate_geometry() in
Manage_add() that has level=LEVEL_CONTAINER and chunk=NULL.
This causes some ddf tests to segfault which aborts the test suite.
To fix this, avoid dereferencing chunk when the level is
LEVEL_CONTAINER or LEVEL_NONE.
Fixes: 1f5d54a06df0 ("Manage: Call validate_geometry when adding drive to external container")
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
Signed-off-by: Coly Li <colyli@suse.de>
---
super-ddf.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/super-ddf.c b/super-ddf.c
index 9d867f6..949e7d1 100644
--- a/super-ddf.c
+++ b/super-ddf.c
@@ -3369,9 +3369,6 @@ static int validate_geometry_ddf(struct supertype *st,
* If given BVDs, we make an SVD, changing all the GUIDs in the process.
*/
- if (*chunk == UnSet)
- *chunk = DEFAULT_CHUNK;
-
if (level == LEVEL_NONE)
level = LEVEL_CONTAINER;
if (level == LEVEL_CONTAINER) {
@@ -3381,6 +3378,9 @@ static int validate_geometry_ddf(struct supertype *st,
freesize, verbose);
}
+ if (*chunk == UnSet)
+ *chunk = DEFAULT_CHUNK;
+
if (!dev) {
mdu_array_info_t array = {
.level = level,
--
2.35.3

View File

@ -1,349 +0,0 @@
From b06815989179e0f153e44e4336290e655edce9a1 Mon Sep 17 00:00:00 2001
From: Mariusz Dabrowski <mariusz.dabrowski@intel.com>
Date: Wed, 10 Jul 2019 13:38:53 +0200
Subject: [PATCH] mdadm: load default sysfs attributes after assemblation
Git-commit: b06815989179e0f153e44e4336290e655edce9a1
Patch-mainline: mdadm-4.1+
References: jsc#SLE-10078, jsc#SLE-9348
Added new type of line to mdadm.conf which allows to specify values of
sysfs attributes for MD devices that should be loaded after the array is
assembled. Each line is interpreted as list of structures containing
sysname of MD device (md126 etc.) and list of sysfs attributes and their
values.
Signed-off-by: Mariusz Dabrowski <mariusz.dabrowski@intel.com>
Signed-off-by: Krzysztof Smolinski <krzysztof.smolinski@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
Assemble.c | 12 +++-
Incremental.c | 1 +
config.c | 7 ++-
mdadm.conf.5 | 25 ++++++++
mdadm.h | 3 +
sysfs.c | 158 ++++++++++++++++++++++++++++++++++++++++++++++++++
6 files changed, 202 insertions(+), 4 deletions(-)
diff --git a/Assemble.c b/Assemble.c
index 420c7b3..b2e6914 100644
--- a/Assemble.c
+++ b/Assemble.c
@@ -1063,9 +1063,12 @@ static int start_array(int mdfd,
mddev, okcnt + sparecnt + journalcnt,
okcnt + sparecnt + journalcnt == 1 ? "" : "s");
if (okcnt < (unsigned)content->array.raid_disks)
- fprintf(stderr, " (out of %d)",
+ fprintf(stderr, " (out of %d)\n",
content->array.raid_disks);
- fprintf(stderr, "\n");
+ else {
+ fprintf(stderr, "\n");
+ sysfs_rules_apply(mddev, content);
+ }
}
if (st->ss->validate_container) {
@@ -1139,6 +1142,7 @@ static int start_array(int mdfd,
rv = ioctl(mdfd, RUN_ARRAY, NULL);
reopen_mddev(mdfd); /* drop O_EXCL */
if (rv == 0) {
+ sysfs_rules_apply(mddev, content);
if (c->verbose >= 0) {
pr_err("%s has been started with %d drive%s",
mddev, okcnt, okcnt==1?"":"s");
@@ -2130,10 +2134,12 @@ int assemble_container_content(struct supertype *st, int mdfd,
pr_err("array %s now has %d device%s",
chosen_name, working + preexist,
working + preexist == 1 ? "":"s");
- else
+ else {
+ sysfs_rules_apply(chosen_name, content);
pr_err("Started %s with %d device%s",
chosen_name, working + preexist,
working + preexist == 1 ? "":"s");
+ }
if (preexist)
fprintf(stderr, " (%d new)", working);
if (expansion)
diff --git a/Incremental.c b/Incremental.c
index d4d3c35..98dbcd9 100644
--- a/Incremental.c
+++ b/Incremental.c
@@ -480,6 +480,7 @@ int Incremental(struct mddev_dev *devlist, struct context *c,
pr_err("container %s now has %d device%s\n",
chosen_name, info.array.working_disks,
info.array.working_disks == 1?"":"s");
+ sysfs_rules_apply(chosen_name, &info);
wait_for(chosen_name, mdfd);
if (st->ss->external)
strcpy(devnm, fd2devnm(mdfd));
diff --git a/config.c b/config.c
index e14eae0..7592b2d 100644
--- a/config.c
+++ b/config.c
@@ -80,7 +80,8 @@ char DefaultAltConfFile[] = CONFFILE2;
char DefaultAltConfDir[] = CONFFILE2 ".d";
enum linetype { Devices, Array, Mailaddr, Mailfrom, Program, CreateDev,
- Homehost, HomeCluster, AutoMode, Policy, PartPolicy, LTEnd };
+ Homehost, HomeCluster, AutoMode, Policy, PartPolicy, Sysfs,
+ LTEnd };
char *keywords[] = {
[Devices] = "devices",
[Array] = "array",
@@ -93,6 +94,7 @@ char *keywords[] = {
[AutoMode] = "auto",
[Policy] = "policy",
[PartPolicy]="part-policy",
+ [Sysfs] = "sysfs",
[LTEnd] = NULL
};
@@ -764,6 +766,9 @@ void conf_file(FILE *f)
case PartPolicy:
policyline(line, rule_part);
break;
+ case Sysfs:
+ sysfsline(line);
+ break;
default:
pr_err("Unknown keyword %s\n", line);
}
diff --git a/mdadm.conf.5 b/mdadm.conf.5
index 47c962a..27dbab1 100644
--- a/mdadm.conf.5
+++ b/mdadm.conf.5
@@ -587,6 +587,26 @@ be based on the domain, but with
appended, when N is the partition number for the partition that was
found.
+.TP
+.B SYSFS
+The SYSFS line lists custom values of MD device's sysfs attributes which will be
+stored in sysfs after the array is assembled. Multiple lines are allowed and each
+line has to contain the uuid or the name of the device to which it relates.
+.RS 4
+.TP
+.B uuid=
+hexadecimal identifier of MD device. This has to match the uuid stored in the
+superblock.
+.TP
+.B name=
+name of the MD device as was given to
+.I mdadm
+when the array was created. It will be ignored if
+.B uuid
+is not empty.
+.TP
+.RS 7
+
.SH EXAMPLE
DEVICE /dev/sd[bcdjkl]1
.br
@@ -657,6 +677,11 @@ CREATE group=system mode=0640 auto=part\-8
HOMEHOST <system>
.br
AUTO +1.x homehost \-all
+.br
+SYSFS name=/dev/md/raid5 group_thread_cnt=4 sync_speed_max=1000000
+.br
+SYSFS uuid=bead5eb6:31c17a27:da120ba2:7dfda40d group_thread_cnt=4
+sync_speed_max=1000000
.SH SEE ALSO
.BR mdadm (8),
diff --git a/mdadm.h b/mdadm.h
index 0fa9e1b..c36d7fd 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -1322,6 +1322,9 @@ void domain_add(struct domainlist **domp, char *domain);
extern void policy_save_path(char *id_path, struct map_ent *array);
extern int policy_check_path(struct mdinfo *disk, struct map_ent *array);
+extern void sysfs_rules_apply(char *devnm, struct mdinfo *dev);
+extern void sysfsline(char *line);
+
#if __GNUC__ < 3
struct stat64;
#endif
diff --git a/sysfs.c b/sysfs.c
index 2dd9ab6..c313781 100644
--- a/sysfs.c
+++ b/sysfs.c
@@ -26,9 +26,22 @@
#include "mdadm.h"
#include <dirent.h>
#include <ctype.h>
+#include "dlink.h"
#define MAX_SYSFS_PATH_LEN 120
+struct dev_sysfs_rule {
+ struct dev_sysfs_rule *next;
+ char *devname;
+ int uuid[4];
+ int uuid_set;
+ struct sysfs_entry {
+ struct sysfs_entry *next;
+ char *name;
+ char *value;
+ } *entry;
+};
+
int load_sys(char *path, char *buf, int len)
{
int fd = open(path, O_RDONLY);
@@ -999,3 +1012,148 @@ int sysfs_wait(int fd, int *msec)
}
return n;
}
+
+int sysfs_rules_apply_check(const struct mdinfo *sra,
+ const struct sysfs_entry *ent)
+{
+ /* Check whether parameter is regular file,
+ * exists and is under specified directory.
+ */
+ char fname[MAX_SYSFS_PATH_LEN];
+ char dname[MAX_SYSFS_PATH_LEN];
+ char resolved_path[PATH_MAX];
+ char resolved_dir[PATH_MAX];
+
+ if (sra == NULL || ent == NULL)
+ return -1;
+
+ snprintf(dname, MAX_SYSFS_PATH_LEN, "/sys/block/%s/md/", sra->sys_name);
+ snprintf(fname, MAX_SYSFS_PATH_LEN, "%s/%s", dname, ent->name);
+
+ if (realpath(fname, resolved_path) == NULL ||
+ realpath(dname, resolved_dir) == NULL)
+ return -1;
+
+ if (strncmp(resolved_dir, resolved_path,
+ strnlen(resolved_dir, PATH_MAX)) != 0)
+ return -1;
+
+ return 0;
+}
+
+static struct dev_sysfs_rule *sysfs_rules;
+
+void sysfs_rules_apply(char *devnm, struct mdinfo *dev)
+{
+ struct dev_sysfs_rule *rules = sysfs_rules;
+
+ while (rules) {
+ struct sysfs_entry *ent = rules->entry;
+ int match = 0;
+
+ if (!rules->uuid_set) {
+ if (rules->devname)
+ match = strcmp(devnm, rules->devname) == 0;
+ } else {
+ match = memcmp(dev->uuid, rules->uuid,
+ sizeof(int[4])) == 0;
+ }
+
+ while (match && ent) {
+ if (sysfs_rules_apply_check(dev, ent) < 0)
+ pr_err("SYSFS: failed to write '%s' to '%s'\n",
+ ent->value, ent->name);
+ else
+ sysfs_set_str(dev, NULL, ent->name, ent->value);
+ ent = ent->next;
+ }
+ rules = rules->next;
+ }
+}
+
+static void sysfs_rule_free(struct dev_sysfs_rule *rule)
+{
+ struct sysfs_entry *entry;
+
+ while (rule) {
+ struct dev_sysfs_rule *tmp = rule->next;
+
+ entry = rule->entry;
+ while (entry) {
+ struct sysfs_entry *tmp = entry->next;
+
+ free(entry->name);
+ free(entry->value);
+ free(entry);
+ entry = tmp;
+ }
+
+ if (rule->devname)
+ free(rule->devname);
+ free(rule);
+ rule = tmp;
+ }
+}
+
+void sysfsline(char *line)
+{
+ struct dev_sysfs_rule *sr;
+ char *w;
+
+ sr = xcalloc(1, sizeof(*sr));
+ for (w = dl_next(line); w != line ; w = dl_next(w)) {
+ if (strncasecmp(w, "name=", 5) == 0) {
+ char *devname = w + 5;
+
+ if (strncmp(devname, "/dev/md/", 8) == 0) {
+ if (sr->devname)
+ pr_err("Only give one device per SYSFS line: %s\n",
+ devname);
+ else
+ sr->devname = xstrdup(devname);
+ } else {
+ pr_err("%s is an invalid name for an md device - ignored.\n",
+ devname);
+ }
+ } else if (strncasecmp(w, "uuid=", 5) == 0) {
+ char *uuid = w + 5;
+
+ if (sr->uuid_set) {
+ pr_err("Only give one uuid per SYSFS line: %s\n",
+ uuid);
+ } else {
+ if (parse_uuid(w + 5, sr->uuid) &&
+ memcmp(sr->uuid, uuid_zero,
+ sizeof(int[4])) != 0)
+ sr->uuid_set = 1;
+ else
+ pr_err("Invalid uuid: %s\n", uuid);
+ }
+ } else {
+ struct sysfs_entry *prop;
+
+ char *sep = strchr(w, '=');
+
+ if (sep == NULL || *(sep + 1) == 0) {
+ pr_err("Cannot parse \"%s\" - ignoring.\n", w);
+ continue;
+ }
+
+ prop = xmalloc(sizeof(*prop));
+ prop->value = xstrdup(sep + 1);
+ *sep = 0;
+ prop->name = xstrdup(w);
+ prop->next = sr->entry;
+ sr->entry = prop;
+ }
+ }
+
+ if (!sr->devname && !sr->uuid_set) {
+ pr_err("Device name not found in sysfs config entry - ignoring.\n");
+ sysfs_rule_free(sr);
+ return;
+ }
+
+ sr->next = sysfs_rules;
+ sysfs_rules = sr;
+}
--
2.25.0

View File

@ -0,0 +1,88 @@
From 548e9b916f86c06e2cdb50d8f49633f9bec66c7e Mon Sep 17 00:00:00 2001
From: Logan Gunthorpe <logang@deltatee.com>
Date: Wed, 22 Jun 2022 14:25:09 -0600
Subject: [PATCH 34/61] mdadm/Grow: Fix use after close bug by closing after
fork
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
The test 07reshape-grow fails most of the time. But it succeeds around
1 in 5 times. When it does succeed, it causes the tests to die because
mdadm has segfaulted.
The segfault was caused by mdadm attempting to repoen a file
descriptor that was already closed. The backtrace of the segfault
was:
#0 __strncmp_avx2 () at ../sysdeps/x86_64/multiarch/strcmp-avx2.S:101
#1 0x000056146e31d44b in devnm2devid (devnm=0x0) at util.c:956
#2 0x000056146e31dab4 in open_dev_flags (devnm=0x0, flags=0)
at util.c:1072
#3 0x000056146e31db22 in open_dev (devnm=0x0) at util.c:1079
#4 0x000056146e3202e8 in reopen_mddev (mdfd=4) at util.c:2244
#5 0x000056146e329f36 in start_array (mdfd=4,
mddev=0x7ffc55342450 "/dev/md0", content=0x7ffc55342860,
st=0x56146fc78660, ident=0x7ffc55342f70, best=0x56146fc6f5d0,
bestcnt=10, chosen_drive=0, devices=0x56146fc706b0, okcnt=5,
sparecnt=0, rebuilding_cnt=0, journalcnt=0, c=0x7ffc55342e90,
clean=1, avail=0x56146fc78720 "\001\001\001\001\001",
start_partial_ok=0, err_ok=0, was_forced=0)
at Assemble.c:1206
#6 0x000056146e32c36e in Assemble (st=0x56146fc78660,
mddev=0x7ffc55342450 "/dev/md0", ident=0x7ffc55342f70,
devlist=0x56146fc6e2d0, c=0x7ffc55342e90)
at Assemble.c:1914
#7 0x000056146e312ac9 in main (argc=11, argv=0x7ffc55343238)
at mdadm.c:1510
The file descriptor was closed early in Grow_continue(). The noted commit
moved the close() call to close the fd above the fork which caused the
parent process to return with a closed fd.
This meant reshape_array() and Grow_continue() would return in the parent
with the fd forked. The fd would eventually be passed to reopen_mddev()
which returned an unhandled NULL from fd2devnm() which would then be
dereferenced in devnm2devid.
Fix this by moving the close() call below the fork. This appears to
fix the 07revert-grow test. While we're at it, switch to using
close_fd() to invalidate the file descriptor.
Fixes: 77b72fa82813 ("mdadm/Grow: prevent md's fd from being occupied during delayed time")
Cc: Alex Wu <alexwu@synology.com>
Cc: BingJing Chang <bingjingc@synology.com>
Cc: Danny Shih <dannyshih@synology.com>
Cc: ChangSyun Peng <allenpeng@synology.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
Signed-off-by: Coly Li <colyli@suse.de>
---
Grow.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/Grow.c b/Grow.c
index 8c520d4..97f22c7 100644
--- a/Grow.c
+++ b/Grow.c
@@ -3514,7 +3514,6 @@ started:
return 0;
}
- close(fd);
/* Now we just need to kick off the reshape and watch, while
* handling backups of the data...
* This is all done by a forked background process.
@@ -3535,6 +3534,9 @@ started:
break;
}
+ /* Close unused file descriptor in the forked process */
+ close_fd(&fd);
+
/* If another array on the same devices is busy, the
* reshape will wait for them. This would mean that
* the first section that we suspend will stay suspended
--
2.35.3

View File

@ -1,39 +0,0 @@
From 452dc4d13a012cdcb05088c0dbc699959c4d6c73 Mon Sep 17 00:00:00 2001
From: Baruch Siach <baruch@tkos.co.il>
Date: Tue, 6 Aug 2019 16:05:23 +0300
Subject: [PATCH] mdadm.h: include sysmacros.h unconditionally
Git-commit: 452dc4d13a012cdcb05088c0dbc699959c4d6c73
Patch-mainline: mdadm-4.1+
References: jsc#SLE-10078, jsc#SLE-9348
musl libc now also requires sys/sysmacros.h for the major/minor macros.
All supported libc implementations carry sys/sysmacros.h, including
diet-libc, klibc, and uclibc-ng.
Cc: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
mdadm.h | 2 --
1 file changed, 2 deletions(-)
diff --git a/mdadm.h b/mdadm.h
index c36d7fd..d61a9ca 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -45,10 +45,8 @@ extern __off64_t lseek64 __P ((int __fd, __off64_t __offset, int __whence));
#include <errno.h>
#include <string.h>
#include <syslog.h>
-#ifdef __GLIBC__
/* Newer glibc requires sys/sysmacros.h directly for makedev() */
#include <sys/sysmacros.h>
-#endif
#ifdef __dietlibc__
#include <strings.h>
/* dietlibc has deprecated random and srandom!! */
--
2.25.0

View File

@ -1,161 +0,0 @@
From d11abe4bd5cad39803726ddff1888674e417bda5 Mon Sep 17 00:00:00 2001
From: Coly Li <colyli@suse.de>
Date: Wed, 31 Jul 2019 13:29:29 +0800
Subject: [PATCH 1/2] mdadm: add --no-devices to avoid component devices detail
information
Git-commit: d11abe4bd5cad39803726ddff1888674e417bda5
Patch-mainline: mdadm-4.1+
References: bsc#1139709
When people assemble a md raid device with a large number of
component deivces (e.g. 1500 DASD disks), the raid device detail
information generated by 'mdadm --detail --export $devnode' is very
large. It is because the detail information contains information of
all the component disks (even the missing/failed ones).
In such condition, when udev-md-raid-arrays.rules is triggered and
internally calls "mdadm --detail --no-devices --export $devnode",
user may observe systemd error message ""invalid message length". It
is because the following on-stack raw message buffer in systemd code
is not big enough,
systemd/src/libudev/libudev-monitor.c
_public_ struct udev_device *udev_monito ...
struct ucred *cred;
union {
struct udev_monitor_netlink_header nlh;
char raw[8192];
} buf;
Even change size of raw[] from 8KB to larger size, it may still be not
enough for detail message of a md raid device with much larger number of
component devices.
To fix this problem, an extra option '--no-devices' is added (the
original idea is proposed by Neil Brown). When printing detailed
information of a md raid device, if '--no-devices' is specified, then
all component devices information will not be printed, then the output
message size can be restricted to a small number, even with the systemd
only has 8KB on-disk raw buffer, the md raid array udev rules can work
correctly without failure message.
Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
---
Detail.c | 24 ++++++++++++++++--------
ReadMe.c | 1 +
mdadm.c | 4 ++++
mdadm.h | 2 ++
4 files changed, 23 insertions(+), 8 deletions(-)
Index: mdadm-4.1/Detail.c
===================================================================
--- mdadm-4.1.orig/Detail.c
+++ mdadm-4.1/Detail.c
@@ -56,7 +56,7 @@ int Detail(char *dev, struct context *c)
*/
int fd = open(dev, O_RDONLY);
mdu_array_info_t array;
- mdu_disk_info_t *disks;
+ mdu_disk_info_t *disks = NULL;
int next;
int d;
time_t atime;
@@ -280,7 +280,7 @@ int Detail(char *dev, struct context *c)
}
map_free(map);
}
- if (sra) {
+ if (!c->no_devices && sra) {
struct mdinfo *mdi;
for (mdi = sra->devs; mdi; mdi = mdi->next) {
char *path;
@@ -655,12 +655,17 @@ This is pretty boring
printf("\n\n");
}
- if (array.raid_disks)
- printf(" Number Major Minor RaidDevice State\n");
- else
- printf(" Number Major Minor RaidDevice\n");
+ if (!c->no_devices) {
+ if (array.raid_disks)
+ printf(" Number Major Minor RaidDevice State\n");
+ else
+ printf(" Number Major Minor RaidDevice\n");
+ }
}
- free(info);
+
+ /* if --no_devices specified, not print component devices info */
+ if (c->no_devices)
+ goto skip_devices_state;
for (d = 0; d < max_disks * 2; d++) {
char *dv;
@@ -747,6 +752,8 @@ This is pretty boring
if (!c->brief)
printf("\n");
}
+
+skip_devices_state:
if (spares && c->brief && array.raid_disks)
printf(" spares=%d", spares);
if (c->brief && st && st->sb)
@@ -766,8 +773,9 @@ This is pretty boring
!enough(array.level, array.raid_disks, array.layout, 1, avail))
rv = 2;
- free(disks);
out:
+ free(info);
+ free(disks);
close(fd);
free(subarray);
free(avail);
Index: mdadm-4.1/ReadMe.c
===================================================================
--- mdadm-4.1.orig/ReadMe.c
+++ mdadm-4.1/ReadMe.c
@@ -181,6 +181,7 @@ struct option long_options[] = {
/* For Detail/Examine */
{"brief", 0, 0, Brief},
+ {"no-devices",0, 0, NoDevices},
{"export", 0, 0, 'Y'},
{"sparc2.2", 0, 0, Sparc22},
{"test", 0, 0, 't'},
Index: mdadm-4.1/mdadm.c
===================================================================
--- mdadm-4.1.orig/mdadm.c
+++ mdadm-4.1/mdadm.c
@@ -159,6 +159,10 @@ int main(int argc, char *argv[])
c.brief = 1;
continue;
+ case NoDevices:
+ c.no_devices = 1;
+ continue;
+
case 'Y': c.export++;
continue;
Index: mdadm-4.1/mdadm.h
===================================================================
--- mdadm-4.1.orig/mdadm.h
+++ mdadm-4.1/mdadm.h
@@ -412,6 +412,7 @@ enum special_options {
NoSharing,
HelpOptions,
Brief,
+ NoDevices,
ManageOpt,
Add,
AddSpare,
@@ -522,6 +523,7 @@ struct context {
int runstop;
int verbose;
int brief;
+ int no_devices;
int force;
char *homehost;
int require_homehost;

View File

@ -0,0 +1,39 @@
From 9ae62977b51dab0f4bb46b1c8ea5ebd1705b2f4d Mon Sep 17 00:00:00 2001
From: Logan Gunthorpe <logang@deltatee.com>
Date: Wed, 22 Jun 2022 14:25:10 -0600
Subject: [PATCH 35/61] monitor: Avoid segfault when calling NULL
get_bad_blocks
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
Not all struct superswitch implement a get_bad_blocks() function,
yet mdmon seems to call it without checking for NULL and thus
occasionally segfaults in the test 10ddf-geometry.
Fix this by checking for NULL before calling it.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
Signed-off-by: Coly Li <colyli@suse.de>
---
monitor.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/monitor.c b/monitor.c
index b877e59..820a93d 100644
--- a/monitor.c
+++ b/monitor.c
@@ -311,6 +311,9 @@ static int check_for_cleared_bb(struct active_array *a, struct mdinfo *mdi)
struct md_bb *bb;
int i;
+ if (!ss->get_bad_blocks)
+ return -1;
+
/*
* Get a list of bad blocks for an array, then read list of
* acknowledged bad blocks from kernel and compare it against metadata
--
2.35.3

View File

@ -0,0 +1,81 @@
From 6c9d9260633f2c8491985b0782cf0fbd7e51651b Mon Sep 17 00:00:00 2001
From: Logan Gunthorpe <logang@deltatee.com>
Date: Wed, 22 Jun 2022 14:25:11 -0600
Subject: [PATCH 36/61] mdadm: Fix mdadm -r remove option regression
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
The commit noted below globally adds a parameter to the -r option but missed
the fact that -r is used for another purpose: --remove.
After that commit, a command such as:
mdadm /dev/md0 -r /dev/loop0
will do nothing seeing the device parameter will be consumed as a
argument to the -r option; thus, there will only be one device
seen one the command line, devs_found will only be 1 and nothing will
happen.
This caused the 01r5integ and 01raid6integ tests to hang indefinitely
as mdadm did not remove the failed device. With the device not removed,
it would not be readded. Then the loop waiting for the array status to
change would loop forever.
This commit was recently reverted, but the legitimate fix for the
monitor operations was still not fixed. So add specific monitor
short ops to re-fix the --monitor -r option.
Fixes: 546047688e1c ("mdadm: fix coredump of mdadm --monitor -r")
Fixes: 190dc029b141 ("Revert "mdadm: fix coredump of mdadm --monitor -r"")
Cc: Wu Guanghao <wuguanghao3@huawei.com>
Cc: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
Signed-off-by: Coly Li <colyli@suse.de>
---
ReadMe.c | 1 +
mdadm.c | 1 +
mdadm.h | 1 +
3 files changed, 3 insertions(+)
diff --git a/ReadMe.c b/ReadMe.c
index bec1be9..7518a32 100644
--- a/ReadMe.c
+++ b/ReadMe.c
@@ -82,6 +82,7 @@ char Version[] = "mdadm - v" VERSION " - " VERS_DATE EXTRAVERSION "\n";
*/
char short_options[]="-ABCDEFGIQhVXYWZ:vqbc:i:l:p:m:n:x:u:c:d:z:U:N:sarfRSow1tye:k:";
+char short_monitor_options[]="-ABCDEFGIQhVXYWZ:vqbc:i:l:p:m:r:n:x:u:c:d:z:U:N:safRSow1tye:k:";
char short_bitmap_options[]=
"-ABCDEFGIQhVXYWZ:vqb:c:i:l:p:m:n:x:u:c:d:z:U:N:sarfRSow1tye:k:";
char short_bitmap_auto_options[]=
diff --git a/mdadm.c b/mdadm.c
index be40686..d0c5e6d 100644
--- a/mdadm.c
+++ b/mdadm.c
@@ -227,6 +227,7 @@ int main(int argc, char *argv[])
shortopt = short_bitmap_auto_options;
break;
case 'F': newmode = MONITOR;
+ shortopt = short_monitor_options;
break;
case 'G': newmode = GROW;
shortopt = short_bitmap_options;
diff --git a/mdadm.h b/mdadm.h
index 974415b..163f4a4 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -419,6 +419,7 @@ enum mode {
};
extern char short_options[];
+extern char short_monitor_options[];
extern char short_bitmap_options[];
extern char short_bitmap_auto_options[];
extern struct option long_options[];
--
2.35.3

View File

@ -1,45 +0,0 @@
From 1a52f1fc0266d438c996789d4addbfac999a6139 Mon Sep 17 00:00:00 2001
From: Coly Li <colyli@suse.de>
Date: Wed, 31 Jul 2019 13:29:30 +0800
Subject: [PATCH 2/2] udev: add --no-devices option for calling 'mdadm
--detail'
Git-commit: 1a52f1fc0266d438c996789d4addbfac999a6139
Patch-mainline: mdadm-4.1+
References: bsc#1139709
When creating symlink of a md raid device, the detailed information of
component disks are unnecessary for rule udev-md-raid-arrays.rules. For
md raid devices with huge number of component disks (e.g. 1500 DASD
disks), the detail information of component devices can be very large
and exceed udev monitor's on-stack message buffer.
This patch adds '--no-devices' option when calling mdadm by,
IMPORT{program}="BINDIR/mdadm --detail --no-devices --export $devnode"
Now the detailed output won't include component disks information,
and the error message "invalid message length" reported by systemd can
be removed.
Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
---
udev-md-raid-arrays.rules | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/udev-md-raid-arrays.rules b/udev-md-raid-arrays.rules
index 5b99d58..d391665 100644
--- a/udev-md-raid-arrays.rules
+++ b/udev-md-raid-arrays.rules
@@ -17,7 +17,7 @@ TEST!="md/array_state", ENV{SYSTEMD_READY}="0", GOTO="md_end"
ATTR{md/array_state}=="|clear|inactive", ENV{SYSTEMD_READY}="0", GOTO="md_end"
LABEL="md_ignore_state"
-IMPORT{program}="BINDIR/mdadm --detail --export $devnode"
+IMPORT{program}="BINDIR/mdadm --detail --no-devices --export $devnode"
ENV{DEVTYPE}=="disk", ENV{MD_NAME}=="?*", SYMLINK+="disk/by-id/md-name-$env{MD_NAME}", OPTIONS+="string_escape=replace"
ENV{DEVTYPE}=="disk", ENV{MD_UUID}=="?*", SYMLINK+="disk/by-id/md-uuid-$env{MD_UUID}"
ENV{DEVTYPE}=="disk", ENV{MD_DEVNAME}=="?*", SYMLINK+="md/$env{MD_DEVNAME}"
--
2.16.4

View File

@ -1,49 +0,0 @@
From 91c97c5432028875db5f8abeddb5cb5f31902001 Mon Sep 17 00:00:00 2001
From: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com>
Date: Mon, 15 Jul 2019 09:25:35 +0200
Subject: [PATCH] imsm: close removed drive fd.
Git-commit: 91c97c5432028875db5f8abeddb5cb5f31902001
Patch-mainline: mdadm-4.1+
References: jsc#SLE-10078, jsc#SLE-9348
When member drive fails, managemon prepares metadata update and adds
the drive to disk_mgmt_list with DISK_REMOVE flag. It fills only
minor and major. It is enough to recognize the device later.
Monitor thread while processing this update will remove the drive from
super only if it is a spare. It never removes failed member from
disks list. As a result, it still keeps opened descriptor to
non-existing device.
If removed drive is not a spare fill fd in disk_cfg structure
(prepared by managemon), monitor will close fd during freeing it.
Also set this drive fd to -1 in super to avoid double closing because
monitor will close the fd (if needed) while replacing removed drive
in array.
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
super-intel.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/super-intel.c b/super-intel.c
index d7e8a65..a103a3f 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -9200,6 +9200,9 @@ static int add_remove_disk_update(struct intel_super *super)
remove_disk_super(super,
disk_cfg->major,
disk_cfg->minor);
+ } else {
+ disk_cfg->fd = disk->fd;
+ disk->fd = -1;
}
}
/* release allocate disk structure */
--
2.25.0

View File

@ -0,0 +1,45 @@
From 41edf6f45895193f4a523cb0a08d639c9ff9ccc9 Mon Sep 17 00:00:00 2001
From: Logan Gunthorpe <logang@deltatee.com>
Date: Wed, 22 Jun 2022 14:25:12 -0600
Subject: [PATCH 37/61] mdadm: Fix optional --write-behind parameter
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
The commit noted below changed the behaviour of --write-behind to
require an argument. This broke the 06wrmostly test with the error:
mdadm: Invalid value for maximum outstanding write-behind writes: (null).
Must be between 0 and 16383.
To fix this, check if optarg is NULL before parising it, as the origial
code did.
Fixes: 60815698c0ac ("Refactor parse_num and use it to parse optarg.")
Cc: Mateusz Grzonka <mateusz.grzonka@intel.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
Signed-off-by: Coly Li <colyli@suse.de>
---
mdadm.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/mdadm.c b/mdadm.c
index d0c5e6d..56722ed 100644
--- a/mdadm.c
+++ b/mdadm.c
@@ -1201,8 +1201,9 @@ int main(int argc, char *argv[])
case O(BUILD, WriteBehind):
case O(CREATE, WriteBehind):
s.write_behind = DEFAULT_MAX_WRITE_BEHIND;
- if (parse_num(&s.write_behind, optarg) != 0 ||
- s.write_behind < 0 || s.write_behind > 16383) {
+ if (optarg &&
+ (parse_num(&s.write_behind, optarg) != 0 ||
+ s.write_behind < 0 || s.write_behind > 16383)) {
pr_err("Invalid value for maximum outstanding write-behind writes: %s.\n\tMust be between 0 and 16383.\n",
optarg);
exit(2);
--
2.35.3

View File

@ -0,0 +1,319 @@
From 239b3cc0b5da87e966746533b1873c439db54b16 Mon Sep 17 00:00:00 2001
From: Mateusz Grzonka <mateusz.grzonka@intel.com>
Date: Fri, 12 Aug 2022 16:36:02 +0200
Subject: [PATCH 45/61] mdadm: Replace obsolete usleep with nanosleep
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
According to POSIX.1-2001, usleep is considered obsolete.
Replace it with a wrapper that uses nanosleep, as recommended in man.
Add handy macros for conversions between msec, usec and nsec.
Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
Assemble.c | 2 +-
Grow.c | 4 ++--
Manage.c | 10 +++++-----
managemon.c | 8 ++++----
mdadm.h | 4 ++++
mdmon.c | 4 ++--
super-intel.c | 6 +++---
util.c | 42 +++++++++++++++++++++++++++++++++---------
8 files changed, 54 insertions(+), 26 deletions(-)
diff --git a/Assemble.c b/Assemble.c
index 6df6bfb..be2160b 100644
--- a/Assemble.c
+++ b/Assemble.c
@@ -1947,7 +1947,7 @@ out:
break;
close(mdfd);
}
- usleep(usecs);
+ sleep_for(0, USEC_TO_NSEC(usecs), true);
usecs <<= 1;
}
}
diff --git a/Grow.c b/Grow.c
index 97f22c7..5780635 100644
--- a/Grow.c
+++ b/Grow.c
@@ -954,7 +954,7 @@ int start_reshape(struct mdinfo *sra, int already_running,
err = sysfs_set_str(sra, NULL, "sync_action",
"reshape");
if (err)
- sleep(1);
+ sleep_for(1, 0, true);
} while (err && errno == EBUSY && cnt-- > 0);
}
return err;
@@ -5058,7 +5058,7 @@ int Grow_continue_command(char *devname, int fd,
}
st->ss->getinfo_super(st, content, NULL);
if (!content->reshape_active)
- sleep(3);
+ sleep_for(3, 0, true);
else
break;
} while (cnt-- > 0);
diff --git a/Manage.c b/Manage.c
index e5e6abe..a142f8b 100644
--- a/Manage.c
+++ b/Manage.c
@@ -244,7 +244,7 @@ int Manage_stop(char *devname, int fd, int verbose, int will_retry)
"array_state",
"inactive")) < 0 &&
errno == EBUSY) {
- usleep(200000);
+ sleep_for(0, MSEC_TO_NSEC(200), true);
count--;
}
if (err) {
@@ -328,7 +328,7 @@ int Manage_stop(char *devname, int fd, int verbose, int will_retry)
sysfs_get_ll(mdi, NULL, "sync_max", &old_sync_max) == 0) {
/* must be in the critical section - wait a bit */
delay -= 1;
- usleep(100000);
+ sleep_for(0, MSEC_TO_NSEC(100), true);
}
if (sysfs_set_str(mdi, NULL, "sync_action", "frozen") != 0)
@@ -405,7 +405,7 @@ int Manage_stop(char *devname, int fd, int verbose, int will_retry)
* quite started yet. Wait a bit and
* check 'sync_action' to see.
*/
- usleep(10000);
+ sleep_for(0, MSEC_TO_NSEC(10), true);
sysfs_get_str(mdi, NULL, "sync_action", buf, sizeof(buf));
if (strncmp(buf, "reshape", 7) != 0)
break;
@@ -447,7 +447,7 @@ done:
count = 25; err = 0;
while (count && fd >= 0 &&
(err = ioctl(fd, STOP_ARRAY, NULL)) < 0 && errno == EBUSY) {
- usleep(200000);
+ sleep_for(0, MSEC_TO_NSEC(200), true);
count --;
}
if (fd >= 0 && err) {
@@ -1105,7 +1105,7 @@ int Manage_remove(struct supertype *tst, int fd, struct mddev_dev *dv,
ret = sysfs_unique_holder(devnm, rdev);
if (ret < 2)
break;
- usleep(100 * 1000); /* 100ms */
+ sleep_for(0, MSEC_TO_NSEC(100), true);
} while (--count > 0);
if (ret == 0) {
diff --git a/managemon.c b/managemon.c
index 0e9bdf0..a7bfa8f 100644
--- a/managemon.c
+++ b/managemon.c
@@ -207,7 +207,7 @@ static void replace_array(struct supertype *container,
remove_old();
while (pending_discard) {
while (discard_this == NULL)
- sleep(1);
+ sleep_for(1, 0, true);
remove_old();
}
pending_discard = old;
@@ -568,7 +568,7 @@ static void manage_member(struct mdstat_ent *mdstat,
updates = NULL;
while (update_queue_pending || update_queue) {
check_update_queue(container);
- usleep(15*1000);
+ sleep_for(0, MSEC_TO_NSEC(15), true);
}
replace_array(container, a, newa);
if (sysfs_set_str(&a->info, NULL,
@@ -822,7 +822,7 @@ static void handle_message(struct supertype *container, struct metadata_update *
if (msg->len <= 0)
while (update_queue_pending || update_queue) {
check_update_queue(container);
- usleep(15*1000);
+ sleep_for(0, MSEC_TO_NSEC(15), true);
}
if (msg->len == 0) { /* ping_monitor */
@@ -836,7 +836,7 @@ static void handle_message(struct supertype *container, struct metadata_update *
wakeup_monitor();
while (monitor_loop_cnt - cnt < 0)
- usleep(10 * 1000);
+ sleep_for(0, MSEC_TO_NSEC(10), true);
} else if (msg->len == -1) { /* ping_manager */
struct mdstat_ent *mdstat = mdstat_read(1, 0);
diff --git a/mdadm.h b/mdadm.h
index 163f4a4..add9c0b 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -1720,6 +1720,10 @@ extern int cluster_get_dlmlock(void);
extern int cluster_release_dlmlock(void);
extern void set_dlm_hooks(void);
+#define MSEC_TO_NSEC(msec) ((msec) * 1000000)
+#define USEC_TO_NSEC(usec) ((usec) * 1000)
+extern void sleep_for(unsigned int sec, long nsec, bool wake_after_interrupt);
+
#define _ROUND_UP(val, base) (((val) + (base) - 1) & ~(base - 1))
#define ROUND_UP(val, base) _ROUND_UP(val, (typeof(val))(base))
#define ROUND_UP_PTR(ptr, base) ((typeof(ptr)) \
diff --git a/mdmon.c b/mdmon.c
index c057da6..e9d035e 100644
--- a/mdmon.c
+++ b/mdmon.c
@@ -99,7 +99,7 @@ static int clone_monitor(struct supertype *container)
if (rc)
return rc;
while (mon_tid == -1)
- usleep(10);
+ sleep_for(0, USEC_TO_NSEC(10), true);
pthread_attr_destroy(&attr);
mgr_tid = syscall(SYS_gettid);
@@ -209,7 +209,7 @@ static void try_kill_monitor(pid_t pid, char *devname, int sock)
rv = kill(pid, SIGUSR1);
if (rv < 0)
break;
- usleep(200000);
+ sleep_for(0, MSEC_TO_NSEC(200), true);
}
}
diff --git a/super-intel.c b/super-intel.c
index 4ddfcf9..4d82af3 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -5275,7 +5275,7 @@ static int get_super_block(struct intel_super **super_list, char *devnm, char *d
/* retry the load if we might have raced against mdmon */
if (err == 3 && devnm && mdmon_running(devnm))
for (retry = 0; retry < 3; retry++) {
- usleep(3000);
+ sleep_for(0, MSEC_TO_NSEC(3), true);
err = load_and_parse_mpb(dfd, s, NULL, keep_fd);
if (err != 3)
break;
@@ -5377,7 +5377,7 @@ static int load_super_imsm(struct supertype *st, int fd, char *devname)
if (mdstat && mdmon_running(mdstat->devnm) && getpid() != mdmon_pid(mdstat->devnm)) {
for (retry = 0; retry < 3; retry++) {
- usleep(3000);
+ sleep_for(0, MSEC_TO_NSEC(3), true);
rv = load_and_parse_mpb(fd, super, devname, 0);
if (rv != 3)
break;
@@ -12084,7 +12084,7 @@ int wait_for_reshape_imsm(struct mdinfo *sra, int ndata)
close(fd);
return 1;
}
- usleep(30000);
+ sleep_for(0, MSEC_TO_NSEC(30), true);
} else
break;
} while (retry--);
diff --git a/util.c b/util.c
index 38f0420..ca48d97 100644
--- a/util.c
+++ b/util.c
@@ -166,7 +166,7 @@ retry:
pr_err("error %d when get PW mode on lock %s\n", errno, str);
/* let's try several times if EAGAIN happened */
if (dlm_lock_res->lksb.sb_status == EAGAIN && retry_count < 10) {
- sleep(10);
+ sleep_for(10, 0, true);
retry_count++;
goto retry;
}
@@ -1085,7 +1085,7 @@ int open_dev_excl(char *devnm)
int i;
int flags = O_RDWR;
dev_t devid = devnm2devid(devnm);
- long delay = 1000;
+ unsigned int delay = 1; // miliseconds
sprintf(buf, "%d:%d", major(devid), minor(devid));
for (i = 0; i < 25; i++) {
@@ -1098,8 +1098,8 @@ int open_dev_excl(char *devnm)
}
if (errno != EBUSY)
return fd;
- usleep(delay);
- if (delay < 200000)
+ sleep_for(0, MSEC_TO_NSEC(delay), true);
+ if (delay < 200)
delay *= 2;
}
return -1;
@@ -1123,7 +1123,7 @@ void wait_for(char *dev, int fd)
{
int i;
struct stat stb_want;
- long delay = 1000;
+ unsigned int delay = 1; // miliseconds
if (fstat(fd, &stb_want) != 0 ||
(stb_want.st_mode & S_IFMT) != S_IFBLK)
@@ -1135,8 +1135,8 @@ void wait_for(char *dev, int fd)
(stb.st_mode & S_IFMT) == S_IFBLK &&
(stb.st_rdev == stb_want.st_rdev))
return;
- usleep(delay);
- if (delay < 200000)
+ sleep_for(0, MSEC_TO_NSEC(delay), true);
+ if (delay < 200)
delay *= 2;
}
if (i == 25)
@@ -1821,7 +1821,7 @@ int hot_remove_disk(int mdfd, unsigned long dev, int force)
while ((ret = ioctl(mdfd, HOT_REMOVE_DISK, dev)) == -1 &&
errno == EBUSY &&
cnt-- > 0)
- usleep(10000);
+ sleep_for(0, MSEC_TO_NSEC(10), true);
return ret;
}
@@ -1834,7 +1834,7 @@ int sys_hot_remove_disk(int statefd, int force)
while ((ret = write(statefd, "remove", 6)) == -1 &&
errno == EBUSY &&
cnt-- > 0)
- usleep(10000);
+ sleep_for(0, MSEC_TO_NSEC(10), true);
return ret == 6 ? 0 : -1;
}
@@ -2375,3 +2375,27 @@ out:
close(fd_zero);
return ret;
}
+
+/**
+ * sleep_for() - Sleeps for specified time.
+ * @sec: Seconds to sleep for.
+ * @nsec: Nanoseconds to sleep for, has to be less than one second.
+ * @wake_after_interrupt: If set, wake up if interrupted.
+ *
+ * Function immediately returns if error different than EINTR occurs.
+ */
+void sleep_for(unsigned int sec, long nsec, bool wake_after_interrupt)
+{
+ struct timespec delay = {.tv_sec = sec, .tv_nsec = nsec};
+
+ assert(nsec < MSEC_TO_NSEC(1000));
+
+ do {
+ errno = 0;
+ nanosleep(&delay, &delay);
+ if (errno != 0 && errno != EINTR) {
+ pr_err("Error sleeping for %us %ldns: %s\n", sec, nsec, strerror(errno));
+ return;
+ }
+ } while (!wake_after_interrupt && errno == EINTR);
+}
--
2.35.3

View File

@ -1,50 +0,0 @@
From fd5b09c9a9107f0393ce194c4aac6e7b8f163e85 Mon Sep 17 00:00:00 2001
From: Krzysztof Smolinski <krzysztof.smolinski@intel.com>
Date: Fri, 16 Aug 2019 11:06:17 +0200
Subject: [PATCH] mdadm: check value returned by snprintf against errors
Git-commit: fd5b09c9a9107f0393ce194c4aac6e7b8f163e85
Patch-mainline: mdadm-4.1+
References: jsc#SLE-10078, jsc#SLE-9348
GCC 8 checks possible truncation during snprintf more strictly
than GCC 7 which result in compilation errors. To fix this
problem checking result of snprintf against errors has been added.
Signed-off-by: Krzysztof Smolinski <krzysztof.smolinski@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
sysfs.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/sysfs.c b/sysfs.c
index c313781..2995713 100644
--- a/sysfs.c
+++ b/sysfs.c
@@ -1023,12 +1023,20 @@ int sysfs_rules_apply_check(const struct mdinfo *sra,
char dname[MAX_SYSFS_PATH_LEN];
char resolved_path[PATH_MAX];
char resolved_dir[PATH_MAX];
+ int result;
if (sra == NULL || ent == NULL)
return -1;
- snprintf(dname, MAX_SYSFS_PATH_LEN, "/sys/block/%s/md/", sra->sys_name);
- snprintf(fname, MAX_SYSFS_PATH_LEN, "%s/%s", dname, ent->name);
+ result = snprintf(dname, MAX_SYSFS_PATH_LEN,
+ "/sys/block/%s/md/", sra->sys_name);
+ if (result < 0 || result >= MAX_SYSFS_PATH_LEN)
+ return -1;
+
+ result = snprintf(fname, MAX_SYSFS_PATH_LEN,
+ "%s/%s", dname, ent->name);
+ if (result < 0 || result >= MAX_SYSFS_PATH_LEN)
+ return -1;
if (realpath(fname, resolved_path) == NULL ||
realpath(dname, resolved_dir) == NULL)
--
2.25.0

View File

@ -1,167 +0,0 @@
From 43ebc9105e9dafe5145b3e801c05da4736bf6e02 Mon Sep 17 00:00:00 2001
From: "Guilherme G. Piccoli" <gpiccoli@canonical.com>
Date: Tue, 3 Sep 2019 16:49:01 -0300
Subject: [PATCH] mdadm: Introduce new array state 'broken' for raid0/linear
Git-commit: 43ebc9105e9dafe5145b3e801c05da4736bf6e02
Patch-mainline: mdadm-4.1+
References: jsc#SLE-10078, jsc#SLE-9348
Currently if a md raid0/linear array gets one or more members removed while
being mounted, kernel keeps showing state 'clean' in the 'array_state'
sysfs attribute. Despite udev signaling the member device is gone, 'mdadm'
cannot issue the STOP_ARRAY ioctl successfully, given the array is mounted.
Nothing else hints that something is wrong (except that the removed devices
don't show properly in the output of mdadm 'detail' command). There is no
other property to be checked, and if user is not performing reads/writes
to the array, even kernel log is quiet and doesn't give a clue about the
missing member.
This patch is the mdadm counterpart of kernel new array state 'broken'.
The 'broken' state mimics the state 'clean' in every aspect, being useful
only to distinguish if an array has some member missing. All necessary
paths in mdadm were changed to deal with 'broken' state, and in case the
tool runs in a kernel that is not updated, it'll work normally, i.e., it
doesn't require the 'broken' state in order to work.
Also, this patch changes the way the array state is showed in the 'detail'
command (for raid0/linear only) - now it takes the 'array_state' sysfs
attribute into account instead of only rely in the MD_SB_CLEAN flag.
Cc: Jes Sorensen <jes.sorensen@gmail.com>
Cc: NeilBrown <neilb@suse.de>
Cc: Song Liu <songliubraving@fb.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
Detail.c | 14 ++++++++++++--
Monitor.c | 8 ++++++--
maps.c | 1 +
mdadm.h | 1 +
mdmon.h | 2 +-
monitor.c | 4 ++--
6 files changed, 23 insertions(+), 7 deletions(-)
diff --git a/Detail.c b/Detail.c
index ad60434..3e61e37 100644
--- a/Detail.c
+++ b/Detail.c
@@ -81,6 +81,7 @@ int Detail(char *dev, struct context *c)
int external;
int inactive;
int is_container = 0;
+ char *arrayst;
if (fd < 0) {
pr_err("cannot open %s: %s\n",
@@ -485,9 +486,18 @@ int Detail(char *dev, struct context *c)
else
st = ", degraded";
+ if (array.state & (1 << MD_SB_CLEAN)) {
+ if ((array.level == 0) ||
+ (array.level == LEVEL_LINEAR))
+ arrayst = map_num(sysfs_array_states,
+ sra->array_state);
+ else
+ arrayst = "clean";
+ } else
+ arrayst = "active";
+
printf(" State : %s%s%s%s%s%s \n",
- (array.state & (1 << MD_SB_CLEAN)) ?
- "clean" : "active", st,
+ arrayst, st,
(!e || (e->percent < 0 &&
e->percent != RESYNC_PENDING &&
e->percent != RESYNC_DELAYED)) ?
diff --git a/Monitor.c b/Monitor.c
index 036103f..b527165 100644
--- a/Monitor.c
+++ b/Monitor.c
@@ -1055,8 +1055,11 @@ int Wait(char *dev)
}
}
+/* The state "broken" is used only for RAID0/LINEAR - it's the same as
+ * "clean", but used in case the array has one or more members missing.
+ */
static char *clean_states[] = {
- "clear", "inactive", "readonly", "read-auto", "clean", NULL };
+ "clear", "inactive", "readonly", "read-auto", "clean", "broken", NULL };
int WaitClean(char *dev, int verbose)
{
@@ -1116,7 +1119,8 @@ int WaitClean(char *dev, int verbose)
rv = read(state_fd, buf, sizeof(buf));
if (rv < 0)
break;
- if (sysfs_match_word(buf, clean_states) <= 4)
+ if (sysfs_match_word(buf, clean_states) <
+ (int)ARRAY_SIZE(clean_states) - 1)
break;
rv = sysfs_wait(state_fd, &delay);
if (rv < 0 && errno != EINTR)
diff --git a/maps.c b/maps.c
index 02a0474..49b7f2c 100644
--- a/maps.c
+++ b/maps.c
@@ -150,6 +150,7 @@ mapping_t sysfs_array_states[] = {
{ "read-auto", ARRAY_READ_AUTO },
{ "clean", ARRAY_CLEAN },
{ "write-pending", ARRAY_WRITE_PENDING },
+ { "broken", ARRAY_BROKEN },
{ NULL, ARRAY_UNKNOWN_STATE }
};
diff --git a/mdadm.h b/mdadm.h
index 43b07d5..c88ceab 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -373,6 +373,7 @@ struct mdinfo {
ARRAY_ACTIVE,
ARRAY_WRITE_PENDING,
ARRAY_ACTIVE_IDLE,
+ ARRAY_BROKEN,
ARRAY_UNKNOWN_STATE,
} array_state;
struct md_bb bb;
diff --git a/mdmon.h b/mdmon.h
index 818367c..b3d72ac 100644
--- a/mdmon.h
+++ b/mdmon.h
@@ -21,7 +21,7 @@
extern const char Name[];
enum array_state { clear, inactive, suspended, readonly, read_auto,
- clean, active, write_pending, active_idle, bad_word};
+ clean, active, write_pending, active_idle, broken, bad_word};
enum sync_action { idle, reshape, resync, recover, check, repair, bad_action };
diff --git a/monitor.c b/monitor.c
index 81537ed..e0d3be6 100644
--- a/monitor.c
+++ b/monitor.c
@@ -26,7 +26,7 @@
static char *array_states[] = {
"clear", "inactive", "suspended", "readonly", "read-auto",
- "clean", "active", "write-pending", "active-idle", NULL };
+ "clean", "active", "write-pending", "active-idle", "broken", NULL };
static char *sync_actions[] = {
"idle", "reshape", "resync", "recover", "check", "repair", NULL
};
@@ -476,7 +476,7 @@ static int read_and_act(struct active_array *a, fd_set *fds)
a->next_state = clean;
ret |= ARRAY_DIRTY;
}
- if (a->curr_state == clean) {
+ if ((a->curr_state == clean) || (a->curr_state == broken)) {
a->container->ss->set_array_state(a, 1);
}
if (a->curr_state == active ||
--
2.25.0

View File

@ -0,0 +1,179 @@
From e4a030a0d3a953b8e74c118200e58dc83c2fc608 Mon Sep 17 00:00:00 2001
From: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Date: Tue, 19 Jul 2022 14:48:22 +0200
Subject: [PATCH 48/61] mdadm: remove symlink option
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
The option is not used. Remove it from code.
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
ReadMe.c | 1 -
config.c | 7 +------
mdadm.8.in | 9 ---------
mdadm.c | 20 --------------------
mdadm.conf.5.in | 15 ---------------
mdadm.h | 2 --
6 files changed, 1 insertion(+), 53 deletions(-)
diff --git a/ReadMe.c b/ReadMe.c
index 7518a32..7f94847 100644
--- a/ReadMe.c
+++ b/ReadMe.c
@@ -147,7 +147,6 @@ struct option long_options[] = {
{"nofailfast",0, 0, NoFailFast},
{"re-add", 0, 0, ReAdd},
{"homehost", 1, 0, HomeHost},
- {"symlinks", 1, 0, Symlinks},
{"data-offset",1, 0, DataOffset},
{"nodes",1, 0, Nodes}, /* also for --assemble */
{"home-cluster",1, 0, ClusterName},
diff --git a/config.c b/config.c
index 9c72545..dc1620c 100644
--- a/config.c
+++ b/config.c
@@ -194,7 +194,6 @@ struct mddev_dev *load_containers(void)
struct createinfo createinfo = {
.autof = 2, /* by default, create devices with standard names */
- .symlinks = 1,
.names = 0, /* By default, stick with numbered md devices. */
.bblist = 1, /* Use a bad block list by default */
#ifdef DEBIAN
@@ -310,11 +309,7 @@ static void createline(char *line)
if (!createinfo.supertype)
pr_err("metadata format %s unknown, ignoring\n",
w+9);
- } else if (strncasecmp(w, "symlinks=yes", 12) == 0)
- createinfo.symlinks = 1;
- else if (strncasecmp(w, "symlinks=no", 11) == 0)
- createinfo.symlinks = 0;
- else if (strncasecmp(w, "names=yes", 12) == 0)
+ } else if (strncasecmp(w, "names=yes", 12) == 0)
createinfo.names = 1;
else if (strncasecmp(w, "names=no", 11) == 0)
createinfo.names = 0;
diff --git a/mdadm.8.in b/mdadm.8.in
index 0be02e4..f273622 100644
--- a/mdadm.8.in
+++ b/mdadm.8.in
@@ -1048,11 +1048,6 @@ simultaneously. If not specified, this defaults to 4.
Specify journal device for the RAID-4/5/6 array. The journal device
should be a SSD with reasonable lifetime.
-.TP
-.BR \-\-symlinks
-Auto creation of symlinks in /dev to /dev/md, option --symlinks must
-be 'no' or 'yes' and work with --create and --build.
-
.TP
.BR \-k ", " \-\-consistency\-policy=
Specify how the array maintains consistency in case of unexpected shutdown.
@@ -1405,10 +1400,6 @@ Reshape can be continued later using the
.B \-\-continue
option for the grow command.
-.TP
-.BR \-\-symlinks
-See this option under Create and Build options.
-
.SH For Manage mode:
.TP
diff --git a/mdadm.c b/mdadm.c
index 56722ed..180f7a9 100644
--- a/mdadm.c
+++ b/mdadm.c
@@ -59,7 +59,6 @@ int main(int argc, char *argv[])
struct mddev_dev *dv;
mdu_array_info_t array;
int devs_found = 0;
- char *symlinks = NULL;
int grow_continue = 0;
/* autof indicates whether and how to create device node.
* bottom 3 bits are style. Rest (when shifted) are number of parts
@@ -663,13 +662,6 @@ int main(int argc, char *argv[])
case O(ASSEMBLE,Auto): /* auto-creation of device node */
c.autof = parse_auto(optarg, "--auto flag", 0);
continue;
-
- case O(CREATE,Symlinks):
- case O(BUILD,Symlinks):
- case O(ASSEMBLE,Symlinks): /* auto creation of symlinks in /dev to /dev/md */
- symlinks = optarg;
- continue;
-
case O(BUILD,'f'): /* force honouring '-n 1' */
case O(BUILD,Force): /* force honouring '-n 1' */
case O(GROW,'f'): /* ditto */
@@ -1325,18 +1317,6 @@ int main(int argc, char *argv[])
exit(2);
}
- if (symlinks) {
- struct createinfo *ci = conf_get_create_info();
-
- if (strcasecmp(symlinks, "yes") == 0)
- ci->symlinks = 1;
- else if (strcasecmp(symlinks, "no") == 0)
- ci->symlinks = 0;
- else {
- pr_err("option --symlinks must be 'no' or 'yes'\n");
- exit(2);
- }
- }
/* Ok, got the option parsing out of the way
* hopefully it's mostly right but there might be some stuff
* missing
diff --git a/mdadm.conf.5.in b/mdadm.conf.5.in
index cd4e6a9..bc2295c 100644
--- a/mdadm.conf.5.in
+++ b/mdadm.conf.5.in
@@ -338,21 +338,6 @@ missing device entries should be created.
The name of the metadata format to use if none is explicitly given.
This can be useful to impose a system-wide default of version-1 superblocks.
-.TP
-.B symlinks=no
-Normally when creating devices in
-.B /dev/md/
-.I mdadm
-will create a matching symlink from
-.B /dev/
-with a name starting
-.B md
-or
-.BR md_ .
-Give
-.B symlinks=no
-to suppress this symlink creation.
-
.TP
.B names=yes
Since Linux 2.6.29 it has been possible to create
diff --git a/mdadm.h b/mdadm.h
index add9c0b..93e7278 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -394,7 +394,6 @@ struct createinfo {
int gid;
int autof;
int mode;
- int symlinks;
int names;
int bblist;
struct supertype *supertype;
@@ -442,7 +441,6 @@ enum special_options {
BackupFile,
HomeHost,
AutoHomeHost,
- Symlinks,
AutoDetect,
Waitclean,
DetailPlatform,
--
2.35.3

View File

@ -1,45 +0,0 @@
From 2c2d9c48d2daf0d78d20494c3779c0f6dc4bfa75 Mon Sep 17 00:00:00 2001
From: Nigel Croxon <ncroxon@redhat.com>
Date: Tue, 24 Sep 2019 11:39:24 -0400
Subject: [PATCH] mdadm: force a uuid swap on big endian
Git-commit: 2c2d9c48d2daf0d78d20494c3779c0f6dc4bfa75
Patch-mainline: mdadm-4.1+
References: jsc#SLE-10078, jsc#SLE-9348
The code path for metadata 0.90 calls a common routine
fname_from_uuid that uses metadata 1.2. The code expects member
swapuuid to be setup and usable. But it is only setup when using
metadata 1.2. Since the metadata 0.90 did not create swapuuid
and set it. The test (st->ss == &super1) ? 1 : st->ss->swapuuid
fails. The swapuuid is set at compile time based on byte order.
Any call based on metadata 0.90 and on big endian processors,
the --export uuid will be incorrect.
Signed-off-by: Nigel Croxon <ncroxon@redhat.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
util.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/util.c b/util.c
index c26cf5f..64dd409 100644
--- a/util.c
+++ b/util.c
@@ -685,8 +685,12 @@ char *fname_from_uuid(struct supertype *st, struct mdinfo *info,
// work, but can't have it set if we want this printout to match
// all the other uuid printouts in super1.c, so we force swapuuid
// to 1 to make our printout match the rest of super1
+#if __BYTE_ORDER == BIG_ENDIAN
+ return __fname_from_uuid(info->uuid, 1, buf, sep);
+#else
return __fname_from_uuid(info->uuid, (st->ss == &super1) ? 1 :
st->ss->swapuuid, buf, sep);
+#endif
}
int check_ext2(int fd, char *name)
--
2.25.0

View File

@ -0,0 +1,235 @@
From ae5dfc56b7a96805d5a0b50eaf93b9fec8604298 Mon Sep 17 00:00:00 2001
From: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Date: Tue, 19 Jul 2022 14:48:23 +0200
Subject: [PATCH 49/61] mdadm: move data_offset to struct shape
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
Data offset is a shape property so move it there to remove additional
parameter from some functions.
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
Create.c | 16 ++++++++--------
Grow.c | 7 +++----
mdadm.c | 20 +++++++++-----------
mdadm.h | 5 ++---
4 files changed, 22 insertions(+), 26 deletions(-)
diff --git a/Create.c b/Create.c
index c84c1ac..e06ec2a 100644
--- a/Create.c
+++ b/Create.c
@@ -95,7 +95,7 @@ int Create(struct supertype *st, char *mddev,
char *name, int *uuid,
int subdevs, struct mddev_dev *devlist,
struct shape *s,
- struct context *c, unsigned long long data_offset)
+ struct context *c)
{
/*
* Create a new raid array.
@@ -288,7 +288,7 @@ int Create(struct supertype *st, char *mddev,
newsize = s->size * 2;
if (st && ! st->ss->validate_geometry(st, s->level, s->layout, s->raiddisks,
&s->chunk, s->size*2,
- data_offset, NULL,
+ s->data_offset, NULL,
&newsize, s->consistency_policy,
c->verbose >= 0))
return 1;
@@ -323,10 +323,10 @@ int Create(struct supertype *st, char *mddev,
info.array.working_disks = 0;
dnum = 0;
for (dv = devlist; dv; dv = dv->next)
- if (data_offset == VARIABLE_OFFSET)
+ if (s->data_offset == VARIABLE_OFFSET)
dv->data_offset = INVALID_SECTORS;
else
- dv->data_offset = data_offset;
+ dv->data_offset = s->data_offset;
for (dv=devlist; dv && !have_container; dv=dv->next, dnum++) {
char *dname = dv->devname;
@@ -342,7 +342,7 @@ int Create(struct supertype *st, char *mddev,
missing_disks ++;
continue;
}
- if (data_offset == VARIABLE_OFFSET) {
+ if (s->data_offset == VARIABLE_OFFSET) {
doff = strchr(dname, ':');
if (doff) {
*doff++ = 0;
@@ -350,7 +350,7 @@ int Create(struct supertype *st, char *mddev,
} else
dv->data_offset = INVALID_SECTORS;
} else
- dv->data_offset = data_offset;
+ dv->data_offset = s->data_offset;
dfd = open(dname, O_RDONLY);
if (dfd < 0) {
@@ -535,7 +535,7 @@ int Create(struct supertype *st, char *mddev,
if (!st->ss->validate_geometry(st, s->level, s->layout,
s->raiddisks,
&s->chunk, minsize*2,
- data_offset,
+ s->data_offset,
NULL, NULL,
s->consistency_policy, 0)) {
pr_err("devices too large for RAID level %d\n", s->level);
@@ -754,7 +754,7 @@ int Create(struct supertype *st, char *mddev,
}
}
if (!st->ss->init_super(st, &info.array, s, name, c->homehost, uuid,
- data_offset))
+ s->data_offset))
goto abort_locked;
total_slots = info.array.nr_disks;
diff --git a/Grow.c b/Grow.c
index 5780635..868bdc3 100644
--- a/Grow.c
+++ b/Grow.c
@@ -1775,7 +1775,6 @@ static int reshape_container(char *container, char *devname,
int Grow_reshape(char *devname, int fd,
struct mddev_dev *devlist,
- unsigned long long data_offset,
struct context *c, struct shape *s)
{
/* Make some changes in the shape of an array.
@@ -1821,7 +1820,7 @@ int Grow_reshape(char *devname, int fd,
return 1;
}
- if (data_offset != INVALID_SECTORS && array.level != 10 &&
+ if (s->data_offset != INVALID_SECTORS && array.level != 10 &&
(array.level < 4 || array.level > 6)) {
pr_err("--grow --data-offset not yet supported\n");
return 1;
@@ -2179,7 +2178,7 @@ size_change_error:
if ((s->level == UnSet || s->level == array.level) &&
(s->layout_str == NULL) &&
(s->chunk == 0 || s->chunk == array.chunk_size) &&
- data_offset == INVALID_SECTORS &&
+ s->data_offset == INVALID_SECTORS &&
(s->raiddisks == 0 || s->raiddisks == array.raid_disks)) {
/* Nothing more to do */
if (!changed && c->verbose >= 0)
@@ -2379,7 +2378,7 @@ size_change_error:
}
sync_metadata(st);
rv = reshape_array(container, fd, devname, st, &info, c->force,
- devlist, data_offset, c->backup_file,
+ devlist, s->data_offset, c->backup_file,
c->verbose, 0, 0, 0);
frozen = 0;
}
diff --git a/mdadm.c b/mdadm.c
index 180f7a9..845e446 100644
--- a/mdadm.c
+++ b/mdadm.c
@@ -49,7 +49,6 @@ int main(int argc, char *argv[])
int i;
unsigned long long array_size = 0;
- unsigned long long data_offset = INVALID_SECTORS;
struct mddev_ident ident;
char *configfile = NULL;
int devmode = 0;
@@ -79,6 +78,7 @@ int main(int argc, char *argv[])
.layout = UnSet,
.bitmap_chunk = UnSet,
.consistency_policy = CONSISTENCY_POLICY_UNKNOWN,
+ .data_offset = INVALID_SECTORS,
};
char sys_hostname[256];
@@ -479,15 +479,15 @@ int main(int argc, char *argv[])
case O(CREATE,DataOffset):
case O(GROW,DataOffset):
- if (data_offset != INVALID_SECTORS) {
+ if (s.data_offset != INVALID_SECTORS) {
pr_err("data-offset may only be specified one. Second value is %s.\n", optarg);
exit(2);
}
if (mode == CREATE && strcmp(optarg, "variable") == 0)
- data_offset = VARIABLE_OFFSET;
+ s.data_offset = VARIABLE_OFFSET;
else
- data_offset = parse_size(optarg);
- if (data_offset == INVALID_SECTORS) {
+ s.data_offset = parse_size(optarg);
+ if (s.data_offset == INVALID_SECTORS) {
pr_err("invalid data-offset: %s\n",
optarg);
exit(2);
@@ -1416,7 +1416,7 @@ int main(int argc, char *argv[])
exit(1);
}
- if (c.backup_file && data_offset != INVALID_SECTORS) {
+ if (c.backup_file && s.data_offset != INVALID_SECTORS) {
pr_err("--backup-file and --data-offset are incompatible\n");
exit(2);
}
@@ -1587,8 +1587,7 @@ int main(int argc, char *argv[])
rv = Create(ss, devlist->devname,
ident.name, ident.uuid_set ? ident.uuid : NULL,
- devs_found-1, devlist->next,
- &s, &c, data_offset);
+ devs_found - 1, devlist->next, &s, &c);
break;
case MISC:
if (devmode == 'E') {
@@ -1706,10 +1705,9 @@ int main(int argc, char *argv[])
c.verbose);
else if (s.size > 0 || s.raiddisks || s.layout_str ||
s.chunk != 0 || s.level != UnSet ||
- data_offset != INVALID_SECTORS) {
+ s.data_offset != INVALID_SECTORS) {
rv = Grow_reshape(devlist->devname, mdfd,
- devlist->next,
- data_offset, &c, &s);
+ devlist->next, &c, &s);
} else if (s.consistency_policy != CONSISTENCY_POLICY_UNKNOWN) {
rv = Grow_consistency_policy(devlist->devname, mdfd, &c, &s);
} else if (array_size == 0)
diff --git a/mdadm.h b/mdadm.h
index 93e7278..adb7cda 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -595,6 +595,7 @@ struct shape {
int assume_clean;
int write_behind;
unsigned long long size;
+ unsigned long long data_offset;
int consistency_policy;
};
@@ -1431,7 +1432,6 @@ extern int Grow_addbitmap(char *devname, int fd,
struct context *c, struct shape *s);
extern int Grow_reshape(char *devname, int fd,
struct mddev_dev *devlist,
- unsigned long long data_offset,
struct context *c, struct shape *s);
extern int Grow_restart(struct supertype *st, struct mdinfo *info,
int *fdlist, int cnt, char *backup_file, int verbose);
@@ -1462,8 +1462,7 @@ extern int Create(struct supertype *st, char *mddev,
char *name, int *uuid,
int subdevs, struct mddev_dev *devlist,
struct shape *s,
- struct context *c,
- unsigned long long data_offset);
+ struct context *c);
extern int Detail(char *dev, struct context *c);
extern int Detail_Platform(struct superswitch *ss, int scan, int verbose, int export, char *controller_path);
--
2.35.3

View File

@ -0,0 +1,165 @@
From 27ad4900501c615b7c6b266bf23948e5606dba53 Mon Sep 17 00:00:00 2001
From: Logan Gunthorpe <logang@deltatee.com>
Date: Wed, 27 Jul 2022 15:52:46 -0600
Subject: [PATCH 50/61] mdadm: Don't open md device for CREATE and ASSEMBLE
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
The mdadm command tries to open the md device for most modes, first
thing, no matter what. When running to create or assemble an array,
in most cases, the md device will not exist, the open call will fail
and everything will proceed correctly.
However, when running tests, a create or assembly command may be run
shortly after stopping an array and the old md device file may still
be around. Then, if create_on_open is set in the kernel, a new md
device will be created when mdadm does its initial open.
When mdadm gets around to creating the new device with the new_array
parameter it issues this error:
mdadm: Fail to create md0 when using
/sys/module/md_mod/parameters/new_array, fallback to creation via node
This is because an mddev was already created by the kernel with the
earlier open() call and thus the new one being created will fail with
EEXIST. The mdadm command will still successfully be created due to
falling back to the node creation method. However, the error message
itself will fail any test that's running it.
This issue is a race condition that is very rare, but a recent change
in the kernel caused this to happen more frequently: about 1 in 50
times.
To fix this, don't bother trying to open the md device for CREATE,
ASSEMBLE and BUILD commands, as the file descriptor will never be used
anyway even if it is successfully openned. The mdfd has not been used
for these commands since:
7f91af49ad09 ("Delay creation of array devices for assemble/build/create")
The checks that were done on the open device can be changed to being
done with stat.
Side note: it would be nice to disable create_on_open as well to help
solve this, but it seems the work for this was never finished. By default,
mdadm will create using the old node interface when a name is specified
unless the user specifically puts names=yes in a config file, which
doesn't seem to be common or desirable to require this..
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
lib.c | 12 ++++++++++++
mdadm.c | 40 ++++++++++++++++++++--------------------
mdadm.h | 1 +
3 files changed, 33 insertions(+), 20 deletions(-)
diff --git a/lib.c b/lib.c
index 7e3e3d4..e395b28 100644
--- a/lib.c
+++ b/lib.c
@@ -164,6 +164,18 @@ char *stat2devnm(struct stat *st)
return devid2devnm(st->st_rdev);
}
+bool stat_is_md_dev(struct stat *st)
+{
+ if ((S_IFMT & st->st_mode) != S_IFBLK)
+ return false;
+ if (major(st->st_rdev) == MD_MAJOR)
+ return true;
+ if (major(st->st_rdev) == (unsigned)get_mdp_major())
+ return true;
+
+ return false;
+}
+
char *fd2devnm(int fd)
{
struct stat stb;
diff --git a/mdadm.c b/mdadm.c
index 845e446..972adb5 100644
--- a/mdadm.c
+++ b/mdadm.c
@@ -1329,6 +1329,9 @@ int main(int argc, char *argv[])
if (mode == MANAGE || mode == BUILD || mode == CREATE ||
mode == GROW || (mode == ASSEMBLE && ! c.scan)) {
+ struct stat stb;
+ int ret;
+
if (devs_found < 1) {
pr_err("an md device must be given in this mode\n");
exit(2);
@@ -1341,6 +1344,12 @@ int main(int argc, char *argv[])
mdfd = open_mddev(devlist->devname, 1);
if (mdfd < 0)
exit(1);
+
+ ret = fstat(mdfd, &stb);
+ if (ret) {
+ pr_err("fstat failed on %s.\n", devlist->devname);
+ exit(1);
+ }
} else {
char *bname = basename(devlist->devname);
@@ -1348,30 +1357,21 @@ int main(int argc, char *argv[])
pr_err("Name %s is too long.\n", devlist->devname);
exit(1);
}
- /* non-existent device is OK */
- mdfd = open_mddev(devlist->devname, 0);
- }
- if (mdfd == -2) {
- pr_err("device %s exists but is not an md array.\n", devlist->devname);
- exit(1);
- }
- if ((int)ident.super_minor == -2) {
- struct stat stb;
- if (mdfd < 0) {
+
+ ret = stat(devlist->devname, &stb);
+ if (ident.super_minor == -2 && ret != 0) {
pr_err("--super-minor=dev given, and listed device %s doesn't exist.\n",
- devlist->devname);
+ devlist->devname);
+ exit(1);
+ }
+
+ if (!ret && !stat_is_md_dev(&stb)) {
+ pr_err("device %s exists but is not an md array.\n", devlist->devname);
exit(1);
}
- fstat(mdfd, &stb);
- ident.super_minor = minor(stb.st_rdev);
- }
- if (mdfd >= 0 && mode != MANAGE && mode != GROW) {
- /* We don't really want this open yet, we just might
- * have wanted to check some things
- */
- close(mdfd);
- mdfd = -1;
}
+ if (ident.super_minor == -2)
+ ident.super_minor = minor(stb.st_rdev);
}
if (s.raiddisks) {
diff --git a/mdadm.h b/mdadm.h
index adb7cda..8208b81 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -1672,6 +1672,7 @@ void *super1_make_v0(struct supertype *st, struct mdinfo *info, mdp_super_t *sb0
extern char *stat2kname(struct stat *st);
extern char *fd2kname(int fd);
extern char *stat2devnm(struct stat *st);
+bool stat_is_md_dev(struct stat *st);
extern char *fd2devnm(int fd);
extern void udev_block(char *devnm);
extern void udev_unblock(void);
--
2.35.3

View File

@ -1,103 +0,0 @@
From e53cb968691d9e40d83caf5570da3bb7b83c64e1 Mon Sep 17 00:00:00 2001
From: Guoqing Jiang <gqjiang@suse.com>
Date: Fri, 31 May 2019 10:10:00 +0800
Subject: [PATCH] mdadm/md.4: add the descriptions for bitmap sysfs nodes
Git-commit: e53cb968691d9e40d83caf5570da3bb7b83c64e1
Patch-mainline: mdadm-4.1+
References: jsc#SLE-10078, jsc#SLE-9348
The sysfs nodes under bitmap are not recorded in md.4,
add them based on md.rst and kernel source code.
Cc: NeilBrown <neilb@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
md.4 | 69 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 69 insertions(+)
diff --git a/md.4 b/md.4
index 3a1d677..e86707a 100644
--- a/md.4
+++ b/md.4
@@ -1101,6 +1101,75 @@ stripe that requires some "prereading". For fairness this defaults to
maximizes sequential-write throughput at the cost of fairness to threads
doing small or random writes.
+.TP
+.B md/bitmap/backlog
+The value stored in the file only has any effect on RAID1 when write-mostly
+devices are active, and write requests to those devices are proceed in the
+background.
+
+This variable sets a limit on the number of concurrent background writes,
+the valid values are 0 to 16383, 0 means that write-behind is not allowed,
+while any other number means it can happen. If there are more write requests
+than the number, new writes will by synchronous.
+
+.TP
+.B md/bitmap/can_clear
+This is for externally managed bitmaps, where the kernel writes the bitmap
+itself, but metadata describing the bitmap is managed by mdmon or similar.
+
+When the array is degraded, bits mustn't be cleared. When the array becomes
+optimal again, bit can be cleared, but first the metadata needs to record
+the current event count. So md sets this to 'false' and notifies mdmon,
+then mdmon updates the metadata and writes 'true'.
+
+There is no code in mdmon to actually do this, so maybe it doesn't even
+work.
+
+.TP
+.B md/bitmap/chunksize
+The bitmap chunksize can only be changed when no bitmap is active, and
+the value should be power of 2 and at least 512.
+
+.TP
+.B md/bitmap/location
+This indicates where the write-intent bitmap for the array is stored.
+It can be "none" or "file" or a signed offset from the array metadata
+- measured in sectors. You cannot set a file by writing here - that can
+only be done with the SET_BITMAP_FILE ioctl.
+
+Write 'none' to 'bitmap/location' will clear bitmap, and the previous
+location value must be write to it to restore bitmap.
+
+.TP
+.B md/bitmap/max_backlog_used
+This keeps track of the maximum number of concurrent write-behind requests
+for an md array, writing any value to this file will clear it.
+
+.TP
+.B md/bitmap/metadata
+This can be 'internal' or 'clustered' or 'external'. 'internal' is set
+by default, which means the metadata for bitmap is stored in the first 256
+bytes of the bitmap space. 'clustered' means separate bitmap metadata are
+used for each cluster node. 'external' means that bitmap metadata is managed
+externally to the kernel.
+
+.TP
+.B md/bitmap/space
+This shows the space (in sectors) which is available at md/bitmap/location,
+and allows the kernel to know when it is safe to resize the bitmap to match
+a resized array. It should big enough to contain the total bytes in the bitmap.
+
+For 1.0 metadata, assume we can use up to the superblock if before, else
+to 4K beyond superblock. For other metadata versions, assume no change is
+possible.
+
+.TP
+.B md/bitmap/time_base
+This shows the time (in seconds) between disk flushes, and is used to looking
+for bits in the bitmap to be cleared.
+
+The default value is 5 seconds, and it should be an unsigned long value.
+
.SS KERNEL PARAMETERS
The md driver recognised several different kernel parameters.
--
2.25.0

View File

@ -0,0 +1,233 @@
From 7211116c295ba1f9e1fcbdc2dd2d3762855062e1 Mon Sep 17 00:00:00 2001
From: Mateusz Kusiak <mateusz.kusiak@intel.com>
Date: Thu, 28 Jul 2022 20:20:53 +0800
Subject: [PATCH 51/61] Grow: Split Grow_reshape into helper function
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
Grow_reshape should be split into helper functions given its size.
- Add helper function for preparing reshape on external metadata.
- Close cfd file descriptor.
Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
---
Grow.c | 125 ++++++++++++++++++++++++++++++--------------------------
mdadm.h | 1 +
util.c | 14 +++++++
3 files changed, 81 insertions(+), 59 deletions(-)
diff --git a/Grow.c b/Grow.c
index 868bdc3..0f07a89 100644
--- a/Grow.c
+++ b/Grow.c
@@ -1773,6 +1773,65 @@ static int reshape_container(char *container, char *devname,
char *backup_file, int verbose,
int forked, int restart, int freeze_reshape);
+/**
+ * prepare_external_reshape() - prepares update on external metadata if supported.
+ * @devname: Device name.
+ * @subarray: Subarray.
+ * @st: Supertype.
+ * @container: Container.
+ * @cfd: Container file descriptor.
+ *
+ * Function checks that the requested reshape is supported on external metadata,
+ * and performs an initial check that the container holds the pre-requisite
+ * spare devices (mdmon owns final validation).
+ *
+ * Return: 0 on success, else 1
+ */
+static int prepare_external_reshape(char *devname, char *subarray,
+ struct supertype *st, char *container,
+ const int cfd)
+{
+ struct mdinfo *cc = NULL;
+ struct mdinfo *content = NULL;
+
+ if (st->ss->load_container(st, cfd, NULL)) {
+ pr_err("Cannot read superblock for %s\n", devname);
+ return 1;
+ }
+
+ if (!st->ss->container_content)
+ return 1;
+
+ cc = st->ss->container_content(st, subarray);
+ for (content = cc; content ; content = content->next) {
+ /*
+ * check if reshape is allowed based on metadata
+ * indications stored in content.array.status
+ */
+ if (is_bit_set(&content->array.state, MD_SB_BLOCK_VOLUME) ||
+ is_bit_set(&content->array.state, MD_SB_BLOCK_CONTAINER_RESHAPE)) {
+ pr_err("Cannot reshape arrays in container with unsupported metadata: %s(%s)\n",
+ devname, container);
+ goto error;
+ }
+ if (content->consistency_policy == CONSISTENCY_POLICY_PPL) {
+ pr_err("Operation not supported when ppl consistency policy is enabled\n");
+ goto error;
+ }
+ if (content->consistency_policy == CONSISTENCY_POLICY_BITMAP) {
+ pr_err("Operation not supported when write-intent bitmap consistency policy is enabled\n");
+ goto error;
+ }
+ }
+ sysfs_free(cc);
+ if (mdmon_running(container))
+ st->update_tail = &st->updates;
+ return 0;
+error:
+ sysfs_free(cc);
+ return 1;
+}
+
int Grow_reshape(char *devname, int fd,
struct mddev_dev *devlist,
struct context *c, struct shape *s)
@@ -1799,7 +1858,7 @@ int Grow_reshape(char *devname, int fd,
struct supertype *st;
char *subarray = NULL;
- int frozen;
+ int frozen = 0;
int changed = 0;
char *container = NULL;
int cfd = -1;
@@ -1808,7 +1867,7 @@ int Grow_reshape(char *devname, int fd,
int added_disks;
struct mdinfo info;
- struct mdinfo *sra;
+ struct mdinfo *sra = NULL;
if (md_get_array_info(fd, &array) < 0) {
pr_err("%s is not an active md array - aborting\n",
@@ -1865,13 +1924,7 @@ int Grow_reshape(char *devname, int fd,
}
}
- /* in the external case we need to check that the requested reshape is
- * supported, and perform an initial check that the container holds the
- * pre-requisite spare devices (mdmon owns final validation)
- */
if (st->ss->external) {
- int retval;
-
if (subarray) {
container = st->container_devnm;
cfd = open_dev_excl(st->container_devnm);
@@ -1887,13 +1940,12 @@ int Grow_reshape(char *devname, int fd,
return 1;
}
- retval = st->ss->load_container(st, cfd, NULL);
-
- if (retval) {
- pr_err("Cannot read superblock for %s\n", devname);
- close(cfd);
+ rv = prepare_external_reshape(devname, subarray, st,
+ container, cfd);
+ if (rv > 0) {
free(subarray);
- return 1;
+ close(cfd);
+ goto release;
}
if (s->raiddisks && subarray) {
@@ -1902,51 +1954,6 @@ int Grow_reshape(char *devname, int fd,
free(subarray);
return 1;
}
-
- /* check if operation is supported for metadata handler */
- if (st->ss->container_content) {
- struct mdinfo *cc = NULL;
- struct mdinfo *content = NULL;
-
- cc = st->ss->container_content(st, subarray);
- for (content = cc; content ; content = content->next) {
- int allow_reshape = 1;
-
- /* check if reshape is allowed based on metadata
- * indications stored in content.array.status
- */
- if (content->array.state &
- (1 << MD_SB_BLOCK_VOLUME))
- allow_reshape = 0;
- if (content->array.state &
- (1 << MD_SB_BLOCK_CONTAINER_RESHAPE))
- allow_reshape = 0;
- if (!allow_reshape) {
- pr_err("cannot reshape arrays in container with unsupported metadata: %s(%s)\n",
- devname, container);
- sysfs_free(cc);
- free(subarray);
- return 1;
- }
- if (content->consistency_policy ==
- CONSISTENCY_POLICY_PPL) {
- pr_err("Operation not supported when ppl consistency policy is enabled\n");
- sysfs_free(cc);
- free(subarray);
- return 1;
- }
- if (content->consistency_policy ==
- CONSISTENCY_POLICY_BITMAP) {
- pr_err("Operation not supported when write-intent bitmap is enabled\n");
- sysfs_free(cc);
- free(subarray);
- return 1;
- }
- }
- sysfs_free(cc);
- }
- if (mdmon_running(container))
- st->update_tail = &st->updates;
}
added_disks = 0;
diff --git a/mdadm.h b/mdadm.h
index 8208b81..941a5f3 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -1539,6 +1539,7 @@ extern int stat_is_blkdev(char *devname, dev_t *rdev);
extern bool is_dev_alive(char *path);
extern int get_mdp_major(void);
extern int get_maj_min(char *dev, int *major, int *minor);
+extern bool is_bit_set(int *val, unsigned char index);
extern int dev_open(char *dev, int flags);
extern int open_dev(char *devnm);
extern void reopen_mddev(int mdfd);
diff --git a/util.c b/util.c
index ca48d97..26ffdce 100644
--- a/util.c
+++ b/util.c
@@ -1027,6 +1027,20 @@ int get_maj_min(char *dev, int *major, int *minor)
*e == 0);
}
+/**
+ * is_bit_set() - get bit value by index.
+ * @val: value.
+ * @index: index of the bit (LSB numbering).
+ *
+ * Return: bit value.
+ */
+bool is_bit_set(int *val, unsigned char index)
+{
+ if ((*val) & (1 << index))
+ return true;
+ return false;
+}
+
int dev_open(char *dev, int flags)
{
/* like 'open', but if 'dev' matches %d:%d, create a temp
--
2.35.3

View File

@ -1,40 +0,0 @@
From 8063fd0f9e8abd718bd65928c19bc607cee5acd8 Mon Sep 17 00:00:00 2001
From: Xiao Ni <xni@redhat.com>
Date: Mon, 30 Sep 2019 19:47:59 +0800
Subject: [PATCH] Init devlist as an array
Git-commit: 8063fd0f9e8abd718bd65928c19bc607cee5acd8
Patch-mainline: mdadm-4.1+
References: jsc#SLE-10078, jsc#SLE-9348
devlist is an string. It will change to an array if there is disk that
is sbd disk. If one device is sbd, it runs devlist=().
This line code changes devlist from a string to an array. If there is
no sbd device, it can't run this line code. So it will still be a string.
The later codes need an array, rather than an string. So init devlist
as an array to fix this problem.
Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
clustermd_tests/func.sh | 3 +++
1 file changed, 3 insertions(+)
diff --git a/clustermd_tests/func.sh b/clustermd_tests/func.sh
index 642cc96..801d604 100644
--- a/clustermd_tests/func.sh
+++ b/clustermd_tests/func.sh
@@ -39,6 +39,9 @@ fetch_devlist()
devlist=($(ls /dev/disk/by-path/*$ISCSI_ID*))
fi
# sbd disk cannot use in testing
+ # Init devlist as an array
+ i=''
+ devlist=(${devlist[@]#$i})
for i in ${devlist[@]}
do
sbd -d $i dump &> /dev/null
--
2.25.0

View File

@ -0,0 +1,39 @@
From 5c3c3df646dd3b7e8df81152f08e9ac4ddccc671 Mon Sep 17 00:00:00 2001
From: Kinga Tanska <kinga.tanska@intel.com>
Date: Fri, 19 Aug 2022 02:55:46 +0200
Subject: [PATCH 52/61] Assemble: check if device is container before
scheduling force-clean update
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
Up to now using assemble with force flag making each array as clean.
Force-clean should not be done for the container. This commit add
check if device is different than container before cleaning.
Signed-off-by: Kinga Tanska <kinga.tanska@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
Assemble.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/Assemble.c b/Assemble.c
index be2160b..1dd82a8 100644
--- a/Assemble.c
+++ b/Assemble.c
@@ -1809,10 +1809,9 @@ try_again:
}
#endif
}
- if (c->force && !clean &&
+ if (c->force && !clean && content->array.level != LEVEL_CONTAINER &&
!enough(content->array.level, content->array.raid_disks,
- content->array.layout, clean,
- avail)) {
+ content->array.layout, clean, avail)) {
change += st->ss->update_super(st, content, "force-array",
devices[chosen_drive].devname, c->verbose,
0, NULL);
--
2.35.3

View File

@ -1,36 +0,0 @@
From 611093148574164fcf4f24f8c076d09473f655d7 Mon Sep 17 00:00:00 2001
From: Xiao Ni <xni@redhat.com>
Date: Mon, 30 Sep 2019 19:48:00 +0800
Subject: [PATCH] Don't need to check recovery after re-add when no I/O writes
to raid
Git-commit: 611093148574164fcf4f24f8c076d09473f655d7
Patch-mainline: mdadm-4.1+
References: jsc#SLE-10078, jsc#SLE-9348
If there is no write I/O between removing member disk and re-add it, there is no
recovery after re-adding member disk.
Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
clustermd_tests/02r1_Manage_re-add | 2 --
1 file changed, 2 deletions(-)
diff --git a/clustermd_tests/02r1_Manage_re-add b/clustermd_tests/02r1_Manage_re-add
index dd9c416..d0d13e5 100644
--- a/clustermd_tests/02r1_Manage_re-add
+++ b/clustermd_tests/02r1_Manage_re-add
@@ -9,8 +9,6 @@ check all state UU
check all dmesg
mdadm --manage $md0 --fail $dev0 --remove $dev0
mdadm --manage $md0 --re-add $dev0
-check $NODE1 recovery
-check all wait
check all state UU
check all dmesg
stop_md all $md0
--
2.25.0

View File

@ -0,0 +1,115 @@
From 171e9743881edf2dfb163ddff483566fbf913ccd Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@suse.de>
Date: Fri, 26 Aug 2022 08:55:56 +1000
Subject: [PATCH 53/61] super1: report truncated device
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
When the metadata is at the start of the device, it is possible that it
describes a device large than the one it is actually stored on. When
this happens, report it loudly in --examine.
....
Unused Space : before=1968 sectors, after=-2047 sectors DEVICE TOO SMALL
State : clean TRUNCATED DEVICE
....
Also report in --assemble so that the failure which the kernel will
report will be explained.
mdadm: Device /dev/sdb is not large enough for data described in superblock
mdadm: no RAID superblock on /dev/sdb
mdadm: /dev/sdb has no superblock - assembly aborted
Scenario can be demonstrated as follows:
mdadm: Note: this array has metadata at the start and
may not be suitable as a boot device. If you plan to
store '/boot' on this device please ensure that
your boot-loader understands md/v1.x metadata, or use
--metadata=0.90
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md/test started.
mdadm: stopped /dev/md/test
Unused Space : before=1968 sectors, after=-2047 sectors DEVICE TOO SMALL
State : clean TRUNCATED DEVICE
Unused Space : before=1968 sectors, after=-2047 sectors DEVICE TOO SMALL
State : clean TRUNCATED DEVICE
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
super1.c | 35 ++++++++++++++++++++++++++++-------
1 file changed, 28 insertions(+), 7 deletions(-)
diff --git a/super1.c b/super1.c
index 71af860..58345e6 100644
--- a/super1.c
+++ b/super1.c
@@ -406,12 +406,18 @@ static void examine_super1(struct supertype *st, char *homehost)
st->ss->getinfo_super(st, &info, NULL);
if (info.space_after != 1 &&
- !(__le32_to_cpu(sb->feature_map) & MD_FEATURE_NEW_OFFSET))
- printf(" Unused Space : before=%llu sectors, after=%llu sectors\n",
- info.space_before, info.space_after);
-
- printf(" State : %s\n",
- (__le64_to_cpu(sb->resync_offset)+1)? "active":"clean");
+ !(__le32_to_cpu(sb->feature_map) & MD_FEATURE_NEW_OFFSET)) {
+ printf(" Unused Space : before=%llu sectors, ",
+ info.space_before);
+ if (info.space_after < INT64_MAX)
+ printf("after=%llu sectors\n", info.space_after);
+ else
+ printf("after=-%llu sectors DEVICE TOO SMALL\n",
+ UINT64_MAX - info.space_after);
+ }
+ printf(" State : %s%s\n",
+ (__le64_to_cpu(sb->resync_offset)+1) ? "active":"clean",
+ (info.space_after > INT64_MAX) ? " TRUNCATED DEVICE" : "");
printf(" Device UUID : ");
for (i=0; i<16; i++) {
if ((i&3)==0 && i != 0)
@@ -2206,6 +2212,7 @@ static int load_super1(struct supertype *st, int fd, char *devname)
tst.ss = &super1;
for (tst.minor_version = 0; tst.minor_version <= 2;
tst.minor_version++) {
+ tst.ignore_hw_compat = st->ignore_hw_compat;
switch(load_super1(&tst, fd, devname)) {
case 0: super = tst.sb;
if (bestvers == -1 ||
@@ -2312,7 +2319,6 @@ static int load_super1(struct supertype *st, int fd, char *devname)
free(super);
return 2;
}
- st->sb = super;
bsb = (struct bitmap_super_s *)(((char*)super)+MAX_SB_SIZE);
@@ -2322,6 +2328,21 @@ static int load_super1(struct supertype *st, int fd, char *devname)
if (st->data_offset == INVALID_SECTORS)
st->data_offset = __le64_to_cpu(super->data_offset);
+ if (st->minor_version >= 1 &&
+ st->ignore_hw_compat == 0 &&
+ (dsize < (__le64_to_cpu(super->data_offset) +
+ __le64_to_cpu(super->size))
+ ||
+ dsize < (__le64_to_cpu(super->data_offset) +
+ __le64_to_cpu(super->data_size)))) {
+ if (devname)
+ pr_err("Device %s is not large enough for data described in superblock\n",
+ devname);
+ free(super);
+ return 2;
+ }
+ st->sb = super;
+
/* Now check on the bitmap superblock */
if ((__le32_to_cpu(super->feature_map)&MD_FEATURE_BITMAP_OFFSET) == 0)
return 0;
--
2.35.3

View File

@ -1,52 +0,0 @@
From 7bd59e7926c6921121087eb067befaa896c900a4 Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@suse.de>
Date: Wed, 18 Sep 2019 15:12:55 +1000
Subject: [PATCH] udev: allow for udev attribute reading bug.
Git-commit: 7bd59e7926c6921121087eb067befaa896c900a4
Patch-mainline: mdadm-4.1+
References: jsc#SLE-10078, jsc#SLE-9348
There is a bug in udev (which will hopefully get fixed, but
we should allow for it anways).
When reading a sysfs attribute, it first reads the whole
value of the attribute, then reads again expecting to get
a read of 0 bytes, like you would with an ordinary file.
If the sysfs attribute changed between these two reads, it can
get a mixture of two values.
In particular, if it reads when 'array_state' is changing from
'clear' to 'inactive', it can find the value as "clear\nve".
This causes the test for "|clear|active" to fail, so systemd is allowed
to think that the array is ready - when it isn't.
So change the pattern to allow for this but adding a wildcard at
the end.
Also don't allow for an empty string - reading array_state will
never return an empty string - if it exists at all, it will be
non-empty.
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
udev-md-raid-arrays.rules | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/udev-md-raid-arrays.rules b/udev-md-raid-arrays.rules
index d391665..c8fa8e8 100644
--- a/udev-md-raid-arrays.rules
+++ b/udev-md-raid-arrays.rules
@@ -14,7 +14,7 @@ ENV{DEVTYPE}=="partition", GOTO="md_ignore_state"
# never leave state 'inactive'
ATTR{md/metadata_version}=="external:[A-Za-z]*", ATTR{md/array_state}=="inactive", GOTO="md_ignore_state"
TEST!="md/array_state", ENV{SYSTEMD_READY}="0", GOTO="md_end"
-ATTR{md/array_state}=="|clear|inactive", ENV{SYSTEMD_READY}="0", GOTO="md_end"
+ATTR{md/array_state}=="clear*|inactive", ENV{SYSTEMD_READY}="0", GOTO="md_end"
LABEL="md_ignore_state"
IMPORT{program}="BINDIR/mdadm --detail --no-devices --export $devnode"
--
2.25.0

View File

@ -1,45 +0,0 @@
From b6180160f78f0182b296bdceed6419b26a6fccc7 Mon Sep 17 00:00:00 2001
From: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com>
Date: Fri, 4 Oct 2019 12:07:28 +0200
Subject: [PATCH] imsm: save current_vol number
Git-commit: b6180160f78f0182b296bdceed6419b26a6fccc7
Patch-mainline: mdadm-4.1+
References: jsc#SLE-10078, jsc#SLE-9348
The imsm container_content routine will set curr_volume index in super
for getting volume information. This flag has never been restored to
original value, later other function may rely on it.
Restore this flag to original value.
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
super-intel.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/super-intel.c b/super-intel.c
index a103a3f..e02bbd7 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -7826,6 +7826,7 @@ static struct mdinfo *container_content_imsm(struct supertype *st, char *subarra
int sb_errors = 0;
struct dl *d;
int spare_disks = 0;
+ int current_vol = super->current_vol;
/* do not assemble arrays when not all attributes are supported */
if (imsm_check_attributes(mpb->attributes) == 0) {
@@ -7993,6 +7994,7 @@ static struct mdinfo *container_content_imsm(struct supertype *st, char *subarra
rest = this;
}
+ super->current_vol = current_vol;
return rest;
}
--
2.25.0

View File

@ -0,0 +1,619 @@
From 1a386f804d8392b849b3362da6b0157b0db83091 Mon Sep 17 00:00:00 2001
From: Mateusz Grzonka <mateusz.grzonka@intel.com>
Date: Fri, 12 Aug 2022 16:52:12 +0200
Subject: [PATCH 54/61] mdadm: Correct typos, punctuation and grammar in man
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com>
Reviewed-by: Wol <anthony@youngman.org.uk>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
mdadm.8.in | 178 ++++++++++++++++++++++++++---------------------------
1 file changed, 88 insertions(+), 90 deletions(-)
diff --git a/mdadm.8.in b/mdadm.8.in
index f273622..70c79d1 100644
--- a/mdadm.8.in
+++ b/mdadm.8.in
@@ -158,7 +158,7 @@ adding new spares and removing faulty devices.
.B Misc
This is an 'everything else' mode that supports operations on active
arrays, operations on component devices such as erasing old superblocks, and
-information gathering operations.
+information-gathering operations.
.\"This mode allows operations on independent devices such as examine MD
.\"superblocks, erasing old superblocks and stopping active arrays.
@@ -231,12 +231,12 @@ mode to be assumed.
.TP
.BR \-h ", " \-\-help
-Display general help message or, after one of the above options, a
+Display a general help message or, after one of the above options, a
mode-specific help message.
.TP
.B \-\-help\-options
-Display more detailed help about command line parsing and some commonly
+Display more detailed help about command-line parsing and some commonly
used options.
.TP
@@ -266,7 +266,7 @@ the exact meaning of this option in different contexts.
.TP
.BR \-c ", " \-\-config=
-Specify the config file or directory. If not specified, default config file
+Specify the config file or directory. If not specified, the default config file
and default conf.d directory will be used. See
.BR mdadm.conf (5)
for more details.
@@ -379,7 +379,7 @@ When creating an array, the
.B homehost
will be recorded in the metadata. For version-1 superblocks, it will
be prefixed to the array name. For version-0.90 superblocks, part of
-the SHA1 hash of the hostname will be stored in the later half of the
+the SHA1 hash of the hostname will be stored in the latter half of the
UUID.
When reporting information about an array, any array which is tagged
@@ -388,7 +388,7 @@ for the given homehost will be reported as such.
When using Auto-Assemble, only arrays tagged for the given homehost
will be allowed to use 'local' names (i.e. not ending in '_' followed
by a digit string). See below under
-.BR "Auto Assembly" .
+.BR "Auto-Assembly" .
The special name "\fBany\fP" can be used as a wild card. If an array
is created with
@@ -403,7 +403,7 @@ When
.I mdadm
needs to print the name for a device it normally finds the name in
.B /dev
-which refers to the device and is shortest. When a path component is
+which refers to the device and is the shortest. When a path component is
given with
.B \-\-prefer
.I mdadm
@@ -478,9 +478,9 @@ still be larger than any replacement.
This option can be used with
.B \-\-create
-for determining initial size of an array. For external metadata,
+for determining the initial size of an array. For external metadata,
it can be used on a volume, but not on a container itself.
-Setting initial size of
+Setting the initial size of
.B RAID 0
array is only valid for external metadata.
@@ -545,20 +545,20 @@ Clustered arrays do not support this parameter yet.
.TP
.BR \-c ", " \-\-chunk=
-Specify chunk size of kilobytes. The default when creating an
+Specify chunk size in kilobytes. The default when creating an
array is 512KB. To ensure compatibility with earlier versions, the
default when building an array with no persistent metadata is 64KB.
This is only meaningful for RAID0, RAID4, RAID5, RAID6, and RAID10.
RAID4, RAID5, RAID6, and RAID10 require the chunk size to be a power
-of 2. In any case it must be a multiple of 4KB.
+of 2, with minimal chunk size being 4KB.
A suffix of 'K', 'M', 'G' or 'T' can be given to indicate Kilobytes,
Megabytes, Gigabytes or Terabytes respectively.
.TP
.BR \-\-rounding=
-Specify rounding factor for a Linear array. The size of each
+Specify the rounding factor for a Linear array. The size of each
component will be rounded down to a multiple of this size.
This is a synonym for
.B \-\-chunk
@@ -655,7 +655,8 @@ option to set subsequent failure modes.
and "flush" will clear any persistent faults.
The layout options for RAID10 are one of 'n', 'o' or 'f' followed
-by a small number. The default is 'n2'. The supported options are:
+by a small number signifying the number of copies of each datablock.
+The default is 'n2'. The supported options are:
.I 'n'
signals 'near' copies. Multiple copies of one data block are at
@@ -673,7 +674,7 @@ signals 'far' copies
(multiple copies have very different offsets).
See md(4) for more detail about 'near', 'offset', and 'far'.
-The number is the number of copies of each datablock. 2 is normal, 3
+As for the number of copies of each data block, 2 is normal, 3
can be useful. This number can be at most equal to the number of
devices in the array. It does not need to divide evenly into that
number (e.g. it is perfectly legal to have an 'n2' layout for an array
@@ -684,7 +685,7 @@ A bug introduced in Linux 3.14 means that RAID0 arrays
started using a different layout. This could lead to
data corruption. Since Linux 5.4 (and various stable releases that received
backports), the kernel will not accept such an array unless
-a layout is explictly set. It can be set to
+a layout is explicitly set. It can be set to
.RB ' original '
or
.RB ' alternate '.
@@ -760,13 +761,13 @@ or by selecting a different consistency policy with
.TP
.BR \-\-bitmap\-chunk=
-Set the chunksize of the bitmap. Each bit corresponds to that many
+Set the chunk size of the bitmap. Each bit corresponds to that many
Kilobytes of storage.
-When using a file based bitmap, the default is to use the smallest
-size that is at-least 4 and requires no more than 2^21 chunks.
+When using a file-based bitmap, the default is to use the smallest
+size that is at least 4 and requires no more than 2^21 chunks.
When using an
.B internal
-bitmap, the chunksize defaults to 64Meg, or larger if necessary to
+bitmap, the chunk size defaults to 64Meg, or larger if necessary to
fit the bitmap into the available space.
A suffix of 'K', 'M', 'G' or 'T' can be given to indicate Kilobytes,
@@ -840,7 +841,7 @@ can be used with that command to avoid the automatic resync.
.BR \-\-backup\-file=
This is needed when
.B \-\-grow
-is used to increase the number of raid-devices in a RAID5 or RAID6 if
+is used to increase the number of raid devices in a RAID5 or RAID6 if
there are no spare devices available, or to shrink, change RAID level
or layout. See the GROW MODE section below on RAID\-DEVICES CHANGES.
The file must be stored on a separate device, not on the RAID array
@@ -879,7 +880,7 @@ When creating an array,
.B \-\-data\-offset
can be specified as
.BR variable .
-In the case each member device is expected to have a offset appended
+In the case each member device is expected to have an offset appended
to the name, separated by a colon. This makes it possible to recreate
exactly an array which has varying data offsets (as can happen when
different versions of
@@ -943,7 +944,7 @@ Insist that
.I mdadm
accept the geometry and layout specified without question. Normally
.I mdadm
-will not allow creation of an array with only one device, and will try
+will not allow the creation of an array with only one device, and will try
to create a RAID5 array with one missing drive (as this makes the
initial resync work faster). With
.BR \-\-force ,
@@ -1004,7 +1005,7 @@ number added, e.g.
If the md device name is in a 'standard' format as described in DEVICE
NAMES, then it will be created, if necessary, with the appropriate
device number based on that name. If the device name is not in one of these
-formats, then a unused device number will be allocated. The device
+formats, then an unused device number will be allocated. The device
number will be considered unused if there is no active array for that
number, and there is no entry in /dev for that number and with a
non-standard name. Names that are not in 'standard' format are only
@@ -1032,25 +1033,25 @@ then
.B \-\-add
can be used to add some extra devices to be included in the array.
In most cases this is not needed as the extra devices can be added as
-spares first, and then the number of raid-disks can be changed.
-However for RAID0, it is not possible to add spares. So to increase
+spares first, and then the number of raid disks can be changed.
+However, for RAID0 it is not possible to add spares. So to increase
the number of devices in a RAID0, it is necessary to set the new
number of devices, and to add the new devices, in the same command.
.TP
.BR \-\-nodes
-Only works when the array is for clustered environment. It specifies
+Only works when the array is created for a clustered environment. It specifies
the maximum number of nodes in the cluster that will use this device
simultaneously. If not specified, this defaults to 4.
.TP
.BR \-\-write-journal
Specify journal device for the RAID-4/5/6 array. The journal device
-should be a SSD with reasonable lifetime.
+should be an SSD with a reasonable lifetime.
.TP
.BR \-k ", " \-\-consistency\-policy=
-Specify how the array maintains consistency in case of unexpected shutdown.
+Specify how the array maintains consistency in the case of an unexpected shutdown.
Only relevant for RAID levels with redundancy.
Currently supported options are:
.RS
@@ -1058,7 +1059,7 @@ Currently supported options are:
.TP
.B resync
Full resync is performed and all redundancy is regenerated when the array is
-started after unclean shutdown.
+started after an unclean shutdown.
.TP
.B bitmap
@@ -1067,8 +1068,8 @@ Resync assisted by a write-intent bitmap. Implicitly selected when using
.TP
.B journal
-For RAID levels 4/5/6, journal device is used to log transactions and replay
-after unclean shutdown. Implicitly selected when using
+For RAID levels 4/5/6, the journal device is used to log transactions and replay
+after an unclean shutdown. Implicitly selected when using
.BR \-\-write\-journal .
.TP
@@ -1237,7 +1238,7 @@ This can be useful if
reports a different "Preferred Minor" to
.BR \-\-detail .
In some cases this update will be performed automatically
-by the kernel driver. In particular the update happens automatically
+by the kernel driver. In particular, the update happens automatically
at the first write to an array with redundancy (RAID level 1 or
greater) on a 2.6 (or later) kernel.
@@ -1277,7 +1278,7 @@ For version-1 superblocks, this involves updating the name.
The
.B home\-cluster
option will change the cluster name as recorded in the superblock and
-bitmap. This option only works for clustered environment.
+bitmap. This option only works for a clustered environment.
The
.B resync
@@ -1390,10 +1391,10 @@ This option should be used with great caution.
.TP
.BR \-\-freeze\-reshape
-Option is intended to be used in start-up scripts during initrd boot phase.
-When array under reshape is assembled during initrd phase, this option
-stops reshape after reshape critical section is being restored. This happens
-before file system pivot operation and avoids loss of file system context.
+This option is intended to be used in start-up scripts during the initrd boot phase.
+When the array under reshape is assembled during the initrd phase, this option
+stops the reshape after the reshape-critical section has been restored. This happens
+before the file system pivot operation and avoids loss of filesystem context.
Losing file system context would cause reshape to be broken.
Reshape can be continued later using the
@@ -1437,9 +1438,9 @@ re\-add a device that was previously removed from an array.
If the metadata on the device reports that it is a member of the
array, and the slot that it used is still vacant, then the device will
be added back to the array in the same position. This will normally
-cause the data for that device to be recovered. However based on the
+cause the data for that device to be recovered. However, based on the
event count on the device, the recovery may only require sections that
-are flagged a write-intent bitmap to be recovered or may not require
+are flagged by a write-intent bitmap to be recovered or may not require
any recovery at all.
When used on an array that has no metadata (i.e. it was built with
@@ -1447,13 +1448,12 @@ When used on an array that has no metadata (i.e. it was built with
it will be assumed that bitmap-based recovery is enough to make the
device fully consistent with the array.
-When used with v1.x metadata,
.B \-\-re\-add
-can be accompanied by
+can also be accompanied by
.BR \-\-update=devicesize ,
.BR \-\-update=bbl ", or"
.BR \-\-update=no\-bbl .
-See the description of these option when used in Assemble mode for an
+See descriptions of these options when used in Assemble mode for an
explanation of their use.
If the device name given is
@@ -1480,7 +1480,7 @@ Add a device as a spare. This is similar to
except that it does not attempt
.B \-\-re\-add
first. The device will be added as a spare even if it looks like it
-could be an recent member of the array.
+could be a recent member of the array.
.TP
.BR \-r ", " \-\-remove
@@ -1497,12 +1497,12 @@ and names like
.B set-A
can be given to
.BR \-\-remove .
-The first causes all failed device to be removed. The second causes
+The first causes all failed devices to be removed. The second causes
any device which is no longer connected to the system (i.e an 'open'
returns
.BR ENXIO )
to be removed.
-The third will remove a set as describe below under
+The third will remove a set as described below under
.BR \-\-fail .
.TP
@@ -1519,7 +1519,7 @@ For RAID10 arrays where the number of copies evenly divides the number
of devices, the devices can be conceptually divided into sets where
each set contains a single complete copy of the data on the array.
Sometimes a RAID10 array will be configured so that these sets are on
-separate controllers. In this case all the devices in one set can be
+separate controllers. In this case, all the devices in one set can be
failed by giving a name like
.B set\-A
or
@@ -1549,9 +1549,9 @@ This can follow a list of
.B \-\-replace
devices. The devices listed after
.B \-\-with
-will be preferentially used to replace the devices listed after
+will preferentially be used to replace the devices listed after
.BR \-\-replace .
-These device must already be spare devices in the array.
+These devices must already be spare devices in the array.
.TP
.BR \-\-write\-mostly
@@ -1574,8 +1574,8 @@ the device is found or <slot>:missing in case the device is not found.
.TP
.BR \-\-add-journal
-Add journal to an existing array, or recreate journal for RAID-4/5/6 array
-that lost a journal device. To avoid interrupting on-going write opertions,
+Add a journal to an existing array, or recreate journal for a RAID-4/5/6 array
+that lost a journal device. To avoid interrupting ongoing write operations,
.B \-\-add-journal
only works for array in Read-Only state.
@@ -1631,9 +1631,9 @@ Print details of one or more md devices.
.TP
.BR \-\-detail\-platform
Print details of the platform's RAID capabilities (firmware / hardware
-topology) for a given metadata format. If used without argument, mdadm
+topology) for a given metadata format. If used without an argument, mdadm
will scan all controllers looking for their capabilities. Otherwise, mdadm
-will only look at the controller specified by the argument in form of an
+will only look at the controller specified by the argument in the form of an
absolute filepath or a link, e.g.
.IR /sys/devices/pci0000:00/0000:00:1f.2 .
@@ -1742,8 +1742,8 @@ the block where the superblock would be is overwritten even if it
doesn't appear to be valid.
.B Note:
-Be careful to call \-\-zero\-superblock with clustered raid, make sure
-array isn't used or assembled in other cluster node before execute it.
+Be careful when calling \-\-zero\-superblock with clustered raid. Make sure
+the array isn't used or assembled in another cluster node before executing it.
.TP
.B \-\-kill\-subarray=
@@ -1790,7 +1790,7 @@ For each md device given, or each device in /proc/mdstat if
is given, arrange for the array to be marked clean as soon as possible.
.I mdadm
will return with success if the array uses external metadata and we
-successfully waited. For native arrays this returns immediately as the
+successfully waited. For native arrays, this returns immediately as the
kernel handles dirty-clean transitions at shutdown. No action is taken
if safe-mode handling is disabled.
@@ -1830,7 +1830,7 @@ uses to help track which arrays are currently being assembled.
.TP
.BR \-\-run ", " \-R
-Run any array assembled as soon as a minimal number of devices are
+Run any array assembled as soon as a minimal number of devices is
available, rather than waiting until all expected devices are present.
.TP
@@ -1860,7 +1860,7 @@ Only used with \-\-fail. The 'path' given will be recorded so that if
a new device appears at the same location it can be automatically
added to the same array. This allows the failed device to be
automatically replaced by a new device without metadata if it appears
-at specified path. This option is normally only set by a
+at specified path. This option is normally only set by an
.I udev
script.
@@ -1961,7 +1961,7 @@ Usage:
.PP
This usage assembles one or more RAID arrays from pre-existing components.
For each array, mdadm needs to know the md device, the identity of the
-array, and a number of component-devices. These can be found in a number of ways.
+array, and the number of component devices. These can be found in a number of ways.
In the first usage example (without the
.BR \-\-scan )
@@ -2001,7 +2001,7 @@ The config file is only used if explicitly named with
.B \-\-config
or requested with (a possibly implicit)
.BR \-\-scan .
-In the later case, default config file is used. See
+In the latter case, the default config file is used. See
.BR mdadm.conf (5)
for more details.
@@ -2039,14 +2039,14 @@ detects that udev is not configured, it will create the devices in
.B /dev
itself.
-In Linux kernels prior to version 2.6.28 there were two distinctly
-different types of md devices that could be created: one that could be
+In Linux kernels prior to version 2.6.28 there were two distinct
+types of md devices that could be created: one that could be
partitioned using standard partitioning tools and one that could not.
-Since 2.6.28 that distinction is no longer relevant as both type of
+Since 2.6.28 that distinction is no longer relevant as both types of
devices can be partitioned.
.I mdadm
will normally create the type that originally could not be partitioned
-as it has a well defined major number (9).
+as it has a well-defined major number (9).
Prior to 2.6.28, it is important that mdadm chooses the correct type
of array device to use. This can be controlled with the
@@ -2066,7 +2066,7 @@ can also be given in the configuration file as a word starting
.B auto=
on the ARRAY line for the relevant array.
-.SS Auto Assembly
+.SS Auto-Assembly
When
.B \-\-assemble
is used with
@@ -2122,11 +2122,11 @@ See
.IR mdadm.conf (5)
for further details.
-Note: Auto assembly cannot be used for assembling and activating some
+Note: Auto-assembly cannot be used for assembling and activating some
arrays which are undergoing reshape. In particular as the
.B backup\-file
-cannot be given, any reshape which requires a backup-file to continue
-cannot be started by auto assembly. An array which is growing to more
+cannot be given, any reshape which requires a backup file to continue
+cannot be started by auto-assembly. An array which is growing to more
devices and has passed the critical section can be assembled using
auto-assembly.
@@ -2233,7 +2233,7 @@ When creating a partition based array, using
.I mdadm
with version-1.x metadata, the partition type should be set to
.B 0xDA
-(non fs-data). This type selection allows for greater precision since
+(non fs-data). This type of selection allows for greater precision since
using any other [RAID auto-detect (0xFD) or a GNU/Linux partition (0x83)],
might create problems in the event of array recovery through a live cdrom.
@@ -2249,7 +2249,7 @@ when creating a v0.90 array will silently override any
setting.
.\"If the
.\".B \-\-size
-.\"option is given, it is not necessary to list any component-devices in this command.
+.\"option is given, it is not necessary to list any component devices in this command.
.\"They can be added later, before a
.\".B \-\-run.
.\"If no
@@ -2263,7 +2263,7 @@ requested with the
.B \-\-bitmap
option or a different consistency policy is selected with the
.B \-\-consistency\-policy
-option. In any case space for a bitmap will be reserved so that one
+option. In any case, space for a bitmap will be reserved so that one
can be added later with
.BR "\-\-grow \-\-bitmap=internal" .
@@ -2313,7 +2313,7 @@ will firstly mark
as faulty in
.B /dev/md0
and will then remove it from the array and finally add it back
-in as a spare. However only one md array can be affected by a single
+in as a spare. However, only one md array can be affected by a single
command.
When a device is added to an active array, mdadm checks to see if it
@@ -2458,14 +2458,14 @@ config file to be examined.
If the device contains RAID metadata, a file will be created in the
.I directory
and the metadata will be written to it. The file will be the same
-size as the device and have the metadata written in the file at the
-same locate that it exists in the device. However the file will be "sparse" so
+size as the device and will have the metadata written at the
+same location as it exists in the device. However, the file will be "sparse" so
that only those blocks containing metadata will be allocated. The
total space used will be small.
-The file name used in the
+The filename used in the
.I directory
-will be the base name of the device. Further if any links appear in
+will be the base name of the device. Further, if any links appear in
.I /dev/disk/by-id
which point to the device, then hard links to the file will be created
in
@@ -2567,7 +2567,7 @@ and if the destination array has a failed drive but no spares.
If any devices are listed on the command line,
.I mdadm
-will only monitor those devices. Otherwise all arrays listed in the
+will only monitor those devices, otherwise, all arrays listed in the
configuration file will be monitored. Further, if
.B \-\-scan
is given, then any other md devices that appear in
@@ -2624,10 +2624,10 @@ check, repair). (syslog priority: Warning)
.BI Rebuild NN
Where
.I NN
-is a two-digit number (ie. 05, 48). This indicates that rebuild
-has passed that many percent of the total. The events are generated
-with fixed increment since 0. Increment size may be specified with
-a commandline option (default is 20). (syslog priority: Warning)
+is a two-digit number (eg. 05, 48). This indicates that the rebuild
+has reached that percentage of the total. The events are generated
+at a fixed increment from 0. The increment size may be specified with
+a command-line option (the default is 20). (syslog priority: Warning)
.TP
.B RebuildFinished
@@ -2735,8 +2735,8 @@ When
detects that an array in a spare group has fewer active
devices than necessary for the complete array, and has no spare
devices, it will look for another array in the same spare group that
-has a full complement of working drive and a spare. It will then
-attempt to remove the spare from the second drive and add it to the
+has a full complement of working drives and a spare. It will then
+attempt to remove the spare from the second array and add it to the
first.
If the removal succeeds but the adding fails, then it is added back to
the original array.
@@ -2750,10 +2750,8 @@ and then follow similar steps as above if a matching spare is found.
.SH GROW MODE
The GROW mode is used for changing the size or shape of an active
array.
-For this to work, the kernel must support the necessary change.
-Various types of growth are being added during 2.6 development.
-Currently the supported changes include
+During the kernel 2.6 era the following changes were added:
.IP \(bu 4
change the "size" attribute for RAID1, RAID4, RAID5 and RAID6.
.IP \(bu 4
@@ -2796,8 +2794,8 @@ use more than half of a spare device for backup space.
.SS SIZE CHANGES
Normally when an array is built the "size" is taken from the smallest
-of the drives. If all the small drives in an arrays are, one at a
-time, removed and replaced with larger drives, then you could have an
+of the drives. If all the small drives in an arrays are, over time,
+removed and replaced with larger drives, then you could have an
array of large drives with only a small amount used. In this
situation, changing the "size" with "GROW" mode will allow the extra
space to start being used. If the size is increased in this way, a
@@ -2812,7 +2810,7 @@ after growing, or to reduce its size
.B prior
to shrinking the array.
-Also the size of an array cannot be changed while it has an active
+Also, the size of an array cannot be changed while it has an active
bitmap. If an array has a bitmap, it must be removed before the size
can be changed. Once the change is complete a new bitmap can be created.
@@ -2892,7 +2890,7 @@ long time. A
is required. If the array is not simultaneously being grown or
shrunk, so that the array size will remain the same - for example,
reshaping a 3-drive RAID5 into a 4-drive RAID6 - the backup file will
-be used not just for a "cricital section" but throughout the reshape
+be used not just for a "critical section" but throughout the reshape
operation, as described below under LAYOUT CHANGES.
.SS CHUNK-SIZE AND LAYOUT CHANGES
@@ -2910,7 +2908,7 @@ slowly.
If the reshape is interrupted for any reason, this backup file must be
made available to
.B "mdadm --assemble"
-so the array can be reassembled. Consequently the file cannot be
+so the array can be reassembled. Consequently, the file cannot be
stored on the device being reshaped.
--
2.35.3

View File

@ -0,0 +1,94 @@
From fc6fd4063769f4194c3fb8f77b32b2819e140fb9 Mon Sep 17 00:00:00 2001
From: Mateusz Kusiak <mateusz.kusiak@intel.com>
Date: Thu, 18 Aug 2022 11:47:21 +0200
Subject: [PATCH 55/61] Manage: Block unsafe member failing
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
Kernel may or may not block mdadm from removing member device if it
will cause arrays failed state. It depends on raid personality
implementation in kernel.
Add verification on requested removal path (#mdadm --set-faulty
command).
Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
Manage.c | 53 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 52 insertions(+), 1 deletion(-)
diff --git a/Manage.c b/Manage.c
index a142f8b..b1d0e63 100644
--- a/Manage.c
+++ b/Manage.c
@@ -1285,6 +1285,50 @@ int Manage_with(struct supertype *tst, int fd, struct mddev_dev *dv,
return -1;
}
+/**
+ * is_remove_safe() - Check if remove is safe.
+ * @array: Array info.
+ * @fd: Array file descriptor.
+ * @devname: Name of device to remove.
+ * @verbose: Verbose.
+ *
+ * The function determines if array will be operational
+ * after removing &devname.
+ *
+ * Return: True if array will be operational, false otherwise.
+ */
+bool is_remove_safe(mdu_array_info_t *array, const int fd, char *devname, const int verbose)
+{
+ dev_t devid = devnm2devid(devname + 5);
+ struct mdinfo *mdi = sysfs_read(fd, NULL, GET_DEVS | GET_DISKS | GET_STATE);
+
+ if (!mdi) {
+ if (verbose)
+ pr_err("Failed to read sysfs attributes for %s\n", devname);
+ return false;
+ }
+
+ char *avail = xcalloc(array->raid_disks, sizeof(char));
+
+ for (mdi = mdi->devs; mdi; mdi = mdi->next) {
+ if (mdi->disk.raid_disk < 0)
+ continue;
+ if (!(mdi->disk.state & (1 << MD_DISK_SYNC)))
+ continue;
+ if (makedev(mdi->disk.major, mdi->disk.minor) == devid)
+ continue;
+ avail[mdi->disk.raid_disk] = 1;
+ }
+ sysfs_free(mdi);
+
+ bool is_enough = enough(array->level, array->raid_disks,
+ array->layout, (array->state & 1),
+ avail);
+
+ free(avail);
+ return is_enough;
+}
+
int Manage_subdevs(char *devname, int fd,
struct mddev_dev *devlist, int verbose, int test,
char *update, int force)
@@ -1598,7 +1642,14 @@ int Manage_subdevs(char *devname, int fd,
break;
case 'f': /* set faulty */
- /* FIXME check current member */
+ if (!is_remove_safe(&array, fd, dv->devname, verbose)) {
+ pr_err("Cannot remove %s from %s, array will be failed.\n",
+ dv->devname, devname);
+ if (sysfd >= 0)
+ close(sysfd);
+ goto abort;
+ }
+
if ((sysfd >= 0 && write(sysfd, "faulty", 6) != 6) ||
(sysfd < 0 && ioctl(fd, SET_DISK_FAULTY,
rdev))) {
--
2.35.3

View File

@ -1,55 +0,0 @@
From 1a1ced1e2e64a6b4b349a3fb559f6b39e4cf7103 Mon Sep 17 00:00:00 2001
From: Krzysztof Smolinski <krzysztof.smolinski@intel.com>
Date: Fri, 8 Nov 2019 11:59:11 +0100
Subject: [PATCH] imsm: allow to specify second volume size
Git-commit: 1a1ced1e2e64a6b4b349a3fb559f6b39e4cf7103
Patch-mainline: mdadm-4.1+
References: jsc#SLE-10078, jsc#SLE-9348
Removed checks which limited second volume size only to max value (the
largest size that fits on all current drives). It is now permitted
to create second volume with size lower then maximum possible.
Signed-off-by: Krzysztof Smolinski <krzysztof.smolinski@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
super-intel.c | 14 ++++----------
1 file changed, 4 insertions(+), 10 deletions(-)
diff --git a/super-intel.c b/super-intel.c
index e02bbd7..713058c 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -7298,11 +7298,8 @@ static int validate_geometry_imsm_volume(struct supertype *st, int level,
maxsize = merge_extents(super, i);
- if (!check_env("IMSM_NO_PLATFORM") &&
- mpb->num_raid_devs > 0 && size && size != maxsize) {
- pr_err("attempting to create a second volume with size less then remaining space. Aborting...\n");
- return 0;
- }
+ if (mpb->num_raid_devs > 0 && size && size != maxsize)
+ pr_err("attempting to create a second volume with size less then remaining space.\n");
if (maxsize < size || maxsize == 0) {
if (verbose) {
@@ -7393,11 +7390,8 @@ static int imsm_get_free_size(struct supertype *st, int raiddisks,
}
maxsize = size;
}
- if (!check_env("IMSM_NO_PLATFORM") &&
- mpb->num_raid_devs > 0 && size && size != maxsize) {
- pr_err("attempting to create a second volume with size less then remaining space. Aborting...\n");
- return 0;
- }
+ if (mpb->num_raid_devs > 0 && size && size != maxsize)
+ pr_err("attempting to create a second volume with size less then remaining space.\n");
cnt = 0;
for (dl = super->disks; dl; dl = dl->next)
if (dl->e)
--
2.25.0

View File

@ -0,0 +1,115 @@
From 55c10e4de13abe3e6934895e1fff7d2d20d0b2c2 Mon Sep 17 00:00:00 2001
From: Pawel Baldysiak <pawel.baldysiak@intel.com>
Date: Thu, 1 Sep 2022 11:20:31 +0200
Subject: [PATCH 56/61] Monitor: Fix statelist memory leaks
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
Free statelist in error path in Monitor initialization.
Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
Monitor.c | 40 +++++++++++++++++++++++++++++++---------
1 file changed, 31 insertions(+), 9 deletions(-)
diff --git a/Monitor.c b/Monitor.c
index 93f36ac..b4e954c 100644
--- a/Monitor.c
+++ b/Monitor.c
@@ -74,6 +74,7 @@ static int add_new_arrays(struct mdstat_ent *mdstat, struct state **statelist,
int test, struct alert_info *info);
static void try_spare_migration(struct state *statelist, struct alert_info *info);
static void link_containers_with_subarrays(struct state *list);
+static void free_statelist(struct state *statelist);
#ifndef NO_LIBUDEV
static int check_udev_activity(void);
#endif
@@ -128,7 +129,6 @@ int Monitor(struct mddev_dev *devlist,
*/
struct state *statelist = NULL;
- struct state *st2;
int finished = 0;
struct mdstat_ent *mdstat = NULL;
char *mailfrom;
@@ -185,12 +185,14 @@ int Monitor(struct mddev_dev *devlist,
continue;
if (strcasecmp(mdlist->devname, "<ignore>") == 0)
continue;
+ if (!is_mddev(mdlist->devname)) {
+ free_statelist(statelist);
+ return 1;
+ }
st = xcalloc(1, sizeof *st);
snprintf(st->devname, MD_NAME_MAX + sizeof("/dev/md/"),
"/dev/md/%s", basename(mdlist->devname));
- if (!is_mddev(mdlist->devname))
- return 1;
st->next = statelist;
st->devnm[0] = 0;
st->percent = RESYNC_UNKNOWN;
@@ -206,8 +208,10 @@ int Monitor(struct mddev_dev *devlist,
for (dv = devlist; dv; dv = dv->next) {
struct state *st;
- if (!is_mddev(dv->devname))
+ if (!is_mddev(dv->devname)) {
+ free_statelist(statelist);
return 1;
+ }
st = xcalloc(1, sizeof *st);
mdlist = conf_get_ident(dv->devname);
@@ -294,16 +298,16 @@ int Monitor(struct mddev_dev *devlist,
for (stp = &statelist; (st = *stp) != NULL; ) {
if (st->from_auto && st->err > 5) {
*stp = st->next;
- free(st->spare_group);
+ if (st->spare_group)
+ free(st->spare_group);
+
free(st);
} else
stp = &st->next;
}
}
- for (st2 = statelist; st2; st2 = statelist) {
- statelist = st2->next;
- free(st2);
- }
+
+ free_statelist(statelist);
if (pidfile)
unlink(pidfile);
@@ -1056,6 +1060,24 @@ static void link_containers_with_subarrays(struct state *list)
}
}
+/**
+ * free_statelist() - Frees statelist.
+ * @statelist: statelist to free
+ */
+static void free_statelist(struct state *statelist)
+{
+ struct state *tmp = NULL;
+
+ while (statelist) {
+ if (statelist->spare_group)
+ free(statelist->spare_group);
+
+ tmp = statelist;
+ statelist = statelist->next;
+ free(tmp);
+ }
+}
+
#ifndef NO_LIBUDEV
/* function: check_udev_activity
* Description: Function waits for udev to finish
--
2.35.3

View File

@ -1,47 +0,0 @@
From 6636788aaf4ec0cacaefb6e77592e4a68e70a957 Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@suse.de>
Date: Fri, 18 Oct 2019 11:10:34 +1100
Subject: [PATCH] mdcheck: when mdcheck_start is enabled, enable
mdcheck_continue too.
Git-commit: 6636788aaf4ec0cacaefb6e77592e4a68e70a957
Patch-mainline: mdadm-4.1+
References: bsc#1153258
mdcheck_continue continues a regular array scan that was started by
mdcheck_start.
mdcheck_start will ensure that mdcheck_continue is active.
Howver if you reboot after a check has started, but before it finishes,
then mdcheck_continue won't cause it to continue, because nothing
starts it on boot.
So add an install option for mdcheck_contine, and make sure it
gets enabled when mdcheck_start is enabled.
Signed-off-by: NeilBrown <neilb@suse.de>
---
systemd/mdcheck_continue.timer | 2 ++
systemd/mdcheck_start.timer | 1 +
2 files changed, 3 insertions(+)
diff --git a/systemd/mdcheck_continue.timer b/systemd/mdcheck_continue.timer
index 3ccfd7858a3f..dba1074c1f44 100644
--- a/systemd/mdcheck_continue.timer
+++ b/systemd/mdcheck_continue.timer
@@ -11,3 +11,5 @@ Description=MD array scrubbing - continuation
[Timer]
OnCalendar= 1:05:00
+[Install]
+WantedBy= mdmonitor.service
diff --git a/systemd/mdcheck_start.timer b/systemd/mdcheck_start.timer
index 64807362d649..9e7e02ab7333 100644
--- a/systemd/mdcheck_start.timer
+++ b/systemd/mdcheck_start.timer
@@ -13,3 +13,4 @@ OnCalendar=Sun *-*-1..7 1:00:00
[Install]
WantedBy= mdmonitor.service
+Also= mdcheck_continue.timer
--
2.23.0

View File

@ -0,0 +1,64 @@
From ea7a02a3294aae223e1329aed5da7f4aa3ac05c5 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Old=C5=99ich=20Jedli=C4=8Dka?= <oldium.pro@gmail.com>
Date: Wed, 31 Aug 2022 19:57:29 +0200
Subject: [PATCH 57/61] mdadm: added support for Intel Alderlake RST on VMD
platform
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
Alderlake RST on VMD uses RstVmdV UEFI variable name, so detect it.
Signed-off-by: Oldřich Jedlička <oldium.pro@gmail.com>
Reviewed-by: Kinga Tanska <kinga.tanska@linux.intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
platform-intel.c | 18 +++++++++++++-----
1 file changed, 13 insertions(+), 5 deletions(-)
diff --git a/platform-intel.c b/platform-intel.c
index 5a8729e..757f0b1 100644
--- a/platform-intel.c
+++ b/platform-intel.c
@@ -512,7 +512,8 @@ static const struct imsm_orom *find_imsm_hba_orom(struct sys_dev *hba)
#define AHCI_PROP "RstSataV"
#define AHCI_SSATA_PROP "RstsSatV"
#define AHCI_TSATA_PROP "RsttSatV"
-#define VMD_PROP "RstUefiV"
+#define VROC_VMD_PROP "RstUefiV"
+#define RST_VMD_PROP "RstVmdV"
#define VENDOR_GUID \
EFI_GUID(0x193dfefa, 0xa445, 0x4302, 0x99, 0xd8, 0xef, 0x3a, 0xad, 0x1a, 0x04, 0xc6)
@@ -605,6 +606,7 @@ const struct imsm_orom *find_imsm_efi(struct sys_dev *hba)
struct orom_entry *ret;
static const char * const sata_efivars[] = {AHCI_PROP, AHCI_SSATA_PROP,
AHCI_TSATA_PROP};
+ static const char * const vmd_efivars[] = {VROC_VMD_PROP, RST_VMD_PROP};
unsigned long i;
if (check_env("IMSM_TEST_AHCI_EFI") || check_env("IMSM_TEST_SCU_EFI"))
@@ -636,10 +638,16 @@ const struct imsm_orom *find_imsm_efi(struct sys_dev *hba)
break;
case SYS_DEV_VMD:
- if (!read_efi_variable(&orom, sizeof(orom), VMD_PROP,
- VENDOR_GUID))
- break;
- return NULL;
+ for (i = 0; i < ARRAY_SIZE(vmd_efivars); i++) {
+ if (!read_efi_variable(&orom, sizeof(orom),
+ vmd_efivars[i], VENDOR_GUID))
+ break;
+ }
+
+ if (i == ARRAY_SIZE(vmd_efivars))
+ return NULL;
+
+ break;
default:
return NULL;
}
--
2.35.3

View File

@ -0,0 +1,113 @@
From ea109700563d93704ebdc540c7770d874369f667 Mon Sep 17 00:00:00 2001
From: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Date: Fri, 9 Sep 2022 15:50:33 +0200
Subject: [PATCH 58/61] mdadm: Add Documentation entries to systemd services
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
Add documentation section.
Copied from Debian.
Cc: Felix Lechner <felix.lechner@lease-up.com>
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
---
systemd/mdadm-grow-continue@.service | 1 +
systemd/mdadm-last-resort@.service | 1 +
systemd/mdcheck_continue.service | 3 ++-
systemd/mdcheck_start.service | 1 +
systemd/mdmon@.service | 1 +
systemd/mdmonitor-oneshot.service | 1 +
systemd/mdmonitor.service | 1 +
7 files changed, 8 insertions(+), 1 deletion(-)
diff --git a/systemd/mdadm-grow-continue@.service b/systemd/mdadm-grow-continue@.service
index 9fdc8ec..64b8254 100644
--- a/systemd/mdadm-grow-continue@.service
+++ b/systemd/mdadm-grow-continue@.service
@@ -8,6 +8,7 @@
[Unit]
Description=Manage MD Reshape on /dev/%I
DefaultDependencies=no
+Documentation=man:mdadm(8)
[Service]
ExecStart=BINDIR/mdadm --grow --continue /dev/%I
diff --git a/systemd/mdadm-last-resort@.service b/systemd/mdadm-last-resort@.service
index efeb3f6..e938112 100644
--- a/systemd/mdadm-last-resort@.service
+++ b/systemd/mdadm-last-resort@.service
@@ -2,6 +2,7 @@
Description=Activate md array %I even though degraded
DefaultDependencies=no
ConditionPathExists=!/sys/devices/virtual/block/%i/md/sync_action
+Documentation=man:mdadm(8)
[Service]
Type=oneshot
diff --git a/systemd/mdcheck_continue.service b/systemd/mdcheck_continue.service
index 854317f..f532490 100644
--- a/systemd/mdcheck_continue.service
+++ b/systemd/mdcheck_continue.service
@@ -7,7 +7,8 @@
[Unit]
Description=MD array scrubbing - continuation
-ConditionPathExistsGlob = /var/lib/mdcheck/MD_UUID_*
+ConditionPathExistsGlob=/var/lib/mdcheck/MD_UUID_*
+Documentation=man:mdadm(8)
[Service]
Type=oneshot
diff --git a/systemd/mdcheck_start.service b/systemd/mdcheck_start.service
index 3bb3d13..703a658 100644
--- a/systemd/mdcheck_start.service
+++ b/systemd/mdcheck_start.service
@@ -8,6 +8,7 @@
[Unit]
Description=MD array scrubbing
Wants=mdcheck_continue.timer
+Documentation=man:mdadm(8)
[Service]
Type=oneshot
diff --git a/systemd/mdmon@.service b/systemd/mdmon@.service
index 7753395..97a1acd 100644
--- a/systemd/mdmon@.service
+++ b/systemd/mdmon@.service
@@ -9,6 +9,7 @@
Description=MD Metadata Monitor on /dev/%I
DefaultDependencies=no
Before=initrd-switch-root.target
+Documentation=man:mdmon(8)
[Service]
# mdmon should never complain due to lack of a platform,
diff --git a/systemd/mdmonitor-oneshot.service b/systemd/mdmonitor-oneshot.service
index 373955a..ba86b44 100644
--- a/systemd/mdmonitor-oneshot.service
+++ b/systemd/mdmonitor-oneshot.service
@@ -7,6 +7,7 @@
[Unit]
Description=Reminder for degraded MD arrays
+Documentation=man:mdadm(8)
[Service]
Environment=MDADM_MONITOR_ARGS=--scan
diff --git a/systemd/mdmonitor.service b/systemd/mdmonitor.service
index 46f7b88..9c36478 100644
--- a/systemd/mdmonitor.service
+++ b/systemd/mdmonitor.service
@@ -8,6 +8,7 @@
[Unit]
Description=MD array monitor
DefaultDependencies=no
+Documentation=man:mdadm(8)
[Service]
Environment= MDADM_MONITOR_ARGS=--scan
--
2.35.3

View File

@ -0,0 +1,34 @@
From f7cbd810b639eb946ba1b3bddb1faefb9696de42 Mon Sep 17 00:00:00 2001
From: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Date: Fri, 9 Sep 2022 15:50:34 +0200
Subject: [PATCH 59/61] ReadMe: fix command-line help
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
Make command-line help consistent with manual page.
Copied from Debian.
Cc: Felix Lechner <felix.lechner@lease-up.com>
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
---
ReadMe.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/ReadMe.c b/ReadMe.c
index 7f94847..50a5e36 100644
--- a/ReadMe.c
+++ b/ReadMe.c
@@ -477,7 +477,7 @@ char Help_assemble[] =
;
char Help_manage[] =
-"Usage: mdadm arraydevice options component devices...\n"
+"Usage: mdadm [mode] arraydevice [options] <component devices...>\n"
"\n"
"This usage is for managing the component devices within an array.\n"
"The --manage option is not needed and is assumed if the first argument\n"
--
2.35.3

View File

@ -0,0 +1,259 @@
From 6f2af6a48c541f207cb727a31fb86de2cd04fc21 Mon Sep 17 00:00:00 2001
From: Kinga Tanska <kinga.tanska@intel.com>
Date: Fri, 2 Sep 2022 08:49:23 +0200
Subject: [PATCH 60/61] mdadm: replace container level checking with inline
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
To unify all containers checks in code, is_container() function is
added and propagated.
Signed-off-by: Kinga Tanska <kinga.tanska@intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
---
Assemble.c | 7 +++----
Create.c | 6 +++---
Grow.c | 6 +++---
Incremental.c | 4 ++--
mdadm.h | 14 ++++++++++++++
super-ddf.c | 6 +++---
super-intel.c | 4 ++--
super0.c | 2 +-
super1.c | 2 +-
sysfs.c | 2 +-
10 files changed, 33 insertions(+), 20 deletions(-)
diff --git a/Assemble.c b/Assemble.c
index 1dd82a8..8b0af0c 100644
--- a/Assemble.c
+++ b/Assemble.c
@@ -1120,7 +1120,7 @@ static int start_array(int mdfd,
i/2, mddev);
}
- if (content->array.level == LEVEL_CONTAINER) {
+ if (is_container(content->array.level)) {
sysfs_rules_apply(mddev, content);
if (c->verbose >= 0) {
pr_err("Container %s has been assembled with %d drive%s",
@@ -1549,8 +1549,7 @@ try_again:
*/
trustworthy = LOCAL;
- if (name[0] == 0 &&
- content->array.level == LEVEL_CONTAINER) {
+ if (!name[0] && is_container(content->array.level)) {
name = content->text_version;
trustworthy = METADATA;
}
@@ -1809,7 +1808,7 @@ try_again:
}
#endif
}
- if (c->force && !clean && content->array.level != LEVEL_CONTAINER &&
+ if (c->force && !clean && !is_container(content->array.level) &&
!enough(content->array.level, content->array.raid_disks,
content->array.layout, clean, avail)) {
change += st->ss->update_super(st, content, "force-array",
diff --git a/Create.c b/Create.c
index e06ec2a..953e737 100644
--- a/Create.c
+++ b/Create.c
@@ -487,7 +487,7 @@ int Create(struct supertype *st, char *mddev,
st->minor_version >= 1)
/* metadata at front */
warn |= check_partitions(fd, dname, 0, 0);
- else if (s->level == 1 || s->level == LEVEL_CONTAINER ||
+ else if (s->level == 1 || is_container(s->level) ||
(s->level == 0 && s->raiddisks == 1))
/* partitions could be meaningful */
warn |= check_partitions(fd, dname, freesize*2, s->size*2);
@@ -997,7 +997,7 @@ int Create(struct supertype *st, char *mddev,
* again returns container info.
*/
st->ss->getinfo_super(st, &info_new, NULL);
- if (st->ss->external && s->level != LEVEL_CONTAINER &&
+ if (st->ss->external && !is_container(s->level) &&
!same_uuid(info_new.uuid, info.uuid, 0)) {
map_update(&map, fd2devnm(mdfd),
info_new.text_version,
@@ -1040,7 +1040,7 @@ int Create(struct supertype *st, char *mddev,
map_unlock(&map);
free(infos);
- if (s->level == LEVEL_CONTAINER) {
+ if (is_container(s->level)) {
/* No need to start. But we should signal udev to
* create links */
sysfs_uevent(&info, "change");
diff --git a/Grow.c b/Grow.c
index 0f07a89..e362403 100644
--- a/Grow.c
+++ b/Grow.c
@@ -2175,7 +2175,7 @@ size_change_error:
devname, s->size);
}
changed = 1;
- } else if (array.level != LEVEL_CONTAINER) {
+ } else if (!is_container(array.level)) {
s->size = get_component_size(fd)/2;
if (s->size == 0)
s->size = array.size;
@@ -2231,7 +2231,7 @@ size_change_error:
info.component_size = s->size*2;
info.new_level = s->level;
info.new_chunk = s->chunk * 1024;
- if (info.array.level == LEVEL_CONTAINER) {
+ if (is_container(info.array.level)) {
info.delta_disks = UnSet;
info.array.raid_disks = s->raiddisks;
} else if (s->raiddisks)
@@ -2344,7 +2344,7 @@ size_change_error:
printf("layout for %s set to %d\n",
devname, array.layout);
}
- } else if (array.level == LEVEL_CONTAINER) {
+ } else if (is_container(array.level)) {
/* This change is to be applied to every array in the
* container. This is only needed when the metadata imposes
* restraints of the various arrays in the container.
diff --git a/Incremental.c b/Incremental.c
index 4d0cd9d..5a5f4c4 100644
--- a/Incremental.c
+++ b/Incremental.c
@@ -244,7 +244,7 @@ int Incremental(struct mddev_dev *devlist, struct context *c,
c->autof = ci->autof;
name_to_use = info.name;
- if (name_to_use[0] == 0 && info.array.level == LEVEL_CONTAINER) {
+ if (name_to_use[0] == 0 && is_container(info.array.level)) {
name_to_use = info.text_version;
trustworthy = METADATA;
}
@@ -472,7 +472,7 @@ int Incremental(struct mddev_dev *devlist, struct context *c,
/* 7/ Is there enough devices to possibly start the array? */
/* 7a/ if not, finish with success. */
- if (info.array.level == LEVEL_CONTAINER) {
+ if (is_container(info.array.level)) {
char devnm[32];
/* Try to assemble within the container */
sysfs_uevent(sra, "change");
diff --git a/mdadm.h b/mdadm.h
index 941a5f3..3673494 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -1924,3 +1924,17 @@ enum r0layout {
* This is true for native and DDF, IMSM allows 16.
*/
#define MD_NAME_MAX 32
+
+/**
+ * is_container() - check if @level is &LEVEL_CONTAINER
+ * @level: level value
+ *
+ * return:
+ * 1 if level is equal to &LEVEL_CONTAINER, 0 otherwise.
+ */
+static inline int is_container(const int level)
+{
+ if (level == LEVEL_CONTAINER)
+ return 1;
+ return 0;
+}
diff --git a/super-ddf.c b/super-ddf.c
index 949e7d1..9d1e3b9 100644
--- a/super-ddf.c
+++ b/super-ddf.c
@@ -3325,7 +3325,7 @@ validate_geometry_ddf_container(struct supertype *st,
int fd;
unsigned long long ldsize;
- if (level != LEVEL_CONTAINER)
+ if (!is_container(level))
return 0;
if (!dev)
return 1;
@@ -3371,7 +3371,7 @@ static int validate_geometry_ddf(struct supertype *st,
if (level == LEVEL_NONE)
level = LEVEL_CONTAINER;
- if (level == LEVEL_CONTAINER) {
+ if (is_container(level)) {
/* Must be a fresh device to add to a container */
return validate_geometry_ddf_container(st, level, raiddisks,
data_offset, dev,
@@ -3488,7 +3488,7 @@ static int validate_geometry_ddf_bvd(struct supertype *st,
struct dl *dl;
unsigned long long maxsize;
/* ddf/bvd supports lots of things, but not containers */
- if (level == LEVEL_CONTAINER) {
+ if (is_container(level)) {
if (verbose)
pr_err("DDF cannot create a container within an container\n");
return 0;
diff --git a/super-intel.c b/super-intel.c
index 4d82af3..b056561 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -6727,7 +6727,7 @@ static int validate_geometry_imsm_container(struct supertype *st, int level,
struct intel_super *super = NULL;
int rv = 0;
- if (level != LEVEL_CONTAINER)
+ if (!is_container(level))
return 0;
if (!dev)
return 1;
@@ -7692,7 +7692,7 @@ static int validate_geometry_imsm(struct supertype *st, int level, int layout,
* if given unused devices create a container
* if given given devices in a container create a member volume
*/
- if (level == LEVEL_CONTAINER)
+ if (is_container(level))
/* Must be a fresh device to add to a container */
return validate_geometry_imsm_container(st, level, raiddisks,
data_offset, dev,
diff --git a/super0.c b/super0.c
index 37f595e..93876e2 100644
--- a/super0.c
+++ b/super0.c
@@ -1273,7 +1273,7 @@ static int validate_geometry0(struct supertype *st, int level,
if (get_linux_version() < 3001000)
tbmax = 2;
- if (level == LEVEL_CONTAINER) {
+ if (is_container(level)) {
if (verbose)
pr_err("0.90 metadata does not support containers\n");
return 0;
diff --git a/super1.c b/super1.c
index 58345e6..0b505a7 100644
--- a/super1.c
+++ b/super1.c
@@ -2830,7 +2830,7 @@ static int validate_geometry1(struct supertype *st, int level,
unsigned long long overhead;
int fd;
- if (level == LEVEL_CONTAINER) {
+ if (is_container(level)) {
if (verbose)
pr_err("1.x metadata does not support containers\n");
return 0;
diff --git a/sysfs.c b/sysfs.c
index 0d98a65..ca1d888 100644
--- a/sysfs.c
+++ b/sysfs.c
@@ -763,7 +763,7 @@ int sysfs_add_disk(struct mdinfo *sra, struct mdinfo *sd, int resume)
rv = sysfs_set_num(sra, sd, "offset", sd->data_offset);
rv |= sysfs_set_num(sra, sd, "size", (sd->component_size+1) / 2);
- if (sra->array.level != LEVEL_CONTAINER) {
+ if (!is_container(sra->array.level)) {
if (sra->consistency_policy == CONSISTENCY_POLICY_PPL) {
rv |= sysfs_set_num(sra, sd, "ppl_sector", sd->ppl_sector);
rv |= sysfs_set_num(sra, sd, "ppl_size", sd->ppl_size);
--
2.35.3

View File

@ -1,54 +0,0 @@
From 4ca799c581703d4d0ad840833c037c2fff088ca7 Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@suse.de>
Date: Wed, 30 Oct 2019 10:32:41 +1100
Subject: [PATCH] mdcheck: use ${} to pass variable to mdcheck
Git-commit: 4ca799c581703d4d0ad840833c037c2fff088ca7
Patch-mainline: mdadm-4.1+
References: bsc#1153258
$MDADM_CHECK_DURATION allows the value to be split on spaces.
${MDADM_CHECK_DURATION} avoids such splitting.
Making this change removes the need for double quoting when setting
the default Environment, and means that double quoting isn't needed
in the EnvironmentFile.
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
---
systemd/mdcheck_continue.service | 5 ++---
systemd/mdcheck_start.service | 4 ++--
2 files changed, 4 insertions(+), 5 deletions(-)
diff --git a/systemd/mdcheck_continue.service b/systemd/mdcheck_continue.service
index 592c607..deac695 100644
--- a/systemd/mdcheck_continue.service
+++ b/systemd/mdcheck_continue.service
@@ -11,8 +11,7 @@ ConditionPathExistsGlob = /var/lib/mdcheck/MD_UUID_*
[Service]
Type=oneshot
-Environment= MDADM_CHECK_DURATION='"6 hours"'
+Environment= MDADM_CHECK_DURATION="6 hours"
EnvironmentFile=-/run/sysconfig/mdadm
ExecStartPre=-/usr/lib/mdadm/mdadm_env.sh
-ExecStart=/usr/share/mdadm/mdcheck --continue --duration $MDADM_CHECK_DURATION
-
+ExecStart=/usr/share/mdadm/mdcheck --continue --duration ${MDADM_CHECK_DURATION}
diff --git a/systemd/mdcheck_start.service b/systemd/mdcheck_start.service
index 812141b..f17f1aa 100644
--- a/systemd/mdcheck_start.service
+++ b/systemd/mdcheck_start.service
@@ -11,7 +11,7 @@ Wants=mdcheck_continue.timer
[Service]
Type=oneshot
-Environment= MDADM_CHECK_DURATION='"6 hours"'
+Environment= MDADM_CHECK_DURATION="6 hours"
EnvironmentFile=-/run/sysconfig/mdadm
ExecStartPre=-/usr/lib/mdadm/mdadm_env.sh
-ExecStart=/usr/share/mdadm/mdcheck --duration $MDADM_CHECK_DURATION
+ExecStart=/usr/share/mdadm/mdcheck --duration ${MDADM_CHECK_DURATION}
--
2.25.0

View File

@ -0,0 +1,61 @@
From 8b668d4aa3305af5963162b7499b128bd71f8f29 Mon Sep 17 00:00:00 2001
From: Lukasz Florczak <lukasz.florczak@linux.intel.com>
Date: Thu, 22 Sep 2022 08:29:50 +0200
Subject: [PATCH 61/61] Mdmonitor: Omit non-md devices
Patch-mainline: mdadm-4.2+
References: jsc#PED-1009
Fix segfault commit [1] introduced check whether given device is
mddevice, but it happend to terminate Mdmonitor if at least one of given
devices didn't fulfill that condition. In result Mdmonitor service was
no longer started on boot (with --scan option) when config contained some
non-existent array entry.
This commit introduces ommiting non-md devices so scan option can still
be used when config is wrong and allow Mdmonitor service to run on boot.
Giving a list of devices to monitor containing non-existing or
non-md devices will result in monitoring only confirmed mddevices.
[1] https://git.kernel.org/pub/scm/utils/mdadm/mdadm.git/commit/?id=e702f392959d1c2ad2089e595b52235ed97b4e18
Signed-off-by: Lukasz Florczak <lukasz.florczak@linux.intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
Monitor.c | 12 ++++--------
1 file changed, 4 insertions(+), 8 deletions(-)
diff --git a/Monitor.c b/Monitor.c
index b4e954c..7d7dc4d 100644
--- a/Monitor.c
+++ b/Monitor.c
@@ -185,10 +185,8 @@ int Monitor(struct mddev_dev *devlist,
continue;
if (strcasecmp(mdlist->devname, "<ignore>") == 0)
continue;
- if (!is_mddev(mdlist->devname)) {
- free_statelist(statelist);
- return 1;
- }
+ if (!is_mddev(mdlist->devname))
+ continue;
st = xcalloc(1, sizeof *st);
snprintf(st->devname, MD_NAME_MAX + sizeof("/dev/md/"),
@@ -208,10 +206,8 @@ int Monitor(struct mddev_dev *devlist,
for (dv = devlist; dv; dv = dv->next) {
struct state *st;
- if (!is_mddev(dv->devname)) {
- free_statelist(statelist);
- return 1;
- }
+ if (!is_mddev(dv->devname))
+ continue;
st = xcalloc(1, sizeof *st);
mdlist = conf_get_ident(dv->devname);
--
2.35.3

View File

@ -1,31 +0,0 @@
From 85b83a7920bca5b93d2458f093f2c640a130614c Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@suse.de>
Date: Wed, 30 Oct 2019 10:32:41 +1100
Subject: [PATCH] SUSE-mdadm_env.sh: handle MDADM_CHECK_DURATION
Git-commit: 85b83a7920bca5b93d2458f093f2c640a130614c
Patch-mainline: mdadm-4.1+
References: bsc#1153258
The suse sysconfig/mdadm allows MDADM_CHECK_DURATION
to be set, but it is currently ignored.
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
---
systemd/SUSE-mdadm_env.sh | 3 +++
1 file changed, 3 insertions(+)
diff --git a/systemd/SUSE-mdadm_env.sh b/systemd/SUSE-mdadm_env.sh
index 10b2e74..c13b48a 100644
--- a/systemd/SUSE-mdadm_env.sh
+++ b/systemd/SUSE-mdadm_env.sh
@@ -43,3 +43,6 @@ fi
mkdir -p /run/sysconfig
echo "MDADM_MONITOR_ARGS=$MDADM_RAIDDEVICES $MDADM_DELAY $MDADM_MAIL $MDADM_PROGRAM $MDADM_SCAN $MDADM_SEND_MAIL $MDADM_CONFIG" > /run/sysconfig/mdadm
+if [ -n "$MDADM_CHECK_DURATION" ]; then
+ echo "MDADM_CHECK_DURATION=$MDADM_CHECK_DURATION" >> /run/sysconfig/mdadm
+fi
--
2.25.0

View File

@ -1,126 +0,0 @@
From 761e3bd9f5e3aafa95ad3ae50a637dc67c8774f0 Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@suse.de>
Date: Thu, 31 Oct 2019 15:15:38 +1100
Subject: [PATCH] super-intel: don't mark structs 'packed' unnecessarily
Git-commit: 761e3bd9f5e3aafa95ad3ae50a637dc67c8774f0
Patch-mainline: mdadm-4.1+
References: jsc#SLE-10078, jsc#SLE-9348
super-intel marks a number of structures 'packed', but this
doesn't change the layout - they are already well organized.
This is a problem a gcc warns when code takes the address
of a field in a packet struct - as super-intel sometimes does.
So remove the marking where isn't needed.
Do ensure this does introduce a regression, add a compile-time
assertion that the size of the structure is exactly the value
it had before the 'packed' notation was removed.
Note that a couple of structure do need to be packed.
As the address of fields is never taken, that is safe.
Signed-off-by: NeilBrown <neilb@suse.de>
Acked-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
super-intel.c | 32 ++++++++++++++++++++++++++------
1 file changed, 26 insertions(+), 6 deletions(-)
diff --git a/super-intel.c b/super-intel.c
index 713058c..a7fbed4 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -96,6 +96,19 @@
* mutliple PPL area
*/
+/*
+ * This macro let's us ensure that no-one accidentally
+ * changes the size of a struct
+ */
+#define ASSERT_SIZE(_struct, size) \
+static inline void __assert_size_##_struct(void) \
+{ \
+ switch (0) { \
+ case 0: break; \
+ case (sizeof(struct _struct) == size): break; \
+ } \
+}
+
/* Disk configuration info. */
#define IMSM_MAX_DEVICES 255
struct imsm_disk {
@@ -112,6 +125,7 @@ struct imsm_disk {
#define IMSM_DISK_FILLERS 3
__u32 filler[IMSM_DISK_FILLERS]; /* 0xF5 - 0x107 MPB_DISK_FILLERS for future expansion */
};
+ASSERT_SIZE(imsm_disk, 48)
/* map selector for map managment
*/
@@ -146,7 +160,8 @@ struct imsm_map {
__u32 disk_ord_tbl[1]; /* disk_ord_tbl[num_members],
* top byte contains some flags
*/
-} __attribute__ ((packed));
+};
+ASSERT_SIZE(imsm_map, 52)
struct imsm_vol {
__u32 curr_migr_unit;
@@ -169,7 +184,8 @@ struct imsm_vol {
__u32 filler[4];
struct imsm_map map[1];
/* here comes another one if migr_state */
-} __attribute__ ((packed));
+};
+ASSERT_SIZE(imsm_vol, 84)
struct imsm_dev {
__u8 volume[MAX_RAID_SERIAL_LEN];
@@ -220,7 +236,8 @@ struct imsm_dev {
#define IMSM_DEV_FILLERS 3
__u32 filler[IMSM_DEV_FILLERS];
struct imsm_vol vol;
-} __attribute__ ((packed));
+};
+ASSERT_SIZE(imsm_dev, 164)
struct imsm_super {
__u8 sig[MAX_SIGNATURE_LENGTH]; /* 0x00 - 0x1F */
@@ -248,7 +265,8 @@ struct imsm_super {
struct imsm_disk disk[1]; /* 0xD8 diskTbl[numDisks] */
/* here comes imsm_dev[num_raid_devs] */
/* here comes BBM logs */
-} __attribute__ ((packed));
+};
+ASSERT_SIZE(imsm_super, 264)
#define BBM_LOG_MAX_ENTRIES 254
#define BBM_LOG_MAX_LBA_ENTRY_VAL 256 /* Represents 256 LBAs */
@@ -269,7 +287,8 @@ struct bbm_log {
__u32 signature; /* 0xABADB10C */
__u32 entry_count;
struct bbm_log_entry marked_block_entries[BBM_LOG_MAX_ENTRIES];
-} __attribute__ ((__packed__));
+};
+ASSERT_SIZE(bbm_log, 2040)
static char *map_state_str[] = { "normal", "uninitialized", "degraded", "failed" };
@@ -323,7 +342,8 @@ struct migr_record {
* destination - high order 32 bits */
__u32 num_migr_units_hi; /* Total num migration units-of-op
* high order 32 bits */
-} __attribute__ ((__packed__));
+};
+ASSERT_SIZE(migr_record, 64)
struct md_list {
/* usage marker:
--
2.25.0

View File

@ -1,57 +0,0 @@
From 1cc3965d48deb0fb3e0657159c608ffb124643c1 Mon Sep 17 00:00:00 2001
From: Xiao Yang <ice_yangxiao@163.com>
Date: Wed, 27 Nov 2019 11:59:24 +0800
Subject: [PATCH] Manage: Remove the legacy code for md driver prior to 0.90.03
Git-commit: 1cc3965d48deb0fb3e0657159c608ffb124643c1
Patch-mainline: mdadm-4.1+
References: jsc#SLE-10078, jsc#SLE-9348
Previous re-add operation only calls ioctl(HOT_ADD_DISK) for array without
metadata(e.g. mdadm -B/--build) when md driver is less than 0.90.02, but
commit 091e8e6 breaks the logic and current re-add operation can call
ioctl(HOT_ADD_DISK) even if md driver is 0.90.03.
This issue is reproduced by 05r1-re-add-nosuper:
Signed-off-by: Coly Li <colyli@suse.de>
------------------------------------------------
++ die 'resync or recovery is happening!'
++ echo -e '\n\tERROR: resync or recovery is happening! \n'
ERROR: resync or recovery is happening!
------------------------------------------------
Fixes: 091e8e6("Manage: Remove all references to md_get_version()")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Xiao Yang <ice_yangxiao@163.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
---
Manage.c | 12 ------------
1 file changed, 12 deletions(-)
diff --git a/Manage.c b/Manage.c
index 21536f5..ffe55f8 100644
--- a/Manage.c
+++ b/Manage.c
@@ -741,18 +741,6 @@ int Manage_add(int fd, int tfd, struct mddev_dev *dv,
" Adding anyway as --force was given.\n",
dv->devname, devname);
}
- if (!tst->ss->external && array->major_version == 0) {
- if (ioctl(fd, HOT_ADD_DISK, rdev)==0) {
- if (verbose >= 0)
- pr_err("hot added %s\n",
- dv->devname);
- return 1;
- }
-
- pr_err("hot add failed for %s: %s\n",
- dv->devname, strerror(errno));
- return -1;
- }
if (array->not_persistent == 0 || tst->ss->external) {
--
2.25.0

View File

@ -1,46 +0,0 @@
From 02af379337c73e751ad97c0fed9123121f8b4289 Mon Sep 17 00:00:00 2001
From: Jes Sorensen <jsorensen@fb.com>
Date: Wed, 27 Nov 2019 10:19:54 -0500
Subject: [PATCH] Remove last traces of HOT_ADD_DISK
Git-commit: 02af379337c73e751ad97c0fed9123121f8b4289
Patch-mainline: mdadm-4.1+
References: jsc#SLE-10078, jsc#SLE-9348
This ioctl is no longer used, so remove all references to it.
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
Manage.c | 2 --
md_u.h | 1 -
2 files changed, 3 deletions(-)
diff --git a/Manage.c b/Manage.c
index ffe55f8..deeba2b 100644
--- a/Manage.c
+++ b/Manage.c
@@ -1289,8 +1289,6 @@ int Manage_subdevs(char *devname, int fd,
/* Do something to each dev.
* devmode can be
* 'a' - add the device
- * try HOT_ADD_DISK
- * If that fails EINVAL, try ADD_NEW_DISK
* 'S' - add the device as a spare - don't try re-add
* 'j' - add the device as a journal device
* 'A' - re-add the device
diff --git a/md_u.h b/md_u.h
index 2d66d52..b30893c 100644
--- a/md_u.h
+++ b/md_u.h
@@ -28,7 +28,6 @@
#define ADD_NEW_DISK _IOW (MD_MAJOR, 0x21, mdu_disk_info_t)
#define HOT_REMOVE_DISK _IO (MD_MAJOR, 0x22)
#define SET_ARRAY_INFO _IOW (MD_MAJOR, 0x23, mdu_array_info_t)
-#define HOT_ADD_DISK _IO (MD_MAJOR, 0x28)
#define SET_DISK_FAULTY _IO (MD_MAJOR, 0x29)
#define SET_BITMAP_FILE _IOW (MD_MAJOR, 0x2b, int)
--
2.25.0

Some files were not shown because too many files have changed in this diff Show More