mdadm/0013-imsm-finish-recovery-when-drive-with-rebuild-fails.patch
Neil Brown 2e35d7583b Accepting request 781064 from home:colyli:branches:Base:System
- Update for latest mdadm-4.1+ patches, this is required by
  jsc#SLE-10078 and jsc#SLE-9348. Mostly the purpose is for
  latest Intel IMSM raid support.
  The following patches also include previous patches with
  new re-ordered prefix numbers.
- Makefile: install mdadm_env.sh to /usr/lib/mdadm (bsc#1111960)
  0000-Makefile-install-mdadm_env.sh-to-usr-lib-mdadm.patch
- Assemble: keep MD_DISK_FAILFAST and MD_DISK_WRITEMOSTLY flag
  (jsc#SLE-10078, jsc#SLE-9348)
  0001-Assemble-keep-MD_DISK_FAILFAST-and-MD_DISK_WRITEMOST.patch
- Document PART-POLICY lines (jsc#SLE-10078, jsc#SLE-9348)
  0002-Document-PART-POLICY-lines.patc
- policy: support devices with multiple paths.
  (jsc#SLE-10078, jsc#SLE-9348)
  0003-policy-support-devices-with-multiple-paths.patch
- mdcheck: add systemd unit files to run mdcheck. (bsc#1115407)
  0004-mdcheck-add-systemd-unit-files-to-run-mdcheck.patch
- Monitor: add system timer to run --oneshot periodically (bsc#1115407)
  0005-Monitor-add-system-timer-to-run-oneshot-periodically.patch
- imsm: update metadata correctly while raid10 double
  (jsc#SLE-10078, jsc#SLE-9348)
  0006-imsm-update-metadata-correctly-while-raid10-double-d.patch
- Assemble: mask FAILFAST and WRITEMOSTLY flags when finding
  (jsc#SLE-10078, jsc#SLE-9348)
  0007-Assemble-mask-FAILFAST-and-WRITEMOSTLY-flags-when-fi.patch
- Grow: avoid overflow in compute_backup_blocks()
  (jsc#SLE-10078, jsc#SLE-9348)
  0008-Grow-avoid-overflow-in-compute_backup_blocks.patch
- Grow: report correct new chunk size. (jsc#SLE-10078, jsc#SLE-9348)
  0009-Grow-report-correct-new-chunk-size.patch

OBS-URL: https://build.opensuse.org/request/show/781064
OBS-URL: https://build.opensuse.org/package/show/Base:System/mdadm?expand=0&rev=181
2020-03-04 04:49:18 +00:00

99 lines
3.7 KiB
Diff

From a4e96fd8f3f0b5416783237c1cb6ee87e7eff23d Mon Sep 17 00:00:00 2001
From: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com>
Date: Fri, 8 Feb 2019 11:07:10 +0100
Subject: [PATCH] imsm: finish recovery when drive with rebuild fails
Git-commit: a4e96fd8f3f0b5416783237c1cb6ee87e7eff23d
Patch-mainline: mdadm-4.1-12
References: bsc#1126975
Commit d7a1fda2769b ("imsm: update metadata correctly while raid10 double
degradation") resolves main Imsm double degradation problems but it
omits one case. Now metadata hangs in the rebuilding state if the drive
under rebuild is removed during recovery from double degradation.
The root cause of this problem is comparing new map_state with current
and if they both are degraded assuming that nothing new happens.
Don't rely on map states, just check if device is failed. If the drive
under rebuild fails then finish migration, in other cases update map
state only (second fail means that destination map state can't be normal).
To avoid problems with reassembling move end_migration (called after
double degradation successful recovery) after check if recovery really
finished, for details see (7ce057018 "imsm: fix: rebuild does not
continue after reboot").
Remove redundant code responsible for finishing rebuild process. Function
end_migration do exactly the same. Set last_checkpoint to 0, to prepare
it for the next rebuild.
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
super-intel.c | 26 +++++++++++---------------
1 file changed, 11 insertions(+), 15 deletions(-)
diff --git a/super-intel.c b/super-intel.c
index d2035cc..38a1b6c 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -8560,26 +8560,22 @@ static void imsm_set_disk(struct active_array *a, int n, int state)
}
if (is_rebuilding(dev)) {
dprintf_cont("while rebuilding ");
- if (map->map_state != map_state) {
- dprintf_cont("map state change ");
+ if (state & DS_FAULTY) {
+ dprintf_cont("removing failed drive ");
if (n == map->failed_disk_num) {
dprintf_cont("end migration");
end_migration(dev, super, map_state);
+ a->last_checkpoint = 0;
} else {
- dprintf_cont("raid10 double degradation, map state change");
+ dprintf_cont("fail detected during rebuild, changing map state");
map->map_state = map_state;
}
super->updates_pending++;
- } else if (!rebuild_done)
- break;
- else if (n == map->failed_disk_num) {
- /* r10 double degraded to degraded transition */
- dprintf_cont("raid10 double degradation end migration");
- end_migration(dev, super, map_state);
- a->last_checkpoint = 0;
- super->updates_pending++;
}
+ if (!rebuild_done)
+ break;
+
/* check if recovery is really finished */
for (mdi = a->info.devs; mdi ; mdi = mdi->next)
if (mdi->recovery_start != MaxSector) {
@@ -8588,7 +8584,7 @@ static void imsm_set_disk(struct active_array *a, int n, int state)
}
if (recovery_not_finished) {
dprintf_cont("\n");
- dprintf_cont("Rebuild has not finished yet, map state changes only if raid10 double degradation happens");
+ dprintf_cont("Rebuild has not finished yet");
if (a->last_checkpoint < mdi->recovery_start) {
a->last_checkpoint =
mdi->recovery_start;
@@ -8598,9 +8594,9 @@ static void imsm_set_disk(struct active_array *a, int n, int state)
}
dprintf_cont(" Rebuild done, still degraded");
- dev->vol.migr_state = 0;
- set_migr_type(dev, 0);
- dev->vol.curr_migr_unit = 0;
+ end_migration(dev, super, map_state);
+ a->last_checkpoint = 0;
+ super->updates_pending++;
for (i = 0; i < map->num_members; i++) {
int idx = get_imsm_ord_tbl_ent(dev, i, MAP_0);
--
2.16.4