Accepting request 570449 from network:ha-clustering:Unstable
- clvmd: try to refresh device cache on the first failure (bsc#978055, bsc#1076042) + bug-978055_clvmd-try-to-refresh-device-cache-on-the-first-failu.patch - clvmd: try to refresh device cache on the first failure (bsc#978055, bsc#1076042) + bug-978055_clvmd-try-to-refresh-device-cache-on-the-first-failu.patch - clvmd: try to refresh device cache on the first failure (bsc#978055, bsc#1076042) + bug-978055_clvmd-try-to-refresh-device-cache-on-the-first-failu.patch OBS-URL: https://build.opensuse.org/request/show/570449 OBS-URL: https://build.opensuse.org/package/show/Base:System/lvm2?expand=0&rev=216
This commit is contained in:
parent
1a6b6dce97
commit
f1f0f1622d
@ -0,0 +1,92 @@
|
|||||||
|
From 4f0681b1a296d88ac1dbdb26e46afed3285ad1bf Mon Sep 17 00:00:00 2001
|
||||||
|
From: Eric Ren <zren@suse.com>
|
||||||
|
Date: Tue, 23 May 2017 15:09:46 +0800
|
||||||
|
Subject: [PATCH 09/10] clvmd: try to refresh device cache on the first failure
|
||||||
|
|
||||||
|
1. The original problem
|
||||||
|
$ sudo lvchange -ay testvg/testlv
|
||||||
|
Error locking on node 1302cf30: Volume group for uuid not found:
|
||||||
|
qBKu65bSxfRq7gUf91NZuH4epLza4ifDieQJFd2to2WruVi5Brn7DxxsEgi5Zodw
|
||||||
|
|
||||||
|
2. This problem can be easily replicated
|
||||||
|
a. Make clvmd running in cluster environment;
|
||||||
|
b. Assume you have created LV "testlv" in local VG 'testvg' on
|
||||||
|
a MD device 'md0';
|
||||||
|
c. Make sure 'md0' is stopped, and not in the device cache by
|
||||||
|
executing 'clvmd -R' or 'pvscan';
|
||||||
|
d. Assemble 'md0' by issuing 'mdadm --assemble --scan --name md0';
|
||||||
|
e. To activate 'testlv', you will see the 'Error locking' problem.
|
||||||
|
|
||||||
|
3. Analysis
|
||||||
|
a. After step 2.d, 'pvscan --cache ...' is triggered by udev rules,
|
||||||
|
notifying 'md0' is ready. But, pvscan exits very early because
|
||||||
|
lvmetad is not being used, thus doesn't go through the lock manager.
|
||||||
|
Therefore, clvmd isn't aware of this udev events. The device cache
|
||||||
|
hasn't 'md0'.
|
||||||
|
|
||||||
|
b. In step 2.e, the client, 'lvchange -ay testvg/testlv' cmd, can find
|
||||||
|
'testlv' correctly in the client metadata, because the device list
|
||||||
|
is gathered by call chain:
|
||||||
|
lvm_run_command()->init_filters()->persistent_filter_load()->dev_cache_scan().
|
||||||
|
Then, it asks clvmd for "Locking VG V_testvg CR", which just drops
|
||||||
|
the metadata in clmvd by call chain: do_lock_vg()->lvmcache_drop_metadata(),
|
||||||
|
but the device cache is *not* refreshed.
|
||||||
|
|
||||||
|
c. Finally, clvmd fails to find the lvid in activation path:
|
||||||
|
do_lock_lv()->do_activate_lv()->lv_info_by_lvid()
|
||||||
|
|
||||||
|
Apparently, the metadata DB is not complete without a complete device
|
||||||
|
cache in clvmd. However, upstream say the pvscan tool intends to be
|
||||||
|
only used with lvmetad, suggesting me not hacking there. So, we'd
|
||||||
|
better fix this issue within clvmd code.
|
||||||
|
|
||||||
|
Sometimes, the device cache in clvmd could be out of date.
|
||||||
|
"clvmd -R" is invented for this issue. However, to run
|
||||||
|
"clvmd -R" manually is not convenient, because it's hard
|
||||||
|
to predict when device change would happen.
|
||||||
|
|
||||||
|
This patch gives another try after refreshing the device
|
||||||
|
cache. In normal, it doesn't cause any side-effect. In
|
||||||
|
case of the issue above, it's worth a retry.
|
||||||
|
|
||||||
|
Signed-off-by: Eric Ren <zren@suse.com>
|
||||||
|
---
|
||||||
|
daemons/clvmd/lvm-functions.c | 11 ++++++++++-
|
||||||
|
1 file changed, 10 insertions(+), 1 deletion(-)
|
||||||
|
|
||||||
|
diff --git a/daemons/clvmd/lvm-functions.c b/daemons/clvmd/lvm-functions.c
|
||||||
|
index 2446fd1..dcd3f9b 100644
|
||||||
|
--- a/daemons/clvmd/lvm-functions.c
|
||||||
|
+++ b/daemons/clvmd/lvm-functions.c
|
||||||
|
@@ -509,11 +509,14 @@ const char *do_lock_query(char *resource)
|
||||||
|
int do_lock_lv(unsigned char command, unsigned char lock_flags, char *resource)
|
||||||
|
{
|
||||||
|
int status = 0;
|
||||||
|
+ int do_refresh = 0;
|
||||||
|
|
||||||
|
DEBUGLOG("do_lock_lv: resource '%s', cmd = %s, flags = %s, critical_section = %d\n",
|
||||||
|
resource, decode_locking_cmd(command), decode_flags(lock_flags), critical_section());
|
||||||
|
|
||||||
|
- if (!cmd->initialized.config || config_files_changed(cmd)) {
|
||||||
|
+again:
|
||||||
|
+ if (!cmd->initialized.config || config_files_changed(cmd)
|
||||||
|
+ || do_refresh) {
|
||||||
|
/* Reinitialise various settings inc. logging, filters */
|
||||||
|
if (do_refresh_cache()) {
|
||||||
|
log_error("Updated config file invalid. Aborting.");
|
||||||
|
@@ -579,6 +582,12 @@ int do_lock_lv(unsigned char command, unsigned char lock_flags, char *resource)
|
||||||
|
init_test(0);
|
||||||
|
pthread_mutex_unlock(&lvm_lock);
|
||||||
|
|
||||||
|
+ /* Try again in case device cache is stale */
|
||||||
|
+ if (status == EIO && !do_refresh) {
|
||||||
|
+ do_refresh = 1;
|
||||||
|
+ goto again;
|
||||||
|
+ }
|
||||||
|
+
|
||||||
|
DEBUGLOG("Command return is %d, critical_section is %d\n", status, critical_section());
|
||||||
|
return status;
|
||||||
|
}
|
||||||
|
--
|
||||||
|
2.10.2
|
||||||
|
|
@ -1,3 +1,10 @@
|
|||||||
|
-------------------------------------------------------------------
|
||||||
|
Tue Jan 16 11:53:36 UTC 2018 - zren@suse.com
|
||||||
|
|
||||||
|
- clvmd: try to refresh device cache on the first failure
|
||||||
|
(bsc#978055, bsc#1076042)
|
||||||
|
+ bug-978055_clvmd-try-to-refresh-device-cache-on-the-first-failu.patch
|
||||||
|
|
||||||
-------------------------------------------------------------------
|
-------------------------------------------------------------------
|
||||||
Wed Jan 10 10:41:45 UTC 2018 - zren@suse.com
|
Wed Jan 10 10:41:45 UTC 2018 - zren@suse.com
|
||||||
|
|
||||||
|
@ -1,3 +1,10 @@
|
|||||||
|
-------------------------------------------------------------------
|
||||||
|
Tue Jan 16 11:53:36 UTC 2018 - zren@suse.com
|
||||||
|
|
||||||
|
- clvmd: try to refresh device cache on the first failure
|
||||||
|
(bsc#978055, bsc#1076042)
|
||||||
|
+ bug-978055_clvmd-try-to-refresh-device-cache-on-the-first-failu.patch
|
||||||
|
|
||||||
-------------------------------------------------------------------
|
-------------------------------------------------------------------
|
||||||
Wed Jan 10 10:41:45 UTC 2018 - zren@suse.com
|
Wed Jan 10 10:41:45 UTC 2018 - zren@suse.com
|
||||||
|
|
||||||
|
@ -61,6 +61,9 @@ Patch1004: bug-935623_dmeventd-fix-dso-name-wrong-compare.patch
|
|||||||
Patch2001: bug-1012973_simplify-special-case-for-md-in-69-dm-lvm-metadata.patch
|
Patch2001: bug-1012973_simplify-special-case-for-md-in-69-dm-lvm-metadata.patch
|
||||||
### COMMON-PATCH-END ###
|
### COMMON-PATCH-END ###
|
||||||
|
|
||||||
|
# Patches for clvmd and cmirrord
|
||||||
|
Patch3001: bug-978055_clvmd-try-to-refresh-device-cache-on-the-first-failu.patch
|
||||||
|
|
||||||
%description
|
%description
|
||||||
A daemon for using LVM2 Logival Volumes in a clustered environment.
|
A daemon for using LVM2 Logival Volumes in a clustered environment.
|
||||||
|
|
||||||
@ -76,6 +79,8 @@ A daemon for using LVM2 Logival Volumes in a clustered environment.
|
|||||||
%patch2001 -p1
|
%patch2001 -p1
|
||||||
### COMMON-PREP-END ###
|
### COMMON-PREP-END ###
|
||||||
|
|
||||||
|
%patch3001 -p1
|
||||||
|
|
||||||
%build
|
%build
|
||||||
extra_opts="
|
extra_opts="
|
||||||
--enable-applib
|
--enable-applib
|
||||||
|
@ -1,3 +1,10 @@
|
|||||||
|
-------------------------------------------------------------------
|
||||||
|
Tue Jan 16 11:53:36 UTC 2018 - zren@suse.com
|
||||||
|
|
||||||
|
- clvmd: try to refresh device cache on the first failure
|
||||||
|
(bsc#978055, bsc#1076042)
|
||||||
|
+ bug-978055_clvmd-try-to-refresh-device-cache-on-the-first-failu.patch
|
||||||
|
|
||||||
-------------------------------------------------------------------
|
-------------------------------------------------------------------
|
||||||
Wed Jan 10 10:41:45 UTC 2018 - zren@suse.com
|
Wed Jan 10 10:41:45 UTC 2018 - zren@suse.com
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user