Accepting request 495131 from openSUSE:Factory:zSystems

Bug fix from Beta testing.

OBS-URL: https://build.opensuse.org/request/show/495131
OBS-URL: https://build.opensuse.org/package/show/Base:System/s390-tools?expand=0&rev=8
This commit is contained in:
Mark Post 2017-05-16 01:32:01 +00:00 committed by Git OBS Bridge
parent 9130ff8ac6
commit a416fd0dcd
3 changed files with 152 additions and 0 deletions

View File

@ -0,0 +1,144 @@
Subject: [PATCH] [BZ 149058] ziomon: no blktrace kill which can corrupt kernel blktrace state
From: Steffen Maier <maier@linux.vnet.ibm.com>
Description: ziomon: no blktrace kill which can corrupt kernel blktrace state
Symptom: Ziomon terminates with the following error messages:
$ ziomon -d <duration> -o <logfile> <device>...
...
ziomon: Failed to stop trace on /dev/sdX
...
ziomon: Failed to stop trace on /dev/sdY
ziomon: blktrace has leftovers, manual cleanup required!
Subsequent ziomon runs fail likely until the next reboot:
$ ziomon -d <duration> -o <logfile> <device>...
...
Collecting data...BLKTRACESETUP(2) /dev/sdX failed: \
2/No such file or directory
BLKTRACESETUP(2) /dev/sdY failed: 2/No such file or directory
...
Thread Z failed open /sys/kernel/debug/block/(null)/traceZ: \
2/No such file or directory
FAILED to start thread on CPU 0: 1/Operation not permitted
...
Error: blktrace has errors, aborting
...
ziomon: Failed to stop trace on /dev/sdX
...
ziomon: Failed to stop trace on /dev/sdY
ziomon: blktrace has leftovers, manual cleanup required!
Problem: While we call blktrace with stopwatch option '-w $WRP_DURATION'
to have itself terminate automatically after the (mandatory)
sampling duration, there might be a race
with our own timed blktrace checking loop between "Collecting
data..." and "done"
or any of SIGHUP SIGTERM SIGINT SIGQUIT from the user might
prematurely trigger emergency_shutdown(),
so blktrace might still be running or not have completed its
cleanup when we reach the blktrace leftover checks in shutdown().
In that case, ziomon uses the meanwhile hidden kill option of
blktrace which can corrupt kernel state if blktrace user space
has not yet completed its regular cleanup.
Solution: The correct way to have the blktrace process terminate and
cleanup properly including any dependent kernel state, before
blktrace's stopwatch makes it terminate and cleanup
automatically, is to only send SIGINT to it.
http://git.kernel.org/cgit/linux/kernel/git/axboe/blktrace.git
86596c7579c6 ("blktrace: remove -k from manpage synopsis")
fb7f86674a51 ("blktrace: disable kill option - take 2")
Additional explanation for blktrace -k:
http://marc.info/?l=linux-btrace&m=131245973528087&w=2i
linux-btrace ("Re: Recent changes")
The following kernel commits do not change above termination
statement:
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git
39cbb602b543 ("Remove double removal of blktrace directory")
fd51d251e4cd ("blktrace: remove debugfs entries on bad path")
f48fc4d32e24
("block: get rid of the manual directory counting in blktrace")
As of this writing, the (multithreaded) blktrace also wires
SIGHUP SIGTERM SIGALRM with the same signal handler
handle_sigint(), but to adhere to its documentation move from
SIGTERM to SIGINT anyway.
We must not use the meanwhile hidden kill option of blktrace
because this corrupts kernel state if blktrace user space has not
yet completed its regular cleanup.
In order to avoid zombies we wait for all children. After all, we
want everything to be completed before ziomon terminates and the
user could safely run another instance of ziomon. Also, we get
the possibility for users to detect and report issues with
children that don't finish timely.
Since we do not know how to manually cleanup, drop the check for
blktrace leftovers in shutdown().
Reproduction: Run ziomon with many devices on the command line.
Possibly (other) load on the system increases probability.
Upstream-ID: -
Problem-ID: 149058
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
---
ziomon/ziomon | 17 ++++++-----------
1 file changed, 6 insertions(+), 11 deletions(-)
--- a/ziomon/ziomon
+++ b/ziomon/ziomon
@@ -5,7 +5,7 @@
#
# Wrapper script to start all processes
#
-# Copyright IBM Corp. 2008
+# Copyright IBM Corp. 2008, 2016
#
# Author(s): Stefan Raspl <raspl@linux.vnet.ibm.com>
#
@@ -84,7 +84,7 @@ function check_for_int() {
function print_version() {
echo "$WRP_TOOLNAME: I/O data collection utility, version %S390_TOOLS_VERSION%";
- echo "Copyright IBM Corp. 2008";
+ echo "Copyright IBM Corp. 2008, 2016";
}
@@ -356,8 +356,6 @@ function start_trace() {
function shutdown() {
- local failed=0;
-
echo "Shutting down";
# one more second to write final result
sleep 2;
@@ -365,7 +363,7 @@ function shutdown() {
[ -d /proc/$WRP_ZIOMON_UTIL_PID ] && echo "Shutting down utilization process" && kill -s SIGTERM $WRP_ZIOMON_UTIL_PID;
fi
if [ "$WRP_BLKTRACE_PID" != "" ]; then
- [ -d /proc/$WRP_BLKTRACE_PID ] && echo "Shutting down blktrace process" && kill -s SIGTERM $WRP_BLKTRACE_PID;
+ [ -d /proc/$WRP_BLKTRACE_PID ] && echo "Shutting down blktrace process" && kill -s SIGINT $WRP_BLKTRACE_PID;
fi
if [ "$WRP_BLKIOMON_PID" != "" ]; then
[ -d /proc/$WRP_BLKIOMON_PID ] && echo "Shutting down blkiomon process" && kill -s SIGTERM $WRP_BLKIOMON_PID;
@@ -381,12 +379,9 @@ function shutdown() {
kill -s SIGTERM $WRP_ZIOMON_MGR_PID;
fi
fi
- # kill running traces on devices
- for dev in ${WRP_DEVICES[@]}; do
- [ -d /sys/kernel/debug/block/${dev##*/} ] && debug "Trace on $dev exists, attempt to kill..." && blktrace -k $dev >/dev/null 2>&1;
- [ -d /sys/kernel/debug/block/${dev##*/} ] && echo "$WRP_TOOLNAME: Failed to stop trace on $dev" && (( failed++ ));
- done
- [ $failed -ne 0 ] && echo "$WRP_TOOLNAME: blktrace has leftovers, manual cleanup required!";
+ # synchronize with all children to avoid zombies
+ # and to prepare for a clean subsequent re-run of ziomon
+ wait
if [ -e $WRP_MSG_Q_PATH ]; then
if [ $WRP_DEBUG -gt 1 ]; then

View File

@ -1,3 +1,9 @@
-------------------------------------------------------------------
Tue May 16 00:57:05 UTC 2017 - mpost@suse.com
- Added s390-tools-sles12sp3-ziomon-no-blktrace-kill-which-can-corrupt-kernel-blk.patch
(bsc#1038861)
------------------------------------------------------------------- -------------------------------------------------------------------
Tue Apr 18 22:25:10 UTC 2017 - mpost@suse.com Tue Apr 18 22:25:10 UTC 2017 - mpost@suse.com

View File

@ -132,6 +132,7 @@ Patch29: s390-tools-sles12sp3-dasdfmt-10-Add-expand-format-mode.patch
Patch30: s390-tools-sles12sp3-util_proc-fix-memory-allocation-error-messages.patch Patch30: s390-tools-sles12sp3-util_proc-fix-memory-allocation-error-messages.patch
Patch31: s390-tools-sles12sp3-mon_fsstatd-fix-double-free-in-error-path-and-skip-virtual-fs.patch Patch31: s390-tools-sles12sp3-mon_fsstatd-fix-double-free-in-error-path-and-skip-virtual-fs.patch
Patch32: s390-tools-sles12sp3-dbginfo-Collect-docker-debug-data.patch Patch32: s390-tools-sles12sp3-dbginfo-Collect-docker-debug-data.patch
Patch33: s390-tools-sles12sp3-ziomon-no-blktrace-kill-which-can-corrupt-kernel-blk.patch
BuildRoot: %{_tmppath}/%{name}-%{version}-build BuildRoot: %{_tmppath}/%{name}-%{version}-build
ExclusiveArch: s390 s390x ExclusiveArch: s390 s390x
@ -220,6 +221,7 @@ to list files and directories.
%patch30 -p1 %patch30 -p1
%patch31 -p1 %patch31 -p1
%patch32 -p1 %patch32 -p1
%patch33 -p1
cp -vi %{S:22} CAUTION cp -vi %{S:22} CAUTION