Accepting request 668314 from home:ptesarik:branches:Kernel:kdump

- Update to 1.6.4
  * 5-level paging support on x86_64
  * --mem-usage support for arm64
  * Support kernels up to 4.17.0
- Drop upstreamed patches:
  * makedumpfile-always-use-bigger-SECTION_MAP_MASK.patch
  * makedumpfile-sadump-fix-PTI-enabled-kernels.patch
  * makedumpfile-do-not-print-ETA-if-progress-is-0.patch
  * makedumpfile-is_cache_page-helper.patch
  * makedumpfile-check-PG_swapbacked.patch

OBS-URL: https://build.opensuse.org/request/show/668314
OBS-URL: https://build.opensuse.org/package/show/Kernel:kdump/makedumpfile?expand=0&rev=128
This commit is contained in:
Petr Tesařík 2019-01-24 13:32:07 +00:00 committed by Git OBS Bridge
parent 20ac94dc19
commit ce5792c70c
9 changed files with 29 additions and 390 deletions

View File

@ -1,3 +0,0 @@
version https://git-lfs.github.com/spec/v1
oid sha256:cb1afe2cf24147eac983694bfbcf8c1b149eeeb92289562d4d25fbe3b100b125
size 186492

View File

@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:7e06f72d5f291fcab9e92975f405a76e37d4f7fc8fa4172f199636398ae812b1
size 191786

View File

@ -1,41 +0,0 @@
From: Petr Tesarik <ptesarik@suse.com>
Date: Mon, 29 Jan 2018 14:59:28 +0200
Subject: Always use bigger SECTION_MAP_MASK
References: bsc#1066811, bsc#1067703
Upstream: not yet
Since kernel commit 2d070eab2e82 merely reused a previously unused bit, it
is safe to mask it off for all kernel versions, because it had always been
zero (even in kernels < 4.13).
Signed-off-by: Petr Tesarik <ptesarik@suse.com>
---
makedumpfile.c | 5 +----
makedumpfile.h | 1 -
2 files changed, 1 insertion(+), 5 deletions(-)
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -3337,10 +3337,7 @@ section_mem_map_addr(unsigned long addr)
return NOT_KV_ADDR;
}
map = ULONG(mem_section + OFFSET(mem_section.section_mem_map));
- if (info->kernel_version < KERNEL_VERSION(4, 13, 0))
- map &= SECTION_MAP_MASK_4_12;
- else
- map &= SECTION_MAP_MASK;
+ map &= SECTION_MAP_MASK;
free(mem_section);
return map;
--- a/makedumpfile.h
+++ b/makedumpfile.h
@@ -186,7 +186,6 @@ isAnon(unsigned long mapping)
#define SECTION_NR_TO_ROOT(sec) ((sec) / SECTIONS_PER_ROOT())
#define SECTION_IS_ONLINE (1UL<<2)
#define SECTION_MAP_LAST_BIT (1UL<<3)
-#define SECTION_MAP_MASK_4_12 (~(SECTION_IS_ONLINE-1))
#define SECTION_MAP_MASK (~(SECTION_MAP_LAST_BIT-1))
#define NR_SECTION_ROOTS() divideup(num_section, SECTIONS_PER_ROOT())
#define SECTION_NR_TO_PFN(sec) ((sec) << PFN_SECTION_SHIFT())

View File

@ -1,123 +0,0 @@
From: Petr Tesarik <ptesarik@suse.com>
Date: Fri, 13 Apr 2018 17:35:55 +0200
Subject: Check PG_swapbacked for swap cache pages
References: bsc#1088354
Upstream: posted 2018-04-13
When page cache is filtered out (dump level bitmap includes 2 or 4),
makedumpfile checks the PG_swapcache bit, but since kernel commit
6326fec1122cde256bd2a8c63f2606e08e44ce1d (v4.10-rc1~7) this bit is
an alias for PG_owner_priv_1, which is also used by filesystem
code (PG_checked) and Xen (PG_pinned and PG_foreign).
With these kernels, the PG_swapcache flag is valid only if
PG_swapbacked is set. A Linux kernel patch has already been
submitted to export the value of PG_swapbacked in VMCOREINFO.
Since there are released kernels in the wild which do not export the
value, a fallback is implemented. I considered these three situations:
1. Kernels before v2.6.28-rc1~244:
PG_swapbacked does not exist, so it must not be checked.
Instead, check PG_swapcache, which is never overloaded for
another purpose.
2. Kernels between v2.6.28-rc1~244 and v4.10-rc1~7:
It is sufficient to check only PG_swapcache, but PG_swapbacked
may also be checked (it is always set if PG_swapcache is set).
3. Kernels since v4.10-rc1~7:
PG_swapbacked must be checked.
If PG_swapbacked value is known (exported or read from debuginfo),
it is always safe to use it (case 2 or 3). If PG_swapbacked is not
known, it is safe to ignore it for cases 1 and 2, but not 3.
Thankfully, the new value of PG_swapcache (since v4.10-rc1~7) is
less than PG_private (which is known), whereas the old value had
always been greater than PG_private. Moreover, the flags between
PG_private and PG_swapbacked haven't changed since v4.10-rc1~7, so
PG_swapbacked can fall back to PG_private + 6 if unknown.
Without this patch, all Xen dumps are unusable, because PG_pinned is
set for all page table pages.
Signed-off-by: Petr Tesarik <ptesarik@suse.com>
---
makedumpfile.c | 19 ++++++++++++++++++-
makedumpfile.h | 2 ++
2 files changed, 20 insertions(+), 1 deletion(-)
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -252,7 +252,18 @@ isHugetlb(int dtor)
static int
is_cache_page(unsigned long flags)
{
- return isLRU(flags) || isSwapCache(flags);
+ if (isLRU(flags))
+ return TRUE;
+
+ /* PG_swapcache is valid only if:
+ * a. PG_swapbacked bit is set, or
+ * b. PG_swapbacked did not exist (kernels before 4.10-rc1).
+ */
+ if ((NUMBER(PG_swapbacked) == NOT_FOUND_NUMBER || isSwapBacked(flags))
+ && isSwapCache(flags))
+ return TRUE;
+
+ return FALSE;
}
static inline unsigned long
@@ -1735,6 +1746,7 @@ get_structure_info(void)
ENUM_NUMBER_INIT(PG_lru, "PG_lru");
ENUM_NUMBER_INIT(PG_private, "PG_private");
ENUM_NUMBER_INIT(PG_swapcache, "PG_swapcache");
+ ENUM_NUMBER_INIT(PG_swapbacked, "PG_swapbacked");
ENUM_NUMBER_INIT(PG_buddy, "PG_buddy");
ENUM_NUMBER_INIT(PG_slab, "PG_slab");
ENUM_NUMBER_INIT(PG_hwpoison, "PG_hwpoison");
@@ -1988,6 +2000,9 @@ get_value_for_old_linux(void)
NUMBER(PG_private) = PG_private_ORIGINAL;
if (NUMBER(PG_swapcache) == NOT_FOUND_NUMBER)
NUMBER(PG_swapcache) = PG_swapcache_ORIGINAL;
+ if (NUMBER(PG_swapbacked) == NOT_FOUND_NUMBER
+ && NUMBER(PG_swapcache) < NUMBER(PG_private))
+ NUMBER(PG_swapbacked) = NUMBER(PG_private) + 6;
if (NUMBER(PG_slab) == NOT_FOUND_NUMBER)
NUMBER(PG_slab) = PG_slab_ORIGINAL;
if (NUMBER(PG_head_mask) == NOT_FOUND_NUMBER)
@@ -2264,6 +2279,7 @@ write_vmcoreinfo_data(void)
WRITE_NUMBER("PG_private", PG_private);
WRITE_NUMBER("PG_head_mask", PG_head_mask);
WRITE_NUMBER("PG_swapcache", PG_swapcache);
+ WRITE_NUMBER("PG_swapbacked", PG_swapbacked);
WRITE_NUMBER("PG_buddy", PG_buddy);
WRITE_NUMBER("PG_slab", PG_slab);
WRITE_NUMBER("PG_hwpoison", PG_hwpoison);
@@ -2658,6 +2674,7 @@ read_vmcoreinfo(void)
READ_NUMBER("PG_private", PG_private);
READ_NUMBER("PG_head_mask", PG_head_mask);
READ_NUMBER("PG_swapcache", PG_swapcache);
+ READ_NUMBER("PG_swapbacked", PG_swapbacked);
READ_NUMBER("PG_slab", PG_slab);
READ_NUMBER("PG_buddy", PG_buddy);
READ_NUMBER("PG_hwpoison", PG_hwpoison);
--- a/makedumpfile.h
+++ b/makedumpfile.h
@@ -155,6 +155,7 @@ test_bit(int nr, unsigned long addr)
#define isPrivate(flags) test_bit(NUMBER(PG_private), flags)
#define isCompoundHead(flags) (!!((flags) & NUMBER(PG_head_mask)))
#define isSwapCache(flags) test_bit(NUMBER(PG_swapcache), flags)
+#define isSwapBacked(flags) test_bit(NUMBER(PG_swapbacked), flags)
#define isHWPOISON(flags) (test_bit(NUMBER(PG_hwpoison), flags) \
&& (NUMBER(PG_hwpoison) != NOT_FOUND_NUMBER))
@@ -1869,6 +1870,7 @@ struct number_table {
long PG_head;
long PG_head_mask;
long PG_swapcache;
+ long PG_swapbacked;
long PG_buddy;
long PG_slab;
long PG_hwpoison;

View File

@ -1,90 +0,0 @@
From: Petr Tesarik <ptesarik@suse.com>
Date: Mon, 9 Apr 2018 09:59:46 +0200
Subject: Do not print ETA value if current progress is 0
References: bsc#1084936
Upstream: submitted 2018-04-09
Essentially, the estimated remaining time is calculated as:
elapsed * (100 - progress) / progress
However, print_progress() is also called when progress is 0. The
result of a floating point division by zero is either NaN (if
elapsed is zero), or infinity (if the system clock happens to cross
a second's boundary since reading the start timestamp).
The C standard defines only conversion of floating point values
within the range of the destination integer variable. This means
that conversion of NaN or infinity to an integer is undefined
behaviour. Yes, it happens to produce INT_MIN with GCC on major
platforms...
This bug has gone unnoticed, because the very first call to
print_progress() does not specify a start timestamp (so it cannot
trigger the bug), and all subsequent calls are rate-limited to one
per second. As a result, the bug is triggered very rarely.
Before commit e5f96e79d69a1d295f19130da00ec6514d28a8ae, the bug also
caused a buffer overflow. The buffer overflow is mitigated thanks to
using snprintf() instead of sprintf(), but the program may still
invoke undefined behaviour.
Note that all other changes in the above-mentioned commit were
ineffective. They merely reduced the precision of the calculation:
Why would you add delta.tv_usec as a fraction if the fractional part
is immediately truncated by a converstion to int64_t?
Additionally, when the original bug is hit, the output is still
incorrect, e.g. on my system I get:
Copying data : [ 0.0 %] / eta: -9223372036854775808s
For that reason, let me revert the changes from commit
e5f96e79d69a1d295f19130da00ec6514d28a8ae and fix the bug properly,
i.e. do not calculate ETA if progress is 0.
Last but not least, part of the issue was probably caused by the
wrong assumption that integers < 100 can be interpreted with max 3
ASCII characters, but that's not true for signed integers. To make
eta_to_human_short() a bit safer, use an unsigned integer type.
Signed-off-by: Petr Tesarik <ptesarik@suse.com>
---
print_info.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
--- a/print_info.c
+++ b/print_info.c
@@ -352,18 +352,18 @@ static void calc_delta(struct timeval *t
}
/* produce less than 12 bytes on msg */
-static int eta_to_human_short (int secs, char* msg)
+static int eta_to_human_short (unsigned secs, char* msg)
{
strcpy(msg, "eta: ");
msg += strlen("eta: ");
if (secs < 100)
- sprintf(msg, "%ds", secs);
+ sprintf(msg, "%us", secs);
else if (secs < 100 * 60)
- sprintf(msg, "%dm%ds", secs / 60, secs % 60);
+ sprintf(msg, "%um%us", secs / 60, secs % 60);
else if (secs < 48 * 3600)
- sprintf(msg, "%dh%dm", secs / 3600, (secs / 60) % 60);
+ sprintf(msg, "%uh%um", secs / 3600, (secs / 60) % 60);
else if (secs < 100 * 86400)
- sprintf(msg, "%dd%dh", secs / 86400, (secs / 3600) % 24);
+ sprintf(msg, "%ud%uh", secs / 86400, (secs / 3600) % 24);
else
sprintf(msg, ">2day");
return 0;
@@ -391,7 +391,7 @@ print_progress(const char *msg, unsigned
} else
progress = 100;
- if (start != NULL) {
+ if (start != NULL && current != 0) {
calc_delta(start, &delta);
eta = delta.tv_sec + delta.tv_usec / 1e6;
eta = (100 - progress) * eta / progress;

View File

@ -1,47 +0,0 @@
From: Petr Tesarik <ptesarik@suse.com>
Date: Fri, 13 Apr 2018 15:58:45 +0200
Subject: Add is_cache_page() helper to check if a page belongs to the cache
References: bsc#1088354
Upstream: posted 2018-04-13
No functional change, but clarify the purpose of checking isLRU()
and SwapCache(), and move the check to a single place.
Signed-off-by: Petr Tesarik <ptesarik@suse.com>
---
makedumpfile.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -249,6 +249,12 @@ isHugetlb(int dtor)
&& (SYMBOL(free_huge_page) == dtor));
}
+static int
+is_cache_page(unsigned long flags)
+{
+ return isLRU(flags) || isSwapCache(flags);
+}
+
static inline unsigned long
calculate_len_buf_out(long page_size)
{
@@ -5850,7 +5856,7 @@ __exclude_unnecessary_pages(unsigned lon
* Exclude the non-private cache page.
*/
else if ((info->dump_level & DL_EXCLUDE_CACHE)
- && (isLRU(flags) || isSwapCache(flags))
+ && is_cache_page(flags)
&& !isPrivate(flags) && !isAnon(mapping)) {
pfn_counter = &pfn_cache;
}
@@ -5858,7 +5864,7 @@ __exclude_unnecessary_pages(unsigned lon
* Exclude the cache page whether private or non-private.
*/
else if ((info->dump_level & DL_EXCLUDE_CACHE_PRI)
- && (isLRU(flags) || isSwapCache(flags))
+ && is_cache_page(flags)
&& !isAnon(mapping)) {
if (isPrivate(flags))
pfn_counter = &pfn_cache_private;

View File

@ -1,71 +0,0 @@
From: Takao Indoh <indou.takao@jp.fujitsu.com>
Date: Fri, 26 Jan 2018 09:22:26 +0900
Subject: sadump: Fix a problem of PTI enabled kernel
References: bsc#1085826
Upstream: submitted
Message-ID: <1516926146-20347-1-git-send-email-indou.takao@jp.fujitsu.com>
This patch fixes a problme that a dumpfile of sadump cannot be handled by
makedumpfile when Page Table Isolation(PTI) is enabled.
When PTI is enabled, bit 12 of CR3 register is used to split user space and
kernel space. Also bit 11:0 is used for Process Context IDentifiers(PCID). To
open a dump file of sadump, a value of CR3 is used to calculate KASLR offset and
phys_base, therefore this patch fixes to mask CR3 register value collectly for
PTI enabled kernel.
Signed-off-by: Takao Indoh <indou.takao@jp.fujitsu.com>
Acked-by: Petr Tesarik <ptesarik@suse.com>
---
makedumpfile.c | 2 ++
makedumpfile.h | 2 ++
sadump_info.c | 9 ++++++++-
3 files changed, 12 insertions(+), 1 deletion(-)
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -1572,6 +1572,8 @@ get_symbol_info(void)
SYMBOL_INIT(divide_error, "divide_error");
SYMBOL_INIT(idt_table, "idt_table");
SYMBOL_INIT(saved_command_line, "saved_command_line");
+ SYMBOL_INIT(pti_init, "pti_init");
+ SYMBOL_INIT(kaiser_init, "kaiser_init");
return TRUE;
}
--- a/makedumpfile.h
+++ b/makedumpfile.h
@@ -1606,6 +1606,8 @@ struct symbol_table {
unsigned long long divide_error;
unsigned long long idt_table;
unsigned long long saved_command_line;
+ unsigned long long pti_init;
+ unsigned long long kaiser_init;
/*
* symbols on ppc64 arch
--- a/sadump_info.c
+++ b/sadump_info.c
@@ -1362,6 +1362,9 @@ finish:
* kernel. Retrieve vmcoreinfo from address of "elfcorehdr=" and
* get kaslr_offset and phys_base from vmcoreinfo.
*/
+#define PTI_USER_PGTABLE_BIT (info->page_shift)
+#define PTI_USER_PGTABLE_MASK (1 << PTI_USER_PGTABLE_BIT)
+#define CR3_PCID_MASK 0xFFFull
int
calc_kaslr_offset(void)
{
@@ -1389,7 +1392,11 @@ calc_kaslr_offset(void)
}
idtr = ((uint64_t)smram.IdtUpper)<<32 | (uint64_t)smram.IdtLower;
- cr3 = smram.Cr3;
+ if ((SYMBOL(pti_init) != NOT_FOUND_SYMBOL) ||
+ (SYMBOL(kaiser_init) != NOT_FOUND_SYMBOL))
+ cr3 = smram.Cr3 & ~(CR3_PCID_MASK|PTI_USER_PGTABLE_MASK);
+ else
+ cr3 = smram.Cr3 & ~CR3_PCID_MASK;
/* Convert virtual address of IDT table to physical address */
if ((idtr_paddr = vtop4_x86_64_pagetable(idtr, cr3)) == NOT_PADDR)

View File

@ -1,3 +1,17 @@
-------------------------------------------------------------------
Thu Jan 24 12:27:37 UTC 2019 - ptesarik@suse.com
- Update to 1.6.4
* 5-level paging support on x86_64
* --mem-usage support for arm64
* Support kernels up to 4.17.0
- Drop upstreamed patches:
* makedumpfile-always-use-bigger-SECTION_MAP_MASK.patch
* makedumpfile-sadump-fix-PTI-enabled-kernels.patch
* makedumpfile-do-not-print-ETA-if-progress-is-0.patch
* makedumpfile-is_cache_page-helper.patch
* makedumpfile-check-PG_swapbacked.patch
-------------------------------------------------------------------
Fri Aug 24 13:06:51 UTC 2018 - ptesarik@suse.com

View File

@ -1,7 +1,7 @@
#
# spec file for package makedumpfile
#
# Copyright (c) 2018 SUSE LINUX GmbH, Nuernberg, Germany.
# Copyright (c) 2019 SUSE LINUX GmbH, Nuernberg, Germany.
#
# All modifications and additions to the file contributed by third parties
# remain the property of their copyright owners, unless otherwise agreed
@ -24,12 +24,15 @@
%endif
%endif
# Compatibility cruft
# there is no separate -ltinfo until openSUSE 13.1
%if 0%{?suse_version} < 1310 && 0%{?sles_version} < 12
%define ncurses_make_opts TINFOLIB=-lncurses
%endif
# End of compatibility cruft
Name: makedumpfile
Version: 1.6.3
Version: 1.6.4
Release: 0
Summary: Partial kernel dump
License: GPL-2.0-only
@ -39,11 +42,6 @@ Source: https://sourceforge.net/projects/makedumpfile/files/makedumpfile
Source99: %{name}-rpmlintrc
Patch0: %{name}-coptflags.diff
Patch1: %{name}-override-libtinfo.patch
Patch2: %{name}-always-use-bigger-SECTION_MAP_MASK.patch
Patch3: %{name}-sadump-fix-PTI-enabled-kernels.patch
Patch4: %{name}-do-not-print-ETA-if-progress-is-0.patch
Patch5: %{name}-is_cache_page-helper.patch
Patch6: %{name}-check-PG_swapbacked.patch
BuildRequires: libdw-devel
BuildRequires: libebl-devel
BuildRequires: libelf-devel
@ -73,11 +71,6 @@ via gdb or crash utility.
%setup -q
%patch0 -p1
%patch1 -p1
%patch2 -p1
%patch3 -p1
%patch4 -p1
%patch5 -p1
%patch6 -p1
%build
%if %{have_snappy}
@ -97,13 +90,17 @@ install -D -m 0755 eppic_makedumpfile.so %{buildroot}%{_libdir}/%{name}-%{versio
install -d -m 0755 %{buildroot}%{_datadir}/%{name}-%{version}/eppic_scripts
install -m 0644 -t %{buildroot}%{_datadir}/%{name}-%{version}/eppic_scripts/ eppic_scripts/*
%if 0%{?_defaultlicensedir:1}
# Compatibility cruft
# there is no %license prior to SLE12
%if %{undefined _defaultlicensedir}
%define license %doc
%else
# filesystem before SLE12 SP3 lacks /usr/share/licenses
%if 0%(test ! -d %{_defaultlicensedir} && echo 1)
%define _defaultlicensedir %_defaultdocdir
%endif
%else
%define license %doc
%endif
# End of compatibility cruft
%files
%defattr(-,root,root)