72 lines
2.5 KiB
Diff
72 lines
2.5 KiB
Diff
# HG changeset patch
|
|
# User Andre Przywara <osp@andrep.de>
|
|
# Date 1355913729 -3600
|
|
# Node ID 5fb0b8b838dab0b331abfa675fd2b2214ac90760
|
|
# Parent b04de677de31f26ba4b8f2f382ca4dfffcff9a79
|
|
x86, amd: Disable way access filter on Piledriver CPUs
|
|
|
|
The Way Access Filter in recent AMD CPUs may hurt the performance of
|
|
some workloads, caused by aliasing issues in the L1 cache.
|
|
This patch disables it on the affected CPUs.
|
|
|
|
The issue is similar to that one of last year:
|
|
http://lkml.indiana.edu/hypermail/linux/kernel/1107.3/00041.html
|
|
This new patch does not replace the old one, we just need another
|
|
quirk for newer CPUs.
|
|
|
|
The performance penalty without the patch depends on the
|
|
circumstances, but is a bit less than the last year's 3%.
|
|
|
|
The workloads affected would be those that access code from the same
|
|
physical page under different virtual addresses, so different
|
|
processes using the same libraries with ASLR or multiple instances of
|
|
PIE-binaries. The code needs to be accessed simultaneously from both
|
|
cores of the same compute unit.
|
|
|
|
More details can be found here:
|
|
http://developer.amd.com/Assets/SharedL1InstructionCacheonAMD15hCPU.pdf
|
|
|
|
CPUs affected are anything with the core known as Piledriver.
|
|
That includes the new parts of the AMD A-Series (aka Trinity) and the
|
|
just released new CPUs of the FX-Series (aka Vishera).
|
|
The model numbering is a bit odd here: FX CPUs have model 2,
|
|
A-Series has model 10h, with possible extensions to 1Fh. Hence the
|
|
range of model ids.
|
|
|
|
Signed-off-by: Andre Przywara <osp@andrep.de>
|
|
|
|
Add and use MSR_AMD64_IC_CFG. Update the value whenever it is found to
|
|
not have all bits set, rather than just when it's zero.
|
|
|
|
Signed-off-by: Jan Beulich <jbeulich@suse.com>
|
|
Acked-by: Keir Fraser <keir@xen.org>
|
|
Committed-by: Jan Beulich <jbeulich@suse.com>
|
|
|
|
--- a/xen/arch/x86/cpu/amd.c
|
|
+++ b/xen/arch/x86/cpu/amd.c
|
|
@@ -493,6 +493,14 @@ static void __devinit init_amd(struct cp
|
|
}
|
|
}
|
|
|
|
+ /*
|
|
+ * The way access filter has a performance penalty on some workloads.
|
|
+ * Disable it on the affected CPUs.
|
|
+ */
|
|
+ if (c->x86 == 0x15 && c->x86_model >= 0x02 && c->x86_model < 0x20 &&
|
|
+ !rdmsr_safe(MSR_AMD64_IC_CFG, value) && (value & 0x1e) != 0x1e)
|
|
+ wrmsr_safe(MSR_AMD64_IC_CFG, value | 0x1e);
|
|
+
|
|
amd_get_topology(c);
|
|
|
|
/* Pointless to use MWAIT on Family10 as it does not deep sleep. */
|
|
--- a/xen/include/asm-x86/msr-index.h
|
|
+++ b/xen/include/asm-x86/msr-index.h
|
|
@@ -206,6 +206,7 @@
|
|
|
|
/* AMD64 MSRs */
|
|
#define MSR_AMD64_NB_CFG 0xc001001f
|
|
+#define MSR_AMD64_IC_CFG 0xc0011021
|
|
#define MSR_AMD64_DC_CFG 0xc0011022
|
|
#define AMD64_NB_CFG_CF8_EXT_ENABLE_BIT 46
|
|
|