xen/21757-x86-mce-avoid-BUG_ON.patch

39 lines
1.4 KiB
Diff
Raw Normal View History

# HG changeset patch
# User Keir Fraser <keir.fraser@citrix.com>
# Date 1278674491 -3600
# Node ID 50cf787b70eb74adfe501a2484a0dffe7d15e567
# Parent a7a680442b738928eb963b31e22a3e428ac111a0
mce: Replace BUG() with a console warning in the MCE handler.
If the hardware reports corrected errors that we didn't see through
the status MSRs, complain on the console but don't BUG() the machine.
Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
--- a/xen/arch/x86/cpu/mcheck/amd_nonfatal.c
+++ b/xen/arch/x86/cpu/mcheck/amd_nonfatal.c
@@ -152,14 +152,19 @@ static void mce_amd_work_fn(void *data)
/* HW does not count *all* kinds of correctable errors.
* Thus it is possible, that the polling routine finds an
- * correctable error even if the HW reports nothing.
- * However, the other way around is not possible (= BUG).
- */
+ * correctable error even if the HW reports nothing. */
if (counter > 0) {
/* HW reported correctable errors,
* the polling routine did not find...
*/
- BUG_ON(adjust == 0);
+ if (adjust == 0) {
+ printk("CPU counter reports %"PRIu32
+ " correctable hardware error%s that %s"
+ " not reported by the status MSRs\n",
+ counter,
+ (counter == 1 ? "" : "s"),
+ (counter == 1 ? "was" : "were"));
+ }
/* subtract 1 to not double count the error
* from the polling service routine */
adjust += (counter - 1);