bug on aarch64. [bnc#1093797] OBS-URL: https://build.opensuse.org/package/show/devel:gcc/gcc48?expand=0&rev=222
130 lines
5.8 KiB
Diff
130 lines
5.8 KiB
Diff
This is for bsc#1093797
|
|
|
|
A miscompile in kernels btrfs module in reada.c. Preprocessed
|
|
testcase and receipt for reproduction in bug report. In short,
|
|
we're starting with an asm with a =Q constraint:
|
|
|
|
(insn 186 185 188 18 (parallel [
|
|
(set (mem/j:HI (plus:DI (reg/v/f:DI 162 [ fs_info ])
|
|
(const_int 4416 [0x1140])) [0 MEM[(struct arch_spinlock_t *)_222].owner+0 S2 A32])
|
|
(asm_operands/v:HI (".if 1 == 1 ....") ("=Q") ...))))
|
|
|
|
pseudo 162 (fs_info) didn't get a hardreg, constraint "Q" requires a memory
|
|
address in a single register. The constant 4416 will turn out to be invalid
|
|
for an operand in any add instruction.
|
|
|
|
reload will register these reloads:
|
|
Reload 0: reload_in (DI) = (reg/v/f:DI 162 [ fs_info ])
|
|
GENERAL_REGS, RELOAD_FOR_INPUT_ADDRESS (opnum = 0)
|
|
reload_in_reg: (reg/v/f:DI 162 [ fs_info ])
|
|
Reload 1: reload_in (DI) = (plus:DI (reg/v/f:DI 162 [ fs_info ])
|
|
(const_int 4416 [0x1140]))
|
|
GENERAL_REGS, RELOAD_FOR_INPUT (opnum = 0), inc by 2
|
|
reload_in_reg: (plus:DI (reg/v/f:DI 162 [ fs_info ])
|
|
(const_int 4416 [0x1140]))
|
|
|
|
Without changes these two reloads don't conflict (one is RELOAD_FOR_INPUT
|
|
the other RELOAD_FOR_INPUT_ADDRESS), and hence are getting allocated the same
|
|
reload reg (x4 in this case). The first reload generates this insn:
|
|
|
|
(insn 836 185 838 18 (set (reg:DI 4 x4)
|
|
(mem/c:DI (plus:DI (reg/f:DI 29 x29)
|
|
(const_int 128 [0x80])) [0 %sfp+32 S8 A64])) spinlock.h:158 33 {*movdi_aarch64}
|
|
(nil))
|
|
|
|
The second reload wants to generate this insn (in gen_reload):
|
|
(insn 839 838 186 18 (set (reg:DI 4 x4)
|
|
(plus:DI (reg:DI 4 x4) (const_int 4416 [0x1140]))))
|
|
which would be fine but emit_insn_if_valid_for_reload() finds out
|
|
correctly that this insn is invalid (because constant doesn't
|
|
match aarch64_plus_operand). The code in gen_reload handles this situation
|
|
(see comments therein) by moving one operand into the reload register
|
|
(here it chooses the constant, if it chose the other operand (the register)
|
|
nothing would be solved, the constant would still be too large).
|
|
_BUT_ the reload register is x4, which already contains an input to this
|
|
insn. So what is generated is:
|
|
|
|
(insn 838 836 839 18 (set (reg:DI 4 x4)
|
|
(const_int 4416 [0x1140])) {*movdi_aarch64}
|
|
(nil))
|
|
(insn 839 838 186 18 (set (reg:DI 4 x4)
|
|
(plus:DI (reg:DI 4 x4)
|
|
(reg:DI 4 x4))) {*adddi3_aarch64}
|
|
(expr_list:REG_EQUIV (plus:DI (reg:DI 4 x4)
|
|
(const_int 4416 [0x1140]))
|
|
(nil)))
|
|
|
|
hence x4 overwritten. There's commentary about this specific situation,
|
|
and how to avoid it. The situation is: two reloads, one feeding the other
|
|
and the other not being able to be generated as one instruction.
|
|
The function gen_reload_chain_without_interm_reg_p specifically detects
|
|
this situation (and in this case indeed says that an interim register
|
|
is needed, i.e. that the reload_reg can't be the same for these two
|
|
reloads). But that function is only used in a very specific case, namely
|
|
when looking for conflicts between two RELOAD_FOR_OPERAND_ADDRESS reloads.
|
|
|
|
In particular it's claimed (in reloads_conflict, reload_reg_free_p and
|
|
reload_reg_free_for_value_p) that a RELOAD_FOR_INPUT and a
|
|
RELOAD_FOR_INPUT_ADDRESS for the same operand never conflict. Obviously
|
|
that's not true in this case on aarch64. In fact I don't see how this
|
|
situation is avoided generally on other architectures. Possibly address
|
|
forms with out-of-range offsets must be rejected such that the invalid
|
|
offset itself generates a reload (rejecting just the whole address is
|
|
equivalent to the above).
|
|
|
|
The patch below does something like this: it detects the situation
|
|
in the backends hinter for reloading addresses. When it sees an
|
|
address consisting of a sum of a pseudo that didn't get a hardreg and a
|
|
constant (or generally an operand) that's invalid for a single
|
|
instruction then it reloads both operands of the PLUS (the whole plus
|
|
is reloaded by the caller).
|
|
|
|
This should be relatively safe, but it's unclear if it deals with
|
|
all situation where this can occur as it doesn't touch reload itself.
|
|
|
|
We do add a safety assert in gen_reload, though. That always trigger
|
|
when we're about to miscompile for the above reasons. This assert
|
|
doesn't trigger in the GCC testsuite even without patch.
|
|
|
|
Index: gcc/config/aarch64/aarch64.c
|
|
===================================================================
|
|
--- gcc/config/aarch64/aarch64.c (revision 238821)
|
|
+++ gcc/config/aarch64/aarch64.c (working copy)
|
|
@@ -3747,6 +3758,22 @@ aarch64_legitimize_reload_address (rtx *
|
|
BASE_REG_CLASS, GET_MODE (x), VOIDmode, 0, 0,
|
|
opnum, (enum reload_type) type);
|
|
return x;
|
|
+ }
|
|
+
|
|
+ if (GET_CODE (x) == PLUS
|
|
+ && 1
|
|
+ && REG_P (XEXP (x, 0))
|
|
+ && !HARD_REGISTER_P (XEXP (x, 0))
|
|
+ && reg_renumber[REGNO (XEXP (x, 0))] < 0
|
|
+ && !aarch64_plus_operand (XEXP (x, 1), DImode))
|
|
+ {
|
|
+ push_reload (XEXP (x, 0), NULL_RTX, &XEXP (x, 0), NULL,
|
|
+ BASE_REG_CLASS, GET_MODE (x), VOIDmode, 0, 0,
|
|
+ opnum, (enum reload_type) type);
|
|
+ push_reload (XEXP (x, 1), NULL_RTX, &XEXP (x, 1), NULL,
|
|
+ BASE_REG_CLASS, GET_MODE (x), VOIDmode, 0, 0,
|
|
+ opnum, (enum reload_type) type);
|
|
+ return x;
|
|
}
|
|
|
|
/* We wish to handle large displacements off a base register by splitting
|
|
Index: gcc/reload1.c
|
|
===================================================================
|
|
--- gcc/reload1.c (revision 238821)
|
|
+++ gcc/reload1.c (working copy)
|
|
@@ -8613,6 +8613,9 @@ gen_reload (rtx out, rtx in, int opnum,
|
|
&& !insn_operand_matches (code, 2, op1)))
|
|
tem = op0, op0 = op1, op1 = tem;
|
|
|
|
+ /* We must not clobber op1! */
|
|
+ gcc_assert (!reg_overlap_mentioned_p (out, op1));
|
|
+
|
|
gen_reload (out, op0, opnum, type);
|
|
|
|
/* If OP0 and OP1 are the same, we can use OUT for OP1.
|