gcc7-aarch64-sls-miti-3.patch to backport aarch64 Straight Line Speculation mitigation [bsc#1172798, CVE-2020-13844] OBS-URL: https://build.opensuse.org/package/show/devel:gcc/gcc7?expand=0&rev=199
579 lines
22 KiB
Diff
579 lines
22 KiB
Diff
Backport of below commit for bsc#1172798
|
|
|
|
commit 2155170525f93093b90a1a065e7ed71a925566e9
|
|
Author: Matthew Malcomson <matthew.malcomson@arm.com>
|
|
Date: Thu Jul 9 09:11:59 2020 +0100
|
|
|
|
aarch64: Mitigate SLS for BLR instruction
|
|
|
|
This patch introduces the mitigation for Straight Line Speculation past
|
|
the BLR instruction.
|
|
|
|
This mitigation replaces BLR instructions with a BL to a stub which uses
|
|
a BR to jump to the original value. These function stubs are then
|
|
appended with a speculation barrier to ensure no straight line
|
|
speculation happens after these jumps.
|
|
|
|
When optimising for speed we use a set of stubs for each function since
|
|
this should help the branch predictor make more accurate predictions
|
|
about where a stub should branch.
|
|
|
|
When optimising for size we use one set of stubs for all functions.
|
|
This set of stubs can have human readable names, and we are using
|
|
`__call_indirect_x<N>` for register x<N>.
|
|
|
|
When BTI branch protection is enabled the BLR instruction can jump to a
|
|
`BTI c` instruction using any register, while the BR instruction can
|
|
only jump to a `BTI c` instruction using the x16 or x17 registers.
|
|
Hence, in order to ensure this transformation is safe we mov the value
|
|
of the original register into x16 and use x16 for the BR.
|
|
|
|
As an example when optimising for size:
|
|
a
|
|
BLR x0
|
|
instruction would get transformed to something like
|
|
BL __call_indirect_x0
|
|
where __call_indirect_x0 labels a thunk that contains
|
|
__call_indirect_x0:
|
|
MOV X16, X0
|
|
BR X16
|
|
<speculation barrier>
|
|
|
|
The first version of this patch used local symbols specific to a
|
|
compilation unit to try and avoid relocations.
|
|
This was mistaken since functions coming from the same compilation unit
|
|
can still be in different sections, and the assembler will insert
|
|
relocations at jumps between sections.
|
|
|
|
On any relocation the linker is permitted to emit a veneer to handle
|
|
jumps between symbols that are very far apart. The registers x16 and
|
|
x17 may be clobbered by these veneers.
|
|
Hence the function stubs cannot rely on the values of x16 and x17 being
|
|
the same as just before the function stub is called.
|
|
|
|
Similar can be said for the hot/cold partitioning of single functions,
|
|
so function-local stubs have the same restriction.
|
|
|
|
This updated version of the patch never emits function stubs for x16 and
|
|
x17, and instead forces other registers to be used.
|
|
|
|
Given the above, there is now no benefit to local symbols (since they
|
|
are not enough to avoid dealing with linker intricacies). This patch
|
|
now uses global symbols with hidden visibility each stored in their own
|
|
COMDAT section. This means stubs can be shared between compilation
|
|
units while still avoiding the PLT indirection.
|
|
|
|
This patch also removes the `__call_indirect_x30` stub (and
|
|
function-local equivalent) which would simply jump back to the original
|
|
location.
|
|
|
|
The function-local stubs are emitted to the assembly output file in one
|
|
chunk, which means we need not add the speculation barrier directly
|
|
after each one.
|
|
This is because we know for certain that the instructions directly after
|
|
the BR in all but the last function stub will be from another one of
|
|
these stubs and hence will not contain a speculation gadget.
|
|
Instead we add a speculation barrier at the end of the sequence of
|
|
stubs.
|
|
|
|
The global stubs are emitted in COMDAT/.linkonce sections by
|
|
themselves so that the linker can remove duplicates from multiple object
|
|
files. This means they are not emitted in one chunk, and each one must
|
|
include the speculation barrier.
|
|
|
|
Another difference is that since the global stubs are shared across
|
|
compilation units we do not know that all functions will be targeting an
|
|
architecture supporting the SB instruction.
|
|
Rather than provide multiple stubs for each architecture, we provide a
|
|
stub that will work for all architectures -- using the DSB+ISB barrier.
|
|
|
|
This mitigation does not apply for BLR instructions in the following
|
|
places:
|
|
- Some accesses to thread-local variables use a code sequence with a BLR
|
|
instruction. This code sequence is part of the binary interface between
|
|
compiler and linker. If this BLR instruction needs to be mitigated, it'd
|
|
probably be best to do so in the linker. It seems that the code sequence
|
|
for thread-local variable access is unlikely to lead to a Spectre Revalation
|
|
Gadget.
|
|
- PLT stubs are produced by the linker and each contain a BLR instruction.
|
|
It seems that at most only after the last PLT stub a Spectre Revalation
|
|
Gadget might appear.
|
|
|
|
Testing:
|
|
Bootstrap and regtest on AArch64
|
|
(with BOOT_CFLAGS="-mharden-sls=retbr,blr")
|
|
Used a temporary hack(1) in gcc-dg.exp to use these options on every
|
|
test in the testsuite, a slight modification to emit the speculation
|
|
barrier after every function stub, and a script to check that the
|
|
output never emitted a BLR, or unmitigated BR or RET instruction.
|
|
Similar on an aarch64-none-elf cross-compiler.
|
|
|
|
1) Temporary hack emitted a speculation barrier at the end of every stub
|
|
function, and used a script to ensure that:
|
|
a) Every RET or BR is immediately followed by a speculation barrier.
|
|
b) No BLR instruction is emitted by compiler.
|
|
|
|
(cherry picked from 96b7f495f9269d5448822e4fc28882edb35a58d7)
|
|
|
|
gcc/ChangeLog:
|
|
|
|
* config/aarch64/aarch64-protos.h (aarch64_indirect_call_asm):
|
|
New declaration.
|
|
* config/aarch64/aarch64.c (aarch64_regno_regclass): Handle new
|
|
stub registers class.
|
|
(aarch64_class_max_nregs): Likewise.
|
|
(aarch64_register_move_cost): Likewise.
|
|
(aarch64_sls_shared_thunks): Global array to store stub labels.
|
|
(aarch64_sls_emit_function_stub): New.
|
|
(aarch64_create_blr_label): New.
|
|
(aarch64_sls_emit_blr_function_thunks): New.
|
|
(aarch64_sls_emit_shared_blr_thunks): New.
|
|
(aarch64_asm_file_end): New.
|
|
(aarch64_indirect_call_asm): New.
|
|
(TARGET_ASM_FILE_END): Use aarch64_asm_file_end.
|
|
(TARGET_ASM_FUNCTION_EPILOGUE): Use
|
|
aarch64_sls_emit_blr_function_thunks.
|
|
* config/aarch64/aarch64.h (STB_REGNUM_P): New.
|
|
(enum reg_class): Add STUB_REGS class.
|
|
(machine_function): Introduce `call_via` array for
|
|
function-local stub labels.
|
|
* config/aarch64/aarch64.md (*call_insn, *call_value_insn): Use
|
|
aarch64_indirect_call_asm to emit code when hardening BLR
|
|
instructions.
|
|
* config/aarch64/constraints.md (Ucr): New constraint
|
|
representing registers for indirect calls. Is GENERAL_REGS
|
|
usually, and STUB_REGS when hardening BLR instruction against
|
|
SLS.
|
|
* config/aarch64/predicates.md (aarch64_general_reg): STUB_REGS class
|
|
is also a general register.
|
|
|
|
gcc/testsuite/ChangeLog:
|
|
|
|
* gcc.target/aarch64/sls-mitigation/sls-miti-blr-bti.c: New test.
|
|
* gcc.target/aarch64/sls-mitigation/sls-miti-blr.c: New test.
|
|
|
|
Index: gcc-7.5.0+r278197/gcc/config/aarch64/aarch64-protos.h
|
|
===================================================================
|
|
--- gcc-7.5.0+r278197.orig/gcc/config/aarch64/aarch64-protos.h
|
|
+++ gcc-7.5.0+r278197/gcc/config/aarch64/aarch64-protos.h
|
|
@@ -486,6 +486,7 @@ extern const atomic_ool_names aarch64_oo
|
|
extern const atomic_ool_names aarch64_ool_ldeor_names;
|
|
|
|
const char *aarch64_sls_barrier (int);
|
|
+const char *aarch64_indirect_call_asm (rtx);
|
|
extern bool aarch64_harden_sls_retbr_p (void);
|
|
extern bool aarch64_harden_sls_blr_p (void);
|
|
|
|
Index: gcc-7.5.0+r278197/gcc/config/aarch64/aarch64.c
|
|
===================================================================
|
|
--- gcc-7.5.0+r278197.orig/gcc/config/aarch64/aarch64.c
|
|
+++ gcc-7.5.0+r278197/gcc/config/aarch64/aarch64.c
|
|
@@ -5471,6 +5471,9 @@ aarch64_label_mentioned_p (rtx x)
|
|
enum reg_class
|
|
aarch64_regno_regclass (unsigned regno)
|
|
{
|
|
+ if (STUB_REGNUM_P (regno))
|
|
+ return STUB_REGS;
|
|
+
|
|
if (GP_REGNUM_P (regno))
|
|
return GENERAL_REGS;
|
|
|
|
@@ -5799,6 +5802,7 @@ aarch64_class_max_nregs (reg_class_t reg
|
|
{
|
|
switch (regclass)
|
|
{
|
|
+ case STUB_REGS:
|
|
case TAILCALL_ADDR_REGS:
|
|
case POINTER_REGS:
|
|
case GENERAL_REGS:
|
|
@@ -7880,10 +7884,12 @@ aarch64_register_move_cost (machine_mode
|
|
= aarch64_tune_params.regmove_cost;
|
|
|
|
/* Caller save and pointer regs are equivalent to GENERAL_REGS. */
|
|
- if (to == TAILCALL_ADDR_REGS || to == POINTER_REGS)
|
|
+ if (to == TAILCALL_ADDR_REGS || to == POINTER_REGS
|
|
+ || to == STUB_REGS)
|
|
to = GENERAL_REGS;
|
|
|
|
- if (from == TAILCALL_ADDR_REGS || from == POINTER_REGS)
|
|
+ if (from == TAILCALL_ADDR_REGS || from == POINTER_REGS
|
|
+ || from == STUB_REGS)
|
|
from = GENERAL_REGS;
|
|
|
|
/* Moving between GPR and stack cost is the same as GP2GP. */
|
|
@@ -14699,6 +14705,215 @@ aarch64_sls_barrier (int mitigation_requ
|
|
: "";
|
|
}
|
|
|
|
+static GTY (()) tree aarch64_sls_shared_thunks[30];
|
|
+static GTY (()) bool aarch64_sls_shared_thunks_needed = false;
|
|
+const char *indirect_symbol_names[30] = {
|
|
+ "__call_indirect_x0",
|
|
+ "__call_indirect_x1",
|
|
+ "__call_indirect_x2",
|
|
+ "__call_indirect_x3",
|
|
+ "__call_indirect_x4",
|
|
+ "__call_indirect_x5",
|
|
+ "__call_indirect_x6",
|
|
+ "__call_indirect_x7",
|
|
+ "__call_indirect_x8",
|
|
+ "__call_indirect_x9",
|
|
+ "__call_indirect_x10",
|
|
+ "__call_indirect_x11",
|
|
+ "__call_indirect_x12",
|
|
+ "__call_indirect_x13",
|
|
+ "__call_indirect_x14",
|
|
+ "__call_indirect_x15",
|
|
+ "", /* "__call_indirect_x16", */
|
|
+ "", /* "__call_indirect_x17", */
|
|
+ "__call_indirect_x18",
|
|
+ "__call_indirect_x19",
|
|
+ "__call_indirect_x20",
|
|
+ "__call_indirect_x21",
|
|
+ "__call_indirect_x22",
|
|
+ "__call_indirect_x23",
|
|
+ "__call_indirect_x24",
|
|
+ "__call_indirect_x25",
|
|
+ "__call_indirect_x26",
|
|
+ "__call_indirect_x27",
|
|
+ "__call_indirect_x28",
|
|
+ "__call_indirect_x29",
|
|
+};
|
|
+
|
|
+/* Function to create a BLR thunk. This thunk is used to mitigate straight
|
|
+ line speculation. Instead of a simple BLR that can be speculated past,
|
|
+ we emit a BL to this thunk, and this thunk contains a BR to the relevant
|
|
+ register. These thunks have the relevant speculation barries put after
|
|
+ their indirect branch so that speculation is blocked.
|
|
+
|
|
+ We use such a thunk so the speculation barriers are kept off the
|
|
+ architecturally executed path in order to reduce the performance overhead.
|
|
+
|
|
+ When optimizing for size we use stubs shared by the linked object.
|
|
+ When optimizing for performance we emit stubs for each function in the hope
|
|
+ that the branch predictor can better train on jumps specific for a given
|
|
+ function. */
|
|
+rtx
|
|
+aarch64_sls_create_blr_label (int regnum)
|
|
+{
|
|
+ gcc_assert (STUB_REGNUM_P (regnum));
|
|
+ if (optimize_function_for_size_p (cfun))
|
|
+ {
|
|
+ /* For the thunks shared between different functions in this compilation
|
|
+ unit we use a named symbol -- this is just for users to more easily
|
|
+ understand the generated assembly. */
|
|
+ aarch64_sls_shared_thunks_needed = true;
|
|
+ const char *thunk_name = indirect_symbol_names[regnum];
|
|
+ if (aarch64_sls_shared_thunks[regnum] == NULL)
|
|
+ {
|
|
+ /* Build a decl representing this function stub and record it for
|
|
+ later. We build a decl here so we can use the GCC machinery for
|
|
+ handling sections automatically (through `get_named_section` and
|
|
+ `make_decl_one_only`). That saves us a lot of trouble handling
|
|
+ the specifics of different output file formats. */
|
|
+ tree decl = build_decl (BUILTINS_LOCATION, FUNCTION_DECL,
|
|
+ get_identifier (thunk_name),
|
|
+ build_function_type_list (void_type_node,
|
|
+ NULL_TREE));
|
|
+ DECL_RESULT (decl) = build_decl (BUILTINS_LOCATION, RESULT_DECL,
|
|
+ NULL_TREE, void_type_node);
|
|
+ TREE_PUBLIC (decl) = 1;
|
|
+ TREE_STATIC (decl) = 1;
|
|
+ DECL_IGNORED_P (decl) = 1;
|
|
+ DECL_ARTIFICIAL (decl) = 1;
|
|
+ make_decl_one_only (decl, DECL_ASSEMBLER_NAME (decl));
|
|
+ resolve_unique_section (decl, 0, false);
|
|
+ aarch64_sls_shared_thunks[regnum] = decl;
|
|
+ }
|
|
+
|
|
+ return gen_rtx_SYMBOL_REF (Pmode, thunk_name);
|
|
+ }
|
|
+
|
|
+ if (cfun->machine->call_via[regnum] == NULL)
|
|
+ cfun->machine->call_via[regnum]
|
|
+ = gen_rtx_LABEL_REF (Pmode, gen_label_rtx ());
|
|
+ return cfun->machine->call_via[regnum];
|
|
+}
|
|
+
|
|
+/* Helper function for aarch64_sls_emit_blr_function_thunks and
|
|
+ aarch64_sls_emit_shared_blr_thunks below. */
|
|
+static void
|
|
+aarch64_sls_emit_function_stub (FILE *out_file, int regnum)
|
|
+{
|
|
+ /* Save in x16 and branch to that function so this transformation does
|
|
+ not prevent jumping to `BTI c` instructions. */
|
|
+ asm_fprintf (out_file, "\tmov\tx16, x%d\n", regnum);
|
|
+ asm_fprintf (out_file, "\tbr\tx16\n");
|
|
+}
|
|
+
|
|
+/* Emit all BLR stubs for this particular function.
|
|
+ Here we emit all the BLR stubs needed for the current function. Since we
|
|
+ emit these stubs in a consecutive block we know there will be no speculation
|
|
+ gadgets between each stub, and hence we only emit a speculation barrier at
|
|
+ the end of the stub sequences.
|
|
+
|
|
+ This is called in the TARGET_ASM_FUNCTION_EPILOGUE hook. */
|
|
+void
|
|
+aarch64_sls_emit_blr_function_thunks (FILE *out_file, HOST_WIDE_INT)
|
|
+{
|
|
+ if (! aarch64_harden_sls_blr_p ())
|
|
+ return;
|
|
+
|
|
+ bool any_functions_emitted = false;
|
|
+ /* We must save and restore the current function section since this assembly
|
|
+ is emitted at the end of the function. This means it can be emitted *just
|
|
+ after* the cold section of a function. That cold part would be emitted in
|
|
+ a different section. That switch would trigger a `.cfi_endproc` directive
|
|
+ to be emitted in the original section and a `.cfi_startproc` directive to
|
|
+ be emitted in the new section. Switching to the original section without
|
|
+ restoring would mean that the `.cfi_endproc` emitted as a function ends
|
|
+ would happen in a different section -- leaving an unmatched
|
|
+ `.cfi_startproc` in the cold text section and an unmatched `.cfi_endproc`
|
|
+ in the standard text section. */
|
|
+ section *save_text_section = in_section;
|
|
+ switch_to_section (function_section (current_function_decl));
|
|
+ for (int regnum = 0; regnum < 30; ++regnum)
|
|
+ {
|
|
+ rtx specu_label = cfun->machine->call_via[regnum];
|
|
+ if (specu_label == NULL)
|
|
+ continue;
|
|
+
|
|
+ targetm.asm_out.print_operand (out_file, specu_label, 0);
|
|
+ asm_fprintf (out_file, ":\n");
|
|
+ aarch64_sls_emit_function_stub (out_file, regnum);
|
|
+ any_functions_emitted = true;
|
|
+ }
|
|
+ if (any_functions_emitted)
|
|
+ /* Can use the SB if needs be here, since this stub will only be used
|
|
+ by the current function, and hence for the current target. */
|
|
+ asm_fprintf (out_file, "\t%s\n", aarch64_sls_barrier (true));
|
|
+ switch_to_section (save_text_section);
|
|
+}
|
|
+
|
|
+/* Emit shared BLR stubs for the current compilation unit.
|
|
+ Over the course of compiling this unit we may have converted some BLR
|
|
+ instructions to a BL to a shared stub function. This is where we emit those
|
|
+ stub functions.
|
|
+ This function is for the stubs shared between different functions in this
|
|
+ compilation unit. We share when optimizing for size instead of speed.
|
|
+
|
|
+ This function is called through the TARGET_ASM_FILE_END hook. */
|
|
+void
|
|
+aarch64_sls_emit_shared_blr_thunks (FILE *out_file)
|
|
+{
|
|
+ if (! aarch64_sls_shared_thunks_needed)
|
|
+ return;
|
|
+
|
|
+ for (int regnum = 0; regnum < 30; ++regnum)
|
|
+ {
|
|
+ tree decl = aarch64_sls_shared_thunks[regnum];
|
|
+ if (!decl)
|
|
+ continue;
|
|
+
|
|
+ const char *name = indirect_symbol_names[regnum];
|
|
+ switch_to_section (get_named_section (decl, NULL, 0));
|
|
+ ASM_OUTPUT_ALIGN (out_file, 2);
|
|
+ targetm.asm_out.globalize_label (out_file, name);
|
|
+ /* Only emits if the compiler is configured for an assembler that can
|
|
+ handle visibility directives. */
|
|
+ targetm.asm_out.assemble_visibility (decl, VISIBILITY_HIDDEN);
|
|
+ ASM_OUTPUT_TYPE_DIRECTIVE (out_file, name, "function");
|
|
+ ASM_OUTPUT_LABEL (out_file, name);
|
|
+ aarch64_sls_emit_function_stub (out_file, regnum);
|
|
+ /* Use the most conservative target to ensure it can always be used by any
|
|
+ function in the translation unit. */
|
|
+ asm_fprintf (out_file, "\tdsb\tsy\n\tisb\n");
|
|
+ ASM_DECLARE_FUNCTION_SIZE (out_file, name, decl);
|
|
+ }
|
|
+}
|
|
+
|
|
+/* Implement TARGET_ASM_FILE_END. */
|
|
+void
|
|
+aarch64_asm_file_end ()
|
|
+{
|
|
+ aarch64_sls_emit_shared_blr_thunks (asm_out_file);
|
|
+ /* Since this function will be called for the ASM_FILE_END hook, we ensure
|
|
+ that what would be called otherwise (e.g. `file_end_indicate_exec_stack`
|
|
+ for FreeBSD) still gets called. */
|
|
+#ifdef TARGET_ASM_FILE_END
|
|
+ TARGET_ASM_FILE_END ();
|
|
+#endif
|
|
+}
|
|
+
|
|
+const char *
|
|
+aarch64_indirect_call_asm (rtx addr)
|
|
+{
|
|
+ gcc_assert (REG_P (addr));
|
|
+ if (aarch64_harden_sls_blr_p ())
|
|
+ {
|
|
+ rtx stub_label = aarch64_sls_create_blr_label (REGNO (addr));
|
|
+ output_asm_insn ("bl\t%0", &stub_label);
|
|
+ }
|
|
+ else
|
|
+ output_asm_insn ("blr\t%0", &addr);
|
|
+ return "";
|
|
+}
|
|
+
|
|
/* Target-specific selftests. */
|
|
|
|
#if CHECKING_P
|
|
@@ -15132,6 +15347,12 @@ aarch64_libgcc_floating_mode_supported_p
|
|
#define TARGET_RUN_TARGET_SELFTESTS selftest::aarch64_run_selftests
|
|
#endif /* #if CHECKING_P */
|
|
|
|
+#undef TARGET_ASM_FILE_END
|
|
+#define TARGET_ASM_FILE_END aarch64_asm_file_end
|
|
+
|
|
+#undef TARGET_ASM_FUNCTION_EPILOGUE
|
|
+#define TARGET_ASM_FUNCTION_EPILOGUE aarch64_sls_emit_blr_function_thunks
|
|
+
|
|
struct gcc_target targetm = TARGET_INITIALIZER;
|
|
|
|
#include "gt-aarch64.h"
|
|
Index: gcc-7.5.0+r278197/gcc/config/aarch64/aarch64.h
|
|
===================================================================
|
|
--- gcc-7.5.0+r278197.orig/gcc/config/aarch64/aarch64.h
|
|
+++ gcc-7.5.0+r278197/gcc/config/aarch64/aarch64.h
|
|
@@ -435,6 +435,16 @@ extern unsigned aarch64_architecture_ver
|
|
#define GP_REGNUM_P(REGNO) \
|
|
(((unsigned) (REGNO - R0_REGNUM)) <= (R30_REGNUM - R0_REGNUM))
|
|
|
|
+/* Registers known to be preserved over a BL instruction. This consists of the
|
|
+ GENERAL_REGS without x16, x17, and x30. The x30 register is changed by the
|
|
+ BL instruction itself, while the x16 and x17 registers may be used by
|
|
+ veneers which can be inserted by the linker. */
|
|
+#define STUB_REGNUM_P(REGNO) \
|
|
+ (GP_REGNUM_P (REGNO) \
|
|
+ && (REGNO) != R16_REGNUM \
|
|
+ && (REGNO) != R17_REGNUM \
|
|
+ && (REGNO) != R30_REGNUM) \
|
|
+
|
|
#define FP_REGNUM_P(REGNO) \
|
|
(((unsigned) (REGNO - V0_REGNUM)) <= (V31_REGNUM - V0_REGNUM))
|
|
|
|
@@ -448,6 +458,7 @@ enum reg_class
|
|
{
|
|
NO_REGS,
|
|
TAILCALL_ADDR_REGS,
|
|
+ STUB_REGS,
|
|
GENERAL_REGS,
|
|
STACK_REG,
|
|
POINTER_REGS,
|
|
@@ -463,6 +474,7 @@ enum reg_class
|
|
{ \
|
|
"NO_REGS", \
|
|
"TAILCALL_ADDR_REGS", \
|
|
+ "STUB_REGS", \
|
|
"GENERAL_REGS", \
|
|
"STACK_REG", \
|
|
"POINTER_REGS", \
|
|
@@ -475,6 +487,7 @@ enum reg_class
|
|
{ \
|
|
{ 0x00000000, 0x00000000, 0x00000000 }, /* NO_REGS */ \
|
|
{ 0x0004ffff, 0x00000000, 0x00000000 }, /* TAILCALL_ADDR_REGS */\
|
|
+ { 0x3ffcffff, 0x00000000, 0x00000000 }, /* STUB_REGS */ \
|
|
{ 0x7fffffff, 0x00000000, 0x00000003 }, /* GENERAL_REGS */ \
|
|
{ 0x80000000, 0x00000000, 0x00000000 }, /* STACK_REG */ \
|
|
{ 0xffffffff, 0x00000000, 0x00000003 }, /* POINTER_REGS */ \
|
|
@@ -609,6 +622,8 @@ typedef struct GTY (()) machine_function
|
|
struct aarch64_frame frame;
|
|
/* One entry for each hard register. */
|
|
bool reg_is_wrapped_separately[LAST_SAVED_REGNUM];
|
|
+ /* One entry for each general purpose register. */
|
|
+ rtx call_via[SP_REGNUM];
|
|
} machine_function;
|
|
#endif
|
|
|
|
Index: gcc-7.5.0+r278197/gcc/config/aarch64/aarch64.md
|
|
===================================================================
|
|
--- gcc-7.5.0+r278197.orig/gcc/config/aarch64/aarch64.md
|
|
+++ gcc-7.5.0+r278197/gcc/config/aarch64/aarch64.md
|
|
@@ -778,12 +778,12 @@
|
|
)
|
|
|
|
(define_insn "*call_reg"
|
|
- [(call (mem:DI (match_operand:DI 0 "register_operand" "r"))
|
|
+ [(call (mem:DI (match_operand:DI 0 "register_operand" "Ucr"))
|
|
(match_operand 1 "" ""))
|
|
(use (match_operand 2 "" ""))
|
|
(clobber (reg:DI LR_REGNUM))]
|
|
""
|
|
- "blr\\t%0"
|
|
+ "* return aarch64_indirect_call_asm (operands[0]);"
|
|
[(set_attr "type" "call")]
|
|
)
|
|
|
|
@@ -840,12 +840,12 @@
|
|
|
|
(define_insn "*call_value_reg"
|
|
[(set (match_operand 0 "" "")
|
|
- (call (mem:DI (match_operand:DI 1 "register_operand" "r"))
|
|
+ (call (mem:DI (match_operand:DI 1 "register_operand" "Ucr"))
|
|
(match_operand 2 "" "")))
|
|
(use (match_operand 3 "" ""))
|
|
(clobber (reg:DI LR_REGNUM))]
|
|
""
|
|
- "blr\\t%1"
|
|
+ "* return aarch64_indirect_call_asm (operands[1]);"
|
|
[(set_attr "type" "call")]
|
|
|
|
)
|
|
Index: gcc-7.5.0+r278197/gcc/config/aarch64/constraints.md
|
|
===================================================================
|
|
--- gcc-7.5.0+r278197.orig/gcc/config/aarch64/constraints.md
|
|
+++ gcc-7.5.0+r278197/gcc/config/aarch64/constraints.md
|
|
@@ -24,6 +24,15 @@
|
|
(define_register_constraint "Ucs" "TAILCALL_ADDR_REGS"
|
|
"@internal Registers suitable for an indirect tail call")
|
|
|
|
+(define_register_constraint "Ucr"
|
|
+ "aarch64_harden_sls_blr_p () ? STUB_REGS : GENERAL_REGS"
|
|
+ "@internal Registers to be used for an indirect call.
|
|
+ This is usually the general registers, but when we are hardening against
|
|
+ Straight Line Speculation we disallow x16, x17, and x30 so we can use
|
|
+ indirection stubs. These indirection stubs cannot use the above registers
|
|
+ since they will be reached by a BL that may have to go through a linker
|
|
+ veneer.")
|
|
+
|
|
(define_register_constraint "w" "FP_REGS"
|
|
"Floating point and SIMD vector registers.")
|
|
|
|
Index: gcc-7.5.0+r278197/gcc/testsuite/gcc.target/aarch64/sls-mitigation/sls-miti-blr.c
|
|
===================================================================
|
|
--- /dev/null
|
|
+++ gcc-7.5.0+r278197/gcc/testsuite/gcc.target/aarch64/sls-mitigation/sls-miti-blr.c
|
|
@@ -0,0 +1,33 @@
|
|
+/* { dg-additional-options "-mharden-sls=blr -save-temps" } */
|
|
+/* Ensure that the SLS hardening of BLR leaves no BLR instructions.
|
|
+ We only test that all BLR instructions have been removed, not that the
|
|
+ resulting code makes sense. */
|
|
+typedef int (foo) (int, int);
|
|
+typedef void (bar) (int, int);
|
|
+struct sls_testclass {
|
|
+ foo *x;
|
|
+ bar *y;
|
|
+ int left;
|
|
+ int right;
|
|
+};
|
|
+
|
|
+/* We test both RTL patterns for a call which returns a value and a call which
|
|
+ does not. */
|
|
+int blr_call_value (struct sls_testclass x)
|
|
+{
|
|
+ int retval = x.x(x.left, x.right);
|
|
+ if (retval % 10)
|
|
+ return 100;
|
|
+ return 9;
|
|
+}
|
|
+
|
|
+int blr_call (struct sls_testclass x)
|
|
+{
|
|
+ x.y(x.left, x.right);
|
|
+ if (x.left % 10)
|
|
+ return 100;
|
|
+ return 9;
|
|
+}
|
|
+
|
|
+/* { dg-final { scan-assembler-not {\tblr\t} } } */
|
|
+/* { dg-final { scan-assembler {\tbr\tx[0-9][0-9]?} } } */
|