In the recent refactoring we missed a few places which should be
calling finalize_memop_asimd() for ASIMD loads and stores but
instead are just calling finalize_memop(); fix these.
For the disas_ldst_single_struct() and disas_ldst_multiple_struct()
cases, this is not a behaviour change because there the size
is never MO_128 and the two finalize functions do the same thing.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
In disas_ldst_reg_imm9() we missed one place where a call to
a gen_mte_check* function should now be passed the memop we
have created rather than just being passed the size. Fix this.
Fixes: 0a9091424d ("target/arm: Pass memop to gen_mte_check1*")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The LDG instruction loads the tag from a memory address (identified
by [Xn + offset]), and then merges that tag into the destination
register Xt. We implemented this correctly for the case when
allocation tags are enabled, but didn't get it right when ATA=0:
instead of merging the tag bits into Xt, we merged them into the
memory address [Xn + offset] and then set Xt to that.
Merge the tag bits into the old Xt value, as they should be.
Cc: qemu-stable@nongnu.org
Fixes: c15294c1e3 ("target/arm: Implement LDG, STG, ST2G instructions")
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The atomic memory operations are supposed to return the old memory
data value in the destination register. This value is not
sign-extended, even if the operation is the signed minimum or
maximum. (In the pseudocode for the instructions the returned data
value is passed to ZeroExtend() to create the value in the register.)
We got this wrong because we were doing a 32-to-64 zero extend on the
result for 8 and 16 bit data values, rather than the correct amount
of zero extension.
Fix the bug by using ext8u and ext16u for the MO_8 and MO_16 data
sizes rather than ext32u.
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230602155223.2040685-2-peter.maydell@linaro.org
Move most includes from *translate*.c to translate.h, ensuring
that we get the ordering correct. Ensure cpu.h is first.
Use disas/disas.h instead of exec/log.h.
Drop otherwise unused includes.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
New wrapper around gen_io_start which takes care of the USE_ICOUNT
check, as well as marking the DisasContext to end the TB.
Remove exec/gen-icount.h.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This had been included via tcg-op-common.h via tcg-op.h,
but that is going away.
It is needed for inlines within translator.h, so we might as well
do it there and not individually in each translator c file.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Convert the exception-return insns ERET, ERETA and ERETB to
decodetree. These were the last insns left in the legacy
decoder function disas_uncond_reg_b(), which allows us to
remove it.
The old decoder explicitly decoded the DRPS instruction,
only in order to call unallocated_encoding() on it, exactly
as would have happened if it hadn't decoded it. This is
because this insn always UNDEFs unless the CPU is in
halting-debug state, which we don't emulate. So we list
the pattern in a comment in a64.decode, but don't actively
decode it.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230512144106.3608981-21-peter.maydell@linaro.org
Convert the last four BR-with-pointer-auth insns to decodetree.
The remaining cases in the outer switch in disas_uncond_b_reg()
all return early rather than leaving the case statement, so we
can delete the now-unused code at the end of that function.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230512144106.3608981-20-peter.maydell@linaro.org
The SVE and SME decode is already done by decodetree. Pull the calls
to these decoders out of the legacy decoder. This doesn't change
behaviour because all the patterns in sve.decode and sme.decode
already require the bits that the legacy decoder is decoding to have
the correct values.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230512144106.3608981-4-peter.maydell@linaro.org
The A64 translator uses a hand-written decoder for everything except
SVE or SME. It's fairly well structured, but it's becoming obvious
that it's still more painful to add instructions to than the A32
translator, because putting a new instruction into the right place in
a hand-written decoder is much harder than adding new instruction
patterns to a decodetree file.
As the first step in conversion to decodetree, create the skeleton of
the decodetree decoder; where it does not handle instructions we will
fall back to the legacy decoder (which will be for everything at the
moment, since there are no patterns in a64.decode).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230512144106.3608981-3-peter.maydell@linaro.org
Here it is not trivial to notice first initialization, so explicitly
zero the temps. Use an array for the output, rather than separate
tcg_rd/tcg_rd_hi variables.
Fixes a bug by adding a missing clear_vec_high.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
It is easy enough to use mov instead of or-with-zero
and relying on the optimizer to fold away the or.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
It is easy enough to use mov instead of or-with-zero and relying
on the optimizer to fold away the or. Use an array for the output,
rather than separate tcg_res{l,h} variables.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Split out common subroutines for handing rounding mode
changes during translation. Use tcg_constant_i32 and
tcg_temp_new_i32 instead of tcg_const_i32.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In preparation for extracting new helpers, ensure that
the rounding mode is represented as ARMFPRounding and
not FloatRoundMode.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Initialize rmode to -1 instead of keeping two variables.
This is already used elsewhere in translate-a64.c.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Only the use within cpu_reg requires a writable temp,
so inline new_tmp_a64_zero there. All other uses are
fine with a constant temp, so use tcg_constant_i64(0).
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>