In commit cfdb2c0c95 ("target/arm: Vectorize SABA/UABA") we
replaced the old handling of SABA/UABA with a vectorized implementation
which returns early rather than falling into the loop-ever-elements
code. We forgot to delete the part of the old looping code that
did the accumulate step, and Coverity correctly warns (CID 1428955)
that this code is now dead. Delete it.
Fixes: cfdb2c0c95
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200619171547.29780-1-peter.maydell@linaro.org
The NeonGenOneOpFn typedef breaks with the pattern of the other
NeonGen*Fn typedefs, because it is a TCGv_i64 -> TCGv_i64 operation
but it does not have '64' in its name. Rename it to NeonGenOne64OpFn,
so that the old name is available for a TCGv_i32 -> TCGv_i32 operation
(which we will need in a subsequent commit).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200616170844.13318-10-peter.maydell@linaro.org
With this conversion, we will be able to use the same helpers
with sve. In particular, pass 3 vector parameters for the
3-operand operations; for advsimd the destination register
is also an input.
This also fixes a bug in which we failed to clear the high bits
of the SVE register after an AdvSIMD operation.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200514212831.31248-2-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The functions eliminate duplication of the special cases for
this operation. They match up with the GVecGen2iFn typedef.
Add out-of-line helpers. We got away with only having inline
expanders because the neon vector size is only 16 bytes, and
we know that the inline expansion will always succeed.
When we reuse this for SVE, tcg-gvec-op may decide to use an
out-of-line helper due to longer vector lengths.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-4-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The functions eliminate duplication of the special cases for
this operation. They match up with the GVecGen2iFn typedef.
Add out-of-line helpers. We got away with only having inline
expanders because the neon vector size is only 16 bytes, and
we know that the inline expansion will always succeed.
When we reuse this for SVE, tcg-gvec-op may decide to use an
out-of-line helper due to longer vector lengths.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200513163245.17915-2-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Coverity raised a shed-load of errors cascading from inferring
that clz32(immh) might yield 32, from immh might be 0.
While immh cannot be 0 from encoding, it is not obvious even to
a human how we've checked that: via the filtering provided by
data_proc_simd[].
Reported-by: Coverity (CID 1421923, and more)
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200320160622.8040-3-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We must include the tag in the FAR_ELx register when raising
an addressing exception. Which means that we should not clear
out the tag during translation.
We cannot at present comply with this for user mode, so we
retain the clean_data_tbi function for the moment, though it
no longer does what it says on the tin for system mode. This
function is to be replaced with MTE, so don't worry about the
slight misnaming.
Buglink: https://bugs.launchpad.net/qemu/+bug/1867072
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200308012946.16303-3-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The v8.4-RCPC extension implements some new instructions:
* LDAPUR, LDAPURB, LDAPURH, LDAPRSB, LDAPRSH, LDAPRSW
* STLUR, STLURB, STLURH
These are all in a new subgroup of encodings that sits below the
top-level "Loads and Stores" group in the Arm ARM.
The STLUR* instructions have standard store-release semantics; the
LDAPUR* have Load-AcquirePC semantics, but (as with LDAPR*) we choose
to implement them as the slightly stronger Load-Acquire.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200224172846.13053-4-peter.maydell@linaro.org
The v8.3-RCPC extension implements three new load instructions
which provide slightly weaker consistency guarantees than the
existing load-acquire operations. For QEMU we choose to simply
implement them with a full LDAQ barrier.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200224172846.13053-3-peter.maydell@linaro.org
We still need two different helpers, since NEON and SVE2 get the
inputs from different locations within the source vector. However,
we can convert both to the same internal form for computation.
The sve2 helper is not used yet, but adding it with this patch
helps illustrate why the neon changes are helpful.
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200216214232.4230-5-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
These instructions shift left or right depending on the sign
of the input, and 7 bits are significant to the shift. This
requires several masks and selects in addition to the actual
shifts to form the complete answer.
That said, the operation is still a small improvement even for
two 64-bit elements -- 13 vector operations instead of 2 * 7
integer operations.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200216214232.4230-2-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>