Handle 32/64-bit elements via gvec expansion and the 8/16 bits via
ool helpers.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Similar to VECTOR ADD COMPUTE CARRY, however 128-bit handling only.
Courtesy of Richard H.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Only slightly ugly, perform two additions. At least it is only supported
for 128 bit elements.
Introduce gen_gvec128_4_i64() similar to gen_gvec128_3_i64().
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Introduce two types of fancy new helpers that will be reused a couple of
times
1. gen_gvec_fn_3: Call an existing tcg_gen_gvec_X function with 3
parameters, simplifying parameter passing
2. gen_gvec128_3_i64: Call a function that performs 128 bit calculations
using two 64 bit values per vector.
Luckily, for VECTOR ADD we already have everything we need.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Add CPUClass::tlb_fill.
Improve tlb_vaddr_to_host for use by ARM SVE no-fault loads.
# gpg: Signature made Fri 10 May 2019 19:48:37 BST
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F
* remotes/rth/tags/pull-tcg-20190510: (27 commits)
tcg: Use tlb_fill probe from tlb_vaddr_to_host
tcg: Remove CPUClass::handle_mmu_fault
tcg: Use CPUClass::tlb_fill in cputlb.c
target/xtensa: Convert to CPUClass::tlb_fill
target/unicore32: Convert to CPUClass::tlb_fill
target/tricore: Convert to CPUClass::tlb_fill
target/tilegx: Convert to CPUClass::tlb_fill
target/sparc: Convert to CPUClass::tlb_fill
target/sh4: Convert to CPUClass::tlb_fill
target/s390x: Convert to CPUClass::tlb_fill
target/riscv: Convert to CPUClass::tlb_fill
target/ppc: Convert to CPUClass::tlb_fill
target/openrisc: Convert to CPUClass::tlb_fill
target/nios2: Convert to CPUClass::tlb_fill
target/moxie: Convert to CPUClass::tlb_fill
target/mips: Convert to CPUClass::tlb_fill
target/mips: Tidy control flow in mips_cpu_handle_mmu_fault
target/mips: Pass a valid error to raise_mmu_exception for user-only
target/microblaze: Convert to CPUClass::tlb_fill
target/m68k: Convert to CPUClass::tlb_fill
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We can now use the CPUClass hook instead of a named function.
Create a static tlb_fill function to avoid other changes within
cputlb.c. This also isolates the asserts within. Remove the
named tlb_fill function from all of the targets.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add tcg_gen_extract2_*.
Deal with overflow of TranslationBlocks.
Respect access_type in io_readx.
# gpg: Signature made Fri 26 Apr 2019 18:17:01 BST
# gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg: issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F
* remotes/rth/tags/pull-tcg-20190426:
cputlb: Fix io_readx() to respect the access_type
tcg/arm: Restrict constant pool displacement to 12 bits
tcg/ppc: Allow the constant pool to overflow at 32k
tcg: Restart TB generation after out-of-line ldst overflow
tcg: Restart TB generation after constant pool overflow
tcg: Restart TB generation after relocation overflow
tcg: Restart after TB code generation overflow
tcg: Hoist max_insns computation to tb_gen_code
tcg/aarch64: Support INDEX_op_extract2_{i32,i64}
tcg/arm: Support INDEX_op_extract2_i32
tcg/i386: Support INDEX_op_extract2_{i32,i64}
tcg: Use extract2 in tcg_gen_deposit_{i32,i64}
tcg: Use deposit and extract2 in tcg_gen_shifti_i64
tcg: Add INDEX_op_extract2_{i32,i64}
tcg: Implement tcg_gen_extract2_{i32,i64}
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Right now we configure the pagesize quite early, when initializing KVM.
This is long before system memory is actually allocated via
memory_region_allocate_system_memory(), and therefore memory backends
marked as mapped.
Instead, let's configure the maximum page size after initializing
memory in s390_memory_init(). cap_hpage_1m is still properly
configured before creating any CPUs, and therefore before configuring
the CPU model and eventually enabling CMMA.
This is not a fix but rather a preparation for the future, when initial
memory might reside on memory backends (not the case for s390x right now)
We will replace qemu_getrampagesize() soon by a function that will always
return the maximum page size (not the minimum page size, which only
works by pure luck so far, as there are no memory backends).
Acked-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190417113143.5551-2-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
In order to handle TB's that translate to too much code, we
need to place the control of the length of the translation
in the hands of the code gen master loop.
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
CPUClass method dump_statistics() takes an fprintf()-like callback and
a FILE * to pass to it. Most callers pass fprintf() and stderr.
log_cpu_state() passes fprintf() and qemu_log_file.
hmp_info_registers() passes monitor_fprintf() and the current monitor
cast to FILE *. monitor_fprintf() casts it right back, and is
otherwise identical to monitor_printf().
The callback gets passed around a lot, which is tiresome. The
type-punning around monitor_fprintf() is ugly.
Drop the callback, and call qemu_fprintf() instead. Also gets rid of
the type-punning, since qemu_fprintf() takes NULL instead of the
current monitor cast to FILE *.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190417191805.28198-15-armbru@redhat.com>
The various TARGET_cpu_list() take an fprintf()-like callback and a
FILE * to pass to it. Their callers (vl.c's main() via list_cpus(),
bsd-user/main.c's main(), linux-user/main.c's main()) all pass
fprintf() and stdout. Thus, the flexibility provided by the (rather
tiresome) indirection isn't actually used.
Drop the callback, and call qemu_printf() instead.
Calling printf() would also work, but would make the code unsuitable
for monitor context without making it simpler.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190417191805.28198-10-armbru@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
kvm_s390_mem_op() can fail in two ways: when !cap_mem_op, it returns
-ENOSYS, and when kvm_vcpu_ioctl() fails, it returns -errno set by
ioctl(). Its caller s390_cpu_virt_mem_rw() recovers from both
failures.
kvm_s390_mem_op() prints "KVM_S390_MEM_OP failed" with error_printf()
in the latter failure mode. Since this is obviously a warning, use
warn_report().
Perhaps the reporting should be left to the caller. It could warn on
failure other than -ENOSYS.
Cc: Thomas Huth <thuth@redhat.com>
Cc: qemu-s390x@nongnu.org
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20190417190641.26814-9-armbru@redhat.com>
Combine all variant in a single handler. As source and destination
have different element sizes, we can't use gvec expansion. Expand
manually. Also watch out for overlapping source and destination
registers. Use a safe evaluation order depending on the operation.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-33-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Instead of checking e.g. the first access on every touched page, we should
check the actual access, otherwise we might get false positives when Low
Address Protection (LAP) is active. As probe_write() can only deal with
accesses to one page, we have to loop.
Use i64 for the length, although not needed - easier to reuse
TCG temps we already have in the translation functions where this will
be used. Also allow it to be used from other helpers.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-28-david@redhat.com>
[CH: add missing page_check_range()]
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
This is a big one. Luckily we only have a limited set of such nasty
instructions.
We'll implement all variants with helpers, except when sources and
the destination don't overlap for VECTOR PACK. Provide different helpers
when the cc is to be modified. We'll return the cc then via env->cc_op.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-20-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We cannot use gvec expansion as source and destination elements are
have different element numbers. So we'll expand using a fancy loop.
Also, we have to take care of overlapping source and destination
registers, therefore use a safe evaluation irder depending on the
operation.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-19-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Very similar to LOAD COUNT TO BLOCK BOUNDARY, but instead of only
calculating, the actual vector is loaded. Use a temporary vector to
not modify the real vector on exceptions. Initialize that one to zero,
to not leak any data. Provide a fast path if we're loading a full
vector.
As we don't have gvec ool handlers for single vectors, just calculate
the vector address manually.
We can reuse the helper later on for VECTOR LOAD WITH LENGTH. In fact,
we are going to name it "vll" right from the beginning, because that's
a better match.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-15-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Try to load the last element first. Access to the first element will
be checked afterwards. This way, we can guarantee that the vector is
not modified before we checked for all possible exceptions. (16 vectors
cannot cross more than two pages)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-14-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
To avoid an helper, we have to do the actual calculation of the element
address (offset in cpu_env + cpu_env) manually. Factor that out into
get_vec_element_ptr_i64(). The same logic will be reused for "VECTOR
LOAD VR ELEMENT FROM GR".
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-12-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
When loading from memory, load both elements into temps first before
modifying the target vector
Loading with strange alingment from the end of the address space will
not properly wrap, we can ignore that for now.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-8-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Let's start with a more involved one, but it is the first in the list
of vector support instructions (introduced with the vector facility).
Good thing is, we need a lot of basic infrastructure for this. Reading
and writing vector elements as well as checking element validity.
All vector instruction related translation functions will reside in
translate_vx.inc.c, to be included in translate.c - similar to how
other architectures handle it.
While at it, directly add some documentation (which contains parts about
things added in follow-up patches, but splitting this up does not make
too much sense). Also add ES_* defines heavily used later.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-5-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Check them at a central point. We'll use a new instruction flag to
flag all vector instructions (IF_VEC) and handle it very similar to
AFP, whereby we use another unused position in the PSW mask to store
the state of vector register enablement per translation block.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-3-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
These are the new instruction formats related to vector instructions as
up to the z14 (a.k.a. latest PoP).
As v2 appeares (like x2 in VRX) with d2/b2 in VRV, we have to assign it a
higher field number to avoid collisions.
Properly take care of the MSB (to be able to address 32 registers) for
each vector register field stored in the RXB field (Bit 36 - 30 for all
vector instructions). As we have 32 bit vector registers and the
"v" fields are only 4 bit in size, the 5th bit is stored in the RXB.
We use a new type to indicate that the MSB has to be fetched from the
RXB.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190307121539.12842-2-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>