2024-06-26 17:49:24 +00:00
|
|
|
commit 28ffffe90896cbd655466b870b74d8304736a316
|
2023-03-29 08:50:48 +00:00
|
|
|
Author: Nicolas Morey <nmorey@suse.com>
|
2024-06-26 17:49:24 +00:00
|
|
|
Date: Wed Jun 26 17:36:58 2024 +0200
|
2017-09-19 13:53:14 +00:00
|
|
|
|
2018-08-09 10:25:09 +00:00
|
|
|
openucx s390x support
|
|
|
|
|
|
|
|
Signed-off-by: Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com>
|
2017-09-19 13:53:14 +00:00
|
|
|
|
2019-02-25 16:53:29 +00:00
|
|
|
diff --git config/m4/ucm.m4 config/m4/ucm.m4
|
2023-10-06 09:59:22 +00:00
|
|
|
index e5e66266d695..ef7e4ede93ce 100644
|
2019-02-25 16:53:29 +00:00
|
|
|
--- config/m4/ucm.m4
|
|
|
|
+++ config/m4/ucm.m4
|
Accepting request 1006486 from home:NMoreyChaisemartin:branches:science:HPC
- Update to v1.13.1 (jsc#PED-912)
- Core
- Added new objects to VFS: local and remote address of endpoint,
statistics of ucp_ep_create success/failure, failed/destroyed endpoints
- Added support for UCX static libraries
- Added profiling for rkey management routines
- PCIe relaxed order enabled by default for AMD CPUs
- Fixed not deallocating memory from ucp_mem_unmap if no rcache
- Fixed versioning infrastructure
- Multiple code improvements: refactoring, debug prints and assertions, etc.
- Multiple improvements in build, test and docs infrastructure
- Added new objects to VFS (md, component, log_level, etc.)
- Added configuration variable to specify which loadable modules are allowed
- Added build-time configuration to disable sigaction overriding
- UCP
- Added API to pass pre-registered memory handle to UCP operations
- Added implementation of AM rendezvous protocol
- Added 2-stage pipeline rendezvous protocol for GPU
- Added support for fragment mem_type for v1 pipeline proto, disabled by default
- Added active message support for proto v2
- Added UCP memory registration cache
- Improved adaptive progress - deactivate iface when all p2p lanes are destroyed
- Added support for user memh in proto_v1
- Added support for selecting local address when creating a client endpoint
- Added option to limit GPUDirectRDMA size in rendezvous protocol, UCX_RNDV_MEMTYPE_DIRECT_SIZE
- Deprecated UCX_SOCKADDR_AUX_TLS configuration parameter
- Resolving remote EP ID when creating local EP disabled by default
- Added client_id to ucp_worker_create() and ucp_conn_request_query() APIs
- Added ucp_worker_address_query() API
- Updated ucp_ep_query() API for getting local and remote addresses
- Added address versioning to correctly preserve wire compatibility starting from version 1.11.0
- Added new client/server connection establishment packet header format
- Enabled rendezvous and tag sync protocols when error handling is enabled on the endpoint
- Added iov zcopy support to RMA operations
- Reduced memory usage of unexpected messages by fitting receive buffer size to packet size
- Added support for modifying UCT and UCS configs by ucp_config_modify() API
- Optimized unpacked rkeys memory consumption
- Added request flag to influence latency vs. bandwidth protocol
- Reduced memory management overhead with new protocols
- Improved performance calculations for new protocols
- Added AMO support with GPU memory target using new protocols
- Added put_zcopy, get_zcopy and pipeline based rendezvous in new protocols
- Added support for user-defined alignment in Active Messages
- Added support for offload tag sync in new protocols
- Updated ucp_atomic_post() to use NBX flow
- UCT
- Introduced API uct_md_mkey_pack_v2
- Introduced UCT iface features API
- Introduced max_inflight_eps parameter in perf_attr API
- Introduced UCT_SEND_FLAG_PEER_CHECK flag that forces checking connectivity to a peer
- Introduced UCX_RCACHE_PURGE_ON_FORK to enable/disable cleaning regions when application is forking
- Disabled PEER_FAILURE capability for XPMEM
- Added API - uct_iface_is_reachable_v2()
- Added IPv6 address support in TCP
- Added latency estimation to uct_iface_estimate_perf()
- Adjusted knem and cma overhead cost
- Increased built-in TCP keep-alive interval to 2 seconds
- RDMA CORE (IB, ROCE, etc.)
- Introduced NDR autorecognition
- Introduced CQE zipping support
- Set the default MAX_RD_ATOMIC to maximum value supported by the hardware
- Disabled mlx5 ifaces on verbs MD
- Added detection of IB NDR devices
- Added check for CQ overrun in assert mode
- Added bitmap usage for releasing detached DCIs
- Added configuration for requests ack frequency with DevX
- Added remote QP info to tx error CQE traces
- ROCM
- Increased maximum number of HSA agents
- UCS
- Added topo module infrastructure
- Added memtrack and rcache information to VFS
- Added API for a per-process aggregate-sum statistics report
- Added memory pool set data structure
- Added new ptr_array API for bulk allocation
- Added ucs_string_buffer_append_flags() for string buffer
- Added ucs_ffs32()
- Added ucs_vsnprintf_safe() which always adds '\0'
- Added thread-safe put to ptr_map
- Improved accuracy of the topology distance estimation
- Added prints of leaked callbacks from the callback queue
- Removed a diagnostic message when fuse thread is stopped
- Added configurable limit for the memory consumed by rcache
- Added configuration for VFS(FUSE) thread affinity
- Added memory limit support to memtrack
- Packaging
- Added cmake config files for better integration with external cmake based projects
- Tools
- Added loop-back transport support in ucx_perftest
- Split ucx_perftest into separate modules
- Added process placement option for ucx_info
- Extended parameters correctness check in ucx_perftest
- Backported UCS-DEBUG-replace-PTR-with-void.patch
from upstream to fix compilation
OBS-URL: https://build.opensuse.org/request/show/1006486
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=48
2022-09-29 15:27:45 +00:00
|
|
|
@@ -80,9 +80,20 @@ AC_CHECK_DECLS([SYS_ipc],
|
2019-02-25 16:53:29 +00:00
|
|
|
[ipc_hooks_happy=no],
|
|
|
|
[#include <sys/syscall.h>])
|
|
|
|
|
2019-09-27 08:19:55 +00:00
|
|
|
+
|
2019-02-25 16:53:29 +00:00
|
|
|
+SAVE_CFLAGS=$CFLAGS
|
|
|
|
+CFLAGS="$CLAGS -Isrc/"
|
|
|
|
+bistro_arch_happy=yes
|
|
|
|
+AC_CHECK_DECLS([ucm_bistro_patch],
|
|
|
|
+ [],
|
2019-09-27 08:19:55 +00:00
|
|
|
+ [bistro_arch_happy=no],
|
2019-02-25 16:53:29 +00:00
|
|
|
+ [#include <ucm/bistro/bistro.h>])
|
|
|
|
+CFLAGS=$SAVE_CFLAGS
|
|
|
|
+
|
2019-09-27 08:19:55 +00:00
|
|
|
AS_IF([test "x$mmap_hooks_happy" = "xyes"],
|
|
|
|
AS_IF([test "x$ipc_hooks_happy" = "xyes" -o "x$shm_hooks_happy" = "xyes"],
|
2019-02-25 16:53:29 +00:00
|
|
|
- [bistro_hooks_happy=yes]))
|
2019-09-27 08:19:55 +00:00
|
|
|
+ AS_IF([test "x$bistro_arch_happy" == "xyes"],
|
|
|
|
+ [bistro_hooks_happy=yes])))
|
2019-02-25 16:53:29 +00:00
|
|
|
|
2019-09-27 08:19:55 +00:00
|
|
|
AS_IF([test "x$bistro_hooks_happy" = "xyes"],
|
2019-02-25 16:53:29 +00:00
|
|
|
[AC_DEFINE([UCM_BISTRO_HOOKS], [1], [Enable BISTRO hooks])],
|
|
|
|
diff --git src/ucm/Makefile.am src/ucm/Makefile.am
|
2024-06-26 17:49:24 +00:00
|
|
|
index fa7a722f2d31..e6df414a4ecb 100644
|
2019-02-25 16:53:29 +00:00
|
|
|
--- src/ucm/Makefile.am
|
|
|
|
+++ src/ucm/Makefile.am
|
2024-06-26 17:49:24 +00:00
|
|
|
@@ -34,6 +34,7 @@ noinst_HEADERS = \
|
2019-02-25 16:53:29 +00:00
|
|
|
bistro/bistro_aarch64.h \
|
2024-06-26 17:49:24 +00:00
|
|
|
bistro/bistro_ppc64.h \
|
|
|
|
bistro/bistro_rv64.h
|
2019-02-25 16:53:29 +00:00
|
|
|
+ bistro/bistro_s390x.h
|
|
|
|
|
2019-09-27 08:19:55 +00:00
|
|
|
libucm_la_SOURCES = \
|
|
|
|
event/event.c \
|
2019-02-25 16:53:29 +00:00
|
|
|
diff --git src/ucm/bistro/bistro.h src/ucm/bistro/bistro.h
|
2024-06-26 17:49:24 +00:00
|
|
|
index 8d0b90751676..a0b9d3f064c3 100644
|
2019-02-25 16:53:29 +00:00
|
|
|
--- src/ucm/bistro/bistro.h
|
|
|
|
+++ src/ucm/bistro/bistro.h
|
2024-06-26 17:49:24 +00:00
|
|
|
@@ -23,6 +23,8 @@ typedef struct ucm_bistro_restore_point ucm_bistro_restore_point_t;
|
2019-02-25 16:53:29 +00:00
|
|
|
# include "bistro_x86_64.h"
|
2024-06-26 17:49:24 +00:00
|
|
|
#elif defined(__riscv)
|
|
|
|
# include "bistro_rv64.h"
|
2019-02-25 16:53:29 +00:00
|
|
|
+#elif defined(__s390x__)
|
|
|
|
+# include "bistro_s390x.h"
|
|
|
|
#else
|
|
|
|
# error "Unsupported architecture"
|
|
|
|
#endif
|
|
|
|
diff --git src/ucm/bistro/bistro_s390x.h src/ucm/bistro/bistro_s390x.h
|
|
|
|
new file mode 100644
|
2024-06-26 17:49:24 +00:00
|
|
|
index 000000000000..2beb5de54fab
|
2019-02-25 16:53:29 +00:00
|
|
|
--- /dev/null
|
|
|
|
+++ src/ucm/bistro/bistro_s390x.h
|
2024-06-26 17:49:24 +00:00
|
|
|
@@ -0,0 +1,27 @@
|
2019-02-25 16:53:29 +00:00
|
|
|
+#ifndef UCM_BISTRO_BISTRO_S390X_H_
|
|
|
|
+#define UCM_BISTRO_BISTRO_S390X_H_
|
|
|
|
+
|
|
|
|
+#include <stdint.h>
|
|
|
|
+
|
|
|
|
+#include <ucs/type/status.h>
|
|
|
|
+#include <ucs/sys/compiler_def.h>
|
|
|
|
+
|
|
|
|
+#define UCM_BISTRO_PROLOGUE
|
|
|
|
+#define UCM_BISTRO_EPILOGUE
|
|
|
|
+
|
2024-06-26 17:49:24 +00:00
|
|
|
+typedef struct ucm_bistro_patch {
|
|
|
|
+} UCS_S_PACKED ucm_bistro_patch_t;
|
|
|
|
+typedef struct {
|
|
|
|
+} UCS_S_PACKED ucm_bistro_lock_t;
|
|
|
|
+
|
2021-09-27 09:00:18 +00:00
|
|
|
+static inline ucs_status_t ucm_bistro_patch(void *func_ptr, void *hook, const char *symbol,
|
2024-06-26 17:49:24 +00:00
|
|
|
+ void **orig_func_p,
|
|
|
|
+ ucm_bistro_restore_point_t **rp){
|
2021-09-27 09:00:18 +00:00
|
|
|
+ return UCS_ERR_UNSUPPORTED;
|
|
|
|
+}
|
2019-02-25 16:53:29 +00:00
|
|
|
+
|
2024-06-26 17:49:24 +00:00
|
|
|
+static inline void ucm_bistro_patch_lock(void * UCS_V_UNUSED dst)
|
|
|
|
+{
|
|
|
|
+}
|
|
|
|
+
|
2019-02-25 16:53:29 +00:00
|
|
|
+#endif
|
2018-08-09 10:25:09 +00:00
|
|
|
diff --git src/ucs/Makefile.am src/ucs/Makefile.am
|
2024-06-26 17:49:24 +00:00
|
|
|
index 4a05f47b6369..c1cd2fb2cb57 100644
|
2018-08-09 10:25:09 +00:00
|
|
|
--- src/ucs/Makefile.am
|
|
|
|
+++ src/ucs/Makefile.am
|
2024-06-26 17:49:24 +00:00
|
|
|
@@ -24,6 +24,7 @@ nobase_dist_libucs_la_HEADERS = \
|
2021-09-27 09:00:18 +00:00
|
|
|
arch/aarch64/bitops.h \
|
|
|
|
arch/ppc64/bitops.h \
|
2024-06-26 17:49:24 +00:00
|
|
|
arch/rv64/bitops.h \
|
2021-09-27 09:00:18 +00:00
|
|
|
+ arch/s390x/bitops.h \
|
|
|
|
arch/x86_64/bitops.h \
|
|
|
|
arch/bitops.h \
|
|
|
|
algorithm/crc.h \
|
2024-06-26 17:49:24 +00:00
|
|
|
@@ -87,6 +88,7 @@ nobase_dist_libucs_la_HEADERS = \
|
2021-09-27 09:00:18 +00:00
|
|
|
arch/generic/atomic.h \
|
2020-06-05 08:02:58 +00:00
|
|
|
arch/ppc64/global_opts.h \
|
2024-06-26 17:49:24 +00:00
|
|
|
arch/rv64/global_opts.h \
|
2020-06-05 08:02:58 +00:00
|
|
|
+ arch/s390x/global_opts.h \
|
|
|
|
arch/global_opts.h
|
|
|
|
|
|
|
|
noinst_HEADERS = \
|
2024-06-26 17:49:24 +00:00
|
|
|
@@ -94,6 +96,7 @@ noinst_HEADERS = \
|
2017-07-12 17:33:54 +00:00
|
|
|
arch/generic/cpu.h \
|
|
|
|
arch/ppc64/cpu.h \
|
2024-06-26 17:49:24 +00:00
|
|
|
arch/rv64/cpu.h \
|
2017-07-12 17:33:54 +00:00
|
|
|
+ arch/s390x/cpu.h \
|
|
|
|
arch/x86_64/cpu.h \
|
2021-09-27 09:00:18 +00:00
|
|
|
arch/cpu.h \
|
Accepting request 1006486 from home:NMoreyChaisemartin:branches:science:HPC
- Update to v1.13.1 (jsc#PED-912)
- Core
- Added new objects to VFS: local and remote address of endpoint,
statistics of ucp_ep_create success/failure, failed/destroyed endpoints
- Added support for UCX static libraries
- Added profiling for rkey management routines
- PCIe relaxed order enabled by default for AMD CPUs
- Fixed not deallocating memory from ucp_mem_unmap if no rcache
- Fixed versioning infrastructure
- Multiple code improvements: refactoring, debug prints and assertions, etc.
- Multiple improvements in build, test and docs infrastructure
- Added new objects to VFS (md, component, log_level, etc.)
- Added configuration variable to specify which loadable modules are allowed
- Added build-time configuration to disable sigaction overriding
- UCP
- Added API to pass pre-registered memory handle to UCP operations
- Added implementation of AM rendezvous protocol
- Added 2-stage pipeline rendezvous protocol for GPU
- Added support for fragment mem_type for v1 pipeline proto, disabled by default
- Added active message support for proto v2
- Added UCP memory registration cache
- Improved adaptive progress - deactivate iface when all p2p lanes are destroyed
- Added support for user memh in proto_v1
- Added support for selecting local address when creating a client endpoint
- Added option to limit GPUDirectRDMA size in rendezvous protocol, UCX_RNDV_MEMTYPE_DIRECT_SIZE
- Deprecated UCX_SOCKADDR_AUX_TLS configuration parameter
- Resolving remote EP ID when creating local EP disabled by default
- Added client_id to ucp_worker_create() and ucp_conn_request_query() APIs
- Added ucp_worker_address_query() API
- Updated ucp_ep_query() API for getting local and remote addresses
- Added address versioning to correctly preserve wire compatibility starting from version 1.11.0
- Added new client/server connection establishment packet header format
- Enabled rendezvous and tag sync protocols when error handling is enabled on the endpoint
- Added iov zcopy support to RMA operations
- Reduced memory usage of unexpected messages by fitting receive buffer size to packet size
- Added support for modifying UCT and UCS configs by ucp_config_modify() API
- Optimized unpacked rkeys memory consumption
- Added request flag to influence latency vs. bandwidth protocol
- Reduced memory management overhead with new protocols
- Improved performance calculations for new protocols
- Added AMO support with GPU memory target using new protocols
- Added put_zcopy, get_zcopy and pipeline based rendezvous in new protocols
- Added support for user-defined alignment in Active Messages
- Added support for offload tag sync in new protocols
- Updated ucp_atomic_post() to use NBX flow
- UCT
- Introduced API uct_md_mkey_pack_v2
- Introduced UCT iface features API
- Introduced max_inflight_eps parameter in perf_attr API
- Introduced UCT_SEND_FLAG_PEER_CHECK flag that forces checking connectivity to a peer
- Introduced UCX_RCACHE_PURGE_ON_FORK to enable/disable cleaning regions when application is forking
- Disabled PEER_FAILURE capability for XPMEM
- Added API - uct_iface_is_reachable_v2()
- Added IPv6 address support in TCP
- Added latency estimation to uct_iface_estimate_perf()
- Adjusted knem and cma overhead cost
- Increased built-in TCP keep-alive interval to 2 seconds
- RDMA CORE (IB, ROCE, etc.)
- Introduced NDR autorecognition
- Introduced CQE zipping support
- Set the default MAX_RD_ATOMIC to maximum value supported by the hardware
- Disabled mlx5 ifaces on verbs MD
- Added detection of IB NDR devices
- Added check for CQ overrun in assert mode
- Added bitmap usage for releasing detached DCIs
- Added configuration for requests ack frequency with DevX
- Added remote QP info to tx error CQE traces
- ROCM
- Increased maximum number of HSA agents
- UCS
- Added topo module infrastructure
- Added memtrack and rcache information to VFS
- Added API for a per-process aggregate-sum statistics report
- Added memory pool set data structure
- Added new ptr_array API for bulk allocation
- Added ucs_string_buffer_append_flags() for string buffer
- Added ucs_ffs32()
- Added ucs_vsnprintf_safe() which always adds '\0'
- Added thread-safe put to ptr_map
- Improved accuracy of the topology distance estimation
- Added prints of leaked callbacks from the callback queue
- Removed a diagnostic message when fuse thread is stopped
- Added configurable limit for the memory consumed by rcache
- Added configuration for VFS(FUSE) thread affinity
- Added memory limit support to memtrack
- Packaging
- Added cmake config files for better integration with external cmake based projects
- Tools
- Added loop-back transport support in ucx_perftest
- Split ucx_perftest into separate modules
- Added process placement option for ucx_info
- Extended parameters correctness check in ucx_perftest
- Backported UCS-DEBUG-replace-PTR-with-void.patch
from upstream to fix compilation
OBS-URL: https://build.opensuse.org/request/show/1006486
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=48
2022-09-29 15:27:45 +00:00
|
|
|
config/ucm_opts.h \
|
2024-06-26 17:49:24 +00:00
|
|
|
@@ -149,6 +152,7 @@ libucs_la_SOURCES = \
|
2023-03-29 08:50:48 +00:00
|
|
|
algorithm/string_distance.c \
|
2020-06-05 08:02:58 +00:00
|
|
|
arch/aarch64/cpu.c \
|
|
|
|
arch/aarch64/global_opts.c \
|
|
|
|
+ arch/s390x/global_opts.c \
|
|
|
|
arch/ppc64/timebase.c \
|
|
|
|
arch/ppc64/global_opts.c \
|
2024-06-26 17:49:24 +00:00
|
|
|
arch/rv64/cpu.c \
|
2018-08-09 10:25:09 +00:00
|
|
|
diff --git src/ucs/arch/atomic.h src/ucs/arch/atomic.h
|
2024-06-26 17:49:24 +00:00
|
|
|
index 849647902fab..a328c37e2020 100644
|
2018-08-09 10:25:09 +00:00
|
|
|
--- src/ucs/arch/atomic.h
|
|
|
|
+++ src/ucs/arch/atomic.h
|
2024-06-26 17:49:24 +00:00
|
|
|
@@ -18,6 +18,8 @@
|
2017-07-12 17:33:54 +00:00
|
|
|
# include "generic/atomic.h"
|
2024-06-26 17:49:24 +00:00
|
|
|
#elif defined(__riscv)
|
2017-07-12 17:33:54 +00:00
|
|
|
# include "generic/atomic.h"
|
|
|
|
+#elif defined(__s390x__)
|
|
|
|
+# include "generic/atomic.h"
|
|
|
|
#else
|
|
|
|
# error "Unsupported architecture"
|
|
|
|
#endif
|
2018-08-09 10:25:09 +00:00
|
|
|
diff --git src/ucs/arch/bitops.h src/ucs/arch/bitops.h
|
2024-06-26 17:49:24 +00:00
|
|
|
index 3e0e530f1336..f887e03ebac0 100644
|
2018-08-09 10:25:09 +00:00
|
|
|
--- src/ucs/arch/bitops.h
|
|
|
|
+++ src/ucs/arch/bitops.h
|
2024-06-26 17:49:24 +00:00
|
|
|
@@ -23,6 +23,8 @@ BEGIN_C_DECLS
|
2017-07-12 17:33:54 +00:00
|
|
|
# include "aarch64/bitops.h"
|
2024-06-26 17:49:24 +00:00
|
|
|
#elif defined(__riscv)
|
|
|
|
# include "rv64/bitops.h"
|
2017-07-12 17:33:54 +00:00
|
|
|
+#elif defined(__s390x__)
|
|
|
|
+# include "s390x/bitops.h"
|
|
|
|
#else
|
|
|
|
# error "Unsupported architecture"
|
|
|
|
#endif
|
2020-06-05 08:02:58 +00:00
|
|
|
diff --git src/ucs/arch/cpu.c src/ucs/arch/cpu.c
|
2024-06-26 17:49:24 +00:00
|
|
|
index 307fb61bfc4a..4356fff36f8b 100644
|
2020-06-05 08:02:58 +00:00
|
|
|
--- src/ucs/arch/cpu.c
|
|
|
|
+++ src/ucs/arch/cpu.c
|
2024-06-26 17:49:24 +00:00
|
|
|
@@ -64,6 +64,10 @@ const ucs_cpu_builtin_memcpy_t ucs_cpu_builtin_memcpy[UCS_CPU_VENDOR_LAST] = {
|
2020-06-05 08:02:58 +00:00
|
|
|
.min = UCS_MEMUNITS_INF,
|
|
|
|
.max = UCS_MEMUNITS_INF
|
2020-10-09 06:50:44 +00:00
|
|
|
},
|
2020-06-05 08:02:58 +00:00
|
|
|
+ [UCS_CPU_VENDOR_GENERIC_IBM] = {
|
|
|
|
+ .min = UCS_MEMUNITS_INF,
|
|
|
|
+ .max = UCS_MEMUNITS_INF
|
|
|
|
+ },
|
2020-10-09 06:50:44 +00:00
|
|
|
[UCS_CPU_VENDOR_FUJITSU_ARM] = {
|
|
|
|
.min = UCS_MEMUNITS_INF,
|
|
|
|
.max = UCS_MEMUNITS_INF
|
2024-06-26 17:49:24 +00:00
|
|
|
@@ -89,6 +93,7 @@ const size_t ucs_cpu_est_bcopy_bw[UCS_CPU_VENDOR_LAST] = {
|
|
|
|
[UCS_CPU_VENDOR_GENERIC_ARM] = UCS_CPU_EST_BCOPY_BW_DEFAULT,
|
|
|
|
[UCS_CPU_VENDOR_GENERIC_PPC] = UCS_CPU_EST_BCOPY_BW_DEFAULT,
|
|
|
|
[UCS_CPU_VENDOR_GENERIC_RV64G] = UCS_CPU_EST_BCOPY_BW_DEFAULT,
|
2023-10-06 09:59:22 +00:00
|
|
|
+ [UCS_CPU_VENDOR_GENERIC_IBM] = UCS_CPU_EST_BCOPY_BW_DEFAULT,
|
2024-06-26 17:49:24 +00:00
|
|
|
[UCS_CPU_VENDOR_FUJITSU_ARM] = UCS_CPU_EST_BCOPY_BW_FUJITSU_ARM,
|
|
|
|
[UCS_CPU_VENDOR_ZHAOXIN] = UCS_CPU_EST_BCOPY_BW_DEFAULT,
|
|
|
|
[UCS_CPU_VENDOR_NVIDIA] = UCS_CPU_EST_BCOPY_BW_DEFAULT
|
|
|
|
@@ -183,6 +188,7 @@ const char *ucs_cpu_vendor_name()
|
|
|
|
[UCS_CPU_VENDOR_GENERIC_ARM] = "Generic ARM",
|
|
|
|
[UCS_CPU_VENDOR_GENERIC_PPC] = "Generic PPC",
|
|
|
|
[UCS_CPU_VENDOR_GENERIC_RV64G] = "Generic RV64G",
|
|
|
|
+ [UCS_CPU_VENDOR_GENERIC_IBM] = "Generic IBM",
|
|
|
|
[UCS_CPU_VENDOR_FUJITSU_ARM] = "Fujitsu ARM",
|
|
|
|
[UCS_CPU_VENDOR_ZHAOXIN] = "Zhaoxin",
|
|
|
|
[UCS_CPU_VENDOR_NVIDIA] = "Nvidia"
|
|
|
|
@@ -212,6 +218,7 @@ const char *ucs_cpu_model_name()
|
|
|
|
[UCS_CPU_MODEL_ZHAOXIN_WUDAOKOU] = "Wudaokou",
|
|
|
|
[UCS_CPU_MODEL_ZHAOXIN_LUJIAZUI] = "Lujiazui",
|
|
|
|
[UCS_CPU_MODEL_RV64G] = "RV64G",
|
|
|
|
+ [UCS_CPU_MODEL_S390X] = "S390x",
|
|
|
|
[UCS_CPU_MODEL_NVIDIA_GRACE] = "Grace"
|
|
|
|
};
|
|
|
|
|
2018-08-09 10:25:09 +00:00
|
|
|
diff --git src/ucs/arch/cpu.h src/ucs/arch/cpu.h
|
2024-06-26 17:49:24 +00:00
|
|
|
index ca25e714d141..e97405c30d52 100644
|
2018-08-09 10:25:09 +00:00
|
|
|
--- src/ucs/arch/cpu.h
|
|
|
|
+++ src/ucs/arch/cpu.h
|
2024-06-26 17:49:24 +00:00
|
|
|
@@ -39,6 +39,7 @@ typedef enum ucs_cpu_model {
|
|
|
|
UCS_CPU_MODEL_ZHAOXIN_WUDAOKOU,
|
|
|
|
UCS_CPU_MODEL_ZHAOXIN_LUJIAZUI,
|
|
|
|
UCS_CPU_MODEL_RV64G,
|
|
|
|
+ UCS_CPU_MODEL_S390X,
|
|
|
|
UCS_CPU_MODEL_NVIDIA_GRACE,
|
|
|
|
UCS_CPU_MODEL_LAST
|
|
|
|
} ucs_cpu_model_t;
|
|
|
|
@@ -68,6 +69,7 @@ typedef enum ucs_cpu_vendor {
|
2020-06-05 08:02:58 +00:00
|
|
|
UCS_CPU_VENDOR_AMD,
|
|
|
|
UCS_CPU_VENDOR_GENERIC_ARM,
|
|
|
|
UCS_CPU_VENDOR_GENERIC_PPC,
|
|
|
|
+ UCS_CPU_VENDOR_GENERIC_IBM,
|
2020-10-09 06:50:44 +00:00
|
|
|
UCS_CPU_VENDOR_FUJITSU_ARM,
|
2021-09-27 09:00:18 +00:00
|
|
|
UCS_CPU_VENDOR_ZHAOXIN,
|
2024-06-26 17:49:24 +00:00
|
|
|
UCS_CPU_VENDOR_GENERIC_RV64G,
|
|
|
|
@@ -107,6 +109,8 @@ typedef struct ucs_cpu_builtin_memcpy {
|
2017-07-12 17:33:54 +00:00
|
|
|
# include "aarch64/cpu.h"
|
2024-06-26 17:49:24 +00:00
|
|
|
#elif defined(__riscv)
|
|
|
|
# include "rv64/cpu.h"
|
2017-07-12 17:33:54 +00:00
|
|
|
+#elif defined(__s390x__)
|
|
|
|
+# include "s390x/cpu.h"
|
|
|
|
#else
|
|
|
|
# error "Unsupported architecture"
|
|
|
|
#endif
|
2020-06-05 08:02:58 +00:00
|
|
|
diff --git src/ucs/arch/global_opts.h src/ucs/arch/global_opts.h
|
2024-06-26 17:49:24 +00:00
|
|
|
index 550d22b8b751..d8e4a7cca694 100644
|
2020-06-05 08:02:58 +00:00
|
|
|
--- src/ucs/arch/global_opts.h
|
|
|
|
+++ src/ucs/arch/global_opts.h
|
2024-06-26 17:49:24 +00:00
|
|
|
@@ -18,6 +18,8 @@
|
2020-06-05 08:02:58 +00:00
|
|
|
# include "aarch64/global_opts.h"
|
2024-06-26 17:49:24 +00:00
|
|
|
#elif defined(__riscv)
|
|
|
|
# include "rv64/global_opts.h"
|
2020-06-05 08:02:58 +00:00
|
|
|
+#elif defined(__s390x__)
|
|
|
|
+# include "s390x/global_opts.h"
|
|
|
|
#else
|
|
|
|
# error "Unsupported architecture"
|
|
|
|
#endif
|
2018-08-09 10:25:09 +00:00
|
|
|
diff --git src/ucs/arch/s390x/bitops.h src/ucs/arch/s390x/bitops.h
|
|
|
|
new file mode 100644
|
2022-10-05 07:13:29 +00:00
|
|
|
index 000000000000..ce48ff1ff451
|
2017-07-12 17:33:54 +00:00
|
|
|
--- /dev/null
|
2018-08-09 10:25:09 +00:00
|
|
|
+++ src/ucs/arch/s390x/bitops.h
|
2022-10-05 07:13:29 +00:00
|
|
|
@@ -0,0 +1,37 @@
|
2017-07-12 17:33:54 +00:00
|
|
|
+/**
|
|
|
|
+* Copyright (C) Mellanox Technologies Ltd. 2001-2015. ALL RIGHTS RESERVED.
|
|
|
|
+*
|
|
|
|
+* See file LICENSE for terms.
|
|
|
|
+*/
|
|
|
|
+
|
|
|
|
+#ifndef UCS_S390X_BITOPS_H_
|
|
|
|
+#define UCS_S390X_BITOPS_H_
|
|
|
|
+
|
|
|
|
+#include <stdint.h>
|
|
|
|
+
|
|
|
|
+
|
|
|
|
+static inline unsigned __ucs_ilog2_u32(uint32_t n)
|
|
|
|
+{
|
|
|
|
+ if (!n)
|
|
|
|
+ return 0;
|
|
|
|
+ return 31 - __builtin_clz(n);
|
|
|
|
+}
|
|
|
|
+
|
|
|
|
+static inline unsigned __ucs_ilog2_u64(uint64_t n)
|
|
|
|
+{
|
|
|
|
+ if (!n)
|
|
|
|
+ return 0;
|
2023-03-29 08:50:48 +00:00
|
|
|
+ return 63 - __builtin_clz(n);
|
2017-07-12 17:33:54 +00:00
|
|
|
+}
|
|
|
|
+
|
2022-10-05 07:13:29 +00:00
|
|
|
+static UCS_F_ALWAYS_INLINE unsigned ucs_ffs32(uint32_t n)
|
|
|
|
+{
|
|
|
|
+ return __ucs_ilog2_u32(n & -n);
|
|
|
|
+}
|
|
|
|
+
|
2017-07-12 17:33:54 +00:00
|
|
|
+static inline unsigned ucs_ffs64(uint64_t n)
|
|
|
|
+{
|
|
|
|
+ return __ucs_ilog2_u64(n & -n);
|
|
|
|
+}
|
|
|
|
+
|
|
|
|
+#endif
|
2018-08-09 10:25:09 +00:00
|
|
|
diff --git src/ucs/arch/s390x/cpu.h src/ucs/arch/s390x/cpu.h
|
|
|
|
new file mode 100644
|
2024-06-26 17:49:24 +00:00
|
|
|
index 000000000000..033f58f7c047
|
2017-07-12 17:33:54 +00:00
|
|
|
--- /dev/null
|
2018-08-09 10:25:09 +00:00
|
|
|
+++ src/ucs/arch/s390x/cpu.h
|
2020-06-05 08:02:58 +00:00
|
|
|
@@ -0,0 +1,84 @@
|
2017-07-12 17:33:54 +00:00
|
|
|
+/**
|
|
|
|
+* Copyright (C) Mellanox Technologies Ltd. 2001-2013. ALL RIGHTS RESERVED.
|
|
|
|
+* Copyright (C) ARM Ltd. 2016-2017. ALL RIGHTS RESERVED.
|
|
|
|
+*
|
|
|
|
+* See file LICENSE for terms.
|
|
|
|
+*/
|
|
|
|
+
|
|
|
|
+
|
|
|
|
+#ifndef UCS_S390X_CPU_H_
|
|
|
|
+#define UCS_S390X_CPU_H_
|
|
|
|
+
|
|
|
|
+#include <ucs/sys/compiler.h>
|
|
|
|
+#include <ucs/arch/generic/cpu.h>
|
|
|
|
+#include <stdint.h>
|
2020-06-05 08:02:58 +00:00
|
|
|
+#include <string.h>
|
|
|
|
+#include <ucs/type/status.h>
|
2017-07-12 17:33:54 +00:00
|
|
|
+
|
|
|
|
+
|
|
|
|
+#define UCS_ARCH_CACHE_LINE_SIZE 256
|
|
|
|
+
|
2020-06-05 08:02:58 +00:00
|
|
|
+BEGIN_C_DECLS
|
|
|
|
+
|
2017-07-12 17:33:54 +00:00
|
|
|
+/* Assume the worst - weak memory ordering */
|
|
|
|
+#define ucs_memory_bus_fence() asm volatile (""::: "memory")
|
|
|
|
+#define ucs_memory_bus_store_fence() ucs_memory_bus_fence()
|
|
|
|
+#define ucs_memory_bus_load_fence() ucs_memory_bus_fence()
|
2019-02-25 16:53:29 +00:00
|
|
|
+#define ucs_memory_bus_wc_flush() ucs_memory_bus_fence()
|
2017-07-12 17:33:54 +00:00
|
|
|
+#define ucs_memory_cpu_fence() ucs_memory_bus_fence()
|
|
|
|
+#define ucs_memory_cpu_store_fence() ucs_memory_bus_fence()
|
|
|
|
+#define ucs_memory_cpu_load_fence() ucs_memory_bus_fence()
|
2019-02-25 16:53:29 +00:00
|
|
|
+#define ucs_memory_cpu_wc_fence() ucs_memory_bus_fence()
|
2017-07-12 17:33:54 +00:00
|
|
|
+
|
|
|
|
+
|
|
|
|
+static inline uint64_t ucs_arch_read_hres_clock()
|
|
|
|
+{
|
|
|
|
+ unsigned long clk;
|
|
|
|
+ asm volatile("stck %0" : "=Q" (clk) : : "cc");
|
|
|
|
+ return clk >> 2;
|
|
|
|
+}
|
|
|
|
+#define ucs_arch_get_clocks_per_sec ucs_arch_generic_get_clocks_per_sec
|
|
|
|
+
|
|
|
|
+
|
|
|
|
+static inline ucs_cpu_model_t ucs_arch_get_cpu_model()
|
|
|
|
+{
|
2024-06-26 17:49:24 +00:00
|
|
|
+ return UCS_CPU_MODEL_S390X;
|
2017-07-12 17:33:54 +00:00
|
|
|
+}
|
|
|
|
+
|
2020-06-05 08:02:58 +00:00
|
|
|
+static inline ucs_cpu_vendor_t ucs_arch_get_cpu_vendor()
|
|
|
|
+{
|
|
|
|
+ return UCS_CPU_VENDOR_GENERIC_IBM;
|
|
|
|
+}
|
|
|
|
+
|
2017-07-12 17:33:54 +00:00
|
|
|
+static inline int ucs_arch_get_cpu_flag()
|
|
|
|
+{
|
|
|
|
+ return UCS_CPU_FLAG_UNKNOWN;
|
|
|
|
+}
|
|
|
|
+
|
|
|
|
+double ucs_arch_get_clocks_per_sec();
|
|
|
|
+
|
|
|
|
+#define ucs_arch_wait_mem ucs_arch_generic_wait_mem
|
|
|
|
+
|
2020-06-05 08:02:58 +00:00
|
|
|
+static inline void ucs_cpu_init()
|
|
|
|
+{
|
|
|
|
+}
|
|
|
|
+
|
|
|
|
+static inline void *ucs_memcpy_relaxed(void *dst, const void *src, size_t len)
|
|
|
|
+{
|
|
|
|
+ return memcpy(dst, src, len);
|
|
|
|
+}
|
|
|
|
+
|
|
|
|
+static UCS_F_ALWAYS_INLINE void
|
|
|
|
+ucs_memcpy_nontemporal(void *dst, const void *src, size_t len)
|
|
|
|
+{
|
|
|
|
+ memcpy(dst, src, len);
|
|
|
|
+}
|
|
|
|
+
|
|
|
|
+static inline ucs_status_t ucs_arch_get_cache_size(size_t *cache_sizes)
|
|
|
|
+{
|
|
|
|
+ return UCS_ERR_UNSUPPORTED;
|
|
|
|
+}
|
|
|
|
+
|
|
|
|
+END_C_DECLS
|
|
|
|
+
|
|
|
|
+#endif
|
|
|
|
diff --git src/ucs/arch/s390x/global_opts.c src/ucs/arch/s390x/global_opts.c
|
|
|
|
new file mode 100644
|
|
|
|
index 000000000000..4fa0c74034a7
|
|
|
|
--- /dev/null
|
|
|
|
+++ src/ucs/arch/s390x/global_opts.c
|
|
|
|
@@ -0,0 +1,24 @@
|
|
|
|
+/**
|
|
|
|
+* Copyright (C) Mellanox Technologies Ltd. 2019. ALL RIGHTS RESERVED.
|
|
|
|
+*
|
|
|
|
+* See file LICENSE for terms.
|
|
|
|
+*/
|
|
|
|
+
|
|
|
|
+#if defined(__s390x__)
|
|
|
|
+
|
|
|
|
+#ifdef HAVE_CONFIG_H
|
|
|
|
+# include "config.h"
|
|
|
|
+#endif
|
|
|
|
+
|
|
|
|
+#include <ucs/arch/global_opts.h>
|
|
|
|
+#include <ucs/config/parser.h>
|
|
|
|
+
|
|
|
|
+ucs_config_field_t ucs_arch_global_opts_table[] = {
|
|
|
|
+ {NULL}
|
|
|
|
+};
|
|
|
|
+
|
|
|
|
+void ucs_arch_print_memcpy_limits(ucs_arch_global_opts_t *config)
|
|
|
|
+{
|
|
|
|
+}
|
|
|
|
+
|
2017-07-12 17:33:54 +00:00
|
|
|
+#endif
|
2020-06-05 08:02:58 +00:00
|
|
|
diff --git src/ucs/arch/s390x/global_opts.h src/ucs/arch/s390x/global_opts.h
|
|
|
|
new file mode 100644
|
|
|
|
index 000000000000..225e4e5e896a
|
|
|
|
--- /dev/null
|
|
|
|
+++ src/ucs/arch/s390x/global_opts.h
|
|
|
|
@@ -0,0 +1,25 @@
|
|
|
|
+/**
|
|
|
|
+* Copyright (C) Mellanox Technologies Ltd. 2019. ALL RIGHTS RESERVED.
|
|
|
|
+*
|
|
|
|
+* See file LICENSE for terms.
|
|
|
|
+*/
|
|
|
|
+
|
|
|
|
+
|
|
|
|
+#ifndef UCS_PPC64_GLOBAL_OPTS_H_
|
|
|
|
+#define UCS_PPC64_GLOBAL_OPTS_H_
|
|
|
|
+
|
|
|
|
+#include <ucs/sys/compiler_def.h>
|
|
|
|
+
|
|
|
|
+BEGIN_C_DECLS
|
|
|
|
+
|
|
|
|
+#define UCS_ARCH_GLOBAL_OPTS_INITALIZER {}
|
|
|
|
+
|
|
|
|
+/* built-in memcpy config */
|
|
|
|
+typedef struct ucs_arch_global_opts {
|
|
|
|
+ char dummy;
|
|
|
|
+} ucs_arch_global_opts_t;
|
|
|
|
+
|
|
|
|
+END_C_DECLS
|
|
|
|
+
|
|
|
|
+#endif
|
|
|
|
+
|
2021-02-24 17:24:21 +00:00
|
|
|
diff --git src/ucs/sys/sys.c src/ucs/sys/sys.c
|
2024-06-26 17:49:24 +00:00
|
|
|
index 42ff75f64af5..b22418e3f4b0 100644
|
2021-02-24 17:24:21 +00:00
|
|
|
--- src/ucs/sys/sys.c
|
|
|
|
+++ src/ucs/sys/sys.c
|
2023-10-06 09:59:22 +00:00
|
|
|
@@ -1258,8 +1258,19 @@ void *ucs_sys_realloc(void *old_ptr, size_t old_length, size_t new_length)
|
2021-02-24 17:24:21 +00:00
|
|
|
if (old_ptr == NULL) {
|
|
|
|
/* Note: Must pass the 0 offset as "long", otherwise it will be
|
|
|
|
* partially undefined when converted to syscall arguments */
|
|
|
|
+#if defined(__s390x__)
|
|
|
|
+ long int _args[6] = {
|
|
|
|
+ (long int) NULL,
|
|
|
|
+ (long int) new_length,
|
|
|
|
+ (long int) PROT_READ|PROT_WRITE,
|
|
|
|
+ (long int) MAP_PRIVATE|MAP_ANONYMOUS,
|
|
|
|
+ (long int) -1,
|
|
|
|
+ (long int) 0ul};
|
|
|
|
+ ptr = (void*)syscall(__NR_mmap, _args);
|
|
|
|
+#else
|
|
|
|
ptr = (void*)syscall(__NR_mmap, NULL, new_length, PROT_READ|PROT_WRITE,
|
|
|
|
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0ul);
|
|
|
|
+#endif
|
|
|
|
if (ptr == MAP_FAILED) {
|
|
|
|
ucs_log_fatal_error("mmap(NULL, %zu, READ|WRITE, PRIVATE|ANON) failed: %m",
|
|
|
|
new_length);
|