81 Commits

Author SHA256 Message Date
44641d0495 Accepting request 1298351 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/1298351
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openucx?expand=0&rev=37
2025-08-09 17:58:51 +00:00
daadbbb50e Accepting request 1285180 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/1285180
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openucx?expand=0&rev=36
2025-06-13 16:42:51 +00:00
e2fae99408 - Update to ucx 1.18.1
- CUDA
    - Added config keys to update cuda_copy bandwidth for coherent platforms
    - Improved cache invalidation of memory allocated using CUDA memory pool
  - AZP
    - Added Ubuntu 24.04 to build and release pipeline
  - UCP
    - Fixed assertion failure when maximum lane fragment is smaller than AM header
    - Fixed potential active message user header use after free with protocol reconfiguration
  - CUDA
    - Fixed registration of CUDA Fabric memory allocated by UCT
    - Fixed VA recycling check of memory allocated using VMM and CUDA memory pool
  - RDMA CORE (IB, ROCE, etc.)
    - Do not use ConnectX-8 SMI subdevices for communication
    - Fixed remote access error by disabling ODP when the device supports DDP
    - Fixed configuration logic by disabling DDP when AR is disabled
  - UCM
    - Fixed crash with bistro hooks for CUDA 12.9 on amd64

OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=80
2025-06-12 14:32:39 +00:00
89e2c1bb0f Accepting request 1277496 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/1277496
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openucx?expand=0&rev=35
2025-05-23 12:29:12 +00:00
99815d77b5 add patches to fix gcc-15 compile errors (boo#1241939)
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=78
2025-05-14 20:27:27 +00:00
aa486005dd Accepting request 1266178 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/1266178
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openucx?expand=0&rev=34
2025-04-02 15:09:07 +00:00
46d315ac9e - Add UCT-IB-UD-Use-GRH-to-detect-address-family-on-non-Mellanox-hardware.patch
to fix an UD init issue on non-Mellanox RDMA HW (bsc#1240204).

OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=76
2025-04-01 13:23:59 +00:00
7905fb8b39 Accepting request 1247274 from science:HPC
- Update to ucx 1.18.0
  - UCP
    - Enabled using CUDA staging buffers for pipeline protocols by default
    - Added endpoint reconfiguration support for non-reused p2p scenarios
    - Enabled non-cacheable memory domains, activated for gdr_copy
    - Added user_data parameter to ucp_ep_query
    - Added support for host memory pipeline through CUDA buffers for rendezvous protocol
    - Added global VA infrastructure and memory region in absence of error handling
    - Made protocol performance node names more informative
    - Enforced always running on the same thread in single thread mode
    - Multiple improvements in protocols selection infrastructure
    - Added UCP_MEM_MAP_LOCK API flag to enforce locked memory mapping
    - Allowed up-to 64 endpoint lanes for systems with many transports or devices
    - Added usage tracker to worker
    - Improved various logging messages
    - Fixed stack overflow in exported rkey unpack
    - Removed extra remote-cpu overhead from protocol estimation for zcopy
    - Fixed performance estimation for rndv pipeline protocols
    - Fixed ATP sending by picking the correct lane
    - Fixed missing reg_id on memh creation
    - Fixed repeated invalidations by retaining existing access flags
    - Fixed abort reason propagation for rendezvous RTR mtype
    - Do not check transport availability if it is disabled by UCX_TLS environment variable
    - Fixed wrong flag being used for checking BCOPY capability
    - Fixed sending too many ATPs for small messages
    - Enforced 16 bits size for Active Messages identifiers
    - Fixed unnecessary status check for emulated AMO
    - Fixed more than one fragment sending in rendezvous pipeline
    - Fixed crash by using biggest max frag across all lanes
    - Fixed missing memory handle flags by copying from parent to child

OBS-URL: https://build.opensuse.org/request/show/1247274
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openucx?expand=0&rev=33
2025-02-20 15:28:03 +00:00
145da08ae6 - Refresh openucx-s390x-support.patch due to API changes
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=74
2025-02-20 06:38:45 +00:00
222004fc02 - Update to ucx 1.18.0
- UCP
    - Enabled using CUDA staging buffers for pipeline protocols by default
    - Added endpoint reconfiguration support for non-reused p2p scenarios
    - Enabled non-cacheable memory domains, activated for gdr_copy
    - Added user_data parameter to ucp_ep_query
    - Added support for host memory pipeline through CUDA buffers for rendezvous protocol
    - Added global VA infrastructure and memory region in absence of error handling
    - Made protocol performance node names more informative
    - Enforced always running on the same thread in single thread mode
    - Multiple improvements in protocols selection infrastructure
    - Added UCP_MEM_MAP_LOCK API flag to enforce locked memory mapping
    - Allowed up-to 64 endpoint lanes for systems with many transports or devices
    - Added usage tracker to worker
    - Improved various logging messages
    - Fixed stack overflow in exported rkey unpack
    - Removed extra remote-cpu overhead from protocol estimation for zcopy
    - Fixed performance estimation for rndv pipeline protocols
    - Fixed ATP sending by picking the correct lane
    - Fixed missing reg_id on memh creation
    - Fixed repeated invalidations by retaining existing access flags
    - Fixed abort reason propagation for rendezvous RTR mtype
    - Do not check transport availability if it is disabled by UCX_TLS environment variable
    - Fixed wrong flag being used for checking BCOPY capability
    - Fixed sending too many ATPs for small messages
    - Enforced 16 bits size for Active Messages identifiers
    - Fixed unnecessary status check for emulated AMO
    - Fixed more than one fragment sending in rendezvous pipeline
    - Fixed crash by using biggest max frag across all lanes
    - Fixed missing memory handle flags by copying from parent to child

OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=73
2025-02-19 20:35:36 +00:00
50926fe318 Accepting request 1199376 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/1199376
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openucx?expand=0&rev=32
2024-09-09 12:43:20 +00:00
9f2cde7a87 - Refresh openucx-s390x-support.patch to fix compilation on s390x
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=71
2024-09-07 14:26:13 +00:00
1aaa6114cd Accepting request 1184228 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/1184228
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openucx?expand=0&rev=31
2024-07-03 18:26:35 +00:00
8094d4b34d - Enable build on riscv64
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=69
2024-07-01 08:27:55 +00:00
2b4398a74a Accepting request 1183479 from science:HPC
- Update to 1.17.0
  - See NEWS for the complete CHANGELOG
- Refresh openucx-s390x-support.patch against the latest sources
- Add upstream fix UCS-TIME-Add-math.h-to-provide-INFINITY.patch
  to fix compilation on ppc64

OBS-URL: https://build.opensuse.org/request/show/1183479
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openucx?expand=0&rev=30
2024-06-29 13:16:13 +00:00
cfaa4352a9 - Update to 1.17.0
- See NEWS for the complete CHANGELOG
- Refresh openucx-s390x-support.patch against the latest sources
- Add upstream fix UCS-TIME-Add-math.h-to-provide-INFINITY.patch
  to fix compilation on ppc64

OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=67
2024-06-26 17:49:24 +00:00
d7ff57612d Accepting request 1151438 from science:HPC
Prepare for RPM 4.20 (forwarded request 1151423 from dimstar)

OBS-URL: https://build.opensuse.org/request/show/1151438
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openucx?expand=0&rev=29
2024-02-27 21:44:20 +00:00
0835e04bcc Accepting request 1151423 from home:dimstar:rpm4.20:o
Prepare for RPM 4.20

OBS-URL: https://build.opensuse.org/request/show/1151423
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=65
2024-02-26 12:52:59 +00:00
0a42199aad Accepting request 1116008 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/1116008
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openucx?expand=0&rev=28
2023-10-08 10:17:06 +00:00
2a1a111b03 Accepting request 1115979 from home:NMorey:branches:science:HPC
- Update to 1.15.0
  - UCP
    - Added 2-stage pipeline protocol in the new protocol infrastructure
    - Added reset and abort functionality of rendezvous protocols in the
       new infrastructure
    - Added zero-copy rendezvous data send protocol in the new infrastructure
    - Added support for user memory handle in the new protocol infrastructure
    - Added option to force ODP registration for certain memory types
    - Enabled lock free memory region deregistration
    - Updated allow/deny transport list feature to control auxiliary transport selection
    - Multiple performance improvements of the new protocol infrastructure
    - Multiple improvements in error and debug messages
    - Fixed assertion when sending from non-contiguous GPU buffer to managed buffer
    - Fixed the race condition on endpoint configurations
    - Fixed endpoint reconfiguration issues due to asymmetrical selection
    - Fixed endpoint reconfiguration error due to wrong locality detection
    - Fixed crash during connection manager cleanup
    - Fixed rkey index calculation for rendezvous protocol
    - Fixed rcache dump function
    - Removed logging from rkey unpack in release mode
    - Fixed dobule free of rkey in rendezvous protocol
    - Fixed rendezvous pipeline protocol error flow
    - Fixed error handling in rendezvous get zcopy protocol
    - Replay pending requests of wireup EP CM during connection establishment
      to prevent potential ordering issues and wrong configuration
    - Pass user-provided memory type to the function that checks whether the buffer
      can be sent inline or not
    - Avoid memory registration during UCP context initialization
    - Fixed CPU/device atomics selection in the new protocol infrastructure
    - Multiple fixes in the new protocol infrastructure information output

OBS-URL: https://build.opensuse.org/request/show/1115979
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=63
2023-10-06 09:59:22 +00:00
ba3eec4113 Accepting request 1100646 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/1100646
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openucx?expand=0&rev=27
2023-07-26 11:22:10 +00:00
7d6841ca26 Accepting request 1100640 from home:NMorey:branches:science:HPC
- Update to v1.14.1
  - Fixed ROCm to prevent the locking of host pinned memory
  - Added CUDA 12 based UCX builds to the release flow
  - Increased the maximal number of endpoint configurations
  - Fixed filter for a slow-lanes in selection logic
  - Fixed TCP transport bandwidth calculation
  - Fixed device detection for ROCM
  - Fixed compatibility with CUDA 12
  - Fixed rendezvous threshold for multi-path configurations
  - Fixed error message in case of static link
  - Fixed BlueField-3 detection
  - Multiple fixes for Azure CI pipeline

OBS-URL: https://build.opensuse.org/request/show/1100640
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=61
2023-07-25 13:54:29 +00:00
8a8941ab4f Accepting request 1075600 from science:HPC
- Update to v1.14.0
  - UCP
    - Added API for querying transport and device names on endpoint
    - Added API for querying datatype object
    - Added API for exporting and importing memory keys (no implementation yet)
    - Added support for non-persistent active message header
    - Added infrastructure to print protocols v2 performance
    - Multiple performance improvements for protocols v2
    - Added support for non-contiguous datatypes for rendezvous protocols v2
    - Added support for reset and abort request in protocols v2
    - Added support for user memory handles in RMA API
    - Added multi-rail support for RMA API in protocols v2
    - Added support for up to 16 different lanes per endpoint
    - Added support for dmabuf memory registration in protocols v2
    - Added strong fence mode for ucp_worker_fence() API
  - UCT
    - Added new uct_md_mem_attach() API to support exported memory handles
    - Added remote completion mode for endpoint flush (via new flag)
    - Added support for dmabuf registration
    - Added new uct_ep_connect_to_ep_v2() API
    - Added new uct_mem_reg_v2() API
    - Added new uct_md_query_v2() API
    - Added support for IPv6 loopback address in TCP transport
  - RDMA CORE (IB, ROCE, etc.)
    - Added ECE (enhanced connection establishment) support for RC and DC transports
    - Added support for hardware DCS in DC transport
    - Added UD interface and endpoint resource information to VFS
    - Added CQ creation via DEVX API
    - Removed support for accelerated IB transports over legacy experimental verbs
  - UCS
    - Added support for auto-correction of user environment variables
  - UCM
    - Implemented CUDA bistro hooks for aarch64 (to enable memory cache on this platform)
    - Added support for CUDA virtual/stream-ordered memory with cudaMallocAsync
  - Documentation
    - Added FAQ for using pkg-config tool to build applications with UCX
  - Tools
    - Added runtime library version to the 'ucx_info -v' output
    - Added support for memory types in ucx_info
  - Many bugfixes. See NEWS.
- Drop patch merged upstream:
  - UCS-DEBUG-replace-PTR-with-void.patch
  - gcc13-fix.patch
- Refresh openucx-s390x-support.patch

OBS-URL: https://build.opensuse.org/request/show/1075600
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openucx?expand=0&rev=26
2023-04-01 21:26:51 +00:00
a42d04ee36 Remove remaining gcc13 patch
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=59
2023-03-30 16:49:05 +00:00
b714ee86f4 - Add gcc13-fix.patch for GCC13 support
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=58
2023-03-29 10:50:52 +00:00
6a412379a9 Accepting request 1075167 from home:NMorey:branches:science:HPC
- Update to v1.14.0
  - UCP
    - Added API for querying transport and device names on endpoint
    - Added API for querying datatype object
    - Added API for exporting and importing memory keys (no implementation yet)
    - Added support for non-persistent active message header
    - Added infrastructure to print protocols v2 performance
    - Multiple performance improvements for protocols v2
    - Added support for non-contiguous datatypes for rendezvous protocols v2
    - Added support for reset and abort request in protocols v2
    - Added support for user memory handles in RMA API
    - Added multi-rail support for RMA API in protocols v2
    - Added support for up to 16 different lanes per endpoint
    - Added support for dmabuf memory registration in protocols v2
    - Added strong fence mode for ucp_worker_fence() API
  - UCT
    - Added new uct_md_mem_attach() API to support exported memory handles
    - Added remote completion mode for endpoint flush (via new flag)
    - Added support for dmabuf registration
    - Added new uct_ep_connect_to_ep_v2() API
    - Added new uct_mem_reg_v2() API
    - Added new uct_md_query_v2() API
    - Added support for IPv6 loopback address in TCP transport
  - RDMA CORE (IB, ROCE, etc.)
    - Added ECE (enhanced connection establishment) support for RC and DC transports
    - Added support for hardware DCS in DC transport
    - Added UD interface and endpoint resource information to VFS
    - Added CQ creation via DEVX API
    - Removed support for accelerated IB transports over legacy experimental verbs
  - UCS
    - Added support for auto-correction of user environment variables
  - UCM
    - Implemented CUDA bistro hooks for aarch64 (to enable memory cache on this platform)
    - Added support for CUDA virtual/stream-ordered memory with cudaMallocAsync
  - Documentation
    - Added FAQ for using pkg-config tool to build applications with UCX
  - Tools
    - Added runtime library version to the 'ucx_info -v' output
    - Added support for memory types in ucx_info
  - Many bugfixes. See NEWS.
- Drop patch merged upstream:
  - UCS-DEBUG-replace-PTR-with-void.patch
  - gcc13-fix.patch
- Refresh openucx-s390x-support.patch

OBS-URL: https://build.opensuse.org/request/show/1075167
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=57
2023-03-29 08:50:48 +00:00
61b71445ce Accepting request 1069629 from science:HPC
- Add upstream gcc13-fix.patch fix. (forwarded request 1069627 from marxin)

OBS-URL: https://build.opensuse.org/request/show/1069629
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openucx?expand=0&rev=25
2023-03-07 15:48:49 +00:00
1c9eb00a8e Accepting request 1069627 from home:marxin:branches:science:HPC
- Add upstream gcc13-fix.patch fix.

OBS-URL: https://build.opensuse.org/request/show/1069627
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=55
2023-03-06 12:24:21 +00:00
1c024f5f2d Accepting request 1058681 from science:HPC
- openucx-s390x-support.patch: fix use of clz builtin for 64-bit value (forwarded request 1058654 from Andreas_Schwab)

OBS-URL: https://build.opensuse.org/request/show/1058681
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openucx?expand=0&rev=24
2023-01-17 16:34:47 +00:00
dfc3070ec1 Accepting request 1058654 from home:Andreas_Schwab:Factory
- openucx-s390x-support.patch: fix use of clz builtin for 64-bit value

OBS-URL: https://build.opensuse.org/request/show/1058654
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=53
2023-01-16 11:22:10 +00:00
ec8c3382db Accepting request 1008219 from science:HPC
- Update openucx-s390x-support.patch to add missing ucs_ffs32 on s390x
- Drop baselibs.conf as openucx only works on 64b systems

OBS-URL: https://build.opensuse.org/request/show/1008219
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openucx?expand=0&rev=23
2022-10-10 16:44:15 +00:00
Nicolas Morey-Chaisemartin
8322be19fe Accepting request 1008118 from home:NMoreyChaisemartin:branches:science:HPC
- Drop baselibs.conf as openucx only works on 64b systems

OBS-URL: https://build.opensuse.org/request/show/1008118
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=51
2022-10-05 07:28:51 +00:00
Nicolas Morey-Chaisemartin
d485735431 Accepting request 1008115 from home:NMoreyChaisemartin:branches:science:HPC
- Update openucx-s390x-support.patch to add missing ucs_ffs32 on s390x

OBS-URL: https://build.opensuse.org/request/show/1008115
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=50
2022-10-05 07:13:29 +00:00
54dbb80402 Accepting request 1007003 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/1007003
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openucx?expand=0&rev=22
2022-10-03 11:44:06 +00:00
Nicolas Morey-Chaisemartin
878438d42d Accepting request 1006486 from home:NMoreyChaisemartin:branches:science:HPC
- Update to v1.13.1 (jsc#PED-912)
  - Core
    - Added new objects to VFS: local and remote address of endpoint,
      statistics of ucp_ep_create success/failure, failed/destroyed endpoints
    - Added support for UCX static libraries
    - Added profiling for rkey management routines
    - PCIe relaxed order enabled by default for AMD CPUs
    - Fixed not deallocating memory from ucp_mem_unmap if no rcache
    - Fixed versioning infrastructure
    - Multiple code improvements: refactoring, debug prints and assertions, etc.
    - Multiple improvements in build, test and docs infrastructure
    - Added new objects to VFS (md, component, log_level, etc.)
    - Added configuration variable to specify which loadable modules are allowed
    - Added build-time configuration to disable sigaction overriding
  - UCP
    - Added API to pass pre-registered memory handle to UCP operations
    - Added implementation of AM rendezvous protocol
    - Added 2-stage pipeline rendezvous protocol for GPU
    - Added support for fragment mem_type for v1 pipeline proto, disabled by default
    - Added active message support for proto v2
    - Added UCP memory registration cache
    - Improved adaptive progress - deactivate iface when all p2p lanes are destroyed
    - Added support for user memh in proto_v1
    - Added support for selecting local address when creating a client endpoint
    - Added option to limit GPUDirectRDMA size in rendezvous protocol, UCX_RNDV_MEMTYPE_DIRECT_SIZE
    - Deprecated UCX_SOCKADDR_AUX_TLS configuration parameter
    - Resolving remote EP ID when creating local EP disabled by default
    - Added client_id to ucp_worker_create() and ucp_conn_request_query() APIs
    - Added ucp_worker_address_query() API
    - Updated ucp_ep_query() API for getting local and remote addresses
    - Added address versioning to correctly preserve wire compatibility starting from version 1.11.0
    - Added new client/server connection establishment packet header format
    - Enabled rendezvous and tag sync protocols when error handling is enabled on the endpoint
    - Added iov zcopy support to RMA operations
    - Reduced memory usage of unexpected messages by fitting receive buffer size to packet size
    - Added support for modifying UCT and UCS configs by ucp_config_modify() API
    - Optimized unpacked rkeys memory consumption
    - Added request flag to influence latency vs. bandwidth protocol
    - Reduced memory management overhead with new protocols
    - Improved performance calculations for new protocols
    - Added AMO support with GPU memory target using new protocols
    - Added put_zcopy, get_zcopy and pipeline based rendezvous in new protocols
    - Added support for user-defined alignment in Active Messages
    - Added support for offload tag sync in new protocols
    - Updated ucp_atomic_post() to use NBX flow
  - UCT
    - Introduced API uct_md_mkey_pack_v2
    - Introduced UCT iface features API
    - Introduced max_inflight_eps parameter in perf_attr API
    - Introduced UCT_SEND_FLAG_PEER_CHECK flag that forces checking connectivity to a peer
    - Introduced UCX_RCACHE_PURGE_ON_FORK to enable/disable cleaning regions when application is forking
    - Disabled PEER_FAILURE capability for XPMEM
    - Added API - uct_iface_is_reachable_v2()
    - Added IPv6 address support in TCP
    - Added latency estimation to uct_iface_estimate_perf()
    - Adjusted knem and cma overhead cost
    - Increased built-in TCP keep-alive interval to 2 seconds
  - RDMA CORE (IB, ROCE, etc.)
    - Introduced NDR autorecognition
    - Introduced CQE zipping support
    - Set the default MAX_RD_ATOMIC to maximum value supported by the hardware
    - Disabled mlx5 ifaces on verbs MD
    - Added detection of IB NDR devices
    - Added check for CQ overrun in assert mode
    - Added bitmap usage for releasing detached DCIs
    - Added configuration for requests ack frequency with DevX
    - Added remote QP info to tx error CQE traces
  - ROCM
    - Increased maximum number of HSA agents
  - UCS
    - Added topo module infrastructure
    - Added memtrack and rcache information to VFS
    - Added API for a per-process aggregate-sum statistics report
    - Added memory pool set data structure
    - Added new ptr_array API for bulk allocation
    - Added ucs_string_buffer_append_flags() for string buffer
    - Added ucs_ffs32()
    - Added ucs_vsnprintf_safe() which always adds '\0'
    - Added thread-safe put to ptr_map
    - Improved accuracy of the topology distance estimation
    - Added prints of leaked callbacks from the callback queue
    - Removed a diagnostic message when fuse thread is stopped
    - Added configurable limit for the memory consumed by rcache
    - Added configuration for VFS(FUSE) thread affinity
    - Added memory limit support to memtrack
  - Packaging
    - Added cmake config files for better integration with external cmake based projects
  - Tools
    - Added loop-back transport support in ucx_perftest
    - Split ucx_perftest into separate modules
    - Added process placement option for ucx_info
    - Extended parameters correctness check in ucx_perftest
- Backported UCS-DEBUG-replace-PTR-with-void.patch
  from upstream to fix compilation

OBS-URL: https://build.opensuse.org/request/show/1006486
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=48
2022-09-29 15:27:45 +00:00
98063d874c Accepting request 946105 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/946105
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openucx?expand=0&rev=21
2022-01-14 22:12:37 +00:00
Nicolas Morey-Chaisemartin
6e22959692 Accepting request 946104 from home:NMoreyChaisemartin:branches:science:HPC
- Fix UCM bistro support on non s390x archs
- Add ucm-fix-UCX_MEM_MALLOC_RELOC.patch to disable malloc relocations by default (bsc#1194369)

OBS-URL: https://build.opensuse.org/request/show/946104
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=47
2022-01-13 11:45:07 +00:00
21f5083b95 Accepting request 921703 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/921703
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openucx?expand=0&rev=20
2021-09-30 21:42:59 +00:00
Nicolas Morey-Chaisemartin
643404b991 Accepting request 921702 from home:NMoreyChaisemartin:branches:science:HPC
- Update to v1.11.1 (jsc#SLE-19260)

  - Core:
    - Added support for UCX monitoring using virtual file system (VFS)/FUSE
    - Added support for applications with static CUDA runtime linking
    - Added support for a configuration file
    - Updated clang format configuration
  - UCP
    - Added rendezvous API for active messages
    - Added user-defined name to context, worker, and endpoint objects
    - Added flag to silence request leak check
    - Added API for endpoint performance evaluation
    - Added API - ucp_request_query
    - Added API - ucp_lib_query
    - Added bandwidth optimizations for new protocols multi-lane
    - Added support for multi-rail over lanes with BW ratio >= 1/4
    - Added support for tracking outstanding requests and aborting those in case of connection failure
    - Refactored keep-alive protocol
    - Added device id to wireup protocol
    - Added support up to 128 transport layer resources in UCP context
    - Added support CUDA memory allocations with ucp_mem_map
    - Increased UCP_WORKER_MAX_EP_CONFIG to 64
    - Adjusted memory type zcopy threshold when UCX_ZCOPY_THRESH set
    - Refactored wireup protocols, rendezvous, get, zcopy protocols
    - Added put zcopy multi-rail
    - Improved logging for new protocols
    - Added system topology information
    - Added new protocols for eager offload protocols
  - UCT
    - Extended connection establishment API

OBS-URL: https://build.opensuse.org/request/show/921702
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=46
2021-09-27 09:00:18 +00:00
Richard Brown
b01e11bc13 Accepting request 874910 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/874910
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openucx?expand=0&rev=19
2021-03-02 11:25:29 +00:00
Nicolas Morey-Chaisemartin
cc6c36d10f Accepting request 874909 from home:NMoreyChaisemartin:branches:science:HPC
- Update openucx-s390x-support.patch to fix mmap syscall on s390x (bsc#1182691)

OBS-URL: https://build.opensuse.org/request/show/874909
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=44
2021-02-24 17:24:21 +00:00
3b5acc2b06 Accepting request 840387 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/840387
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openucx?expand=0&rev=18
2020-10-11 18:15:04 +00:00
Nicolas Morey-Chaisemartin
f10927b874 Accepting request 840386 from home:NMoreyChaisemartin:branches:science:HPC
- Update to v1.9.0 (jsc#SLE-15163)
  - Features:
    - Added a new class of communication APIs '*_nbx' that enable API extendability while
    - preserving ABI backward compatibility
    - Added asynchronous event support to UCT/IB/DEVX
    - Added support for latest CUDA library version
    - Added NAK-based reliability protocol for UCT/IB/UD to optimize resends
    - Added new tests for ROCm
    - Added new configuration parameters for protocol selection
    - Added performance optimization for Fujitsu A64FX with InfiniBand
    - Added performance optimization for clear cache code aarch64
    - Added support for relaxed-order PCIe access in IB RDMA transports
    - Added new TCP connection manager
    - Added support for UCT/IB PKey with partial membership in IB transports
    - Added support for RoCE LAG
    - Added support for ROCm 3.7 and above
    - Added flow control for RDMA read operations
    - Improved endpoint flush implementation for UCT/IB
    - Improved UD timer to avoid interrupting the main thread when not in use
    - Improved latency estimation for network path with CUDA
    - Improved error reporting messages
    - Improved performance in active message flow (removed malloc call)
    - Improved performance in ptr_array flow
    - Improved performance in UCT/SM progress engine flow
    - Improved I/O demo code
    - Improved rendezvous protocol for CUDA
    - Updated examples code
  - Bugfixes:
    - Fixes for most resent versions of GCC, CLANG, ARMCLANG, PGI
    - Fixes in UCT/IB for strict order keys

OBS-URL: https://build.opensuse.org/request/show/840386
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=42
2020-10-09 06:50:44 +00:00
2da20e0a3c Accepting request 822283 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/822283
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openucx?expand=0&rev=17
2020-07-26 14:15:03 +00:00
Nicolas Morey-Chaisemartin
b4e3d46395 Accepting request 822282 from home:NMoreyChaisemartin:branches:science:HPC
- Update to v1.8.1
  - Features:
    - Added binary release pipeline in Azure CI
  - Bugfixes:
    - Multiple fixes in testing environment
    - Fixes in InfiniBand DEVX transport
    - Fixes in memory management for CUDA IPC transport
    - Fixes for binutils 2.34+
    - Fixes for AMD ROCM build environment

OBS-URL: https://build.opensuse.org/request/show/822282
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=40
2020-07-22 15:44:37 +00:00
46c68d5620 Accepting request 811726 from science:HPC
- Update to v1.8.0

OBS-URL: https://build.opensuse.org/request/show/811726
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openucx?expand=0&rev=16
2020-06-09 22:33:43 +00:00
b3b5e27527 - Trim bias and filler wording from descriptions.
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=38
2020-06-05 10:06:01 +00:00
9033dd246f Accepting request 811684 from home:NMoreyChaisemartin:branches:science:HPC
- Update to v1.8.0
  - Features:
    - Improved detection for DEVX support
    - Improved TCP scalability
    - Added support for ROCM to perftest
    - Added support for different source and target memory types to perftest
    - Added optimized memcpy for ROCM devices
    - Added hardware tag-matching for CUDA buffers
    - Added support for CUDA and ROCM managed memories
    - Added support for client/server disconnect protocol over rdma connection manager
    - Added support for striding receive queue for hardware tag-matching
    - Added XPMEM-based rendezvous protocol for shared memory
    - Added support shared memory communication between containers on same machine
    - Added support for multi-threaded RDMA memory registration for large regions
    - Added new test cases to Azure CI
    - Added support for multiple listening transports
    - Added UCT socket-based connection manager transport
    - Updated API for UCT component management
    - Added API to retrieve the listening port
    - Added UCP active message API
    - Removed deprecated API for querying UCT memory domains
    - Refactored server/client examples
    - Added support for dlopen interception in UCM
    - Added support for PCIe atomics
    - Updated Java API: added support for most of UCP layer operations
    - Updated support for Mellanox DevX API
    - Added multiple UCT/TCP transport performance optimizations
    - Optimized memcpy() for Intel platforms
    - Added protection from non-UCX socket based app connections
    - Improved search time for PKEY object
    - Enabled gtest over IPv6 interfaces
    - Updated Mellanox and Bull device IDs
    - Added support for CUDA_VISIBLE_DEVICES
    - Increased limits for CUDA IPC registration
  - Bugfixes:
    - Multiple fixes in JUCX
    - Fixes in UCP thread safety
    - Fixes for most recent versions GCC, PGI, and ICC
    - Fixes for CPU affinity on Azure instances
    - Fixes in XPMEM support on PPC64
    - Performance fixes in CUDA IPC
    - Fixes in RDMA CM flows
    - Multiple fixes in TCP transport
    - Multiple fixes in documentation
    - Fixes in transport lane selection logic
    - Fixes in Java jar build
    - Fixes in socket connection manager for Nvidia DGX-2 platform
    - Multiple fixes in UCP, UCT, UCM libraries
    - Multiple fixes for BSD and Mac OS systems
    - Fixes for Clang compiler
    - Fix CPU optimization configuration options
    - Fix JUCX build on GPU nodes
    - Fix in Azure release pipeline flow
    - Fix in CUDA memory hooks management
    - Fix in GPU memory peer direct gtest
    - Fix in TCP connection establishment flow
    - Fix in GPU IPC check
    - Fix in CUDA Jenkins test flow
    - Multiple fixes in CUDA IPC flow
    - Fix adding missing header files
    - Fix to prevent failures in presence of VPN enabled Ethernet interfaces
- Refresh openucx-s390x-support.patch against new version

OBS-URL: https://build.opensuse.org/request/show/811684
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=37
2020-06-05 08:02:58 +00:00
455518e131 Accepting request 734936 from science:HPC
- Ensure /usr/lib/ucx is owned at all times.

OBS-URL: https://build.opensuse.org/request/show/734936
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openucx?expand=0&rev=15
2019-10-09 13:17:32 +00:00
f5ac91c2bc - Ensure /usr/lib/ucx is owned at all times.
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=35
2019-10-04 08:22:04 +00:00
6488ec11a4 Accepting request 733611 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/733611
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openucx?expand=0&rev=14
2019-10-02 09:55:36 +00:00
Nicolas Morey-Chaisemartin
de6138b03e Accepting request 733589 from home:NMoreyChaisemartin:branches:science:HPC
- Update to v1.6.0
  - Features:
    - Modular architecture for UCT transports
    - ROCm transport re-design: support for managed memory, direct copy, ROCm GDR
    - Random scheduling policy for DC transport
    - Optimized out-of-box settings for multi-rail
    - Added support for OmniPath (using Verbs)
    - Support for PCI atomics with IB transports
    - Reduced UCP address size for homogeneous environments
  - Bugfixes:
    - Multiple stability and performance improvements in TCP transport
    - Multiple stability fixed in Verbs and MLX5 transports
    - Multiple stability fixes in UCM memory hooks
    - Multiple stability fixes in UGNI transport
    - RPM Spec file cleanup
    - Fixing compilation issues with most recent clang and gcc compilers
    - Fixing the wrong name of aliases
    - Fix data race in UCP wireup
    - Fix segfault when libuct.so is reloaded - issue #3558
    - Include Java sources in distribution
    - Handle EADDRNOTAVAIL in rdma_cm connection manager
    - Disable ibcm on RHEL7+ by default
    - Fix data race in UCP proxy endpoint
    - Static checker fixes
    - Fallback to ibv_create_cq() if ibv_create_cq_ex() returns ENOSYS
    - Fix malloc hooks test
    - Fix checking return status in ucp_client_server example
    - Fix gdrcopy libdir config value
    - Fix printing atomic capabilities in ucx_info
    - Fix perftest warmup iterations to be non-zero

OBS-URL: https://build.opensuse.org/request/show/733589
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=33
2019-09-27 08:19:55 +00:00
c6d47e9fb8 Accepting request 703079 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/703079
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openucx?expand=0&rev=13
2019-05-25 11:14:07 +00:00
Nicolas Morey-Chaisemartin
47949112e3 Accepting request 703055 from home:NMoreyChaisemartin:branches:science:HPC
- Disable Werror to handle boo#1121267

OBS-URL: https://build.opensuse.org/request/show/703055
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=31
2019-05-15 06:01:04 +00:00
d2263e3b21 Accepting request 690257 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/690257
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openucx?expand=0&rev=12
2019-04-04 09:59:51 +00:00
Nicolas Morey-Chaisemartin
ca246a454a Accepting request 690254 from home:NMoreyChaisemartin:branches:science:HPC
- Update openucx-s390x-support.patch to fix support of 1.5.0 on s390x (bsc#1121267)

OBS-URL: https://build.opensuse.org/request/show/690254
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=29
2019-04-01 06:03:14 +00:00
Stephan Kulow
85725747e0 Accepting request 678967 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/678967
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openucx?expand=0&rev=11
2019-03-01 19:27:44 +00:00
Nicolas Morey-Chaisemartin
fd1e5380fe Accepting request 678966 from home:NMoreyChaisemartin:branches:science:HPC
- Update openucx-s390x-support.patch to fix support of 1.5.0 on s390x
- Add baselibs.conf for ppc

- Update to v1.5.0 (bsc#1121267)
  * Features:
  * New emulation mode enabling full UCX functionality (Atomic, Put, Get)
  * over TCP and RDMA-CORE interconnects which don't implement full RDMA semantics
  * Non-blocking API for all one-sided operations. All blocking communication APIs marked
  * as deprecated
  * New client/server connection establishment API, which allows connected handover between workers
  * Support for rdma-core direct-verbs (DEVX) and DC with mlx5 transports
  * GPU - Support for stream API and receive side pipelining
  * Malloc hooks using binary instrumentation instead of symbol override
  * Statistics for UCT tag API
  * GPU-to-Infiniband HCA affinity support based on locality/distance (PCIe)
  * Bugfixes:
  * Fix overflow in RC/DC flush operations
  * Update description in SPEC file and README
  * Fix RoCE source port for dc_mlx5 flow control
  * Improve ucx_info help message
  * Fix segfault in UCP, due to int truncation in count_one_bits()
  * Multiple other bugfixes (full list on github)
  * Tested configurations:
  * InfiniBand: MLNX_OFED 4.4-4.5, distribution inbox drivers, rdma-core
  * CUDA: gdrcopy 1.2, cuda 9.1.85
  * XPMEM: 2.6.2
  * KNEM: 1.1.2

OBS-URL: https://build.opensuse.org/request/show/678966
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=27
2019-02-25 16:53:29 +00:00
50735531ff Accepting request 646644 from science:HPC
- Update to v1.4.0 (bsc#1103494)

OBS-URL: https://build.opensuse.org/request/show/646644
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openucx?expand=0&rev=10
2018-11-12 08:50:19 +00:00
56befa2187 Stick to established changelog syntax
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=25
2018-11-06 12:02:30 +00:00
Nicolas Morey-Chaisemartin
4774502643 Accepting request 646571 from home:NMoreyChaisemartin:branches:science:HPC
- Update to v1.4.0 (bsc#1103494)
  - Features:
    - Improved support for installation with latest ROCm
    - Improved support for latest rdma-core
    - Adding support for CUDA IPC for intra-node GPU
    - Added support for CUDA memory allocation cache for mem-type detection
    - Added support for latest Mellanox devices
    - Added support for Nvidia GPU managed memory
    - Added support for multiple connections between the same pair of workers
    - Added support large worker address for client/server connection establishment
      and INADDR_ANY
    - Added support for bitwise atomics operations
  - Bugfixes:
    - Performance fixes for rendezvous protocol
    - Memory hook fixes
    - Clang support fixes
    - Self tl multi-rail fix
    - Thread safety fixes in IB/RDMA transport
    - Compilation fixes with upstream rdma-core
    - Multiple minor bugfixes (full list on github)
    - Segfault fix for a code generated by armclang compiler
    - UCP memory-domain index fix for zero-copy active messages

- Update to v1.3.1 (fate#325996)

OBS-URL: https://build.opensuse.org/request/show/646571
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=24
2018-11-06 07:56:17 +00:00
Nicolas Morey-Chaisemartin
6cb716aaee Accepting request 644613 from home:NMoreyChaisemartin:branches:sp1-staging
- Update to v1.3.1 (bsc#325996)
  - Prevent potential out-of-order sending in shared memory active messages
  - CUDA: Include cudamem.h in source tarball, pass cudaFree memory size
  - Registration cache: fix large range lookup, handle shmat(REMAP)/mmap(FIXED)
  - Limit IB CQE size for specific ARM boards

OBS-URL: https://build.opensuse.org/request/show/644613
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=23
2018-10-25 10:50:06 +00:00
6ff0a2a930 Accepting request 628374 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/628374
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openucx?expand=0&rev=9
2018-08-17 21:57:19 +00:00
Nicolas Morey-Chaisemartin
6c87d0bee6 Accepting request 628372 from home:NMoreyChaisemartin:branches:science:HPC
- Update to v1.3.0 (bsc#1104159)
  - Added stream-based communication API to UCP
  - Added support for GPU platforms: Nvidia CUDA and AMD ROCM software stacks
  - Added API for client/server based connection establishment
  - Added support for TCP transport
  - Support for InfiniBand tag-matching offload for DC and accelerated transports
  - Multi-rail support for eager and rendezvous protocols
  - Added support for tag-matching communications with CUDA buffers
  - Added ucp_rkey_ptr() to obtain pointer for shared memory region
  - Avoid progress overhead on unused transports
  - Improved scalability of software tag-matching by using a hash table
  - Added transparent huge-pages allocator
  - Added non-blocking flush and disconnect for UCP
  - Support fixed-address memory allocation via ucp_mem_map()
  - Added ucp_tag_send_nbr() API to avoid send request allocation
  - Support global addressing in all IB transports
  - Add support for external epoll fd and edge-triggered events
  - Added registration cache for knem
  - Initial support for Java bindings
  - Multiple bugfixes (full list on github)
- Drop UCT-UD-fixed-compilation-by-gcc8.patch as it was fixed upstream
- Refresh openucx-s390x-support.patch against latest sources

OBS-URL: https://build.opensuse.org/request/show/628372
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=21
2018-08-09 10:25:09 +00:00
28b9a25066 Accepting request 618650 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/618650
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openucx?expand=0&rev=8
2018-06-28 13:09:33 +00:00
1bb8a7934f Accepting request 618096 from home:NMoreyChaisemartin:branches:science:HPC
- Remove libnuma-devel on s390x for older releases

OBS-URL: https://build.opensuse.org/request/show/618096
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=19
2018-06-23 08:34:02 +00:00
61591a8cc8 Accepting request 591614 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/591614
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openucx?expand=0&rev=7
2018-03-30 09:58:56 +00:00
Nicolas Morey-Chaisemartin
37de8011ef Accepting request 591499 from home:NMoreyChaisemartin:branches:science:HPC
- Add UCT-UD-fixed-compilation-by-gcc8.patch to fix compilation
  with GCC8 (bsc#1084635)

OBS-URL: https://build.opensuse.org/request/show/591499
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=17
2018-03-27 13:06:52 +00:00
c5d83e6e59 Accepting request 567915 from science:HPC
- Use right documentation path.
- Update to 1.2.2

OBS-URL: https://build.opensuse.org/request/show/567915
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openucx?expand=0&rev=6
2018-01-25 11:34:54 +00:00
936151ea1e - Use right documentation path.
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=15
2018-01-20 15:40:56 +00:00
Nicolas Morey-Chaisemartin
33fb347489 Fix docdir
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=14
2018-01-20 11:37:26 +00:00
bbb8bc7682 Accepting request 567622 from home:NMoreyChaisemartin:branches:science:HPC
- Update to 1.2.2
  - Support including UCX API headers from C++ code
  - UD transport to handle unicast flood on RoCE fabric
  - Compilation fixes for gcc 7.1.1, clang 3.6, clang 5
  - When UD transport is used with RoCE, packets intended for other peers may
    arrive on different adapters (as a result of unicast flooding).
  - This change adds packet filtering based on destination GIDs. Now the packet
    is silently dropped, if its destination GID does not match the local GID.
  - Added a new device ID for InfiniBand HCA

OBS-URL: https://build.opensuse.org/request/show/567622
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=13
2018-01-19 16:08:27 +00:00
d0cd7dd0be Accepting request 555777 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/555777
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openucx?expand=0&rev=5
2017-12-13 10:55:39 +00:00
57691e3478 Accepting request 555398 from home:dimstar:Factory
- Drop doxygen BuildRequires: The documentation was already not
  built with this enabled. Removing the BR causes no regression in
  the package but eliminates a build cycle
  boost -> curl -> doxygen -> openucx -> boost

OBS-URL: https://build.opensuse.org/request/show/555398
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=11
2017-12-10 22:25:22 +00:00
c648941319 Accepting request 527339 from science:HPC
- Rediff openucx-s390x-support.patch as p1 to be in line with
  potential git-generated patches.

- Switch to version 1.2.1 (Fate#324050)
  Previous 1.3+ version was based on a development branch.
  Supported platforms
    - Shared memory: KNEM, CMA, XPMEM, SYSV, Posix
    - VERBs over InfiniBand and RoCE.
      VERBS over other RDMA interconnects (iWarp, OmniPath, etc.) is available
      for community evaluation and has not been tested in context of this release
    - Cray Gemini and Aries
    - Architectures: x86_64, ARMv8 (64bit), Power64
  Features:
    - Added support for InfiniBand DC and UD transports, including accelerated verbs for Mellanox devices
    - Full support for PGAS/SHMEM interfaces, blocking and non-blocking APIs
    - Support for MPI tag matching, both in software and offload mode
    - Zero copy protocols and rendezvous, registration cache
    - Handling transport errors
    - Flow control for DC/RC
    - Dataypes support: contiguous, IOV, generic
    - Multi-threading support
    - Support for ARMv8 64bit architecture
    - A new API for efficient memory polling
    - Support for malloc-hooks and memory registration caching

OBS-URL: https://build.opensuse.org/request/show/527339
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openucx?expand=0&rev=4
2017-09-22 19:29:55 +00:00
2b8d3bdf06 Switch to "proper" 1.2.1 tarball.
Rediff openucx-s390x-support.patch as p1 to be in line with potential git-generated patches.

OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=9
2017-09-19 13:53:14 +00:00
Nicolas Morey-Chaisemartin
8c6efa2743 Add missing fate ID
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=8
2017-09-19 13:28:56 +00:00
Nicolas Morey-Chaisemartin
6a966f5112 Accepting request 527297 from home:NMoreyChaisemartin:branches:science:HPC
- Switch to version 1.2.1
  Previous 1.3+ version was based on a development branch.
  Supported platforms
    - Shared memory: KNEM, CMA, XPMEM, SYSV, Posix
    - VERBs over InfiniBand and RoCE.
      VERBS over other RDMA interconnects (iWarp, OmniPath, etc.) is available
      for community evaluation and has not been tested in context of this release
    - Cray Gemini and Aries
    - Architectures: x86_64, ARMv8 (64bit), Power64
  Features:
    - Added support for InfiniBand DC and UD transports, including accelerated verbs for Mellanox devices
    - Full support for PGAS/SHMEM interfaces, blocking and non-blocking APIs
    - Support for MPI tag matching, both in software and offload mode
    - Zero copy protocols and rendezvous, registration cache
    - Handling transport errors
    - Flow control for DC/RC
    - Dataypes support: contiguous, IOV, generic
    - Multi-threading support
    - Support for ARMv8 64bit architecture
    - A new API for efficient memory polling
    - Support for malloc-hooks and memory registration caching

OBS-URL: https://build.opensuse.org/request/show/527297
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=7
2017-09-19 13:27:22 +00:00
42718c30e4 Accepting request 507873 from science:HPC
- Disable avx at configure level

- Add openucx-s390x-support.patch to fix compilation on s390x
- Compile openucx on s390x

- Fix compilation on ppc

- Update to snapshot 1.3+git44
  * No changelog was found
- Add -Wno-error and disable AVX/SSE as it is not guaranteed
  to exist.

OBS-URL: https://build.opensuse.org/request/show/507873
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openucx?expand=0&rev=3
2017-07-12 17:33:54 +00:00
ec0b537606 Accepting request 403317 from OFED:Factory
- Update to snapshot 0~git1727

OBS-URL: https://build.opensuse.org/request/show/403317
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openucx?expand=0&rev=2
2016-06-19 08:50:43 +00:00
Stephan Kulow
c69bff694e Accepting request 330811 from OFED:Factory
OBS-URL: https://build.opensuse.org/request/show/330811
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openucx?expand=0&rev=1
2015-10-08 06:24:03 +00:00
8 changed files with 0 additions and 2006 deletions

View File

@@ -1,18 +0,0 @@
commit c49bd7a5d183a57f41c801c7f5c9691bcd7d23da
Author: Thomas Vegas <tvegas@nvidia.com>
Date: Mon Jun 24 16:52:04 2024 +0300
UCS/TIME: Add math.h to provide INFINITY
diff --git src/ucs/time/time.h src/ucs/time/time.h
index cff9810cdad8..c51362273f8d 100644
--- src/ucs/time/time.h
+++ src/ucs/time/time.h
@@ -11,6 +11,7 @@
#include <ucs/time/time_def.h>
#include <sys/time.h>
#include <limits.h>
+#include <math.h>
BEGIN_C_DECLS

View File

@@ -1,224 +0,0 @@
commit d437b65a6df080416048067141b1c206a52bdc78
Author: Nathan Hjelm <hjelmn@google.com>
Date: Wed Oct 16 20:32:48 2024 +0000
UCT/IB/UD: Use GRH to detect address family on non-Mellanox hardware
Setting the service level in the work completion is a Mellanox-specific feature,
so it can not be relied on to detect IPv4 vs IPv6. This commit fixes the
detection logic for non-Mellanox providers by detecting the address class from
the grh instead. This is done by detecting either 0x6a (IPv6) at offset 0 or
0x45 (IPv4) at offset 20 of the receive buffer. Since the first 20B of IPv4
packets are undefined ud_verbs sets the first byte of each posted receive to a
known value (0xff) since the provider is unliklely to touch these bytes. This
commit makes no changes to the mlx5 code which continues to rely on the CQE data
to determine if a packet is IPv4 or IPv6. It can be updated to use the non-mlx5
logic but since the IP version is present in the CGE there is no need.
Signed-off-by: Nathan Hjelm <hjelmn@google.com>
diff --git src/uct/ib/mlx5/ib_mlx5.h src/uct/ib/mlx5/ib_mlx5.h
index 3183ea460a8a..3ec48b7197d8 100644
--- src/uct/ib/mlx5/ib_mlx5.h
+++ src/uct/ib/mlx5/ib_mlx5.h
@@ -1,6 +1,7 @@
/**
* Copyright (c) NVIDIA CORPORATION & AFFILIATES, 2001-2014. ALL RIGHTS RESERVED.
* Copyright (C) ARM Ltd. 2016. ALL RIGHTS RESERVED.
+* Copyright (c) Google, LLC, 2024. ALL RIGHTS RESERVED.
*
* See file LICENSE for terms.
*/
@@ -66,6 +67,9 @@
#define UCT_IB_MLX5_ATOMIC_MODE_EXT 3
#define UCT_IB_MLX5_CQE_FLAG_L3_IN_DATA UCS_BIT(28) /* GRH/IP in the receive buffer */
#define UCT_IB_MLX5_CQE_FLAG_L3_IN_CQE UCS_BIT(29) /* GRH/IP in the CQE */
+/* Bits 24-26 of flags_rqpn indicate the packet type */
+#define UCT_IB_MLX5_RQPN_ROCE_FLAG_IPV6 UCS_BIT(24)
+#define UCT_IB_MLX5_RQPN_ROCE_FLAG_IPV4 UCS_BIT(25)
#define UCT_IB_MLX5_CQE_FORMAT_MASK 0xc
#define UCT_IB_MLX5_MINICQE_ARR_MAX_SIZE 7
#define UCT_IB_MLX5_MP_RQ_BYTE_CNT_MASK 0x0000FFFF /* Byte count mask for multi-packet RQs */
diff --git src/uct/ib/mlx5/ib_mlx5.inl src/uct/ib/mlx5/ib_mlx5.inl
index 6602143c8bf5..2aa58455d5cd 100644
--- src/uct/ib/mlx5/ib_mlx5.inl
+++ src/uct/ib/mlx5/ib_mlx5.inl
@@ -1,5 +1,6 @@
/**
* Copyright (c) NVIDIA CORPORATION & AFFILIATES, 2001-2016. ALL RIGHTS RESERVED.
+ * Copyright (c) Google, LLC, 2024. ALL RIGHTS RESERVED.
*
* See file LICENSE for terms.
*/
@@ -88,6 +89,35 @@ uct_ib_mlx5_cqe_is_grh_present(struct mlx5_cqe64* cqe)
UCT_IB_MLX5_CQE_FLAG_L3_IN_CQE);
}
+static UCS_F_ALWAYS_INLINE size_t
+uct_ib_mlx5_cqe_roce_gid_len(struct mlx5_cqe64* cqe)
+{
+ /*
+ * Take the packet type from CQE, because:
+ * 1. According to Annex17_RoCEv2 (A17.4.5.1):
+ * For UD, the Completion Queue Entry (CQE) includes remote address
+ * information (InfiniBand Specification Vol. 1 Rev 1.2.1 Section 11.4.2.1).
+ * For RoCEv2, the remote address information comprises the source L2
+ * Address and a flag that indicates if the received frame is an IPv4,
+ * IPv6 or RoCE packet.
+ *
+ * 2. According to PRM, for responder UD/DC over RoCE sl represents RoCE
+ * packet type as:
+ * bit 3 : when set R-RoCE frame contains an UDP header otherwise not
+ * Bits[2:0]: L3_Header_Type, as defined below
+ * - 0x0 : GRH - (RoCE v1.0)
+ * - 0x1 : IPv6 - (RoCE v1.5/v2.0)
+ * - 0x2 : IPv4 - (RoCE v1.5/v2.0)
+ *
+ * The service level is the most significant byte of cqe->flags_rqpn.
+ *
+ * Alternatively, this could be detected by examining the packet contents
+ * as is done for non-mlx5 transports.
+ */
+ return (cqe->flags_rqpn & htonl(UCT_IB_MLX5_RQPN_ROCE_FLAG_IPV4)) ?
+ UCS_IPV4_ADDR_LEN : UCS_IPV6_ADDR_LEN;
+}
+
static UCS_F_ALWAYS_INLINE void*
uct_ib_mlx5_gid_from_cqe(struct mlx5_cqe64* cqe)
{
diff --git src/uct/ib/mlx5/ud/ud_mlx5.c src/uct/ib/mlx5/ud/ud_mlx5.c
index 58f4ae6446a3..27a96b1b615b 100644
--- src/uct/ib/mlx5/ud/ud_mlx5.c
+++ src/uct/ib/mlx5/ud/ud_mlx5.c
@@ -2,6 +2,7 @@
* Copyright (c) NVIDIA CORPORATION & AFFILIATES, 2001-2019. ALL RIGHTS RESERVED.
* Copyright (C) ARM Ltd. 2017. ALL RIGHTS RESERVED.
* Copyright (C) Advanced Micro Devices, Inc. 2024. ALL RIGHTS RESERVED.
+* Copyright (c) Google, LLC, 2024. ALL RIGHTS RESERVED.
*
* See file LICENSE for terms.
*/
@@ -521,7 +522,7 @@ uct_ud_mlx5_iface_poll_rx(uct_ud_mlx5_iface_t *iface, int is_async)
if (!uct_ud_iface_check_grh(&iface->super, packet,
uct_ib_mlx5_cqe_is_grh_present(cqe),
- cqe->flags_rqpn & 0xFF)) {
+ uct_ib_mlx5_cqe_roce_gid_len(cqe))) {
ucs_mpool_put_inline(desc);
goto out_polled;
}
diff --git src/uct/ib/ud/base/ud_iface.h src/uct/ib/ud/base/ud_iface.h
index 1efecd291d98..89fa7e3810fc 100644
--- src/uct/ib/ud/base/ud_iface.h
+++ src/uct/ib/ud/base/ud_iface.h
@@ -1,5 +1,6 @@
/**
* Copyright (c) NVIDIA CORPORATION & AFFILIATES, 2001-2020. ALL RIGHTS RESERVED.
+* Copyright (c) Google, LLC, 2024. ALL RIGHTS RESERVED.
*
* See file LICENSE for terms.
*/
@@ -395,10 +396,9 @@ static UCS_F_ALWAYS_INLINE void uct_ud_leave(uct_ud_iface_t *iface)
static UCS_F_ALWAYS_INLINE int
uct_ud_iface_check_grh(uct_ud_iface_t *iface, void *packet, int is_grh_present,
- uint8_t roce_pkt_type)
+ size_t gid_len)
{
struct ibv_grh *grh = (struct ibv_grh *)packet;
- size_t gid_len;
union ibv_gid *gid;
khiter_t khiter;
char gid_str[128] UCS_V_UNUSED;
@@ -412,25 +412,6 @@ uct_ud_iface_check_grh(uct_ud_iface_t *iface, void *packet, int is_grh_present,
return 1;
}
- /*
- * Take the packet type from CQE, because:
- * 1. According to Annex17_RoCEv2 (A17.4.5.1):
- * For UD, the Completion Queue Entry (CQE) includes remote address
- * information (InfiniBand Specification Vol. 1 Rev 1.2.1 Section 11.4.2.1).
- * For RoCEv2, the remote address information comprises the source L2
- * Address and a flag that indicates if the received frame is an IPv4,
- * IPv6 or RoCE packet.
- * 2. According to PRM, for responder UD/DC over RoCE sl represents RoCE
- * packet type as:
- * bit 3 : when set R-RoCE frame contains an UDP header otherwise not
- * Bits[2:0]: L3_Header_Type, as defined below
- * - 0x0 : GRH - (RoCE v1.0)
- * - 0x1 : IPv6 - (RoCE v1.5/v2.0)
- * - 0x2 : IPv4 - (RoCE v1.5/v2.0)
- */
- gid_len = ((roce_pkt_type & UCT_IB_CQE_SL_PKTYPE_MASK) == 0x2) ?
- UCS_IPV4_ADDR_LEN : UCS_IPV6_ADDR_LEN;
-
if (ucs_likely((gid_len == iface->gid_table.last_len) &&
uct_ud_gid_equal(&grh->dgid, &iface->gid_table.last,
gid_len))) {
diff --git src/uct/ib/ud/verbs/ud_verbs.c src/uct/ib/ud/verbs/ud_verbs.c
index 989bdb59d08f..848dc4e5cd66 100644
--- src/uct/ib/ud/verbs/ud_verbs.c
+++ src/uct/ib/ud/verbs/ud_verbs.c
@@ -1,5 +1,6 @@
/**
* Copyright (c) NVIDIA CORPORATION & AFFILIATES, 2001-2019. ALL RIGHTS RESERVED.
+* Copyright (c) Google, LLC, 2024. ALL RIGHTS RESERVED.
*
* See file LICENSE for terms.
*/
@@ -393,6 +394,20 @@ uct_ud_verbs_iface_poll_tx(uct_ud_verbs_iface_t *iface, int is_async)
return 1;
}
+static UCS_F_ALWAYS_INLINE size_t uct_ud_verbs_iface_get_gid_len(void *packet)
+{
+ /* The GRH will contain either an IPv4 or IPv6 header. If the former is
+ * present the header will start at offset 20 in the buffer otherwise it
+ * will start at offset 0. Since the two headers are of fixed size (20 or
+ * 40 bytes) this means we will either see 0x6? at offset 0 (IPv6) or 0x45
+ * at offset 20. The detection is a little tricky for IPv6 given that the
+ * first 20B are undefined for IPv4. To overcome this the first byte of
+ * the posted receive buffer is set to 0xff.
+ */
+ return ((((uint8_t*)packet)[0] & 0xf0) == 0x60) ? UCS_IPV6_ADDR_LEN :
+ UCS_IPV4_ADDR_LEN;
+}
+
static UCS_F_ALWAYS_INLINE unsigned
uct_ud_verbs_iface_poll_rx(uct_ud_verbs_iface_t *iface, int is_async)
{
@@ -413,7 +428,8 @@ uct_ud_verbs_iface_poll_rx(uct_ud_verbs_iface_t *iface, int is_async)
UCT_IB_IFACE_VERBS_FOREACH_RXWQE(&iface->super.super, i, packet, wc, num_wcs) {
if (!uct_ud_iface_check_grh(&iface->super, packet,
- wc[i].wc_flags & IBV_WC_GRH, wc[i].sl)) {
+ wc[i].wc_flags & IBV_WC_GRH,
+ uct_ud_verbs_iface_get_gid_len(packet))) {
ucs_mpool_put_inline((void*)wc[i].wr_id);
continue;
}
@@ -696,7 +712,7 @@ uct_ud_verbs_iface_post_recv_always(uct_ud_verbs_iface_t *iface, int max)
struct ibv_recv_wr *bad_wr;
uct_ib_recv_wr_t *wrs;
unsigned count;
- int ret;
+ int ret, i;
wrs = ucs_alloca(sizeof *wrs * max);
@@ -706,6 +722,14 @@ uct_ud_verbs_iface_post_recv_always(uct_ud_verbs_iface_t *iface, int max)
return;
}
+ /* Set the first byte in the receive buffer grh to a known value not equal to
+ * 0x6?. This should aid in the detection of IPv6 vs IPv4 because the first
+ * byte is undefined in the later and 0x6? in the former. It is unlikely
+ * this byte is touched with IPv4. */
+ for (i = 0; i < count; ++i) {
+ ((uint8_t*)wrs[i].sg.addr)[0] = 0xff;
+ }
+
ret = ibv_post_recv(iface->super.qp, &wrs[0].ibwr, &bad_wr);
if (ret != 0) {
ucs_fatal("ibv_post_recv() returned %d: %m", ret);

View File

@@ -1,41 +0,0 @@
github.com/openucx/ucx/issues/10663
github.com/openucx/ucx/pull/10664
github.com/openucx/ucx/commit/2e6f69d
adapted to opensuse's source tree:
- change to -p0 as used by %autopatch in opensuse's openucx.spec
From 2e6f69db88da2c38c89c688a932817b6b4912920 Mon Sep 17 00:00:00 2001
From: Thomas Vegas <tvegas@nvidia.com>
Date: Tue, 29 Apr 2025 05:22:28 +0000
Subject: [PATCH] TOOLS/PERF: Include omp.h outside of extern C declarations
--- src/tools/perf/lib/libperf_int.h
+++ src/tools/perf/lib/libperf_int.h
@@ -11,6 +11,12 @@
#include <tools/perf/api/libperf.h>
+
+#if _OPENMP
+#include <omp.h>
+#endif
+
+
BEGIN_C_DECLS
/** @file libperf_int.h */
@@ -20,11 +26,6 @@ BEGIN_C_DECLS
#include <ucs/sys/math.h>
-#if _OPENMP
-#include <omp.h>
-#endif
-
-
#define TIMING_QUEUE_SIZE 2048
#define UCT_PERF_TEST_AM_ID 5
#define ADDR_BUF_SIZE 4096

View File

@@ -1,26 +0,0 @@
state of opensuse's source tree requires additional fixes for build with
strict header checking as e.g. w/ gcc-15
--- src/uct/ib/mlx5/gga/gga_mlx5.c 2025-04-23 16:48:44.303006307 +0200
+++ src/uct/ib/mlx5/gga/gga_mlx5.c 2025-04-23 16:50:50.294440975 +0200
@@ -616,7 +616,7 @@
.ep_invalidate = uct_rc_mlx5_base_ep_invalidate,
.ep_connect_to_ep_v2 = uct_gga_mlx5_ep_connect_to_ep_v2,
.iface_is_reachable_v2 = uct_gga_mlx5_iface_is_reachable_v2,
- .ep_is_connected = ucs_empty_function_do_assert
+ .ep_is_connected = (uct_ep_is_connected_func_t)ucs_empty_function_do_assert
},
.create_cq = uct_rc_mlx5_iface_common_create_cq,
.destroy_cq = uct_rc_mlx5_iface_common_destroy_cq,
--- src/uct/tcp/tcp_iface.c 2025-04-23 16:32:27.249029997 +0200
+++ src/uct/tcp/tcp_iface.c 2025-04-23 16:46:58.288518124 +0200
@@ -621,7 +621,7 @@
.ep_invalidate = (uct_ep_invalidate_func_t)ucs_empty_function_return_unsupported,
.ep_connect_to_ep_v2 = uct_tcp_ep_connect_to_ep_v2,
.iface_is_reachable_v2 = uct_tcp_iface_is_reachable_v2,
- .ep_is_connected = uct_tcp_ep_is_connected
+ .ep_is_connected = (uct_ep_is_connected_func_t)uct_tcp_ep_is_connected
};
static UCS_CLASS_INIT_FUNC(uct_tcp_iface_t, uct_md_h md, uct_worker_h worker,

File diff suppressed because it is too large Load Diff

BIN
ucx-1.17.0.tar.gz (Stored with Git LFS)

Binary file not shown.

BIN
ucx-1.18.0.tar.gz (Stored with Git LFS)

Binary file not shown.

View File

@@ -1,3 +0,0 @@
version https://git-lfs.github.com/spec/v1
oid sha256:8018dd75f11b5e8d6e57dcdb5b798d2c1f000982c353efde1f3170025c6c3b4c
size 3313043