* devel/main:
Update to v1.19.1
Add patches to fix a badly initialized value in settings
Fix a badly initialized value in settings
Minor fixes to openucx-s390x-support.patch
Add Gitea build results
- Update to ucx 1.19.0 - UCP - Enabled multi-GPU support within a single process - Added dynamic selection between strong and weak fences in RMA flush operations - Improved endpoint reconfiguration capabilities - Added All2All lane selection for multi-NIC-GPU systems - Improved rkey debug info when config cache limit is reached - Improved UCP protocol selection based on available memory types - Removed dummy memory key from irrelevant transports (TCP, CMA and CUDA) - Improved RNDV performance with device-local staging buffers - Enabled error handling for RMA get_offload protocols - Made UCX_TLS=^ib disable all transports including auxiliary - Fixed send request status handling - Fixed performance degradation in RNDV by optimizing md cache updates - Fixed protocol selection when first lane is filtered out by fragment size - Fixed rkey selection by using memory registration flag - UCT - Defined uct_rkey_unpack_v2 API to support passing sys-dev - RDMA CORE (IB, ROCE, etc.) - Added SRD transport support in EFA with reordering, AM, and control operations - Removed XGVMI BF2 support (umem) - Removed device memory indirect key - Fixed VFS objects for DCIs and pools - Added routing table cache to the reachability check - Fixed strict order usage in IB auxiliary rkeys - Improved various init logging messages - Improved reliability of DC transport by adding DCI validation and separating connection logic - Fixed segfault in DC fence operation - UCS - Removed compilation warnings
- Update to ucx 1.18.1 - CUDA - Added config keys to update cuda_copy bandwidth for coherent platforms - Improved cache invalidation of memory allocated using CUDA memory pool - AZP - Added Ubuntu 24.04 to build and release pipeline - UCP - Fixed assertion failure when maximum lane fragment is smaller than AM header - Fixed potential active message user header use after free with protocol reconfiguration - CUDA - Fixed registration of CUDA Fabric memory allocated by UCT - Fixed VA recycling check of memory allocated using VMM and CUDA memory pool - RDMA CORE (IB, ROCE, etc.) - Do not use ConnectX-8 SMI subdevices for communication - Fixed remote access error by disabling ODP when the device supports DDP - Fixed configuration logic by disabling DDP when AR is disabled - UCM - Fixed crash with bistro hooks for CUDA 12.9 on amd64
add patches to fix gcc-15 compile errors (boo#1241939)
- Add UCT-IB-UD-Use-GRH-to-detect-address-family-on-non-Mellanox-hardware.patch to fix an UD init issue on non-Mellanox RDMA HW (bsc#1240204).
Accepting request 1247273 from home:NMorey:branches:science:HPC
Accepting request 1247161 from home:NMorey:branches:science:HPC
Accepting request 1199375 from home:NMorey:branches:science:HPC
Accepting request 1184022 from openSUSE:Factory:RISCV
Accepting request 1183477 from home:NMorey:branches:science:HPC
Signed-off-by: Nicolas Morey <nmorey@suse.com>
- Update to ucx 1.18.0
- UCP
- Enabled using CUDA staging buffers for pipeline protocols by default
- Added endpoint reconfiguration support for non-reused p2p scenarios
- Enabled non-cacheable memory domains, activated for gdr_copy
- Added user_data parameter to ucp_ep_query
- Added support for host memory pipeline through CUDA buffers for rendezvous protocol
- Added global VA infrastructure and memory region in absence of error handling
- Made protocol performance node names more informative
- Enforced always running on the same thread in single thread mode
- Multiple improvements in protocols selection infrastructure
- Added UCP_MEM_MAP_LOCK API flag to enforce locked memory mapping
- Allowed up-to 64 endpoint lanes for systems with many transports or devices
- Added usage tracker to worker
- Improved various logging messages
- Fixed stack overflow in exported rkey unpack
- Removed extra remote-cpu overhead from protocol estimation for zcopy
- Fixed performance estimation for rndv pipeline protocols
- Fixed ATP sending by picking the correct lane
- Fixed missing reg_id on memh creation
- Fixed repeated invalidations by retaining existing access flags
- Fixed abort reason propagation for rendezvous RTR mtype
- Do not check transport availability if it is disabled by UCX_TLS environment variable
- Fixed wrong flag being used for checking BCOPY capability
- Fixed sending too many ATPs for small messages
- Enforced 16 bits size for Active Messages identifiers
- Fixed unnecessary status check for emulated AMO
- Fixed more than one fragment sending in rendezvous pipeline
- Fixed crash by using biggest max frag across all lanes
- Fixed missing memory handle flags by copying from parent to child
OBS-URL: https://build.opensuse.org/request/show/1247274
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openucx?expand=0&rev=33
- UCP
- Enabled using CUDA staging buffers for pipeline protocols by default
- Added endpoint reconfiguration support for non-reused p2p scenarios
- Enabled non-cacheable memory domains, activated for gdr_copy
- Added user_data parameter to ucp_ep_query
- Added support for host memory pipeline through CUDA buffers for rendezvous protocol
- Added global VA infrastructure and memory region in absence of error handling
- Made protocol performance node names more informative
- Enforced always running on the same thread in single thread mode
- Multiple improvements in protocols selection infrastructure
- Added UCP_MEM_MAP_LOCK API flag to enforce locked memory mapping
- Allowed up-to 64 endpoint lanes for systems with many transports or devices
- Added usage tracker to worker
- Improved various logging messages
- Fixed stack overflow in exported rkey unpack
- Removed extra remote-cpu overhead from protocol estimation for zcopy
- Fixed performance estimation for rndv pipeline protocols
- Fixed ATP sending by picking the correct lane
- Fixed missing reg_id on memh creation
- Fixed repeated invalidations by retaining existing access flags
- Fixed abort reason propagation for rendezvous RTR mtype
- Do not check transport availability if it is disabled by UCX_TLS environment variable
- Fixed wrong flag being used for checking BCOPY capability
- Fixed sending too many ATPs for small messages
- Enforced 16 bits size for Active Messages identifiers
- Fixed unnecessary status check for emulated AMO
- Fixed more than one fragment sending in rendezvous pipeline
- Fixed crash by using biggest max frag across all lanes
- Fixed missing memory handle flags by copying from parent to child
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=73