------------------------------------------------------------------- Thu Sep 23 07:35:57 UTC 2021 - Nicolas Morey-Chaisemartin - Update to v1.11.1 (jsc#SLE-19260) ------------------------------------------------------------------- Wed Feb 24 16:34:54 UTC 2021 - Nicolas Morey-Chaisemartin - Update openucx-s390x-support.patch to fix mmap syscall on s390x (bsc#1182691) - Core: - Added support for UCX monitoring using virtual file system (VFS)/FUSE - Added support for applications with static CUDA runtime linking - Added support for a configuration file - Updated clang format configuration - UCP - Added rendezvous API for active messages - Added user-defined name to context, worker, and endpoint objects - Added flag to silence request leak check - Added API for endpoint performance evaluation - Added API - ucp_request_query - Added API - ucp_lib_query - Added bandwidth optimizations for new protocols multi-lane - Added support for multi-rail over lanes with BW ratio >= 1/4 - Added support for tracking outstanding requests and aborting those in case of connection failure - Refactored keep-alive protocol - Added device id to wireup protocol - Added support up to 128 transport layer resources in UCP context - Added support CUDA memory allocations with ucp_mem_map - Increased UCP_WORKER_MAX_EP_CONFIG to 64 - Adjusted memory type zcopy threshold when UCX_ZCOPY_THRESH set - Refactored wireup protocols, rendezvous, get, zcopy protocols - Added put zcopy multi-rail - Improved logging for new protocols - Added system topology information - Added new protocols for eager offload protocols - UCT - Extended connection establishment API - Added active message AM alignment in iface params - Added active message short IOV API. - Added support for interface query by operation and memory type - Added API to get allocation base address and length - Added md_dereg_v2 API - UCS - Added log filter by source file name. - Added checking for last element in fraglist queue - Added a method to get IP address from sockaddr. - Added memory usage limits to registration cache - RDMA CORE (IB, ROCE, etc.) - Added report of QP info in case of completion with error - Refactored of FC send operations - Added support for DevX unique QPN allocation - Optimized endpoint lookup for DCI - Added support for RDMA sub-function (SF) - Added support for DCI via DEVX - Added DCI pool per LAG port - Added support for RoCE IP reachability check using a subnet mask - Added active message short IOV for UD/DC/RC mlx, UD/RC verbs - Added endpoint keep alive check for UD - Suppressed warning if device can't be opened - Added support for multiple flush cancel without completion - Added ignore for devices with invalid GID - Added support for SRQ linked list reordering - Added flush by flow control on old devices - Added support for configurable rdma_resolve_addr/route timeout - Shared memory - Added active message short IOV support for posix, sysv, and self transports - TCP - Added support for peer failure in case of CONNECT_TO_EP - Added support for active message short IOV - See NEWS for a complete changelog and bug fixes - Refresh openucx-s390x-support against latest sources ------------------------------------------------------------------- Mon Oct 5 13:21:34 UTC 2020 - Nicolas Morey-Chaisemartin - Update to v1.9.0 (jsc#SLE-15163) - Features: - Added a new class of communication APIs '*_nbx' that enable API extendability while - preserving ABI backward compatibility - Added asynchronous event support to UCT/IB/DEVX - Added support for latest CUDA library version - Added NAK-based reliability protocol for UCT/IB/UD to optimize resends - Added new tests for ROCm - Added new configuration parameters for protocol selection - Added performance optimization for Fujitsu A64FX with InfiniBand - Added performance optimization for clear cache code aarch64 - Added support for relaxed-order PCIe access in IB RDMA transports - Added new TCP connection manager - Added support for UCT/IB PKey with partial membership in IB transports - Added support for RoCE LAG - Added support for ROCm 3.7 and above - Added flow control for RDMA read operations - Improved endpoint flush implementation for UCT/IB - Improved UD timer to avoid interrupting the main thread when not in use - Improved latency estimation for network path with CUDA - Improved error reporting messages - Improved performance in active message flow (removed malloc call) - Improved performance in ptr_array flow - Improved performance in UCT/SM progress engine flow - Improved I/O demo code - Improved rendezvous protocol for CUDA - Updated examples code - Bugfixes: - Fixes for most resent versions of GCC, CLANG, ARMCLANG, PGI - Fixes in UCT/IB for strict order keys - Fixes in memory barrier code for aarch64 - Fixes in UCT/IB/DEVX for fork system call - Fixes in UCT/IB for rand() call in rdma-core - Fixed in group rescheduling for UCT/IB/DC - Fixes in UCT/CUDA bandwidth reporting - Fixes in rkey_ptr protocol - Fixes in lane selection for rendezvous protocol based on get-zero-copy flow - Fixes for ROCm build - Fixes for XPMEM transport - Fixes in closing endpoint code - Fixes in RDMACM code - Fixes in memcpy selection for AMD - Fixed in UCT/UD endpoint flush functionality - Fixes in XPMEM detection - Fixes in rendezvous staging protocol - Fixes in ROCEv1 mlx5 UDP source port configuration - Multiple fixes in RPM spec file - Multiple fixes in UCP documentation - Multiple fixes in socket connection manager - Multiple fixes in gtest - Multiple fixes in JAVA API implementation - Refresh openucx-s390x-support.patch against new version ------------------------------------------------------------------- Mon Jul 13 08:19:45 UTC 2020 - Nicolas Morey-Chaisemartin - Update to v1.8.1 - Features: - Added binary release pipeline in Azure CI - Bugfixes: - Multiple fixes in testing environment - Fixes in InfiniBand DEVX transport - Fixes in memory management for CUDA IPC transport - Fixes for binutils 2.34+ - Fixes for AMD ROCM build environment ------------------------------------------------------------------- Fri Jun 5 09:38:40 UTC 2020 - Jan Engelhardt - Trim bias and filler wording from descriptions. ------------------------------------------------------------------- Thu Jun 4 08:18:26 UTC 2020 - Nicolas Morey-Chaisemartin - Update to v1.8.0 - Features: - Improved detection for DEVX support - Improved TCP scalability - Added support for ROCM to perftest - Added support for different source and target memory types to perftest - Added optimized memcpy for ROCM devices - Added hardware tag-matching for CUDA buffers - Added support for CUDA and ROCM managed memories - Added support for client/server disconnect protocol over rdma connection manager - Added support for striding receive queue for hardware tag-matching - Added XPMEM-based rendezvous protocol for shared memory - Added support shared memory communication between containers on same machine - Added support for multi-threaded RDMA memory registration for large regions - Added new test cases to Azure CI - Added support for multiple listening transports - Added UCT socket-based connection manager transport - Updated API for UCT component management - Added API to retrieve the listening port - Added UCP active message API - Removed deprecated API for querying UCT memory domains - Refactored server/client examples - Added support for dlopen interception in UCM - Added support for PCIe atomics - Updated Java API: added support for most of UCP layer operations - Updated support for Mellanox DevX API - Added multiple UCT/TCP transport performance optimizations - Optimized memcpy() for Intel platforms - Added protection from non-UCX socket based app connections - Improved search time for PKEY object - Enabled gtest over IPv6 interfaces - Updated Mellanox and Bull device IDs - Added support for CUDA_VISIBLE_DEVICES - Increased limits for CUDA IPC registration - Bugfixes: - Multiple fixes in JUCX - Fixes in UCP thread safety - Fixes for most recent versions GCC, PGI, and ICC - Fixes for CPU affinity on Azure instances - Fixes in XPMEM support on PPC64 - Performance fixes in CUDA IPC - Fixes in RDMA CM flows - Multiple fixes in TCP transport - Multiple fixes in documentation - Fixes in transport lane selection logic - Fixes in Java jar build - Fixes in socket connection manager for Nvidia DGX-2 platform - Multiple fixes in UCP, UCT, UCM libraries - Multiple fixes for BSD and Mac OS systems - Fixes for Clang compiler - Fix CPU optimization configuration options - Fix JUCX build on GPU nodes - Fix in Azure release pipeline flow - Fix in CUDA memory hooks management - Fix in GPU memory peer direct gtest - Fix in TCP connection establishment flow - Fix in GPU IPC check - Fix in CUDA Jenkins test flow - Multiple fixes in CUDA IPC flow - Fix adding missing header files - Fix to prevent failures in presence of VPN enabled Ethernet interfaces - Refresh openucx-s390x-support.patch against new version ------------------------------------------------------------------- Fri Oct 4 08:11:49 UTC 2019 - Jan Engelhardt - Ensure /usr/lib/ucx is owned at all times. ------------------------------------------------------------------- Wed Sep 18 10:16:05 UTC 2019 - Nicolas Morey-Chaisemartin - Update to v1.6.0 - Features: - Modular architecture for UCT transports - ROCm transport re-design: support for managed memory, direct copy, ROCm GDR - Random scheduling policy for DC transport - Optimized out-of-box settings for multi-rail - Added support for OmniPath (using Verbs) - Support for PCI atomics with IB transports - Reduced UCP address size for homogeneous environments - Bugfixes: - Multiple stability and performance improvements in TCP transport - Multiple stability fixed in Verbs and MLX5 transports - Multiple stability fixes in UCM memory hooks - Multiple stability fixes in UGNI transport - RPM Spec file cleanup - Fixing compilation issues with most recent clang and gcc compilers - Fixing the wrong name of aliases - Fix data race in UCP wireup - Fix segfault when libuct.so is reloaded - issue #3558 - Include Java sources in distribution - Handle EADDRNOTAVAIL in rdma_cm connection manager - Disable ibcm on RHEL7+ by default - Fix data race in UCP proxy endpoint - Static checker fixes - Fallback to ibv_create_cq() if ibv_create_cq_ex() returns ENOSYS - Fix malloc hooks test - Fix checking return status in ucp_client_server example - Fix gdrcopy libdir config value - Fix printing atomic capabilities in ucx_info - Fix perftest warmup iterations to be non-zero - Fixing default values for configure logic - Fix race condition updating fired_events from multiple threads - Fix madvise() hook - Refresh openucx-s390x-support.patch against new version ------------------------------------------------------------------- Wed May 15 05:52:55 UTC 2019 - Nicolas Morey-Chaisemartin - Disable Werror to handle boo#1121267 ------------------------------------------------------------------- Mon Feb 25 07:56:39 UTC 2019 - nmorey - Update openucx-s390x-support.patch to fix support of 1.5.0 on s390x (bsc#1121267) - Add baselibs.conf for ppc ------------------------------------------------------------------- Fri Feb 22 12:11:57 UTC 2019 - Martin Liška - Update to v1.5.0 (bsc#1121267) * Features: * New emulation mode enabling full UCX functionality (Atomic, Put, Get) * over TCP and RDMA-CORE interconnects which don't implement full RDMA semantics * Non-blocking API for all one-sided operations. All blocking communication APIs marked * as deprecated * New client/server connection establishment API, which allows connected handover between workers * Support for rdma-core direct-verbs (DEVX) and DC with mlx5 transports * GPU - Support for stream API and receive side pipelining * Malloc hooks using binary instrumentation instead of symbol override * Statistics for UCT tag API * GPU-to-Infiniband HCA affinity support based on locality/distance (PCIe) * Bugfixes: * Fix overflow in RC/DC flush operations * Update description in SPEC file and README * Fix RoCE source port for dc_mlx5 flow control * Improve ucx_info help message * Fix segfault in UCP, due to int truncation in count_one_bits() * Multiple other bugfixes (full list on github) * Tested configurations: * InfiniBand: MLNX_OFED 4.4-4.5, distribution inbox drivers, rdma-core * CUDA: gdrcopy 1.2, cuda 9.1.85 * XPMEM: 2.6.2 * KNEM: 1.1.2 ------------------------------------------------------------------- Tue Nov 6 07:18:34 UTC 2018 - nmoreychaisemartin@suse.com - Update to v1.4.0 (bsc#1103494) * Features: * Improved support for installation with latest ROCm * Improved support for latest rdma-core * Added support for CUDA IPC for intra-node GPU, CUDA memory allocation cache for mem-type detection, latest Mellanox devices, Nvidia GPU managed memory, multiple connections between the same pair of workers, large worker address for client/server connection establishment and INADDR_ANY, and for bitwise atomics operations. * Bugfixes: * Performance fixes for rendezvous protocol * Memory hook fixes * Clang support fixes * Self tl multi-rail fix * Thread safety fixes in IB/RDMA transport * Compilation fixes with upstream rdma-core * Multiple minor bugfixes (full list on github) * Segfault fix for a code generated by armclang compiler * UCP memory-domain index fix for zero-copy active messages ------------------------------------------------------------------- Mon Oct 15 07:51:12 UTC 2018 - nmoreychaisemartin@suse.com - Update to v1.3.1 (fate#325996) - Prevent potential out-of-order sending in shared memory active messages - CUDA: Include cudamem.h in source tarball, pass cudaFree memory size - Registration cache: fix large range lookup, handle shmat(REMAP)/mmap(FIXED) - Limit IB CQE size for specific ARM boards ------------------------------------------------------------------- Thu Aug 9 05:57:24 UTC 2018 - nmoreychaisemartin@suse.com - Update to v1.3.0 (bsc#1104159) - Added stream-based communication API to UCP - Added support for GPU platforms: Nvidia CUDA and AMD ROCM software stacks - Added API for client/server based connection establishment - Added support for TCP transport - Support for InfiniBand tag-matching offload for DC and accelerated transports - Multi-rail support for eager and rendezvous protocols - Added support for tag-matching communications with CUDA buffers - Added ucp_rkey_ptr() to obtain pointer for shared memory region - Avoid progress overhead on unused transports - Improved scalability of software tag-matching by using a hash table - Added transparent huge-pages allocator - Added non-blocking flush and disconnect for UCP - Support fixed-address memory allocation via ucp_mem_map() - Added ucp_tag_send_nbr() API to avoid send request allocation - Support global addressing in all IB transports - Add support for external epoll fd and edge-triggered events - Added registration cache for knem - Initial support for Java bindings - Multiple bugfixes (full list on github) - Drop UCT-UD-fixed-compilation-by-gcc8.patch as it was fixed upstream - Refresh openucx-s390x-support.patch against latest sources ------------------------------------------------------------------- Wed Jun 13 12:45:34 UTC 2018 - nmoreychaisemartin@suse.com - Remove libnuma-devel on s390x for older releases ------------------------------------------------------------------- Tue Mar 27 07:12:37 UTC 2018 - nmoreychaisemartin@suse.com - Add UCT-UD-fixed-compilation-by-gcc8.patch to fix compilation with GCC8 (bsc#1084635) ------------------------------------------------------------------- Sat Jan 20 15:40:43 UTC 2018 - jengelh@inai.de - Use right documentation path. ------------------------------------------------------------------- Fri Jan 19 10:12:04 UTC 2018 - nmoreychaisemartin@suse.com - Update to 1.2.2 - Support including UCX API headers from C++ code - UD transport to handle unicast flood on RoCE fabric - Compilation fixes for gcc 7.1.1, clang 3.6, clang 5 - When UD transport is used with RoCE, packets intended for other peers may arrive on different adapters (as a result of unicast flooding). - This change adds packet filtering based on destination GIDs. Now the packet is silently dropped, if its destination GID does not match the local GID. - Added a new device ID for InfiniBand HCA ------------------------------------------------------------------- Fri Dec 8 21:19:11 UTC 2017 - dimstar@opensuse.org - Drop doxygen BuildRequires: The documentation was already not built with this enabled. Removing the BR causes no regression in the package but eliminates a build cycle boost -> curl -> doxygen -> openucx -> boost ------------------------------------------------------------------- Tue Sep 19 13:52:13 UTC 2017 - jengelh@inai.de - Rediff openucx-s390x-support.patch as p1 to be in line with potential git-generated patches. ------------------------------------------------------------------- Tue Sep 19 09:26:07 UTC 2017 - nmoreychaisemartin@suse.com - Switch to version 1.2.1 (Fate#324050) Previous 1.3+ version was based on a development branch. Supported platforms - Shared memory: KNEM, CMA, XPMEM, SYSV, Posix - VERBs over InfiniBand and RoCE. VERBS over other RDMA interconnects (iWarp, OmniPath, etc.) is available for community evaluation and has not been tested in context of this release - Cray Gemini and Aries - Architectures: x86_64, ARMv8 (64bit), Power64 Features: - Added support for InfiniBand DC and UD transports, including accelerated verbs for Mellanox devices - Full support for PGAS/SHMEM interfaces, blocking and non-blocking APIs - Support for MPI tag matching, both in software and offload mode - Zero copy protocols and rendezvous, registration cache - Handling transport errors - Flow control for DC/RC - Dataypes support: contiguous, IOV, generic - Multi-threading support - Support for ARMv8 64bit architecture - A new API for efficient memory polling - Support for malloc-hooks and memory registration caching ------------------------------------------------------------------- Fri Jun 30 09:30:58 UTC 2017 - nmoreychaisemartin@suse.com - Disable avx at configure level ------------------------------------------------------------------- Wed Jun 28 16:46:31 UTC 2017 - nmoreychaisemartin@suse.com - Add openucx-s390x-support.patch to fix compilation on s390x - Compile openucx on s390x ------------------------------------------------------------------- Thu Jun 8 12:12:59 UTC 2017 - nmoreychaisemartin@suse.com - Fix compilation on ppc ------------------------------------------------------------------- Fri May 26 08:29:51 UTC 2017 - jengelh@inai.de - Update to snapshot 1.3+git44 * No changelog was found - Add -Wno-error and disable AVX/SSE as it is not guaranteed to exist. ------------------------------------------------------------------- Sat Jun 18 07:36:59 UTC 2016 - jengelh@inai.de - Update to snapshot 0~git1727 * New: libucm. libucm is a standalone non-unloadable library which installs hooks for virtual memory changes in the current process. ------------------------------------------------------------------- Sun Sep 13 18:35:15 UTC 2015 - jengelh@inai.de - Update to snapshot 0~git862 * License clarification on upstream's behalf ------------------------------------------------------------------- Mon Jul 27 18:32:48 UTC 2015 - jengelh@inai.de - Initial package for build.opensuse.org (version 0~git713)