2022-10-05 09:13:29 +02:00
|
|
|
-------------------------------------------------------------------
|
|
|
|
Tue Oct 4 16:39:30 UTC 2022 - Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com>
|
|
|
|
|
|
|
|
- Update openucx-s390x-support.patch to add missing ucs_ffs32 on s390x
|
2022-10-05 09:28:51 +02:00
|
|
|
- Drop baselibs.conf as openucx only works on 64b systems
|
2022-10-05 09:13:29 +02:00
|
|
|
|
Accepting request 1006486 from home:NMoreyChaisemartin:branches:science:HPC
- Update to v1.13.1 (jsc#PED-912)
- Core
- Added new objects to VFS: local and remote address of endpoint,
statistics of ucp_ep_create success/failure, failed/destroyed endpoints
- Added support for UCX static libraries
- Added profiling for rkey management routines
- PCIe relaxed order enabled by default for AMD CPUs
- Fixed not deallocating memory from ucp_mem_unmap if no rcache
- Fixed versioning infrastructure
- Multiple code improvements: refactoring, debug prints and assertions, etc.
- Multiple improvements in build, test and docs infrastructure
- Added new objects to VFS (md, component, log_level, etc.)
- Added configuration variable to specify which loadable modules are allowed
- Added build-time configuration to disable sigaction overriding
- UCP
- Added API to pass pre-registered memory handle to UCP operations
- Added implementation of AM rendezvous protocol
- Added 2-stage pipeline rendezvous protocol for GPU
- Added support for fragment mem_type for v1 pipeline proto, disabled by default
- Added active message support for proto v2
- Added UCP memory registration cache
- Improved adaptive progress - deactivate iface when all p2p lanes are destroyed
- Added support for user memh in proto_v1
- Added support for selecting local address when creating a client endpoint
- Added option to limit GPUDirectRDMA size in rendezvous protocol, UCX_RNDV_MEMTYPE_DIRECT_SIZE
- Deprecated UCX_SOCKADDR_AUX_TLS configuration parameter
- Resolving remote EP ID when creating local EP disabled by default
- Added client_id to ucp_worker_create() and ucp_conn_request_query() APIs
- Added ucp_worker_address_query() API
- Updated ucp_ep_query() API for getting local and remote addresses
- Added address versioning to correctly preserve wire compatibility starting from version 1.11.0
- Added new client/server connection establishment packet header format
- Enabled rendezvous and tag sync protocols when error handling is enabled on the endpoint
- Added iov zcopy support to RMA operations
- Reduced memory usage of unexpected messages by fitting receive buffer size to packet size
- Added support for modifying UCT and UCS configs by ucp_config_modify() API
- Optimized unpacked rkeys memory consumption
- Added request flag to influence latency vs. bandwidth protocol
- Reduced memory management overhead with new protocols
- Improved performance calculations for new protocols
- Added AMO support with GPU memory target using new protocols
- Added put_zcopy, get_zcopy and pipeline based rendezvous in new protocols
- Added support for user-defined alignment in Active Messages
- Added support for offload tag sync in new protocols
- Updated ucp_atomic_post() to use NBX flow
- UCT
- Introduced API uct_md_mkey_pack_v2
- Introduced UCT iface features API
- Introduced max_inflight_eps parameter in perf_attr API
- Introduced UCT_SEND_FLAG_PEER_CHECK flag that forces checking connectivity to a peer
- Introduced UCX_RCACHE_PURGE_ON_FORK to enable/disable cleaning regions when application is forking
- Disabled PEER_FAILURE capability for XPMEM
- Added API - uct_iface_is_reachable_v2()
- Added IPv6 address support in TCP
- Added latency estimation to uct_iface_estimate_perf()
- Adjusted knem and cma overhead cost
- Increased built-in TCP keep-alive interval to 2 seconds
- RDMA CORE (IB, ROCE, etc.)
- Introduced NDR autorecognition
- Introduced CQE zipping support
- Set the default MAX_RD_ATOMIC to maximum value supported by the hardware
- Disabled mlx5 ifaces on verbs MD
- Added detection of IB NDR devices
- Added check for CQ overrun in assert mode
- Added bitmap usage for releasing detached DCIs
- Added configuration for requests ack frequency with DevX
- Added remote QP info to tx error CQE traces
- ROCM
- Increased maximum number of HSA agents
- UCS
- Added topo module infrastructure
- Added memtrack and rcache information to VFS
- Added API for a per-process aggregate-sum statistics report
- Added memory pool set data structure
- Added new ptr_array API for bulk allocation
- Added ucs_string_buffer_append_flags() for string buffer
- Added ucs_ffs32()
- Added ucs_vsnprintf_safe() which always adds '\0'
- Added thread-safe put to ptr_map
- Improved accuracy of the topology distance estimation
- Added prints of leaked callbacks from the callback queue
- Removed a diagnostic message when fuse thread is stopped
- Added configurable limit for the memory consumed by rcache
- Added configuration for VFS(FUSE) thread affinity
- Added memory limit support to memtrack
- Packaging
- Added cmake config files for better integration with external cmake based projects
- Tools
- Added loop-back transport support in ucx_perftest
- Split ucx_perftest into separate modules
- Added process placement option for ucx_info
- Extended parameters correctness check in ucx_perftest
- Backported UCS-DEBUG-replace-PTR-with-void.patch
from upstream to fix compilation
OBS-URL: https://build.opensuse.org/request/show/1006486
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=48
2022-09-29 17:27:45 +02:00
|
|
|
-------------------------------------------------------------------
|
|
|
|
Tue Sep 27 15:55:19 UTC 2022 - Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com>
|
|
|
|
|
|
|
|
- Update to v1.13.1 (jsc#PED-912)
|
|
|
|
- Core
|
|
|
|
- Added new objects to VFS: local and remote address of endpoint,
|
|
|
|
statistics of ucp_ep_create success/failure, failed/destroyed endpoints
|
|
|
|
- Added support for UCX static libraries
|
|
|
|
- Added profiling for rkey management routines
|
|
|
|
- PCIe relaxed order enabled by default for AMD CPUs
|
|
|
|
- Fixed not deallocating memory from ucp_mem_unmap if no rcache
|
|
|
|
- Fixed versioning infrastructure
|
|
|
|
- Multiple code improvements: refactoring, debug prints and assertions, etc.
|
|
|
|
- Multiple improvements in build, test and docs infrastructure
|
|
|
|
- Added new objects to VFS (md, component, log_level, etc.)
|
|
|
|
- Added configuration variable to specify which loadable modules are allowed
|
|
|
|
- Added build-time configuration to disable sigaction overriding
|
|
|
|
- UCP
|
|
|
|
- Added API to pass pre-registered memory handle to UCP operations
|
|
|
|
- Added implementation of AM rendezvous protocol
|
|
|
|
- Added 2-stage pipeline rendezvous protocol for GPU
|
|
|
|
- Added support for fragment mem_type for v1 pipeline proto, disabled by default
|
|
|
|
- Added active message support for proto v2
|
|
|
|
- Added UCP memory registration cache
|
|
|
|
- Improved adaptive progress - deactivate iface when all p2p lanes are destroyed
|
|
|
|
- Added support for user memh in proto_v1
|
|
|
|
- Added support for selecting local address when creating a client endpoint
|
|
|
|
- Added option to limit GPUDirectRDMA size in rendezvous protocol, UCX_RNDV_MEMTYPE_DIRECT_SIZE
|
|
|
|
- Deprecated UCX_SOCKADDR_AUX_TLS configuration parameter
|
|
|
|
- Resolving remote EP ID when creating local EP disabled by default
|
|
|
|
- Added client_id to ucp_worker_create() and ucp_conn_request_query() APIs
|
|
|
|
- Added ucp_worker_address_query() API
|
|
|
|
- Updated ucp_ep_query() API for getting local and remote addresses
|
|
|
|
- Added address versioning to correctly preserve wire compatibility starting from version 1.11.0
|
|
|
|
- Added new client/server connection establishment packet header format
|
|
|
|
- Enabled rendezvous and tag sync protocols when error handling is enabled on the endpoint
|
|
|
|
- Added iov zcopy support to RMA operations
|
|
|
|
- Reduced memory usage of unexpected messages by fitting receive buffer size to packet size
|
|
|
|
- Added support for modifying UCT and UCS configs by ucp_config_modify() API
|
|
|
|
- Optimized unpacked rkeys memory consumption
|
|
|
|
- Added request flag to influence latency vs. bandwidth protocol
|
|
|
|
- Reduced memory management overhead with new protocols
|
|
|
|
- Improved performance calculations for new protocols
|
|
|
|
- Added AMO support with GPU memory target using new protocols
|
|
|
|
- Added put_zcopy, get_zcopy and pipeline based rendezvous in new protocols
|
|
|
|
- Added support for user-defined alignment in Active Messages
|
|
|
|
- Added support for offload tag sync in new protocols
|
|
|
|
- Updated ucp_atomic_post() to use NBX flow
|
|
|
|
- UCT
|
|
|
|
- Introduced API uct_md_mkey_pack_v2
|
|
|
|
- Introduced UCT iface features API
|
|
|
|
- Introduced max_inflight_eps parameter in perf_attr API
|
|
|
|
- Introduced UCT_SEND_FLAG_PEER_CHECK flag that forces checking connectivity to a peer
|
|
|
|
- Introduced UCX_RCACHE_PURGE_ON_FORK to enable/disable cleaning regions when application is forking
|
|
|
|
- Disabled PEER_FAILURE capability for XPMEM
|
|
|
|
- Added API - uct_iface_is_reachable_v2()
|
|
|
|
- Added IPv6 address support in TCP
|
|
|
|
- Added latency estimation to uct_iface_estimate_perf()
|
|
|
|
- Adjusted knem and cma overhead cost
|
|
|
|
- Increased built-in TCP keep-alive interval to 2 seconds
|
|
|
|
- RDMA CORE (IB, ROCE, etc.)
|
|
|
|
- Introduced NDR autorecognition
|
|
|
|
- Introduced CQE zipping support
|
|
|
|
- Set the default MAX_RD_ATOMIC to maximum value supported by the hardware
|
|
|
|
- Disabled mlx5 ifaces on verbs MD
|
|
|
|
- Added detection of IB NDR devices
|
|
|
|
- Added check for CQ overrun in assert mode
|
|
|
|
- Added bitmap usage for releasing detached DCIs
|
|
|
|
- Added configuration for requests ack frequency with DevX
|
|
|
|
- Added remote QP info to tx error CQE traces
|
|
|
|
- ROCM
|
|
|
|
- Increased maximum number of HSA agents
|
|
|
|
- UCS
|
|
|
|
- Added topo module infrastructure
|
|
|
|
- Added memtrack and rcache information to VFS
|
|
|
|
- Added API for a per-process aggregate-sum statistics report
|
|
|
|
- Added memory pool set data structure
|
|
|
|
- Added new ptr_array API for bulk allocation
|
|
|
|
- Added ucs_string_buffer_append_flags() for string buffer
|
|
|
|
- Added ucs_ffs32()
|
|
|
|
- Added ucs_vsnprintf_safe() which always adds '\0'
|
|
|
|
- Added thread-safe put to ptr_map
|
|
|
|
- Improved accuracy of the topology distance estimation
|
|
|
|
- Added prints of leaked callbacks from the callback queue
|
|
|
|
- Removed a diagnostic message when fuse thread is stopped
|
|
|
|
- Added configurable limit for the memory consumed by rcache
|
|
|
|
- Added configuration for VFS(FUSE) thread affinity
|
|
|
|
- Added memory limit support to memtrack
|
|
|
|
- Packaging
|
|
|
|
- Added cmake config files for better integration with external cmake based projects
|
|
|
|
- Tools
|
|
|
|
- Added loop-back transport support in ucx_perftest
|
|
|
|
- Split ucx_perftest into separate modules
|
|
|
|
- Added process placement option for ucx_info
|
|
|
|
- Extended parameters correctness check in ucx_perftest
|
|
|
|
- Backported UCS-DEBUG-replace-PTR-with-void.patch
|
|
|
|
from upstream to fix compilation
|
|
|
|
|
|
|
|
|
2022-01-13 12:45:07 +01:00
|
|
|
-------------------------------------------------------------------
|
|
|
|
Thu Jan 13 08:42:05 UTC 2022 - Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com>
|
|
|
|
|
|
|
|
- Fix UCM bistro support on non s390x archs
|
|
|
|
- Add ucm-fix-UCX_MEM_MALLOC_RELOC.patch to disable malloc relocations by default (bsc#1194369)
|
|
|
|
|
2021-09-27 11:00:18 +02:00
|
|
|
-------------------------------------------------------------------
|
|
|
|
Thu Sep 23 07:35:57 UTC 2021 - Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com>
|
|
|
|
|
|
|
|
- Update to v1.11.1 (jsc#SLE-19260)
|
|
|
|
|
2021-02-24 18:24:21 +01:00
|
|
|
-------------------------------------------------------------------
|
|
|
|
Wed Feb 24 16:34:54 UTC 2021 - Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com>
|
|
|
|
|
|
|
|
- Update openucx-s390x-support.patch to fix mmap syscall on s390x (bsc#1182691)
|
2021-09-27 11:00:18 +02:00
|
|
|
- Core:
|
|
|
|
- Added support for UCX monitoring using virtual file system (VFS)/FUSE
|
|
|
|
- Added support for applications with static CUDA runtime linking
|
|
|
|
- Added support for a configuration file
|
|
|
|
- Updated clang format configuration
|
|
|
|
- UCP
|
|
|
|
- Added rendezvous API for active messages
|
|
|
|
- Added user-defined name to context, worker, and endpoint objects
|
|
|
|
- Added flag to silence request leak check
|
|
|
|
- Added API for endpoint performance evaluation
|
|
|
|
- Added API - ucp_request_query
|
|
|
|
- Added API - ucp_lib_query
|
|
|
|
- Added bandwidth optimizations for new protocols multi-lane
|
|
|
|
- Added support for multi-rail over lanes with BW ratio >= 1/4
|
|
|
|
- Added support for tracking outstanding requests and aborting those in case of connection failure
|
|
|
|
- Refactored keep-alive protocol
|
|
|
|
- Added device id to wireup protocol
|
|
|
|
- Added support up to 128 transport layer resources in UCP context
|
|
|
|
- Added support CUDA memory allocations with ucp_mem_map
|
|
|
|
- Increased UCP_WORKER_MAX_EP_CONFIG to 64
|
|
|
|
- Adjusted memory type zcopy threshold when UCX_ZCOPY_THRESH set
|
|
|
|
- Refactored wireup protocols, rendezvous, get, zcopy protocols
|
|
|
|
- Added put zcopy multi-rail
|
|
|
|
- Improved logging for new protocols
|
|
|
|
- Added system topology information
|
|
|
|
- Added new protocols for eager offload protocols
|
|
|
|
- UCT
|
|
|
|
- Extended connection establishment API
|
|
|
|
- Added active message AM alignment in iface params
|
|
|
|
- Added active message short IOV API.
|
|
|
|
- Added support for interface query by operation and memory type
|
|
|
|
- Added API to get allocation base address and length
|
|
|
|
- Added md_dereg_v2 API
|
|
|
|
- UCS
|
|
|
|
- Added log filter by source file name.
|
|
|
|
- Added checking for last element in fraglist queue
|
|
|
|
- Added a method to get IP address from sockaddr.
|
|
|
|
- Added memory usage limits to registration cache
|
|
|
|
- RDMA CORE (IB, ROCE, etc.)
|
|
|
|
- Added report of QP info in case of completion with error
|
|
|
|
- Refactored of FC send operations
|
|
|
|
- Added support for DevX unique QPN allocation
|
|
|
|
- Optimized endpoint lookup for DCI
|
|
|
|
- Added support for RDMA sub-function (SF)
|
|
|
|
- Added support for DCI via DEVX
|
|
|
|
- Added DCI pool per LAG port
|
|
|
|
- Added support for RoCE IP reachability check using a subnet mask
|
|
|
|
- Added active message short IOV for UD/DC/RC mlx, UD/RC verbs
|
|
|
|
- Added endpoint keep alive check for UD
|
|
|
|
- Suppressed warning if device can't be opened
|
|
|
|
- Added support for multiple flush cancel without completion
|
|
|
|
- Added ignore for devices with invalid GID
|
|
|
|
- Added support for SRQ linked list reordering
|
|
|
|
- Added flush by flow control on old devices
|
|
|
|
- Added support for configurable rdma_resolve_addr/route timeout
|
|
|
|
- Shared memory
|
|
|
|
- Added active message short IOV support for posix, sysv, and self transports
|
|
|
|
- TCP
|
|
|
|
- Added support for peer failure in case of CONNECT_TO_EP
|
|
|
|
- Added support for active message short IOV
|
|
|
|
- See NEWS for a complete changelog and bug fixes
|
|
|
|
- Refresh openucx-s390x-support against latest sources
|
2021-02-24 18:24:21 +01:00
|
|
|
|
2020-10-09 08:50:44 +02:00
|
|
|
-------------------------------------------------------------------
|
|
|
|
Mon Oct 5 13:21:34 UTC 2020 - Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com>
|
|
|
|
|
|
|
|
- Update to v1.9.0 (jsc#SLE-15163)
|
|
|
|
- Features:
|
|
|
|
- Added a new class of communication APIs '*_nbx' that enable API extendability while
|
|
|
|
- preserving ABI backward compatibility
|
|
|
|
- Added asynchronous event support to UCT/IB/DEVX
|
|
|
|
- Added support for latest CUDA library version
|
|
|
|
- Added NAK-based reliability protocol for UCT/IB/UD to optimize resends
|
|
|
|
- Added new tests for ROCm
|
|
|
|
- Added new configuration parameters for protocol selection
|
|
|
|
- Added performance optimization for Fujitsu A64FX with InfiniBand
|
|
|
|
- Added performance optimization for clear cache code aarch64
|
|
|
|
- Added support for relaxed-order PCIe access in IB RDMA transports
|
|
|
|
- Added new TCP connection manager
|
|
|
|
- Added support for UCT/IB PKey with partial membership in IB transports
|
|
|
|
- Added support for RoCE LAG
|
|
|
|
- Added support for ROCm 3.7 and above
|
|
|
|
- Added flow control for RDMA read operations
|
|
|
|
- Improved endpoint flush implementation for UCT/IB
|
|
|
|
- Improved UD timer to avoid interrupting the main thread when not in use
|
|
|
|
- Improved latency estimation for network path with CUDA
|
|
|
|
- Improved error reporting messages
|
|
|
|
- Improved performance in active message flow (removed malloc call)
|
|
|
|
- Improved performance in ptr_array flow
|
|
|
|
- Improved performance in UCT/SM progress engine flow
|
|
|
|
- Improved I/O demo code
|
|
|
|
- Improved rendezvous protocol for CUDA
|
|
|
|
- Updated examples code
|
|
|
|
- Bugfixes:
|
|
|
|
- Fixes for most resent versions of GCC, CLANG, ARMCLANG, PGI
|
|
|
|
- Fixes in UCT/IB for strict order keys
|
|
|
|
- Fixes in memory barrier code for aarch64
|
|
|
|
- Fixes in UCT/IB/DEVX for fork system call
|
|
|
|
- Fixes in UCT/IB for rand() call in rdma-core
|
|
|
|
- Fixed in group rescheduling for UCT/IB/DC
|
|
|
|
- Fixes in UCT/CUDA bandwidth reporting
|
|
|
|
- Fixes in rkey_ptr protocol
|
|
|
|
- Fixes in lane selection for rendezvous protocol based on get-zero-copy flow
|
|
|
|
- Fixes for ROCm build
|
|
|
|
- Fixes for XPMEM transport
|
|
|
|
- Fixes in closing endpoint code
|
|
|
|
- Fixes in RDMACM code
|
|
|
|
- Fixes in memcpy selection for AMD
|
|
|
|
- Fixed in UCT/UD endpoint flush functionality
|
|
|
|
- Fixes in XPMEM detection
|
|
|
|
- Fixes in rendezvous staging protocol
|
|
|
|
- Fixes in ROCEv1 mlx5 UDP source port configuration
|
|
|
|
- Multiple fixes in RPM spec file
|
|
|
|
- Multiple fixes in UCP documentation
|
|
|
|
- Multiple fixes in socket connection manager
|
|
|
|
- Multiple fixes in gtest
|
|
|
|
- Multiple fixes in JAVA API implementation
|
|
|
|
- Refresh openucx-s390x-support.patch against new version
|
|
|
|
|
2020-07-22 17:44:37 +02:00
|
|
|
-------------------------------------------------------------------
|
|
|
|
Mon Jul 13 08:19:45 UTC 2020 - Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com>
|
|
|
|
|
|
|
|
- Update to v1.8.1
|
|
|
|
- Features:
|
|
|
|
- Added binary release pipeline in Azure CI
|
|
|
|
- Bugfixes:
|
|
|
|
- Multiple fixes in testing environment
|
|
|
|
- Fixes in InfiniBand DEVX transport
|
|
|
|
- Fixes in memory management for CUDA IPC transport
|
|
|
|
- Fixes for binutils 2.34+
|
|
|
|
- Fixes for AMD ROCM build environment
|
|
|
|
|
2020-06-05 12:06:01 +02:00
|
|
|
-------------------------------------------------------------------
|
|
|
|
Fri Jun 5 09:38:40 UTC 2020 - Jan Engelhardt <jengelh@inai.de>
|
|
|
|
|
|
|
|
- Trim bias and filler wording from descriptions.
|
|
|
|
|
2020-06-05 10:02:58 +02:00
|
|
|
-------------------------------------------------------------------
|
|
|
|
Thu Jun 4 08:18:26 UTC 2020 - Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com>
|
|
|
|
|
|
|
|
- Update to v1.8.0
|
|
|
|
- Features:
|
|
|
|
- Improved detection for DEVX support
|
|
|
|
- Improved TCP scalability
|
|
|
|
- Added support for ROCM to perftest
|
|
|
|
- Added support for different source and target memory types to perftest
|
|
|
|
- Added optimized memcpy for ROCM devices
|
|
|
|
- Added hardware tag-matching for CUDA buffers
|
|
|
|
- Added support for CUDA and ROCM managed memories
|
|
|
|
- Added support for client/server disconnect protocol over rdma connection manager
|
|
|
|
- Added support for striding receive queue for hardware tag-matching
|
|
|
|
- Added XPMEM-based rendezvous protocol for shared memory
|
|
|
|
- Added support shared memory communication between containers on same machine
|
|
|
|
- Added support for multi-threaded RDMA memory registration for large regions
|
|
|
|
- Added new test cases to Azure CI
|
|
|
|
- Added support for multiple listening transports
|
|
|
|
- Added UCT socket-based connection manager transport
|
|
|
|
- Updated API for UCT component management
|
|
|
|
- Added API to retrieve the listening port
|
|
|
|
- Added UCP active message API
|
|
|
|
- Removed deprecated API for querying UCT memory domains
|
|
|
|
- Refactored server/client examples
|
|
|
|
- Added support for dlopen interception in UCM
|
|
|
|
- Added support for PCIe atomics
|
|
|
|
- Updated Java API: added support for most of UCP layer operations
|
|
|
|
- Updated support for Mellanox DevX API
|
|
|
|
- Added multiple UCT/TCP transport performance optimizations
|
|
|
|
- Optimized memcpy() for Intel platforms
|
|
|
|
- Added protection from non-UCX socket based app connections
|
|
|
|
- Improved search time for PKEY object
|
|
|
|
- Enabled gtest over IPv6 interfaces
|
|
|
|
- Updated Mellanox and Bull device IDs
|
|
|
|
- Added support for CUDA_VISIBLE_DEVICES
|
|
|
|
- Increased limits for CUDA IPC registration
|
|
|
|
- Bugfixes:
|
|
|
|
- Multiple fixes in JUCX
|
|
|
|
- Fixes in UCP thread safety
|
|
|
|
- Fixes for most recent versions GCC, PGI, and ICC
|
|
|
|
- Fixes for CPU affinity on Azure instances
|
|
|
|
- Fixes in XPMEM support on PPC64
|
|
|
|
- Performance fixes in CUDA IPC
|
|
|
|
- Fixes in RDMA CM flows
|
|
|
|
- Multiple fixes in TCP transport
|
|
|
|
- Multiple fixes in documentation
|
|
|
|
- Fixes in transport lane selection logic
|
|
|
|
- Fixes in Java jar build
|
|
|
|
- Fixes in socket connection manager for Nvidia DGX-2 platform
|
|
|
|
- Multiple fixes in UCP, UCT, UCM libraries
|
|
|
|
- Multiple fixes for BSD and Mac OS systems
|
|
|
|
- Fixes for Clang compiler
|
|
|
|
- Fix CPU optimization configuration options
|
|
|
|
- Fix JUCX build on GPU nodes
|
|
|
|
- Fix in Azure release pipeline flow
|
|
|
|
- Fix in CUDA memory hooks management
|
|
|
|
- Fix in GPU memory peer direct gtest
|
|
|
|
- Fix in TCP connection establishment flow
|
|
|
|
- Fix in GPU IPC check
|
|
|
|
- Fix in CUDA Jenkins test flow
|
|
|
|
- Multiple fixes in CUDA IPC flow
|
|
|
|
- Fix adding missing header files
|
|
|
|
- Fix to prevent failures in presence of VPN enabled Ethernet interfaces
|
|
|
|
- Refresh openucx-s390x-support.patch against new version
|
|
|
|
|
2019-10-04 10:22:04 +02:00
|
|
|
-------------------------------------------------------------------
|
|
|
|
Fri Oct 4 08:11:49 UTC 2019 - Jan Engelhardt <jengelh@inai.de>
|
|
|
|
|
|
|
|
- Ensure /usr/lib/ucx is owned at all times.
|
|
|
|
|
2019-09-27 10:19:55 +02:00
|
|
|
-------------------------------------------------------------------
|
|
|
|
Wed Sep 18 10:16:05 UTC 2019 - Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com>
|
|
|
|
|
|
|
|
- Update to v1.6.0
|
|
|
|
- Features:
|
|
|
|
- Modular architecture for UCT transports
|
|
|
|
- ROCm transport re-design: support for managed memory, direct copy, ROCm GDR
|
|
|
|
- Random scheduling policy for DC transport
|
|
|
|
- Optimized out-of-box settings for multi-rail
|
|
|
|
- Added support for OmniPath (using Verbs)
|
|
|
|
- Support for PCI atomics with IB transports
|
|
|
|
- Reduced UCP address size for homogeneous environments
|
|
|
|
- Bugfixes:
|
|
|
|
- Multiple stability and performance improvements in TCP transport
|
|
|
|
- Multiple stability fixed in Verbs and MLX5 transports
|
|
|
|
- Multiple stability fixes in UCM memory hooks
|
|
|
|
- Multiple stability fixes in UGNI transport
|
|
|
|
- RPM Spec file cleanup
|
|
|
|
- Fixing compilation issues with most recent clang and gcc compilers
|
|
|
|
- Fixing the wrong name of aliases
|
|
|
|
- Fix data race in UCP wireup
|
|
|
|
- Fix segfault when libuct.so is reloaded - issue #3558
|
|
|
|
- Include Java sources in distribution
|
|
|
|
- Handle EADDRNOTAVAIL in rdma_cm connection manager
|
|
|
|
- Disable ibcm on RHEL7+ by default
|
|
|
|
- Fix data race in UCP proxy endpoint
|
|
|
|
- Static checker fixes
|
|
|
|
- Fallback to ibv_create_cq() if ibv_create_cq_ex() returns ENOSYS
|
|
|
|
- Fix malloc hooks test
|
|
|
|
- Fix checking return status in ucp_client_server example
|
|
|
|
- Fix gdrcopy libdir config value
|
|
|
|
- Fix printing atomic capabilities in ucx_info
|
|
|
|
- Fix perftest warmup iterations to be non-zero
|
|
|
|
- Fixing default values for configure logic
|
|
|
|
- Fix race condition updating fired_events from multiple threads
|
|
|
|
- Fix madvise() hook
|
|
|
|
- Refresh openucx-s390x-support.patch against new version
|
|
|
|
|
2019-05-15 08:01:04 +02:00
|
|
|
-------------------------------------------------------------------
|
|
|
|
Wed May 15 05:52:55 UTC 2019 - Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com>
|
|
|
|
|
|
|
|
- Disable Werror to handle boo#1121267
|
|
|
|
|
2018-11-06 08:56:17 +01:00
|
|
|
-------------------------------------------------------------------
|
2019-02-25 17:53:29 +01:00
|
|
|
Mon Feb 25 07:56:39 UTC 2019 - nmorey <nmoreychaisemartin@suse.com>
|
|
|
|
|
2019-04-01 08:03:14 +02:00
|
|
|
- Update openucx-s390x-support.patch to fix support of 1.5.0 on s390x (bsc#1121267)
|
2019-02-25 17:53:29 +01:00
|
|
|
- Add baselibs.conf for ppc
|
|
|
|
|
|
|
|
-------------------------------------------------------------------
|
|
|
|
Fri Feb 22 12:11:57 UTC 2019 - Martin Liška <mliska@suse.cz>
|
|
|
|
|
|
|
|
- Update to v1.5.0 (bsc#1121267)
|
|
|
|
* Features:
|
|
|
|
|
|
|
|
* New emulation mode enabling full UCX functionality (Atomic, Put, Get)
|
|
|
|
* over TCP and RDMA-CORE interconnects which don't implement full RDMA semantics
|
|
|
|
* Non-blocking API for all one-sided operations. All blocking communication APIs marked
|
|
|
|
* as deprecated
|
|
|
|
* New client/server connection establishment API, which allows connected handover between workers
|
|
|
|
* Support for rdma-core direct-verbs (DEVX) and DC with mlx5 transports
|
|
|
|
* GPU - Support for stream API and receive side pipelining
|
|
|
|
* Malloc hooks using binary instrumentation instead of symbol override
|
|
|
|
* Statistics for UCT tag API
|
|
|
|
* GPU-to-Infiniband HCA affinity support based on locality/distance (PCIe)
|
|
|
|
* Bugfixes:
|
|
|
|
|
|
|
|
* Fix overflow in RC/DC flush operations
|
|
|
|
* Update description in SPEC file and README
|
|
|
|
* Fix RoCE source port for dc_mlx5 flow control
|
|
|
|
* Improve ucx_info help message
|
|
|
|
* Fix segfault in UCP, due to int truncation in count_one_bits()
|
|
|
|
* Multiple other bugfixes (full list on github)
|
|
|
|
* Tested configurations:
|
|
|
|
|
|
|
|
* InfiniBand: MLNX_OFED 4.4-4.5, distribution inbox drivers, rdma-core
|
|
|
|
* CUDA: gdrcopy 1.2, cuda 9.1.85
|
|
|
|
* XPMEM: 2.6.2
|
|
|
|
* KNEM: 1.1.2
|
|
|
|
|
|
|
|
-------------------------------------------------------------------
|
2018-11-06 08:56:17 +01:00
|
|
|
Tue Nov 6 07:18:34 UTC 2018 - nmoreychaisemartin@suse.com
|
|
|
|
|
|
|
|
- Update to v1.4.0 (bsc#1103494)
|
2018-11-06 13:02:30 +01:00
|
|
|
* Features:
|
|
|
|
* Improved support for installation with latest ROCm
|
|
|
|
* Improved support for latest rdma-core
|
|
|
|
* Added support for CUDA IPC for intra-node GPU, CUDA memory
|
|
|
|
allocation cache for mem-type detection, latest Mellanox
|
|
|
|
devices, Nvidia GPU managed memory, multiple connections
|
|
|
|
between the same pair of workers, large worker address for
|
|
|
|
client/server connection establishment and INADDR_ANY, and
|
|
|
|
for bitwise atomics operations.
|
|
|
|
* Bugfixes:
|
|
|
|
* Performance fixes for rendezvous protocol
|
|
|
|
* Memory hook fixes
|
|
|
|
* Clang support fixes
|
|
|
|
* Self tl multi-rail fix
|
|
|
|
* Thread safety fixes in IB/RDMA transport
|
|
|
|
* Compilation fixes with upstream rdma-core
|
|
|
|
* Multiple minor bugfixes (full list on github)
|
|
|
|
* Segfault fix for a code generated by armclang compiler
|
|
|
|
* UCP memory-domain index fix for zero-copy active messages
|
2018-11-06 08:56:17 +01:00
|
|
|
|
2018-10-25 12:50:06 +02:00
|
|
|
-------------------------------------------------------------------
|
|
|
|
Mon Oct 15 07:51:12 UTC 2018 - nmoreychaisemartin@suse.com
|
|
|
|
|
2018-11-06 08:56:17 +01:00
|
|
|
- Update to v1.3.1 (fate#325996)
|
2018-10-25 12:50:06 +02:00
|
|
|
- Prevent potential out-of-order sending in shared memory active messages
|
|
|
|
- CUDA: Include cudamem.h in source tarball, pass cudaFree memory size
|
|
|
|
- Registration cache: fix large range lookup, handle shmat(REMAP)/mmap(FIXED)
|
|
|
|
- Limit IB CQE size for specific ARM boards
|
|
|
|
|
2018-08-09 12:25:09 +02:00
|
|
|
-------------------------------------------------------------------
|
|
|
|
Thu Aug 9 05:57:24 UTC 2018 - nmoreychaisemartin@suse.com
|
|
|
|
|
|
|
|
- Update to v1.3.0 (bsc#1104159)
|
|
|
|
- Added stream-based communication API to UCP
|
|
|
|
- Added support for GPU platforms: Nvidia CUDA and AMD ROCM software stacks
|
|
|
|
- Added API for client/server based connection establishment
|
|
|
|
- Added support for TCP transport
|
|
|
|
- Support for InfiniBand tag-matching offload for DC and accelerated transports
|
|
|
|
- Multi-rail support for eager and rendezvous protocols
|
|
|
|
- Added support for tag-matching communications with CUDA buffers
|
|
|
|
- Added ucp_rkey_ptr() to obtain pointer for shared memory region
|
|
|
|
- Avoid progress overhead on unused transports
|
|
|
|
- Improved scalability of software tag-matching by using a hash table
|
|
|
|
- Added transparent huge-pages allocator
|
|
|
|
- Added non-blocking flush and disconnect for UCP
|
|
|
|
- Support fixed-address memory allocation via ucp_mem_map()
|
|
|
|
- Added ucp_tag_send_nbr() API to avoid send request allocation
|
|
|
|
- Support global addressing in all IB transports
|
|
|
|
- Add support for external epoll fd and edge-triggered events
|
|
|
|
- Added registration cache for knem
|
|
|
|
- Initial support for Java bindings
|
|
|
|
- Multiple bugfixes (full list on github)
|
|
|
|
- Drop UCT-UD-fixed-compilation-by-gcc8.patch as it was fixed upstream
|
|
|
|
- Refresh openucx-s390x-support.patch against latest sources
|
|
|
|
|
2018-06-23 10:34:02 +02:00
|
|
|
-------------------------------------------------------------------
|
|
|
|
Wed Jun 13 12:45:34 UTC 2018 - nmoreychaisemartin@suse.com
|
|
|
|
|
|
|
|
- Remove libnuma-devel on s390x for older releases
|
|
|
|
|
2018-03-27 15:06:52 +02:00
|
|
|
-------------------------------------------------------------------
|
|
|
|
Tue Mar 27 07:12:37 UTC 2018 - nmoreychaisemartin@suse.com
|
|
|
|
|
|
|
|
- Add UCT-UD-fixed-compilation-by-gcc8.patch to fix compilation
|
|
|
|
with GCC8 (bsc#1084635)
|
|
|
|
|
2018-01-20 16:40:56 +01:00
|
|
|
-------------------------------------------------------------------
|
|
|
|
Sat Jan 20 15:40:43 UTC 2018 - jengelh@inai.de
|
|
|
|
|
|
|
|
- Use right documentation path.
|
|
|
|
|
2018-01-19 17:08:27 +01:00
|
|
|
-------------------------------------------------------------------
|
|
|
|
Fri Jan 19 10:12:04 UTC 2018 - nmoreychaisemartin@suse.com
|
|
|
|
|
|
|
|
- Update to 1.2.2
|
|
|
|
- Support including UCX API headers from C++ code
|
|
|
|
- UD transport to handle unicast flood on RoCE fabric
|
|
|
|
- Compilation fixes for gcc 7.1.1, clang 3.6, clang 5
|
|
|
|
- When UD transport is used with RoCE, packets intended for other peers may
|
|
|
|
arrive on different adapters (as a result of unicast flooding).
|
|
|
|
- This change adds packet filtering based on destination GIDs. Now the packet
|
|
|
|
is silently dropped, if its destination GID does not match the local GID.
|
|
|
|
- Added a new device ID for InfiniBand HCA
|
|
|
|
|
2017-12-10 23:25:22 +01:00
|
|
|
-------------------------------------------------------------------
|
|
|
|
Fri Dec 8 21:19:11 UTC 2017 - dimstar@opensuse.org
|
|
|
|
|
|
|
|
- Drop doxygen BuildRequires: The documentation was already not
|
|
|
|
built with this enabled. Removing the BR causes no regression in
|
|
|
|
the package but eliminates a build cycle
|
|
|
|
boost -> curl -> doxygen -> openucx -> boost
|
|
|
|
|
2017-09-19 15:53:14 +02:00
|
|
|
-------------------------------------------------------------------
|
|
|
|
Tue Sep 19 13:52:13 UTC 2017 - jengelh@inai.de
|
|
|
|
|
|
|
|
- Rediff openucx-s390x-support.patch as p1 to be in line with
|
|
|
|
potential git-generated patches.
|
|
|
|
|
Accepting request 527297 from home:NMoreyChaisemartin:branches:science:HPC
- Switch to version 1.2.1
Previous 1.3+ version was based on a development branch.
Supported platforms
- Shared memory: KNEM, CMA, XPMEM, SYSV, Posix
- VERBs over InfiniBand and RoCE.
VERBS over other RDMA interconnects (iWarp, OmniPath, etc.) is available
for community evaluation and has not been tested in context of this release
- Cray Gemini and Aries
- Architectures: x86_64, ARMv8 (64bit), Power64
Features:
- Added support for InfiniBand DC and UD transports, including accelerated verbs for Mellanox devices
- Full support for PGAS/SHMEM interfaces, blocking and non-blocking APIs
- Support for MPI tag matching, both in software and offload mode
- Zero copy protocols and rendezvous, registration cache
- Handling transport errors
- Flow control for DC/RC
- Dataypes support: contiguous, IOV, generic
- Multi-threading support
- Support for ARMv8 64bit architecture
- A new API for efficient memory polling
- Support for malloc-hooks and memory registration caching
OBS-URL: https://build.opensuse.org/request/show/527297
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=7
2017-09-19 15:27:22 +02:00
|
|
|
-------------------------------------------------------------------
|
|
|
|
Tue Sep 19 09:26:07 UTC 2017 - nmoreychaisemartin@suse.com
|
|
|
|
|
2017-09-19 15:28:56 +02:00
|
|
|
- Switch to version 1.2.1 (Fate#324050)
|
Accepting request 527297 from home:NMoreyChaisemartin:branches:science:HPC
- Switch to version 1.2.1
Previous 1.3+ version was based on a development branch.
Supported platforms
- Shared memory: KNEM, CMA, XPMEM, SYSV, Posix
- VERBs over InfiniBand and RoCE.
VERBS over other RDMA interconnects (iWarp, OmniPath, etc.) is available
for community evaluation and has not been tested in context of this release
- Cray Gemini and Aries
- Architectures: x86_64, ARMv8 (64bit), Power64
Features:
- Added support for InfiniBand DC and UD transports, including accelerated verbs for Mellanox devices
- Full support for PGAS/SHMEM interfaces, blocking and non-blocking APIs
- Support for MPI tag matching, both in software and offload mode
- Zero copy protocols and rendezvous, registration cache
- Handling transport errors
- Flow control for DC/RC
- Dataypes support: contiguous, IOV, generic
- Multi-threading support
- Support for ARMv8 64bit architecture
- A new API for efficient memory polling
- Support for malloc-hooks and memory registration caching
OBS-URL: https://build.opensuse.org/request/show/527297
OBS-URL: https://build.opensuse.org/package/show/science:HPC/openucx?expand=0&rev=7
2017-09-19 15:27:22 +02:00
|
|
|
Previous 1.3+ version was based on a development branch.
|
|
|
|
|
|
|
|
Supported platforms
|
|
|
|
- Shared memory: KNEM, CMA, XPMEM, SYSV, Posix
|
|
|
|
- VERBs over InfiniBand and RoCE.
|
|
|
|
VERBS over other RDMA interconnects (iWarp, OmniPath, etc.) is available
|
|
|
|
for community evaluation and has not been tested in context of this release
|
|
|
|
- Cray Gemini and Aries
|
|
|
|
- Architectures: x86_64, ARMv8 (64bit), Power64
|
|
|
|
Features:
|
|
|
|
- Added support for InfiniBand DC and UD transports, including accelerated verbs for Mellanox devices
|
|
|
|
- Full support for PGAS/SHMEM interfaces, blocking and non-blocking APIs
|
|
|
|
- Support for MPI tag matching, both in software and offload mode
|
|
|
|
- Zero copy protocols and rendezvous, registration cache
|
|
|
|
- Handling transport errors
|
|
|
|
- Flow control for DC/RC
|
|
|
|
- Dataypes support: contiguous, IOV, generic
|
|
|
|
- Multi-threading support
|
|
|
|
- Support for ARMv8 64bit architecture
|
|
|
|
- A new API for efficient memory polling
|
|
|
|
- Support for malloc-hooks and memory registration caching
|
|
|
|
|
2017-07-12 19:33:54 +02:00
|
|
|
-------------------------------------------------------------------
|
|
|
|
Fri Jun 30 09:30:58 UTC 2017 - nmoreychaisemartin@suse.com
|
|
|
|
|
|
|
|
- Disable avx at configure level
|
|
|
|
|
|
|
|
-------------------------------------------------------------------
|
|
|
|
Wed Jun 28 16:46:31 UTC 2017 - nmoreychaisemartin@suse.com
|
|
|
|
|
|
|
|
- Add openucx-s390x-support.patch to fix compilation on s390x
|
|
|
|
- Compile openucx on s390x
|
|
|
|
|
|
|
|
-------------------------------------------------------------------
|
|
|
|
Thu Jun 8 12:12:59 UTC 2017 - nmoreychaisemartin@suse.com
|
|
|
|
|
|
|
|
- Fix compilation on ppc
|
|
|
|
|
|
|
|
-------------------------------------------------------------------
|
|
|
|
Fri May 26 08:29:51 UTC 2017 - jengelh@inai.de
|
|
|
|
|
|
|
|
- Update to snapshot 1.3+git44
|
|
|
|
* No changelog was found
|
|
|
|
- Add -Wno-error and disable AVX/SSE as it is not guaranteed
|
|
|
|
to exist.
|
|
|
|
|
2016-06-19 10:50:43 +02:00
|
|
|
-------------------------------------------------------------------
|
|
|
|
Sat Jun 18 07:36:59 UTC 2016 - jengelh@inai.de
|
|
|
|
|
|
|
|
- Update to snapshot 0~git1727
|
|
|
|
* New: libucm. libucm is a standalone non-unloadable library which
|
|
|
|
installs hooks for virtual memory changes in the current process.
|
|
|
|
|
2015-10-08 08:24:03 +02:00
|
|
|
-------------------------------------------------------------------
|
|
|
|
Sun Sep 13 18:35:15 UTC 2015 - jengelh@inai.de
|
|
|
|
|
|
|
|
- Update to snapshot 0~git862
|
|
|
|
* License clarification on upstream's behalf
|
|
|
|
|
|
|
|
-------------------------------------------------------------------
|
|
|
|
Mon Jul 27 18:32:48 UTC 2015 - jengelh@inai.de
|
|
|
|
|
|
|
|
- Initial package for build.opensuse.org (version 0~git713)
|