From add54a60b7ec13ca0289f15e5095e40fb225a10a1846ae4a979e11259d12d460 Mon Sep 17 00:00:00 2001 From: Nicolas Morey Date: Tue, 18 Apr 2023 20:47:57 +0000 Subject: [PATCH] Accepting request 1080188 from home:NMorey:branches:science:HPC - Update to 1.18.0 - Core - rocr: fix offset calculation - rocr: use ofi spinlock functions - rocr: minor fixes - neuron: convert warn to info for nrt_get_dmabuf_fd not found - neuron: check existance of neuron devices during initialization - neuron: Add support for neuron dma-buf - ze: update ZE to support new driver index specification - List variables read from config file - Add switch to prefer system-config over environment - Add basic system-config support for setting library variables - Move peer provider defines into new header - rocr: Support asynchronous memory copies - rocr: Add support for ROCR IPC - rocr: rename rocr data-structures - synpaseai: return 0 for host_register and host_deregister - fabric: Improve log level of provider mismatch - cuda: Allow CUDA IPC when P2P disabled - ze: add ZE command list pool to reuse command lists - cuda: implement cuda_get_xfer_setting for non cuda build - cuda: adjust FI_HMEM_CUDA_ENABLE_XFER behavior - cuda.c: Add const to param to remove warning - Add IFF_RUNNING check to indicate iface is up and running - io_uring support enhancements - EFA - Implement CUDA support on instance types that do not support GPUDirect RDMA - Implement fi_write using device's RDMA write capability - Enrich error messages with debug and connection info - Implement support for FI_OPT_EFA_USE_DEVICE_RDMA in fi_setopt OBS-URL: https://build.opensuse.org/request/show/1080188 OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=85 --- _service | 2 +- fabtests.changes | 214 ++++++++++++++++++++++++ fabtests.spec | 4 +- libfabric-1.17.1.0.1528ac2d6a1b.tar.bz2 | 3 - libfabric-1.18.0.0.8f3a881e08f5.tar.bz2 | 3 + libfabric.changes | 214 ++++++++++++++++++++++++ libfabric.spec | 5 +- 7 files changed, 437 insertions(+), 8 deletions(-) delete mode 100644 libfabric-1.17.1.0.1528ac2d6a1b.tar.bz2 create mode 100644 libfabric-1.18.0.0.8f3a881e08f5.tar.bz2 diff --git a/_service b/_service index 243d8ad..072e961 100644 --- a/_service +++ b/_service @@ -8,7 +8,7 @@ @PARENT_TAG@.@TAG_OFFSET@.%h v(.*) \1 - 1528ac2d6a1b94d51a677ca7e2422683551c24dc + 8f3a881e08f56ee685d416436fc87fb6b00af332 libfabric*.tar diff --git a/fabtests.changes b/fabtests.changes index 2e5b9de..b13cdf7 100644 --- a/fabtests.changes +++ b/fabtests.changes @@ -1,3 +1,217 @@ +------------------------------------------------------------------- +Tue Apr 18 17:25:02 UTC 2023 - Nicolas Morey + +- Update to 1.18.0 + - Core + - rocr: fix offset calculation + - rocr: use ofi spinlock functions + - rocr: minor fixes + - neuron: convert warn to info for nrt_get_dmabuf_fd not found + - neuron: check existance of neuron devices during initialization + - neuron: Add support for neuron dma-buf + - ze: update ZE to support new driver index specification + - List variables read from config file + - Add switch to prefer system-config over environment + - Add basic system-config support for setting library variables + - Move peer provider defines into new header + - rocr: Support asynchronous memory copies + - rocr: Add support for ROCR IPC + - rocr: rename rocr data-structures + - synpaseai: return 0 for host_register and host_deregister + - fabric: Improve log level of provider mismatch + - cuda: Allow CUDA IPC when P2P disabled + - ze: add ZE command list pool to reuse command lists + - cuda: implement cuda_get_xfer_setting for non cuda build + - cuda: adjust FI_HMEM_CUDA_ENABLE_XFER behavior + - cuda.c: Add const to param to remove warning + - Add IFF_RUNNING check to indicate iface is up and running + - io_uring support enhancements + - EFA + - Implement CUDA support on instance types that do not support GPUDirect RDMA + - Implement fi_write using device's RDMA write capability + - Enrich error messages with debug and connection info + - Implement support for FI_OPT_EFA_USE_DEVICE_RDMA in fi_setopt + - Implement support for FI_OPT_CUDA_API_PERMITTED in fi_setopt + - Add support for neuron dma-buf + - Use gdrcopy to improve the intra-node CUDA communication performance for small messages + - Use shm provider's FI_AV_USER_ID support + - Fix bugs in efa provider’s shm info initialization procedure + - Hooks + - dmabuf_peer_mem: Handle IPC handle caching in L0 + - trace: Add trace log for CM operation APIs + - trace: Change tag in trace log to hex format + - trace: Enhance trace log for data transfer API calls + - trace: Add trace log for API fi_cq_readerr() + - trace: Add trace log for CQ operation APIs + - Add tracing hook provider + - Net + - Net provider optimizations have been integrated into the tcp provider. + - Net provider has been removed as a reported provider. + - OPX + - Fixes for Coverity scan issues + - Enhanced tag matching + - Tune expected recv for unaligned buffers + - Add finer logging granularity + - Reduce RTS immediate data and fix packet estimate for odd TID lengths + - Add additional sources for FI_OPX_UUID + - Exclude opx from build if missing needed defines + - Move some logs to optimized builds + - Fix build warnings for unused return code from posix_memalign + - Add reliability sanity check to detect when send buffer is illegally altered + - SDMA Completion workaround for driver cache invalidation race condition + - Fix replay payload pointer increment + - Handle completion counter across multiple writes in SDMA + - Cleanup pointers after free() + - Modify domain creation to handle soft cache errors + - Two biband performance improvements + - Fixes based on Coverity Scan related to auto progress patch + - Changed poll many argument to rx_caps instead of caps + - Resync with server configured for Multi-Engines (DAOS CART Self Tests) + - Remove import_monitor as ENOSYS case + - Address memory leaks reported on OFIWG issues page + - General code cleanup + - Add replays over SDMA + - Implement basic TID Cache + - Revert work_pending check change + - Fix use_immediate_blocks + - Restore state after replay packet is NULL + - Fix memory leak from early arrival packets + - Fix segfault in SHM operations from uninitialized value in atomic path + - Prevent SDMA work entries from being reused with outstanding replays + - Set runtime as default for OPX_AV + - Fix RTS replay immediate data + - Fix errors caught by the upstream libfabric Coverity Scan + - fi_getInfo - Support multiple HFI devices + - Support OFI_PORT and Contiguous endpoint addresses for CART & Mercury + - Add fi_opx_tid.h to Makefile.include + - Fix progress checks and default domain + - Revert is_intranode simplification. + - Don't inline handle_ud_ping function + - Allow atomic fetch ops to use SDMA for sufficiently large counts + - Cleaned up FI_LOG_LEVEL=warn output + - Cleaned up unused macros for FI_REMOTE_COMM and FI_LOCAL_COMM + - Reset default progress to FI_PROGRESS_MANUAL + - Fixed GCC 10 build error with Auto Progress + - Add support for FI_PROGRESS_AUTO + - Use max allowed packet size in SDMA path when expected TID is off + - Expected receive (TID) rendezvous + - RMA Read/Write operations over SDMA + - Remove origin_rs from cts and dput packet header + - Fix for hang in DAOS CART tests + - Use single IOV for bounce buffer in SDMA requests. + - Check for FI_MULTI_RECV with bitwise OR instead of AND + - Fix for intermittent intra-node deadlock hang (DAOS CART tests) + - Fix to RPC transport error failure (DAOS CART tests) + - Fix for context->buf set to NULL + - Fix bad asserts + - Ensure atomicity of atomic ops + - fi_opx_cq_poll_inline count and head check fix + - Fix intermittent intra-node hang causing RPC timeouts (DAOS CART tests) + - PSM3 + - Update provider to sync with IEFS 11.4.1.1.2 + - Fix warnings from build + - Add oneapi ZE support to OFI configure + - RXD + - Ignore error path in av_close return + - RXM + - Handle NULL av in rxm_freeall_conns() + - Implement the FI_OPT_CUDA_API_PERMITTED option + - Write "len" field for remote write + - Ignore error path domain_close return + - Free coll_pool on ep close + - Update rxm to use util_cq FI_PEER support functions + - Fix incorrect CQ completion field + - Rename srx to msg_srx + - Disable FI_SOURCE if not requested + - Memory leaks removed + - Set offload_coll_mask based on actual configuration + - Report on coll offload capabilities with OFI_OFFLOAD_PROV_ONLY + - Fabric setups collective offload fabric + - Create eq for collective offload provider + - Close collective providers ep when rxm_ep is closed + - Fix incorrect use of OFI_UNUSED() + - Rework collective support to use collective provider(s) + - SHM + - Fix potential deadlock in smr_generic_rma() + - smr_generic_rma() wwrite error completion with positive errno + - Update SHM to use ROCR + - Fix incorrect discard call when cleaning up unexpected queues + - Separate smr_generic_msg into msg and tagged recv + - Fix start_msg call + - Implement the FI_OPT_CUDA_API_PERMITTED option + - Assert not valid atomic op + - Fix a bug in smr_av_insert + - Optimize locking on the SAR path + - Remove unneeded sar_cnt + - Optimize locking + - Enable multiple GPU/interface support + - Remove HMEM specific calls from atomic path + - Use util_cq FI_PEER support + - Import shm as device host memory + - Add HMEM flag to smr region + - Fix user_id support + - Write tx err comp to correct cq + - Fix index when setting FI_ADDR_USER_ID + - TCP + - Provider source has been replaced by net provider source + - Removed incorrect reporting of support for FI_ATOMIC + - Do not save unmatched messages until we have the peer's fi_addr + - Use internal flag for FI_CLAIM messages, versus a reserved tag bit + - Fix updating error counter when discarding saved messages + - Allow saved messages to be received after the underlying ep has been closed + - Enhanced debug logging in connection path + - Force CM progress on unconnected ep's when posting data transfers + - Support connect and accept calls with io_uring + - Fix segfault accessing an invalid fi_addr + - Add io_uring support for CM message exchange + - Move CM progress from fabric to EQ to improve multi-threaded performance + - Fix small memory leak destroying an EQ + - Fix race where same rx entry could be freed twice + - Handle NULL av in rdm ep cleanup + - Reduce stack use for epoll event array + - UCX + - New provider targeting Nvidia fabrics that layers over libucp + - Util + - Fix the behavior of cq_read for FI_PEER + - rocr: Fix compilation issue + - cuda: Use correct debug string calls + - Free cq->peer_cq on close + - Remove extra new line from av insert log + - Check for count = 0 in ofi_ip_av_insert + - rocr: Add support for ROCR IPC + - Add FI_PEER support to util_cq + - Disable FI_SOURCE if not requested + - Remove FID events from the EQ when closing endpoint + - Rework collective support to be a peer collective provider(s) + - Allow FI_PEER to pass CQ, EQ and AV attr checking + - Remove annoying WARNING message for FI_AFFINITY + - Add utility collective provider + - Verbs + - Implement the FI_OPT_CUDA_API_PERMITTED option + - Add support for ROCR IPC + - Fabtests + - Add fi_setopt_test unit test + - Update ze device registration calls + - fi-rdmabw-xe: Always use host buffer for synchronization + - Fix bug in posting RMA operation + - fi_cq_data: Extend test to fi_writedata + - fi_cq_data: Extend validation of completion data + - Rename fi_msg_inject tests to fi_inject_test to reflect its use + - fi_rdm_stress: Add count option to json key/pair options + - Add and fix OOB option handling in several tests + - fi_eq_test: Fix incorrect return value + - fi_rdm_multi_client: Increase the size of ep name buffer + - Add FI_MR_RAW to default mr_mode + - Support larger control messages needed by newer providers + - fi-rdmabw-xe: Update to work with the ucx provider + - fi_ubertest: Cleanup allocations in failure cases + - Change ft_reg_mr to not assume hmem iface & device + - fi_multinode: Bugfix multinode test for ze + verbs + - fi_multinode: Remove unused validation print + - fi_multinode: Skip tests for unsupported collective operations + - fi_ubertest: Fix data validation with device memory + - fi_peek_tagged: Restructure and expand test + ------------------------------------------------------------------- Mon Mar 20 09:03:29 UTC 2023 - Nicolas Morey diff --git a/fabtests.spec b/fabtests.spec index 69bea5f..7340279 100644 --- a/fabtests.spec +++ b/fabtests.spec @@ -16,10 +16,10 @@ # -%define git_ver .0.1528ac2d6a1b +%define git_ver .0.8f3a881e08f5 Name: fabtests -Version: 1.17.1 +Version: 1.18.0 Release: 0 Summary: Test suite for libfabric API License: BSD-2-Clause OR GPL-2.0-only diff --git a/libfabric-1.17.1.0.1528ac2d6a1b.tar.bz2 b/libfabric-1.17.1.0.1528ac2d6a1b.tar.bz2 deleted file mode 100644 index 11951e6..0000000 --- a/libfabric-1.17.1.0.1528ac2d6a1b.tar.bz2 +++ /dev/null @@ -1,3 +0,0 @@ -version https://git-lfs.github.com/spec/v1 -oid sha256:170fcbbf7075ab6d167ae1b3da115cb19029dfa962d4609782ea40f7ce5a9fd1 -size 3036923 diff --git a/libfabric-1.18.0.0.8f3a881e08f5.tar.bz2 b/libfabric-1.18.0.0.8f3a881e08f5.tar.bz2 new file mode 100644 index 0000000..1726c56 --- /dev/null +++ b/libfabric-1.18.0.0.8f3a881e08f5.tar.bz2 @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:6b162231a3e29105f7a4ae61e2764dc2d4c7fbf0bc3057690a089a2b08b872c7 +size 3244939 diff --git a/libfabric.changes b/libfabric.changes index 2e5b9de..b13cdf7 100644 --- a/libfabric.changes +++ b/libfabric.changes @@ -1,3 +1,217 @@ +------------------------------------------------------------------- +Tue Apr 18 17:25:02 UTC 2023 - Nicolas Morey + +- Update to 1.18.0 + - Core + - rocr: fix offset calculation + - rocr: use ofi spinlock functions + - rocr: minor fixes + - neuron: convert warn to info for nrt_get_dmabuf_fd not found + - neuron: check existance of neuron devices during initialization + - neuron: Add support for neuron dma-buf + - ze: update ZE to support new driver index specification + - List variables read from config file + - Add switch to prefer system-config over environment + - Add basic system-config support for setting library variables + - Move peer provider defines into new header + - rocr: Support asynchronous memory copies + - rocr: Add support for ROCR IPC + - rocr: rename rocr data-structures + - synpaseai: return 0 for host_register and host_deregister + - fabric: Improve log level of provider mismatch + - cuda: Allow CUDA IPC when P2P disabled + - ze: add ZE command list pool to reuse command lists + - cuda: implement cuda_get_xfer_setting for non cuda build + - cuda: adjust FI_HMEM_CUDA_ENABLE_XFER behavior + - cuda.c: Add const to param to remove warning + - Add IFF_RUNNING check to indicate iface is up and running + - io_uring support enhancements + - EFA + - Implement CUDA support on instance types that do not support GPUDirect RDMA + - Implement fi_write using device's RDMA write capability + - Enrich error messages with debug and connection info + - Implement support for FI_OPT_EFA_USE_DEVICE_RDMA in fi_setopt + - Implement support for FI_OPT_CUDA_API_PERMITTED in fi_setopt + - Add support for neuron dma-buf + - Use gdrcopy to improve the intra-node CUDA communication performance for small messages + - Use shm provider's FI_AV_USER_ID support + - Fix bugs in efa provider’s shm info initialization procedure + - Hooks + - dmabuf_peer_mem: Handle IPC handle caching in L0 + - trace: Add trace log for CM operation APIs + - trace: Change tag in trace log to hex format + - trace: Enhance trace log for data transfer API calls + - trace: Add trace log for API fi_cq_readerr() + - trace: Add trace log for CQ operation APIs + - Add tracing hook provider + - Net + - Net provider optimizations have been integrated into the tcp provider. + - Net provider has been removed as a reported provider. + - OPX + - Fixes for Coverity scan issues + - Enhanced tag matching + - Tune expected recv for unaligned buffers + - Add finer logging granularity + - Reduce RTS immediate data and fix packet estimate for odd TID lengths + - Add additional sources for FI_OPX_UUID + - Exclude opx from build if missing needed defines + - Move some logs to optimized builds + - Fix build warnings for unused return code from posix_memalign + - Add reliability sanity check to detect when send buffer is illegally altered + - SDMA Completion workaround for driver cache invalidation race condition + - Fix replay payload pointer increment + - Handle completion counter across multiple writes in SDMA + - Cleanup pointers after free() + - Modify domain creation to handle soft cache errors + - Two biband performance improvements + - Fixes based on Coverity Scan related to auto progress patch + - Changed poll many argument to rx_caps instead of caps + - Resync with server configured for Multi-Engines (DAOS CART Self Tests) + - Remove import_monitor as ENOSYS case + - Address memory leaks reported on OFIWG issues page + - General code cleanup + - Add replays over SDMA + - Implement basic TID Cache + - Revert work_pending check change + - Fix use_immediate_blocks + - Restore state after replay packet is NULL + - Fix memory leak from early arrival packets + - Fix segfault in SHM operations from uninitialized value in atomic path + - Prevent SDMA work entries from being reused with outstanding replays + - Set runtime as default for OPX_AV + - Fix RTS replay immediate data + - Fix errors caught by the upstream libfabric Coverity Scan + - fi_getInfo - Support multiple HFI devices + - Support OFI_PORT and Contiguous endpoint addresses for CART & Mercury + - Add fi_opx_tid.h to Makefile.include + - Fix progress checks and default domain + - Revert is_intranode simplification. + - Don't inline handle_ud_ping function + - Allow atomic fetch ops to use SDMA for sufficiently large counts + - Cleaned up FI_LOG_LEVEL=warn output + - Cleaned up unused macros for FI_REMOTE_COMM and FI_LOCAL_COMM + - Reset default progress to FI_PROGRESS_MANUAL + - Fixed GCC 10 build error with Auto Progress + - Add support for FI_PROGRESS_AUTO + - Use max allowed packet size in SDMA path when expected TID is off + - Expected receive (TID) rendezvous + - RMA Read/Write operations over SDMA + - Remove origin_rs from cts and dput packet header + - Fix for hang in DAOS CART tests + - Use single IOV for bounce buffer in SDMA requests. + - Check for FI_MULTI_RECV with bitwise OR instead of AND + - Fix for intermittent intra-node deadlock hang (DAOS CART tests) + - Fix to RPC transport error failure (DAOS CART tests) + - Fix for context->buf set to NULL + - Fix bad asserts + - Ensure atomicity of atomic ops + - fi_opx_cq_poll_inline count and head check fix + - Fix intermittent intra-node hang causing RPC timeouts (DAOS CART tests) + - PSM3 + - Update provider to sync with IEFS 11.4.1.1.2 + - Fix warnings from build + - Add oneapi ZE support to OFI configure + - RXD + - Ignore error path in av_close return + - RXM + - Handle NULL av in rxm_freeall_conns() + - Implement the FI_OPT_CUDA_API_PERMITTED option + - Write "len" field for remote write + - Ignore error path domain_close return + - Free coll_pool on ep close + - Update rxm to use util_cq FI_PEER support functions + - Fix incorrect CQ completion field + - Rename srx to msg_srx + - Disable FI_SOURCE if not requested + - Memory leaks removed + - Set offload_coll_mask based on actual configuration + - Report on coll offload capabilities with OFI_OFFLOAD_PROV_ONLY + - Fabric setups collective offload fabric + - Create eq for collective offload provider + - Close collective providers ep when rxm_ep is closed + - Fix incorrect use of OFI_UNUSED() + - Rework collective support to use collective provider(s) + - SHM + - Fix potential deadlock in smr_generic_rma() + - smr_generic_rma() wwrite error completion with positive errno + - Update SHM to use ROCR + - Fix incorrect discard call when cleaning up unexpected queues + - Separate smr_generic_msg into msg and tagged recv + - Fix start_msg call + - Implement the FI_OPT_CUDA_API_PERMITTED option + - Assert not valid atomic op + - Fix a bug in smr_av_insert + - Optimize locking on the SAR path + - Remove unneeded sar_cnt + - Optimize locking + - Enable multiple GPU/interface support + - Remove HMEM specific calls from atomic path + - Use util_cq FI_PEER support + - Import shm as device host memory + - Add HMEM flag to smr region + - Fix user_id support + - Write tx err comp to correct cq + - Fix index when setting FI_ADDR_USER_ID + - TCP + - Provider source has been replaced by net provider source + - Removed incorrect reporting of support for FI_ATOMIC + - Do not save unmatched messages until we have the peer's fi_addr + - Use internal flag for FI_CLAIM messages, versus a reserved tag bit + - Fix updating error counter when discarding saved messages + - Allow saved messages to be received after the underlying ep has been closed + - Enhanced debug logging in connection path + - Force CM progress on unconnected ep's when posting data transfers + - Support connect and accept calls with io_uring + - Fix segfault accessing an invalid fi_addr + - Add io_uring support for CM message exchange + - Move CM progress from fabric to EQ to improve multi-threaded performance + - Fix small memory leak destroying an EQ + - Fix race where same rx entry could be freed twice + - Handle NULL av in rdm ep cleanup + - Reduce stack use for epoll event array + - UCX + - New provider targeting Nvidia fabrics that layers over libucp + - Util + - Fix the behavior of cq_read for FI_PEER + - rocr: Fix compilation issue + - cuda: Use correct debug string calls + - Free cq->peer_cq on close + - Remove extra new line from av insert log + - Check for count = 0 in ofi_ip_av_insert + - rocr: Add support for ROCR IPC + - Add FI_PEER support to util_cq + - Disable FI_SOURCE if not requested + - Remove FID events from the EQ when closing endpoint + - Rework collective support to be a peer collective provider(s) + - Allow FI_PEER to pass CQ, EQ and AV attr checking + - Remove annoying WARNING message for FI_AFFINITY + - Add utility collective provider + - Verbs + - Implement the FI_OPT_CUDA_API_PERMITTED option + - Add support for ROCR IPC + - Fabtests + - Add fi_setopt_test unit test + - Update ze device registration calls + - fi-rdmabw-xe: Always use host buffer for synchronization + - Fix bug in posting RMA operation + - fi_cq_data: Extend test to fi_writedata + - fi_cq_data: Extend validation of completion data + - Rename fi_msg_inject tests to fi_inject_test to reflect its use + - fi_rdm_stress: Add count option to json key/pair options + - Add and fix OOB option handling in several tests + - fi_eq_test: Fix incorrect return value + - fi_rdm_multi_client: Increase the size of ep name buffer + - Add FI_MR_RAW to default mr_mode + - Support larger control messages needed by newer providers + - fi-rdmabw-xe: Update to work with the ucx provider + - fi_ubertest: Cleanup allocations in failure cases + - Change ft_reg_mr to not assume hmem iface & device + - fi_multinode: Bugfix multinode test for ze + verbs + - fi_multinode: Remove unused validation print + - fi_multinode: Skip tests for unsupported collective operations + - fi_ubertest: Fix data validation with device memory + - fi_peek_tagged: Restructure and expand test + ------------------------------------------------------------------- Mon Mar 20 09:03:29 UTC 2023 - Nicolas Morey diff --git a/libfabric.spec b/libfabric.spec index e239585..4a78156 100644 --- a/libfabric.spec +++ b/libfabric.spec @@ -17,10 +17,10 @@ # -%define git_ver .0.1528ac2d6a1b +%define git_ver .0.8f3a881e08f5 Name: libfabric -Version: 1.17.1 +Version: 1.18.0 Release: 0 Summary: User-space RDMA Fabric Interfaces License: BSD-2-Clause OR GPL-2.0-only @@ -128,6 +128,7 @@ rm -f %{buildroot}%{_libdir}/*.la %{_includedir}/rdma/fi_trigger.h %dir %{_includedir}/rdma/providers %{_includedir}/rdma/providers/fi_log.h +%{_includedir}/rdma/providers/fi_peer.h %{_includedir}/rdma/providers/fi_prov.h %{_includedir}/rdma/fi_ext_usnic.h %ifarch x86_64