- Core
- hmem/cuda: Adding more robust libgdrapi libpaths
- Update bindings/rust/README.md to reflect the recommended build process.
- Update build.rs to support both cargo build & cargo publish work directories.
- Update Cargo.toml in preparation for crates.io publishing.
- configure: Fix sanitizer detection logic
- Introduce a lightweight Rust bindings for Libfabric, using bindgen.
- include/ofi_indexer: introduce new ofi_array_at_max function
- man/fi_cxi: fixup info for FI_CXI_RDZV_GET_MIN
- man/fi_getinfo: Update the capabilities with mode bits requirements
- man/fi_cq: Document `FI_GETWAITOBJ` for `fi_control`
- man/fi_fabric: Update `fi_tostr()` datatypes
- CXI
- Bump provider support up to libfabric 2.4
- Add domain rx match mode override
- Set rendezvous eager size default to 2K
- Change cuda dmabuf default to enabled
- Do not abort if MR match count do not reconcile
- Allow CP for triggered CQ to remap to Best Effort
- Fix sl-driver path for testing
- Set max domain TX CQs to 14
- Use cxil_alloc_trig_cp to distinguish trig and tx cmdqs
- Add FI_EBUSY debug messages
- Fix validation of service id
- Fix criterion test_sw tap files
- Cxip_cmdq_cp_modify fix
- Fix RNR protocol send byte/error counting
- Release TX credit when pending RNR retry
- Update rocr test fine grained flags
- Fix DEVICE in fi_info_test
- Introduce non-debug tracing
- Reset timer on rx of ARM packet
- Fix performance issue with close_mc()
- Increase vni range in auth_key tests
- Support auth_key ranges
- Fix use of hw_cps and memory leak
- EFA
- Fix cq data size in efa-rdm pkt post
- fix test_efa_rdm_mr_reg_cuda_memory unit test
- adjust the memory barrier positions
- Optimize RTW packet sending by replacing efa_rdm_ope_post_send
- Adjust logging level for txe releases
- Add tracepoints for handshake
- Add flags to MR logs
- Grow efa_tx_pkt_pool and ope_pool during rdm ep creation
- Do not use rdma write when unsolicited recv support is inconsistent
- Determine whether using device rdma based on p2p
- Introduce pke generation counter for protocol path
- Enable data path direct for efa-rdm
- Update the function signature for efa_data_path_direct_cq_initialize
- Move efa_cq_open_ibv_cq to efa_cq.c
- Do not track rx pkt pool for non-debug build
- Temporarily disable FI_OPT_EFA_SENDRECV_IN_ORDER_ALIGNED_128_BYTES support for efa protocol
- do not ignore local read completion
- Add missing lttng tps in efa_post_send
- Fix the remote cq data flags for zcpy recv
- Optimize the WQE post in data path direct
- fix typos in error messages
- Only show help message for OPE warn logs
- configure: replace no-brake space with regular space character
- Remove unused function declarations
- Acquire CQ's `ep_list_lock` during counter progress
- Add asserts to detect erroneous CQE dereferences
- Ignore rma completion to a removed peer
- Remove the incorrect check for device max_msg_size
- Fix function signature mismatch
- Set FI_RX_CQ_DATA for efa direct with NULL hints
- Do not fail fi_getinfo for the wrong fabric
- Log warnings only for internal OPE failures or if CQ error entry not written
- Add unit tests for LRU AH eviction
- Evict AH with no explicit AV entries when AH limit reached
- Add locking assertions and update unit tests
- Remove efa_conn_release unsafe
- Require FI_RX_CQ_DATA on devices without unsolicited write recv
- Add LLTng tracepoints for direct data path operations
- Don't warn users about non-EFA devices
- Support FI_RX_CQ_DATA for efa-direct
- Fix deadlocks in AV insert/remove/close and CQ read paths
- Don't try to release a lock that is not taken
- set RUNPATH if custom rdma-core provided
- Remove rx_msg_flags from efa_rdm_msg_recv/efa_rdm_msg_recvv
- Update tracepoints in the receive path
- Slide recv-win on RTM/RTA error
- Insert read and write packets to tx debug list
- LNX
- remove force setting DEVICE_ONLY flag
- set core hints proto to UNSPEC
- remove iov count failures
- add wait object implementation
- OPX
- Don't fail configure when OPX unhappy
- Add note to FI_OPX_SDMA_MIN_PAYLOAD_BYTES doc
- Simplify uapi configuration
- Unionize 9B and 16B packet SCB models in endpoint structs.
- Support shared contexts in hfisvc bts
- Fix replays for multi-packet eager
- Don't retry forever in send rendezvous.
- Don't ACK packets that were never received
- Segfault in opx_hfi_rdma_context_open() on 2nd endpoint opened
- Fix seg fault in finalize
- Fix SDMA writev error when RDMA core functions are being used.
- Add back accidentally removed opx_domain_hfisvc_poll()
- Add missing function pointers for HFI service
- Check uapi for hfisvc/HFI1 direct verbs
- Rename hfisvc to opx-hfisvc
- Move submodule to rdma core
- Remove stx/srx support in OPX
- Register MRs with HFI service
- Ensure SDMA packet lengths are 8-byte multiples
- Use HFI service by default if enabled in the driver.
- fixup goto labels that need statements
- Update hfisvc_client to 64-bit atomics
- HFISVC: Fix replay payload
- Disable HFI Service by default.
- Disable use of HFI service when driver does not support it.
- Update hfisvc_client to latest patch
- Only open IPC cache if HMEM initialized and IPC enabled
- Handle extended rx bits in common 9B code
- Add IPC to 16B header path
- Make sriov-alpha limitations CN5000-only
- Remove cmake build for hfisvc_client library
- Handle completion errors from HFI service
- Fix setting of rc in deferred recv rts
- Additional HFI Service support changes
- HFI Service initial support
- Asynchronous HMEM memcopy for IPC
Signed-off-by: Nicolas Morey <nmorey@suse.com>
- Core
- configure: Improve the restricted-dl help text
- ofi_list: Introduce dlist_entry_in_list
- man/fi_peer: Fix `FI_ADDR_NOTAVAIL` typo
- common: Make common runtime parameters working for DL providers
- configure.ac: Move cuda cppflag set before DMABUF check
- Add address format FI_SOCKADDR_IP
- include/fi_peer.h: remove fi_peer_rx_entry dlist fields
- configure: Fix clang checking
- hmem/neuron: Implement put_dmabuf_fd op
- man/fi_endpoint: Clarify rx_attr->caps usage
- EFA
- Decrement rx_pkts_posted before efa_rdm_pke_release_rx
- Enable direct data path by default
- Bypass rdma-core in blocking cq read path
- Add traces for RX/TX completions
- Fix the unsolicited write recv check
- Refactor efa_base_ep_create_qp
- Add generic function to process queued op entries
- Deduce queued packet list from op entry
- Add generic utility for fetching RDM packet type
- Create abstraction for IBV CQ polling sequence
- Bypass rdma-core in data path.
- Refactor ibv_cq_ex open call
- Fix stale links in docs/overview.md
- Initialize nevents in efa_domain_cq_open_ext
- Fix conflicting types for efa_mock_efa_ibv_cq_wc_read_opcode_return_mock
- Remove duplicate mock function declarations
- Use efa specific cq trywait
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=120
- Core
- log: Fix buffer overrun when accessing the 'log_levels' array
- man/fi_mr: Clarify fi_close behavior
- rdma/fabric.h: Add new FI_RESCAN flag to fi_getinfo()
- hmem/cuda: Add fallback for dmabuf flag with CUDA_ERROR_NOT_SUPPORTED
- hmem/cuda: Add runtime fallback for unsupported dmabuf flag
- hmem/cuda: Add a flag for exporting dmabuf fd on GB200
- man: Clarify fi_close behavior on FI_ENDPOINT
- av: introduce FI_FIREWALL_ADDR flag for insert operations
- common: ofi_ifname_toaddr check ifa->ifa_addr for null
- man/fi_mr: Add note that requested_key may be ignored w/o remote access
- CXI
- Fix alt_read unit test to use rdzv_threshold
- Adjust cxi environment variable defaults
- Fix regression which could cause deadlock
- Support libfabric 2.2 API
- Set cq_data in peer unexpected message
- Fix locking on the SRX path
- Allow for passing opaque 64-bit data in ctrl_msg
- Fix cxi driver paths for CI
- Fix use of alt_read rget restricted TC type
- Fix compile warnings associated with new dlopen of curl/json
- Fix curl CXIP_WARN that included extra parameter
- Decouple existence CXI_MAP_IOVA_ALLOC for build
- New conf opt for binding of json symbols
- New conf opt for binding of curl symbols
- Pad struct to address hash mismatch bug
- Consistency for initialization of cxip_addr structure
- Fix uninitialized padding in cxip_addr structure causing hash mismatches
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=116
- Core
- man/fi_domain: Define resource mgmt unreachable EP
- man/fi_domain: Update connectionless EP disable
- hmem: Fix missing rocr dlopen function assignments
- Fix data race on log_prefix
- hmem: Define ofi_hmem_put_dmabuf_fd and add support for cuda and rocr
- Fix a few minor man page issues
- CXI
- Fix ss_plugin_auth_key_priority test
- Bump internal CXI version to support 2.1
- Fix possible cq_open segfault
- Fix peer CQ support
- Added collectives logical operators
- Fix bug in constrained LE test cases in test.sh and test_sw.sh
- Fix unit test missing pthread initialization
- Add FI_WAIT_YIELD EQ support
- Make string setup of FI_CXI_CURL_LIB_PATH safe
- Add FI_CXI_CURL_LIB_PATH #define from autoconf
- Test CUDA with DMA buf FD recycling
- Test ROCR with DMA buf FD recycling
- Test ROCR with DMA buf offset
- Integrate with ofi_hmem_put_dmabuf_fd
- Test monitor unsubscribe
- Fix fi_cq_strerror
- Cxi EQ do not support wait objects
- Fix CQ wait FD logic
- Disable retry logic for experimental collectives
- Ignore drop count during init
- Remove CXI_MAP_IOVA_ALLOC flag.
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=114
- Update to v2.0.0 (jsc#PED-9661, jsc#PED-10668)
- Core
- hmem/cuda: avoid stub loading at runtime
- Makefile.am: Keep using libfabric.so.1 as the soname
- xpmem: Cleanup xpmem before monitors
- Remove redundant windows.h
- hmem/cuda: Add env variable to enable/disable CUDA DMABUF
- Update ofi_vrb_speed
- xpmem: Fix compilation warning
- Change the xpmem log level to info
- Clarify FI_HMEM support of inject calls
- Introduce Sub-MR
- Define capbility for directed receive without wildcard src_addr
- Define capability for tagged message only directed recv
- Define capability bit for tagged multi receive
- Define flag for single use MR
- Move flags only used for memory registration calls to fi_domain.h
- windows/osd.h: fix and refactor logical operations on complex numbers
- man/fi_peer: update peer fid initialization language
- Remove CURRENT_SYMVER() macro
- 1.8 ABI compat
- hmem/ze: Fix mistmatched library name in an error message
- Add FI_PEER as a capability
- Add missing FI_AV_USER_ID to cap tostr
- Update and clarify peer SRX API flow
- Prefix public xpmem symbols with ofi
- Add rbmap foreach node utility function
- ofi_mem: Add release bufpool validity check
- hmem/rocr: Don't attempt to get device info when pointer type is unknown.
- hmem: Added handle field to close_handle
OBS-URL: https://build.opensuse.org/request/show/1231350
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=110
- Completely remove building for AVX/AVX2 in PSM3 (bsc#1213538, bsc#1233356, bsc#1234014)
Runtime detection before initializing the provider is not enough as
PSM3 uses constructors which may include AVX insctruction.
Only requires SSE4.2 as it does make a large performance impact
in calculatin packet hashes.
- Remove psm3-fix-SIGILL-on-system-not-supporting-AVX.patch
- Add psm3-prevent-code-from-building-using-AVX-AVX2.patch
- Add _constraints to mark SSE4.2 as required
OBS-URL: https://build.opensuse.org/request/show/1227697
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=108
- Enable ucx and new efa provider on 64b architectures.
- Use a single changes file for libfabric and fabtests.
- Update to 1.21.0
- Core
- Various update and fixed in man pages
- Fix xpmem memory corruption
- Extend FI_PROVIDER_PATH to allow setting preferred DL provider
- Add a SECURITY.md file
- Document preferred threading model for scalable endpoints
- Move FI_PRIORITY to internal flag
- Remove FI_PROV_SPECIFIC
- Remove unimplemented or unused features
- Support cntr byte counting
- configure: Do not check for xpmem if disabled
- Add FI_PROGRESS_CONTROL_UNIFIED
- hmem/cuda: Get multiple attributes at once in cuda_is_addr_valid
- configure: Add -pipe by default to CFLAGS
- Selectively generate warnings on failed loading of DL providers
- hmem: introduce ofi_dev_reg_copy_*_iov ops
- Print provider path on fabric creation
- Introduce FI_OPT_SHARED_MEMORY_PERMITTED
- README.md: Add badge for openssf scorecard
- man: Regulate the fi_setopt call sequence.
- man: Clarify the usage of FI_RMOTE_CQ_DATA flag
- man: Add ucx provider to the fi_provider man page
- configure.ac: add extra check for 128 bit atomic support
- include/osd: align atomic complex definitions
- hmem/synapseai: Refine the error handling and warning
- Specify C11 standard for Visual Studio builds
- configure: Do not check for xpmem if disabled
OBS-URL: https://build.opensuse.org/request/show/1164368
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=101
- Update to 1.20.1
- Core
- hmem/ze: Change the library name passed to dlopen
- hmem/ze: map device id to physical device
- hmem/ze: skip duplicate initialization
- hmem/ze: dynamically allocate device resources based on number of devices
- hmem/ze: fix hmem_ze_copy_engine variable look up
- hmem/ze: Increase ZE_MAX_DEVICES to 32
- man: Fix typo in fi_getinfo man page
- Fix compiler warning when compiling with ICX
- man: Fix fi_rxm.7 and fi_collective.3 man pages
- man: Update EFA docs for FI_EFA_INTER_MIN_READ_WRITE_SIZE
- EFA
- efa_rdm_ep_record_tx_op_submitted() rm peer lookup
- Remove peer lookup from efa_rdm_pke_sendv()
- Make handshake response use txe
- test: Only close SHM if SHM peer is Created
- Handshake code allocs txe via efa util
- Initialize txe.rma_iov_count to 0
- Switch fi_addr to efa_rdm_peer in trigger_handshake
- Downgrade EFA Endpoint Creation WARN to INFO
- Init srx_ctx before use
- Clean up generic_send path
- Pass in efa_rdm_ep to efa_rdm_msg_generic_recv()
- Make recv path slightly more efficient
- re-org rma write to avoid duplicate checks
- Add missing sync_memops call to writedata
- use peer pointer from txe in read, write and send
- Pass in peer pointer to txe
- Get rid of noop instruction from empty #define
OBS-URL: https://build.opensuse.org/request/show/1161331
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=99
- Update to 1.19.0
- Core
- General code cleanup and restructuring
- Add ofi_hmem_any_ipc_enabled()
- ofi_consume_iov allows 0-byte consume
- ofi_consume_iov consistency
- ofi_indexer: return error code when iterating
- getinfo: Add post filters for domain and fabric names
- Filter loopback device if iface is specified
- bsock: Fix error checking for -EAGAIN
- windows/osd: Remove unneeded check to silence coverity
- windows/osd: Move variable declaration to silence coverity
- Introduce gdrcopy awareness to hmem copy
- mr/cache: Fix fi_mr_info initialization
- hmem_cuda: remove gdrcopy from cuda hmem copy path
- iouring: Fix wrong indent in ofi_sockapi_accept_uring()
- Implement ofi_sockctx_uring_poll_add()
- hmem: introduce gdrcopy from/to cuda iov functions
- hmem: Deprecate `FI_HMEM_CUDA_ENABLE_XFER`
- hmem_cuda: Restrict CUDA IPC based on peer accessibility
- hmem_cuda: Log number of CUDA devices detected
- hmem_cuda: Refactor global variables
- tostr: Remove the extra dir "shared/" from "include/" and "src/" .
- hmem_ze: fix ZE is valid check
- hmem_rocr: fix offset calculation
- hmem_rocr: use ofi spinlock functions
- hmem_rocr: minor fixes
- hmem_neuron: convert warn to info for nrt_get_dmabuf_fd not found
- hmem_neuron: check existance of neuron devices during initialization
- tostr: Moved Windows functions in shared/ofi_str.c to windows/osd.h
OBS-URL: https://build.opensuse.org/request/show/1108986
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=93
- Update to 1.18.1
- Core
- Fix build warning for ofi_dynpoll_get_fd
- EFA
- Handle 0-byte writes
- Apply byte_in_order_128_byte for all memory type
- Increase default shm_av_size to 256
- Force handshake before selecting rtm for non-system ifaces.
- Only select readbase_rtm when both sides support rdma-read
- Bugfix for initializing SHM offload
- Correct CPPFLAGS during configure
- Make setopt support sendrecv aligned 128 bytes
- Make data size to be 128 byte multiples for in-order aligned send/recv
- prepare local read pkt entry for in-order aligned send/recv.
- Disable gdrcopy and cudamemcpy for in-order aligned recv.
- Increase the pad size in rxr_pkt_entry
- Make readcopy pkt pool 128 byte aligned
- Introduce alignment to support in order aligned ops
- Fix a bug when calling ibv_query_qp_data_in_order
- RMA operations will ensure FI_ATOMIC cap
- RMA operations will ensure FI_RMA cap
- Unittest atomics without FI_ATOMIC cap.
- Unittest RMA without FI_RMA cap.
- Refactor pkt_entry assignment in poll_ibv loop
- Fixes for RDMA Write and Writedata
- RXM
- Revert rxm util peer CQ support
- Fix credit size parameter for flow ctrl
- SHM
- Fix DSA enable
OBS-URL: https://build.opensuse.org/request/show/1096631
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=89
- Update to 1.18.0
- Core
- rocr: fix offset calculation
- rocr: use ofi spinlock functions
- rocr: minor fixes
- neuron: convert warn to info for nrt_get_dmabuf_fd not found
- neuron: check existance of neuron devices during initialization
- neuron: Add support for neuron dma-buf
- ze: update ZE to support new driver index specification
- List variables read from config file
- Add switch to prefer system-config over environment
- Add basic system-config support for setting library variables
- Move peer provider defines into new header
- rocr: Support asynchronous memory copies
- rocr: Add support for ROCR IPC
- rocr: rename rocr data-structures
- synpaseai: return 0 for host_register and host_deregister
- fabric: Improve log level of provider mismatch
- cuda: Allow CUDA IPC when P2P disabled
- ze: add ZE command list pool to reuse command lists
- cuda: implement cuda_get_xfer_setting for non cuda build
- cuda: adjust FI_HMEM_CUDA_ENABLE_XFER behavior
- cuda.c: Add const to param to remove warning
- Add IFF_RUNNING check to indicate iface is up and running
- io_uring support enhancements
- EFA
- Implement CUDA support on instance types that do not support GPUDirect RDMA
- Implement fi_write using device's RDMA write capability
- Enrich error messages with debug and connection info
- Implement support for FI_OPT_EFA_USE_DEVICE_RDMA in fi_setopt
OBS-URL: https://build.opensuse.org/request/show/1080188
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=85
- Update to 1.17.1
- Core
- hmem_cuda Add const to param to remove warning
- Fix typos in fi_ext.h
- ofi_epoll: Remove unused hot_index struct member
- EFA
- Print local/peer addresses for RX write errors
- Unit test to verify no copy with shm for small host message
- Avoid unnecessary copy when sending data from shm
- Compare pci bus id in hints
- Fix double free in rxr endpoint init
- Hooks
- dmabuf_peer_mem: Handle IPC handle caching in L0
- OPX
- Exclude from build if missing needed defines
- Move some logs to optimized builds
- Fix build warnings for unused return code from posix_memalign
- Add reliability sanity check to detect when send buffer is illegally altered
- SDMA Completion workaround for driver cache invalidation race condition
- Fix replay payload pointer increment
- Handle completion counter across multiple writes in SDMA
- Cleanup pointers after free()
- Modify domain creation to handle soft cache errors
- Two biband performance improvements
- Fixes based on Coverity Scan related to auto progress patch
- Changed poll many argument to rx_caps instead of caps
- Resynch with server configured for Multi-Engines (DAOS CART Self Tests)
- Remove import_monitor as ENOSYS case
- Address memory leaks reported on OFIWG issues page
- Remove unused fields
- Fix unwanted print statement case
- Add replays over SDMA
- Implement basic TID Cache
- Revert work_pending check change
- Fix use_immediate_blocks
- Restore state after replay packet is NULL
- Fix memory leak from early arrival packets.
- Fix segfault in SHM operations from uninitialized value in atomic path.
- Prevent SDMA work entries from being reused with outstanding
replays pointing to bounce buf.
- Set runtime as default for OPX_AV
- Fix RTS replay immediate data
- Fix errors caught by the upstream libfabric Coverity Scan
- Support multiple HFI devices
- Support OFI_PORT and Contiguous endpoint addresses
- Update man pages
- Util
- util_cq: Remove annoying WARNING message for FI_AFFINITY
OBS-URL: https://build.opensuse.org/request/show/1075155
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=83
- Update to 1.16.1
- Core
- Fix windows implementation to remove fd from poll set
- PSM3
- Add missing files to release tarball
- Util
- Handle NULL address insertion to fi_av_insert
- Drop prov-rxm-Disable-128-bit-atomics.patch which was merged upstream
- Update to 1.16.1
- Core
- Fix windows implementation to remove fd from poll set
- PSM3
- Add missing files to release tarball
- Util
- Handle NULL address insertion to fi_av_insert
- Drop prov-rxm-Disable-128-bit-atomics.patch which was merged upstream
OBS-URL: https://build.opensuse.org/request/show/1012023
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=79
- Update to 1.16.0 (jsc#PED-351, jsc#PED-190)
- Core
- Added HMEM IPC cache
- Use exact string comparison checks for network interfaces
- Restructuring of poll/epoll abstraction
- Add ability to disable locks completely in debug builds
- Serialize access to modifying the logging calls
- Minor fixes to fi_tostr text formatting
- Add hmem interface checks to memory registration
- EFA
- Added support of Synapse AI memory.
- Improved error message
- Net
- Temporarily forked, optimized version of tcp provider
- Focused on improved performance and scalability over tcp sockets
- Fork ensures tcp provider stability while net provider is developed
- Shares the tcp provider protocol and base implementation for msg endpoints
- Integrates direct support for rdm endpoints, using a derivative from rxm
- Implements own protocol for rdm endpoints, separate from rxm;tcp
- OPX
- Added initial support for SDMA
- General performance enhancements
- Performance improvements to reliability protocol
- Improved deferred work pending complete
- Added support for OPX_AV=runtime
- Support iov memory registration ops
- Added DAOS RPC support
- Atomic ops enhancements
- Improved documentation
- Debug build enhancements
- Fixed compiler warnings
- Reduced time to compile prov/opx code
- General bug fixes
- Fixed PSN wrapping scaling
- Added intranode fence
- Addressed bugs discovered by coverity scan
- PSM2
- Fix sending CQ data in some instances of fi_tsendmsg
- PSM3
- Updated to match Intel Ethernet Fabric Suite (IEFS) 11.3 release
- RxM
- Update to read multiple completions at once from msg provider
- Move RxM AV implementation to util code to share with net provider
- Minor code cleanups
- SHM
- Implement and use ipc_cache
- Add log messages for debugging and error tracking
- Fix check for FI_MR_HMEM mr_mode
- Move shm signal handlers initialization to EP
- Added log messages for errors detected
- TCP
- Fix incorrect signaling of the CQ
- Increase max number of poll events to retrieve
- Acquire ep lock prior to flushing socket in shutdown
- Verify ep state prior to progressing socket data
- Read cm error data when receiving connreq response
- Log error on connect failure
- Fix assertion failure in CQ progress function
- Util
- Fix text in log of UFFD ioctl failure
- Introduce cuda ipc monitor
- Fix CQ memory leak handling overflow
- Fix MR mode bit check for ver 1.5 and greater
- Add max_array_size to track/check array overflow
- Always progress transfers when reading from a CQ
- Handle NULL address insertion
- Try IPv4 before IPv6 addresses when starting name server
- Fix IP util av default address length
- Fix util IP getinfo path to read hints->addr_format
- Fix debug print mismatch
- Fix return code when memory allocation fails.
- Fix build sign warning in ofi_bufpool_region_alloc
- Minor code cleanups
- Print warning if an addr is inserted into an AV again
- Verbs
- Fix support of FI_SOCKADDR_IB when requested by the application
- Ensure all posted receives are flushed to the application
- Update ofi_mr_cache_search API for hmem IPC support
- Reduce logging verbosity for "no active ports"
- Fix incorrect length used in memory registration
- Various minor bug fixes for test failures
- Fix a memory leak getting IB address
- Implement verbs provider on Windows over NetworkDirect API
- Set and check address format correctly
- Only close qp if it was initialized
- Portable detection of loopback device
- Fabtests
- multi_ep: Separate EP resources and fix MR registration
- multi_recv: Fix possible crash and check for valid buffer
- unexpected_msg: Fix printf compiler warning
- dgram_pingpong.c: Use out-of-band sync
- multinode: Make multinode tests platform agnostic, fix formatting
- ubertest: Fix string comparison to include length, fix writedata completion check
- av_test: add support for -e <ep_type>
- New tests:
- dmabuf-rdma: Component level test for dma-buf RDMA
- sock_test: Component level performance test of poll, epoll, and select
- rdm_stress: Multi-threaded, multi-process stress test for RDM endpoints
- sighandler_test: Regression test for signal handler restoration
- Drop patches fixed upstream:
- prov-opx-Correctly-disable-OPX-if-unsupported.patch
- disable-flatten-attr.patch
OBS-URL: https://build.opensuse.org/request/show/1007631
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=75
- Add disable-flatten-attr.patch that drops flatten attribute.
Note the flatten attribute results in huge compile time hog
in inliner (same the binary size would be huge).
- Use %make_build and enable LTO (boo#1133235).
- Synchronize used Patches.
- Add disable-flatten-attr.patch that drops flatten attribute.
Note the flatten attribute results in huge compile time hog
in inliner (same the binary size would be huge).
- Use %make_build and enable LTO (boo#1133235).
- Synchronize used Patches.
OBS-URL: https://build.opensuse.org/request/show/998810
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=73
- Update to 1.15.1
- Core
- Fix fi_info indentation error in fi_tostr
- hmem_ze: Add runtime option to choose specific copy engine
- Cleanup of configure HMEM checks
- Fixed stringop-truncation in ofi_ifaddr_get_speed
- Add utility provider log suffix to make logs easier to read
- Fix truncation of ipv6 addressing
- hmem: add support for AWS Trainium devices
- Fix potential sscanf overflows
- hmem: pass through device and flags when querying memory interface
- Rework locking in several areas to convert spinlocks to mutexes
- Add new locking abstractions to select lock types at runtime
- Add new FI_PROTO_RXM_TCP for optimized rxm over tcp path
- Fix windows implementation to remove fd from poll set
- EFA
- Added windows support through efawin (https://github.com/aws/efawin)
- Added support of AWS neuron.
- Added support of using gdrcopy to copy data from host to device.
- Fixed a bug that cause 0 byte read to fail.
- Fixed a memory corruption issue that can caused forked process to crash.
- Extended testing coverage through new pytest based testing framework.
- HOOKS
- Add new hooking provider dmabuf_peer_mem
- Enable DL build of hooking providers
- Add HMEM memory registration hook
- OPX
- New provider supporting Cornelis Networks Omni-path hardware
- PSM3
- Updated psm3 to match IEFS 11.2.0.0 release
- Added support for sockets (TCP/UDP) via a runtime selectable Hardware
Abstraction Layer (HAL)
- Added support for IPv6 addressing in RoCE and sockets
- Added various NIC selection filtering options (wildcarded NIC name,
address format, wildcarded IP subnet, link speed)
- Performance tuning in conjunction with OneAPI and OneCCL
- Improved PSM3_IDENTIFY output
- Rename most internal symbols to psm3_
- Corrected vulnerabilities found during Coverity scans
- configure options refined and help text improved
- PSM3_MULTI_EP has been deprecated (recommend always enabled, default
is enabled [same default as previous releases])
- Various bug fixes
- RxM
- Add check that atomic size is valid
- Add support to passthru calls to tcp provider in specific
- TCP
- Add assert to verify RMA source/target msg sizes match
- Wake-up threads blocked on CQ to update their poll events
- Fix use of incorrect events in progress handler
- Fixes for various compile warnings, mostly on Windows
- Add support for FI_RMA_EVENT capability
- Add support for completion counters
- Fix check for CQ data in tagged messages
- Add cancel support to shared rx context
- Add src_addr receive buffer matching
- Add provider control to assign a src_addr with an ep
- Handle trecv with FI_PEEK flag
- Allow binding a CQ with an SRX
- Restructuring of code in source files
- Handle EWOULDBLOCK returned by send call
- Add hot (active) pollfd
- SHM
- Properly chain the original signal handlers
- Avoid uninitialized variable with invalid atomic parameters
- Fix 0 byte SAR read
- Initialize len parameter to accept
- Refactor and simplify protocol code
- Remove broken support for 128-bit atomics
- Fix FI_INJECT flag support
- Add assert to verify RMA source/target msg sizes match
- Set domain threading to thread safe
- Fix possible use of uninitiated var in av_insert
- Util
- Fix sign warning in ofi_bufpool_region_alloc
- Remove unused variable from ofi_bufpool_destroy
- Fix check for valid datatype in ofi_atomic_valid
- Return with error if util_coll_sched_copy fails
- Fix use of uninitialized variable in ofi_ep_allreduce
- Fix memory access in ip_av_insertsym
- Track ep per collective operation not with multicast
- Restructure collective av set creation/destruction
- Change most locks from spin locks to mutexes
- Allow selection of spinlocks for CQ and domain objects
- Fix AV default addrlen
- Update fi_getinfo checks to include hints->addr_
- Handle NULL address insertion to fi_av_insert
- Verbs
- Initial changes for compiling on Windows (via NetworkDirect)
- Add a failover path to dma-buf based memory registration
- Replace use of spin locks with mutexes
- Check for valid qp prior to cleanup
- Set and check for address format correct in fi_getinfo
- Fabtests
- hmem_cuda: used device allocated host buff to fill device buf
- Add python scripts to control test execution
- test_configs: include util provider in core config file
- Add option "--pin-core"
- Only call nrt_init once
- Fix a bug in ft_neuron_cleanup
- Correct help for unit test programs
- Remove duplicate help prints from fi_mcast
- configure.ac: fix --enable-debug=no not properly detected
- msg_inject: handle the case ft_tsendmsg return -FI_EAGAIN
- Add AWS Trainium device support
- fi_inj_complete: Add FI_INJECT to fabtests
- inj_complete.c: Make arguments align with the other tests
- dgram_pingpong: handle the error return of fi_recv
- recv_cancel: Remove requirement for unexpected msg handling
- poll: Fix crash if unable to allocate pollset
- ubertest: Add GPU testing and validation support
- Add HMEM options parsing support
- Update and re-enable fi_multi_ep test
- Add prov-opx-Correctly-disable-OPX-if-unsupported.patch to disable
OPX compilation on non x86_64 systems
OBS-URL: https://build.opensuse.org/request/show/989191
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=71
- Update to 1.14.1
- Core
- Use non-shared memory allocations to use MADV_DONTFORK safely
- Fix incorrect use of gdr_copy_from_mapping
- Ensure proper timeout time for pollfds to avoid early exit
- EFA
- Handle read completion properly for multi_recv
- Use shm's inject write when possible
- Support 0 byte read
- RxM
- Ensure signaling the CQ fd after writing completion
- Fix inject path for sending tagged messages with cq data
- Negotiate credit based flow control support over CM
- Add PID to CM messages to detect stale vs duplicate connections
- Fix race handling unexpected messages from unknown peers
- Fix possible leak of stack data in cm_accept
- Restrict reported caps based on core provider
- Delay starting listen until endpoint fully initialized
- Verify valid atomic size
- Sockets
- Fix coverity reports on uninitialized data
- Check for NULL pointers passed to memcpy
- Add missing error return code from sock_ep_enable
- TCP
- Fix performance regression resulting from sparse pollfd sets
- Fix assertion failure in CQ progress function
- Do not generate error completions for inject msgs
- Fix use of incorrect event names in progress handler
- Fix check for CQ data in tagged messages
- Make start_op array a static to reduce memory
OBS-URL: https://build.opensuse.org/request/show/971079
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=69
- Update to 1.14.0
- Add time stamps to log messages
- Fix gdrcopy calculation of memory region size when aligned
- Allow user to disable use of p2p transfers
- Update fi_tostr print FI_SHARED_CONTEXT text instead of value
- Update fi_tostr to output field names matching header file names
- Fix narrow race condition in ofi_init
- Add new fi_log_sparse API to rate limit repeated log output
- Define memory registration for buffers used for collective operations
- EFA, SHM, TCP, RXM, and verbs fixes
OBS-URL: https://build.opensuse.org/request/show/932983
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=68
- Update to 1.13.0
- Fix behavior of fi_param_get parsing an invalid boolean value
- Add new APIs to open, export, and import specialized fid's
- Define ability to import a monitor into the registration cache
- Add API support for INT128/UINT128 atomics
- Fix incorrect check for provider name in getinfo filtering path
- Allow core providers to return default attributes which are lower then
maximum supported attributes in getinfo call
- Add option prefer external providers (in order discovered) over internal
providers, regardless of provider version
- Separate Ze (level-0) and DRM dependencies
- Always maintain a list of all discovered providers
- Fix incorrect CUDA warnings
- Fix bug in cuda init/cleanup checking for gdrcopy support
- Shift order providers are called from in fi_getinfo, move psm2 ahead of
psm3 and efa ahead of psmX
- See NEWS.md for changelog
OBS-URL: https://build.opensuse.org/request/show/905235
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=64