SHA256
1
0
forked from pool/libfabric
Commit Graph

109 Commits

Author SHA256 Message Date
Dominique Leuenberger
f9b9495259 Accepting request 1193128 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/1193128
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/libfabric?expand=0&rev=48
2024-08-10 17:06:13 +00:00
28daf30db4 - Update to 1.22.0
- Coll
    - Fix Coverity issues
  - Core
    - General bug fixes
    - hmem: change neuron get_dmabuf_fd error code
    - Fix an error in the error handling path of fi_param_define()
    - Makefile.am: Add Windows build files to distribution tarball
    - hmem: disable ZE IPC
    - Add profile variables for connections and memory allocated
    - hmem: Fix `cuDeviceCanAccessPeer()` error reporting
    - man: Update text for `len` parameter
    - Add page size MR attr field
    - man: Extend fi_mr_refresh support
    - man: Improve FI_MR_ALLOCATED documentation
    - man: Support optional MR desc
    - man: Improve FI_MR_HMEM documentation
    - Added ofi_get_realtime interfaces
    - Add endpoint options for max message size and inject size
    - Add Windows definition for `EREMOTEIO`
  - EFA
    - General improvement and bug fixes
    - Handle recv cancel for zero copy recv
    - Avoid iterating EP list in CQ read
    - Add RDMA core errno for remote unknown peer
    - Map EFA errnos to Libfabric codes
    - Improve the zero-copy receive feature
    - Improve the handshake enforcement procedure
    - Support unsolicited rdma-write recv
    - Support FI_MORE for eager send and rdma-write

OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=104
2024-08-10 14:56:13 +00:00
08f2f39cab - Add -Wno-incompatible-pointer-types to CFLAGS to enable building
for 32bit with GCC 14.

If this request is ok, please forward it soon to factory so that
it is ready when the default compiler is switched.

OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=103
2024-08-08 16:01:35 +00:00
Ana Guerrero
85e3cca968 Accepting request 1164392 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/1164392
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/libfabric?expand=0&rev=47
2024-04-04 20:24:35 +00:00
779fd8ecd7 Accepting request 1164368 from home:NMorey:branches:science:HPC
- Enable ucx and new efa provider on 64b architectures.
- Use a single changes file for libfabric and fabtests.
- Update to 1.21.0
  - Core
    - Various update and fixed in man pages
    - Fix xpmem memory corruption
    - Extend FI_PROVIDER_PATH to allow setting preferred DL provider
    - Add a SECURITY.md file
    - Document preferred threading model for scalable endpoints
    - Move FI_PRIORITY to internal flag
    - Remove FI_PROV_SPECIFIC
    - Remove unimplemented or unused features
    - Support cntr byte counting
    - configure: Do not check for xpmem if disabled
    - Add FI_PROGRESS_CONTROL_UNIFIED
    - hmem/cuda: Get multiple attributes at once in cuda_is_addr_valid
    - configure: Add -pipe by default to CFLAGS
    - Selectively generate warnings on failed loading of DL providers
    - hmem: introduce ofi_dev_reg_copy_*_iov ops
    - Print provider path on fabric creation
    - Introduce FI_OPT_SHARED_MEMORY_PERMITTED
    - README.md: Add badge for openssf scorecard
    - man: Regulate the fi_setopt call sequence.
    - man: Clarify the usage of FI_RMOTE_CQ_DATA flag
    - man: Add ucx provider to the fi_provider man page
    - configure.ac: add extra check for 128 bit atomic support
    - include/osd: align atomic complex definitions
    - hmem/synapseai: Refine the error handling and warning
    - Specify C11 standard for Visual Studio builds
    - configure: Do not check for xpmem if disabled

OBS-URL: https://build.opensuse.org/request/show/1164368
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=101
2024-04-03 15:32:26 +00:00
Ana Guerrero
1b10814640 Accepting request 1161340 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/1161340
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/libfabric?expand=0&rev=46
2024-03-25 20:07:15 +00:00
0dfc65be02 Accepting request 1161331 from home:NMorey:branches:science:HPC
- Update to 1.20.1
  - Core
    - hmem/ze: Change the library name passed to dlopen
    - hmem/ze: map device id to physical device
    - hmem/ze: skip duplicate initialization
    - hmem/ze: dynamically allocate device resources based on number of devices
    - hmem/ze: fix hmem_ze_copy_engine variable look up
    - hmem/ze: Increase ZE_MAX_DEVICES to 32
    - man: Fix typo in fi_getinfo man page
    - Fix compiler warning when compiling with ICX
    - man: Fix fi_rxm.7 and fi_collective.3 man pages
    - man: Update EFA docs for FI_EFA_INTER_MIN_READ_WRITE_SIZE
  - EFA
    - efa_rdm_ep_record_tx_op_submitted() rm peer lookup
    - Remove peer lookup from efa_rdm_pke_sendv()
    - Make handshake response use txe
    - test: Only close SHM if SHM peer is Created
    - Handshake code allocs txe via efa util
    - Initialize txe.rma_iov_count to 0
    - Switch fi_addr to efa_rdm_peer in trigger_handshake
    - Downgrade EFA Endpoint Creation WARN to INFO
    - Init srx_ctx before use
    - Clean up generic_send path
    - Pass in efa_rdm_ep to efa_rdm_msg_generic_recv()
    - Make recv path slightly more efficient
    - re-org rma write to avoid duplicate checks
    - Add missing sync_memops call to writedata
    - use peer pointer from txe in read, write and send
    - Pass in peer pointer to txe
    - Get rid of noop instruction from empty #define

OBS-URL: https://build.opensuse.org/request/show/1161331
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=99
2024-03-25 08:50:35 +00:00
Dominique Leuenberger
d15d9152ef Accepting request 1155207 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/1155207
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/libfabric?expand=0&rev=45
2024-03-06 22:03:43 +00:00
73658dedfa Accepting request 1153473 from home:pgajdos:l
- Use %autosetup macro. Allows to eliminate the usage of deprecated
  %patchN

OBS-URL: https://build.opensuse.org/request/show/1153473
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=97
2024-03-05 13:40:29 +00:00
Ana Guerrero
77be0f1aba Accepting request 1127574 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/1127574
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/libfabric?expand=0&rev=44
2023-11-20 20:19:00 +00:00
5587bbb374 Accepting request 1127573 from home:NMorey:branches:science:HPC
- Update to 1.20.0 (jsc#PED-5777, jsc#PED-5893, jsc#PED-5889)

OBS-URL: https://build.opensuse.org/request/show/1127573
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=95
2023-11-19 18:58:48 +00:00
Ana Guerrero
f6a72224bc Accepting request 1108987 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/1108987
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/libfabric?expand=0&rev=43
2023-09-06 16:55:45 +00:00
a07202472f Accepting request 1108986 from home:NMorey:branches:science:HPC
- Update to 1.19.0
  - Core
    - General code cleanup and restructuring
    - Add ofi_hmem_any_ipc_enabled()
    - ofi_consume_iov allows 0-byte consume
    - ofi_consume_iov consistency
    - ofi_indexer: return error code when iterating
    - getinfo: Add post filters for domain and fabric names
    - Filter loopback device if iface is specified
    - bsock: Fix error checking for -EAGAIN
    - windows/osd: Remove unneeded check to silence coverity
    - windows/osd: Move variable declaration to silence coverity
    - Introduce gdrcopy awareness to hmem copy
    - mr/cache: Fix fi_mr_info initialization
    - hmem_cuda: remove gdrcopy from cuda hmem copy path
    - iouring: Fix wrong indent in ofi_sockapi_accept_uring()
    - Implement ofi_sockctx_uring_poll_add()
    - hmem: introduce gdrcopy from/to cuda iov functions
    - hmem: Deprecate `FI_HMEM_CUDA_ENABLE_XFER`
    - hmem_cuda: Restrict CUDA IPC based on peer accessibility
    - hmem_cuda: Log number of CUDA devices detected
    - hmem_cuda: Refactor global variables
    - tostr: Remove the extra dir "shared/" from "include/" and "src/" .
    - hmem_ze: fix ZE is valid check
    - hmem_rocr: fix offset calculation
    - hmem_rocr: use ofi spinlock functions
    - hmem_rocr: minor fixes
    - hmem_neuron: convert warn to info for nrt_get_dmabuf_fd not found
    - hmem_neuron: check existance of neuron devices during initialization
    - tostr: Moved Windows functions in shared/ofi_str.c to windows/osd.h

OBS-URL: https://build.opensuse.org/request/show/1108986
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=93
2023-09-05 07:23:01 +00:00
Dominique Leuenberger
03adceadba Accepting request 1102763 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/1102763
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/libfabric?expand=0&rev=42
2023-08-09 15:23:55 +00:00
fd28efa431 Accepting request 1102753 from home:NMorey:branches:science:HPC
- Drop support for obsolete TrueScale (bsc#1212146)

OBS-URL: https://build.opensuse.org/request/show/1102753
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=91
2023-08-07 17:25:39 +00:00
Dominique Leuenberger
c63f177d4e Accepting request 1096632 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/1096632
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/libfabric?expand=0&rev=41
2023-07-04 13:21:43 +00:00
1545d1225e Accepting request 1096631 from home:NMorey:branches:science:HPC
- Update to 1.18.1
  - Core
    - Fix build warning for ofi_dynpoll_get_fd
  - EFA
    - Handle 0-byte writes
    - Apply byte_in_order_128_byte for all memory type
    - Increase default shm_av_size to 256
    - Force handshake before selecting rtm for non-system ifaces.
    - Only select readbase_rtm when both sides support rdma-read
    - Bugfix for initializing SHM offload
    - Correct CPPFLAGS during configure
    - Make setopt support sendrecv aligned 128 bytes
    - Make data size to be 128 byte multiples for in-order aligned send/recv
    - prepare local read pkt entry for in-order aligned send/recv.
    - Disable gdrcopy and cudamemcpy for in-order aligned recv.
    - Increase the pad size in rxr_pkt_entry
    - Make readcopy pkt pool 128 byte aligned
    - Introduce alignment to support in order aligned ops
    - Fix a bug when calling ibv_query_qp_data_in_order
    - RMA operations will ensure FI_ATOMIC cap
    - RMA operations will ensure FI_RMA cap
    - Unittest atomics without FI_ATOMIC cap.
    - Unittest RMA without FI_RMA cap.
    - Refactor pkt_entry assignment in poll_ibv loop
    - Fixes for RDMA Write and Writedata
  - RXM
    - Revert rxm util peer CQ support
    - Fix credit size parameter for flow ctrl
  - SHM
    - Fix DSA enable

OBS-URL: https://build.opensuse.org/request/show/1096631
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=89
2023-07-03 16:43:43 +00:00
Dominique Leuenberger
f78d1c2529 Accepting request 1085713 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/1085713
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/libfabric?expand=0&rev=40
2023-05-10 14:16:42 +00:00
bd0842cd83 Accepting request 1084707 from home:fcrozat:branches:science:HPC
- Add _multibuild to define additional spec files as additional
  flavors.
  Eliminates the need for source package links in OBS.

- Add _multibuild to define additional spec files as additional
  flavors.
  Eliminates the need for source package links in OBS.

OBS-URL: https://build.opensuse.org/request/show/1084707
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=87
2023-05-09 12:56:31 +00:00
Dominique Leuenberger
bb5d2fb283 Accepting request 1080189 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/1080189
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/libfabric?expand=0&rev=39
2023-04-20 13:13:15 +00:00
add54a60b7 Accepting request 1080188 from home:NMorey:branches:science:HPC
- Update to 1.18.0
  - Core
    - rocr: fix offset calculation
    - rocr: use ofi spinlock functions
    - rocr: minor fixes
    - neuron: convert warn to info for nrt_get_dmabuf_fd not found
    - neuron: check existance of neuron devices during initialization
    - neuron: Add support for neuron dma-buf
    - ze: update ZE to support new driver index specification
    - List variables read from config file
    - Add switch to prefer system-config over environment
    - Add basic system-config support for setting library variables
    - Move peer provider defines into new header
    - rocr: Support asynchronous memory copies
    - rocr: Add support for ROCR IPC
    - rocr: rename rocr data-structures
    - synpaseai: return 0 for host_register and host_deregister
    - fabric: Improve log level of provider mismatch
    - cuda: Allow CUDA IPC when P2P disabled
    - ze: add ZE command list pool to reuse command lists
    - cuda: implement cuda_get_xfer_setting for non cuda build
    - cuda: adjust FI_HMEM_CUDA_ENABLE_XFER behavior
    - cuda.c: Add const to param to remove warning
    - Add IFF_RUNNING check to indicate iface is up and running
    - io_uring support enhancements
  - EFA
    - Implement CUDA support on instance types that do not support GPUDirect RDMA
    - Implement fi_write using device's RDMA write capability
    - Enrich error messages with debug and connection info
    - Implement support for FI_OPT_EFA_USE_DEVICE_RDMA in fi_setopt

OBS-URL: https://build.opensuse.org/request/show/1080188
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=85
2023-04-18 20:47:57 +00:00
Dominique Leuenberger
1e1e226034 Accepting request 1075156 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/1075156
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/libfabric?expand=0&rev=38
2023-03-30 20:50:41 +00:00
6d086ca72a Accepting request 1075155 from home:NMorey:branches:science:HPC
- Update to 1.17.1
  - Core
    - hmem_cuda Add const to param to remove warning
    - Fix typos in fi_ext.h
    - ofi_epoll: Remove unused hot_index struct member
  - EFA
    - Print local/peer addresses for RX write errors
    - Unit test to verify no copy with shm for small host message
    - Avoid unnecessary copy when sending data from shm
    - Compare pci bus id in hints
    - Fix double free in rxr endpoint init
  - Hooks
    - dmabuf_peer_mem: Handle IPC handle caching in L0
  - OPX
    - Exclude from build if missing needed defines
    - Move some logs to optimized builds
    - Fix build warnings for unused return code from posix_memalign
    - Add reliability sanity check to detect when send buffer is illegally altered
    - SDMA Completion workaround for driver cache invalidation race condition
    - Fix replay payload pointer increment
    - Handle completion counter across multiple writes in SDMA
    - Cleanup pointers after free()
    - Modify domain creation to handle soft cache errors
    - Two biband performance improvements
    - Fixes based on Coverity Scan related to auto progress patch
    - Changed poll many argument to rx_caps instead of caps
    - Resynch with server configured for Multi-Engines (DAOS CART Self Tests)
    - Remove import_monitor as ENOSYS case
    - Address memory leaks reported on OFIWG issues page
    - Remove unused fields
    - Fix unwanted print statement case
    - Add replays over SDMA
    - Implement basic TID Cache
    - Revert work_pending check change
    - Fix use_immediate_blocks
    - Restore state after replay packet is NULL
    - Fix memory leak from early arrival packets.
    - Fix segfault in SHM operations from uninitialized value in atomic path.
    - Prevent SDMA work entries from being reused with outstanding
      replays pointing to bounce buf.
    - Set runtime as default for OPX_AV
    - Fix RTS replay immediate data
    - Fix errors caught by the upstream libfabric Coverity Scan
    - Support multiple HFI devices
    - Support OFI_PORT and Contiguous endpoint addresses
    - Update man pages
  - Util
    - util_cq: Remove annoying WARNING message for FI_AFFINITY

OBS-URL: https://build.opensuse.org/request/show/1075155
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=83
2023-03-29 08:24:52 +00:00
Dominique Leuenberger
d5c883a19d Accepting request 1034518 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/1034518
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/libfabric?expand=0&rev=37
2022-11-09 11:56:28 +00:00
Nicolas Morey-Chaisemartin
1b73b978dd Accepting request 1034517 from home:NMoreyChaisemartin:branches:science:HPC
- Add prov-net-fix-error-path-in-xnet_enable_rdm.patch to fix a deadlock
  when no network interfaces are available (bsc#1205139)

OBS-URL: https://build.opensuse.org/request/show/1034517
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=81
2022-11-08 12:08:06 +00:00
Dominique Leuenberger
b41af68ed2 Accepting request 1012024 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/1012024
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/libfabric?expand=0&rev=36
2022-10-18 10:44:22 +00:00
Nicolas Morey-Chaisemartin
b4457cf5d3 Accepting request 1012023 from home:NMoreyChaisemartin:branches:science:HPC
- Update to 1.16.1
  - Core
    - Fix windows implementation to remove fd from poll set
  - PSM3
    - Add missing files to release tarball
  - Util
    - Handle NULL address insertion to fi_av_insert
- Drop prov-rxm-Disable-128-bit-atomics.patch which was merged upstream

- Update to 1.16.1
  - Core
    - Fix windows implementation to remove fd from poll set
  - PSM3
    - Add missing files to release tarball
  - Util
    - Handle NULL address insertion to fi_av_insert
- Drop prov-rxm-Disable-128-bit-atomics.patch which was merged upstream

OBS-URL: https://build.opensuse.org/request/show/1012023
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=79
2022-10-17 08:21:22 +00:00
Fabian Vogt
99fd313f39 Accepting request 1008574 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/1008574
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/libfabric?expand=0&rev=35
2022-10-10 16:43:27 +00:00
Nicolas Morey-Chaisemartin
f1f52ea9c9 Accepting request 1008573 from home:NMoreyChaisemartin:branches:science:HPC
- Add prov-rxm-Disable-128-bit-atomics.patch to fix a potential
  segfault on misaligned buffers.
- Add prov-rxm-Disable-128-bit-atomics.patch to fix a potential
  segfault on misaligned buffers.

OBS-URL: https://build.opensuse.org/request/show/1008573
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=77
2022-10-06 17:01:30 +00:00
Richard Brown
36926c25e7 Accepting request 1007632 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/1007632
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/libfabric?expand=0&rev=34
2022-10-04 18:36:52 +00:00
Nicolas Morey-Chaisemartin
d98f48a74f Accepting request 1007631 from home:NMoreyChaisemartin:branches:science:HPC
- Update to 1.16.0 (jsc#PED-351, jsc#PED-190)
  - Core
    - Added HMEM IPC cache
    - Use exact string comparison checks for network interfaces
    - Restructuring of poll/epoll abstraction
    - Add ability to disable locks completely in debug builds
    - Serialize access to modifying the logging calls
    - Minor fixes to fi_tostr text formatting
    - Add hmem interface checks to memory registration
  - EFA
    - Added support of Synapse AI memory.
    - Improved error message
  - Net
    - Temporarily forked, optimized version of tcp provider
    - Focused on improved performance and scalability over tcp sockets
    - Fork ensures tcp provider stability while net provider is developed
    - Shares the tcp provider protocol and base implementation for msg endpoints
    - Integrates direct support for rdm endpoints, using a derivative from rxm
    - Implements own protocol for rdm endpoints, separate from rxm;tcp
  - OPX
    - Added initial support for SDMA
    - General performance enhancements
    - Performance improvements to reliability protocol
    - Improved deferred work pending complete
    - Added support for OPX_AV=runtime
    - Support iov memory registration ops
    - Added DAOS RPC support
    - Atomic ops enhancements
    - Improved documentation
    - Debug build enhancements
    - Fixed compiler warnings
    - Reduced time to compile prov/opx code
    - General bug fixes
    - Fixed PSN wrapping scaling
    - Added intranode fence
    - Addressed bugs discovered by coverity scan
  - PSM2
    - Fix sending CQ data in some instances of fi_tsendmsg
  - PSM3
    - Updated to match Intel Ethernet Fabric Suite (IEFS) 11.3 release
  - RxM
    - Update to read multiple completions at once from msg provider
    - Move RxM AV implementation to util code to share with net provider
    - Minor code cleanups
  - SHM
    - Implement and use ipc_cache
    - Add log messages for debugging and error tracking
    - Fix check for FI_MR_HMEM mr_mode
    - Move shm signal handlers initialization to EP
    - Added log messages for errors detected
  - TCP
    - Fix incorrect signaling of the CQ
    - Increase max number of poll events to retrieve
    - Acquire ep lock prior to flushing socket in shutdown
    - Verify ep state prior to progressing socket data
    - Read cm error data when receiving connreq response
    - Log error on connect failure
    - Fix assertion failure in CQ progress function
  - Util
    - Fix text in log of UFFD ioctl failure
    - Introduce cuda ipc monitor
    - Fix CQ memory leak handling overflow
    - Fix MR mode bit check for ver 1.5 and greater
    - Add max_array_size to track/check array overflow
    - Always progress transfers when reading from a CQ
    - Handle NULL address insertion
    - Try IPv4 before IPv6 addresses when starting name server
    - Fix IP util av default address length
    - Fix util IP getinfo path to read hints->addr_format
    - Fix debug print mismatch
    - Fix return code when memory allocation fails.
    - Fix build sign warning in ofi_bufpool_region_alloc
    - Minor code cleanups
    - Print warning if an addr is inserted into an AV again
  - Verbs
    - Fix support of FI_SOCKADDR_IB when requested by the application
    - Ensure all posted receives are flushed to the application
    - Update ofi_mr_cache_search API for hmem IPC support
    - Reduce logging verbosity for "no active ports"
    - Fix incorrect length used in memory registration
    - Various minor bug fixes for test failures
    - Fix a memory leak getting IB address
    - Implement verbs provider on Windows over NetworkDirect API
    - Set and check address format correctly
    - Only close qp if it was initialized
    - Portable detection of loopback device
  - Fabtests
    - multi_ep: Separate EP resources and fix MR registration
    - multi_recv: Fix possible crash and check for valid buffer
    - unexpected_msg: Fix printf compiler warning
    - dgram_pingpong.c: Use out-of-band sync
    - multinode: Make multinode tests platform agnostic, fix formatting
    - ubertest: Fix string comparison to include length, fix writedata completion check
    - av_test: add support for -e <ep_type>
    - New tests:
      - dmabuf-rdma: Component level test for dma-buf RDMA
      - sock_test: Component level performance test of poll, epoll, and select
      - rdm_stress: Multi-threaded, multi-process stress test for RDM endpoints
      - sighandler_test: Regression test for signal handler restoration
- Drop patches fixed upstream:
  - prov-opx-Correctly-disable-OPX-if-unsupported.patch
  - disable-flatten-attr.patch

OBS-URL: https://build.opensuse.org/request/show/1007631
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=75
2022-10-03 07:34:47 +00:00
Dominique Leuenberger
727dd06214 Accepting request 998811 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/998811
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/libfabric?expand=0&rev=33
2022-08-24 13:10:49 +00:00
Nicolas Morey-Chaisemartin
deb2507db0 Accepting request 998810 from home:marxin:branches:science:HPC
- Add disable-flatten-attr.patch that drops flatten attribute.
  Note the flatten attribute results in huge compile time hog
  in inliner (same the binary size would be huge).
- Use %make_build and enable LTO (boo#1133235).
- Synchronize used Patches.

- Add disable-flatten-attr.patch that drops flatten attribute.
  Note the flatten attribute results in huge compile time hog
  in inliner (same the binary size would be huge).
- Use %make_build and enable LTO (boo#1133235).
- Synchronize used Patches.

OBS-URL: https://build.opensuse.org/request/show/998810
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=73
2022-08-23 12:14:10 +00:00
Fabian Vogt
8bb8f3d9c7 Accepting request 989962 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/989962
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/libfabric?expand=0&rev=32
2022-07-31 21:00:32 +00:00
Nicolas Morey-Chaisemartin
abc00bb762 Accepting request 989191 from home:NMoreyChaisemartin:branches:science:HPC
- Update to 1.15.1
  - Core
    - Fix fi_info indentation error in fi_tostr
    - hmem_ze: Add runtime option to choose specific copy engine
    - Cleanup of configure HMEM checks
    - Fixed stringop-truncation in ofi_ifaddr_get_speed
    - Add utility provider log suffix to make logs easier to read
    - Fix truncation of ipv6 addressing
    - hmem: add support for AWS Trainium devices
    - Fix potential sscanf overflows
    - hmem: pass through device and flags when querying memory interface
    - Rework locking in several areas to convert spinlocks to mutexes
    - Add new locking abstractions to select lock types at runtime
    - Add new FI_PROTO_RXM_TCP for optimized rxm over tcp path
    - Fix windows implementation to remove fd from poll set
  - EFA
    - Added windows support through efawin (https://github.com/aws/efawin)
    - Added support of AWS neuron.
    - Added support of using gdrcopy to copy data from host to device.
    - Fixed a bug that cause 0 byte read to fail.
    - Fixed a memory corruption issue that can caused forked process to crash.
    - Extended testing coverage through new pytest based testing framework.
  - HOOKS
    - Add new hooking provider dmabuf_peer_mem
    - Enable DL build of hooking providers
    - Add HMEM memory registration hook
  - OPX
    - New provider supporting Cornelis Networks Omni-path hardware
  - PSM3
    - Updated psm3 to match IEFS 11.2.0.0 release
    - Added support for sockets (TCP/UDP) via a runtime selectable Hardware
  Abstraction Layer (HAL)
    - Added support for IPv6 addressing in RoCE and sockets
    - Added various NIC selection filtering options (wildcarded NIC name,
      address format, wildcarded IP subnet, link speed)
    - Performance tuning in conjunction with OneAPI and OneCCL
    - Improved PSM3_IDENTIFY output
    - Rename most internal symbols to psm3_
    - Corrected vulnerabilities found during Coverity scans
    - configure options refined and help text improved
    - PSM3_MULTI_EP has been deprecated (recommend always enabled, default
      is enabled [same default as previous releases])
    - Various bug fixes
  - RxM
    - Add check that atomic size is valid
    - Add support to passthru calls to tcp provider in specific
  - TCP
    - Add assert to verify RMA source/target msg sizes match
    - Wake-up threads blocked on CQ to update their poll events
    - Fix use of incorrect events in progress handler
    - Fixes for various compile warnings, mostly on Windows
    - Add support for FI_RMA_EVENT capability
    - Add support for completion counters
    - Fix check for CQ data in tagged messages
    - Add cancel support to shared rx context
    - Add src_addr receive buffer matching
    - Add provider control to assign a src_addr with an ep
    - Handle trecv with FI_PEEK flag
    - Allow binding a CQ with an SRX
    - Restructuring of code in source files
    - Handle EWOULDBLOCK returned by send call
    - Add hot (active) pollfd
  - SHM
    - Properly chain the original signal handlers
    - Avoid uninitialized variable with invalid atomic parameters
    - Fix 0 byte SAR read
    - Initialize len parameter to accept
    - Refactor and simplify protocol code
    - Remove broken support for 128-bit atomics
    - Fix FI_INJECT flag support
    - Add assert to verify RMA source/target msg sizes match
    - Set domain threading to thread safe
    - Fix possible use of uninitiated var in av_insert
  - Util
    - Fix sign warning in ofi_bufpool_region_alloc
    - Remove unused variable from ofi_bufpool_destroy
    - Fix check for valid datatype in ofi_atomic_valid
    - Return with error if util_coll_sched_copy fails
    - Fix use of uninitialized variable in ofi_ep_allreduce
    - Fix memory access in ip_av_insertsym
    - Track ep per collective operation not with multicast
    - Restructure collective av set creation/destruction
    - Change most locks from spin locks to mutexes
    - Allow selection of spinlocks for CQ and domain objects
    - Fix AV default addrlen
    - Update fi_getinfo checks to include hints->addr_
    - Handle NULL address insertion to fi_av_insert
  - Verbs
    - Initial changes for compiling on Windows (via NetworkDirect)
    - Add a failover path to dma-buf based memory registration
    - Replace use of spin locks with mutexes
    - Check for valid qp prior to cleanup
    - Set and check for address format correct in fi_getinfo
  - Fabtests
    - hmem_cuda: used device allocated host buff to fill device buf
    - Add python scripts to control test execution
    - test_configs: include util provider in core config file
    - Add option "--pin-core"
    - Only call nrt_init once
    - Fix a bug in ft_neuron_cleanup
    - Correct help for unit test programs
    - Remove duplicate help prints from fi_mcast
    - configure.ac: fix --enable-debug=no not properly detected
    - msg_inject: handle the case ft_tsendmsg return -FI_EAGAIN
    - Add AWS Trainium device support
    - fi_inj_complete: Add FI_INJECT to fabtests
    - inj_complete.c: Make arguments align with the other tests
    - dgram_pingpong: handle the error return of fi_recv
    - recv_cancel: Remove requirement for unexpected msg handling
    - poll: Fix crash if unable to allocate pollset
    - ubertest: Add GPU testing and validation support
    - Add HMEM options parsing support
    - Update and re-enable fi_multi_ep test
- Add prov-opx-Correctly-disable-OPX-if-unsupported.patch to disable
  OPX compilation on non x86_64 systems

OBS-URL: https://build.opensuse.org/request/show/989191
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=71
2022-07-18 13:06:07 +00:00
Dominique Leuenberger
698f4fa244 Accepting request 971080 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/971080
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/libfabric?expand=0&rev=31
2022-04-22 19:53:05 +00:00
Nicolas Morey-Chaisemartin
36cbb47841 Accepting request 971079 from home:NMoreyChaisemartin:branches:science:HPC
- Update to 1.14.1
  - Core
    - Use non-shared memory allocations to use MADV_DONTFORK safely
    - Fix incorrect use of gdr_copy_from_mapping
    - Ensure proper timeout time for pollfds to avoid early exit
  - EFA
    - Handle read completion properly for multi_recv
    - Use shm's inject write when possible
    - Support 0 byte read
  - RxM
    - Ensure signaling the CQ fd after writing completion
    - Fix inject path for sending tagged messages with cq data
    - Negotiate credit based flow control support over CM
    - Add PID to CM messages to detect stale vs duplicate connections
    - Fix race handling unexpected messages from unknown peers
    - Fix possible leak of stack data in cm_accept
    - Restrict reported caps based on core provider
    - Delay starting listen until endpoint fully initialized
    - Verify valid atomic size
  - Sockets
    - Fix coverity reports on uninitialized data
    - Check for NULL pointers passed to memcpy
    - Add missing error return code from sock_ep_enable
  - TCP
    - Fix performance regression resulting from sparse pollfd sets
    - Fix assertion failure in CQ progress function
    - Do not generate error completions for inject msgs
    - Fix use of incorrect event names in progress handler
    - Fix check for CQ data in tagged messages
    - Make start_op array a static to reduce memory

OBS-URL: https://build.opensuse.org/request/show/971079
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=69
2022-04-20 11:30:05 +00:00
Dominique Leuenberger
77029653be Accepting request 933768 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/933768
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/libfabric?expand=0&rev=30
2021-11-28 20:29:57 +00:00
Nicolas Morey-Chaisemartin
a69e2dce28 Accepting request 932983 from home:NMoreyChaisemartin:branches:science:HPC
- Update to 1.14.0
  - Add time stamps to log messages
  - Fix gdrcopy calculation of memory region size when aligned
  - Allow user to disable use of p2p transfers
  - Update fi_tostr print FI_SHARED_CONTEXT text instead of value
  - Update fi_tostr to output field names matching header file names
  - Fix narrow race condition in ofi_init
  - Add new fi_log_sparse API to rate limit repeated log output
  - Define memory registration for buffers used for collective operations
  - EFA, SHM, TCP, RXM, and verbs fixes

OBS-URL: https://build.opensuse.org/request/show/932983
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=68
2021-11-25 14:12:36 +00:00
Dominique Leuenberger
b350b1181b Accepting request 928954 from science:HPC
- Enable PSM3 provider (jsc#SLE-18754)

- Update to 1.13.2
  - Sort DL providers to ensure consistent load ordering
  - Update hooking providers to handle fi_open_ops calls to avoid crashes
  - Replace cassert with assert.h to avoid C++ headers in C code
  - Enhance serialization for memory monitors to handle external monitors
  - EFA, SHM, TCP, RxM and vers fixes

OBS-URL: https://build.opensuse.org/request/show/928954
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/libfabric?expand=0&rev=29
2021-11-08 16:24:08 +00:00
Nicolas Morey-Chaisemartin
ad6d9ec62e Accepting request 928952 from home:NMoreyChaisemartin:branches:science:HPC
- Enable PSM3 provider (jsc#SLE-18754)

OBS-URL: https://build.opensuse.org/request/show/928952
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=67
2021-11-03 08:07:55 +00:00
Nicolas Morey-Chaisemartin
dd36aca7a8 Accepting request 928694 from home:NMoreyChaisemartin:branches:science:HPC
- Update to 1.13.2
  - Sort DL providers to ensure consistent load ordering
  - Update hooking providers to handle fi_open_ops calls to avoid crashes
  - Replace cassert with assert.h to avoid C++ headers in C code
  - Enhance serialization for memory monitors to handle external monitors
  - EFA, SHM, TCP, RxM and vers fixes

OBS-URL: https://build.opensuse.org/request/show/928694
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=66
2021-11-02 09:39:02 +00:00
Dominique Leuenberger
1668f45c04 Accepting request 917139 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/917139
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/libfabric?expand=0&rev=28
2021-09-08 19:36:33 +00:00
Nicolas Morey-Chaisemartin
a480721370 Accepting request 917134 from home:NMoreyChaisemartin:branches:science:HPC
- Update to 1.13.1
  - Enable loading ZE library with dlopen()
  - Add IPv6 support to fi_pingpong
  - EFA, PSM3 and SHM fixes

OBS-URL: https://build.opensuse.org/request/show/917134
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=65
2021-09-06 14:36:39 +00:00
Dominique Leuenberger
7c552a978d Accepting request 905237 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/905237
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/libfabric?expand=0&rev=27
2021-07-16 20:12:28 +00:00
Nicolas Morey-Chaisemartin
c26bb2e322 Accepting request 905235 from home:NMoreyChaisemartin:branches:science:HPC
- Update to 1.13.0
  - Fix behavior of fi_param_get parsing an invalid boolean value
  - Add new APIs to open, export, and import specialized fid's
  - Define ability to import a monitor into the registration cache
  - Add API support for INT128/UINT128 atomics
  - Fix incorrect check for provider name in getinfo filtering path
  - Allow core providers to return default attributes which are lower then
    maximum supported attributes in getinfo call
  - Add option prefer external providers (in order discovered) over internal
    providers, regardless of provider version
  - Separate Ze (level-0) and DRM dependencies
  - Always maintain a list of all discovered providers
  - Fix incorrect CUDA warnings
  - Fix bug in cuda init/cleanup checking for gdrcopy support
  - Shift order providers are called from in fi_getinfo, move psm2 ahead of
    psm3 and efa ahead of psmX
  - See NEWS.md for changelog

OBS-URL: https://build.opensuse.org/request/show/905235
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=64
2021-07-09 10:59:38 +00:00
Richard Brown
890f767b43 Accepting request 882724 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/882724
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/libfabric?expand=0&rev=26
2021-04-08 19:01:51 +00:00
Nicolas Morey-Chaisemartin
948cc1e28f Accepting request 882701 from home:NMoreyChaisemartin:branches:science:HPC
- Update to 1.12.1
  - Fix initialization checks for CUDA HMEM support
  - Fail if a memory monitor is requested but not available
  - Adjust priority of psm3 provider to prefer HW specific providers,
    such as efa and psm2
  - EFA and PSM3 fixes
  - See NEWS.md for changelog

OBS-URL: https://build.opensuse.org/request/show/882701
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=62
2021-04-02 13:48:53 +00:00
Richard Brown
83459a9280 Accepting request 879116 from science:HPC
OBS-URL: https://build.opensuse.org/request/show/879116
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/libfabric?expand=0&rev=25
2021-03-16 14:42:51 +00:00
Nicolas Morey-Chaisemartin
1cc7aa642e Accepting request 879115 from home:NMoreyChaisemartin:branches:science:HPC
- Update to 1.12.0
  - See NEWS.md for changelog

- Update to 1.12.0
  - See NEWS.md for changelog

OBS-URL: https://build.opensuse.org/request/show/879115
OBS-URL: https://build.opensuse.org/package/show/science:HPC/libfabric?expand=0&rev=60
2021-03-15 09:05:41 +00:00