Index | index by Group | index by Distribution | index by Vendor | index by creation date | index by Name | Mirrors | Help | Search |
Name: libfabric1 | Distribution: openSUSE Tumbleweed |
Version: 2.0.0 | Vendor: openSUSE |
Release: 1.1 | Build date: Mon Dec 16 09:34:01 2024 |
Group: System/Libraries | Build host: reproducible |
Size: 1624041 | Source RPM: libfabric-2.0.0-1.1.src.rpm |
Packager: http://bugs.opensuse.org | |
Url: http://www.github.com/ofiwg/libfabric | |
Summary: User-space RDMA fabric interfaces |
libfabric provides a user-space API to access high-performance fabric services, such as RDMA. This package contains the runtime library.
BSD-2-Clause OR GPL-2.0-only
* Mon Dec 16 2024 Nicolas Morey <nicolas.morey@suse.com> - Update to v2.0.0 (jsc#PED-9661, jsc#PED-10668) - Core - hmem/cuda: avoid stub loading at runtime - Makefile.am: Keep using libfabric.so.1 as the soname - xpmem: Cleanup xpmem before monitors - Remove redundant windows.h - hmem/cuda: Add env variable to enable/disable CUDA DMABUF - Update ofi_vrb_speed - xpmem: Fix compilation warning - Change the xpmem log level to info - Clarify FI_HMEM support of inject calls - Introduce Sub-MR - Define capbility for directed receive without wildcard src_addr - Define capability for tagged message only directed recv - Define capability bit for tagged multi receive - Define flag for single use MR - Move flags only used for memory registration calls to fi_domain.h - windows/osd.h: fix and refactor logical operations on complex numbers - man/fi_peer: update peer fid initialization language - Remove CURRENT_SYMVER() macro - 1.8 ABI compat - hmem/ze: Fix mistmatched library name in an error message - Add FI_PEER as a capability - Add missing FI_AV_USER_ID to cap tostr - Update and clarify peer SRX API flow - Prefix public xpmem symbols with ofi - Add rbmap foreach node utility function - ofi_mem: Add release bufpool validity check - hmem/rocr: Don't attempt to get device info when pointer type is unknown. - hmem: Added handle field to close_handle - Introduce new atomic datatypes and operation - Define new tag formats - Add new peer group feature - Add fi_fabric2() API - Deprecate old MR modes - Deprecate FI_WAIT_MUTEX_COND - Deprecate wait set and poll set - Require using libfabric APIs to allocate fi_info structures - Cleanup FI_ORDER flags - Deprecate support for async memory registration - Remove total_buffered_recv - Deprecate comp_order attribute - Simplify progress definition - Simplify threading models - Move FI_BUFFERED_RECV to internal flag - Simplify the AV API - Remove internally used definitions from public headers - hmem/cuda: Modify the logging for nvml dlopen - hmem/rocr: Fix dmabuf for amd gpu implementation - CXI - Add FI_OPT_CUDA_API_PERMITTED tests - Define FI_CXI_FORCE_DEV_REG_COPY - Support FI_OPT_CUDA_API_PERMITTED - Testing FI_RM_ENABLED - Correct checking of MR test rc - Update unit test for collectives - Add test for invalid client RKEY - Fix broken client key check - Ignore FLT_OVERFLOW and FLT_INVALID errors - Update CXI man page. - Enable dmabuf for ROCR by default. - Remove disable_dmabuf_cuda and disable_dmabuf_rocr - Disable use of dmabuf by default for cuda - Remove use of deprecated FI_ORDER_NONE - Report RMA order used in debug output - Remove srx unittests - Add FI_PEER capability bit - Support shared receive queues - Implement shared Completion Queues - Update provider man page - Update version to 2.0 - Remove setting total_buffered_recv - Update CXI provider - FI_PATH_MAX is removed in 2.0 API - EFA - Skip rx pkt refill under certain threshold - Fix efa multi recv setopt segfault - Add tracepoints for rma operations - Adjust the location of tracepoint - Implement the rma interface - Fix efa_msg flags - Remove efa_send_wr, send_wr_pool and recv_wr_pool from dgram_ep - Fix the read_bad_recv_status unit test - Implement efa_msg interface - Implement FI_MORE for fi_recv in zero copy recv mode - Fix the error path of zero copy recv - Move inject sizes from rdm ep to base ep - Fix the ep list scan in cq/cntr read - Fix the error handling for unsolicited recv - Fall back to zero sl when non-zero sl qp creation failed - Disable zero copy receive if p2p is not available - Initialize efa fork support in EFA_INI - Update efa_hmem and efa_fork_support log to FI_LOG_CORE - Make efa_hmem_info a global variable - Set max rma order size correctly - Remove unused fields from various data structures - Update efa shm implementation to allocate fi_peer_srx_context - Avoid gdr_pin/gdr_map for dmabuf mrs - Only do dmabuf reg when FI_MR_DMABUF is set - Report correct inject_msg_size for zcpy rx - Add setopt/getopt support for remaining EP sizes - Split RDM EP inject size field into MSG,RMA variants - Use tclass to prioritize the messages from an ep - Remove tx_size and rx_size from efa_rdm_ep - Remove tx_iov_limit and rx_iov_limit from efa_rdm_ep - Remove DC NACK packet from rxe map after recv completed - Correctly handle fallback longcts-rtw send completion - Differentiate unresponsive receiver errors following rdma-core - Make NACK protocol fall back to DC longCTS when DC is requested - Update help message for inter_min_read_write_size - Adjust log level for setopt/getopt - Add dependency header file in fi_ext_efa.h - Test: Disable shm via fi_setopt - Rename p2p_available to mr_p2p_available - Always use p2p for system memory - Test: Use correct qp num in the mock - Shrink the size of extra_info array - Improve the zero-copy recv error message. - Update read nack protocol docs - Receiver send NACK if p2p is unavailable - Sender switch to emulated long CTS write if p2p unavailable - Adjust log level for shm disabling. - Check p2p support to use rdma read - Add device to host copy for inject rdma write - Copy user buffer for fi_sendmsg with FI_INJECT - Respect FI_MR_LOCAL in transport path - Zero the cq entry array in dgram ep progress - Remove unit test for libfabric 1.1 API - Replace deprecated MR modes - Remove deprecated FI_ORDER flag - Update EP's `inject_size` in zero-copy mode - Add support for `FI_OPT_INJECT_RMA_SIZE` - Query for shm's FI_PEER capability - Require FI_MR_LOCAL for zero-copy receive - Correctly handle fallback longcts-rtm send completion - Adjust the logging for pke exhaustion - Fix a memory leak in local read - Use dlist_foreach_container_safe to iterate progressed ep list - refactor hmem interface initialization - Fix a memory leak in efa_rdm_ep_post_handshake - disable zero-copy receive if p2p is not supported - Update data types for various IOV operations - Require shm to be disabled for using zero-copy recv - Register user recv buffer for zero-copy receive mode - Make fi_cancel return EOPNOTSUPP for zero copy receive mode. - Handle receive window overflow - Introduce FI_EFA_IFACE to restrict visible NICs - Allow disabling unsolicited write recv via env - Hook - Fix the preprocessor - Trace: Add trace log for domain_attr. - LNX - Initialize flags to 0 - Convert peer table to use buffer pools - Fix av strncpy - Fix various issues with initial commit - Initial addition - LPP - Initial addition - OPX - Use page_sizes[OFI_PAGE_SIZE] instead of PAGE_SIZE - Set immediate ACK requested bit when sending last packet of RMA PUT - Add debug check for zero-byte length data packets - Conditionally set FI_REMOTE_CQ_DATA on receive - Include less immediate data in RTS packet to improve rendezvous performance - Investigate and address indeterminate behavior or segfault resulting from ignored context creation error - fi_info -e fix for FI_OPX_UUID env var - Fix last_bytes field for replay over sdma - Fix eager and mp eager - Fix payload copy - Add FI_OPX_TID_MIN_PAYLOAD_BYTES param - Fix incorrect calculation of immediate block offset in send rendezvous - Initialize nic info in fi_info - Simplify fi_opx_check_rma() function. - added OPX Tracer points to RMA code paths - Fix credit return - Remove polling call from internal rma write - Support 16B SDMA CTS work - Fix uepkt 16B headers - 16B SDMA header support - Man: Document OPX max ping envvars - Link bounce support for OPX WFR - Scb/hdr changes - Updated configure.m4 for ROCR - Capitalized env var used for production override, also added opx to the front. - Remove FI_CONTEXT2 requirement - Only posting one completion for rzv truncation receives. - Fixing bug for credit check in inject code path. - Resolve coverity scan defects uncovered after upstream - Replace fi_opx_context_slist with slist - Remove assert from find pkt by tag - Add OPX Tracer EP lock and Recv entries - CN5000/JKR: Changes needed to get RMA working in 16B - Added GDRCopy logging and failure path - Initial 16B header support - Fix wrong function used when copying from HMEM/rocr. - Create GPU-specific SDMA/RZV thresholds - Don't try to get HMEM iface for NULL pointers - Limit the number of reliability pings on credit-constrained flows - Remove function table entries for reliability types other than ONLOAD - PSM3 - Fix logical atomic function calls - Check atomic op error code - Disable complex comparison combinations - Fix incorrect unlock function - PSM2 - Check return value of asprintf - Fix incorrect unlock function - RXM - Fix rxm multi recv setopt segfault - Replace rxm managed srx with util srx, support FI_PEER - Add rxm support for using a peer CQs and counters - Add FI_AV_USER_ID support - Fix definition of the rxm SAR segment enum - SHM - Fix shm multi recv setopt segfault - Cleanup op flags - Add unmap_region function - Use owner-allocated srx - Fix incorrect capability set - Make progress errors ints instead of unit64 - Remove unused err path from progress_iov - Refactor initialization process - Put smr_map memory into av - Add FI_PEER capability - Refactor ze ipc path to use pidfd - TCP - Fix incorrect usage of av insert apis when multiplexing - Initialize addr_size when duplicating an av - Introduce sub-domains to support FI_THREAD_COMPLETION - Sockets - Fixed coverity issue for unchecked return value. - UCX - Fix segfault in ucx_send_callback - Fix incorrect return value checking for fi_param_get() - Support FI_OPT_CUDA_API_PERMITTED in fi_setopt() - Fix error code for fi_setopt()/fi_getopt() - Util - Set srx completion flags and msg_len properly - fi_pingpong: Fix coverity issue about integer overflow - Change uffd stop routine to use pipe - Integrate kdreg2 into libfabric - mr_cache: Support compile default monitor - Handle page faults in uffd monitor - Allow providers to update cache MR IOV - Log AV insert with AV's specified address format - Add uffd user mode flag for kernels - Initialize ROCR name in memory monitor struct - Support specific placement of addr into the av - Verbs - Fix coverity issue about overflowed return value - Enable implicit dmabuf mr reg for more HMEM ifaces - Fix resource leak in error handling path - Replace __BITS_PER_LONG with LONG_WIDTH - Fix issue while displaying addresses with fi_info -a <addr_format> - Fabtests - Add opts.min_multi_recv_size to set opt before enable - Add FI_MORE pytest for fi_recv in zcpy recv mode - Allow tests with FI_MORE flag by using fi_recvmsg - New fabtest fi_flood to test over subscription of resources - test_configs/ofi_rxm/tcp.test: remove cntr RMA testing - Fix compiler warning about unitialized variable - Fix compilation error about CMPLX with C99 - Added -E/env option to multinode test script - Change xfer-method variable to xfer_method in runmultinode.sh - Fix complex fill cast - efa: Remove rnr cq error message check - efa: Loose assertion for read request counters - runfabtests.cmd: add atomic tests to windows testing - runfabtests.sh: add rdm_atomic validation tests - rdm_atomic: add data validation - Change ZE memset to use uint8 - Change sync message to be 0 bytes instead of 1 byte - Fix atomic buffer - Add hmem support to common atomic validation - Move ubertest atomic validation code to common - Use new synapse api - Update fi_multinode test - Update runmultinode.py with args - Added inband sync to ft_init_fabric_cm - lpp: remove deprecated FI_MR_BASIC - Add option for conditionally building lpp - Make building efa conditional - Call provider specific configure - efa: Skip inter_min_write_write_size test when rdma write is on - efa: Add efa_rdma_checker - lpp: remove invalid condition in fi_tsenddata - Support no prepost RX pingpong test - Split out ft_sync logic - Define common run pingpong function - Move pingpong logic into pre-posted func - lpp: update version and protocol in fi_getinfo - lpp: fix compile warnings - Remove multi_ep from tcp exclude - runfabtests.sh: add more multi_ep tests - Add common threading option - multi_ep: use common long ops, switch shared-av and cq opts - multi_ep: add closing and reopening of MRs - multi_ep: add RMA validation - Create common raw key functions - multi_ep: separate MR resources per EP - efa: Skip memory registration that hit device limit - efa: Avoid testing duplicate mixed memory type workload - lpp: Fix compiler warning about unused variables - Remove deprecated MR modes - Remove fi_poll and fi_dgram_waitset tests (deprecated feature) - Add LPP specific fabtests - Add `inject_size` to `ft_opts` - Add pytests for FI_MORE Test fi_rma_bw and fi_rdm_tagged_bw with flag FI_MORE. - Use fi_writemsg to test rma write/writedata with FI_MORE - Use fi_sendmsg to test rdm_tagged_bw with FI_MORE - Add option for running tests with FI_MORE - synapse: Remove dependency of scal - Pass `memory_type` to client server test * Mon Dec 02 2024 Nicolas Morey <nicolas.morey@suse.com> - Completely remove building for AVX/AVX2 in PSM3 (bsc#1213538, bsc#1233356, bsc#1234014) Runtime detection before initializing the provider is not enough as PSM3 uses constructors which may include AVX insctruction. Only requires SSE4.2 as it does make a large performance impact in calculatin packet hashes. - Remove psm3-fix-SIGILL-on-system-not-supporting-AVX.patch - Add psm3-prevent-code-from-building-using-AVX-AVX2.patch - Add _constraints to mark SSE4.2 as required * Thu Nov 28 2024 Nicolas Morey <nicolas.morey@suse.com> - Add psm3-fix-SIGILL-on-system-not-supporting-AVX.patch to fix SIGILL hapening during init on older CPU (bsc#1213538, bsc#1233356). - Refresh libfabric-libtool.patch tu support patch -p0 * Mon Aug 05 2024 Filip Kastl <filip.kastl@suse.com> - Add -Wno-incompatible-pointer-types to CFLAGS to enable building for 32bit with GCC 14. * Sun Aug 04 2024 Nicolas Morey <nicolas.morey@suse.com> - Update to 1.22.0 - Coll - Fix Coverity issues - Core - General bug fixes - hmem: change neuron get_dmabuf_fd error code - Fix an error in the error handling path of fi_param_define() - Makefile.am: Add Windows build files to distribution tarball - hmem: disable ZE IPC - Add profile variables for connections and memory allocated - hmem: Fix `cuDeviceCanAccessPeer()` error reporting - man: Update text for `len` parameter - Add page size MR attr field - man: Extend fi_mr_refresh support - man: Improve FI_MR_ALLOCATED documentation - man: Support optional MR desc - man: Improve FI_MR_HMEM documentation - Added ofi_get_realtime interfaces - Add endpoint options for max message size and inject size - Add Windows definition for `EREMOTEIO` - EFA - General improvement and bug fixes - Handle recv cancel for zero copy recv - Avoid iterating EP list in CQ read - Add RDMA core errno for remote unknown peer - Map EFA errnos to Libfabric codes - Improve the zero-copy receive feature - Improve the handshake enforcement procedure - Support unsolicited rdma-write recv - Support FI_MORE for eager send and rdma-write - Improve the EFA_IO_COMP error code and explanation - Improve the unit test for LL128 protocol - Distinguish max RMA size from msg size - Hooks - dmabuf: Fix incompatible pointer warning - OPX - Add missing file needed for fabric direct build to release package - Fix performance issue caused by not setting ACK bit in the single SDMA packet case - TID cache debug improvements - Detection of driver lack of support for TID - Multi-CTS support for TID - Removal of statement that TID is not supported - OPX Tracer improvements - Improvements to OPX shared memory cleanup - H to H performance improvements for build that supports HMEM - Bug fix for a threshold check - Bug fix for FI_SELECTIVE_COMPLETION - CN5000 fixes - Parameterization of various thresholds - Further enhancements to support NVIDIA GPUs, included CUDA-allocated bounce buffers and in-provider support for GDRCopy - Enhancements to enable support for CN5000 hardware - Better checking for TID support - General TID enhancements - Pkey error handling - Send work queue splitting - Support for OPX tracer for profiling purposes - Coverity scan fixes - Fixes and enhancements to logging and debug messages - Intranode RMA read fixes - Fix compile issues - Fix shared memory segment index creation bug - PSM3 - Update provider to sync with IEFS 11.7.0.0.110 - Improved auto-tuning features for PSM3, including dynamic Credit Flows and detecting the presence of the rv kernel module - Improved PSM3 intra-node performance for large message sizes - SHM - Added support for write() method to submit DSA work - Touch all buffer pages after DSA page fault - Add return and more descriptive error message - Fix coverity about incorrect sign - Fix memory leaks for srx - Fix atomic read - Sockets - Fix Coverity issues - USNIC - Fix a few Coverity issues - Util - Discard outstanding operations in util_srx_close - Enable profile on the size of bufpool allocated. - Add more predefined profile variables. - Fix issue while displaying addresses with fi_info -a <addr_format> - fi_pingpong: Fix out of scope memory leak - Add source address to fi_pingpong - Verbs - Flush CQ for SQ on no SQ credit - Optimize search for device max inline size - Enable profiling - Fabtests - pytest/shm: reduce the msg size in test_unexpected_msg - Fix synapseai fabtests build - Add pytests for EFA zero-copy receive - Add benchmark option for `FI_OPT_MAX_MSG_SIZE` - benchmarks: Add synapseai support - Disable fi_rdm_tagged_peek test for ucx and psm3 - Add manual init sync to fi_rdm_multiclient and fi_rdm - Refactor ft_sock_sync to take in a socket - Add fi_rdm_bw test - Skip rma_pingpong write tests - Init rx_buf before sending data - Add rma_pingpong tests to makefile - pytest: use different message sizes for rma pingpong - Fix missing fixture memory_type in test_rma_pingpong_range_no_inject - pytest: account for process startup overhead in client-server tests - pytest: save client process output to a file - Support testing inject with cq data - multinode: update arguments - multi_ep: Fix memory leak - rdm_tagged_peek: Align rx's msg_order with tx's - Add backlog > 0 to listen call * Wed Apr 03 2024 Nicolas Morey <nicolas.morey@suse.com> - Enable ucx and new efa provider on 64b architectures. - Use a single changes file for libfabric and fabtests. - Update to 1.21.0 - Core - Various update and fixed in man pages - Fix xpmem memory corruption - Extend FI_PROVIDER_PATH to allow setting preferred DL provider - Add a SECURITY.md file - Document preferred threading model for scalable endpoints - Move FI_PRIORITY to internal flag - Remove FI_PROV_SPECIFIC - Remove unimplemented or unused features - Support cntr byte counting - configure: Do not check for xpmem if disabled - Add FI_PROGRESS_CONTROL_UNIFIED - hmem/cuda: Get multiple attributes at once in cuda_is_addr_valid - configure: Add -pipe by default to CFLAGS - Selectively generate warnings on failed loading of DL providers - hmem: introduce ofi_dev_reg_copy_*_iov ops - Print provider path on fabric creation - Introduce FI_OPT_SHARED_MEMORY_PERMITTED - README.md: Add badge for openssf scorecard - man: Regulate the fi_setopt call sequence. - man: Clarify the usage of FI_RMOTE_CQ_DATA flag - man: Add ucx provider to the fi_provider man page - configure.ac: add extra check for 128 bit atomic support - include/osd: align atomic complex definitions - hmem/synapseai: Refine the error handling and warning - Specify C11 standard for Visual Studio builds - configure: Do not check for xpmem if disabled - man page fixes - EFA - General improvement and bug fixes - Propagate errnos from core functions untouched - Create 1:1 relationship between libfabric CQs and IBV CQs - Do not progress ep inside transmission call when hitting EAGAIN - Remove unnecessary check in rdma write. - Handle rx pkts error without ope - Add a new rx pkt counter - Enable runting for neuron with a different runt size - Distinguish unresponsive receiver errors - Remove unnecessary handshake in send path - Don't fail the whole domain init if cudamalloc failed - Introduce efa specific domain operations - Implement FI_OPT_SHARED_MEMORY_PERMITTED - Do not memset rxe to 0 on init - Reduce # of error cases in happy path - Add FI_EFA_USE_HUGE_PAGE to efa man page. - Don't do handshake for local fi_write - Add pingpong test after exhausting MRs - Introduce utilities to exhaust MRs on EFA device - Test EFA with a 1GiB message - Do not abort on all deprecated env vars - Onboard fi_mr_dmabuf API in mem reg ops. - Try registering cuda memory via dmabuf when checking p2p - Introduce HAVE_EFA_DMABUF_MR macro in configure - Use long CTS protocol if long read and runting read protocols fail because of memory registration limits - Remove unnecessary check in rdma write. - Enable runting for neuron with a different runt size - Handle rx pkts error without ope - Distinguish unresponsive receiver errors - Add `efa_show_help()` - Refactor error code definitions - Remove error message assertions from CQ unit tests - Refactor `efa_strerror()` - Doxyfile: Configure tabs to 8 spaces - Rename Doxyfile - Hooks - dmabuf_peer_mem: initialize fd to supress compiler warning - NETDIR - Removed. The functionality is intergrated into the verbs provider. - OPX - Fix compiler warnings and coverity issues - General improvement and bug fixes - Add GPU support to expected TID - RZV RTS packet exclude empty immediate data - Add more efficient check for cuda-resident user buffer - Improve default HFI selection logic in multi rail environments - Flush dead list opportunistically - Add RISC-V support - Make update HDRQ register frequency configurable at build time - Removed all references to the reliability nack threshold env var - Added missing tuneables, rearraged to match fi_info -e output - Use BAR load/store macros - Check HFI driver version to allow GPU-enabled build/run - Added kernel and driver version check to allow/disallow expected receive TID - Fix max SHM connections to allow up to 16 HFIs - Use FI_HMEM_SYSTEM for Cuda-Managed (Unified) memory - Handle FI_OPT_CUDA_API_PERMITTED - Use contiguous send when only one iov present - Always replay TID packets over SDMA - Add Virtual Lane and Partition pkey (FI_OPX_SL and FI_OPX_PKEY) - Forced AV type to be AV Map when requested AV is unsupported - Reduce size of opx_shm_tx - Add GPU support for RMA Atomic operations - Add GPU support for RMA reads and writes - Add HMEM debug counters - Print debug counters upon receiving SIGUSR1 - Fix multi-receive to work with contiguous rzv payload - Initial support for GPU / FI_HMEM - Limit multipacket eager implementation to tagged sends - Read, verify and store some hfi chip attributes - PSM3 - Update provider to sync with IEFS 11.6.0.0.231 - Fix some conditional build errors - RSTREAM - Removed. - RXM - Add option to auto detect hmem iface of user buffers - SHM - Manually align 8 byte fields in memory region - Close device_fds for connected peers when the EP is closed - Print shm name and error code when failed to open - Mark send as completed when a message is discarded - Don't close dmabuf-fd when a request is done - Revert the smr_region fields adjustment - Fix various coverity issues - Add ep to cq ep list once in cq bind - Add ofi_buf_alloc error handling - Revert the smr_region fields adjustment - Don't close dmabuf-fd when a request is done - Mark send as completed when a message is discarded - Print shm name and error code when failed to open - Close device_fds for connected peers when the EP is closed - SOCKETS - fix compiler warnings and coverity issues - UCX - Fix incorrect enum value in FI_DBG() and FI_WARN() - USNIC - Turn off compiler warnings of possible string truncation - Util - Make ep_list_lock noop for FI_PROGRESS_CONTROL_UNIFIED - Save control progress model to util_domain - Set import monitor state to idle upon close - Add name field to memory monitors - memhooks: Fix a bug when calculating mprotect region - Modify domain_attr based on FI_AV_AUTH_KEY - Verbs - Non-blocking EP creation - Address cm_id resource leak in rdma_reject path - Redirected error handle logic for dmabuf failure in verbs - Added rocr dmabuf support under verbs - Windows: Check error code from GetPrivateData - Add missing lock to protect SRX - Fix compiler warnings about out of boundary access - Fabtests - Fix various coverity issues - General improvement and bug fixes - Add multi_ep test - Serialize the run of fi_cq_test - Utilize `junitparser` module directly - Add progress models to SHM/EFA fabtests - Add option to change progress model - efa/rnr_cq_read_err: poll cq when hitting EAGAIN - Allow testing multi_ep with shared/non-shared cq and av - Print warning for HMEM iface init failure - efa: Add small tx_rx size test - pytest: Make ssh connection error pattern less stringent - Add new exclude file for io_uring tests - Add rma_pingpong benchmark - efa: Make 1G tests run faster - pytests: add command line argument for dmabuf reg - Bump Libfabric API version. - Add option to support dmabuf MR - Add dmabuf ops for cuda. - Replace strtok with strtok_r - Add new exclude file for io_uring tests * Mon Mar 25 2024 Nicolas Morey <nicolas.morey@suse.com> - Update to 1.20.1 - Core - hmem/ze: Change the library name passed to dlopen - hmem/ze: map device id to physical device - hmem/ze: skip duplicate initialization - hmem/ze: dynamically allocate device resources based on number of devices - hmem/ze: fix hmem_ze_copy_engine variable look up - hmem/ze: Increase ZE_MAX_DEVICES to 32 - man: Fix typo in fi_getinfo man page - Fix compiler warning when compiling with ICX - man: Fix fi_rxm.7 and fi_collective.3 man pages - man: Update EFA docs for FI_EFA_INTER_MIN_READ_WRITE_SIZE - EFA - efa_rdm_ep_record_tx_op_submitted() rm peer lookup - Remove peer lookup from efa_rdm_pke_sendv() - Make handshake response use txe - test: Only close SHM if SHM peer is Created - Handshake code allocs txe via efa util - Initialize txe.rma_iov_count to 0 - Switch fi_addr to efa_rdm_peer in trigger_handshake - Downgrade EFA Endpoint Creation WARN to INFO - Init srx_ctx before use - Clean up generic_send path - Pass in efa_rdm_ep to efa_rdm_msg_generic_recv() - Make recv path slightly more efficient - re-org rma write to avoid duplicate checks - Add missing sync_memops call to writedata - use peer pointer from txe in read, write and send - Pass in peer pointer to txe - Get rid of noop instruction from empty #define - Remove noop memset - Fix the ibv cq error handling. - Don't do handshake for local read - Fix a typo in configure.m4 - Make runt_size aligned - OPX - Initialize cq error data size - RXM - Fix data error with FI_OFI_RXM_USE_RNDV_WRITE=1 - SHM - Fix coverity issue about resource leak - Adjust the order of smr_region fields. - Allocate peer device fds dynamically - Util - Fix coverity issue about missing lock - Implement timeout in util_wait_yield_run() - Fix bug in util_cq startup error case - util_mem_hooks: add missing parantheses - Verbs - Windows: Resolve regression in user data retrieval - Fabtests - efa: Close ibv device after use - efa: Get device MR limit from ibv_query_device - efa: Add simple unexpected test to MR exhaustion test - pytest: add a new ssh connection error pattern * Thu Feb 29 2024 pgajdos@suse.com - Use %autosetup macro. Allows to eliminate the usage of deprecated %patchN * Sun Nov 19 2023 Nicolas Morey <nicolas.morey@suse.com> - Update to 1.20.0 (jsc#PED-5777, jsc#PED-5893, jsc#PED-5889) - Core - General bug fixes and code clean-up - configure.ac: add extra check for 128 bit atomic support - hmem/synapseai: Refine the error handling and warning - Introduce FI_ENOMR - hmem/cuda: fix a bug when calculating aligned size. - Handle dmabuf for ofi_mr_cache* functions. - Handle dmabuf flag in ofi_mr_attr_update - Handle dmabuf for mr_map insert. - man: Fix the description of virtual address when FI_MR_DMABUF is set - man: Clarify the defition of FI_OPT_MIN_MULTI_RECV - hmem/cuda: Add dmabuf fd ops functions - include/ofi_atomic_queue: Properly align atomic values - Define fi_av_set_user_id - Support multiple auth keys per EP - Simplify restricted-dl feature - hmem: Only initalize synapseai if device exists - Add "--enable-profile" option - windows: Updated config.h - Add environment variable for selective HMEM initialization - Add restricted dlopen flag to configure options - hmem: generalize the use of OFI_HMEM_DATA to non-cuda iface - hmem: fail cuda_dev_register if gdrcopy is not enabled - Add 1.7 ABI compat - Define fi_domain_attr::max_ep_auth_key - hmem: Add new op to hmem_ops for getting dmabuf fd - hmem/cuda: Update cuda_gdrcopy_dev_register's signature - mr_cache: Define ofi_mr_info::flags - Add ABI compat for fi_cq_err_entry::src_addr - Define fi_cq_err_entry::src_addr - Add base_addr to fi_mr_dmabuf - hmem: Set FI_HMEM_HOST_ALLOC for ze addr valid - hmem: Support dev reg with FI_HMEM_ZE - tostr: Added fi_tostr() for data type struct fi_cq_err_entry. - hmem_ze: fix incorrect device id in copy function - Introduce new profiling interface for low-level statistics - hmem: Support dev reg with FI_HMEM_CUDA - hmem: Support dev reg with FI_HMEM_ROCR - hmem: Support dev reg with FI_HMEM_SYSTEM - hmem: Define optimized HMEM memcpy APIs - Implement memhooks atfork child handler - hmem: Support ofi_hmem_get_base_addr with sys mem - hmem: Add length field to ofi_hmem_get_base_addr - mr_cache: Improve cache hit rate - mr_cache: Purge dead regions in find - mr_cache: Update find to remove invalid MR entries - mr_cache: Update find with MM valid check - Add direct support for dma-buf memory registration - man/fi_tagged: Remove the peek for data ability - indexer: Add byte idx abstraction - Add missing FI_REMOTE_CQ_DATA for fi_inject_writedata - Add configure flags for more sanitizers - Fix fi_peer man page inconsistency - include/fi_peer: Add cq_data to rx_entry, allow peer to modify on unexp - Add XPMEM support - EFA - General bug fix and code clean-up - Do not abort on all deprecated env vars - Onboard fi_mr_dmabuf API in mem reg ops. - Try registering cuda memory via dmabuf when checking p2p - Introduce HAVE_EFA_DMABUF_MR macro in configure - Add read nack protocol docs - Receiver send NACK if runt read fails with ENOMR - Sender switch to long CTS protocol if runt read fails with ENOMR - Receiver send NACK if long read fails with ENOMR - Update efa_rdm_rxe_map_remove to accept msg_id and addr - Sender switch to long CTS protocol if long read fails with ENOMR - Introduce new READ_NACK feature - Use SHM's full inject size - Add testing for small messages without inject - Enable inject rdma write - Use bounce buffer for 0 byte writes - Onboard ofi_hmem_dev_register API - Update cuda_gdrcopy_dev_register's signature - Allocate pke_vec, recv_wr_vec, sge_vec from heap - Close shm resource when it is disabled in ep - Disable RUNTING for Neuron - Move cuda-sync-memops from MR to EP - Do not insert shm av inside efa progress engine - Enable shm when FI_HMEM and FI_ATOMIC are requested - Adjust posted receive size to pkt_size - Do not create SHM peer when SHM is disabled - Use correct threading model for shm - Restrict RDMA read to compatible EFA devices - Add EFA device version to handshake - Add missing locks in efa_cntr_wait. - Add writedata RNR fabtest - Handle RNRs from RDMA writedata - Check opt_len in efa_rdm_ep_getopt - Use correct tx/rx op_flags for shm - Hooks - dmabuf: Initialize fd to supress compiler warning - trace: Add log on FI_VAR_UNEXP_MSG_CNT when enabled. - trace: Fixed trace log format on some attributes. - OPX - Fix compiler warnings - PSM3 - Fix compiler warnings - Update provider to sync with IEFS 11.5.1.1.1 - RXM - Remove unused function - Use gdrcopy in rma when emulating injection - Use gdrcopy in eager send/recv - Add hmem gdrcopy functions - Remove unused dynamic rbuf support - SHM - General bug fixes and cleanup - Add ofi_buf_alloc error handling - Only copy header + msg on unexpected path - Add FI_HMEM atomic support - Add memory barrier before updating resp for atomic - Add more error output - Reduce atomic locking with ofi_mr_map_verify - Only increment tx cntr when inject rma succeeded. - Use peer cntr inc ops in smr_progress_cmd - Allow for inject protocol to buffer more unexpected messages - Change pending fs to bufpool to allow it to grow - Add unexpected SAR buffering - Use generic acronym for shm cap - Move CMA to use the p2p infrastructure - Add p2p abstraction - Load DSA dependency dynamically - Replace tx_lock with ep_lock - Calculate comp vars when writing completion - Move progress_sar above progress_cmd - Rename SAR status enum to be more clear - Make SAR protocol handle 0 byte transfer. - Move selection logic to smr_select_proto() - Sockets - Fix compiler warnings - Fix provider name and api version in returned fi_info struct - TCP - Add profiling interface support - Pass through rdm_ep flags to msg eps - Derive cq flags from op and msg flags - Do not progress ep that is disconnected - Set FI_MULTI_RECV for last completed RX slice - Return an error if invalid sequence number received - xnet_progress_rx() must only be called when connected - Reset ep->rx_avail to 0 after RX queue is flushed - Disable the EP if an error is detected for zero-copy - Add debug tracking of transfer entries - Negotiate support for rendezvous - Add rendezvous protocol option - Generalize xnet_send_ack - Flatten protocol header definitions - Remove unused dynamic rbuf support - Define tcp specific protocol ops - Remove unneeded and incorrect rx_entry init code - UCX - Add FI_HMEM support - Initialize ep_flush to 1 - Util - General bug fixes - memhooks: Fix a bug when calculating mprotect region - Check the return value of ofi_genlock_init() - Update checks for FI_AV_AUTH_KEY - Define domain primary and secondary caps - Add profiling util functions - Update util_cq to support err_data - Update ofi_cq_readerr to use new memcpy - Update ofi_cq_err_memcpy to handle err_data - Zero util cancel err entry - Move FI_REMOTE/LOCAL_COMM to secondary caps - Alter domain max_ep_auth_key - Add domain checks for max_ep_auth_key - Revert util_cntr->ep_list_lock to ofi_mutex - Add NIC FID functions to ofi.h - Add EP and domain auth key checking - Add bounds checks to ibuf get - Define dlist_first_entry_or_null - Update util_getinfo to dup auth_key - Revert util_av, util_cq and util_cntr to mutex - Add missing calls to (de)initialize monitor's mutexes - Avoid attempting to cleanup an uninitialized MR cache - Rename ofi_mr_info fields - Add rv64g support to memory hooks - Verbs - Windows: Check error code from GetPrivateData - Add missing lock to protect SRX - Add synapseai dmabuf mr support - Bug fix for matching domain name with device name - Windows: Fetch rejected connection data - Add support for DMA-buf memory registration - Windows: Fix use-after-free in case of failure in fi_listen - Windows: Map ND request type to ibverbs opcode - Fix memory leak when creating EQ with unsupported wait object - Track ep state to prevent duplicate shutdown events - Fabtests - Update man page - pytests/efa: onboard dmabuf argument for test_mr - pytest: make do_dmabuf_reg_for_hmem an cmdline argument - Bump Libfabric API version. - mr_test: Add dmabuf support - Introduce ft_get_dmabuf_from_iov - unexpected_msg: Use ft_reg_mr to register memory - pytest: Allow registering mr with dmabuf - Add dmabuf support to ft_reg_mr - Add dmabuf ops for cuda. - Test max inject size - Add FI_HMEM support to fi_rdm_rma_event and fi_rdm tests - memcopy-xe: Fix data verification error for device buffer - dmabuf-rdma: Increase the number of NICs that can be tested - dmabuf-rdma: Remove redundant libze_ops definition - fi-mr-reg-xe: Skip native dmabuf reg test for system memory - Check if fi_info is returned correctly in case of FI_CONNREQ - cq_data: relax CQ data validation to cq_data_size - Add ZE host alloc function - Use common device host buffer for check_buf - hmem_ze: allocate one cq and cl on init - fi-mr-reg-xe: Add testing for dmabuf registration - scripts: use yaml safe_load - macos: Fix build error with clang - multinode: Use FI_DELIVERY_COMPLETE for 'barrier' - Handle partial read scenario for fi_xe_rdmabw test For cross node tests - pytest/efa: add cuda memory marker - pytest/efa: Skip some configuration for unexp msg test on neuron. - runfabtests.py: ignore error due to no tests are collected. - pytest/efa: extend unexpected msg test range - pytest/shm: extend unexpected msg test range - pytest: Allow running shm fabtests in parallel - unexpected_msg.c: Allow running the test with FI_DELIVERY_COMPLETE - runfabtests.sh: run fi_unexpected_msg with data validation - pytest/shm: Extend test_unexpected_message - unexpected_msg: Make tx/rx_size large enough - pytest/shm: Extend shm's rma bw test - Update shm.exclude * Mon Sep 04 2023 Nicolas Morey <nicolas.morey@suse.com> - Update to 1.19.0 - Core - General code cleanup and restructuring - Add ofi_hmem_any_ipc_enabled() - ofi_consume_iov allows 0-byte consume - ofi_consume_iov consistency - ofi_indexer: return error code when iterating - getinfo: Add post filters for domain and fabric names - Filter loopback device if iface is specified - bsock: Fix error checking for -EAGAIN - windows/osd: Remove unneeded check to silence coverity - windows/osd: Move variable declaration to silence coverity - Introduce gdrcopy awareness to hmem copy - mr/cache: Fix fi_mr_info initialization - hmem_cuda: remove gdrcopy from cuda hmem copy path - iouring: Fix wrong indent in ofi_sockapi_accept_uring() - Implement ofi_sockctx_uring_poll_add() - hmem: introduce gdrcopy from/to cuda iov functions - hmem: Deprecate `FI_HMEM_CUDA_ENABLE_XFER` - hmem_cuda: Restrict CUDA IPC based on peer accessibility - hmem_cuda: Log number of CUDA devices detected - hmem_cuda: Refactor global variables - tostr: Remove the extra dir "shared/" from "include/" and "src/" . - hmem_ze: fix ZE is valid check - hmem_rocr: fix offset calculation - hmem_rocr: use ofi spinlock functions - hmem_rocr: minor fixes - hmem_neuron: convert warn to info for nrt_get_dmabuf_fd not found - hmem_neuron: check existance of neuron devices during initialization - tostr: Moved Windows functions in shared/ofi_str.c to windows/osd.h - tostr: Add helper functions ofi_tostr_size() and ofi_tostr_count(). - EFA - Onboard Peer API, use shm provider as a peer provider - Uses util SRX framework in shared receive procedures. - Register shm MR with hmem_data, allow shm to use gdrcopy for cuda data movement - Finish the refactor for rxr squash. - Use rdma-core WR API for send requests - Check optlen in getopt call - Fix the rdma-read support check in RMA and MSG operations - Optimize ep lock usage - Use an internal fi_mr_attr for memory registration - Hooks - Init field in mr_attr to silence coverity - Add profiling hook provider - Rename cq hooking functions' names - Added trace for resource creation operations - OPX - Initialize ofi_mr_info - Fix dput credit check - Only allocate replay buffer if psn is valid - Support SHM Intra-node communication between single server HFI devices - Fix incorrect packet size in packet header when sending CTS packet - Added check to address Coverity scan defect - Add multi-entry caching to TID rendezvous - Fall back to default domain name for TID fabric - Properly handle multiple IOVs in fi_opx_tsendmsg - Fix OPX Rzv RTS receive operation SHM error (DAOS-related) - Fix non-tagged sends may incorrectly set FI_TAGGED in send completions - Add more info to reliability IOV buffer validation check - Move dput packet build functions to new inline include - Use fi_mr_attr in fi_opx_mr - Disable Pre-NAKing by default, throttle until all outstanding replays ACK'd - Fix reliability bug when NAKing the last PSN - Update HeaderQ Register more frequently - No rbuf_wrap needed for expected receive (TID) - Fixes for Coverity scan issues - Enhanced tag matching - Tune expected recv for unaligned buffers - Observability: Add finer logging granularity - Reduce RTS immediate data and fix packet estimate for odd TID lengths - Add additional sources for FI_OPX_UUID - Peer - Add cq_data to rx_entry, allow peer to modify on unexp - Introduce peer cntr API - Add foreach_unspec_addr API - Add size as an input of the get_tag - PSM3 - Sync with IEFS 11.5.0.0.172 - SHM - Only poll IPC list when ROCR IPC is enabled - Allow for SAR and inject protocol to buffer more unexpected messages - Remove unused sar fields - Make SAR protocol handle 0 byte transfer - Load DSA dependency dynamically - Change recv entry freestack into bufpool - Remove shm signal - Use util peer cntr implementation - Make SHM default to domain level threading level - Replace internal shared receive implementation with util_srx - Lock entire progress loop - Fix ROCR data coherency - Add FI_LOCAL_COMM to shm attrs - Handle empty freestack - Fix bug in configure.m4 in atomics_happy assignment happy - Add memory barrier before update resp->status for SAR - Do not use inline/inject for read op - Allow shm to use gdrcopy - Refactor protocol selection code - Init map fi addrs to FI_ADDR_NOTAVAIL - TCP - General code cleanups - Restrict which EPs can be opened per domain - Increase CM error debug output - Avoid calling close() on an invalid socket after accept error - Mark the EP as disconnected before flushing the queues - Add assertion failures for xnet_{monitor,halt}_sock - Disable ofi_dynpoll_wait() for non-blocking progress - Move PEP pollin operations to io_uring - Move EP poll operations to io_uring - Early exit if ofi_bsock_flush() has operation in progress - Implement pollin sockctx in bsock - Add missing call to xnet_submit_uring() - Add return error to xnet_update_pollflag() - Remove the cancel sockctx from the EP structure - Move io_uring cqe from the stack to progress struct - Reduce stack size for epoll event array - handle NULL av in xnet_freeall_conns() - UCX - Publish FI_LOCAL_COMM and FI_REMOTE_COMM capabilities - Fix configure error with newer MOFED - Fix segfault in unsignalled completions - Util - Add FI_PEER support to util counter - Refactor the usage of cntrs - Change util_ep to be a genlock - Add util shared receive implementation - Update log message for invalid AV type message - Fix fi_mr_info initialization - Add peer ID to MR cache - Store hmem_data in ofi_mr_map - Split the cq progress and reading entries in ofi_cq_readfrom - Verbs - Add event lock to EQ to serialize closing ep - Remove saved_wc_list and use CQ directly - Consolidate peer_mem and dmabuf support check - Fix vrb_add_credits signature - Introduce new progress engine structure - Simplify (and correct) locking around progress operations - General code restructuring - Fabtests - Fix reading addressing options - Allow to change only the OOB address - Allow to use FI_ADDR_STR with -F - Fix bw buffer utilization - Separate RX and RMA counters - Fix tx counter with RMA - Add FI_CONTEXT mode to rdm_cntr_pingpong - Add HMEM support to fi_unexpected_msg test - Fix array OOB during fabtest list parsing - Enable shm tagged_peek test - Fix windows build warnings - Make tx_buf and rx_buf aligned to 64 bytes by default - Fix windows build warnings for sscanf - Use dummy ft_pin_core on macOS - Fix some header includes - sock_test: Do not use epoll if not available - recv_cancel: initialize error entry - Fix wrong size used to allocate tx_msg_buf - unexpected: change defaults to support tcp - unexpected: add unknown unexpected peer test - Enable a list of arbitrary message sizes - Enabled data validation for rma read & write - bw_rma operates on distinct buffer offsets - ft_post_rma issues reads from remote's tx_buf - General code cleanup and restructuring - rdm_tagged_peek: fix race condition synchronization - Add FI_LOCAL_COMM/FI_REMOTE_COMM presence check to fi_getinfo_test - Correct ft_exchange_keys in prefix-mode - Make rdm_tagged_peek test more general - Add unit test for fi_setopt * Mon Aug 07 2023 Nicolas Morey <nicolas.morey@suse.com> - Drop support for obsolete TrueScale (bsc#1212146) * Mon Jul 03 2023 Nicolas Morey <nicolas.morey@suse.com> - Update to 1.18.1 - Core - Fix build warning for ofi_dynpoll_get_fd - EFA - Handle 0-byte writes - Apply byte_in_order_128_byte for all memory type - Increase default shm_av_size to 256 - Force handshake before selecting rtm for non-system ifaces. - Only select readbase_rtm when both sides support rdma-read - Bugfix for initializing SHM offload - Correct CPPFLAGS during configure - Make setopt support sendrecv aligned 128 bytes - Make data size to be 128 byte multiples for in-order aligned send/recv - prepare local read pkt entry for in-order aligned send/recv. - Disable gdrcopy and cudamemcpy for in-order aligned recv. - Increase the pad size in rxr_pkt_entry - Make readcopy pkt pool 128 byte aligned - Introduce alignment to support in order aligned ops - Fix a bug when calling ibv_query_qp_data_in_order - RMA operations will ensure FI_ATOMIC cap - RMA operations will ensure FI_RMA cap - Unittest atomics without FI_ATOMIC cap. - Unittest RMA without FI_RMA cap. - Refactor pkt_entry assignment in poll_ibv loop - Fixes for RDMA Write and Writedata - RXM - Revert rxm util peer CQ support - Fix credit size parameter for flow ctrl - SHM - Fix DSA enable - Assert read op and inject proto are mutually exclusive - Fix ROCR data coherency - Add FI_LOCAL_COMM to shm attrs - Signal peer when peer is out of resources - Handle empty freestack - Fix bug in configure.m4 in atomics_happy assignment happy - Add memory barrier before update resp->status for SAR - Fix resource leak reported by coverity - Switch cmd_ctx pool from freestack to bufpool - Add iface parameter to smr_select_proto - TCP - Fix spinning on fi_trywait() - Handle truncation of active message - Handle prefetched data after reporting ETRUNC error - Progress all ep's on unexp_msg_list when posting recv - Removed unused saved_msg::ep field to fix assert - Continue receiving after truncation error - Create function to allocate internal msg buffer - Add runtime setting for max saved message size - Increase default max_saved value - Dynamically allocate large saved Rx buffers - Separate the max inject and recv buf size - Remove 1-line xnet_cq_add_progress function - Changed default wait object to epoll - Handle case where epoll isn't natively supported - Hold domain lock while deregistering memory - Rename DL package from libnet to libtcp - UCX - Align the provider version with the libfabric version - Verbs - Delay device initialization to when fi_getinfo is called - Consolidate peer_mem and dmabuf support check - verbs_nd: Init len to 0 for WCSGetProviderPath call - verbs_nd: Verify CQs are valid in rdma_create_qp - verbs_nd: Initialize ibv_wc fields - verbs_nd: Release lock in network direct error paths - Fix vrb_add_credits signature - Fix credit size parameter for flow ctrl - Recover RXM connection from verbs QP in error state - Fabtests - Add ze-dlopen functions to component tests - Call cudaSetDevice() for selected device - pytest/efa: Adjust get_efa_devices() - pytest/common: Support parallel neuron test - pytest/common: Use different cuda device for parallel cuda set - efa: Test_flood_peer.py increase timeout - pytest/efa: Test to flood peer during startup - fi-rdmabw-xe: Add option to set maximum message size - fi-rdmabw-xe: Add option to set batch size * Thu May 04 2023 Frederic Crozat <fcrozat@suse.com> - Add _multibuild to define additional spec files as additional flavors. Eliminates the need for source package links in OBS. * Tue Apr 18 2023 Nicolas Morey <nicolas.morey@suse.com> - Update to 1.18.0 - Core - rocr: fix offset calculation - rocr: use ofi spinlock functions - rocr: minor fixes - neuron: convert warn to info for nrt_get_dmabuf_fd not found - neuron: check existance of neuron devices during initialization - neuron: Add support for neuron dma-buf - ze: update ZE to support new driver index specification - List variables read from config file - Add switch to prefer system-config over environment - Add basic system-config support for setting library variables - Move peer provider defines into new header - rocr: Support asynchronous memory copies - rocr: Add support for ROCR IPC - rocr: rename rocr data-structures - synpaseai: return 0 for host_register and host_deregister - fabric: Improve log level of provider mismatch - cuda: Allow CUDA IPC when P2P disabled - ze: add ZE command list pool to reuse command lists - cuda: implement cuda_get_xfer_setting for non cuda build - cuda: adjust FI_HMEM_CUDA_ENABLE_XFER behavior - cuda.c: Add const to param to remove warning - Add IFF_RUNNING check to indicate iface is up and running - io_uring support enhancements - EFA - Implement CUDA support on instance types that do not support GPUDirect RDMA - Implement fi_write using device's RDMA write capability - Enrich error messages with debug and connection info - Implement support for FI_OPT_EFA_USE_DEVICE_RDMA in fi_setopt - Implement support for FI_OPT_CUDA_API_PERMITTED in fi_setopt - Add support for neuron dma-buf - Use gdrcopy to improve the intra-node CUDA communication performance for small messages - Use shm provider's FI_AV_USER_ID support - Fix bugs in efa provider’s shm info initialization procedure - Hooks - dmabuf_peer_mem: Handle IPC handle caching in L0 - trace: Add trace log for CM operation APIs - trace: Change tag in trace log to hex format - trace: Enhance trace log for data transfer API calls - trace: Add trace log for API fi_cq_readerr() - trace: Add trace log for CQ operation APIs - Add tracing hook provider - Net - Net provider optimizations have been integrated into the tcp provider. - Net provider has been removed as a reported provider. - OPX - Fixes for Coverity scan issues - Enhanced tag matching - Tune expected recv for unaligned buffers - Add finer logging granularity - Reduce RTS immediate data and fix packet estimate for odd TID lengths - Add additional sources for FI_OPX_UUID - Exclude opx from build if missing needed defines - Move some logs to optimized builds - Fix build warnings for unused return code from posix_memalign - Add reliability sanity check to detect when send buffer is illegally altered - SDMA Completion workaround for driver cache invalidation race condition - Fix replay payload pointer increment - Handle completion counter across multiple writes in SDMA - Cleanup pointers after free() - Modify domain creation to handle soft cache errors - Two biband performance improvements - Fixes based on Coverity Scan related to auto progress patch - Changed poll many argument to rx_caps instead of caps - Resync with server configured for Multi-Engines (DAOS CART Self Tests) - Remove import_monitor as ENOSYS case - Address memory leaks reported on OFIWG issues page - General code cleanup - Add replays over SDMA - Implement basic TID Cache - Revert work_pending check change - Fix use_immediate_blocks - Restore state after replay packet is NULL - Fix memory leak from early arrival packets - Fix segfault in SHM operations from uninitialized value in atomic path - Prevent SDMA work entries from being reused with outstanding replays - Set runtime as default for OPX_AV - Fix RTS replay immediate data - Fix errors caught by the upstream libfabric Coverity Scan - fi_getInfo - Support multiple HFI devices - Support OFI_PORT and Contiguous endpoint addresses for CART & Mercury - Add fi_opx_tid.h to Makefile.include - Fix progress checks and default domain - Revert is_intranode simplification. - Don't inline handle_ud_ping function - Allow atomic fetch ops to use SDMA for sufficiently large counts - Cleaned up FI_LOG_LEVEL=warn output - Cleaned up unused macros for FI_REMOTE_COMM and FI_LOCAL_COMM - Reset default progress to FI_PROGRESS_MANUAL - Fixed GCC 10 build error with Auto Progress - Add support for FI_PROGRESS_AUTO - Use max allowed packet size in SDMA path when expected TID is off - Expected receive (TID) rendezvous - RMA Read/Write operations over SDMA - Remove origin_rs from cts and dput packet header - Fix for hang in DAOS CART tests - Use single IOV for bounce buffer in SDMA requests. - Check for FI_MULTI_RECV with bitwise OR instead of AND - Fix for intermittent intra-node deadlock hang (DAOS CART tests) - Fix to RPC transport error failure (DAOS CART tests) - Fix for context->buf set to NULL - Fix bad asserts - Ensure atomicity of atomic ops - fi_opx_cq_poll_inline count and head check fix - Fix intermittent intra-node hang causing RPC timeouts (DAOS CART tests) - PSM3 - Update provider to sync with IEFS 11.4.1.1.2 - Fix warnings from build - Add oneapi ZE support to OFI configure - RXD - Ignore error path in av_close return - RXM - Handle NULL av in rxm_freeall_conns() - Implement the FI_OPT_CUDA_API_PERMITTED option - Write "len" field for remote write - Ignore error path domain_close return - Free coll_pool on ep close - Update rxm to use util_cq FI_PEER support functions - Fix incorrect CQ completion field - Rename srx to msg_srx - Disable FI_SOURCE if not requested - Memory leaks removed - Set offload_coll_mask based on actual configuration - Report on coll offload capabilities with OFI_OFFLOAD_PROV_ONLY - Fabric setups collective offload fabric - Create eq for collective offload provider - Close collective providers ep when rxm_ep is closed - Fix incorrect use of OFI_UNUSED() - Rework collective support to use collective provider(s) - SHM - Fix potential deadlock in smr_generic_rma() - smr_generic_rma() wwrite error completion with positive errno - Update SHM to use ROCR - Fix incorrect discard call when cleaning up unexpected queues - Separate smr_generic_msg into msg and tagged recv - Fix start_msg call - Implement the FI_OPT_CUDA_API_PERMITTED option - Assert not valid atomic op - Fix a bug in smr_av_insert - Optimize locking on the SAR path - Remove unneeded sar_cnt - Optimize locking - Enable multiple GPU/interface support - Remove HMEM specific calls from atomic path - Use util_cq FI_PEER support - Import shm as device host memory - Add HMEM flag to smr region - Fix user_id support - Write tx err comp to correct cq - Fix index when setting FI_ADDR_USER_ID - TCP - Provider source has been replaced by net provider source - Removed incorrect reporting of support for FI_ATOMIC - Do not save unmatched messages until we have the peer's fi_addr - Use internal flag for FI_CLAIM messages, versus a reserved tag bit - Fix updating error counter when discarding saved messages - Allow saved messages to be received after the underlying ep has been closed - Enhanced debug logging in connection path - Force CM progress on unconnected ep's when posting data transfers - Support connect and accept calls with io_uring - Fix segfault accessing an invalid fi_addr - Add io_uring support for CM message exchange - Move CM progress from fabric to EQ to improve multi-threaded performance - Fix small memory leak destroying an EQ - Fix race where same rx entry could be freed twice - Handle NULL av in rdm ep cleanup - Reduce stack use for epoll event array - UCX - New provider targeting Nvidia fabrics that layers over libucp - Util - Fix the behavior of cq_read for FI_PEER - rocr: Fix compilation issue - cuda: Use correct debug string calls - Free cq->peer_cq on close - Remove extra new line from av insert log - Check for count = 0 in ofi_ip_av_insert - rocr: Add support for ROCR IPC - Add FI_PEER support to util_cq - Disable FI_SOURCE if not requested - Remove FID events from the EQ when closing endpoint - Rework collective support to be a peer collective provider(s) - Allow FI_PEER to pass CQ, EQ and AV attr checking - Remove annoying WARNING message for FI_AFFINITY - Add utility collective provider - Verbs - Implement the FI_OPT_CUDA_API_PERMITTED option - Add support for ROCR IPC - Fabtests - Add fi_setopt_test unit test - Update ze device registration calls - fi-rdmabw-xe: Always use host buffer for synchronization - Fix bug in posting RMA operation - fi_cq_data: Extend test to fi_writedata - fi_cq_data: Extend validation of completion data - Rename fi_msg_inject tests to fi_inject_test to reflect its use - fi_rdm_stress: Add count option to json key/pair options - Add and fix OOB option handling in several tests - fi_eq_test: Fix incorrect return value - fi_rdm_multi_client: Increase the size of ep name buffer - Add FI_MR_RAW to default mr_mode - Support larger control messages needed by newer providers - fi-rdmabw-xe: Update to work with the ucx provider - fi_ubertest: Cleanup allocations in failure cases - Change ft_reg_mr to not assume hmem iface & device - fi_multinode: Bugfix multinode test for ze + verbs - fi_multinode: Remove unused validation print - fi_multinode: Skip tests for unsupported collective operations - fi_ubertest: Fix data validation with device memory - fi_peek_tagged: Restructure and expand test * Mon Mar 20 2023 Nicolas Morey <nicolas.morey@suse.com> - Update to 1.17.1 - Core - hmem_cuda Add const to param to remove warning - Fix typos in fi_ext.h - ofi_epoll: Remove unused hot_index struct member - EFA - Print local/peer addresses for RX write errors - Unit test to verify no copy with shm for small host message - Avoid unnecessary copy when sending data from shm - Compare pci bus id in hints - Fix double free in rxr endpoint init - Hooks - dmabuf_peer_mem: Handle IPC handle caching in L0 - OPX - Exclude from build if missing needed defines - Move some logs to optimized builds - Fix build warnings for unused return code from posix_memalign - Add reliability sanity check to detect when send buffer is illegally altered - SDMA Completion workaround for driver cache invalidation race condition - Fix replay payload pointer increment - Handle completion counter across multiple writes in SDMA - Cleanup pointers after free() - Modify domain creation to handle soft cache errors - Two biband performance improvements - Fixes based on Coverity Scan related to auto progress patch - Changed poll many argument to rx_caps instead of caps - Resynch with server configured for Multi-Engines (DAOS CART Self Tests) - Remove import_monitor as ENOSYS case - Address memory leaks reported on OFIWG issues page - Remove unused fields - Fix unwanted print statement case - Add replays over SDMA - Implement basic TID Cache - Revert work_pending check change - Fix use_immediate_blocks - Restore state after replay packet is NULL - Fix memory leak from early arrival packets. - Fix segfault in SHM operations from uninitialized value in atomic path. - Prevent SDMA work entries from being reused with outstanding replays pointing to bounce buf. - Set runtime as default for OPX_AV - Fix RTS replay immediate data - Fix errors caught by the upstream libfabric Coverity Scan - Support multiple HFI devices - Support OFI_PORT and Contiguous endpoint addresses - Update man pages - Util - util_cq: Remove annoying WARNING message for FI_AFFINITY * Mon Dec 19 2022 Nicolas Morey <nicolas.morey@suse.com> - Update to 1.17.0 - Core - Add IFF_RUNNING check to indicate iface is up and running - General code cleanups - Add abstraction for common io_uring operations - Support ROCR get_base_addr - Add a 'flags' parameter to fi_barrier() - Introduce new calls for opening domain and endpoint with flags - Add ability to re-sort the fi_info list - Allowing layering of rxm over net provider - General cleanup of provider filtering functions - Add io_uring operations to be used by sockapi - Modify internal handling of async socket operations - Sockets operations are moved to a common sockapi abstraction - Add support for Ze host register/unregister - Add new offload provider type - Rename fi_prov_context and simplify its use - Convert interface prefix string checks to exact checks - EFA - Code cleanups and various bug fixes - Improved debug logging and warnings and assertions - Do not ignore hints->domain_attr->name - Fix the calculation of REQ header size for a packet entry - Fix default value for host memory's max_medium_msg_size - Add tracepoints to send/recv/read ops - Simplified emulated read protocol - Set use_device_rdma according to efa device id - Fix shm initialization path on error - Fix Implementation of FI_EFA_INTER_MIN_READ_MESSAGE_SIZE - Do not enable rdma_read if rxr_env.use_device_rdma is false - Remove de-allocated CUDA memory region during registration - Fix the error handling path of efa_mr_reg_impl() - Fix rxr_ep unit tests involving ibv_cq_ex - Add check of rdma-read capability for synapseai - Report correct default for runt_size parameter - Toggle cuda sync memops via environment variable. - Net - Continued fork of tcp provider, will eventually merge changes back - Fix inject support - Fix memory leak in peek/claim path - General code cleanups and bug fixes from initial fork - Allow looking ahead in tcp stream to handle out-of-order messages - Add message tracing ability - Fetch correct ep when posting to a loopback connection - Release lock in case of error in rdm_close - Fix error path in xnet_enable_rdm - Add missing progress lock in srx cleanup - Code restructuring and enhancements with longer term goal of supporting io_uring - Disable the progress thread in most situations - Rename DL from libxnet-fi to libnet-fi - Add missing initialization calls for DL provider - Add support for FI_PEEK, FI_CLAIM, and FI_DISCARD - Include source address with CQ entry - Fix support for FI_MULTI_RECV - OPX - Bug fixes and general code cleanup - Fix progress checks and default domain - Allow atomic fetch ops to use SDMA for sufficiently large counts - Cleaned up FI_LOG_LEVEL=warn output - Reset default progress to FI_PROGRESS_MANUAL - Fixed GCC 10 build error with Auto Progress - Add support for FI_PROGRESS_AUTO - Use max allowed packet size in SDMA path when expected TID is turned off - Expected receive (TID) rendezvous - RMA Read/Write operations over SDMA - Remove origin_rs from cts and dput packet header. - Fix for hang - unable to match inbound packets with receive context->src_addr (DAOS CART tests) - Use single IOV for bounce buffer in SDMA requests. - Check for FI_MULTI_RECV with bitwise OR instead of AND - Fix for intermittent intra-node deadlock hang (DAOS CART tests) - Fix to RPC transport error failure (DAOS CART tests) - Fix for context->buf set to NULL - Fix bad asserts - Ensure atomicity of atomic ops - fi_opx_cq_poll_inline count and head check fix - Fix intermittent intra-node hang causing RPC timeouts (DAOS CART tests) - Temporarily reduce SDMA queue ring size for possible driver bug workaround - Fix alignment issue and asserts - Enable more parallel SDMA operations - PSM3 - Synced to IEFS 11.4.0.0.198 - Tech Preview Ubuntu 22.04 Support - Tech Preview Intel DSA Support - Improved Intel GPU Support - Various performance improvements - Various bug fixes - RxM - Always use rendezvous protocol for ZE device memory send - Code cleanup - Add option to free resources on AV removal - SHM - Fix user_id support - Write tx err comp to correct cq - Fix index when setting FI_ADDR_USER_ID - Remove extraneous ofi_cirque_next() call - Add support for FI_AV_USER_ID - Fix multi_recv messaging - General code restructuring for maintainability - Implement shared completion queues - Decouple error processing from cq completion path to avoid switch - Fix incorrect op passed into recv cancel operation - Enhanced SHM implementation with DSA offload - Use multiple SAR buffers per copy operation - Fix ZE IPC race condition on startup - TCP - Minor updates in preparation for io_uring support (via net provider) - Util - Add option to free resources on AV removal - Add 'flags' parameter to new fi_barrier2() call - Add debugging in ofi_mr_map_verify - Rename internal bitmask struct to include ofi prefix - Verbs - Add option to disable dmabuf support - FI_SOCKADDR includes support of FI_SOCKADDR_IB - Fabtests - shared: Expand hmem support - fi_loopback: Add support for tagged messages - fi_mr_test: add support of hmem - fi_rdm_atomic: Fix hmem support - fi_rdm_tagged_peek: Read messages in order, code cleanup and fixes - fi_multinode: Add performance and runtime control options, cleanups - benchmarks: Add data verification to some bw tests - fi_multi_recv: Fix possible crash in cleanup - Drop prov-net-fix-error-path-in-xnet_enable_rdm.patch which was merged upstream. * Tue Nov 08 2022 Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com> - Add prov-net-fix-error-path-in-xnet_enable_rdm.patch to fix a deadlock when no network interfaces are available (bsc#1205139) * Mon Oct 10 2022 Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com> - Update to 1.16.1 - Core - Fix windows implementation to remove fd from poll set - PSM3 - Add missing files to release tarball - Util - Handle NULL address insertion to fi_av_insert - Drop prov-rxm-Disable-128-bit-atomics.patch which was merged upstream * Thu Oct 06 2022 Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com> - Add prov-rxm-Disable-128-bit-atomics.patch to fix a potential segfault on misaligned buffers. * Fri Sep 30 2022 Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com> - Update to 1.16.0 (jsc#PED-351, jsc#PED-190) - Core - Added HMEM IPC cache - Use exact string comparison checks for network interfaces - Restructuring of poll/epoll abstraction - Add ability to disable locks completely in debug builds - Serialize access to modifying the logging calls - Minor fixes to fi_tostr text formatting - Add hmem interface checks to memory registration - EFA - Added support of Synapse AI memory. - Improved error message - Net - Temporarily forked, optimized version of tcp provider - Focused on improved performance and scalability over tcp sockets - Fork ensures tcp provider stability while net provider is developed - Shares the tcp provider protocol and base implementation for msg endpoints - Integrates direct support for rdm endpoints, using a derivative from rxm - Implements own protocol for rdm endpoints, separate from rxm;tcp - OPX - Added initial support for SDMA - General performance enhancements - Performance improvements to reliability protocol - Improved deferred work pending complete - Added support for OPX_AV=runtime - Support iov memory registration ops - Added DAOS RPC support - Atomic ops enhancements - Improved documentation - Debug build enhancements - Fixed compiler warnings - Reduced time to compile prov/opx code - General bug fixes - Fixed PSN wrapping scaling - Added intranode fence - Addressed bugs discovered by coverity scan - PSM2 - Fix sending CQ data in some instances of fi_tsendmsg - PSM3 - Updated to match Intel Ethernet Fabric Suite (IEFS) 11.3 release - RxM - Update to read multiple completions at once from msg provider - Move RxM AV implementation to util code to share with net provider - Minor code cleanups - SHM - Implement and use ipc_cache - Add log messages for debugging and error tracking - Fix check for FI_MR_HMEM mr_mode - Move shm signal handlers initialization to EP - Added log messages for errors detected - TCP - Fix incorrect signaling of the CQ - Increase max number of poll events to retrieve - Acquire ep lock prior to flushing socket in shutdown - Verify ep state prior to progressing socket data - Read cm error data when receiving connreq response - Log error on connect failure - Fix assertion failure in CQ progress function - Util - Fix text in log of UFFD ioctl failure - Introduce cuda ipc monitor - Fix CQ memory leak handling overflow - Fix MR mode bit check for ver 1.5 and greater - Add max_array_size to track/check array overflow - Always progress transfers when reading from a CQ - Handle NULL address insertion - Try IPv4 before IPv6 addresses when starting name server - Fix IP util av default address length - Fix util IP getinfo path to read hints->addr_format - Fix debug print mismatch - Fix return code when memory allocation fails. - Fix build sign warning in ofi_bufpool_region_alloc - Minor code cleanups - Print warning if an addr is inserted into an AV again - Verbs - Fix support of FI_SOCKADDR_IB when requested by the application - Ensure all posted receives are flushed to the application - Update ofi_mr_cache_search API for hmem IPC support - Reduce logging verbosity for "no active ports" - Fix incorrect length used in memory registration - Various minor bug fixes for test failures - Fix a memory leak getting IB address - Implement verbs provider on Windows over NetworkDirect API - Set and check address format correctly - Only close qp if it was initialized - Portable detection of loopback device - Fabtests - multi_ep: Separate EP resources and fix MR registration - multi_recv: Fix possible crash and check for valid buffer - unexpected_msg: Fix printf compiler warning - dgram_pingpong.c: Use out-of-band sync - multinode: Make multinode tests platform agnostic, fix formatting - ubertest: Fix string comparison to include length, fix writedata completion check - av_test: add support for -e <ep_type> - New tests: - dmabuf-rdma: Component level test for dma-buf RDMA - sock_test: Component level performance test of poll, epoll, and select - rdm_stress: Multi-threaded, multi-process stress test for RDM endpoints - sighandler_test: Regression test for signal handler restoration - Drop patches fixed upstream: - prov-opx-Correctly-disable-OPX-if-unsupported.patch - disable-flatten-attr.patch * Mon Aug 01 2022 Martin Liška <mliska@suse.cz> - Add disable-flatten-attr.patch that drops flatten attribute. Note the flatten attribute results in huge compile time hog in inliner (same the binary size would be huge). - Use %make_build and enable LTO (boo#1133235). - Synchronize used Patches. * Thu Jun 23 2022 Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com> - Update to 1.15.1 - Core - Fix fi_info indentation error in fi_tostr - hmem_ze: Add runtime option to choose specific copy engine - Cleanup of configure HMEM checks - Fixed stringop-truncation in ofi_ifaddr_get_speed - Add utility provider log suffix to make logs easier to read - Fix truncation of ipv6 addressing - hmem: add support for AWS Trainium devices - Fix potential sscanf overflows - hmem: pass through device and flags when querying memory interface - Rework locking in several areas to convert spinlocks to mutexes - Add new locking abstractions to select lock types at runtime - Add new FI_PROTO_RXM_TCP for optimized rxm over tcp path - Fix windows implementation to remove fd from poll set - EFA - Added windows support through efawin (https://github.com/aws/efawin) - Added support of AWS neuron. - Added support of using gdrcopy to copy data from host to device. - Fixed a bug that cause 0 byte read to fail. - Fixed a memory corruption issue that can caused forked process to crash. - Extended testing coverage through new pytest based testing framework. - HOOKS - Add new hooking provider dmabuf_peer_mem - Enable DL build of hooking providers - Add HMEM memory registration hook - OPX - New provider supporting Cornelis Networks Omni-path hardware - PSM3 - Updated psm3 to match IEFS 11.2.0.0 release - Added support for sockets (TCP/UDP) via a runtime selectable Hardware Abstraction Layer (HAL) - Added support for IPv6 addressing in RoCE and sockets - Added various NIC selection filtering options (wildcarded NIC name, address format, wildcarded IP subnet, link speed) - Performance tuning in conjunction with OneAPI and OneCCL - Improved PSM3_IDENTIFY output - Rename most internal symbols to psm3_ - Corrected vulnerabilities found during Coverity scans - configure options refined and help text improved - PSM3_MULTI_EP has been deprecated (recommend always enabled, default is enabled [same default as previous releases]) - Various bug fixes - RxM - Add check that atomic size is valid - Add support to passthru calls to tcp provider in specific - TCP - Add assert to verify RMA source/target msg sizes match - Wake-up threads blocked on CQ to update their poll events - Fix use of incorrect events in progress handler - Fixes for various compile warnings, mostly on Windows - Add support for FI_RMA_EVENT capability - Add support for completion counters - Fix check for CQ data in tagged messages - Add cancel support to shared rx context - Add src_addr receive buffer matching - Add provider control to assign a src_addr with an ep - Handle trecv with FI_PEEK flag - Allow binding a CQ with an SRX - Restructuring of code in source files - Handle EWOULDBLOCK returned by send call - Add hot (active) pollfd - SHM - Properly chain the original signal handlers - Avoid uninitialized variable with invalid atomic parameters - Fix 0 byte SAR read - Initialize len parameter to accept - Refactor and simplify protocol code - Remove broken support for 128-bit atomics - Fix FI_INJECT flag support - Add assert to verify RMA source/target msg sizes match - Set domain threading to thread safe - Fix possible use of uninitiated var in av_insert - Util - Fix sign warning in ofi_bufpool_region_alloc - Remove unused variable from ofi_bufpool_destroy - Fix check for valid datatype in ofi_atomic_valid - Return with error if util_coll_sched_copy fails - Fix use of uninitialized variable in ofi_ep_allreduce - Fix memory access in ip_av_insertsym - Track ep per collective operation not with multicast - Restructure collective av set creation/destruction - Change most locks from spin locks to mutexes - Allow selection of spinlocks for CQ and domain objects - Fix AV default addrlen - Update fi_getinfo checks to include hints->addr_ - Handle NULL address insertion to fi_av_insert - Verbs - Initial changes for compiling on Windows (via NetworkDirect) - Add a failover path to dma-buf based memory registration - Replace use of spin locks with mutexes - Check for valid qp prior to cleanup - Set and check for address format correct in fi_getinfo - Fabtests - hmem_cuda: used device allocated host buff to fill device buf - Add python scripts to control test execution - test_configs: include util provider in core config file - Add option "--pin-core" - Only call nrt_init once - Fix a bug in ft_neuron_cleanup - Correct help for unit test programs - Remove duplicate help prints from fi_mcast - configure.ac: fix --enable-debug=no not properly detected - msg_inject: handle the case ft_tsendmsg return -FI_EAGAIN - Add AWS Trainium device support - fi_inj_complete: Add FI_INJECT to fabtests - inj_complete.c: Make arguments align with the other tests - dgram_pingpong: handle the error return of fi_recv - recv_cancel: Remove requirement for unexpected msg handling - poll: Fix crash if unable to allocate pollset - ubertest: Add GPU testing and validation support - Add HMEM options parsing support - Update and re-enable fi_multi_ep test - Add prov-opx-Correctly-disable-OPX-if-unsupported.patch to disable OPX compilation on non x86_64 systems * Tue Apr 19 2022 Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com> - Update to 1.14.1 - Core - Use non-shared memory allocations to use MADV_DONTFORK safely - Fix incorrect use of gdr_copy_from_mapping - Ensure proper timeout time for pollfds to avoid early exit - EFA - Handle read completion properly for multi_recv - Use shm's inject write when possible - Support 0 byte read - RxM - Ensure signaling the CQ fd after writing completion - Fix inject path for sending tagged messages with cq data - Negotiate credit based flow control support over CM - Add PID to CM messages to detect stale vs duplicate connections - Fix race handling unexpected messages from unknown peers - Fix possible leak of stack data in cm_accept - Restrict reported caps based on core provider - Delay starting listen until endpoint fully initialized - Verify valid atomic size - Sockets - Fix coverity reports on uninitialized data - Check for NULL pointers passed to memcpy - Add missing error return code from sock_ep_enable - TCP - Fix performance regression resulting from sparse pollfd sets - Fix assertion failure in CQ progress function - Do not generate error completions for inject msgs - Fix use of incorrect event names in progress handler - Fix check for CQ data in tagged messages - Make start_op array a static to reduce memory - Wake-up threads blocked on CQ to update their poll events - Verbs - Generate error completions for all failed transmits - Set all fields in the fi_fabric_attr for FI_CONNREQ events - Set proper completion flags for all failed transfer - Ensure that all attributes are provided when opening an endpoint - Fix error handling in vrb_eq_read - Fix memory leak in error case in vrb_get_sib - Work-around bug in verbs HW not reported correct send opcodes - Only call ibv_reg_dmabuf_mr when kernel support exists - Add a failover path to dma-buf based memory registration - Negotiate credit based flow control support over CM * Mon Nov 22 2021 Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com> - Update to 1.14.0 - Add time stamps to log messages - Fix gdrcopy calculation of memory region size when aligned - Allow user to disable use of p2p transfers - Update fi_tostr print FI_SHARED_CONTEXT text instead of value - Update fi_tostr to output field names matching header file names - Fix narrow race condition in ofi_init - Add new fi_log_sparse API to rate limit repeated log output - Define memory registration for buffers used for collective operations - EFA, SHM, TCP, RXM, and verbs fixes * Wed Nov 03 2021 Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com> - Enable PSM3 provider (jsc#SLE-18754) * Fri Oct 29 2021 Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com> - Update to 1.13.2 - Sort DL providers to ensure consistent load ordering - Update hooking providers to handle fi_open_ops calls to avoid crashes - Replace cassert with assert.h to avoid C++ headers in C code - Enhance serialization for memory monitors to handle external monitors - EFA, SHM, TCP, RxM and vers fixes * Wed Aug 25 2021 Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com> - Update to 1.13.1 - Enable loading ZE library with dlopen() - Add IPv6 support to fi_pingpong - EFA, PSM3 and SHM fixes * Wed Jul 07 2021 Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com> - Update to 1.13.0 - Fix behavior of fi_param_get parsing an invalid boolean value - Add new APIs to open, export, and import specialized fid's - Define ability to import a monitor into the registration cache - Add API support for INT128/UINT128 atomics - Fix incorrect check for provider name in getinfo filtering path - Allow core providers to return default attributes which are lower then maximum supported attributes in getinfo call - Add option prefer external providers (in order discovered) over internal providers, regardless of provider version - Separate Ze (level-0) and DRM dependencies - Always maintain a list of all discovered providers - Fix incorrect CUDA warnings - Fix bug in cuda init/cleanup checking for gdrcopy support - Shift order providers are called from in fi_getinfo, move psm2 ahead of psm3 and efa ahead of psmX - See NEWS.md for changelog * Fri Apr 02 2021 Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com> - Update to 1.12.1 - Fix initialization checks for CUDA HMEM support - Fail if a memory monitor is requested but not available - Adjust priority of psm3 provider to prefer HW specific providers, such as efa and psm2 - EFA and PSM3 fixes - See NEWS.md for changelog * Tue Mar 09 2021 Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com> - Update to 1.12.0 - See NEWS.md for changelog
/usr/lib/libfabric.so.1 /usr/lib/libfabric.so.1.26.0 /usr/share/doc/packages/libfabric1 /usr/share/doc/packages/libfabric1/AUTHORS /usr/share/doc/packages/libfabric1/README /usr/share/licenses/libfabric1 /usr/share/licenses/libfabric1/COPYING
Generated by rpm2html 1.8.1
Fabrice Bellet, Sun Jan 12 02:11:34 2025