SHA256
1
0
forked from pool/mvapich2
mvapich2/mvapich2.changes

484 lines
20 KiB
Plaintext

-------------------------------------------------------------------
Thu Jun 8 13:55:32 UTC 2017 - nmoreychaisemartin@suse.com
- Reenable arm compilation
- Rename and cleanup mvapich-s390_get_cycles.patch to
mvapich2-s390_get_cycles.patch for coherency
- Cleanup mvapich2-pthread_yield.patch
- Add mvapich2-arm-support.patch to provide missing functions for
armv7hl and aarch64
-------------------------------------------------------------------
Thu Jun 8 11:38:36 UTC 2017 - nmoreychaisemartin@suse.com
- Remove version dependencies to libibumad, libibverbs and librdmacm
-------------------------------------------------------------------
Tue May 16 16:29:41 UTC 2017 - nmoreychaisemartin@suse.com
- Fix mvapich2-testsuite packaging
- Disable build on armv7
-------------------------------------------------------------------
Wed Mar 29 08:06:23 CEST 2017 - pth@suse.de
- Make dependencies on libs now coming from rdma-core versioned.
-------------------------------------------------------------------
Tue Nov 29 13:08:18 CET 2016 - pth@suse.de
- Create environment module (bsc#1004628).
-------------------------------------------------------------------
Wed Nov 23 11:00:43 CET 2016 - pth@suse.de
- Fix URL.
- Update to mvapich 2.2 GA. Changes since rc1:
MVAPICH2 2.2 (09/07/2016)
* Features and Enhancements (since 2.2rc2):
- Single node collective tuning for Bridges@PSC, Stampede@TACC and other
architectures
- Enable PSM builds when both PSM and PSM2 libraries are present
- Add support for HCAs that return result of atomics in big endian notation
- Establish loopback connections by default if HCA supports atomics
* Bug Fixes (since 2.2rc2):
- Fix minor error in use of communicator object in collectives
- Fix missing u_int64_t declaration with PGI compilers
- Fix memory leak in RMA rendezvous code path
MVAPICH2 2.2rc2 (08/08/2016)
* Features and Enhancements (since 2.2rc1):
- Enhanced performance for MPI_Comm_split through new bitonic algorithm
- Enable graceful fallback to Shared Memory if LiMIC2 or CMA transfer fails
- Enable support for multiple MPI initializations
- Unify process affinity support in Gen2, PSM and PSM2 channels
- Remove verbs dependency when building the PSM and PSM2 channels
- Allow processes to request MPI_THREAD_MULTIPLE when socket or NUMA node
level affinity is specified
- Point-to-point and collective performance optimization for Intel Knights
Landing
- Automatic detection and tuning for InfiniBand EDR HCAs
- Warn user to reconfigure library if rank type is not large enough to
represent all ranks in job
- Collective tuning for Opal@LLNL, Bridges@PSC, and Stampede-1.5@TACC
- Tuning and architecture detection for Intel Broadwell processors
- Add ability to avoid using --enable-new-dtags with ld
- Add LIBTVMPICH specific CFLAGS and LDFLAGS
* Bug Fixes (since 2.2rc1):
- Disable optimization that removes use of calloc in ptmalloc hook
detection code
- Fix weak alias typos (allows successful compilation with CLANG compiler)
- Fix issues in PSM large message gather operations
- Enhance error checking in collective tuning code
- Fix issues with UD based communication in RoCE mode
- Fix issues with PMI2 support in singleton mode
- Fix default binding bug in hydra launcher
- Fix issues with Checkpoint Restart when launched with mpirun_rsh
- Fix fortran binding issues with Intel 2016 compilers
- Fix issues with socket/NUMA node level binding
- Disable atomics when using Connect-IB with RDMA_CM
- Fix hang in MPI_Finalize when using hybrid channel
- Fix memory leaks
-------------------------------------------------------------------
Tue Nov 15 14:04:50 CET 2016 - pth@suse.de
- Update to version 2.2rc1. Changes since 2.1:
MVAPICH2 2.2rc1 (03/29/2016)
* Features and Enhancements (since 2.2b):
- Support for OpenPower architecture
- Optimized inter-node and intra-node communication
- Support for Intel Omni-Path architecture
- Thanks to Intel for contributing the patch
- Introduction of a new PSM2 channel for Omni-Path
- Support for RoCEv2
- Architecture detection for PSC Bridges system with Omni-Path
- Enhanced startup performance and reduced memory footprint for storing
InfiniBand end-point information with SLURM
- Support for shared memory based PMI operations
- Availability of an updated patch from the MVAPICH project website
with this support for SLURM installations
- Optimized pt-to-pt and collective tuning for Chameleon InfiniBand
systems at TACC/UoC
- Enable affinity by default for TrueScale(PSM) and Omni-Path(PSM2)
channels
- Enhanced tuning for shared-memory based MPI_Bcast
- Enhanced debugging support and error messages
- Update to hwloc version 1.11.2
* Bug Fixes (since 2.2b):
- Fix issue in some of the internal algorithms used for MPI_Bcast,
MPI_Alltoall and MPI_Reduce
- Fix hang in one of the internal algorithms used for MPI_Scatter
- Thanks to Ivan Raikov@Stanford for reporting this issue
- Fix issue with rdma_connect operation
- Fix issue with Dynamic Process Management feature
- Fix issue with de-allocating InfiniBand resources in blocking mode
- Fix build errors caused due to improper compile time guards
- Thanks to Adam Moody@LLNL for the report
- Fix finalize hang when running in hybrid or UD-only mode
- Thanks to Jerome Vienne@TACC for reporting this issue
- Fix issue in MPI_Win_flush operation
- Thanks to Nenad Vukicevic for reporting this issue
- Fix out of memory issues with non-blocking collectives code
- Thanks to Phanisri Pradeep Pratapa and Fang Liu@GaTech for
reporting this issue
- Fix fall-through bug in external32 pack
- Thanks to Adam Moody@LLNL for the report and patch
- Fix issue with on-demand connection establishment and blocking mode
- Thanks to Maksym Planeta@TU Dresden for the report
- Fix memory leaks in hardware multicast based broadcast code
- Fix memory leaks in TrueScale(PSM) channel
- Fix compilation warnings
MVAPICH2 2.2b (11/12/2015)
* Features and Enhancements (since 2.2a):
- Enhanced performance for small messages
- Enhanced startup performance with SLURM
- Support for PMIX_Iallgather and PMIX_Ifence
- Support to enable affinity with asynchronous progress thread
- Enhanced support for MPIT based performance variables
- Tuned VBUF size for performance
- Improved startup performance for QLogic PSM-CH3 channel
- Thanks to Maksym Planeta@TU Dresden for the patch
* Bug Fixes (since 2.2a):
- Fix issue with MPI_Get_count in QLogic PSM-CH3 channel with very large
messages (>2GB)
- Fix issues with shared memory collectives and checkpoint-restart
- Fix hang with checkpoint-restart
- Fix issue with unlinking shared memory files
- Fix memory leak with MPIT
- Fix minor typos and usage of inline and static keywords
- Thanks to Maksym Planeta@TU Dresden for the patch and suggestions
- Fix missing MPIDI_FUNC_EXIT
- Thanks to Maksym Planeta@TU Dresden for the patch
- Remove unused code
- Thanks to Maksym Planeta@TU Dresden for the patch
- Continue with warning if user asks to enable XRC when the system does not
support XRC
MVAPICH2 2.2a (08/17/2015)
* Features and Enhancements (since 2.1 GA):
- Based on MPICH 3.1.4
- Support for backing on-demand UD CM information with shared memory
for minimizing memory footprint
- Reorganized HCA-aware process mapping
- Dynamic identification of maximum read/atomic operations supported by HCA
- Enabling support for intra-node communications in RoCE mode without
shared memory
- Updated to hwloc 1.11.0
- Updated to sm_20 kernel optimizations for MPI Datatypes
- Automatic detection and tuning for 24-core Haswell architecture
* Bug Fixes (since 2.1 GA):
- Fix for error with multi-vbuf design for GPU based communication
- Fix bugs with hybrid UD/RC/XRC communications
- Fix for MPICH putfence/getfence for large messages
- Fix for error in collective tuning framework
- Fix validation failure with Alltoall with IN_PLACE option
- Thanks for Mahidhar Tatineni @SDSC for the report
- Fix bug with MPI_Reduce with IN_PLACE option
- Thanks to Markus Geimer for the report
- Fix for compilation failures with multicast disabled
- Thanks to Devesh Sharma @Emulex for the report
- Fix bug with MPI_Bcast
- Fix IPC selection for shared GPU mode systems
- Fix for build time warnings and memory leaks
- Fix issues with Dynamic Process Management
- Thanks to Neil Spruit for the report
- Fix bug in architecture detection code
- Thanks to Adam Moody @LLNL for the report
-------------------------------------------------------------------
Fri Oct 14 11:28:41 CEST 2016 - pth@suse.de
- Create and include modules file for Mvapich2 (bsc#1004628).
- Remove mvapich2-fix-implicit-decl.patch as the fix is upstream.
- Adapt spec file to the changed micro benchmark install directory.
-------------------------------------------------------------------
Sun Jul 24 14:24:59 UTC 2016 - p.drouand@gmail.com
- Update to version 2.1
* Features and Enhancements (since 2.1rc2):
- Tuning for EDR adapters
- Optimization of collectives for SDSC Comet system
- Based on MPICH-3.1.4
- Enhanced startup performance with mpirun_rsh
- Checkpoint-Restart Support with DMTCP (Distributed MultiThreaded
CheckPointing)
- Thanks to the DMTCP project team (http://dmtcp.sourceforge.net/)
- Support for handling very large messages in RMA
- Optimize size of buffer requested for control messages in large message
transfer
- Enhanced automatic detection of atomic support
- Optimized collectives (bcast, reduce, and allreduce) for 4K processes
- Introduce support to sleep for user specified period before aborting
- Disable PSM from setting CPU affinity
- Install PSM error handler to print more verbose error messages
- Introduce retry mechanism to perform psm_ep_open in PSM channel
* Bug-Fixes (since 2.1rc2):
- Relocate reading environment variables in PSM
- Fix issue with automatic process mapping
- Fix issue with checkpoint restart when full path is not given
- Fix issue with Dynamic Process Management
- Fix issue in CUDA IPC code path
- Fix corner case in CMA runtime detection
* Features and Enhancements (since 2.1rc1):
- Based on MPICH-3.1.4
- Enhanced startup performance with mpirun_rsh
- Checkpoint-Restart Support with DMTCP (Distributed MultiThreaded
CheckPointing)
- Support for handling very large messages in RMA
- Optimize size of buffer requested for control messages in large message
transfer
- Enhanced automatic detection of atomic support
- Optimized collectives (bcast, reduce, and allreduce) for 4K processes
- Introduce support to sleep for user specified period before aborting
- Disable PSM from setting CPU affinity
- Install PSM error handler to print more verbose error messages
- Introduce retry mechanism to perform psm_ep_open in PSM channel
* Bug-Fixes (since 2.1rc1):
- Fix failures with shared memory collectives with checkpoint-restart
- Fix failures with checkpoint-restart when using internal communication
buffers of different size
- Fix undeclared variable error when --disable-cxx is specified with
configure
- Fix segfault seen during connect/accept with dynamic processes
- Fix errors with large messages pack/unpack operations in PSM channel
- Fix for bcast collective tuning
- Fix assertion errors in one-sided put operations in PSM channel
- Fix issue with code getting stuck in infinite loop inside ptmalloc
- Fix assertion error in shared memory large message transfers
- Fix compilation warnings
* Features and Enhancements (since 2.1a):
- Based on MPICH-3.1.3
- Flexibility to use internal communication buffers of different size for
improved performance and memory footprint
- Improve communication performance by removing locks from critical path
- Enhanced communication performance for small/medium message sizes
- Support for linking Intel Trace Analyzer and Collector
- Increase the number of connect retry attempts with RDMA_CM
- Automatic detection and tuning for Haswell architecture
* Bug-Fixes (since 2.1a):
- Fix automatic detection of support for atomics
- Fix issue with void pointer arithmetic with PGI
- Fix deadlock in ctxidup MPICH test in PSM channel
- Fix compile warnings
* Features and Enhancements (since 2.0):
- Based on MPICH-3.1.2
- Support for PMI-2 based startup with SLURM
- Enhanced startup performance for Gen2/UD-Hybrid channel
- GPU support for MPI_Scan and MPI_Exscan collective operations
- Optimize creation of 2-level communicator
- Collective optimization for PSM-CH3 channel
- Tuning for IvyBridge architecture
- Add -export-all option to mpirun_rsh
- Support for additional MPI-T performance variables (PVARs)
in the CH3 channel
- Link with libstdc++ when building with GPU support
(required by CUDA 6.5)
* Bug-Fixes (since 2.0):
- Fix error in large message (>2GB) transfers in CMA code path
- Fix memory leaks in OFA-IB-CH3 and OFA-IB-Nemesis channels
- Fix issues with optimizations for broadcast and reduce collectives
- Fix hang at finalize with Gen2-Hybrid/UD channel
- Fix issues for collectives with non power-of-two process counts
- Make ring startup use HCA selected by user
- Increase counter length for shared-memory collectives
- Use download Url as source
- Some other minor improvements
- Add mvapich2-fix-implicit-decl.patch
-------------------------------------------------------------------
Thu Oct 9 13:32:28 CEST 2014 - pth@suse.de
- Don't provide the full source uri as build servis can't handle it.
-------------------------------------------------------------------
Wed Oct 8 17:12:27 CEST 2014 - pth@suse.de
- Only run autogen.sh if the distribution has a new enough automake.
-------------------------------------------------------------------
Wed Sep 24 17:06:22 CEST 2014 - pth@suse.de
- Update to mvapich2 2.0 GMC:
* Features and Enhancements (since 2.0rc2):
- Consider CMA in collective tuning framework
* Bug-Fixes (since 2.0rc2):
- Fix bug when disabling registration cache
- Fix shared memory window bug when shared memory collectives
are disabled.
- Fix mpirun_rsh bug when running mpmd programs with no arguments
- Exclude Aarch64 for the time being as asm/timex.h seems to be missing
from the glibc kernel headers.
-------------------------------------------------------------------
Tue Jun 3 11:24:34 CEST 2014 - pth@suse.de
- Update to OFED 3.12 final.
-------------------------------------------------------------------
Mon May 26 13:02:24 CEST 2014 - pth@suse.de
- Update to 2.0rc2:
* Features and Enhancements (since 2.0rc1):
- CMA support is now enabled by default
- Optimization of collectives with CMA support
- RMA optimizations for shared memory and atomic operations
- Tuning RGET and Atomics operations
- Tuning RDMA FP-based communication
- MPI-T support for additional performance and control variables
- The --enable-mpit-pvars=yes configuration option will now
enable only MVAPICH2 specific variables
- Large message transfer support for PSM interface
- Optimization of collectives for PSM interface
- Updated to hwloc v1.9
* Bug-Fixes (since 2.0rc1):
- Fix multicast hang when there is a single process on one node
and more than one process on other nodes
- Fix non-power-of-two usage of scatter-doubling-allgather algorithm
- Fix for bcastzero type hang during finalize
- Enhanced handling of failures in RDMA_CM based
connection establishment
- Fix for a hang in finalize when using RDMA_CM
- Finish receive request when RDMA READ completes in RGET protocol
- Always use direct RDMA when flush is used
- Fix compilation error with --enable-g=all in PSM interface
- Fix warnings and memory leaks
-------------------------------------------------------------------
Thu May 15 16:01:50 CEST 2014 - pth@suse.de
- mvapich2-psm-devel requires infinipath-psm-devel.
- Remove redundent requires for the devel-static package.
-------------------------------------------------------------------
Wed May 7 15:40:33 UTC 2014 - stefan.fent@suse.com
- remove typo in mvapich-s390_get_cycles.patch
-------------------------------------------------------------------
Tue Apr 29 13:47:06 CEST 2014 - pth@suse.de
- Remove bogus 0 from spec.
-------------------------------------------------------------------
Mon Apr 28 12:30:12 CEST 2014 - pth@suse.de
- Remove all additional mvapich specific CFLAGS and extra LIBS.
-------------------------------------------------------------------
Fri Apr 25 09:41:47 CEST 2014 - pth@suse.de
- Fix ExclusiveArch
- Only PSM needs explicit configuration so drop the else branch
in configure call.
- mvapich2 now builds in parallel so tell make.
-------------------------------------------------------------------
Thu Apr 24 21:27:06 CEST 2014 - pth@suse.de
- Build Mvapich2 for Qlogic from its own mvapich2-psm.spec.
-------------------------------------------------------------------
Wed Apr 23 18:04:36 CEST 2014 - pth@suse.de
- Add mvapich2-pthread_yield.patch to define GNU_SOURCE before
including pthread.h to get pthread_yield declared.
-------------------------------------------------------------------
Wed Apr 23 14:48:07 CEST 2014 - pth@suse.de
- Don't require libibcommon as it's gone with OFED 3.12.
-------------------------------------------------------------------
Wed Apr 16 15:50:22 UTC 2014 - stefan.fent@suse.com
- add asm code from kernel to properly implement get_cycles on
s390 and s390x (bnc #870424) (mvapich-s390_get_cycles.patch)
-------------------------------------------------------------------
Mon Apr 7 14:49:22 CEST 2014 - pth@suse.de
- Fix spec so that testsuite builds correctly.
-------------------------------------------------------------------
Sat Apr 5 20:28:49 CEST 2014 - pth@suse.de
- Update config.* to make it build on ppc64le.
-------------------------------------------------------------------
Thu Mar 27 12:56:50 CET 2014 - pth@suse.de
- Regenerate autotool files to get ppc64le recognized.
- The predefined platform macros for s390 are lower case not upcase.
-------------------------------------------------------------------
Wed Mar 26 16:15:45 CET 2014 - pth@suse.de
- Finally got the syntax for conditionals in spec right...
- Add a dummy implementation of get_cycles for s390x.
- Update to 2.0rc1 as this is a MPI-3 implementation. For
detailed changes see.
- Fix options passed to mpi-selector
-------------------------------------------------------------------
Tue Mar 25 14:42:43 CET 2014 - pth@suse.de
- Include the two COPYRIGHT files in the package.
- BuildRequire kernel-headers on s390x.
- Fix spec file
-------------------------------------------------------------------
Wed Mar 5 14:04:47 CET 2014 - pth@suse.de
- Compile with support for PSM on ix86 (fate#315889).
- mvapich2 has a testsuite, so run it from a separate spec file.
-------------------------------------------------------------------
Mon Feb 10 13:13:39 CET 2014 - pth@suse.de
- Update to 1.9:
- Remove mvapich2-1.0.2-non-void-rtn.patch as the changes are in
the upstream source.
- Reformat BuildRequires
-------------------------------------------------------------------
Fri Jan 24 19:15:39 CET 2014 - pth@suse.de
- Update to OFED 3.12 daily.
-------------------------------------------------------------------
Fri Feb 29 00:00:00 CET 2008 - - jjolly@suse.de
- Update to 1.0.2 from OFED 1.3 GA release
- Minor changes to return value patch
-------------------------------------------------------------------
Thu Jan 31 00:00:00 CET 2008 - - jjolly@suse.de
- Update to 1.0.1 from OFED 1.3 rc2
- Fixed several 'undefined return value' compile errors
-------------------------------------------------------------------
Tue Jul 10 00:00:00 CET 2007 - - hvogel@suse.de
- Initial Package, Version 0.9.8