- update to 2.3.1 with following summarized highlights:

* from 2.0.x:
    - torch.compile is the main API for PyTorch 2.0, which wraps your model and
      returns a compiled model. It is a fully additive (and optional) feature
      and hence 2.0 is 100% backward compatible by definition
    - Accelerated Transformers introduce high-performance support for training
      and inference using a custom kernel architecture for scaled dot product
      attention (SPDA). The API is integrated with torch.compile() and model
      developers may also use the scaled dot product attention kernels directly
      by calling the new scaled_dot_product_attention() operato
  * from 2.1.x:
    - automatic dynamic shape support in torch.compile,
      torch.distributed.checkpoint for saving/loading distributed training jobs
      on multiple ranks in parallel, and torch.compile support for the NumPy
      API.
    - In addition, this release offers numerous performance improvements (e.g.
      CPU inductor improvements, AVX512 support, scaled-dot-product-attention
      support) as well as a prototype release of torch.export, a sound
      full-graph capture mechanism, and torch.export-based quantization.
  * from 2.2.x:
    - 2x performance improvements to scaled_dot_product_attention via
      FlashAttention-v2 integration, as well as AOTInductor, a new
      ahead-of-time compilation and deployment tool built for non-python
      server-side deployments.
  * from 2.3.x:
    - support for user-defined Triton kernels in torch.compile, allowing for
      users to migrate their own Triton kernels from eager without
      experiencing performance complications or graph breaks. As well, Tensor
      Parallelism improves the experience for training Large Language Models
      using native PyTorch functions, which has been validated on training

OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/python-torch?expand=0&rev=32
This commit is contained in:
2024-07-19 12:15:19 +00:00
committed by Git OBS Bridge
parent 73b1680af3
commit 9c8ce17a59
37 changed files with 307 additions and 253 deletions

View File

@@ -1,3 +1,49 @@
-------------------------------------------------------------------
Thu Jul 11 09:37:17 UTC 2024 - Christian Goll <cgoll@suse.com>
- update to 2.3.1 with following summarized highlights:
* from 2.0.x:
- torch.compile is the main API for PyTorch 2.0, which wraps your model and
returns a compiled model. It is a fully additive (and optional) feature
and hence 2.0 is 100% backward compatible by definition
- Accelerated Transformers introduce high-performance support for training
and inference using a custom kernel architecture for scaled dot product
attention (SPDA). The API is integrated with torch.compile() and model
developers may also use the scaled dot product attention kernels directly
by calling the new scaled_dot_product_attention() operato
* from 2.1.x:
- automatic dynamic shape support in torch.compile,
torch.distributed.checkpoint for saving/loading distributed training jobs
on multiple ranks in parallel, and torch.compile support for the NumPy
API.
- In addition, this release offers numerous performance improvements (e.g.
CPU inductor improvements, AVX512 support, scaled-dot-product-attention
support) as well as a prototype release of torch.export, a sound
full-graph capture mechanism, and torch.export-based quantization.
* from 2.2.x:
- 2x performance improvements to scaled_dot_product_attention via
FlashAttention-v2 integration, as well as AOTInductor, a new
ahead-of-time compilation and deployment tool built for non-python
server-side deployments.
* from 2.3.x:
- support for user-defined Triton kernels in torch.compile, allowing for
users to migrate their own Triton kernels from eager without
experiencing performance complications or graph breaks. As well, Tensor
Parallelism improves the experience for training Large Language Models
using native PyTorch functions, which has been validated on training
runs for 100B parameter models.
- added seperate openmpi4 build
- added sepetate vulcan build, although this functions isn't exposed to python
abi
- For the obs build all the vendored sources follow the pattern
NAME-7digitcommit.tar.gz and not the NAME-COMMIT.tar.gz
- added following patches:
* skip-third-party-check.patch
* fix-setup.patch
- removed patches:
* pytorch-rm-some-gitmodules.patch
* fix-call-of-onnxInitGraph.patch
-------------------------------------------------------------------
Thu Jul 22 14:40:45 UTC 2021 - Guillaume GARDET <guillaume.gardet@opensuse.org>