2019-08-28 13:58:32 +02:00
#
2020-02-21 16:50:33 +01:00
# spec file for package python-torch
2019-08-28 13:58:32 +02:00
#
2024-07-23 15:10:03 +02:00
# Copyright (c) 2024 SUSE LLC
2019-08-28 13:58:32 +02:00
#
# All modifications and additions to the file contributed by third parties
# remain the property of their copyright owners, unless otherwise agreed
# upon. The license for this file, and modifications and additions to the
# file, is the same license as for the pristine package itself (unless the
# license for the pristine package is not an Open Source License, in which
# case the license is the MIT License). An "Open Source License" is a
# license that conforms to the Open Source Definition (Version 1.9)
# published by the Open Source Initiative.
# Please submit bugfixes or comments via https://bugs.opensuse.org/
2020-02-21 16:50:33 +01:00
#
2019-08-28 13:58:32 +02:00
%define srcname pytorch
2020-01-14 15:16:39 +01:00
%define pname torch
2019-08-28 13:58:32 +02:00
2022-05-24 07:53:14 +02:00
%global flavor @BUILD_FLAVOR@%{nil}
%if "%{flavor}" == "standard"
%bcond_with cuda
%endif
2020-02-26 14:47:31 +01:00
2022-05-24 07:53:14 +02:00
%if "%{flavor}" == "cuda-10-2"
%bcond_without cuda
%define cudaver 10-2
%endif
- update to 2.3.1 with following summarized highlights:
* from 2.0.x:
- torch.compile is the main API for PyTorch 2.0, which wraps your model and
returns a compiled model. It is a fully additive (and optional) feature
and hence 2.0 is 100% backward compatible by definition
- Accelerated Transformers introduce high-performance support for training
and inference using a custom kernel architecture for scaled dot product
attention (SPDA). The API is integrated with torch.compile() and model
developers may also use the scaled dot product attention kernels directly
by calling the new scaled_dot_product_attention() operato
* from 2.1.x:
- automatic dynamic shape support in torch.compile,
torch.distributed.checkpoint for saving/loading distributed training jobs
on multiple ranks in parallel, and torch.compile support for the NumPy
API.
- In addition, this release offers numerous performance improvements (e.g.
CPU inductor improvements, AVX512 support, scaled-dot-product-attention
support) as well as a prototype release of torch.export, a sound
full-graph capture mechanism, and torch.export-based quantization.
* from 2.2.x:
- 2x performance improvements to scaled_dot_product_attention via
FlashAttention-v2 integration, as well as AOTInductor, a new
ahead-of-time compilation and deployment tool built for non-python
server-side deployments.
* from 2.3.x:
- support for user-defined Triton kernels in torch.compile, allowing for
users to migrate their own Triton kernels from eager without
experiencing performance complications or graph breaks. As well, Tensor
Parallelism improves the experience for training Large Language Models
using native PyTorch functions, which has been validated on training
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/python-torch?expand=0&rev=32
2024-07-19 14:15:19 +02:00
%bcond_with mpi
%if "%{flavor}" == "openmpi4"
%bcond_without mpi
%bcond_without openmpi4
%global mpi_flavor openmpi
%define mpi_ext 4
%endif
%if "%{flavor}" == "vulkan"
%bcond_without vulkan
%global pkg_suffix -vulkan
%endif
%if %{with mpi}
%global pkg_suffix %{?mpi_flavor:-%{mpi_flavor} %{?mpi_ext} }
%define pkg_prefix %{_libdir}/mpi/gcc/%{mpi_flavor}%{?mpi_ext}
%define pkg_bindir %{pkg_prefix}/bin/
%define pkg_libdir %{pkg_prefix}/%{_lib}/
%define pkg_incdir %{pkg_prefix}/include/
%define pkg_datadir %{pkg_prefix}/share/
%define pkg_sysconfdir %{pkg_prefix}/etc/
%define pkg_skeldir %{pkg_prefix}/etc/skel/
%define package_name %{pname}%{?pkg_suffix}
%endif
%define FP16_version 4dfe081
%define FXdiv_version b408327
%define QNNPACK_version 7d2a4e9
%define XNNPACK_version fcbf55a
%define cpuinfo_version d6860c4
%define flatbuffers 01834de
%define foxi_version c278588
%define fmt_version e69e5f9
%define gemmlowp_version 3fb5c
%define gloo_version 5354032
%define kineto 3f30237
%define libnop 910b558
%define onnx_version 990217f
%define pocketfft 9d3ab05
%define psimd_version 072586a
%define pthreadpool_version 4fe0e1e
%define pybind11_version 3e9dfa2
%define sleef_version e0a003e
%define tensorpipe 52791a2
Name : python-torch%{?pkg_suffix}
Version : 2.3.1
2019-08-28 13:58:32 +02:00
Release : 0
Summary : Deep learning framework aka pytorch/Caffe2
2022-05-24 07:53:14 +02:00
License : Apache-2.0 AND BSD-2-Clause AND BSD-3-Clause AND MIT AND Zlib AND BSL-1.0
2019-08-28 13:58:32 +02:00
Group : Development/Languages/Python
2020-02-21 16:50:33 +01:00
URL : https://pytorch.org
2019-08-28 13:58:32 +02:00
Source0 : https://github.com/pytorch/pytorch/archive/v%{version} .tar.gz#/%{srcname}-%{version}.tar.gz
2020-02-25 15:08:41 +01:00
Source1 : releases.html
2022-05-24 07:53:14 +02:00
# License10: BSD-3-Clause
Source10 : https://github.com/facebookincubator/gloo/archive/%{gloo_version} .tar.gz#/gloo-%{gloo_version}.tar.gz
# License12: BSD-2-Clause
Source12 : https://github.com/pytorch/cpuinfo/archive/%{cpuinfo_version} .tar.gz#/cpuinfo-%{cpuinfo_version}.tar.gz
# License13: BSL-1.0
Source13 : https://github.com/zdevito/sleef/archive/%{sleef_version} .tar.gz#/sleef-%{sleef_version}.tar.gz
# License14: BSD-3-Clause
Source14 : https://github.com/pybind/pybind11/archive/%{pybind11_version} .tar.gz#/pybind11-%{pybind11_version}.tar.gz
2019-08-28 13:58:32 +02:00
# License15: MIT
2022-05-24 07:53:14 +02:00
Source15 : https://github.com/onnx/onnx/archive/%{onnx_version} .tar.gz#/onnx-%{onnx_version}.tar.gz
2019-08-28 13:58:32 +02:00
#License16: BSD-2-Clause
2022-05-24 07:53:14 +02:00
Source16 : https://github.com/Maratyszcza/pthreadpool/archive/%{pthreadpool_version} .tar.gz#/pthreadpool-%{pthreadpool_version}.tar.gz
2019-08-28 13:58:32 +02:00
# License17: MIT
2022-05-24 07:53:14 +02:00
Source17 : https://github.com/Maratyszcza/FXdiv/archive/%{FXdiv_version} .tar.gz#/FXdiv-%{FXdiv_version}.tar.gz
2019-08-28 13:58:32 +02:00
# License18: MIT
2022-05-24 07:53:14 +02:00
Source18 : https://github.com/Maratyszcza/psimd/archive/%{psimd_version} .tar.gz#/psimd-%{psimd_version}.tar.gz
2019-08-28 13:58:32 +02:00
# License19: MIT
2022-05-24 07:53:14 +02:00
Source19 : https://github.com/Maratyszcza/FP16/archive/%{FP16_version} .tar.gz#/FP16-%{FP16_version}.tar.gz
# License20: Apache-2.0
Source20 : https://github.com/google/gemmlowp/archive/%{gemmlowp_version} .tar.gz#/gemmlowp-%{gemmlowp_version}.tar.gz
# License21: MIT
Source21 : https://github.com/houseroad/foxi/archive/%{foxi_version} .tar.gz#/foxi-%{foxi_version}.tar.gz
2020-02-21 16:50:33 +01:00
# License22: MIT
2022-05-24 07:53:14 +02:00
Source22 : https://github.com/pytorch/QNNPACK/archive/%{QNNPACK_version} .tar.gz#/QNNPACK-%{QNNPACK_version}.tar.gz
# License23: BSD-3-Clause
Source23 : https://github.com/google/XNNPACK/archive/%{XNNPACK_version} .tar.gz#/XNNPACK-%{XNNPACK_version}.tar.gz
# License 25: MIT
Source25 : https://github.com/fmtlib/fmt/archive/%{fmt_version} .tar.gz#/fmt-%{fmt_version}.tar.gz
- update to 2.3.1 with following summarized highlights:
* from 2.0.x:
- torch.compile is the main API for PyTorch 2.0, which wraps your model and
returns a compiled model. It is a fully additive (and optional) feature
and hence 2.0 is 100% backward compatible by definition
- Accelerated Transformers introduce high-performance support for training
and inference using a custom kernel architecture for scaled dot product
attention (SPDA). The API is integrated with torch.compile() and model
developers may also use the scaled dot product attention kernels directly
by calling the new scaled_dot_product_attention() operato
* from 2.1.x:
- automatic dynamic shape support in torch.compile,
torch.distributed.checkpoint for saving/loading distributed training jobs
on multiple ranks in parallel, and torch.compile support for the NumPy
API.
- In addition, this release offers numerous performance improvements (e.g.
CPU inductor improvements, AVX512 support, scaled-dot-product-attention
support) as well as a prototype release of torch.export, a sound
full-graph capture mechanism, and torch.export-based quantization.
* from 2.2.x:
- 2x performance improvements to scaled_dot_product_attention via
FlashAttention-v2 integration, as well as AOTInductor, a new
ahead-of-time compilation and deployment tool built for non-python
server-side deployments.
* from 2.3.x:
- support for user-defined Triton kernels in torch.compile, allowing for
users to migrate their own Triton kernels from eager without
experiencing performance complications or graph breaks. As well, Tensor
Parallelism improves the experience for training Large Language Models
using native PyTorch functions, which has been validated on training
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/python-torch?expand=0&rev=32
2024-07-19 14:15:19 +02:00
# License 26: BSD-3-Clause
Source26 : https://github.com/mreineck/pocketfft/archive/%{pocketfft} .tar.gz#/pocketfft-%{pocketfft}.tar.gz
# License 27: BSD-3-Clause
Source27 : https://github.com/pytorch/kineto/archive/%{kineto} .tar.gz#/kineto-%{kineto}.tar.gz
# License 28: Apache-2.0
Source28 : https://github.com/google/flatbuffers/archive/%{flatbuffers} .tar.gz#/flatbuffers-%{flatbuffers}.tar.gz
# License 29: BSD-3-Clause
Source29 : https://github.com/pytorch/tensorpipe/archive/%{tensorpipe} .tar.gz#/tensorpipe-%{tensorpipe}.tar.gz
# License 30: Apache-2.0
Source30 : https://github.com/google/libnop/archive/%{libnop} .tar.gz#/libnop-%{libnop}.tar.gz
2024-07-23 15:10:03 +02:00
Patch1 : skip-third-party-check.patch
Patch2 : fix-setup.patch
2020-02-21 16:50:33 +01:00
2020-02-26 14:47:31 +01:00
# A python call to cmake fails with a return code of 1 on this arch, disable it for now.
2022-05-24 07:53:14 +02:00
# and 32-bit arm is not supported
ExcludeArch : %ix86 %{arm}
2020-02-26 14:47:31 +01:00
2020-02-21 16:50:33 +01:00
BuildRequires : %{python_module Gloo}
2020-01-14 15:16:11 +01:00
%ifarch x86_64
2019-08-28 13:58:32 +02:00
BuildRequires : %{python_module PeachPy}
2020-01-14 15:16:11 +01:00
%endif
2019-08-28 13:58:32 +02:00
BuildRequires : %{python_module PyYAML}
2020-02-21 16:50:33 +01:00
BuildRequires : %{python_module devel}
BuildRequires : %{python_module hypothesis}
BuildRequires : %{python_module numpy-devel}
BuildRequires : %{python_module opcodes}
- update to 2.3.1 with following summarized highlights:
* from 2.0.x:
- torch.compile is the main API for PyTorch 2.0, which wraps your model and
returns a compiled model. It is a fully additive (and optional) feature
and hence 2.0 is 100% backward compatible by definition
- Accelerated Transformers introduce high-performance support for training
and inference using a custom kernel architecture for scaled dot product
attention (SPDA). The API is integrated with torch.compile() and model
developers may also use the scaled dot product attention kernels directly
by calling the new scaled_dot_product_attention() operato
* from 2.1.x:
- automatic dynamic shape support in torch.compile,
torch.distributed.checkpoint for saving/loading distributed training jobs
on multiple ranks in parallel, and torch.compile support for the NumPy
API.
- In addition, this release offers numerous performance improvements (e.g.
CPU inductor improvements, AVX512 support, scaled-dot-product-attention
support) as well as a prototype release of torch.export, a sound
full-graph capture mechanism, and torch.export-based quantization.
* from 2.2.x:
- 2x performance improvements to scaled_dot_product_attention via
FlashAttention-v2 integration, as well as AOTInductor, a new
ahead-of-time compilation and deployment tool built for non-python
server-side deployments.
* from 2.3.x:
- support for user-defined Triton kernels in torch.compile, allowing for
users to migrate their own Triton kernels from eager without
experiencing performance complications or graph breaks. As well, Tensor
Parallelism improves the experience for training Large Language Models
using native PyTorch functions, which has been validated on training
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/python-torch?expand=0&rev=32
2024-07-19 14:15:19 +02:00
BuildRequires : %{python_module pip}
2020-02-21 16:50:33 +01:00
BuildRequires : %{python_module protobuf}
BuildRequires : %{python_module psutil}
2024-07-23 15:10:03 +02:00
BuildRequires : %{python_module py-cpuinfo}
2019-08-28 13:58:32 +02:00
BuildRequires : %{python_module setuptools}
2019-09-04 10:17:31 +02:00
BuildRequires : %{python_module typing_extensions}
2019-08-28 13:58:32 +02:00
BuildRequires : %{python_module typing}
2022-05-24 07:53:14 +02:00
%if 0%{?suse_version} <= 1500
# Python 3.6 still need dataclasses
BuildRequires : %{python_module dataclasses}
%ifarch aarch64
# XNNPACK uses +dotprod modifier which requires GCC8+
BuildRequires : gcc8
BuildRequires : gcc8-c++
%endif
%endif
BuildRequires : cmake >= 3.5
2019-08-28 13:58:32 +02:00
BuildRequires : eigen3-devel
BuildRequires : fdupes
BuildRequires : gcc-c++
BuildRequires : glog-devel
BuildRequires : gtest
2020-02-21 16:50:33 +01:00
BuildRequires : leveldb-devel
2019-08-28 13:58:32 +02:00
BuildRequires : libnuma-devel
2020-02-21 16:50:33 +01:00
BuildRequires : libopenblas_pthreads-devel
2024-07-23 15:10:03 +02:00
BuildRequires : libuv-devel
2019-08-28 13:58:32 +02:00
BuildRequires : lmdb-devel
2020-02-21 16:50:33 +01:00
BuildRequires : ninja
2019-08-28 13:58:32 +02:00
BuildRequires : openblas-devel
- update to 2.3.1 with following summarized highlights:
* from 2.0.x:
- torch.compile is the main API for PyTorch 2.0, which wraps your model and
returns a compiled model. It is a fully additive (and optional) feature
and hence 2.0 is 100% backward compatible by definition
- Accelerated Transformers introduce high-performance support for training
and inference using a custom kernel architecture for scaled dot product
attention (SPDA). The API is integrated with torch.compile() and model
developers may also use the scaled dot product attention kernels directly
by calling the new scaled_dot_product_attention() operato
* from 2.1.x:
- automatic dynamic shape support in torch.compile,
torch.distributed.checkpoint for saving/loading distributed training jobs
on multiple ranks in parallel, and torch.compile support for the NumPy
API.
- In addition, this release offers numerous performance improvements (e.g.
CPU inductor improvements, AVX512 support, scaled-dot-product-attention
support) as well as a prototype release of torch.export, a sound
full-graph capture mechanism, and torch.export-based quantization.
* from 2.2.x:
- 2x performance improvements to scaled_dot_product_attention via
FlashAttention-v2 integration, as well as AOTInductor, a new
ahead-of-time compilation and deployment tool built for non-python
server-side deployments.
* from 2.3.x:
- support for user-defined Triton kernels in torch.compile, allowing for
users to migrate their own Triton kernels from eager without
experiencing performance complications or graph breaks. As well, Tensor
Parallelism improves the experience for training Large Language Models
using native PyTorch functions, which has been validated on training
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/python-torch?expand=0&rev=32
2024-07-19 14:15:19 +02:00
BuildRequires : opencv-devel
2024-07-23 15:10:03 +02:00
BuildRequires : openssl-devel
2019-08-28 13:58:32 +02:00
BuildRequires : protobuf-c
BuildRequires : protobuf-devel
2022-05-24 07:53:14 +02:00
BuildRequires : python-rpm-macros
2019-08-28 13:58:32 +02:00
BuildRequires : snappy-devel
2022-05-24 07:53:14 +02:00
%if %{with cuda}
BuildRequires : cuda-compiler-%cudaver
BuildRequires : cuda-cudart-dev-%cudaver
BuildRequires : cuda-libraries-dev-%cudaver
BuildRequires : cuda-misc-headers-%cudaver
BuildRequires : cuda-nsight-%cudaver
BuildRequires : cuda-toolkit-%cudaver
%if 0%{?suse_version} > 1500
BuildRequires : gcc7
BuildRequires : gcc7-c++
%endif
BuildRequires : libcudnn7-devel
BuildRequires : libnccl-devel
%endif
- update to 2.3.1 with following summarized highlights:
* from 2.0.x:
- torch.compile is the main API for PyTorch 2.0, which wraps your model and
returns a compiled model. It is a fully additive (and optional) feature
and hence 2.0 is 100% backward compatible by definition
- Accelerated Transformers introduce high-performance support for training
and inference using a custom kernel architecture for scaled dot product
attention (SPDA). The API is integrated with torch.compile() and model
developers may also use the scaled dot product attention kernels directly
by calling the new scaled_dot_product_attention() operato
* from 2.1.x:
- automatic dynamic shape support in torch.compile,
torch.distributed.checkpoint for saving/loading distributed training jobs
on multiple ranks in parallel, and torch.compile support for the NumPy
API.
- In addition, this release offers numerous performance improvements (e.g.
CPU inductor improvements, AVX512 support, scaled-dot-product-attention
support) as well as a prototype release of torch.export, a sound
full-graph capture mechanism, and torch.export-based quantization.
* from 2.2.x:
- 2x performance improvements to scaled_dot_product_attention via
FlashAttention-v2 integration, as well as AOTInductor, a new
ahead-of-time compilation and deployment tool built for non-python
server-side deployments.
* from 2.3.x:
- support for user-defined Triton kernels in torch.compile, allowing for
users to migrate their own Triton kernels from eager without
experiencing performance complications or graph breaks. As well, Tensor
Parallelism improves the experience for training Large Language Models
using native PyTorch functions, which has been validated on training
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/python-torch?expand=0&rev=32
2024-07-19 14:15:19 +02:00
%if %{with openmpi4}
BuildRequires : openmpi4-devel
2024-07-23 15:10:03 +02:00
Conflicts : %{python_module torch}
- update to 2.3.1 with following summarized highlights:
* from 2.0.x:
- torch.compile is the main API for PyTorch 2.0, which wraps your model and
returns a compiled model. It is a fully additive (and optional) feature
and hence 2.0 is 100% backward compatible by definition
- Accelerated Transformers introduce high-performance support for training
and inference using a custom kernel architecture for scaled dot product
attention (SPDA). The API is integrated with torch.compile() and model
developers may also use the scaled dot product attention kernels directly
by calling the new scaled_dot_product_attention() operato
* from 2.1.x:
- automatic dynamic shape support in torch.compile,
torch.distributed.checkpoint for saving/loading distributed training jobs
on multiple ranks in parallel, and torch.compile support for the NumPy
API.
- In addition, this release offers numerous performance improvements (e.g.
CPU inductor improvements, AVX512 support, scaled-dot-product-attention
support) as well as a prototype release of torch.export, a sound
full-graph capture mechanism, and torch.export-based quantization.
* from 2.2.x:
- 2x performance improvements to scaled_dot_product_attention via
FlashAttention-v2 integration, as well as AOTInductor, a new
ahead-of-time compilation and deployment tool built for non-python
server-side deployments.
* from 2.3.x:
- support for user-defined Triton kernels in torch.compile, allowing for
users to migrate their own Triton kernels from eager without
experiencing performance complications or graph breaks. As well, Tensor
Parallelism improves the experience for training Large Language Models
using native PyTorch functions, which has been validated on training
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/python-torch?expand=0&rev=32
2024-07-19 14:15:19 +02:00
%endif
%if %{with vulkan}
2024-07-23 15:10:03 +02:00
BuildRequires : VulkanMemoryAllocator-devel
- update to 2.3.1 with following summarized highlights:
* from 2.0.x:
- torch.compile is the main API for PyTorch 2.0, which wraps your model and
returns a compiled model. It is a fully additive (and optional) feature
and hence 2.0 is 100% backward compatible by definition
- Accelerated Transformers introduce high-performance support for training
and inference using a custom kernel architecture for scaled dot product
attention (SPDA). The API is integrated with torch.compile() and model
developers may also use the scaled dot product attention kernels directly
by calling the new scaled_dot_product_attention() operato
* from 2.1.x:
- automatic dynamic shape support in torch.compile,
torch.distributed.checkpoint for saving/loading distributed training jobs
on multiple ranks in parallel, and torch.compile support for the NumPy
API.
- In addition, this release offers numerous performance improvements (e.g.
CPU inductor improvements, AVX512 support, scaled-dot-product-attention
support) as well as a prototype release of torch.export, a sound
full-graph capture mechanism, and torch.export-based quantization.
* from 2.2.x:
- 2x performance improvements to scaled_dot_product_attention via
FlashAttention-v2 integration, as well as AOTInductor, a new
ahead-of-time compilation and deployment tool built for non-python
server-side deployments.
* from 2.3.x:
- support for user-defined Triton kernels in torch.compile, allowing for
users to migrate their own Triton kernels from eager without
experiencing performance complications or graph breaks. As well, Tensor
Parallelism improves the experience for training Large Language Models
using native PyTorch functions, which has been validated on training
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/python-torch?expand=0&rev=32
2024-07-19 14:15:19 +02:00
BuildRequires : shaderc
BuildRequires : vulkan-devel
2024-07-23 15:10:03 +02:00
Conflicts : %{python_module torch}
- update to 2.3.1 with following summarized highlights:
* from 2.0.x:
- torch.compile is the main API for PyTorch 2.0, which wraps your model and
returns a compiled model. It is a fully additive (and optional) feature
and hence 2.0 is 100% backward compatible by definition
- Accelerated Transformers introduce high-performance support for training
and inference using a custom kernel architecture for scaled dot product
attention (SPDA). The API is integrated with torch.compile() and model
developers may also use the scaled dot product attention kernels directly
by calling the new scaled_dot_product_attention() operato
* from 2.1.x:
- automatic dynamic shape support in torch.compile,
torch.distributed.checkpoint for saving/loading distributed training jobs
on multiple ranks in parallel, and torch.compile support for the NumPy
API.
- In addition, this release offers numerous performance improvements (e.g.
CPU inductor improvements, AVX512 support, scaled-dot-product-attention
support) as well as a prototype release of torch.export, a sound
full-graph capture mechanism, and torch.export-based quantization.
* from 2.2.x:
- 2x performance improvements to scaled_dot_product_attention via
FlashAttention-v2 integration, as well as AOTInductor, a new
ahead-of-time compilation and deployment tool built for non-python
server-side deployments.
* from 2.3.x:
- support for user-defined Triton kernels in torch.compile, allowing for
users to migrate their own Triton kernels from eager without
experiencing performance complications or graph breaks. As well, Tensor
Parallelism improves the experience for training Large Language Models
using native PyTorch functions, which has been validated on training
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/python-torch?expand=0&rev=32
2024-07-19 14:15:19 +02:00
%endif
2020-02-21 16:50:33 +01:00
Requires : python-numpy
Requires : python-protobuf
Requires : python-six
- update to 2.3.1 with following summarized highlights:
* from 2.0.x:
- torch.compile is the main API for PyTorch 2.0, which wraps your model and
returns a compiled model. It is a fully additive (and optional) feature
and hence 2.0 is 100% backward compatible by definition
- Accelerated Transformers introduce high-performance support for training
and inference using a custom kernel architecture for scaled dot product
attention (SPDA). The API is integrated with torch.compile() and model
developers may also use the scaled dot product attention kernels directly
by calling the new scaled_dot_product_attention() operato
* from 2.1.x:
- automatic dynamic shape support in torch.compile,
torch.distributed.checkpoint for saving/loading distributed training jobs
on multiple ranks in parallel, and torch.compile support for the NumPy
API.
- In addition, this release offers numerous performance improvements (e.g.
CPU inductor improvements, AVX512 support, scaled-dot-product-attention
support) as well as a prototype release of torch.export, a sound
full-graph capture mechanism, and torch.export-based quantization.
* from 2.2.x:
- 2x performance improvements to scaled_dot_product_attention via
FlashAttention-v2 integration, as well as AOTInductor, a new
ahead-of-time compilation and deployment tool built for non-python
server-side deployments.
* from 2.3.x:
- support for user-defined Triton kernels in torch.compile, allowing for
users to migrate their own Triton kernels from eager without
experiencing performance complications or graph breaks. As well, Tensor
Parallelism improves the experience for training Large Language Models
using native PyTorch functions, which has been validated on training
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/python-torch?expand=0&rev=32
2024-07-19 14:15:19 +02:00
Requires : python-typing_extensions
2019-08-28 13:58:32 +02:00
2024-07-23 15:10:03 +02:00
Provides : python-caffe2%{?pkg_suffix} = %version
Provides : python-pytorch%{?pkg_suffix} = %version
2019-08-28 13:58:32 +02:00
2022-05-24 07:53:14 +02:00
%if "%flavor" == ""
ExclusiveArch : do_not_build
%endif
2019-08-28 13:58:32 +02:00
%python_subpackages
%description
PyTorch enables fast, flexible experimentation and efficient production through
a hybrid front-end, distributed training, and ecosystem of tools and libraries.
The library is developed by Facebook and other groups.
PyTorch provides two high-level features:
* Tensor computing (like NumPy) with strong acceleration via graphics
* processing units (GPU) Deep neural networks built on a tape-based autodiff
system
%package devel
Summary : Headers for C/C++, cmake build description and libraries needed for development
Group : Development/Languages/Python
2022-05-24 07:53:14 +02:00
Requires : python-torch = %{version}
2019-08-28 13:58:32 +02:00
%description devel
Although the Python interface is more polished and the primary focus of
development, PyTorch also has a C++ frontend. This package contains the header
to access the C/C++ interface.
- update to 2.3.1 with following summarized highlights:
* from 2.0.x:
- torch.compile is the main API for PyTorch 2.0, which wraps your model and
returns a compiled model. It is a fully additive (and optional) feature
and hence 2.0 is 100% backward compatible by definition
- Accelerated Transformers introduce high-performance support for training
and inference using a custom kernel architecture for scaled dot product
attention (SPDA). The API is integrated with torch.compile() and model
developers may also use the scaled dot product attention kernels directly
by calling the new scaled_dot_product_attention() operato
* from 2.1.x:
- automatic dynamic shape support in torch.compile,
torch.distributed.checkpoint for saving/loading distributed training jobs
on multiple ranks in parallel, and torch.compile support for the NumPy
API.
- In addition, this release offers numerous performance improvements (e.g.
CPU inductor improvements, AVX512 support, scaled-dot-product-attention
support) as well as a prototype release of torch.export, a sound
full-graph capture mechanism, and torch.export-based quantization.
* from 2.2.x:
- 2x performance improvements to scaled_dot_product_attention via
FlashAttention-v2 integration, as well as AOTInductor, a new
ahead-of-time compilation and deployment tool built for non-python
server-side deployments.
* from 2.3.x:
- support for user-defined Triton kernels in torch.compile, allowing for
users to migrate their own Triton kernels from eager without
experiencing performance complications or graph breaks. As well, Tensor
Parallelism improves the experience for training Large Language Models
using native PyTorch functions, which has been validated on training
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/python-torch?expand=0&rev=32
2024-07-19 14:15:19 +02:00
%package converters
2019-08-28 13:58:32 +02:00
Summary : Converters for onnx and caffe2
Group : Development/Languages/Python
2020-01-14 15:16:39 +01:00
BuildArch : noarch
Requires : python3-click
2020-02-21 16:50:33 +01:00
Requires : python3-onnx
Requires : python3-pip
2020-01-14 15:16:39 +01:00
Requires : python3-pname
2019-08-28 13:58:32 +02:00
- update to 2.3.1 with following summarized highlights:
* from 2.0.x:
- torch.compile is the main API for PyTorch 2.0, which wraps your model and
returns a compiled model. It is a fully additive (and optional) feature
and hence 2.0 is 100% backward compatible by definition
- Accelerated Transformers introduce high-performance support for training
and inference using a custom kernel architecture for scaled dot product
attention (SPDA). The API is integrated with torch.compile() and model
developers may also use the scaled dot product attention kernels directly
by calling the new scaled_dot_product_attention() operato
* from 2.1.x:
- automatic dynamic shape support in torch.compile,
torch.distributed.checkpoint for saving/loading distributed training jobs
on multiple ranks in parallel, and torch.compile support for the NumPy
API.
- In addition, this release offers numerous performance improvements (e.g.
CPU inductor improvements, AVX512 support, scaled-dot-product-attention
support) as well as a prototype release of torch.export, a sound
full-graph capture mechanism, and torch.export-based quantization.
* from 2.2.x:
- 2x performance improvements to scaled_dot_product_attention via
FlashAttention-v2 integration, as well as AOTInductor, a new
ahead-of-time compilation and deployment tool built for non-python
server-side deployments.
* from 2.3.x:
- support for user-defined Triton kernels in torch.compile, allowing for
users to migrate their own Triton kernels from eager without
experiencing performance complications or graph breaks. As well, Tensor
Parallelism improves the experience for training Large Language Models
using native PyTorch functions, which has been validated on training
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/python-torch?expand=0&rev=32
2024-07-19 14:15:19 +02:00
%description converters
2019-08-28 13:58:32 +02:00
Converter from caffe2 to onnx and from caffe2 to onnx formated files.
- update to 2.3.1 with following summarized highlights:
* from 2.0.x:
- torch.compile is the main API for PyTorch 2.0, which wraps your model and
returns a compiled model. It is a fully additive (and optional) feature
and hence 2.0 is 100% backward compatible by definition
- Accelerated Transformers introduce high-performance support for training
and inference using a custom kernel architecture for scaled dot product
attention (SPDA). The API is integrated with torch.compile() and model
developers may also use the scaled dot product attention kernels directly
by calling the new scaled_dot_product_attention() operato
* from 2.1.x:
- automatic dynamic shape support in torch.compile,
torch.distributed.checkpoint for saving/loading distributed training jobs
on multiple ranks in parallel, and torch.compile support for the NumPy
API.
- In addition, this release offers numerous performance improvements (e.g.
CPU inductor improvements, AVX512 support, scaled-dot-product-attention
support) as well as a prototype release of torch.export, a sound
full-graph capture mechanism, and torch.export-based quantization.
* from 2.2.x:
- 2x performance improvements to scaled_dot_product_attention via
FlashAttention-v2 integration, as well as AOTInductor, a new
ahead-of-time compilation and deployment tool built for non-python
server-side deployments.
* from 2.3.x:
- support for user-defined Triton kernels in torch.compile, allowing for
users to migrate their own Triton kernels from eager without
experiencing performance complications or graph breaks. As well, Tensor
Parallelism improves the experience for training Large Language Models
using native PyTorch functions, which has been validated on training
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/python-torch?expand=0&rev=32
2024-07-19 14:15:19 +02:00
%package examples
2019-08-28 13:58:32 +02:00
Summary : Examples which can be used for testing
Group : Development/Languages/Python
BuildArch : noarch
2020-01-14 15:16:39 +01:00
Recommends: python3-lmdb
Recommends: python3-networkx
2019-08-28 13:58:32 +02:00
- update to 2.3.1 with following summarized highlights:
* from 2.0.x:
- torch.compile is the main API for PyTorch 2.0, which wraps your model and
returns a compiled model. It is a fully additive (and optional) feature
and hence 2.0 is 100% backward compatible by definition
- Accelerated Transformers introduce high-performance support for training
and inference using a custom kernel architecture for scaled dot product
attention (SPDA). The API is integrated with torch.compile() and model
developers may also use the scaled dot product attention kernels directly
by calling the new scaled_dot_product_attention() operato
* from 2.1.x:
- automatic dynamic shape support in torch.compile,
torch.distributed.checkpoint for saving/loading distributed training jobs
on multiple ranks in parallel, and torch.compile support for the NumPy
API.
- In addition, this release offers numerous performance improvements (e.g.
CPU inductor improvements, AVX512 support, scaled-dot-product-attention
support) as well as a prototype release of torch.export, a sound
full-graph capture mechanism, and torch.export-based quantization.
* from 2.2.x:
- 2x performance improvements to scaled_dot_product_attention via
FlashAttention-v2 integration, as well as AOTInductor, a new
ahead-of-time compilation and deployment tool built for non-python
server-side deployments.
* from 2.3.x:
- support for user-defined Triton kernels in torch.compile, allowing for
users to migrate their own Triton kernels from eager without
experiencing performance complications or graph breaks. As well, Tensor
Parallelism improves the experience for training Large Language Models
using native PyTorch functions, which has been validated on training
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/python-torch?expand=0&rev=32
2024-07-19 14:15:19 +02:00
%description examples
2019-08-28 13:58:32 +02:00
This example files can be used to start an own pytorch/caffe2 project.
- update to 2.3.1 with following summarized highlights:
* from 2.0.x:
- torch.compile is the main API for PyTorch 2.0, which wraps your model and
returns a compiled model. It is a fully additive (and optional) feature
and hence 2.0 is 100% backward compatible by definition
- Accelerated Transformers introduce high-performance support for training
and inference using a custom kernel architecture for scaled dot product
attention (SPDA). The API is integrated with torch.compile() and model
developers may also use the scaled dot product attention kernels directly
by calling the new scaled_dot_product_attention() operato
* from 2.1.x:
- automatic dynamic shape support in torch.compile,
torch.distributed.checkpoint for saving/loading distributed training jobs
on multiple ranks in parallel, and torch.compile support for the NumPy
API.
- In addition, this release offers numerous performance improvements (e.g.
CPU inductor improvements, AVX512 support, scaled-dot-product-attention
support) as well as a prototype release of torch.export, a sound
full-graph capture mechanism, and torch.export-based quantization.
* from 2.2.x:
- 2x performance improvements to scaled_dot_product_attention via
FlashAttention-v2 integration, as well as AOTInductor, a new
ahead-of-time compilation and deployment tool built for non-python
server-side deployments.
* from 2.3.x:
- support for user-defined Triton kernels in torch.compile, allowing for
users to migrate their own Triton kernels from eager without
experiencing performance complications or graph breaks. As well, Tensor
Parallelism improves the experience for training Large Language Models
using native PyTorch functions, which has been validated on training
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/python-torch?expand=0&rev=32
2024-07-19 14:15:19 +02:00
%package -n libtorch%{?pkg_suffix}
2020-02-21 16:50:33 +01:00
Summary : Library which used by %{name}
Group : Development/Libraries/Python
2024-07-23 15:10:03 +02:00
%if %{with openmpi4}
Conflicts : libtorch
%endif
%if %{with vulkan}
Conflicts : libtorch
%endif
2020-02-21 16:50:33 +01:00
- update to 2.3.1 with following summarized highlights:
* from 2.0.x:
- torch.compile is the main API for PyTorch 2.0, which wraps your model and
returns a compiled model. It is a fully additive (and optional) feature
and hence 2.0 is 100% backward compatible by definition
- Accelerated Transformers introduce high-performance support for training
and inference using a custom kernel architecture for scaled dot product
attention (SPDA). The API is integrated with torch.compile() and model
developers may also use the scaled dot product attention kernels directly
by calling the new scaled_dot_product_attention() operato
* from 2.1.x:
- automatic dynamic shape support in torch.compile,
torch.distributed.checkpoint for saving/loading distributed training jobs
on multiple ranks in parallel, and torch.compile support for the NumPy
API.
- In addition, this release offers numerous performance improvements (e.g.
CPU inductor improvements, AVX512 support, scaled-dot-product-attention
support) as well as a prototype release of torch.export, a sound
full-graph capture mechanism, and torch.export-based quantization.
* from 2.2.x:
- 2x performance improvements to scaled_dot_product_attention via
FlashAttention-v2 integration, as well as AOTInductor, a new
ahead-of-time compilation and deployment tool built for non-python
server-side deployments.
* from 2.3.x:
- support for user-defined Triton kernels in torch.compile, allowing for
users to migrate their own Triton kernels from eager without
experiencing performance complications or graph breaks. As well, Tensor
Parallelism improves the experience for training Large Language Models
using native PyTorch functions, which has been validated on training
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/python-torch?expand=0&rev=32
2024-07-19 14:15:19 +02:00
%description -n libtorch%{?pkg_suffix}
2019-08-28 13:58:32 +02:00
Library which is used by %{name}
%prep
- update to 2.3.1 with following summarized highlights:
* from 2.0.x:
- torch.compile is the main API for PyTorch 2.0, which wraps your model and
returns a compiled model. It is a fully additive (and optional) feature
and hence 2.0 is 100% backward compatible by definition
- Accelerated Transformers introduce high-performance support for training
and inference using a custom kernel architecture for scaled dot product
attention (SPDA). The API is integrated with torch.compile() and model
developers may also use the scaled dot product attention kernels directly
by calling the new scaled_dot_product_attention() operato
* from 2.1.x:
- automatic dynamic shape support in torch.compile,
torch.distributed.checkpoint for saving/loading distributed training jobs
on multiple ranks in parallel, and torch.compile support for the NumPy
API.
- In addition, this release offers numerous performance improvements (e.g.
CPU inductor improvements, AVX512 support, scaled-dot-product-attention
support) as well as a prototype release of torch.export, a sound
full-graph capture mechanism, and torch.export-based quantization.
* from 2.2.x:
- 2x performance improvements to scaled_dot_product_attention via
FlashAttention-v2 integration, as well as AOTInductor, a new
ahead-of-time compilation and deployment tool built for non-python
server-side deployments.
* from 2.3.x:
- support for user-defined Triton kernels in torch.compile, allowing for
users to migrate their own Triton kernels from eager without
experiencing performance complications or graph breaks. As well, Tensor
Parallelism improves the experience for training Large Language Models
using native PyTorch functions, which has been validated on training
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/python-torch?expand=0&rev=32
2024-07-19 14:15:19 +02:00
%define make_depend_src() test -e $(basename %1| sed 's/-.*//') && rmdir %{?2}%{!?2:$(basename %1| sed 's/-.*//')}; tar xzf %1; mv $(basename %{1} | sed 's/\.tar\.gz//' )* %{?2}%{!?2:$(basename %1| sed 's/-.*//')}
2022-05-24 07:53:14 +02:00
%define make_depend_src_uppercase() rmdir -p $(basename %1| sed 's/-.*//'| tr '[:upper:]' '[:lower:]'); tar xzf %1; mv $(basename %1 | cut -f 1 -d '.' ) $(basename %1| sed 's/-.*//'| tr '[:upper:]' '[:lower:]')
- update to 2.3.1 with following summarized highlights:
* from 2.0.x:
- torch.compile is the main API for PyTorch 2.0, which wraps your model and
returns a compiled model. It is a fully additive (and optional) feature
and hence 2.0 is 100% backward compatible by definition
- Accelerated Transformers introduce high-performance support for training
and inference using a custom kernel architecture for scaled dot product
attention (SPDA). The API is integrated with torch.compile() and model
developers may also use the scaled dot product attention kernels directly
by calling the new scaled_dot_product_attention() operato
* from 2.1.x:
- automatic dynamic shape support in torch.compile,
torch.distributed.checkpoint for saving/loading distributed training jobs
on multiple ranks in parallel, and torch.compile support for the NumPy
API.
- In addition, this release offers numerous performance improvements (e.g.
CPU inductor improvements, AVX512 support, scaled-dot-product-attention
support) as well as a prototype release of torch.export, a sound
full-graph capture mechanism, and torch.export-based quantization.
* from 2.2.x:
- 2x performance improvements to scaled_dot_product_attention via
FlashAttention-v2 integration, as well as AOTInductor, a new
ahead-of-time compilation and deployment tool built for non-python
server-side deployments.
* from 2.3.x:
- support for user-defined Triton kernels in torch.compile, allowing for
users to migrate their own Triton kernels from eager without
experiencing performance complications or graph breaks. As well, Tensor
Parallelism improves the experience for training Large Language Models
using native PyTorch functions, which has been validated on training
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/python-torch?expand=0&rev=32
2024-07-19 14:15:19 +02:00
%autosetup -p1 -n %{srcname} -%{version}
2020-02-21 16:50:33 +01:00
cp %{S:1} releases.html
- update to 2.3.1 with following summarized highlights:
* from 2.0.x:
- torch.compile is the main API for PyTorch 2.0, which wraps your model and
returns a compiled model. It is a fully additive (and optional) feature
and hence 2.0 is 100% backward compatible by definition
- Accelerated Transformers introduce high-performance support for training
and inference using a custom kernel architecture for scaled dot product
attention (SPDA). The API is integrated with torch.compile() and model
developers may also use the scaled dot product attention kernels directly
by calling the new scaled_dot_product_attention() operato
* from 2.1.x:
- automatic dynamic shape support in torch.compile,
torch.distributed.checkpoint for saving/loading distributed training jobs
on multiple ranks in parallel, and torch.compile support for the NumPy
API.
- In addition, this release offers numerous performance improvements (e.g.
CPU inductor improvements, AVX512 support, scaled-dot-product-attention
support) as well as a prototype release of torch.export, a sound
full-graph capture mechanism, and torch.export-based quantization.
* from 2.2.x:
- 2x performance improvements to scaled_dot_product_attention via
FlashAttention-v2 integration, as well as AOTInductor, a new
ahead-of-time compilation and deployment tool built for non-python
server-side deployments.
* from 2.3.x:
- support for user-defined Triton kernels in torch.compile, allowing for
users to migrate their own Triton kernels from eager without
experiencing performance complications or graph breaks. As well, Tensor
Parallelism improves the experience for training Large Language Models
using native PyTorch functions, which has been validated on training
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/python-torch?expand=0&rev=32
2024-07-19 14:15:19 +02:00
%if %{with vulkan}
sed -i '/-Werror=return-type/d' CMakeLists.txt
%endif
2019-08-28 13:58:32 +02:00
cd third_party
rmdir python-peachpy/
- update to 2.3.1 with following summarized highlights:
* from 2.0.x:
- torch.compile is the main API for PyTorch 2.0, which wraps your model and
returns a compiled model. It is a fully additive (and optional) feature
and hence 2.0 is 100% backward compatible by definition
- Accelerated Transformers introduce high-performance support for training
and inference using a custom kernel architecture for scaled dot product
attention (SPDA). The API is integrated with torch.compile() and model
developers may also use the scaled dot product attention kernels directly
by calling the new scaled_dot_product_attention() operato
* from 2.1.x:
- automatic dynamic shape support in torch.compile,
torch.distributed.checkpoint for saving/loading distributed training jobs
on multiple ranks in parallel, and torch.compile support for the NumPy
API.
- In addition, this release offers numerous performance improvements (e.g.
CPU inductor improvements, AVX512 support, scaled-dot-product-attention
support) as well as a prototype release of torch.export, a sound
full-graph capture mechanism, and torch.export-based quantization.
* from 2.2.x:
- 2x performance improvements to scaled_dot_product_attention via
FlashAttention-v2 integration, as well as AOTInductor, a new
ahead-of-time compilation and deployment tool built for non-python
server-side deployments.
* from 2.3.x:
- support for user-defined Triton kernels in torch.compile, allowing for
users to migrate their own Triton kernels from eager without
experiencing performance complications or graph breaks. As well, Tensor
Parallelism improves the experience for training Large Language Models
using native PyTorch functions, which has been validated on training
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/python-torch?expand=0&rev=32
2024-07-19 14:15:19 +02:00
rmdir eigen/
2019-08-28 13:58:32 +02:00
%make_depend_src %{SOURCE10}
%make_depend_src %{SOURCE12}
%make_depend_src %{SOURCE13}
%make_depend_src %{SOURCE14}
%make_depend_src %{SOURCE15}
%make_depend_src %{SOURCE16}
%make_depend_src %{SOURCE17}
%make_depend_src %{SOURCE18}
%make_depend_src %{SOURCE19}
%make_depend_src %{SOURCE20} gemmlowp/gemmlowp
%make_depend_src %{SOURCE21}
2020-02-21 16:50:33 +01:00
%make_depend_src %{SOURCE22}
2022-05-24 07:53:14 +02:00
%make_depend_src %{SOURCE23}
%make_depend_src %{SOURCE25}
- update to 2.3.1 with following summarized highlights:
* from 2.0.x:
- torch.compile is the main API for PyTorch 2.0, which wraps your model and
returns a compiled model. It is a fully additive (and optional) feature
and hence 2.0 is 100% backward compatible by definition
- Accelerated Transformers introduce high-performance support for training
and inference using a custom kernel architecture for scaled dot product
attention (SPDA). The API is integrated with torch.compile() and model
developers may also use the scaled dot product attention kernels directly
by calling the new scaled_dot_product_attention() operato
* from 2.1.x:
- automatic dynamic shape support in torch.compile,
torch.distributed.checkpoint for saving/loading distributed training jobs
on multiple ranks in parallel, and torch.compile support for the NumPy
API.
- In addition, this release offers numerous performance improvements (e.g.
CPU inductor improvements, AVX512 support, scaled-dot-product-attention
support) as well as a prototype release of torch.export, a sound
full-graph capture mechanism, and torch.export-based quantization.
* from 2.2.x:
- 2x performance improvements to scaled_dot_product_attention via
FlashAttention-v2 integration, as well as AOTInductor, a new
ahead-of-time compilation and deployment tool built for non-python
server-side deployments.
* from 2.3.x:
- support for user-defined Triton kernels in torch.compile, allowing for
users to migrate their own Triton kernels from eager without
experiencing performance complications or graph breaks. As well, Tensor
Parallelism improves the experience for training Large Language Models
using native PyTorch functions, which has been validated on training
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/python-torch?expand=0&rev=32
2024-07-19 14:15:19 +02:00
%make_depend_src %{SOURCE26}
%make_depend_src %{SOURCE27}
%make_depend_src %{SOURCE28}
%make_depend_src %{SOURCE29}
2024-07-23 15:10:03 +02:00
# getting the vendoring of the vendored source working, this is
- update to 2.3.1 with following summarized highlights:
* from 2.0.x:
- torch.compile is the main API for PyTorch 2.0, which wraps your model and
returns a compiled model. It is a fully additive (and optional) feature
and hence 2.0 is 100% backward compatible by definition
- Accelerated Transformers introduce high-performance support for training
and inference using a custom kernel architecture for scaled dot product
attention (SPDA). The API is integrated with torch.compile() and model
developers may also use the scaled dot product attention kernels directly
by calling the new scaled_dot_product_attention() operato
* from 2.1.x:
- automatic dynamic shape support in torch.compile,
torch.distributed.checkpoint for saving/loading distributed training jobs
on multiple ranks in parallel, and torch.compile support for the NumPy
API.
- In addition, this release offers numerous performance improvements (e.g.
CPU inductor improvements, AVX512 support, scaled-dot-product-attention
support) as well as a prototype release of torch.export, a sound
full-graph capture mechanism, and torch.export-based quantization.
* from 2.2.x:
- 2x performance improvements to scaled_dot_product_attention via
FlashAttention-v2 integration, as well as AOTInductor, a new
ahead-of-time compilation and deployment tool built for non-python
server-side deployments.
* from 2.3.x:
- support for user-defined Triton kernels in torch.compile, allowing for
users to migrate their own Triton kernels from eager without
experiencing performance complications or graph breaks. As well, Tensor
Parallelism improves the experience for training Large Language Models
using native PyTorch functions, which has been validated on training
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/python-torch?expand=0&rev=32
2024-07-19 14:15:19 +02:00
# insanity at the next level. My onlu exuse is that libnop is header only
rmdir tensorpipe/third_party/libnop
%make_depend_src %{SOURCE30} tensorpipe/third_party/libnop
2019-08-28 13:58:32 +02:00
%build
2022-05-24 07:53:14 +02:00
%define buildvars \
export USE_NNPACK=OFF \
%if %{with cuda} \
export USE_CUDA=ON \
export USE_CUDNN=ON \
export USE_SYSTEM_NCCL=ON \
export PATH=" / u s r / l o c a l / c u d a - 1 0 . 1 / b i n : $ P A T H " \
export CPLUS_INCLUDE_PATH=" / u s r / l o c a l / c u d a - 1 0 . 1 / i n c l u d e " \
export C_INCLUDE_PATH=" / u s r / l o c a l / c u d a - 1 0 . 1 / i n c l u d e " \
export LD_LIBRARY_PATH=" / u s r / l o c a l / c u d a - 1 0 . 1 / l i b " \
export NCCL_INCLUDE_DIR=" / u s r / i n c l u d e / " \
%if 0%{?suse_version} > 1500 \
export CC=gcc-7 \
export CXX=g++-7 \
%endif \
%else \
%if 0%{?suse_version} <= 1500 \
%ifarch aarch64 \
export CC=gcc-8 \
export CXX=g++-8 \
%endif \
%endif \
export USE_CUDA=OFF \
export USE_CUDNN=OFF \
%endif \
export USE_KINETO=OFF \
export USE_LEVELDB=ON \
export USE_LMDB=ON \
export USE_FBGEMM=OFF \
export USE_SYSTEM_BENCHMARK=ON \
export USE_SYSTEM_EIGEN_INSTALL=ON \
- update to 2.3.1 with following summarized highlights:
* from 2.0.x:
- torch.compile is the main API for PyTorch 2.0, which wraps your model and
returns a compiled model. It is a fully additive (and optional) feature
and hence 2.0 is 100% backward compatible by definition
- Accelerated Transformers introduce high-performance support for training
and inference using a custom kernel architecture for scaled dot product
attention (SPDA). The API is integrated with torch.compile() and model
developers may also use the scaled dot product attention kernels directly
by calling the new scaled_dot_product_attention() operato
* from 2.1.x:
- automatic dynamic shape support in torch.compile,
torch.distributed.checkpoint for saving/loading distributed training jobs
on multiple ranks in parallel, and torch.compile support for the NumPy
API.
- In addition, this release offers numerous performance improvements (e.g.
CPU inductor improvements, AVX512 support, scaled-dot-product-attention
support) as well as a prototype release of torch.export, a sound
full-graph capture mechanism, and torch.export-based quantization.
* from 2.2.x:
- 2x performance improvements to scaled_dot_product_attention via
FlashAttention-v2 integration, as well as AOTInductor, a new
ahead-of-time compilation and deployment tool built for non-python
server-side deployments.
* from 2.3.x:
- support for user-defined Triton kernels in torch.compile, allowing for
users to migrate their own Triton kernels from eager without
experiencing performance complications or graph breaks. As well, Tensor
Parallelism improves the experience for training Large Language Models
using native PyTorch functions, which has been validated on training
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/python-torch?expand=0&rev=32
2024-07-19 14:15:19 +02:00
export USE_OPENCV=ON \
2022-05-24 07:53:14 +02:00
export USE_TBB=OFF \
- update to 2.3.1 with following summarized highlights:
* from 2.0.x:
- torch.compile is the main API for PyTorch 2.0, which wraps your model and
returns a compiled model. It is a fully additive (and optional) feature
and hence 2.0 is 100% backward compatible by definition
- Accelerated Transformers introduce high-performance support for training
and inference using a custom kernel architecture for scaled dot product
attention (SPDA). The API is integrated with torch.compile() and model
developers may also use the scaled dot product attention kernels directly
by calling the new scaled_dot_product_attention() operato
* from 2.1.x:
- automatic dynamic shape support in torch.compile,
torch.distributed.checkpoint for saving/loading distributed training jobs
on multiple ranks in parallel, and torch.compile support for the NumPy
API.
- In addition, this release offers numerous performance improvements (e.g.
CPU inductor improvements, AVX512 support, scaled-dot-product-attention
support) as well as a prototype release of torch.export, a sound
full-graph capture mechanism, and torch.export-based quantization.
* from 2.2.x:
- 2x performance improvements to scaled_dot_product_attention via
FlashAttention-v2 integration, as well as AOTInductor, a new
ahead-of-time compilation and deployment tool built for non-python
server-side deployments.
* from 2.3.x:
- support for user-defined Triton kernels in torch.compile, allowing for
users to migrate their own Triton kernels from eager without
experiencing performance complications or graph breaks. As well, Tensor
Parallelism improves the experience for training Large Language Models
using native PyTorch functions, which has been validated on training
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/python-torch?expand=0&rev=32
2024-07-19 14:15:19 +02:00
export USE_MKLDNN=OFF \
export USE_KINETO=OFF \
export USE_DISTRIBUTED=ON \
export TP_BUILD_LIBUV=OFF \
export USE_NCCL=OFF \
%if %{with mpi} \
export USE_MPI=ON \
export MPIEXEC_EXECUTABLE=" %{pkg_bindir} / m p i e x e c " \
%else \
export USE_GLOO=ON \
%endif \
export BLAS=OpenBLAS \
2022-05-24 07:53:14 +02:00
export BUILD_CUSTOM_PROTOBUF=OFF \
export BUILD_TEST=OFF \
export MAX_JOBS=%{?jobs} \
- update to 2.3.1 with following summarized highlights:
* from 2.0.x:
- torch.compile is the main API for PyTorch 2.0, which wraps your model and
returns a compiled model. It is a fully additive (and optional) feature
and hence 2.0 is 100% backward compatible by definition
- Accelerated Transformers introduce high-performance support for training
and inference using a custom kernel architecture for scaled dot product
attention (SPDA). The API is integrated with torch.compile() and model
developers may also use the scaled dot product attention kernels directly
by calling the new scaled_dot_product_attention() operato
* from 2.1.x:
- automatic dynamic shape support in torch.compile,
torch.distributed.checkpoint for saving/loading distributed training jobs
on multiple ranks in parallel, and torch.compile support for the NumPy
API.
- In addition, this release offers numerous performance improvements (e.g.
CPU inductor improvements, AVX512 support, scaled-dot-product-attention
support) as well as a prototype release of torch.export, a sound
full-graph capture mechanism, and torch.export-based quantization.
* from 2.2.x:
- 2x performance improvements to scaled_dot_product_attention via
FlashAttention-v2 integration, as well as AOTInductor, a new
ahead-of-time compilation and deployment tool built for non-python
server-side deployments.
* from 2.3.x:
- support for user-defined Triton kernels in torch.compile, allowing for
users to migrate their own Triton kernels from eager without
experiencing performance complications or graph breaks. As well, Tensor
Parallelism improves the experience for training Large Language Models
using native PyTorch functions, which has been validated on training
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/python-torch?expand=0&rev=32
2024-07-19 14:15:19 +02:00
%if %{with vulkan} \
export USE_VULKAN=ON \
export CXXFLAGS=" - I / u s r / - W n o - e r r o r = r e t u r n - t y p e " \
%endif \
2022-05-24 07:53:14 +02:00
%build vars
2019-08-28 13:58:32 +02:00
%python_build
%install
2022-05-24 07:53:14 +02:00
%build vars
- update to 2.3.1 with following summarized highlights:
* from 2.0.x:
- torch.compile is the main API for PyTorch 2.0, which wraps your model and
returns a compiled model. It is a fully additive (and optional) feature
and hence 2.0 is 100% backward compatible by definition
- Accelerated Transformers introduce high-performance support for training
and inference using a custom kernel architecture for scaled dot product
attention (SPDA). The API is integrated with torch.compile() and model
developers may also use the scaled dot product attention kernels directly
by calling the new scaled_dot_product_attention() operato
* from 2.1.x:
- automatic dynamic shape support in torch.compile,
torch.distributed.checkpoint for saving/loading distributed training jobs
on multiple ranks in parallel, and torch.compile support for the NumPy
API.
- In addition, this release offers numerous performance improvements (e.g.
CPU inductor improvements, AVX512 support, scaled-dot-product-attention
support) as well as a prototype release of torch.export, a sound
full-graph capture mechanism, and torch.export-based quantization.
* from 2.2.x:
- 2x performance improvements to scaled_dot_product_attention via
FlashAttention-v2 integration, as well as AOTInductor, a new
ahead-of-time compilation and deployment tool built for non-python
server-side deployments.
* from 2.3.x:
- support for user-defined Triton kernels in torch.compile, allowing for
users to migrate their own Triton kernels from eager without
experiencing performance complications or graph breaks. As well, Tensor
Parallelism improves the experience for training Large Language Models
using native PyTorch functions, which has been validated on training
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/python-torch?expand=0&rev=32
2024-07-19 14:15:19 +02:00
%python_install -q
2019-08-28 13:58:32 +02:00
%python_expand %fdupes %{buildroot} %{$python_sitearch}
2020-01-14 15:16:39 +01:00
install -m 755 -D caffe2/python/examples/* -t %{buildroot} %{_docdir} /%{name} /
2019-08-28 13:58:32 +02:00
install -m 644 -D %{buildroot} %{python_sitearch} /torch/lib/* %{buildroot} /%{_libdir}
2020-02-21 16:50:33 +01:00
#rm -r %{buildroot}%{python_sitearch}/torch/lib
#cd %{buildroot}/%{_libdir}
#rm libtorch.so
#ln -s libtorch.so.1 libtorch.so
#cd -
#for file in $(find %{buildroot}%{python_sitearch} -type f -name \*.py -perm 644 -size +1b); do
2022-05-24 07:53:14 +02:00
#%{__grep} '/usr/bin/env ' $file && sed -i 's@/usr/bin/env python@/usr/bin/python@' $file && chmod 755 $file
2020-02-21 16:50:33 +01:00
#done
#
#%check
#export LD_LIBRARY_PATH=%{buildroot}/%{_libdir}
#%%python_expand PYTHONPATH=%{buildroot}%{$python_sitearch} $python test/run_test.py
- update to 2.3.1 with following summarized highlights:
* from 2.0.x:
- torch.compile is the main API for PyTorch 2.0, which wraps your model and
returns a compiled model. It is a fully additive (and optional) feature
and hence 2.0 is 100% backward compatible by definition
- Accelerated Transformers introduce high-performance support for training
and inference using a custom kernel architecture for scaled dot product
attention (SPDA). The API is integrated with torch.compile() and model
developers may also use the scaled dot product attention kernels directly
by calling the new scaled_dot_product_attention() operato
* from 2.1.x:
- automatic dynamic shape support in torch.compile,
torch.distributed.checkpoint for saving/loading distributed training jobs
on multiple ranks in parallel, and torch.compile support for the NumPy
API.
- In addition, this release offers numerous performance improvements (e.g.
CPU inductor improvements, AVX512 support, scaled-dot-product-attention
support) as well as a prototype release of torch.export, a sound
full-graph capture mechanism, and torch.export-based quantization.
* from 2.2.x:
- 2x performance improvements to scaled_dot_product_attention via
FlashAttention-v2 integration, as well as AOTInductor, a new
ahead-of-time compilation and deployment tool built for non-python
server-side deployments.
* from 2.3.x:
- support for user-defined Triton kernels in torch.compile, allowing for
users to migrate their own Triton kernels from eager without
experiencing performance complications or graph breaks. As well, Tensor
Parallelism improves the experience for training Large Language Models
using native PyTorch functions, which has been validated on training
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/python-torch?expand=0&rev=32
2024-07-19 14:15:19 +02:00
%python_clone -a %{buildroot} %{_bindir} /torchrun
%python_clone -a %{buildroot} %{_bindir} /convert-caffe2-to-onnx
%python_clone -a %{buildroot} %{_bindir} /convert-onnx-to-caffe2
2019-08-28 13:58:32 +02:00
- update to 2.3.1 with following summarized highlights:
* from 2.0.x:
- torch.compile is the main API for PyTorch 2.0, which wraps your model and
returns a compiled model. It is a fully additive (and optional) feature
and hence 2.0 is 100% backward compatible by definition
- Accelerated Transformers introduce high-performance support for training
and inference using a custom kernel architecture for scaled dot product
attention (SPDA). The API is integrated with torch.compile() and model
developers may also use the scaled dot product attention kernels directly
by calling the new scaled_dot_product_attention() operato
* from 2.1.x:
- automatic dynamic shape support in torch.compile,
torch.distributed.checkpoint for saving/loading distributed training jobs
on multiple ranks in parallel, and torch.compile support for the NumPy
API.
- In addition, this release offers numerous performance improvements (e.g.
CPU inductor improvements, AVX512 support, scaled-dot-product-attention
support) as well as a prototype release of torch.export, a sound
full-graph capture mechanism, and torch.export-based quantization.
* from 2.2.x:
- 2x performance improvements to scaled_dot_product_attention via
FlashAttention-v2 integration, as well as AOTInductor, a new
ahead-of-time compilation and deployment tool built for non-python
server-side deployments.
* from 2.3.x:
- support for user-defined Triton kernels in torch.compile, allowing for
users to migrate their own Triton kernels from eager without
experiencing performance complications or graph breaks. As well, Tensor
Parallelism improves the experience for training Large Language Models
using native PyTorch functions, which has been validated on training
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/python-torch?expand=0&rev=32
2024-07-19 14:15:19 +02:00
%post -n libtorch%{?pkg_suffix} -p /sbin/ldconfig
%postun -n libtorch%{?pkg_suffix} -p /sbin/ldconfig
2019-08-28 13:58:32 +02:00
%files %{python_files}
%defattr (-,root,root)
2020-02-21 16:50:33 +01:00
%doc README.md NOTICE releases.html
2019-08-28 13:58:32 +02:00
%license LICENSE
- update to 2.3.1 with following summarized highlights:
* from 2.0.x:
- torch.compile is the main API for PyTorch 2.0, which wraps your model and
returns a compiled model. It is a fully additive (and optional) feature
and hence 2.0 is 100% backward compatible by definition
- Accelerated Transformers introduce high-performance support for training
and inference using a custom kernel architecture for scaled dot product
attention (SPDA). The API is integrated with torch.compile() and model
developers may also use the scaled dot product attention kernels directly
by calling the new scaled_dot_product_attention() operato
* from 2.1.x:
- automatic dynamic shape support in torch.compile,
torch.distributed.checkpoint for saving/loading distributed training jobs
on multiple ranks in parallel, and torch.compile support for the NumPy
API.
- In addition, this release offers numerous performance improvements (e.g.
CPU inductor improvements, AVX512 support, scaled-dot-product-attention
support) as well as a prototype release of torch.export, a sound
full-graph capture mechanism, and torch.export-based quantization.
* from 2.2.x:
- 2x performance improvements to scaled_dot_product_attention via
FlashAttention-v2 integration, as well as AOTInductor, a new
ahead-of-time compilation and deployment tool built for non-python
server-side deployments.
* from 2.3.x:
- support for user-defined Triton kernels in torch.compile, allowing for
users to migrate their own Triton kernels from eager without
experiencing performance complications or graph breaks. As well, Tensor
Parallelism improves the experience for training Large Language Models
using native PyTorch functions, which has been validated on training
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/python-torch?expand=0&rev=32
2024-07-19 14:15:19 +02:00
%{python_sitearch} /torch*
%{python_sitearch} /torchgen/
%{python_sitearch} /functorch/
2019-08-28 13:58:32 +02:00
%exclude %{python_sitearch} /torch/share
- update to 2.3.1 with following summarized highlights:
* from 2.0.x:
- torch.compile is the main API for PyTorch 2.0, which wraps your model and
returns a compiled model. It is a fully additive (and optional) feature
and hence 2.0 is 100% backward compatible by definition
- Accelerated Transformers introduce high-performance support for training
and inference using a custom kernel architecture for scaled dot product
attention (SPDA). The API is integrated with torch.compile() and model
developers may also use the scaled dot product attention kernels directly
by calling the new scaled_dot_product_attention() operato
* from 2.1.x:
- automatic dynamic shape support in torch.compile,
torch.distributed.checkpoint for saving/loading distributed training jobs
on multiple ranks in parallel, and torch.compile support for the NumPy
API.
- In addition, this release offers numerous performance improvements (e.g.
CPU inductor improvements, AVX512 support, scaled-dot-product-attention
support) as well as a prototype release of torch.export, a sound
full-graph capture mechanism, and torch.export-based quantization.
* from 2.2.x:
- 2x performance improvements to scaled_dot_product_attention via
FlashAttention-v2 integration, as well as AOTInductor, a new
ahead-of-time compilation and deployment tool built for non-python
server-side deployments.
* from 2.3.x:
- support for user-defined Triton kernels in torch.compile, allowing for
users to migrate their own Triton kernels from eager without
experiencing performance complications or graph breaks. As well, Tensor
Parallelism improves the experience for training Large Language Models
using native PyTorch functions, which has been validated on training
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/python-torch?expand=0&rev=32
2024-07-19 14:15:19 +02:00
%exclude %{python_sitearch} /torch/include
%exclude %{python_sitearch} /torch/_inductor/codegen
%exclude %{python_sitearch} /torch/utils/benchmark/utils/
%exclude %{python_sitearch} /torchgen/packaged/ATen/templates
%exclude %{python_sitearch} /torchgen/packaged/autograd/templates
%python_alternative %{_bindir} /torchrun
2019-08-28 13:58:32 +02:00
%files %{python_files devel}
%{python_sitearch} /torch/share
- update to 2.3.1 with following summarized highlights:
* from 2.0.x:
- torch.compile is the main API for PyTorch 2.0, which wraps your model and
returns a compiled model. It is a fully additive (and optional) feature
and hence 2.0 is 100% backward compatible by definition
- Accelerated Transformers introduce high-performance support for training
and inference using a custom kernel architecture for scaled dot product
attention (SPDA). The API is integrated with torch.compile() and model
developers may also use the scaled dot product attention kernels directly
by calling the new scaled_dot_product_attention() operato
* from 2.1.x:
- automatic dynamic shape support in torch.compile,
torch.distributed.checkpoint for saving/loading distributed training jobs
on multiple ranks in parallel, and torch.compile support for the NumPy
API.
- In addition, this release offers numerous performance improvements (e.g.
CPU inductor improvements, AVX512 support, scaled-dot-product-attention
support) as well as a prototype release of torch.export, a sound
full-graph capture mechanism, and torch.export-based quantization.
* from 2.2.x:
- 2x performance improvements to scaled_dot_product_attention via
FlashAttention-v2 integration, as well as AOTInductor, a new
ahead-of-time compilation and deployment tool built for non-python
server-side deployments.
* from 2.3.x:
- support for user-defined Triton kernels in torch.compile, allowing for
users to migrate their own Triton kernels from eager without
experiencing performance complications or graph breaks. As well, Tensor
Parallelism improves the experience for training Large Language Models
using native PyTorch functions, which has been validated on training
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/python-torch?expand=0&rev=32
2024-07-19 14:15:19 +02:00
%{python_sitearch} /torch/include
%{python_sitearch} /torch/_inductor/codegen
%{python_sitearch} /torch/utils/benchmark/utils/
%{python_sitearch} /torchgen/packaged/ATen/templates
%{python_sitearch} /torchgen/packaged/autograd/templates
%files %{python_files converters}
%python_alternative %{_bindir} /convert-caffe2-to-onnx
%python_alternative %{_bindir} /convert-onnx-to-caffe2
%files %{python_files examples}
2019-08-28 13:58:32 +02:00
%{_docdir} /%{name}
- update to 2.3.1 with following summarized highlights:
* from 2.0.x:
- torch.compile is the main API for PyTorch 2.0, which wraps your model and
returns a compiled model. It is a fully additive (and optional) feature
and hence 2.0 is 100% backward compatible by definition
- Accelerated Transformers introduce high-performance support for training
and inference using a custom kernel architecture for scaled dot product
attention (SPDA). The API is integrated with torch.compile() and model
developers may also use the scaled dot product attention kernels directly
by calling the new scaled_dot_product_attention() operato
* from 2.1.x:
- automatic dynamic shape support in torch.compile,
torch.distributed.checkpoint for saving/loading distributed training jobs
on multiple ranks in parallel, and torch.compile support for the NumPy
API.
- In addition, this release offers numerous performance improvements (e.g.
CPU inductor improvements, AVX512 support, scaled-dot-product-attention
support) as well as a prototype release of torch.export, a sound
full-graph capture mechanism, and torch.export-based quantization.
* from 2.2.x:
- 2x performance improvements to scaled_dot_product_attention via
FlashAttention-v2 integration, as well as AOTInductor, a new
ahead-of-time compilation and deployment tool built for non-python
server-side deployments.
* from 2.3.x:
- support for user-defined Triton kernels in torch.compile, allowing for
users to migrate their own Triton kernels from eager without
experiencing performance complications or graph breaks. As well, Tensor
Parallelism improves the experience for training Large Language Models
using native PyTorch functions, which has been validated on training
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/python-torch?expand=0&rev=32
2024-07-19 14:15:19 +02:00
%exclude %{_docdir} /%{name} /NOTICE
%exclude %{_docdir} /%{name} /README.md
%exclude %{_docdir} /%{name} /releases.html
2019-08-28 13:58:32 +02:00
- update to 2.3.1 with following summarized highlights:
* from 2.0.x:
- torch.compile is the main API for PyTorch 2.0, which wraps your model and
returns a compiled model. It is a fully additive (and optional) feature
and hence 2.0 is 100% backward compatible by definition
- Accelerated Transformers introduce high-performance support for training
and inference using a custom kernel architecture for scaled dot product
attention (SPDA). The API is integrated with torch.compile() and model
developers may also use the scaled dot product attention kernels directly
by calling the new scaled_dot_product_attention() operato
* from 2.1.x:
- automatic dynamic shape support in torch.compile,
torch.distributed.checkpoint for saving/loading distributed training jobs
on multiple ranks in parallel, and torch.compile support for the NumPy
API.
- In addition, this release offers numerous performance improvements (e.g.
CPU inductor improvements, AVX512 support, scaled-dot-product-attention
support) as well as a prototype release of torch.export, a sound
full-graph capture mechanism, and torch.export-based quantization.
* from 2.2.x:
- 2x performance improvements to scaled_dot_product_attention via
FlashAttention-v2 integration, as well as AOTInductor, a new
ahead-of-time compilation and deployment tool built for non-python
server-side deployments.
* from 2.3.x:
- support for user-defined Triton kernels in torch.compile, allowing for
users to migrate their own Triton kernels from eager without
experiencing performance complications or graph breaks. As well, Tensor
Parallelism improves the experience for training Large Language Models
using native PyTorch functions, which has been validated on training
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/python-torch?expand=0&rev=32
2024-07-19 14:15:19 +02:00
%files -n libtorch%{?pkg_suffix}
2019-08-28 13:58:32 +02:00
%{_libdir} /*.so*
2020-02-21 16:50:33 +01:00
2019-08-28 13:58:32 +02:00
%changelog