1
0
Files
python-swifter/python-swifter.spec

79 lines
2.7 KiB
RPMSpec
Raw Permalink Normal View History

#
# spec file for package python-swifter
#
# Copyright (c) 2024 SUSE LLC
#
# All modifications and additions to the file contributed by third parties
# remain the property of their copyright owners, unless otherwise agreed
# upon. The license for this file, and modifications and additions to the
# file, is the same license as for the pristine package itself (unless the
# license for the pristine package is not an Open Source License, in which
# case the license is the MIT License). An "Open Source License" is a
# license that conforms to the Open Source Definition (Version 1.9)
# published by the Open Source Initiative.
# Please submit bugfixes or comments via https://bugs.opensuse.org/
#
Name: python-swifter
Version: 1.4.0
Release: 0
Summary: Tool to speed up pandas calculations
License: MIT
URL: https://github.com/jmcarpenter2/swifter
Source: https://github.com/jmcarpenter2/swifter/archive/%{version}.tar.gz#/swifter-%{version}.tar.gz
Accepting request 1074411 from home:bnavigator:branches:devel:languages:python:numeric - Update to 1.3.4 * Enable indexing after a groupby, e.g. df.swifter.groupby(by)[key].apply(func) * Improve groupby apply progress bar * Previously, the groupby apply progress bar only appeared after the data was distributed across the cores. * Now, the groupby apply progress bar appears before the data is distributed for a more realistic reflection of how long it took * Additional groupby apply code refactoring and optimizations, including removing the mutability of the data within ray - Version 1.3.3 * Enable users to pass in df.index as the by parameter for the df.swifter.groupby(by).apply(func) command - Version 1.3.2 * Enable users to df.swifter.groupby.apply, which requires a new package (ray) that now available as an extra_requires. * To use groupby apply, install swifter as pip install -U swifter[groupby] * All credit goes to user @diditforlulz273 for writing the performant groupby apply code, that is now part of swifter! - Version 1.2.0 * Enable users to force_parallel which immediately forces swifter to jump to using dask apply. This enables a simple interface for parallel processing, but disables swifter's algorithm to determine the fastest apply solution possible. - Version 1.1.4 * Enable users to leverage set_defaults functionality so they don't have to keep invoking individual settings on a per swifter invocation basis - Version 1.1.3 * Enhance the robustness of swifter by randomizing the sample index to avoid sparse data impacting the validity of apply validation * Resolve issue where functions that return a non array-like cause swifter to fail on vectorization OBS-URL: https://build.opensuse.org/request/show/1074411 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python:numeric/python-swifter?expand=0&rev=18
2023-03-26 16:46:04 +00:00
BuildRequires: %{python_module pip}
BuildRequires: %{python_module setuptools}
Accepting request 1074411 from home:bnavigator:branches:devel:languages:python:numeric - Update to 1.3.4 * Enable indexing after a groupby, e.g. df.swifter.groupby(by)[key].apply(func) * Improve groupby apply progress bar * Previously, the groupby apply progress bar only appeared after the data was distributed across the cores. * Now, the groupby apply progress bar appears before the data is distributed for a more realistic reflection of how long it took * Additional groupby apply code refactoring and optimizations, including removing the mutability of the data within ray - Version 1.3.3 * Enable users to pass in df.index as the by parameter for the df.swifter.groupby(by).apply(func) command - Version 1.3.2 * Enable users to df.swifter.groupby.apply, which requires a new package (ray) that now available as an extra_requires. * To use groupby apply, install swifter as pip install -U swifter[groupby] * All credit goes to user @diditforlulz273 for writing the performant groupby apply code, that is now part of swifter! - Version 1.2.0 * Enable users to force_parallel which immediately forces swifter to jump to using dask apply. This enables a simple interface for parallel processing, but disables swifter's algorithm to determine the fastest apply solution possible. - Version 1.1.4 * Enable users to leverage set_defaults functionality so they don't have to keep invoking individual settings on a per swifter invocation basis - Version 1.1.3 * Enhance the robustness of swifter by randomizing the sample index to avoid sparse data impacting the validity of apply validation * Resolve issue where functions that return a non array-like cause swifter to fail on vectorization OBS-URL: https://build.opensuse.org/request/show/1074411 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python:numeric/python-swifter?expand=0&rev=18
2023-03-26 16:46:04 +00:00
BuildRequires: %{python_module wheel}
BuildRequires: fdupes
BuildRequires: python-rpm-macros
Accepting request 870474 from home:bnavigator:branches:devel:languages:python:numeric - Update to 1.0.7 * Sample applies now suppress logging in addition to stdout and stderr * Allow new kwargs offset and origin for pandas df.resample - Changes in 1.0.5 * Added warnings/errors for swifter methods which do not exist when using modin dataframes * Updated Dask Dataframe dependencies to require a more recent version * Updated examples/speed benchmark notebooks - Changes in 1.0.3 * Fixed bug with string, axis=1 applies for pandas dataframes that prevented swifter from leveraging modin for parallelization when returning a series instead of a dataframe - Changes in 1.0.2 * Remove pickle5 hard dependency - Changes in 1.0.1 * Reduce resources consumed by swifter by only importing modin/ ray when necessary. * Added swifter.register_modin() function, which gives access to modin.DataFrame.swifter.apply(...), but is only required if modin is imported after swifter. If you import modin before swifter, this is not necessary. - Changes in 1.0.0 * Two major enhancements are included in this release, both involving the use of modin in swifter. Special thanks to Devin Petersohn for the collaboration. * Enable compatibility with modin dataframes. Compatibility not only allows modin dataframes to work with df.swifter.apply(...), but still attempts to vectorize the operation which can lead to a performance boost. Example: import modin.pandas as pd df = pd.DataFrame(...) df.swifter.apply(...) * Significantly speed up swifter axis=1 string applies by using Modin, resolving a long-standing issue for swifter. * Use Modin for axis=1 string applies, unless allow_dask_on_strings(True) is set. If that flag is set, still use Dask. NOTE: this means that allow_dask_on_strings() is no longer required to work with text data using swifter. - Changes in 0.305 * Remove Numba hard dependency, but still handle TypingErrors when numba is installed * Only call tqdm's progress_apply on transformations (e.g. Resampler, Rolling) when tqdm has an implementation for that object. - Do not require modin and skip the tests involving it. gh#jmcarpenter2/swifter#147 OBS-URL: https://build.opensuse.org/request/show/870474 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python:numeric/python-swifter?expand=0&rev=11
2021-02-09 15:12:56 +00:00
Requires: python-dask-dataframe >= 2.10.0
Requires: python-pandas >= 1.0
Requires: python-psutil >= 5.6.6
Requires: python-tqdm >= 4.33.0
Suggests: python-ipywidgets >= 7.0.0
Accepting request 1074411 from home:bnavigator:branches:devel:languages:python:numeric - Update to 1.3.4 * Enable indexing after a groupby, e.g. df.swifter.groupby(by)[key].apply(func) * Improve groupby apply progress bar * Previously, the groupby apply progress bar only appeared after the data was distributed across the cores. * Now, the groupby apply progress bar appears before the data is distributed for a more realistic reflection of how long it took * Additional groupby apply code refactoring and optimizations, including removing the mutability of the data within ray - Version 1.3.3 * Enable users to pass in df.index as the by parameter for the df.swifter.groupby(by).apply(func) command - Version 1.3.2 * Enable users to df.swifter.groupby.apply, which requires a new package (ray) that now available as an extra_requires. * To use groupby apply, install swifter as pip install -U swifter[groupby] * All credit goes to user @diditforlulz273 for writing the performant groupby apply code, that is now part of swifter! - Version 1.2.0 * Enable users to force_parallel which immediately forces swifter to jump to using dask apply. This enables a simple interface for parallel processing, but disables swifter's algorithm to determine the fastest apply solution possible. - Version 1.1.4 * Enable users to leverage set_defaults functionality so they don't have to keep invoking individual settings on a per swifter invocation basis - Version 1.1.3 * Enhance the robustness of swifter by randomizing the sample index to avoid sparse data impacting the validity of apply validation * Resolve issue where functions that return a non array-like cause swifter to fail on vectorization OBS-URL: https://build.opensuse.org/request/show/1074411 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python:numeric/python-swifter?expand=0&rev=18
2023-03-26 16:46:04 +00:00
Suggests: python-ray >= 1.0
BuildArch: noarch
# SECTION test requirements
Accepting request 870474 from home:bnavigator:branches:devel:languages:python:numeric - Update to 1.0.7 * Sample applies now suppress logging in addition to stdout and stderr * Allow new kwargs offset and origin for pandas df.resample - Changes in 1.0.5 * Added warnings/errors for swifter methods which do not exist when using modin dataframes * Updated Dask Dataframe dependencies to require a more recent version * Updated examples/speed benchmark notebooks - Changes in 1.0.3 * Fixed bug with string, axis=1 applies for pandas dataframes that prevented swifter from leveraging modin for parallelization when returning a series instead of a dataframe - Changes in 1.0.2 * Remove pickle5 hard dependency - Changes in 1.0.1 * Reduce resources consumed by swifter by only importing modin/ ray when necessary. * Added swifter.register_modin() function, which gives access to modin.DataFrame.swifter.apply(...), but is only required if modin is imported after swifter. If you import modin before swifter, this is not necessary. - Changes in 1.0.0 * Two major enhancements are included in this release, both involving the use of modin in swifter. Special thanks to Devin Petersohn for the collaboration. * Enable compatibility with modin dataframes. Compatibility not only allows modin dataframes to work with df.swifter.apply(...), but still attempts to vectorize the operation which can lead to a performance boost. Example: import modin.pandas as pd df = pd.DataFrame(...) df.swifter.apply(...) * Significantly speed up swifter axis=1 string applies by using Modin, resolving a long-standing issue for swifter. * Use Modin for axis=1 string applies, unless allow_dask_on_strings(True) is set. If that flag is set, still use Dask. NOTE: this means that allow_dask_on_strings() is no longer required to work with text data using swifter. - Changes in 0.305 * Remove Numba hard dependency, but still handle TypingErrors when numba is installed * Only call tqdm's progress_apply on transformations (e.g. Resampler, Rolling) when tqdm has an implementation for that object. - Do not require modin and skip the tests involving it. gh#jmcarpenter2/swifter#147 OBS-URL: https://build.opensuse.org/request/show/870474 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python:numeric/python-swifter?expand=0&rev=11
2021-02-09 15:12:56 +00:00
BuildRequires: %{python_module dask-dataframe >= 2.10.0}
BuildRequires: %{python_module ipywidgets >= 7.0.0 if %python-base >= 3.10}
Accepting request 870474 from home:bnavigator:branches:devel:languages:python:numeric - Update to 1.0.7 * Sample applies now suppress logging in addition to stdout and stderr * Allow new kwargs offset and origin for pandas df.resample - Changes in 1.0.5 * Added warnings/errors for swifter methods which do not exist when using modin dataframes * Updated Dask Dataframe dependencies to require a more recent version * Updated examples/speed benchmark notebooks - Changes in 1.0.3 * Fixed bug with string, axis=1 applies for pandas dataframes that prevented swifter from leveraging modin for parallelization when returning a series instead of a dataframe - Changes in 1.0.2 * Remove pickle5 hard dependency - Changes in 1.0.1 * Reduce resources consumed by swifter by only importing modin/ ray when necessary. * Added swifter.register_modin() function, which gives access to modin.DataFrame.swifter.apply(...), but is only required if modin is imported after swifter. If you import modin before swifter, this is not necessary. - Changes in 1.0.0 * Two major enhancements are included in this release, both involving the use of modin in swifter. Special thanks to Devin Petersohn for the collaboration. * Enable compatibility with modin dataframes. Compatibility not only allows modin dataframes to work with df.swifter.apply(...), but still attempts to vectorize the operation which can lead to a performance boost. Example: import modin.pandas as pd df = pd.DataFrame(...) df.swifter.apply(...) * Significantly speed up swifter axis=1 string applies by using Modin, resolving a long-standing issue for swifter. * Use Modin for axis=1 string applies, unless allow_dask_on_strings(True) is set. If that flag is set, still use Dask. NOTE: this means that allow_dask_on_strings() is no longer required to work with text data using swifter. - Changes in 0.305 * Remove Numba hard dependency, but still handle TypingErrors when numba is installed * Only call tqdm's progress_apply on transformations (e.g. Resampler, Rolling) when tqdm has an implementation for that object. - Do not require modin and skip the tests involving it. gh#jmcarpenter2/swifter#147 OBS-URL: https://build.opensuse.org/request/show/870474 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python:numeric/python-swifter?expand=0&rev=11
2021-02-09 15:12:56 +00:00
BuildRequires: %{python_module pandas >= 1.0}
BuildRequires: %{python_module psutil >= 5.6.6}
BuildRequires: %{python_module pytest-xdist}
BuildRequires: %{python_module pytest}
BuildRequires: %{python_module tqdm >= 4.33.0}
# /SECTION
%python_subpackages
%description
A package which efficiently applies any function to a
pandas dataframe or series in the fastest available manner
%prep
%setup -q -n swifter-%{version}
%build
Accepting request 1074411 from home:bnavigator:branches:devel:languages:python:numeric - Update to 1.3.4 * Enable indexing after a groupby, e.g. df.swifter.groupby(by)[key].apply(func) * Improve groupby apply progress bar * Previously, the groupby apply progress bar only appeared after the data was distributed across the cores. * Now, the groupby apply progress bar appears before the data is distributed for a more realistic reflection of how long it took * Additional groupby apply code refactoring and optimizations, including removing the mutability of the data within ray - Version 1.3.3 * Enable users to pass in df.index as the by parameter for the df.swifter.groupby(by).apply(func) command - Version 1.3.2 * Enable users to df.swifter.groupby.apply, which requires a new package (ray) that now available as an extra_requires. * To use groupby apply, install swifter as pip install -U swifter[groupby] * All credit goes to user @diditforlulz273 for writing the performant groupby apply code, that is now part of swifter! - Version 1.2.0 * Enable users to force_parallel which immediately forces swifter to jump to using dask apply. This enables a simple interface for parallel processing, but disables swifter's algorithm to determine the fastest apply solution possible. - Version 1.1.4 * Enable users to leverage set_defaults functionality so they don't have to keep invoking individual settings on a per swifter invocation basis - Version 1.1.3 * Enhance the robustness of swifter by randomizing the sample index to avoid sparse data impacting the validity of apply validation * Resolve issue where functions that return a non array-like cause swifter to fail on vectorization OBS-URL: https://build.opensuse.org/request/show/1074411 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python:numeric/python-swifter?expand=0&rev=18
2023-03-26 16:46:04 +00:00
%pyproject_wheel
%install
Accepting request 1074411 from home:bnavigator:branches:devel:languages:python:numeric - Update to 1.3.4 * Enable indexing after a groupby, e.g. df.swifter.groupby(by)[key].apply(func) * Improve groupby apply progress bar * Previously, the groupby apply progress bar only appeared after the data was distributed across the cores. * Now, the groupby apply progress bar appears before the data is distributed for a more realistic reflection of how long it took * Additional groupby apply code refactoring and optimizations, including removing the mutability of the data within ray - Version 1.3.3 * Enable users to pass in df.index as the by parameter for the df.swifter.groupby(by).apply(func) command - Version 1.3.2 * Enable users to df.swifter.groupby.apply, which requires a new package (ray) that now available as an extra_requires. * To use groupby apply, install swifter as pip install -U swifter[groupby] * All credit goes to user @diditforlulz273 for writing the performant groupby apply code, that is now part of swifter! - Version 1.2.0 * Enable users to force_parallel which immediately forces swifter to jump to using dask apply. This enables a simple interface for parallel processing, but disables swifter's algorithm to determine the fastest apply solution possible. - Version 1.1.4 * Enable users to leverage set_defaults functionality so they don't have to keep invoking individual settings on a per swifter invocation basis - Version 1.1.3 * Enhance the robustness of swifter by randomizing the sample index to avoid sparse data impacting the validity of apply validation * Resolve issue where functions that return a non array-like cause swifter to fail on vectorization OBS-URL: https://build.opensuse.org/request/show/1074411 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python:numeric/python-swifter?expand=0&rev=18
2023-03-26 16:46:04 +00:00
%pyproject_install
%python_expand %fdupes %{buildroot}%{$python_sitelib}
%check
Accepting request 870474 from home:bnavigator:branches:devel:languages:python:numeric - Update to 1.0.7 * Sample applies now suppress logging in addition to stdout and stderr * Allow new kwargs offset and origin for pandas df.resample - Changes in 1.0.5 * Added warnings/errors for swifter methods which do not exist when using modin dataframes * Updated Dask Dataframe dependencies to require a more recent version * Updated examples/speed benchmark notebooks - Changes in 1.0.3 * Fixed bug with string, axis=1 applies for pandas dataframes that prevented swifter from leveraging modin for parallelization when returning a series instead of a dataframe - Changes in 1.0.2 * Remove pickle5 hard dependency - Changes in 1.0.1 * Reduce resources consumed by swifter by only importing modin/ ray when necessary. * Added swifter.register_modin() function, which gives access to modin.DataFrame.swifter.apply(...), but is only required if modin is imported after swifter. If you import modin before swifter, this is not necessary. - Changes in 1.0.0 * Two major enhancements are included in this release, both involving the use of modin in swifter. Special thanks to Devin Petersohn for the collaboration. * Enable compatibility with modin dataframes. Compatibility not only allows modin dataframes to work with df.swifter.apply(...), but still attempts to vectorize the operation which can lead to a performance boost. Example: import modin.pandas as pd df = pd.DataFrame(...) df.swifter.apply(...) * Significantly speed up swifter axis=1 string applies by using Modin, resolving a long-standing issue for swifter. * Use Modin for axis=1 string applies, unless allow_dask_on_strings(True) is set. If that flag is set, still use Dask. NOTE: this means that allow_dask_on_strings() is no longer required to work with text data using swifter. - Changes in 0.305 * Remove Numba hard dependency, but still handle TypingErrors when numba is installed * Only call tqdm's progress_apply on transformations (e.g. Resampler, Rolling) when tqdm has an implementation for that object. - Do not require modin and skip the tests involving it. gh#jmcarpenter2/swifter#147 OBS-URL: https://build.opensuse.org/request/show/870474 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python:numeric/python-swifter?expand=0&rev=11
2021-02-09 15:12:56 +00:00
# we fail the speedtests on the build service machines. Disable that portion of the tests
sed -i 's/if self.ncores > 1: # speed test/if False: # no speed test/' swifter/swifter_tests.py
Accepting request 1074411 from home:bnavigator:branches:devel:languages:python:numeric - Update to 1.3.4 * Enable indexing after a groupby, e.g. df.swifter.groupby(by)[key].apply(func) * Improve groupby apply progress bar * Previously, the groupby apply progress bar only appeared after the data was distributed across the cores. * Now, the groupby apply progress bar appears before the data is distributed for a more realistic reflection of how long it took * Additional groupby apply code refactoring and optimizations, including removing the mutability of the data within ray - Version 1.3.3 * Enable users to pass in df.index as the by parameter for the df.swifter.groupby(by).apply(func) command - Version 1.3.2 * Enable users to df.swifter.groupby.apply, which requires a new package (ray) that now available as an extra_requires. * To use groupby apply, install swifter as pip install -U swifter[groupby] * All credit goes to user @diditforlulz273 for writing the performant groupby apply code, that is now part of swifter! - Version 1.2.0 * Enable users to force_parallel which immediately forces swifter to jump to using dask apply. This enables a simple interface for parallel processing, but disables swifter's algorithm to determine the fastest apply solution possible. - Version 1.1.4 * Enable users to leverage set_defaults functionality so they don't have to keep invoking individual settings on a per swifter invocation basis - Version 1.1.3 * Enhance the robustness of swifter by randomizing the sample index to avoid sparse data impacting the validity of apply validation * Resolve issue where functions that return a non array-like cause swifter to fail on vectorization OBS-URL: https://build.opensuse.org/request/show/1074411 OBS-URL: https://build.opensuse.org/package/show/devel:languages:python:numeric/python-swifter?expand=0&rev=18
2023-03-26 16:46:04 +00:00
# groupy requires the extra dep python-ray, which is not available
donttest="(TestPandasDataFrame and groupby)"
# wrong axis parameter
donttest="$donttest or test_nonvectorized_math_apply_on_small_dataframe"
%pytest -n auto swifter/swifter_tests.py -k "not ($donttest)"
%files %{python_files}
%doc README.md
%license LICENSE
%{python_sitelib}/swifter
%{python_sitelib}/swifter-%{version}.dist-info
%changelog