5039 lines
221 KiB
Plaintext
5039 lines
221 KiB
Plaintext
-------------------------------------------------------------------
|
||
Sat Dec 2 14:09:52 UTC 2023 - Dirk Müller <dmueller@suse.com>
|
||
|
||
- update to 2023.12.0:
|
||
* Bokeh 3.3.0 compatibility
|
||
* Add ``network`` marker to
|
||
``test_pyarrow_filesystem_option_real_data``
|
||
* Bump GPU CI to CUDA 11.8 (:pr:`10656`)
|
||
* Tokenize ``pandas`` offsets deterministically
|
||
* Add tokenize ``pd.NA`` functionality
|
||
* Update gpuCI ``RAPIDS_VER`` to ``24.02`` (:pr:`10636`)
|
||
* Fix precision handling in ``array.linalg.norm`` (:pr:`10556`)
|
||
`joanrue`_
|
||
* Add ``axis`` argument to ``DataFrame.clip`` and
|
||
``Series.clip`` (:pr:`10616`) `Richard (Rick) Zamora`_
|
||
* Update changelog entry for in-memory rechunking (:pr:`10630`)
|
||
`Florian Jetter`_
|
||
* Fix flaky ``test_resources_reset_after_cancelled_task``
|
||
* Bump GPU CI to CUDA 11.8
|
||
* Bump ``conda-incubator/setup-miniconda``
|
||
* Add debug logs to P2P scheduler plugin
|
||
* ``O(1)`` access for ``/info/task/`` endpoint
|
||
* Remove stringification from shuffle annotations
|
||
* Don't cast ``int`` metrics to ``float``
|
||
* Drop asyncio TCP backend
|
||
* Add offload support to ``context_meter.add_callback``
|
||
* Test that ``sync()`` propagates contextvars
|
||
* Fix ``test_statistical_profiling_cycle``
|
||
* Replace ``Client.register_plugin`` s ``idempotent`` argument
|
||
with ``.idempotent`` attribute on plugins
|
||
* Fix test report generation
|
||
* Install ``pyarrow-hotfix`` on ``mindeps-pandas`` CI
|
||
* Reduce memory usage of scheduler process - optimize
|
||
``scheduler.py::TaskState`` class
|
||
* Update cuDF test with explicit ``dtype=object``
|
||
* Fix ``Cluster`` / ``SpecCluster`` calls to async close
|
||
methods
|
||
|
||
-------------------------------------------------------------------
|
||
Thu Nov 16 21:26:58 UTC 2023 - Ondřej Súkup <mimi.vx@gmail.com>
|
||
|
||
- Update to 2023.11.0
|
||
* Zero-copy P2P Array Rechunking
|
||
* Deprecating PyArrow <14.0.1
|
||
* Improved PyArrow filesystem for Parquet
|
||
* Improve Type Reconciliation in P2P Shuffling
|
||
* official support for Python 3.12
|
||
* Reduced memory pressure for multi array reductions
|
||
* improved P2P shuffling robustness
|
||
* Reduced scheduler CPU load for large graphs
|
||
|
||
-------------------------------------------------------------------
|
||
Sun Sep 10 13:29:26 UTC 2023 - Ben Greiner <code@bnavigator.de>
|
||
|
||
- Update to 2023.9.1
|
||
## Enhancements
|
||
* Stricter data type for dask keys (GH#10485) crusaderky
|
||
* Special handling for None in DASK_ environment variables
|
||
(GH#10487) crusaderky
|
||
## Bug Fixes
|
||
- Release 2023.9.0
|
||
## Bug Fixes
|
||
* Remove support for np.int64 in keys (GH#10483) crusaderky
|
||
* Fix _partitions dtype in meta for shuffling (GH#10462) Hendrik
|
||
Makait
|
||
* Don’t use exception hooks to shorten tracebacks (GH#10456)
|
||
crusaderky
|
||
- Release 2023.8.1
|
||
## Enhancements
|
||
* Adding support for cgroup v2 to cpu_count (GH#10419) Johan
|
||
Olsson
|
||
* Support multi-column groupby with sort=True and split_out>1
|
||
(GH#10425) Richard (Rick) Zamora
|
||
* Add DataFrame.enforce_runtime_divisions method (GH#10404)
|
||
Richard (Rick) Zamora
|
||
* Enable file mode="x" with a single_file=True for Dask DataFrame
|
||
to_csv (GH#10443) Genevieve Buckley
|
||
## Bug Fixes
|
||
* Fix ValueError when running to_csv in append mode with
|
||
single_file as True (GH#10441)
|
||
- Release 2023.8.0
|
||
## Enhancements
|
||
* Fix for make_timeseries performance regression (GH#10428) Irina
|
||
Truong
|
||
- Release 2023.7.1
|
||
* This release updates Dask DataFrame to automatically convert
|
||
text data using object data types to string[pyarrow] if
|
||
pandas>=2 and pyarrow>=12 are installed. This should result in
|
||
significantly reduced memory consumption and increased
|
||
computation performance in many workflows that deal with text
|
||
data. You can disable this change by setting the
|
||
dataframe.convert-string configuration value to False with
|
||
dask.config.set({"dataframe.convert-string": False})
|
||
## Enhancements
|
||
* Convert to pyarrow strings if proper dependencies are installed
|
||
(GH#10400) James Bourbeau
|
||
* Avoid repartition before shuffle for p2p (GH#10421) Patrick
|
||
Hoefler
|
||
* API to generate random Dask DataFrames (GH#10392) Irina Truong
|
||
* Speed up dask.bag.Bag.random_sample (GH#10356) crusaderky
|
||
* Raise helpful ValueError for invalid time units (GH#10408) Nat
|
||
Tabris
|
||
* Make repartition a no-op when divisions match (divisions
|
||
provided as a list) (GH#10395) Nicolas Grandemange
|
||
## Bug Fixes
|
||
* Use dataframe.convert-string in read_parquet token (GH#10411)
|
||
James Bourbeau
|
||
* Category dtype is lost when concatenating MultiIndex (GH#10407)
|
||
Irina Truong
|
||
* Fix FutureWarning: The provided callable... (GH#10405) Irina
|
||
Truong
|
||
* Enable non-categorical hive-partition columns in read_parquet
|
||
(GH#10353) Richard (Rick) Zamora
|
||
* concat ignoring DataFrame withouth columns (GH#10359) Patrick
|
||
Hoefler
|
||
- Release 2023.7.0
|
||
## Enhancements
|
||
* Catch exceptions when attempting to load CLI entry points
|
||
(GH#10380) Jacob Tomlinson
|
||
## Bug Fixes
|
||
* Fix typo in _clean_ipython_traceback (GH#10385) Alexander
|
||
Clausen
|
||
* Ensure that df is immutable after from_pandas (GH#10383)
|
||
Patrick Hoefler
|
||
* Warn consistently for inplace in Series.rename (GH#10313)
|
||
Patrick Hoefler
|
||
- Release 2023.6.1
|
||
## Enhancements
|
||
* Remove no longer supported clip_lower and clip_upper (GH#10371)
|
||
Patrick Hoefler
|
||
* Support DataFrame.set_index(..., sort=False) (GH#10342) Miles
|
||
* Cleanup remote tracebacks (GH#10354) Irina Truong
|
||
* Add dispatching mechanisms for pyarrow.Table conversion
|
||
(GH#10312) Richard (Rick) Zamora
|
||
* Choose P2P even if fusion is enabled (GH#10344) Hendrik Makait
|
||
* Validate that rechunking is possible earlier in graph
|
||
generation (GH#10336) Hendrik Makait
|
||
## Bug Fixes
|
||
* Fix issue with header passed to read_csv (GH#10355) GALI PREM
|
||
SAGAR
|
||
* Respect dropna and observed in GroupBy.var and GroupBy.std
|
||
(GH#10350) Patrick Hoefler
|
||
* Fix H5FD_lock error when writing to hdf with distributed client
|
||
(GH#10309) Irina Truong
|
||
* Fix for total_mem_usage of bag.map() (GH#10341) Irina Truong
|
||
## Deprecations
|
||
* Deprecate DataFrame.fillna/Series.fillna with method (GH#10349)
|
||
Irina Truong
|
||
* Deprecate DataFrame.first and Series.first (GH#10352) Irina
|
||
Truong
|
||
- Release 2023.6.0
|
||
## Enhancements
|
||
* Add missing not in predicate support to read_parquet (GH#10320)
|
||
Richard (Rick) Zamora
|
||
## Bug Fixes
|
||
* Fix for incorrect value_counts (GH#10323) Irina Truong
|
||
* Update empty describe top and freq values (GH#10319) James
|
||
Bourbeau
|
||
|
||
-------------------------------------------------------------------
|
||
Sat Jun 10 12:27:59 UTC 2023 - ecsos <ecsos@opensuse.org>
|
||
|
||
- Add %{?sle15_python_module_pythons}
|
||
|
||
-------------------------------------------------------------------
|
||
Mon Jun 5 23:42:44 UTC 2023 - Steve Kowalik <steven.kowalik@suse.com>
|
||
|
||
- Tighten bokeh requirement to match distributed.
|
||
|
||
-------------------------------------------------------------------
|
||
Fri May 26 19:59:37 UTC 2023 - Ben Greiner <code@bnavigator.de>
|
||
|
||
- Update to 2023.5.1
|
||
* This release drops support for Python 3.8. As of this release
|
||
Dask supports Python 3.9, 3.10, and 3.11.
|
||
## Enhancements
|
||
* Drop Python 3.8 support (GH#10295) Thomas Grainger
|
||
* Change Dask Bag partitioning scheme to improve cluster
|
||
saturation (GH#10294) Jacob Tomlinson
|
||
* Generalize dd.to_datetime for GPU-backed collections, introduce
|
||
get_meta_library utility (GH#9881) Charles Blackmon-Luca
|
||
* Add na_action to DataFrame.map (GH#10305) Patrick Hoefler
|
||
* Raise TypeError in DataFrame.nsmallest and DataFrame.nlargest
|
||
when columns is not given (GH#10301) Patrick Hoefler
|
||
* Improve sizeof for pd.MultiIndex (GH#10230) Patrick Hoefler
|
||
* Support duplicated columns in a bunch of DataFrame methods
|
||
(GH#10261) Patrick Hoefler
|
||
* Add numeric_only support to DataFrame.idxmin and
|
||
DataFrame.idxmax (GH#10253) Patrick Hoefler
|
||
* Implement numeric_only support for DataFrame.quantile
|
||
(GH#10259) Patrick Hoefler
|
||
* Add support for numeric_only=False in DataFrame.std (GH#10251)
|
||
Patrick Hoefler
|
||
* Implement numeric_only=False for GroupBy.cumprod and
|
||
GroupBy.cumsum (GH#10262) Patrick Hoefler
|
||
* Implement numeric_only for skew and kurtosis (GH#10258) Patrick
|
||
Hoefler
|
||
* mask and where should accept a callable (GH#10289) Irina Truong
|
||
* Fix conversion from Categorical to pa.dictionary in
|
||
read_parquet (GH#10285) Patrick Hoefler
|
||
## Bug Fixes
|
||
* Spurious config on nested annotations (GH#10318) crusaderky
|
||
* Fix rechunking behavior for dimensions with known and unknown
|
||
chunk sizes (GH#10157) Hendrik Makait
|
||
* Enable drop to support mismatched partitions (GH#10300) James
|
||
Bourbeau
|
||
* Fix divisions construction for to_timestamp (GH#10304) Patrick
|
||
Hoefler
|
||
* pandas ExtensionDtype raising in Series reduction operations
|
||
(GH#10149) Patrick Hoefler
|
||
* Fix regression in da.random interface (GH#10247) Eray Aslan
|
||
* da.coarsen doesn’t trim an empty chunk in meta (GH#10281) Irina
|
||
Truong
|
||
* Fix dtype inference for engine="pyarrow" in read_csv (GH#10280)
|
||
Patrick Hoefler
|
||
- Release 2023.5.0
|
||
## Enhancements
|
||
* Implement numeric_only=False for GroupBy.corr and GroupBy.cov
|
||
(GH#10264) Patrick Hoefler
|
||
* Add support for numeric_only=False in DataFrame.var (GH#10250)
|
||
Patrick Hoefler
|
||
* Add numeric_only support to DataFrame.mode (GH#10257) Patrick
|
||
Hoefler
|
||
* Add DataFrame.map to dask.DataFrame API (GH#10246) Patrick
|
||
Hoefler
|
||
* Adjust for DataFrame.applymap deprecation and all NA concat
|
||
behaviour change (GH#10245) Patrick Hoefler
|
||
* Enable numeric_only=False for DataFrame.count (GH#10234)
|
||
Patrick Hoefler
|
||
* Disallow array input in mask/where (GH#10163) Irina Truong
|
||
* Support numeric_only=True in GroupBy.corr and GroupBy.cov
|
||
(GH#10227) Patrick Hoefler
|
||
* Add numeric_only support to GroupBy.median (GH#10236) Patrick
|
||
Hoefler
|
||
* Support mimesis=9 in dask.datasets (GH#10241) James Bourbeau
|
||
* Add numeric_only support to min, max and prod (GH#10219)
|
||
Patrick Hoefler
|
||
* Add numeric_only=True support for GroupBy.cumsum and
|
||
GroupBy.cumprod (GH#10224) Patrick Hoefler
|
||
* Add helper to unpack numeric_only keyword (GH#10228) Patrick
|
||
Hoefler
|
||
## Bug Fixes
|
||
* Fix clone + from_array failure (GH#10211) crusaderky
|
||
* Fix dataframe reductions for ea dtypes (GH#10150) Patrick
|
||
Hoefler
|
||
* Avoid scalar conversion deprecation warning in numpy=1.25
|
||
(GH#10248) James Bourbeau
|
||
* Make sure transform output has the same index as input
|
||
(GH#10184) Irina Truong
|
||
* Fix corr and cov on a single-row partition (GH#9756) Irina
|
||
Truong
|
||
* Fix test_groupby_numeric_only_supported and
|
||
test_groupby_aggregate_categorical_observed upstream errors
|
||
(GH#10243) Irina Truong
|
||
- Release 2023.4.1
|
||
## Enhancements
|
||
* Implement numeric_only support for DataFrame.sum (GH#10194)
|
||
Patrick Hoefler
|
||
* Add support for numeric_only=True in GroupBy operations
|
||
(GH#10222) Patrick Hoefler
|
||
* Avoid deep copy in DataFrame.__setitem__ for pandas 1.4 and up
|
||
(GH#10221) Patrick Hoefler
|
||
* Avoid calling Series.apply with _meta_nonempty (GH#10212)
|
||
Patrick Hoefler
|
||
* Unpin sqlalchemy and fix compatibility issues (GH#10140)
|
||
Patrick Hoefler
|
||
## Bug Fixes
|
||
* Partially revert default client discovery (GH#10225) Florian
|
||
Jetter
|
||
* Support arrow dtypes in Index meta creation (GH#10170) Patrick
|
||
Hoefler
|
||
* Repartitioning raises with extension dtype when truncating
|
||
floats (GH#10169) Patrick Hoefler
|
||
* Adjust empty Index from fastparquet to object dtype (GH#10179)
|
||
Patrick Hoefler
|
||
- Release 2023.4.0
|
||
## Enhancements
|
||
* Override old default values in update_defaults (GH#10159) Gabe
|
||
Joseph
|
||
* Add a CLI command to list and get a value from dask config
|
||
(GH#9936) Irina Truong
|
||
* Handle string-based engine argument to read_json (GH#9947)
|
||
Richard (Rick) Zamora
|
||
* Avoid deprecated GroupBy.dtypes (GH#10111) Irina Truong
|
||
## Bug Fixes
|
||
* Revert grouper-related changes (GH#10182) Irina Truong
|
||
* GroupBy.cov raising for non-numeric grouping column (GH#10171)
|
||
Patrick Hoefler
|
||
* Updates for Index supporting numpy numeric dtypes (GH#10154)
|
||
Irina Truong
|
||
* Preserve dtype for partitioning columns when read with pyarrow
|
||
(GH#10115) Patrick Hoefler
|
||
* Fix annotations for to_hdf (GH#10123) Hendrik Makait
|
||
* Handle None column name when checking if columns are all
|
||
numeric (GH#10128) Lawrence Mitchell
|
||
* Fix valid_divisions when passed a tuple (GH#10126) Brian
|
||
Phillips
|
||
* Maintain annotations in DataFrame.categorize (GH#10120) Hendrik
|
||
Makait
|
||
* Fix handling of missing min/max parquet statistics during
|
||
filtering (GH#10042) Richard (Rick) Zamora
|
||
## Deprecations
|
||
* Deprecate use_nullable_dtypes= and add dtype_backend=
|
||
(GH#10076) Irina Truong
|
||
* Deprecate convert_dtype in Series.apply (GH#10133) Irina Truong
|
||
- Drop dask-pr10042-parquetstats.patch
|
||
|
||
-------------------------------------------------------------------
|
||
Tue Apr 4 20:46:26 UTC 2023 - Ben Greiner <code@bnavigator.de>
|
||
|
||
- Drop python38 test flavor
|
||
|
||
-------------------------------------------------------------------
|
||
Thu Mar 30 21:00:52 UTC 2023 - Ben Greiner <code@bnavigator.de>
|
||
|
||
- Enable pyarrow in the [complete] extra
|
||
|
||
-------------------------------------------------------------------
|
||
Mon Mar 27 16:40:11 UTC 2023 - Ben Greiner <code@bnavigator.de>
|
||
|
||
- Update to 2023.3.2
|
||
## Enhancements
|
||
* Deprecate observed=False for groupby with categoricals
|
||
(GH#10095) Irina Truong
|
||
* Deprecate axis= for some groupby operations (GH#10094) James
|
||
Bourbeau
|
||
* The axis keyword in DataFrame.rolling/Series.rolling is
|
||
deprecated (GH#10110) Irina Truong
|
||
* DataFrame._data deprecation in pandas (GH#10081) Irina Truong
|
||
* Use importlib_metadata backport to avoid CLI UserWarning
|
||
(GH#10070) Thomas Grainger
|
||
* Port option parsing logic from dask.dataframe.read_parquet to
|
||
to_parquet (GH#9981) Anton Loukianov
|
||
## Bug Fixes
|
||
* Avoid using dd.shuffle in groupby-apply (GH#10043) Richard
|
||
(Rick) Zamora
|
||
* Enable null hive partitions with pyarrow parquet engine
|
||
(GH#10007) Richard (Rick) Zamora
|
||
* Support unknown shapes in *_like functions (GH#10064) Doug
|
||
Davis
|
||
## Maintenance
|
||
* Restore Entrypoints compatibility (GH#10113) Jacob Tomlinson
|
||
* Allow pyarrow build to continue on failures (GH#10097) James
|
||
Bourbeau
|
||
* Fix test_set_index_on_empty with pyarrow strings active
|
||
(GH#10054) Irina Truong
|
||
* Temporarily skip pyarrow_compat tests with pandas 2.0
|
||
(GH#10063) James Bourbeau
|
||
|
||
-------------------------------------------------------------------
|
||
Sun Mar 26 17:13:15 UTC 2023 - Ben Greiner <code@bnavigator.de>
|
||
|
||
- Add dask-pr10042-parquetstats.patch gh#dask/dask#10042
|
||
- Enable python311 build: numba is not a strict requirement
|
||
|
||
-------------------------------------------------------------------
|
||
Sat Mar 11 22:53:32 UTC 2023 - Ben Greiner <code@bnavigator.de>
|
||
|
||
- Update to v2023.3.1
|
||
## Enhancements
|
||
* Support pyarrow strings in MultiIndex (GH#10040) Irina Truong
|
||
* Improved support for pyarrow strings (GH#10000) Irina Truong
|
||
* Fix flaky RuntimeWarning during array reductions (GH#10030)
|
||
James Bourbeau
|
||
* Extend complete extras (GH#10023) James Bourbeau
|
||
* Raise an error with dataframe.convert_string=True and pandas<2.0
|
||
(GH#10033) Irina Truong
|
||
* Rename shuffle/rechunk config option/kwarg to method (GH#10013)
|
||
James Bourbeau
|
||
* Add initial support for converting pandas extension dtypes to
|
||
arrays (GH#10018) James Bourbeau
|
||
* Remove randomgen support (GH#9987) Eray Aslan
|
||
## Bug Fixes
|
||
* Skip rechunk when rechunking to the same chunks with unknown
|
||
sizes (GH#10027) Hendrik Makait
|
||
* Custom utility to convert parquet filters to pyarrow expression
|
||
(GH#9885) Richard (Rick) Zamora
|
||
* Consider numpy scalars and 0d arrays as scalars when padding
|
||
(GH#9653) Justus Magin
|
||
* Fix parquet overwrite behavior after an adaptive read_parquet
|
||
operation (GH#10002) Richard (Rick) Zamora
|
||
## Maintenance
|
||
* Remove stale hive-partitioning code from pyarrow parquet engine
|
||
(GH#10039) Richard (Rick) Zamora
|
||
* Increase minimum supported pyarrow to 7.0 (GH#10024) James
|
||
Bourbeau
|
||
* Revert “Prepare drop packunpack (GH#9994) (GH#10037) Florian
|
||
Jetter
|
||
* Have codecov wait for more builds before reporting (GH#10031)
|
||
James Bourbeau
|
||
* Prepare drop packunpack (GH#9994) Florian Jetter
|
||
* Add CI job with pyarrow strings turned on (GH#10017) James
|
||
Bourbeau
|
||
* Fix test_groupby_dropna_with_agg for pandas 2.0 (GH#10001) Irina
|
||
Truong
|
||
* Fix test_pickle_roundtrip for pandas 2.0 (GH#10011) James
|
||
Bourbeau
|
||
|
||
-------------------------------------------------------------------
|
||
Wed Mar 8 15:20:50 UTC 2023 - Benjamin Greiner <code@bnavigator.de>
|
||
|
||
- Update dependencies
|
||
- Skip one more test failing because of missing pyarrow
|
||
|
||
-------------------------------------------------------------------
|
||
Wed Mar 8 09:37:10 UTC 2023 - Dirk Müller <dmueller@suse.com>
|
||
|
||
- update to 2023.3.0:
|
||
* Bag must not pick p2p as shuffle default (:pr:`10005`)
|
||
* Minor follow-up to P2P by default (:pr:`10008`) `James
|
||
Bourbeau`_
|
||
* Add minimum version to optional ``jinja2`` dependency
|
||
(:pr:`9999`) `Charles Blackmon-Luca`_
|
||
* Enable P2P shuffling by default
|
||
* P2P rechunking
|
||
* Efficient `dataframe.convert_string` support for
|
||
`read_parquet`
|
||
* Allow p2p shuffle kwarg for DataFrame merges
|
||
* Change ``split_row_groups`` default to "infer"
|
||
* Add option for converting string data to use ``pyarrow``
|
||
strings
|
||
* Add support for multi-column ``sort_values``
|
||
* ``Generator`` based random-number generation in``dask.array``
|
||
* Support ``numeric_only`` for simple groupby aggregations for
|
||
``pandas`` 2.0 compatibility
|
||
* Fix profilers plot not being aligned to context manager enter
|
||
time
|
||
* Relax dask.dataframe assert_eq type checks
|
||
* Restore ``describe`` compatibility for ``pandas`` 2.0
|
||
* Improving deploying Dask docs
|
||
* More docs for ``DataFrame.partitions``
|
||
* Update docs with more information on default Delayed
|
||
scheduler
|
||
* Deployment Considerations documentation
|
||
* Temporarily rerun flaky tests
|
||
* Update parsing of FULL_RAPIDS_VER/FULL_UCX_PY_VER
|
||
* Increase minimum supported versions to ``pandas=1.3`` and
|
||
``numpy=1.21``
|
||
* Fix ``std`` to work with ``numeric_only`` for ``pandas`` 2.0
|
||
* Temporarily ``xfail``
|
||
``test_roundtrip_partitioned_pyarrow_dataset`` (:pr:`9977`)
|
||
* Fix copy on write failure in `test_idxmaxmin` (:pr:`9944`)
|
||
* Bump ``pre-commit`` versions (:pr:`9955`) `crusaderky`_
|
||
* Fix ``test_groupby_unaligned_index`` for ``pandas`` 2.0
|
||
* Un-``xfail`` ``test_set_index_overlap_2`` for ``pandas`` 2.0
|
||
* Fix ``test_merge_by_index_patterns`` for ``pandas`` 2.0
|
||
* Bump jacobtomlinson/gha-find-replace from 2 to 3 (:pr:`9953`)
|
||
* Fix ``test_rolling_agg_aggregate`` for ``pandas`` 2.0
|
||
compatibility
|
||
* Bump ``black`` to ``23.1.0``
|
||
* Run GPU tests on python 3.8 & 3.10 (:pr:`9940`)
|
||
* Fix ``test_to_timestamp`` for ``pandas`` 2.0 (:pr:`9932`)
|
||
* Fix an error with ``groupby`` ``value_counts`` for ``pandas``
|
||
2.0 compatibility
|
||
* Config converter: replace all dashes with underscores
|
||
|
||
-------------------------------------------------------------------
|
||
Sun Feb 26 00:08:43 UTC 2023 - Ben Greiner <code@bnavigator.de>
|
||
|
||
- Prepare test multiflavors for python311, but skip python311
|
||
* Numba is not ready for python 3.11 yet gh#numba/numba#8304
|
||
|
||
-------------------------------------------------------------------
|
||
Fri Feb 17 09:06:25 UTC 2023 - Ben Greiner <code@bnavigator.de>
|
||
|
||
- Update to 2023.2.0
|
||
## Enhancements
|
||
* Update numeric_only default in quantile for pandas 2.0
|
||
(GH#9854) Irina Truong
|
||
* Make repartition a no-op when divisions match (GH#9924) James
|
||
Bourbeau
|
||
* Update datetime_is_numeric behavior in describe for pandas 2.0
|
||
(GH#9868) Irina Truong
|
||
* Update value_counts to return correct name in pandas 2.0
|
||
(GH#9919) Irina Truong
|
||
* Support new axis=None behavior in pandas 2.0 for certain
|
||
reductions (GH#9867) James Bourbeau
|
||
* Filter out all-nan RuntimeWarning at the chunk level for nanmin
|
||
and nanmax (GH#9916) Julia Signell
|
||
* Fix numeric meta_nonempty index creation for pandas 2.0
|
||
(GH#9908) James Bourbeau
|
||
* Fix DataFrame.info() tests for pandas 2.0 (GH#9909) James
|
||
Bourbeau
|
||
## Bug Fixes
|
||
* Fix GroupBy.value_counts handling for multiple groupby columns
|
||
(GH#9905) Charles Blackmon-Luca
|
||
|
||
-------------------------------------------------------------------
|
||
Sun Feb 5 13:29:14 UTC 2023 - Ben Greiner <code@bnavigator.de>
|
||
|
||
- Update to 2023.1.1
|
||
## Enhancements
|
||
* Add to_backend method to Array and _Frame (GH#9758) Richard
|
||
(Rick) Zamora
|
||
* Small fix for timestamp index divisions in pandas 2.0 (GH#9872)
|
||
Irina Truong
|
||
* Add numeric_only to DataFrame.cov and DataFrame.corr (GH#9787)
|
||
James Bourbeau
|
||
* Fixes related to group_keys default change in pandas 2.0
|
||
(GH#9855) Irina Truong
|
||
* infer_datetime_format compatibility for pandas 2.0 (GH#9783)
|
||
James Bourbeau
|
||
## Bug Fixes
|
||
* Fix serialization bug in BroadcastJoinLayer (GH#9871) Richard
|
||
(Rick) Zamora
|
||
* Satisfy broadcast argument in DataFrame.merge (GH#9852) Richard
|
||
(Rick) Zamora
|
||
* Fix pyarrow parquet columns statistics computation (GH#9772)
|
||
aywandji
|
||
## Documentation
|
||
* Fix “duplicate explicit target name” docs warning (GH#9863)
|
||
Chiara Marmo
|
||
* Fix code formatting issue in “Defining a new collection
|
||
backend” docs (GH#9864) Chiara Marmo
|
||
* Update dashboard documentation for memory plot (GH#9768) Jayesh
|
||
Manani
|
||
* Add docs section about no-worker tasks (GH#9839) Florian Jetter
|
||
## Maintenance
|
||
* Additional updates for detecting a distributed scheduler
|
||
(GH#9890) James Bourbeau
|
||
* Update gpuCI RAPIDS_VER to 23.04 (GH#9876)
|
||
* Reverse precedence between collection and distributed default
|
||
(GH#9869) Florian Jetter
|
||
* Update xarray-contrib/issue-from-pytest-log to version 1.2.6
|
||
(GH#9865) James Bourbeau
|
||
* Dont require dask config shuffle default (GH#9826) Florian
|
||
Jetter
|
||
* Un-xfail datetime64 Parquet roundtripping tests for new
|
||
fastparquet (GH#9811) James Bourbeau
|
||
* Add option to manually run upstream CI build (GH#9853) James
|
||
Bourbeau
|
||
* Use custom timeout in CI builds (GH#9844) James Bourbeau
|
||
* Remove kwargs from make_blockwise_graph (GH#9838) Florian
|
||
Jetter
|
||
* Ignore warnings on persist call in
|
||
test_setitem_extended_API_2d_mask (GH#9843) Charles
|
||
Blackmon-Luca
|
||
* Fix running S3 tests locally (GH#9833) James Bourbeau
|
||
- Release 2023.1.0
|
||
## Enhancements
|
||
* Use distributed default clients even if no config is set
|
||
(GH#9808) Florian Jetter
|
||
* Implement ma.where and ma.nonzero (GH#9760) Erik Holmgren
|
||
* Update zarr store creation functions (GH#9790) Ryan Abernathey
|
||
* iteritems compatibility for pandas 2.0 (GH#9785) James Bourbeau
|
||
* Accurate sizeof for pandas string[python] dtype (GH#9781)
|
||
crusaderky
|
||
* Deflate sizeof() of duplicate references to pandas object types
|
||
(GH#9776) crusaderky
|
||
* GroupBy.__getitem__ compatibility for pandas 2.0 (GH#9779)
|
||
James Bourbeau
|
||
* append compatibility for pandas 2.0 (GH#9750) James Bourbeau
|
||
* get_dummies compatibility for pandas 2.0 (GH#9752) James
|
||
Bourbeau
|
||
* is_monotonic compatibility for pandas 2.0 (GH#9751) James
|
||
Bourbeau
|
||
* numpy=1.24 compatability (GH#9777) James Bourbeau
|
||
## Documentation
|
||
* Remove duplicated encoding kwarg in docstring for to_json
|
||
(GH#9796) Sultan Orazbayev
|
||
* Mention SubprocessCluster in LocalCluster documentation
|
||
(GH#9784) Hendrik Makait
|
||
* Move Prometheus docs to dask/distributed (GH#9761) crusaderky
|
||
## Maintenance
|
||
* Temporarily ignore RuntimeWarning in
|
||
test_setitem_extended_API_2d_mask (GH#9828) James Bourbeau
|
||
* Fix flaky test_threaded.py::test_interrupt (GH#9827) Hendrik
|
||
Makait
|
||
* Update xarray-contrib/issue-from-pytest-log in upstream report
|
||
(GH#9822) James Bourbeau
|
||
* pip install dask on gpuCI builds (GH#9816) Charles
|
||
Blackmon-Luca
|
||
* Bump actions/checkout from 3.2.0 to 3.3.0 (GH#9815)
|
||
* Resolve sqlalchemy import failures in mindeps testing (GH#9809)
|
||
Charles Blackmon-Luca
|
||
* Ignore sqlalchemy.exc.RemovedIn20Warning (GH#9801) Thomas
|
||
Grainger
|
||
* xfail datetime64 Parquet roundtripping tests for pandas 2.0
|
||
(GH#9786) James Bourbeau
|
||
* Remove sqlachemy 1.3 compatibility (GH#9695) McToel
|
||
* Reduce size of expected DoK sparse matrix (GH#9775) Elliott
|
||
Sales de Andrade
|
||
* Remove executable flag from dask/dataframe/io/orc/utils.py
|
||
(GH#9774) Elliott Sales de Andrade
|
||
- Drop dask-pr9777-np1.24.patch
|
||
|
||
-------------------------------------------------------------------
|
||
Mon Jan 2 20:44:44 UTC 2023 - Ben Greiner <code@bnavigator.de>
|
||
|
||
- Update to 2022.12.1
|
||
## Enhancements
|
||
* Support dtype_backend="pandas|pyarrow" configuration (GH#9719)
|
||
James Bourbeau
|
||
* Support cupy.ndarray to cudf.DataFrame dispatching in
|
||
dask.dataframe (GH#9579) Richard (Rick) Zamora
|
||
* Make filesystem-backend configurable in read_parquet (GH#9699)
|
||
Richard (Rick) Zamora
|
||
* Serialize all pyarrow extension arrays efficiently (GH#9740)
|
||
James Bourbeau
|
||
## Bug Fixes
|
||
* Fix bug when repartitioning with tz-aware datetime index
|
||
(GH#9741) James Bourbeau
|
||
* Partial functions in aggs may have arguments (GH#9724) Irina
|
||
Truong
|
||
* Add support for simple operation with pyarrow-backed extension
|
||
dtypes (GH#9717) James Bourbeau
|
||
* Rename columns correctly in case of SeriesGroupby (GH#9716)
|
||
Lawrence Mitchell
|
||
## Maintenance
|
||
* Add zarr to Python 3.11 CI environment (GH#9771) James Bourbeau
|
||
* Add support for Python 3.11 (GH#9708) Thomas Grainger
|
||
* Bump actions/checkout from 3.1.0 to 3.2.0 (GH#9753)
|
||
* Avoid np.bool8 deprecation warning (GH#9737) James Bourbeau
|
||
* Make sure dev packages aren’t overwritten in upstream CI build
|
||
(GH#9731) James Bourbeau
|
||
* Avoid adding data.h5 and mydask.html files during tests
|
||
(GH#9726) Thomas Grainger
|
||
- Release 2022.12.0
|
||
## Enhancements
|
||
* Remove statistics-based set_index logic from read_parquet
|
||
(GH#9661) Richard (Rick) Zamora
|
||
* Add support for use_nullable_dtypes to dd.read_parquet
|
||
(GH#9617) Ian Rose
|
||
* Fix map_overlap in order to accept pandas arguments (GH#9571)
|
||
Fabien Aulaire
|
||
* Fix pandas 1.5+ FutureWarning in .str.split(..., expand=True)
|
||
(GH#9704) Jacob Hayes
|
||
* Enable column projection for groupby slicing (GH#9667) Richard
|
||
(Rick) Zamora
|
||
* Support duplicate column cum-functions (GH#9685) Ben
|
||
* Improve error message for failed backend dispatch call
|
||
(GH#9677) Richard (Rick) Zamora
|
||
## Bug Fixes
|
||
* Revise meta creation in arrow parquet engine (GH#9672) Richard
|
||
(Rick) Zamora
|
||
* Fix da.fft.fft for array-like inputs (GH#9688) James Bourbeau
|
||
* Fix groupby -aggregation when grouping on an index by name
|
||
(GH#9646) Richard (Rick) Zamora
|
||
## Maintenance
|
||
* Avoid PytestReturnNotNoneWarning in test_inheriting_class
|
||
(GH#9707) Thomas Grainger
|
||
* Fix flaky test_dataframe_aggregations_multilevel (GH#9701)
|
||
Richard (Rick) Zamora
|
||
* Bump mypy version (GH#9697) crusaderky
|
||
* Disable dashboard in test_map_partitions_df_input (GH#9687)
|
||
James Bourbeau
|
||
* Use latest xarray-contrib/issue-from-pytest-log in upstream
|
||
build (GH#9682) James Bourbeau
|
||
* xfail ttest_1samp for upstream scipy (GH#9670) James Bourbeau
|
||
* Update gpuCI RAPIDS_VER to 23.02 (GH#9678)
|
||
- Add dask-pr9777-np1.24.patch gh#dask/dask#9777
|
||
- Move to PEP517 build
|
||
|
||
-------------------------------------------------------------------
|
||
Mon Nov 21 19:03:11 UTC 2022 - Ben Greiner <code@bnavigator.de>
|
||
|
||
- Go back to bokeh 2.4
|
||
* gh#dask/dask#9659
|
||
* we provide a legacy bokeh2 instead
|
||
|
||
-------------------------------------------------------------------
|
||
Sun Nov 20 10:01:18 UTC 2022 - Ben Greiner <code@bnavigator.de>
|
||
|
||
- Update to version 2022.11.1
|
||
## Enhancements
|
||
* Restrict bokeh=3 support (GH#9673) Gabe Joseph (ignored in rpm,
|
||
fixed by bokek 3.0.2, see gh#dask/dask#9659)
|
||
* Updates for fastparquet evolution (GH#9650) Martin Durant
|
||
## Maintenance
|
||
* Revert importlib.metadata workaround (GH#9658) James Bourbeau
|
||
- Release 2022.11.0
|
||
## Enhancements
|
||
* Generalize from_dict implementation to allow usage from other
|
||
backends (GH#9628) GALI PREM SAGAR
|
||
## Bug Fixes
|
||
* Avoid pandas constructors in dask.dataframe.core (GH#9570)
|
||
Richard (Rick) Zamora
|
||
* Fix sort_values with Timestamp data (GH#9642) James Bourbeau
|
||
* Generalize array checking and remove pd.Index call in
|
||
_get_partitions (GH#9634) Benjamin Zaitlen
|
||
* Fix read_csv behavior for header=0 and names (GH#9614) Richard
|
||
(Rick) Zamora
|
||
## Maintenance
|
||
* Allow bokeh=3 (GH#9659) James Bourbeau
|
||
* Add pre-commit to catch breakpoint() (GH#9638) James Bourbeau
|
||
* Bump xarray-contrib/issue-from-pytest-log from 1.1 to 1.2
|
||
(GH#9635)
|
||
* Remove blosc references (GH#9625) Naty Clementi
|
||
* Harden test_repartition_npartitions (GH#9585) Richard (Rick)
|
||
Zamora
|
||
- Release 2022.10.2
|
||
* This was a hotfix and has no changes in this repository. The
|
||
necessary fix was in dask/distributed, but we decided to bump
|
||
this version number for consistency.
|
||
- Release 2022.10.1
|
||
## Enhancements
|
||
* Enable named aggregation syntax (GH#9563) ChrisJar
|
||
* Add extension dtype support to set_index (GH#9566) James
|
||
Bourbeau
|
||
* Redesigning the array HTML repr for clarity (GH#9519) Shingo
|
||
OKAWA
|
||
## Bug Fixes
|
||
* Fix merge with emtpy left DataFrame (GH#9578) Ian Rose
|
||
## Maintenance
|
||
* Require Click 7.0+ in Dask (GH#9595) John A Kirkham
|
||
* Temporarily restrict bokeh<3 (GH#9607) James Bourbeau
|
||
* Resolve importlib-related failures in upstream CI (GH#9604)
|
||
Charles Blackmon-Luca
|
||
* Remove setuptools host dep, add CLI entrypoint (GH#9600)
|
||
Charles Blackmon-Luca
|
||
* More Backend dispatch class type annotations (GH#9573) Ian Rose
|
||
- Create a -test subpackage in order to avoid rpmlint errors
|
||
- Drop extra conftest: included in sdist.
|
||
|
||
-------------------------------------------------------------------
|
||
Fri Oct 21 13:19:48 UTC 2022 - Ben Greiner <code@bnavigator.de>
|
||
|
||
- Update to version 2022.10.0
|
||
* Backend library dispatching for IO in Dask-Array and
|
||
Dask-DataFrame (GH#9475) Richard (Rick) Zamora
|
||
* Add new CLI that is extensible (GH#9283) Doug Davis
|
||
* Groupby median (GH#9516) Ian Rose
|
||
* Fix array copy not being a no-op (GH#9555) David Hoese
|
||
* Add support for string timedelta in map_overlap (GH#9559)
|
||
Nicolas Grandemange
|
||
* Shuffle-based groupby for single functions (GH#9504) Ian Rose
|
||
* Make datetime.datetime tokenize idempotantly (GH#9532) Martin
|
||
Durant
|
||
* Support tokenizing datetime.time (GH#9528) Tim Paine
|
||
* Avoid race condition in lazy dispatch registration (GH#9545)
|
||
James Bourbeau
|
||
* Do not allow setitem to np.nan for int dtype (GH#9531) Doug
|
||
Davis
|
||
* Stable demo column projection (GH#9538) Ian Rose
|
||
* Ensure pickle-able binops in delayed (GH#9540) Ian Rose
|
||
* Fix project CSV columns when selecting (GH#9534) Martin Durant
|
||
* Update Parquet best practice (GH#9537) Matthew Rocklin
|
||
- move -all metapackage to -complete, mirroring upstream's
|
||
[complete] extra.
|
||
|
||
-------------------------------------------------------------------
|
||
Fri Sep 30 23:19:11 UTC 2022 - Arun Persaud <arun@gmx.de>
|
||
|
||
- update to version 2022.9.2:
|
||
* Enhancements
|
||
+ Remove factorization logic from array auto chunking (:pr:`9507`)
|
||
`James Bourbeau`_
|
||
* Documentation
|
||
+ Add docs on running Dask in a standalone Python script
|
||
(:pr:`9513`) `James Bourbeau`_
|
||
+ Clarify custom-graph multiprocessing example (:pr:`9511`)
|
||
`nouman`_
|
||
* Maintenance
|
||
+ Groupby sort upstream compatibility (:pr:`9486`) `Ian Rose`_
|
||
|
||
-------------------------------------------------------------------
|
||
Fri Sep 16 19:54:12 UTC 2022 - Arun Persaud <arun@gmx.de>
|
||
|
||
- update to version 2022.9.1:
|
||
* New Features
|
||
+ Add "DataFrame" and "Series" "median" methods (:pr:`9483`)
|
||
`James Bourbeau`_
|
||
* Enhancements
|
||
+ Shuffle "groupby" default (:pr:`9453`) `Ian Rose`_
|
||
+ Filter by list (:pr:`9419`) `Greg Hayes`_
|
||
+ Added "distributed.utils.key_split" functionality to
|
||
"dask.utils.key_split" (:pr:`9464`) `Luke Conibear`_
|
||
* Bug Fixes
|
||
+ Fix overlap so that "set_index" doesn't drop rows (:pr:`9423`)
|
||
`Julia Signell`_
|
||
+ Fix assigning pandas "Series" to column when "ddf.columns.min()"
|
||
raises (:pr:`9485`) `Erik Welch`_
|
||
+ Fix metadata comparison "stack_partitions" (:pr:`9481`) `James
|
||
Bourbeau`_
|
||
+ Provide default for "split_out" (:pr:`9493`) `Lawrence
|
||
Mitchell`_
|
||
* Deprecations
|
||
+ Allow "split_out" to be "None", which then defaults to "1" in
|
||
"groupby().aggregate()" (:pr:`9491`) `Ian Rose`_
|
||
* Documentation
|
||
+ Fixing "enforce_metadata" documentation, not checking for dtypes
|
||
(:pr:`9474`) `Nicolas Grandemange`_
|
||
+ Fix "it's" --> "its" typo (:pr:`9484`) `Nat Tabris`_
|
||
* Maintenance
|
||
+ Workaround for parquet writing failure using some datetime
|
||
series but not others (:pr:`9500`) `Ian Rose`_
|
||
+ Filter out "numeric_only" warnings from "pandas" (:pr:`9496`)
|
||
`James Bourbeau`_
|
||
+ Avoid "set_index(..., inplace=True)" where not necessary
|
||
(:pr:`9472`) `James Bourbeau`_
|
||
+ Avoid passing groupby key list of length one (:pr:`9495`) `James
|
||
Bourbeau`_
|
||
+ Update "test_groupby_dropna_cudf" based on "cudf" support for
|
||
"group_keys" (:pr:`9482`) `James Bourbeau`_
|
||
+ Remove "dd.from_bcolz" (:pr:`9479`) `James Bourbeau`_
|
||
+ Added "flake8-bugbear" to "pre-commit" hooks (:pr:`9457`) `Luke
|
||
Conibear`_
|
||
+ Bind loop variables in function definitions ("B023")
|
||
(:pr:`9461`) `Luke Conibear`_
|
||
+ Added assert for comparisons ("B015") (:pr:`9459`) `Luke
|
||
Conibear`_
|
||
+ Set top-level default shell in CI workflows (:pr:`9469`) `James
|
||
Bourbeau`_
|
||
+ Removed unused loop control variables ("B007") (:pr:`9458`)
|
||
`Luke Conibear`_
|
||
+ Replaced "getattr" calls for constant attributes ("B009")
|
||
(:pr:`9460`) `Luke Conibear`_
|
||
+ Pin "libprotobuf" to allow nightly "pyarrow" in the upstream CI
|
||
build (:pr:`9465`) `Joris Van den Bossche`_
|
||
+ Replaced mutable data structures for default arguments ("B006")
|
||
(:pr:`9462`) `Luke Conibear`_
|
||
+ Changed "flake8" mirror and updated version (:pr:`9456`) `Luke
|
||
Conibear`_
|
||
|
||
-------------------------------------------------------------------
|
||
Sat Sep 10 15:15:32 UTC 2022 - Arun Persaud <arun@gmx.de>
|
||
|
||
- update to version 2022.9.0:
|
||
* Enhancements
|
||
+ Enable automatic column projection for "groupby" aggregations
|
||
(:pr:`9442`) `Richard (Rick) Zamora`_
|
||
+ Accept superclasses in NEP-13/17 dispatching (:pr:`6710`) `Gabe
|
||
Joseph`_
|
||
* Bug Fixes
|
||
+ Rename "by" columns internally for cumulative operations on the
|
||
same "by" columns (:pr:`9430`) `Pavithra Eswaramoorthy`_
|
||
+ Fix "get_group" with categoricals (:pr:`9436`) `Pavithra
|
||
Eswaramoorthy`_
|
||
+ Fix caching-related "MaterializedLayer.cull" performance
|
||
regression (:pr:`9413`) `Richard (Rick) Zamora`_
|
||
* Documentation
|
||
+ Add maintainer documentation page (:pr:`9309`) `James Bourbeau`_
|
||
* Maintenance
|
||
+ Revert skipped fastparquet test (:pr:`9439`) `Pavithra
|
||
Eswaramoorthy`_
|
||
+ "tmpfile" does not end files with period on empty extension
|
||
(:pr:`9429`) `Hendrik Makait`_
|
||
+ Skip failing fastparquet test with latest release (:pr:`9432`)
|
||
`James Bourbeau`_
|
||
|
||
-------------------------------------------------------------------
|
||
Thu Sep 1 06:57:11 UTC 2022 - Steve Kowalik <steven.kowalik@suse.com>
|
||
|
||
- Update to 2022.8.1:
|
||
* Implement ma.*_like functions (:pr:`9378`) `Ruth Comer`_
|
||
* Fuse compatible annotations (:pr:`9402`) `Ian Rose`_
|
||
* Shuffle-based groupby aggregation for high-cardinality groups (:pr:`9302`)
|
||
`Richard (Rick) Zamora`_
|
||
* Unpack namedtuple (:pr:`9361`) `Hendrik Makait`_
|
||
* Fix SeriesGroupBy cumulative functions with axis=1 (:pr:`9377`)
|
||
`Pavithra Eswaramoorthy`_
|
||
* Sparse array reductions (:pr:`9342`) `Ian Rose`_
|
||
* Fix make_meta while using categorical column with index (:pr:`9348`)
|
||
`Pavithra Eswaramoorthy`_
|
||
* Don't allow incompatible keywords in DataFrame.dropna (:pr:`9366`)
|
||
`Naty Clementi`_
|
||
* Make set_index handle entirely empty dataframes (:pr:`8896`)
|
||
`Julia Signell`_
|
||
* Improve dataclass handling in unpack_collections (:pr:`9345`)
|
||
`Hendrik Makait`_
|
||
* Fix bag sampling when there are some smaller partitions (:pr:`9349`)
|
||
`Ian Rose`_
|
||
* Add support for empty partitions to da.min/da.max functions (:pr:`9268`)
|
||
`geraninam`_
|
||
* Use entry_points utility in sizeof (:pr:`9390`) `James Bourbeau`_
|
||
* Add entry_points compatibility utility (:pr:`9388`) `Jacob Tomlinson`_
|
||
* Upload environment file artifact for each CI build (:pr:`9372`)
|
||
`James Bourbeau`_
|
||
* Remove werkzeug pin in CI (:pr:`9371`) `James Bourbeau`_
|
||
* Fix type annotations for dd.from_pandas and dd.from_delayed (:pr:`9362`)
|
||
`Jordan Yap`_
|
||
* Ensure make_meta doesn't hold ref to data (:pr:`9354`) `Jim Crist-Harif`_
|
||
* Revise divisions logic in from_pandas (:pr:`9221`) `Richard (Rick) Zamora`_
|
||
* Warn if user sets index with existing index (:pr:`9341`) `Julia Signell`_
|
||
* Add keepdims keyword for da.average (:pr:`9332`) `Ruth Comer`_
|
||
* Change repr methods to avoid Layer materialization (:pr:`9289`)
|
||
`Richard (Rick) Zamora`_
|
||
* Make sure order kwarg will not crash the astype method (:pr:`9317`)
|
||
`Genevieve Buckley`_
|
||
* Fix bug for cumsum on cupy chunked dask arrays (:pr:`9320`)
|
||
`Genevieve Buckley`_
|
||
* Match input and output structure in _sample_reduce (:pr:`9272`)
|
||
`Pavithra Eswaramoorthy`_
|
||
* Include meta in array serialization (:pr:`9240`) `Frédéric BRIOL`_
|
||
* Fix Index.memory_usage (:pr:`9290`) `James Bourbeau`_
|
||
* Fix division calculation in dask.dataframe.io.from_dask_array (:pr:`9282`)
|
||
`Jordan Yap`_
|
||
* Switch js-yaml for yaml.js in config converter (:pr:`9306`)
|
||
`Jacob Tomlinson`_
|
||
* Update da.linalg.solve for SciPy 1.9.0 compatibility (:pr:`9350`)
|
||
`Pavithra Eswaramoorthy`_
|
||
* Update test_getitem_avoids_large_chunks_missing (:pr:`9347`)
|
||
`Pavithra Eswaramoorthy`_
|
||
* Import loop_in_thread fixture in tests (:pr:`9337`) `James Bourbeau`_
|
||
* Temporarily xfail test_solve_sym_pos (:pr:`9336`) `Pavithra Eswaramoorthy`_
|
||
* Update gpuCI RAPIDS_VER to 22.10 (:pr:`9314`)
|
||
* Return Dask array if all axes are squeezed (:pr:`9250`)
|
||
`Pavithra Eswaramoorthy`_
|
||
* Make cycle reported by toposort shorter (:pr:`9068`) `Erik Welch`_
|
||
* Unknown chunk slicing - raise informative error (:pr:`9285`)
|
||
`Naty Clementi`_
|
||
* Fix bug in HighLevelGraph.cull (:pr:`9267`) `Richard (Rick) Zamora`_
|
||
* Sort categories (:pr:`9264`) `Pavithra Eswaramoorthy`_
|
||
* Use max (instead of sum) for calculating warnsize (:pr:`9235`)
|
||
`Pavithra Eswaramoorthy`_
|
||
* Fix bug when filtering on partitioned column with pyarrow (:pr:`9252`)
|
||
`Richard (Rick) Zamora`_
|
||
*Add type annotations to dd.from_pandas and dd.from_delayed (:pr:`9237`)
|
||
`Michael Milton`_
|
||
* Update test_plot_multiple for upcoming bokeh release (:pr:`9261`)
|
||
`James Bourbeau`_
|
||
* Add typing to common array properties (:pr:`9255`) `Illviljan`_
|
||
|
||
-------------------------------------------------------------------
|
||
Mon Jul 11 02:47:49 UTC 2022 - Arun Persaud <arun@gmx.de>
|
||
|
||
- update to version 2022.7.0:
|
||
* Enhancements
|
||
+ Support "pathlib.PurePath" in "normalize_token" (:pr:`9229`)
|
||
`Angus Hollands`_
|
||
+ Add "AttributeNotImplementedError" for properties so IPython
|
||
glob search works (:pr:`9231`) `Erik Welch`_
|
||
+ "map_overlap": multiple dataframe handling (:pr:`9145`) `Fabien
|
||
Aulaire`_
|
||
+ Read entrypoints in "dask.sizeof" (:pr:`7688`) `Angus Hollands`_
|
||
* Bug Fixes
|
||
+ Fix "TypeError: 'Serialize' object is not subscriptable" when
|
||
writing parquet dataset with "Client(processes=False)"
|
||
(:pr:`9015`) `Lucas Miguel Ponce`_
|
||
+ Correct dtypes when "concat" with an empty dataframe
|
||
(:pr:`9193`) `Pavithra Eswaramoorthy`_
|
||
* Documentation
|
||
+ Highlight note about persist (:pr:`9234`) `Pavithra
|
||
Eswaramoorthy`_
|
||
+ Update release-procedure to include more detail and helpful
|
||
commands (:pr:`9215`) `Julia Signell`_
|
||
+ Better SEO for Futures and Dask vs. Spark pages (:pr:`9217`)
|
||
`Sarah Charlotte Johnson`_
|
||
* Maintenance
|
||
+ Use "math.prod" instead of "np.prod" on lists, tuples, and iters
|
||
(:pr:`9232`) `crusaderky`_
|
||
+ Only import IPython if type checking (:pr:`9230`) `Florian
|
||
Jetter`_
|
||
+ Tougher mypy checks (:pr:`9206`) `crusaderky`_
|
||
|
||
-------------------------------------------------------------------
|
||
Fri Jun 24 20:21:01 UTC 2022 - Ben Greiner <code@bnavigator.de>
|
||
|
||
- Update to to 2022.6.1
|
||
* Enhancements
|
||
- Dask in pyodide (GH#9053) Ian Rose
|
||
- Create dask.utils.show_versions (GH#9144) Sultan Orazbayev
|
||
- Better error message for unsupported numpy operations on
|
||
dask.dataframe objects. (GH#9201) Julia Signell
|
||
- Add allow_rechunk kwarg to dask.array.overlap function
|
||
(GH#7776) Genevieve Buckley
|
||
- Add minutes and hours to dask.utils.format_time (GH#9116)
|
||
Matthew Rocklin
|
||
- More retries when writing parquet to remote filesystem
|
||
(GH#9175) Ian Rose
|
||
* Bug Fixes
|
||
- Timedelta deterministic hashing (GH#9213) Fabien Aulaire
|
||
- Enum deterministic hashing (GH#9212) Fabien Aulaire
|
||
- shuffle_group(): avoid converting to arrays (GH#9157) Mads R.
|
||
B. Kristensen
|
||
* Deprecations
|
||
- Deprecate extra format_time utility (GH#9184) James Bourbeau
|
||
- Release 2022.6.0
|
||
* Enhancements
|
||
- Add feature to show names of layer dependencies in HLG
|
||
JupyterLab repr (GH#9081) Angelos Omirolis
|
||
- Add arrow schema extraction dispatch (GH#9169) GALI PREM
|
||
SAGAR
|
||
- Add sort_results argument to assert_eq (GH#9130) Pavithra
|
||
Eswaramoorthy
|
||
- Add weeks to parse_timedelta (GH#9168) Matthew Rocklin
|
||
- Warn that cloudpickle is not always deterministic (GH#9148)
|
||
Pavithra Eswaramoorthy
|
||
- Switch parquet default engine (GH#9140) Jim Crist-Harif
|
||
- Use deterministic hashing with _iLocIndexer / _LocIndexer
|
||
(GH#9108) Fabien Aulaire
|
||
- Enfore consistent schema in to_parquet pyarrow (GH#9131) Jim
|
||
Crist-Harif
|
||
* Bug Fixes
|
||
- Fix pyarrow.StringArray pickle (GH#9170) Jim Crist-Harif
|
||
- Fix parallel metadata collection in pyarrow engine (GH#9165)
|
||
Richard (Rick) Zamora
|
||
- Improve pyarrow partitioning logic (GH#9147) James Bourbeau
|
||
- pyarrow 8.0 partitioning fix (GH#9143) James Bourbeau
|
||
- Release 2022.05.2
|
||
* Enhancements
|
||
- Add a dispatch for non-pandas Grouper objects and use it in
|
||
GroupBy (GH#9074) brandon-b-miller
|
||
- Error if read_parquet & to_parquet files intersect (GH#9124)
|
||
Jim Crist-Harif
|
||
- Visualize task graphs using ipycytoscape (GH#9091) Ian Rose
|
||
- Release 2022.05.1
|
||
* New Features
|
||
- Add DataFrame.from_dict classmethod (GH#9017) Matthew Powers
|
||
- Add from_map function to Dask DataFrame (GH#8911) Richard
|
||
(Rick) Zamora
|
||
* Enhancements
|
||
- Improve to_parquet error for appended divisions overlap
|
||
(GH#9102) Jim Crist-Harif
|
||
- Enabled user-defined process-initializer functions (GH#9087)
|
||
ParticularMiner
|
||
- Mention align_dataframes=False option in map_partitions error
|
||
(GH#9075) Gabe Joseph
|
||
- Add kwarg enforce_ndim to dask.array.map_blocks() (GH#8865)
|
||
ParticularMiner
|
||
- Implement Series.GroupBy.fillna / DataFrame.GroupBy.fillna
|
||
methods (GH#8869) Pavithra Eswaramoorthy
|
||
- Allow fillna with Dask DataFrame (GH#8950) Pavithra
|
||
Eswaramoorthy
|
||
- Update error message for assignment with 1-d dask array
|
||
(GH#9036) Pavithra Eswaramoorthy
|
||
- Collection Protocol (GH#8674) Doug Davis
|
||
- Patch around pandas ArrowStringArray pickling (GH#9024) Jim
|
||
Crist-Harif
|
||
- Band-aid for compute_as_if_collection (GH#8998) Ian Rose
|
||
- Add p2p shuffle option (GH#8836) Matthew Rocklin
|
||
* Bug Fixes
|
||
- Fixup column projection with no columns (GH#9106) Jim
|
||
Crist-Harif
|
||
- Blockwise cull NumPy dtype (GH#9100) Ian Rose
|
||
- Fix column-projection bug in from_map (GH#9078) Richard
|
||
(Rick) Zamora
|
||
- Prevent nulls in index for non-numeric dtypes (GH#8963) Jorge
|
||
López
|
||
- Fix is_monotonic methods for more than 8 partitions (GH#9019)
|
||
Julia Signell
|
||
- Handle enumerate and generator inputs to from_map (GH#9066)
|
||
Richard (Rick) Zamora
|
||
- Revert is_dask_collection; back to previous implementation
|
||
(GH#9062) Doug Davis
|
||
- Fix Blockwise.clone does not handle iterable literal
|
||
arguments correctly (GH#8979) JSKenyon
|
||
- Array setitem hardmask (GH#9027) David Hassell
|
||
- Fix overlapping divisions error on append (GH#8997) Ian Rose
|
||
* Deprecations
|
||
- Add pre-deprecation warnings for read_parquet kwargs
|
||
chunksize and aggregate_files (GH#9052) Richard (Rick) Zamora
|
||
- Release 2022.05.0
|
||
* This is a bugfix release with doc changes only
|
||
- Release 2022.04.2
|
||
* This release includes several deprecations/breaking API changes
|
||
to dask.dataframe.read_parquet and dask.dataframe.to_parquet:
|
||
- to_parquet no longer writes _metadata files by default. If
|
||
you want to write a _metadata file, you can pass in
|
||
write_metadata_file=True.
|
||
- read_parquet now defaults to split_row_groups=False, which
|
||
results in one Dask dataframe partition per parquet file when
|
||
reading in a parquet dataset. If you’re working with large
|
||
parquet files you may need to set split_row_groups=True to
|
||
reduce your partition size.
|
||
- read_parquet no longer calculates divisions by default. If
|
||
you require read_parquet to return dataframes with known
|
||
divisions, please set calculate_divisions=True.
|
||
- read_parquet has deprecated the gather_statistics keyword
|
||
argument. Please use the calculate_divisions keyword argument
|
||
instead.
|
||
- read_parquet has deprecated the require_extensions keyword
|
||
argument. Please use the parquet_file_extension keyword
|
||
argument instead.
|
||
* New Features
|
||
- Add removeprefix and removesuffix as StringMethods (GH#8912)
|
||
Jorge López
|
||
* Enhancements
|
||
- Call fs.invalidate_cache in to_parquet (GH#8994) Jim
|
||
Crist-Harif
|
||
- Change to_parquet default to write_metadata_file=None
|
||
(GH#8988) Jim Crist-Harif
|
||
- Let arg reductions pass keepdims (GH#8926) Julia Signell
|
||
- Change split_row_groups default to False in read_parquet
|
||
(GH#8981) Richard (Rick) Zamora
|
||
- Improve NotImplementedError message for da.reshape (GH#8987)
|
||
Jim Crist-Harif
|
||
- Simplify to_parquet compute path (GH#8982) Jim Crist-Harif
|
||
- Raise an error if you try to use vindex with a Dask object
|
||
(GH#8945) Julia Signell
|
||
- Avoid pre_buffer=True when a precache method is specified
|
||
(GH#8957) Richard (Rick) Zamora
|
||
- from_dask_array uses blockwise instead of merging graphs
|
||
(GH#8889) Bryan Weber
|
||
- Use pre_buffer=True for “pyarrow” Parquet engine (GH#8952)
|
||
Richard (Rick) Zamora
|
||
* Bug Fixes
|
||
- Handle dtype=None correctly in da.full (GH#8954) Tom White
|
||
- Fix dask-sql bug caused by blockwise fusion (GH#8989) Richard
|
||
(Rick) Zamora
|
||
- to_parquet errors for non-string column names (GH#8990) Jim
|
||
Crist-Harif
|
||
- Make sure da.roll works even if shape is 0 (GH#8925) Julia
|
||
Signell
|
||
- Fix recursion error issue with set_index (GH#8967) Paul
|
||
Hobson
|
||
- Stringify BlockwiseDepDict mapping values when
|
||
produces_keys=True (GH#8972) Richard (Rick) Zamora
|
||
- Use DataFram`eIOLayer in DataFrame.from_delayed (GH#8852)
|
||
Richard (Rick) Zamora
|
||
- Check that values for the in predicate in read_parquet are
|
||
correct (GH#8846) Bryan Weber
|
||
- Fix bug for reduction of zero dimensional arrays (GH#8930)
|
||
Tom White
|
||
- Specify dtype when deciding division using np.linspace in
|
||
read_sql_query (GH#8940) Cheun Hong
|
||
* Deprecations
|
||
- Deprecate gather_statistics from read_parquet (GH#8992)
|
||
Richard (Rick) Zamora
|
||
- Change require_extension to top-level parquet_file_extension
|
||
read_parquet kwarg (GH#8935) Richard (Rick) Zamora
|
||
- Release 2022.04.1
|
||
* New Features
|
||
- Add missing NumPy ufuncs: abs, left_shift, right_shift,
|
||
positive. (GH#8920) Tom White
|
||
* Enhancements
|
||
- Avoid collecting parquet metadata in pyarrow when
|
||
write_metadata_file=False (GH#8906) Richard (Rick) Zamora
|
||
- Better error for failed wildcard path in dd.read_csv() (fixes
|
||
#8878) (GH#8908) Roger Filmyer
|
||
- Return da.Array rather than dd.Series for non-ufunc
|
||
elementwise functions on dd.Series (GH#8558) Julia Signell
|
||
- Let get_dummies use meta computation in map_partitions
|
||
(GH#8898) Julia Signell
|
||
- Masked scalars input to da.from_array (GH#8895) David Hassell
|
||
- Raise ValueError in merge_asof for duplicate kwargs (GH#8861)
|
||
Bryan Weber
|
||
* Bug Fixes
|
||
- Make is_monotonic work when some partitions are empty
|
||
(GH#8897) Julia Signell
|
||
- Fix custom getter in da.from_array when inline_array=False
|
||
(GH#8903) Ian Rose
|
||
- Correctly handle dict-specification for rechunk. (GH#8859)
|
||
Richard
|
||
- Fix merge_asof: drop index column if left_on == right_on
|
||
(GH#8874) Gil Forsyth
|
||
* Deprecations
|
||
- Warn users that engine='auto' will change in future (GH#8907)
|
||
Jim Crist-Harif
|
||
- Remove pyarrow-legacy engine from parquet API (GH#8835)
|
||
Richard (Rick) Zamora
|
||
- Release 2022.04.0
|
||
* This is the first release with support for Python 3.10
|
||
* New Features
|
||
- Add Python 3.10 support (GH#8566) James Bourbeau
|
||
* Enhancements
|
||
- Add check on dtype.itemsize in order to produce a useful
|
||
error (GH#8860) Davide Gavio
|
||
- Add mild typing to common utils functions (GH#8848) Matthew
|
||
Rocklin
|
||
- Add sanity checks to divisions setter (GH#8806) Jim
|
||
Crist-Harif
|
||
- Use Blockwise and map_partitions for more tasks (GH#8831)
|
||
Bryan Weber
|
||
* Bug Fixes
|
||
- Fix dataframe.merge_asof to preserve right_on column
|
||
(GH#8857) Sarah Charlotte Johnson
|
||
- Fix “Buffer dtype mismatch” for pandas >= 1.3 on 32bit
|
||
(GH#8851) Ben Greiner
|
||
- Fix slicing fusion by altering SubgraphCallable getter
|
||
(GH#8827) Ian Rose
|
||
* Deprecations
|
||
- Remove support for PyPy (GH#8863) James Bourbeau
|
||
- Drop setuptools at runtime (GH#8855) crusaderky
|
||
- Remove dataframe.tseries.resample.getnanos (GH#8834) Sarah
|
||
Charlotte Johnson
|
||
- Drop dask-fix8169-pandas13.patch and dask-py310-test.patch
|
||
|
||
-------------------------------------------------------------------
|
||
Sun Mar 27 19:18:19 UTC 2022 - Ben Greiner <code@bnavigator.de>
|
||
|
||
- dask.dataframe requires dask.bag (revealed by swifter test suite)
|
||
|
||
-------------------------------------------------------------------
|
||
Fri Mar 25 19:02:53 UTC 2022 - Ben Greiner <code@bnavigator.de>
|
||
|
||
- Update to 2022.3.0
|
||
* Bag: add implementation for reservoir sampling
|
||
* Add ma.count to Dask array
|
||
* Change to_parquet default to compression="snappy"
|
||
* Add weights parameter to dask.array.reduction
|
||
* Add ddf.compute_current_divisions to get divisions on a sorted
|
||
index or column
|
||
* Pass __name__ and __doc__ through on DelayedLeaf
|
||
* Raise exception for not implemented merge how option
|
||
* Move Bag.map_partitions to Blockwise
|
||
* Improve error messages for malformed config files
|
||
* Revise column-projection optimization to capture common
|
||
dask-sql patterns
|
||
* Useful error for empty divisions
|
||
* Scipy 1.8.0 compat: copy private classes into
|
||
dask/array/stats.py
|
||
- Release 2022.2.1
|
||
* Add aggregate functions first and last to
|
||
dask.dataframe.pivot_table
|
||
* Add std() support for datetime64 dtype for pandas-like objects
|
||
* Add materialized task counts to HighLevelGraph and Layer html
|
||
reprs
|
||
* Do not allow iterating a DataFrameGroupBy
|
||
* Fix missing newline after info() call on empty DataFrame
|
||
* Add groupby.compute as a not implemented method
|
||
* Improve multi dataframe join performance
|
||
* Include bool type for Index
|
||
* Allow ArrowDatasetEngine subclass to override pandas->arrow
|
||
conversion also for partitioned write
|
||
* Increase performance of k-diagonal extraction in da.diag() and
|
||
da.diagonal()
|
||
* Change linspace creation to match numpy when num equal to 0
|
||
* Tokenize dataclasses
|
||
* Update tokenize to treat dict and kwargs differently
|
||
- Release 2022.2.0
|
||
* Add region to to_zarr when using existing array
|
||
* Add engine_kwargs support to dask.dataframe.to_sql
|
||
* Add include_path_column arg to read_json
|
||
* Add expand_dims to Dask array
|
||
* Add scheduler option to assert_eq utilities
|
||
* Fix eye inconsistency with NumPy for dtype=None
|
||
* Fix concatenate inconsistency with NumPy for axis=None
|
||
* Type annotations, part 1
|
||
* Really allow any iterable to be passed as a meta
|
||
* Use map_partitions (Blockwise) in to_parquet
|
||
- Update dask-fix8169-pandas13.patch
|
||
- Add dask-py310-test.patch -- gh#dask/dask#8566
|
||
- Make the distributed/dask update sync requirement even more
|
||
obvious.
|
||
|
||
-------------------------------------------------------------------
|
||
Sat Jan 29 17:35:38 UTC 2022 - Ben Greiner <code@bnavigator.de>
|
||
|
||
- Update to 2022.1.1
|
||
* Add dask.dataframe.series.view()
|
||
* Update tz for fastparquet + pandas 1.4.0
|
||
* Cleaning up misc tests for pandas compat
|
||
* Moving to SQLAlchemy >= 1.4
|
||
* Pandas compat: Filter sparse warnings
|
||
* Fail if meta is not a pandas object
|
||
* Use fsspec.parquet module for better remote-storage
|
||
read_parquet performance
|
||
* Move DataFrame ACA aggregations to HLG
|
||
* Add optional information about originating function call in
|
||
DataFrameIOLayer
|
||
* Blockwise array creation redux
|
||
* Refactor config default search path retrieval
|
||
* Add optimize_graph flag to Bag.to_dataframe function
|
||
* Make sure that delayed output operations still return lists of
|
||
paths
|
||
* Pandas compat: Fix to_frame name to not pass None
|
||
* Pandas compat: Fix axis=None warning
|
||
* Expand Dask YAML config search directories
|
||
* Fix groupby.cumsum with series grouped by index
|
||
* Fix derived_from for pandas methods
|
||
* Enforce boolean ascending for sort_values
|
||
* Fix parsing of __setitem__ indices
|
||
* Avoid divide by zero in slicing
|
||
* Downgrade meta error in
|
||
* Pandas compat: Deprecate append when pandas >= 1.4.0
|
||
* Replace outdated columns argument with meta in DataFrame
|
||
constructor
|
||
* Refactor deploying docs
|
||
* Pin coverage in CI
|
||
* Move cached_cumsum imports to be from dask.utils
|
||
* Update gpuCI RAPIDS_VER to 22.04
|
||
* Update cocstring for from_delayed function
|
||
* Handle plot_width / plot_height deprecations
|
||
* Remove unnecessary pyyaml importorskip
|
||
* Specify scheduler in DataFrame assert_eq
|
||
|
||
-------------------------------------------------------------------
|
||
Tue Jan 25 22:07:53 UTC 2022 - Ben Greiner <code@bnavigator.de>
|
||
|
||
- Revert python310 enablement -- gh#dask/distributed#5460
|
||
|
||
-------------------------------------------------------------------
|
||
Tue Jan 25 09:35:17 UTC 2022 - Dirk Müller <dmueller@suse.com>
|
||
|
||
- reenable python 3.10 build as distributed is also reenabled
|
||
|
||
-------------------------------------------------------------------
|
||
Thu Jan 20 16:23:05 UTC 2022 - Ben Greiner <code@bnavigator.de>
|
||
|
||
- Update to 2022.1.0
|
||
* Add groupby.shift method (GH#8522) kori73
|
||
* Add DataFrame.nunique (GH#8479) Sarah Charlotte Johnson
|
||
* Add da.ndim to match np.ndim (GH#8502) Julia Signell
|
||
* Replace interpolation with method and method with
|
||
internal_method (GH#8525) Julia Signell
|
||
* Remove daily stock demo utility (GH#8477) James Bourbeau
|
||
* Add Series and Index is_monotonic* methods (GH#8304) Daniel
|
||
Mesejo-León
|
||
* Deprecate token keyword argument to map_blocks (GH#8464) James
|
||
Bourbeau
|
||
* Deprecation warning for default value of boundary kwarg in
|
||
map_overlap (GH#8397) Genevieve Buckley
|
||
- Skip python310: Not supported by distributed yet
|
||
-- gh#dask/distributed#5350
|
||
|
||
-------------------------------------------------------------------
|
||
Wed Sep 22 12:50:07 UTC 2021 - Ben Greiner <code@bnavigator.de>
|
||
|
||
- Update to 2021.09.1
|
||
* Fix groupby for future pandas
|
||
* Remove warning filters in tests that are no longer needed
|
||
* Add link to diagnostic visualize function in local diagnostic
|
||
docs
|
||
* Add datetime_is_numeric to dataframe.describe
|
||
* Remove references to pd.Int64Index in anticipation of
|
||
deprecation
|
||
* Use loc if needed for series __get_item__
|
||
* Specifically ignore warnings on mean for empty slices
|
||
* Skip groupby nunique test for pandas >= 1.3.3
|
||
* Implement ascending arg for sort_values
|
||
* Replace operator.getitem
|
||
* Deprecate zero_broadcast_dimensions and homogeneous_deepmap
|
||
* Add error if drop_index is negative
|
||
* Allow scheduler to be an Executor
|
||
* Handle asarray/asanyarray cases where like is a dask.Array
|
||
* Fix index_col duplication if index_col is type str
|
||
* Add dtype and order to asarray and asanyarray definitions
|
||
* Deprecate dask.dataframe.Series.__contains__
|
||
* Fix edge case with like-arrays in _wrapped_qr
|
||
* Deprecate boundary_slice kwarg: kind for pandas compat
|
||
- Release 2021.09.0
|
||
* Fewer open files
|
||
* Add FileNotFound to expected http errors
|
||
* Add DataFrame.sort_values to API docs
|
||
* Change to dask.order: be more eager at times
|
||
* Add pytest color to CI
|
||
* FIX: make_people works with processes scheduler
|
||
* Adds deep param to Dataframe copy method and restrict it to
|
||
False
|
||
* Fix typo in configuration docs
|
||
* Update formatting in DataFrame.query docstring
|
||
* Un-xfail sparse tests for 0.13.0 release
|
||
* Add axes property to DataFrame and Series
|
||
* Add CuPy support in da.unique (values only)
|
||
* Unit tests for sparse.zeros_like (xfailed)
|
||
* Add explicit like kwarg support to array creation functions
|
||
* Separate Array and DataFrame mindeps builds
|
||
* Fork out percentile_dispatch to dask.array
|
||
* Ensure filepath exists in to_parquet
|
||
* Update scheduler plugin usage in
|
||
test_scheduler_highlevel_graph_unpack_import
|
||
* Add DataFrame.shuffle to API docs
|
||
* Order requirements alphabetically
|
||
- Release 2021.08.1
|
||
* Add ignore_metadata_file option to read_parquet
|
||
(pyarrow-dataset and fastparquet support only)
|
||
* Add reference to pytest-xdist in dev docs
|
||
* Include tz in meta from to_datetime
|
||
* CI Infra Docs
|
||
* Include invalid DataFrame key in assert_eq check
|
||
* Use __class__ when creating DataFrames
|
||
* Use development version of distributed in gpuCI build
|
||
* Ignore whitespace when gufunc signature
|
||
* Move pandas import and percentile dispatch refactor
|
||
* Add colors to represent high level layer types
|
||
* Upstream instance fix
|
||
* Add dask.widgets and migrate HTML reprs to jinja2
|
||
* Remove wrap_func_like_safe, not required with
|
||
NumPy >= 1.17
|
||
* Fix threaded scheduler memory backpressure regression
|
||
* Add percentile dispatch
|
||
* Use a publicly documented attribute obj in groupby rather than
|
||
private _selected_obj
|
||
* Specify module to import rechunk from
|
||
* Use dict to store data for {nan,}arg{min,max} in certain cases
|
||
* Fix blocksize description formatting in read_pandas
|
||
* Fix "point" -> "pointers" typo in docs
|
||
- Release 2021.08.0
|
||
* Fix to_orc delayed compute behavior
|
||
* Don't convert to low-level task graph in
|
||
compute_as_if_collection
|
||
* Fix multifile read for hdf
|
||
* Resolve warning in distributed tests
|
||
* Update to_orc collection name
|
||
* Resolve skipfooter problem
|
||
* Raise NotImplementedError for non-indexable arg passed to
|
||
to_datetime
|
||
* Ensure we error on warnings from distributed
|
||
* Added dict format in to_bag accessories of DataFrame
|
||
* Delayed docs indirect dependencies
|
||
* Add tooltips to graphviz high-level graphs
|
||
* Close 2021 User Survey
|
||
* Reorganize CuPy tests into multiple files
|
||
* Refactor and Expand Dask-Dataframe ORC API
|
||
* Don't enforce columns if enforce=False
|
||
* Fix map_overlap trimming behavior when drop_axis is not None
|
||
* Mark gpuCI CuPy test as flaky
|
||
* Avoid using Delayed in to_csv and to_parquet
|
||
* Removed redundant check_dtypes
|
||
* Use pytest.warns instead of raises for checking parquet engine
|
||
deprecation
|
||
* Bump RAPIDS_VER in gpuCI to 21.10
|
||
* Add back pyarrow-legacy test coverage for pyarrow>=5
|
||
* Allow pyarrow>=5 in to_parquet and read_parquet
|
||
* Skip CuPy tests requiring NEP-35 when NumPy < 1.20 is available
|
||
* Add tail and head to SeriesGroupby
|
||
* Update Zoom link for monthly meeting
|
||
* Add gpuCI build script
|
||
* Deprecate daily_stock utility
|
||
* Add distributed.nanny to configuration reference docs
|
||
* Require NumPy 1.18+ & Pandas 1.0+
|
||
- Add dask-fix8169-pandas13.patch -- gh#dask/dask#8169
|
||
|
||
-------------------------------------------------------------------
|
||
Sun Aug 8 14:42:17 UTC 2021 - Ben Greiner <code@bnavigator.de>
|
||
|
||
- Update to 2021.7.2
|
||
* This is the last release with support for NumPy 1.17 and pandas
|
||
0.25. Beginning with the next release, NumPy 1.18 and pandas
|
||
1.0 will be the minimum supported versions.
|
||
* Add dask.array SVG to the HTML Repr
|
||
* Avoid use of Delayed in to_parquet
|
||
* Temporarily pin pyarrow<5 in CI
|
||
* Add deprecation warning for top-level ucx and rmm config values
|
||
* Remove skips from doctests (4 of 6)
|
||
* Remove skips from doctests (5 of 6)
|
||
* Adds missing prepend/append functionality to da.diff
|
||
* Change graphviz font family to sans
|
||
* Fix read-csv name - when path is different, use different name
|
||
for task
|
||
* Update configuration reference for ucx and rmm changes
|
||
* Add meta support to __setitem__
|
||
* NEP-35 support for slice_with_int_dask_array
|
||
* Unpin fastparquet in CI
|
||
* Remove skips from doctests (3 of 6)
|
||
- Release 2021.7.1
|
||
* Make array assert_eq check dtype
|
||
* Remove skips from doctests (6 of 6)
|
||
* Remove experimental feature warning from actors docs
|
||
* Remove skips from doctests (2 of 6)
|
||
* Separate out Array and Bag API
|
||
* Implement lazy Array.__iter__
|
||
* Clean up places where we inadvertently iterate over arrays
|
||
* Add numeric_only kwarg to DataFrame reductions
|
||
* Add pytest marker for GPU tests
|
||
* Add support for histogram2d in dask.array
|
||
* Remove skips from doctests (1 of 6)
|
||
* Add node size scaling to the Graphviz output for the high
|
||
level graphs
|
||
* Update old Bokeh links
|
||
* Temporarily pin fastparquet in CI
|
||
* Add dask.array import to progress bar docs
|
||
* Use separate files for each DataFrame API function and method
|
||
* Fix pyarrow-dataset ordering bug
|
||
* Generalize unique aggregate
|
||
* Raise NotImplementedError when using pd.Grouper
|
||
* Add aggregate_files argument to enable multi-file partitions in
|
||
read_parquet
|
||
* Un-xfail test_daily_stock
|
||
* Update access configuration docs
|
||
* Use packaging for version comparisons
|
||
* Handle infinite loops in merge_asof
|
||
|
||
-------------------------------------------------------------------
|
||
Fri Jul 16 09:25:39 UTC 2021 - Ben Greiner <code@bnavigator.de>
|
||
|
||
- Update to 2021.07.0
|
||
* Include fastparquet in upstream CI build
|
||
* Blockwise: handle non-string constant dependencies
|
||
* fastparquet now supports new time types, including ns precision
|
||
* Avoid ParquetDataset API when appending in ArrowDatasetEngine
|
||
* Add retry logic to test_shuffle_priority
|
||
* Use strict channel priority in CI
|
||
* Support nested dask.distributed imports
|
||
* Should check module name only, not the entire directory
|
||
filepath
|
||
* Updates due to https://github.com/dask/fastparquet/pull/623
|
||
* da.eye fix for chunks=-1
|
||
* Temporarily xfail test_daily_stock
|
||
* Set priority annotations in SimpleShuffleLayer
|
||
* Blockwise: stringify constant key inputs
|
||
* Allow mixing dask and numpy arrays in @guvectorize
|
||
* Don't sample dict result of a shuffle group when calculating
|
||
its size
|
||
* Fix scipy tests
|
||
* Deterministically tokenize datetime.date
|
||
* Add sample_rows to read_csv-like
|
||
* Fix typo in config.deserialize docstring
|
||
* Remove warning filter in test_dataframe_picklable
|
||
* Improvements to histogramdd
|
||
* Make PY_VERSION private
|
||
- Release 2021.06.2
|
||
* layers.py compare parts_out with set(self.parts_out)
|
||
* Make check_meta understand pandas dtypes better
|
||
* Remove "Educational Resources" doc page
|
||
* - Release 2021.06.1
|
||
* Replace funding page with 'Supported By' section on dask.org
|
||
* Add initial deprecation utilities
|
||
* Enforce dtype conservation in ufuncs that explicitly use dtype=
|
||
* Add Coiled to list of paid support organizations
|
||
* Small tweaks to the HTML repr for Layer & HighLevelGraph
|
||
* Add dark mode support to HLG HTML repr
|
||
* Remove compatibility entries for old distributed
|
||
* Implementation of HTML repr for HighLevelGraph layers
|
||
* Update default blockwise token to avoid DataFrame column name
|
||
clash
|
||
* Use dispatch concat for merge_asof
|
||
* Fix upstream freq tests
|
||
* Use more context managers from the standard library
|
||
* Simplify skips in parquet tests
|
||
* Remove check for outdated bokeh
|
||
* More test coverage uploads
|
||
* Remove ImportError catching from dask/__init__.py
|
||
* Allow DataFrame.join() to take a list of DataFrames to merge
|
||
with
|
||
* Fix maximum recursion depth exception in dask.array.linspace
|
||
* Fix docs links
|
||
* Initial da.select() implementation and test
|
||
* Layers must implement get_output_keys method
|
||
* Don't include or expect freq in divisions
|
||
* A HighLevelGraph abstract layer for map_overlap
|
||
* Always include kwarg name in drop
|
||
* Only rechunk for median if needed
|
||
* Add add_(prefix|suffix) to DataFrame and Series
|
||
* Move read_hdf to Blockwise
|
||
* Make Layer.get_output_keys officially an abstract method
|
||
* Non-dask-arrays and broadcasting in ravel_multi_index
|
||
* Fix for paths ending with "/" in parquet overwrite
|
||
* Fixing calling .visualize() with filename=None
|
||
* Generate unique names for SubgraphCallable
|
||
* Pin fsspec to 2021.5.0 in CI
|
||
* Evaluate graph lazily if meta is provided in from_delayed
|
||
* Add meta support for DatetimeTZDtype
|
||
* Add dispatch label to automatic PR labeler
|
||
* Fix HDFS tests
|
||
- Release 2021.06.0
|
||
* Remove abstract tokens from graph keys in rewrite_blockwise
|
||
* Ensure correct column order in csv project_columns
|
||
* Renamed inner loop variables to avoid duplication
|
||
* Do not return delayed object from to_zarr
|
||
* Array: correct number of outputs in apply_gufunc
|
||
* Rewrite da.fromfunction with da.blockwise
|
||
* Rename make_meta_util to make_meta
|
||
* Repartition before shuffle if the requested partitions are
|
||
less than input partitions
|
||
* Blockwise: handle constant key inputs
|
||
* Added raise to apply_gufunc
|
||
* Show failing tests summary in CI
|
||
* sizeof sets in Python 3.9
|
||
* Warn if using pandas datetimelike string in
|
||
dataframe.__getitem__
|
||
* Highlight the client.dashboard_link
|
||
* Easier link for subscribing to the Google calendar
|
||
* Automatically show graph visualization in Jupyter notebooks
|
||
* Add autofunction for unify_chunks in API docs
|
||
- Release 2021.05.1
|
||
* Pandas compatibility
|
||
* Fix optimize_dataframe_getitem bug
|
||
* Update make_meta import in docs
|
||
* Implement da.searchsorted
|
||
* Fix format string in error message
|
||
* Fix read_sql_table returning wrong result for single column
|
||
loads
|
||
* Add slack join link in support.rst
|
||
* Remove unused alphabet variable
|
||
* Fix meta creation incase of object
|
||
* Add dispatch for union_categoricals
|
||
* Consolidate array Dispatch objects
|
||
* Move DataFrame dispatch.registers to their own file
|
||
* Fix delayed with dataclasses where init=False
|
||
* Allow a column to be named divisions
|
||
* Stack nd array with unknown chunks
|
||
* Promote the 2021 Dask User Survey
|
||
* Fix typo in DataFrame.set_index()
|
||
* Cleanup array API reference links
|
||
* Accept axis tuple for flip to be consistent with NumPy
|
||
* Bump pre-commit hook versions
|
||
* Cleanup to_zarr docstring
|
||
* Fix the docstring of read_orc
|
||
* Doc ipyparallel & mpi4py concurrent.futures
|
||
* Update tests to support CuPy 9
|
||
* Fix some HighLevelGraph documentation inaccuracies
|
||
* Fix spelling in Series getitem error message
|
||
|
||
-------------------------------------------------------------------
|
||
Tue May 18 10:12:19 UTC 2021 - Ben Greiner <code@bnavigator.de>
|
||
|
||
- update to version 2021.5.0
|
||
* Remove deprecated kind kwarg to comply with pandas 1.3.0
|
||
(GH#7653) Julia Signell
|
||
* Fix bug in DataFrame column projection (GH#7645) Richard (Rick)
|
||
Zamora
|
||
* Merge global annotations when packing (GH#7565) Mads R. B.
|
||
Kristensen
|
||
* Avoid inplace= in pandas set_categories (GH#7633) James
|
||
Bourbeau
|
||
* Change the active-fusion default to False for Dask-Dataframe
|
||
(GH#7620) Richard (Rick) Zamora
|
||
* Array: remove extraneous code from RandomState (GH#7487) Gabe
|
||
Joseph
|
||
* Implement str.concat when others=None (GH#7623) Daniel
|
||
Mesejo-León
|
||
* Fix dask.dataframe in sandboxed environments (GH#7601) Noah D.
|
||
Brenowitz
|
||
* Support for cupyx.scipy.linalg (GH#7563) Benjamin Zaitlen
|
||
* Move timeseries and daily-stock to Blockwise (GH#7615) Richard
|
||
(Rick) Zamora
|
||
* Fix bugs in broadcast join (GH#7617) Richard (Rick) Zamora
|
||
* Use Blockwise for DataFrame IO (parquet, csv, and orc)
|
||
(GH#7415) Richard (Rick) Zamora
|
||
* Adding chunk & type information to Dask HighLevelGraph s
|
||
(GH#7309) Genevieve Buckley
|
||
* Add pyarrow sphinx intersphinx_mapping (GH#7612) Ray Bell
|
||
* Remove skip on test freq (GH#7608) Julia Signell
|
||
* Defaults in read_parquet parameters (GH#7567) Ray Bell
|
||
* Remove ignore_abc_warning (GH#7606) Julia Signell
|
||
* Harden DataFrame merge between column-selection and index
|
||
(GH#7575) Richard (Rick) Zamora
|
||
* Get rid of ignore_abc decorator (GH#7604) Julia Signell
|
||
* Remove kwarg validation for bokeh (GH#7597) Julia Signell
|
||
* Add loky example (GH#7590) Naty Clementi
|
||
* Delayed: nout when arguments become tasks (GH#7593) Gabe Joseph
|
||
* Update distributed version in mindep CI build (GH#7602) James
|
||
Bourbeau
|
||
* Support all or no overlap between partition columns and real
|
||
columns (GH#7541) Richard (Rick) Zamora
|
||
- Stress that python-distributed, if used, has to have a matching
|
||
version number. Always update at the same time.
|
||
|
||
-------------------------------------------------------------------
|
||
Mon May 3 01:34:23 UTC 2021 - Arun Persaud <arun@gmx.de>
|
||
|
||
- update to version 2021.4.1:
|
||
* Handle Blockwise HLG pack/unpack for concatenate=True (:pr:`7455`)
|
||
Richard (Rick) Zamora
|
||
* map_partitions: use tokenized info as name of the SubgraphCallable
|
||
(:pr:`7524`) Mads R. B. Kristensen
|
||
* Using tmp_path and tmpdir to avoid temporary files and directories
|
||
hanging in the repo (:pr:`7592`) Naty Clementi
|
||
* Contributing to docs (development guide) (:pr:`7591`) Naty
|
||
Clementi
|
||
* Add more packages to Python 3.9 CI build (:pr:`7588`) James
|
||
Bourbeau
|
||
* Array: Fix NEP-18 dispatching in finalize (:pr:`7508`) Gabe Joseph
|
||
* Misc fixes for numpydoc (:pr:`7569`) Matthias Bussonnier
|
||
* Avoid pandas level= keyword deprecation (:pr:`7577`) James
|
||
Bourbeau
|
||
* Map e.g. .repartition(freq="M") to .repartition(freq="MS")
|
||
(:pr:`7504`) Ruben van de Geer
|
||
* Remove hash seeding in parallel CI runs (:pr:`7128`) Elliott Sales
|
||
de Andrade
|
||
* Add defaults in parameters in to_parquet (:pr:`7564`) Ray Bell
|
||
* Simplify transpose axes cleanup (:pr:`7561`) Julia Signell
|
||
* Make ValueError in len(index_names) > 1 explicit it's using
|
||
fastparquet (:pr:`7556`) Ray Bell
|
||
* Fix dict-column appending for pyarrow parquet engines (:pr:`7527`)
|
||
Richard (Rick) Zamora
|
||
* Add a documentation auto label (:pr:`7560`) Doug Davis
|
||
* Add dask.delayed.Delayed to docs so it can be referenced by other
|
||
sphinx docs (:pr:`7559`) Doug Davis
|
||
* Fix upstream idxmaxmin for uneven split_every (:pr:`7538`) Julia
|
||
Signell
|
||
* Make normalize_token for pandas Series/DataFrame future proof (no
|
||
direct block access) (:pr:`7318`) Joris Van den Bossche
|
||
* Redesigned __setitem__ implementation (:pr:`7393`) David Hassell
|
||
* histogram, histogramdd improvements (docs; return consistencies)
|
||
(:pr:`7520`) Doug Davis
|
||
* Force nightly pyarrow in the upstream build (:pr:`7530`) Joris Van
|
||
den Bossche
|
||
* Fix Configuration Reference (:pr:`7533`) Benjamin Zaitlen
|
||
* Use .to_parquet on dask.dataframe in doc string (:pr:`7528`) Ray
|
||
Bell
|
||
* Avoid double msgpack serialization of HLGs (:pr:`7525`) Mads
|
||
R. B. Kristensen
|
||
* Encourage usage of yaml.safe_load() in configuration doc
|
||
(:pr:`7529`) Hristo Georgiev
|
||
* Fix reshape bug. Add relevant test. Fixes #7171. (:pr:`7523`)
|
||
JSKenyon
|
||
* Support custom_metadata= argument in to_parquet (:pr:`7359`)
|
||
Richard (Rick) Zamora
|
||
* Clean some documentation warnings (:pr:`7518`) Daniel Mesejo-León
|
||
* Getting rid of more docs warnings (:pr:`7426`) Julia Signell
|
||
* Added product (alias of prod) (:pr:`7517`) Freyam Mehta
|
||
* Fix upstream __array_ufunc__ tests (:pr:`7494`) Julia Signell
|
||
* Escape from map_overlap to map_blocks if depth is zero
|
||
(:pr:`7481`) Genevieve Buckley
|
||
* Add check_type to array assert_eq (:pr:`7491`) Julia Signell
|
||
|
||
-------------------------------------------------------------------
|
||
Fri Apr 9 13:47:13 UTC 2021 - Benjamin Greiner <code@bnavigator.de>
|
||
|
||
- Reenable 32bit tests after distributed is not cythonized anymore
|
||
gh#dask/dask#7489
|
||
|
||
-------------------------------------------------------------------
|
||
Sun Apr 4 16:38:31 UTC 2021 - Arun Persaud <arun@gmx.de>
|
||
|
||
- update to version 2021.4.0:
|
||
* Adding support for multidimensional histograms with
|
||
dask.array.histogramdd (:pr:`7387`) Doug Davis
|
||
* Update docs on number of threads and workers in default
|
||
LocalCluster (:pr:`7497`) cameron16
|
||
* Add labels automatically when certain files are touched in a PR
|
||
(:pr:`7506`) Julia Signell
|
||
* Extract ignore_order from kwargs (:pr:`7500`) GALI PREM SAGAR
|
||
* Only provide installation instructions when distributed is missing
|
||
(:pr:`7498`) Matthew Rocklin
|
||
* Start adding isort (:pr:`7370`) Julia Signell
|
||
* Add ignore_order parameter in dd.concat (:pr:`7473`) Daniel
|
||
Mesejo-León
|
||
* Use powers-of-two when displaying RAM (:pr:`7484`) Guido Imperiale
|
||
* Added License Classifier (:pr:`7485`) Tom Augspurger
|
||
* Replace conda with mamba (:pr:`7227`) Guido Imperiale
|
||
* Fix typo in array docs (:pr:`7478`) James Lamb
|
||
* Use concurrent.futures in local scheduler (:pr:`6322`) John A
|
||
Kirkham
|
||
|
||
-------------------------------------------------------------------
|
||
Tue Mar 30 21:47:53 UTC 2021 - Ben Greiner <code@bnavigator.de>
|
||
|
||
- Update to 2021.3.1
|
||
* Add a dispatch for is_categorical_dtype to handle non-pandas
|
||
objects (GH#7469) brandon-b-miller
|
||
* Use multiprocessing.Pool in test_read_text (GH#7472) John A
|
||
Kirkham
|
||
* Add missing meta kwarg to gufunc class (GH#7423) Peter Andreas
|
||
Entschev
|
||
* Example for memory-mapped Dask array (GH#7380) Dieter Weber
|
||
* Fix NumPy upstream failures xfail pandas and fastparquet
|
||
failures (GH#7441) Julia Signell
|
||
* Fix bug in repartition with freq (GH#7357) Ruben van de Geer
|
||
* Fix __array_function__ dispatching for tril/triu (GH#7457)
|
||
Peter Andreas Entschev
|
||
* Use concurrent.futures.Executors in a few tests (GH#7429) John
|
||
A Kirkham
|
||
* Require NumPy >=1.16 (GH#7383) Guido Imperiale
|
||
* Minor sort_values housekeeping (GH#7462) Ryan Williams
|
||
* Ensure natural sort order in parquet part paths (GH#7249) Ryan
|
||
Williams
|
||
* Remove global env mutation upon running test_config.py
|
||
(GH#7464) Hristo
|
||
* Update NumPy intersphinx URL (GH#7460) Gabe Joseph
|
||
* Add rot90 (GH#7440) Trevor Manz
|
||
* Update docs for required package for endpoint (GH#7454) Nick
|
||
Vazquez
|
||
* Master -> main in slice_array docstring (GH#7453) Gabe Joseph
|
||
* Expand dask.utils.is_arraylike docstring (GH#7445) Doug Davis
|
||
* Simplify BlockwiseIODeps importing (GH#7420) Richard (Rick)
|
||
Zamora
|
||
* Update layer annotation packing method (GH#7430) James Bourbeau
|
||
* Drop duplicate test in test_describe_empty (GH#7431) John A
|
||
Kirkham
|
||
* Add Series.dot method to dataframe module (GH#7236) Madhu94
|
||
* Added df kurtosis-method and testing (GH#7273) Jan Borchmann
|
||
* Avoid quadratic-time performance for HLG culling (GH#7403)
|
||
Bruce Merry
|
||
* Temporarily skip problematic sparse test (GH#7421) James
|
||
Bourbeau
|
||
* Update some CI workflow names (GH#7422) James Bourbeau
|
||
* Fix HDFS test (GH#7418) Julia Signell
|
||
* Make changelog subtitles match the hierarchy (GH#7419) Julia
|
||
Signell
|
||
* Add support for normalize in value_counts (GH#7342) Julia
|
||
Signell
|
||
* Avoid unnecessary imports for HLG Layer unpacking and
|
||
materialization (GH#7381) Richard (Rick) Zamora
|
||
* Bincount fix slicing (GH#7391) Genevieve Buckley
|
||
* Add sliding_window_view (GH#7234) Deepak Cherian
|
||
* Fix typo in docs/source/develop.rst (GH#7414) Hristo
|
||
* Switch documentation builds for PRs to readthedocs (GH#7397)
|
||
James Bourbeau
|
||
* Adds sort_values to dask.DataFrame (GH#7286) gerrymanoim
|
||
* Pin sqlalchemy<1.4.0 in CI (GH#7405) James Bourbeau
|
||
* Comment fixes (GH#7215) Ryan Williams
|
||
* Dead code removal / fixes (GH#7388) Ryan Williams
|
||
* Use single thread for pa.Table.from_pandas calls (GH#7347)
|
||
Richard (Rick) Zamora
|
||
* Replace 'container' with 'image' (GH#7389) James Lamb
|
||
* DOC hyperlink repartition (GH#7394) Ray Bell
|
||
* Pass delimiter to fsspec in bag.read_text (GH#7349) Martin
|
||
Durant
|
||
* Update read_hdf default mode to "r" (GH#7039) rs9w33
|
||
* Embed literals in SubgraphCallable when packing Blockwise
|
||
(GH#7353) Mads R. B. Kristensen
|
||
* Update test_hdf.py to not reuse file handlers (GH#7044) rs9w33
|
||
* Require additional dependencies: cloudpickle, partd, fsspec,
|
||
toolz (GH#7345) Julia Signell
|
||
* Prepare Blockwise + IO infrastructure (GH#7281) Richard (Rick)
|
||
Zamora
|
||
* Remove duplicated imports from test_slicing.py (GH#7365) Hristo
|
||
* Add test deps for pip development (GH#7360) Julia Signell
|
||
* Support int slicing for non-NumPy arrays (GH#7364) Peter
|
||
Andreas Entschev
|
||
* Automatically cancel previous CI builds (GH#7348) James
|
||
Bourbeau
|
||
* dask.array.asarray should handle case where xarray class is in
|
||
top-level namespace (GH#7335) Tom White
|
||
* HighLevelGraph length without materializing layers (GH#7274)
|
||
Gabe Joseph
|
||
* Drop support for Python 3.6 (GH#7006) James Bourbeau
|
||
* Fix fsspec usage in create_metadata_file (GH#7295) Richard
|
||
(Rick) Zamora
|
||
* Change default branch from master to main (GH#7198) Julia
|
||
Signell
|
||
* Add Xarray to CI software environment (GH#7338) James Bourbeau
|
||
* Update repartition argument name in error text (GH#7336) Eoin
|
||
Shanaghy
|
||
* Run upstream tests based on commit message (GH#7329) James
|
||
Bourbeau
|
||
* Use pytest.register_assert_rewrite on util modules (GH#7278)
|
||
Bruce Merry
|
||
* Add example on using specific chunk sizes in from_array()
|
||
(GH#7330) James Lamb
|
||
* Move NumPy skip into test (GH#7247) Julia Signell
|
||
- Update package descriptions
|
||
- Add dask-delayed and dask-diagnostics packages
|
||
- Drop dask-multiprocessing package merged into main
|
||
- Skip python36: upstream dropped support for Python < 3.7
|
||
- Drop dask-pr7247-numpyskip.patch merged upstream
|
||
- Test more optional requirements for better compatibility
|
||
assurance.
|
||
|
||
-------------------------------------------------------------------
|
||
Sun Mar 7 16:40:26 UTC 2021 - Ben Greiner <code@bnavigator.de>
|
||
|
||
- Update to 2021.3.0
|
||
* This is the first release with support for Python 3.9 and the
|
||
last release with support for Python 3.6
|
||
* Bump minimum version of distributed (GH#7328) James Bourbeau
|
||
* Fix percentiles_summary with dask_cudf (GH#7325) Peter Andreas
|
||
Entschev
|
||
* Temporarily revert recent Array.__setitem__ updates (GH#7326)
|
||
James Bourbeau
|
||
* Blockwise.clone (GH#7312) Guido Imperiale
|
||
* NEP-35 duck array update (GH#7321) James Bourbeau
|
||
* Don’t allow setting .name for array (GH#7222) Julia Signell
|
||
* Use nearest interpolation for creating percentiles of integer
|
||
input (GH#7305) Kyle Barron
|
||
* Test exp with CuPy arrays (GH#7322) John A Kirkham
|
||
* Check that computed chunks have right size and dtype (GH#7277)
|
||
Bruce Merry
|
||
* pytest.mark.flaky (GH#7319) Guido Imperiale
|
||
* Contributing docs: add note to pull the latest git tags before
|
||
pip installing Dask (GH#7308) Genevieve Buckley
|
||
* Support for Python 3.9 (GH#7289) Guido Imperiale
|
||
* Add broadcast-based merge implementation (GH#7143) Richard
|
||
(Rick) Zamora
|
||
* Add split_every to graph_manipulation (GH#7282) Guido Imperiale
|
||
* Typo in optimize docs (GH#7306) Julius Busecke
|
||
* dask.graph_manipulation support for xarray.Dataset (GH#7276)
|
||
Guido Imperiale
|
||
* Add plot width and height support for Bokeh 2.3.0 (GH#7297)
|
||
James Bourbeau
|
||
* Add NumPy functions tri, triu_indices, triu_indices_from,
|
||
tril_indices, tril_indices_from (GH#6997) Illviljan
|
||
* Remove “cleanup” task in DataFrame on-disk shuffle (GH#7260)
|
||
Sinclair Target
|
||
* Use development version of distributed in CI (GH#7279) James
|
||
Bourbeau
|
||
* Moving high level graph pack/unpack Dask (GH#7179) Mads R. B.
|
||
Kristensen
|
||
* Improve performance of merge_percentiles (GH#7172) Ashwin
|
||
Srinath
|
||
* DOC: add dask-sql and fugue (GH#7129) Ray Bell
|
||
* Example for working with categoricals and parquet (GH#7085)
|
||
McToel
|
||
* Adds tree reduction to bincount (GH#7183) Thomas J. Fan
|
||
* Improve documentation of name in from_array (GH#7264) Bruce
|
||
Merry
|
||
* Fix cumsum for empty partitions (GH#7230) Julia Signell
|
||
* Add map_blocks example to dask array creation docs (GH#7221)
|
||
Julia Signell
|
||
* Fix performance issue in dask.graph_manipulation.wait_on()
|
||
(GH#7258) Guido Imperiale
|
||
* Replace coveralls with codecov.io (GH#7246) Guido Imperiale
|
||
* Pin to a particular black rev in pre-commit (GH#7256) Julia
|
||
Signell
|
||
* Minor typo in documentation: array-chunks.rst (GH#7254) Magnus
|
||
Nord
|
||
* Fix bugs in Blockwise and ShuffleLayer (GH#7213) Richard
|
||
(Rick) Zamora
|
||
* Fix parquet filtering bug for "pyarrow-dataset" with
|
||
pyarrow-3.0.0 (GH#7200) Richard (Rick) Zamora
|
||
* graph_manipulation without NumPy (GH#7243) Guido Imperiale
|
||
* Support for NEP-35 (GH#6738) Peter Andreas Entschev
|
||
* Avoid running unit tests during doctest CI build (GH#7240)
|
||
James Bourbeau
|
||
* Run doctests on CI (GH#7238) Julia Signell
|
||
* Cleanup code quality on set arithmetics (GH#7196) Guido
|
||
Imperiale
|
||
* Add dask.array.delete (GH#7125) Julia Signell
|
||
* Unpin graphviz now that new conda-forge recipe is built
|
||
(GH#7235) Julia Signell
|
||
* Don’t use NumPy 1.20 from conda-forge on Mac (GH#7211) Guido
|
||
Imperiale
|
||
* map_overlap: Don’t rechunk axes without overlap (GH#7233)
|
||
Deepak Cherian
|
||
* Pin graphviz to avoid issue with latest conda-forge build
|
||
(GH#7232) Julia Signell
|
||
* Use html_css_files in docs for custom CSS (GH#7220) James
|
||
Bourbeau
|
||
* Graph manipulation: clone, bind, checkpoint, wait_on (GH#7109)
|
||
Guido Imperiale
|
||
* Fix handling of filter expressions in parquet pyarrow-dataset
|
||
engine (GH#7186) Joris Van den Bossche
|
||
* Extend __setitem__ to more closely match numpy (GH#7033) David
|
||
Hassell
|
||
* Clean up Python 2 syntax (GH#7195) Guido Imperiale
|
||
* Fix regression in Delayed._length (GH#7194) Guido Imperiale
|
||
* __dask_layers__() tests and tweaks (GH#7177) Guido Imperiale
|
||
* Properly convert HighLevelGraph in multiprocessing scheduler
|
||
(GH#7191) Jim Crist-Harif
|
||
* Don’t fail fast in CI (GH#7188) James Bourbeau
|
||
- Add dask-pr7247-numpyskip.patch -- gh#dask/dask#7247
|
||
|
||
-------------------------------------------------------------------
|
||
Wed Feb 17 21:51:48 UTC 2021 - Ben Greiner <code@bnavigator.de>
|
||
|
||
- Run the full test suite: use rootdir conftest.py
|
||
* importable optional dependencies are skipped automatically
|
||
* can use network marker to skip network tests
|
||
- Don't package and test -dataframe and -array for python36 flavor,
|
||
because python36-numpy and depending packages were dropped from
|
||
Tumbleweed with version 1.20.
|
||
- Skip more distributed tests occasionally failing
|
||
|
||
-------------------------------------------------------------------
|
||
Mon Feb 8 14:24:58 UTC 2021 - Ben Greiner <code@bnavigator.de>
|
||
|
||
- Update to version 2020.2.0
|
||
* Add percentile support for NEP-35 (GH#7162) Peter Andreas
|
||
Entschev
|
||
* Added support for Float64 in column assignment (GH#7173) Nils
|
||
Braun
|
||
* Coarsen rechunking error (GH#7127) Davis Bennett
|
||
* Fix upstream CI tests (GH#6896) Julia Signell
|
||
* Revise HighLevelGraph Mapping API (GH#7160) Guido Imperiale
|
||
* Update low-level graph spec to use any hashable for keys
|
||
(GH#7163) James Bourbeau
|
||
* Generically rebuild a collection with different keys (GH#7142)
|
||
Guido Imperiale
|
||
* Make easier to link issues in PRs (GH#7130) Ray Bell
|
||
* Add dask.array.append (GH#7146) D-Stacks
|
||
* Allow dask.array.ravel to accept array_like argument (GH#7138)
|
||
D-Stacks
|
||
* Fixes link in array design doc (GH#7152) Thomas J. Fan
|
||
* Fix example of using blockwise for an outer product (GH#7119)
|
||
Bruce Merry
|
||
* Deprecate HighlevelGraph.dicts in favor of .layers (GH#7145)
|
||
Amit Kumar
|
||
* Align FastParquetEngine with pyarrow engines (GH#7091) Richard
|
||
(Rick) Zamora
|
||
* Merge annotations (GH#7102) Ian Rose
|
||
* Simplify contents of parts list in read_parquet (GH#7066)
|
||
Richard (Rick) Zamora
|
||
* check_meta(): use __class__ when checking DataFrame types
|
||
(GH#7099) Mads R. B. Kristensen
|
||
* Cache several properties (GH#7104) Illviljan
|
||
* Fix parquet getitem optimization (GH#7106) Richard (Rick)
|
||
Zamora
|
||
* Add cytoolz back to CI environment (GH#7103) James Bourbeau
|
||
|
||
-------------------------------------------------------------------
|
||
Thu Jan 28 12:25:51 UTC 2021 - Ben Greiner <code@bnavigator.de>
|
||
|
||
- Update to version 2020.1.1
|
||
Partially fix cumprod (GH#7089) Julia Signell
|
||
* Test pandas 1.1.x / 1.2.0 releases and pandas nightly
|
||
(GH#6996) Joris Van den Bossche
|
||
* Use assign to avoid SettingWithCopyWarning (GH#7092) Julia
|
||
Signell
|
||
* 'mode' argument passed to bokeh.output_file() (GH#7034)
|
||
(GH#7075) patquem
|
||
* Skip empty partitions when doing groupby.value_counts
|
||
(GH#7073) Julia Signell
|
||
* Add error messages to assert_eq() (GH#7083) James Lamb
|
||
* Make cached properties read-only (GH#7077) Illviljan
|
||
- Changelog for 2021.01.0
|
||
* map_partitions with review comments (GH#6776) Kumar Bharath
|
||
Prabhu
|
||
* Make sure that population is a real list (GH#7027) Julia Signell
|
||
* Propagate storage_options in read_csv (GH#7074) Richard (Rick)
|
||
Zamora
|
||
* Remove all BlockwiseIO code (GH#7067) Richard (Rick) Zamora
|
||
* Fix CI (GH#7069) James Bourbeau
|
||
* Add option to control rechunking in reshape (GH#6753) Tom
|
||
Augspurger
|
||
* Fix linalg.lstsq for complex inputs (GH#7056) Johnnie Gray
|
||
* Add compression='infer' default to read_csv (GH#6960) Richard
|
||
(Rick) Zamora
|
||
* Revert parameter changes in svd_compressed #7003 (GH#7004) Eric
|
||
Czech
|
||
* Skip failing s3 test (GH#7064) Martin Durant
|
||
* Revert BlockwiseIO (GH#7048) Richard (Rick) Zamora
|
||
* Add some cross-references to DataFrame.to_bag() and Series.
|
||
to_bag() (GH#7049) Rob Malouf
|
||
* Rewrite matmul as blockwise without contraction/concatenate
|
||
(GH#7000) Rafal Wojdyla
|
||
* Use functools.cached_property in da.shape (GH#7023) Illviljan
|
||
* Use meta value in series non_empty (GH#6976) Julia Signell
|
||
* Revert “Temporarly pin sphinx version to 3.3.1 (GH#7002)”
|
||
(GH#7014) Rafal Wojdyla
|
||
* Revert python-graphviz pinning (GH#7037) Julia Signell
|
||
* Accidentally committed print statement (GH#7038) Julia Signell
|
||
* Pass dropna and observed in agg (GH#6992) Julia Signell
|
||
* Add index to meta after .str.split with expand (GH#7026) Ruben
|
||
van de Geer
|
||
* CI: test pyarrow 2.0 and nightly (GH#7030) Joris Van den Bossche
|
||
* Temporarily pin python-graphviz in CI (GH#7031) James Bourbeau
|
||
* Underline section in numpydoc (GH#7013) Matthias Bussonnier
|
||
* Keep normal optimizations when adding custom optimizations
|
||
(GH#7016) Matthew Rocklin
|
||
* Temporarily pin sphinx version to 3.3.1 (GH#7002) Rafal Wojdyla
|
||
* DOC: Misc formatting (GH#6998) Matthias Bussonnier
|
||
* Add inline_array option to from_array (GH#6773) Tom Augspurger
|
||
* Revert “Initial pass at blockwise array creation routines
|
||
(GH#6931)” (:pr:`6995) James Bourbeau
|
||
* Set npartitions in set_index (GH#6978) Julia Signell
|
||
* Upstream config serialization and inheritance (GH#6987) Jacob
|
||
Tomlinson
|
||
* Bump the minimum time in test_minimum_time (GH#6988) Martin
|
||
Durant
|
||
* Fix pandas dtype inference for read_parquet (GH#6985) Richard
|
||
(Rick) Zamora
|
||
* Avoid data loss in set_index with sorted=True (GH#6980) Richard
|
||
(Rick) Zamora
|
||
* Bugfix in read_parquet for handling un-named indices with
|
||
index=False (GH#6969) Richard (Rick) Zamora
|
||
* Use __class__ when comparing meta data (GH#6981) Mads R. B.
|
||
Kristensen
|
||
* Comparing string versions won’t always work (GH#6979) Rafal
|
||
Wojdyla
|
||
* Fix GH#6925 (GH#6982) sdementen
|
||
* Initial pass at blockwise array creation routines (GH#6931) Ian
|
||
Rose
|
||
* Simplify has_parallel_type() (GH#6927) Mads R. B. Kristensen
|
||
* Handle annotation unpacking in BlockwiseIO (GH#6934) Simon
|
||
Perkins
|
||
* Avoid deprecated yield_fixture in test_sql.py (GH#6968) Richard
|
||
(Rick) Zamora
|
||
* Remove bad graph logic in BlockwiseIO (GH#6933) Richard (Rick)
|
||
Zamora
|
||
* Get config item if variable is None (GH#6862) Jacob Tomlinson
|
||
* Update from_pandas docstring (GH#6957) Richard (Rick) Zamora
|
||
* Prevent fuse_roots from clobbering annotations (GH#6955) Simon
|
||
Perkins
|
||
|
||
-------------------------------------------------------------------
|
||
Wed Jan 13 14:01:09 UTC 2021 - Benjamin Greiner <code@bnavigator.de>
|
||
|
||
- update to version 2020.12.0
|
||
* Switched to CalVer for versioning scheme.
|
||
* Introduced new APIs for HighLevelGraph to enable sending
|
||
high-level representations of task graphs to the
|
||
distributed scheduler.
|
||
* Introduced new HighLevelGraph layer objects including
|
||
BasicLayer, Blockwise, BlockwiseIO, ShuffleLayer, and
|
||
more.
|
||
* Added support for applying custom Layer-level annotations
|
||
like priority, retries, etc. with the dask.annotations
|
||
context manager.
|
||
* Updated minimum supported version of pandas to 0.25.0
|
||
and NumPy to 1.15.1.
|
||
* Support for the pyarrow.dataset API to read_parquet.
|
||
* Several fixes to Dask Array’s SVD.
|
||
- For a full list of changes see
|
||
https://docs.dask.org/en/latest/changelog.html
|
||
- Clean requirements
|
||
- Fix incorrect usage of python3_only macro
|
||
- Test with pytest-xdist in order to avoid hang after test
|
||
|
||
-------------------------------------------------------------------
|
||
Sat Oct 10 19:03:48 UTC 2020 - Arun Persaud <arun@gmx.de>
|
||
|
||
- update to version 2.30.0:
|
||
* Allow rechunk to evenly split into N chunks (:pr:`6420`) Scott
|
||
Sievert
|
||
|
||
-------------------------------------------------------------------
|
||
Mon Oct 5 20:14:32 UTC 2020 - Arun Persaud <arun@gmx.de>
|
||
|
||
- update to version 2.29.0:
|
||
* Array
|
||
+ _repr_html_: color sides darker instead of drawing all the lines
|
||
(:pr:`6683`) Julia Signell
|
||
+ Removes warning from nanstd and nanvar (:pr:`6667`) Thomas J Fan
|
||
+ Get shape of output from original array - map_overlap
|
||
(:pr:`6682`) Julia Signell
|
||
+ Replace np.searchsorted with bisect in indexing (:pr:`6669`)
|
||
Joachim B Haga
|
||
* Bag
|
||
+ Make sure subprocesses have a consistent hash for bag groupby
|
||
(:pr:`6660`) Itamar Turner-Trauring
|
||
* Core
|
||
+ Revert "Use HighLevelGraph layers everywhere in collections
|
||
(:pr:`6510`)" (:pr:`6697`) Tom Augspurger
|
||
+ Use pandas.testing (:pr:`6687`) John A Kirkham
|
||
+ Improve 128-bit floating-point skip in tests (:pr:`6676`)
|
||
Elliott Sales de Andrade
|
||
* DataFrame
|
||
+ Allow setting dataframe items using a bool dataframe
|
||
(:pr:`6608`) Julia Signell
|
||
* Documentation
|
||
+ Fix typo (:pr:`6692`) garanews
|
||
+ Fix a few typos (:pr:`6678`) Pav A
|
||
|
||
- changes from version 2.28.0:
|
||
* Array
|
||
+ Partially reverted changes to Array indexing that produces large
|
||
changes. This restores the behavior from Dask 2.25.0 and
|
||
earlier, with a warning when large chunks are produced. A
|
||
configuration option is provided to avoid creating the large
|
||
chunks, see :ref:`array.slicing.efficiency`. (:pr:`6665`) Tom
|
||
Augspurger
|
||
+ Add meta to to_dask_array (:pr:`6651`) Kyle Nicholson
|
||
+ Fix :pr:`6631` and :pr:`6611` (:pr:`6632`) Rafal Wojdyla
|
||
+ Infer object in array reductions (:pr:`6629`) Daniel Saxton
|
||
+ Adding v_based flag for svd_flip (:pr:`6658`) Eric Czech
|
||
+ Fix flakey array mean (:pr:`6656`) Sam Grayson
|
||
* Core
|
||
+ Removed dsk equality check from SubgraphCallable.__eq__
|
||
(:pr:`6666`) Mads R. B. Kristensen
|
||
+ Use HighLevelGraph layers everywhere in collections (:pr:`6510`)
|
||
Mads R. B. Kristensen
|
||
+ Adds hash dunder method to SubgraphCallable for caching purposes
|
||
(:pr:`6424`) Andrew Fulton
|
||
+ Stop writing commented out config files by default (:pr:`6647`)
|
||
Matthew Rocklin
|
||
* DataFrame
|
||
+ Add support for collect list aggregation via agg API
|
||
(:pr:`6655`) Madhur Tandon
|
||
+ Slightly better error message (:pr:`6657`) Julia Signell
|
||
|
||
-------------------------------------------------------------------
|
||
Sat Sep 19 15:07:55 UTC 2020 - Arun Persaud <arun@gmx.de>
|
||
|
||
- update to version 2.27.0:
|
||
* Array
|
||
+ Preserve dtype in svd (:pr:`6643`) Eric Czech
|
||
* Core
|
||
+ store(): create a single HLG layer (:pr:`6601`) Mads
|
||
R. B. Kristensen
|
||
+ Add pre-commit CI build (:pr:`6645`) James Bourbeau
|
||
+ Update .pre-commit-config to latest black. (:pr:`6641`) Julia
|
||
Signell
|
||
+ Update super usage to remove Python 2 compatibility (:pr:`6630`)
|
||
Poruri Sai Rahul
|
||
+ Remove u string prefixes (:pr:`6633`) Poruri Sai Rahul
|
||
* DataFrame
|
||
+ Improve error message for to_sql (:pr:`6638`) Julia Signell
|
||
+ Use empty list as categories (:pr:`6626`) Julia Signell
|
||
* Documentation
|
||
+ Add autofunction to array api docs for more ufuncs (:pr:`6644`)
|
||
James Bourbeau
|
||
+ Add a number of missing ufuncs to dask.array docs (:pr:`6642`)
|
||
Ralf Gommers
|
||
+ Add HelmCluster docs (:pr:`6290`) Jacob Tomlinson
|
||
|
||
-------------------------------------------------------------------
|
||
Sat Sep 12 19:57:21 UTC 2020 - Arun Persaud <arun@gmx.de>
|
||
|
||
- specfile:
|
||
* added python-mimesis and python-zarr to be able to run more tests
|
||
|
||
- update to version 2.26.0:
|
||
* Array
|
||
+ Backend-aware dtype inference for single-chunk svd (:pr:`6623`)
|
||
Eric Czech
|
||
+ Make array.reduction docstring match for dtype (:pr:`6624`)
|
||
Martin Durant
|
||
+ Set lower bound on compression level for svd_compressed using
|
||
rows and cols (:pr:`6622`) Eric Czech
|
||
+ Improve SVD consistency and small array handling (:pr:`6616`)
|
||
Eric Czech
|
||
+ Add svd_flip #6599 (:pr:`6613`) Eric Czech
|
||
+ Handle sequences containing dask Arrays (:pr:`6595`) Gabe Joseph
|
||
+ Avoid large chunks from getitem with lists (:pr:`6514`) Tom
|
||
Augspurger
|
||
+ Eagerly slice numpy arrays in from_array (:pr:`6605`) Deepak
|
||
Cherian
|
||
+ Restore ability to pickle dask arrays (:pr:`6594`) Noah D
|
||
Brenowitz
|
||
+ Add SVD support for short-and-fat arrays (:pr:`6591`) Eric Czech
|
||
+ Add simple chunk type registry and defer as appropriate to
|
||
upcast types (:pr:`6393`) Jon Thielen
|
||
+ Align coarsen chunks by default (:pr:`6580`) Deepak Cherian
|
||
+ Fixup reshape on unknown dimensions and other testing fixes
|
||
(:pr:`6578`) Ryan Williams
|
||
* Core
|
||
+ Add validation and fixes for HighLevelGraph dependencies
|
||
(:pr:`6588`) Mads R. B. Kristensen
|
||
+ Fix linting issue (:pr:`6598`) Tom Augspurger
|
||
+ Skip bokeh version 2.0.0 (:pr:`6572`) John A Kirkham
|
||
* DataFrame
|
||
+ Added bytes/row calculation when using meta (:pr:`6585`) McToel
|
||
+ Handle min_count in Series.sum / prod (:pr:`6618`) Daniel Saxton
|
||
+ Update DataFrame.set_index docstring (:pr:`6549`) Timost
|
||
+ Always compute 0 and 1 quantiles during quantile calculations
|
||
(:pr:`6564`) Erik Welch
|
||
+ Fix wrong path when reading empty csv file (:pr:`6573`)
|
||
Abdulelah Bin Mahfoodh
|
||
* Documentation
|
||
+ Doc: Troubleshooting dashboard 404 (:pr:`6215`) Kilian Lieret
|
||
+ Fixup extraConfig example (:pr:`6625`) Tom Augspurger
|
||
+ Update supported Python versions (:pr:`6609`) Julia Signell
|
||
+ Document dask/daskhub helm chart (:pr:`6560`) Tom Augspurger
|
||
|
||
-------------------------------------------------------------------
|
||
Sat Aug 29 15:51:43 UTC 2020 - Arun Persaud <arun@gmx.de>
|
||
|
||
- update to version 2.25.0:
|
||
* Core
|
||
+ Compare key hashes in subs() (:pr:`6559`) Mads R. B. Kristensen
|
||
+ Rerun with latest black release (:pr:`6568`) James Bourbeau
|
||
+ License update (:pr:`6554`) Tom Augspurger
|
||
* DataFrame
|
||
+ Add gs read_parquet example (:pr:`6548`) Ray Bell
|
||
* Documentation
|
||
+ Remove version from documentation page names (:pr:`6558`) James
|
||
Bourbeau
|
||
+ Update kubernetes-helm.rst (:pr:`6523`) David Sheldon
|
||
+ Stop 2020 survey (:pr:`6547`) Tom Augspurger
|
||
|
||
- changes from version 2.24.0:
|
||
* Array
|
||
+ Fix setting random seed in tests. (:pr:`6518`) Elliott Sales de
|
||
Andrade
|
||
+ Support meta in apply gufunc (:pr:`6521`) joshreback
|
||
+ Replace cupy.sparse with cupyx.scipy.sparse (:pr:`6530`) John A
|
||
Kirkham
|
||
* Dataframe
|
||
+ Bump up tolerance for rolling tests (:pr:`6502`) Julia Signell
|
||
+ Implement DatFrame.__len__ (:pr:`6515`) Tom Augspurger
|
||
+ Infer arrow schema in to_parquet (for ArrowEngine`) (:pr:`6490`)
|
||
`Richard Zamora`_
|
||
+ Fix parquet test when no pyarrow (:pr:`6524`) Martin Durant
|
||
+ Remove problematic filter arguments in ArrowEngine (:pr:`6527`)
|
||
`Richard Zamora`_
|
||
+ Avoid schema validation by default in ArrowEngine (:pr:`6536`)
|
||
`Richard Zamora`_
|
||
* Core
|
||
+ Use unpack_collections in make_blockwise_graph (:pr:`6517`)
|
||
`Thomas Fan`_
|
||
+ Move key_split() from optimization.py to utils.py (:pr:`6529`)
|
||
Mads R. B. Kristensen
|
||
+ Make tests run on moto server (:pr:`6528`) Martin Durant
|
||
|
||
-------------------------------------------------------------------
|
||
Sat Aug 15 16:59:24 UTC 2020 - Arun Persaud <arun@gmx.de>
|
||
|
||
- update to version 2.23.0:
|
||
* Array
|
||
+ Reduce np.zeros, ones, and full array size with broadcasting
|
||
(:pr:`6491`) Matthias Bussonnier
|
||
+ Add missing meta= for trim in map_overlap (:pr:`6494`) Peter
|
||
Andreas Entschev
|
||
* Bag
|
||
+ Bag repartition partition size (:pr:`6371`) joshreback
|
||
* Core
|
||
+ Scalar.__dask_layers__() to return self._name instead of
|
||
self.key (:pr:`6507`) Mads R. B. Kristensen
|
||
+ Update dependencies correctly in fuse_root optimization
|
||
(:pr:`6508`) Mads R. B. Kristensen
|
||
* DataFrame
|
||
+ Adds items to dataframe (:pr:`6503`) Thomas J Fan
|
||
+ Include compression in write_table call (:pr:`6499`) Julia
|
||
Signell
|
||
+ Fixed warning in nonempty_series (:pr:`6485`) Tom Augspurger
|
||
+ Intelligently determine partitions based on type of first arg
|
||
(:pr:`6479`) Matthew Rocklin
|
||
+ Fix pyarrow mkdirs (:pr:`6475`) Julia Signell
|
||
+ Fix duplicate parquet output in to_parquet (:pr:`6451`)
|
||
michaelnarodovitch
|
||
* Documentation
|
||
+ Fix documentation da.histogram (:pr:`6439`) Roberto Panai
|
||
+ Add agg nunique example (:pr:`6404`) Ray Bell
|
||
+ Fixed a few typos in the SQL docs (:pr:`6489`) Mike McCarty
|
||
+ Docs for SQLing (:pr:`6453`) Martin Durant
|
||
|
||
-------------------------------------------------------------------
|
||
Sat Aug 1 22:09:59 UTC 2020 - Arun Persaud <arun@gmx.de>
|
||
|
||
- update to version 2.22.0:
|
||
* Array
|
||
+ Compatibility for NumPy dtype deprecation (:pr:`6430`) Tom
|
||
Augspurger
|
||
* Core
|
||
+ Implement sizeof for some bytes-like objects (:pr:`6457`) John A
|
||
Kirkham
|
||
+ HTTP error for new fsspec (:pr:`6446`) Martin Durant
|
||
+ When RecursionError is raised, return uuid from tokenize
|
||
function (:pr:`6437`) Julia Signell
|
||
+ Install deps of upstream-dev packages (:pr:`6431`) Tom
|
||
Augspurger
|
||
+ Use updated link in setup.cfg (:pr:`6426`) Zhengnan
|
||
* DataFrame
|
||
+ Add single quotes around column names if strings (:pr:`6471`)
|
||
Gil Forsyth
|
||
+ Refactor ArrowEngine for better read_parquet performance
|
||
(:pr:`6346`) Richard (Rick) Zamora
|
||
+ Add tolist dispatch (:pr:`6444`) GALI PREM SAGAR
|
||
+ Compatibility with pandas 1.1.0rc0 (:pr:`6429`) Tom Augspurger
|
||
+ Multi value pivot table (:pr:`6428`) joshreback
|
||
+ Duplicate argument definitions in to_csv docstring (:pr:`6411`)
|
||
Jun Han (Johnson) Ooi
|
||
* Documentation
|
||
+ Add utility to docs to convert YAML config to env vars and back
|
||
(:pr:`6472`) Jacob Tomlinson
|
||
+ Fix parameter server rendering (:pr:`6466`) Scott Sievert
|
||
+ Fixes broken links (:pr:`6403`) Jim Circadian
|
||
+ Complete parameter server implementation in docs (:pr:`6449`)
|
||
Scott Sievert
|
||
+ Fix typo (:pr:`6436`) Jack Xiaosong Xu
|
||
|
||
-------------------------------------------------------------------
|
||
Sat Jul 18 18:12:13 UTC 2020 - Arun Persaud <arun@gmx.de>
|
||
|
||
- update to version 2.21.0:
|
||
* Array
|
||
+ Correct error message in array.routines.gradient() (:pr:`6417`)
|
||
johnomotani
|
||
+ Fix blockwise concatenate for array with some dimension=1
|
||
(:pr:`6342`) Matthias Bussonnier
|
||
* Bag
|
||
+ Fix bag.take example (:pr:`6418`) Roberto Panai
|
||
* Core
|
||
+ Groups values in optimization pass should only be graph and keys
|
||
-- not an optimization + keys (:pr:`6409`) Ben Zaitlen
|
||
+ Call custom optimizations once, with kwargs provided
|
||
(:pr:`6382`) Clark Zinzow
|
||
+ Include pickle5 for testing on Python 3.7 (:pr:`6379`) John A
|
||
Kirkham
|
||
* DataFrame
|
||
+ Correct typo in error message (:pr:`6422`) Tom McTiernan
|
||
+ Use pytest.warns to check for UserWarning (:pr:`6378`) Richard
|
||
(Rick) Zamora
|
||
+ Parse bytes_per_chunk keyword from string (:pr:`6370`) Matthew
|
||
Rocklin
|
||
* Documentation
|
||
+ Numpydoc formatting (:pr:`6421`) Matthias Bussonnier
|
||
+ Unpin numpydoc following 1.1 release (:pr:`6407`) Gil Forsyth
|
||
+ Numpydoc formatting (:pr:`6402`) Matthias Bussonnier
|
||
+ Add instructions for using conda when installing code for
|
||
development (:pr:`6399`) Ray Bell
|
||
+ Update visualize docstrings (:pr:`6383`) Zhengnan
|
||
|
||
-------------------------------------------------------------------
|
||
Thu Jul 9 08:20:00 UTC 2020 - Marketa Calabkova <mcalabkova@suse.com>
|
||
|
||
- Update to version 2.20.0
|
||
Array
|
||
- Register ``sizeof`` for numpy zero-strided arrays (:pr:`6343`) `Matthias Bussonnier`_
|
||
- Use ``concatenate_lookup`` in ``concatenate`` (:pr:`6339`) `John A Kirkham`_
|
||
- Fix rechunking of arrays with some zero-length dimensions (:pr:`6335`) `Matthias Bussonnier`_
|
||
DataFrame
|
||
- Dispatch ``iloc``` calls to ``getitem`` (:pr:`6355`) `Gil Forsyth`_
|
||
- Handle unnamed pandas ``RangeIndex`` in fastparquet engine (:pr:`6350`) `Richard (Rick) Zamora`_
|
||
- Preserve index when writing partitioned parquet datasets with pyarrow (:pr:`6282`) `Richard (Rick) Zamora`_
|
||
- Use ``ignore_index`` for pandas' ``group_split_dispatch`` (:pr:`6251`) `Richard (Rick) Zamora`_
|
||
Documentation
|
||
- Add doc describing argument (:pr:`6318`) `asmith26`_
|
||
- 2.19.0
|
||
Array
|
||
- Cast chunk sizes to python int ``dtype`` (:pr:`6326`) `Gil Forsyth`_
|
||
- Add ``shape=None`` to ``*_like()`` array creation functions (:pr:`6064`) `Anderson Banihirwe`_
|
||
Core
|
||
- Update expected error msg for protocol difference in fsspec (:pr:`6331`) `Gil Forsyth`_
|
||
- Fix for floats < 1 in ``parse_bytes`` (:pr:`6311`) `Gil Forsyth`_
|
||
- Fix exception causes all over the codebase (:pr:`6308`) `Ram Rachum`_
|
||
- Fix duplicated tests (:pr:`6303`) `James Lamb`_
|
||
- Remove unused testing function (:pr:`6304`) `James Lamb`_
|
||
DataFrame
|
||
- Add high-level CSV Subgraph (:pr:`6262`) `Gil Forsyth`_
|
||
- Fix ``ValueError`` when merging an index-only 1-partition dataframe (:pr:`6309`) `Krishan Bhasin`_
|
||
- Make ``index.map`` clear divisions. (:pr:`6285`) `Julia Signell`_
|
||
Documentation
|
||
- Add link to 2020 survey (:pr:`6328`) `Tom Augspurger`_
|
||
- Update ``bag.rst`` (:pr:`6317`) `Ben Shaver`_
|
||
- 2.18.1
|
||
Array
|
||
- Don't try to set name on ``full`` (:pr:`6299`) `Julia Signell`_
|
||
- Histogram: support lazy values for range/bins (another way) (:pr:`6252`) `Gabe Joseph`_
|
||
Core
|
||
- Fix exception causes in ``utils.py`` (:pr:`6302`) `Ram Rachum`_
|
||
- Improve performance of ``HighLevelGraph`` construction (:pr:`6293`) `Julia Signell`_
|
||
Documentation
|
||
- Now readthedocs builds unrelased features' docstrings (:pr:`6295`) `Antonio Ercole De Luca`_
|
||
- Add ``asyncssh`` intersphinx mappings (:pr:`6298`) `Jacob Tomlinson`_
|
||
- 2.18.0
|
||
Array
|
||
- Cast slicing index to dask array if same shape as original (:pr:`6273`) `Julia Signell`_
|
||
- Fix ``stack`` error message (:pr:`6268`) `Stephanie Gott`_
|
||
- ``full`` & ``full_like``: error on non-scalar ``fill_value`` (:pr:`6129`) `Huite`_
|
||
- Support for multiple arrays in ``map_overlap`` (:pr:`6165`) `Eric Czech`_
|
||
- Pad resample divisions so that edges are counted (:pr:`6255`) `Julia Signell`_
|
||
Bag
|
||
- Random sampling of k elements from a dask bag #4799 (:pr:`6239`) `Antonio Ercole De Luca`_
|
||
DataFrame
|
||
- Add ``dropna``, ``sort``, and ``ascending`` to ``sort_values`` (:pr:`5880`) `Julia Signell`_
|
||
- Generalize ``from_dask_array`` (:pr:`6263`) `GALI PREM SAGAR`_
|
||
- Add derived docstring for ``SeriesGroupby.nunique`` (:pr:`6284`) `Julia Signell`_
|
||
- Remove ``NotImplementedError`` in resample with rule (:pr:`6274`) `Abdulelah Bin Mahfoodh`_
|
||
- Add ``dd.to_sql`` (:pr:`6038`) `Ryan Williams`_
|
||
Documentation
|
||
- Update remote data section (:pr:`6258`) `Ray Bell`_
|
||
- 2.17.2
|
||
Core
|
||
- Re-add the ``complete`` extra (:pr:`6257`) `Jim Crist-Harif`_
|
||
DataFrame
|
||
- Raise error if ``resample`` isn't going to give right answer (:pr:`6244`) `Julia Signell`_
|
||
- 2.17.1
|
||
Array
|
||
- Empty array rechunk (:pr:`6233`) `Andrew Fulton`_
|
||
Core
|
||
- Make ``pyyaml`` required (:pr:`6250`) `Jim Crist-Harif`_
|
||
- Fix install commands from ``ImportError`` (:pr:`6238`) `Gaurav Sheni`_
|
||
- Remove issue template (:pr:`6249`) `Jacob Tomlinson`_
|
||
DataFrame
|
||
- Pass ``ignore_index`` to ``dd_shuffle`` from ``DataFrame.shuffle`` (:pr:`6247`) `Richard (Rick) Zamora`_
|
||
- Cope with missing HDF keys (:pr:`6204`) `Martin Durant`_
|
||
- Generalize ``describe`` & ``quantile`` apis (:pr:`5137`) `GALI PREM SAGAR`_
|
||
- 2.17.0
|
||
Array
|
||
- Small improvements to ``da.pad`` (:pr:`6213`) `Mark Boer`_
|
||
- Return ``tuple`` if multiple outputs in ``dask.array.apply_gufunc``, add test to check for tuple (:pr:`6207`) `Kai Mühlbauer`_
|
||
- Support ``stack`` with unknown chunksizes (:pr:`6195`) `swapna`_
|
||
Bag
|
||
- Random Choice on Bags (:pr:`6208`) `Antonio Ercole De Luca`_
|
||
Core
|
||
- Raise warning ``delayed.visualise()`` (:pr:`6216`) `Amol Umbarkar`_
|
||
- Ensure other pickle arguments work (:pr:`6229`) `John A Kirkham`_
|
||
- Overhaul ``fuse()`` config (:pr:`6198`) `Guido Imperiale`_
|
||
- Update ``dask.order.order`` to consider "next" nodes using both FIFO and LIFO (:pr:`5872`) `Erik Welch`_
|
||
DataFrame
|
||
- Use 0 as ``fill_value`` for more agg methods (:pr:`6245`) `Julia Signell`_
|
||
- Generalize ``rearrange_by_column_tasks`` and add ``DataFrame.shuffle`` (:pr:`6066`) `Richard (Rick) Zamora`_
|
||
- Xfail ``test_rolling_numba_engine`` for newer numba and older pandas (:pr:`6236`) `James Bourbeau`_
|
||
- Generalize ``fix_overlap`` (:pr:`6240`) `GALI PREM SAGAR`_
|
||
- Fix ``DataFrame.shape`` with no columns (:pr:`6237`) `noreentry`_
|
||
- Avoid shuffle when setting a presorted index with overlapping divisions (:pr:`6226`) `Krishan Bhasin`_
|
||
- Adjust the Parquet engine classes to allow more easily subclassing (:pr:`6211`) `Marius van Niekerk`_
|
||
- Fix ``dd.merge_asof`` with ``left_on='col'`` & ``right_index=True`` (:pr:`6192`) `noreentry`_
|
||
- Disable warning for ``concat`` (:pr:`6210`) `Tung Dang`_
|
||
- Move ``AUTO_BLOCKSIZE`` out of ``read_csv`` signature (:pr:`6214`) `Jim Crist-Harif`_
|
||
- ``.loc`` indexing with callable (:pr:`6185`) `Endre Mark Borza`_
|
||
- Avoid apply in ``_compute_sum_of_squares`` for groupby std agg (:pr:`6186`) `Richard (Rick) Zamora`_
|
||
- Minor correction to ``test_parquet`` (:pr:`6190`) `Brian Larsen`_
|
||
- Adhering to the passed pat for delimeter join and fix error message (:pr:`6194`) `GALI PREM SAGAR`_
|
||
- Skip ``test_to_parquet_with_get`` if no parquet libs available (:pr:`6188`) `Scott Sanderson`_
|
||
Documentation
|
||
- Added documentation for ``distributed.Event`` class (:pr:`6231`) `Nils Braun`_
|
||
- Doc write to remote (:pr:`6124`) `Ray Bell`_
|
||
- 2.16.0
|
||
Array
|
||
- Fix array general-reduction name (:pr:`6176`) `Nick Evans`_
|
||
- Replace ``dim`` with ``shape`` in ``unravel_index`` (:pr:`6155`) `Julia Signell`_
|
||
- Moment: handle all elements being masked (:pr:`5339`) `Gabe Joseph`_
|
||
Core
|
||
- Remove Redundant string concatenations in dask code-base (:pr:`6137`) `GALI PREM SAGAR`_
|
||
- Upstream compat (:pr:`6159`) `Tom Augspurger`_
|
||
- Ensure ``sizeof`` of dict and sequences returns an integer (:pr:`6179`) `James Bourbeau`_
|
||
- Estimate python collection sizes with random sampling (:pr:`6154`) `Florian Jetter`_
|
||
- Update test upstream (:pr:`6146`) `Tom Augspurger`_
|
||
- Skip test for mindeps build (:pr:`6144`) `Tom Augspurger`_
|
||
- Switch default multiprocessing context to "spawn" (:pr:`4003`) `Itamar Turner-Trauring`_
|
||
- Update manifest to include dask-schema (:pr:`6140`) `Ben Zaitlen`_
|
||
DataFrame
|
||
- Harden inconsistent-schema handling in pyarrow-based ``read_parquet`` (:pr:`6160`) `Richard (Rick) Zamora`_
|
||
- Add compute ``kwargs`` to methods that write data to disk (:pr:`6056`) `Krishan Bhasin`_
|
||
- Fix issue where ``unique`` returns an index like result from backends (:pr:`6153`) `GALI PREM SAGAR`_
|
||
- Fix internal error in ``map_partitions`` with collections (:pr:`6103`) `Tom Augspurger`_
|
||
Documentation
|
||
- Add phase of computation to index TOC (:pr:`6157`) `Ben Zaitlen`_
|
||
- Remove unused imports in scheduling script (:pr:`6138`) `James Lamb`_
|
||
- Fix indent (:pr:`6147`) `Martin Durant`_
|
||
- Add Tom's log config example (:pr:`6143`) `Martin Durant`_
|
||
- 2.15.0
|
||
Array
|
||
- Update ``dask.array.from_array`` to warn when passed a Dask collection (:pr:`6122`) `James Bourbeau`_
|
||
- Un-numpy like behaviour in ``dask.array.pad`` (:pr:`6042`) `Mark Boer`_
|
||
- Add support for ``repeats=0`` in ``da.repeat`` (:pr:`6080`) `James Bourbeau`_
|
||
Core
|
||
- Fix yaml layout for schema (:pr:`6132`) `Ben Zaitlen`_
|
||
- Configuration Reference (:pr:`6069`) `Ben Zaitlen`_
|
||
- Add configuration option to turn off task fusion (:pr:`6087`) `Matthew Rocklin`_
|
||
- Skip pyarrow on windows (:pr:`6094`) `Tom Augspurger`_
|
||
- Set limit to maximum length of fused key (:pr:`6057`) `Lucas Rademaker`_
|
||
- Add test against #6062 (:pr:`6072`) `Martin Durant`_
|
||
- Bump checkout action to v2 (:pr:`6065`) `James Bourbeau`_
|
||
DataFrame
|
||
- Generalize categorical calls to support cudf ``Categorical`` (:pr:`6113`) `GALI PREM SAGAR`_
|
||
- Avoid reading ``_metadata`` on every worker (:pr:`6017`) `Richard (Rick) Zamora`_
|
||
- Use ``group_split_dispatch`` and ``ignore_index`` in ``apply_concat_apply`` (:pr:`6119`) `Richard (Rick) Zamora`_
|
||
- Handle new (dtype) pandas metadata with pyarrow (:pr:`6090`) `Richard (Rick) Zamora`_
|
||
- Skip ``test_partition_on_cats_pyarrow`` if pyarrow is not installed (:pr:`6112`) `James Bourbeau`_
|
||
- Update DataFrame len to handle columns with the same name (:pr:`6111`) `James Bourbeau`_
|
||
- ``ArrowEngine`` bug fixes and test coverage (:pr:`6047`) `Richard (Rick) Zamora`_
|
||
- Added mode (:pr:`5958`) `Adam Lewis`_
|
||
|
||
-------------------------------------------------------------------
|
||
Mon Apr 20 13:01:44 UTC 2020 - Tomáš Chvátal <tchvatal@suse.com>
|
||
|
||
- Drop py2 dep from py3 only package
|
||
|
||
-------------------------------------------------------------------
|
||
Sat Apr 11 21:45:43 UTC 2020 - Arun Persaud <arun@gmx.de>
|
||
|
||
- update to version 2.14.0:
|
||
* Array
|
||
+ Added np.iscomplexobj implementation (:pr:`6045`) Tom Augspurger
|
||
* Core
|
||
+ Update test_rearrange_disk_cleanup_with_exception to pass
|
||
without cloudpickle installed (:pr:`6052`) James Bourbeau
|
||
+ Fixed flaky test-rearrange (:pr:`5977`) Tom Augspurger
|
||
* DataFrame
|
||
+ Use _meta_nonempty for dtype casting in stack_partitions
|
||
(:pr:`6061`) mlondschien
|
||
+ Fix bugs in _metadata creation and filtering in parquet
|
||
ArrowEngine (:pr:`6023`) Richard (Rick) Zamora
|
||
* Documentation
|
||
+ DOC: Add name caveats (:pr:`6040`) Tom Augspurger
|
||
|
||
-------------------------------------------------------------------
|
||
Sat Mar 28 16:47:35 UTC 2020 - Arun Persaud <arun@gmx.de>
|
||
|
||
- update to version 2.13.0:
|
||
* Array
|
||
+ Support dtype and other keyword arguments in da.random
|
||
(:pr:`6030`) Matthew Rocklin
|
||
+ Register support for cupy sparse hstack/vstack (:pr:`5735`)
|
||
Corey J. Nolet
|
||
+ Force self.name to str in dask.array (:pr:`6002`) Chuanzhu Xu
|
||
* Bag
|
||
+ Set rename_fused_keys to None by default in bag.optimize
|
||
(:pr:`6000`) Lucas Rademaker
|
||
* Core
|
||
+ Copy dict in to_graphviz to prevent overwriting (:pr:`5996`)
|
||
JulianWgs
|
||
+ Stricter pandas xfail (:pr:`6024`) Tom Augspurger
|
||
+ Fix CI failures (:pr:`6013`) James Bourbeau
|
||
+ Update toolz to 0.8.2 and use tlz (:pr:`5997`) Ryan Grout
|
||
+ Move Windows CI builds to GitHub Actions (:pr:`5862`) James
|
||
Bourbeau
|
||
* DataFrame
|
||
+ Improve path-related exceptions in read_hdf (:pr:`6032`) psimaj
|
||
+ Fix dtype handling in dd.concat (:pr:`6006`) mlondschien
|
||
+ Handle cudf's leftsemi and leftanti joins (:pr:`6025`) Richard J
|
||
Zamora
|
||
+ Remove unused npartitions variable in dd.from_pandas
|
||
(:pr:`6019`) Daniel Saxton
|
||
+ Added shuffle to DataFrame.random_split (:pr:`5980`) petiop
|
||
* Documentation
|
||
+ Fix indentation in scheduler-overview docs (:pr:`6022`) Matthew
|
||
Rocklin
|
||
+ Update task graphs in optimize docs (:pr:`5928`) Julia Signell
|
||
+ Optionally get rid of intermediary boxes in visualize, and add
|
||
more labels (:pr:`5976`) Julia Signell
|
||
|
||
-------------------------------------------------------------------
|
||
Sun Mar 8 19:03:37 UTC 2020 - Arun Persaud <arun@gmx.de>
|
||
|
||
- update to version 2.12.0:
|
||
* Array
|
||
+ Improve reuse of temporaries with numpy (:pr:`5933`) Bruce Merry
|
||
+ Make map_blocks with block_info produce a Blockwise (:pr:`5896`)
|
||
Bruce Merry
|
||
+ Optimize make_blockwise_graph (:pr:`5940`) Bruce Merry
|
||
+ Fix axes ordering in da.tensordot (:pr:`5975`) Gil Forsyth
|
||
+ Adds empty mode to array.pad (:pr:`5931`) Thomas J Fan
|
||
* Core
|
||
+ Remove toolz.memoize dependency in dask.utils (:pr:`5978`) Ryan
|
||
Grout
|
||
+ Close pool leaking subprocess (:pr:`5979`) Tom Augspurger
|
||
+ Pin numpydoc to 0.8.0 (fix double autoescape) (:pr:`5961`) Gil
|
||
Forsyth
|
||
+ Register deterministic tokenization for range objects
|
||
(:pr:`5947`) James Bourbeau
|
||
+ Unpin msgpack in CI (:pr:`5930`) JAmes Bourbeau
|
||
+ Ensure dot results are placed in unique files. (:pr:`5937`)
|
||
Elliott Sales de Andrade
|
||
+ Add remaining optional dependencies to Travis 3.8 CI build
|
||
environment (:pr:`5920`) James Bourbeau
|
||
* DataFrame
|
||
+ Skip parquet getitem optimization for some keys (:pr:`5917`) Tom
|
||
Augspurger
|
||
+ Add ignore_index argument to rearrange_by_column code path
|
||
(:pr:`5973`) Richard J Zamora
|
||
+ Add DataFrame and Series memory_usage_per_partition methods
|
||
(:pr:`5971`) James Bourbeau
|
||
+ xfail test_describe when using Pandas 0.24.2 (:pr:`5948`) James
|
||
Bourbeau
|
||
+ Implement dask.dataframe.to_numeric (:pr:`5929`) Julia Signell
|
||
+ Add new error message content when columns are in a different
|
||
order (:pr:`5927`) Julia Signell
|
||
+ Use shallow copy for assign operations when possible
|
||
(:pr:`5740`) Richard J Zamora
|
||
* Documentation
|
||
+ Changed above to below in dask.array.triu docs (:pr:`5984`)
|
||
Henrik Andersson
|
||
+ Array slicing: fix typo in slice_with_int_dask_array error
|
||
message (:pr:`5981`) Gabe Joseph
|
||
+ Grammar and formatting updates to docstrings (:pr:`5963`) James
|
||
Lamb
|
||
+ Update develop doc with conda option (:pr:`5939`) Ray Bell
|
||
+ Update title of DataFrame extension docs (:pr:`5954`) James
|
||
Bourbeau
|
||
+ Fixed typos in documentation (:pr:`5962`) James Lamb
|
||
+ Add original class or module as a kwarg on _bind_* methods
|
||
(:pr:`5946`) Julia Signell
|
||
+ Add collect list example (:pr:`5938`) Ray Bell
|
||
+ Update optimization doc for python 3 (:pr:`5926`) Julia Signell
|
||
|
||
-------------------------------------------------------------------
|
||
Sat Feb 22 18:54:54 UTC 2020 - Arun Persaud <arun@gmx.de>
|
||
|
||
- specfile:
|
||
* require pandas >= 0.23
|
||
|
||
- update to version 2.11.0:
|
||
* Array
|
||
+ Cache result of Array.shape (:pr:`5916`) Bruce Merry
|
||
+ Improve accuracy of estimate_graph_size for rechunk (:pr:`5907`)
|
||
Bruce Merry
|
||
+ Skip rechunk steps that do not alter chunking (:pr:`5909`) Bruce
|
||
Merry
|
||
+ Support dtype and other kwargs in coarsen (:pr:`5903`) Matthew
|
||
Rocklin
|
||
+ Push chunk override from map_blocks into blockwise (:pr:`5895`)
|
||
Bruce Merry
|
||
+ Avoid using rewrite_blockwise for a singleton (:pr:`5890`) Bruce
|
||
Merry
|
||
+ Optimize slices_from_chunks (:pr:`5891`) Bruce Merry
|
||
+ Avoid unnecessary __getitem__ in block() when chunks have
|
||
correct dimensionality (:pr:`5884`) Thomas Robitaille
|
||
* Bag
|
||
+ Add include_path option for dask.bag.read_text (:pr:`5836`)
|
||
Yifan Gu
|
||
+ Fixes ValueError in delayed execution of bagged NumPy array
|
||
(:pr:`5828`) Surya Avala
|
||
* Core
|
||
+ CI: Pin msgpack (:pr:`5923`) Tom Augspurger
|
||
+ Rename test_inner to test_outer (:pr:`5922`) Shiva Raisinghani
|
||
+ quote should quote dicts too (:pr:`5905`) Bruce Merry
|
||
+ Register a normalizer for literal (:pr:`5898`) Bruce Merry
|
||
+ Improve layer name synthesis for non-HLGs (:pr:`5888`) Bruce
|
||
Merry
|
||
+ Replace flake8 pre-commit-hook with upstream (:pr:`5892`) Julia
|
||
Signell
|
||
+ Call pip as a module to avoid warnings (:pr:`5861`) Cyril
|
||
Shcherbin
|
||
+ Close ThreadPool at exit (:pr:`5852`) Tom Augspurger
|
||
+ Remove dask.dataframe import in tokenization code (:pr:`5855`)
|
||
James Bourbeau
|
||
* DataFrame
|
||
+ Require pandas>=0.23 (:pr:`5883`) Tom Augspurger
|
||
+ Remove lambda from dataframe aggregation (:pr:`5901`) Matthew
|
||
Rocklin
|
||
+ Fix exception chaining in dataframe/__init__.py (:pr:`5882`) Ram
|
||
Rachum
|
||
+ Add support for reductions on empty dataframes (:pr:`5804`)
|
||
Shiva Raisinghani
|
||
+ Expose sort= argument for groupby (:pr:`5801`) Richard J Zamora
|
||
+ Add df.empty property (:pr:`5711`) rockwellw
|
||
+ Use parquet read speed-ups from
|
||
fastparquet.api.paths_to_cats. (:pr:`5821`) Igor Gotlibovych
|
||
* Documentation
|
||
+ Deprecate doc_wraps (:pr:`5912`) Tom Augspurger
|
||
+ Update array internal design docs for HighLevelGraph era
|
||
(:pr:`5889`) Bruce Merry
|
||
+ Move over dashboard connection docs (:pr:`5877`) Matthew Rocklin
|
||
+ Move prometheus docs from distributed.dask.org (:pr:`5876`)
|
||
Matthew Rocklin
|
||
+ Removing duplicated DO block at the end (:pr:`5878`) K.-Michael
|
||
Aye
|
||
+ map_blocks see also (:pr:`5874`) Tom Augspurger
|
||
+ More derived from (:pr:`5871`) Julia Signell
|
||
+ Fix typo (:pr:`5866`) Yetunde Dada
|
||
+ Fix typo in cloud.rst (:pr:`5860`) Andrew Thomas
|
||
+ Add note pointing to code of conduct and diversity statement
|
||
(:pr:`5844`) Matthew Rocklin
|
||
|
||
-------------------------------------------------------------------
|
||
Sat Feb 8 21:45:22 UTC 2020 - Arun Persaud <arun@gmx.de>
|
||
|
||
- update to version 2.10.1:
|
||
* Fix Pandas 1.0 version comparison (:pr:`5851`) Tom Augspurger
|
||
* Fix typo in distributed diagnostics documentation (:pr:`5841`)
|
||
Gerrit Holl
|
||
|
||
- changes from version 2.10.0:
|
||
* Support for pandas 1.0's new BooleanDtype and StringDtype
|
||
(:pr:`5815`) Tom Augspurger
|
||
* Compatibility with pandas 1.0's API breaking changes and
|
||
deprecations (:pr:`5792`) Tom Augspurger
|
||
* Fixed non-deterministic tokenization of some extension-array
|
||
backed pandas objects (:pr:`5813`) Tom Augspurger
|
||
* Fixed handling of dataclass class objects in collections
|
||
(:pr:`5812`) Matteo De Wint
|
||
* Fixed resampling with tz-aware dates when one of the endpoints
|
||
fell in a non-existent time (:pr:`5807`) dfonnegra
|
||
* Delay initial Zarr dataset creation until the computation occurs
|
||
(:pr:`5797`) Chris Roat
|
||
* Use parquet dataset statistics in more cases with the pyarrow
|
||
engine (:pr:`5799`) Richard J Zamora
|
||
* Fixed exception in groupby.std() when some of the keys were large
|
||
integers (:pr:`5737`) H. Thomson Comer
|
||
|
||
-------------------------------------------------------------------
|
||
Sat Jan 18 19:16:38 UTC 2020 - Arun Persaud <arun@gmx.de>
|
||
|
||
- update to version 2.9.2:
|
||
* Array
|
||
+ Unify chunks in broadcast_arrays (:pr:`5765`) Matthew Rocklin
|
||
* Core
|
||
+ xfail CSV encoding tests (:pr:`5791`) Tom Augspurger
|
||
+ Update order to handle empty dask graph (:pr:`5789`) James
|
||
Bourbeau
|
||
+ Redo dask.order.order (:pr:`5646`) Erik Welch
|
||
* DataFrame
|
||
+ Add transparent compression for on-disk shuffle with partd
|
||
(:pr:`5786`) Christian Wesp
|
||
+ Fix repr for empty dataframes (:pr:`5781`) Shiva Raisinghani
|
||
+ Pandas 1.0.0RC0 compat (:pr:`5784`) Tom Augspurger
|
||
+ Remove buggy assertions (:pr:`5783`) Tom Augspurger
|
||
+ Pandas 1.0 compat (:pr:`5782`) Tom Augspurger
|
||
+ Fix bug in pyarrow-based read_parquet on partitioned datasets
|
||
(:pr:`5777`) Richard J Zamora
|
||
+ Compat for pandas 1.0 (:pr:`5779`) Tom Augspurger
|
||
+ Fix groupby/mean error with with categorical index (:pr:`5776`)
|
||
Richard J Zamora
|
||
+ Support empty partitions when performing cumulative aggregation
|
||
(:pr:`5730`) Matthew Rocklin
|
||
+ set_index accepts single-item unnested list (:pr:`5760`) Wes
|
||
Roach
|
||
+ Fixed partitioning in set index for ordered Categorical
|
||
(:pr:`5715`) Tom Augspurger
|
||
* Documentation
|
||
+ Note additional use case for normalize_token.register
|
||
(:pr:`5766`) Thomas A Caswell
|
||
+ Update bag repartition docstring (:pr:`5772`) Timost
|
||
+ Small typos (:pr:`5771`) Maarten Breddels
|
||
+ Fix typo in Task Expectations docs (:pr:`5767`) James Bourbeau
|
||
+ Add docs section on task expectations to graph page (:pr:`5764`)
|
||
Devin Petersohn
|
||
|
||
-------------------------------------------------------------------
|
||
Mon Jan 6 05:05:16 UTC 2020 - Arun Persaud <arun@gmx.de>
|
||
|
||
- specfile:
|
||
* update copyright year
|
||
|
||
- update to version 2.9.1:
|
||
* Array
|
||
+ Support Array.view with dtype=None (:pr:`5736`) Anderson
|
||
Banihirwe
|
||
+ Add dask.array.nanmedian (:pr:`5684`) Deepak Cherian
|
||
* Core
|
||
+ xfail test_temporary_directory on Python 3.8 (:pr:`5734`) James
|
||
Bourbeau
|
||
+ Add support for Python 3.8 (:pr:`5603`) James Bourbeau
|
||
+ Use id to dedupe constants in rewrite_blockwise (:pr:`5696`) Jim
|
||
Crist
|
||
* DataFrame
|
||
+ Raise error when converting a dask dataframe scalar to a boolean
|
||
(:pr:`5743`) James Bourbeau
|
||
+ Ensure dataframe groupby-variance is greater than zero
|
||
(:pr:`5728`) Matthew Rocklin
|
||
+ Fix DataFrame.__iter__ (:pr:`5719`) Tom Augspurger
|
||
+ Support Parquet filters in disjunctive normal form, like PyArrow
|
||
(:pr:`5656`) Matteo De Wint
|
||
+ Auto-detect categorical columns in ArrowEngine-based
|
||
read_parquet (:pr:`5690`) Richard J Zamora
|
||
+ Skip parquet getitem optimization tests if no engine found
|
||
(:pr:`5697`) James Bourbeau
|
||
+ Fix independent optimization of parquet-getitem (:pr:`5613`) Tom
|
||
Augspurger
|
||
* Documentation
|
||
+ Update helm config doc (:pr:`5750`) Ray Bell
|
||
+ Link to examples.dask.org in several places (:pr:`5733`) Tom
|
||
Augspurger
|
||
+ Add missing " in performance report example (:pr:`5724`) James
|
||
Bourbeau
|
||
+ Resolve several documentation build warnings (:pr:`5685`) James
|
||
Bourbeau
|
||
+ add info on performance_report (:pr:`5713`) Ben Zaitlen
|
||
+ Add more docs disclaimers (:pr:`5710`) Julia Signell
|
||
+ Fix simple typo: wihout -> without (:pr:`5708`) Tim Gates
|
||
+ Update numpydoc dependency (:pr:`5694`) James Bourbeau
|
||
|
||
-------------------------------------------------------------------
|
||
Sat Dec 7 19:08:29 UTC 2019 - Arun Persaud <arun@gmx.de>
|
||
|
||
- update to version 2.9.0:
|
||
* Array
|
||
+ Fix da.std to work with NumPy arrays (:pr:`5681`) James Bourbeau
|
||
* Core
|
||
+ Register sizeof functions for Numba and RMM (:pr:`5668`) John A
|
||
Kirkham
|
||
+ Update meeting time (:pr:`5682`) Tom Augspurger
|
||
* DataFrame
|
||
+ Modify dd.DataFrame.drop to use shallow copy (:pr:`5675`)
|
||
Richard J Zamora
|
||
+ Fix bug in _get_md_row_groups (:pr:`5673`) Richard J Zamora
|
||
+ Close sqlalchemy engine after querying DB (:pr:`5629`) Krishan
|
||
Bhasin
|
||
+ Allow dd.map_partitions to not enforce meta (:pr:`5660`) Matthew
|
||
Rocklin
|
||
+ Generalize concat_unindexed_dataframes to support cudf-backend
|
||
(:pr:`5659`) Richard J Zamora
|
||
+ Add dataframe resample methods (:pr:`5636`) Ben Zaitlen
|
||
+ Compute length of dataframe as length of first column
|
||
(:pr:`5635`) Matthew Rocklin
|
||
* Documentation
|
||
+ Doc fixup (:pr:`5665`) James Bourbeau
|
||
+ Update doc build instructions (:pr:`5640`) James Bourbeau
|
||
+ Fix ADL link (:pr:`5639`) Ray Bell
|
||
+ Add documentation build (:pr:`5617`) James Bourbeau
|
||
|
||
-------------------------------------------------------------------
|
||
Sun Nov 24 17:35:04 UTC 2019 - Arun Persaud <arun@gmx.de>
|
||
|
||
- update to version 2.8.1:
|
||
* Array
|
||
+ Use auto rechunking in da.rechunk if no value given (:pr:`5605`)
|
||
Matthew Rocklin
|
||
* Core
|
||
+ Add simple action to activate GH actions (:pr:`5619`) James
|
||
Bourbeau
|
||
* DataFrame
|
||
+ Fix "file_path_0" bug in aggregate_row_groups (:pr:`5627`)
|
||
Richard J Zamora
|
||
+ Add chunksize argument to read_parquet (:pr:`5607`) Richard J
|
||
Zamora
|
||
+ Change test_repartition_npartitions to support arch64
|
||
architecture (:pr:`5620`) ossdev07
|
||
+ Categories lost after groupby + agg (:pr:`5423`) Oliver Hofkens
|
||
+ Fixed relative path issue with parquet metadata file
|
||
(:pr:`5608`) Nuno Gomes Silva
|
||
+ Enable gpu-backed covariance/correlation in dataframes
|
||
(:pr:`5597`) Richard J Zamora
|
||
* Documentation
|
||
+ Fix institutional faq and unknown doc warnings (:pr:`5616`)
|
||
James Bourbeau
|
||
+ Add doc for some utils (:pr:`5609`) Tom Augspurger
|
||
+ Removes html_extra_path (:pr:`5614`) James Bourbeau
|
||
+ Fixed See Also referencence (:pr:`5612`) Tom Augspurger
|
||
|
||
-------------------------------------------------------------------
|
||
Sat Nov 16 17:53:12 UTC 2019 - Arun Persaud <arun@gmx.de>
|
||
|
||
- update to version 2.8.0:
|
||
* Array
|
||
+ Implement complete dask.array.tile function (:pr:`5574`) Bouwe
|
||
Andela
|
||
+ Add median along an axis with automatic rechunking (:pr:`5575`)
|
||
Matthew Rocklin
|
||
+ Allow da.asarray to chunk inputs (:pr:`5586`) Matthew Rocklin
|
||
* Bag
|
||
+ Use key_split in Bag name (:pr:`5571`) Matthew Rocklin
|
||
* Core
|
||
+ Switch Doctests to Py3.7 (:pr:`5573`) Ryan Nazareth
|
||
+ Relax get_colors test to adapt to new Bokeh release (:pr:`5576`)
|
||
Matthew Rocklin
|
||
+ Add dask.blockwise.fuse_roots optimization (:pr:`5451`) Matthew
|
||
Rocklin
|
||
+ Add sizeof implementation for small dicts (:pr:`5578`) Matthew
|
||
Rocklin
|
||
+ Update fsspec, gcsfs, s3fs (:pr:`5588`) Tom Augspurger
|
||
* DataFrame
|
||
+ Add dropna argument to groupby (:pr:`5579`) Richard J Zamora
|
||
+ Revert "Remove import of dask_cudf, which is now a part of cudf
|
||
(:pr:`5568`)" (:pr:`5590`) Matthew Rocklin
|
||
* Documentation
|
||
+ Add best practice for dask.compute function (:pr:`5583`) Matthew
|
||
Rocklin
|
||
+ Create FUNDING.yml (:pr:`5587`) Gina Helfrich
|
||
+ Add screencast for coordination primitives (:pr:`5593`) Matthew
|
||
Rocklin
|
||
+ Move funding to .github repo (:pr:`5589`) Tom Augspurger
|
||
+ Update calendar link (:pr:`5569`) Tom Augspurger
|
||
|
||
-------------------------------------------------------------------
|
||
Mon Nov 11 18:24:07 UTC 2019 - Todd R <toddrme2178@gmail.com>
|
||
|
||
- Update to 2.7.0
|
||
+ Array
|
||
* Reuse code for assert_eq util method
|
||
* Update da.array to always return a dask array
|
||
* Skip transpose on trivial inputs
|
||
* Avoid NumPy scalar string representation in tokenize
|
||
* Remove unnecessary tiledb shape constraint
|
||
* Removes bytes from sparse array HTML repr
|
||
+ Core
|
||
* Drop Python 3.5
|
||
* Update the use of fixtures in distributed tests
|
||
* Changed deprecated bokeh-port to dashboard-address
|
||
* Avoid updating with identical dicts in ensure_dict
|
||
* Test Upstream
|
||
* Accelerate reverse_dict
|
||
* Update test_imports.sh
|
||
* Support cgroups limits on cpu count in multiprocess and threaded schedulers
|
||
* Update minimum pyarrow version on CI
|
||
* Make cloudpickle optional
|
||
+ DataFrame
|
||
* Add an example of index_col usage
|
||
* Explicitly use iloc for row indexing
|
||
* Accept dask arrays on columns assignemnt
|
||
* Implement unique and value_counts for SeriesGroupBy
|
||
* Add sizeof definition for pyarrow tables and columns
|
||
* Enable row-group task partitioning in pyarrow-based read_parquet
|
||
* Removes npartitions='auto' from dd.merge docstring
|
||
* Apply enforce error message shows non-overlapping columns.
|
||
* Optimize meta_nonempty for repetitive dtypes
|
||
* Remove import of dask_cudf, which is now a part of cudf
|
||
+ Documentation
|
||
* Make capitalization more consistent in FAQ docs
|
||
* Add CONTRIBUTING.md
|
||
* Document optional dependencies
|
||
* Update helm chart docs to reflect new chart repo
|
||
* Add Resampler to API docs
|
||
* Fix typo in read_sql_table
|
||
* Add adaptive deployments screencast
|
||
- Update to 2.6.0
|
||
+ Core
|
||
* Call ``ensure_dict`` on graphs before entering ``toolz.merge``
|
||
* Consolidating hash dispatch functions
|
||
+ DataFrame
|
||
* Support Python 3.5 in Parquet code
|
||
* Avoid identity check in ``warn_dtype_mismatch``
|
||
* Enable unused groupby tests
|
||
* Remove old parquet and bcolz dataframe optimizations
|
||
* Add getitem optimization for ``read_parquet``
|
||
* Use ``_constructor_sliced`` method to determine Series type
|
||
* Fix map(series) for unsorted base series index
|
||
* Fix ``KeyError`` with Groupby label
|
||
+ Documentation
|
||
* Use Zoom meeting instead of appear.in
|
||
* Added curated list of resources
|
||
* Update SSH docs to include ``SSHCluster``
|
||
* Update "Why Dask?" page
|
||
* Fix typos in docstrings
|
||
- Update to 2.5.2
|
||
+ Array
|
||
* Correct chunk size logic for asymmetric overlaps
|
||
* Make da.unify_chunks public API
|
||
+ DataFrame
|
||
* Fix dask.dataframe.fillna handling of Scalar object
|
||
+ Documentation
|
||
* Remove boxes in Spark comparison page
|
||
* Add latest presentations
|
||
* Update cloud documentation
|
||
- Update to 2.5.0
|
||
+ Core
|
||
* Add sentinel no_default to get_dependencies task
|
||
* Update fsspec version
|
||
* Remove PY2 checks
|
||
+ DataFrame
|
||
* Add option to not check meta in dd.from_delayed
|
||
* Fix test_timeseries_nulls_in_schema failures with pyarrow master
|
||
* Reduce read_metadata output size in pyarrow/parquet
|
||
* Test numeric edge case for repartition with npartitions.
|
||
* Unxfail pandas-datareader test
|
||
* Add DataFrame.pop implementation
|
||
* Enable merge/set_index for cudf-based dataframes with cupy ``values``
|
||
* drop_duplicates support for positional subset parameter
|
||
+ Documentation
|
||
* Add screencasts to array, bag, dataframe, delayed, futures and setup
|
||
* Fix delimeter parsing documentation
|
||
* Update overview image
|
||
- Update to 2.4.0
|
||
+ Array
|
||
* Adds explicit ``h5py.File`` mode
|
||
* Provides method to compute unknown array chunks sizes
|
||
* Ignore runtime warning in Array ``compute_meta``
|
||
* Add ``_meta`` to ``Array.__dask_postpersist__``
|
||
* Fixup ``da.asarray`` and ``da.asanyarray`` for datetime64 dtype and xarray objects
|
||
* Add shape implementation
|
||
* Add chunktype to array text repr
|
||
* Array.random.choice: handle array-like non-arrays
|
||
+ Core
|
||
* Remove deprecated code
|
||
* Fix ``funcname`` when vectorized func has no ``__name__``
|
||
* Truncate ``funcname`` to avoid long key names
|
||
* Add support for ``numpy.vectorize`` in ``funcname``
|
||
* Fixed HDFS upstream test
|
||
* Support numbers and None in ``parse_bytes``/``timedelta``
|
||
* Fix tokenizing of subindexes on memmapped numpy arrays
|
||
* Upstream fixups
|
||
+ DataFrame
|
||
* Allow pandas to cast type of statistics
|
||
* Preserve index dtype after applying ``dd.pivot_table``
|
||
* Implement explode for Series and DataFrame
|
||
* ``set_index`` on categorical fails with less categories than partitions
|
||
* Support output to a single CSV file
|
||
* Add ``groupby().transform()``
|
||
* Adding filter kwarg to pyarrow dataset call
|
||
* Implement and check compression defaults for parquet
|
||
* Pass sqlalchemy params to delayed objects
|
||
* Fixing schema handling in arrow-parquet
|
||
* Add support for DF and Series ``groupby().idxmin/max()``
|
||
* Add correlation calculation and add test
|
||
+ Documentation
|
||
* Numpy docstring standard has moved
|
||
* Reference correct NumPy array name
|
||
* Minor edits to Array chunk documentation
|
||
* Add methods to API docs
|
||
* Add namespacing to configuration example
|
||
* Add get_task_stream and profile to the diagnostics page
|
||
* Add best practice to load data with Dask
|
||
* Update ``institutional-faq.rst``
|
||
* Add threads and processes note to the best practices
|
||
* Update cuDF links
|
||
* Fixed small typo with parentheses placement
|
||
* Update link in reshape docstring
|
||
- Update to 2.3.0
|
||
+ Array
|
||
* Raise exception when ``from_array`` is given a dask array
|
||
* Avoid adjusting gufunc's meta dtype twice
|
||
* Add ``meta=`` keyword to map_blocks and add test with sparse
|
||
* Add rollaxis and moveaxis
|
||
* Always increment old chunk index
|
||
* Shuffle dask array
|
||
* Fix ordering when indexing a dask array with a bool dask array
|
||
+ Bag
|
||
* Add workaround for memory leaks in bag generators
|
||
+ Core
|
||
* Set strict xfail option
|
||
* test-upstream
|
||
* Fixed HDFS CI failure
|
||
* Error nicely if no file size inferred
|
||
* A few changes to ``config.set``
|
||
* Fixup black string normalization
|
||
* Pin NumPy in windows tests
|
||
* Ensure parquet tests are skipped if fastparquet and pyarrow not installed
|
||
* Add fsspec to readthedocs
|
||
* Bump NumPy and Pandas to 1.17 and 0.25 in CI test
|
||
+ DataFrame
|
||
* Fix ``DataFrame.query`` docstring (incorrect numexpr API)
|
||
* Parquet metadata-handling improvements
|
||
* Improve messaging around sorted parquet columns for index
|
||
* Add ``rearrange_by_divisions`` and ``set_index`` support for cudf
|
||
* Fix ``groupby.std()`` with integer colum names
|
||
* Add ``Series.__iter__``
|
||
* Generalize ``hash_pandas_object`` to work for non-pandas backends
|
||
* Add rolling cov
|
||
* Add columns argument in drop function
|
||
+ Documentation
|
||
* Update institutional FAQ doc
|
||
* Add draft of institutional FAQ
|
||
* Make boxes for dask-spark page
|
||
* Add motivation for shuffle docs
|
||
* Fix links and API entries for best-practices
|
||
* Remove "bytes" (internal data ingestion) doc page
|
||
* Redirect from our local distributed page to distributed.dask.org
|
||
* Cleanup API page
|
||
* Remove excess endlines from install docs
|
||
* Remove item list in phases of computation doc
|
||
* Remove custom graphs from the TOC sidebar
|
||
* Remove experimental status of custom collections
|
||
* Adds table of contents to Why Dask?
|
||
* Moves bag overview to top-level bag page
|
||
* Remove use-cases in favor of stories.dask.org
|
||
* Removes redundant TOC information in index.rst
|
||
* Elevate dashboard in distributed diagnostics documentation
|
||
* Updates "add" layer in HLG docs example
|
||
* Update GUFunc documentation
|
||
- Update to 2.2.0
|
||
+ Array
|
||
* Use da.from_array(..., asarray=False) if input follows NEP-18
|
||
* Add missing attributes to from_array documentation
|
||
* Fix meta computation for some reduction functions
|
||
* Raise informative error in to_zarr if unknown chunks
|
||
* Remove invalid pad tests
|
||
* Ignore NumPy warnings in compute_meta
|
||
* Fix kurtosis calc for single dimension input array
|
||
* Support Numpy 1.17 in tests
|
||
+ Bag
|
||
* Supply pool to bag test to resolve intermittent failure
|
||
+ Core
|
||
* Base dask on fsspec
|
||
* Various upstream compatibility fixes
|
||
* Make distributed tests optional again.
|
||
* Fix HDFS in dask
|
||
* Ignore some more invalid value warnings.
|
||
+ DataFrame
|
||
* Fix pd.MultiIndex size estimate
|
||
* Generalizing has_known_categories
|
||
* Refactor Parquet engine
|
||
* Add divide method to series and dataframe
|
||
* fix flaky partd test
|
||
* Adjust is_dataframe_like to adjust for value_counts change
|
||
* Generalize rolling windows to support non-Pandas dataframes
|
||
* Avoid unnecessary aggregation in pivot_table
|
||
* Add column names to apply_and_enforce error message
|
||
* Add schema keyword argument to to_parquet
|
||
* Remove recursion error in accessors
|
||
* Allow fastparquet to handle gather_statistics=False for file lists
|
||
+ Documentation
|
||
* Adds NumFOCUS badge to the README
|
||
* Update developer docs
|
||
* Document DataFrame.set_index computataion behavior
|
||
* Use pip install . instead of calling setup.py
|
||
* Close user survey
|
||
* Fix Google Calendar meeting link
|
||
* Add docker image customization example
|
||
* Update remote-data-services after fsspec
|
||
* Fix typo in spark.rstZ
|
||
* Update setup/python docs for async/await API
|
||
* Update Local Storage HPC documentation
|
||
|
||
-------------------------------------------------------------------
|
||
Tue Jul 23 00:23:55 UTC 2019 - Todd R <toddrme2178@gmail.com>
|
||
|
||
- Update to 2.1.0
|
||
+ Array
|
||
* Add ``recompute=`` keyword to ``svd_compressed`` for lower-memory use
|
||
* Change ``__array_function__`` implementation for backwards compatibility
|
||
* Added ``dtype`` and ``shape`` kwargs to ``apply_along_axis``
|
||
* Fix reduction with empty tuple axis
|
||
* Drop size 0 arrays in ``stack``
|
||
+ Core
|
||
* Removes index keyword from pandas ``to_parquet`` call
|
||
* Fixes upstream dev CI build installation
|
||
* Ensure scalar arrays are not rendered to SVG
|
||
* Environment creation overhaul
|
||
* s3fs, moto compatibility
|
||
* pytest 5.0 compat
|
||
+ DataFrame
|
||
* Fix ``compute_meta`` recursion in blockwise
|
||
* Remove hard dependency on pandas in ``get_dummies``
|
||
* Check dtypes unchanged when using ``DataFrame.assign``
|
||
* Fix cumulative functions on tables with more than 1 partition
|
||
* Handle non-divisible sizes in repartition
|
||
* Handles timestamp and ``preserve_index`` changes in pyarrow
|
||
* Fix undefined ``meta`` for ``str.split(expand=False)``
|
||
* Removed checks used for debugging ``merge_asof``
|
||
* Don't use type when getting accessor in dataframes
|
||
* Add ``melt`` as a method of Dask DataFrame
|
||
* Adds path-like support to ``to_hdf``
|
||
+ Documentation
|
||
* Point to latest K8s setup article in JupyterHub docs
|
||
* Changes vizualize to visualize
|
||
* Fix ``from_sequence`` typo in delayed best practices
|
||
* Add user survey link to docs
|
||
* Fixes typo in optimization docs
|
||
* Update community meeting information
|
||
- Update to 2.0.0
|
||
+ Array
|
||
* Support automatic chunking in da.indices
|
||
* Err if there are no arrays to stack
|
||
* Asymmetrical Array Overlap
|
||
* Dispatch concatenate where possible within dask array
|
||
* Fix tokenization of memmapped numpy arrays on different part of same file
|
||
* Preserve NumPy condition in da.asarray to preserve output shape
|
||
* Expand foo_like_safe usage
|
||
* Defer order/casting einsum parameters to NumPy implementation
|
||
* Remove numpy warning in moment calculation
|
||
* Fix meta_from_array to support Xarray test suite
|
||
* Cache chunk boundaries for integer slicing
|
||
* Drop size 0 arrays in concatenate
|
||
* Raise ValueError if concatenate is given no arrays
|
||
* Promote types in `concatenate` using `_meta`
|
||
* Add chunk type to html repr in Dask array
|
||
* Add Dask Array._meta attribute
|
||
> Fix _meta slicing of flexible types
|
||
> Minor meta construction cleanup in concatenate
|
||
> Further relax Array meta checks for Xarray
|
||
> Support meta= keyword in da.from_delayed
|
||
> Concatenate meta along axis
|
||
> Use meta in stack
|
||
> Move blockwise_meta to more general compute_meta function
|
||
* Alias .partitions to .blocks attribute of dask arrays
|
||
* Drop outdated `numpy_compat` functions
|
||
* Allow da.eye to support arbitrary chunking sizes with chunks='auto'
|
||
* Fix CI warnings in dask.array tests
|
||
* Make map_blocks work with drop_axis + block_info
|
||
* Add SVG image and table in Array._repr_html_
|
||
* ufunc: avoid __array_wrap__ in favor of __array_function__
|
||
* Ensure trivial padding returns the original array
|
||
* Test ``da.block`` with 0-size arrays
|
||
+ Core
|
||
* **Drop Python 2.7**
|
||
* Quiet dependency installs in CI
|
||
* Raise on warnings in tests
|
||
* Add a diagnostics extra to setup.py (includes bokeh)
|
||
* Add newline delimter keyword to OpenFile
|
||
* Overload HighLevelGraphs values method
|
||
* Add __await__ method to Dask collections
|
||
* Also ignore AttributeErrors which may occur if snappy (not python-snappy) is installed
|
||
* Canonicalize key names in config.rename
|
||
* Bump minimum partd to 0.3.10
|
||
* Catch async def SyntaxError
|
||
* catch IOError in ensure_file
|
||
* Cleanup CI warnings
|
||
* Move distributed's parse and format functions to dask.utils
|
||
* Apply black formatting
|
||
* Package license file in wheels
|
||
+ DataFrame
|
||
* Add an optional partition_size parameter to repartition
|
||
* merge_asof and prefix_reduction
|
||
* Allow dataframes to be indexed by dask arrays
|
||
* Avoid deprecated message parameter in pytest.raises
|
||
* Update test_to_records to test with lengths argument(:pr:`4515`) `asmith26`_
|
||
* Remove pandas pinning in Dataframe accessors
|
||
* Fix correlation of series with same names
|
||
* Map Dask Series to Dask Series
|
||
* Warn in dd.merge on dtype warning
|
||
* Add groupby Covariance/Correlation
|
||
* keep index name with to_datetime
|
||
* Add Parallel variance computation for dataframes
|
||
* Add divmod implementation to arrays and dataframes
|
||
* Add documentation for dataframe reshape methods
|
||
* Avoid use of pandas.compat
|
||
* Added accessor registration for Series, DataFrame, and Index
|
||
* Add read_function keyword to read_json
|
||
* Provide full type name in check_meta
|
||
* Correctly estimate bytes per row in read_sql_table
|
||
* Adding support of non-numeric data to describe()
|
||
* Scalars for extension dtypes.
|
||
* Call head before compute in dd.from_delayed
|
||
* Add support for rolling operations with larger window that partition size in DataFrames with Time-based index
|
||
* Update groupby-apply doc with warning
|
||
* Change groupby-ness tests in `_maybe_slice`
|
||
* Add master best practices document
|
||
* Add document for how Dask works with GPUs
|
||
* Add cli API docs
|
||
* Ensure concat output has coherent dtypes
|
||
* Fixes pandas_datareader dependencies installation
|
||
* Accept pathlib.Path as pattern in read_hdf
|
||
+ Documentation
|
||
* Move CLI API docs to relavant pages
|
||
* Add to_datetime function to dataframe API docs `Matthew Rocklin`_
|
||
* Add documentation entry for dask.array.ma.average
|
||
* Add bag.read_avro to bag API docs
|
||
* Fix typo
|
||
* Docs: Drop support for Python 2.7
|
||
* Remove requirement to modify changelog
|
||
* Add documentation about meta column order
|
||
* Add documentation note in DataFrame.shift
|
||
* Docs: Fix typo
|
||
* Put do/don't into boxes for delayed best practice docs
|
||
* Doc fixups
|
||
* Add quansight to paid support doc section
|
||
* Add document for custom startup
|
||
* Allow `utils.derive_from` to accept functions, apply across array
|
||
* Add "Avoid Large Partitions" section to best practices
|
||
* Update URL for joblib to new website hosting their doc (:pr:`4816`) `Christian Hudon`_
|
||
|
||
-------------------------------------------------------------------
|
||
Tue May 21 11:48:23 UTC 2019 - pgajdos@suse.com
|
||
|
||
- version update to 1.2.2
|
||
+ Array
|
||
* Clarify regions kwarg to array.store (:pr:`4759`) `Martin Durant`_
|
||
* Add dtype= parameter to da.random.randint (:pr:`4753`) `Matthew Rocklin`_
|
||
* Use "row major" rather than "C order" in docstring (:pr:`4452`) `@asmith26`_
|
||
* Normalize Xarray datasets to Dask arrays (:pr:`4756`) `Matthew Rocklin`_
|
||
* Remove normed keyword in da.histogram (:pr:`4755`) `Matthew Rocklin`_
|
||
+ Bag
|
||
* Add key argument to Bag.distinct (:pr:`4423`) `Daniel Severo`_
|
||
+ Core
|
||
* Add core dask config file (:pr:`4774`) `Matthew Rocklin`_
|
||
* Add core dask config file to MANIFEST.in (:pr:`4780`) `James Bourbeau`_
|
||
* Enabling glob with HTTP file-system (:pr:`3926`) `Martin Durant`_
|
||
* HTTPFile.seek with whence=1 (:pr:`4751`) `Martin Durant`_
|
||
* Remove config key normalization (:pr:`4742`) `Jim Crist`_
|
||
+ DataFrame
|
||
* Remove explicit references to Pandas in dask.dataframe.groupby (:pr:`4778`) `Matthew Rocklin`_
|
||
* Add support for group_keys kwarg in DataFrame.groupby() (:pr:`4771`) `Brian Chu`_
|
||
* Describe doc (:pr:`4762`) `Martin Durant`_
|
||
* Remove explicit pandas check in cumulative aggregations (:pr:`4765`) `Nick Becker`_
|
||
* Added meta for read_json and test (:pr:`4588`) `Abhinav Ralhan`_
|
||
* Add test for dtype casting (:pr:`4760`) `Martin Durant`_
|
||
* Document alignment in map_partitions (:pr:`4757`) `Jim Crist`_
|
||
* Implement Series.str.split(expand=True) (:pr:`4744`) `Matthew Rocklin`_
|
||
+ Documentation
|
||
* Tweaks to develop.rst from trying to run tests (:pr:`4772`) `Christian Hudon`_
|
||
* Add document describing phases of computation (:pr:`4766`) `Matthew Rocklin`_
|
||
* Point users to Dask-Yarn from spark documentation (:pr:`4770`) `Matthew Rocklin`_
|
||
* Update images in delayed doc to remove labels (:pr:`4768`) `Martin Durant`_
|
||
* Explain intermediate storage for dask arrays (:pr:`4025`) `John A Kirkham`_
|
||
* Specify bash code-block in array best practices (:pr:`4764`) `James Bourbeau`_
|
||
* Add array best practices doc (:pr:`4705`) `Matthew Rocklin`_
|
||
* Update optimization docs now that cull is not automatic (:pr:`4752`) `Matthew Rocklin`_
|
||
- version update to 1.2.1
|
||
+ Array
|
||
* Fix map_blocks with block_info and broadcasting (:pr:`4737`) `Bruce Merry`_
|
||
* Make 'minlength' keyword argument optional in da.bincount (:pr:`4684`) `Genevieve Buckley`_
|
||
* Add support for map_blocks with no array arguments (:pr:`4713`) `Bruce Merry`_
|
||
* Add dask.array.trace (:pr:`4717`) `Danilo Horta`_
|
||
* Add sizeof support for cupy.ndarray (:pr:`4715`) `Peter Andreas Entschev`_
|
||
* Add name kwarg to from_zarr (:pr:`4663`) `Michael Eaton`_
|
||
* Add chunks='auto' to from_array (:pr:`4704`) `Matthew Rocklin`_
|
||
* Raise TypeError if dask array is given as shape for da.ones, zeros, empty or full (:pr:`4707`) `Genevieve Buckley`_
|
||
* Add TileDB backend (:pr:`4679`) `Isaiah Norton`_
|
||
+ Core
|
||
* Delay long list arguments (:pr:`4735`) `Matthew Rocklin`_
|
||
* Bump to numpy >= 1.13, pandas >= 0.21.0 (:pr:`4720`) `Jim Crist`_
|
||
* Remove file "test" (:pr:`4710`) `James Bourbeau`_
|
||
* Reenable development build, uses upstream libraries (:pr:`4696`) `Peter Andreas Entschev`_
|
||
* Remove assertion in HighLevelGraph constructor (:pr:`4699`) `Matthew Rocklin`_
|
||
+ DataFrame
|
||
* Change cum-aggregation last-nonnull-value algorithm (:pr:`4736`) `Nick Becker`_
|
||
* Fixup series-groupby-apply (:pr:`4738`) `Jim Crist`_
|
||
* Refactor array.percentile and dataframe.quantile to use t-digest (:pr:`4677`) `Janne Vuorela`_
|
||
* Allow naive concatenation of sorted dataframes (:pr:`4725`) `Matthew Rocklin`_
|
||
* Fix perf issue in dd.Series.isin (:pr:`4727`) `Jim Crist`_
|
||
* Remove hard pandas dependency for melt by using methodcaller (:pr:`4719`) `Nick Becker`_
|
||
* A few dataframe metadata fixes (:pr:`4695`) `Jim Crist`_
|
||
* Add Dataframe.replace (:pr:`4714`) `Matthew Rocklin`_
|
||
* Add 'threshold' parameter to pd.DataFrame.dropna (:pr:`4625`) `Nathan Matare`_
|
||
+ Documentation
|
||
* Add warning about derived docstrings early in the docstring (:pr:`4716`) `Matthew Rocklin`_
|
||
* Create dataframe best practices doc (:pr:`4703`) `Matthew Rocklin`_
|
||
* Uncomment dask_sphinx_theme (:pr:`4728`) `James Bourbeau`_
|
||
* Fix minor typo fix in a Queue/fire_and_forget example (:pr:`4709`) `Matthew Rocklin`_
|
||
* Update from_pandas docstring to match signature (:pr:`4698`) `James Bourbeau`_
|
||
|
||
-------------------------------------------------------------------
|
||
Mon Apr 22 19:32:28 UTC 2019 - Todd R <toddrme2178@gmail.com>
|
||
|
||
- Update to version 1.2.0
|
||
+ Array
|
||
* Fixed mean() and moment() on sparse arrays
|
||
* Add test for NEP-18.
|
||
* Allow None to say "no chunking" in normalize_chunks
|
||
* Fix limit value in auto_chunks
|
||
+ Core
|
||
* Updated diagnostic bokeh test for compatibility with bokeh>=1.1.0
|
||
* Adjusts codecov's target/threshold, disable patch
|
||
* Always start with empty http buffer, not None
|
||
+ DataFrame
|
||
* Propagate index dtype and name when create dask dataframe from array
|
||
* Fix ordering of quantiles in describe
|
||
* Clean up and document rearrange_column_by_tasks
|
||
* Mark some parquet tests xfail
|
||
* Fix parquet breakages with arrow 0.13.0
|
||
* Allow sample to be False when reading CSV from a remote URL
|
||
* Fix timezone metadata inference on parquet load
|
||
* Use is_dataframe/index_like in dd.utils
|
||
* Add min_count parameter to groupby sum method
|
||
* Correct quantile to handle unsorted quantiles
|
||
+ Documentation
|
||
* Add delayed extra dependencies to install docs
|
||
- Update to version 1.1.5
|
||
+ Array
|
||
* Ensure that we use the dtype keyword in normalize_chunks
|
||
+ Core
|
||
* Use recursive glob in LocalFileSystem
|
||
* Avoid YAML deprecation
|
||
* Fix CI and add set -e
|
||
* Support builtin sequence types in dask.visualize
|
||
* unpack/repack orderedDict
|
||
* Add da.random.randint to API docs
|
||
* Add zarr to CI environment
|
||
* Enable codecov
|
||
+ DataFrame
|
||
* Support setting the index
|
||
* DataFrame.itertuples accepts index, name kwargs
|
||
* Support non-Pandas series in dd.Series.unique
|
||
* Replace use of explicit type check with ._is_partition_type predicate
|
||
* Remove additional pandas warnings in tests
|
||
* Check object for name/dtype attributes rather than type
|
||
* Fix comparison against pd.Series
|
||
* Fixing warning from setting categorical codes to floats
|
||
* Fix renaming on index to_frame method
|
||
* Fix divisions when joining two single-partition dataframes
|
||
* Warn if partitions overlap in compute_divisions
|
||
* Give informative meta= warning
|
||
* Add informative error message to Series.__getitem__
|
||
* Add clear exception message when using index or index_col in read_csv
|
||
+ Documentation
|
||
* Add documentation for custom groupby aggregations
|
||
* Docs dataframe joins
|
||
* Specify fork-based contributions
|
||
* correct to_parquet example in docs
|
||
* Update and secure several references
|
||
|
||
-------------------------------------------------------------------
|
||
Tue Apr 9 10:06:13 UTC 2019 - pgajdos@suse.com
|
||
|
||
- do not require optional python2-sparse for testing, python-sparse
|
||
is going to be python3-only
|
||
|
||
-------------------------------------------------------------------
|
||
Mon Mar 11 12:30:53 UTC 2019 - Tomáš Chvátal <tchvatal@suse.com>
|
||
|
||
- Update to 1.1.4:
|
||
* Various bugfixes in 1.1 branch
|
||
|
||
-------------------------------------------------------------------
|
||
Wed Feb 20 11:19:16 UTC 2019 - Tomáš Chvátal <tchvatal@suse.com>
|
||
|
||
- Enable tests and switch to multibuild
|
||
|
||
-------------------------------------------------------------------
|
||
Sat Feb 2 17:09:28 UTC 2019 - Arun Persaud <arun@gmx.de>
|
||
|
||
- update to version 1.1.1:
|
||
* Array
|
||
+ Add support for cupy.einsum (:pr:`4402`) Johnnie Gray
|
||
+ Provide byte size in chunks keyword (:pr:`4434`) Adam Beberg
|
||
+ Raise more informative error for histogram bins and range
|
||
(:pr:`4430`) James Bourbeau
|
||
* DataFrame
|
||
+ Lazily register more cudf functions and move to backends file
|
||
(:pr:`4396`) Matthew Rocklin
|
||
+ Fix ORC tests for pyarrow 0.12.0 (:pr:`4413`) Jim Crist
|
||
+ rearrange_by_column: ensure that shuffle arg defaults to 'disk'
|
||
if it's None in dask.config (:pr:`4414`) George Sakkis
|
||
+ Implement filters for _read_pyarrow (:pr:`4415`) George Sakkis
|
||
+ Avoid checking against types in is_dataframe_like (:pr:`4418`)
|
||
Matthew Rocklin
|
||
+ Pass username as 'user' when using pyarrow (:pr:`4438`) Roma
|
||
Sokolov
|
||
* Delayed
|
||
+ Fix DelayedAttr return value (:pr:`4440`) Matthew Rocklin
|
||
* Documentation
|
||
+ Use SVG for pipeline graphic (:pr:`4406`) John A Kirkham
|
||
+ Add doctest-modules to py.test documentation (:pr:`4427`) Daniel
|
||
Severo
|
||
* Core
|
||
+ Work around psutil 5.5.0 not allowing pickling Process objects
|
||
Dimplexion
|
||
|
||
-------------------------------------------------------------------
|
||
Sun Jan 20 04:50:39 UTC 2019 - Arun Persaud <arun@gmx.de>
|
||
|
||
- specfile:
|
||
* update copyright year
|
||
|
||
- update to version 1.1.0:
|
||
* Array
|
||
+ Fix the average function when there is a masked array
|
||
(:pr:`4236`) Damien Garaud
|
||
+ Add allow_unknown_chunksizes to hstack and vstack (:pr:`4287`)
|
||
Paul Vecchio
|
||
+ Fix tensordot for 27+ dimensions (:pr:`4304`) Johnnie Gray
|
||
+ Fixed block_info with axes. (:pr:`4301`) Tom Augspurger
|
||
+ Use safe_wraps for matmul (:pr:`4346`) Mark Harfouche
|
||
+ Use chunks="auto" in array creation routines (:pr:`4354`)
|
||
Matthew Rocklin
|
||
+ Fix np.matmul in dask.array.Array.__array_ufunc__ (:pr:`4363`)
|
||
Stephan Hoyer
|
||
+ COMPAT: Re-enable multifield copy->view change (:pr:`4357`)
|
||
Diane Trout
|
||
+ Calling np.dtype on a delayed object works (:pr:`4387`) Jim
|
||
Crist
|
||
+ Rework normalize_array for numpy data (:pr:`4312`) Marco Neumann
|
||
* DataFrame
|
||
+ Add fill_value support for series comparisons (:pr:`4250`) James
|
||
Bourbeau
|
||
+ Add schema name in read_sql_table for empty tables (:pr:`4268`)
|
||
Mina Farid
|
||
+ Adjust check for bad chunks in map_blocks (:pr:`4308`) Tom
|
||
Augspurger
|
||
+ Add dask.dataframe.read_fwf (:pr:`4316`) @slnguyen
|
||
+ Use atop fusion in dask dataframe (:pr:`4229`) Matthew Rocklin
|
||
+ Use parallel_types(`) in from_pandas (:pr:`4331`) Matthew
|
||
Rocklin
|
||
+ Change DataFrame._repr_data to method (:pr:`4330`) Matthew
|
||
Rocklin
|
||
+ Install pyarrow fastparquet for Appveyor (:pr:`4338`) Gábor
|
||
Lipták
|
||
+ Remove explicit pandas checks and provide cudf lazy registration
|
||
(:pr:`4359`) Matthew Rocklin
|
||
+ Replace isinstance(..., pandas`) with is_dataframe_like
|
||
(:pr:`4375`) Matthew Rocklin
|
||
+ ENH: Support 3rd-party ExtensionArrays (:pr:`4379`) Tom
|
||
Augspurger
|
||
+ Pandas 0.24.0 compat (:pr:`4374`) Tom Augspurger
|
||
* Documentation
|
||
+ Fix link to 'map_blocks' function in array api docs (:pr:`4258`)
|
||
David Hoese
|
||
+ Add a paragraph on Dask-Yarn in the cloud docs (:pr:`4260`) Jim
|
||
Crist
|
||
+ Copy edit documentation (:pr:`4267), (:pr:`4263`), (:pr:`4262`),
|
||
(:pr:`4277`), (:pr:`4271`), (:pr:`4279), (:pr:`4265`),
|
||
(:pr:`4295`), (:pr:`4293`), (:pr:`4296`), (:pr:`4302`),
|
||
(:pr:`4306`), (:pr:`4318`), (:pr:`4314`), (:pr:`4309`),
|
||
(:pr:`4317`), (:pr:`4326`), (:pr:`4325`), (:pr:`4322`),
|
||
(:pr:`4332`), (:pr:`4333`), Miguel Farrajota
|
||
+ Fix typo in code example (:pr:`4272`) Daniel Li
|
||
+ Doc: Update array-api.rst (:pr:`4259`) (:pr:`4282`) Prabakaran
|
||
Kumaresshan
|
||
+ Update hpc doc (:pr:`4266`) Guillaume Eynard-Bontemps
|
||
+ Doc: Replace from_avro with read_avro in documents (:pr:`4313`)
|
||
Prabakaran Kumaresshan
|
||
+ Remove reference to "get" scheduler functions in docs
|
||
(:pr:`4350`) Matthew Rocklin
|
||
+ Fix typo in docstring (:pr:`4376`) Daniel Saxton
|
||
+ Added documentation for dask.dataframe.merge (:pr:`4382`)
|
||
Jendrik Jördening
|
||
* Core
|
||
+ Avoid recursion in dask.core.get (:pr:`4219`) Matthew Rocklin
|
||
+ Remove verbose flag from pytest setup.cfg (:pr:`4281`) Matthew
|
||
Rocklin
|
||
+ Support Pytest 4.0 by specifying marks explicitly (:pr:`4280`)
|
||
Takahiro Kojima
|
||
+ Add High Level Graphs (:pr:`4092`) Matthew Rocklin
|
||
+ Fix SerializableLock locked and acquire methods (:pr:`4294`)
|
||
Stephan Hoyer
|
||
+ Pin boto3 to earlier version in tests to avoid moto conflict
|
||
(:pr:`4276`) Martin Durant
|
||
+ Treat None as missing in config when updating (:pr:`4324`)
|
||
Matthew Rocklin
|
||
+ Update Appveyor to Python 3.6 (:pr:`4337`) Gábor Lipták
|
||
+ Use parse_bytes more liberally in dask.dataframe/bytes/bag
|
||
(:pr:`4339`) Matthew Rocklin
|
||
+ Add a better error message when cloudpickle is missing
|
||
(:pr:`4342`) Mark Harfouche
|
||
+ Support pool= keyword argument in threaded/multiprocessing get
|
||
functions (:pr:`4351`) Matthew Rocklin
|
||
+ Allow updates from arbitrary Mappings in config.update, not only
|
||
dicts. (:pr:`4356`) Stuart Berg
|
||
+ Move dask/array/top.py code to dask/blockwise.py (:pr:`4348`)
|
||
Matthew Rocklin
|
||
+ Add has_parallel_type (:pr:`4395`) Matthew Rocklin
|
||
+ CI: Update Appveyor (:pr:`4381`) Tom Augspurger
|
||
+ Ignore non-readable config files (:pr:`4388`) Jim Crist
|
||
|
||
-------------------------------------------------------------------
|
||
Sat Dec 1 18:36:31 UTC 2018 - Arun Persaud <arun@gmx.de>
|
||
|
||
- update to version 1.0.0:
|
||
* Array
|
||
+ Add nancumsum/nancumprod unit tests (:pr:`4215`) Guido Imperiale
|
||
* DataFrame
|
||
+ Add index to to_dask_dataframe docstring (:pr:`4232`) James
|
||
Bourbeau
|
||
+ Text and fix when appending categoricals with fastparquet
|
||
(:pr:`4245`) Martin Durant
|
||
+ Don't reread metadata when passing ParquetFile to read_parquet
|
||
(:pr:`4247`) Martin Durant
|
||
* Documentation
|
||
+ Copy edit documentation (:pr:`4222`) (:pr:`4224`) (:pr:`4228`)
|
||
(:pr:`4231`) (:pr:`4230`) (:pr:`4234`) (:pr:`4235`) (:pr:`4254`)
|
||
Miguel Farrajota
|
||
+ Updated doc for the new scheduler keyword (:pr:`4251`) @milesial
|
||
* Core
|
||
+ Avoid a few warnings (:pr:`4223`) Matthew Rocklin
|
||
+ Remove dask.store module (:pr:`4221`) Matthew Rocklin
|
||
+ Remove AUTHORS.md Jim Crist
|
||
|
||
-------------------------------------------------------------------
|
||
Thu Nov 22 22:46:17 UTC 2018 - Arun Persaud <arun@gmx.de>
|
||
|
||
- update to version 0.20.2:
|
||
* Array
|
||
+ Avoid fusing dependencies of atop reductions (:pr:`4207`)
|
||
Matthew Rocklin
|
||
* Dataframe
|
||
+ Improve memory footprint for dataframe correlation (:pr:`4193`)
|
||
Damien Garaud
|
||
+ Add empty DataFrame check to boundary_slice (:pr:`4212`) James
|
||
Bourbeau
|
||
* Documentation
|
||
+ Copy edit documentation (:pr:`4197`) (:pr:`4204`) (:pr:`4198`)
|
||
(:pr:`4199`) (:pr:`4200`) (:pr:`4202`) (:pr:`4209`) Miguel
|
||
Farrajota
|
||
+ Add stats module namespace (:pr:`4206`) James Bourbeau
|
||
+ Fix link in dataframe documentation (:pr:`4208`) James Bourbeau
|
||
|
||
-------------------------------------------------------------------
|
||
Mon Nov 12 05:54:54 UTC 2018 - Arun Persaud <arun@gmx.de>
|
||
|
||
- update to version 0.20.1:
|
||
* Array
|
||
+ Only allocate the result space in wrapped_pad_func (:pr:`4153`)
|
||
John A Kirkham
|
||
+ Generalize expand_pad_width to expand_pad_value (:pr:`4150`)
|
||
John A Kirkham
|
||
+ Test da.pad with 2D linear_ramp case (:pr:`4162`) John A Kirkham
|
||
+ Fix import for broadcast_to. (:pr:`4168`) samc0de
|
||
+ Rewrite Dask Array's pad to add only new chunks (:pr:`4152`)
|
||
John A Kirkham
|
||
+ Validate index inputs to atop (:pr:`4182`) Matthew Rocklin
|
||
* Core
|
||
+ Dask.config set and get normalize underscores and hyphens
|
||
(:pr:`4143`) James Bourbeau
|
||
+ Only subs on core collections, not subclasses (:pr:`4159`)
|
||
Matthew Rocklin
|
||
+ Add block_size=0 option to HTTPFileSystem. (:pr:`4171`) Martin
|
||
Durant
|
||
+ Add traverse support for dataclasses (:pr:`4165`) Armin Berres
|
||
+ Avoid optimization on sharedicts without dependencies
|
||
(:pr:`4181`) Matthew Rocklin
|
||
+ Update the pytest version for TravisCI (:pr:`4189`) Damien
|
||
Garaud
|
||
+ Use key_split rather than funcname in visualize names
|
||
(:pr:`4160`) Matthew Rocklin
|
||
* Dataframe
|
||
+ Add fix for DataFrame.__setitem__ for index (:pr:`4151`)
|
||
Anderson Banihirwe
|
||
+ Fix column choice when passing list of files to fastparquet
|
||
(:pr:`4174`) Martin Durant
|
||
+ Pass engine_kwargs from read_sql_table to sqlalchemy
|
||
(:pr:`4187`) Damien Garaud
|
||
* Documentation
|
||
+ Fix documentation in Delayed best practices example that
|
||
returned an empty list (:pr:`4147`) Jonathan Fraine
|
||
+ Copy edit documentation (:pr:`4164`) (:pr:`4175`) (:pr:`4185`)
|
||
(:pr:`4192`) (:pr:`4191`) (:pr:`4190`) (:pr:`4180`) Miguel
|
||
Farrajota
|
||
+ Fix typo in docstring (:pr:`4183`) Carlos Valiente
|
||
|
||
-------------------------------------------------------------------
|
||
Tue Oct 30 03:04:38 UTC 2018 - Arun Persaud <arun@gmx.de>
|
||
|
||
- update to version 0.20.0:
|
||
* Array
|
||
+ Fuse Atop operations (:pr:`3998`), (:pr:`4081`) Matthew Rocklin
|
||
+ Support da.asanyarray on dask dataframes (:pr:`4080`) Matthew
|
||
Rocklin
|
||
+ Remove unnecessary endianness check in datetime test
|
||
(:pr:`4113`) Elliott Sales de Andrade
|
||
+ Set name=False in array foo_like functions (:pr:`4116`) Matthew
|
||
Rocklin
|
||
+ Remove dask.array.ghost module (:pr:`4121`) Matthew Rocklin
|
||
+ Fix use of getargspec in dask array (:pr:`4125`) Stephan Hoyer
|
||
+ Adds dask.array.invert (:pr:`4127`), (:pr:`4131`) Anderson
|
||
Banihirwe
|
||
+ Raise informative error on arg-reduction on unknown chunksize
|
||
(:pr:`4128`), (:pr:`4135`) Matthew Rocklin
|
||
+ Normalize reversed slices in dask array (:pr:`4126`) Matthew
|
||
Rocklin
|
||
* Bag
|
||
+ Add bag.to_avro (:pr:`4076`) Martin Durant
|
||
* Core
|
||
+ Pull num_workers from config.get (:pr:`4086`), (:pr:`4093`)
|
||
James Bourbeau
|
||
+ Fix invalid escape sequences with raw strings (:pr:`4112`)
|
||
Elliott Sales de Andrade
|
||
+ Raise an error on the use of the get= keyword and set_options
|
||
(:pr:`4077`) Matthew Rocklin
|
||
+ Add import for Azure DataLake storage, and add docs (:pr:`4132`)
|
||
Martin Durant
|
||
+ Avoid collections.Mapping/Sequence (:pr:`4138`) Matthew Rocklin
|
||
* Dataframe
|
||
+ Include index keyword in to_dask_dataframe (:pr:`4071`) Matthew
|
||
Rocklin
|
||
+ add support for duplicate column names (:pr:`4087`) Jan Koch
|
||
+ Implement min_count for the DataFrame methods sum and prod
|
||
(:pr:`4090`) Bart Broere
|
||
+ Remove pandas warnings in concat (:pr:`4095`) Matthew Rocklin
|
||
+ DataFrame.to_csv header option to only output headers in the
|
||
first chunk (:pr:`3909`) Rahul Vaidya
|
||
+ Remove Series.to_parquet (:pr:`4104`) Justin Dennison
|
||
+ Avoid warnings and deprecated pandas methods (:pr:`4115`)
|
||
Matthew Rocklin
|
||
+ Swap 'old' and 'previous' when reporting append error
|
||
(:pr:`4130`) Martin Durant
|
||
* Documentation
|
||
+ Copy edit documentation (:pr:`4073`), (:pr:`4074`),
|
||
(:pr:`4094`), (:pr:`4097`), (:pr:`4107`), (:pr:`4124`),
|
||
(:pr:`4133`), (:pr:`4139`) Miguel Farrajota
|
||
+ Fix typo in code example (:pr:`4089`) Antonino Ingargiola
|
||
+ Add pycon 2018 presentation (:pr:`4102`) Javad
|
||
+ Quick description for gcsfs (:pr:`4109`) Martin Durant
|
||
+ Fixed typo in docstrings of read_sql_table method (:pr:`4114`)
|
||
TakaakiFuruse
|
||
+ Make target directories in redirects if they don't exist
|
||
(:pr:`4136`) Matthew Rocklin
|
||
|
||
-------------------------------------------------------------------
|
||
Wed Oct 10 01:49:52 UTC 2018 - Arun Persaud <arun@gmx.de>
|
||
|
||
- update to version 0.19.4:
|
||
* Array
|
||
+ Implement apply_gufunc(..., axes=..., keepdims=...) (:pr:`3985`)
|
||
Markus Gonser
|
||
* Bag
|
||
+ Fix typo in datasets.make_people (:pr:`4069`) Matthew Rocklin
|
||
* Dataframe
|
||
+ Added percentiles options for dask.dataframe.describe method
|
||
(:pr:`4067`) Zhenqing Li
|
||
+ Add DataFrame.partitions accessor similar to Array.blocks
|
||
(:pr:`4066`) Matthew Rocklin
|
||
* Core
|
||
+ Pass get functions and Clients through scheduler keyword
|
||
(:pr:`4062`) Matthew Rocklin
|
||
* Documentation
|
||
+ Fix Typo on hpc example. (missing = in kwarg). (:pr:`4068`)
|
||
Matthias Bussonier
|
||
+ Extensive copy-editing: (:pr:`4065`), (:pr:`4064`), (:pr:`4063`)
|
||
Miguel Farrajota
|
||
|
||
-------------------------------------------------------------------
|
||
Mon Oct 8 15:01:22 UTC 2018 - Arun Persaud <arun@gmx.de>
|
||
|
||
- update to version 0.19.3:
|
||
* Array
|
||
+ Make da.RandomState extensible to other modules (:pr:`4041`)
|
||
Matthew Rocklin
|
||
+ Support unknown dims in ravel no-op case (:pr:`4055`) Jim Crist
|
||
+ Add basic infrastructure for cupy (:pr:`4019`) Matthew Rocklin
|
||
+ Avoid asarray and lock arguments for from_array(getitem`)
|
||
(:pr:`4044`) Matthew Rocklin
|
||
+ Move local imports in corrcoef to global imports (:pr:`4030`)
|
||
John A Kirkham
|
||
+ Move local indices import to global import (:pr:`4029`) John A
|
||
Kirkham
|
||
+ Fix-up Dask Array's fromfunction w.r.t. dtype and kwargs
|
||
(:pr:`4028`) John A Kirkham
|
||
+ Don't use dummy expansion for trim_internal in overlapped
|
||
(:pr:`3964`) Mark Harfouche
|
||
+ Add unravel_index (:pr:`3958`) John A Kirkham
|
||
* Bag
|
||
+ Sort result in Bag.frequencies (:pr:`4033`) Matthew Rocklin
|
||
+ Add support for npartitions=1 edge case in groupby (:pr:`4050`)
|
||
James Bourbeau
|
||
+ Add new random dataset for people (:pr:`4018`) Matthew Rocklin
|
||
+ Improve performance of bag.read_text on small files (:pr:`4013`)
|
||
Eric Wolak
|
||
+ Add bag.read_avro (:pr:`4000`) (:pr:`4007`) Martin Durant
|
||
* Dataframe
|
||
+ Added an index parameter to
|
||
:meth:`dask.dataframe.from_dask_array` for creating a dask
|
||
DataFrame from a dask Array with a given index. (:pr:`3991`) Tom
|
||
Augspurger
|
||
+ Improve sub-classability of dask dataframe (:pr:`4015`) Matthew
|
||
Rocklin
|
||
+ Fix failing hdfs test [test-hdfs] (:pr:`4046`) Jim Crist
|
||
+ fuse_subgraphs works without normal fuse (:pr:`4042`) Jim Crist
|
||
+ Make path for reading many parquet files without prescan
|
||
(:pr:`3978`) Martin Durant
|
||
+ Index in dd.from_dask_array (:pr:`3991`) Tom Augspurger
|
||
+ Making skiprows accept lists (:pr:`3975`) Julia Signell
|
||
+ Fail early in fastparquet read for nonexistent column
|
||
(:pr:`3989`) Martin Durant
|
||
* Core
|
||
+ Add support for npartitions=1 edge case in groupby (:pr:`4050`)
|
||
James Bourbeau
|
||
+ Automatically wrap large arguments with dask.delayed in
|
||
map_blocks/partitions (:pr:`4002`) Matthew Rocklin
|
||
+ Fuse linear chains of subgraphs (:pr:`3979`) Jim Crist
|
||
+ Make multiprocessing context configurable (:pr:`3763`) Itamar
|
||
Turner-Trauring
|
||
* Documentation
|
||
+ Extensive copy-editing (:pr:`4049`), (:pr:`4034`), (:pr:`4031`),
|
||
(:pr:`4020`), (:pr:`4021`), (:pr:`4022`), (:pr:`4023`),
|
||
(:pr:`4016`), (:pr:`4017`), (:pr:`4010`), (:pr:`3997`),
|
||
(:pr:`3996`), Miguel Farrajota
|
||
+ Update shuffle method selection docs [skip ci] (:pr:`4048`)
|
||
James Bourbeau
|
||
+ Remove docs/source/examples, point to examples.dask.org
|
||
(:pr:`4014`) Matthew Rocklin
|
||
+ Replace readthedocs links with dask.org (:pr:`4008`) Matthew
|
||
Rocklin
|
||
+ Updates DataFrame.to_hdf docstring for returned values [skip ci]
|
||
(:pr:`3992`) James Bourbeau
|
||
|
||
-------------------------------------------------------------------
|
||
Mon Sep 17 14:54:42 UTC 2018 - Arun Persaud <arun@gmx.de>
|
||
|
||
- update to version 0.19.2:
|
||
* Array
|
||
+ apply_gufunc implements automatic infer of functions output
|
||
dtypes (:pr:`3936`) Markus Gonser
|
||
+ Fix array histogram range error when array has nans (#3980)
|
||
James Bourbeau
|
||
+ Issue 3937 follow up, int type checks. (#3956) Yu Feng
|
||
+ from_array: add @martindurant's explaining of how hashing is
|
||
done for an array. (#3965) Mark Harfouche
|
||
+ Support gradient with coordinate (#3949) Keisuke Fujii
|
||
* Core
|
||
+ Fix use of has_keyword with partial in Python 2.7 (#3966) Mark
|
||
Harfouche
|
||
+ Set pyarrow as default for HDFS (#3957) Matthew Rocklin
|
||
* Documentation
|
||
+ Use dask_sphinx_theme (#3963) Matthew Rocklin
|
||
+ Use JupyterLab in Binder links from main page Matthew Rocklin
|
||
+ DOC: fixed sphinx syntax (#3960) Tom Augspurger
|
||
|
||
-------------------------------------------------------------------
|
||
Sat Sep 8 04:33:17 UTC 2018 - Arun Persaud <arun@gmx.de>
|
||
|
||
- update to version 0.19.1:
|
||
* Array
|
||
+ Don't enforce dtype if result has no dtype (:pr:`3928`) Matthew
|
||
Rocklin
|
||
+ Fix NumPy issubtype deprecation warning (:pr:`3939`) Bruce Merry
|
||
+ Fix arg reduction tokens to be unique with different arguments
|
||
(:pr:`3955`) Tobias de Jong
|
||
+ Coerce numpy integers to ints in slicing code (:pr:`3944`) Yu
|
||
Feng
|
||
+ Linalg.norm ndim along axis partial fix (:pr:`3933`) Tobias de
|
||
Jong
|
||
* Dataframe
|
||
+ Deterministic DataFrame.set_index (:pr:`3867`) George Sakkis
|
||
+ Fix divisions in read_parquet when dealing with filters #3831
|
||
#3930 (:pr:`3923`) (:pr:`3931`) @andrethrill
|
||
+ Fixing returning type in categorical.as_known (:pr:`3888`)
|
||
Sriharsha Hatwar
|
||
+ Fix DataFrame.assign for callables (:pr:`3919`) Tom Augspurger
|
||
+ Include partitions with no width in repartition (:pr:`3941`)
|
||
Matthew Rocklin
|
||
+ Don't constrict stage/k dtype in dataframe shuffle (:pr:`3942`)
|
||
Matthew Rocklin
|
||
* Documentation
|
||
+ DOC: Add hint on how to render task graphs horizontally
|
||
(:pr:`3922`) Uwe Korn
|
||
+ Add try-now button to main landing page (:pr:`3924`) Matthew
|
||
Rocklin
|
||
|
||
-------------------------------------------------------------------
|
||
Sun Sep 2 17:00:59 UTC 2018 - arun@gmx.de
|
||
|
||
- specfile:
|
||
* remove devel from noarch
|
||
|
||
- update to version 0.19.0:
|
||
* Array
|
||
+ Fix argtopk split_every bug (:pr:`3810`) Guido Imperiale
|
||
+ Ensure result computing dask.array.isnull(`) always gives a
|
||
numpy array (:pr:`3825`) Stephan Hoyer
|
||
+ Support concatenate for scipy.sparse in dask array (:pr:`3836`)
|
||
Matthew Rocklin
|
||
+ Fix argtopk on 32-bit systems. (:pr:`3823`) Elliott Sales de
|
||
Andrade
|
||
+ Normalize keys in rechunk (:pr:`3820`) Matthew Rocklin
|
||
+ Allow shape of dask.array to be a numpy array (:pr:`3844`) Mark
|
||
Harfouche
|
||
+ Fix numpy deprecation warning on tuple indexing (:pr:`3851`)
|
||
Tobias de Jong
|
||
+ Rename ghost module to overlap (:pr:`3830`) `Robert Sare`_
|
||
+ Re-add the ghost import to da __init__ (:pr:`3861`) Jim Crist
|
||
+ Ensure copy preserves masked arrays (:pr:`3852`) Tobias de Jong
|
||
* DataFrame
|
||
+ Added dtype and sparse keywords to
|
||
:func:`dask.dataframe.get_dummies` (:pr:`3792`) Tom Augspurger
|
||
+ Added :meth:`dask.dataframe.to_dask_array` for converting a Dask
|
||
Series or DataFrame to a Dask Array, possibly with known chunk
|
||
sizes (:pr:`3884`) Tom Augspurger
|
||
+ Changed the behavior for :meth:`dask.array.asarray` for dask
|
||
dataframe and series inputs. Previously, the series was eagerly
|
||
converted to an in-memory NumPy array before creating a dask
|
||
array with known chunks sizes. This caused unexpectedly high
|
||
memory usage. Now, no intermediate NumPy array is created, and a
|
||
Dask array with unknown chunk sizes is returned (:pr:`3884`) Tom
|
||
Augspurger
|
||
+ DataFrame.iloc (:pr:`3805`) Tom Augspurger
|
||
+ When reading multiple paths, expand globs. (:pr:`3828`) Irina
|
||
Truong
|
||
+ Added index column name after resample (:pr:`3833`) Eric
|
||
Bonfadini
|
||
+ Add (lazy) shape property to dataframe and series (:pr:`3212`)
|
||
Henrique Ribeiro
|
||
+ Fix failing hdfs test [test-hdfs] (:pr:`3858`) Jim Crist
|
||
+ Fixes for pyarrow 0.10.0 release (:pr:`3860`) Jim Crist
|
||
+ Rename to_csv keys for diagnostics (:pr:`3890`) Matthew Rocklin
|
||
+ Match pandas warnings for concat sort (:pr:`3897`) Tom
|
||
Augspurger
|
||
+ Include filename in read_csv (:pr:`3908`) Julia Signell
|
||
* Core
|
||
+ Better error message on import when missing common dependencies
|
||
(:pr:`3771`) Danilo Horta
|
||
+ Drop Python 3.4 support (:pr:`3840`) Jim Crist
|
||
+ Remove expired deprecation warnings (:pr:`3841`) Jim Crist
|
||
+ Add DASK_ROOT_CONFIG environment variable (:pr:`3849`) `Joe
|
||
Hamman`_
|
||
+ Don't cull in local scheduler, do cull in delayed (:pr:`3856`)
|
||
Jim Crist
|
||
+ Increase conda download retries (:pr:`3857`) Jim Crist
|
||
+ Add python_requires and Trove classifiers (:pr:`3855`) @hugovk
|
||
+ Fix collections.abc deprecation warnings in Python 3.7.0
|
||
(:pr:`3876`) Jan Margeta
|
||
+ Allow dot jpeg to xfail in visualize tests (:pr:`3896`) Matthew
|
||
Rocklin
|
||
+ Add Python 3.7 to travis.yml (:pr:`3894`) Matthew Rocklin
|
||
+ Add expand_environment_variables to dask.config (:pr:`3893`)
|
||
`Joe Hamman`_
|
||
* Docs
|
||
+ Fix typo in import statement of diagnostics (:pr:`3826`) John
|
||
Mrziglod
|
||
+ Add link to YARN docs (:pr:`3838`) Jim Crist
|
||
+ fix of minor typos in landing page index.html (:pr:`3746`)
|
||
Christoph Moehl
|
||
+ Update delayed-custom.rst (:pr:`3850`) Anderson Banihirwe
|
||
+ DOC: clarify delayed docstring (:pr:`3709`) Scott Sievert
|
||
+ Add new presentations (:pr:`3880`) @javad94
|
||
+ Add dask array normalize_chunks to documentation (:pr:`3878`)
|
||
Daniel Rothenberg
|
||
+ Docs: Fix link to snakeviz (:pr:`3900`) Hans Moritz Günther
|
||
+ Add missing ` to docstring (:pr:`3915`) @rtobar
|
||
|
||
- changes from version 0.18.2:
|
||
* Array
|
||
+ Reimplemented argtopk to make it release the GIL (:pr:`3610`)
|
||
Guido Imperiale
|
||
+ Don't overlap on non-overlapped dimensions in map_overlap
|
||
(:pr:`3653`) Matthew Rocklin
|
||
+ Fix linalg.tsqr for dimensions of uncertain length (:pr:`3662`)
|
||
Jeremy Chen
|
||
+ Break apart uneven array-of-int slicing to separate chunks
|
||
(:pr:`3648`) Matthew Rocklin
|
||
+ Align auto chunks to provided chunks, rather than shape
|
||
(:pr:`3679`) Matthew Rocklin
|
||
+ Adds endpoint and retstep support for linspace (:pr:`3675`)
|
||
James Bourbeau
|
||
+ Implement .blocks accessor (:pr:`3689`) Matthew Rocklin
|
||
+ Add block_info keyword to map_blocks functions (:pr:`3686`)
|
||
Matthew Rocklin
|
||
+ Slice by dask array of ints (:pr:`3407`) Guido Imperiale
|
||
+ Support dtype in arange (:pr:`3722`) Guido Imperiale
|
||
+ Fix argtopk with uneven chunks (:pr:`3720`) Guido Imperiale
|
||
+ Raise error when replace=False in da.choice (:pr:`3765`) James
|
||
Bourbeau
|
||
+ Update chunks in Array.__setitem__ (:pr:`3767`) Itamar
|
||
Turner-Trauring
|
||
+ Add a chunksize convenience property (:pr:`3777`) Jacob
|
||
Tomlinson
|
||
+ Fix and simplify array slicing behavior when step < 0
|
||
(:pr:`3702`) Ziyao Wei
|
||
+ Ensure to_zarr with return_stored True returns a Dask Array
|
||
(:pr:`3786`) John A Kirkham
|
||
* Bag
|
||
+ Add last_endline optional parameter in to_textfiles (:pr:`3745`)
|
||
George Sakkis
|
||
* Dataframe
|
||
+ Add aggregate function for rolling objects (:pr:`3772`) Gerome
|
||
Pistre
|
||
+ Properly tokenize cumulative groupby aggregations (:pr:`3799`)
|
||
Cloves Almeida
|
||
* Delayed
|
||
+ Add the @ operator to the delayed objects (:pr:`3691`) Mark
|
||
Harfouche
|
||
+ Add delayed best practices to documentation (:pr:`3737`) Matthew
|
||
Rocklin
|
||
+ Fix @delayed decorator for methods and add tests (:pr:`3757`)
|
||
Ziyao Wei
|
||
* Core
|
||
+ Fix extra progressbar (:pr:`3669`) Mike Neish
|
||
+ Allow tasks back onto ordering stack if they have one dependency
|
||
(:pr:`3652`) Matthew Rocklin
|
||
+ Prefer end-tasks with low numbers of dependencies when ordering
|
||
(:pr:`3588`) Tom Augspurger
|
||
+ Add assert_eq to top-level modules (:pr:`3726`) Matthew Rocklin
|
||
+ Test that dask collections can hold scipy.sparse arrays
|
||
(:pr:`3738`) Matthew Rocklin
|
||
+ Fix setup of lz4 decompression functions (:pr:`3782`) Elliott
|
||
Sales de Andrade
|
||
+ Add datasets module (:pr:`3780`) Matthew Rocklin
|
||
|
||
-------------------------------------------------------------------
|
||
Sun Jun 24 01:07:09 UTC 2018 - arun@gmx.de
|
||
|
||
- update to version 0.18.1:
|
||
* Array
|
||
+ from_array now supports scalar types and nested lists/tuples in
|
||
input, just like all numpy functions do. It also produces a
|
||
simpler graph when the input is a plain ndarray (:pr:`3556`)
|
||
Guido Imperiale
|
||
+ Fix slicing of big arrays due to cumsum dtype bug (:pr:`3620`)
|
||
Marco Rossi
|
||
+ Add Dask Array implementation of pad (:pr:`3578`) John A Kirkham
|
||
+ Fix array random API examples (:pr:`3625`) James Bourbeau
|
||
+ Add average function to dask array (:pr:`3640`) James Bourbeau
|
||
+ Tokenize ghost_internal with axes (:pr:`3643`) Matthew Rocklin
|
||
+ from_array: special handling for ndarray, list, and scalar types
|
||
(:pr:`3568`) Guido Imperiale
|
||
+ Add outer for Dask Arrays (:pr:`3658`) John A Kirkham
|
||
* DataFrame
|
||
+ Add Index.to_series method (:pr:`3613`) Henrique Ribeiro
|
||
+ Fix missing partition columns in pyarrow-parquet (:pr:`3636`)
|
||
Martin Durant
|
||
* Core
|
||
+ Minor tweaks to CI (:pr:`3629`) Guido Imperiale
|
||
+ Add back dask.utils.effective_get (:pr:`3642`) Matthew Rocklin
|
||
+ DASK_CONFIG dictates config write location (:pr:`3621`) Jim
|
||
Crist
|
||
+ Replace 'collections' key in unpack_collections with unique key
|
||
(:pr:`3632`) Yu Feng
|
||
+ Avoid deepcopy in dask.config.set (:pr:`3649`) Matthew Rocklin
|
||
|
||
- changes from version 0.18.0:
|
||
* Array
|
||
+ Add to/read_zarr for Zarr-format datasets and arrays
|
||
(:pr:`3460`) Martin Durant
|
||
+ Experimental addition of generalized ufunc support,
|
||
apply_gufunc, gufunc, and as_gufunc (:pr:`3109`) (:pr:`3526`)
|
||
(:pr:`3539`) Markus Gonser
|
||
+ Avoid unnecessary rechunking tasks (:pr:`3529`) Matthew Rocklin
|
||
+ Compute dtypes at runtime for fft (:pr:`3511`) Matthew Rocklin
|
||
+ Generate UUIDs for all da.store operations (:pr:`3540`) Martin
|
||
Durant
|
||
+ Correct internal dimension of Dask's SVD (:pr:`3517`) John A
|
||
Kirkham
|
||
+ BUG: do not raise IndexError for identity slice in array.vindex
|
||
(:pr:`3559`) Scott Sievert
|
||
+ Adds isneginf and isposinf (:pr:`3581`) John A Kirkham
|
||
+ Drop Dask Array's learn module (:pr:`3580`) John A Kirkham
|
||
+ added sfqr (short-and-fat) as a counterpart to tsqr…
|
||
(:pr:`3575`) Jeremy Chen
|
||
+ Allow 0-width chunks in dask.array.rechunk (:pr:`3591`) Marc
|
||
Pfister
|
||
+ Document Dask Array's nan_to_num in public API (:pr:`3599`) John
|
||
A Kirkham
|
||
+ Show block example (:pr:`3601`) John A Kirkham
|
||
+ Replace token= keyword with name= in map_blocks (:pr:`3597`)
|
||
Matthew Rocklin
|
||
+ Disable locking in to_zarr (needed for using to_zarr in a
|
||
distributed context) (:pr:`3607`) John A Kirkham
|
||
+ Support Zarr Arrays in to_zarr/from_zarr (:pr:`3561`) John A
|
||
Kirkham
|
||
+ Added recursion to array/linalg/tsqr to better manage the single
|
||
core bottleneck (:pr:`3586`) `Jeremy Chan`_
|
||
* Dataframe
|
||
+ Add to/read_json (:pr:`3494`) Martin Durant
|
||
+ Adds index to unsupported arguments for DataFrame.rename method
|
||
(:pr:`3522`) James Bourbeau
|
||
+ Adds support to subset Dask DataFrame columns using
|
||
numpy.ndarray, pandas.Series, and pandas.Index objects
|
||
(:pr:`3536`) James Bourbeau
|
||
+ Raise error if meta columns do not match dataframe (:pr:`3485`)
|
||
Christopher Ren
|
||
+ Add index to unsupprted argument for DataFrame.rename
|
||
(:pr:`3522`) James Bourbeau
|
||
+ Adds support for subsetting DataFrames with pandas Index/Series
|
||
and numpy ndarrays (:pr:`3536`) James Bourbeau
|
||
+ Dataframe sample method docstring fix (:pr:`3566`) James
|
||
Bourbeau
|
||
+ fixes dd.read_json to infer file compression (:pr:`3594`) Matt
|
||
Lee
|
||
+ Adds n to sample method (:pr:`3606`) James Bourbeau
|
||
+ Add fastparquet ParquetFile object support (:pr:`3573`)
|
||
@andrethrill
|
||
* Bag
|
||
+ Rename method= keyword to shuffle= in bag.groupby (:pr:`3470`)
|
||
Matthew Rocklin
|
||
* Core
|
||
+ Replace get= keyword with scheduler= keyword (:pr:`3448`)
|
||
Matthew Rocklin
|
||
+ Add centralized dask.config module to handle configuration for
|
||
all Dask subprojects (:pr:`3432`) (:pr:`3513`) (:pr:`3520`)
|
||
Matthew Rocklin
|
||
+ Add dask-ssh CLI Options and Description. (:pr:`3476`) @beomi
|
||
+ Read whole files fix regardless of header for HTTP (:pr:`3496`)
|
||
Martin Durant
|
||
+ Adds synchronous scheduler syntax to debugging docs (:pr:`3509`)
|
||
James Bourbeau
|
||
+ Replace dask.set_options with dask.config.set (:pr:`3502`)
|
||
Matthew Rocklin
|
||
+ Update sphinx readthedocs-theme (:pr:`3516`) Matthew Rocklin
|
||
+ Introduce "auto" value for normalize_chunks (:pr:`3507`) Matthew
|
||
Rocklin
|
||
+ Fix check in configuration with env=None (:pr:`3562`) Simon
|
||
Perkins
|
||
+ Update sizeof definitions (:pr:`3582`) Matthew Rocklin
|
||
+ Remove --verbose flag from travis-ci (:pr:`3477`) Matthew
|
||
Rocklin
|
||
+ Remove "da.random" from random array keys (:pr:`3604`) Matthew
|
||
Rocklin
|
||
|
||
-------------------------------------------------------------------
|
||
Mon May 21 03:57:53 UTC 2018 - arun@gmx.de
|
||
|
||
- update to version 0.17.5:
|
||
* Compatibility with pandas 0.23.0 (:pr:`3499`) Tom Augspurger
|
||
|
||
-------------------------------------------------------------------
|
||
Sun May 6 05:33:50 UTC 2018 - arun@gmx.de
|
||
|
||
- update to version 0.17.4:
|
||
* Dataframe
|
||
+ Add support for indexing Dask DataFrames with string subclasses
|
||
(:pr:`3461`) James Bourbeau
|
||
+ Allow using both sorted_index and chunksize in read_hdf
|
||
(:pr:`3463`) Pierre Bartet
|
||
+ Pass filesystem to arrow piece reader (:pr:`3466`) Martin Durant
|
||
+ Switches to using dask.compat string_types (#3462) James
|
||
Bourbeau
|
||
|
||
- changes from version 0.17.3:
|
||
* Array
|
||
+ Add einsum for Dask Arrays (:pr:`3412`) Simon Perkins
|
||
+ Add piecewise for Dask Arrays (:pr:`3350`) John A Kirkham
|
||
+ Fix handling of nan in broadcast_shapes (:pr:`3356`) John A
|
||
Kirkham
|
||
+ Add isin for dask arrays (:pr:`3363`). Stephan Hoyer
|
||
+ Overhauled topk for Dask Arrays: faster algorithm, particularly
|
||
for large k's; added support for multiple axes, recursive
|
||
aggregation, and an option to pick the bottom k elements
|
||
instead. (:pr:`3395`) Guido Imperiale
|
||
+ The topk API has changed from topk(k, array) to the more
|
||
conventional topk(array, k). The legacy API still works but is
|
||
now deprecated. (:pr:`2965`) Guido Imperiale
|
||
+ New function argtopk for Dask Arrays (:pr:`3396`) Guido
|
||
Imperiale
|
||
+ Fix handling partial depth and boundary in map_overlap
|
||
(:pr:`3445`) John A Kirkham
|
||
+ Add gradient for Dask Arrays (:pr:`3434`) John A Kirkham
|
||
* DataFrame
|
||
+ Allow t as shorthand for table in to_hdf for pandas
|
||
compatibility (:pr:`3330`) Jörg Dietrich
|
||
+ Added top level isna method for Dask DataFrames (:pr:`3294`)
|
||
Christopher Ren
|
||
+ Fix selection on partition column on read_parquet for
|
||
engine="pyarrow" (:pr:`3207`) Uwe Korn
|
||
+ Added DataFrame.squeeze method (:pr:`3366`) Christopher Ren
|
||
+ Added infer_divisions option to read_parquet to specify whether
|
||
read engines should compute divisions (:pr:`3387`) Jon Mease
|
||
+ Added support for inferring division for engine="pyarrow"
|
||
(:pr:`3387`) Jon Mease
|
||
+ Provide more informative error message for meta= errors
|
||
(:pr:`3343`) Matthew Rocklin
|
||
+ add orc reader (:pr:`3284`) Martin Durant
|
||
+ Default compression for parquet now always Snappy, in line with
|
||
pandas (:pr:`3373`) Martin Durant
|
||
+ Fixed bug in Dask DataFrame and Series comparisons with NumPy
|
||
scalars (:pr:`3436`) James Bourbeau
|
||
+ Remove outdated requirement from repartition docstring
|
||
(:pr:`3440`) Jörg Dietrich
|
||
+ Fixed bug in aggregation when only a Series is selected
|
||
(:pr:`3446`) Jörg Dietrich
|
||
+ Add default values to make_timeseries (:pr:`3421`) Matthew
|
||
Rocklin
|
||
* Core
|
||
+ Support traversing collections in persist, visualize, and
|
||
optimize (:pr:`3410`) Jim Crist
|
||
+ Add schedule= keyword to compute and persist. This replaces
|
||
common use of the get= keyword (:pr:`3448`) Matthew Rocklin
|
||
|
||
-------------------------------------------------------------------
|
||
Sat Mar 24 18:48:24 UTC 2018 - arun@gmx.de
|
||
|
||
- update to version 0.17.2:
|
||
* Array
|
||
+ Add broadcast_arrays for Dask Arrays (:pr:`3217`) John A Kirkham
|
||
+ Add bitwise_* ufuncs (:pr:`3219`) John A Kirkham
|
||
+ Add optional axis argument to squeeze (:pr:`3261`) John A
|
||
Kirkham
|
||
+ Validate inputs to atop (:pr:`3307`) Matthew Rocklin
|
||
+ Avoid calls to astype in concatenate if all parts have the same
|
||
dtype (:pr:`3301`) `Martin Durant`_
|
||
* DataFrame
|
||
+ Fixed bug in shuffle due to aggressive truncation (:pr:`3201`)
|
||
Matthew Rocklin
|
||
+ Support specifying categorical columns on read_parquet with
|
||
categories=[…] for engine="pyarrow" (:pr:`3177`) Uwe Korn
|
||
+ Add dd.tseries.Resampler.agg (:pr:`3202`) Richard Postelnik
|
||
+ Support operations that mix dataframes and arrays (:pr:`3230`)
|
||
Matthew Rocklin
|
||
+ Support extra Scalar and Delayed args in
|
||
dd.groupby._Groupby.apply (:pr:`3256`) Gabriele Lanaro
|
||
* Bag
|
||
+ Support joining against single-partitioned bags and delayed
|
||
objects (:pr:`3254`) Matthew Rocklin
|
||
* Core
|
||
+ Fixed bug when using unexpected but hashable types for keys
|
||
(:pr:`3238`) Daniel Collins
|
||
+ Fix bug in task ordering so that we break ties consistently with
|
||
the key name (:pr:`3271`) Matthew Rocklin
|
||
+ Avoid sorting tasks in order when the number of tasks is very
|
||
large (:pr:`3298`) Matthew Rocklin
|
||
|
||
-------------------------------------------------------------------
|
||
Fri Mar 2 19:52:06 UTC 2018 - sebix+novell.com@sebix.at
|
||
|
||
- correctly package bytecode
|
||
- use %license macro
|
||
|
||
-------------------------------------------------------------------
|
||
Fri Feb 23 03:52:52 UTC 2018 - arun@gmx.de
|
||
|
||
- update to version 0.17.1:
|
||
* Array
|
||
+ Corrected dimension chunking in indices (:issue:`3166`,
|
||
:pr:`3167`) Simon Perkins
|
||
+ Inline store_chunk calls for store's return_stored option
|
||
(:pr:`3153`) John A Kirkham
|
||
+ Compatibility with struct dtypes for NumPy 1.14.1 release
|
||
(:pr:`3187`) Matthew Rocklin
|
||
* DataFrame
|
||
+ Bugfix to allow column assignment of pandas
|
||
datetimes(:pr:`3164`) Max Epstein
|
||
* Core
|
||
+ New file-system for HTTP(S), allowing direct loading from
|
||
specific URLs (:pr:`3160`) `Martin Durant`_
|
||
+ Fix bug when tokenizing partials with no keywords (:pr:`3191`)
|
||
Matthew Rocklin
|
||
+ Use more recent LZ4 API (:pr:`3157`) `Thrasibule`_
|
||
+ Introduce output stream parameter for progress bar (:pr:`3185`)
|
||
`Dieter Weber`_
|
||
|
||
-------------------------------------------------------------------
|
||
Sat Feb 10 17:26:43 UTC 2018 - arun@gmx.de
|
||
|
||
- update to version 0.17.0:
|
||
* Array
|
||
+ Added a support object-type arrays for nansum, nanmin, and
|
||
nanmax (:issue:`3133`) Keisuke Fujii
|
||
+ Update error handling when len is called with empty chunks
|
||
(:issue:`3058`) Xander Johnson
|
||
+ Fixes a metadata bug with store's return_stored option
|
||
(:pr:`3064`) John A Kirkham
|
||
+ Fix a bug in optimization.fuse_slice to properly handle when
|
||
first input is None (:pr:`3076`) James Bourbeau
|
||
+ Support arrays with unknown chunk sizes in percentile
|
||
(:pr:`3107`) Matthew Rocklin
|
||
+ Tokenize scipy.sparse arrays and np.matrix (:pr:`3060`) Roman
|
||
Yurchak
|
||
* DataFrame
|
||
+ Support month timedeltas in repartition(freq=...) (:pr:`3110`)
|
||
Matthew Rocklin
|
||
+ Avoid mutation in dataframe groupby tests (:pr:`3118`) Matthew
|
||
Rocklin
|
||
+ read_csv, read_table, and read_parquet accept iterables of paths
|
||
(:pr:`3124`) Jim Crist
|
||
+ Deprecates the dd.to_delayed function in favor of the existing
|
||
method (:pr:`3126`) Jim Crist
|
||
+ Return dask.arrays from df.map_partitions calls when the UDF
|
||
returns a numpy array (:pr:`3147`) Matthew Rocklin
|
||
+ Change handling of columns and index in dd.read_parquet to be
|
||
more consistent, especially in handling of multi-indices
|
||
(:pr:`3149`) Jim Crist
|
||
+ fastparquet append=True allowed to create new dataset
|
||
(:pr:`3097`) `Martin Durant`_
|
||
+ dtype rationalization for sql queries (:pr:`3100`) `Martin
|
||
Durant`_
|
||
* Bag
|
||
+ Document bag.map_paritions function may recieve either a list or
|
||
generator. (:pr:`3150`) Nir
|
||
* Core
|
||
+ Change default task ordering to prefer nodes with few dependents
|
||
and then many downstream dependencies (:pr:`3056`) Matthew
|
||
Rocklin
|
||
+ Add color= option to visualize to color by task order
|
||
(:pr:`3057`) (:pr:`3122`) Matthew Rocklin
|
||
+ Deprecate dask.bytes.open_text_files (:pr:`3077`) Jim Crist
|
||
+ Remove short-circuit hdfs reads handling due to maintenance
|
||
costs. May be re-added in a more robust manner later
|
||
(:pr:`3079`) Jim Crist
|
||
+ Add dask.base.optimize for optimizing multiple collections
|
||
without computing. (:pr:`3071`) Jim Crist
|
||
+ Rename dask.optimize module to dask.optimization (:pr:`3071`)
|
||
Jim Crist
|
||
+ Change task ordering to do a full traversal (:pr:`3066`) Matthew
|
||
Rocklin
|
||
+ Adds an optimize_graph keyword to all to_delayed methods to
|
||
allow controlling whether optimizations occur on
|
||
conversion. (:pr:`3126`) Jim Crist
|
||
+ Support using pyarrow for hdfs integration (:pr:`3123`) Jim
|
||
Crist
|
||
+ Move HDFS integration and tests into dask repo (:pr:`3083`) Jim
|
||
Crist
|
||
+ Remove write_bytes (:pr:`3116`) Jim Crist
|
||
|
||
-------------------------------------------------------------------
|
||
Thu Jan 11 23:56:36 UTC 2018 - arun@gmx.de
|
||
|
||
- specfile:
|
||
* update copyright year
|
||
|
||
- update to version 0.16.1:
|
||
* Array
|
||
+ Fix handling of scalar percentile values in "percentile"
|
||
(:pr:`3021`) `James Bourbeau`_
|
||
+ Prevent "bool()" coercion from calling compute (:pr:`2958`)
|
||
`Albert DeFusco`_
|
||
+ Add "matmul" (:pr:`2904`) `John A Kirkham`_
|
||
+ Support N-D arrays with "matmul" (:pr:`2909`) `John A Kirkham`_
|
||
+ Add "vdot" (:pr:`2910`) `John A Kirkham`_
|
||
+ Explicit "chunks" argument for "broadcast_to" (:pr:`2943`)
|
||
`Stephan Hoyer`_
|
||
+ Add "meshgrid" (:pr:`2938`) `John A Kirkham`_ and (:pr:`3001`)
|
||
`Markus Gonser`_
|
||
+ Preserve singleton chunks in "fftshift"/"ifftshift" (:pr:`2733`)
|
||
`John A Kirkham`_
|
||
+ Fix handling of negative indexes in "vindex" and raise errors
|
||
for out of bounds indexes (:pr:`2967`) `Stephan Hoyer`_
|
||
+ Add "flip", "flipud", "fliplr" (:pr:`2954`) `John A Kirkham`_
|
||
+ Add "float_power" ufunc (:pr:`2962`) (:pr:`2969`) `John A
|
||
Kirkham`_
|
||
+ Compatability for changes to structured arrays in the upcoming
|
||
NumPy 1.14 release (:pr:`2964`) `Tom Augspurger`_
|
||
+ Add "block" (:pr:`2650`) `John A Kirkham`_
|
||
+ Add "frompyfunc" (:pr:`3030`) `Jim Crist`_
|
||
* DataFrame
|
||
+ Fixed naming bug in cumulative aggregations (:issue:`3037`)
|
||
`Martijn Arts`_
|
||
+ Fixed "dd.read_csv" when "names" is given but "header" is not
|
||
set to "None" (:issue:`2976`) `Martijn Arts`_
|
||
+ Fixed "dd.read_csv" so that passing instances of
|
||
"CategoricalDtype" in "dtype" will result in known categoricals
|
||
(:pr:`2997`) `Tom Augspurger`_
|
||
+ Prevent "bool()" coercion from calling compute (:pr:`2958`)
|
||
`Albert DeFusco`_
|
||
+ "DataFrame.read_sql()" (:pr:`2928`) to an empty database tables
|
||
returns an empty dask dataframe `Apostolos Vlachopoulos`_
|
||
+ Compatability for reading Parquet files written by PyArrow 0.8.0
|
||
(:pr:`2973`) `Tom Augspurger`_
|
||
+ Correctly handle the column name (`df.columns.name`) when
|
||
reading in "dd.read_parquet" (:pr:2973`) `Tom Augspurger`_
|
||
+ Fixed "dd.concat" losing the index dtype when the data contained
|
||
a categorical (:issue:`2932`) `Tom Augspurger`_
|
||
+ Add "dd.Series.rename" (:pr:`3027`) `Jim Crist`_
|
||
+ "DataFrame.merge()" (:pr:`2960`) now supports merging on a
|
||
combination of columns and the index `Jon Mease`_
|
||
+ Removed the deprecated "dd.rolling*" methods, in preperation for
|
||
their removal in the next pandas release (:pr:`2995`) `Tom
|
||
Augspurger`_
|
||
+ Fix metadata inference bug in which single-partition series were
|
||
mistakenly special cased (:pr:`3035`) `Jim Crist`_
|
||
+ Add support for "Series.str.cat" (:pr:`3028`) `Jim Crist`_
|
||
* Core
|
||
+ Improve 32-bit compatibility (:pr:`2937`) `Matthew Rocklin`_
|
||
+ Change task prioritization to avoid upwards branching
|
||
(:pr:`3017`) `Matthew Rocklin`_
|
||
|
||
-------------------------------------------------------------------
|
||
Sun Nov 19 05:11:59 UTC 2017 - arun@gmx.de
|
||
|
||
- update to version 0.16.0:
|
||
* Fix install of fastparquet on travis (#2897)
|
||
* Fix port for bokeh dashboard (#2889)
|
||
* fix hdfs3 version
|
||
* Modify hdfs import to point to hdfs3 (#2894)
|
||
* Explicitly pass in pyarrow filesystem for parquet (#2881)
|
||
* COMPAT: Ensure lists for multiple groupby keys (#2892)
|
||
* Avoid list index error in repartition_freq (#2873)
|
||
* Finish moving `infer_storage_options` (#2886)
|
||
* Support arrow in `to_parquet`. Several other parquet
|
||
cleanups. (#2868)
|
||
* Bugfix: Filesystem object not passed to pyarrow reader (#2527)
|
||
* Fix py34 build
|
||
* Fixup s3 tests (#2875)
|
||
* Close resource profiler process on __exit__ (#2871)
|
||
* Add changelog for to_parquet changes. [ci skip]
|
||
* A few parquet cleanups (#2867)
|
||
* Fixed fillna with Series (#2810)
|
||
* Error nicely on parse dates failure in read_csv (#2863)
|
||
* Fix empty dataframe partitioning for numpy 1.10.4 (#2862)
|
||
* Test `unique`'s inverse mapping's shape (#2857)
|
||
* Move `thread_state` out of the top namespace (#2858)
|
||
* Explain unique's steps (#2856)
|
||
* fix and test for issue #2811 (#2818)
|
||
* Minor tweaks to `_unique_internal` optional result handling
|
||
(#2855)
|
||
* Update dask interface during XArray integration (#2847)
|
||
* Remove unnecessary map_partitions in aggregate (#2712)
|
||
* Simplify `_unique_internal` (#2850)
|
||
* Add more tests for read_parquet(engine='pyarrow') (#2822)
|
||
* Do not raise exception when calling set_index on empty dataframe
|
||
#2819 (#2827)
|
||
* Test unique on more data (#2846)
|
||
* Do not except on set_index on text column with empty partitions
|
||
#2820 (#2831)
|
||
* Compat for bokeh 0.12.10 (#2844)
|
||
* Support `return_*` arguments with `unique` (#2779)
|
||
* Fix installing of pandas dev (#2838)
|
||
* Squash a few warnings in dask.array (#2833)
|
||
* Array optimizations don't elide some getter calls (#2826)
|
||
* test against pandas rc (#2814)
|
||
* df.astype(categorical_dtype) -> known categoricals (#2835)
|
||
* Fix cloudpickle test (#2836)
|
||
* BUG: Quantile with missing data (#2791)
|
||
* API: remove dask.async (#2828)
|
||
* Adds comma to flake8 section in setup.cfg (#2817)
|
||
* Adds asarray and asanyarray to the dask.array public API (#2787)
|
||
* flake8 now checks bare excepts (#2816)
|
||
* CI: Update for new flake8 / pycodestyle (#2808)
|
||
* Fix concat series bug (#2800)
|
||
* Typo in the docstring of read_parquet's filters param (#2806)
|
||
* Docs update (#2803)
|
||
* minor doc changes in bag.core (#2797)
|
||
* da.random.choice works with array args (#2781)
|
||
* Support broadcasting 0-length dimensions (#2784)
|
||
* ResourceProfiler plot works with single point (#2778)
|
||
* Implement Dask Array's unique to be lazy (#2775)
|
||
* Dask Collection Interface
|
||
* Reduce test memory usage (#2782)
|
||
* Deprecate vnorm (#2773)
|
||
* add auto-import of gcsfs (#2776)
|
||
* Add allclose (#2771)
|
||
* Remove `random.different_seeds` from API docs (#2772)
|
||
* Follow-up for atleast_nd (#2765)
|
||
* Use get_worker().client.get if available (#2762)
|
||
* Link PR for "Allow tuples as sharedict keys" (#2766)
|
||
* Allow tuples as sharedict keys (#2763)
|
||
* update docs to use flatten vs concat (#2764)
|
||
* Add atleast_nd functions (#2760)
|
||
* Consolidate changelog for 0.15.4 (#2759)
|
||
* Add changelog template for future date (#2758)
|
||
|
||
-------------------------------------------------------------------
|
||
Mon Oct 30 06:16:22 UTC 2017 - arun@gmx.de
|
||
|
||
- update to version 0.15.4:
|
||
* Drop s3fs requirement (#2750)
|
||
* Support -1 as an alias for dimension size in chunks (#2749)
|
||
* Handle zero dimension when rechunking (#2747)
|
||
* Pandas 0.21 compatability (#2737)
|
||
* API: Add `.str` accessor for Categorical with object dtype (#2743)
|
||
* Fix install failures
|
||
* Reduce memory usage
|
||
* A few test cleanups
|
||
* Fix #2720 (#2729)
|
||
* Pass on file_scheme to fastparquet (#2714)
|
||
* Support indexing with np.int (#2719)
|
||
* Tree reduction support for dask.bag.Bag.foldby (#2710)
|
||
* Update link to IPython parallel docs (#2715)
|
||
* Call mkdir from correct namespace in array.to_npy_stack. (#2709)
|
||
* add int96 times to parquet writer (#2711)
|
||
|
||
-------------------------------------------------------------------
|
||
Sun Sep 24 21:28:49 UTC 2017 - arun@gmx.de
|
||
|
||
- update to version 0.15.3:
|
||
* add .github/PULL_REQUEST_TEMPLATE.md file
|
||
* Make `y` optional in dask.array.learn (#2701)
|
||
* Add apply_over_axes (#2702)
|
||
* Use apply_along_axis name in Dask (#2704)
|
||
* Tweak apply_along_axis's pre-NumPy 1.13.0 error (#2703)
|
||
* Add apply_along_axis (#2698)
|
||
* Use travis conditional builds (#2697)
|
||
* Skip days in daily_stock that have nan values (#2693)
|
||
* TST: Have array assert_eq check scalars (#2681)
|
||
* Add schema keyword to read_sql (#2582)
|
||
* Only install pytest-runner if needed (#2692)
|
||
* Remove resize tool from bokeh plots (#2688)
|
||
* Add ptp (#2691)
|
||
* Catch warning from numpy in subs (#2457)
|
||
* Publish Series methods in dataframe api (#2686)
|
||
* Fix norm keepdims (#2683)
|
||
* Dask array slicing with boolean arrays (#2658)
|
||
* repartition works with mixed categoricals (#2676)
|
||
* Merge pull request #2667 from martindurant/parquet_file_schema
|
||
* Fix for parquet file schemes
|
||
* Optional axis argument for cumulative functions (#2664)
|
||
* Remove partial_by_order
|
||
* Support literals in atop
|
||
* [ci skip] Add flake8 note in developer doc page (#2662)
|
||
* Add filenames return for ddf.to_csv and bag.to_textfiles as they
|
||
both… (#2655)
|
||
* CLN: Remove redundant code, fix typos (#2652)
|
||
* [docs] company name change from Continuum to Anaconda (#2660)
|
||
* Fix what hapend when combining partition_on and append in
|
||
to_parquet (#2645)
|
||
* WIP: Add user defined aggregations (#2344)
|
||
* [docs] new cheatsheet (#2649)
|
||
* Masked arrays (#2301)
|
||
* Indexing with an unsigned integer array (#2647)
|
||
* ENH: Allow the groupby by param to handle columns and index levels
|
||
(#2636)
|
||
* update copyright date (#2642)
|
||
* python setup.py test runs py.test (#2641)
|
||
* Avoid using operator.itemgetter in dask.dataframe (#2638)
|
||
* Add `*_like` array creation functions (#2640)
|
||
* Consistent slicing names (#2601)
|
||
* Replace Continuum Analytics with Anaconda Inc. (#2631)
|
||
* Implement Series.str[index] (#2634)
|
||
* Support complex data with vnorm (#2621)
|
||
|
||
- changes from version 0.15.2:
|
||
* BUG: setitem should update divisions (#2622)
|
||
* Allow dataframe.loc with numpy array (#2615)
|
||
* Add link to Stack Overflow's mcve docpage to support docs (#2612)
|
||
* Improve dtype inference and reflection (#2571)
|
||
* Add ediff1d (#2609)
|
||
* Optimize concatenate on singleton sequences (#2610)
|
||
* Add diff (#2607)
|
||
* Document norm in Dask Array API (#2605)
|
||
* Add norm (#2597)
|
||
* Don't check for memory leaks in distributed tests (#2603)
|
||
* Include computed collection within sharedict in delayed (#2583)
|
||
* Reorg array (#2595)
|
||
* Remove `expand` parameter from df.str.split (#2593)
|
||
* Normalize `meta` on call to `dd.from_delayed` (#2591)
|
||
* Remove bare `except:` blocks and test that none exist. (#2590)
|
||
* Adds choose method to dask.array.Array (#2584)
|
||
* Generalize vindex in dask.array (#2573)
|
||
* Clear `_cached_keys` on name change in dask.array (#2572)
|
||
* Don't render None for unknown divisions (#2570)
|
||
* Add missing initialization to CacheProfiler (#2550)
|
||
* Add argwhere, *nonzero, where (cond) (#2539)
|
||
* Fix indices error message (#2565)
|
||
* Fix and secure some references (#2563)
|
||
* Allows for read_hdf to accept an iterable of files (#2547)
|
||
* Allow split on rechunk on first pass (#2560)
|
||
* Improvements to dask.array.where (#2549)
|
||
* Adds isin method to dask.dataframe.DataFrame (#2558)
|
||
* Support dask array conditional in compress (#2555)
|
||
* Clarify ResourceProfiler docstring [ci skip] (#2553)
|
||
* In compress, use Dask to expand condition array (#2545)
|
||
* Support compress with axis as None (#2541)
|
||
* df.idxmax/df.idxmin work with empty partitions (#2542)
|
||
* FIX typo in accumulate docstring (#2552)
|
||
* da.where works with non-bool condition (#2543)
|
||
* da.repeat works with negative axis (#2544)
|
||
* Check metadata in `dd.from_delayed` (#2534)
|
||
* TST: clean up test directories in shuffle (#2535)
|
||
* Do no attemp to compute divisions on empty dataframe. (#2529)
|
||
* Remove deprecated bag behavior (#2525)
|
||
* Updates read_hdf docstring (#2518)
|
||
* Add dd.to_timedelta (#2523)
|
||
* Better error message for read_csv (#2522)
|
||
* Remove spurious keys from map_overlap graph (#2520)
|
||
* Do not compare x.dim with None in array. (#1847)
|
||
* Support concat for categorical MultiIndex (#2514)
|
||
* Support for callables in df.assign (#2513)
|
||
|
||
-------------------------------------------------------------------
|
||
Thu May 4 22:24:37 UTC 2017 - toddrme2178@gmail.com
|
||
|
||
- Implement single-spec version
|
||
- Update source URL.
|
||
- Split classes into own subpackages to lighten base dependencies.
|
||
- Update to version 0.15.1
|
||
* Add storage_options to to_textfiles and to_csv (:pr:`2466`)
|
||
* Rechunk and simplify rfftfreq (:pr:`2473`), (:pr:`2475`)
|
||
* Better support ndarray subclasses (:pr:`2486`)
|
||
* Import star in dask.distributed (:pr:`2503`)
|
||
* Threadsafe cache handling with tokenization (:pr:`2511`)
|
||
- Update to version 0.15.0
|
||
+ Array
|
||
* Add dask.array.stats submodule (:pr:`2269`)
|
||
* Support ``ufunc.outer`` (:pr:`2345`)
|
||
* Optimize fancy indexing by reducing graph overhead (:pr:`2333`) (:pr:`2394`)
|
||
* Faster array tokenization using alternative hashes (:pr:`2377`)
|
||
* Added the matmul ``@`` operator (:pr:`2349`)
|
||
* Improved coverage of the ``numpy.fft`` module (:pr:`2320`) (:pr:`2322`) (:pr:`2327`) (:pr:`2323`)
|
||
* Support NumPy's ``__array_ufunc__`` protocol (:pr:`2438`)
|
||
+ Bag
|
||
* Fix bug where reductions on bags with no partitions would fail (:pr:`2324`)
|
||
* Add broadcasting and variadic ``db.map`` top-level function. Also remove
|
||
auto-expansion of tuples as map arguments (:pr:`2339`)
|
||
* Rename ``Bag.concat`` to ``Bag.flatten`` (:pr:`2402`)
|
||
+ DataFrame
|
||
* Parquet improvements (:pr:`2277`) (:pr:`2422`)
|
||
+ Core
|
||
* Move dask.async module to dask.local (:pr:`2318`)
|
||
* Support callbacks with nested scheduler calls (:pr:`2397`)
|
||
* Support pathlib.Path objects as uris (:pr:`2310`)
|
||
- Update to version 0.14.3
|
||
+ DataFrame
|
||
* Pandas 0.20.0 support
|
||
- Update to version 0.14.2
|
||
+ Array
|
||
* Add da.indices (:pr:`2268`), da.tile (:pr:`2153`), da.roll (:pr:`2135`)
|
||
* Simultaneously support drop_axis and new_axis in da.map_blocks (:pr:`2264`)
|
||
* Rechunk and concatenate work with unknown chunksizes (:pr:`2235`) and (:pr:`2251`)
|
||
* Support non-numpy container arrays, notably sparse arrays (:pr:`2234`)
|
||
* Tensordot contracts over multiple axes (:pr:`2186`)
|
||
* Allow delayed targets in da.store (:pr:`2181`)
|
||
* Support interactions against lists and tuples (:pr:`2148`)
|
||
* Constructor plugins for debugging (:pr:`2142`)
|
||
* Multi-dimensional FFTs (single chunk) (:pr:`2116`)
|
||
+ Bag
|
||
* to_dataframe enforces consistent types (:pr:`2199`)
|
||
+ DataFrame
|
||
* Set_index always fully sorts the index (:pr:`2290`)
|
||
* Support compatibility with pandas 0.20.0 (:pr:`2249`), (:pr:`2248`), and (:pr:`2246`)
|
||
* Support Arrow Parquet reader (:pr:`2223`)
|
||
* Time-based rolling windows (:pr:`2198`)
|
||
* Repartition can now create more partitions, not just less (:pr:`2168`)
|
||
+ Core
|
||
* Always use absolute paths when on POSIX file system (:pr:`2263`)
|
||
* Support user provided graph optimizations (:pr:`2219`)
|
||
* Refactor path handling (:pr:`2207`)
|
||
* Improve fusion performance (:pr:`2129`), (:pr:`2131`), and (:pr:`2112`)
|
||
- Update to version 0.14.1
|
||
+ Array
|
||
* Micro-optimize optimizations (:pr:`2058`)
|
||
* Change slicing optimizations to avoid fusing raw numpy arrays (:pr:`2075`)
|
||
(:pr:`2080`)
|
||
* Dask.array operations now work on numpy arrays (:pr:`2079`)
|
||
* Reshape now works in a much broader set of cases (:pr:`2089`)
|
||
* Support deepcopy python protocol (:pr:`2090`)
|
||
* Allow user-provided FFT implementations in ``da.fft`` (:pr:`2093`)
|
||
+ Bag
|
||
+ DataFrame
|
||
* Fix to_parquet with empty partitions (:pr:`2020`)
|
||
* Optional ``npartitions='auto'`` mode in ``set_index`` (:pr:`2025`)
|
||
* Optimize shuffle performance (:pr:`2032`)
|
||
* Support efficient repartitioning along time windows like
|
||
``repartition(freq='12h')`` (:pr:`2059`)
|
||
* Improve speed of categorize (:pr:`2010`)
|
||
* Support single-row dataframe arithmetic (:pr:`2085`)
|
||
* Automatically avoid shuffle when setting index with a sorted column
|
||
(:pr:`2091`)
|
||
* Improve handling of integer-na handling in read_csv (:pr:`2098`)
|
||
+ Delayed
|
||
* Repeated attribute access on delayed objects uses the same key (:pr:`2084`)
|
||
+ Core
|
||
* Improve naming of nodes in dot visuals to avoid generic ``apply``
|
||
(:pr:`2070`)
|
||
* Ensure that worker processes have different random seeds (:pr:`2094`)
|
||
- Update to version 0.14.0
|
||
+ Array
|
||
* Fix corner cases with zero shape and misaligned values in ``arange``
|
||
* Improve concatenation efficiency (:pr:`1923`)
|
||
* Avoid hashing in ``from_array`` if name is provided (:pr:`1972`)
|
||
+ Bag
|
||
* Repartition can now increase number of partitions (:pr:`1934`)
|
||
* Fix bugs in some reductions with empty partitions (:pr:`1939`), (:pr:`1950`),
|
||
(:pr:`1953`)
|
||
+ DataFrame
|
||
* Support non-uniform categoricals (:pr:`1877`), (:pr:`1930`)
|
||
* Groupby cumulative reductions (:pr:`1909`)
|
||
* DataFrame.loc indexing now supports lists (:pr:`1913`)
|
||
* Improve multi-level groupbys (:pr:`1914`)
|
||
* Improved HTML and string repr for DataFrames (:pr:`1637`)
|
||
* Parquet append (:pr:`1940`)
|
||
* Add ``dd.demo.daily_stock`` function for teaching (:pr:`1992`)
|
||
+ Delayed
|
||
* Add ``traverse=`` keyword to delayed to optionally avoid traversing nested
|
||
data structures (:pr:`1899`)
|
||
* Support Futures in from_delayed functions (:pr:`1961`)
|
||
* Improve serialization of decorated delayed functions (:pr:`1969`)
|
||
+ Core
|
||
* Improve windows path parsing in corner cases (:pr:`1910`)
|
||
* Rename tasks when fusing (:pr:`1919`)
|
||
* Add top level ``persist`` function (:pr:`1927`)
|
||
* Propagate ``errors=`` keyword in byte handling (:pr:`1954`)
|
||
* Dask.compute traverses Python collections (:pr:`1975`)
|
||
* Structural sharing between graphs in dask.array and dask.delayed (:pr:`1985`)
|
||
- Update to version 0.13.0
|
||
+ Array
|
||
* Mandatory dtypes on dask.array. All operations maintain dtype information
|
||
and UDF functions like map_blocks now require a dtype= keyword if it can not
|
||
be inferred. (:pr:`1755`)
|
||
* Support arrays without known shapes, such as arises when slicing arrays with
|
||
arrays or converting dataframes to arrays (:pr:`1838`)
|
||
* Support mutation by setting one array with another (:pr:`1840`)
|
||
* Tree reductions for covariance and correlations. (:pr:`1758`)
|
||
* Add SerializableLock for better use with distributed scheduling (:pr:`1766`)
|
||
* Improved atop support (:pr:`1800`)
|
||
* Rechunk optimization (:pr:`1737`), (:pr:`1827`)
|
||
+ Bag
|
||
* Avoid wrong results when recomputing the same groupby twice (:pr:`1867`)
|
||
+ DataFrame
|
||
* Add ``map_overlap`` for custom rolling operations (:pr:`1769`)
|
||
* Add ``shift`` (:pr:`1773`)
|
||
* Add Parquet support (:pr:`1782`) (:pr:`1792`) (:pr:`1810`), (:pr:`1843`),
|
||
(:pr:`1859`), (:pr:`1863`)
|
||
* Add missing methods combine, abs, autocorr, sem, nsmallest, first, last,
|
||
prod, (:pr:`1787`)
|
||
* Approximate nunique (:pr:`1807`), (:pr:`1824`)
|
||
* Reductions with multiple output partitions (for operations like
|
||
drop_duplicates) (:pr:`1808`), (:pr:`1823`) (:pr:`1828`)
|
||
* Add delitem and copy to DataFrames, increasing mutation support (:pr:`1858`)
|
||
+ Delayed
|
||
* Changed behaviour for ``delayed(nout=0)`` and ``delayed(nout=1)``:
|
||
``delayed(nout=1)`` does not default to ``out=None`` anymore, and
|
||
``delayed(nout=0)`` is also enabled. I.e. functions with return
|
||
tuples of length 1 or 0 can be handled correctly. This is especially
|
||
handy, if functions with a variable amount of outputs are wrapped by
|
||
``delayed``. E.g. a trivial example:
|
||
``delayed(lambda *args: args, nout=len(vals))(*vals)``
|
||
+ Core
|
||
* Refactor core byte ingest (:pr:`1768`), (:pr:`1774`)
|
||
* Improve import time (:pr:`1833`)
|
||
- update to version 0.12.0:
|
||
* update changelog (#1757)
|
||
* Avoids spurious warning message in concatenate (#1752)
|
||
* CLN: cleanup dd.multi (#1728)
|
||
* ENH: da.ufuncs now supports DataFrame/Series (#1669)
|
||
* Faster array slicing (#1731)
|
||
* Avoid calling list on partitions (#1747)
|
||
* Fix slicing error with None and ints (#1743)
|
||
* Add da.repeat (#1702)
|
||
* ENH: add dd.DataFrame.resample (#1741)
|
||
* Unify column names in dd.read_csv (#1740)
|
||
* replace empty with random in test to avoid nans
|
||
* Update diagnostics plots (#1736)
|
||
* Allow atop to change chunk shape (#1716)
|
||
* ENH: DataFrame.loc now supports 2d indexing (#1726)
|
||
* Correct shape when indexing with Ellipsis and None
|
||
* ENH: Add DataFrame.pivot_table (#1729)
|
||
* CLN: cleanup DataFrame class handling (#1727)
|
||
* ENH: Add DataFrame.combine_first (#1725)
|
||
* ENH: Add DataFrame all/any (#1724)
|
||
* micro-optimize _deps (#1722)
|
||
* A few small tweaks to da.Array.astype (#1721)
|
||
* BUG: Fixed metadata lookup failure in Accessor (#1706)
|
||
* Support auto-rechunking in stack and concatenate (#1717)
|
||
* Forward `get` kwarg in df.to_csv (#1715)
|
||
* Add rename support for multi-level columns (#1712)
|
||
* Update paid support section
|
||
* Add `drop` to reset_index (#1711)
|
||
* Cull dask.arrays on slicing (#1709)
|
||
* Update dd.read_* functions in docs
|
||
* WIP: Feature/dataframe aggregate (implements #1619) (#1678)
|
||
* Add da.round (#1708)
|
||
* Executor -> Client
|
||
* Add support of getitem for multilevel columns (#1697)
|
||
* Prepend optimization keywords with name of optimization (#1690)
|
||
* Add dd.read_table (#1682)
|
||
* Fix dd.pivot_table dtype to be deterministic (#1693)
|
||
* da.random with state is consistent across sizes (#1687)
|
||
* Remove `raises`, use pytest.raises instead (#1679)
|
||
* Remove unnecessary calls to list (#1681)
|
||
* Dataframe tree reductions (#1663)
|
||
* Add global optimizations to compute (#1675)
|
||
* TST: rename dataframe eq to assert_eq (#1674)
|
||
* ENH: Add DataFrame/Series.align (#1668)
|
||
* CLN: dataframe.io (#1664)
|
||
* ENH: Add DataFrame/Series clip_xxx (#1667)
|
||
* Clear divisions on single_partitions_merge (#1666)
|
||
* ENH: add dd.pivot_table (#1665)
|
||
* Typo in `use-cases`? (#1670)
|
||
* add distributed follow link doc page
|
||
* Dataframe elemwise (#1660)
|
||
* Windows file and endline test handling (#1661)
|
||
* remove old badges
|
||
* Fix #1656: failures when parallel testing (#1657)
|
||
* Remove use of multiprocessing.Manager (#1652) (#1653)
|
||
* A few fixes for `map_blocks` (#1654)
|
||
* Automatically expand chunking in atop (#1644)
|
||
* Add AppVeyor configuration (#1648)
|
||
* TST: move flake8 to travis script (#1655)
|
||
* CLN: Remove unused funcs (#1638)
|
||
* Implementing .size and groupby size method (#1627) (#1649)
|
||
* Use strides, shape, and offset in memmap tokenize (#1646)
|
||
* Validate scalar metadata is scalar (#1642)
|
||
* Convert readthedocs links for their .org -> .io migration for
|
||
hosted projects (#1639)
|
||
* CLN: little cleanup of dd.categorical (#1635)
|
||
* Signature of Array.transpose matches numpy (#1632)
|
||
* Error nicely when indexing Array with Array (#1629)
|
||
* ENH: add DataFrame.get_xtype_counts (#1634)
|
||
* PEP8: some fixes (#1633)
|
||
- changes from version 0.11.1:
|
||
* support uniform index partitions in set_index(sorted) (#1626)
|
||
* Groupby works with multiprocessing (#1625)
|
||
* Use a nonempty index in _maybe_partial_time_string
|
||
* Fix segfault in groupby-var
|
||
* Support Pandas 0.19.0
|
||
* Deprecations (#1624)
|
||
* work-around for ddf.info() failing because of
|
||
https://github.com/pydata/pandas/issues/14368 (#1623)
|
||
* .str accessor needs to pass thru both args & kwargs (#1621)
|
||
* Ensure dtype is provided in additional tests (#1620)
|
||
* coerce rounded numbers to int in dask.array.ghost (#1618)
|
||
* Use assert_eq everywhere in dask.array tests (#1617)
|
||
* Update documentation (#1606)
|
||
* Support new_axes= keyword in atop (#1612)
|
||
* pass through node_attr and edge_attr in dot_graph (#1614)
|
||
* Add swapaxes to dask array (#1611)
|
||
* add clip to Array (#1610)
|
||
* Add atop(concatenate=False) keyword argument (#1609)
|
||
* Better error message on metadata inference failure (#1598)
|
||
* ENH/API: Enhanced Categorical Accessor (#1574)
|
||
* PEP8: dataframe fix except E127,E402,E501,E731 (#1601)
|
||
* ENH: dd.get_dummies for categorical Series (#1602)
|
||
* PEP8: some fixes (#1605)
|
||
* Fix da.learn tests for scikit-learn release (#1597)
|
||
* Suppress warnings in psutil (#1589)
|
||
* avoid more timeseries warnings (#1586)
|
||
* Support inplace operators in dataframe (#1585)
|
||
* Squash warnings in resample (#1583)
|
||
* expand imports for dask.distributed (#1580)
|
||
* Add indicator keyword to dd.merge (#1575)
|
||
* Error loudly if `nrows` used in read_csv (#1576)
|
||
* Add versioneer (#1569)
|
||
* Strengthen statement about gitter for developers in docs
|
||
* Raise IndexError on out of bounds slice. (#1579)
|
||
* ENH: Support Series in read_hdf (#1577)
|
||
* COMPAT/API: DataFrame.categorize missing values (#1578)
|
||
* Add `pipe` method to dask.dataframe (#1567)
|
||
* Sample from `read_bytes` ends on a delimiter (#1571)
|
||
* Remove mention of bag join in docs (#1568)
|
||
* Tokenize mmap works without filename (#1570)
|
||
* String accessor works with indexes (#1561)
|
||
* corrected links to documentation from Examples (#1557)
|
||
* Use conda-forge channel in travis (#1559)
|
||
* add s3fs to travis.yml (#1558)
|
||
* ENH: DataFrame.select_dtypes (#1556)
|
||
* Improve slicing performance (#1539)
|
||
* Check meta in `__init__` of _Frame
|
||
* Fix metadata in Series.getitem
|
||
* A few changes to `dask.delayed` (#1542)
|
||
* Fixed read_hdf example (#1544)
|
||
* add section on distributed computing with link to toc
|
||
* Fix spelling (#1535)
|
||
* Only fuse simple indexing with getarray backends (#1529)
|
||
* Deemphasize graphs in docs (#1531)
|
||
* Avoid pickle when tokenizing __main__ functions (#1527)
|
||
* Add changelog doc going up to dask 0.6.1 (2015-07-23). (#1526)
|
||
* update dataframe docs
|
||
* update index
|
||
* Update to highlight the use of glob based file naming option for
|
||
df exports (#1525)
|
||
* Add custom docstring to dd.to_csv, mentioning that one file per
|
||
partition is written (#1524)
|
||
* Run slow tests in Travis for all Python versions, even if coverage
|
||
check is disabled. (#1523)
|
||
* Unify example doc pages into one (#1520)
|
||
* Remove lambda/inner functions in dask.dataframe (#1516)
|
||
* Add documentation for dataframe metadata (#1514)
|
||
* "dd.map_partitions" works with scalar outputs (#1515)
|
||
* meta_nonempty returns types of correct size (#1513)
|
||
* add memory use note to tsqr docstring
|
||
* Fix slow consistent keyname test (#1510)
|
||
* Chunks check (#1504)
|
||
* Fix last 'line' in sample; prevents open quotes. (#1495)
|
||
* Create new threadpool when operating from thread (#1487)
|
||
* Add finalize- prefix to dask.delayed collections
|
||
* Move key-split from distributed to dask
|
||
* State that delayed values should be lists in bag.from_delayed
|
||
(#1490)
|
||
* Use lists in db.from_sequence (#1491)
|
||
* Implement user defined aggregations (#1483)
|
||
* Field access works with non-scalar fields (#1484)
|
||
- Update to 0.11.0
|
||
* DataFrames now enforce knowing full metadata (columns, dtypes)
|
||
everywhere. Previously we would operate in an ambiguous state
|
||
when functions lost dtype information (such as apply). Now all
|
||
dataframes always know their dtypes and raise errors asking for
|
||
information if they are unable to infer (which they usually
|
||
can). Some internal attributes like _pd and _pd_nonempty have
|
||
been moved.
|
||
* The internals of the distributed scheduler have been refactored
|
||
to transition tasks between explicit states. This improves
|
||
resilience, reasoning about scheduling, plugin operation, and
|
||
logging. It also makes the scheduler code easier to understand
|
||
for newcomers.
|
||
* Breaking Changes
|
||
+ The distributed.s3 and distributed.hdfs namespaces are gone.
|
||
Use protocols in normal methods like read_text('s3://...'
|
||
instead.
|
||
+ Dask.array.reshape now errs in some cases where previously
|
||
it would have create a very large number of tasks
|
||
- update to version 0.10.2:
|
||
* raise informative error on merge(on=frame)
|
||
* Fix crash with -OO Python command line (#1388)
|
||
* [WIP] Read hdf partitioned (#1407)
|
||
* Add dask.array.digitize. (#1409)
|
||
* Adding documentation to create dask DataFrame from HDF5 (#1405)
|
||
* Unify shuffle algorithms (#1404)
|
||
* dd.read_hdf: clear errors on exceeding row numbers (#1406)
|
||
* Rename `get_division` to `get_partition`
|
||
* Add nice error messages on import failures
|
||
* Use task-based shuffle in hash_joins (#1383)
|
||
* Fixed #1381: Reimplemented DataFrame.repartition(npartition=N) so
|
||
it doesn't require indexing and just coalesce existing partitions
|
||
without shuffling/balancing (#1396)
|
||
* Import visualize from dask.diagnostics in docs
|
||
* Backport `equal_nans` to older version of numpy
|
||
* Improve checks for dtype and shape in dask.array
|
||
* Progess bar process should be deamon
|
||
* LZMA may not be available in python 3 (#1391)
|
||
* dd.to_hdf: multiple files multiprocessing avoid locks (#1384)
|
||
* dir works with numeric column names
|
||
* Dataframe groupby works with numeric column names
|
||
* Use fsync when appending to partd
|
||
* Fix pickling issue in dataframe to_bag
|
||
* Add documentation for dask.dataframe.to_hdf
|
||
* Fixed a copy-paste typo in DataFrame.map_partitions docstring
|
||
* Fix 'visualize' import location in diagnostics documentation
|
||
(#1376)
|
||
* update cheat sheet (#1371)
|
||
- update to version 0.10.1:
|
||
* `inline` no longer removes keys (#1356)
|
||
* avoid c: in infer_storage_options (#1369)
|
||
* Protect reductions against empty partitions (#1361)
|
||
* Add doc examples for dask.array.histogram. (#1363)
|
||
* Fix typo in pip install requirements path (#1364)
|
||
* avoid unnecessary dependencies between save tasks in
|
||
dataframe.to_hdf (#1293)
|
||
* remove xfail mark for blosc missing const
|
||
* Add `anon=True` for read from s3 test
|
||
* `subs` doesn't needlessly compare keys and values
|
||
* Use pytest.importorskip instead of try/except/return pattern
|
||
* Fixes for bokeh 0.12.0
|
||
* Multiprocess scheduler handles unpickling errors
|
||
* arra.random with array-like parameters (#1327)
|
||
* Fixes issue #1337 (#1338)
|
||
* Remove dask runtime dependence on mock 2.7 backport.
|
||
* Load known but external protocols automatically (#1325)
|
||
* Add center argument to Series/DataFrame.rolling (#1280)
|
||
* Add Bag.random_sample method. (#1332)
|
||
* Correct docs install command and add missing required packages
|
||
(#1333)
|
||
* Mark the 4 slowest tests as slow to get a faster suite by
|
||
default. (#1334)
|
||
* Travis: Install mock package in Python 2.7.
|
||
* Automatic blocksize for read_csv based on available memory and
|
||
number of cores.
|
||
* Replace "Matthew Rocklin" with "Dask Development Team" (#1329)
|
||
* Support column assignment in DataFrame (#1322)
|
||
* Few travis fixes, pandas version >= 0.18.0 (#1314)
|
||
* Don't run hdf test if pytables package is not present. (#1323)
|
||
* Add delayed.compute to api docs.
|
||
* Support datetimes in DataFrame._build_pd (#1319)
|
||
* Test setting the index with datetime with timezones, which is a
|
||
pandas-defined dtype
|
||
* (#1315)
|
||
* Add s3fs to requirements (#1316)
|
||
* Pass dtype information through in Series.astype (#1320)
|
||
* Add draft of development guidelines (#1305)
|
||
* Skip tests needing optional package when it's not present. (#1318)
|
||
* DOC: Document DataFrame.categorize
|
||
* make dd.to_csv support writing to multiple csv files (#1303)
|
||
* quantiles for repartitioning (#1261)
|
||
* DOC: Minimal doc for get_sync (#1312)
|
||
* Pass through storage_options in db.read_text (#1304)
|
||
* Fixes #1237: correctly propagate storage_options through read_*
|
||
APIs and use urlsplit to automatically get remote connection
|
||
settings (#1269)
|
||
* TST: Travis build matrix to specify numpy/pandas ver (#1300)
|
||
* amend doc string to Bag.to_textfiles
|
||
* Return dask.Delayed when saving files with compute = false (#1286)
|
||
* Support empty or small dataframes in from_pandas (#1290)
|
||
* Add validation and tests for order breaking name_function (#1275)
|
||
* ENH: dataframe now supports partial string selection (#1278)
|
||
* Fix typo in spark-dask docs
|
||
* added note and verbose exception about CSV parsing errors (#1287)
|
||
- update to version 0.10.0:
|
||
* Add parametrization to merge tests
|
||
* Add more challenging types to nonempty_sample_df test
|
||
* Windows fixes
|
||
* TST: Fix coveralls badge (#1276)
|
||
* Sort index on shuffle (#1274)
|
||
* Update specification docs to reflect new spec.
|
||
* Add groupby docs (#1273)
|
||
* Update spark docs
|
||
* Rolling class receives normal arguments (unchecked other than
|
||
pandas call), stores at
|
||
* Reduce communication in rolling operations #1242 (#1270)
|
||
* Fix Shuffle (#1255)
|
||
* Work on earlier versions of Pandas
|
||
* Handle additional Pandas types
|
||
* Use non-empty fake dataframe in merge operations
|
||
* Add failing test for merge case
|
||
* Add utility function to create sample dataframe
|
||
* update release procedure
|
||
* amend doc string to Bag.to_textfiles (#1258)
|
||
* Drop Python 2.6 support (#1264)
|
||
* Clean DataFrame naming conventions (#1263)
|
||
* Fix some bugs in the rolling implementation.
|
||
* Fix core.get to use new spec
|
||
* Make graph definition recursive
|
||
* Handle empty partitions in dask.bag.to_textfiles
|
||
* test index.min/max
|
||
* Add regression test for non-ndarray slicing
|
||
* Standardize dataframe keynames
|
||
* bump csv sample size to 256k (#1253)
|
||
* Switch tests to utils.tmpdir (#1251)
|
||
* Fix dot_graph filename split bug
|
||
* Correct documentation to reflect argument existing now.
|
||
* Allow non-zero axis for .rolling (for application over columns)
|
||
* Fix scheduler behavior for top-level lists
|
||
* Various spelling mistakes in docstrings, comments, exception
|
||
messages, and a filename
|
||
* Fix typo. (#1247)
|
||
* Fix tokenize in dask.delayed
|
||
* Remove unused imports, pep8 fixes
|
||
* Fix bug in slicing optimization
|
||
* Add Task Shuffle (#1186)
|
||
* Add bytes API (#1224)
|
||
* Add dask_key_name to docs, fix bug in methods
|
||
* Allow formatting in dask.dataframe.to_hdf path and key parameters
|
||
* Match pandas' exceptions a bit closer in the rolling API. Also,
|
||
correct computation f
|
||
* Add tests to package (#1231)
|
||
* Document visualize method (#1234)
|
||
* Skip new rolling API's tests if the pandas we have is too old.
|
||
* Improve df_or_series.rolling(...) implementation.
|
||
* Remove `iloc` property on `dask.dataframe`
|
||
* Support for the new pandas rolling API.
|
||
* test delayed names are different under kwargs
|
||
* Add Hussain Sultan to AUTHORS
|
||
* Add `optimize_graph` keyword to multiprocessing get
|
||
* Add `optimize_graph` keyword to `compute`
|
||
* Add dd.info() (#1213)
|
||
* Cleanup base tests
|
||
* Add groupby documentation stub
|
||
* pngmath is deprecated in sphinx 1.4
|
||
* A few docfixes
|
||
* Extract dtype in dd.from_bcolz
|
||
* Throw NotImplementedError if old toolz.accumulate
|
||
* Add isnull and notnull for dataframe
|
||
* Add dask.bag.accumulate
|
||
* Fix categorical partitioning
|
||
* create single lock for glob read_hdf
|
||
* Fix failing from_url doctest
|
||
* Add missing api to bag docs
|
||
* Add Skipper Seabold to AUTHORS.
|
||
* Don't use mutable default argument
|
||
* Fix typo
|
||
* Ensure to_task_dasks always returns a task
|
||
* Fix dir for dataframe objects
|
||
* Infer metadata in dd.from_delayed
|
||
* Fix some closure issues in dask.dataframe
|
||
* Add storage_options keyword to read_csv
|
||
* Define finalize function for dask.dataframe.Scalar
|
||
* py26 compatibility
|
||
* add stacked logos to docs
|
||
* test from-array names
|
||
* rename from_array tasks
|
||
* add atop to array docs
|
||
* Add motivation and example to delayed docs
|
||
* splat out delayed values in compute docs
|
||
* Fix optimize docs
|
||
* add html page with logos
|
||
* add dask logo to documentation images
|
||
* Few pep8 cleanups to dask.dataframe.groupby
|
||
* Groupby aggregate works with list of columns
|
||
* Use different names for input and output in from_array
|
||
* Don't enforce same column names
|
||
* don't write header for first block in csv
|
||
* Add var and std to DataFrame groupby (#1159)
|
||
* Move conda recipe to conda-forge (#1162)
|
||
* Use function names in map_blocks and elemwise (#1163)
|
||
* add hyphen to delayed name (#1161)
|
||
* Avoid shuffles when merging with Pandas objects (#1154)
|
||
* Add DataFrame.eval
|
||
* Ensure future imports
|
||
* Add db.Bag.unzip
|
||
* Guard against shape attributes that are not sequences
|
||
* Add dask.array.multinomial
|
||
- update to version 0.9.0:
|
||
* No upstream changelog
|
||
- update to version 0.8.2:
|
||
* No upstream changelog
|
||
- update to version 0.8.1:
|
||
* No upstream changelog
|
||
- update to version 0.8.0:
|
||
* No upstream changelog
|
||
- update to version 0.7.5:
|
||
* No upstream changelog
|
||
- update to version 0.7.5:
|
||
* No upstream changelog
|
||
- update to version 0.7.0:
|
||
* No upstream changelog
|
||
- update to version 0.6.1:
|
||
* No upstream changelog
|
||
|
||
-------------------------------------------------------------------
|
||
Tue Jul 14 13:33:53 UTC 2015 - toddrme2178@gmail.com
|
||
|
||
- Update to 0.6.0
|
||
* No upstream changelog
|
||
|
||
-------------------------------------------------------------------
|
||
Tue May 19 11:03:41 UTC 2015 - toddrme2178@gmail.com
|
||
|
||
- Update to 0.5.0
|
||
* No upstream changelog
|
||
|
||
-------------------------------------------------------------------
|
||
Thu Apr 9 16:57:59 UTC 2015 - toddrme2178@gmail.com
|
||
|
||
- Initial version
|
||
|