1
0
forked from pool/python-pandas

Accepting request 1090040 from home:bnavigator:branches:devel:languages:python:numeric

- Update to 2.0.2
  ## Fixed regressions
  * Fixed performance regression in GroupBy.apply() (GH53195)
  * Fixed regression in merge() on Windows when dtype is np.intc
    (GH52451)
  * Fixed regression in read_sql() dropping columns with duplicated
    column names (GH53117)
  * Fixed regression in DataFrame.loc() losing MultiIndex name when
    enlarging object (GH53053)
  * Fixed regression in DataFrame.to_string() printing a backslash
    at the end of the first row of data, instead of headers, when
    the DataFrame doesn’t fit the line width (GH53054)
  * Fixed regression in MultiIndex.join() returning levels in wrong
    order (GH53093)
  ## Bug fixes
  * Bug in arrays.ArrowExtensionArray incorrectly assigning dict
    instead of list for .type with pyarrow.map_ and raising a
    NotImplementedError with pyarrow.struct (GH53328)
  * Bug in api.interchange.from_dataframe() was raising IndexError
    on empty categorical data (GH53077)
  * Bug in api.interchange.from_dataframe() was returning
    DataFrame’s of incorrect sizes when called on slices (GH52824)
  * Bug in api.interchange.from_dataframe() was unnecessarily
    raising on bitmasks (GH49888)
  * Bug in merge() when merging on datetime columns on different
    resolutions (GH53200)
  * Bug in read_csv() raising OverflowError for engine="pyarrow"
    and parse_dates set (GH53295)
  * Bug in to_datetime() was inferring format to contain "%H"
    instead of "%I" if date contained “AM” / “PM” tokens (GH53147)

OBS-URL: https://build.opensuse.org/request/show/1090040
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python:numeric/python-pandas?expand=0&rev=84
This commit is contained in:
2023-06-05 14:32:15 +00:00
committed by Git OBS Bridge
parent 5d32d9366f
commit 749b1f8721
5 changed files with 619 additions and 96 deletions

View File

@@ -1,3 +1,170 @@
-------------------------------------------------------------------
Sat May 27 13:18:13 UTC 2023 - Ben Greiner <code@bnavigator.de>
- Update to 2.0.2
## Fixed regressions
* Fixed performance regression in GroupBy.apply() (GH53195)
* Fixed regression in merge() on Windows when dtype is np.intc
(GH52451)
* Fixed regression in read_sql() dropping columns with duplicated
column names (GH53117)
* Fixed regression in DataFrame.loc() losing MultiIndex name when
enlarging object (GH53053)
* Fixed regression in DataFrame.to_string() printing a backslash
at the end of the first row of data, instead of headers, when
the DataFrame doesnt fit the line width (GH53054)
* Fixed regression in MultiIndex.join() returning levels in wrong
order (GH53093)
## Bug fixes
* Bug in arrays.ArrowExtensionArray incorrectly assigning dict
instead of list for .type with pyarrow.map_ and raising a
NotImplementedError with pyarrow.struct (GH53328)
* Bug in api.interchange.from_dataframe() was raising IndexError
on empty categorical data (GH53077)
* Bug in api.interchange.from_dataframe() was returning
DataFrames of incorrect sizes when called on slices (GH52824)
* Bug in api.interchange.from_dataframe() was unnecessarily
raising on bitmasks (GH49888)
* Bug in merge() when merging on datetime columns on different
resolutions (GH53200)
* Bug in read_csv() raising OverflowError for engine="pyarrow"
and parse_dates set (GH53295)
* Bug in to_datetime() was inferring format to contain "%H"
instead of "%I" if date contained “AM” / “PM” tokens (GH53147)
* Bug in DataFrame.convert_dtypes() ignores convert_* keywords
when set to False dtype_backend="pyarrow" (GH52872)
* Bug in DataFrame.convert_dtypes() losing timezone for tz-aware
dtypes and dtype_backend="pyarrow" (GH53382)
* Bug in DataFrame.sort_values() raising for PyArrow dictionary
dtype (GH53232)
* Bug in Series.describe() treating pyarrow-backed timestamps and
timedeltas as categorical data (GH53001)
* Bug in Series.rename() not making a lazy copy when
Copy-on-Write is enabled when a scalar is passed to it
(GH52450)
* Bug in pd.array() raising for NumPy array and pa.large_string
or pa.large_binary (GH52590)
* Bug in DataFrame.__getitem__() not preserving dtypes for
MultiIndex partial keys (GH51895)
## Other
* Raised a better error message when calling
Series.dt.to_pydatetime() with ArrowDtype with pyarrow.date32
or pyarrow.date64 type (GH52812)
- Release to 2.0.1
## Fixed regressions
* Fixed regression for subclassed Series when constructing from a
dictionary (GH52445)
* Fixed regression in SeriesGroupBy.agg() failing when grouping
with categorical data, multiple groupings, as_index=False, and
a list of aggregations (GH52760)
* Fixed regression in DataFrame.pivot() changing Index name of
input object (GH52629)
* Fixed regression in DataFrame.resample() raising on a DataFrame
with no columns (GH52484)
* Fixed regression in DataFrame.sort_values() not resetting index
when DataFrame is already sorted and ignore_index=True
(GH52553)
* Fixed regression in MultiIndex.isin() raising TypeError for
Generator (GH52568)
* Fixed regression in Series.describe() showing RuntimeWarning
for extension dtype Series with one element (GH52515)
* Fixed regression when adding a new column to a DataFrame when
the DataFrame.columns was a RangeIndex and the new key was
hashable but not a scalar (GH52652)
## Bug fixes
* Bug in Series.dt.days that would overflow int32 number of days
(GH52391)
* Bug in arrays.DatetimeArray constructor returning an incorrect
unit when passed a non-nanosecond numpy datetime array
(GH52555)
* Bug in ArrowExtensionArray with duration dtype overflowing when
constructed from data containing numpy NaT (GH52843)
* Bug in Series.dt.round() when passing a freq of equal or higher
resolution compared to the Series would raise a
ZeroDivisionError (GH52761)
* Bug in Series.median() with ArrowDtype returning an approximate
median (GH52679)
* Bug in api.interchange.from_dataframe() was unnecessarily
raising on categorical dtypes (GH49889)
* Bug in api.interchange.from_dataframe() was unnecessarily
raising on large string dtypes (GH52795)
* Bug in pandas.testing.assert_series_equal() where
check_dtype=False would still raise for datetime or timedelta
types with different resolutions (GH52449)
* Bug in read_csv() casting PyArrow datetimes to NumPy when
dtype_backend="pyarrow" and parse_dates is set causing a
performance bottleneck in the process (GH52546)
* Bug in to_datetime() and to_timedelta() when trying to convert
numeric data with a ArrowDtype (GH52425)
* Bug in to_numeric() with errors='coerce' and
dtype_backend='pyarrow' with ArrowDtype data (GH52588)
* Bug in ArrowDtype.__from_arrow__() not respecting if dtype is
explicitly given (GH52533)
* Bug in DataFrame.describe() not respecting ArrowDtype in
include and exclude (GH52570)
* Bug in DataFrame.max() and related casting different Timestamp
resolutions always to nanoseconds (GH52524)
* Bug in Series.describe() not returning ArrowDtype with
pyarrow.float64 type with numeric data (GH52427)
* Bug in Series.dt.tz_localize() incorrectly localizing
timestamps with ArrowDtype (GH52677)
* Bug in arithmetic between np.datetime64 and np.timedelta64 NaT
scalars with units always returning nanosecond resolution
(GH52295)
* Bug in logical and comparison operations between ArrowDtype and
numpy masked types (e.g. "boolean") (GH52625)
* Fixed bug in merge() when merging with ArrowDtype one one and a
NumPy dtype on the other side (GH52406)
* Fixed segfault in Series.to_numpy() with null[pyarrow] dtype
(GH52443)
## Other
* DataFrame created from empty dicts had columns of dtype object.
It is now a RangeIndex (GH52404)
* Series created from empty dicts had index of dtype object. It
is now a RangeIndex (GH52404)
* Implemented Series.str.split() and Series.str.rsplit() for
ArrowDtype with pyarrow.string (GH52401)
* Implemented most str accessor methods for ArrowDtype with
pyarrow.string (GH52401)
* Supplying a non-integer hashable key that tests False in
api.types.is_scalar() now raises a KeyError for
RangeIndex.get_loc(), like it does for Index.get_loc().
Previously it raised an InvalidIndexError (GH52652).
- Release to 2.0.0
## Enhancements
* Installing optional dependencies with pip extras
* Index can now hold numpy numeric dtypes
* Argument dtype_backend , to return pyarrow-backed or
numpy-backed nullable dtypes
* Copy-on-Write improvements
* Other enhancements, see
https://pandas.pydata.org/pandas-docs/version/2.0.2/whatsnew/v2.0.0.html#other-enhancements
## Notable bug fixes
* DataFrameGroupBy.cumsum() and DataFrameGroupBy.cumprod()
overflow instead of lossy casting to float
* DataFrameGroupBy.nth() and SeriesGroupBy.nth() now behave as
filtrations
## Backwards incompatible API changes
* Construction with datetime64 or timedelta64 dtype with
unsupported resolution
* Value counts sets the resulting name to count
* Disallow astype conversion to non-supported
datetime64/timedelta64 dtypes
* UTC and fixed-offset timezones default to standard-library
tzinfo objects
* Empty DataFrames/Series will now default to have a RangeIndex
* DataFrame to LaTeX has a new render engine
* Increased minimum versions for dependencies
* Datetimes are now parsed with a consistent format
* Other API changes, see
https://pandas.pydata.org/pandas-docs/version/2.0.2/whatsnew/v2.0.0.html#other-api-changes
## Deprecations
## Removal of prior version deprecations/changes
## Performance improvements
## Bug fixes
- Drop python38 test flavor and start testing python311 which has
been missing since.
-------------------------------------------------------------------
Mon May 8 06:10:30 UTC 2023 - Johannes Kastl <kastl@b1-systems.de>