- Update to 2.2.1
## Enhancements
* Added pyarrow pip extra so users can install pandas and pyarrow
with pip with pip install pandas[pyarrow] (#54466)
## Fixed regressions
* Fixed memory leak in `read_csv` (#57039)
* Fixed performance regression in `Series.combine_first` (#55845)
* Fixed regression causing overflow for near-minimum timestamps
(#57150)
* Fixed regression in `concat` changing long-standing behavior
that always sorted the non-concatenation axis when the axis was
a `DatetimeIndex` (#57006)
* Fixed regression in `merge_ordered` raising TypeError for
fill_method="ffill" and how="left" (#57010)
* Fixed regression in `pandas.testing.assert_series_equal`
defaulting to check_exact=True when checking the `Index`
(#57067)
* Fixed regression in `read_json` where an `Index` would be
returned instead of a `RangeIndex` (#57429)
* Fixed regression in `wide_to_long` raising an AttributeError
for string columns (#57066)
* Fixed regression in `.DataFrameGroupBy.idxmin`,
`.DataFrameGroupBy.idxmax`, `.SeriesGroupBy.idxmin`,
`.SeriesGroupBy.idxmax` ignoring the skipna argument (#57040)
* Fixed regression in `.DataFrameGroupBy.idxmin`,
`.DataFrameGroupBy.idxmax`, `.SeriesGroupBy.idxmin`,
`.SeriesGroupBy.idxmax` where values containing the minimum or
maximum value for the dtype could produce incorrect results
(#57040)
* Fixed regression in `CategoricalIndex.difference` raising
KeyError when other contains null values other than NaN
(#57318)
* Fixed regression in `DataFrame.groupby` raising ValueError when
grouping by a `Series` in some cases (#57276)
* Fixed regression in `DataFrame.loc` raising IndexError for
non-unique, masked dtype indexes where result has more than
10,000 rows (#57027)
* Fixed regression in `DataFrame.loc` which was unnecessarily
throwing "incompatible dtype warning" when expanding with
partial row indexer and multiple columns (see PDEP6) (#56503)
* Fixed regression in `DataFrame.map` with na_action="ignore" not
being respected for NumPy nullable and `ArrowDtypes` (#57316)
* Fixed regression in `DataFrame.merge` raising ValueError for
certain types of 3rd-party extension arrays (#57316)
* Fixed regression in `DataFrame.query` with all NaT column with
object dtype (#57068)
* Fixed regression in `DataFrame.shift` raising AssertionError
for axis=1 and empty `DataFrame` (#57301)
* Fixed regression in `DataFrame.sort_index` not producing a
stable sort for a index with duplicates (#57151)
* Fixed regression in `DataFrame.to_dict` with orient='list' and
datetime or timedelta types returning integers (#54824)
* Fixed regression in `DataFrame.to_json` converting nullable
integers to floats (#57224)
* Fixed regression in `DataFrame.to_sql` when method="multi" is
passed and the dialect type is not Oracle (#57310)
* Fixed regression in `DataFrame.transpose` with nullable
extension dtypes not having F-contiguous data potentially
causing exceptions when used (#57315)
* Fixed regression in `DataFrame.update` emitting incorrect
warnings about downcasting (#57124)
* Fixed regression in `DataFrameGroupBy.idxmin`,
`DataFrameGroupBy.idxmax`, `SeriesGroupBy.idxmin`,
`SeriesGroupBy.idxmax` ignoring the skipna argument (#57040)
* Fixed regression in `DataFrameGroupBy.idxmin`,
`DataFrameGroupBy.idxmax`, `SeriesGroupBy.idxmin`,
`SeriesGroupBy.idxmax` where values containing the minimum or
maximum value for the dtype could produce incorrect results
(#57040)
* Fixed regression in `ExtensionArray.to_numpy` raising for
non-numeric masked dtypes (#56991)
* Fixed regression in `Index.join` raising TypeError when joining
an empty index to a non-empty index containing mixed dtype
values (#57048)
* Fixed regression in `Series.astype` introducing decimals when
converting from integer with missing values to string dtype
(#57418)
* Fixed regression in `Series.pct_change` raising a ValueError
for an empty `Series` (#57056)
* Fixed regression in `Series.to_numpy` when dtype is given as
float and the data contains NaNs (#57121)
* Fixed regression in addition or subtraction of `DateOffset`
objects with millisecond components to datetime64 `Index`,
`Series`, or `DataFrame` (#57529)
## Bug fixes
* Fixed bug in `pandas.api.interchange.from_dataframe` which was
raising for Nullable integers (#55069)
* Fixed bug in `pandas.api.interchange.from_dataframe` which was
raising for empty inputs (#56700)
* Fixed bug in `pandas.api.interchange.from_dataframe` which
wasn't converting columns names to strings (#55069)
* Fixed bug in `DataFrame.__getitem__` for empty `DataFrame` with
Copy-on-Write enabled (#57130)
* Fixed bug in `PeriodIndex.asfreq` which was silently converting
frequencies which are not supported as period frequencies
instead of raising an error (#56945)
## Note
* The DeprecationWarning that was raised when pandas was imported
without PyArrow being installed has been removed. This decision
was made because the warning was too noisy for too many users
and a lot of feedback was collected about the decision to make
PyArrow a required dependency. Pandas is currently considering
the decision whether or not PyArrow should be added as a hard
dependency in 3.0. Interested users can follow the discussion
here.
* Added the argument skipna to `DataFrameGroupBy.first`,
`DataFrameGroupBy.last`, `SeriesGroupBy.first`, and
`SeriesGroupBy.last`; achieving skipna=False used to be
available via `DataFrameGroupBy.nth`, but the behavior was
changed in pandas 2.0.0 (#57019)
* Added the argument skipna to `Resampler.first`,
`Resampler.last` (#57019)
- Release notes for 2.2.0
* For full changelog see
https://github.com/pandas-dev/pandas/blob/main/doc/source/whatsnew/v2.2.0.rst
## Enhancements
* ADBC Driver support in to_sql and read_sql
* Create a pandas Series based on one or more conditions
* to_numpy for NumPy nullable and Arrow types converts to
suitable NumPy dtype
* Series.struct accessor for PyArrow structured data
* Series.list accessor for PyArrow list data
* Calamine engine for `read_excel`
## Notable bug fixes
* `merge` and `DataFrame.join` now consistently follow documented
sort behavior
* `merge` and `DataFrame.join` no longer reorder levels when
levels differ
* Increased minimum versions for dependencies
## Deprecations
* Chained assignment
* Deprecate aliases M, Q, Y, etc. in favour of ME, QE, YE, etc.
for offsets
* Deprecated automatic downcasting
- Simplify flavor test setup: obs can evaluate %{shrink:} now
OBS-URL: https://build.opensuse.org/request/show/1152058
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python:numeric/python-pandas?expand=0&rev=110
- Update to 2.1.4
## Fixed regressions
* Fixed regression when trying to read a pickled pandas DataFrame
from pandas 1.3 (GH 55137)
## Bug fixes
* Bug in Series constructor raising DeprecationWarning when index
is a list of Series (GH 55228)
* Bug in Series when trying to cast date-like string inputs to
ArrowDtype of pyarrow.timestamp (GH 56266)
* Bug in DataFrame.apply() where passing raw=True ignored args
passed to the applied function (GH 55753)
* Bug in Index.__getitem__() returning wrong result for Arrow
dtypes and negative stepsize (GH 55832)
* Fixed bug in to_numeric() converting to extension dtype for
string[pyarrow_numpy] dtype (GH 56179)
* Fixed bug in DataFrameGroupBy.min() and DataFrameGroupBy.max()
not preserving extension dtype for empty object (GH 55619)
* Fixed bug in DataFrame.__setitem__() casting Index with
object-dtype to PyArrow backed strings when infer_string option
is set (GH 55638)
* Fixed bug in DataFrame.to_hdf() raising when columns have
StringDtype (GH 55088)
* Fixed bug in Index.insert() casting object-dtype to PyArrow
backed strings when infer_string option is set (GH 55638)
* Fixed bug in Series.__ne__() resulting in False for comparison
between NA and string value for dtype="string[pyarrow_numpy]"
(GH 56122)
* Fixed bug in Series.mode() not keeping object dtype when
infer_string is set (GH 56183)
* Fixed bug in Series.reset_index() not preserving object dtype
when infer_string is set (GH 56160)
* Fixed bug in Series.str.split() and Series.str.rsplit() when
pat=None for ArrowDtype with pyarrow.string (GH 56271)
* Fixed bug in Series.str.translate() losing object dtype when
string option is set (GH 56152)
- Go back to Cython0, it has NOT been unpinned by upstream released
version
* https://github.com/pandas-dev/pandas/blob/v2.1.4/pyproject.toml#L8
* See also gh#jsonpickle/jsonpickle#460
OBS-URL: https://build.opensuse.org/request/show/1133481
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python:numeric/python-pandas?expand=0&rev=104
- Update to 2.1.3:
* Reverted deprecation of fill_method=None in DataFrame.pct_change(),
Series.pct_change(), DataFrameGroupBy.pct_change(), and
SeriesGroupBy.pct_change(); the values 'backfill', 'bfill', 'pad', and
'ffill' are still deprecated
* Fixed regressions
+ Fixed infinite recursion from operations that return a new object on
some DataFrame subclasses
+ Fixed regression in DataFrame.join() where result has missing values
and dtype is arrow backed string
+ Fixed regression in rolling() where non-nanosecond index or on column
would produce incorrect results
+ Fixed regression in DataFrame.resample() which was extrapolating back
to origin when origin was outside its bounds
+ Fixed regression in DataFrame.sort_index() which was not sorting
correctly when the index was a sliced MultiIndex
+ Fixed regression in DataFrameGroupBy.agg() and SeriesGroupBy.agg()
where if the option compute.use_numba was set to True, groupby methods
not supported by the numba engine would raise a TypeError
+ Fixed performance regression with wide DataFrames, typically
involving methods where all columns were accessed individually
+ Fixed regression in merge_asof() raising TypeError for by with
datetime and timedelta dtypes
+ Fixed regression in read_parquet() when reading a file with a string
column consisting of more than 2 GB of string data and using the
"string" dtype
+ Fixed regression in DataFrame.to_sql() not roundtripping datetime
columns correctly for sqlite when using detect_types
+ Fixed regression in construction of certain DataFrame or Series
subclasses
OBS-URL: https://build.opensuse.org/request/show/1130126
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/python-pandas?expand=0&rev=58
* Reverted deprecation of fill_method=None in DataFrame.pct_change(),
Series.pct_change(), DataFrameGroupBy.pct_change(), and
SeriesGroupBy.pct_change(); the values 'backfill', 'bfill', 'pad', and
'ffill' are still deprecated
* Fixed regressions
+ Fixed infinite recursion from operations that return a new object on
some DataFrame subclasses
+ Fixed regression in DataFrame.join() where result has missing values
and dtype is arrow backed string
+ Fixed regression in rolling() where non-nanosecond index or on column
would produce incorrect results
+ Fixed regression in DataFrame.resample() which was extrapolating back
to origin when origin was outside its bounds
+ Fixed regression in DataFrame.sort_index() which was not sorting
correctly when the index was a sliced MultiIndex
+ Fixed regression in DataFrameGroupBy.agg() and SeriesGroupBy.agg()
where if the option compute.use_numba was set to True, groupby methods
not supported by the numba engine would raise a TypeError
+ Fixed performance regression with wide DataFrames, typically
involving methods where all columns were accessed individually
+ Fixed regression in merge_asof() raising TypeError for by with
datetime and timedelta dtypes
+ Fixed regression in read_parquet() when reading a file with a string
column consisting of more than 2 GB of string data and using the
"string" dtype
+ Fixed regression in DataFrame.to_sql() not roundtripping datetime
columns correctly for sqlite when using detect_types
+ Fixed regression in construction of certain DataFrame or Series
subclasses
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python:numeric/python-pandas?expand=0&rev=102
- Update to 2.1.1
## Fixed regressions
* Fixed regression in concat() when DataFrame ‘s have two
different extension dtypes (GH 54848)
* Fixed regression in merge() when merging over a PyArrow string
index (GH 54894)
* Fixed regression in read_csv() when usecols is given and dtypes
is a dict for engine="python" (GH 54868)
* Fixed regression in read_csv() when delim_whitespace is True
(GH 54918, GH 54931)
* Fixed regression in GroupBy.get_group() raising for axis=1 (GH
54858)
* Fixed regression in DataFrame.__setitem__() raising
AssertionError when setting a Series with a partial MultiIndex
(GH 54875)
* Fixed regression in DataFrame.filter() not respecting the order
of elements for filter (GH 54980)
* Fixed regression in DataFrame.to_sql() not roundtripping
datetime columns correctly for sqlite (GH 54877)
* Fixed regression in DataFrameGroupBy.agg() when aggregating a
DataFrame with duplicate column names using a dictionary (GH
55006)
* Fixed regression in MultiIndex.append() raising when appending
overlapping IntervalIndex levels (GH 54934)
* Fixed regression in Series.drop_duplicates() for PyArrow
strings (GH 54904)
* Fixed regression in Series.interpolate() raising when
fill_value was given (GH 54920)
* Fixed regression in Series.value_counts() raising for numeric
data if bins was specified (GH 54857)
* Fixed regression in comparison operations for PyArrow backed
columns not propagating exceptions correctly (GH 54944)
* Fixed regression when comparing a Series with datetime64 dtype
with None (GH 54870)
## Bug fixes
* Fixed bug for ArrowDtype raising NotImplementedError for
fixed-size list (GH 55000)
* Fixed bug in DataFrame.stack() with future_stack=True and
columns a non-MultiIndex consisting of tuples (GH 54948)
* Fixed bug in Series.dt.tz() with ArrowDtype where a string was
returned instead of a tzinfo object (GH 55003)
* Fixed bug in Series.pct_change() and DataFrame.pct_change()
showing unnecessary FutureWarning (GH 54981)
## Other
* Reverted the deprecation that disallowed Series.apply()
returning a DataFrame when the passed-in callable returns a
Series object (GH 52116)
- Drop pandas-pr55073-pyarrow13.patch merged upstream
OBS-URL: https://build.opensuse.org/request/show/1116287
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python:numeric/python-pandas?expand=0&rev=98
- Use git cloned archive gh#pandas-dev/pandas#54907
- Update to 2.1.0
* https://pandas.pydata.org/pandas-docs/version/2.1.0/whatsnew/v2.1.0.html
* Avoid NumPy object dtype for strings by default
* DataFrame reductions preserve extension dtypes
* Copy-on-Write improvements
* New DataFrame.map() method and support for ExtensionArrays
* New implementation of DataFrame.stack()
* Other minor enhancements (see link above)
## Backwards incompatible API changes
* pandas 2.1.0 supports Python 3.9 and higher
* Increased minimum versions for numpy 1.22.3 and some optional
dependencies
* arrays.PandasArray has been renamed NumpyExtensionArray and the
attached dtype name changed from PandasDtype to NumpyEADtype;
importing PandasArray still works until the next major version
(GH 53694)
## Deprecations
* Deprecated silent upcasting in setitem-like Series operations
* Deprecated parsing datetimes with mixed time zones
* Other Deprecation (see link above)
## More
* Performance Improvements (see link above)
* Bug fixes (see linkl above)
- Switch to meson build system
OBS-URL: https://build.opensuse.org/request/show/1109356
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python:numeric/python-pandas?expand=0&rev=94
- update to 2.0.3:
* Bug in Timestamp.weekday`() was returning incorrect results
before '0000-02-29'
* Fixed performance regression in merging on datetime-like columns
* Fixed regression when DataFrame.to_string() creates extra space
for string dtypes
* Bug in DataFrame.convert_dtype() and Series.convert_dtype()
when trying to convert ArrowDtype with dtype_backend="nullable_numpy"
* Bug in RangeIndex.union() when using sort=True with another
RangeIndex
* Bug in Series.reindex() when expanding a non-nanosecond datetime
or timedelta
* Bug in read_csv() when defining dtype with bool[pyarrow] for
the "c" and "python" engines
* Bug in Series.str.split() and Series.str.rsplit() with expand=True
* Bug in indexing methods (e.g. DataFrame.__getitem__()) where
taking the entire DataFrame/Series would raise an OverflowError
when Copy on Write was enabled the length of the array was over
the maximum size a 32-bit integer can hold
* Bug when constructing a DataFrame with columns of an ArrowDtype
with a pyarrow.dictionary type that reindexes the data
* Bug when indexing a DataFrame or Series with an Index with a
timestamp ArrowDtype would raise an AttributeError
- drop pandas-fix-tests.patch (upstream)
OBS-URL: https://build.opensuse.org/request/show/1104661
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/python-pandas?expand=0&rev=53
* Bug in Timestamp.weekday`() was returning incorrect results
before '0000-02-29'
* Fixed performance regression in merging on datetime-like columns
* Fixed regression when DataFrame.to_string() creates extra space
for string dtypes
* Bug in DataFrame.convert_dtype() and Series.convert_dtype()
when trying to convert ArrowDtype with dtype_backend="nullable_numpy"
* Bug in RangeIndex.union() when using sort=True with another
RangeIndex
* Bug in Series.reindex() when expanding a non-nanosecond datetime
or timedelta
* Bug in read_csv() when defining dtype with bool[pyarrow] for
the "c" and "python" engines
* Bug in Series.str.split() and Series.str.rsplit() with expand=True
* Bug in indexing methods (e.g. DataFrame.__getitem__()) where
taking the entire DataFrame/Series would raise an OverflowError
when Copy on Write was enabled the length of the array was over
the maximum size a 32-bit integer can hold
* Bug when constructing a DataFrame with columns of an ArrowDtype
with a pyarrow.dictionary type that reindexes the data
* Bug when indexing a DataFrame or Series with an Index with a
timestamp ArrowDtype would raise an AttributeError
- drop pandas-fix-tests.patch (upstream)
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python:numeric/python-pandas?expand=0&rev=92
- Update to 2.0.2
## Fixed regressions
* Fixed performance regression in GroupBy.apply() (GH53195)
* Fixed regression in merge() on Windows when dtype is np.intc
(GH52451)
* Fixed regression in read_sql() dropping columns with duplicated
column names (GH53117)
* Fixed regression in DataFrame.loc() losing MultiIndex name when
enlarging object (GH53053)
* Fixed regression in DataFrame.to_string() printing a backslash
at the end of the first row of data, instead of headers, when
the DataFrame doesn’t fit the line width (GH53054)
* Fixed regression in MultiIndex.join() returning levels in wrong
order (GH53093)
## Bug fixes
* Bug in arrays.ArrowExtensionArray incorrectly assigning dict
instead of list for .type with pyarrow.map_ and raising a
NotImplementedError with pyarrow.struct (GH53328)
* Bug in api.interchange.from_dataframe() was raising IndexError
on empty categorical data (GH53077)
* Bug in api.interchange.from_dataframe() was returning
DataFrame’s of incorrect sizes when called on slices (GH52824)
* Bug in api.interchange.from_dataframe() was unnecessarily
raising on bitmasks (GH49888)
* Bug in merge() when merging on datetime columns on different
resolutions (GH53200)
* Bug in read_csv() raising OverflowError for engine="pyarrow"
and parse_dates set (GH53295)
* Bug in to_datetime() was inferring format to contain "%H"
instead of "%I" if date contained “AM” / “PM” tokens (GH53147)
OBS-URL: https://build.opensuse.org/request/show/1090040
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python:numeric/python-pandas?expand=0&rev=84
- Update to version 1.5.2
## Fixed regressions
* Fixed regression in MultiIndex.join() for extension array
dtypes (GH49277)
* Fixed regression in Series.replace() raising RecursionError
with numeric dtype and when specifying value=None (GH45725)
* Fixed regression in arithmetic operations for DataFrame with
MultiIndex columns with different dtypes (GH49769)
* Fixed regression in DataFrame.plot() preventing Colormap
instance from being passed using the colormap argument if
Matplotlib 3.6+ is used (GH49374)
* Fixed regression in date_range() returning an invalid set of
periods for CustomBusinessDay frequency and start date with
timezone (GH49441)
* Fixed performance regression in groupby operations (GH49676)
* Fixed regression in Timedelta constructor returning object of
wrong type when subclassing Timedelta (GH49579)
## Bug fixes
* Bug in the Copy-on-Write implementation losing track of views
in certain chained indexing cases (GH48996)
* Fixed memory leak in Styler.to_excel() (GH49751)
## Other
* Reverted color as an alias for c and size as an alias for s in
function DataFrame.plot.scatter() (GH49732)
- Add pandas-pr49886-fix-numpy-deprecations.patch
* gh#pandas-dev/pandas#49887
- Move to PEP518 build
OBS-URL: https://build.opensuse.org/request/show/1045082
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python:numeric/python-pandas?expand=0&rev=78
- Update to version 1.4.0
* https://pandas.pydata.org/docs/whatsnew/v1.4.0.html
* Enhancements
- Improved warning messages
- Index can hold arbitrary ExtensionArrays
- Enhancements in Styler
- Multi-threaded CSV reading with a new CSV Engine based on
pyarrow
- Rank function for rolling and expanding windows
- Groupby positional indexing
- DataFrame.from_dict and DataFrame.to_dict have new 'tight'
option
* Notable bug fixes
- Inconsistent date string parsing
- Ignoring dtypes in concat with empty or all-NA columns
- Null-values are no longer coerced to NaN-value in
value_counts and mode
- mangle_dupe_cols in read_csv no longer renames unique columns
conflicting with target names
- unstack and pivot_table no longer raises ValueError for
result that would exceed int32 limit
- groupby.apply consistent transform detection
* API changes
- Index.get_indexer_for() no longer accepts keyword arguments
(other than target); in the past these would be silently
ignored if the index was not unique (GH42310)
- Change in the position of the min_rows argument in
DataFrame.to_string() due to change in the docstring
(GH44304)
- Reduction operations for DataFrame or Series now raising a
ValueError when None is passed for skipna (GH44178)
- read_csv() and read_html() no longer raising an error when
one of the header rows consists only of Unnamed: columns
(GH13054)
- Changed the name attribute of several holidays in
USFederalHolidayCalendar to match official federal holiday
names.
* Deprecations
- Deprecated Int64Index, UInt64Index & Float64Index
- Deprecated Frame.append and Series.append
- Split out test runs into separate flavors, optimize memory usage
in pytest-xdist runs
OBS-URL: https://build.opensuse.org/request/show/948450
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python:numeric/python-pandas?expand=0&rev=67