- Update to 2.1.4
## Fixed regressions
* Fixed regression when trying to read a pickled pandas DataFrame
from pandas 1.3 (GH 55137)
## Bug fixes
* Bug in Series constructor raising DeprecationWarning when index
is a list of Series (GH 55228)
* Bug in Series when trying to cast date-like string inputs to
ArrowDtype of pyarrow.timestamp (GH 56266)
* Bug in DataFrame.apply() where passing raw=True ignored args
passed to the applied function (GH 55753)
* Bug in Index.__getitem__() returning wrong result for Arrow
dtypes and negative stepsize (GH 55832)
* Fixed bug in to_numeric() converting to extension dtype for
string[pyarrow_numpy] dtype (GH 56179)
* Fixed bug in DataFrameGroupBy.min() and DataFrameGroupBy.max()
not preserving extension dtype for empty object (GH 55619)
* Fixed bug in DataFrame.__setitem__() casting Index with
object-dtype to PyArrow backed strings when infer_string option
is set (GH 55638)
* Fixed bug in DataFrame.to_hdf() raising when columns have
StringDtype (GH 55088)
* Fixed bug in Index.insert() casting object-dtype to PyArrow
backed strings when infer_string option is set (GH 55638)
* Fixed bug in Series.__ne__() resulting in False for comparison
between NA and string value for dtype="string[pyarrow_numpy]"
(GH 56122)
* Fixed bug in Series.mode() not keeping object dtype when
infer_string is set (GH 56183)
* Fixed bug in Series.reset_index() not preserving object dtype
when infer_string is set (GH 56160)
* Fixed bug in Series.str.split() and Series.str.rsplit() when
pat=None for ArrowDtype with pyarrow.string (GH 56271)
* Fixed bug in Series.str.translate() losing object dtype when
string option is set (GH 56152)
- Go back to Cython0, it has NOT been unpinned by upstream released
version
* https://github.com/pandas-dev/pandas/blob/v2.1.4/pyproject.toml#L8
* See also gh#jsonpickle/jsonpickle#460
OBS-URL: https://build.opensuse.org/request/show/1133481
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python:numeric/python-pandas?expand=0&rev=104
- Update to 2.1.3:
* Reverted deprecation of fill_method=None in DataFrame.pct_change(),
Series.pct_change(), DataFrameGroupBy.pct_change(), and
SeriesGroupBy.pct_change(); the values 'backfill', 'bfill', 'pad', and
'ffill' are still deprecated
* Fixed regressions
+ Fixed infinite recursion from operations that return a new object on
some DataFrame subclasses
+ Fixed regression in DataFrame.join() where result has missing values
and dtype is arrow backed string
+ Fixed regression in rolling() where non-nanosecond index or on column
would produce incorrect results
+ Fixed regression in DataFrame.resample() which was extrapolating back
to origin when origin was outside its bounds
+ Fixed regression in DataFrame.sort_index() which was not sorting
correctly when the index was a sliced MultiIndex
+ Fixed regression in DataFrameGroupBy.agg() and SeriesGroupBy.agg()
where if the option compute.use_numba was set to True, groupby methods
not supported by the numba engine would raise a TypeError
+ Fixed performance regression with wide DataFrames, typically
involving methods where all columns were accessed individually
+ Fixed regression in merge_asof() raising TypeError for by with
datetime and timedelta dtypes
+ Fixed regression in read_parquet() when reading a file with a string
column consisting of more than 2 GB of string data and using the
"string" dtype
+ Fixed regression in DataFrame.to_sql() not roundtripping datetime
columns correctly for sqlite when using detect_types
+ Fixed regression in construction of certain DataFrame or Series
subclasses
OBS-URL: https://build.opensuse.org/request/show/1130126
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/python-pandas?expand=0&rev=58
* Reverted deprecation of fill_method=None in DataFrame.pct_change(),
Series.pct_change(), DataFrameGroupBy.pct_change(), and
SeriesGroupBy.pct_change(); the values 'backfill', 'bfill', 'pad', and
'ffill' are still deprecated
* Fixed regressions
+ Fixed infinite recursion from operations that return a new object on
some DataFrame subclasses
+ Fixed regression in DataFrame.join() where result has missing values
and dtype is arrow backed string
+ Fixed regression in rolling() where non-nanosecond index or on column
would produce incorrect results
+ Fixed regression in DataFrame.resample() which was extrapolating back
to origin when origin was outside its bounds
+ Fixed regression in DataFrame.sort_index() which was not sorting
correctly when the index was a sliced MultiIndex
+ Fixed regression in DataFrameGroupBy.agg() and SeriesGroupBy.agg()
where if the option compute.use_numba was set to True, groupby methods
not supported by the numba engine would raise a TypeError
+ Fixed performance regression with wide DataFrames, typically
involving methods where all columns were accessed individually
+ Fixed regression in merge_asof() raising TypeError for by with
datetime and timedelta dtypes
+ Fixed regression in read_parquet() when reading a file with a string
column consisting of more than 2 GB of string data and using the
"string" dtype
+ Fixed regression in DataFrame.to_sql() not roundtripping datetime
columns correctly for sqlite when using detect_types
+ Fixed regression in construction of certain DataFrame or Series
subclasses
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python:numeric/python-pandas?expand=0&rev=102
- Update to 2.1.1
## Fixed regressions
* Fixed regression in concat() when DataFrame ‘s have two
different extension dtypes (GH 54848)
* Fixed regression in merge() when merging over a PyArrow string
index (GH 54894)
* Fixed regression in read_csv() when usecols is given and dtypes
is a dict for engine="python" (GH 54868)
* Fixed regression in read_csv() when delim_whitespace is True
(GH 54918, GH 54931)
* Fixed regression in GroupBy.get_group() raising for axis=1 (GH
54858)
* Fixed regression in DataFrame.__setitem__() raising
AssertionError when setting a Series with a partial MultiIndex
(GH 54875)
* Fixed regression in DataFrame.filter() not respecting the order
of elements for filter (GH 54980)
* Fixed regression in DataFrame.to_sql() not roundtripping
datetime columns correctly for sqlite (GH 54877)
* Fixed regression in DataFrameGroupBy.agg() when aggregating a
DataFrame with duplicate column names using a dictionary (GH
55006)
* Fixed regression in MultiIndex.append() raising when appending
overlapping IntervalIndex levels (GH 54934)
* Fixed regression in Series.drop_duplicates() for PyArrow
strings (GH 54904)
* Fixed regression in Series.interpolate() raising when
fill_value was given (GH 54920)
* Fixed regression in Series.value_counts() raising for numeric
data if bins was specified (GH 54857)
* Fixed regression in comparison operations for PyArrow backed
columns not propagating exceptions correctly (GH 54944)
* Fixed regression when comparing a Series with datetime64 dtype
with None (GH 54870)
## Bug fixes
* Fixed bug for ArrowDtype raising NotImplementedError for
fixed-size list (GH 55000)
* Fixed bug in DataFrame.stack() with future_stack=True and
columns a non-MultiIndex consisting of tuples (GH 54948)
* Fixed bug in Series.dt.tz() with ArrowDtype where a string was
returned instead of a tzinfo object (GH 55003)
* Fixed bug in Series.pct_change() and DataFrame.pct_change()
showing unnecessary FutureWarning (GH 54981)
## Other
* Reverted the deprecation that disallowed Series.apply()
returning a DataFrame when the passed-in callable returns a
Series object (GH 52116)
- Drop pandas-pr55073-pyarrow13.patch merged upstream
OBS-URL: https://build.opensuse.org/request/show/1116287
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python:numeric/python-pandas?expand=0&rev=98
- Use git cloned archive gh#pandas-dev/pandas#54907
- Update to 2.1.0
* https://pandas.pydata.org/pandas-docs/version/2.1.0/whatsnew/v2.1.0.html
* Avoid NumPy object dtype for strings by default
* DataFrame reductions preserve extension dtypes
* Copy-on-Write improvements
* New DataFrame.map() method and support for ExtensionArrays
* New implementation of DataFrame.stack()
* Other minor enhancements (see link above)
## Backwards incompatible API changes
* pandas 2.1.0 supports Python 3.9 and higher
* Increased minimum versions for numpy 1.22.3 and some optional
dependencies
* arrays.PandasArray has been renamed NumpyExtensionArray and the
attached dtype name changed from PandasDtype to NumpyEADtype;
importing PandasArray still works until the next major version
(GH 53694)
## Deprecations
* Deprecated silent upcasting in setitem-like Series operations
* Deprecated parsing datetimes with mixed time zones
* Other Deprecation (see link above)
## More
* Performance Improvements (see link above)
* Bug fixes (see linkl above)
- Switch to meson build system
OBS-URL: https://build.opensuse.org/request/show/1109356
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python:numeric/python-pandas?expand=0&rev=94
- update to 2.0.3:
* Bug in Timestamp.weekday`() was returning incorrect results
before '0000-02-29'
* Fixed performance regression in merging on datetime-like columns
* Fixed regression when DataFrame.to_string() creates extra space
for string dtypes
* Bug in DataFrame.convert_dtype() and Series.convert_dtype()
when trying to convert ArrowDtype with dtype_backend="nullable_numpy"
* Bug in RangeIndex.union() when using sort=True with another
RangeIndex
* Bug in Series.reindex() when expanding a non-nanosecond datetime
or timedelta
* Bug in read_csv() when defining dtype with bool[pyarrow] for
the "c" and "python" engines
* Bug in Series.str.split() and Series.str.rsplit() with expand=True
* Bug in indexing methods (e.g. DataFrame.__getitem__()) where
taking the entire DataFrame/Series would raise an OverflowError
when Copy on Write was enabled the length of the array was over
the maximum size a 32-bit integer can hold
* Bug when constructing a DataFrame with columns of an ArrowDtype
with a pyarrow.dictionary type that reindexes the data
* Bug when indexing a DataFrame or Series with an Index with a
timestamp ArrowDtype would raise an AttributeError
- drop pandas-fix-tests.patch (upstream)
OBS-URL: https://build.opensuse.org/request/show/1104661
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/python-pandas?expand=0&rev=53
* Bug in Timestamp.weekday`() was returning incorrect results
before '0000-02-29'
* Fixed performance regression in merging on datetime-like columns
* Fixed regression when DataFrame.to_string() creates extra space
for string dtypes
* Bug in DataFrame.convert_dtype() and Series.convert_dtype()
when trying to convert ArrowDtype with dtype_backend="nullable_numpy"
* Bug in RangeIndex.union() when using sort=True with another
RangeIndex
* Bug in Series.reindex() when expanding a non-nanosecond datetime
or timedelta
* Bug in read_csv() when defining dtype with bool[pyarrow] for
the "c" and "python" engines
* Bug in Series.str.split() and Series.str.rsplit() with expand=True
* Bug in indexing methods (e.g. DataFrame.__getitem__()) where
taking the entire DataFrame/Series would raise an OverflowError
when Copy on Write was enabled the length of the array was over
the maximum size a 32-bit integer can hold
* Bug when constructing a DataFrame with columns of an ArrowDtype
with a pyarrow.dictionary type that reindexes the data
* Bug when indexing a DataFrame or Series with an Index with a
timestamp ArrowDtype would raise an AttributeError
- drop pandas-fix-tests.patch (upstream)
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python:numeric/python-pandas?expand=0&rev=92
- Update to 2.0.2
## Fixed regressions
* Fixed performance regression in GroupBy.apply() (GH53195)
* Fixed regression in merge() on Windows when dtype is np.intc
(GH52451)
* Fixed regression in read_sql() dropping columns with duplicated
column names (GH53117)
* Fixed regression in DataFrame.loc() losing MultiIndex name when
enlarging object (GH53053)
* Fixed regression in DataFrame.to_string() printing a backslash
at the end of the first row of data, instead of headers, when
the DataFrame doesn’t fit the line width (GH53054)
* Fixed regression in MultiIndex.join() returning levels in wrong
order (GH53093)
## Bug fixes
* Bug in arrays.ArrowExtensionArray incorrectly assigning dict
instead of list for .type with pyarrow.map_ and raising a
NotImplementedError with pyarrow.struct (GH53328)
* Bug in api.interchange.from_dataframe() was raising IndexError
on empty categorical data (GH53077)
* Bug in api.interchange.from_dataframe() was returning
DataFrame’s of incorrect sizes when called on slices (GH52824)
* Bug in api.interchange.from_dataframe() was unnecessarily
raising on bitmasks (GH49888)
* Bug in merge() when merging on datetime columns on different
resolutions (GH53200)
* Bug in read_csv() raising OverflowError for engine="pyarrow"
and parse_dates set (GH53295)
* Bug in to_datetime() was inferring format to contain "%H"
instead of "%I" if date contained “AM” / “PM” tokens (GH53147)
OBS-URL: https://build.opensuse.org/request/show/1090040
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python:numeric/python-pandas?expand=0&rev=84
- Update to version 1.5.2
## Fixed regressions
* Fixed regression in MultiIndex.join() for extension array
dtypes (GH49277)
* Fixed regression in Series.replace() raising RecursionError
with numeric dtype and when specifying value=None (GH45725)
* Fixed regression in arithmetic operations for DataFrame with
MultiIndex columns with different dtypes (GH49769)
* Fixed regression in DataFrame.plot() preventing Colormap
instance from being passed using the colormap argument if
Matplotlib 3.6+ is used (GH49374)
* Fixed regression in date_range() returning an invalid set of
periods for CustomBusinessDay frequency and start date with
timezone (GH49441)
* Fixed performance regression in groupby operations (GH49676)
* Fixed regression in Timedelta constructor returning object of
wrong type when subclassing Timedelta (GH49579)
## Bug fixes
* Bug in the Copy-on-Write implementation losing track of views
in certain chained indexing cases (GH48996)
* Fixed memory leak in Styler.to_excel() (GH49751)
## Other
* Reverted color as an alias for c and size as an alias for s in
function DataFrame.plot.scatter() (GH49732)
- Add pandas-pr49886-fix-numpy-deprecations.patch
* gh#pandas-dev/pandas#49887
- Move to PEP518 build
OBS-URL: https://build.opensuse.org/request/show/1045082
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python:numeric/python-pandas?expand=0&rev=78
- Update to version 1.4.0
* https://pandas.pydata.org/docs/whatsnew/v1.4.0.html
* Enhancements
- Improved warning messages
- Index can hold arbitrary ExtensionArrays
- Enhancements in Styler
- Multi-threaded CSV reading with a new CSV Engine based on
pyarrow
- Rank function for rolling and expanding windows
- Groupby positional indexing
- DataFrame.from_dict and DataFrame.to_dict have new 'tight'
option
* Notable bug fixes
- Inconsistent date string parsing
- Ignoring dtypes in concat with empty or all-NA columns
- Null-values are no longer coerced to NaN-value in
value_counts and mode
- mangle_dupe_cols in read_csv no longer renames unique columns
conflicting with target names
- unstack and pivot_table no longer raises ValueError for
result that would exceed int32 limit
- groupby.apply consistent transform detection
* API changes
- Index.get_indexer_for() no longer accepts keyword arguments
(other than target); in the past these would be silently
ignored if the index was not unique (GH42310)
- Change in the position of the min_rows argument in
DataFrame.to_string() due to change in the docstring
(GH44304)
- Reduction operations for DataFrame or Series now raising a
ValueError when None is passed for skipna (GH44178)
- read_csv() and read_html() no longer raising an error when
one of the header rows consists only of Unnamed: columns
(GH13054)
- Changed the name attribute of several holidays in
USFederalHolidayCalendar to match official federal holiday
names.
* Deprecations
- Deprecated Int64Index, UInt64Index & Float64Index
- Deprecated Frame.append and Series.append
- Split out test runs into separate flavors, optimize memory usage
in pytest-xdist runs
OBS-URL: https://build.opensuse.org/request/show/948450
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python:numeric/python-pandas?expand=0&rev=67
- Update to version 1.3.5
* Fixed regression in Series.equals() when comparing floats with
dtype object to None (GH44190)
* Fixed regression in merge_asof() raising error when array was
supplied as join key (GH42844)
* Fixed regression when resampling DataFrame with DateTimeIndex
with empty groups and uint8, uint16 or uint32 columns
incorrectly raising RuntimeError (GH43329)
* Fixed regression in creating a DataFrame from a timezone-aware
Timestamp scalar near a Daylight Savings Time transition
(GH42505)
* Fixed performance regression in read_csv() (GH44106)
* Fixed regression in Series.duplicated() and
Series.drop_duplicates() when Series has Categorical dtype with
boolean categories (GH44351)
* Fixed regression in GroupBy.sum() with timedelta64[ns] dtype
containing NaT failing to treat that value as NA (GH42659)
* Fixed regression in RollingGroupby.cov() and
RollingGroupby.corr() when other had the same shape as each
group would incorrectly return superfluous groups in the result
(GH42915)
OBS-URL: https://build.opensuse.org/request/show/943876
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python:numeric/python-pandas?expand=0&rev=66
- Update to version 1.3.4
* Fixed regression in DataFrame.convert_dtypes() incorrectly
converts byte strings to strings (GH43183)
* Fixed regression in GroupBy.agg() where it was failing
silently with mixed data types along axis=1 and MultiIndex (GH43209)
* Fixed regression in merge() with integer and NaN keys
failing with outer merge (GH43550)
* Fixed regression in DataFrame.corr() raising ValueError with
method="spearman" on 32-bit platforms (GH43588)
* Fixed performance regression in MultiIndex.equals() (GH43549)
* Fixed performance regression in GroupBy.first() and GroupBy.last()
with StringDtype (GH41596)
* Fixed regression in Series.cat.reorder_categories() failing to
update the categories on the Series (GH43232)
* Fixed regression in Series.cat.categories() setter failing to
update the categories on the Series (GH43334)
* Fixed regression in read_csv() raising UnicodeDecodeError exception
when memory_map=True (GH43540)
* Fixed regression in DataFrame.explode() raising AssertionError
when column is any scalar which is not a string (GH43314)
* Fixed regression in Series.aggregate() attempting to pass args
and kwargs multiple times to the user supplied func in certain cases (GH43357)
* Fixed regression when iterating over a DataFrame.groupby.rolling
object causing the resulting DataFrames to have an incorrect index if the input groupings were not sorted (GH43386)
* Fixed regression in DataFrame.groupby.rolling.cov() and
DataFrame.groupby.rolling.corr() computing incorrect results if the
input groupings were not sorted (GH43386)
* Fixed bug in pandas.DataFrame.groupby.rolling() and
pandas.api.indexers.FixedForwardWindowIndexer leading to
segfaults and window endpoints being mixed across groups (GH43267)
* Fixed bug in GroupBy.mean() with datetimelike values
including NaT values returning incorrect results (GH43132)
* Fixed bug in Series.aggregate() not passing the first args
to the user supplied func in certain cases (GH43357)
* Fixed memory leaks in Series.rolling.quantile() and
Series.rolling.median() (GH43339)
OBS-URL: https://build.opensuse.org/request/show/926551
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python:numeric/python-pandas?expand=0&rev=65
- Update to version 1.3.3
* Fixed regression in DataFrame constructor failing to broadcast
for defined Index and len one list of Timestamp (GH42810)
* Fixed regression in GroupBy.agg() incorrectly raising in some
cases (GH42390)
* Fixed regression in GroupBy.apply() where nan values were
dropped even with dropna=False (GH43205)
* Fixed regression in GroupBy.quantile() which was failing with
pandas.NA (GH42849)
* Fixed regression in merge() where on columns with
ExtensionDtype or bool data types were cast to object in right
and outer merge (GH40073)
* Fixed regression in RangeIndex.where() and RangeIndex.putmask()
raising AssertionError when result did not represent a
RangeIndex (GH43240)
* Fixed regression in read_parquet() where the fastparquet engine
would not work properly with fastparquet 0.7.0 (GH43075)
* Fixed regression in DataFrame.loc.__setitem__() raising
ValueError when setting array as cell value (GH43422)
* Fixed regression in is_list_like() where objects with __iter__
set to None would be identified as iterable (GH43373)
* Fixed regression in DataFrame.__getitem__() raising error for
slice of DatetimeIndex when index is non monotonic (GH43223)
* Fixed regression in Resampler.aggregate() when used after
column selection would raise if func is a list of aggregation
functions (GH42905)
* Fixed regression in DataFrame.corr() where Kendall correlation
would produce incorrect results for columns with repeated
values (GH43401)
* Fixed regression in DataFrame.groupby() where aggregation on
columns with object types dropped results on those columns
(GH42395, GH43108)
* Fixed regression in Series.fillna() raising TypeError when
filling float Series with list-like fill value having a dtype
which couldn’t cast lostlessly (like float32 filled with
float64) (GH43424)
* Fixed regression in read_csv() raising AttributeError when the
file handle is an tempfile.SpooledTemporaryFile object
(GH43439)
* Fixed performance regression in core.window.ewm.
ExponentialMovingWindow.mean() (GH42333)
* Performance improvement for DataFrame.__setitem__() when the
key or value is not a DataFrame, or key is not list-like
(GH43274)
* Fixed bug in DataFrameGroupBy.agg() and DataFrameGroupBy.
transform() with engine="numba" where index data was not being
correctly passed into func (GH43133)
- Release 1.3.2
* Performance regression in DataFrame.isin() and Series.isin()
for nullable data types (GH42714)
* Regression in updating values of Series using boolean index,
created by using DataFrame.pop() (GH42530)
* Regression in DataFrame.from_records() with empty records
(GH42456)
* Fixed regression in DataFrame.shift() where TypeError occurred
when shifting DataFrame created by concatenation of slices and
fills with values (GH42719)
* Regression in DataFrame.agg() when the func argument returned
lists and axis=1 (GH42727)
* Regression in DataFrame.drop() does nothing if MultiIndex has
duplicates and indexer is a tuple or list of tuples (GH42771)
* Fixed regression where read_csv() raised a ValueError when
parameters names and prefix were both set to None (GH42387)
* Fixed regression in comparisons between Timestamp object and
datetime64 objects outside the implementation bounds for
nanosecond datetime64 (GH42794)
* Fixed regression in Styler.highlight_min() and Styler.
highlight_max() where pandas.NA was not successfully ignored
(GH42650)
* Fixed regression in concat() where copy=False was not honored
in axis=1 Series concatenation (GH42501)
* Regression in Series.nlargest() and Series.nsmallest() with
nullable integer or float dtype (GH42816)
* Fixed regression in Series.quantile() with Int64Dtype (GH42626)
* Fixed regression in Series.groupby() and DataFrame.groupby()
where supplying the by argument with a Series named with a
tuple would incorrectly raise (GH42731)
* Bug in read_excel() modifies the dtypes dictionary when reading
a file with duplicate columns (GH42462)
* 1D slices over extension types turn into N-dimensional slices
over ExtensionArrays (GH42430)
* Fixed bug in Series.rolling() and DataFrame.rolling() not
calculating window bounds correctly for the first row when
center=True and window is an offset that covers all the rows
(GH42753)
* Styler.hide_columns() now hides the index name header row as
well as column headers (GH42101)
* Styler.set_sticky() has amended CSS to control the column/index
names and ensure the correct sticky positions (GH42537)
* Bug in de-serializing datetime indexes in PYTHONOPTIMIZED mode
(GH42866)
OBS-URL: https://build.opensuse.org/request/show/920383
OBS-URL: https://build.opensuse.org/package/show/devel:languages:python:numeric/python-pandas?expand=0&rev=64