Accepting request 724138 from devel:languages:python:numeric
Update to Version 0.25.0 All packages broken by this update should be fixed now. OBS-URL: https://build.opensuse.org/request/show/724138 OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/python-pandas?expand=0&rev=18
This commit is contained in:
commit
0fd36f4242
@ -1,3 +0,0 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:4f919f409c433577a501e023943e582c57355d50a724c589e78bc1d551a535a2
|
||||
size 11837693
|
3
pandas-0.25.0.tar.gz
Normal file
3
pandas-0.25.0.tar.gz
Normal file
@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:914341ad2d5b1ea522798efa4016430b66107d05781dbfe7cf05eba8f37df995
|
||||
size 12616848
|
@ -1,48 +0,0 @@
|
||||
From 5a73ff8b4e10d016e0fd4162fa14c8f1a41345d9 Mon Sep 17 00:00:00 2001
|
||||
From: =?UTF-8?q?Tom=C3=A1=C5=A1=20Chv=C3=A1tal?= <tchvatal@suse.com>
|
||||
Date: Thu, 21 Feb 2019 15:05:21 +0100
|
||||
Subject: [PATCH] Mark test_pct_max_many_rows as high memory
|
||||
|
||||
Fixes issue #25384
|
||||
---
|
||||
pandas/tests/frame/test_rank.py | 1 +
|
||||
pandas/tests/series/test_rank.py | 1 +
|
||||
pandas/tests/test_algos.py | 1 +
|
||||
3 files changed, 3 insertions(+)
|
||||
|
||||
diff --git a/pandas/tests/frame/test_rank.py b/pandas/tests/frame/test_rank.py
|
||||
index 10c42e0d1a1..6bb9dea15d1 100644
|
||||
--- a/pandas/tests/frame/test_rank.py
|
||||
+++ b/pandas/tests/frame/test_rank.py
|
||||
@@ -310,6 +310,7 @@ def test_rank_pct_true(self, method, exp):
|
||||
tm.assert_frame_equal(result, expected)
|
||||
|
||||
@pytest.mark.single
|
||||
+ @pytest.mark.high_memory
|
||||
def test_pct_max_many_rows(self):
|
||||
# GH 18271
|
||||
df = DataFrame({'A': np.arange(2**24 + 1),
|
||||
diff --git a/pandas/tests/series/test_rank.py b/pandas/tests/series/test_rank.py
|
||||
index 510a51e0029..dfcda889269 100644
|
||||
--- a/pandas/tests/series/test_rank.py
|
||||
+++ b/pandas/tests/series/test_rank.py
|
||||
@@ -499,6 +499,7 @@ def test_rank_first_pct(dtype, ser, exp):
|
||||
|
||||
|
||||
@pytest.mark.single
|
||||
+@pytest.mark.high_memory
|
||||
def test_pct_max_many_rows():
|
||||
# GH 18271
|
||||
s = Series(np.arange(2**24 + 1))
|
||||
diff --git a/pandas/tests/test_algos.py b/pandas/tests/test_algos.py
|
||||
index 888cf78a1c6..cb7426ce2f7 100644
|
||||
--- a/pandas/tests/test_algos.py
|
||||
+++ b/pandas/tests/test_algos.py
|
||||
@@ -1484,6 +1484,7 @@ def test_too_many_ndims(self):
|
||||
algos.rank(arr)
|
||||
|
||||
@pytest.mark.single
|
||||
+ @pytest.mark.high_memory
|
||||
@pytest.mark.parametrize('values', [
|
||||
np.arange(2**24 + 1),
|
||||
np.arange(2**25 + 2).reshape(2**24 + 1, 2)],
|
@ -1,3 +1,409 @@
|
||||
-------------------------------------------------------------------
|
||||
Mon Jul 22 15:36:34 UTC 2019 - Todd R <toddrme2178@gmail.com>
|
||||
|
||||
- Update to Version 0.25.0
|
||||
+ Warning
|
||||
* Starting with the 0.25.x series of releases, pandas only supports Python 3.5.3 and higher.
|
||||
* The minimum supported Python version will be bumped to 3.6 in a future release.
|
||||
* Panel has been fully removed. For N-D labeled data structures, please
|
||||
use xarray
|
||||
* read_pickle read_msgpack are only guaranteed backwards compatible back to
|
||||
pandas version 0.20.3
|
||||
+ Enhancements
|
||||
* Groupby aggregation with relabeling
|
||||
Pandas has added special groupby behavior, known as "named aggregation", for naming the
|
||||
output columns when applying multiple aggregation functions to specific columns.
|
||||
* Groupby Aggregation with multiple lambdas
|
||||
You can now provide multiple lambda functions to a list-like aggregation in
|
||||
pandas.core.groupby.GroupBy.agg.
|
||||
* Better repr for MultiIndex
|
||||
Printing of MultiIndex instances now shows tuples of each row and ensures
|
||||
that the tuple items are vertically aligned, so it's now easier to understand
|
||||
the structure of the MultiIndex.
|
||||
* Shorter truncated repr for Series and DataFrame
|
||||
Currently, the default display options of pandas ensure that when a Series
|
||||
or DataFrame has more than 60 rows, its repr gets truncated to this maximum
|
||||
of 60 rows (the display.max_rows option). However, this still gives
|
||||
a repr that takes up a large part of the vertical screen estate. Therefore,
|
||||
a new option display.min_rows is introduced with a default of 10 which
|
||||
determines the number of rows showed in the truncated repr:
|
||||
* Json normalize with max_level param support
|
||||
json_normalize normalizes the provided input dict to all
|
||||
nested levels. The new max_level parameter provides more control over
|
||||
which level to end normalization.
|
||||
* Series.explode to split list-like values to rows
|
||||
Series and DataFrame have gained the DataFrame.explode methods to transform
|
||||
list-likes to individual rows.
|
||||
* DataFrame.plot keywords logy, logx and loglog can now accept the value 'sym' for symlog scaling.
|
||||
* Added support for ISO week year format ('%G-%V-%u') when parsing datetimes using to_datetime
|
||||
* Indexing of DataFrame and Series now accepts zerodim np.ndarray
|
||||
* Timestamp.replace now supports the fold argument to disambiguate DST transition times
|
||||
* DataFrame.at_time and Series.at_time now support datetime.time objects with timezones
|
||||
* DataFrame.pivot_table now accepts an observed parameter which is passed to underlying calls to DataFrame.groupby to speed up grouping categorical data.
|
||||
* Series.str has gained Series.str.casefold method to removes all case distinctions present in a string
|
||||
* DataFrame.set_index now works for instances of abc.Iterator, provided their output is of the same length as the calling frame
|
||||
* DatetimeIndex.union now supports the sort argument. The behavior of the sort parameter matches that of Index.union
|
||||
* RangeIndex.union now supports the sort argument. If sort=False an unsorted Int64Index is always returned. sort=None is the default and returns a monotonically increasing RangeIndex if possible or a sorted Int64Index if not
|
||||
* TimedeltaIndex.intersection now also supports the sort keyword
|
||||
* DataFrame.rename now supports the errors argument to raise errors when attempting to rename nonexistent keys
|
||||
* Added api.frame.sparse for working with a DataFrame whose values are sparse
|
||||
* RangeIndex has gained ~RangeIndex.start, ~RangeIndex.stop, and ~RangeIndex.step attributes
|
||||
* datetime.timezone objects are now supported as arguments to timezone methods and constructors
|
||||
* DataFrame.query and DataFrame.eval now supports quoting column names with backticks to refer to names with spaces
|
||||
* merge_asof now gives a more clear error message when merge keys are categoricals that are not equal
|
||||
* pandas.core.window.Rolling supports exponential (or Poisson) window type
|
||||
* Error message for missing required imports now includes the original import error's text
|
||||
* DatetimeIndex and TimedeltaIndex now have a mean method
|
||||
* DataFrame.describe now formats integer percentiles without decimal point
|
||||
* Added support for reading SPSS .sav files using read_spss
|
||||
* Added new option plotting.backend to be able to select a plotting backend different than the existing matplotlib one. Use pandas.set_option('plotting.backend', '<backend-module>') where <backend-module is a library implementing the pandas plotting API
|
||||
* pandas.offsets.BusinessHour supports multiple opening hours intervals
|
||||
* read_excel can now use openpyxl to read Excel files via the engine='openpyxl' argument. This will become the default in a future release
|
||||
* pandas.io.excel.read_excel supports reading OpenDocument tables. Specify engine='odf' to enable. Consult the IO User Guide <io.ods> for more details
|
||||
* Interval, IntervalIndex, and ~arrays.IntervalArray have gained an ~Interval.is_empty attribute denoting if the given interval(s) are empty
|
||||
+ Backwards incompatible API changes
|
||||
* Indexing with date strings with UTC offsets
|
||||
Indexing a DataFrame or Series with a DatetimeIndex with a
|
||||
date string with a UTC offset would previously ignore the UTC offset. Now, the UTC offset
|
||||
is respected in indexing.
|
||||
* MultiIndex constructed from levels and codes
|
||||
Constructing a MultiIndex with NaN levels or codes value < -1 was allowed previously.
|
||||
Now, construction with codes value < -1 is not allowed and NaN levels' corresponding codes
|
||||
would be reassigned as -1.
|
||||
* Groupby.apply on DataFrame evaluates first group only once
|
||||
The implementation of DataFrameGroupBy.apply()
|
||||
previously evaluated the supplied function consistently twice on the first group
|
||||
to infer if it is safe to use a fast code path. Particularly for functions with
|
||||
side effects, this was an undesired behavior and may have led to surprises.
|
||||
* Concatenating sparse values
|
||||
When passed DataFrames whose values are sparse, concat will now return a
|
||||
Series or DataFrame with sparse values, rather than a SparseDataFrame .
|
||||
* The .str-accessor performs stricter type checks
|
||||
Due to the lack of more fine-grained dtypes, Series.str so far only checked whether the data was
|
||||
of object dtype. Series.str will now infer the dtype data *within* the Series; in particular,
|
||||
'bytes'-only data will raise an exception (except for Series.str.decode, Series.str.get,
|
||||
Series.str.len, Series.str.slice).
|
||||
* Categorical dtypes are preserved during groupby
|
||||
Previously, columns that were categorical, but not the groupby key(s) would be converted to object dtype during groupby operations. Pandas now will preserve these dtypes.
|
||||
* Incompatible Index type unions
|
||||
When performing Index.union operations between objects of incompatible dtypes,
|
||||
the result will be a base Index of dtype object. This behavior holds true for
|
||||
unions between Index objects that previously would have been prohibited. The dtype
|
||||
of empty Index objects will now be evaluated before performing union operations
|
||||
rather than simply returning the other Index object. Index.union can now be
|
||||
considered commutative, such that A.union(B) == B.union(A) .
|
||||
* DataFrame groupby ffill/bfill no longer return group labels
|
||||
The methods ffill, bfill, pad and backfill of
|
||||
DataFrameGroupBy <pandas.core.groupby.DataFrameGroupBy>
|
||||
previously included the group labels in the return value, which was
|
||||
inconsistent with other groupby transforms. Now only the filled values
|
||||
are returned.
|
||||
* DataFrame describe on an empty categorical / object column will return top and freq
|
||||
When calling DataFrame.describe with an empty categorical / object
|
||||
column, the 'top' and 'freq' columns were previously omitted, which was inconsistent with
|
||||
the output for non-empty columns. Now the 'top' and 'freq' columns will always be included,
|
||||
with numpy.nan in the case of an empty DataFrame
|
||||
* __str__ methods now call __repr__ rather than vice versa
|
||||
Pandas has until now mostly defined string representations in a Pandas objects's
|
||||
__str__/__unicode__/__bytes__ methods, and called __str__ from the __repr__
|
||||
method, if a specific __repr__ method is not found. This is not needed for Python3.
|
||||
In Pandas 0.25, the string representations of Pandas objects are now generally
|
||||
defined in __repr__, and calls to __str__ in general now pass the call on to
|
||||
the __repr__, if a specific __str__ method doesn't exist, as is standard for Python.
|
||||
This change is backward compatible for direct usage of Pandas, but if you subclass
|
||||
Pandas objects *and* give your subclasses specific __str__/__repr__ methods,
|
||||
you may have to adjust your __str__/__repr__ methods .
|
||||
* Indexing an IntervalIndex with Interval objects
|
||||
Indexing methods for IntervalIndex have been modified to require exact matches only for Interval queries.
|
||||
IntervalIndex methods previously matched on any overlapping Interval. Behavior with scalar points, e.g. querying
|
||||
with an integer, is unchanged .
|
||||
* Binary ufuncs on Series now align
|
||||
Applying a binary ufunc like numpy.power now aligns the inputs
|
||||
when both are Series .
|
||||
* Categorical.argsort now places missing values at the end
|
||||
Categorical.argsort now places missing values at the end of the array, making it
|
||||
consistent with NumPy and the rest of pandas .
|
||||
* Column order is preserved when passing a list of dicts to DataFrame
|
||||
Starting with Python 3.7 the key-order of dict is guaranteed <https://mail.python.org/pipermail/python-dev/2017-December/151283.html>_. In practice, this has been true since
|
||||
Python 3.6. The DataFrame constructor now treats a list of dicts in the same way as
|
||||
it does a list of OrderedDict, i.e. preserving the order of the dicts.
|
||||
This change applies only when pandas is running on Python>=3.6 .
|
||||
* Increased minimum versions for dependencies
|
||||
* DatetimeTZDtype will now standardize pytz timezones to a common timezone instance
|
||||
* Timestamp and Timedelta scalars now implement the to_numpy method as aliases to Timestamp.to_datetime64 and Timedelta.to_timedelta64, respectively.
|
||||
* Timestamp.strptime will now rise a NotImplementedError
|
||||
* Comparing Timestamp with unsupported objects now returns :pyNotImplemented instead of raising TypeError. This implies that unsupported rich comparisons are delegated to the other object, and are now consistent with Python 3 behavior for datetime objects
|
||||
* Bug in DatetimeIndex.snap which didn't preserving the name of the input Index
|
||||
* The arg argument in pandas.core.groupby.DataFrameGroupBy.agg has been renamed to func
|
||||
* The arg argument in pandas.core.window._Window.aggregate has been renamed to func
|
||||
* Most Pandas classes had a __bytes__ method, which was used for getting a python2-style bytestring representation of the object. This method has been removed as a part of dropping Python2
|
||||
* The .str-accessor has been disabled for 1-level MultiIndex, use MultiIndex.to_flat_index if necessary
|
||||
* Removed support of gtk package for clipboards
|
||||
* Using an unsupported version of Beautiful Soup 4 will now raise an ImportError instead of a ValueError
|
||||
* Series.to_excel and DataFrame.to_excel will now raise a ValueError when saving timezone aware data.
|
||||
* ExtensionArray.argsort places NA values at the end of the sorted array.
|
||||
* DataFrame.to_hdf and Series.to_hdf will now raise a NotImplementedError when saving a MultiIndex with extention data types for a fixed format.
|
||||
* Passing duplicate names in read_csv will now raise a ValueError
|
||||
+ Deprecations
|
||||
* Sparse subclasses
|
||||
The SparseSeries and SparseDataFrame subclasses are deprecated. Their functionality is better-provided
|
||||
by a Series or DataFrame with sparse values.
|
||||
* msgpack format
|
||||
The msgpack format is deprecated as of 0.25 and will be removed in a future version. It is recommended to use pyarrow for on-the-wire transmission of pandas objects.
|
||||
* The deprecated .ix[] indexer now raises a more visible FutureWarning instead of DeprecationWarning .
|
||||
* Deprecated the units=M (months) and units=Y (year) parameters for units of pandas.to_timedelta, pandas.Timedelta and pandas.TimedeltaIndex
|
||||
* pandas.concat has deprecated the join_axes-keyword. Instead, use DataFrame.reindex or DataFrame.reindex_like on the result or on the inputs
|
||||
* The SparseArray.values attribute is deprecated. You can use np.asarray(...) or
|
||||
the SparseArray.to_dense method instead .
|
||||
* The functions pandas.to_datetime and pandas.to_timedelta have deprecated the box keyword. Instead, use to_numpy or Timestamp.to_datetime64 or Timedelta.to_timedelta64.
|
||||
* The DataFrame.compound and Series.compound methods are deprecated and will be removed in a future version .
|
||||
* The internal attributes _start, _stop and _step attributes of RangeIndex have been deprecated.
|
||||
Use the public attributes ~RangeIndex.start, ~RangeIndex.stop and ~RangeIndex.step instead .
|
||||
* The Series.ftype, Series.ftypes and DataFrame.ftypes methods are deprecated and will be removed in a future version.
|
||||
Instead, use Series.dtype and DataFrame.dtypes .
|
||||
* The Series.get_values, DataFrame.get_values, Index.get_values,
|
||||
SparseArray.get_values and Categorical.get_values methods are deprecated.
|
||||
One of np.asarray(..) or ~Series.to_numpy can be used instead .
|
||||
* The 'outer' method on NumPy ufuncs, e.g. np.subtract.outer has been deprecated on Series objects. Convert the input to an array with Series.array first
|
||||
* Timedelta.resolution is deprecated and replaced with Timedelta.resolution_string. In a future version, Timedelta.resolution will be changed to behave like the standard library datetime.timedelta.resolution
|
||||
* read_table has been undeprecated.
|
||||
* Index.dtype_str is deprecated.
|
||||
* Series.imag and Series.real are deprecated.
|
||||
* Series.put is deprecated.
|
||||
* Index.item and Series.item is deprecated.
|
||||
* The default value ordered=None in ~pandas.api.types.CategoricalDtype has been deprecated in favor of ordered=False. When converting between categorical types ordered=True must be explicitly passed in order to be preserved.
|
||||
* Index.contains is deprecated. Use key in index (__contains__) instead .
|
||||
* DataFrame.get_dtype_counts is deprecated.
|
||||
* Categorical.ravel will return a Categorical instead of a np.ndarray
|
||||
+ Removal of prior version deprecations/changes
|
||||
* Removed Panel
|
||||
* Removed the previously deprecated sheetname keyword in read_excel
|
||||
* Removed the previously deprecated TimeGrouper
|
||||
* Removed the previously deprecated parse_cols keyword in read_excel
|
||||
* Removed the previously deprecated pd.options.html.border
|
||||
* Removed the previously deprecated convert_objects
|
||||
* Removed the previously deprecated select method of DataFrame and Series
|
||||
* Removed the previously deprecated behavior of Series treated as list-like in ~Series.cat.rename_categories
|
||||
* Removed the previously deprecated DataFrame.reindex_axis and Series.reindex_axis
|
||||
* Removed the previously deprecated behavior of altering column or index labels with Series.rename_axis or DataFrame.rename_axis
|
||||
* Removed the previously deprecated tupleize_cols keyword argument in read_html, read_csv, and DataFrame.to_csv
|
||||
* Removed the previously deprecated DataFrame.from.csv and Series.from_csv
|
||||
* Removed the previously deprecated raise_on_error keyword argument in DataFrame.where and DataFrame.mask
|
||||
* Removed the previously deprecated ordered and categories keyword arguments in astype
|
||||
* Removed the previously deprecated cdate_range
|
||||
* Removed the previously deprecated True option for the dropna keyword argument in SeriesGroupBy.nth
|
||||
* Removed the previously deprecated convert keyword argument in Series.take and DataFrame.take
|
||||
+ Performance improvements
|
||||
* Significant speedup in SparseArray initialization that benefits most operations, fixing performance regression introduced in v0.20.0
|
||||
* DataFrame.to_stata() is now faster when outputting data with any string or non-native endian columns
|
||||
* Improved performance of Series.searchsorted. The speedup is especially large when the dtype is
|
||||
int8/int16/int32 and the searched key is within the integer bounds for the dtype
|
||||
* Improved performance of pandas.core.groupby.GroupBy.quantile
|
||||
* Improved performance of slicing and other selected operation on a RangeIndex
|
||||
* RangeIndex now performs standard lookup without instantiating an actual hashtable, hence saving memory
|
||||
* Improved performance of read_csv by faster tokenizing and faster parsing of small float numbers
|
||||
* Improved performance of read_csv by faster parsing of N/A and boolean values
|
||||
* Improved performance of IntervalIndex.is_monotonic, IntervalIndex.is_monotonic_increasing and IntervalIndex.is_monotonic_decreasing by removing conversion to MultiIndex
|
||||
* Improved performance of DataFrame.to_csv when writing datetime dtypes
|
||||
* Improved performance of read_csv by much faster parsing of MM/YYYY and DD/MM/YYYY datetime formats
|
||||
* Improved performance of nanops for dtypes that cannot store NaNs. Speedup is particularly prominent for Series.all and Series.any
|
||||
* Improved performance of Series.map for dictionary mappers on categorical series by mapping the categories instead of mapping all values
|
||||
* Improved performance of IntervalIndex.intersection
|
||||
* Improved performance of read_csv by faster concatenating date columns without extra conversion to string for integer/float zero and float NaN; by faster checking the string for the possibility of being a date
|
||||
* Improved performance of IntervalIndex.is_unique by removing conversion to MultiIndex
|
||||
* Restored performance of DatetimeIndex.__iter__ by re-enabling specialized code path
|
||||
* Improved performance when building MultiIndex with at least one CategoricalIndex level
|
||||
* Improved performance by removing the need for a garbage collect when checking for SettingWithCopyWarning
|
||||
* For to_datetime changed default value of cache parameter to True
|
||||
* Improved performance of DatetimeIndex and PeriodIndex slicing given non-unique, monotonic data .
|
||||
* Improved performance of pd.read_json for index-oriented data.
|
||||
* Improved performance of MultiIndex.shape .
|
||||
+ Bug fixes
|
||||
> Categorical
|
||||
* Bug in DataFrame.at and Series.at that would raise exception if the index was a CategoricalIndex
|
||||
* Fixed bug in comparison of ordered Categorical that contained missing values with a scalar which sometimes incorrectly resulted in True
|
||||
* Bug in DataFrame.dropna when the DataFrame has a CategoricalIndex containing Interval objects incorrectly raised a TypeError
|
||||
> Datetimelike
|
||||
* Bug in to_datetime which would raise an (incorrect) ValueError when called with a date far into the future and the format argument specified instead of raising OutOfBoundsDatetime
|
||||
* Bug in to_datetime which would raise InvalidIndexError: Reindexing only valid with uniquely valued Index objects when called with cache=True, with arg including at least two different elements from the set {None, numpy.nan, pandas.NaT}
|
||||
* Bug in DataFrame and Series where timezone aware data with dtype='datetime64[ns] was not cast to naive
|
||||
* Improved Timestamp type checking in various datetime functions to prevent exceptions when using a subclassed datetime
|
||||
* Bug in Series and DataFrame repr where np.datetime64('NaT') and np.timedelta64('NaT') with dtype=object would be represented as NaN
|
||||
* Bug in to_datetime which does not replace the invalid argument with NaT when error is set to coerce
|
||||
* Bug in adding DateOffset with nonzero month to DatetimeIndex would raise ValueError
|
||||
* Bug in to_datetime which raises unhandled OverflowError when called with mix of invalid dates and NaN values with format='%Y%m%d' and error='coerce'
|
||||
* Bug in isin for datetimelike indexes; DatetimeIndex, TimedeltaIndex and PeriodIndex where the levels parameter was ignored.
|
||||
* Bug in to_datetime which raises TypeError for format='%Y%m%d' when called for invalid integer dates with length >= 6 digits with errors='ignore'
|
||||
* Bug when comparing a PeriodIndex against a zero-dimensional numpy array
|
||||
* Bug in constructing a Series or DataFrame from a numpy datetime64 array with a non-ns unit and out-of-bound timestamps generating rubbish data, which will now correctly raise an OutOfBoundsDatetime error .
|
||||
* Bug in date_range with unnecessary OverflowError being raised for very large or very small dates
|
||||
* Bug where adding Timestamp to a np.timedelta64 object would raise instead of returning a Timestamp
|
||||
* Bug where comparing a zero-dimensional numpy array containing a np.datetime64 object to a Timestamp would incorrect raise TypeError
|
||||
* Bug in to_datetime which would raise ValueError: Tz-aware datetime.datetime cannot be converted to datetime64 unless utc=True when called with cache=True, with arg including datetime strings with different offset
|
||||
> Timedelta
|
||||
* Bug in TimedeltaIndex.intersection where for non-monotonic indices in some cases an empty Index was returned when in fact an intersection existed
|
||||
* Bug with comparisons between Timedelta and NaT raising TypeError
|
||||
* Bug when adding or subtracting a BusinessHour to a Timestamp with the resulting time landing in a following or prior day respectively
|
||||
* Bug when comparing a TimedeltaIndex against a zero-dimensional numpy array
|
||||
> Timezones
|
||||
* Bug in DatetimeIndex.to_frame where timezone aware data would be converted to timezone naive data
|
||||
* Bug in to_datetime with utc=True and datetime strings that would apply previously parsed UTC offsets to subsequent arguments
|
||||
* Bug in Timestamp.tz_localize and Timestamp.tz_convert does not propagate freq
|
||||
* Bug in Series.at where setting Timestamp with timezone raises TypeError
|
||||
* Bug in DataFrame.update when updating with timezone aware data would return timezone naive data
|
||||
* Bug in to_datetime where an uninformative RuntimeError was raised when passing a naive Timestamp with datetime strings with mixed UTC offsets
|
||||
* Bug in to_datetime with unit='ns' would drop timezone information from the parsed argument
|
||||
* Bug in DataFrame.join where joining a timezone aware index with a timezone aware column would result in a column of NaN
|
||||
* Bug in date_range where ambiguous or nonexistent start or end times were not handled by the ambiguous or nonexistent keywords respectively
|
||||
* Bug in DatetimeIndex.union when combining a timezone aware and timezone unaware DatetimeIndex
|
||||
* Bug when applying a numpy reduction function (e.g. numpy.minimum) to a timezone aware Series
|
||||
> Numeric
|
||||
* Bug in to_numeric in which large negative numbers were being improperly handled
|
||||
* Bug in to_numeric in which numbers were being coerced to float, even though errors was not coerce
|
||||
* Bug in to_numeric in which invalid values for errors were being allowed
|
||||
* Bug in format in which floating point complex numbers were not being formatted to proper display precision and trimming
|
||||
* Bug in error messages in DataFrame.corr and Series.corr. Added the possibility of using a callable.
|
||||
* Bug in Series.divmod and Series.rdivmod which would raise an (incorrect) ValueError rather than return a pair of Series objects as result
|
||||
* Raises a helpful exception when a non-numeric index is sent to interpolate with methods which require numeric index.
|
||||
* Bug in ~pandas.eval when comparing floats with scalar operators, for example: x < -0.1
|
||||
* Fixed bug where casting all-boolean array to integer extension array failed
|
||||
* Bug in divmod with a Series object containing zeros incorrectly raising AttributeError
|
||||
* Inconsistency in Series floor-division (//) and divmod filling positive//zero with NaN instead of Inf
|
||||
> Conversion
|
||||
* Bug in DataFrame.astype() when passing a dict of columns and types the errors parameter was ignored.
|
||||
> Strings
|
||||
* Bug in the __name__ attribute of several methods of Series.str, which were set incorrectly
|
||||
* Improved error message when passing Series of wrong dtype to Series.str.cat
|
||||
> Interval
|
||||
* Construction of Interval is restricted to numeric, Timestamp and Timedelta endpoints
|
||||
* Fixed bug in Series/DataFrame not displaying NaN in IntervalIndex with missing values
|
||||
* Bug in IntervalIndex.get_loc where a KeyError would be incorrectly raised for a decreasing IntervalIndex
|
||||
* Bug in Index constructor where passing mixed closed Interval objects would result in a ValueError instead of an object dtype Index
|
||||
> Indexing
|
||||
* Improved exception message when calling DataFrame.iloc with a list of non-numeric objects .
|
||||
* Improved exception message when calling .iloc or .loc with a boolean indexer with different length .
|
||||
* Bug in KeyError exception message when indexing a MultiIndex with a non-existant key not displaying the original key .
|
||||
* Bug in .iloc and .loc with a boolean indexer not raising an IndexError when too few items are passed .
|
||||
* Bug in DataFrame.loc and Series.loc where KeyError was not raised for a MultiIndex when the key was less than or equal to the number of levels in the MultiIndex .
|
||||
* Bug in which DataFrame.append produced an erroneous warning indicating that a KeyError will be thrown in the future when the data to be appended contains new columns .
|
||||
* Bug in which DataFrame.to_csv caused a segfault for a reindexed data frame, when the indices were single-level MultiIndex .
|
||||
* Fixed bug where assigning a arrays.PandasArray to a pandas.core.frame.DataFrame would raise error
|
||||
* Allow keyword arguments for callable local reference used in the DataFrame.query string
|
||||
* Fixed a KeyError when indexing a MultiIndex` level with a list containing exactly one label, which is missing
|
||||
* Bug which produced AttributeError on partial matching Timestamp in a MultiIndex
|
||||
* Bug in Categorical and CategoricalIndex with Interval values when using the in operator (__contains) with objects that are not comparable to the values in the Interval
|
||||
* Bug in DataFrame.loc and DataFrame.iloc on a DataFrame with a single timezone-aware datetime64[ns] column incorrectly returning a scalar instead of a Series
|
||||
* Bug in CategoricalIndex and Categorical incorrectly raising ValueError instead of TypeError when a list is passed using the in operator (__contains__)
|
||||
* Bug in setting a new value in a Series with a Timedelta object incorrectly casting the value to an integer
|
||||
* Bug in Series setting a new key (__setitem__) with a timezone-aware datetime incorrectly raising ValueError
|
||||
* Bug in DataFrame.iloc when indexing with a read-only indexer
|
||||
* Bug in Series setting an existing tuple key (__setitem__) with timezone-aware datetime values incorrectly raising TypeError
|
||||
> Missing
|
||||
* Fixed misleading exception message in Series.interpolate if argument order is required, but omitted .
|
||||
* Fixed class type displayed in exception message in DataFrame.dropna if invalid axis parameter passed
|
||||
* A ValueError will now be thrown by DataFrame.fillna when limit is not a positive integer
|
||||
> MultiIndex
|
||||
* Bug in which incorrect exception raised by Timedelta when testing the membership of MultiIndex
|
||||
> I/O
|
||||
* Bug in DataFrame.to_html() where values were truncated using display options instead of outputting the full content
|
||||
* Fixed bug in missing text when using to_clipboard if copying utf-16 characters in Python 3 on Windows
|
||||
* Bug in read_json for orient='table' when it tries to infer dtypes by default, which is not applicable as dtypes are already defined in the JSON schema
|
||||
* Bug in read_json for orient='table' and float index, as it infers index dtype by default, which is not applicable because index dtype is already defined in the JSON schema
|
||||
* Bug in read_json for orient='table' and string of float column names, as it makes a column name type conversion to Timestamp, which is not applicable because column names are already defined in the JSON schema
|
||||
* Bug in json_normalize for errors='ignore' where missing values in the input data, were filled in resulting DataFrame with the string "nan" instead of numpy.nan
|
||||
* DataFrame.to_html now raises TypeError when using an invalid type for the classes parameter instead of AssertionError
|
||||
* Bug in DataFrame.to_string and DataFrame.to_latex that would lead to incorrect output when the header keyword is used
|
||||
* Bug in read_csv not properly interpreting the UTF8 encoded filenames on Windows on Python 3.6+
|
||||
* Improved performance in pandas.read_stata and pandas.io.stata.StataReader when converting columns that have missing values
|
||||
* Bug in DataFrame.to_html where header numbers would ignore display options when rounding
|
||||
* Bug in read_hdf where reading a table from an HDF5 file written directly with PyTables fails with a ValueError when using a sub-selection via the start or stop arguments
|
||||
* Bug in read_hdf not properly closing store after a KeyError is raised
|
||||
* Improved the explanation for the failure when value labels are repeated in Stata dta files and suggested work-arounds
|
||||
* Improved pandas.read_stata and pandas.io.stata.StataReader to read incorrectly formatted 118 format files saved by Stata
|
||||
* Improved the col_space parameter in DataFrame.to_html to accept a string so CSS length values can be set correctly
|
||||
* Fixed bug in loading objects from S3 that contain # characters in the URL
|
||||
* Adds use_bqstorage_api parameter to read_gbq to speed up downloads of large data frames. This feature requires version 0.10.0 of the pandas-gbq library as well as the google-cloud-bigquery-storage and fastavro libraries.
|
||||
* Fixed memory leak in DataFrame.to_json when dealing with numeric data
|
||||
* Bug in read_json where date strings with Z were not converted to a UTC timezone
|
||||
* Added cache_dates=True parameter to read_csv, which allows to cache unique dates when they are parsed
|
||||
* DataFrame.to_excel now raises a ValueError when the caller's dimensions exceed the limitations of Excel
|
||||
* Fixed bug in pandas.read_csv where a BOM would result in incorrect parsing using engine='python'
|
||||
* read_excel now raises a ValueError when input is of type pandas.io.excel.ExcelFile and engine param is passed since pandas.io.excel.ExcelFile has an engine defined
|
||||
* Bug while selecting from HDFStore with where='' specified .
|
||||
* Fixed bug in DataFrame.to_excel() where custom objects (i.e. PeriodIndex) inside merged cells were not being converted into types safe for the Excel writer
|
||||
* Bug in read_hdf where reading a timezone aware DatetimeIndex would raise a TypeError
|
||||
* Bug in to_msgpack and read_msgpack which would raise a ValueError rather than a FileNotFoundError for an invalid path
|
||||
* Fixed bug in DataFrame.to_parquet which would raise a ValueError when the dataframe had no columns
|
||||
* Allow parsing of PeriodDtype columns when using read_csv
|
||||
> Plotting
|
||||
* Fixed bug where api.extensions.ExtensionArray could not be used in matplotlib plotting
|
||||
* Bug in an error message in DataFrame.plot. Improved the error message if non-numerics are passed to DataFrame.plot
|
||||
* Bug in incorrect ticklabel positions when plotting an index that are non-numeric / non-datetime
|
||||
* Fixed bug causing plots of PeriodIndex timeseries to fail if the frequency is a multiple of the frequency rule code
|
||||
* Fixed bug when plotting a DatetimeIndex with datetime.timezone.utc timezone
|
||||
> Groupby/resample/rolling
|
||||
* Bug in pandas.core.resample.Resampler.agg with a timezone aware index where OverflowError would raise when passing a list of functions
|
||||
* Bug in pandas.core.groupby.DataFrameGroupBy.nunique in which the names of column levels were lost
|
||||
* Bug in pandas.core.groupby.GroupBy.agg when applying an aggregation function to timezone aware data
|
||||
* Bug in pandas.core.groupby.GroupBy.first and pandas.core.groupby.GroupBy.last where timezone information would be dropped
|
||||
* Bug in pandas.core.groupby.GroupBy.size when grouping only NA values
|
||||
* Bug in Series.groupby where observed kwarg was previously ignored
|
||||
* Bug in Series.groupby where using groupby with a MultiIndex Series with a list of labels equal to the length of the series caused incorrect grouping
|
||||
* Ensured that ordering of outputs in groupby aggregation functions is consistent across all versions of Python
|
||||
* Ensured that result group order is correct when grouping on an ordered Categorical and specifying observed=True
|
||||
* Bug in pandas.core.window.Rolling.min and pandas.core.window.Rolling.max that caused a memory leak
|
||||
* Bug in pandas.core.window.Rolling.count and pandas.core.window.Expanding.count was previously ignoring the axis keyword
|
||||
* Bug in pandas.core.groupby.GroupBy.idxmax and pandas.core.groupby.GroupBy.idxmin with datetime column would return incorrect dtype
|
||||
* Bug in pandas.core.groupby.GroupBy.cumsum, pandas.core.groupby.GroupBy.cumprod, pandas.core.groupby.GroupBy.cummin and pandas.core.groupby.GroupBy.cummax with categorical column having absent categories, would return incorrect result or segfault
|
||||
* Bug in pandas.core.groupby.GroupBy.nth where NA values in the grouping would return incorrect results
|
||||
* Bug in pandas.core.groupby.SeriesGroupBy.transform where transforming an empty group would raise a ValueError
|
||||
* Bug in pandas.core.frame.DataFrame.groupby where passing a pandas.core.groupby.grouper.Grouper would return incorrect groups when using the .groups accessor
|
||||
* Bug in pandas.core.groupby.GroupBy.agg where incorrect results are returned for uint64 columns.
|
||||
* Bug in pandas.core.window.Rolling.median and pandas.core.window.Rolling.quantile where MemoryError is raised with empty window
|
||||
* Bug in pandas.core.window.Rolling.median and pandas.core.window.Rolling.quantile where incorrect results are returned with closed='left' and closed='neither'
|
||||
* Improved pandas.core.window.Rolling, pandas.core.window.Window and pandas.core.window.EWM functions to exclude nuisance columns from results instead of raising errors and raise a DataError only if all columns are nuisance
|
||||
* Bug in pandas.core.window.Rolling.max and pandas.core.window.Rolling.min where incorrect results are returned with an empty variable window
|
||||
* Raise a helpful exception when an unsupported weighted window function is used as an argument of pandas.core.window.Window.aggregate
|
||||
> Reshaping
|
||||
* Bug in pandas.merge adds a string of None, if None is assigned in suffixes instead of remain the column name as-is .
|
||||
* Bug in merge when merging by index name would sometimes result in an incorrectly numbered index (missing index values are now assigned NA)
|
||||
* to_records now accepts dtypes to its column_dtypes parameter
|
||||
* Bug in concat where order of OrderedDict (and dict in Python 3.6+) is not respected, when passed in as objs argument
|
||||
* Bug in pivot_table where columns with NaN values are dropped even if dropna argument is False, when the aggfunc argument contains a list
|
||||
* Bug in concat where the resulting freq of two DatetimeIndex with the same freq would be dropped .
|
||||
* Bug in merge where merging with equivalent Categorical dtypes was raising an error
|
||||
* bug in DataFrame instantiating with a dict of iterators or generators (e.g. pd.DataFrame({'A': reversed(range(3))})) raised an error .
|
||||
* Bug in DataFrame instantiating with a range (e.g. pd.DataFrame(range(3))) raised an error .
|
||||
* Bug in DataFrame constructor when passing non-empty tuples would cause a segmentation fault
|
||||
* Bug in Series.apply failed when the series is a timezone aware DatetimeIndex
|
||||
* Bug in pandas.cut where large bins could incorrectly raise an error due to an integer overflow
|
||||
* Bug in DataFrame.sort_index where an error is thrown when a multi-indexed DataFrame is sorted on all levels with the initial level sorted last
|
||||
* Bug in Series.nlargest treats True as smaller than False
|
||||
* Bug in DataFrame.pivot_table with a IntervalIndex as pivot index would raise TypeError
|
||||
* Bug in which DataFrame.from_dict ignored order of OrderedDict when orient='index' .
|
||||
* Bug in DataFrame.transpose where transposing a DataFrame with a timezone-aware datetime column would incorrectly raise ValueError
|
||||
* Bug in pivot_table when pivoting a timezone aware column as the values would remove timezone information
|
||||
* Bug in merge_asof when specifying multiple by columns where one is datetime64[ns, tz] dtype
|
||||
> Sparse
|
||||
* Significant speedup in SparseArray initialization that benefits most operations, fixing performance regression introduced in v0.20.0
|
||||
* Bug in SparseFrame constructor where passing None as the data would cause default_fill_value to be ignored
|
||||
* Bug in SparseDataFrame when adding a column in which the length of values does not match length of index, AssertionError is raised instead of raising ValueError
|
||||
* Introduce a better error message in Series.sparse.from_coo so it returns a TypeError for inputs that are not coo matrices
|
||||
* Bug in numpy.modf on a SparseArray. Now a tuple of SparseArray is returned .
|
||||
> Build Changes
|
||||
* Fix install error with PyPy on macOS
|
||||
> ExtensionArray
|
||||
* Bug in factorize when passing an ExtensionArray with a custom na_sentinel .
|
||||
* Series.count miscounts NA values in ExtensionArrays
|
||||
* Added Series.__array_ufunc__ to better handle NumPy ufuncs applied to Series backed by extension arrays .
|
||||
* Keyword argument deep has been removed from ExtensionArray.copy
|
||||
> Other
|
||||
* Removed unused C functions from vendored UltraJSON implementation
|
||||
* Allow Index and RangeIndex to be passed to numpy min and max functions
|
||||
* Use actual class name in repr of empty objects of a Series subclass .
|
||||
* Bug in DataFrame where passing an object array of timezone-aware datetime objects would incorrectly raise ValueError
|
||||
- Remove upstream-included pandas-tests-memory.patch
|
||||
|
||||
-------------------------------------------------------------------
|
||||
Sat Mar 16 22:35:08 UTC 2019 - Arun Persaud <arun@gmx.de>
|
||||
|
||||
|
@ -17,84 +17,81 @@
|
||||
|
||||
|
||||
%{?!python_module:%define python_module() python-%{**} python3-%{**}}
|
||||
%define oldpython python
|
||||
%define skip_python2 1
|
||||
Name: python-pandas
|
||||
Version: 0.24.2
|
||||
Version: 0.25.0
|
||||
Release: 0
|
||||
Summary: Python module for working with "relational" or "labeled" data
|
||||
Summary: Python data structures for data analysis, time series, and statistics
|
||||
License: BSD-3-Clause
|
||||
Group: Development/Libraries/Python
|
||||
URL: http://pandas.pydata.org/
|
||||
Source0: https://files.pythonhosted.org/packages/source/p/pandas/pandas-%{version}.tar.gz
|
||||
Patch0: pandas-tests-memory.patch
|
||||
BuildRequires: %{python_module Cython >= 0.28.2}
|
||||
BuildRequires: %{python_module SQLAlchemy}
|
||||
BuildRequires: %{python_module XlsxWriter}
|
||||
BuildRequires: %{python_module beautifulsoup4 >= 4.2.1}
|
||||
BuildRequires: %{python_module devel}
|
||||
BuildRequires: %{python_module hypothesis}
|
||||
BuildRequires: %{python_module lxml}
|
||||
BuildRequires: %{python_module nose}
|
||||
BuildRequires: %{python_module numpy-devel >= 1.15.0}
|
||||
BuildRequires: %{python_module pytest-mock}
|
||||
BuildRequires: %{python_module pytest}
|
||||
BuildRequires: %{python_module python-dateutil >= 2.5}
|
||||
BuildRequires: %{python_module pytz >= 2011k}
|
||||
BuildRequires: %{python_module numpy-devel >= 1.13.3}
|
||||
BuildRequires: %{python_module setuptools >= 24.2.0}
|
||||
BuildRequires: %{python_module six}
|
||||
BuildRequires: %{python_module xlrd}
|
||||
BuildRequires: fdupes
|
||||
BuildRequires: gcc-c++
|
||||
BuildRequires: python-rpm-macros
|
||||
# SECTION test requirements
|
||||
BuildRequires: %{python_module SQLAlchemy >= 1.1.4}
|
||||
BuildRequires: %{python_module XlsxWriter >= 0.9.8}
|
||||
BuildRequires: %{python_module beautifulsoup4 >= 4.6.0}
|
||||
BuildRequires: %{python_module hypothesis}
|
||||
BuildRequires: %{python_module lxml >= 3.8.0}
|
||||
BuildRequires: %{python_module openpyxl >= 2.4.8}
|
||||
BuildRequires: %{python_module pytest-mock}
|
||||
BuildRequires: %{python_module pytest >= 4.0.2}
|
||||
BuildRequires: %{python_module python-dateutil >= 2.6.1}
|
||||
BuildRequires: %{python_module pytz >= 2015.4}
|
||||
BuildRequires: %{python_module xlrd >= 1.1.0}
|
||||
BuildRequires: %{python_module xlwt >= 1.2.0}
|
||||
BuildRequires: xvfb-run
|
||||
# /SECTION
|
||||
Requires: python-Cython >= 0.28.2
|
||||
Requires: python-Tempita
|
||||
Requires: python-lxml
|
||||
Requires: python-numpy >= 1.15.0
|
||||
Requires: python-python-dateutil >= 2.5
|
||||
Requires: python-pytz >= 2011k
|
||||
Requires: python-six
|
||||
Recommends: python-Bottleneck
|
||||
Requires: python-numpy >= 1.13.3
|
||||
Requires: python-python-dateutil >= 2.6.1
|
||||
Requires: python-pytz >= 2015.4
|
||||
Recommends: python-Bottleneck >= 1.2.1
|
||||
Recommends: python-Jinja2
|
||||
Recommends: python-SQLAlchemy >= 0.8.1
|
||||
Recommends: python-XlsxWriter
|
||||
Recommends: python-beautifulsoup4 >= 4.2.1
|
||||
Recommends: python-QtPy
|
||||
Recommends: python-SQLAlchemy >= 1.1.4
|
||||
Recommends: python-XlsxWriter >= 0.9.8
|
||||
Recommends: python-beautifulsoup4 >= 4.6.0
|
||||
Recommends: python-blosc
|
||||
Recommends: python-boto
|
||||
Recommends: python-google-api-python-client
|
||||
Recommends: python-fastparquet >= 0.2.1
|
||||
Recommends: python-gcsfs >= 0.2.2
|
||||
Recommends: python-html5lib
|
||||
Recommends: python-matplotlib
|
||||
Recommends: python-numexpr >= 2.1
|
||||
Recommends: python-oauth2client
|
||||
Recommends: python-openpyxl >= 2.4
|
||||
Recommends: python-pandas-gbq
|
||||
Recommends: python-python-gflags
|
||||
Recommends: python-s3fs
|
||||
Recommends: python-scipy
|
||||
Recommends: python-tables >= 3.0.0
|
||||
Recommends: python-xarray >= 0.7.0
|
||||
Recommends: python-xlrd
|
||||
Recommends: python-xlwt
|
||||
Recommends: python-lxml >= 3.8.0
|
||||
Recommends: python-matplotlib >= 2.2.2
|
||||
Recommends: python-numexpr >= 2.6.2
|
||||
Recommends: python-openpyxl >= 2.4.8
|
||||
Recommends: python-pandas-gbq >= 0.8.0
|
||||
Recommends: python-psycopg2
|
||||
Recommends: python-pyarrow >= 0.9.0
|
||||
Recommends: python-PyMySQL >= 0.7.11
|
||||
Recommends: python-pyreadstat
|
||||
Recommends: python-qt5
|
||||
Recommends: python-scipy >= 0.19.0
|
||||
Recommends: python-tables >= 3.4.2
|
||||
Recommends: python-xarray >= 0.8.2
|
||||
Recommends: python-xlrd >= 1.1.0
|
||||
Recommends: python-xlwt >= 1.2.0
|
||||
Recommends: xclip
|
||||
Recommends: xsel
|
||||
Recommends: python-zlib
|
||||
Obsoletes: python-pandas-doc < %{version}
|
||||
Provides: python-pandas-doc = %{version}
|
||||
%ifpython2
|
||||
Recommends: python-backports.lzma
|
||||
Obsoletes: %{oldpython}-pandas-doc < %{version}
|
||||
Provides: %{oldpython}-pandas-doc = %{version}
|
||||
%endif
|
||||
%python_subpackages
|
||||
|
||||
%description
|
||||
pandas is a Python package providing flexible and expressive data
|
||||
structures for working with "relational" or "labeled" data.
|
||||
|
||||
Documentation is located at
|
||||
http://pandas.pydata.org/pandas-docs/stable/ .
|
||||
Pandas is a Python package providing data structures designed for
|
||||
working with structured (tabular, multidimensional, potentially
|
||||
heterogeneous) and time series data. It is a high-level building
|
||||
block for doing data analysis in Python.
|
||||
|
||||
%prep
|
||||
%setup -q -n pandas-%{version}
|
||||
%patch0 -p1
|
||||
sed -i -e '/^#!\//, 1d' pandas/core/computation/eval.py
|
||||
|
||||
%build
|
||||
@ -107,7 +104,13 @@ export CFLAGS="%{optflags} -fno-strict-aliasing"
|
||||
|
||||
%check
|
||||
# skip test that tries to compile stuff in buildroot test_oo_optimizable
|
||||
%python_expand PYTHONPATH=%{buildroot}%{$python_sitearch} xvfb-run py.test-%{$python_version} -v %{buildroot}%{$python_sitearch}/pandas/tests -k 'not test_oo_optimizable'
|
||||
export PYTHONHASHSEED=$(python -c 'import random; print(random.randint(1, 4294967295))')
|
||||
export http_proxy=http://1.2.3.4 https_proxy=http://1.2.3.4;
|
||||
export LANG=en_US.UTF-8
|
||||
export LC_ALL=en_US.UTF-8
|
||||
%{python_expand export PYTHONPATH=%{buildroot}%{$python_sitearch}
|
||||
xvfb-run py.test-%{$python_version} -v %{buildroot}%{$python_sitearch}/pandas/tests -k 'not test_oo_optimizable'
|
||||
}
|
||||
|
||||
%files %{python_files}
|
||||
%license LICENSE
|
||||
|
Loading…
Reference in New Issue
Block a user