2025-06-13 18:31:56 +00:00
|
|
|
|
-------------------------------------------------------------------
|
|
|
|
|
|
Fri Jun 13 18:22:38 UTC 2025 - Ben Greiner <code@bnavigator.de>
|
|
|
|
|
|
|
|
|
|
|
|
- Update to 20.0.0
|
|
|
|
|
|
## Bug Fixes
|
|
|
|
|
|
* GH-36628 - [Python][Parquet] Fail when instantiating internal
|
|
|
|
|
|
Parquet metadata classes (#45549)
|
|
|
|
|
|
* GH-37630 - [C++][Python][Dataset] Allow disabling fragment
|
|
|
|
|
|
metadata caching (#45330)
|
|
|
|
|
|
* GH-44188 - [Python] Fix pandas roundtrip with bytes column
|
|
|
|
|
|
names (#44171)
|
|
|
|
|
|
* GH-45129 - [Python][C++] Fix usage of deprecated C++
|
|
|
|
|
|
functionality on pyarrow (#45189)
|
|
|
|
|
|
* GH-45155 - [Python][CI] Fix path for scientific nightly windows
|
|
|
|
|
|
wheel upload (#45222)
|
|
|
|
|
|
* GH-45169 - [Python] Adapt to modified pytest ignore collect
|
|
|
|
|
|
hook api (#45170)
|
|
|
|
|
|
* GH-45380 - [Python] Expose RankQuantileOptions to Python
|
|
|
|
|
|
(#45392)
|
|
|
|
|
|
* GH-45530 - [Python][Packaging] Add pyarrow.libs dir to
|
|
|
|
|
|
get_library_dirs (#45766)
|
|
|
|
|
|
* GH-45582 - [Python] Preserve decimal32/64/256 metadata in
|
|
|
|
|
|
Schema.metadata (#45583)
|
|
|
|
|
|
* GH-45733 - [C++][Python] Add biased/unbiased toggle to skew and
|
|
|
|
|
|
kurtosis functions (#45762)
|
|
|
|
|
|
* GH-45739 - [C++][Python] Fix crash when calling
|
|
|
|
|
|
hash_pivot_wider without options (#45740)
|
|
|
|
|
|
* GH-45758 - [Python] Add AzureFileSystem documentation (#45759)
|
|
|
|
|
|
* GH-45926 - [Python] Use pytest.approx for float values on
|
|
|
|
|
|
unbiased skew and kurtosis tests (#45929)
|
|
|
|
|
|
* GH-46041 - [Python][Packaging] Temporary remove pandas from
|
|
|
|
|
|
being installed on free-threaded Windows wheel tests (#46042)
|
|
|
|
|
|
## New Features and Improvements
|
|
|
|
|
|
* GH-14932 - [Python] Add python bindings for JSON streaming
|
|
|
|
|
|
reader (#45084)
|
|
|
|
|
|
* GH-35289 - [Python] Support large variable width types in numpy
|
|
|
|
|
|
conversion (#36701)
|
|
|
|
|
|
* GH-36412 - [Python][CI] Fix deprecation warnings in the pandas
|
|
|
|
|
|
nightly build
|
|
|
|
|
|
* GH-39010 - [Python] Introduce maps_as_pydicts parameter for
|
|
|
|
|
|
to_pylist, to_pydict, as_py (#45471)
|
|
|
|
|
|
* GH-41002 - [Python] Remove pins for pytest-cython and
|
|
|
|
|
|
conda-docs pytest (#45240)
|
|
|
|
|
|
* GH-41985 - [Python][Docs] Clarify docstring of
|
|
|
|
|
|
pyarrow.compute.scalar() (#45668)
|
|
|
|
|
|
* GH-43587 - [Python] Remove no longer used serialize/deserialize
|
|
|
|
|
|
PyArrow C++ code (#45743)
|
|
|
|
|
|
* GH-44421 - [Python] Add configuration for building & testing
|
|
|
|
|
|
free-threaded wheels on Windows (#44804)
|
|
|
|
|
|
* GH-44790 - [Python] Remove use_legacy_dataset from code base
|
|
|
|
|
|
(#45742)
|
|
|
|
|
|
* GH-45156 - [Python][Packaging] Refactor Python Windows wheel
|
|
|
|
|
|
images to use newer base image (#45442)
|
|
|
|
|
|
* GH-45237 - [Python] Raise minimum supported cython to >=3
|
|
|
|
|
|
(#45238)
|
|
|
|
|
|
* GH-45278 - [Python][Packaging] Updated delvewheel install
|
|
|
|
|
|
command and updated flags used with delvewheel repair (#45323)
|
|
|
|
|
|
* GH-45282 - [Python][Parquet] Remove unused readonly properties
|
|
|
|
|
|
of ParquetWriter (#45281)
|
|
|
|
|
|
* GH-45288 - [Python][Packaging][Docs] Update documentation for
|
|
|
|
|
|
PyArrow nightly wheels (#45289)
|
|
|
|
|
|
* GH-45358 - [C++][Python] Add MemoryPool method to print
|
|
|
|
|
|
statistics (#45359)
|
|
|
|
|
|
* GH-45433 - [Python] Remove Cython workarounds (#45437)
|
|
|
|
|
|
* GH-45457 - [Python] Add pyarrow.ArrayStatistics (#45550)
|
|
|
|
|
|
* GH-45482 - [CI][Python] Don’t use Ubuntu 20.04 for wheel test
|
|
|
|
|
|
(#45483)
|
|
|
|
|
|
* GH-45570 - [Python] Allow Decimal32/64Array.to_pandas (#45571)
|
|
|
|
|
|
* GH-45676 - [C++][Python][Compute] Add skew and kurtosis
|
|
|
|
|
|
functions (#45677)
|
|
|
|
|
|
* GH-45680 - [C++][Python] Remove deprecated functions in 20.0
|
|
|
|
|
|
* GH-45705 - [Python] Add support for SAS token in
|
|
|
|
|
|
AzureFileSystem (#45706)
|
|
|
|
|
|
* GH-45755 - [C++][Python][Compute] Add winsorize function
|
|
|
|
|
|
(#45763)
|
|
|
|
|
|
* GH-45848 - [C++][Python][R] Remove deprecated PARQUET_2_0
|
|
|
|
|
|
(#45849)
|
|
|
|
|
|
* GH-45920 - [Release][Python] Upload sdist and wheels to GitHub
|
|
|
|
|
|
Releases not apache.jfrog.io (#45962)
|
|
|
|
|
|
|
2025-02-17 22:32:29 +00:00
|
|
|
|
-------------------------------------------------------------------
|
|
|
|
|
|
Mon Feb 17 19:17:26 UTC 2025 - Ben Greiner <code@bnavigator.de>
|
|
|
|
|
|
|
|
|
|
|
|
- Update to 19.0.1
|
|
|
|
|
|
## Bug Fixes
|
|
|
|
|
|
* [Python][CI] Make download_tzdata_on_windows more robust and
|
|
|
|
|
|
use tzdata package for tzinfo database on Windows for ORC
|
|
|
|
|
|
(#45425)
|
|
|
|
|
|
* [Python] Only enable the string dtype on pandas export for
|
|
|
|
|
|
pandas>=2.3 (#45383) [Python] Fix version comparison in pandas
|
|
|
|
|
|
compat for pandas 2.3 dev version (#45428)
|
|
|
|
|
|
## Improvements
|
|
|
|
|
|
* [CI][Python] Temporarily avoid newer boto3 version (#45311)
|
|
|
|
|
|
[CI] Bump Minio version and unpin boto3 (#45320)
|
|
|
|
|
|
- Release 19.0.0
|
|
|
|
|
|
## New Features and Improvements
|
|
|
|
|
|
* [Python] Add more FlightInfo / FlightEndpoint attributes
|
|
|
|
|
|
(#43537)
|
|
|
|
|
|
* [Python] Support Arrow PyCapsule stream objects in
|
|
|
|
|
|
write_dataset (#43771)
|
|
|
|
|
|
* [Python] Support pandas future default string dtype
|
|
|
|
|
|
* [CI][Python] Use GitHub Packages for vcpkg cache (#44644)
|
|
|
|
|
|
* [Python] Add Python wrapper for JsonExtensionType (#44070)
|
|
|
|
|
|
* [Python][C++] Add version suffix to libarrow_python* libraries
|
|
|
|
|
|
(#44702)
|
|
|
|
|
|
* [Python] Add support for Decimal32 and Decimal64 types (#44882)
|
|
|
|
|
|
* [C++][Python] Add Hyperbolic Trig functions (#44630)
|
|
|
|
|
|
* [Python] Clean-up name / field_name handling in pandas compat
|
|
|
|
|
|
(#44963)
|
|
|
|
|
|
* [CI][Python][Packaging] Test 3.12 wheels on Ubuntu 24.04
|
|
|
|
|
|
(#45042)
|
|
|
|
|
|
* [CI][Packaging][Python] Simplify
|
|
|
|
|
|
dev/tasks/python-wheels/github.linux.yml (#45077)
|
|
|
|
|
|
* [Python] Honor the strings_to_categorical keyword in to_pandas
|
|
|
|
|
|
for string view type (#45176)
|
|
|
|
|
|
## Bug Fixes
|
|
|
|
|
|
* [C++][Python] Fix ORC crash when file contains unknown timezone
|
|
|
|
|
|
(#45051)
|
|
|
|
|
|
* [Python] Converting month_day_nano_interal to numpy crashes
|
|
|
|
|
|
* [Python] Allow from_buffers to work with StringView on Python
|
|
|
|
|
|
(#44701)
|
|
|
|
|
|
* [C++][Python] Fix Flight Timestamp precision, revert workaround
|
|
|
|
|
|
from #43537 (#44681)
|
|
|
|
|
|
* [Docs][Python] Add missing canonical extension types to PyArrow
|
|
|
|
|
|
arrays and datatypes docs (#44880)
|
|
|
|
|
|
* [Python] Trigger manual Garbage collection before checking
|
|
|
|
|
|
allocated bytes for dlpack tests (#44793)
|
|
|
|
|
|
* [Python][Packaging] Use delvewheel to repair Windows wheels
|
|
|
|
|
|
(#35323)
|
|
|
|
|
|
* [CI][Python] Fix and modernize AppVeyor build (#44999)
|
|
|
|
|
|
* [Python][Docs] Update docstrings for metadata methods on Field
|
|
|
|
|
|
and Schema classes (#45004)
|
|
|
|
|
|
* [CI][Python] Fix test_memory failures (#45007)
|
|
|
|
|
|
* [CI][Packaging][Python] Fix Docker push step for free-threaded
|
|
|
|
|
|
wheel builds (#45040)
|
|
|
|
|
|
* [Packaging][Python] Use ORC from vcpkg instead of bundled on
|
|
|
|
|
|
Linux and macOS (#45046)
|
|
|
|
|
|
- Release 18.1.0
|
|
|
|
|
|
## Bug Fixes
|
|
|
|
|
|
* [Release][Packacing][Python] Set PARQUET_TEST_DATA on
|
|
|
|
|
|
verify-release-candidate-wheels.bat (#44462)
|
|
|
|
|
|
## New Features and Improvements
|
|
|
|
|
|
- Release 18.0.0
|
|
|
|
|
|
## Bug Fixes
|
|
|
|
|
|
* [Python][Packaging] Bump MACOSX_DEPLOYMENT_TARGET to 12 instead
|
|
|
|
|
|
of 11 (#43137)
|
|
|
|
|
|
* [Release][Packaging][Python] Add tzdata as conda env
|
|
|
|
|
|
requirement to avoid ORC failure (#43233)
|
|
|
|
|
|
* [Python] Give precedence to pycapsule interface in
|
|
|
|
|
|
pa.schema(..) (#43486)
|
|
|
|
|
|
* [Python] Sanitize Python reference handling in UDF
|
|
|
|
|
|
implementation (#43557)
|
|
|
|
|
|
* [Python] Allow tuple for rename columns (#43609)
|
|
|
|
|
|
* [Packaging][Python] Fix vcpkg version detection in macOS wheel
|
|
|
|
|
|
build jobs (#43615)
|
|
|
|
|
|
* [Python] Fix compilation on Cython<3 (#43765)
|
|
|
|
|
|
* [Python][CI] Correct PARQUET_TEST_DATA path in wheel tests
|
|
|
|
|
|
(#43786)
|
|
|
|
|
|
* [CI][Packaging][Python] Avoid uploading wheel to gemfury if
|
|
|
|
|
|
version already exists (#43816)
|
|
|
|
|
|
* [CI][Python] Skip test that requires PARQUET_TEST_DATA env on
|
|
|
|
|
|
emscripten (#43906)
|
|
|
|
|
|
* [Python] Fix threading issues with borrowed refs and pandas
|
|
|
|
|
|
(#44047)
|
|
|
|
|
|
* [Benchmarking][Python] Avoid uwsgi install failure on macOS
|
|
|
|
|
|
(#44221)
|
|
|
|
|
|
* [CI][Release][Python] Do not verify Python on Ubuntu 20.04
|
|
|
|
|
|
(#44254)
|
|
|
|
|
|
* [CI][Python] Remove ds requirement from test collection on
|
|
|
|
|
|
test_dataset.py (#44370)
|
|
|
|
|
|
## New Features and Improvements
|
|
|
|
|
|
* [C++][Python] Native support for UUID (#37298)
|
|
|
|
|
|
* [C++][Python] Bool8 Extension Type Implementation (#43488)
|
|
|
|
|
|
* [Python] Make NumPy an optional runtime dependency (#41904)
|
|
|
|
|
|
* [Python] Add StructType attribute to access all its fields
|
|
|
|
|
|
(#43481)
|
|
|
|
|
|
* [CI][Python] Use pipx to install GCS testbench (#43852)
|
|
|
|
|
|
* [Python][CI][Packaging] Don’t upload sdist to scientific-python
|
|
|
|
|
|
nightly channel (only wheels) (#43943)
|
|
|
|
|
|
* [Python][CI][Packaging] Upload nightly wheels to main label of
|
|
|
|
|
|
scientific-python-nightly-wheels channel (#43932)
|
|
|
|
|
|
* [CI][Packaging][Python] Upload pyarrow nightly wheels to
|
|
|
|
|
|
scientific python channel on Anaconda (#43862)
|
|
|
|
|
|
* [C++][Python][Parquet] Support reading/writing key-value
|
|
|
|
|
|
metadata from/to ColumnChunkMetaData (#41580)
|
|
|
|
|
|
* [Python] Ensure (Chunked)Array/RecordBatch/Table methods don’t
|
|
|
|
|
|
crash with non-CPU data
|
|
|
|
|
|
* [Python] Let StructArray.from_array accept a type in addition
|
|
|
|
|
|
to names or fields (#43047)
|
|
|
|
|
|
* [Python] Test FlightStreamReader iterator (#42086)
|
|
|
|
|
|
* [Python] Add bindings for CopyTo on RecordBatch and Array
|
|
|
|
|
|
classes (#42223)
|
|
|
|
|
|
* [Python] Use Py_IsFinalizing from pythoncapi_compat.h (#43767)
|
|
|
|
|
|
* [Python] Add bindings for memory manager and device to Context
|
|
|
|
|
|
class (#43392)
|
|
|
|
|
|
* [C++][Python] Add Opaque canonical extension type (#43458)
|
|
|
|
|
|
* [Python] Deprecate passing build flags to setup.py (#43515)
|
|
|
|
|
|
* [Python][Packaging][CI] Drop Python 3.8 support (#43970)
|
|
|
|
|
|
* [Python][CI] Add Python 3.13 conda test build (#44192)
|
|
|
|
|
|
* [Python][CI][Packaging] Use released versions to build and test
|
|
|
|
|
|
wheels on Python 3.13 (#44193)
|
|
|
|
|
|
* [Python] Set up wheel building for Python 3.13 (#43539)
|
|
|
|
|
|
* [Python] Remove usage of deprecated pkg_resources in setup.py
|
|
|
|
|
|
(#43602)
|
|
|
|
|
|
* [Python][CI] Add a Crossbow job with the free-threaded build
|
|
|
|
|
|
(#43671)
|
|
|
|
|
|
* [Python] Do not use borrowed references APIs (#43540)
|
|
|
|
|
|
* [Python] Declare support for free-threading in Cython (#43606)
|
|
|
|
|
|
* [Python][CI] Add a Crossbow job with a debug CPython
|
|
|
|
|
|
interpreter (#43565)
|
|
|
|
|
|
* [Python][Dataset] Python / Cython interface to C++
|
|
|
|
|
|
arrow::dataset::Partitioning::Format (#43740)
|
|
|
|
|
|
* [Python][CI] Simplify python/requirements-wheel-test.txt file
|
|
|
|
|
|
(#43691)
|
|
|
|
|
|
* [Python] RecordBatch fails gracefully on non-cpu devices
|
|
|
|
|
|
(#43729)
|
|
|
|
|
|
* [Python] ChunkedArray fails gracefully on non-cpu devices
|
|
|
|
|
|
(#43795)
|
|
|
|
|
|
* [Python][Packaging] Remove numpy dependency from pyarrow
|
|
|
|
|
|
packaging (#44148)
|
|
|
|
|
|
* [Python] Build macOS and manylinux wheels for free-threading
|
|
|
|
|
|
(#43965)
|
|
|
|
|
|
* [Python] Table fails gracefully on non-cpu devices (#43974)
|
|
|
|
|
|
* [Python] Deprecate the no longer used serialize/deserialize
|
|
|
|
|
|
Pyarrow C++ functions (#44064)
|
|
|
|
|
|
* [CI][Python] Enable S3 testing on Windows wheel builds (#44093)
|
|
|
|
|
|
* [CI][Python] Enable S3 tests on macOS CI (#44129)
|
|
|
|
|
|
* [Packaging][Python] Use macOS 12 as deployment target to have
|
|
|
|
|
|
macOS 12 pyarrow wheels (#44315)
|
|
|
|
|
|
* [Packaging][Python] Disable interactive deb configuration in
|
|
|
|
|
|
wheel-manylinux--cp313t- (#44362)
|
|
|
|
|
|
- Drop pyarrow-pr433325-extradirs.patch
|
|
|
|
|
|
|
2024-10-26 01:06:02 +00:00
|
|
|
|
-------------------------------------------------------------------
|
|
|
|
|
|
Thu Sep 26 23:24:22 UTC 2024 - Guang Yee <gyee@suse.com>
|
|
|
|
|
|
|
2025-02-17 22:32:29 +00:00
|
|
|
|
- Enable sle15_python_module_pythons.
|
2024-10-26 01:06:02 +00:00
|
|
|
|
|
2024-08-15 09:43:24 +00:00
|
|
|
|
-------------------------------------------------------------------
|
|
|
|
|
|
Wed Aug 14 20:27:48 UTC 2024 - Ben Greiner <code@bnavigator.de>
|
|
|
|
|
|
|
|
|
|
|
|
- Update to 17.0.0
|
|
|
|
|
|
## Bug Fixes
|
|
|
|
|
|
* [C++][Python] Fix casting to extension type with fixed size
|
|
|
|
|
|
list storage type (#42219)
|
|
|
|
|
|
* [Python] Include metadata when creating pa.schema from
|
|
|
|
|
|
PyCapsule (#41538)
|
|
|
|
|
|
* [C++][Python] RecordBatch.filter() segfaults if passed a
|
|
|
|
|
|
ChunkedArray (#40971)
|
|
|
|
|
|
* [Python] pa.array: add check for byte-swapped numpy arrays
|
|
|
|
|
|
inside python objects (#41549)
|
|
|
|
|
|
* [Python] Fix read_table for encrypted parquet (#39438)
|
|
|
|
|
|
* [Python] RunEndEncodedArray.from_arrays: bugfix for Array
|
|
|
|
|
|
arguments (#40560) (#41093)
|
|
|
|
|
|
* [C++][Python] Map child Array constructed from keys and items
|
|
|
|
|
|
shouldn’t have offset (#40871)
|
|
|
|
|
|
* [Python] `test_numpy_array_protocol` test failures with numpy
|
|
|
|
|
|
2.0.0rc1
|
|
|
|
|
|
* [Python] Fix StructArray.sort() for by=None (#41495)
|
|
|
|
|
|
* [Python] Build with Python 3.13 (#42034)
|
|
|
|
|
|
* [Python] remove special methods related to buffers in python
|
|
|
|
|
|
<2.6 (#41492)
|
|
|
|
|
|
* [Python] Fix reading column index with decimal values (#41503)
|
|
|
|
|
|
* [Docs][Python] Remove duplicate contents (#41588)
|
|
|
|
|
|
* [C++][Python] Add optional null_bitmap to MapArray::FromArrays
|
|
|
|
|
|
(#41757)
|
|
|
|
|
|
* [Python][Parquet] Implement to_dict method on SortingColumn
|
|
|
|
|
|
(#41704)
|
|
|
|
|
|
* [Python] CMake: ignore Parquet encryption option if Parquet
|
|
|
|
|
|
itself is not enabled (fix Java integration build) (#41776)
|
|
|
|
|
|
* [Python] Disallow direct pa.RecordBatchReader() construction to
|
|
|
|
|
|
avoid segfaults (#41773)
|
|
|
|
|
|
* [Python] Fix RecordBatchReader.cast to support casting to equal
|
|
|
|
|
|
schema for all types (#42098)
|
|
|
|
|
|
* [Python] Fix tests when using NumPy 2.0 on Windows (#42099)
|
|
|
|
|
|
* [CI][Python] Use pip install -e instead of setup.py build_ext
|
|
|
|
|
|
–inplace for installing pyarrow on verification script (#42007)
|
|
|
|
|
|
* [CI][Python][C++] Fix utf8proc detection for wheel on Windows
|
|
|
|
|
|
(#42022)
|
|
|
|
|
|
* [Python][CI] Update expected output for numpy 2.0.0 (#42172)
|
|
|
|
|
|
## New Features and Improvements
|
|
|
|
|
|
* [Python] Replace pandas.util.testing.rands with vendored
|
|
|
|
|
|
version (#42089)
|
|
|
|
|
|
* [Python] begin moving static settings to pyproject.toml
|
|
|
|
|
|
(#41041)
|
|
|
|
|
|
* [Python] Implement PyCapsule interface for Device data in
|
|
|
|
|
|
PyArrow (#40717)
|
|
|
|
|
|
* [Python] Expand the Arrow PyCapsule Interface with C Device
|
|
|
|
|
|
Data support (#40708)
|
|
|
|
|
|
* [Python] Let RecordBatch.filter accept a boolean expression in
|
|
|
|
|
|
addition to mask array (#43043)
|
|
|
|
|
|
* [Python] Fix pickling of LocalFileSystem for cython 2 (#41459)
|
|
|
|
|
|
* [Python] Expand the C Device Interface bindings to support
|
|
|
|
|
|
import on CUDA device (#40385)
|
|
|
|
|
|
* [Python] Allow passing a mapping of column names to
|
|
|
|
|
|
rename_columns (#40645)
|
|
|
|
|
|
* [Python][Packaging] Strip unnecessary symbols when building
|
|
|
|
|
|
wheels (#42028)
|
|
|
|
|
|
* [Python][Docs] Update PyArrow installation docs for conda
|
|
|
|
|
|
package split (#41135)
|
|
|
|
|
|
* [Python] Basic bindings for Device and MemoryManager classes
|
|
|
|
|
|
(#41685)
|
|
|
|
|
|
* [C++][Python] Expose recursive flatten for lists on
|
|
|
|
|
|
list_flatten kernel function and pyarrow bindings (#41295)
|
|
|
|
|
|
* [Python][Packaging] Ensure to build with released numpy 2.0
|
|
|
|
|
|
(instead of RC) in the wheel building workflows (#42194)
|
|
|
|
|
|
* [CI][Python] Add a job on ARM64 macOS (#41313)
|
|
|
|
|
|
* [CI][Python] Reduce CI time on macOS (#41378)
|
|
|
|
|
|
* [Python] Expose byte_width and bit_width of ExtensionType in
|
|
|
|
|
|
terms of the storage type (#41413)
|
|
|
|
|
|
* [Python] Update Python development guide about components being
|
|
|
|
|
|
enabled by default based on Arrow C++ (#41705)
|
|
|
|
|
|
* [Python] Building PyArrow: enable/disable python components by
|
|
|
|
|
|
default based on availability in Arrow C++ (#41494)
|
|
|
|
|
|
* [C++][Python] Extends the add_key_value to parquet::arrow and
|
|
|
|
|
|
PyArrow (#41633)
|
|
|
|
|
|
* [Python] Ensure Buffer methods don’t crash with non-CPU data
|
|
|
|
|
|
(#41889)
|
|
|
|
|
|
* [C++][Python] PrettyPrint non-cpu data by copying to default
|
|
|
|
|
|
CPU device (#42010)
|
|
|
|
|
|
* [Python][Parquet] Update BYTE_STREAM_SPLIT description in
|
|
|
|
|
|
write_table() docstring (#41759)
|
|
|
|
|
|
* [Python] Add support for Pyodide (#37822)
|
|
|
|
|
|
* [Python] Fix pandas tests to follow downstream datetime64 unit
|
|
|
|
|
|
changes (#41979)
|
|
|
|
|
|
* [Python] Allow Array.filter() to take general array input
|
|
|
|
|
|
(#42051)
|
|
|
|
|
|
* [Python] Expose new FLOAT16 logical type in the pyarrow.parquet
|
|
|
|
|
|
bindings (#42103)
|
|
|
|
|
|
* [Python] Array gracefully fails on non-cpu device (#42113)
|
|
|
|
|
|
* [Python][Parquet] Pyarrow store decimal as integer (#42169)
|
|
|
|
|
|
* [Python] Add CI job for Numpy 1.X (#42189)
|
|
|
|
|
|
* [CI][Python] Pin openjdk=17 in python substrait integration
|
|
|
|
|
|
(#43051)
|
|
|
|
|
|
- Drop pyarrow-pr41319-numpy2-tests.patch
|
|
|
|
|
|
- Add pyarrow-pr433325-extradirs.patch gh#apache/arrow/pull/43325
|
|
|
|
|
|
|
2024-04-25 09:07:39 +00:00
|
|
|
|
-------------------------------------------------------------------
|
|
|
|
|
|
Thu Apr 25 08:58:22 UTC 2024 - Ben Greiner <code@bnavigator.de>
|
|
|
|
|
|
|
|
|
|
|
|
- Update to 16.0.0
|
|
|
|
|
|
* [Python] construct pandas.DataFrame with public API in
|
|
|
|
|
|
to_pandas (#40897)
|
|
|
|
|
|
* [Python] Fix ORC test segfault in the python wheel windows test
|
|
|
|
|
|
(#40609)
|
|
|
|
|
|
* [Python] Attach Python stacktrace to errors in ConvertPyError
|
|
|
|
|
|
(#39380)
|
|
|
|
|
|
* [Python] Plug reference leaks when creating Arrow array from
|
|
|
|
|
|
Python list of dicts (#40412)
|
|
|
|
|
|
* [Python] Empty slicing an array backwards beyond the start is
|
|
|
|
|
|
now empty (#40682)
|
|
|
|
|
|
* [Python] Slicing an array backwards beyond the start now
|
|
|
|
|
|
includes first item. (#39240)
|
|
|
|
|
|
* [Python] Calling
|
|
|
|
|
|
pyarrow.dataset.ParquetFileFormat.make_write_options as a class
|
|
|
|
|
|
method results in a segfault (#40976)
|
|
|
|
|
|
* [Python] Fix parquet import in encryption test (#40505)
|
|
|
|
|
|
* [Python] fix raising ValueError on _ensure_partitioning
|
|
|
|
|
|
(#39593)
|
|
|
|
|
|
* [Python] Validate max_chunksize in Table.to_batches (#39796)
|
|
|
|
|
|
* [C++][Python] Fix test_gdb failures on 32-bit (#40293)
|
|
|
|
|
|
* [Python] Make Tensor.__getbuffer__ work on 32-bit platforms
|
|
|
|
|
|
(#40294)
|
|
|
|
|
|
* [Python] Avoid using np.take in Array.to_numpy() (#40295)
|
|
|
|
|
|
* [Python][C++] Fix large file handling on 32-bit Python build
|
|
|
|
|
|
(#40176)
|
|
|
|
|
|
* [Python] Update size assumptions for 32-bit platforms (#40165)
|
|
|
|
|
|
* [Python] Fix OverflowError in foreign_buffer on 32-bit
|
|
|
|
|
|
platforms (#40158)
|
|
|
|
|
|
* [Python] Add Type_FIXED_SIZE_LIST to _NESTED_TYPES set (#40172)
|
|
|
|
|
|
* [Python] Mark ListView as a nested type (#40265)
|
|
|
|
|
|
* [Python] only allocate the ScalarMemoTable when used (#40565)
|
|
|
|
|
|
* [Python] Error compiling Cython files on Windows during release
|
|
|
|
|
|
verification
|
|
|
|
|
|
* [Python] Fix flake8 failures in python/benchmarks/parquet.py
|
|
|
|
|
|
(#40440)
|
|
|
|
|
|
* [Python] Suppress python/examples/minimal_build/Dockerfile.*
|
|
|
|
|
|
warnings (#40444)
|
|
|
|
|
|
* [Python][Docs] Add workaround for autosummary (#40739)
|
|
|
|
|
|
* [Python] BUG: Empty slicing an array backwards beyond the start
|
|
|
|
|
|
should be empty
|
|
|
|
|
|
* [CI][Python] Activate ARROW_PYTHON_VENV if defined in
|
|
|
|
|
|
sdist-test job (#40707)
|
|
|
|
|
|
* [CI][Python] CI failures on Python builds due to pytest_cython
|
|
|
|
|
|
(#40975)
|
|
|
|
|
|
* [Python] ListView pandas tests should use np.nan instead of
|
|
|
|
|
|
None (#41040)
|
|
|
|
|
|
* [C++][Python] Sporadic asof_join failures in PyArrow
|
|
|
|
|
|
## New Features and Improvements
|
|
|
|
|
|
* [Python][CI] Remove legacy hdfs tests from hdfs and hypothesis
|
|
|
|
|
|
setup (#40363)
|
|
|
|
|
|
* [Python] Remove deprecated pyarrow.filesystem legacy
|
|
|
|
|
|
implementations (#39825)
|
|
|
|
|
|
* [C++][Python] Add missing methods to RecordBatch (#39506)
|
|
|
|
|
|
* [Python][CI] Support ORC in Windows wheels
|
|
|
|
|
|
* [Python] Correct test marker for join_asof tests (#40666)
|
|
|
|
|
|
* [Python] Add join_asof binding (#34234)
|
|
|
|
|
|
* [Python] Add a function to download and extract timezone
|
|
|
|
|
|
database on Windows (#38179)
|
|
|
|
|
|
* [Python][CI][Packaging] Enable ORC on Windows Appveyor CI and
|
|
|
|
|
|
Windows wheels for pyarrow
|
|
|
|
|
|
* [Python] Add a FixedSizeTensorScalar class (#37533)
|
|
|
|
|
|
* [Python][CI][Dev][Python] Release and merge script errors
|
|
|
|
|
|
(#37819)" (#40150)
|
|
|
|
|
|
* [Python] Construct pyarrow.Field and ChunkedArray through Arrow
|
|
|
|
|
|
PyCapsule Protocol (#40818)
|
|
|
|
|
|
* [Python] Fix missing byte_width attribute on DataType class
|
|
|
|
|
|
(#39592)
|
|
|
|
|
|
* [Python] Compatibility with NumPy 2.0
|
|
|
|
|
|
* [Packaging][Python] Enable building pyarrow against numpy 2.0
|
|
|
|
|
|
(#39557)
|
|
|
|
|
|
* [Python] Basic pyarrow bindings for Binary/StringView classes
|
|
|
|
|
|
(#39652)
|
|
|
|
|
|
* [Python] Expose force_virtual_addressing in PyArrow (#39819)
|
|
|
|
|
|
* [Python][Parquet] Support hashing for FileMetaData and
|
|
|
|
|
|
ParquetSchema (#39781)
|
|
|
|
|
|
* [Python] Add bindings for ListView and LargeListView (#39813)
|
|
|
|
|
|
* [Python][Packaging] Build pyarrow wheels with numpy RC instead
|
|
|
|
|
|
of nightly (#41097)
|
|
|
|
|
|
* [Python] Support creating Binary/StringView arrays from python
|
|
|
|
|
|
objects (#39853)
|
|
|
|
|
|
* [Python] ListView support for pa.array() (#40160)
|
|
|
|
|
|
* [Python][CI] Remove upper pin on pytest (#40487)
|
|
|
|
|
|
* [Python][FS][Azure] Minimal Python bindings for AzureFileSystem
|
|
|
|
|
|
(#40021)
|
|
|
|
|
|
* [Python] Low-level bindings for exporting/importing the C
|
|
|
|
|
|
Device Interface (#39980)
|
|
|
|
|
|
* [Python] Add ChunkedArray import/export to/from C (#39985)
|
|
|
|
|
|
* [Python] Use Cast() instead of CastTo (#40116)
|
|
|
|
|
|
* [C++][Python] Basic conversion of RecordBatch to Arrow Tensor
|
|
|
|
|
|
(#40064)
|
|
|
|
|
|
* [C++][Python] Basic conversion of RecordBatch to Arrow Tensor -
|
|
|
|
|
|
add support for different data types (#40359)
|
|
|
|
|
|
* [C++][Python] Basic conversion of RecordBatch to Arrow Tensor -
|
|
|
|
|
|
add option to cast NULL to NaN (#40803)
|
|
|
|
|
|
* [Python] Support requested_schema in __arrow_c_stream__()
|
|
|
|
|
|
(#40070)
|
|
|
|
|
|
* [Python] Support Binary/StringView conversion to numpy/pandas
|
|
|
|
|
|
(#40093)
|
|
|
|
|
|
* [Python] Allow FileInfo instances to be passed to dataset init
|
|
|
|
|
|
(#40143)
|
|
|
|
|
|
* [Python][CI] Add 32-bit Debian build on Crossbow (#40164)
|
|
|
|
|
|
* [Python] ListView arrow-to-pandas conversion (#40482)
|
|
|
|
|
|
* [Python][CI] Disable generating C lines in Cython tracebacks
|
|
|
|
|
|
(#40225)
|
|
|
|
|
|
* [Python] Support construction of Run-End Encoded arrays in
|
|
|
|
|
|
pa.array(..) (#40341)
|
|
|
|
|
|
* [Python] Accept dict in pyarrow.record_batch() function
|
|
|
|
|
|
(#40292)
|
|
|
|
|
|
* [Python] Update for NumPy 2.0 ABI change in
|
|
|
|
|
|
PyArray_Descr->elsize (#40418)
|
|
|
|
|
|
* [Python][CI] Fix install of nightly dask in integration tests
|
|
|
|
|
|
(#40378)
|
|
|
|
|
|
* [Python] Fix byte_width for binary(0) + fix hypothesis tests
|
|
|
|
|
|
(#40381)
|
|
|
|
|
|
* [Python][CI] Fix dataset partition filter tests with pandas
|
|
|
|
|
|
nightly (#40429)
|
|
|
|
|
|
* [Docs][Python] Added JsonFileFormat to docs (#40585)
|
|
|
|
|
|
* [Dev][C++][Python][R] Use pre-commit for clang-format (#40587)
|
|
|
|
|
|
* [Python][C++] Support conversion of pyarrow.RunEndEncodedArray
|
|
|
|
|
|
to numpy/pandas (#40661)
|
|
|
|
|
|
* [Python] Simplify and improve perf of creation of the column
|
|
|
|
|
|
names in Table.to_pandas (#40721)
|
|
|
|
|
|
* [Docs][C++][Python] Add initial documentation for
|
|
|
|
|
|
RecordBatch::Tensor conversion (#40842)
|
|
|
|
|
|
* [C++][Python] Basic conversion of RecordBatch to Arrow Tensor -
|
|
|
|
|
|
add support for row-major (#40867)
|
|
|
|
|
|
* [CI][Python] check message in test_make_write_options_error for
|
|
|
|
|
|
Cython 2 (#41059)
|
|
|
|
|
|
* [Python] Add copy keyword in Array.array for numpy 2.0+
|
|
|
|
|
|
compatibility (#41071)
|
|
|
|
|
|
* [Python][Packaging] PyArrow wheel building is failing because
|
|
|
|
|
|
of disabled vcpkg install of liblzma
|
|
|
|
|
|
- Drop apache-arrow-pr40230-glog-0.7.patch
|
|
|
|
|
|
- Drop apache-arrow-pr40275-glog-0.7-2.patch
|
|
|
|
|
|
- Add pyarrow-pr41319-numpy2-tests.patch gh#apache/arrow#41319
|
|
|
|
|
|
|
2024-03-23 16:14:18 +00:00
|
|
|
|
-------------------------------------------------------------------
|
|
|
|
|
|
Sat Mar 23 15:23:23 UTC 2024 - Ben Greiner <code@bnavigator.de>
|
|
|
|
|
|
|
|
|
|
|
|
- Update to 15.0.2
|
|
|
|
|
|
## Bug Fixes
|
|
|
|
|
|
* [Python] Fix except clauses (#40387)
|
|
|
|
|
|
* [Python][CI] Skip failing test_dateutil_tzinfo_to_string
|
|
|
|
|
|
(#40486)
|
|
|
|
|
|
|
2024-02-28 16:27:53 +00:00
|
|
|
|
-------------------------------------------------------------------
|
|
|
|
|
|
Wed Feb 28 12:12:36 UTC 2024 - Ben Greiner <code@bnavigator.de>
|
|
|
|
|
|
|
|
|
|
|
|
- Move to science/apache-arrow as multibuild package
|
|
|
|
|
|
- Also needs the cpp GLOG patches
|
|
|
|
|
|
* Add apache-arrow-pr40230-glog-0.7.patch
|
|
|
|
|
|
* Add apache-arrow-pr40275-glog-0.7-2.patch
|
|
|
|
|
|
|
|
|
|
|
|
-------------------------------------------------------------------
|
|
|
|
|
|
Fri Feb 23 17:35:37 UTC 2024 - Ben Greiner <code@bnavigator.de>
|
|
|
|
|
|
|
|
|
|
|
|
- Update to 15.0.1
|
|
|
|
|
|
## Bug Fixes
|
|
|
|
|
|
* [Python] Fix race condition in _pandas_api#_check_import
|
|
|
|
|
|
(#39314)
|
|
|
|
|
|
* [Python] Avoid leaking references to Numpy dtypes (#39636)
|
|
|
|
|
|
* [Release] Update platform tags for macOS wheels to macosx_10_15
|
|
|
|
|
|
(#39657)
|
|
|
|
|
|
* [Python][CI] Fix test failures with latest/nightly pandas
|
|
|
|
|
|
(#39760)
|
|
|
|
|
|
* [C#] Restore support for .NET 4.6.2 (#40008)
|
|
|
|
|
|
* [Python] Make capsule name check more lenient (#39977)
|
|
|
|
|
|
* [Python][FlightRPC] Release GIL in GeneratorStream (#40005)
|
|
|
|
|
|
## New Features and Improvements
|
|
|
|
|
|
* [Python] Remove the use of pytest-lazy-fixture (#39850)
|
|
|
|
|
|
* [Python][CI] Pin moto<5 for dask integration tests (#39881)
|
|
|
|
|
|
* [Python] Fix tests for pandas with CoW / nightly integration
|
|
|
|
|
|
tests (#40000)
|
|
|
|
|
|
- Release 15.0.0
|
|
|
|
|
|
## Bug Fixes
|
|
|
|
|
|
* [C++][Python] Add a no-op kernel for
|
|
|
|
|
|
dictionary_encode(dictionary) (#38349)
|
|
|
|
|
|
* [Python] Fix S3FileSystem equals None segfault (#39276)
|
|
|
|
|
|
* Fix TestArrowReaderAdHoc.ReadFloat16Files to use new
|
|
|
|
|
|
uncompressed files (#38825)
|
|
|
|
|
|
* [Python] Fix spelling (#38945)
|
|
|
|
|
|
* [CI][Python] Update pandas tests failing on pandas nightly CI
|
|
|
|
|
|
build (#39498)
|
|
|
|
|
|
* [CI][JS] Force node 20 on JS build on arm64 to fix build issues
|
|
|
|
|
|
(#39499)
|
|
|
|
|
|
## New Features and Improvements
|
|
|
|
|
|
* [C++][Python] Add "Z" to the end of timestamp print string when
|
|
|
|
|
|
tz defined (#39272)
|
|
|
|
|
|
* [Python] Remove the legacy ParquetDataset custom python-based
|
|
|
|
|
|
implementation (#39112)
|
|
|
|
|
|
* [Python] add Table.to/from_struct_array (#38520)
|
|
|
|
|
|
* [C++][Python] DLPack implementation for Arrow Arrays (producer)
|
|
|
|
|
|
(#38472)
|
|
|
|
|
|
* [Python] FixedSizeListArray.from_arrays supports mask parameter
|
|
|
|
|
|
(#39396)
|
|
|
|
|
|
* [C++][Python][R] Allow users to adjust S3 log level by
|
|
|
|
|
|
environment variable (#38267)
|
|
|
|
|
|
* [Python] Expose Parquet sorting metadata (#37665)
|
|
|
|
|
|
* [C++][Python][Parquet] Implement Float16 logical type (#36073)
|
|
|
|
|
|
* [Python] Make CacheOptions configurable from Python (#36627)
|
|
|
|
|
|
* [Python][Parquet] Parquet Support write and validate Page CRC
|
|
|
|
|
|
(#38360)
|
|
|
|
|
|
* [Python][Dataset] Expose file size to python dataset (#37868)
|
|
|
|
|
|
* [R] Allow code() to return package name prefix. (#38144)
|
|
|
|
|
|
* [Python] Remove usage of pandas internals DatetimeTZBlock
|
|
|
|
|
|
(#38321)
|
|
|
|
|
|
* Add validation logic for offsets and values to
|
|
|
|
|
|
arrow.array.ListArray.fromArrays (#38531)
|
|
|
|
|
|
* [Python][Compute] Describe strptime format semantics (#38665)
|
|
|
|
|
|
* [Python] Remove dead code in _reconstruct_block (#38714)
|
|
|
|
|
|
* [Python] Fix append mode for cython 2 (#39027)
|
|
|
|
|
|
* [Python] Add append mode for pyarrow.OsFile (#38820)
|
|
|
|
|
|
* [Python] Extract libparquet requirements out of
|
|
|
|
|
|
libarrow_python.so to new libarrow_python_parquet_encryption.so
|
|
|
|
|
|
(#39316)
|
|
|
|
|
|
* Create module info compiler plugin (#39135)
|
|
|
|
|
|
* [Python] RecordBatchReader.from_stream constructor for objects
|
|
|
|
|
|
implementing the Arrow PyCapsule protocol (#39218)
|
|
|
|
|
|
* [Python] Pass in type to MapType.from_arrays (#39516)
|
|
|
|
|
|
* [Python][CI] Skip failing dask tests: test_describe_empty and
|
|
|
|
|
|
test_view (#39534)
|
|
|
|
|
|
* [Python] NumPy 2.0 compat: remove usage of np.core (#39535)
|
|
|
|
|
|
* [Packaging][Python] Add a numpy<2 pin to the install
|
|
|
|
|
|
requirements for the 15.x release branch (#39538)
|
|
|
|
|
|
|
|
|
|
|
|
-------------------------------------------------------------------
|
|
|
|
|
|
Mon Jan 15 20:42:25 UTC 2024 - Ben Greiner <code@bnavigator.de>
|
|
|
|
|
|
|
|
|
|
|
|
- Update to 14.0.2
|
|
|
|
|
|
## New Features and Improvements
|
|
|
|
|
|
* GH-38342 - [Python] Update to_pandas to use non-deprecated
|
|
|
|
|
|
DataFrame constructor (#38374)
|
|
|
|
|
|
* GH-38364 - [Python] Initialize S3 on first use (#38375)
|
|
|
|
|
|
## Bug Fixes
|
|
|
|
|
|
* GH-38345 - [Release] Use local test data for verification if
|
|
|
|
|
|
possible (#38362)
|
|
|
|
|
|
* GH-38577 - Reading parquet file behavior change from 13.0.0 to
|
|
|
|
|
|
14.0.0
|
|
|
|
|
|
* GH-38626 - [Python] Fix segfault when PyArrow is imported at
|
|
|
|
|
|
shutdown (#38637)
|
|
|
|
|
|
* GH-38676 - [Python] Fix potential deadlock when CSV reading
|
|
|
|
|
|
errors out (#38713)
|
|
|
|
|
|
* GH-38984 - [Python][Packaging] Verification of wheels on
|
|
|
|
|
|
AlmaLinux 8 are failing due to missing pip (#38985)
|
|
|
|
|
|
* GH-39074 - [Release][Packaging] Use UTF-8 explicitly for KEYS
|
|
|
|
|
|
(#39082)
|
|
|
|
|
|
|
|
|
|
|
|
-------------------------------------------------------------------
|
|
|
|
|
|
Tue Nov 14 23:29:03 UTC 2023 - Ondřej Súkup <mimi.vx@gmail.com>
|
|
|
|
|
|
|
2025-02-17 22:32:29 +00:00
|
|
|
|
- Fix cve in changelog
|
2024-02-28 16:27:53 +00:00
|
|
|
|
|
|
|
|
|
|
-------------------------------------------------------------------
|
|
|
|
|
|
Tue Nov 14 09:28:23 UTC 2023 - Ondřej Súkup <mimi.vx@gmail.com>
|
|
|
|
|
|
|
2025-02-17 22:32:29 +00:00
|
|
|
|
- Update to 14.0.1
|
2024-02-28 16:27:53 +00:00
|
|
|
|
- drop pyarrow-pr37481-pandas2.1.patch
|
|
|
|
|
|
- fixes boo#1216991 CVE-2023-47248
|
|
|
|
|
|
* GH-38431 - [Python][CI] Update fs.type_name checks for s3fs tests
|
|
|
|
|
|
* GH-38607 - [Python] Disable PyExtensionType autoload
|
|
|
|
|
|
- update to 14.0.0
|
|
|
|
|
|
* very long list of changes can be found here:
|
|
|
|
|
|
https://arrow.apache.org/release/14.0.0.html
|
|
|
|
|
|
|
|
|
|
|
|
-------------------------------------------------------------------
|
|
|
|
|
|
Thu Aug 31 18:43:55 UTC 2023 - Ben Greiner <code@bnavigator.de>
|
|
|
|
|
|
|
|
|
|
|
|
- Update to 13.0.0
|
|
|
|
|
|
## Compatibility notes:
|
|
|
|
|
|
* The default format version for Parquet has been bumped from 2.4
|
|
|
|
|
|
to 2.6 GH-35746. In practice, this means that nanosecond
|
|
|
|
|
|
timestamps now preserve its resolution instead of being
|
|
|
|
|
|
converted to microseconds.
|
|
|
|
|
|
* Support for Python 3.7 is dropped GH-34788
|
|
|
|
|
|
## New features:
|
|
|
|
|
|
* Conversion to non-nano datetime64 for pandas >= 2.0 is now
|
|
|
|
|
|
supported GH-33321
|
|
|
|
|
|
* Write page index is now supported GH-36284
|
|
|
|
|
|
* Bindings for reading JSON format in Dataset are added GH-34216
|
|
|
|
|
|
* keys_sorted property of MapType is now exposed GH-35112
|
|
|
|
|
|
## Other improvements:
|
|
|
|
|
|
* Common python functionality between Table and RecordBatch
|
|
|
|
|
|
classes has been consolidated ( GH-36129, GH-35415, GH-35390,
|
|
|
|
|
|
GH-34979, GH-34868, GH-31868)
|
|
|
|
|
|
* Some functionality for FixedShapeTensorType has been improved
|
|
|
|
|
|
(__reduce__ GH-36038, picklability GH-35599)
|
|
|
|
|
|
* Pyarrow scalars can now be accepted in the array constructor
|
|
|
|
|
|
GH-21761
|
|
|
|
|
|
* DataFrame Interchange Protocol implementation and usage is now
|
|
|
|
|
|
documented GH-33980
|
|
|
|
|
|
* Conversion between Arrow and Pandas for map/pydict now has
|
|
|
|
|
|
enhanced support GH-34729
|
|
|
|
|
|
* Usability of pc.map_lookup / MapLookupOptions is improved
|
|
|
|
|
|
GH-36045
|
|
|
|
|
|
* zero_copy_only keyword can now also be accepted in
|
|
|
|
|
|
ChunkedArray.to_numpy() GH-34787
|
|
|
|
|
|
* Python C++ codebase now has linter support in Archery and the
|
|
|
|
|
|
CI GH-35485
|
|
|
|
|
|
## Relevant bug fixes:
|
|
|
|
|
|
* __array__ numpy conversion for Table and RecordBatch is now
|
|
|
|
|
|
corrected so that np.asarray(pa.Table) doesn’t return a
|
|
|
|
|
|
transposed result GH-34886
|
|
|
|
|
|
* parquet.write_to_dataset doesn’t create empty files for
|
|
|
|
|
|
non-observed dictionary (category) values anymore GH-23870
|
|
|
|
|
|
* Dataset writer now also correctly follows default Parquet
|
|
|
|
|
|
version of 2.6 GH-36537
|
|
|
|
|
|
* Comparing pyarrow.dataset.Partitioning with other type is now
|
|
|
|
|
|
correctly handled GH-36659
|
|
|
|
|
|
* Pickling of pyarrow.dataset PartitioningFactory objects is now
|
|
|
|
|
|
supported GH-34884
|
|
|
|
|
|
* None schema is now disallowed in parquet writer GH-35858
|
|
|
|
|
|
* pa.FixedShapeTensorArray.to_numpy_ndarray is not failing on
|
|
|
|
|
|
sliced arrays GH-35573
|
|
|
|
|
|
* Halffloat type is now supported in the conversion from Arrow
|
|
|
|
|
|
list to pandas GH-36168
|
|
|
|
|
|
* __from_arrow__ is now also implemented for Array.to_pandas for
|
|
|
|
|
|
pandas extension data types GH-36096
|
|
|
|
|
|
- Add pyarrow-pr37481-pandas2.1.patch gh#apache/arrow#37481
|
|
|
|
|
|
|
|
|
|
|
|
-------------------------------------------------------------------
|
|
|
|
|
|
Fri Aug 25 12:52:17 UTC 2023 - Ben Greiner <code@bnavigator.de>
|
|
|
|
|
|
|
|
|
|
|
|
- Limit to Cython < 3
|
|
|
|
|
|
|
|
|
|
|
|
-------------------------------------------------------------------
|
|
|
|
|
|
Mon Jun 12 12:22:31 UTC 2023 - Ben Greiner <code@bnavigator.de>
|
|
|
|
|
|
|
|
|
|
|
|
- Update to 12.0.1
|
|
|
|
|
|
## Bug Fixes
|
|
|
|
|
|
* [GH-35389] - [Python] Fix coalesce_keys=False option in join
|
|
|
|
|
|
operation (#35505)
|
|
|
|
|
|
* [GH-35821] - [Python][CI] Skip extension type test failing with
|
|
|
|
|
|
pandas 2.0.2 (#35822)
|
|
|
|
|
|
* [GH-35845] - [CI][Python] Fix usage of assert_frame_equal in
|
|
|
|
|
|
test_hdfs.py (#35842)
|
|
|
|
|
|
## New Features and Improvements
|
|
|
|
|
|
* [GH-35329] - [Python] Address pandas.types.is_sparse deprecation
|
|
|
|
|
|
(#35366)
|
|
|
|
|
|
- Drop pyarrow-pr35822-pandas2-extensiontype.patch
|
|
|
|
|
|
|
|
|
|
|
|
-------------------------------------------------------------------
|
|
|
|
|
|
Wed Jun 7 07:39:44 UTC 2023 - Ben Greiner <code@bnavigator.de>
|
|
|
|
|
|
|
|
|
|
|
|
- Skip invalid pandas 2 test
|
|
|
|
|
|
* pyarrow-pr35822-pandas2-extensiontype.patch
|
|
|
|
|
|
* gh#apache/arrow#35822
|
|
|
|
|
|
* gh#apache/arrow#35839
|
|
|
|
|
|
|
|
|
|
|
|
-------------------------------------------------------------------
|
|
|
|
|
|
Thu May 18 07:28:28 UTC 2023 - Ben Greiner <code@bnavigator.de>
|
|
|
|
|
|
|
|
|
|
|
|
- Update to 12.0.0
|
|
|
|
|
|
## Compatibility notes:
|
|
|
|
|
|
* Plasma has been removed in this release (GH-33243). In
|
|
|
|
|
|
addition, the deprecated serialization module in PyArrow was
|
|
|
|
|
|
also removed (GH-29705). IPC (Inter-Process Communication)
|
|
|
|
|
|
functionality of pyarrow or the standard library pickle should
|
|
|
|
|
|
be used instead.
|
|
|
|
|
|
* The deprecated use_async keyword has been removed from the
|
|
|
|
|
|
dataset module (GH-30774)
|
|
|
|
|
|
* Minimum Cython version to build PyArrow from source has been
|
|
|
|
|
|
raised to 0.29.31 (GH-34933). In addition, PyArrow can now be
|
|
|
|
|
|
compiled using Cython 3 (GH-34564).
|
|
|
|
|
|
## New features:
|
|
|
|
|
|
* A new pyarrow.acero module with initial bindings for the Acero
|
|
|
|
|
|
execution engine has been added (GH-33976)
|
|
|
|
|
|
* A new canonical extension type for fixed shaped tensor data has
|
|
|
|
|
|
been defined. This is exposed in PyArrow as the
|
|
|
|
|
|
FixedShapeTensorType (GH-34882, GH-34956)
|
|
|
|
|
|
* Run-End Encoded arrays binding has been implemented (GH-34686,
|
|
|
|
|
|
GH-34568)
|
|
|
|
|
|
* Method is_nan has been added to Array, ChunkedArray and
|
|
|
|
|
|
Expression (GH-34154)
|
|
|
|
|
|
* Dataframe interchange protocol has been implemented for
|
|
|
|
|
|
RecordBatch (GH-33926)
|
|
|
|
|
|
## Other improvements:
|
|
|
|
|
|
* Extension arrays can now be concatenated (GH-31868)
|
|
|
|
|
|
* get_partition_keys helper function is implemented in the
|
|
|
|
|
|
dataset module to access the partitioning field’s key/value
|
|
|
|
|
|
from the partition expression of a certain dataset fragment
|
|
|
|
|
|
(GH-33825)
|
|
|
|
|
|
* PyArrow Array objects can now be accepted by the pa.array()
|
|
|
|
|
|
constructor (GH-34411)
|
|
|
|
|
|
* The default row group size when writing parquet files has been
|
|
|
|
|
|
changed (GH-34280)
|
|
|
|
|
|
* RecordBatch has the select() method implemented (GH-34359)
|
|
|
|
|
|
* New method drop_column on the pyarrow.Table supports passing a
|
|
|
|
|
|
single column as a string (GH-33377)
|
|
|
|
|
|
* User-defined tabular functions, which are a user-functions
|
|
|
|
|
|
implemented in Python that return a stateful stream of tabular
|
|
|
|
|
|
data, are now also supported (GH-32916)
|
|
|
|
|
|
* Arrow Archery tool now includes linting of the Cython files
|
|
|
|
|
|
(GH-31905)
|
|
|
|
|
|
* Breaking Change: Reorder output fields of “group_by” node so
|
|
|
|
|
|
that keys/segment keys come before aggregates (GH-33616)
|
|
|
|
|
|
## Relevant bug fixes:
|
|
|
|
|
|
* Acero can now detect and raise an error in case a join
|
|
|
|
|
|
operation needs too much bytes of key data (GH-34474)
|
|
|
|
|
|
* Fix for converting non-sequence object in pa.array() (GH-34944)
|
|
|
|
|
|
* Fix erroneous table conversion to pandas if table includes an
|
|
|
|
|
|
extension array that does not implement to_pandas_dtype
|
|
|
|
|
|
(GH-34906)
|
|
|
|
|
|
* Reading from a closed ArrayStreamBatchReader now returns
|
|
|
|
|
|
invalid status instead of segfaulting (GH-34165)
|
|
|
|
|
|
* array() now returns pyarrow.Array and not pyarrow.ChunkedArray
|
|
|
|
|
|
for columns with __arrow_array__ method and only one chunk so
|
|
|
|
|
|
that the conversion of pandas dataframe with categorical column
|
|
|
|
|
|
of dtype string[pyarrow] does not fail (GH-33727)
|
|
|
|
|
|
* Custom type mapper in to_pandas now converts index dtypes
|
|
|
|
|
|
together with column dtypes (GH-34283)
|
|
|
|
|
|
|
|
|
|
|
|
-------------------------------------------------------------------
|
|
|
|
|
|
Wed Mar 29 13:25:55 UTC 2023 - Ben Greiner <code@bnavigator.de>
|
|
|
|
|
|
|
|
|
|
|
|
- Fix tests expecting the jemalloc backend which was disabled in
|
|
|
|
|
|
the apache-arrow package
|
|
|
|
|
|
|
|
|
|
|
|
-------------------------------------------------------------------
|
|
|
|
|
|
Sun Mar 12 05:31:32 UTC 2023 - Ben Greiner <code@bnavigator.de>
|
|
|
|
|
|
|
|
|
|
|
|
- Update to v11.0.0
|
|
|
|
|
|
* [Python][Doc] Add five more numpydoc checks to CI (#15214)
|
|
|
|
|
|
* [Python][CI][Doc] Enable numpydoc check PR03 (#13983)
|
|
|
|
|
|
* [Python] Expose flag to enable/disable storing Arrow schema in Parquet metadata (#13000)
|
|
|
|
|
|
* [Python] Add support for reading record batch custom metadata API (#13041)
|
|
|
|
|
|
* [Python] Add lazy Dataset.filter() method (#13409)
|
|
|
|
|
|
* [Python] ParquetDataset to still take legacy code path when old filesystem is passed (#15269)
|
|
|
|
|
|
* [Python] Switch default and deprecate use_legacy_dataset=True in ParquetDataset (#14052)
|
|
|
|
|
|
* [Python] Support lazy Dataset.filter
|
|
|
|
|
|
* [Python] Order of columns in pyarrow.feather.read_table (#14528)
|
|
|
|
|
|
* [Python] Construct MapArray from sequence of dicts (instead of list of tuples) (#14547)
|
|
|
|
|
|
* [Python] Unify CMakeLists.txt in python/ (#14925)
|
|
|
|
|
|
* [C++][Python] Implement list_slice kernel (#14395)
|
|
|
|
|
|
* [C++][Python] Enable struct_field kernel to accept string field names (#14495)
|
|
|
|
|
|
* [Python][C++] Add use\_threads to run\_substrait\_query
|
|
|
|
|
|
* [Python][Docs] adding info about TableGroupBy.aggregation with empty list (#14482)
|
|
|
|
|
|
* [Python] DataFrame Interchange Protocol for pyarrow Table
|
|
|
|
|
|
* [Python] Drop older versions of Pandas (<1.0) (#14631)
|
2025-02-17 22:32:29 +00:00
|
|
|
|
* [Python] Pass Cmake args to Python CPP
|
2024-02-28 16:27:53 +00:00
|
|
|
|
* [Docs][Python] Improve docs for S3FileSystem (#14599)
|
|
|
|
|
|
* [Python] Add missing value accessor to temporal types (#14746)
|
|
|
|
|
|
* [Python] Expose time32/time64 scalar values (#14637)
|
|
|
|
|
|
* [Python] Remove gcc 4.9 compatibility code (#14602)
|
|
|
|
|
|
* [C++][Python] Support slicing to end in list_slice kernel (#14749)
|
|
|
|
|
|
* [C++][Python] Support step >= 1 in list_slice kernel (#14696)
|
|
|
|
|
|
* [Release][Python] Upload .wheel/.tar.gz for release not RC (#14708)
|
|
|
|
|
|
* [Python] Expose Scalar.validate() (#15149)
|
|
|
|
|
|
* [Python] PyArrow C++ header files no longer always included in installed pyarrow (#14656)
|
|
|
|
|
|
* [Doc][Python] Update note about bundling Arrow C++ on Windows (#14660)
|
|
|
|
|
|
* [Python] Reduce warnings during tests (#14729)
|
|
|
|
|
|
* [Python] Expose reading a schema from an IPC message (#14831)
|
|
|
|
|
|
* [Python] Expose QuotingStyle to Python (#14722)
|
|
|
|
|
|
* [Python] Add (Chunked)Array sort() method (#14781)
|
|
|
|
|
|
* [Python] Dataset.sort_by (#14976)
|
|
|
|
|
|
* [Python] Avoid dependency on exec plan in Table.sort_by to fix minimal tests (#15268)
|
|
|
|
|
|
* [Python] Remove auto generated pyarrow_api.h and pyarrow_lib.h (#15219)
|
|
|
|
|
|
* [Python] Error if datetime.timedelta to pyarrow.duration conversion overflows (#13718)
|
|
|
|
|
|
* [Python] to_pandas fails with FixedOffset timezones when timestamp_as_object is used (#14448)
|
|
|
|
|
|
* [Python] Pass **kwargs in read_feather to to_pandas() (#14492)
|
|
|
|
|
|
* [Python] Add python test for decimals to csv (#14525)
|
|
|
|
|
|
* [Python] Test that reading of timedelta is stable (read_feather/to_pandas) (#14531)
|
|
|
|
|
|
* [C++][Python] Improve s3fs error message when wrong region (#14601)
|
|
|
|
|
|
* [Python][C++] Adding support for IpcWriteOptions to the dataset ipc file writer (#14414)
|
|
|
|
|
|
* [Python] Support passing create_dir thru pq.write_to_dataset (#14459)
|
|
|
|
|
|
* [CI][Python] Fix pandas master/nightly build failure related to timedelta (#14460)
|
|
|
|
|
|
* [Python] Fix writing files with multi-byte characters in file name (#14764)
|
2025-02-17 22:32:29 +00:00
|
|
|
|
* [Python] Handle pytest 8 deprecations about pytest.warns(None)
|
2024-02-28 16:27:53 +00:00
|
|
|
|
* [Python] Remove ARROW_BUILD_DIR in building pyarrow C++ (#14498)
|
|
|
|
|
|
* [Python] Honor default memory pool in Dataset scanning (#14516)
|
|
|
|
|
|
* [Python] Fully support filesystem in parquet.write_metadata (#14574)
|
|
|
|
|
|
* [Python] Check schema argument type in RecordBatchReader.from_batches (#14583)
|
|
|
|
|
|
* [Python][Docs] PyArrow table join docstring typos for left and right suffix arguments (#14591)
|
|
|
|
|
|
* [Python] pass back time types with correct type class (#14633)
|
|
|
|
|
|
* [Python] Support filesystem parameter in ParquetFile (#14717)
|
|
|
|
|
|
* [Python][Docs] Add missing CMAKE_PREFIX_PATH to allow setup.py CMake invocations to find Arrow CMake package (#14586)
|
|
|
|
|
|
* [Python][CI] Add DYLD_LIBRARY_PATH to avoid requiring PYARROW_BUNDLE_ARROW_CPP on macOS job (#14643)
|
|
|
|
|
|
* [Python] Don't crash when schema=None in FlightClient.do_put (#14698)
|
|
|
|
|
|
* [Python] Change warnings to _warnings in _plasma_store_entry_point (#14695)
|
|
|
|
|
|
* [CI][Python] Update nightly test-conda-python-3.7-pandas-0.24 to pandas >= 1.0 (#14714)
|
|
|
|
|
|
* [CI][Python] Update spark test modules to match spark master (#14715)
|
|
|
|
|
|
* [Python] Fix test_s3fs_wrong_region; set anonymous=True (#14716)
|
|
|
|
|
|
* [Python][CI] Fix nightly job using pandas dev (temporarily skip tests) (#15048)
|
|
|
|
|
|
* [Python] Quadratic memory usage of Table.to\_pandas with nested data
|
|
|
|
|
|
* [Python] Fix pyarrow.get_libraries() order (#14944)
|
|
|
|
|
|
* [Python] Fix segfault for dataset ORC write (#15049)
|
|
|
|
|
|
* [Python][Docs] Update docstring for pyarrow.decompress (#15061)
|
|
|
|
|
|
* [Python][CI] Dask nightly tests are failing due to fsspec bug (#15065)
|
|
|
|
|
|
* [C++][Python][FlightRPC] Make DoAction truly streaming (#15118)
|
|
|
|
|
|
* [Benchmarking][Python] Set ARROW_INSTALL_NAME_RPATH=ON for benchmark builds (#15123)
|
|
|
|
|
|
* [Python][macOS] Use `@rpath` for libarrow_python.dylib (#15143)
|
|
|
|
|
|
* [Python] Docstring test failure (#15186)
|
|
|
|
|
|
* [Python] Don't use target_include_directories() for imported target (#33606)
|
|
|
|
|
|
* [Python] Make CSV cancellation test more robust
|
|
|
|
|
|
* [Python][CI] Python sdist installation fails with latest setuptools 58.5
|
|
|
|
|
|
* [Python] Missing bindings for existing\_data\_behavior makes it impossible to maintain old behavior
|
|
|
|
|
|
* [Python] update trove classifiers to include Python 3.10
|
|
|
|
|
|
* [Release][Python] Use python -m pytest
|
|
|
|
|
|
* [Python][C++] Non-deterministic segfault in "AMD64 MacOS 10.15 Python 3.7" build
|
|
|
|
|
|
* [Python][Doc] Clarify what should be expected if read_table is passed an empty list of columns
|
|
|
|
|
|
* [Python][Packaging] Set deployment target to 10.13 for universal2 wheels
|
|
|
|
|
|
* [Python] Fix crash in take/filter of empty ExtensionArray
|
|
|
|
|
|
* [Python] Move marks from fixtures to individual tests/params
|
|
|
|
|
|
* [Python][CI] Requiring s3fs >= 2021.8
|
|
|
|
|
|
* [Python] Allow writing datasets using a partitioning that only specifies field_names
|
|
|
|
|
|
* [Python] Table.from_arrays should raise an error when array is empty but names is not
|
|
|
|
|
|
* [Python][Packaging] Pin minimum setuptools version for the macos wheels
|
|
|
|
|
|
* [Python][Doc] Document nullable dtypes handling and usage of types_mapper in to_pandas conversion
|
|
|
|
|
|
* [C++][Python] Fix unique/value_counts on empty dictionary arrays
|
|
|
|
|
|
* [Python][CI] Fix tests using OrcFileFormat for Python 3.6 + orc not built
|
|
|
|
|
|
* [Python] Fix FlightClient.do_action
|
|
|
|
|
|
* [Python][Docs] Fix usage of sync scanner in dataset writing docs
|
|
|
|
|
|
* [Packaging][Python] Python 3.9 installation fails in macOS wheel build
|
|
|
|
|
|
* [CI][Python] Fix Spark integration failures
|
|
|
|
|
|
* [Python] Fix version constraints in pyproject.toml
|
|
|
|
|
|
* [Packaging][Python] Disable windows wheel testing for python 3.6
|
|
|
|
|
|
* [Python][C++] Segfault with read\_json when a field is missing
|
|
|
|
|
|
* [Python] Support for set/list columns when converting from Pandas
|
|
|
|
|
|
* [Python] Support converting nested sets when converting to arrow
|
|
|
|
|
|
* [Python] Make filesystems compatible with fsspec
|
|
|
|
|
|
* [C++][Python][R] Consolidate coalesce/fill_null
|
|
|
|
|
|
* [Python][Doc] Document the fsspec wrapper for pyarrow.fs filesystems
|
|
|
|
|
|
* [Python] Support core-site.xml default filesystem.
|
|
|
|
|
|
* [Python] Improve HadoopFileSystem docstring
|
|
|
|
|
|
* [Python][Doc] Document missing pandas to arrow conversions
|
|
|
|
|
|
* [Python] Make SubTreeFileSystem print method more informative
|
|
|
|
|
|
* [Doc][Python] Improve documentation regarding dealing with memory mapped files
|
|
|
|
|
|
* [C++][Python] Implement a new scalar function: list_element
|
|
|
|
|
|
* [Python] Allow creating RecordBatch from Python dict
|
|
|
|
|
|
* [Python] Update HadoopFileSystem docs to clarify setting CLASSPATH env variable is required
|
|
|
|
|
|
* [Python] Improve documentation on what 'use_threads' does in 'read_feather'
|
|
|
|
|
|
* [C++][Python] Improve consistency of explicit C++ types in PyArrow files
|
|
|
|
|
|
* [Doc][Python] Improve PyArrow documentation for new users
|
|
|
|
|
|
* [C++][Python] Add CSV convert option to change decimal point
|
|
|
|
|
|
* [Python][Packaging] Build M1 wheels for python 3.8
|
|
|
|
|
|
* [Release][Python] Verify python 3.8 macOS arm64 wheel
|
|
|
|
|
|
* [Doc][Python] Switch ipc/io doc to use context managers
|
|
|
|
|
|
* [Python] Mention alternative deprecation message for ParquetDataset.partitions
|
|
|
|
|
|
* [C++][Python] Implement ExtensionScalar
|
|
|
|
|
|
* [Packaging][Python] Skip test_cancellation test case on M1
|
|
|
|
|
|
* [Python][FlightRPC] pyarrow client do_put close method after write_table did not throw flight error
|
|
|
|
|
|
* [Packaging][Python] Define --with-lg-page for jemalloc in the arm manylinux builds
|
|
|
|
|
|
* [Python] Fix docstrings
|
|
|
|
|
|
* [Python] Expose copy_files in pyarrow.fs
|
|
|
|
|
|
* [Doc][Python] Add a recipe on how to save partitioned datasets to the Cookbook
|
|
|
|
|
|
* [Python] Update deprecated pytest yield_fixture functions
|
|
|
|
|
|
* [Python] Support for MapType with Fields
|
|
|
|
|
|
* [Python][Docs] Improve filesystem documentation
|
|
|
|
|
|
* [Python] Add dataset mark to test_parquet_dataset_deprecated_properties
|
|
|
|
|
|
* . [Python] Preview data when printing tables
|
|
|
|
|
|
* [C++][Python] Column projection pushdown for ORC dataset reading + use liborc for column selection
|
|
|
|
|
|
* [C++][Python] Add support for new MonthDayNano Interval Type
|
|
|
|
|
|
* [Doc][Python] Add documentation for unify_schemas
|
|
|
|
|
|
* [C++][Python] Implement C data interface support for extension types
|
|
|
|
|
|
* [Python] Allow more than numpy.array as masks when creating arrays
|
|
|
|
|
|
* [Python] Correct TimestampScalar.as_py() and DurationScalar.as_py() docstrings
|
|
|
|
|
|
* [Python] Migrate Python ORC bindings to use new Result-based APIs
|
|
|
|
|
|
* [Python] Support tuples in unify_schemas
|
|
|
|
|
|
* [C++][Python] Not providing a sort_key in the "select_k_unstable" kernel crashes
|
|
|
|
|
|
* [C++][Python] Support cast of naive timestamps to strings
|
|
|
|
|
|
* [Python] Update kernel categories in compute doc to match C++
|
|
|
|
|
|
* [C++][Python][R] Implement count distinct kernel
|
|
|
|
|
|
* [Python] Allow unsigned integer index type in dictionary() type factory function
|
|
|
|
|
|
* [Python] Missing Python tests for compute kernels
|
|
|
|
|
|
* [Python][CI] Add support for python 3.10
|
|
|
|
|
|
* [C++][Python] Improve error message when trying use SyncScanner when requiring async
|
|
|
|
|
|
* [Python] Extend CompressedInputStream to work with paths, strings and files
|
|
|
|
|
|
* [Packaging][Python] Enable NEON SIMD optimization for M1 wheels
|
|
|
|
|
|
* [C++][Python] Use std::move() explicitly for g++ 4.8.5
|
|
|
|
|
|
* [Python][Packaging] Use numpy 1.21.3 to build python 3.10 wheels for macOS and windows
|
|
|
|
|
|
- Build via PEP517
|
|
|
|
|
|
|
|
|
|
|
|
-------------------------------------------------------------------
|
|
|
|
|
|
Mon Aug 22 07:06:44 UTC 2022 - John Vandenberg <jayvdb@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
- Update to v9.0.0
|
|
|
|
|
|
|
|
|
|
|
|
-------------------------------------------------------------------
|
|
|
|
|
|
Mon Jan 21 03:51:32 UTC 2019 - Todd R <toddrme2178@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
- Initial version for v0.13.0
|