SHA256
1
0
forked from pool/apache-arrow

Compare commits

63 Commits

Author SHA256 Message Date
97d3f5b9bd Accepting request 1307560 from science
-  Remove boost::system dependency for Tumbleweed
  * Add arrow-boost-system-1.89-boo1249599.patch
  * gh#boostorg/system#132
  * boo#1249599

- Update to 21.0.0
  ## Bug Fixes
  * GH-32276 - [C++][FlightRPC] Add option to align RecordBatch
    buffers given to IPC reader (#44279)
  * GH-35166 - [C++][Compute] Increase precision of decimals in sum
    aggregates (#44184)
  * GH-40756 - [C++] Remove dead Boost urls (#46452)
  * GH-45532 - [C++] RunEndEncodedBuilder should clear dimensions
    after a Finish() call (#45533)
  * GH-45534 - [C++] Test: RunEndEncodeTableColumns should update
    REE columns' schema types (#45535)
  * GH-45608 - [C++][Flight] Fix compilation for clang (#46264)
  * GH-45735 - [C++] Broken tests for extract_regex compute funcion
    (#45900)
  * GH-45853 - [C++][Dev] Fix Meson compilation issues in Docker
    builds (#45858)
  * GH-46011 - [C++] Hide DCHECK family from public headers
    (#46015)
  * GH-46025 - [C++] Use ARROW_CUDA_EXPORT instead of ARROW_EXPORT
    for libarrow_cuda (#46030)
  * GH-46052 - [C++][Benchmarking] Don't build grouper benchmark
    without ARROW_COMPUTE=ON (#46053)
  * GH-46070 - [C++] Remove duplicate storage_type in JsonExtension
    (#46071)
  * GH-46084 - [C++] Always use ARROW_VCPKG to detect vcpkg mode

OBS-URL: https://build.opensuse.org/request/show/1307560
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/apache-arrow?expand=0&rev=21
2025-09-29 14:32:07 +00:00
fe8beb41f1 fix patch application
OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=63
2025-09-28 12:00:30 +00:00
95ae261a6d - Remove boost::system dependency for Tumbleweed
* Add arrow-boost-system-1.89-boo1249599.patch
  * gh#boostorg/system#132
  * boo#1249599

OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=62
2025-09-27 09:37:29 +00:00
423747c6f1 OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=61 2025-09-26 11:14:45 +00:00
d7bbabe07b OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=60 2025-09-26 11:03:19 +00:00
f26a64d1c1 - Update to 21.0.0
## Bug Fixes
  * GH-32276 - [C++][FlightRPC] Add option to align RecordBatch
    buffers given to IPC reader (#44279)
  * GH-35166 - [C++][Compute] Increase precision of decimals in sum
    aggregates (#44184)
  * GH-40756 - [C++] Remove dead Boost urls (#46452)
  * GH-45532 - [C++] RunEndEncodedBuilder should clear dimensions
    after a Finish() call (#45533)
  * GH-45534 - [C++] Test: RunEndEncodeTableColumns should update
    REE columns' schema types (#45535)
  * GH-45608 - [C++][Flight] Fix compilation for clang (#46264)
  * GH-45735 - [C++] Broken tests for extract_regex compute funcion
    (#45900)
  * GH-45853 - [C++][Dev] Fix Meson compilation issues in Docker
    builds (#45858)
  * GH-46011 - [C++] Hide DCHECK family from public headers
    (#46015)
  * GH-46025 - [C++] Use ARROW_CUDA_EXPORT instead of ARROW_EXPORT
    for libarrow_cuda (#46030)
  * GH-46052 - [C++][Benchmarking] Don't build grouper benchmark
    without ARROW_COMPUTE=ON (#46053)
  * GH-46070 - [C++] Remove duplicate storage_type in JsonExtension
    (#46071)
  * GH-46084 - [C++] Always use ARROW_VCPKG to detect vcpkg mode
    (#46467)
  * GH-46090 - [C++] Set default IPC option to enabled in Meson
    (#46114)
  * GH-46094 - [C++][Docs] Add note to RleDecoder::Get's doc
    comment (#46874)

OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=59
2025-09-25 11:52:33 +00:00
a49d6aac12 Accepting request 1285645 from science
- Update to 20.0.0
  ## Bug Fixes
  * GH-30302 - [C++][Parquet] Preserve the bitwidth of integer
    dictionary indices on round-trip to Parquet (#45685)
  * GH-31992 - [C++][Parquet] Handling the special case when
    DataPageV2 values buffer is empty (#45252)
  * GH-37630 - [C++][Python][Dataset] Allow disabling fragment
    metadata caching (#45330)
  * GH-39023 - [C++][CMake] Add missing launcher path conversion
    for ExternalPackage (#45349)
  * GH-43057 - [C++] Thread-safe AesEncryptor / AesDecryptor
    (#44990)
  * GH-45048 - [C++][Parquet] Deprecate unused chunk_size parameter
    in parquet::arrow::FileWriter::NewRowGroup() (#45088)
  * GH-45129 - [Python][C++] Fix usage of deprecated C++
    functionality on pyarrow (#45189)
  * GH-45132 - [C++][Gandiva] Update LLVM to 18.1 (#45114)
  * GH-45185 - [C++][Parquet] Raise an error for invalid repetition
    levels when delimiting records (#45186)
  * GH-45254 - [C++][Acero] Fix the row offset truncation in row
    table merge (#45255)
  * GH-45266 - [C++][Acero] Fix the running tasks count of
    Scheduler when get error tasks in multi-threads (#45268)
  * GH-45270 - [C++][CI] Disable mimalloc on Valgrind builds
    (#45271)
  * GH-45301 - [C++] Change PrimitiveArray ctor to protected
    (#45444)
  * GH-45334 - [C++][Acero] Fix swiss join overflow issues in row
    offset calculation for fixed length and null masks (#45336)
  * GH-45362 - [C++] Fix identity cast for time and list scalar

OBS-URL: https://build.opensuse.org/request/show/1285645
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/apache-arrow?expand=0&rev=20
2025-06-14 14:17:55 +00:00
b8b054a93e OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=57 2025-06-13 18:46:54 +00:00
26f7f2002b .
OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=56
2025-06-13 18:39:08 +00:00
77cc1e4fa0 - Update to 20.0.0
## Bug Fixes
  * GH-30302 - [C++][Parquet] Preserve the bitwidth of integer
    dictionary indices on round-trip to Parquet (#45685)
  * GH-31992 - [C++][Parquet] Handling the special case when
    DataPageV2 values buffer is empty (#45252)
  * GH-37630 - [C++][Python][Dataset] Allow disabling fragment
    metadata caching (#45330)
  * GH-39023 - [C++][CMake] Add missing launcher path conversion
    for ExternalPackage (#45349)
  * GH-43057 - [C++] Thread-safe AesEncryptor / AesDecryptor
    (#44990)
  * GH-45048 - [C++][Parquet] Deprecate unused chunk_size parameter
    in parquet::arrow::FileWriter::NewRowGroup() (#45088)
  * GH-45129 - [Python][C++] Fix usage of deprecated C++
    functionality on pyarrow (#45189)
  * GH-45132 - [C++][Gandiva] Update LLVM to 18.1 (#45114)
  * GH-45185 - [C++][Parquet] Raise an error for invalid repetition
    levels when delimiting records (#45186)
  * GH-45254 - [C++][Acero] Fix the row offset truncation in row
    table merge (#45255)
  * GH-45266 - [C++][Acero] Fix the running tasks count of
    Scheduler when get error tasks in multi-threads (#45268)
  * GH-45270 - [C++][CI] Disable mimalloc on Valgrind builds
    (#45271)
  * GH-45301 - [C++] Change PrimitiveArray ctor to protected
    (#45444)
  * GH-45334 - [C++][Acero] Fix swiss join overflow issues in row
    offset calculation for fixed length and null masks (#45336)
  * GH-45362 - [C++] Fix identity cast for time and list scalar

OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=55
2025-06-13 18:31:56 +00:00
4a660bf2fd Accepting request 1271193 from science
- to fix cmake-4 build problems, upgrade bundled mimalloc from
  2.0.6 to 2.0.9 and add apache-arrow-19.0.1-mimalloc-version.patch;
  mimalloc changes according to readme.md:
  * 2.0.9:
    - Supports building with asan and improved [Valgrind] support.
    - Support abitrary large alignments, in particular for
      `std::pmr` pools.
    - Added C++ STL allocators attached to a specific heap.
    - Heap walks now visit all object (including huge objects).
    - Support Windows nano server containers.
    - Various small bug fixes.
  * 2.0.7:
    - Initial support for [Valgrind] for leak testing and heap
      block overflow detection.
    - Initial support for attaching heaps to a speficic memory area.
    - Fix `realloc` behavior for zero size blocks,
    - Remove restriction to integral multiple of the alignment in
      `alloc_align`.
    - Improved aligned allocation performance.
    - Reduced contention with many threads on few processors.
    - VS2022 support.
    - Support `pkg-config`.

OBS-URL: https://build.opensuse.org/request/show/1271193
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/apache-arrow?expand=0&rev=19
2025-04-22 15:28:04 +00:00
77a7a6c0ae OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=53 2025-04-21 16:33:45 +00:00
b3fe0e46bf changes to fix cmake-4 build problems
OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=52
2025-04-21 16:30:53 +00:00
7710aa0469 Accepting request 1264972 from science
- Re-enable flight, grpc has been fixed boo#1237422

OBS-URL: https://build.opensuse.org/request/show/1264972
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/apache-arrow?expand=0&rev=18
2025-04-02 19:05:38 +00:00
96176c78b0 - Re-enable flight, grpc has been fixed boo#1237422
OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=50
2025-03-28 08:48:20 +00:00
9b0d645fe4 Accepting request 1252869 from science
- Add missing dependencies for libboost_process explicitly
  boo#1239599 (forwarded request 1252868 from bnavigator)

OBS-URL: https://build.opensuse.org/request/show/1252869
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/apache-arrow?expand=0&rev=17
2025-03-13 21:47:20 +00:00
69d004fa1c Accepting request 1252868 from home:bnavigator:branches:science
- Add missing dependencies for libboost_process explicitly
  boo#1239599

OBS-URL: https://build.opensuse.org/request/show/1252868
OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=48
2025-03-13 19:08:14 +00:00
2266545ad6 Accepting request 1247454 from science
- disable flight because of gh#grpc/grpc#37968 boo#1237422 (forwarded request 1247453 from bnavigator)

OBS-URL: https://build.opensuse.org/request/show/1247454
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/apache-arrow?expand=0&rev=16
2025-02-20 18:53:17 +00:00
6abdc26711 - disable flight because of gh#grpc/grpc#37968 boo#1237422
OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=46
2025-02-20 16:43:05 +00:00
274e24f951 OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=45 2025-02-18 19:10:15 +00:00
2caac2258f OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=44 2025-02-18 19:08:10 +00:00
cebda09598 OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=43 2025-02-18 15:09:20 +00:00
0389e42d45 OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=42 2025-02-18 13:00:56 +00:00
9f1d8991ae OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=41 2025-02-17 22:38:30 +00:00
1539a8cfb2 - Update to 19.0.1
## Bug Fixes
  * [C++] Fix overflow issues for large build side in swiss join
    (#45108)
  * [C++][Fuzzing] Fix Negation bug discovered by fuzzing (#45181)
  * [C++][Parquet] Omit level histogram when max level is 0
    (#45285)
  * [Parquet][C++] Fix statistics load logic for no row group and
    multiple row groups (#45350)
  * [C++] Disable Flight test (#45232)
  ## Improvements
  * [C++][Parquet] Improve performance of generating size
    statistics (#45202)
  * [C++][S3] Workaround compatibility issue between AWS SDK and
    MinIO (#45310)
- Release 19.0.0
  ## New Features and Improvements
  * [CI][C++] Add a nightly job to test offline build (#44721)
  * [C++] GcsFileSystem::Make should return Result (#44503)
  * [C++][Parquet] Implement SizeStatistics (#40594)
  * [C++] Reduce string inlining in Substrait serde (#45174)
  * [C++][Acero] Enhance asof_join to work in multi-threaded
    execution by sequencing input (#44083)
  * [C++] Support the AWS S3 SSE-C encryption (#43601)
  * [C++][Parquet] Parquet Metadata Printer supports print
    sort-columns (#43599)
  * [C++] Add C++ implementation of Async C Data Interface (#44495)
  * [C++][Acero] Support AVX2 swiss join decoding (#43832)
  * [C++] skip -0117 in StrptimeZoneOffset for old glibc (#44621)
  * [C++] Add arrow::RecordBatch::MakeStatisticsArray() (#44252)

OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=40
2025-02-17 22:32:29 +00:00
22a0ee3370 Accepting request 1218457 from science
OBS-URL: https://build.opensuse.org/request/show/1218457
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/apache-arrow?expand=0&rev=15
2024-10-27 10:25:51 +00:00
20345967c9 - Set the appropriate C++ complier for the given platform so
it will compile on Leap 15.x. 

- Enable sle15_python_module_pythons.

OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=38
2024-10-26 01:06:02 +00:00
6f40ca4abe Accepting request 1201792 from science
OBS-URL: https://build.opensuse.org/request/show/1201792
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/apache-arrow?expand=0&rev=14
2024-09-22 09:05:54 +00:00
174a699a90 - Add apache-arrow-pr43766-boost1_86.patch for Boost 1.86
* gh#apache/arrow#43766

OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=36
2024-09-18 12:46:47 +00:00
86cdaafbd4 Accepting request 1194086 from science
OBS-URL: https://build.opensuse.org/request/show/1194086
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/apache-arrow?expand=0&rev=13
2024-08-16 10:23:38 +00:00
9c4175a075 Accepting request 1170145 from science
- Update to 16.0.0
  ## Bug Fixes
  * [C++][ORC] Catch all ORC exceptions to avoid crash (#40697)
  * [C++][S3] Handle conventional content-type for directories
    (#40147)
  * [C++] Strengthen handling of duplicate slashes in S3, GCS
    (#40371)
  * [C++] Avoid hash_mean overflow (#39349)
  * [C++] Fix spelling (array) (#38963)
  * [C++][Parquet] Fix crash in Modular Encryption (#39623)
  * [C++][Dataset] Fix failures in dataset-scanner-benchmark
    (#39794)
  * [C++][Device] Fix Importing nested and string types for
    DeviceArray (#39770)
  * [C++] Use correct (non-CPU) address of buffer in
    ExportDeviceArray (#39783)
  * [C++] Improve error message for "chunker out of sync" condition
    (#39892)
  * [C++] Use make -j1 to install bundled bzip2 (#39956)
  * [C++] DatasetWriter avoid creating zero-sized batch when
    max_rows_per_file enabled (#39995)
  * [C++][CI] Disable debug memory pool for ASAN and Valgrind
    (#39975)
  * [C++][Gandiva] Make Gandiva's default cache size to be 5000 for
    object code cache (#40041)
  * [C++][FS][Azure] Fix CreateDir and DeleteDir trailing slash
    issues on hierarchical namespace accounts (#40054)
  * [C++][FS][Azure] Validate containers in
    AzureFileSystem::Impl::MovePaths() (#40086)
  * [C++] Decimal types with different precisions and scales bind
    failed in resolve type when call arithmetic function (#40223)
  * [C++][Docs] Correct the console emitter link (#40146)
  * [C++][Python] Fix test_gdb failures on 32-bit (#40293)
  * [Python][C++] Fix large file handling on 32-bit Python build
    (#40176)
  * [C++] Support glog 0.7 build (#40230)
  * [C++] Fix cast function bind failed after add an alias name
    through AddAlias (#40200)
  * [C++] TakeCC: Concatenate only once and delegate to TakeAA
    instead of TakeCA (#40206)
  * [C++] Fix an abort on asof_join_benchmark run for lost an arg
    (#40234)
  * [C++] Fix an simple buffer-overflow case in decimal_benchmark
    (#40277)
  * [C++] Reduce S3Client initialization time (#40299)
  * [C++] Fix a wrong total_bytes to generate StringType's test
    data in vector_hash_benchmark (#40307)
  * [C++][Gandiva] Add support for compute module's decimal
    promotion rules (#40434)
  * [C++][Parquet] Add missing config.h include in
    key_management_test.cc (#40330)
  * [C++][CMake] Add missing glog::glog dependency to arrow_util
    (#40332)
  * [C++][Gandiva] Add missing OpenSSL dependency to
    encrypt_utils_test.cc (#40338)
  * [C++] Remove const qualifier from Buffer::mutable_span_as
    (#40367)
  * [C++] Avoid simplifying expressions which call impure functions
    (#40396)
  * [C++] Expose protobuf dependency if opentelemetry or ORC are
    enabled (#40399)
  * [C++][FlightRPC] Add missing expiration_time arguments (#40425)
  * [C++] Move key_hash/key_map/light_array related files to
    internal for prevent using by users (#40484)
  * [C++] Add missing Threads::Threads dependency to arrow_static
    (#40433)
  * [C++] Fix static build on Windows (#40446)
  * [C++] Ensure using bundled FlatBuffers (#40519)
  * [C++][CI] Fix TSAN and ASAN/UBSAN crashes (#40559)
  * [C++] Repair FileSystem merge error (#40564)
  * [C++] Fix 3.12 Python support (#40322)
  * [C++] Move mold linker flags to variables (#40603)
  * [C++] Enlarge dest buffer according to dest offset for
    CopyBitmap benchmark (#40769)
  * [C++][Gandiva] 'ilike' function does not work (#40728)
  * [C++] Fix protobuf package name setting for builds with
    substrait (#40753)
  * [C++][ORC] Fix std::filesystem related link error with ORC
    2.0.0 or later (#41023)
  * [C++] Fix TSAN link error for module library (#40864)
  * [C++][FS][Azure] Don't run TestGetFileInfoGenerator() with
    Valgrind (#41163)
  * [C++] Fix null count check in BooleanArray.true_count()
    (#41070)
  * [C++] IO: fixing compiling in gcc 7.5.0 (#41025)
  * [C++][Parquet] Bugfixes and more tests in boolean arrow
    decoding (#41037)
  * [C++] formatting.h: Make sure space is allocated for the 'Z'
    when formatting timestamps (#41045)
  * [C++] Ignore ARROW_USE_MOLD/ARROW_USE_LLD with clang < 12
    (#41062)
  * [C++] Fix: left anti join filter empty rows. (#41122)
  * [CI][C++] Don't use CMake 3.29.1 with vcpkg (#41151)
  * [CI][C++] Use newer LLVM on Ubuntu 24.04 (#41150)
  * [CI][R][C++] test-r-linux-valgrind has started failing
  * [C++][Python] Sporadic asof_join failures in PyArrow
  * [C++] Fix Valgrind error in string-to-float16 conversion
    (#41155)
  * [C++] Stop defining ARROW_TEST_MEMCHECK in config.h.cmake
    (#41177)
  * [C++] Fix mistake in integration test. Explicitly cast
    std::string to avoid compiler interpreting char* -> bool
    (#41202)
  ## New Features and Improvements
  * [C++] Filesystem implementation for Azure Blob Storage
  * [C++] Implement cast to/from halffloat (#40067)
  * [C++] Add residual filter support to swiss join (#39487)
  * [C++] Add support for building with Emscripten (#37821)
  * [C++][Python] Add missing methods to RecordBatch (#39506)
  * [C++][Java][Flight RPC] Add Session management messages
    (#34817)
  * [C++] build filesystems as separate modules (#39067)
  * [C++][Parquet] Rewrite BYTE_STREAM_SPLIT SSE optimizations
    using xsimd (#40335)
  * [C++] Add support for service-specific endpoint for S3 using
    AWS_ENDPOINT_URL_S3 (#39160)
  * [C++][FS][Azure] Implement DeleteFile() (#39840)
  * [C++] Implement Azure FileSystem Move() via Azure DataLake
    Storage Gen 2 API (#39904)
  * [C++] Add ImportChunkedArray and ExportChunkedArray to/from
    ArrowArrayStream (#39455)
  * [CI][C++][Go] Don't run jobs that use a self-hosted GitHub
    Actions Runner on fork (#39903)
  * [C++][FS][Azure] Use the generic filesystem tests (#40567)
  * [C++][Compute] Add binary_slice kernel for fixed size binary
    (#39245)
  * [C++] Avoid creating memory manager instance for every buffer
    view/copy (#39271)
  * [C++][Parquet] Minor: Style enhancement for
    parquet::FileMetaData (#39337)
  * [C++] IO: Reuse same buffer in CompressedInputStream (#39807)
  * [C++] Use more permissable return code for rename (#39481)
  * [C++][Parquet] Use std::count in ColumnReader ReadLevels
    (#39397)
  * [C++] Support cast kernel from large string, (large) binary to
    dictionary (#40017)
  * [C++] Pass -jN to make in external projects (#39550)
  * [C++][Parquet] Add integration test for BYTE_STREAM_SPLIT
    (#39570)
  * [C++] Ensure top-level benchmarks present informative metrics
    (#40091)
  * [C++] Ensure CSV and JSON benchmarks present a bytes/s or
    items/s metric (#39764)
  * [C++] Ensure dataset benchmarks present a bytes/s or items/s
    metric (#39766)
  * [C++][Gandiva] Ensure Gandiva benchmarks present a bytes/s or
    items/s metric (#40435)
  * [C++][Parquet] Benchmark levels decoding (#39705)
  * [C++][FS][Azure] Remove StatusFromErrorResponse as it's not
    necessary (#39719)
  * [C++][Parquet] Make BYTE_STREAM_SPLIT routines type-agnostic
    (#39748)
  * [C++][Device] Generic CopyBatchTo/CopyArrayTo memory types
    (#39772)
  * [C++] Document and micro-optimize ChunkResolver::Resolve()
    (#39817)
  * [C++] Allow building cpp/src/arrow/**/*.cc without waiting
    bundled libraries (#39824)
  * [C++][Parquet] Parquet binary length overflow exception should
    contain the length of binary (#39844)
  * [C++][Parquet] Minor: avoid creating a new Reader object in
    Decoder::SetData (#39847)
  * [C++] Thirdparty: Bump google benchmark to 1.8.3 (#39878)
  * [C++] DataType::ToString support optionally show metadata
    (#39888)
  * [C++][Gandiva] Accept LLVM 18 (#39934)
  * [C++] Use Requires instead of Libs for system RE2 in arrow.pc
    (#39932)
  * [C++] Small CSV reader refactoring (#39963)
  * [C++][Parquet] Expand BYTE_STREAM_SPLIT to support
    FIXED_LEN_BYTE_ARRAY, INT32 and INT64 (#40094)
  * [C++][FS][Azure] Add support for reading user defined metadata
    (#40671)
  * [C++][FS][Azure] Add AzureFileSystem support to
    FileSystemFromUri() (#40325)
  * [C++][FS][Azure] Make attempted reads and writes against
    directories fail fast (#40119)
  * [C++][Python] Basic conversion of RecordBatch to Arrow Tensor
    (#40064)
  * [C++][Python] Basic conversion of RecordBatch to Arrow Tensor -
    add support for different data types (#40359)
  * [C++][Python] Basic conversion of RecordBatch to Arrow Tensor -
    add option to cast NULL to NaN (#40803)
  * [C++][FS][Azure] Implement DeleteFile() for flat-namespace
    storage accounts (#40075)
  * [CI][C++] Add a job on ARM64 macOS (#40456)
  * [C++][Parquet] Remove AVX512 variants of BYTE_STREAM_SPLIT
    encoding (#40127)
  * [C++][Parquet][Tools] Print FIXED_LEN_BYTE_ARRAY length
    (#40132)
  * [C++] Make S3 narrative test more flexible (#40144)
  * [C++] Remove redundant invocation of BatchesFromTable (#40173)
  * [C++][CMake] Use "RapidJSON" CMake target for RapidJSON
    (#40210)
  * [C++][CMake] Use arrow/util/config.h.cmake instead of
    add_definitions() (#40222)
  * [C++] Fix: improve the backpressure handling in the dataset
    writer (#40722)
  * [C++][CMake] Improve description why we need to initialize AWS
    C++ SDK in arrow-s3fs-test (#40229)
  * [C++] Add support for system glog 0.7 (#40275)
  * [C++] Specialize ResolvedChunk::Value on value-specific types
    instead of entire class (#40281)
  * [C++][Docs] Add documentation of array factories (#40373)
  * [C++][Parquet] Allow use of FileDecryptionProperties after the
    CryptoFactory is destroyed (#40329)
  * [FlightRPC][C++][Java][Go] Add URI scheme to reuse connection
    (#40084)
  * [C++] Add benchmark for ToTensor conversions (#40358)
  * [C++] Define ARROW_FORCE_INLINE for non-MSVC builds (#40372)
  * [C++] Add support for mold (#40397)
  * [C++] Add support for LLD (#40927)
  * [C++] Produce better error message when Move is attempted on
    flat-namespace accounts (#40406)
  * [C++][ORC] Upgrade ORC to 2.0.0 (#40508)
  * [CI][C++] Don't install FlatBuffers (#40541)
  * [C++] Ensure pkg-config flags include -ldl for static builds
    (#40578)
  * [Dev][C++][Python][R] Use pre-commit for clang-format (#40587)
  * [C++] Rename Function::is_impure() to is_pure() (#40608)
  * [C++] Add missing util/config.h in arrow/io/compressed_test.cc
    (#40625)
  * [Python][C++] Support conversion of pyarrow.RunEndEncodedArray
    to numpy/pandas (#40661)
  * [C++] Expand Substrait type support (#40696)
  * [C++] Create registry for Devices to map DeviceType to
    MemoryManager in C Device Data import (#40699)
  * [C++][Parquet] Minor enhancement code of encryption (#40732)
  * [C++][Parquet] Simplify PageWriter and ColumnWriter creation
    (#40768)
  * [C++] Re-order loads and stores in MemoryPoolStats update
    (#40647)
  * [C++] Revert changes from PR #40857 (#40980)
  * [C++] Correctly report asimd/neon in GetRuntimeInfo (#40857)
  * [C++] Thirdparty: bump zstd to 1.5.6 (#40837)
  * [Docs][C++][Python] Add initial documentation for
    RecordBatch::Tensor conversion (#40842)
  * [C++][Python] Basic conversion of RecordBatch to Arrow Tensor -
    add support for row-major (#40867)
  * [C++][Parquet] Encoding: Optimize DecodeArrow/Decode(bitmap)
    for PlainBooleanDecoder (#40876)
  * [C++] Suppress shorten-64-to-32 warnings in CUDA/Skyhook codes
    (#40883)
  * [C++] Fix unused function build error (#40984)
  * [C++][Parquet] RleBooleanDecoder supports DecodeArrow with
    nulls (#40995)
  * [C++][FS][Azure] Adjust
    DeleteDir/DeleteDirContents/GetFileInfoSelector behaviors
    against Azure for generic filesystem tests (#41068)
  * [C++][Parquet] Avoid allocating buffer object in RecordReader's
    SkipRecords (#39818)
- Drop apache-arrow-pr40230-glog-0.7.patch
- Drop apache-arrow-pr40275-glog-0.7-2.patch
- Belated inclusion of submission without changelog by
  Shani Hadiyanto <shanipribadi@gmail.com>)
  * disable static devel packages by default: The CMake targets
    require them for all builds, if not disabled
  * Add subpackages for Apache Arrow Flight and Flight SQL
  
- Update to 16.0.0
  * [Python] construct pandas.DataFrame with public API in
    to_pandas (#40897)
  * [Python] Fix ORC test segfault in the python wheel windows test
    (#40609)
  * [Python] Attach Python stacktrace to errors in ConvertPyError
    (#39380)
  * [Python] Plug reference leaks when creating Arrow array from
    Python list of dicts (#40412)
  * [Python] Empty slicing an array backwards beyond the start is
    now empty (#40682)
  * [Python] Slicing an array backwards beyond the start now
    includes first item. (#39240)
  * [Python] Calling
    pyarrow.dataset.ParquetFileFormat.make_write_options as a class
    method results in a segfault (#40976)
  * [Python] Fix parquet import in encryption test (#40505)
  * [Python] fix raising ValueError on _ensure_partitioning
    (#39593)
  * [Python] Validate max_chunksize in Table.to_batches (#39796)
  * [C++][Python] Fix test_gdb failures on 32-bit (#40293)
  * [Python] Make Tensor.__getbuffer__ work on 32-bit platforms
    (#40294)
  * [Python] Avoid using np.take in Array.to_numpy() (#40295)
  * [Python][C++] Fix large file handling on 32-bit Python build
    (#40176)
  * [Python] Update size assumptions for 32-bit platforms (#40165)
  * [Python] Fix OverflowError in foreign_buffer on 32-bit
    platforms (#40158)
  * [Python] Add Type_FIXED_SIZE_LIST to _NESTED_TYPES set (#40172)
  * [Python] Mark ListView as a nested type (#40265)
  * [Python] only allocate the ScalarMemoTable when used (#40565)
  * [Python] Error compiling Cython files on Windows during release
    verification
  * [Python] Fix flake8 failures in python/benchmarks/parquet.py
    (#40440)
  * [Python] Suppress python/examples/minimal_build/Dockerfile.*
    warnings (#40444)
  * [Python][Docs] Add workaround for autosummary (#40739)
  * [Python] BUG: Empty slicing an array backwards beyond the start
    should be empty
  * [CI][Python] Activate ARROW_PYTHON_VENV if defined in
    sdist-test job (#40707)
  * [CI][Python] CI failures on Python builds due to pytest_cython
    (#40975)
  * [Python] ListView pandas tests should use np.nan instead of
    None (#41040)
  * [C++][Python] Sporadic asof_join failures in PyArrow
  ## New Features and Improvements
  * [Python][CI] Remove legacy hdfs tests from hdfs and hypothesis
    setup (#40363)
  * [Python] Remove deprecated pyarrow.filesystem legacy
    implementations (#39825)
  * [C++][Python] Add missing methods to RecordBatch (#39506)
  * [Python][CI] Support ORC in Windows wheels
  * [Python] Correct test marker for join_asof tests (#40666)
  * [Python] Add join_asof binding (#34234)
  * [Python] Add a function to download and extract timezone
    database on Windows (#38179)
  * [Python][CI][Packaging] Enable ORC on Windows Appveyor CI and
    Windows wheels for pyarrow
  * [Python] Add a FixedSizeTensorScalar class (#37533)
  * [Python][CI][Dev][Python] Release and merge script errors
    (#37819)" (#40150)
  * [Python] Construct pyarrow.Field and ChunkedArray through Arrow
    PyCapsule Protocol (#40818)
  * [Python] Fix missing byte_width attribute on DataType class
    (#39592)
  * [Python] Compatibility with NumPy 2.0
  * [Packaging][Python] Enable building pyarrow against numpy 2.0
    (#39557)
  * [Python] Basic pyarrow bindings for Binary/StringView classes
    (#39652)
  * [Python] Expose force_virtual_addressing in PyArrow (#39819)
  * [Python][Parquet] Support hashing for FileMetaData and
    ParquetSchema (#39781)
  * [Python] Add bindings for ListView and LargeListView (#39813)
  * [Python][Packaging] Build pyarrow wheels with numpy RC instead
    of nightly (#41097)
  * [Python] Support creating Binary/StringView arrays from python
    objects (#39853)
  * [Python] ListView support for pa.array() (#40160)
  * [Python][CI] Remove upper pin on pytest (#40487)
  * [Python][FS][Azure] Minimal Python bindings for AzureFileSystem
    (#40021)
  * [Python] Low-level bindings for exporting/importing the C
    Device Interface (#39980)
  * [Python] Add ChunkedArray import/export to/from C (#39985)
  * [Python] Use Cast() instead of CastTo (#40116)
  * [C++][Python] Basic conversion of RecordBatch to Arrow Tensor
    (#40064)
  * [C++][Python] Basic conversion of RecordBatch to Arrow Tensor -
    add support for different data types (#40359)
  * [C++][Python] Basic conversion of RecordBatch to Arrow Tensor -
    add option to cast NULL to NaN (#40803)
  * [Python] Support requested_schema in __arrow_c_stream__()
    (#40070)
  * [Python] Support Binary/StringView conversion to numpy/pandas
    (#40093)
  * [Python] Allow FileInfo instances to be passed to dataset init
    (#40143)
  * [Python][CI] Add 32-bit Debian build on Crossbow (#40164)
  * [Python] ListView arrow-to-pandas conversion (#40482)
  * [Python][CI] Disable generating C lines in Cython tracebacks
    (#40225)
  * [Python] Support construction of Run-End Encoded arrays in
    pa.array(..) (#40341)
  * [Python] Accept dict in pyarrow.record_batch() function
    (#40292)
  * [Python] Update for NumPy 2.0 ABI change in
    PyArray_Descr->elsize (#40418)
  * [Python][CI] Fix install of nightly dask in integration tests
    (#40378)
  * [Python] Fix byte_width for binary(0) + fix hypothesis tests
    (#40381)
  * [Python][CI] Fix dataset partition filter tests with pandas
    nightly (#40429)
  * [Docs][Python] Added JsonFileFormat to docs (#40585)
  * [Dev][C++][Python][R] Use pre-commit for clang-format (#40587)
  * [Python][C++] Support conversion of pyarrow.RunEndEncodedArray
    to numpy/pandas (#40661)
  * [Python] Simplify and improve perf of creation of the column
    names in Table.to_pandas (#40721)
  * [Docs][C++][Python] Add initial documentation for
    RecordBatch::Tensor conversion (#40842)
  * [C++][Python] Basic conversion of RecordBatch to Arrow Tensor -
    add support for row-major (#40867)
  * [CI][Python] check message in test_make_write_options_error for
    Cython 2 (#41059)
  * [Python] Add copy keyword in Array.array for numpy 2.0+
    compatibility (#41071)
  * [Python][Packaging] PyArrow wheel building is failing because
    of disabled vcpkg install of liblzma
- Drop apache-arrow-pr40230-glog-0.7.patch
- Drop apache-arrow-pr40275-glog-0.7-2.patch
- Add pyarrow-pr41319-numpy2-tests.patch gh#apache/arrow#41319

OBS-URL: https://build.opensuse.org/request/show/1170145
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/apache-arrow?expand=0&rev=12
2024-04-25 18:50:23 +00:00
fc3315cd8b OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=32 2024-04-25 13:14:01 +00:00
d947cb7cd2 OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=31 2024-04-25 09:12:59 +00:00
c159005cc1 Accepting request 1170120 from home:bnavigator:numpy
- Update to 16.0.0
  ## Bug Fixes
  * [C++][ORC] Catch all ORC exceptions to avoid crash (#40697)
  * [C++][S3] Handle conventional content-type for directories
    (#40147)
  * [C++] Strengthen handling of duplicate slashes in S3, GCS
    (#40371)
  * [C++] Avoid hash_mean overflow (#39349)
  * [C++] Fix spelling (array) (#38963)
  * [C++][Parquet] Fix crash in Modular Encryption (#39623)
  * [C++][Dataset] Fix failures in dataset-scanner-benchmark
    (#39794)
  * [C++][Device] Fix Importing nested and string types for
    DeviceArray (#39770)
  * [C++] Use correct (non-CPU) address of buffer in
    ExportDeviceArray (#39783)
  * [C++] Improve error message for "chunker out of sync" condition
    (#39892)
  * [C++] Use make -j1 to install bundled bzip2 (#39956)
  * [C++] DatasetWriter avoid creating zero-sized batch when
    max_rows_per_file enabled (#39995)
  * [C++][CI] Disable debug memory pool for ASAN and Valgrind
    (#39975)
  * [C++][Gandiva] Make Gandiva's default cache size to be 5000 for
    object code cache (#40041)
  * [C++][FS][Azure] Fix CreateDir and DeleteDir trailing slash
    issues on hierarchical namespace accounts (#40054)
  * [C++][FS][Azure] Validate containers in
    AzureFileSystem::Impl::MovePaths() (#40086)
  * [C++] Decimal types with different precisions and scales bind

OBS-URL: https://build.opensuse.org/request/show/1170120
OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=30
2024-04-25 09:07:39 +00:00
525207619b Accepting request 1163690 from home:shanipribadi
I would like to have apache flight and apache flight sql library built.

also disabling the static build because the generated CMake Targets includes them, making builds against libarrow requiring not just apache-arrow-devel but also all of the devel-static packages.

note: flight and flight-sql are packaged separately.
in upstream rpm and fedora repo, flight-sql is included in libarrow-flight-libs.

OBS-URL: https://build.opensuse.org/request/show/1163690
OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=29
2024-03-30 15:01:58 +00:00
b132f0a6a2 Accepting request 1160967 from science
- Update to 15.0.2
  ## Bug Fixes
  * [C++][Acero] Increase size of Acero TempStack (#40007)
  * [C++][Dataset] Add missing Protobuf static link dependency
    (#40015)
  * [C++] Possible data race when reading metadata of a parquet
    file (#40111)
  * [C++] Make span SFINAE standards-conforming to enable
    compilation with nvcc (#40253)
  

- Update to 15.0.2
  ## Bug Fixes
  * [Python] Fix except clauses (#40387)
  * [Python][CI] Skip failing test_dateutil_tzinfo_to_string
    (#40486) (forwarded request 1160966 from bnavigator)

OBS-URL: https://build.opensuse.org/request/show/1160967
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/apache-arrow?expand=0&rev=11
2024-03-25 20:09:02 +00:00
8d99637b3c Accepting request 1160966 from home:bnavigator:branches:science
- Update to 15.0.2
  ## Bug Fixes
  * [C++][Acero] Increase size of Acero TempStack (#40007)
  * [C++][Dataset] Add missing Protobuf static link dependency
    (#40015)
  * [C++] Possible data race when reading metadata of a parquet
    file (#40111)
  * [C++] Make span SFINAE standards-conforming to enable
    compilation with nvcc (#40253)
  

- Update to 15.0.2
  ## Bug Fixes
  * [Python] Fix except clauses (#40387)
  * [Python][CI] Skip failing test_dateutil_tzinfo_to_string
    (#40486)

OBS-URL: https://build.opensuse.org/request/show/1160966
OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=27
2024-03-23 16:14:18 +00:00
74e375d960 Accepting request 1152982 from science
OBS-URL: https://build.opensuse.org/request/show/1152982
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/apache-arrow?expand=0&rev=10
2024-03-01 22:36:05 +00:00
f4b994c8a2 Accepting request 1152980 from home:bnavigator:branches:science
- Reenable logging
  * Add apache-arrow-pr40230-glog-0.7.patch
  * Add apache-arrow-pr40275-glog-0.7-2.patch
  * now requires glog devel files to be present for
    apache-arrow-devel; ArrowConfig.cmake fails otherwise
  * gh#apache/arrow#40181
  * gh#apache/arrow#40230
  * gh#apache/arrow#40275
- Move d:l:p:n/python-pyarrow to the  science/apache-arrow as multibuild package: Uses the same source and is tightly connected.

OBS-URL: https://build.opensuse.org/request/show/1152980
OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=25
2024-02-28 16:27:53 +00:00
d8d03abd38 Accepting request 1150089 from science
OBS-URL: https://build.opensuse.org/request/show/1150089
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/apache-arrow?expand=0&rev=9
2024-02-25 13:06:15 +00:00
b029d62e8c Accepting request 1150081 from home:bnavigator:branches:science
- Update to 15.0.1
  ## Bug Fixes
  * [C++] "iso_calendar" kernel returns incorrect results for array
    length > 32 (#39360)
  * [C++] Explicit error in ExecBatchBuilder when appending var
    length data exceeds offset limit (int32 max) (#39383)
  * [C++][Parquet] Pass memory pool to decoders (#39526)
  * [C++][Parquet] Validate page sizes before truncating to int32
    (#39528)
  * [C++] Fix tail-word access cross buffer boundary in
    `CompareBinaryColumnToRow` (#39606)
  * [C++] Fix the issue of ExecBatchBuilder when appending
    consecutive tail rows with the same id may exceed buffer
    boundary (for fixed size types) (#39585)
  * [Release] Update platform tags for macOS wheels to macosx_10_15
    (#39657)
  * [C++][FlightRPC] Fix nullptr dereference in PollInfo (#39711)
  * [C++] Fix tail-byte access cross buffer boundary in key hash
    avx2 (#39800)
  * [C++][Acero] Fix AsOfJoin with differently ordered schemas than
    the output (#39804)
  * [C++] Expression ExecuteScalarExpression execute empty args
    function with a wrong result (#39908)
  * [C++] Strip extension metadata when importing a registered
    extension (#39866)
  * [C#] Restore support for .NET 4.6.2 (#40008)
  * [C++] Fix out-of-line data size calculation in
    BinaryViewBuilder::AppendArraySlice (#39994)
  * [C++][CI][Parquet] Fixing parquet column_writer_test building
    (#40175)
  ## New Features and Improvements
  * [C++] PollFlightInfo does not follow rule of 5
  * [C++] Fix filter and take kernel for month_day_nano intervals
    (#39795)
  * [C++] Thirdparty: Bump zlib to 1.3.1 (#39877)
  * [C++] Add missing "#include <algorithm>" (#40010)
- Release 15.0.0
  ## Bug Fixes
  * [C++] Bring back case_when tests for union types (#39308)
  * [C++] Fix the issue of ExecBatchBuilder when appending
    consecutive tail rows with the same id may exceed buffer
    boundary (#39234)
  * [C++][Python] Add a no-op kernel for
    dictionary_encode(dictionary) (#38349)
  * [C++] Use the latest tagged version of flatbuffers (#38192)
  * [C++] Don't use MSVC_VERSION to determin
    -fms-compatibility-version (#36595)
  * [C++] Optimize hash kernels for Dictionary ChunkedArrays
    (#38394)
  * [C++][Gandiva] Avoid registering exported functions multiple
    times in gandiva (#37752)
  * [C++][Acero] Fix race condition caused by straggling input in
    the as-of-join node (#37839)
  * [C++][Parquet] add more closed file checks for
    ParquetFileWriter (#38390)
  * [C++][FlightRPC] Add missing app_metadata arguments (#38231)
  * [C++][Parquet] Fix Valgrind memory leak in
    arrow-dataset-file-parquet-encryption-test (#38306)
  * [C++][Parquet] Don't initialize OpenSSL explicitly with OpenSSL
    1.1 (#38379)
  * [C++] Re-generate flatbuffers C++ for Skyhook (#38405)
  * [C++] Avoid passing null pointer to LZ4 frame decompressor
    (#39125)
  * [C++] Add missing explicit size_t cast for i386 (#38557)
  * [C++] Fix: add TestingEqualOptions for gtest functions.
    (#38642)
  * [C++][Gandiva] Use arrow io util to replace
    std::filesystem::path in gandiva (#38698)
  * [C++] Protect against PREALLOCATE preprocessor defined on macOS
    (#38760)
  * [C++] Check variadic buffer counts in bounds (#38740)
  * [C++][FS][Azure] Do nothing for CreateDir("/container", true)
    (#38783)
  * Fix TestArrowReaderAdHoc.ReadFloat16Files to use new
    uncompressed files (#38825)
  * [C++] S3FileSystem export s3 sdk config
    "use_virtual_addressing" to arrow::fs::S3Options (#38858)
  * [C++][Gandiva] Fix Gandiva to_date function's validation for
    supress errors parameter (#38987)
  * [C++][Parquet] Fix spelling (#38959)
  * [C++] Fix spelling (acero) (#38961)
  * [C++] Fix spelling (compute) (#38965)
  * [C++] Fix spelling (util) (#38967)
  * [C++] Fix spelling (dataset) (#38969)
  * [C++] Fix spelling (filesystem) (#38972)
  * [C++] Fix spelling (#38978)
  * [C++] Fix spelling (#38980)
  * [C++][Acero] union node output batches should be unordered
    (#39046)
  * [C++][CI] Fix Valgrind failures (#39127)
  * [C++] Remove needless system Protobuf dependency with
    -DARROW_HDFS=ON (#39137)
  * [C++][Compute] Fix negative duration division (#39158)
  * [C++] Add missing data copy in StreamDecoder::Consume(data)
    (#39164)
  * [C++] Remove compiler warnings with -Wconversion
    -Wno-sign-conversion in public headers (#39186)
  * [C++][Benchmarking] Remove hardcoded min times (#39307)
  * [C++] Don't use "if constexpr" in lambda (#39334)
  * [C++] Disable -Werror=attributes for Azure SDK's identity.hpp
    (#39448)
  * [C++] Fix compile warning (#39389)
  * [CI][JS] Force node 20 on JS build on arm64 to fix build issues
    (#39499)
  * [C++] Disable parallelism for jemalloc external project
    (#39522)
  * [C++][Parquet] Fix crash in test_parquet_dataset_lazy_filtering
    (#39632)
  * [C++] Disable parallelism for all `make`-based externalProjects
    when CMake >= 3.28 is used
  ##  New Features and Improvements
  * [C++][JSON] Change the max rows to Unlimited(int_32) (#38582)
  * [C++][Python] Add "Z" to the end of timestamp print string when
    tz defined (#39272)
  * [C++][Python] DLPack implementation for Arrow Arrays (producer)
    (#38472)
  * [C++] Diffing of Run-End Encoded arrays (#35003)
  * [C++][Python][R] Allow users to adjust S3 log level by
    environment variable (#38267)
  * [C++][Format] Implementation of the LIST_VIEW and
    LARGE_LIST_VIEW array formats (#35345)
  * [C++] Use Cast() instead of CastTo() for Scalar in test
    (#39044)
  * [C++][Python][Parquet] Implement Float16 logical type (#36073)
  * [C++] Add Utf8View and BinaryView to the c ABI (#38443)
  * [C++][Parquet] Add api to get RecordReader from RowGroupReader
    (#37003)
  * [C++] Expose a span converter for Buffer and ArraySpan (#38027)
  * [C++] Add A Dictionary Compaction Function For DictionaryArray
    (#37418)
  * [C++] Add arrow::ipc::StreamDecoder::Reset() (#37970)
  * [C++] Implement file reads for Azure filesystem (#38269)
  * [C++][Integration] Add C++ Utf8View implementation (#37792)
  * [C++][Gandiva] Add external function registry support (#38116)
  * [C++][Gandiva] Migrate LLVM JIT engine from MCJIT to ORC
    v2/LLJIT (#39098)
  * [C++] Feature: support concatenate recordbatches. (#37896)
  * [C++] Add support for specifying custom Array opening and
    closing delimiters to arrow::PrettyPrintDelimiters (#38187)
  * [R] Allow code() to return package name prefix. (#38144)
  * [C++][Benchmark] Add non-stream Codec Compression/Decompression
    (#38067)
  * [C++][Parquet] Change DictEncoder dtor checking to warning log
    (#38118)
  * [C++][Parquet] Support reading parquet files with multiple gzip
    members (#38272)
  * [C++][Parquet] check the decompressed page size same as size in
    page header (#38327)
  * [C++][Azure] Use properties for input stream metadata (#38524)
  * [C++][FS][Azure] Implement file writes (#38780)
  * [C++] Implement GetFileInfo for a single file in Azure
    filesystem (#38505)
  * [C++][CMake] Use transitive dependency for system GoogleTest
    (#38340)
  * [C++][Parquet] Use new encrypted files for page index
    encryption test (#38347)
  * Add validation logic for offsets and values to
    arrow.array.ListArray.fromArrays (#38531)
  * [C++][Acero] Create a sorted merge node (#38380)
  * [C++][Benchmark] Adding benchmark for LZ4/Snappy Compression
    (#38453)
  * [C++] Support LogicalNullCount for DictionaryArray (#38681)
  * [C++][Parquet] Faster scalar BYTE_STREAM_SPLIT (#38529)
  * [C++][Gandiva] Support registering external C functions
    (#38632)
  * [C++] Implement GetFileInfo(selector) for Azure filesystem
    (#39009)
  * [C++][FS][Azure] Implement CreateDir() (#38708)
  * [C++][FS][Azure] Implement DeleteDir() (#38793)
  * [C++][FS][Azure] Implement DeleteDirContents() (#38888)
  * [C++] : Implement AzureFileSystem::DeleteRootDirContents
    (#39151)
  * [C++][FS][Azure] Implement CopyFile() (#39058)
  * [C++][Go][Parquet] Add tests for reading Float16 files in
    parquet-testing (#38753)
  * [C++][FS][Azure] Rename AzurePath to AzureLocation (#38773)
  * [C++] Implement directory semantics even when the storage
    account doesn't support HNS (#39361)
  * [C++][Parquet] Update parquet.thrift to sync with 2.10.0
    (#38815)
  * [C++] Replace "#ifdef ARROW_WITH_GZIP" in dataset test to
    ARROW_WITH_ZLIB (#38853)
  * [C++][Parquet] Using length to optimize bloom filter read
    (#38863)
  * [C++][Parquet] Minor: making parquet TypedComparator operation
    as const method (#38875)
  * [C++] DatasetWriter release rows_in_flight_throttle when
    allocate writing failed (#38885)
  * [C++][Parquet] Move EstimatedBufferedValueBytes from
    TypedColumnWriter to ColumnWriter (#39055)
  * [C++] Stop installing internal bpacking_simd* headers (#38908)
  * [C++][Gandiva] Refactor function holder to return arrow Result
    (#38873)
  * [C++] Use Cast() instead of CastTo() for Dictionary Scalar in
    test (#39362)
  * [C++] Use Cast() instead of CastTo() for Timestamp Scalar in
    test (#39060)
  * [C++] Use Cast() instead of CastTo() for List Scalar in test
    (#39353)
  * [C++][Parquet] Support row group filtering for nested paths for
    struct fields (#39065)
  * [C++] Refactor the Azure FS tests and filesystem class
    instantiation (#39207)
  * [C++][Parquet] Optimize FLBA record reader (#39124)
  * Create module info compiler plugin (#39135)
  * [C++] : Try to make Buffer::device_type_ non-optional (#39150)
  * [C++][Parquet] Remove deprecated AppendRowGroup(int64_t
    num_rows) (#39209)
  * [C++][Parquet] Avoid WriteRecordBatch from produce zero-sized
    RowGroup (#39211)
  * [C++] Support binary to fixed_size_binary cast (#39236)
  * [C++][Azure][FS] Add default credential auth configuration
    (#39263)
  * [C++] Don't install bundled Azure SDK for C++ with CMake 3.28+
    (#39269)
  * [C++][FS] : Remove the AzureBackend enum and add more flexible
    connection options (#39293)
  * [C++][FS] : Inform caller of container not-existing when
    checking for HNS support (#39298)
  * [C++][FS][Azure] Add workload identity auth configuration
    (#39319)
  * [C++][FS][Azure] Add managed identity auth configuration
    (#39321)
  * [C++] Forward arguments to ExceptionToStatus all the way to
    Status::FromArgs (#39323)
  * [C++] Flaky DatasetWriterTestFixture.MaxRowsOneWriteBackpresure
    test (#39379)
  * [C++] Add ForceCachedHierarchicalNamespaceSupport to help with
    testing (#39340)
  * [C++][FS][Azure] Add client secret auth configuration (#39346)
  * [C++] Reduce function.h includes (#39312)
  * [C++] Use Cast() instead of CastTo() for Parquet (#39364)
  * [C++][Parquet] Vectorize decode plain on FLBA (#39414)
  * [C++][Parquet] Style: Using arrow::Buffer data_as api rather
    than reinterpret_cast (#39420)
  * [C++][ORC] Upgrade ORC to 1.9.2 (#39431)
  * [C++] Use default Azure credentials implicitly and support
    anonymous credentials explicitly (#39450)
  * [C++][Parquet] Allow reading dictionary without reading data
    via ByteArrayDictionaryRecordReader (#39153)
- Disable logging until compatibility with glog is restored
  gh#apache/arrow#40181

OBS-URL: https://build.opensuse.org/request/show/1150081
OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=23
2024-02-24 09:07:04 +00:00
78e62c5074 Accepting request 1139093 from science
OBS-URL: https://build.opensuse.org/request/show/1139093
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/apache-arrow?expand=0&rev=8
2024-01-16 20:38:38 +00:00
40e5983a49 Accepting request 1139092 from home:bnavigator:branches:science
- Update to 14.0.2
  ## New Features and Improvements
  * GH-38449 - [Release][Go][macOS] Use local test data if possible
    (#38450)
  * GH-38591 - [Parquet][C++] Remove redundant open calls in
    ParquetFileFormat::GetReaderAsync (#38621)
  ## Bug Fixes
  * GH-38345 - [Release] Use local test data for verification if
    possible (#38362)
  * GH-38438 - [C++] Dataset: Trying to fix the async bug in
    Parquet dataset (#38466)
  * GH-38577 - Reading parquet file behavior change from 13.0.0 to
    14.0.0
  * GH-38618 - [C++] S3FileSystem: fix regression in deleting
    explicitly created sub-directories (#38845)
  * GH-38861 - [C++] Add missing “-framework Security” to
    Libs.private in arrow.pc (#38869)
  * GH-39072 - [Release][CI] Python3.11-devel is required for the
    verification job on AlmaLinux 8 (#39073)
  * GH-39074 - [Release][Packaging] Use UTF-8 explicitly for KEYS
    (#39082)

OBS-URL: https://build.opensuse.org/request/show/1139092
OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=21
2024-01-16 09:00:47 +00:00
5938d9209e Accepting request 1138300 from science
OBS-URL: https://build.opensuse.org/request/show/1138300
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/apache-arrow?expand=0&rev=7
2024-01-12 22:46:14 +00:00
6b4b71e17d Accepting request 1138181 from home:pgajdos
- disable some tests for s390x [bsc#1218592]

OBS-URL: https://build.opensuse.org/request/show/1138181
OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=19
2024-01-12 11:03:12 +00:00
39fe80b539 Accepting request 1125775 from science
OBS-URL: https://build.opensuse.org/request/show/1125775
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/apache-arrow?expand=0&rev=6
2023-11-14 20:42:29 +00:00
John Vandenberg
59b113ad72 Accepting request 1125774 from home:mimi_vx:branches:science
- update 14.0.1
 * GH-38431 - [Python][CI] Update fs.type_name checks for s3fs tests
 * GH-38607 - [Python] Disable PyExtensionType autoload
- update to 14.0.1
 * very long list of changes can be found here:
 https://arrow.apache.org/release/14.0.0.html

OBS-URL: https://build.opensuse.org/request/show/1125774
OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=17
2023-11-14 01:23:03 +00:00
110bca2ab1 Accepting request 1109686 from science
OBS-URL: https://build.opensuse.org/request/show/1109686
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/apache-arrow?expand=0&rev=5
2023-09-08 19:16:00 +00:00
0d83feb674 Accepting request 1109685 from home:bnavigator:branches:devel:languages:python:numeric
- Update to 13.0.0
  ## Acero
  * Handling of unaligned buffers is input nodes can be configured
    programmatically or by setting the environment variable
    ACERO_ALIGNMENT_HANDLING. The default behavior is to warn when
    an unaligned buffer is detected GH-35498.
  ## Compute
  * Several new functions have been added:
    - aggregate functions “first”, “last”, “first_last” GH-34911;
    - vector functions “cumulative_prod”, “cumulative_min”,
      “cumulative_max” GH-32190;
    - vector function “pairwise_diff” GH-35786.
  * Sorting now works on dictionary arrays, with a much better
    performance than the naive approach of sorting the decoded
    dictionary GH-29887. Sorting also works on struct arrays, and
    nested sort keys are supported using FieldRed GH-33206.
  * The check_overflow option has been removed from
    CumulativeSumOptions as it was redundant with the availability
    of two different functions: “cumulative_sum” and
    “cumulative_sum_checked” GH-35789.
  * Run-end encoded filters are efficiently supported GH-35749.
  * Duration types are supported with the “is_in” and “index_in”
    functions GH-36047. They can be multiplied with all integer
    types GH-36128.
  * “is_in” and “index_in” now cast their inputs more flexibly:
    they first attempt to cast the value set to the input type,
    then in the other direction if the former fails GH-36203.
  * Multiple bugs have been fixed in “utf8_slice_codeunits” when
    the stop option is omitted GH-36311.
  ## Dataset
  * A custom schema can now be passed when writing a dataset
    GH-35730. The custom schema can alter nullability or metadata
    information, but is not allowed to change the datatypes
    written.
  ## Filesystems
  * The S3 filesystem now writes files in equal-sized chunks, for
    compatibility with Cloudflare’s “R2” Storage GH-34363.
  * A long-standing issue where S3 support could crash at shutdown
    because of resources still being alive after S3 finalization
    has been fixed GH-36346. Now, attempts to use S3 resources
    (such as making filesystem calls) after S3 finalization should
    result in a clean error.
  * The GCS filesystem accepts a new option to set the project id
    GH-36227.
  ## IPC
  * Nullability and metadata information for sub-fields of map
    types is now preserved when deserializing Arrow IPC GH-35297.
  ## Orc
  * The Orc adapter now maps Arrow field metadata to Orc type
    attributes when writing, and vice-versa when reading GH-35304.
  ## Parquet
  * It is now possible to write additional metadata while a
    ParquetFileWriter is open GH-34888.
  * Writing a page index can be enabled selectively per-column
    GH-34949. In addition, page header statistics are not written
    anymore if the page index is enabled for the given column
    GH-34375, as the information would be redundant and less
    efficiently accessed.
  * Parquet writer properties allow specifying the sorting columns
    GH-35331. The user is responsible for ensuring that the data
    written to the file actually complies with the given sorting.
  * CRC computation has been implemented for v2 data pages
    GH-35171. It was already implemented for v1 data pages.
  * Writing compliant nested types is now enabled by default
    GH-29781. This should not have any negative implication.
  * Attempting to load a subset of an Arrow extension type is now
    forbidden GH-20385. Previously, if an extension type’s storage
    is nested (for example a “Point” extension type backed by a
    struct<x: float64, y: float64>), it was possible to load
    selectively some of the columns of the storage type.
  ## Substrait
  * Support for various functions has been added: “stddev”,
    “variance”, “first”, “last” (GH-35247, GH-35506).
  * Deserializing sorts is now supported GH-32763. However, some
    features, such as clustered sort direction or custom sort
    functions, are not implemented.
  ## Miscellaneous
  * FieldRef sports additional methods to get a flattened version
    of nested fields GH-14946. Compared to their non-flattened
    counterparts, the methods GetFlattened, GetAllFlattened,
    GetOneFlattened and GetOneOrNoneFlattened combine a child’s
    null bitmap with its ancestors’ null bitmaps such as to compute
    the field’s overall logical validity bitmap.
  * In other words, given the struct array [null, {'x': null},
    {'x': 5}], FieldRef("x")::Get might return [0, null, 5] while
    FieldRef("y")::GetFlattened will always return [null, null, 5].
  * Scalar::hash() has been fixed for sliced nested arrays
    GH-35360.
  * A new floating-point to decimal conversion algorithm exhibits
    much better precision GH-35576.
  * It is now possible to cast between scalars of different
    list-like types GH-36309.

OBS-URL: https://build.opensuse.org/request/show/1109685
OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=15
2023-09-08 07:18:56 +00:00
ad607e3932 Accepting request 1092627 from science
OBS-URL: https://build.opensuse.org/request/show/1092627
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/apache-arrow?expand=0&rev=4
2023-06-13 14:09:16 +00:00
cd7a2c42f0 Accepting request 1092619 from home:bnavigator:pyarrow
- Update to 12.0.1
  * [GH-35423] - [C++][Parquet] Parquet PageReader Force
    decompression buffer resize smaller (#35428)
  * [GH-35498] - [C++] Relax EnsureAlignment check in Acero from
    requiring 64-byte aligned buffers to requiring value-aligned
    buffers (#35565)
  * [GH-35519] - [C++][Parquet] Fixing exception handling in parquet
    FileSerializer (#35520)
  * [GH-35538] - [C++] Remove unnecessary status.h include from
    protobuf (#35673)
  * [GH-35730] - [C++] Add the ability to specify custom schema on a
    dataset write (#35860)
  * [GH-35850] - [C++] Don't disable optimization with
    RelWithDebInfo (#35856)
- Drop cflags.patch -- fixed upstream

OBS-URL: https://build.opensuse.org/request/show/1092619
OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=13
2023-06-12 15:49:46 +00:00
b96a9994bf Accepting request 1087840 from science
- Update to 12.0.0
  * Run-End Encoded Arrays have been implemented and are accessible
    (GH-32104)
  * The FixedShapeTensor Logical value type has been implemented
    using ExtensionType (GH-15483, GH-34796)
  ## Compute
  * New kernel to convert timestamp with timezone to wall time
    (GH-33143)
  * Cast kernels are now built into libarrow by default (GH-34388)
  ## Acero
  * Acero has been moved out of libarrow into it’s own shared
    library, allowing for smaller builds of the core libarrow
    (GH-15280)
  * Exec nodes now can have a concept of “ordering” and will reject
    non-sensible plans (GH-34136)
  * New exec nodes: “pivot_longer” (GH-34266), “order_by”
    (GH-34248) and “fetch” (GH-34059)
  * Breaking Change: Reorder output fields of “group_by” node so
    that keys/segment keys come before aggregates (GH-33616)
  ## Substrait
  * Add support for the round function GH-33588
  * Add support for the cast expression element GH-31910
  * Added API reference documentation GH-34011
  * Added an extension relation to support segmented aggregation
    GH-34626
  * The output of the aggregate relation now conforms to the spec
    GH-34786
  ## Parquet
  * Added support for DeltaLengthByteArray encoding to the Parquet
    writer (GH-33024) (forwarded request 1087839 from bnavigator)

OBS-URL: https://build.opensuse.org/request/show/1087840
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/apache-arrow?expand=0&rev=3
2023-05-19 09:55:41 +00:00
f0e79bb038 Accepting request 1087839 from home:bnavigator:pyarrow
- Update to 12.0.0
  * Run-End Encoded Arrays have been implemented and are accessible
    (GH-32104)
  * The FixedShapeTensor Logical value type has been implemented
    using ExtensionType (GH-15483, GH-34796)
  ## Compute
  * New kernel to convert timestamp with timezone to wall time
    (GH-33143)
  * Cast kernels are now built into libarrow by default (GH-34388)
  ## Acero
  * Acero has been moved out of libarrow into it’s own shared
    library, allowing for smaller builds of the core libarrow
    (GH-15280)
  * Exec nodes now can have a concept of “ordering” and will reject
    non-sensible plans (GH-34136)
  * New exec nodes: “pivot_longer” (GH-34266), “order_by”
    (GH-34248) and “fetch” (GH-34059)
  * Breaking Change: Reorder output fields of “group_by” node so
    that keys/segment keys come before aggregates (GH-33616)
  ## Substrait
  * Add support for the round function GH-33588
  * Add support for the cast expression element GH-31910
  * Added API reference documentation GH-34011
  * Added an extension relation to support segmented aggregation
    GH-34626
  * The output of the aggregate relation now conforms to the spec
    GH-34786
  ## Parquet
  * Added support for DeltaLengthByteArray encoding to the Parquet
    writer (GH-33024)

OBS-URL: https://build.opensuse.org/request/show/1087839
OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=11
2023-05-18 17:02:09 +00:00
41d9d0fb5f Accepting request 1076956 from science
OBS-URL: https://build.opensuse.org/request/show/1076956
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/apache-arrow?expand=0&rev=2
2023-04-03 15:47:02 +00:00
5313afc3ac Accepting request 1076954 from home:Andreas_Schwab:Factory
- cflags.patch: fix option order to compile with optimisation
- Adjust constraints

OBS-URL: https://build.opensuse.org/request/show/1076954
OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=9
2023-04-03 12:19:46 +00:00
cb5c8049c6 Accepting request 1075538 from science
second try: now without jemalloc and without gflags-static

apache-arrow is being used more and more by python numeric packages like pandas 2.0 (through pyarrow)

OBS-URL: https://build.opensuse.org/request/show/1075538
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/apache-arrow?expand=0&rev=1
2023-03-31 19:15:18 +00:00
John Vandenberg
d957479563 Accepting request 1075316 from home:bnavigator:branches:science
- Remove gflags-static. It was only needed due to a packaging error
  with gflags which is about to be fixed in Tumbleweed
- Disable build of the jemalloc memory pool backend
  * It requires every consuming application to LD_PRELOAD
    libjemalloc.so.2, even when it is not set as the default memory
    pool, due to static TLS block allocation errors
  * Usage of the bundled jemalloc as a workaround is not desired
    (gh#apache/arrow#13739)
  * jemalloc does not seem to have a clear advantage over the
    system glibc allocator:
    https://ursalabs.org/blog/2021-r-benchmarks-part-1
  * This overrides the default behavior documented in
    https://arrow.apache.org/docs/cpp/memory.html#default-memory-pool

OBS-URL: https://build.opensuse.org/request/show/1075316
OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=7
2023-03-29 20:39:22 +00:00
John Vandenberg
ba553e9510 Accepting request 1074321 from home:bnavigator:pyarrow
update to 11.0

OBS-URL: https://build.opensuse.org/request/show/1074321
OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=6
2023-03-28 08:42:12 +00:00
880ac17313 Accepting request 1001057 from home:StefanBruens:branches:science
- Revert ccache change, using ccache in a pristine buildroot
  just slows down OBS builds (use --ccache for local builds).
- Remove unused gflags-static-devel dependency.

OBS-URL: https://build.opensuse.org/request/show/1001057
OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=5
2022-09-07 13:19:56 +00:00
John Vandenberg
3d9efad54e Accepting request 998575 from home:jayvdb:pyarrow
- Speed up builds with ccache

OBS-URL: https://build.opensuse.org/request/show/998575
OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=4
2022-08-22 07:46:52 +00:00
John Vandenberg
34d29c598e Accepting request 994163 from home:StefanBruens:branches:science
- Update to v9.0.0
  No (current) changelog provided
- Spec file cleanup:
  * Remove lots of duplicate, unused, or wrong build dependencies
  * Do not package outdated Readmes and Changelogs
- Enable tests, disable ones requiring external test data

OBS-URL: https://build.opensuse.org/request/show/994163
OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=3
2022-08-10 03:06:39 +00:00
John Vandenberg
62106ae456 OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=2 2020-11-18 11:41:14 +00:00
b913299c7b Accepting request 849131 from home:jayvdb:branches:science
- Update to v2.0.0

OBS-URL: https://build.opensuse.org/request/show/849131
OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=1
2020-11-18 08:11:49 +00:00
17 changed files with 1501 additions and 387 deletions

View File

@@ -1,3 +0,0 @@
version https://git-lfs.github.com/spec/v1
oid sha256:423eb4c1d6dbbcb7ca429d548e94f8a99cd4603bc023de9c0578d1950ce0f21d
size 21350177

View File

@@ -1,3 +0,0 @@
version https://git-lfs.github.com/spec/v1
oid sha256:8379554d89f19f2c8db63620721cabade62541f47a4e706dfb0a401f05a713ef
size 21478486

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:e92401790fdba33bfb4b8aa522626d800ea7fda4b6f036aaf39849927d2cf88d
size 17241418

View File

@@ -1,3 +1,962 @@
-------------------------------------------------------------------
Fri Sep 26 16:52:42 UTC 2025 - Ben Greiner <code@bnavigator.de>
- Remove boost::system dependency for Tumbleweed
* Add arrow-boost-system-1.89-boo1249599.patch
* gh#boostorg/system#132
* boo#1249599
-------------------------------------------------------------------
Thu Sep 25 10:24:04 UTC 2025 - Ben Greiner <code@bnavigator.de>
- Update to 21.0.0
## Bug Fixes
* GH-32276 - [C++][FlightRPC] Add option to align RecordBatch
buffers given to IPC reader (#44279)
* GH-35166 - [C++][Compute] Increase precision of decimals in sum
aggregates (#44184)
* GH-40756 - [C++] Remove dead Boost urls (#46452)
* GH-45532 - [C++] RunEndEncodedBuilder should clear dimensions
after a Finish() call (#45533)
* GH-45534 - [C++] Test: RunEndEncodeTableColumns should update
REE columns' schema types (#45535)
* GH-45608 - [C++][Flight] Fix compilation for clang (#46264)
* GH-45735 - [C++] Broken tests for extract_regex compute funcion
(#45900)
* GH-45853 - [C++][Dev] Fix Meson compilation issues in Docker
builds (#45858)
* GH-46011 - [C++] Hide DCHECK family from public headers
(#46015)
* GH-46025 - [C++] Use ARROW_CUDA_EXPORT instead of ARROW_EXPORT
for libarrow_cuda (#46030)
* GH-46052 - [C++][Benchmarking] Don't build grouper benchmark
without ARROW_COMPUTE=ON (#46053)
* GH-46070 - [C++] Remove duplicate storage_type in JsonExtension
(#46071)
* GH-46084 - [C++] Always use ARROW_VCPKG to detect vcpkg mode
(#46467)
* GH-46090 - [C++] Set default IPC option to enabled in Meson
(#46114)
* GH-46094 - [C++][Docs] Add note to RleDecoder::Get's doc
comment (#46874)
* GH-46146 - [C++] Merge metadata in SchemaBuidler::AddMetadata
(#46654)
* GH-46149 - [C++] Opening dataset fails with sshfs-3.7.3 due to
F_RDADVISE error (#46346)
* GH-46157 - [C++] Move test utility RunEndEncodeTableColumns
that uses REE to test_util_internal on acero instead of common
gtest_util (#46161)
* GH-46192 - [C++] Add substrait dep to third party download
script (#46191)
* GH-46197 - [C++] Tests use legacy timezones (#46201)
* GH-46214 - [C++] Improve S3 client initialization (#46723)
* GH-46224 - [C++][Acero] Fix the hang in asof join (#46300)
* GH-46231 - [C++][CMake] Fix arrow_bundled_dependencies to be
externally accessible by FetchContent (#46232)
* GH-46233 - [C++] Fix missing nested braces in QueuedTask
initialization (#46234)
* GH-46268 - [C++] Improve ArrayData docstrings (#46271)
* GH-46270 - [C++][Parquet] Clarify GeoStatistics docstring
(#46649)
* GH-46299 - [C++][Compute] Don't use static inline const for
default options (#46303)
* GH-46306 - [C++][Parquet] Should use LoadEnumSafe for geo enum
(#46307)
* GH-46314 - [C++][Parquet] Fix valgrind error when collecting
parameterized tests for MakeWKBPoint (#46320)
* GH-46326 - [C++][Parquet] Fix stack overflow in rapidjson value
comparison to integer (#46327)
* GH-46359 - [C++][Thirdparty] Bump Apache ORC to 2.1.2 (#46360)
* GH-46394 - [C++][R] gcc-UBSAN errors on CRAN (#46397)
* GH-46395 - [C++][Statistics] Use EqualOptions for min and max
in arrow::ArrayStatistics::Equals() (#46422)
* GH-46407 - [C++] Fix IPC serialization of sliced list arrays
(#46408)
* GH-46414 - [C++] Fix GCS filesystem getFileInfo method (#46416)
* GH-46417 - [C++][Parquet] Fix UB in LoadEnumSafe for
EdgeInterpolationAlgorithm (#46418)
* GH-46419 - [C++] Remove duplicate declaration and sync arg
names on acero test_util_internal functions (#45400)
* GH-46420 - [C++][Dataset] Fix DatasetWriter deadlock on
writting batch greater than max_rows_queued (#46139)
* GH-46424 - [C++][Parquet] Fix erroneous unit test skip (#46425)
* GH-46435 - [Parquet][C++] Fix uninitialized value in writer
test (#46533)
* GH-46478 - [C++] Implement recent JSON changes into Meson
configuration (#46479)
* GH-46481 - [C++][Python] Allow nullable schema in FlightInfo
(#46489)
* GH-46512 - [CI][C++] Install the llvm package explicitly on
MSYS2 (#46525)
* GH-46564 - [C++] Export ARROW_VCPKG in ArrowConfig.cmake
(#46565)
* GH-46576 - [C++] Suppress codecvt_utf8 deprecation warning
(#46622)
* GH-46589 - [C++] Fix utf8_is_digit to support full Unicode
digit range (#46590)
* GH-46599 - [C++][Doc][Parquet] Update supported types
documentation (#46620)
* GH-46611 - [Python][C++] Allow building float16 arrays without
numpy (#46618)
* GH-46623 - [C++][Compute] Fix the failure of large memory test
in arrow-compute-row-test (#46635)
* GH-46659 - [C++] Fix export of extension arrays with binary
view/string view storage (#46660)
* GH-46674 - [C++] Construct Array from ExtensionType Scalar
(#46675)
* GH-46684 - [C++] Fix Meson configuration issue on Windows
(#46685)
* GH-46704 - [C++] Fix OSS-Fuzz build failure (#46706)
* GH-46708 - [C++][Gandiva] Added zero return values for
castDECIMAL_utf8 (#46709)
* GH-46710 - [C++] Fix ownership and lifetime issues in Dataset
Writer (#46711)
* GH-46724 - [C++][Parquet] OSSFuzz: Prevent from Bad-cast in
handling statistics (#46725)
* GH-46761 - [C++] Add executable detection on FreeBSD (#46759)
* GH-46764 - [C++][Gandiva] Fix wrong .bc depends (#46765)
* GH-46777 - [C++] Use SimplifyIsIn only when the value_set of
the expression is lower than a threshold (#46859)
* GH-46811 - [C++][Python] Fix crash on
FileReaderImpl::GetRecordBatchReader (#46931)
* GH-46827 - [C++] Update Meson Configuration for compute shared
lib (#46839)
* GH-46831 - [C++][R] Remove some pending references to CMake <
3.25 (docs + minor CMake references) (#46834)
* GH-46841 - [C++][Gandiva] Fix date trunc edge case (#46842)
* GH-46863 - [CI][C++] Suppress a false positive UBSAN error in
AWS SDK for C++ (#46870)
* GH-46871 - [C++][Parquet] Restore implementation of 3
arrow::FileReader::GetRecordBatchReader() functions (#46868)
* GH-46888 - [C++] Remove override of default buildtype in Meson
config (#46919)
* GH-46915 - [C++][Compute] Initialize Compute kernels on
benchmarks that require extra kernels (#46922)
* GH-46934 - [C++][Parquet] Trying to fix ub in AttachStatistics
(#46940)
* GH-46986 - [CI][C++] Fix a build error with C++20 (#46987)
* GH-46988 - [C++][Parquet] Fix FLBA DecodeArrow multiply
overflow (#46991)
* GH-46995 - [CI][R][C++] Use system memory allocator in
sanitizer jobs (#47007)
* GH-46998 - [C++] Fix mockfs.cc compiling error with C++23
(#46999)
* GH-47015 - [CI][C++] Use mold on conda-cpp to work around
issues with GNU ld (#47028)
* GH-47033 - [C++][Compute] Never use custom gtest main with MSVC
(#47049)
* GH-47037 - [CI][C++] Fix Fedora 39 CI jobs (#47038)
## New Features and Improvements
* GH-25025 - [C++] Move non core compute kernels into separate
shared library (#46261)
* GH-26818 - [C++][Python] Preserve order when writing dataset
multi-threaded (#44470)
* GH-36753 - [C++] Properly pretty-print and diff HalfFloatArrays
(#46857)
* GH-37027 - [C++] Add float16 kernels to if-else and
vector-replace functions (#46446)
* GH-37677 - [C++][FlightRPC] Allow FlightInfo.schema to be
nullable
* GH-37891 - [C++][Parquet] Refine several classes in Parquet
encryption (#46202)
* GH-37891 - [C++] Followup Buffer change to use sptr move
(#46027)
* GH-39294 - [C++][Python] DLPack on Tensor class (#42118)
* GH-40278 - [C++] Support casting string to duration in CSV
converter (#46035)
* GH-40343 - [C++] Move S3FileSystem to the registry (#41559)
* GH-43041 - [C++][Python] Read/write Parquet BYTE_ARRAY as
Large/View types directly (#46532)
* GH-43807 - [C++][Python] Add UUID extension type conversion
support to/from Parquet (#45866)
* GH-43891 - [C++][Parquet] Faster reading of
FIXED_LEN_BYTE_ARRAY data (#46886)
* GH-45028 - [C++][Compute] Allow cast to reorder struct fields
(#45246)
* GH-45083 - [C++] Add HalfFloat kernels for is_nan, is_inf,
is_finite, negate, negate_checked, sign (#46866)
* GH-45195 - [C++] Update bundled AWS SDK for C++ to 1.11.587
(#45306)
* GH-45522 - [Parquet][C++] Parquet GEOMETRY and GEOGRAPHY
logical type implementations (#45459)
* GH-45664 - [C++] Allow
LargeString,LargeBinary,FixedSizeBinary,StringView and
BinaryView for RecordBatch::MakeStatisticsArray() (#46031)
* GH-45750 - [C++][Python][Parquet] Implement Content-Defined
Chunking for the Parquet writer (#45360)
* GH-45794 - [C++] Add array directory to Meson configuration
(#45795)
* GH-45796 - [C++] Add integration directory to Meson
configuration (#45797)
* GH-45798 - [C++] Add extension directory to Meson (#45799)
* GH-45800 - [C++] Implement util configuration in Meson (#45824)
* GH-45829 - [C++] Add compute directory to Meson configuration
(#45830)
* GH-45833 - [C++] Add JSON directory to Meson configuration
(#45834)
* GH-45865 - [C++] Create dedicated benchmark dependency in Meson
(#45909)
* GH-45908 - [C++][Docs] Rename and expose basic
{Array,...}FromJSON helpers as public APIs (#46180)
* GH-45957 - [C++][Python] Expose allow_delayed_open on
S3FileSystem (#46078)
* GH-45978 - [C++] Bump bundled mimalloc version (#45979)
* GH-45991 - [C++] Bump bundled nlohmann_json to v3.12.0 (#46112)
* GH-45992 - [C++] Bump bundled utf8proc version to 2.10.0
(#46032)
* GH-46091 - [C++] Use feature options in Meson configuration
(#46204)
* GH-46092 - [C++] Add filesystem related options to Meson
(#46101)
* GH-46104 - GH-45937: [C++][Parquet] Logical type definition for
variant
* GH-46115 - [C++] Implement compression libraries in Meson
(#46358)
* GH-46116 - [C++] Implement IPC directory in Meson (#46117)
* GH-46118 - [C++] Add tensor directory to Meson (#46119)
* GH-46132 - [C++][Parquet] Remove deprecated parquet APIs from
19.0.0 (#46133)
* GH-46141 - [C++] Add flight directory to Meson configuration
(#46142)
* GH-46153 - [C++] Implement acero directory in Meson (#46154)
* GH-46155 - [C++] Implement Tensorflow directory in Meson
(#46156)
* GH-46163 - [C++] Add vendored directory to Meson (#46164)
* GH-46196 - [C++] Remove ARROW_USE_PRECOMPILED_HEADERS and
related logic (#46200)
* GH-46207 - [C++] Rename arrow::util::StringBuilder and move to
internal namespace (#46813)
* GH-46209 - [Documentation][C++][Compute] Add cpp developer
documentation for row table (#46210)
* GH-46215 - [C++][Docs] Add README for Meson subprojects
directory (#46216)
* GH-46217 - [C++][Parquet] Update the timestamp of
parquet::encryption::TwoLevelCacheWithExpiration correctly
(#46283)
* GH-46219 - [C++][Parquet] Remove PARQUET_MINIMAL_DEPENDENCY
option (#46274)
* GH-46285 - [C++] Add support for Decimal32/64 and HalfFloat to
run_end_encode/run_end_decode (#46286)
* GH-46318 - [Docs][C++] Add Extension Array/Type documents
(#46319)
* GH-46321 - [C++][Doc] Better explain ArrayData IsValid and
GetNullCount (#46332)
* GH-46338 - [C++] Add compile step for Meson in cpp_build.sh
(#46339)
* GH-46367 - [C++] Prevent Meson from using git info if built as
subproject (#46368)
* GH-46386 - [C++] Ensure using our CMake packages not
Find*.cmake (#46387)
* GH-46388 - [C++] Check Snappy::snappy{,-static} in
FindSnappyAlt.cmake (#46389)
* GH-46396 - [C++][Documentation][Statistics] Revise the
documentation to clarify that arrow::ArrayStatistics is ignored
during arrow::Array comparisons (#46470)
* GH-46403 - [C++] Add support for limiting element size when
printing data (#46536)
* GH-46439 - [C++] Use result pattern for all FromJSONString
Helpers (#46696)
* GH-46439 - [C++] Rename internal Converter class in
from_string.cc (#46697)
* GH-46439 - [C++] Remove unneeded namespace prefix in
test_util_internal.h (#46695)
* GH-46444 - [Documentation][C++][Acero] Move internal Swiss
table doc into public C++ developer doc (#46445)
* GH-46459 - [C++] Make some arrow/util headers internal (#46721)
* GH-46462 - [C++][Parquet] Expose currently thrown
EncodedStatistics when checking is_stats_set (#46463)
* GH-46473 - [C++][Docs] Fix typos in decimal comments (#46474)
* GH-46475 - [Documentation][C++][Compute] Consolidate Acero
developer docs (#46476)
* GH-46477 - [C++] Use vendored flatbuffers in Meson
configuration (#46484)
* GH-46487 - [C++] Refactor lz4 from ExternalProject to
FetchContent (#46390)
* GH-46499 - [CI][Crossbow][C++] Use apache/arrow for Meson
(#46501)
* GH-46508 - [C++] Upgrade OpenTelemetry cpp to avoid build error
on recent Clang (#46509)
* GH-46522 - [C++][FlightRPC] Add Arrow Flight SQL ODBC driver
(#40939)
* GH-46529 - [C++] Convert static inline type trait functions to
constexpr (#46559)
* GH-46537 - [Docs][C++] Add RunEndEncodedArray, FlatArray, and
PrimitiveArray API Docs (#46540)
* GH-46551 - [C++] Use std::string_view for type schema API
(#46553)
* GH-46633 - [Docs][C++][Python] Update CombineChunks
documentation to specify that binary columns can be combined
into multiple chunks (#46638)
* GH-46665 - [CI][Crossbow][C++] Use apache/arrow for Alpine
Linux (#46666)
* GH-46676 - [C++][Python][Parquet] Allow reading Parquet LIST
data as LargeList directly (#46678)
* GH-46679 - [C++][Meson] Use WrapDB entry for gflags instead of
CMake wrapper (#46680)
* GH-46683 - [C++][Python] Add utf8_zero_fill compute function
for sign-aware zero padding (#46815)
* GH-46714 - [C++] Use hidden symbol visibility in Meson
configuration (#46715)
* GH-46740 - [C++] Update bundled Thrift
* GH-46745 - [C++] Update bundled Boost to 1.88.0 and Apache
Thrift to 0.22.0 (#46912)
* GH-46746 - [C++] Assume AWS SDK >= 1.11.0 (#46742)
* GH-46748 - [C++] Initial port on AIX (#46749)
* GH-46767 - [C++] Enable EqualOptions::use_atol_ for
arrow::Array, arrow::Scalar, arrow::RecordBatch, and
arrow::ChuckedArray (#46779)
* GH-46771 - [Python][C++] Implement pa.arange function to
generate array sequences (#46778)
* GH-46785 - [CI][Dev][C++] Suppress needless outputs of cpplint
with pre-commit (#46786)
* GH-46788 - [C++][Parquet] Enable SIMD for byte stream split
with 2 streams (#46789)
* GH-46791 - [C++] Add Status::OrElse, IntoStatus<T> and ToStatus
(#46792)
* GH-46843 - [C++] Don't use unity build for bundled AWS SDK for
C++ (#46845)
* GH-46864 - [C++] Add half-float test for ArrayFromJSONString
(#46865)
* GH-46869 - [C++][Parquet] Deprecate arrow::Status
parquet::arrow::FileReadeder::GetRecordBatchReader() (#46932)
* GH-47025 - [C++][Docs] Increase minimum gcc for building from
7.1 to 9 (#47026)
- Drop apache-arrow-19.0.1-mimalloc-version.patch
-------------------------------------------------------------------
Fri Jun 13 18:22:55 UTC 2025 - Ben Greiner <code@bnavigator.de>
- Update to 20.0.0
## Bug Fixes
* GH-30302 - [C++][Parquet] Preserve the bitwidth of integer
dictionary indices on round-trip to Parquet (#45685)
* GH-31992 - [C++][Parquet] Handling the special case when
DataPageV2 values buffer is empty (#45252)
* GH-37630 - [C++][Python][Dataset] Allow disabling fragment
metadata caching (#45330)
* GH-39023 - [C++][CMake] Add missing launcher path conversion
for ExternalPackage (#45349)
* GH-43057 - [C++] Thread-safe AesEncryptor / AesDecryptor
(#44990)
* GH-45048 - [C++][Parquet] Deprecate unused chunk_size parameter
in parquet::arrow::FileWriter::NewRowGroup() (#45088)
* GH-45129 - [Python][C++] Fix usage of deprecated C++
functionality on pyarrow (#45189)
* GH-45132 - [C++][Gandiva] Update LLVM to 18.1 (#45114)
* GH-45185 - [C++][Parquet] Raise an error for invalid repetition
levels when delimiting records (#45186)
* GH-45254 - [C++][Acero] Fix the row offset truncation in row
table merge (#45255)
* GH-45266 - [C++][Acero] Fix the running tasks count of
Scheduler when get error tasks in multi-threads (#45268)
* GH-45270 - [C++][CI] Disable mimalloc on Valgrind builds
(#45271)
* GH-45301 - [C++] Change PrimitiveArray ctor to protected
(#45444)
* GH-45334 - [C++][Acero] Fix swiss join overflow issues in row
offset calculation for fixed length and null masks (#45336)
* GH-45362 - [C++] Fix identity cast for time and list scalar
(#45370)
* GH-45371 - [C++] Fix data race in SimpleRecordBatch::columns
(#45372)
* GH-45393 - [C++][Compute] Fix wrong decoding for 32-bit column
in row table (#45473)
* GH-45396 - [C++] Use Boost with ARROW_FUZZING (#45397)
* GH-45423 - [C++] Dont require Boost library with
ARROW_TESTING=ON/ARROW_BUILD_SHARED=OFF (#45424)
* GH-45497 - [C++][CSV] Avoid buffer overflow when a line has too
many columns (#45498)
* GH-45510 - [CI][C++] Fix LLVM APT repository preparation on
Debian (#45511)
* GH-45512 - [C++] Clean up undefined symbols in libarrow without
IPC (#45513)
* GH-45514 - [CI][C++][Docs] Set CUDAToolkit_ROOT explicitly in
debian-docs (#45520)
* GH-45537 - [CI][C++] Add missing includes (iwyu) to
file_skyhook.cc (#45538)
* GH-45541 - [Doc][C++] Render ASCII art as-is (#45542)
* GH-45545 - [C++][Parquet] Add missing includes (#45554)
* GH-45564 - [C++][Acero] Add size validation for names and
expressions vectors in ProjectNode (#45565)
* GH-45568 - [C++][Parquet][CMake] Enable zlib automatically when
Thrift is needed (#45569)
* GH-45578 - [C++] Use max not min in
MakeStatisticsArrayMaxApproximate test (#45579)
* GH-45587 - [C++][Docs] Fix the statistics schema link in
arrow::RecordBatch::MakeStatisticsArray()s docstring (#45588)
* GH-45614 - [C++] Use Boosts CMake packages instead of
FindBoost.cmake in CMake (#45623)
* GH-45628 - [C++] Ensure specifying Boost include directory for
bundled Thrift (#45637)
* GH-45669 - [C++][Parquet] Add missing
ParquetFileReader::GetReadRanges() definition (#45684)
* GH-45693 - [C++][Gandiva] Fix aes_encrypt/decrypt algorithm
selection (#45695)
* GH-45700 - [C++][Compute] Added nullptr check in Equals method
to handle null impl_ pointers (#45701)
* GH-45733 - [C++][Python] Add biased/unbiased toggle to skew and
kurtosis functions (#45762)
* GH-45739 - [C++][Python] Fix crash when calling
hash_pivot_wider without options (#45740)
* GH-45788 - [C++][Acero] Fix data race in aggregate node
(#45789)
* GH-45868 - [C++][CI] Fix test for ambiguous initialization on
C++ 20 (#45871)
* GH-45905 - [C++][Acero] Enlarge the timeout in ConcurrentQueue
test to reduce sporadical failures (#45923)
* GH-45930 - [C++] Dont use ICU C++ API in Azure SDK C++
(#45952)
* GH-45939 - [C++][Benchmarking] Fix compilation failures
(#45942)
* GH-45959 - [C++][CMake] Fix Protobuf dependency in
Arrow::arrow_static (#45960)
* GH-45980 - [C++] Bump Bundled Snappy version to 1.2.2 (#45981)
* GH-45999 - [C++][Gandiva] Fix crashes on LLVM 20.1.1 (#46000)
* GH-46022 - [C++] Fix build error with g++ 7.5.0 (#46028)
* GH-46067 - [CI][C++] Remove system Flatbuffers from macOS
(#46105)
* GH-46077 - [CI][C++] Disable -Werror on macos-13 (#46106)
* GH-46111 - [C++][CI] Fix boost 1.88 on MinGW (#46113)
* GH-46123 - [C++] Undefined behavior in compare_internal.cc and
light_array_internal.cc (#46124)
* GH-46134 - [CI][C++] Explicit conversion of possible
absl::string_view on protobuf methods to std::string (#46136)
* GH-46159 - [CI][C++] Stop using possibly missing
boost/process/v2.hpp on boost 1.88 and use individual includes
(#46160)
* GH-46195 - [Release][C++] verify-rc-source-cpp-macos-amd64
failed to build googlemock
## New Features and Improvements
* GH-26648 - [C++] Optimize union equality comparison (#45384)
* GH-33592 - [C++] support casting nullable fields to
non-nullable if there are no null values (#43782)
* GH-41764 - [Parquet][C++] Support future logical types in the
Parquet reader (#41765)
* GH-41816 - [C++] Add Minimal Meson Build of libarrow (#45441)
* GH-43296 - [C++][FlightRPC] Remove Flight UCX transport
(#43297)
* GH-43573 - [C++] Copy bitmap when casting from string-view to
offset string and binary types (#44822)
* GH-44042 - [C++][Parquet] Limit num-of row-groups when building
parquet for encrypted file (# 44043)
* GH-44393 - [C++][Compute] Vector selection functions
inverse_permutation and scatter (#44394)
* GH-44615 - [C++][Compute] Add extract_regex_span function
(#45577)
* GH-44629 - [C++][Acero] Use implicit_ordering for asof_join
rather than require_sequenced_output (#44616)
* GH-44950 - [C++] Bump minimum CMake version to 3.25 (#44989)
* GH-45045 - [C++][Parquet] Add a benchmark for
size_statistics_level (#45085)
* GH-45190 - [C++][Compute] Add rank_quantile function (#45259)
* GH-45196 - [C++][Acero] Small refinement to hash join (#45197)
* GH-45206 - [C++][CMake] Add sanitizer presets (#45207)
* GH-45209 - [C++][CMake] Fix the issue that allocator not
disabled for sanitizer cmake presets (#45210)
* GH-45215 - [C++][Acero] Export SequencingQueue and
SerialSequencingQueue (#45221)
* GH-45216 - [C++][Compute] Refactor Rank implementation (#45217)
* GH-45219 - [C++][Examples] Update examples to disable mimalloc
(#45220)
* GH-45225 - [C++] Upgrade ORC to 2.1.0 (#45226)
* GH-45227 - [C++][Parquet] Enable Size Stats and Page Index by
default (#45249)
* GH-45269 - [C++][Compute] Add “pivot_wider” and
“hash_pivot_wider” functions (#45562)
* GH-45279 - [C++][Compute] Move all Grouper tests to
grouper_test.cc (#45280)
* GH-45344 - [C++][Testing] Generic StepGenerator (#45345)
* GH-45358 - [C++][Python] Add MemoryPool method to print
statistics (#45359)
* GH-45361 - [CI][C++] Curate ci/vcpkg/vcpkg.json (#45081)
* GH-45366 - [C++][Parquet] Set is_compressed to false when data
page v2 is not compressed (#45367)
* GH-45416 - [CI][C++][Homebrew] Backport the latest formula
changes (#45460)
* GH-45478 - [CI][C++] Drop support for Ubuntu 20.04 (#45519)
* GH-45506 - [C++][Acero] More overflow-safe Swiss table (#45515)
* GH-45551 - [C++][Acero] Release temp states of Swiss join
building hash table to reduce memory consumption (#45552)
* GH-45563 - [C++][Compute] Split up hash_aggregate.cc (#45725)
* GH-45566 - [C++][Parquet][CMake] Remove a workaround for
Windows in FindThriftAlt.cmake (#45567)
* GH-45572 - [C++][Compute] Add rank_normal function (#45573)
* GH-45584 - [C++][Thirdparty] Bump zstd to v1.5.7 (#45585)
* GH-45589 - [C++] Enable singular test in Meson configuration
(#45596)
* GH-45591 - [C++][Acero] Refine hash join benchmark and remove
openmp from the project (#45593)
* GH-45605 - [R][C++] Fix identifier … preceded by whitespace
warnings (#45606)
* GH-45611 - [C++][Acero] Improve Swiss join build performance by
partitioning batches ahead to reduce contention (#45612)
* GH-45620 - [CI][C++] Use Visual Studio 2022 not 2019 (#45621)
* GH-45652 - [C++][Acero] Unify ConcurrentQueue and
BackpressureConcurrentQueue API (#45421)
* GH-45676 - [C++][Python][Compute] Add skew and kurtosis
functions (#45677)
* GH-45680 - [C++][Python] Remove deprecated functions in 20.0
* GH-45689 - [C++][Thirdparty] Bump Apache ORC to 2.1.1 (#45600)
* GH-45694 - [C++] Bump vendored flatbuffers to 24.3.6 (#45687)
* GH-45696 - [C++][Gandiva] Accept LLVM 20.1 (#45697)
* GH-45732 - [C++][Compute] Accept more pivot key types (#45945)
* GH-45744 - [C++] Remove deprecated GetNextSegment (#45745)
* GH-45746 - [C++] Remove deprecated functions in 20.0 (C++
subset) (#45748)
* GH-45755 - [C++][Python][Compute] Add winsorize function
(#45763)
* GH-45771 - [C++] Add tests to top level Meson configuration
(#45773)
* GH-45772 - [C++] Export Arrow as dependency from Meson
configuration (#45774)
* GH-45775 - [C++] Use dict.get() in Meson configuration (#45776)
* GH-45779 - [C++] Add testing directory to Meson configuration
(#45780)
* GH-45784 - [C++] Unpin LLVM and OpenSSL in Brewfile (#45785)
* GH-45792 - [C++] Add benchmarks to Meson configuration (#45793)
* GH-45816 - [C++] Make VisitType() fallback branch unreachable
(#45815)
* GH-45820 - [C++] Add optional out_offset for Buffer-returning
CopyBitmap function (#45852)
* GH-45821 - [C++][Compute] Grouper improvements (#45822)
* GH-45825 - [C++] Add c directory to Meson configuration
(#45826)
* GH-45827 - [C++] Add io directory to Meson configuration
(#45828)
* GH-45831 - [C++] Add CSV directory to Meson configuration
(#45832)
* GH-45848 - [C++][Python][R] Remove deprecated PARQUET_2_0
(#45849)
* GH-45877 - [C++][Acero] Cleanup 64-bit temp states of Swiss
join by using 32-bit (#45878)
* GH-45917 - [C++][Acero] Add flush taskgroup to enable
parallelization (#45918)
* GH-45922 - [C++][Flight] Remove deprecated Authenticate and
StartCall (#45932)
* GH-45953 - [C++] Use lock to fix atomic bug in
ReadaheadGenerator (#45954)
* GH-45986 - [C++] Update bundled GoogleTest (#45996)
* GH-45987 - [C++] Set CMAKE_POLICY_VERSION_MINIMUM=3.5 for
bundled dependencies (#45997)
-------------------------------------------------------------------
Mon Apr 21 14:34:37 UTC 2025 - Friedrich Haubensak <hsk17@mail.de>
- to fix cmake-4 build problems, upgrade bundled mimalloc from
2.0.6 to 2.0.9 and add apache-arrow-19.0.1-mimalloc-version.patch;
mimalloc changes according to readme.md:
* 2.0.9:
- Supports building with asan and improved [Valgrind] support.
- Support abitrary large alignments, in particular for
`std::pmr` pools.
- Added C++ STL allocators attached to a specific heap.
- Heap walks now visit all object (including huge objects).
- Support Windows nano server containers.
- Various small bug fixes.
* 2.0.7:
- Initial support for [Valgrind] for leak testing and heap
block overflow detection.
- Initial support for attaching heaps to a speficic memory area.
- Fix `realloc` behavior for zero size blocks,
- Remove restriction to integral multiple of the alignment in
`alloc_align`.
- Improved aligned allocation performance.
- Reduced contention with many threads on few processors.
- VS2022 support.
- Support `pkg-config`.
-------------------------------------------------------------------
Fri Mar 28 08:47:10 UTC 2025 - Ben Greiner <code@bnavigator.de>
- Re-enable flight, grpc has been fixed boo#1237422
-------------------------------------------------------------------
Thu Mar 13 18:57:51 UTC 2025 - Ben Greiner <code@bnavigator.de>
- Add missing dependencies for libboost_process explicitly
boo#1239599
-------------------------------------------------------------------
Wed Feb 19 15:58:28 UTC 2025 - Ben Greiner <code@bnavigator.de>
- disable flight because of gh#grpc/grpc#37968 boo#1237422
-------------------------------------------------------------------
Mon Feb 17 19:17:26 UTC 2025 - Ben Greiner <code@bnavigator.de>
- Update to 19.0.1
## Bug Fixes
* [C++] Fix overflow issues for large build side in swiss join
(#45108)
* [C++][Fuzzing] Fix Negation bug discovered by fuzzing (#45181)
* [C++][Parquet] Omit level histogram when max level is 0
(#45285)
* [Parquet][C++] Fix statistics load logic for no row group and
multiple row groups (#45350)
* [C++] Disable Flight test (#45232)
## Improvements
* [C++][Parquet] Improve performance of generating size
statistics (#45202)
* [C++][S3] Workaround compatibility issue between AWS SDK and
MinIO (#45310)
- Release 19.0.0
## New Features and Improvements
* [CI][C++] Add a nightly job to test offline build (#44721)
* [C++] GcsFileSystem::Make should return Result (#44503)
* [C++][Parquet] Implement SizeStatistics (#40594)
* [C++] Reduce string inlining in Substrait serde (#45174)
* [C++][Acero] Enhance asof_join to work in multi-threaded
execution by sequencing input (#44083)
* [C++] Support the AWS S3 SSE-C encryption (#43601)
* [C++][Parquet] Parquet Metadata Printer supports print
sort-columns (#43599)
* [C++] Add C++ implementation of Async C Data Interface (#44495)
* [C++][Acero] Support AVX2 swiss join decoding (#43832)
* [C++] skip -0117 in StrptimeZoneOffset for old glibc (#44621)
* [C++] Add arrow::RecordBatch::MakeStatisticsArray() (#44252)
* [C++] Improve merge step in chunked sorting (#44217)
* [C++][Parquet] Tools: Debug Print for Json should be valid JSON
(#44532)
* [C++][FS][Azure] Implement SAS token authentication (#45021)
* [C++] Dont export template class (#44365)
* [C++][Docs] Update the URL to C++ Development in README.md
(#44427)
* [C++] Added rvalue-reference-qualified overload for
arrow::Result::status() returning value instead of reference
(#44477)
* [C++] StatusConstant- cheaply copied const Status (#44493)
* [C++][Compute] Allow casting struct to bigger nullable struct
(#44587)
* [C++] Use array type to compute min/max statistics Arrow type
(#45094)
* [C++] Minor: ArrayData ctor can assign null_count directly
(#44582)
* [C++] Add const and & to arrow::Array::statistics() return type
(#44592)
* [Python][C++] Add version suffix to libarrow_python* libraries
(#44702)
* [C++] NumericBuilder::AppendValues append vector prevent from
ub (#44794)
* [C++][Parquet] Remove obsolete parquet_constants generated
files from old thrift (#44772)
* [Docs][C++] Add arrow::ArrayStatistics to API doc (#44764)
* [C++] Upgrade ORC to 2.0.3 (#44745)
* [C++][Parquet] Add arrow::Result version of
parquet::arrow::OpenFile() (#44785)
* [C++] Fix a couple of maybe-uninitialized warnings (#44789)
* [C++] Use arrow::util::span on
arrow::util::bitmap_builders_utilities instead of std::vector
(#44796)
* [C++][Parquet] Add arrow::Result version of
parquet::arrow::FileReader::GetRecordBatchReader() (#44809)
* [C++] minor optimize cancel and thread pool (#44812)
* [C++][Parquet] Add an example to dump statistics read as
arrow::ArrayStatistics (#44816)
* [C++] Add the Expm1(exponent) scalar arithmetic function
(#44904)
* [C++] Add WithinUlp testing functions (#44906)
* [C++][Python] Add Hyperbolic Trig functions (#44630)
* [C++] Enable mimalloc by default, disable jemalloc by default
and more (#44951)
* [C++] Add support for building system OpenTelemetry (#44983)
* [C++][CMake] Use librt only for Linux (#44984)
* [C++] Support for fixed-size list in conversion of range tuple
(#45008)
* [C++][Parquet] Allow configuring the default footer read size
(#45016)
* [C++] Remove result_internal.h (#45066)
* [FlightRPC][C++] Deprecate InitializeFlightUcx before removing
UCX (#45080)
* [C++][Parquet] Add GetReadRanges function to FileReader
(#45093)
* [C++] Apply a cstdint patch to bundled Thrift for GCC 15
(#45097)
* [C++] Remove useless “hash table ready” states in swiss join
(#45136)
* [CI][C++] Add a GCC 15 job (#45138)
* [C++] Ensure using cpp/cmake_modules/*.cmake (#45143)
* [CI][C++] Upgrade Alpine Linux to 3.18 from 3.16 (#45168)
## Bug Fixes
* [C++] Fix CopyFiles when destination is a FileSystem with
background_writes (#44897)
* [C++][Python] Fix ORC crash when file contains unknown timezone
(#45051)
* [C++] Replace std::aligned_storage that is deprecated in C++23
(#45019)
* [C++][Parquet] Refuse writing non-nullable column that contains
nulls (#44921)
* [C++] Initialize offset vector head as 0 after memory allocated
in grouper.cc (#43123)
* [C++] io::BufferedInput: Fix invalid state after SetBufferSize
(#44387)
* [C++][Parquet] Fix schema conversion from two-level encoding
nested list (#43995)
* [C++] Use “lib” for generating bundled dependencies even with
“clang-cl” (#44391)
* [C++] Fix unaligned load/store implementation for clang-18
(#44468)
* [C++] Use CMAKE_LIBTOOL on macOS (#44385)
* [CI][C++] Use setup-python on hosted runner (#44411)
* [C++] Update vendored date to 3.0.3 (#44482)
* [GLib][C++] Meson searches libraries with specific versions.
(#44475)
* [C++][Acero] Fix crash when thread in asof_join is not running
(#44584)
* [C++] NumericArray should not use ctor from parent directly
(#44542)
* [C++] FunctionOptions::{Serialize,Deserialize}() return an
error without ARROW_IPC (#45171)
* [C++][Acero] Enhance partition sort example (#44678)
* [C++][Python] Fix Flight Timestamp precision, revert workaround
from #43537 (#44681)
* [C++] Add S3 option to ignore SIGPIPE signals (#44735)
* [C++] Keep field metadata for keys and values when importing a
map type via the C data interface (#44715)
* [C++][CI] Fix arrow-c-bridge-test timeout with threading
disabled (#44737)
* [C++] Use lowercased windows.h to enable cross-platform builds
(#44755)
* [C++] Fix Float16.To{Little,Big}Endian on big endian machines
(#44768)
* [C++][Parquet] Fix read/write of metadata length footer on
big-endian systems (#44787)
* [C++][CI] Migrate to arrow::Result based
parquet::arrow::OpenFile() API in example tutorials (#44807)
* [C++] Fix thread-unsafe access in ConcurrentQueue::UnsyncFront
(#44849)
* [C++] Fix compilation error on GCC 8 (#44899)
* [C++][CI] Silence protobuf-generated deprecations (#44955)
* [C++] Use recommended downloads URLs for ORC and Thrift
(#44977)
* [C++] Include path in the documentation is wrong (#45031)
* [C++] Remove Parquet requirement from Arrow Acero and from
Arrow Dataset when not necessary (#45035)
* [C++] Add support for Boost 1.87.0 (#45057)
* [C++][CI] Fix test-build-cpp-fuzz failures (#45060)
* [C++][Parquet] Fix generation of repetition levels for
encryption test data (#45074)
* [C++] Avoid static const variable in the status.h (#45100)
* [C++][Parquet] Fix Null-dereference READ in
parquet::arrow::ListToSchemaField (#45152)
* [C++][Release] Add llvm-dev back to setup-ubuntu.sh (#45184)
* [C++][Parquet] test-conda-cpp-valgrind fails on
arrow-dataset-file-parquet-encryption-test
- Release 18.1.0
## Bug Fixes
* [C++] Add support for overwriting grpc_cpp_plugin path for
cross-compiling (#44507)
* [Docs][C++] Fix documentation directive for ChunkLocation
(#44505)
* [C++] Add find module for abseil that handles missing version
(#44613)
* [C++][Dev] Update bundled Thrift, update mirrors to use CDN
(#44685)
## New Features and Improvements
* [C++] Move ChunkResolver to the public API (#44357)
- Release 18.0.0
## Bug Fixes
* [C++] data corruption when using `group_by` and `aggregate` on
large data sets
* [C++] Use PutObject request for S3 in OutputStream when only
uploading small data (#41564)
* [C++] Clean up implicit fallthrough warnings (#41892)
* [C++] Fix avx2 gather rows more than 2^31 issue in
CompareColumnsToRows (#43065)
* [C++][ArrowFlight] Crash due to UCS thread mode
* [C++] Add workaround for missing Boost dependency of Thrift
(#43328)
* [C++] Skip not Emscripten ready tests in CSV tests (#43724)
* [C++] Add date{32,64} to date{32,64} cast (#43192)
* [C++][Compute] Detect and explicit error for offset overflow in
row table (#43226)
* [C++] Fix decimal benchmarks to avoid out-of-bounds accesses
(#43212)
* [C++] Resolve Abseil like any other dependency in the build
system (#43219)
* [C++][Parquet] Refactor parquet::encryption::AesEncryptor to
use unique_ptr (#43222)
* [C++] Fix Abseil compile error on GCC 13 (#43157)
* [C++] Add missing serde methods to Location (#43332)
* [C++][Parquet] min-max Statistics doesnt work well when one of
min-max is truncated (#43383)
* [C++][Parquet] parquet-dump-footer: Remove redundant link and
fix debug processing (#43375)
* [C++] Ensure using bundled GoogleTest when we use bundled
GoogleTest (#43465)
* [C++][Compute] Fix invalid memory access when resizing
var-length buffer in row table (#43415)
* [C++][FlightRPC] Fix Flight UCX build issues (#43430)
* [C++] FIlter out zero length buffers on gRPC transport (#43448)
* [C++][Gandiva] Always use gdv_function_stubs.h in
context_helper.cc (#43464)
* [C++] Add support for the official LZ4 CMake package (#43468)
* [C++] Register the new Opaque extension type by default
(#43788)
* [C++][Acero] Fix typos in join benchmark (#43871)
* [C++][CI] Catch potential integer overflow in PoolBuffer
(#43886)
* [C++] Leak S3 structures if finalization happens too late
(#44090)
* [C++][Parquet] Fix reported metrics in
parquet-arrow-reader-writer-benchmark (#44082)
* [C++] Dont use Boost.Process with Emscripten (#44097)
* [C++] Add home made _mm256_set_m128i for compilers who are
missing it (#44116)
* [C++] JsonExtensionType equality check ignores storage type
(#44215)
* [CI][C++][AppVeyor] Use conda instead of Mamba (#44235)
* [C++][FS][Azure] Fix edgecase where GetFileInfo incorrectly
returns NotFound on flat namespace and Azurite (#44302)
* [C++][FS][Azure] Catch missing exceptions on HNS support check
(#44274)
* [C++][FS][Azure] Fix minor hierarchical namespace bugs (#44307)
* [C++] Fix S3 error handling in ObjectOutputStream (#44335)
* [C++] Disable jemalloc by default on ARM (#44380)
## New Features and Improvements
* [C++][Python] Native support for UUID (#37298)
* [C++][Python] Bool8 Extension Type Implementation (#43488)
* [C++][Parquet] Add JSON canonical extension type (#13901)
* [C++][Compute] Replace explicit checking with DCHECK for
invariants in row segmenter (#44236)
* [C++][CI] Improve IPC fuzzing seed corpus (#43621)
* [Documentation][C++] Explicitly note that compute is optional
(#43629)
* [C++] Azure file system write buffering & async writes (#43096)
* [C++][Parquet] Separate encoders and decoder (#43972)
* [C++][Python][Parquet] Support reading/writing key-value
metadata from/to ColumnChunkMetaData (#41580)
* [Docs][C++] Is arrow::dataset namespace still experimental?
* [C++] Add arrow::ArrayStatistics (#43273)
* [CI][C++] Update Minio version (#44225)
* [C++][Parquet] Add binary that extracts a footer from a parquet
file (#42174)
* [C++] Support casting to and from utf8_view/binary_view
(#43302)
* [C++] Update bundled vendor/datetime to support for building
with libc++ and C++20 (#43094)
* [C++] Implement PathFromUri support for Azure file system
(#43098)
* [C++][Compute] Fix the unnecessary allocation of extra bytes
when encoding row table (#43125)
* [C++][Parquet] Replace use of int with int32_t in the internal
Parquet encryption APIs (#43413)
* [C++][Parquet] Refactor Encryptor API to use arrow::util::span
instead of raw pointers (#43195)
* [C++][Parquet] Default initialize some parquet metadata
variables (#43144)
* [C++] Fix CMake link order for AWS SDK (#43230)
* [C++] Suggest a cast when Concatenate fails due to offsets
overflow (#43190)
* [C++] Support basic is_in predicate simplification (#43761)
* [C++][AzureFS] Ignore password field in URI (#44220)
* [C++] Add lint for DCHECK in public headers (#43248)
* [C++][FlightRPC] Reduce repetition in flight/types.cc in serde
functions (#43237)
* [C++][Parquet] remove useless template parameter of
DeltaLengthByteArrayEncoder (#43250)
* [C++] Always prefer mimalloc to jemalloc (#40875)
* [C++][Flight] Use a Base CRTP type for the types used in RPC
calls (#43255)
* [C++] Expand the take function tests to cover more
chunked-array cases (#43292)
* [C++][Parquet] Enhance the comment for ColumnReader/Decoder
(#44003)
* [C++] Order classes in flight/types.h according to Flight.proto
(#43330)
* [C++][Parquet] Deprecate ColumnChunk::file_offset field and no
longer write Metadata at end of Chunk (#43428)
* [C++] Add benchmark for binary view builder (#43445)
* [C++][Python] Add Opaque canonical extension type (#43458)
* [Java][C++] Support more CsvFragmentScanOptions in JNI call
(#43482)
* [C++] Thirdparty: Bump lz4 to 1.10.0 (#43493)
* [C++][Compute] Widen the row offset of the row table to 64-bit
(#43389)
* [C++] Use ViewOrCopyTo instead of CopyTo when pretty printing
non-CPU data (#43508)
* [FlightRPC][C++] Reduce the number of references to
protobuf::Any (#43544)
* [C++] Simplify arrow::ArrayStatistics::ValueType (#43581)
* [C++][GLib] Dont install arrow-cuda.pc/arrow-cuda-glib.pc on
Windows (#43593)
* [C++] Remove redundant default constructor/deconstructor in
arrow::ArrayStatistics (#43579)
* [C++] Remove std::optional from
arrow::ArrayStatistics::is_{min,max}_exact (#43595)
* [C++][FlightRPC] Move the FlightTestServer to its own .cc and
.h files (#43678)
* [C++] Compute: fix register kernel SimdLevel for
AddMinMax512AggKernels (#43704)
* [C++] Prevent Snappy from disabling RTTI when bundled (#43706)
* [C++][FS][Azure] Use the latest Azurite and update the bundled
Azure SDK for C++ to azure-identity_1.9.0 (#43723)
* [C++][Parquet][CI] Parquet: Introducing more bad_data for
testing (#43708)
* [C++][Parquet] Dataset: Handle num-nulls in Parquet correctly
when !HasNullCount() (#43726)
* [C++] Clarify the way SIMD-enabled agg kernels come from the
same code in different compilation units (#43720)
* [C++] Fix Scalar boolean handling in row encoder (#43734)
* [C++] Add support for Boost 1.86 (#43766)
* [C++] Compute: More comment in RowEncoder (#43763)
* [C++] Acero: Minor code enhancement for Join (#43760)
* [C++] Fix the case when boolean_{any all} meets constant input
with length in Acero (#43799)
* [C++] Add chunked Take benchmarks with a small selection factor
(#43772)
* [C++] Indent preprocessor directives (#43798)
* [C++] Attach arrow::ArrayStatistics to arrow::ArrayData
(#43801)
* [C++] Enable filesystem automatically when one of
ARROW_{AZURE,GCS,HDFS,S3}=ON is specified (#43806)
* [C++] Expose the set of device types where a ChunkedArray is
allocated (#43853)
* [C++] Make ChunkResolver::ResolveMany output a list of
ChunkLocations (#43928)
* [C++][Parquet] Add support for arrow::ArrayStatistics: non
zero-copy int based types (#43945)
* [C++][Parquet] Guard against use of cleared decryptor/encryptor
(#43947)
* [C++] Add tests based on random data and benchmarks to
ChunkResolver::ResolveMany (#43954)
* [C++] Enhance error message for URI parsing (#43938)
* [CI][C++][Dev] Add cpplint to pre-commit (#43982)
* [C++][Parquet] Add support for arrow::ArrayStatistics:
zero-copy types (#43984)
* [C++][Acero] Some code cleanup to Grouper (#43988)
* [C++] Add missing std::move() in array_nested.cc (#43993)
* [C++][Docs] Add missing install command in building docs
(#44000)
* [C++][Parquet] Add support for arrow::ArrayStatistics: boolean
(#44009)
* [C++] IPC: ipc reader/writer code enhancement (#44019)
* [C++][Compute] Reduce the complexity of row segmenter (#44053)
* [C++][Parquet] Add Float16 reading benchmarks (#44073)
* [C++][Parquet] Remove deprecated APIs (#44080)
* [C++][Acero] Add more row segmenter tests (#44166)
* [C++][Parquet] Fix typo in parquet/column_writer.cc (#40856)
* [C++] Avoid repeated ArrayData::offset lookups (#44190)
* [C++][Gandiva] Accept LLVM 19.1 (#44233)
* [C++] Unify simd header includings (#44250)
* [C++][Decimal] Use 0E+1 not 0.E+1 for broader compatibility
(#44275)
* [Packaging][C++] Enable Azure file system for deb/rpm (#44348)
- Drop apache-arrow-pr43766-boost1_86.patch
- Release notes for 18.0.0 and 19.0.0
-------------------------------------------------------------------
Fri Sep 27 05:31:41 UTC 2024 - Guang Yee <gyee@suse.com>
- Set the appropriate C++ complier for the given platform so
it will compile on Leap 15.x.
-------------------------------------------------------------------
Wed Sep 18 06:59:36 UTC 2024 - Ben Greiner <code@bnavigator.de>
- Add apache-arrow-pr43766-boost1_86.patch for Boost 1.86
* gh#apache/arrow#43766
-------------------------------------------------------------------
Mon Aug 12 17:11:06 UTC 2024 - Ben Greiner <code@bnavigator.de>

View File

@@ -1,7 +1,7 @@
#
# spec file for package apache-arrow
#
# Copyright (c) 2024 SUSE LLC
# Copyright (c) 2025 SUSE LLC and contributors
#
# All modifications and additions to the file contributed by third parties
# remain the property of their copyright owners, unless otherwise agreed
@@ -16,19 +16,28 @@
#
%bcond_without tests
%bcond_without flight
# Remove static build due to devel-static packages being required by the generated CMake Targets
%bcond_with static
%bcond_without tests
# Required for runtime dispatch, not yet packaged
%bcond_with xsimd
%define sonum 1700
%if %{suse_version} <= 1500
# requires __has_builtin with keywords
%define gccver 13
%endif
%define sonum 2100
# See git submodule /testing pointing to the correct revision
%define arrow_testing_commit 735ae7128d571398dd798d7ff004adebeb342883
%define arrow_testing_commit fbf6b703dc93d17d75fa3664c5aa2c7873ebaf06
# See git submodule /cpp/submodules/parquet-testing pointing to the correct revision
%define parquet_testing_commit 74278bc4a1122d74945969e6dec405abd1533ec3
%define parquet_testing_commit 18d17540097fca7c40be3d42c167e6bfad90763c
# See cpp/thirdparty/versions.txt, replace by BuildRequires: pkgconfig(mimalloc) as soon as gh#apache/arrow#42211 is resolved
%define arrow_mimalloc_build_version v2.2.4
Name: apache-arrow
Version: 17.0.0
Version: 21.0.0
Release: 0
Summary: A development platform for in-memory data
License: Apache-2.0 AND BSD-3-Clause AND BSD-2-Clause AND MIT
@@ -38,13 +47,22 @@ URL: https://arrow.apache.org/
Source0: https://github.com/apache/arrow/archive/apache-arrow-%{version}.tar.gz
Source1: https://github.com/apache/arrow-testing/archive/%{arrow_testing_commit}.tar.gz#/arrow-testing-%{version}.tar.gz
Source2: https://github.com/apache/parquet-testing/archive/%{parquet_testing_commit}.tar.gz#/parquet-testing-%{version}.tar.gz
Source3: https://github.com/microsoft/mimalloc/archive/%{arrow_mimalloc_build_version}.tar.gz#/mimalloc-%{arrow_mimalloc_build_version}.tar.gz
# PATCH-FIX-OPENSUSE arrow-boost-system-1.89-boo1249599.patch gh#boostorg/system#132, boo#1249599
Patch1: arrow-boost-system-1.89-boo1249599.patch
BuildRequires: bison
BuildRequires: cmake >= 3.16
BuildRequires: cmake >= 3.25
BuildRequires: fdupes
BuildRequires: flex
BuildRequires: gcc-c++
BuildRequires: gcc%{?gccver}-c++
BuildRequires: libboost_context-devel
BuildRequires: libboost_date_time-devel
BuildRequires: libboost_filesystem-devel
BuildRequires: libboost_system-devel >= 1.64.0
BuildRequires: libboost_headers-devel
BuildRequires: libboost_process-devel
%if 0%{?suse_version} < 1699
BuildRequires: libboost_system-devel
%endif
%if %{with static}
BuildRequires: libzstd-devel-static
%endif
@@ -52,27 +70,27 @@ BuildRequires: pkgconfig
BuildRequires: python-rpm-macros
BuildRequires: python3-base
BuildRequires: (cmake(lz4) >= 1.10 or (pkgconfig(liblz4) >= 1.8.3 with pkgconfig(liblz4) < 1.10))
BuildRequires: cmake(Snappy) >= 1.1.7
BuildRequires: cmake(Snappy) >= 1.2.2
BuildRequires: cmake(absl)
BuildRequires: cmake(double-conversion) >= 3.1.5
BuildRequires: cmake(re2)
BuildRequires: pkgconfig(RapidJSON)
BuildRequires: pkgconfig(bzip2) >= 1.0.8
BuildRequires: pkgconfig(gflags) >= 2.2.0
BuildRequires: pkgconfig(grpc++) >= 1.20.0
BuildRequires: pkgconfig(libbrotlicommon) >= 1.0.7
BuildRequires: pkgconfig(libbrotlidec) >= 1.0.7
BuildRequires: pkgconfig(libbrotlienc) >= 1.0.7
BuildRequires: pkgconfig(libcares) >= 1.15.0
BuildRequires: pkgconfig(libglog) >= 0.3.5
BuildRequires: pkgconfig(gflags) >= 2.2.2
BuildRequires: pkgconfig(grpc++) >= 1.46.3
BuildRequires: pkgconfig(libbrotlicommon) >= 1.0.9
BuildRequires: pkgconfig(libbrotlidec) >= 1.0.9
BuildRequires: pkgconfig(libbrotlienc) >= 1.0.9
BuildRequires: pkgconfig(libcares) >= 1.17.2
BuildRequires: pkgconfig(libglog) >= 0.5.0
BuildRequires: pkgconfig(libopenssl)
BuildRequires: pkgconfig(liburiparser) >= 0.9.3
BuildRequires: pkgconfig(libutf8proc)
BuildRequires: pkgconfig(libzstd) >= 1.4.3
BuildRequires: pkgconfig(protobuf) >= 3.7.1
BuildRequires: pkgconfig(sqlite3) >= 3.45.2
BuildRequires: pkgconfig(thrift) >= 0.11.0
BuildRequires: pkgconfig(zlib) >= 1.2.11
BuildRequires: pkgconfig(libutf8proc) >= 2.10.0
BuildRequires: pkgconfig(libzstd) >= 1.5.7
BuildRequires: pkgconfig(protobuf) >= 21.3
BuildRequires: pkgconfig(sqlite3)
BuildRequires: pkgconfig(thrift) >= 0.22.0
BuildRequires: pkgconfig(zlib) >= 1.3.1
%if %{with tests}
BuildRequires: timezone
BuildRequires: pkgconfig(gmock) >= 1.10
@@ -115,6 +133,20 @@ communication.
This package provides the shared library for the Acero streaming execution engine
%package -n libarrow_compute%{sonum}
Summary: Development platform for in-memory data - shared library
Group: System/Libraries
%description -n libarrow_compute%{sonum}
Apache Arrow is a cross-language development platform for in-memory
data. It specifies a standardized language-independent columnar memory
format for flat and hierarchical data, organized for efficient
analytic operations on modern hardware. It also provides computational
libraries and zero-copy streaming messaging and interprocess
communication.
This package provides the shared library for the C++ Compute module
%package -n libarrow_flight%{sonum}
Summary: Development platform for in-memory data - shared library
Group: System/Libraries
@@ -176,16 +208,22 @@ Summary: Development platform for in-memory data - development files
Group: Development/Libraries/C and C++
Requires: libarrow%{sonum} = %{version}
Requires: libarrow_acero%{sonum} = %{version}
Requires: libarrow_compute%{sonum} = %{version}
Requires: libarrow_dataset%{sonum} = %{version}
%if %{with flight}
Requires: libarrow_flight%{sonum} = %{version}
Requires: libarrow_flight_sql%{sonum} = %{version}
%endif
%if %{with static}
Suggests: %{name}-devel-static = %{version}
Suggests: %{name}-acero-devel-static = %{version}
Suggests: %{name}-compute-devel-static = %{version}
Suggests: %{name}-dataset-devel-static = %{version}
%if %{with flight}
Suggests: %{name}-flight-devel-static = %{version}
Suggests: %{name}-flight-sql-devel-static = %{version}
%endif
%endif
%description devel
Apache Arrow is a cross-language development platform for in-memory
@@ -229,6 +267,21 @@ communication.
This package provides the static library for the Acero streaming execution engine
%package compute-devel-static
Summary: Development platform for in-memory data - development files
Group: Development/Libraries/C and C++
Requires: %{name}-devel = %{version}
%description compute-devel-static
Apache Arrow is a cross-language development platform for in-memory
data. It specifies a standardized language-independent columnar memory
format for flat and hierarchical data, organized for efficient
analytic operations on modern hardware. It also provides computational
libraries and zero-copy streaming messaging and interprocess
communication.
This package provides the static library for the C++ Compute module
%package flight-devel-static
Summary: Development platform for in-memory data - development files
Group: Development/Libraries/C and C++
@@ -324,13 +377,18 @@ This package provides utilities for working with the Parquet format.
%prep
%setup -q -n arrow-apache-arrow-%{version} -a1 -a2
%autopatch -p1
%if 0%{?suse_version} >= 1699
%patch -P1 -p1
%endif
# https://github.com/protocolbuffers/protobuf/issues/12292
sed -i 's/find_package(Protobuf/find_package(Protobuf CONFIG/' cpp/cmake_modules/FindProtobufAlt.cmake
%build
%{?gccver:export CXX=g++-%{gccver}}
%{?gccver:export CC=gcc-%{gccver}}
export CFLAGS="%{optflags} -ffat-lto-objects"
export CXXFLAGS="%{optflags} -ffat-lto-objects"
export ARROW_MIMALLOC_URL=%{SOURCE3}
pushd cpp
%cmake \
@@ -351,14 +409,15 @@ pushd cpp
-DARROW_CSV:BOOL=ON \
-DARROW_DATASET:BOOL=ON \
-DARROW_FILESYSTEM:BOOL=ON \
%if %{with flight}
-DARROW_FLIGHT:BOOL=ON \
-DARROW_FLIGHT_SQL:BOOL=ON \
%endif
-DARROW_GANDIVA:BOOL=OFF \
-DARROW_SKYHOOK:BOOL=OFF \
-DARROW_HDFS:BOOL=ON \
-DARROW_HIVESERVER2:BOOL=OFF \
-DARROW_IPC:BOOL=ON \
-DARROW_JEMALLOC:BOOL=OFF \
-DARROW_JSON:BOOL=ON \
-DARROW_ORC:BOOL=OFF \
-DARROW_PARQUET:BOOL=ON \
@@ -387,16 +446,20 @@ pushd cpp
popd
%if %{with tests}
rm %{buildroot}%{_libdir}/libarrow_testing.so*
rm %{buildroot}%{_libdir}/libarrow_flight_testing.so*
rm %{buildroot}%{_libdir}/pkgconfig/arrow-testing.pc
rm -Rf %{buildroot}%{_libdir}/cmake/ArrowTesting
rm -Rf %{buildroot}%{_includedir}/arrow/testing
%if %{with flight}
rm %{buildroot}%{_libdir}/libarrow_flight_testing.so*
rm %{buildroot}%{_libdir}/pkgconfig/arrow-flight-testing.pc
rm -Rf %{buildroot}%{_libdir}/cmake/ArrowFlightTesting
%endif
%if %{with static}
rm %{buildroot}%{_libdir}/libarrow_testing.a
%if %{with flight}
rm %{buildroot}%{_libdir}/libarrow_flight_testing.a
%endif
rm -Rf %{buildroot}%{_libdir}/cmake/ArrowTesting
rm -Rf %{buildroot}%{_libdir}/cmake/ArrowFlightTesting
rm -Rf %{buildroot}%{_includedir}/arrow/testing
%endif
%endif
rm -r %{buildroot}%{_datadir}/doc/arrow/
%fdupes %{buildroot}%{_libdir}/cmake
@@ -421,7 +484,7 @@ if [ -n "${GTEST_failing}" ]; then
fi
%ifarch s390x
# bsc#1218592
exclude_regex='--exclude-regex (arrow-dataset-file-parquet-test|parquet-internals-test|parquet-reader-test|parquet-arrow-test|parquet-arrow-internals-test|parquet-encryption-test|arquet-encryption-key-management-test)'
exclude_regex='--exclude-regex (arrow-dataset-file-parquet-test|parquet-internals-test|parquet-reader-test|parquet-arrow-test|parquet-arrow-internals-test|parquet-encryption-test|parquet-encryption-key-management-test)'
%endif
%ctest --label-regex unittest $exclude_regex
popd
@@ -431,54 +494,67 @@ popd
%postun -n libarrow%{sonum} -p /sbin/ldconfig
%post -n libarrow_acero%{sonum} -p /sbin/ldconfig
%postun -n libarrow_acero%{sonum} -p /sbin/ldconfig
%post -n libarrow_compute%{sonum} -p /sbin/ldconfig
%postun -n libarrow_compute%{sonum} -p /sbin/ldconfig
%if %{with flight}
%post -n libarrow_flight%{sonum} -p /sbin/ldconfig
%postun -n libarrow_flight%{sonum} -p /sbin/ldconfig
%post -n libarrow_flight_sql%{sonum} -p /sbin/ldconfig
%postun -n libarrow_flight_sql%{sonum} -p /sbin/ldconfig
%endif
%post -n libarrow_dataset%{sonum} -p /sbin/ldconfig
%postun -n libarrow_dataset%{sonum} -p /sbin/ldconfig
%post -n libparquet%{sonum} -p /sbin/ldconfig
%postun -n libparquet%{sonum} -p /sbin/ldconfig
%files
%license LICENSE.txt NOTICE.txt header
%license LICENSE.txt NOTICE.txt
%{_bindir}/arrow-file-to-stream
%{_bindir}/arrow-stream-to-file
%files -n libarrow%{sonum}
%license LICENSE.txt NOTICE.txt header
%license LICENSE.txt NOTICE.txt
%{_libdir}/libarrow.so.*
%files -n libarrow_acero%{sonum}
%license LICENSE.txt NOTICE.txt header
%license LICENSE.txt NOTICE.txt
%{_libdir}/libarrow_acero.so.*
%files -n libarrow_compute%{sonum}
%license LICENSE.txt NOTICE.txt
%{_libdir}/libarrow_compute.so.*
%if %{with flight}
%files -n libarrow_flight%{sonum}
%license LICENSE.txt NOTICE.txt header
%license LICENSE.txt NOTICE.txt
%{_libdir}/libarrow_flight.so.*
%files -n libarrow_flight_sql%{sonum}
%license LICENSE.txt NOTICE.txt header
%license LICENSE.txt NOTICE.txt
%{_libdir}/libarrow_flight_sql.so.*
%endif
%files -n libarrow_dataset%{sonum}
%license LICENSE.txt NOTICE.txt header
%license LICENSE.txt NOTICE.txt
%{_libdir}/libarrow_dataset.so.*
%files -n libparquet%{sonum}
%license LICENSE.txt NOTICE.txt header
%license LICENSE.txt NOTICE.txt
%{_libdir}/libparquet.so.*
%files devel
%doc README.md
%license LICENSE.txt NOTICE.txt header
%license LICENSE.txt NOTICE.txt
%{_includedir}/arrow/
%{_libdir}/cmake/Arrow*
%{_libdir}/libarrow.so
%{_libdir}/libarrow_acero.so
%{_libdir}/libarrow_compute.so
%{_libdir}/libarrow_dataset.so
%if %{with flight}
%{_libdir}/libarrow_flight.so
%{_libdir}/libarrow_flight_sql.so
%endif
%{_libdir}/pkgconfig/arrow*.pc
%dir %{_datadir}/arrow
%{_datadir}/arrow/gdb
@@ -490,29 +566,35 @@ popd
%if %{with static}
%files devel-static
%license LICENSE.txt NOTICE.txt header
%license LICENSE.txt NOTICE.txt
%{_libdir}/libarrow.a
%files acero-devel-static
%license LICENSE.txt NOTICE.txt header
%license LICENSE.txt NOTICE.txt
%{_libdir}/libarrow_acero.a
%files compute-devel-static
%license LICENSE.txt NOTICE.txt
%{_libdir}/libarrow_compute.a
%files dataset-devel-static
%license LICENSE.txt NOTICE.txt header
%license LICENSE.txt NOTICE.txt
%{_libdir}/libarrow_dataset.a
%if %{with flight}
%files flight-devel-static
%license LICENSE.txt NOTICE.txt header
%license LICENSE.txt NOTICE.txt
%{_libdir}/libarrow_flight.a
%files flight-sql-devel-static
%license LICENSE.txt NOTICE.txt header
%license LICENSE.txt NOTICE.txt
%{_libdir}/libarrow_flight_sql.a
%endif
%endif
%files -n apache-parquet-devel
%doc README.md
%license LICENSE.txt NOTICE.txt header
%license LICENSE.txt NOTICE.txt
%{_includedir}/parquet/
%{_libdir}/cmake/Parquet
%{_libdir}/libparquet.so
@@ -520,13 +602,13 @@ popd
%if %{with static}
%files -n apache-parquet-devel-static
%license LICENSE.txt NOTICE.txt header
%license LICENSE.txt NOTICE.txt
%{_libdir}/libparquet.a
%endif
%files -n apache-parquet-utils
%doc README.md
%license LICENSE.txt NOTICE.txt header
%license LICENSE.txt NOTICE.txt
%{_bindir}/parquet-*
%changelog

View File

@@ -0,0 +1,27 @@
We have to tell cmake that the openSUSE packager removed the boost::system target.
The headers subpackage provides the necessary Boost:system header file.
diff -ur --no-dereference arrow-apache-arrow-21.0.0.orig/cpp/cmake_modules/ThirdpartyToolchain.cmake arrow-apache-arrow-21.0.0/cpp/cmake_modules/ThirdpartyToolchain.cmake
--- arrow-apache-arrow-21.0.0.orig/cpp/cmake_modules/ThirdpartyToolchain.cmake 2025-07-11 09:44:45.000000000 +0200
+++ arrow-apache-arrow-21.0.0/cpp/cmake_modules/ThirdpartyToolchain.cmake 2025-09-26 20:53:58.409119646 +0200
@@ -1259,7 +1259,7 @@
set(Boost_USE_STATIC_LIBS ON)
endif()
if(ARROW_BOOST_REQUIRE_LIBRARY)
- set(ARROW_BOOST_COMPONENTS filesystem system)
+ set(ARROW_BOOST_COMPONENTS filesystem)
if(ARROW_FLIGHT_SQL_ODBC AND MSVC)
list(APPEND ARROW_BOOST_COMPONENTS locale)
endif()
diff -ur --no-dereference arrow-apache-arrow-21.0.0.orig/cpp/src/arrow/io/CMakeLists.txt arrow-apache-arrow-21.0.0/cpp/src/arrow/io/CMakeLists.txt
--- arrow-apache-arrow-21.0.0.orig/cpp/src/arrow/io/CMakeLists.txt 2025-07-11 09:44:45.000000000 +0200
+++ arrow-apache-arrow-21.0.0/cpp/src/arrow/io/CMakeLists.txt 2025-09-26 20:53:51.229519926 +0200
@@ -30,7 +30,7 @@
EXTRA_LINK_LIBS
arrow::hadoop
Boost::filesystem
- Boost::system)
+ Boost::headers)
endif()
add_arrow_test(memory_test PREFIX "arrow-io")

View File

@@ -1,3 +0,0 @@
version https://git-lfs.github.com/spec/v1
oid sha256:87fa36b469cac0a0c95596e7be39548ddf20c8f737a02ea559e30fbebd12c7d3
size 3571960

View File

@@ -1,3 +0,0 @@
version https://git-lfs.github.com/spec/v1
oid sha256:390bbed0de1c211ad6147df3c27e3be4e5288929ced92aa7e007f90b1ac5919b
size 3572136

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:3fa7b34468636ff1642c5c3fdf67d8f86ae4bff283c5185a6a986d623bab1d19
size 3588150

3
mimalloc-v2.2.4.tar.gz Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:754a98de5e2912fddbeaf24830f982b4540992f1bab4a0a8796ee118e0752bda
size 1295861

View File

@@ -1,3 +0,0 @@
version https://git-lfs.github.com/spec/v1
oid sha256:ac6331205baec1b97e8115de22efaf84561483623e5792d58060e91e84304bce
size 1037654

View File

@@ -1,3 +0,0 @@
version https://git-lfs.github.com/spec/v1
oid sha256:ac6331205baec1b97e8115de22efaf84561483623e5792d58060e91e84304bce
size 1037654

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:4496522640dc88635a8bf3c8e7572a5815549188fa00df132eef6e2a97ce0652
size 1077258

View File

@@ -1,26 +0,0 @@
Index: arrow-apache-arrow-16.0.0/python/pyarrow/tests/test_array.py
===================================================================
--- arrow-apache-arrow-16.0.0.orig/python/pyarrow/tests/test_array.py
+++ arrow-apache-arrow-16.0.0/python/pyarrow/tests/test_array.py
@@ -3323,7 +3323,7 @@ def test_numpy_array_protocol():
result = np.asarray(arr)
np.testing.assert_array_equal(result, expected)
- if Version(np.__version__) < Version("2.0"):
+ if Version(np.__version__) < Version("2.0.0rc1"):
# copy keyword is not strict and not passed down to __array__
result = np.array(arr, copy=False)
np.testing.assert_array_equal(result, expected)
Index: arrow-apache-arrow-16.0.0/python/pyarrow/tests/test_table.py
===================================================================
--- arrow-apache-arrow-16.0.0.orig/python/pyarrow/tests/test_table.py
+++ arrow-apache-arrow-16.0.0/python/pyarrow/tests/test_table.py
@@ -3244,7 +3244,7 @@ def test_numpy_array_protocol(constructo
table = constructor([[1, 2, 3], [4.0, 5.0, 6.0]], names=["a", "b"])
expected = np.array([[1, 4], [2, 5], [3, 6]], dtype="float64")
- if Version(np.__version__) < Version("2.0"):
+ if Version(np.__version__) < Version("2.0.0rc1"):
# copy keyword is not strict and not passed down to __array__
result = np.array(table, copy=False)
np.testing.assert_array_equal(result, expected)

View File

@@ -1,281 +0,0 @@
From 888a5ae568d155d03fbff0db8849517fd24a99ff Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ra=C3=BAl=20Cumplido?= <raulcumplido@gmail.com>
Date: Thu, 18 Jul 2024 16:48:52 +0200
Subject: [PATCH 1/9] GH-43299: [Release][Packaging] Only include pyarrow and
pyarrow.* when finding packages on setuptools
---
python/pyproject.toml | 1 +
1 file changed, 1 insertion(+)
diff --git a/python/pyproject.toml b/python/pyproject.toml
index d863bb3e5f0ac..d70b7fcce5903 100644
--- a/python/pyproject.toml
+++ b/python/pyproject.toml
@@ -74,6 +74,7 @@ include-package-data=true
[tool.setuptools.packages.find]
where = ["."]
+include = ["pyarrow", "pyarrow.*"]
[tool.setuptools.package-data]
pyarrow = ["*.pxd", "*.pyx", "includes/*.pxd"]
From 46d1afc62514ae04a3815aede7722ac5a9ecce64 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ra=C3=BAl=20Cumplido?= <raulcumplido@gmail.com>
Date: Thu, 18 Jul 2024 17:33:33 +0200
Subject: [PATCH 2/9] Update include
---
python/pyproject.toml | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/python/pyproject.toml b/python/pyproject.toml
index d70b7fcce5903..d1c5a799f870f 100644
--- a/python/pyproject.toml
+++ b/python/pyproject.toml
@@ -74,7 +74,7 @@ include-package-data=true
[tool.setuptools.packages.find]
where = ["."]
-include = ["pyarrow", "pyarrow.*"]
+include = ["pyarrow*"]
[tool.setuptools.package-data]
pyarrow = ["*.pxd", "*.pyx", "includes/*.pxd"]
From d954d75432f05723fca0644842deafd941802842 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ra=C3=BAl=20Cumplido?= <raulcumplido@gmail.com>
Date: Thu, 18 Jul 2024 18:00:40 +0200
Subject: [PATCH 3/9] try again without *
---
python/pyproject.toml | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/python/pyproject.toml b/python/pyproject.toml
index d1c5a799f870f..222f8d2ece681 100644
--- a/python/pyproject.toml
+++ b/python/pyproject.toml
@@ -74,7 +74,7 @@ include-package-data=true
[tool.setuptools.packages.find]
where = ["."]
-include = ["pyarrow*"]
+include = ["pyarrow"]
[tool.setuptools.package-data]
pyarrow = ["*.pxd", "*.pyx", "includes/*.pxd"]
From 2fa434ffc03cca1a251c80c51dd6e98f63db19d1 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ra=C3=BAl=20Cumplido?= <raulcumplido@gmail.com>
Date: Thu, 18 Jul 2024 18:36:55 +0200
Subject: [PATCH 4/9] Exclude tests from wheels
---
python/pyproject.toml | 2 ++
1 file changed, 2 insertions(+)
diff --git a/python/pyproject.toml b/python/pyproject.toml
index 222f8d2ece681..45c3b60c8aeed 100644
--- a/python/pyproject.toml
+++ b/python/pyproject.toml
@@ -75,6 +75,8 @@ include-package-data=true
[tool.setuptools.packages.find]
where = ["."]
include = ["pyarrow"]
+exclude = ["pyarrow.tests"]
+namespaces = false
[tool.setuptools.package-data]
pyarrow = ["*.pxd", "*.pyx", "includes/*.pxd"]
From 204a27b0534161a35e2d79241dcadd0471341c2a Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ra=C3=BAl=20Cumplido?= <raulcumplido@gmail.com>
Date: Fri, 19 Jul 2024 12:43:29 +0200
Subject: [PATCH 5/9] Try excluding pyarrow. and pyarrow/tests explicitly
---
python/pyproject.toml | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/python/pyproject.toml b/python/pyproject.toml
index 45c3b60c8aeed..d675f07a82391 100644
--- a/python/pyproject.toml
+++ b/python/pyproject.toml
@@ -75,7 +75,7 @@ include-package-data=true
[tool.setuptools.packages.find]
where = ["."]
include = ["pyarrow"]
-exclude = ["pyarrow.tests"]
+exclude = ["pyarrow/tests", "pyarrow."]
namespaces = false
[tool.setuptools.package-data]
From a1d73a28e3d6e57366ff43d06389a2f3fa47c7de Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ra=C3=BAl=20Cumplido?= <raulcumplido@gmail.com>
Date: Fri, 19 Jul 2024 13:40:25 +0200
Subject: [PATCH 6/9] Try removing where from packages find
---
python/pyproject.toml | 1 -
1 file changed, 1 deletion(-)
diff --git a/python/pyproject.toml b/python/pyproject.toml
index d675f07a82391..9a91fd76a4a20 100644
--- a/python/pyproject.toml
+++ b/python/pyproject.toml
@@ -73,7 +73,6 @@ zip-safe=false
include-package-data=true
[tool.setuptools.packages.find]
-where = ["."]
include = ["pyarrow"]
exclude = ["pyarrow/tests", "pyarrow."]
namespaces = false
From 346c0f1982735cac2a4b76a13efbb2a201bf158f Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ra=C3=BAl=20Cumplido?= <raulcumplido@gmail.com>
Date: Mon, 22 Jul 2024 14:46:54 +0200
Subject: [PATCH 7/9] Try with pyarrow.tests
---
python/pyproject.toml | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/python/pyproject.toml b/python/pyproject.toml
index 9a91fd76a4a20..d83cf8fe45d8c 100644
--- a/python/pyproject.toml
+++ b/python/pyproject.toml
@@ -74,7 +74,7 @@ include-package-data=true
[tool.setuptools.packages.find]
include = ["pyarrow"]
-exclude = ["pyarrow/tests", "pyarrow."]
+exclude = ["pyarrow.tests"]
namespaces = false
[tool.setuptools.package-data]
From f6273223a1b006406bf315f41424be03a51a3b1e Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ra=C3=BAl=20Cumplido?= <raulcumplido@gmail.com>
Date: Mon, 22 Jul 2024 15:34:31 +0200
Subject: [PATCH 8/9] Remove excludes
---
python/pyproject.toml | 1 -
1 file changed, 1 deletion(-)
diff --git a/python/pyproject.toml b/python/pyproject.toml
index d83cf8fe45d8c..7e14795428315 100644
--- a/python/pyproject.toml
+++ b/python/pyproject.toml
@@ -74,7 +74,6 @@ include-package-data=true
[tool.setuptools.packages.find]
include = ["pyarrow"]
-exclude = ["pyarrow.tests"]
namespaces = false
[tool.setuptools.package-data]
From f855f0c14fbc4703123e36924f1641cf4a48396a Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ra=C3=BAl=20Cumplido?= <raulcumplido@gmail.com>
Date: Thu, 25 Jul 2024 11:33:51 +0200
Subject: [PATCH 9/9] Remove PYARROW_INSTALL_TESTS and packages definition on
setup.py
---
ci/scripts/python_wheel_macos_build.sh | 1 -
ci/scripts/python_wheel_manylinux_build.sh | 1 -
ci/scripts/python_wheel_windows_build.bat | 1 -
docs/source/developers/python.rst | 3 ---
python/setup.py | 16 +---------------
5 files changed, 1 insertion(+), 21 deletions(-)
diff --git a/ci/scripts/python_wheel_macos_build.sh b/ci/scripts/python_wheel_macos_build.sh
index 3ed9d5d8dd12f..6c314d0632f60 100755
--- a/ci/scripts/python_wheel_macos_build.sh
+++ b/ci/scripts/python_wheel_macos_build.sh
@@ -152,7 +152,6 @@ echo "=== (${PYTHON_VERSION}) Building wheel ==="
export PYARROW_BUILD_TYPE=${CMAKE_BUILD_TYPE}
export PYARROW_BUNDLE_ARROW_CPP=1
export PYARROW_CMAKE_GENERATOR=${CMAKE_GENERATOR}
-export PYARROW_INSTALL_TESTS=1
export PYARROW_WITH_ACERO=${ARROW_ACERO}
export PYARROW_WITH_AZURE=${ARROW_AZURE}
export PYARROW_WITH_DATASET=${ARROW_DATASET}
diff --git a/ci/scripts/python_wheel_manylinux_build.sh b/ci/scripts/python_wheel_manylinux_build.sh
index aa86494a9d47d..b5b45c54a800d 100755
--- a/ci/scripts/python_wheel_manylinux_build.sh
+++ b/ci/scripts/python_wheel_manylinux_build.sh
@@ -140,7 +140,6 @@ echo "=== (${PYTHON_VERSION}) Building wheel ==="
export PYARROW_BUILD_TYPE=${CMAKE_BUILD_TYPE}
export PYARROW_BUNDLE_ARROW_CPP=1
export PYARROW_CMAKE_GENERATOR=${CMAKE_GENERATOR}
-export PYARROW_INSTALL_TESTS=1
export PYARROW_WITH_ACERO=${ARROW_ACERO}
export PYARROW_WITH_AZURE=${ARROW_AZURE}
export PYARROW_WITH_DATASET=${ARROW_DATASET}
diff --git a/ci/scripts/python_wheel_windows_build.bat b/ci/scripts/python_wheel_windows_build.bat
index 54f02ec6f6ed0..1f1d5dca721d9 100644
--- a/ci/scripts/python_wheel_windows_build.bat
+++ b/ci/scripts/python_wheel_windows_build.bat
@@ -106,7 +106,6 @@ echo "=== (%PYTHON_VERSION%) Building wheel ==="
set PYARROW_BUILD_TYPE=%CMAKE_BUILD_TYPE%
set PYARROW_BUNDLE_ARROW_CPP=ON
set PYARROW_CMAKE_GENERATOR=%CMAKE_GENERATOR%
-set PYARROW_INSTALL_TESTS=ON
set PYARROW_WITH_ACERO=%ARROW_ACERO%
set PYARROW_WITH_DATASET=%ARROW_DATASET%
set PYARROW_WITH_FLIGHT=%ARROW_FLIGHT%
diff --git a/docs/source/developers/python.rst b/docs/source/developers/python.rst
index 2f3e892ce8ede..bed095b4b8d11 100644
--- a/docs/source/developers/python.rst
+++ b/docs/source/developers/python.rst
@@ -632,9 +632,6 @@ PyArrow are:
* - ``PYARROW_BUNDLE_CYTHON_CPP``
- Bundle the C++ files generated by Cython
- ``0`` (``OFF``)
- * - ``PYARROW_INSTALL_TESTS``
- - Add the test to the python package
- - ``1`` (``ON``)
* - ``PYARROW_BUILD_VERBOSE``
- Enable verbose output from Makefile builds
- ``0`` (``OFF``)
diff --git a/python/setup.py b/python/setup.py
index 11cd7028023be..c4517d21c42f1 100755
--- a/python/setup.py
+++ b/python/setup.py
@@ -32,7 +32,7 @@
from distutils import sysconfig
import pkg_resources
-from setuptools import setup, Extension, Distribution, find_namespace_packages
+from setuptools import setup, Extension, Distribution
from Cython.Distutils import build_ext as _build_ext
import Cython
@@ -371,21 +371,7 @@ def has_ext_modules(foo):
return True
-if strtobool(os.environ.get('PYARROW_INSTALL_TESTS', '1')):
- packages = find_namespace_packages(include=['pyarrow*'])
- exclude_package_data = {}
-else:
- packages = find_namespace_packages(include=['pyarrow*'],
- exclude=["pyarrow.tests*"])
- # setuptools adds back importable packages even when excluded.
- # https://github.com/pypa/setuptools/issues/3260
- # https://github.com/pypa/setuptools/issues/3340#issuecomment-1219383976
- exclude_package_data = {"pyarrow": ["tests*"]}
-
-
setup(
- packages=packages,
- exclude_package_data=exclude_package_data,
distclass=BinaryDistribution,
# Dummy extension to trigger build_ext
ext_modules=[Extension('__dummy__', sources=[])],

View File

@@ -1,3 +1,353 @@
-------------------------------------------------------------------
Thu Sep 25 10:25:07 UTC 2025 - Ben Greiner <code@bnavigator.de>
- Update to 21.0.0
## Bug Fixes
* GH-44366 - [Python][Acero] RecordBatch.filter on expression
raises error if result set is empty (#46057)
* GH-45292 - [Python] test_dtypes hypotesis test fails
sporadically (#46029)
* GH-46080 - [Python][Docs] Provide guidance for tzdata related
issues if installing with pip (#46591)
* GH-46121 - [Python] Add missing column_index argument to
ArrowReaderProperties::read_dictionary's Cython binding
(#46122)
* GH-46174 - [Python] Failing tests in python minimal builds
(#46175)
* GH-46238 - [Release][Python] Use array to avoid empty argument
in dev/release/post-11-python.sh (#46239)
* GH-46343 - [CI][Python] Remove workaround for gdb packaging
issue (#46848)
* GH-46344 - [CI][Python] Skip doctest for s3.get_file_info to
avoid bucket restrictions (#46345)
* GH-46355 - [Python] Fix table.to_struct_array with an empty
table (#46357)
* GH-46481 - [C++][Python] Allow nullable schema in FlightInfo
(#46489)
* GH-46516 - [CI][Python] Force Cython>3.1.1 for docs builds
(#46770)
* GH-46606 - [Python] Do not require numpy when normalizing slice
(#46732)
* GH-46611 - [Python][C++] Allow building float16 arrays without
numpy (#46618)
* GH-46729 - [Python] Allow constructing InMemoryDataset from
RecordBatchReader (#46731)
* GH-46811 - [C++][Python] Fix crash on
FileReaderImpl::GetRecordBatchReader (#46931)
## New Features and Improvements
* GH-26818 - [C++][Python] Preserve order when writing dataset
multi-threaded (#44470)
* GH-38914 - [Python] Add
EncryptionConfiguration.uniform_encryption (#46347)
* GH-39294 - [C++][Python] DLPack on Tensor class (#42118)
* GH-40754 - [Python] Expose tls_ca_file_path to S3FileSystem
(#45881)
* GH-41496 - [Python][Azure][Docs] Turn on azure on debian-docs
(#46892)
* GH-41672 - [Python][Doc] Clarify docstring of
FixedSizeListArray.values that it ignores the offset (#46144)
* GH-42012 - [Python] Add Schema with_field or set_field method
(#46348)
* GH-43041 - [C++][Python] Read/write Parquet BYTE_ARRAY as
Large/View types directly (#46532)
* GH-43807 - [C++][Python] Add UUID extension type conversion
support to/from Parquet (#45866)
* GH-44500 - [Python][Parquet] Map Parquet logical types to Arrow
extension types by default (#46772)
* GH-44900 - [Python] Support explicit fsspec+{protocol} and
hf:// filesystem URIs (#45089)
* GH-45229 - [Python] Migrate from scipy.spmatrix to
scipy.sparray (#46423)
* GH-45229 - [Python] skip scipy.sparse roundtrip tests for
float16 (#46413)
* GH-45531 - [Python] Add the dim_names argument to
from_numpy_ndarray (#46170)
* GH-45619 - [Python] Use f-string instead of string.format
(#45629)
* GH-45653 - [Python] Scalar subclasses should implement Python
protocols (#45818)
* GH-45750 - [C++][Python][Parquet] Implement Content-Defined
Chunking for the Parquet writer (#45360)
* GH-45957 - [C++][Python] Expose allow_delayed_open on
S3FileSystem (#46078)
* GH-46019 - [Python] Raise TypeError on feather read_table if
columns is not a Sequence (#46038)
* GH-46054 - [Python][Packaging] Re-enable pandas on Windows
free-threaded wheel (#46109)
* GH-46058 - [Python] Run Python in AppVeyor outside of source
directory (#46059)
* GH-46130 - [Python] Remove use_legacy_format in favour of
setting IpcWriteOptions (#46131)
* GH-46198 - [Python] Remove deprecated PyExtensionType (#46199)
* GH-46222 - [Python] Allow to specify footer metadata when
opening IPC file for writing (#46354)
* GH-46349 - [Python] Move parquet definitions to
pyarrow/includes/libparquet.pxd (#46437)
* GH-46373 - [Python] Exercise fallback case on tests for
parquet.read_table in case dataset is not available (#46550)
* GH-46544 - [CI][Dev][Python] Use pre-commit for autopep8
(#46552)
* GH-46545 - [CI][Dev][Python] Update pre-commit for cython-lint
(#46580)
* GH-46546 - [CI][Dev][Python] Use pre-commit for numpydoc
(#46595)
* GH-46572 - [Python] expose filter option to python for join
(#46566)
* GH-46633 - [Docs][C++][Python] Update CombineChunks
documentation to specify that binary columns can be combined
into multiple chunks (#46638)
* GH-46652 - [Python][Docs] Update language for row_group_size
parameter (#46653)
* GH-46676 - [C++][Python][Parquet] Allow reading Parquet LIST
data as LargeList directly (#46678)
* GH-46683 - [C++][Python] Add utf8_zero_fill compute function
for sign-aware zero padding (#46815)
* GH-46771 - [Python][C++] Implement pa.arange function to
generate array sequences (#46778)
* GH-46833 - [Python] Expose ConfigureManagedIdentityCredential
and ConfigureClientSecretCredential to AzureFileSystem on
PyArrow (#46837)
* GH-46959 - [Python][Packaging] Drop support for manylinux2014
(#46965)
-------------------------------------------------------------------
Fri Jun 13 18:22:38 UTC 2025 - Ben Greiner <code@bnavigator.de>
- Update to 20.0.0
## Bug Fixes
* GH-36628 - [Python][Parquet] Fail when instantiating internal
Parquet metadata classes (#45549)
* GH-37630 - [C++][Python][Dataset] Allow disabling fragment
metadata caching (#45330)
* GH-44188 - [Python] Fix pandas roundtrip with bytes column
names (#44171)
* GH-45129 - [Python][C++] Fix usage of deprecated C++
functionality on pyarrow (#45189)
* GH-45155 - [Python][CI] Fix path for scientific nightly windows
wheel upload (#45222)
* GH-45169 - [Python] Adapt to modified pytest ignore collect
hook api (#45170)
* GH-45380 - [Python] Expose RankQuantileOptions to Python
(#45392)
* GH-45530 - [Python][Packaging] Add pyarrow.libs dir to
get_library_dirs (#45766)
* GH-45582 - [Python] Preserve decimal32/64/256 metadata in
Schema.metadata (#45583)
* GH-45733 - [C++][Python] Add biased/unbiased toggle to skew and
kurtosis functions (#45762)
* GH-45739 - [C++][Python] Fix crash when calling
hash_pivot_wider without options (#45740)
* GH-45758 - [Python] Add AzureFileSystem documentation (#45759)
* GH-45926 - [Python] Use pytest.approx for float values on
unbiased skew and kurtosis tests (#45929)
* GH-46041 - [Python][Packaging] Temporary remove pandas from
being installed on free-threaded Windows wheel tests (#46042)
## New Features and Improvements
* GH-14932 - [Python] Add python bindings for JSON streaming
reader (#45084)
* GH-35289 - [Python] Support large variable width types in numpy
conversion (#36701)
* GH-36412 - [Python][CI] Fix deprecation warnings in the pandas
nightly build
* GH-39010 - [Python] Introduce maps_as_pydicts parameter for
to_pylist, to_pydict, as_py (#45471)
* GH-41002 - [Python] Remove pins for pytest-cython and
conda-docs pytest (#45240)
* GH-41985 - [Python][Docs] Clarify docstring of
pyarrow.compute.scalar() (#45668)
* GH-43587 - [Python] Remove no longer used serialize/deserialize
PyArrow C++ code (#45743)
* GH-44421 - [Python] Add configuration for building & testing
free-threaded wheels on Windows (#44804)
* GH-44790 - [Python] Remove use_legacy_dataset from code base
(#45742)
* GH-45156 - [Python][Packaging] Refactor Python Windows wheel
images to use newer base image (#45442)
* GH-45237 - [Python] Raise minimum supported cython to >=3
(#45238)
* GH-45278 - [Python][Packaging] Updated delvewheel install
command and updated flags used with delvewheel repair (#45323)
* GH-45282 - [Python][Parquet] Remove unused readonly properties
of ParquetWriter (#45281)
* GH-45288 - [Python][Packaging][Docs] Update documentation for
PyArrow nightly wheels (#45289)
* GH-45358 - [C++][Python] Add MemoryPool method to print
statistics (#45359)
* GH-45433 - [Python] Remove Cython workarounds (#45437)
* GH-45457 - [Python] Add pyarrow.ArrayStatistics (#45550)
* GH-45482 - [CI][Python] Dont use Ubuntu 20.04 for wheel test
(#45483)
* GH-45570 - [Python] Allow Decimal32/64Array.to_pandas (#45571)
* GH-45676 - [C++][Python][Compute] Add skew and kurtosis
functions (#45677)
* GH-45680 - [C++][Python] Remove deprecated functions in 20.0
* GH-45705 - [Python] Add support for SAS token in
AzureFileSystem (#45706)
* GH-45755 - [C++][Python][Compute] Add winsorize function
(#45763)
* GH-45848 - [C++][Python][R] Remove deprecated PARQUET_2_0
(#45849)
* GH-45920 - [Release][Python] Upload sdist and wheels to GitHub
Releases not apache.jfrog.io (#45962)
-------------------------------------------------------------------
Mon Feb 17 19:17:26 UTC 2025 - Ben Greiner <code@bnavigator.de>
- Update to 19.0.1
## Bug Fixes
* [Python][CI] Make download_tzdata_on_windows more robust and
use tzdata package for tzinfo database on Windows for ORC
(#45425)
* [Python] Only enable the string dtype on pandas export for
pandas>=2.3 (#45383) [Python] Fix version comparison in pandas
compat for pandas 2.3 dev version (#45428)
## Improvements
* [CI][Python] Temporarily avoid newer boto3 version (#45311)
[CI] Bump Minio version and unpin boto3 (#45320)
- Release 19.0.0
## New Features and Improvements
* [Python] Add more FlightInfo / FlightEndpoint attributes
(#43537)
* [Python] Support Arrow PyCapsule stream objects in
write_dataset (#43771)
* [Python] Support pandas future default string dtype
* [CI][Python] Use GitHub Packages for vcpkg cache (#44644)
* [Python] Add Python wrapper for JsonExtensionType (#44070)
* [Python][C++] Add version suffix to libarrow_python* libraries
(#44702)
* [Python] Add support for Decimal32 and Decimal64 types (#44882)
* [C++][Python] Add Hyperbolic Trig functions (#44630)
* [Python] Clean-up name / field_name handling in pandas compat
(#44963)
* [CI][Python][Packaging] Test 3.12 wheels on Ubuntu 24.04
(#45042)
* [CI][Packaging][Python] Simplify
dev/tasks/python-wheels/github.linux.yml (#45077)
* [Python] Honor the strings_to_categorical keyword in to_pandas
for string view type (#45176)
## Bug Fixes
* [C++][Python] Fix ORC crash when file contains unknown timezone
(#45051)
* [Python] Converting month_day_nano_interal to numpy crashes
* [Python] Allow from_buffers to work with StringView on Python
(#44701)
* [C++][Python] Fix Flight Timestamp precision, revert workaround
from #43537 (#44681)
* [Docs][Python] Add missing canonical extension types to PyArrow
arrays and datatypes docs (#44880)
* [Python] Trigger manual Garbage collection before checking
allocated bytes for dlpack tests (#44793)
* [Python][Packaging] Use delvewheel to repair Windows wheels
(#35323)
* [CI][Python] Fix and modernize AppVeyor build (#44999)
* [Python][Docs] Update docstrings for metadata methods on Field
and Schema classes (#45004)
* [CI][Python] Fix test_memory failures (#45007)
* [CI][Packaging][Python] Fix Docker push step for free-threaded
wheel builds (#45040)
* [Packaging][Python] Use ORC from vcpkg instead of bundled on
Linux and macOS (#45046)
- Release 18.1.0
## Bug Fixes
* [Release][Packacing][Python] Set PARQUET_TEST_DATA on
verify-release-candidate-wheels.bat (#44462)
## New Features and Improvements
- Release 18.0.0
## Bug Fixes
* [Python][Packaging] Bump MACOSX_DEPLOYMENT_TARGET to 12 instead
of 11 (#43137)
* [Release][Packaging][Python] Add tzdata as conda env
requirement to avoid ORC failure (#43233)
* [Python] Give precedence to pycapsule interface in
pa.schema(..) (#43486)
* [Python] Sanitize Python reference handling in UDF
implementation (#43557)
* [Python] Allow tuple for rename columns (#43609)
* [Packaging][Python] Fix vcpkg version detection in macOS wheel
build jobs (#43615)
* [Python] Fix compilation on Cython<3 (#43765)
* [Python][CI] Correct PARQUET_TEST_DATA path in wheel tests
(#43786)
* [CI][Packaging][Python] Avoid uploading wheel to gemfury if
version already exists (#43816)
* [CI][Python] Skip test that requires PARQUET_TEST_DATA env on
emscripten (#43906)
* [Python] Fix threading issues with borrowed refs and pandas
(#44047)
* [Benchmarking][Python] Avoid uwsgi install failure on macOS
(#44221)
* [CI][Release][Python] Do not verify Python on Ubuntu 20.04
(#44254)
* [CI][Python] Remove ds requirement from test collection on
test_dataset.py (#44370)
## New Features and Improvements
* [C++][Python] Native support for UUID (#37298)
* [C++][Python] Bool8 Extension Type Implementation (#43488)
* [Python] Make NumPy an optional runtime dependency (#41904)
* [Python] Add StructType attribute to access all its fields
(#43481)
* [CI][Python] Use pipx to install GCS testbench (#43852)
* [Python][CI][Packaging] Dont upload sdist to scientific-python
nightly channel (only wheels) (#43943)
* [Python][CI][Packaging] Upload nightly wheels to main label of
scientific-python-nightly-wheels channel (#43932)
* [CI][Packaging][Python] Upload pyarrow nightly wheels to
scientific python channel on Anaconda (#43862)
* [C++][Python][Parquet] Support reading/writing key-value
metadata from/to ColumnChunkMetaData (#41580)
* [Python] Ensure (Chunked)Array/RecordBatch/Table methods dont
crash with non-CPU data
* [Python] Let StructArray.from_array accept a type in addition
to names or fields (#43047)
* [Python] Test FlightStreamReader iterator (#42086)
* [Python] Add bindings for CopyTo on RecordBatch and Array
classes (#42223)
* [Python] Use Py_IsFinalizing from pythoncapi_compat.h (#43767)
* [Python] Add bindings for memory manager and device to Context
class (#43392)
* [C++][Python] Add Opaque canonical extension type (#43458)
* [Python] Deprecate passing build flags to setup.py (#43515)
* [Python][Packaging][CI] Drop Python 3.8 support (#43970)
* [Python][CI] Add Python 3.13 conda test build (#44192)
* [Python][CI][Packaging] Use released versions to build and test
wheels on Python 3.13 (#44193)
* [Python] Set up wheel building for Python 3.13 (#43539)
* [Python] Remove usage of deprecated pkg_resources in setup.py
(#43602)
* [Python][CI] Add a Crossbow job with the free-threaded build
(#43671)
* [Python] Do not use borrowed references APIs (#43540)
* [Python] Declare support for free-threading in Cython (#43606)
* [Python][CI] Add a Crossbow job with a debug CPython
interpreter (#43565)
* [Python][Dataset] Python / Cython interface to C++
arrow::dataset::Partitioning::Format (#43740)
* [Python][CI] Simplify python/requirements-wheel-test.txt file
(#43691)
* [Python] RecordBatch fails gracefully on non-cpu devices
(#43729)
* [Python] ChunkedArray fails gracefully on non-cpu devices
(#43795)
* [Python][Packaging] Remove numpy dependency from pyarrow
packaging (#44148)
* [Python] Build macOS and manylinux wheels for free-threading
(#43965)
* [Python] Table fails gracefully on non-cpu devices (#43974)
* [Python] Deprecate the no longer used serialize/deserialize
Pyarrow C++ functions (#44064)
* [CI][Python] Enable S3 testing on Windows wheel builds (#44093)
* [CI][Python] Enable S3 tests on macOS CI (#44129)
* [Packaging][Python] Use macOS 12 as deployment target to have
macOS 12 pyarrow wheels (#44315)
* [Packaging][Python] Disable interactive deb configuration in
wheel-manylinux--cp313t- (#44362)
- Drop pyarrow-pr433325-extradirs.patch
-------------------------------------------------------------------
Thu Sep 26 23:24:22 UTC 2024 - Guang Yee <gyee@suse.com>
- Enable sle15_python_module_pythons.
-------------------------------------------------------------------
Wed Aug 14 20:27:48 UTC 2024 - Ben Greiner <code@bnavigator.de>
@@ -351,12 +701,12 @@ Mon Jan 15 20:42:25 UTC 2024 - Ben Greiner <code@bnavigator.de>
-------------------------------------------------------------------
Tue Nov 14 23:29:03 UTC 2023 - Ondřej Súkup <mimi.vx@gmail.com>
- Fix cve in changelog
- Fix cve in changelog
-------------------------------------------------------------------
Tue Nov 14 09:28:23 UTC 2023 - Ondřej Súkup <mimi.vx@gmail.com>
- Update to 14.0.1
- Update to 14.0.1
- drop pyarrow-pr37481-pandas2.1.patch
- fixes boo#1216991 CVE-2023-47248
* GH-38431 - [Python][CI] Update fs.type_name checks for s3fs tests
@@ -539,7 +889,7 @@ Sun Mar 12 05:31:32 UTC 2023 - Ben Greiner <code@bnavigator.de>
* [Python][Docs] adding info about TableGroupBy.aggregation with empty list (#14482)
* [Python] DataFrame Interchange Protocol for pyarrow Table
* [Python] Drop older versions of Pandas (<1.0) (#14631)
* [Python] Pass Cmake args to Python CPP
* [Python] Pass Cmake args to Python CPP
* [Docs][Python] Improve docs for S3FileSystem (#14599)
* [Python] Add missing value accessor to temporal types (#14746)
* [Python] Expose time32/time64 scalar values (#14637)
@@ -567,7 +917,7 @@ Sun Mar 12 05:31:32 UTC 2023 - Ben Greiner <code@bnavigator.de>
* [Python] Support passing create_dir thru pq.write_to_dataset (#14459)
* [CI][Python] Fix pandas master/nightly build failure related to timedelta (#14460)
* [Python] Fix writing files with multi-byte characters in file name (#14764)
* [Python] Handle pytest 8 deprecations about pytest.warns(None)
* [Python] Handle pytest 8 deprecations about pytest.warns(None)
* [Python] Remove ARROW_BUILD_DIR in building pyarrow C++ (#14498)
* [Python] Honor default memory pool in Dataset scanning (#14516)
* [Python] Fully support filesystem in parquet.write_metadata (#14574)

View File

@@ -1,7 +1,7 @@
#
# spec file for package python-pyarrow
#
# Copyright (c) 2024 SUSE LLC
# Copyright (c) 2025 SUSE LLC and contributors
#
# All modifications and additions to the file contributed by third parties
# remain the property of their copyright owners, unless otherwise agreed
@@ -16,12 +16,19 @@
#
%{?sle15_python_module_pythons}
%bcond_with xsimd
%define plainpython python
# See git submodule /testing pointing to the correct revision
%define arrow_testing_commit 735ae7128d571398dd798d7ff004adebeb342883
%define arrow_testing_commit fbf6b703dc93d17d75fa3664c5aa2c7873ebaf06
# See git submodule /cpp/submodules/parquet-testing pointing to the correct revision
%define parquet_testing_commit 18d17540097fca7c40be3d42c167e6bfad90763c
%if %{suse_version} <= 1500
# requires __has_builtin with keywords
%define gccver 13
%endif
Name: python-pyarrow
Version: 17.0.0
Version: 21.0.0
Release: 0
Summary: Python library for Apache Arrow
License: Apache-2.0 AND BSD-3-Clause AND BSD-2-Clause AND MIT
@@ -29,19 +36,18 @@ URL: https://arrow.apache.org/
# SourceRepository: https://github.com/apache/arrow
Source0: apache-arrow-%{version}.tar.gz
Source1: arrow-testing-%{version}.tar.gz
Source2: parquet-testing-%{version}.tar.gz
Source99: python-pyarrow.rpmlintrc
# PATCH-FIX-UPSTREAM pyarrow-pr433325-extradirs.patch gh#apache/arrow/pull/43325
Patch0: pyarrow-pr433325-extradirs.patch
BuildRequires: %{python_module Cython >= 0.29.31}
BuildRequires: %{python_module devel >= 3.8}
BuildRequires: %{python_module Cython >= 3}
BuildRequires: %{python_module devel >= 3.9}
BuildRequires: %{python_module numpy-devel >= 1.25}
BuildRequires: %{python_module pip}
BuildRequires: %{python_module setuptools_scm}
BuildRequires: %{python_module setuptools}
BuildRequires: %{python_module wheel}
BuildRequires: cmake
BuildRequires: cmake >= 3.25
BuildRequires: fdupes
BuildRequires: gcc-c++
BuildRequires: gcc%{?gccver}-c++
BuildRequires: openssl-devel
BuildRequires: pkgconfig
BuildRequires: python-rpm-macros
@@ -91,13 +97,13 @@ This package provides the header files within the python
platlib for consuming modules using cythonization.
%prep
%setup -n arrow-apache-arrow-%{version} -a1
%setup -n arrow-apache-arrow-%{version} -a1 -a2
%autopatch -p1
# we disabled the jemalloc backend in apache-arrow
sed -i 's/should_have_jemalloc = sys.platform == "linux"/should_have_jemalloc = False/' python/pyarrow/tests/test_memory.py
%build
pushd python
%{?gccver:export CXX=g++-%{gccver}}
%{?gccver:export CC=gcc-%{gccver}}
export CFLAGS="%{optflags}"
export PYARROW_BUILD_TYPE=relwithdebinfo
export PYARROW_BUILD_VERBOSE=1
@@ -126,7 +132,10 @@ pushd python
popd
%check
%{?gccver:export CXX=g++-%{gccver}}
%{?gccver:export CC=gcc-%{gccver}}
export ARROW_TEST_DATA="${PWD}/arrow-testing-%{arrow_testing_commit}/data"
export PARQUET_TEST_DATA="${PWD}/parquet-testing-%{parquet_testing_commit}/data"
# flaky tests
donttest="test_total_bytes_allocated"
donttest="$donttest or test_batch_lifetime"