10
0
forked from pool/apache-arrow

10 Commits

Author SHA256 Message Date
8697b15a63 .
OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=56
2025-06-13 18:39:08 +00:00
853a205aac - Update to 20.0.0
## Bug Fixes
  * GH-30302 - [C++][Parquet] Preserve the bitwidth of integer
    dictionary indices on round-trip to Parquet (#45685)
  * GH-31992 - [C++][Parquet] Handling the special case when
    DataPageV2 values buffer is empty (#45252)
  * GH-37630 - [C++][Python][Dataset] Allow disabling fragment
    metadata caching (#45330)
  * GH-39023 - [C++][CMake] Add missing launcher path conversion
    for ExternalPackage (#45349)
  * GH-43057 - [C++] Thread-safe AesEncryptor / AesDecryptor
    (#44990)
  * GH-45048 - [C++][Parquet] Deprecate unused chunk_size parameter
    in parquet::arrow::FileWriter::NewRowGroup() (#45088)
  * GH-45129 - [Python][C++] Fix usage of deprecated C++
    functionality on pyarrow (#45189)
  * GH-45132 - [C++][Gandiva] Update LLVM to 18.1 (#45114)
  * GH-45185 - [C++][Parquet] Raise an error for invalid repetition
    levels when delimiting records (#45186)
  * GH-45254 - [C++][Acero] Fix the row offset truncation in row
    table merge (#45255)
  * GH-45266 - [C++][Acero] Fix the running tasks count of
    Scheduler when get error tasks in multi-threads (#45268)
  * GH-45270 - [C++][CI] Disable mimalloc on Valgrind builds
    (#45271)
  * GH-45301 - [C++] Change PrimitiveArray ctor to protected
    (#45444)
  * GH-45334 - [C++][Acero] Fix swiss join overflow issues in row
    offset calculation for fixed length and null masks (#45336)
  * GH-45362 - [C++] Fix identity cast for time and list scalar

OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=55
2025-06-13 18:31:56 +00:00
285eb6979a Accepting request 1247453 from home:bnavigator:branches:science
- disable flight because of gh#grpc/grpc#37968 boo#1237422

OBS-URL: https://build.opensuse.org/request/show/1247453
OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=46
2025-02-20 16:43:05 +00:00
55775895c9 - Update to 19.0.1
## Bug Fixes
  * [C++] Fix overflow issues for large build side in swiss join
    (#45108)
  * [C++][Fuzzing] Fix Negation bug discovered by fuzzing (#45181)
  * [C++][Parquet] Omit level histogram when max level is 0
    (#45285)
  * [Parquet][C++] Fix statistics load logic for no row group and
    multiple row groups (#45350)
  * [C++] Disable Flight test (#45232)
  ## Improvements
  * [C++][Parquet] Improve performance of generating size
    statistics (#45202)
  * [C++][S3] Workaround compatibility issue between AWS SDK and
    MinIO (#45310)
- Release 19.0.0
  ## New Features and Improvements
  * [CI][C++] Add a nightly job to test offline build (#44721)
  * [C++] GcsFileSystem::Make should return Result (#44503)
  * [C++][Parquet] Implement SizeStatistics (#40594)
  * [C++] Reduce string inlining in Substrait serde (#45174)
  * [C++][Acero] Enhance asof_join to work in multi-threaded
    execution by sequencing input (#44083)
  * [C++] Support the AWS S3 SSE-C encryption (#43601)
  * [C++][Parquet] Parquet Metadata Printer supports print
    sort-columns (#43599)
  * [C++] Add C++ implementation of Async C Data Interface (#44495)
  * [C++][Acero] Support AVX2 swiss join decoding (#43832)
  * [C++] skip -0117 in StrptimeZoneOffset for old glibc (#44621)
  * [C++] Add arrow::RecordBatch::MakeStatisticsArray() (#44252)

OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=40
2025-02-17 22:32:29 +00:00
be27bc1230 Accepting request 1218425 from home:yeey:OpenWebUI
- Set the appropriate C++ complier for the given platform so
  it will compile on Leap 15.x. 

- Enable sle15_python_module_pythons.

OBS-URL: https://build.opensuse.org/request/show/1218425
OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=38
2024-10-26 01:06:02 +00:00
9bed06f66b Accepting request 1194085 from home:bnavigator:branches:science
- Update to 17.0.0
  ## Bug Fixes
  * [C++] Add option to string ‘center’ kernel to control
    left/right alignment on odd number of padding (#41449)
  * [C++][Python] Fix casting to extension type with fixed size
    list storage type (#42219)
  * [C++] Replace null_count with MayHaveNulls in
    ListArrayFromArray and MapArray (#41957)
  * [C++][Python] RecordBatch.filter() segfaults if passed a
    ChunkedArray (#40971)
  * [C++][Parquet] Timestamp conversion from Parquet to Arrow does
    not follow compatibility guidelines for convertedType
  * [C++] Use LargeStringArray for casting when writing tables to
    CSV (#40271)
  * [C++][Python] Map child Array constructed from keys and items
    shouldn’t have offset (#40871)
  * [C++] Fix compile warning with ‘implicitly-defined constructor
    does not initialize’ in encoding_benchmark (#41060)
  * [C++] Get null_bit_id according to are_cols_in_encoding_order
    in NullUpdateColumnToRow_avx2 (#40998)
  * [C++] Clean up unused parameter warnings (#41111)
  * [C++][Acero] Fix asof join race (#41614)
  * [C++] support for single threaded joins (#41125)
  * [C++] Fix hashjoin benchmark failed at make utf8’s random
    batches (#41195)
  * [C++] Check to avoid copying when NullBitmapBuffer is Null
    (#41452)
  * [C++] Fix crash on invalid Parquet file (#41366)
  * [C++][Parquet] More strict Parquet level checking (#41346)
  * [C++][Gandiva] Fix gandiva cache size env var (#41330)
  * [C++][CMake][Windows] Remove needless .dll suffix from link
    libraries (#41341)
  * [C++][CMake] Remove unused ARROW_NO_DEPRECATED_API (#41345)
  * [C++][maybe_unused] with Arrow macro (#41359)
  * [C++][Large] ListView and Map nested types for scalar_if_else’s
    kernel functions (#41419)
  * [C++][Gandiva] Fix ascii_utf8 function to return same result on
    x86 and Arm (#41434)
  * [C++] Reuse deduplication logic for direct registration
    (#41466)
  * [C++] Clean up more redundant move warnings (#41487)
  * [C++][Compute] Remove redundant logic for ArrayData as
    ExecResults in ExecScalarCaseWhen (#41380)
  * [C++][CMake] correctly use Protobuf_PROTOC_EXECUTABLE (#41582)
  * [C++][CMake] Fix ARROW_USE_BOOST detect condition (#41622)
  * [C++][Python] Add optional null_bitmap to MapArray::FromArrays
    (#41757)
  * [C++] macros.h: Fix ARROW_FORCE_INLINE for MSVC (#41712)
  * [C++][Acero] Remove an useless parameter for QueryContext::Init
    called in hash_join_benchmark (#41716)
  * [C++] Fix the issue that temp vector stack may be under sized
    (#41746)
  * [C++] Check that extension metadata key is present before
    attempting to delete it (#41763)
  * [C++] Iterator releases its resource immediately when it reads
    all values (#41824)
  * [C++][Flight][Benchmark] Ensure waiting server ready (#41793)
  * [C++] Fix avx2 gather offset larger than 2GB in
    CompareColumnsToRows (#42188)
  * [C++][S3] Fix potential deadlock when closing output stream
    (#41876)
  * [CI][C++] Clear cache for mamba on AppVeyor (#41977)
  * [CI][Python][C++] Fix utf8proc detection for wheel on Windows
    (#42022)
  * [C++] Support list-views on list_slice (#42067)
  * [C++] Fix an OTel test failure and remove needless logs
    (#42122)
  * [C++][FS][Azure] Ensure setting BlobSasBuilder::Protocol
    (#42108)
  * [C++] Support list-view typed arrays in array_take and
    array_filter (#42117)
  * [C++] Fix some potential uninitialized variable warnings
    (#42207)
  * [C++] Avoid invalid accesses in parquet-encoding-benchmark
    (#42141)
  * [C++] Use FetchContent for bundled ORC (#43011)
  * [C++] Fix GetRecordBatchPayload crashes for device data
    (#42199)
  * [C++] Use non-stale c-ares download URL (#42250)
  * [C++][Parquet] Check for valid ciphertext length to prevent
    segfault (#43071)
  * [C++][Compute] Mark KeyCompare.CompareColumnsToRowsLarge as
    large memory test (#43128)
  * [C++] Upgrade bundled google-cloud-cpp to 2.22.0 (#43136)
  ## New Features and Improvements
  * [C++][Compute] Implement Grouper::Reset (#41352)
  * [Go][C++] Implement Flight SQL Bulk Ingestion (#38385)
  * [C++][FS][Azure] Support azure cli auth (#41976)
  * [C++][FS][Azure] Add support for environment credential
    (#41715)
  * [C++] Optimize Take for fixed-size types including nested
    fixed-size lists (#41297)
  * [C++][Device] Add Copy/View slice functions to a CPU pointer
    (#41477)
  * [C++] Add support for OpenTelemetry logging (#39905)
  * [C++] Import/Export ArrowDeviceArrayStream (#40807)
  * [C++] move LocalFileSystem to the registry (#40356)
  * [C++] Make flatbuffers serialization more deterministic
    (#40392)
  * [C++][Gandiva] add RE2::Options set_dot_nl(true) for Like
    function (#40970)
  * [C++] Introduce portable compiler assumptions (#41021)
  * [C++] Add a grouper benchmark for preventing performance
    regression (#41036)
  * [C++] Support flatten for combining nested list related types
    (#41092)
  * [C++] Clean up remaining tasks related to half float casts
    (#41084)
  * [C++][FS][Azure] Add support for CopyFile with hierarchical
    namespace support (#41276)
  * [C++] Add is_validity_defined_by_bitmap() predicate (#41115)
  * [C++] IO: enhance boundary checking in CompressedInputStream
    (#41117)
  * [C++][Python] Expose recursive flatten for lists on
    list_flatten kernel function and pyarrow bindings (#41295)
  * [C++][Parquet][Doc] Denote PARQUET:field_id in parquet.rst
    (#41187)
  * [C++] Extract the kernel loops used for PrimitiveTakeExec and
    generalize to any fixed-width type (#41373)
  * [C++][Acero] Use per-node basis temp vector stack to mitigate
    overflow (#41335)
  * [C++][Parquet] Optimize DelimitRecords by batch execution when
    max_rep_level > 1 (#41362)
  * [C++][FS][Azure][Docs] Add AzureFileSystem to Filesystems API
    reference (#41411)
  * [C++] Use ASAN to poison temp vector stack memory (#41695)
  * [C++][S3] Add a new option to check existence before CreateDir
    (#41822)
  * [C++][Parquet] Fix
    DeltaLengthByteArrayEncoder::EstimatedDataEncodedSize (#41546)
  * [C++] Thirdparty: Upgrade xsimd to 13.0.0 (#41548)
  * [C++] Improve fixed_width_test_util.h (#41575)
  * [C++] ChunkResolver: Implement ResolveMany and add unit tests
    (#41561)
  * [C++] fixed_width_internal.h: Simplify docstring and support
    bit-sized types (BOOL) (#41597)
  * [C++][Python] Extends the add_key_value to parquet::arrow and
    PyArrow (#41633)
  * [C++][CMake][Windows] Don’t build needless object libraries
    (#41658)
  * [C++][Python] PrettyPrint non-cpu data by copying to default
    CPU device (#42010)
  * [C++][Parquet] Thrift: generate template method to accelerate
    reading thrift (#41703)
  * [C++][Parquet] Minor: moving EncodedStats by default rather
    than copying (#41727)
  * [C++][ORC] Ensure setting detected ORC version (#41767)
  * [C++][Parquet] Add file metadata read/write benchmark (#41761)
  * [C++] Make git-dependent definitions internal (#41781)
  * [C++][S3] Remove GetBucketRegion hack for newer AWS SDK
    versions (#41798)
  * [C++][Parquet] normalize dictionary encoding to use
    RLE_DICTIONARY (#41819)
  * [C++] IPC: Minor enhance the code of writer (#41900)
  * [C++] Fix ExecuteScalar deduce all_scalar with chunked_array
    (#41925)
  * [C++] Minor enhance code style for FixedShapeTensorType
    (#41954)
  * [C++] Follow up of adding null_bitmap to MapArray::FromArrays
    (#41956)
  * [C++] Misc changes making code around list-like types and
    list-view types behave the same way (#41971)
  * [C++] : kernel.cc: Remove defaults on switch so that compiler
    can check full enum coverage for us (#41995)
  * [C++][Parquet] ParquetFilePrinter::JSONPrint print length of
    FLBA (#41981)
  * [C++][CMake] Add preset for Valgrind (#42110)
  * [C++] Move TakeXXX free functions into TakeMetaFunction and
    make them private (#42127)
  * [C++][FS][Azure] Validate
    AzureOptions::{blob,dfs}_storage_scheme (#42135)
  * [C++] list_parent_indices: Add support for list-view types
    (#42236)
  * [C++] Reduce the recursion of many-join test (#43042)
  * [C++] Limit buffer size in BufferedInputStream::SetBufferSize
    with raw_read_bound (#43064)
- Require cmake lz4 for 1.10
- Update to 17.0.0
  ## Bug Fixes
  * [C++][Python] Fix casting to extension type with fixed size
    list storage type (#42219)
  * [Python] Include metadata when creating pa.schema from
    PyCapsule (#41538)
  * [C++][Python] RecordBatch.filter() segfaults if passed a
    ChunkedArray (#40971)
  * [Python] pa.array: add check for byte-swapped numpy arrays
    inside python objects (#41549)
  * [Python] Fix read_table for encrypted parquet (#39438)
  * [Python] RunEndEncodedArray.from_arrays: bugfix for Array
    arguments (#40560) (#41093)
  * [C++][Python] Map child Array constructed from keys and items
    shouldn’t have offset (#40871)
  * [Python] `test_numpy_array_protocol` test failures with numpy
    2.0.0rc1
  * [Python] Fix StructArray.sort() for by=None (#41495)
  * [Python] Build with Python 3.13 (#42034)
  * [Python] remove special methods related to buffers in python
    <2.6 (#41492)
  * [Python] Fix reading column index with decimal values (#41503)
  * [Docs][Python] Remove duplicate contents (#41588)
  * [C++][Python] Add optional null_bitmap to MapArray::FromArrays
    (#41757)
  * [Python][Parquet] Implement to_dict method on SortingColumn
    (#41704)
  * [Python] CMake: ignore Parquet encryption option if Parquet
    itself is not enabled (fix Java integration build) (#41776)
  * [Python] Disallow direct pa.RecordBatchReader() construction to
    avoid segfaults (#41773)
  * [Python] Fix RecordBatchReader.cast to support casting to equal
    schema for all types (#42098)
  * [Python] Fix tests when using NumPy 2.0 on Windows (#42099)
  * [CI][Python] Use pip install -e instead of setup.py build_ext
    –inplace for installing pyarrow on verification script (#42007)
  * [CI][Python][C++] Fix utf8proc detection for wheel on Windows
    (#42022)
  * [Python][CI] Update expected output for numpy 2.0.0 (#42172)
  ## New Features and Improvements
  * [Python] Replace pandas.util.testing.rands with vendored
    version (#42089)
  * [Python] begin moving static settings to pyproject.toml
    (#41041)
  * [Python] Implement PyCapsule interface for Device data in
    PyArrow (#40717)
  * [Python] Expand the Arrow PyCapsule Interface with C Device
    Data support (#40708)
  * [Python] Let RecordBatch.filter accept a boolean expression in
    addition to mask array (#43043)
  * [Python] Fix pickling of LocalFileSystem for cython 2 (#41459)
  * [Python] Expand the C Device Interface bindings to support
    import on CUDA device (#40385)
  * [Python] Allow passing a mapping of column names to
    rename_columns (#40645)
  * [Python][Packaging] Strip unnecessary symbols when building
    wheels (#42028)
  * [Python][Docs] Update PyArrow installation docs for conda
    package split (#41135)
  * [Python] Basic bindings for Device and MemoryManager classes
    (#41685)
  * [C++][Python] Expose recursive flatten for lists on
    list_flatten kernel function and pyarrow bindings (#41295)
  * [Python][Packaging] Ensure to build with released numpy 2.0
    (instead of RC) in the wheel building workflows (#42194)
  * [CI][Python] Add a job on ARM64 macOS (#41313)
  * [CI][Python] Reduce CI time on macOS (#41378)
  * [Python] Expose byte_width and bit_width of ExtensionType in
    terms of the storage type (#41413)
  * [Python] Update Python development guide about components being
    enabled by default based on Arrow C++ (#41705)
  * [Python] Building PyArrow: enable/disable python components by
    default based on availability in Arrow C++ (#41494)
  * [C++][Python] Extends the add_key_value to parquet::arrow and
    PyArrow (#41633)
  * [Python] Ensure Buffer methods don’t crash with non-CPU data
    (#41889)
  * [C++][Python] PrettyPrint non-cpu data by copying to default
    CPU device (#42010)
  * [Python][Parquet] Update BYTE_STREAM_SPLIT description in
    write_table() docstring (#41759)
  * [Python] Add support for Pyodide (#37822)
  * [Python] Fix pandas tests to follow downstream datetime64 unit
    changes (#41979)
  * [Python] Allow Array.filter() to take general array input
    (#42051)
  * [Python] Expose new FLOAT16 logical type in the pyarrow.parquet
    bindings (#42103)
  * [Python] Array gracefully fails on non-cpu device (#42113)
  * [Python][Parquet] Pyarrow store decimal as integer (#42169)
  * [Python] Add CI job for Numpy 1.X (#42189)
  * [CI][Python] Pin openjdk=17 in python substrait integration
    (#43051)
- Drop pyarrow-pr41319-numpy2-tests.patch
- Add pyarrow-pr433325-extradirs.patch gh#apache/arrow/pull/43325

OBS-URL: https://build.opensuse.org/request/show/1194085
OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=34
2024-08-15 09:43:24 +00:00
c159005cc1 Accepting request 1170120 from home:bnavigator:numpy
- Update to 16.0.0
  ## Bug Fixes
  * [C++][ORC] Catch all ORC exceptions to avoid crash (#40697)
  * [C++][S3] Handle conventional content-type for directories
    (#40147)
  * [C++] Strengthen handling of duplicate slashes in S3, GCS
    (#40371)
  * [C++] Avoid hash_mean overflow (#39349)
  * [C++] Fix spelling (array) (#38963)
  * [C++][Parquet] Fix crash in Modular Encryption (#39623)
  * [C++][Dataset] Fix failures in dataset-scanner-benchmark
    (#39794)
  * [C++][Device] Fix Importing nested and string types for
    DeviceArray (#39770)
  * [C++] Use correct (non-CPU) address of buffer in
    ExportDeviceArray (#39783)
  * [C++] Improve error message for "chunker out of sync" condition
    (#39892)
  * [C++] Use make -j1 to install bundled bzip2 (#39956)
  * [C++] DatasetWriter avoid creating zero-sized batch when
    max_rows_per_file enabled (#39995)
  * [C++][CI] Disable debug memory pool for ASAN and Valgrind
    (#39975)
  * [C++][Gandiva] Make Gandiva's default cache size to be 5000 for
    object code cache (#40041)
  * [C++][FS][Azure] Fix CreateDir and DeleteDir trailing slash
    issues on hierarchical namespace accounts (#40054)
  * [C++][FS][Azure] Validate containers in
    AzureFileSystem::Impl::MovePaths() (#40086)
  * [C++] Decimal types with different precisions and scales bind

OBS-URL: https://build.opensuse.org/request/show/1170120
OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=30
2024-04-25 09:07:39 +00:00
525207619b Accepting request 1163690 from home:shanipribadi
I would like to have apache flight and apache flight sql library built.

also disabling the static build because the generated CMake Targets includes them, making builds against libarrow requiring not just apache-arrow-devel but also all of the devel-static packages.

note: flight and flight-sql are packaged separately.
in upstream rpm and fedora repo, flight-sql is included in libarrow-flight-libs.

OBS-URL: https://build.opensuse.org/request/show/1163690
OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=29
2024-03-30 15:01:58 +00:00
8d99637b3c Accepting request 1160966 from home:bnavigator:branches:science
- Update to 15.0.2
  ## Bug Fixes
  * [C++][Acero] Increase size of Acero TempStack (#40007)
  * [C++][Dataset] Add missing Protobuf static link dependency
    (#40015)
  * [C++] Possible data race when reading metadata of a parquet
    file (#40111)
  * [C++] Make span SFINAE standards-conforming to enable
    compilation with nvcc (#40253)
  

- Update to 15.0.2
  ## Bug Fixes
  * [Python] Fix except clauses (#40387)
  * [Python][CI] Skip failing test_dateutil_tzinfo_to_string
    (#40486)

OBS-URL: https://build.opensuse.org/request/show/1160966
OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=27
2024-03-23 16:14:18 +00:00
f4b994c8a2 Accepting request 1152980 from home:bnavigator:branches:science
- Reenable logging
  * Add apache-arrow-pr40230-glog-0.7.patch
  * Add apache-arrow-pr40275-glog-0.7-2.patch
  * now requires glog devel files to be present for
    apache-arrow-devel; ArrowConfig.cmake fails otherwise
  * gh#apache/arrow#40181
  * gh#apache/arrow#40230
  * gh#apache/arrow#40275
- Move d:l:p:n/python-pyarrow to the  science/apache-arrow as multibuild package: Uses the same source and is tightly connected.

OBS-URL: https://build.opensuse.org/request/show/1152980
OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=25
2024-02-28 16:27:53 +00:00