10
0
forked from pool/apache-arrow

7 Commits

Author SHA256 Message Date
853a205aac - Update to 20.0.0
## Bug Fixes
  * GH-30302 - [C++][Parquet] Preserve the bitwidth of integer
    dictionary indices on round-trip to Parquet (#45685)
  * GH-31992 - [C++][Parquet] Handling the special case when
    DataPageV2 values buffer is empty (#45252)
  * GH-37630 - [C++][Python][Dataset] Allow disabling fragment
    metadata caching (#45330)
  * GH-39023 - [C++][CMake] Add missing launcher path conversion
    for ExternalPackage (#45349)
  * GH-43057 - [C++] Thread-safe AesEncryptor / AesDecryptor
    (#44990)
  * GH-45048 - [C++][Parquet] Deprecate unused chunk_size parameter
    in parquet::arrow::FileWriter::NewRowGroup() (#45088)
  * GH-45129 - [Python][C++] Fix usage of deprecated C++
    functionality on pyarrow (#45189)
  * GH-45132 - [C++][Gandiva] Update LLVM to 18.1 (#45114)
  * GH-45185 - [C++][Parquet] Raise an error for invalid repetition
    levels when delimiting records (#45186)
  * GH-45254 - [C++][Acero] Fix the row offset truncation in row
    table merge (#45255)
  * GH-45266 - [C++][Acero] Fix the running tasks count of
    Scheduler when get error tasks in multi-threads (#45268)
  * GH-45270 - [C++][CI] Disable mimalloc on Valgrind builds
    (#45271)
  * GH-45301 - [C++] Change PrimitiveArray ctor to protected
    (#45444)
  * GH-45334 - [C++][Acero] Fix swiss join overflow issues in row
    offset calculation for fixed length and null masks (#45336)
  * GH-45362 - [C++] Fix identity cast for time and list scalar

OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=55
2025-06-13 18:31:56 +00:00
55775895c9 - Update to 19.0.1
## Bug Fixes
  * [C++] Fix overflow issues for large build side in swiss join
    (#45108)
  * [C++][Fuzzing] Fix Negation bug discovered by fuzzing (#45181)
  * [C++][Parquet] Omit level histogram when max level is 0
    (#45285)
  * [Parquet][C++] Fix statistics load logic for no row group and
    multiple row groups (#45350)
  * [C++] Disable Flight test (#45232)
  ## Improvements
  * [C++][Parquet] Improve performance of generating size
    statistics (#45202)
  * [C++][S3] Workaround compatibility issue between AWS SDK and
    MinIO (#45310)
- Release 19.0.0
  ## New Features and Improvements
  * [CI][C++] Add a nightly job to test offline build (#44721)
  * [C++] GcsFileSystem::Make should return Result (#44503)
  * [C++][Parquet] Implement SizeStatistics (#40594)
  * [C++] Reduce string inlining in Substrait serde (#45174)
  * [C++][Acero] Enhance asof_join to work in multi-threaded
    execution by sequencing input (#44083)
  * [C++] Support the AWS S3 SSE-C encryption (#43601)
  * [C++][Parquet] Parquet Metadata Printer supports print
    sort-columns (#43599)
  * [C++] Add C++ implementation of Async C Data Interface (#44495)
  * [C++][Acero] Support AVX2 swiss join decoding (#43832)
  * [C++] skip -0117 in StrptimeZoneOffset for old glibc (#44621)
  * [C++] Add arrow::RecordBatch::MakeStatisticsArray() (#44252)

OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=40
2025-02-17 22:32:29 +00:00
be27bc1230 Accepting request 1218425 from home:yeey:OpenWebUI
- Set the appropriate C++ complier for the given platform so
  it will compile on Leap 15.x. 

- Enable sle15_python_module_pythons.

OBS-URL: https://build.opensuse.org/request/show/1218425
OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=38
2024-10-26 01:06:02 +00:00
9bed06f66b Accepting request 1194085 from home:bnavigator:branches:science
- Update to 17.0.0
  ## Bug Fixes
  * [C++] Add option to string ‘center’ kernel to control
    left/right alignment on odd number of padding (#41449)
  * [C++][Python] Fix casting to extension type with fixed size
    list storage type (#42219)
  * [C++] Replace null_count with MayHaveNulls in
    ListArrayFromArray and MapArray (#41957)
  * [C++][Python] RecordBatch.filter() segfaults if passed a
    ChunkedArray (#40971)
  * [C++][Parquet] Timestamp conversion from Parquet to Arrow does
    not follow compatibility guidelines for convertedType
  * [C++] Use LargeStringArray for casting when writing tables to
    CSV (#40271)
  * [C++][Python] Map child Array constructed from keys and items
    shouldn’t have offset (#40871)
  * [C++] Fix compile warning with ‘implicitly-defined constructor
    does not initialize’ in encoding_benchmark (#41060)
  * [C++] Get null_bit_id according to are_cols_in_encoding_order
    in NullUpdateColumnToRow_avx2 (#40998)
  * [C++] Clean up unused parameter warnings (#41111)
  * [C++][Acero] Fix asof join race (#41614)
  * [C++] support for single threaded joins (#41125)
  * [C++] Fix hashjoin benchmark failed at make utf8’s random
    batches (#41195)
  * [C++] Check to avoid copying when NullBitmapBuffer is Null
    (#41452)
  * [C++] Fix crash on invalid Parquet file (#41366)
  * [C++][Parquet] More strict Parquet level checking (#41346)
  * [C++][Gandiva] Fix gandiva cache size env var (#41330)
  * [C++][CMake][Windows] Remove needless .dll suffix from link
    libraries (#41341)
  * [C++][CMake] Remove unused ARROW_NO_DEPRECATED_API (#41345)
  * [C++][maybe_unused] with Arrow macro (#41359)
  * [C++][Large] ListView and Map nested types for scalar_if_else’s
    kernel functions (#41419)
  * [C++][Gandiva] Fix ascii_utf8 function to return same result on
    x86 and Arm (#41434)
  * [C++] Reuse deduplication logic for direct registration
    (#41466)
  * [C++] Clean up more redundant move warnings (#41487)
  * [C++][Compute] Remove redundant logic for ArrayData as
    ExecResults in ExecScalarCaseWhen (#41380)
  * [C++][CMake] correctly use Protobuf_PROTOC_EXECUTABLE (#41582)
  * [C++][CMake] Fix ARROW_USE_BOOST detect condition (#41622)
  * [C++][Python] Add optional null_bitmap to MapArray::FromArrays
    (#41757)
  * [C++] macros.h: Fix ARROW_FORCE_INLINE for MSVC (#41712)
  * [C++][Acero] Remove an useless parameter for QueryContext::Init
    called in hash_join_benchmark (#41716)
  * [C++] Fix the issue that temp vector stack may be under sized
    (#41746)
  * [C++] Check that extension metadata key is present before
    attempting to delete it (#41763)
  * [C++] Iterator releases its resource immediately when it reads
    all values (#41824)
  * [C++][Flight][Benchmark] Ensure waiting server ready (#41793)
  * [C++] Fix avx2 gather offset larger than 2GB in
    CompareColumnsToRows (#42188)
  * [C++][S3] Fix potential deadlock when closing output stream
    (#41876)
  * [CI][C++] Clear cache for mamba on AppVeyor (#41977)
  * [CI][Python][C++] Fix utf8proc detection for wheel on Windows
    (#42022)
  * [C++] Support list-views on list_slice (#42067)
  * [C++] Fix an OTel test failure and remove needless logs
    (#42122)
  * [C++][FS][Azure] Ensure setting BlobSasBuilder::Protocol
    (#42108)
  * [C++] Support list-view typed arrays in array_take and
    array_filter (#42117)
  * [C++] Fix some potential uninitialized variable warnings
    (#42207)
  * [C++] Avoid invalid accesses in parquet-encoding-benchmark
    (#42141)
  * [C++] Use FetchContent for bundled ORC (#43011)
  * [C++] Fix GetRecordBatchPayload crashes for device data
    (#42199)
  * [C++] Use non-stale c-ares download URL (#42250)
  * [C++][Parquet] Check for valid ciphertext length to prevent
    segfault (#43071)
  * [C++][Compute] Mark KeyCompare.CompareColumnsToRowsLarge as
    large memory test (#43128)
  * [C++] Upgrade bundled google-cloud-cpp to 2.22.0 (#43136)
  ## New Features and Improvements
  * [C++][Compute] Implement Grouper::Reset (#41352)
  * [Go][C++] Implement Flight SQL Bulk Ingestion (#38385)
  * [C++][FS][Azure] Support azure cli auth (#41976)
  * [C++][FS][Azure] Add support for environment credential
    (#41715)
  * [C++] Optimize Take for fixed-size types including nested
    fixed-size lists (#41297)
  * [C++][Device] Add Copy/View slice functions to a CPU pointer
    (#41477)
  * [C++] Add support for OpenTelemetry logging (#39905)
  * [C++] Import/Export ArrowDeviceArrayStream (#40807)
  * [C++] move LocalFileSystem to the registry (#40356)
  * [C++] Make flatbuffers serialization more deterministic
    (#40392)
  * [C++][Gandiva] add RE2::Options set_dot_nl(true) for Like
    function (#40970)
  * [C++] Introduce portable compiler assumptions (#41021)
  * [C++] Add a grouper benchmark for preventing performance
    regression (#41036)
  * [C++] Support flatten for combining nested list related types
    (#41092)
  * [C++] Clean up remaining tasks related to half float casts
    (#41084)
  * [C++][FS][Azure] Add support for CopyFile with hierarchical
    namespace support (#41276)
  * [C++] Add is_validity_defined_by_bitmap() predicate (#41115)
  * [C++] IO: enhance boundary checking in CompressedInputStream
    (#41117)
  * [C++][Python] Expose recursive flatten for lists on
    list_flatten kernel function and pyarrow bindings (#41295)
  * [C++][Parquet][Doc] Denote PARQUET:field_id in parquet.rst
    (#41187)
  * [C++] Extract the kernel loops used for PrimitiveTakeExec and
    generalize to any fixed-width type (#41373)
  * [C++][Acero] Use per-node basis temp vector stack to mitigate
    overflow (#41335)
  * [C++][Parquet] Optimize DelimitRecords by batch execution when
    max_rep_level > 1 (#41362)
  * [C++][FS][Azure][Docs] Add AzureFileSystem to Filesystems API
    reference (#41411)
  * [C++] Use ASAN to poison temp vector stack memory (#41695)
  * [C++][S3] Add a new option to check existence before CreateDir
    (#41822)
  * [C++][Parquet] Fix
    DeltaLengthByteArrayEncoder::EstimatedDataEncodedSize (#41546)
  * [C++] Thirdparty: Upgrade xsimd to 13.0.0 (#41548)
  * [C++] Improve fixed_width_test_util.h (#41575)
  * [C++] ChunkResolver: Implement ResolveMany and add unit tests
    (#41561)
  * [C++] fixed_width_internal.h: Simplify docstring and support
    bit-sized types (BOOL) (#41597)
  * [C++][Python] Extends the add_key_value to parquet::arrow and
    PyArrow (#41633)
  * [C++][CMake][Windows] Don’t build needless object libraries
    (#41658)
  * [C++][Python] PrettyPrint non-cpu data by copying to default
    CPU device (#42010)
  * [C++][Parquet] Thrift: generate template method to accelerate
    reading thrift (#41703)
  * [C++][Parquet] Minor: moving EncodedStats by default rather
    than copying (#41727)
  * [C++][ORC] Ensure setting detected ORC version (#41767)
  * [C++][Parquet] Add file metadata read/write benchmark (#41761)
  * [C++] Make git-dependent definitions internal (#41781)
  * [C++][S3] Remove GetBucketRegion hack for newer AWS SDK
    versions (#41798)
  * [C++][Parquet] normalize dictionary encoding to use
    RLE_DICTIONARY (#41819)
  * [C++] IPC: Minor enhance the code of writer (#41900)
  * [C++] Fix ExecuteScalar deduce all_scalar with chunked_array
    (#41925)
  * [C++] Minor enhance code style for FixedShapeTensorType
    (#41954)
  * [C++] Follow up of adding null_bitmap to MapArray::FromArrays
    (#41956)
  * [C++] Misc changes making code around list-like types and
    list-view types behave the same way (#41971)
  * [C++] : kernel.cc: Remove defaults on switch so that compiler
    can check full enum coverage for us (#41995)
  * [C++][Parquet] ParquetFilePrinter::JSONPrint print length of
    FLBA (#41981)
  * [C++][CMake] Add preset for Valgrind (#42110)
  * [C++] Move TakeXXX free functions into TakeMetaFunction and
    make them private (#42127)
  * [C++][FS][Azure] Validate
    AzureOptions::{blob,dfs}_storage_scheme (#42135)
  * [C++] list_parent_indices: Add support for list-view types
    (#42236)
  * [C++] Reduce the recursion of many-join test (#43042)
  * [C++] Limit buffer size in BufferedInputStream::SetBufferSize
    with raw_read_bound (#43064)
- Require cmake lz4 for 1.10
- Update to 17.0.0
  ## Bug Fixes
  * [C++][Python] Fix casting to extension type with fixed size
    list storage type (#42219)
  * [Python] Include metadata when creating pa.schema from
    PyCapsule (#41538)
  * [C++][Python] RecordBatch.filter() segfaults if passed a
    ChunkedArray (#40971)
  * [Python] pa.array: add check for byte-swapped numpy arrays
    inside python objects (#41549)
  * [Python] Fix read_table for encrypted parquet (#39438)
  * [Python] RunEndEncodedArray.from_arrays: bugfix for Array
    arguments (#40560) (#41093)
  * [C++][Python] Map child Array constructed from keys and items
    shouldn’t have offset (#40871)
  * [Python] `test_numpy_array_protocol` test failures with numpy
    2.0.0rc1
  * [Python] Fix StructArray.sort() for by=None (#41495)
  * [Python] Build with Python 3.13 (#42034)
  * [Python] remove special methods related to buffers in python
    <2.6 (#41492)
  * [Python] Fix reading column index with decimal values (#41503)
  * [Docs][Python] Remove duplicate contents (#41588)
  * [C++][Python] Add optional null_bitmap to MapArray::FromArrays
    (#41757)
  * [Python][Parquet] Implement to_dict method on SortingColumn
    (#41704)
  * [Python] CMake: ignore Parquet encryption option if Parquet
    itself is not enabled (fix Java integration build) (#41776)
  * [Python] Disallow direct pa.RecordBatchReader() construction to
    avoid segfaults (#41773)
  * [Python] Fix RecordBatchReader.cast to support casting to equal
    schema for all types (#42098)
  * [Python] Fix tests when using NumPy 2.0 on Windows (#42099)
  * [CI][Python] Use pip install -e instead of setup.py build_ext
    –inplace for installing pyarrow on verification script (#42007)
  * [CI][Python][C++] Fix utf8proc detection for wheel on Windows
    (#42022)
  * [Python][CI] Update expected output for numpy 2.0.0 (#42172)
  ## New Features and Improvements
  * [Python] Replace pandas.util.testing.rands with vendored
    version (#42089)
  * [Python] begin moving static settings to pyproject.toml
    (#41041)
  * [Python] Implement PyCapsule interface for Device data in
    PyArrow (#40717)
  * [Python] Expand the Arrow PyCapsule Interface with C Device
    Data support (#40708)
  * [Python] Let RecordBatch.filter accept a boolean expression in
    addition to mask array (#43043)
  * [Python] Fix pickling of LocalFileSystem for cython 2 (#41459)
  * [Python] Expand the C Device Interface bindings to support
    import on CUDA device (#40385)
  * [Python] Allow passing a mapping of column names to
    rename_columns (#40645)
  * [Python][Packaging] Strip unnecessary symbols when building
    wheels (#42028)
  * [Python][Docs] Update PyArrow installation docs for conda
    package split (#41135)
  * [Python] Basic bindings for Device and MemoryManager classes
    (#41685)
  * [C++][Python] Expose recursive flatten for lists on
    list_flatten kernel function and pyarrow bindings (#41295)
  * [Python][Packaging] Ensure to build with released numpy 2.0
    (instead of RC) in the wheel building workflows (#42194)
  * [CI][Python] Add a job on ARM64 macOS (#41313)
  * [CI][Python] Reduce CI time on macOS (#41378)
  * [Python] Expose byte_width and bit_width of ExtensionType in
    terms of the storage type (#41413)
  * [Python] Update Python development guide about components being
    enabled by default based on Arrow C++ (#41705)
  * [Python] Building PyArrow: enable/disable python components by
    default based on availability in Arrow C++ (#41494)
  * [C++][Python] Extends the add_key_value to parquet::arrow and
    PyArrow (#41633)
  * [Python] Ensure Buffer methods don’t crash with non-CPU data
    (#41889)
  * [C++][Python] PrettyPrint non-cpu data by copying to default
    CPU device (#42010)
  * [Python][Parquet] Update BYTE_STREAM_SPLIT description in
    write_table() docstring (#41759)
  * [Python] Add support for Pyodide (#37822)
  * [Python] Fix pandas tests to follow downstream datetime64 unit
    changes (#41979)
  * [Python] Allow Array.filter() to take general array input
    (#42051)
  * [Python] Expose new FLOAT16 logical type in the pyarrow.parquet
    bindings (#42103)
  * [Python] Array gracefully fails on non-cpu device (#42113)
  * [Python][Parquet] Pyarrow store decimal as integer (#42169)
  * [Python] Add CI job for Numpy 1.X (#42189)
  * [CI][Python] Pin openjdk=17 in python substrait integration
    (#43051)
- Drop pyarrow-pr41319-numpy2-tests.patch
- Add pyarrow-pr433325-extradirs.patch gh#apache/arrow/pull/43325

OBS-URL: https://build.opensuse.org/request/show/1194085
OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=34
2024-08-15 09:43:24 +00:00
c159005cc1 Accepting request 1170120 from home:bnavigator:numpy
- Update to 16.0.0
  ## Bug Fixes
  * [C++][ORC] Catch all ORC exceptions to avoid crash (#40697)
  * [C++][S3] Handle conventional content-type for directories
    (#40147)
  * [C++] Strengthen handling of duplicate slashes in S3, GCS
    (#40371)
  * [C++] Avoid hash_mean overflow (#39349)
  * [C++] Fix spelling (array) (#38963)
  * [C++][Parquet] Fix crash in Modular Encryption (#39623)
  * [C++][Dataset] Fix failures in dataset-scanner-benchmark
    (#39794)
  * [C++][Device] Fix Importing nested and string types for
    DeviceArray (#39770)
  * [C++] Use correct (non-CPU) address of buffer in
    ExportDeviceArray (#39783)
  * [C++] Improve error message for "chunker out of sync" condition
    (#39892)
  * [C++] Use make -j1 to install bundled bzip2 (#39956)
  * [C++] DatasetWriter avoid creating zero-sized batch when
    max_rows_per_file enabled (#39995)
  * [C++][CI] Disable debug memory pool for ASAN and Valgrind
    (#39975)
  * [C++][Gandiva] Make Gandiva's default cache size to be 5000 for
    object code cache (#40041)
  * [C++][FS][Azure] Fix CreateDir and DeleteDir trailing slash
    issues on hierarchical namespace accounts (#40054)
  * [C++][FS][Azure] Validate containers in
    AzureFileSystem::Impl::MovePaths() (#40086)
  * [C++] Decimal types with different precisions and scales bind

OBS-URL: https://build.opensuse.org/request/show/1170120
OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=30
2024-04-25 09:07:39 +00:00
8d99637b3c Accepting request 1160966 from home:bnavigator:branches:science
- Update to 15.0.2
  ## Bug Fixes
  * [C++][Acero] Increase size of Acero TempStack (#40007)
  * [C++][Dataset] Add missing Protobuf static link dependency
    (#40015)
  * [C++] Possible data race when reading metadata of a parquet
    file (#40111)
  * [C++] Make span SFINAE standards-conforming to enable
    compilation with nvcc (#40253)
  

- Update to 15.0.2
  ## Bug Fixes
  * [Python] Fix except clauses (#40387)
  * [Python][CI] Skip failing test_dateutil_tzinfo_to_string
    (#40486)

OBS-URL: https://build.opensuse.org/request/show/1160966
OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=27
2024-03-23 16:14:18 +00:00
f4b994c8a2 Accepting request 1152980 from home:bnavigator:branches:science
- Reenable logging
  * Add apache-arrow-pr40230-glog-0.7.patch
  * Add apache-arrow-pr40275-glog-0.7-2.patch
  * now requires glog devel files to be present for
    apache-arrow-devel; ArrowConfig.cmake fails otherwise
  * gh#apache/arrow#40181
  * gh#apache/arrow#40230
  * gh#apache/arrow#40275
- Move d:l:p:n/python-pyarrow to the  science/apache-arrow as multibuild package: Uses the same source and is tightly connected.

OBS-URL: https://build.opensuse.org/request/show/1152980
OBS-URL: https://build.opensuse.org/package/show/science/apache-arrow?expand=0&rev=25
2024-02-28 16:27:53 +00:00