------------------------------------------------------------------- Wed Nov 2 10:13:14 UTC 2022 - Dominique Leuenberger - IF clang-devel is >= 15, force dependency to clang14-devel. ------------------------------------------------------------------- Tue Sep 6 16:34:33 UTC 2022 - Stefan Brüns - Update to version 3.0 * Minimal OpenCL 3.0 feature set should be now supported (official conformance stamp still to apply for). * Support for Clang/LLVM 14.0. * Improved tracing and visualization. * Support for generating specialized work-group functions and include them in the PoCL kernel program binaries. * Fixed printf for SPIR-V. * A lot of other fixes and improvements. ------------------------------------------------------------------- Sat Jun 4 14:33:49 UTC 2022 - Aaron Puchert - Use LLVM 13 on Tumbleweed, since LLVM 14 does not yet work according to upstream. (gh#pocl/pocl#1047, gh#pocl/pocl#1048) - Require at least version 6 of clang-devel, older versions are not supported. (Otherwise configuration will fail.) - Strip prefix from CMAKE_INSTALL_LIBDIR on older distributions to fix paths there. ------------------------------------------------------------------- Sat Oct 30 11:37:16 UTC 2021 - Martin Hauke - Update to version 1.8 * Support for Clang/LLVM 13 * Improved debugging support with Valgrind, LTTNG * Improved support for SPIR/SPIR-V on CUDA - Update to version 1.7 * Support for Clang/LLVM 12 * Improved support for cross-compiling * Improved support for SPIR-V binaries when using CPU device * Implemented OpenCL 3.0 features: clGetDeviceInfo queries + CL_DEVICE_ATOMIC_MEMORY_CAPABILITIES (Minimal implementation) + CL_DEVICE_ATOMIC_FENCE_CAPABILITIES (Minimal implementation) ------------------------------------------------------------------- Fri Dec 25 14:30:16 UTC 2020 - Martin Hauke - Update to version 1.6 * Support for LLVM 11. * CUDA kernels using constant __local blocks are now ABI incompatible with previous release. Users need to delete their pocl cache. * Improved debugging of OpenCL code with CPU driver. * Improved the PTX code generation for __local blocks. * Improved handling of command queue barriers * Fix LLVM loop vectorizing remarks printing (POCL_VECTORIZER_REMARKS=1). * Fix an issue in which the loop vectorizer produced code with invalid memory reads (issue #757). * Fix compilation error when CMake option SINGLE_LLVM_LIB is set to OFF. * Fix wrongly output dlerror (Undefined symbol) after dlopen, caused by a previous libdl call in an ICD loader * [CPU] safety margin of pocl's CPU driver local memory allocation has been reduced to a much more reasonable value * [CPU] buffer size for OpenCL printf is now configurable with PRINTF_BUFFER_SIZE CMake variable * [CPU] local memory size reported is now the size of last level of non-shared data cache (usually L1 or L2 depending on CPU), if hwloc can determine it. - Update patch link_against_libclang-cpp_so.patch ------------------------------------------------------------------- Fri Oct 23 22:46:28 UTC 2020 - Ondřej Súkup - remove broken installation workaround ------------------------------------------------------------------- Fri Jun 26 11:05:12 UTC 2020 - Stefan Dirsch - moved pocl.icd to /usr/share/OpenCL/vendors for real ... ------------------------------------------------------------------- Thu Jun 25 09:53:25 UTC 2020 - Stefan Dirsch - Update to version 1.5 * Added support for LLVM/Clang 10.0 - adjusted link_against_libclang-cpp_so.patch - move pocl.icd from /usr/etc/OpenCL/vendors to /usr/share/OpenCL/vendors (boo#1173005) ------------------------------------------------------------------- Mon Nov 4 20:04:34 UTC 2019 - Stefan Brüns - Update to version 1.4 * Support for LLVM/Clang 8.0 and 9.0 * Support for LLVM older than 6.0 has been removed. * Improved SPIR and SPIR-V support for CPU device * pocl-accel: An example driver and support infrastructure for OpenCL 1.2 CL_DEVICE_TYPE_CUSTOM hardware accelerators. - Remove upstreamed fix_resources_path_version_dependency.patch - Fix build with single-component libclang-cpp.so, add link_against_libclang-cpp_so.patch ------------------------------------------------------------------- Sun Jul 28 19:15:03 UTC 2019 - Stefan Brüns - Use GCC (default host compiler) for compiling the library itself, and only compile the openCL kernel bytecode with clang, which is the upstream default setup. This also fixes problems where clang chokes on the GCC LTO options. - Drop unused boost_headers, glew, ncurses and uthash devel BuildRequires. - Remove unneeded extra linker flags. - Fix build on ARM, and enable Arch64 (needs explicit CPU specification), supported since pocl 1.1. - Fix failing header lookup when minor libclang version changes (https://github.com/pocl/pocl/issues/747), add fix_resources_path_version_dependency.patch - Require implementation (libpocl2) from the main package which contains the ICD referencing it. ------------------------------------------------------------------- Fri Apr 5 19:56:21 UTC 2019 - Martin Hauke - Adjust required clang version (clang < 9) since clang 8 is now supported by upstream. ------------------------------------------------------------------- Thu Apr 4 19:21:26 UTC 2019 - Martin Hauke - Update to version 1.3 * Support for Clang/LLVM 8.0. Bug Fixes: * Fixed kernel debug symbol generation. * HSA: fix kernel caching. * Fix clCreateImage doesn't fail with unsupported image type. * Fix handle non-kernel functions with barriers properly. * Fix Unable to build pocl with CUDA support with LLVM 7 and host GCC 8.2. * Fix image format/size handling with multiple devices in context. * Fix padding issue with context arrays that manifested as unaligned access errors after autovectorization. Notable Internal Changes * Add group ids as hidden kernel arguments instead of digging them up from the context struct. * Ability to generate the final binary via separate assembly text + assembler call. Useful for supporting LLVM targets without direct binary emission support. * Use Clang's Driver API for launching the final linkage step. This way we utilize the toolchain registry with correct linkage steps required for the target at hand. * Add 'device_aux_functions' to the driver layer attributes. This can be used to retain device-specific functions required by the target across the pruning of unused globals. * The "default kernels" hack which was used to store kernel metadata, has been removed. Kernel metadata are now stored only once, in cl_program struct; every new cl_kernel structs holds only a pointer. * Major 'pthread' CPU driver cleanup. * Major Workgroup.cc cleanup. - Remove reproducible.patch (fixed upstream) ------------------------------------------------------------------- Wed Oct 31 12:13:35 UTC 2018 - Bernhard Wiedemann - Add reproducible.patch to make build result independent of build system CPU (boo#1110722) ------------------------------------------------------------------- Tue Sep 25 10:30:51 UTC 2018 - Ondřej Súkup - update to version 1.2 * Support for LLVM/Clang 7.0 and 6.0 * HWLOC 2.0 support - build kernels with distro support - detect and load cpu optimized code on runtime ------------------------------------------------------------------- Mon Jul 30 04:53:06 UTC 2018 - bwiedemann@suse.com - Disable compile time CPU-detection instead always asume core2 (boo#1100677) ------------------------------------------------------------------- Tue May 15 20:34:02 UTC 2018 - mimi.vx@gmail.com - move nonversioned lib to main package ------------------------------------------------------------------- Fri Mar 9 17:17:10 UTC 2018 - mardnh@gmx.de - Update to version 1.1 * Support for LLVM/Clang 6.0 and 5.0. * Experimental SPIR and SPIR-V support * Improved kernel compilation speed - Several tests have problems on some OBS workers while the same tests run perfectly fine in a local chroot. Disable tests for now. ------------------------------------------------------------------- Mon Mar 5 19:34:47 UTC 2018 - mardnh@gmx.de - Create subpackage for the shared library - Run tests after the build ------------------------------------------------------------------- Wed Jan 31 14:51:17 UTC 2018 - msrb@suse.com - Remove dependency on clang-devel-static. (bnc#1065464) * It was removed, clang-devel now again provides everything necessary as shared libraries. ------------------------------------------------------------------- Tue Dec 19 18:44:43 UTC 2017 - mardnh@gmx.de - Update to version 1.0 Highlights * Improved automatic local work-group sizing on kernel enqueue, taking into account standard constraints, SIMD width for vectorization as well as the number of compute units available on the device. * Support for NVIDIA GPUs via a new CUDA backend (currently experimental). * Removed support for BBVectorizer. * LLVM 5.0 is now supported. * A few build options have been added for distribution builds, see README.packaging. * Somewhat improved scalability in the CPU driver. CPUs with many cores and programs using a lot of WIs with small kernels can run somewhat faster. * Full conformance with OpenCL 1.2 standard, enabled by default. There are some caveats though - see the documentation. * When conformance is enabled, some kernel library functions might be slower than in previous releases. * Pocl now reports OpenCL 1.2 instead of 2.0, except HSA enabled builds. * Updated format of pocl binaries, which is NOT backwards compatible. * You'll need to clean any kernel caches. * Fixed several memory leaks. * Unresolved symbols (missing/misspelled functions etc) in a kernel will result in error in clBuildProgram() instead of pocl silently ignoring them and then aborting at dlopen(). * New env variable POCL_MEMORY_LIMIT=N limits the Global memory size reported by pocl to N gigabytes. * New env variable POCL_AFFINITY (defaults to 0): if enabled, sets the affinity of each CPU driver pthread to a single core. * Improved AVX512 support (with LLVM 5.0). Note that even with LLVM 5.0 there are still a few bugs (see pocl issue #555); AVX512 + LLVM 4.0 are a lot more broken, and probably not worth trying. * POCL_DEBUG env var has been revamped. You can now limit debuginfo to these categories (or their combination): all,error,warning,general memory,llvm,events,cache,locking,refcounts,timing,hsa,tce,cuda * The old setting POCL_DEBUG=1 now equals error+warning+general. - Remove patch: * pocl-disable-tests.diff - Disable CUDA backend since it depends on CUDA_TOOLKIT which is not available in Factory ------------------------------------------------------------------- Thu Oct 26 14:13:18 UTC 2017 - mpluskal@suse.com - Simplify spec file a bit - Enable CUDA backend - Enable all available cpu specific kernels for intel platform ------------------------------------------------------------------- Sat Oct 21 06:28:35 UTC 2017 - mpluskal@suse.com - We need clang4-devel-static to build with current clang packaging - Small spec-file cleanup * drop conditionals for older releases then Factory as building was not possible anyways ------------------------------------------------------------------- Wed Jul 12 10:50:47 UTC 2017 - jengelh@inai.de - Description should say what it is, not what it plans in the future. ------------------------------------------------------------------- Sat Jul 8 17:29:43 UTC 2017 - mardnh@gmx.de - Fix runtime linking issues (missing crtbeginS.so) - Require gcc for Factory ------------------------------------------------------------------- Mon Jul 3 21:23:21 UTC 2017 - mardnh@gmx.de - Fix path in the ICD-file ------------------------------------------------------------------- Fri May 19 08:49:22 UTC 2017 - idonmez@suse.com - Update library name for uthash -> libut2 ------------------------------------------------------------------- Tue May 16 18:59:30 UTC 2017 - mardnh@gmx.de - Update to 0.14 - Support for LLVM/Clang versions 3.9 and 4.0. Version 3.9 was the first release to include all frontend features for OpenCL 2.0. - Ability to build pocl in a mode where online compilation is not supported to run in hosts without LLVM and binaries compiled offline e.g. using poclcc. - pocl's binary format now can contain all the necessary bits to execute the programs on a host without online compiler support. - Initial support for out-of-order execution execution of command queues. - It's now possible to cross-compile pocl when building an offline compiler build. - New driver api extension to support out-of-order and asynchronous devices/drivers. - Pthread and HSA drivers are now fully asynchronous. - CMake now the only supported build system, autotools removed. - LTTng tracing support - Add patches: - pocl-disable-tests.diff - compilation errors on some tests disable tests for now - Remove patches: - 0001-Fixes-357-broken-build-with-GCC-6.1.patch - fixed upstream ------------------------------------------------------------------- Thu Feb 2 10:52:12 UTC 2017 - adam.majer@suse.de - use individual libboost-*-devel packages instead of boost-devel ------------------------------------------------------------------- Fri Jan 20 08:45:21 UTC 2017 - mpluskal@suse.com - Use llvm3_8 for building and as runtime dependency ------------------------------------------------------------------- Sun Oct 9 09:17:33 UTC 2016 - mpluskal@suse.com - Use cmake macros - Use ninja to speedup building ------------------------------------------------------------------- Mon Jul 4 15:04:04 UTC 2016 - mardnh@gmx.de - Add patch: 0001-Fixes-357-broken-build-with-GCC-6.1.patch * Fix build with GCC 6.x ------------------------------------------------------------------- Tue Apr 5 07:18:43 UTC 2016 - mpluskal@suse.com - Update to 0.13 * kernel compiler support for LLVM/Clang 3.8 * initial (partial) OpenCL 2.0 support * CMake build system almost on parity with autotools * Improved HSA support * Other optimizations and bug fixes ------------------------------------------------------------------- Mon Oct 26 20:02:35 UTC 2015 - mardnh@gmx.de - update to version 0.12 Highlights * Support for HSA-compliant devices (kernel agents). The GPU of AMD Kaveri now works through pocl with a bunch of test cases in the AMD SDK 2.9 example suite. * New and improved kernel cache system that enables caching kernels with #includes. * Support for LLVM/Clang 3.7. * Little endian MIPS32 now passes almost all pocl testsuite tests. OpenCL Runtime/Platform API support * Transferred buffer read/write/copy offset calculation to device driver side. - these driver api functions have changed; got offset as a new argument. * Maximum allocation is not limited to 1/4th of total memory size. * Maximum image dimensions grow to fit maximum allocation. * clGetDeviceInfo() reports better information about CPU vendor and cache. * experimental clCreateSubDevices() for pthread CPU device. OpenCL C Builtin Function Implementations * Implemented get_image_dim(). Bugfixes * Avoid infinite loops when users recycle an event waiting list. * Correctly report the base address alignment. * Lots of others. Misc * Tests now using new cl2.hpp, removing dependency on OpenGL headers - remove OpenGL-related packages from BuildRequires - add rpmlintrc ------------------------------------------------------------------- Wed Jul 1 14:54:09 UTC 2015 - cdenicolo@suse.com - license update: MIT overall license is MIT, other licenses refere to build scripts only. ------------------------------------------------------------------- Thu Mar 12 19:11:26 UTC 2015 - mardnh@gmx.de - update to version 0.11 This release adds: * kernel compiler support for LLVM/Clang 3.6, * caching of compiled OpenCL kernels * initial Android support * experimental Windows support (many things still broken there) * two new examples, Cloverleaf and Halide, updated AMDSDK examples * better debugging possibilities * initial MIPS architecture support ------------------------------------------------------------------- Tue Oct 7 19:16:42 UTC 2014 - mardnh@gmx.de - initial stable package, version 0.10 based on home:mnhauke:opencl:testing/pocl