------------------------------------------------------------------- Tue Dec 19 18:44:43 UTC 2017 - mardnh@gmx.de - Update to version 1.0 Highlights * Improved automatic local work-group sizing on kernel enqueue, taking into account standard constraints, SIMD width for vectorization as well as the number of compute units available on the device. * Support for NVIDIA GPUs via a new CUDA backend (currently experimental). * Removed support for BBVectorizer. * LLVM 5.0 is now supported. * A few build options have been added for distribution builds, see README.packaging. * Somewhat improved scalability in the CPU driver. CPUs with many cores and programs using a lot of WIs with small kernels can run somewhat faster. * Full conformance with OpenCL 1.2 standard, enabled by default. There are some caveats though - see the documentation. * When conformance is enabled, some kernel library functions might be slower than in previous releases. * Pocl now reports OpenCL 1.2 instead of 2.0, except HSA enabled builds. * Updated format of pocl binaries, which is NOT backwards compatible. * You'll need to clean any kernel caches. * Fixed several memory leaks. * Unresolved symbols (missing/misspelled functions etc) in a kernel will result in error in clBuildProgram() instead of pocl silently ignoring them and then aborting at dlopen(). * New env variable POCL_MEMORY_LIMIT=N limits the Global memory size reported by pocl to N gigabytes. * New env variable POCL_AFFINITY (defaults to 0): if enabled, sets the affinity of each CPU driver pthread to a single core. * Improved AVX512 support (with LLVM 5.0). Note that even with LLVM 5.0 there are still a few bugs (see pocl issue #555); AVX512 + LLVM 4.0 are a lot more broken, and probably not worth trying. * POCL_DEBUG env var has been revamped. You can now limit debuginfo to these categories (or their combination): all,error,warning,general memory,llvm,events,cache,locking,refcounts,timing,hsa,tce,cuda * The old setting POCL_DEBUG=1 now equals error+warning+general. - Remove patch: * pocl-disable-tests.diff - Disable CUDA backend since it depends on CUDA_TOOLKIT which is not available in Factory ------------------------------------------------------------------- Thu Oct 26 14:13:18 UTC 2017 - mpluskal@suse.com - Simplify spec file a bit - Enable CUDA backend - Enable all available cpu specific kernels for intel platform ------------------------------------------------------------------- Sat Oct 21 06:28:35 UTC 2017 - mpluskal@suse.com - We need clang4-devel-static to build with current clang packaging - Small spec-file cleanup * drop conditionals for older releases then Factory as building was not possible anyways ------------------------------------------------------------------- Wed Jul 12 10:50:47 UTC 2017 - jengelh@inai.de - Description should say what it is, not what it plans in the future. ------------------------------------------------------------------- Sat Jul 8 17:29:43 UTC 2017 - mardnh@gmx.de - Fix runtime linking issues (missing crtbeginS.so) - Require gcc for Factory ------------------------------------------------------------------- Mon Jul 3 21:23:21 UTC 2017 - mardnh@gmx.de - Fix path in the ICD-file ------------------------------------------------------------------- Fri May 19 08:49:22 UTC 2017 - idonmez@suse.com - Update library name for uthash -> libut2 ------------------------------------------------------------------- Tue May 16 18:59:30 UTC 2017 - mardnh@gmx.de - Update to 0.14 - Support for LLVM/Clang versions 3.9 and 4.0. Version 3.9 was the first release to include all frontend features for OpenCL 2.0. - Ability to build pocl in a mode where online compilation is not supported to run in hosts without LLVM and binaries compiled offline e.g. using poclcc. - pocl's binary format now can contain all the necessary bits to execute the programs on a host without online compiler support. - Initial support for out-of-order execution execution of command queues. - It's now possible to cross-compile pocl when building an offline compiler build. - New driver api extension to support out-of-order and asynchronous devices/drivers. - Pthread and HSA drivers are now fully asynchronous. - CMake now the only supported build system, autotools removed. - LTTng tracing support - Add patches: - pocl-disable-tests.diff - compilation errors on some tests disable tests for now - Remove patches: - 0001-Fixes-357-broken-build-with-GCC-6.1.patch - fixed upstream ------------------------------------------------------------------- Thu Feb 2 10:52:12 UTC 2017 - adam.majer@suse.de - use individual libboost-*-devel packages instead of boost-devel ------------------------------------------------------------------- Fri Jan 20 08:45:21 UTC 2017 - mpluskal@suse.com - Use llvm3_8 for building and as runtime dependency ------------------------------------------------------------------- Sun Oct 9 09:17:33 UTC 2016 - mpluskal@suse.com - Use cmake macros - Use ninja to speedup building ------------------------------------------------------------------- Mon Jul 4 15:04:04 UTC 2016 - mardnh@gmx.de - Add patch: 0001-Fixes-357-broken-build-with-GCC-6.1.patch * Fix build with GCC 6.x ------------------------------------------------------------------- Tue Apr 5 07:18:43 UTC 2016 - mpluskal@suse.com - Update to 0.13 * kernel compiler support for LLVM/Clang 3.8 * initial (partial) OpenCL 2.0 support * CMake build system almost on parity with autotools * Improved HSA support * Other optimizations and bug fixes ------------------------------------------------------------------- Mon Oct 26 20:02:35 UTC 2015 - mardnh@gmx.de - update to version 0.12 Highlights * Support for HSA-compliant devices (kernel agents). The GPU of AMD Kaveri now works through pocl with a bunch of test cases in the AMD SDK 2.9 example suite. * New and improved kernel cache system that enables caching kernels with #includes. * Support for LLVM/Clang 3.7. * Little endian MIPS32 now passes almost all pocl testsuite tests. OpenCL Runtime/Platform API support * Transferred buffer read/write/copy offset calculation to device driver side. - these driver api functions have changed; got offset as a new argument. * Maximum allocation is not limited to 1/4th of total memory size. * Maximum image dimensions grow to fit maximum allocation. * clGetDeviceInfo() reports better information about CPU vendor and cache. * experimental clCreateSubDevices() for pthread CPU device. OpenCL C Builtin Function Implementations * Implemented get_image_dim(). Bugfixes * Avoid infinite loops when users recycle an event waiting list. * Correctly report the base address alignment. * Lots of others. Misc * Tests now using new cl2.hpp, removing dependency on OpenGL headers - remove OpenGL-related packages from BuildRequires - add rpmlintrc ------------------------------------------------------------------- Wed Jul 1 14:54:09 UTC 2015 - cdenicolo@suse.com - license update: MIT overall license is MIT, other licenses refere to build scripts only. ------------------------------------------------------------------- Thu Mar 12 19:11:26 UTC 2015 - mardnh@gmx.de - update to version 0.11 This release adds: * kernel compiler support for LLVM/Clang 3.6, * caching of compiled OpenCL kernels * initial Android support * experimental Windows support (many things still broken there) * two new examples, Cloverleaf and Halide, updated AMDSDK examples * better debugging possibilities * initial MIPS architecture support ------------------------------------------------------------------- Tue Oct 7 19:16:42 UTC 2014 - mardnh@gmx.de - initial stable package, version 0.10 based on home:mnhauke:opencl:testing/pocl