Changelog for accelerate-llvm-ptx-1.3.0.0
Change Log
Notable changes to the project will be documented in this file.
The format is based on Keep a Changelog and the project adheres to the Haskell Package Versioning Policy (PVP)
1.3.0.0 - 2018-08-27
Changed
- Code generation improvements for stencil operations
Fixed
- Segmented folds crash or give inconsistent results (accelerate#423)
- Synchronisation problems on SM7+ #436
Contributors
Special thanks to those who contributed patches as part of this release:
- Trevor L. McDonell (@tmcdonell)
- Josh Meredith (@JoshMeredith)
- Ivo Gabe de Wolff (@ivogabe)
- Lars van den Haak (@sakehl)
- Joshua Meredith (@JoshMeredith)
1.2.0.0 - 2018-04-03
Changed
run
variants which do not take an explicit execution context now execute on the first available device in an exclusive fashion. Multi-GPU systems can specify the default set of GPUs to use with environment variableACCELERATE_LLVM_PTX_DEVICES
as a list of device ordinals.
Added
- support for half-precision floats
- support for struct-of-array-of-struct representations
- support 64-bit atomic-add instruction in forward permutations (#363)
- support for LLVM-6.0
- support for GHC-8.4
Contributors
Special thanks to those who contributed patches as part of this release:
- Trevor L. McDonell (@tmcdonell)
- Moritz Kiefer (@cocreature)
1.1.0.1 - 2018-01-08
Fixed
- add support for building with CUDA-9.x
1.1.0.0 - 2017-09-21
Added
- support for GHC-8.2
- caching of compilation results (accelerate-llvm#17)
- support for ahead-of-time compilation (
runQ
andrunQAsync
)
Changed
- generalise
run1*
to polyvariadicrunN*
Fixed
- Fixed synchronisation bug in multidimensional reduction
1.0.0.1 - 2017-05-25
Fixed
- device kernel image is invalid (#386)
1.0.0.0 - 2017-03-31
- initial release