This repository has been archived by the owner on Mar 21, 2024. It is now read-only.
Thrust 1.5.0
Thrust 1.5.0 provides introduces new programmer productivity and performance enhancements. New functionality for creating anonymous "lambda" functions has been added. A faster host sort provides 2-10x faster performance for sorting arithmetic types on (single-threaded) CPUs. A new OpenMP sort provides 2.5x-3.0x speedup over the host sort using a quad-core CPU. When sorting arithmetic types with the OpenMP backend the combined performance improvement is 5.9x for 32-bit integers and ranges from 3.0x (64-bit types) to 14.2x (8-bit types). A new CUDA reduce_by_key
implementation provides 2-3x faster performance.
Breaking Changes
- device_ptr no longer unsafely converts to device_ptr without an explicit cast. Use the expression device_pointer_cast(static_cast<int*>(void_ptr.get())) to convert, for example, device_ptr to device_ptr.
New Features
- Algorithms:
- Stencil-less
thrust::transform_if
.
- Stencil-less
- Lambda placeholders
New Examples
- lambda
Other Enhancements
- Host sort is 2-10x faster for arithmetic types
- OMP sort provides speedup over host sort
reduce_by_key
is 2-3x fasterreduce_by_key
no longer requires O(N) temporary storage- CUDA scan algorithms are 10-40% faster
host_vector
anddevice_vector
are now documented- out-of-memory exceptions now provide detailed information from CUDART
- improved histogram example
device_reference
now has a specialized swapreduce_by_key
and scan algorithms are compatible withdiscard_iterator
Bug Fixes
- #44 allow
host_vector
to compile whenvalue_type
uses__align__
- #198 allow
adjacent_difference
to permit safe in-situ operation - #303 make thrust thread-safe
- #313 avoid race conditions in
device_vector::insert
- #314 avoid unintended adl invocation when dispatching copy
- #365 fix merge and set operation failures
Known Issues
- None
Acknowledgments
- Thanks to Manjunath Kudlur for contributing his Carbon library, from which the lambda functionality is derived.
- Thanks to Jean-Francois Bastien for suggesting a fix for #303.