Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

Thrust 1.5.0

Compare
Choose a tag to compare
@brycelelbach brycelelbach released this 16 May 09:44
· 3225 commits to master since this release

Thrust 1.5.0 provides introduces new programmer productivity and performance enhancements. New functionality for creating anonymous "lambda" functions has been added. A faster host sort provides 2-10x faster performance for sorting arithmetic types on (single-threaded) CPUs. A new OpenMP sort provides 2.5x-3.0x speedup over the host sort using a quad-core CPU. When sorting arithmetic types with the OpenMP backend the combined performance improvement is 5.9x for 32-bit integers and ranges from 3.0x (64-bit types) to 14.2x (8-bit types). A new CUDA reduce_by_key implementation provides 2-3x faster performance.

Breaking Changes

  • device_ptr no longer unsafely converts to device_ptr without an explicit cast. Use the expression device_pointer_cast(static_cast<int*>(void_ptr.get())) to convert, for example, device_ptr to device_ptr.

New Features

  • Algorithms:
    • Stencil-less thrust::transform_if.
  • Lambda placeholders

New Examples

  • lambda

Other Enhancements

  • Host sort is 2-10x faster for arithmetic types
  • OMP sort provides speedup over host sort
  • reduce_by_key is 2-3x faster
  • reduce_by_key no longer requires O(N) temporary storage
  • CUDA scan algorithms are 10-40% faster
  • host_vector and device_vector are now documented
  • out-of-memory exceptions now provide detailed information from CUDART
  • improved histogram example
  • device_reference now has a specialized swap
  • reduce_by_key and scan algorithms are compatible with discard_iterator

Bug Fixes

  • #44 allow host_vector to compile when value_type uses __align__
  • #198 allow adjacent_difference to permit safe in-situ operation
  • #303 make thrust thread-safe
  • #313 avoid race conditions in device_vector::insert
  • #314 avoid unintended adl invocation when dispatching copy
  • #365 fix merge and set operation failures

Known Issues

  • None

Acknowledgments

  • Thanks to Manjunath Kudlur for contributing his Carbon library, from which the lambda functionality is derived.
  • Thanks to Jean-Francois Bastien for suggesting a fix for #303.