rocBLAS-2.24.0 for ROCm 3.6.0
New Features
- Improvements to User Guide and Design Document * L1 dot function optimized to utilize shuffle instructions ( improvements on bf16, f16, f32 data types ) * L1 dot function added x dot x optimized kernel * Standardization of L1 rocblas-bench to use device pointer mode to focus on GPU memory bandwidth * Adjustments for hipcc (hip-clang) compiler as standard build compiler and Centos8 support * Added Fortran interface for all rocBLAS functions
Known Issues
None