Skip to content

Commit

Permalink
Merge remote-tracking branch 'upstream/develop' into deprecate/in_par…
Browse files Browse the repository at this point in the history
…allel
  • Loading branch information
ajpowelsnl committed Nov 29, 2023
2 parents 8a561c0 + 0d34280 commit 6c6ca08
Show file tree
Hide file tree
Showing 98 changed files with 2,194 additions and 991 deletions.
11 changes: 11 additions & 0 deletions .github/workflows/clang-format-check.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
name: clang-format check
on: [push, pull_request]
jobs:
formatting-check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Run clang-format style check.
uses: DoozyX/[email protected]
with:
clangFormatVersion: 8
8 changes: 4 additions & 4 deletions .github/workflows/continuous-integration-workflow.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,22 +25,22 @@ jobs:
backend: ['OPENMP']
clang-tidy: ['']
include:
- distro: 'fedora:intel'
- distro: 'ubuntu:intel'
cxx: 'icpc'
cxx_extra_flags: '-diag-disable=177,10441'
cmake_build_type: 'Release'
backend: 'OPENMP'
- distro: 'fedora:intel'
- distro: 'ubuntu:intel'
cxx: 'icpc'
cxx_extra_flags: '-diag-disable=177,10441'
cmake_build_type: 'Debug'
backend: 'OPENMP'
- distro: 'fedora:intel'
- distro: 'ubuntu:intel'
cxx: 'icpx'
cxx_extra_flags: '-fp-model=precise -Wno-pass-failed'
cmake_build_type: 'Release'
backend: 'OPENMP'
- distro: 'fedora:intel'
- distro: 'ubuntu:intel'
cxx: 'icpx'
cxx_extra_flags: '-fp-model=precise -Wno-pass-failed'
cmake_build_type: 'Debug'
Expand Down
93 changes: 92 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,97 @@
# CHANGELOG

## [4.1.00](https://github.com/kokkos/kokkos/tree/4.0.01) (2023-06-16)
## [4.2.00](https://github.com/kokkos/kokkos/tree/4.2.00) (2023-11-06)
[Full Changelog](https://github.com/kokkos/kokkos/compare/4.1.00...4.2.00)

### Features:
- SIMD: significant improvements to SIMD support and alignment with C++26 SIMD
- add `Kokkos::abs` overload for SIMD types [\#6069](https://github.com/kokkos/kokkos/pull/6069)
- add generator constructors [\#6347](https://github.com/kokkos/kokkos/pull/6347)
- convert binary operators to hidden friends [\#6320](https://github.com/kokkos/kokkos/pull/6320)
- add shift operators [\#6109](https://github.com/kokkos/kokkos/pull/6109)
- add `float` support [\#6177](https://github.com/kokkos/kokkos/pull/6177)
- add remaining `gather_from` and `scatter_to` overloads [\#6220](https://github.com/kokkos/kokkos/pull/6220)
- define simd math function overloads in the Kokkos namespace [\#6465](https://github.com/kokkos/kokkos/pull/6465), [\#6487](https://github.com/kokkos/kokkos/pull/6487)
- `Kokkos_ENABLE_NATIVE=ON` autodetects SIMD types supported [\#6188](https://github.com/kokkos/kokkos/pull/6188)
- fix AVX2 SIMD support for ZEN2 AMD CPU [\#6238](https://github.com/kokkos/kokkos/pull/6238)
- `Kokkos::printf` [\#6083](https://github.com/kokkos/kokkos/pull/6083)
- `Kokkos::sort`: support custom comparator [\#6253](https://github.com/kokkos/kokkos/pull/6253)
- `half_t` and `bhalf_t` numeric traits [\#5778](https://github.com/kokkos/kokkos/pull/5778)
- `half_t` and `bhalf_t` mixed comparisons [\#6407](https://github.com/kokkos/kokkos/pull/6407)
- `half_t` and `bhalf_t` mathematical functions [\#6124](https://github.com/kokkos/kokkos/pull/6124)
- `TeamThreadRange` `parallel_scan` with return value [\#6090](https://github.com/kokkos/kokkos/pull/6090), [\#6301](https://github.com/kokkos/kokkos/pull/6301), [\#6302](https://github.com/kokkos/kokkos/pull/6302), [\#6303](https://github.com/kokkos/kokkos/pull/6303), [\#6307](https://github.com/kokkos/kokkos/pull/6307)
- `ThreadVectorRange` `parallel_scan` with return value [\#6235](https://github.com/kokkos/kokkos/pull/6235), [\#6242](https://github.com/kokkos/kokkos/pull/6242), [\#6308](https://github.com/kokkos/kokkos/pull/6308), [\#6305](https://github.com/kokkos/kokkos/pull/6305), [\#6292](https://github.com/kokkos/kokkos/pull/6292)
- Add team-level std algorithms [\#6200](https://github.com/kokkos/kokkos/pull/6200), [\#6205](https://github.com/kokkos/kokkos/pull/6205), [\#6207](https://github.com/kokkos/kokkos/pull/6207), [\#6208](https://github.com/kokkos/kokkos/pull/6208), [\#6209](https://github.com/kokkos/kokkos/pull/6209), [\#6210](https://github.com/kokkos/kokkos/pull/6210), [\#6211](https://github.com/kokkos/kokkos/pull/6211), [\#6212](https://github.com/kokkos/kokkos/pull/6212), [\#6213](https://github.com/kokkos/kokkos/pull/6213), [\#6256](https://github.com/kokkos/kokkos/pull/6256), [\#6258](https://github.com/kokkos/kokkos/pull/6258), [\#6350](https://github.com/kokkos/kokkos/pull/6350), [\#6351](https://github.com/kokkos/kokkos/pull/6351)
- Serial: Allow for distinct execution space instances [\#6441](https://github.com/kokkos/kokkos/pull/6441)

### Backend and Architecture Enhancements:

#### CUDA:
- Fixed potential data race in Cuda `parallel_reduce` [\#6236](https://github.com/kokkos/kokkos/pull/6236)
- Use `cudaMallocAsync` by default [\#6402](https://github.com/kokkos/kokkos/pull/6402)
- Bugfix for using Kokkos from a thread of execution [\#6299](https://github.com/kokkos/kokkos/pull/6299)

#### HIP:
- New naming convention for AMD GPU: VEGA906, VEGA908, VEGA90A, NAVI1030 to AMD_GFX906, AMD_GFX908, AMD_GFX90A, AMD_GFX1030 [\#6266](https://github.com/kokkos/kokkos/pull/6266)
- Add initial support for gfx942: [\#6358](https://github.com/kokkos/kokkos/pull/6358)
- Improve reduction performance [\#6229](https://github.com/kokkos/kokkos/pull/6229)
- Deprecate `HIP(hipStream_t,bool)` constructor [\#6401](https://github.com/kokkos/kokkos/pull/6401)
- Add support for Graph [\#6370](https://github.com/kokkos/kokkos/pull/6370)
- Improve reduction performance when using Teams [\#6284](https://github.com/kokkos/kokkos/pull/6284)
- Fix concurrency calculation [\#6479](https://github.com/kokkos/kokkos/pull/6479)
- Fix potential data race in HIP `parallel_reduce` [\#6429](https://github.com/kokkos/kokkos/pull/6429)

#### SYCL:
- Enforce external `sycl::queues` to be in-order [\#6246](https://github.com/kokkos/kokkos/pull/6246)
- Improve reduction performance: [\#6272](https://github.com/kokkos/kokkos/pull/6272) [\#6271](https://github.com/kokkos/kokkos/pull/6271) [\#6270](https://github.com/kokkos/kokkos/pull/6270) [\#6264](https://github.com/kokkos/kokkos/pull/6264)
- Allow using the SYCL execution space on AMD GPUs [\#6321](https://github.com/kokkos/kokkos/pull/6321)
- Allow sorting via native oneDPL to support Views with stride=1 [\#6322](https://github.com/kokkos/kokkos/pull/6322)
- Make in-order queues the default via macro [\#6189](https://github.com/kokkos/kokkos/pull/6189)

#### OpenACC:
- Support Clacc compiler [\#6250](https://github.com/kokkos/kokkos/pull/6250)

### General Enhancements
- Add missing `is_*_view` traits and `is_*_view_v` helper variable templates for `DynRankView`, `DynamicView`, `OffsetView`, `ScatterView` containers [\#6195](https://github.com/kokkos/kokkos/pull/6195)
- Make `nvcc_wrapper` and `compiler_launcher` scripts more portable by switching to a `#!/usr/bin/env` shebang [\#6357](https://github.com/kokkos/kokkos/pull/6357)
- Add an improved `Kokkos::malloc` / `Kokkos::free` performance test [\#6377](https://github.com/kokkos/kokkos/pull/6377)
- Ensure `Views` with `size==0` can be used with `deep_copy` [\#6273](https://github.com/kokkos/kokkos/pull/6273)
- `Kokkos::abort` is moved to header `Kokkos_Abort.hpp` [\#6445](https://github.com/kokkos/kokkos/pull/6445)
- `KOKKOS_ASSERT`, `KOKKOS_EXPECTS`, `KOKKOS_ENSURES` are moved to header `Kokkos_Assert.hpp` [\#6445](https://github.com/kokkos/kokkos/pull/6445)
- Add a permuted-index mode to the gups benchmark [\#6378](https://github.com/kokkos/kokkos/pull/6378)
- Check for overflow during backend initialization [\#6159](https://github.com/kokkos/kokkos/pull/6159)
- Make constraints on `Kokkos::sort` more visible [\#6234](https://github.com/kokkos/kokkos/pull/6234) and cleanup API [\#6239](https://github.com/kokkos/kokkos/pull/6239)
- Add converting assignment to `DualView`: [\#6474](https://github.com/kokkos/kokkos/pull/6474)


### Build System Changes

- Export `Kokkos_CXX_COMPILER_VERSION` [\#6282](https://github.com/kokkos/kokkos/pull/6282)
- Disable default oneDPL support in Trilinos [\#6342](https://github.com/kokkos/kokkos/pull/6342)

### Incompatibilities (i.e. breaking changes)
- Ensure that `Kokkos::complex` only gets instantiated for cv-unqualified floating-point types [\#6251](https://github.com/kokkos/kokkos/pull/6251)
- Removed (deprecated-3) support for volatile join operators in reductions [\#6385](https://github.com/kokkos/kokkos/pull/6385)
- Enforce `ViewCtorArgs` restrictions for `create_mirror_view` [\#6304](https://github.com/kokkos/kokkos/pull/6304)
- SIMD types for ARM NEON are not autodetected anymore but need `Kokkos_ARCH_ARM_NEON` or `Kokkos_ARCH_NATIVE=ON` [\#6394](https://github.com/kokkos/kokkos/pull/6394)
- Remove `#include <iostream>` from headers where possible [\#6482](https://github.com/kokkos/kokkos/pull/6482)

### Deprecations
- Deprecated `Kokkos::vector` [\#6252](https://github.com/kokkos/kokkos/pull/6252)
- All host allocation mechanisms except for `STD_MALLOC` have been deprecated [\#6341](https://github.com/kokkos/kokkos/pull/6341)

### Bug Fixes
- Missing memory fence in `RandomPool::free_state` functions [\#6290](https://github.com/kokkos/kokkos/pull/6290)
- Fix for corner case in `Kokkos::Experimental::is_partitioned` algorithm [\#6257](https://github.com/kokkos/kokkos/pull/6257)
- Fix initialization of scratch lock variables in the `Cuda` backend [\#6433](https://github.com/kokkos/kokkos/pull/6433)
- Fixes for `Kokkos::Array` [\#6372](https://github.com/kokkos/kokkos/pull/6372)
- Fixed symlink configure issue for Windows [\#6241](https://github.com/kokkos/kokkos/pull/6241)
- OpenMPTarget init-join fix [\#6444](https://github.com/kokkos/kokkos/pull/6444)
- Fix atomic operations bug for Min and Max [\#6435](https://github.com/kokkos/kokkos/pull/6435)
- Fix implementation for `cyl_bessel_i0` [\#6484](https://github.com/kokkos/kokkos/pull/6484)
- Fix various NVCC warnings in `BinSort`, `Array`, and bit manipulation function templates [\#6483](https://github.com/kokkos/kokkos/pull/6483)

## [4.1.00](https://github.com/kokkos/kokkos/tree/4.1.00) (2023-06-16)
[Full Changelog](https://github.com/kokkos/kokkos/compare/4.0.01...4.1.00)

### Features:
Expand Down
6 changes: 6 additions & 0 deletions Makefile.kokkos
Original file line number Diff line number Diff line change
Expand Up @@ -1440,6 +1440,12 @@ ifeq ($(KOKKOS_INTERNAL_USE_OPENMPTARGET), 1)
else
tmp := $(call desul_append_header,"/* $H""undef DESUL_ATOMICS_ENABLE_OPENMP */")
endif

ifeq ($(KOKKOS_INTERNAL_USE_OPENACC), 1)
tmp := $(call desul_append_header,"$H""define DESUL_ATOMICS_ENABLE_OPENACC")
else
tmp := $(call desul_append_header,"/* $H""undef DESUL_ATOMICS_ENABLE_OPENACC */")
endif
tmp := $(call desul_append_header, "")
tmp := $(call desul_append_header, "$H""endif")

Expand Down
34 changes: 16 additions & 18 deletions algorithms/src/Kokkos_Random.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -849,18 +849,17 @@ class Random_XorShift64 {
return drand(end - start) + start;
}

// Marsaglia polar method for drawing a standard normal distributed random
// Box-muller method for drawing a standard normal distributed random
// number
KOKKOS_INLINE_FUNCTION
double normal() {
double S = 2.0;
double U;
while (S >= 1.0) {
U = 2.0 * drand() - 1.0;
const double V = 2.0 * drand() - 1.0;
S = U * U + V * V;
}
return U * std::sqrt(-2.0 * std::log(S) / S);
constexpr auto two_pi = 2 * Kokkos::numbers::pi_v<double>;

const double u = drand();
const double v = drand();
const double r = Kokkos::sqrt(-2.0 * Kokkos::log(u));
const double theta = v * two_pi;
return r * Kokkos::cos(theta);
}

KOKKOS_INLINE_FUNCTION
Expand Down Expand Up @@ -1094,18 +1093,17 @@ class Random_XorShift1024 {
return drand(end - start) + start;
}

// Marsaglia polar method for drawing a standard normal distributed random
// Box-muller method for drawing a standard normal distributed random
// number
KOKKOS_INLINE_FUNCTION
double normal() {
double S = 2.0;
double U;
while (S >= 1.0) {
U = 2.0 * drand() - 1.0;
const double V = 2.0 * drand() - 1.0;
S = U * U + V * V;
}
return U * std::sqrt(-2.0 * std::log(S) / S);
constexpr auto two_pi = 2 * Kokkos::numbers::pi_v<double>;

const double u = drand();
const double v = drand();
const double r = Kokkos::sqrt(-2.0 * Kokkos::log(u));
const double theta = v * two_pi;
return r * Kokkos::cos(theta);
}

KOKKOS_INLINE_FUNCTION
Expand Down
4 changes: 3 additions & 1 deletion algorithms/src/std_algorithms/impl/Kokkos_Unique.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,9 @@ IteratorType unique_exespace_impl(const std::string& label,
// using the same algorithm used for unique_copy but we now move things
using value_type = typename IteratorType::value_type;
using tmp_view_type = Kokkos::View<value_type*, ExecutionSpace>;
tmp_view_type tmp_view("std_unique_tmp_view", num_elements_to_explore);
tmp_view_type tmp_view(Kokkos::view_alloc(ex, Kokkos::WithoutInitializing,
"std_unique_tmp_view"),
num_elements_to_explore);

// scan extent is: num_elements_to_explore - 1
// for same reason as the one explained in unique_copy
Expand Down
14 changes: 7 additions & 7 deletions algorithms/unit_tests/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -27,13 +27,13 @@ TARGETS =

tmp := $(foreach device, $(KOKKOS_DEVICELIST), \
$(if $(filter Test$(device).cpp, $(shell ls Test$(device).cpp 2>/dev/null)),,\
$(shell echo "\#include <Test"${device}"_Category.hpp>" > Test$(device).cpp); \
$(shell echo "\#include <TestRandom.hpp>" >> Test$(device).cpp); \
$(shell echo "\#include <TestSort.hpp>" >> Test$(device).cpp); \
$(shell echo "\#include <TestBinSortA.hpp>" >> Test$(device).cpp); \
$(shell echo "\#include <TestBinSortB.hpp>" >> Test$(device).cpp); \
$(shell echo "\#include <TestNestedSort.hpp>" >> Test$(device).cpp); \
$(shell echo "\#include <TestSortCustomComp.hpp>" >> Test$(device).cpp); \
$(shell echo "$(H)include <Test"${device}"_Category.hpp>" > Test$(device).cpp); \
$(shell echo "$(H)include <TestRandom.hpp>" >> Test$(device).cpp); \
$(shell echo "$(H)include <TestSort.hpp>" >> Test$(device).cpp); \
$(shell echo "$(H)include <TestBinSortA.hpp>" >> Test$(device).cpp); \
$(shell echo "$(H)include <TestBinSortB.hpp>" >> Test$(device).cpp); \
$(shell echo "$(H)include <TestNestedSort.hpp>" >> Test$(device).cpp); \
$(shell echo "$(H)include <TestSortCustomComp.hpp>" >> Test$(device).cpp); \
) \
)

Expand Down
4 changes: 2 additions & 2 deletions algorithms/unit_tests/TestStdAlgorithmsModOps.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ struct MyMovableType {
TEST(std_algorithms_mod_ops_test, move) {
MyMovableType a;
using move_t = decltype(std::move(a));
static_assert(std::is_rvalue_reference<move_t>::value, "");
static_assert(std::is_rvalue_reference<move_t>::value);

// move constr
MyMovableType b(std::move(a));
Expand All @@ -70,7 +70,7 @@ struct StdAlgoModSeqOpsTestMove {
void operator()(const int index) const {
typename ViewType::value_type a{11};
using move_t = decltype(std::move(a));
static_assert(std::is_rvalue_reference<move_t>::value, "");
static_assert(std::is_rvalue_reference<move_t>::value);
m_view(index) = std::move(a);
}

Expand Down
6 changes: 2 additions & 4 deletions algorithms/unit_tests/TestStdAlgorithmsPartitionCopy.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -110,11 +110,9 @@ void verify_data(const std::string& name, ResultType my_result,
ViewTypeDestFalse view_dest_false, PredType pred) {
using value_type = typename ViewTypeFrom::value_type;
static_assert(
std::is_same<value_type, typename ViewTypeDestTrue::value_type>::value,
"");
std::is_same<value_type, typename ViewTypeDestTrue::value_type>::value);
static_assert(
std::is_same<value_type, typename ViewTypeDestFalse::value_type>::value,
"");
std::is_same<value_type, typename ViewTypeDestFalse::value_type>::value);

const std::size_t ext = view_from.extent(0);

Expand Down
2 changes: 1 addition & 1 deletion bin/nvcc_wrapper
Original file line number Diff line number Diff line change
Expand Up @@ -229,7 +229,7 @@ do
fi
;;
#Handle known nvcc args
--dryrun|--verbose|--keep|--source-in-ptx|-src-in-ptx|--keep-dir*|-G|-lineinfo|-extended-lambda|-expt-extended-lambda|-expt-relaxed-constexpr|--resource-usage|--fmad=*|--use_fast_math|--Wext-lambda-captures-this|-Wext-lambda-captures-this)
--dryrun|-dryrun|--verbose|-v|--keep|-keep|--source-in-ptx|-src-in-ptx|--keep-dir*|-keep-dir*|-G|-lineinfo|--generate-line-info|-extended-lambda|-expt-extended-lambda|-expt-relaxed-constexpr|--resource-usage|-res-usage|-fmad=*|--use_fast_math|-use_fast_math|--Wext-lambda-captures-this|-Wext-lambda-captures-this)
cuda_args="$cuda_args $1"
;;
#Handle more known nvcc args
Expand Down
13 changes: 9 additions & 4 deletions cmake/kokkos_arch.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -585,16 +585,20 @@ IF (KOKKOS_ENABLE_SYCL)
ENDIF()

# Check support for device_global variables
# FIXME_SYCL Even if SYCL_EXT_ONEAPI_DEVICE_GLOBAL is defined, we still can't
# use device global variables with shared libraries
IF(KOKKOS_ENABLE_SYCL AND NOT BUILD_SHARED_LIBS)
# FIXME_SYCL If SYCL_EXT_ONEAPI_DEVICE_GLOBAL is defined, we can use device
# global variables with shared libraries using the "non-separable compilation"
# implementation. Otherwise, the feature is not supported when building shared
# libraries. Thus, we don't even check for support if shared libraries are
# requested and SYCL_EXT_ONEAPI_DEVICE_GLOBAL is not defined.
IF(KOKKOS_ENABLE_SYCL)
STRING(REPLACE ";" " " CMAKE_REQUIRED_FLAGS "${KOKKOS_COMPILE_OPTIONS}")
INCLUDE(CheckCXXSymbolExists)
CHECK_CXX_SYMBOL_EXISTS(SYCL_EXT_ONEAPI_DEVICE_GLOBAL "sycl/sycl.hpp" KOKKOS_IMPL_HAVE_SYCL_EXT_ONEAPI_DEVICE_GLOBAL)
IF (KOKKOS_IMPL_HAVE_SYCL_EXT_ONEAPI_DEVICE_GLOBAL)
SET(KOKKOS_IMPL_SYCL_DEVICE_GLOBAL_SUPPORTED ON)
# Use the non-separable compilation implementation to support shared libraries as well.
COMPILER_SPECIFIC_FLAGS(DEFAULT -DDESUL_SYCL_DEVICE_GLOBAL_SUPPORTED)
ELSE()
ELSEIF(NOT BUILD_SHARED_LIBS)
INCLUDE(CheckCXXSourceCompiles)
CHECK_CXX_SOURCE_COMPILES("
#include <sycl/sycl.hpp>
Expand All @@ -614,6 +618,7 @@ IF(KOKKOS_ENABLE_SYCL AND NOT BUILD_SHARED_LIBS)
KOKKOS_IMPL_SYCL_DEVICE_GLOBAL_SUPPORTED)

IF(KOKKOS_IMPL_SYCL_DEVICE_GLOBAL_SUPPORTED)
# Only the separable compilation implementation is supported.
COMPILER_SPECIFIC_FLAGS(
DEFAULT -fsycl-device-code-split=off -DDESUL_SYCL_DEVICE_GLOBAL_SUPPORTED
)
Expand Down
2 changes: 1 addition & 1 deletion containers/src/Kokkos_DynRankView.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -1340,7 +1340,7 @@ class ViewMapping<

template <class MemoryTraits>
struct apply {
static_assert(Kokkos::is_memory_traits<MemoryTraits>::value, "");
static_assert(Kokkos::is_memory_traits<MemoryTraits>::value);

using traits_type =
Kokkos::ViewTraits<data_type, array_layout,
Expand Down
5 changes: 5 additions & 0 deletions containers/unit_tests/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,11 @@ foreach(Tag Threads;Serial;OpenMP;HPX;Cuda;HIP;SYCL)
LIST(REMOVE_ITEM UnitTestSources ${dir}/TestCuda_DynViewAPI_generic.cpp)
endif()

# FIXME_NVHPC: NVC++-S-0000-Internal compiler error. extractor: bad opc 0
if(KOKKOS_ENABLE_CUDA AND KOKKOS_CXX_COMPILER_ID STREQUAL NVHPC)
LIST(REMOVE_ITEM UnitTestSources ${dir}/TestCuda_WithoutInitializing.cpp)
endif()

KOKKOS_ADD_EXECUTABLE_AND_TEST(ContainersUnitTest_${Tag} SOURCES ${UnitTestSources})
endif()
endforeach()
Expand Down
4 changes: 2 additions & 2 deletions containers/unit_tests/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -35,8 +35,8 @@ TESTS = Bitset DualView DynamicView DynViewAPI_generic DynViewAPI_rank12345 DynV
tmp := $(foreach device, $(KOKKOS_DEVICELIST), \
tmp2 := $(foreach test, $(TESTS), \
$(if $(filter Test$(device)_$(test).cpp, $(shell ls Test$(device)_$(test).cpp 2>/dev/null)),,\
$(shell echo "\#include<Test"$(device)"_Category.hpp>" > Test$(device)_$(test).cpp); \
$(shell echo "\#include<Test"$(test)".hpp>" >> Test$(device)_$(test).cpp); \
$(shell echo "$(H)include<Test"$(device)"_Category.hpp>" > Test$(device)_$(test).cpp); \
$(shell echo "$(H)include<Test"$(test)".hpp>" >> Test$(device)_$(test).cpp); \
)\
) \
)
Expand Down
6 changes: 6 additions & 0 deletions core/src/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,16 @@ IF (NOT desul_FOUND)
ENDIF()
IF(KOKKOS_ENABLE_SYCL)
SET(DESUL_ATOMICS_ENABLE_SYCL ON)
IF(KOKKOS_IMPL_SYCL_DEVICE_GLOBAL_SUPPORTED AND NOT KOKKOS_IMPL_HAVE_SYCL_EXT_ONEAPI_DEVICE_GLOBAL)
SET(DESUL_ATOMICS_ENABLE_SYCL_SEPARABLE_COMPILATION ON)
ENDIF()
ENDIF()
IF(KOKKOS_ENABLE_OPENMPTARGET)
SET(DESUL_ATOMICS_ENABLE_OPENMP ON) # not a typo Kokkos OpenMPTarget -> Desul OpenMP
ENDIF()
IF(KOKKOS_ENABLE_OPENACC)
SET(DESUL_ATOMICS_ENABLE_OPENACC ON)
ENDIF()
CONFIGURE_FILE(
${CMAKE_CURRENT_SOURCE_DIR}/../../tpls/desul/Config.hpp.cmake.in
${CMAKE_CURRENT_BINARY_DIR}/desul/atomics/Config.hpp
Expand Down
Loading

0 comments on commit 6c6ca08

Please sign in to comment.