You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am a studen of Computational Engineering at Friedrich-Alexander-University Erlangen-Nuernberg in Germany. This semester i take a seminar about benchmarking multi core architectures. My task is to evaluate MiniAearo/Kokkos by using the roofline model.
I did some measurements on a Tesla K40 and on a numa machine (OpenMP) with 2x Intel E5-2650v2. I barely reach 20% of the roofline on both architectures by using the inputfile for the ramp test in the test folder with varying amounts of cells.
What could be the reason for the perfomance to be so low or am I using unreasonable problem sizes (eg. 128x128x32)?
Best regards,
Johannes
The text was updated successfully, but these errors were encountered:
Hi,
I am a studen of Computational Engineering at Friedrich-Alexander-University Erlangen-Nuernberg in Germany. This semester i take a seminar about benchmarking multi core architectures. My task is to evaluate MiniAearo/Kokkos by using the roofline model.
I did some measurements on a Tesla K40 and on a numa machine (OpenMP) with 2x Intel E5-2650v2. I barely reach 20% of the roofline on both architectures by using the inputfile for the ramp test in the test folder with varying amounts of cells.
What could be the reason for the perfomance to be so low or am I using unreasonable problem sizes (eg. 128x128x32)?
Best regards,
Johannes
The text was updated successfully, but these errors were encountered: