-
Notifications
You must be signed in to change notification settings - Fork 13
AMD Milan S2 M4 C32
- Processor: AMD EPYC 7543 32-Core Processor
- Base frequency: 2.8 GHz
- Number of sockets: 2
- Number of memory domains per socket: 4
- Number of cores per socket: 32
- Number of HWThreads per core: 1
- MachineState output: json
+----------+-----------+
| Compiler | icc (ICC) |
|----------|-----------|
| Version | icc (ICC) 2021.9.0 20230302 |
+----------+-----------+
Optimizing flags: -fast -xCORE-AVX2 -march=znver2 -qopt-streaming-stores=always -std=c99 -ffreestanding -qopenmp
All results are in GB/s
.
Summary results:
+---------------------------------+
| Single core | 40.80 (Copy) |
| Memory domain | 45.35 (Sum with 8 cores) |
| Socket | 180.78 (Sum with 8 cores) |
| Node | 354.15 (Sum with 8 cores) |
+---------------------------------+
Results for scaling within a memory domain:
#nt Init Sum Copy Update Triad Daxpy STriad SDaxpy
1 25.50 32.71 40.80 36.75 39.80 37.95 38.71 36.68
2 25.52 41.33 42.96 43.10 41.82 41.58 40.92 40.52
3 25.54 41.70 42.20 42.19 40.62 40.52 39.70 39.39
4 25.53 42.42 41.79 42.05 40.16 39.88 39.35 38.96
5 30.43 43.74 42.35 42.87 41.44 41.28 40.80 40.66
6 34.58 44.67 42.57 43.11 42.25 42.09 41.67 41.57
7 38.24 45.13 42.74 43.24 42.68 42.43 42.11 42.07
8 41.64 45.35 42.79 43.49 42.68 42.33 42.23 42.29
Results for scaling across memory domains. Shown are the results for the number of memory domains used (nm) with columns number of cores used per memory domain.
Init:
#nm 1 2 3 4 5 6 7 8
1 25.50 50.94 76.43 101.88 126.50 152.34 176.78 199.56
2 25.52 51.00 76.50 101.97 126.69 152.48 177.11 200.10
3 25.54 51.05 75.84 102.10 126.92 152.87 177.54 200.92
4 25.53 51.04 76.58 102.10 127.17 152.90 177.83 201.96
5 30.43 60.86 91.21 121.59 151.84 182.09 212.40 242.45
6 34.58 69.22 103.77 137.85 172.21 206.33 241.10 275.57
7 38.24 76.54 114.74 152.61 190.69 228.69 266.98 304.72
8 41.64 83.20 124.33 165.40 206.24 247.69 288.03 326.45
Sum:
#nm 1 2 3 4 5 6 7 8
1 32.71 64.80 97.45 128.66 159.37 193.39 221.68 248.04
2 41.33 83.96 125.40 168.93 205.81 248.41 281.74 326.52
3 41.70 84.91 124.13 167.18 207.18 249.06 286.82 327.74
4 42.42 83.15 124.72 166.20 205.81 247.64 286.48 323.18
5 43.74 87.20 130.34 175.26 217.22 260.07 300.90 341.82
6 44.67 88.71 133.25 176.78 220.46 264.68 308.65 345.35
7 45.13 90.00 134.67 179.21 223.52 268.54 310.77 351.52
8 45.35 90.52 135.52 180.78 224.35 269.56 312.89 354.15
Copy
#nm 1 2 3 4 5 6 7 8
1 40.80 81.24 121.92 162.29 202.33 242.69 282.83 318.65
2 42.96 85.69 128.29 171.32 213.34 256.65 298.21 337.10
3 42.20 84.17 125.97 168.48 210.09 251.90 293.08 332.76
4 41.79 83.25 124.97 166.56 207.98 249.42 290.30 328.30
5 42.35 84.46 126.68 169.07 211.45 253.88 296.58 338.78
6 42.57 84.88 127.34 169.72 212.26 254.54 297.51 340.23
7 42.74 85.21 127.86 170.32 213.09 255.61 298.48 341.68
8 42.79 85.32 127.98 170.52 213.23 255.99 298.46 341.20
Update
#nm 1 2 3 4 5 6 7 8
1 36.75 73.41 110.09 146.85 182.77 219.99 255.54 289.42
2 43.10 86.02 129.04 172.73 214.28 259.35 301.43 343.33
3 42.19 84.20 126.13 168.81 210.53 252.81 294.53 333.78
4 42.05 83.69 125.76 167.97 209.54 251.97 292.97 330.84
5 42.87 85.50 128.86 172.28 216.13 259.94 303.79 348.15
6 43.11 86.03 129.34 172.89 217.23 261.01 305.69 350.73
7 43.24 86.42 130.03 173.87 218.54 262.61 307.23 352.35
8 43.49 87.08 131.12 175.36 219.57 264.73 309.55 350.84
Triad
#nm 1 2 3 4 5 6 7 8
1 39.80 78.72 118.43 157.77 197.33 234.69 274.38 312.04
2 41.82 83.27 124.85 166.28 207.37 248.85 288.86 328.42
3 40.62 81.14 120.97 161.56 202.27 241.70 281.98 320.59
4 40.16 79.60 119.43 159.12 199.71 238.62 278.25 315.22
5 41.44 82.54 123.56 165.21 206.07 247.11 288.29 328.94
6 42.25 84.11 126.24 168.20 210.13 252.14 294.17 335.45
7 42.68 85.21 127.71 170.24 212.79 255.03 297.68 339.89
8 42.68 85.27 127.91 170.59 212.62 255.15 297.34 337.61
Memory bandwidth scaling within one memory domain:
The following plots illustrate the the performance scaling over multiple memory domains using different number of cores per memory domain.
Memory bandwidth scaling across memory domains for init:
Memory bandwidth scaling across memory domains for sum
Memory bandwidth scaling across memory domains for copy
Memory bandwidth scaling across memory domains for Triad