Figure 6
(a) Aggregated memory bandwidth and (b) single-precision computation performance for the first node of a hybrid run using ten cluster nodes with 32 threads each. The stream benchmark memory bandwidth result for 32 threads has been added as a reference (horizontal blue line). Different colours are used for each thread. Only one thread per socket reports on the memory bandwidth since these data are based on uncore events. |