view article

Figure 7
(a)–(c) Enlargements of the difference-map interactions of Fig. 6[link], running on ten nodes with 32 threads each. (d)–(f) Enlargements of the difference-map iterations on a single node with 32 threads. For each run, (a), (d) the memory bandwidth, (b), (e) the single-precision performance and (c), (f) the clocks per instruction (CPI) are shown. The CPI can be seen as efficiency measurements and are given for each thread of the first node. The increase in synchronization time in the case of multiple nodes and the resulting rise in CPI is clearly visible. Furthermore, the Fourier projection shows a significantly improved performance compared with the overlap projection, e.g. from timestamp 80 s to 81 s in panels (a)–(c) and from 81 s to 82 s in panels (a)–(c), respectively. These effects can also be attributed to the increased synchronization time between threads.

Journal logoJOURNAL OF
APPLIED
CRYSTALLOGRAPHY
ISSN: 1600-5767
Follow J. Appl. Cryst.
Sign up for e-alerts
Follow J. Appl. Cryst. on Twitter
Follow us on facebook
Sign up for RSS feeds