view article

Figure 4
The OpenMP reduction algorithm scheme for six threads. The object array is subdivided into submatrices (blue) and distributed between the threads, shown as object layers 1–6. Each submatrix is again subdivided into smaller blocks of 64 cache lines and a consecutive sequence number (e.g. 1–28 for the first thread). On each addition, a thread fetches the current sequence counter and increases it atomically. To avoid having multiple threads working on the same array block, a reservation lock is used. If blocks are not fully covered (orange), they are treated separately.

Journal logoJOURNAL OF
APPLIED
CRYSTALLOGRAPHY
ISSN: 1600-5767
Follow J. Appl. Cryst.
Sign up for e-alerts
Follow J. Appl. Cryst. on Twitter
Follow us on facebook
Sign up for RSS feeds