view article

Figure 6
(a) Low-order fringe pattern for a photosystem I crystallite calculated on a GPU and similar to that actually observed at the LCLS (Chapman et al., 2011BB12). (b) Computational efficiency of evaluating (1)[link] scaling as N2 (number of atoms × number of structure factors). The CPU calculation was performed single-threaded on a 64-bit Intel Xeon (2.4 GHz), 8 MB cache, 23.5 GB RAM running Scientific Linux 5.4 with code compiled under GCC 4.4.2. GPU calculations were either on an Nvidia C1060 (Tesla, 1.30 GHz), 4.0 GB on-device memory, 960 hardware cores or on the higher-performance Nvidia C2050 (Fermi, 1.15 GHz), 2.6 GB on-device memory, 448 hardware cores; both were programmed in CUDA. The top plot (blue crosses) depicts calculations run with 32-bit (single) precision; otherwise, calculations were in 64-bit (double) precision. A comparison is given with the FFT method, which scales as NlogN. The loss of accuracy observed on moving from 64-bit to 32-bit precision is generally less than the loss of accuracy (typically 0.8%) resulting from use of the FFT approximation rather than (1)[link]. Example code is available at https://cctbx.svn.sourceforge.net/viewvc/cctbx/trunk/cctbx/x-ray/structure_factors/from_scatterers_direct_parallel.py . Python bindings for CUDA utilize the PyCUDA package (Klöckner et al., 2012BB33). Benchmarks in (b) are performed on a single unit cell in space group P1, while the simulation in (a) is over all atoms in 10 × 12 × 14 unit cells in space group P63. Simulation (a) scales as N2 as it uses (1)[link].

Journal logoBIOLOGICAL
CRYSTALLOGRAPHY
ISSN: 1399-0047
Follow Acta Cryst. D
Sign up for e-alerts
Follow Acta Cryst. on Twitter
Follow us on facebook
Sign up for RSS feeds