computer programs\(\def\hfill{\hskip 5em}\def\hfil{\hskip 3em}\def\eqno#1{\hfil {#1}}\)

Journal logoJOURNAL OF
SYNCHROTRON
RADIATION
ISSN: 1600-5775

GAPD: a GPU-accelerated atom-based polychromatic diffraction simulation code

CROSSMARK_Color_square_no_text.svg

aThe Peac Institute of Multiscale Sciences, Chengdu, Sichuan 610031, People's Republic of China, bCollege of Science, Hunan Agricultural University, Changsha, Hunan 410128, People's Republic of China, and cKey Laboratory of Advanced Technologies of Materials, Ministry of Education, Southwest Jiaotong University, Chengdu, Sichuan 610031, People's Republic of China
*Correspondence e-mail: sluo@pims.ac.cn

Edited by A. F. Craievich, University of São Paulo, Brazil (Received 15 August 2017; accepted 20 November 2017; online 6 February 2018)

GAPD, a graphics-processing-unit (GPU)-accelerated atom-based polychromatic diffraction simulation code for direct, kinematics-based, simulations of X-ray/electron diffraction of large-scale atomic systems with mono-/polychromatic beams and arbitrary plane detector geometries, is presented. This code implements GPU parallel computation via both real- and reciprocal-space decompositions. With GAPD, direct simulations are performed of the reciprocal lattice node of ultralarge systems (∼5 billion atoms) and diffraction patterns of single-crystal and polycrystalline configurations with mono- and polychromatic X-ray beams (including synchrotron undulator sources), and validation, benchmark and application cases are presented.

1. Introduction

X-ray/electron diffraction is widely used to characterize microstructure at the lattice level. Diffraction simulations are useful in experimental design and interpretation (Bristowe & Sass, 1980[Bristowe, P. D. & Sass, S. L. (1980). Acta Metall. 28, 575-588.]; Budai et al., 1983[Budai, J., Bristowe, P. & Sass, S. (1983). Acta Metall. 31, 699-712.]; Derlet et al., 2005[Derlet, P. M., Van Petegem, S. & Van Swygenhoven, H. (2005). Phys. Rev. B, 71, 024114.]; Hawreliak et al., 2006[Hawreliak, J., Colvin, J. D., Eggert, J. H., Kalantar, D. H., Lorenzana, H. E. S., Stölken, J., Davies, H. M., Germann, T. C., Holian, B. L., Kadau, K., Lomdahl, P. S., Higginbotham, A., Rosolankova, K., Sheppard, J. & Wark, J. S. (2006). Phys. Rev. B, 74, 184107.]; Brandstetter et al., 2008[Brandstetter, S., Derlet, P., Van Petegem, S. & Van Swygenhoven, H. (2008). Acta Mater. 56, 165-176.]; Liu et al., 2014[Liu, H. K., Lin, Y. & Luo, S. N. (2014). J. Phys. Chem. C, 118, 24797-24802.]; Wang et al., 2015[Wang, L., E, J. C., Cai, Y., Zhao, F., Fan, D. & Luo, S. N. (2015). J. Appl. Phys. 117, 084301.]; E et al., 2015[E, J. C., Wang, L., Cai, Y., Wu, H. A. & Luo, S. N. (2015). J. Chem. Phys. 142, 064704.]; Huang, 2010[Huang, X. R. (2010). J. Appl. Cryst. 43, 926-928.]; Sun & Fezzaa, 2016[Sun, T. & Fezzaa, K. (2016). J. Synchrotron Rad. 23, 1046-1053.]). To correlate diffraction signatures with real structure characteristics such as defects, impurities, precipitates and finite grain size, direct atomic-based simulations of diffraction or reciprocal-space mapping are useful. The capability to simulate directly diffraction patterns of large atomic structures (billion atoms and more) with arbitrary configurations is highly desired. For example, size- and strain-induced diffraction peak broadenings of nanocrystalline solids under deformation are always intertwined (Gleiter, 1989[Gleiter, H. (1989). Prog. Mater. Sci. 33, 223-315.], 2000[Gleiter, H. (2000). Acta Mater. 48, 1-29.]; Revesz et al., 1996[Révész, A., Ungár, T., Borbély, A. & Lendvai, J. (1996). Nanostruct. Mater. 7, 779-788.]; Ungár, 2001[Ungár, T. (2001). Mater. Sci. Eng. A, 309, 14-22.]; Budrovic et al., 2004[Budrovic, Z., Van Swygenhoven, H., Derlet, P. M., Van Petegem, S. & Schmitt, B. (2004). Science, 304, 273-276.]; Barabash & Ice, 2014[Barabash, R. & Ice, G. (2014). Strain and Dislocation Gradients from Diffraction: Spatially-Resolved Local Structure and Defects. Singapore: World Scientific.]). In order to decouple these two effects, knowledge of the minimum size, above which size-induced broadening can be neglected, would be helpful but the exact value is still under debate.

Given the heavy amount of direct calculations from atomic configurations, the system size is quite limited. To improve computation speed, a method was proposed to implement fast Fourier transformation (FFT) with the assumption that electron density distribution is a sum of Gaussian profiles for each atomic position (Kimminau et al., 2008[Kimminau, G., Nagler, B., Higginbotham, A., Murphy, W. J., Park, N., Hawreliak, J., Kadau, K., Germann, T. C., Bringa, E. M., Kalantar, D. H., Lorenzana, H. E., Remington, B. A. & Wark, J. S. (2008). J. Phys. Condens. Matter, 20, 505203.]). One issue is related to the FFT sampling grid, in particular regarding choosing the parameters of Gaussian functions for different configurations. Besides possible artifacts induced by the Gaussian assumption, FFT in reciprocal space is limited to a triperiodic grid, inappropriate for simulations of curved surfaces (Ewald spheres) (Favre-Nicolin et al., 2011[Favre-Nicolin, V., Coraux, J., Richard, M.-I. & Renevier, H. (2011). J. Appl. Cryst. 44, 635-640.]). Currently, direct diffraction simulations for systems consisting of millions of atoms are performed by the central processing unit (CPU) MPI (Message Passing Interface) parallel diffraction package (Coleman et al., 2013[Coleman, S. P., Spearot, D. E. & Capolungo, L. (2013). Modell. Simul. Mater. Sci. Eng. 21, 055020.], 2014[Coleman, S. P., Sichani, M. M. & Spearot, D. E. (2014). JOM, 66, 408-416.]) integrated in the Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS) (Plimpton, 1995[Plimpton, S. (1995). J. Comput. Phys. 117, 1-19.]), or with the stand-alone CPU MPI parallel code SLADS (Chen et al., 2017[Chen, S., E, J. & Luo, S. N. (2017). J. Appl. Cryst. 50, 951-958.]) and single-node graphics processing unit (GPU) CUDA (Compute Unified Device Architecture) parallel computations (Favre-Nicolin et al., 2011[Favre-Nicolin, V., Coraux, J., Richard, M.-I. & Renevier, H. (2011). J. Appl. Cryst. 44, 635-640.]). For the CPU parallel code, calculations involving 108 atoms and 105 points in reciprocal space are still time-consuming (Chen et al., 2017[Chen, S., E, J. & Luo, S. N. (2017). J. Appl. Cryst. 50, 951-958.]).

For high and ultrahigh strain rate dynamic loading experiments, real-time in situ X-ray diffraction represents a highly promising development at synchrotron radiation (Turneaure et al., 2009[Turneaure, S. J., Gupta, Y. M., Zimmerman, K., Perkins, K., Yoo, C. S. & Shen, G. (2009). J. Appl. Phys. 105, 053520.]; Luo et al., 2012[Luo, S. N., Jensen, B. J., Hooks, D. E., Fezzaa, K., Ramos, K. J., Yeager, J. D., Kwiatkowski, K. & Shimada, T. (2012). Rev. Sci. Instrum. 83, 073903.]; Fan et al., 2014[Fan, D., Lu, L., Li, B., Qi, M. L., E, J. C., Zhao, F., Sun, T., Fezzaa, K., Chen, W. & Luo, S. N. (2014). Rev. Sci. Instrum. 85, 113902.]; Hudspeth et al., 2015[Hudspeth, M., Sun, T., Parab, N., Guo, Z., Fezzaa, K., Luo, S. & Chen, W. (2015). J. Synchrotron Rad. 22, 49-58.]; Fan et al., 2016[Fan, D., Huang, J. W., Zeng, X. L., Li, Y. E. J. C., Huang, J. Y., Sun, T., Fezzaa, K., Wang, Z. & Luo, S. N. (2016). Rev. Sci. Instrum. 87, 053903.]) and X-ray free-electron laser (Briggs et al., 2017[Briggs, R., Gorman, M. G., Coleman, A. L., McWilliams, R. S., McBride, E. E., McGonegle, D., Wark, J. S., Peacock, L., Rothman, S., Macleod, S. G., Bolme, C. A., Gleason, A. E., Collins, G. W., Eggert, J. H., Fratanduono, D. E., Smith, R. F., Galtier, E., Granados, E., Lee, H. J., Nagler, B., Nam, I., Xing, Z. & McMahon, M. I. (2017). Phys. Rev. Lett. 118, 025501.]) facilities. Unlike traditional quasi-static diffraction experiments performed with monochromatic X-ray beams (bandwidth ≃ 10-4), higher-bandwidth `pink' beams (bandwidth > 10-2) (Hudspeth et al., 2015[Hudspeth, M., Sun, T., Parab, N., Guo, Z., Fezzaa, K., Luo, S. & Chen, W. (2015). J. Synchrotron Rad. 22, 49-58.]; Fan et al., 2016[Fan, D., Huang, J. W., Zeng, X. L., Li, Y. E. J. C., Huang, J. Y., Sun, T., Fezzaa, K., Wang, Z. & Luo, S. N. (2016). Rev. Sci. Instrum. 87, 053903.]; LCLS, 2017[LCLS (2017). LCLS Instruments, https://portal.slac.stanford.edu/sites/lcls_public/instruments/Pages/de.]) are used for dynamic experiments, allowing more photons in a short exposure time (fs to µs).

Simulation codes for traditional polychromatic Laue diffraction have been developed to find the position and peak intensity of a Laue spot (Huang, 2010[Huang, X. R. (2010). J. Appl. Cryst. 43, 926-928.]) and to correlate the spot shape with a given slip system (Tamura, 2014[Tamura, N. (2014). Strain and Dislocation Gradients from Diffraction: Spatially Resolved Local Structure and Defects, edited by R. Barabash & G. Ice, pp. 125-155. London: Imperial College Press.]). Different from atom-based diffraction simulations, only the unit-cell structure is normally considered to represent a whole system in these codes. As a result, it is difficult to properly describe diffraction patterns of complex heterogeneous microstructures with varying grain sizes, shapes, strain gradients and phases as in real experiments.

Atom-based diffraction calculations for polychromatic beams require considerable computing resources compared with those for monochromatic beams since more wavelengths need to be considered. For a calculation of 108 atoms and 105 q-points, and a spectrum with 100 wavelengths, the computation time is 100 times that of monochromatic cases, which is largely unacceptable for a CPU cluster. On the other hand, the implementation of GPU may lead to tens to hundreds of times of acceleration from purely CPU-based calculations.

In this work, we present GAPD (https://www.pims.ac.cn/Resources.html), a GPU-accelerated atom-based polychromatic diffraction simulation code for direct, kinematics-based, simulations of X-ray/electron diffraction of large-scale atomic systems with mono-/polychromatic beams and arbitrary plane detector geometries. This code implements GPU parallel computation via both real- and reciprocal-space decompositions, and runs on a multi-node GPU cluster. With GAPD, we perform direct simulations of the reciprocal lattice node of ultra-large systems (∼5 billion atoms), obtain diffraction patterns of single-crystal and polycrystalline configurations with mono- and polychromatic X-ray beams, and present validation, benchmark and application cases.

2. Methodology

2.1. Reciprocal-space representation

Lattice points in reciprocal space as well as the diffraction geometry are shown in Fig. 1(a)[link] for face-centered-cubic single-crystal Cu (blue); only a small portion is displayed for clarity. Here O is the origin of the reciprocal space, and A is the center of the Ewald sphere with a radius of [\lambda^{{-1}}], with λ being the wavelength of the incident beam. [{\bf k}_{0}] = [{\bf{AO}}] is the incident wave vector, and [{\bf k}] = [{\bf{AB}}] is the diffracted wave vector, and the angle between [{\bf k}_{0}] and [{\bf k}] is [2\theta]. B is a reciprocal lattice point that intersects the Ewald sphere, satisfying Bragg's law (Warren, 1969[Warren, B. E. (1969). X-ray Diffraction. Courier Corporation.]; Hammond, 2009[Hammond, C. (2009). The Basics of Crystallography and Diffraction, No. 12, 3rd ed. Oxford University Press.]),

[2d_{{hkl}}\sin\theta=n\lambda,\eqno(1)]

where dhkl is the crystal interplanar spacing. [{\bf q}] = [{\bf{OB}}] is the reciprocal lattice vector or scattering vector, defined as

[{{\bf q}}={{({\bf s}-{\bf s}_{{0}})} \over {\lambda}}={\bf k}-{\bf k}_{0},\eqno(2)]

where [{\bf s}] and [{\bf s}_{{0}}] are unit vectors representing the scattered and incident beam directions, respectively. In addition,

[q=|{{\bf q}}|={{2\sin\theta} \over {\lambda}}.\eqno(3)]

An iso-[2\theta] circle on the Ewald sphere, centered at C, defines the azimuthal angles (γ). For lattice point B, [BC \perp AO]; [\gamma] = 0 ([{\bf{CD}}]) is defined by the user. Here the direction of [{\bf{CD}}] is referred to as the transverse direction in GAPD.

[Figure 1]
Figure 1
Diffraction and detection geometries used by GAPD in (a) reciprocal space and (b) real space.

GAPD provides two modes for reciprocal-space construction. For the reciprocal space (RS) mode, one specifies the range and spacing of reciprocal lattice vectors along three reciprocal space axes, qx, qy and qz. In the Ewald sphere (ES) mode, one specifies the direction of the incident beam, the transverse direction and the range and spacing of 2θ and γ, so only the reciprocal space points on an Ewald sphere are sampled.

2.2. Calculation of scattering intensity

The diffraction intensity I of N atoms at scattering vector [{\bf q}] is the product of structure factor [F({\bf q})] with its complex conjugate, [F^{\,{*}}({\bf q})] (Warren, 1969[Warren, B. E. (1969). X-ray Diffraction. Courier Corporation.]),

[I({\bf q})={{F^{\,{*}}({\bf q})\,F({\bf q})} \over {N}},\eqno(4)]

with

[F({\bf q})=\textstyle\sum\limits_{{j\,=\,1}}^{{N}}f_{{j}}\exp\left(2\pi i{\bf q}\cdot{\bf r}_{{j}}\right). \eqno(5)]

Here, [{\bf r}_{{j}}] is the position of the jth atom in real space. fj is the atomic scattering factor to describe the scattering amplitude contributed by atom j at a scattering angle [2\theta], and is parameterized as

[f_{{j}}\left({{\sin\theta} \over {\lambda}}\right)=\sum\limits_{{i}}^{{4}}a_{{i}}\exp\left(-b_{{i}}{{\sin^{{2}}\theta} \over {\lambda^{{2}}}}\right)+c\eqno(6)]

for X-ray scattering and

[f_{{j}}\left({{\sin\theta} \over {\lambda}}\right)=\sum\limits_{{i}}^{{5}}a_{{i}}\exp\left(-b_{{i}}{{\sin^{{2}}\theta} \over {\lambda^{{2}}}}\right)\eqno(7)]

for electron scattering. Parameters a, b and c have been tabulated for the majority of elements (Fox et al., 1989[Fox, A. G., O'Keefe, M. A. & Tabbernor, M. A. (1989). Acta Cryst. A45, 786-793.]; Peng et al., 1996[Peng, L.-M., Ren, G., Dudarev, S. L. & Whelan, M. J. (1996). Acta Cryst. A52, 257-276.]). Here the kinematical approximation is used, which assumes full coherent scattering (Vartanyants & Robinson, 2001[Vartanyants, I. A. & Robinson, I. K. (2001). J. Phys. Condens. Matter, 13, 10593-10611.]). Different X-ray sources have different coherence properties. For example, an X-ray free-electron laser has significantly higher coherence than synchrotron radiation (Geloni et al., 2010[Geloni, G., Saldin, E., Samoylova, L., Schneidmiller, E., Sinn, H., Tschentscher, T. & Yurkov, M. (2010). New J. Phys. 12, 035021.]; Vartanyants et al., 2011[Vartanyants, I. A., Singer, A., Mancuso, A. P., Yefanov, O. M., Sakdinawat, A., Liu, Y., Bang, E., Williams, G. J., Cadenazzi, G., Abbey, B., Sinn, H., Attwood, D., Nugent, K. A., Weckert, E., Wang, T., Zhu, D., Wu, B., Graves, C., Scherz, A., Turner, J. J., Schlotter, W. F., Messerschmidt, M., Lüning, J., Acremann, Y., Heimann, P., Mancini, D. C., Joshi, V., Krzywinski, J., Soufli, R., Fernandez-Perea, M., Hau-Riege, S., Peele, A. G., Feng, Y., Krupin, O., Moeller, S. & Wurth, W. (2011). Phys. Rev. Lett. 107, 144801.]; Vartanyants & Singer, 2016[Vartanyants, I. A. & Singer, A. (2016). Synchrotron Light Sources and Free-Electron Lasers, edited by E. J. Jaeschke, S. Khan, J. R. Schneider & J. B. Hastings, pp. 821-863. Springer International Publishing.]), and there exists a coherence effect on diffraction (Vartanyants & Robinson, 2001[Vartanyants, I. A. & Robinson, I. K. (2001). J. Phys. Condens. Matter, 13, 10593-10611.]). This effect will be included in a future version.

The Lorentz-polarization factor (Warren, 1969[Warren, B. E. (1969). X-ray Diffraction. Courier Corporation.]) for X-ray diffraction,

[{\rm{Lp}}(\theta)={{1+\cos^{{2}}2\theta} \over {\cos\theta\sin^{{2}}\theta}}, \eqno(8)]

can be considered as an optional parameter in GAPD. Then the X-ray diffraction intensity [I_{x}({\bf q})] follows as

[I_{x}({{\bf q}})={\rm{Lp}}(\theta){{F({{\bf q}})\,F^{\,{*}}({{\bf q}})} \over {N}}. \eqno(9)]

Developed in the C++ language, the parallelization of GAPD is achieved by combining CUDA with MPI for its implementation on GPU high-performance computing clusters. We consider Na atoms and [N_{{{\bf q}}}] scattering vectors in reciprocal space. Parallelization is realized in both real space (in terms of atoms) and reciprocal space (in terms of [{\bf q}] points).

Firstly, atoms are distributed over M MPI CPU threads, each of which contains Np atoms, i.e.

[N_{{\rm a}}=\textstyle\sum\limits_{{p=1}}^{{M}}N_{{p}}. \eqno(10)]

For each CPU thread, [N_{{{\bf q}}}] structure factor calculations need to be performed. For each [{{\bf q}}],

[F_{{l}}({{\bf q}})=\sum\limits_{ {j\,=\,\left(\Sigma_{{p=1}}^{{l-1}}N_{{p}}\right)+1} }^{ {\Sigma_{{p=1}}^{{l}}N_{{p}}} } f_{{j}}\exp\left({2\pi i{\bf q}\cdot{\bf r}_{{j}}}\right). \eqno(11)]

Here, the CPU thread ranked l deals with the atoms labeled from [(\Sigma_{{p=1}}^{{l-1}}N_{{p}})+1] to [(\Sigma_{{p=1}}^{{l}}N_{{p}})].

Then, the calculations of [F_{{l}}({{\bf q}})] for the [N_{{{\bf q}}}] scattering vectors are distributed over [N_{{{\bf q}}}] CUDA GPU threads. The results are copied to CPU from GPU when a calculation is finished, and then saved into an array of dimension [N_{{{\bf q}}}]. When all the threads finish their own calculations, the MPI master thread sums individual [F_{{l}}({\bf q})] in the array as

[F({\bf q})=\textstyle\sum\limits_{{l\,=\,1}}^{{M}}F_{{l}}({\bf q}).\eqno(12)]

2.3. Projection from reciprocal space to 2D detector

For the sake of simulating diffraction patterns in synchrotron experiments, GAPD calculates the projection of diffraction wave vectors to a two-dimensional (2D) detector in an arbitrary position. The geometry in real space is shown in Fig. 1(b)[link]; xyz denotes the sample coordinate system or laboratory coordinate system (these two systems coincide), while dxdy denotes the detector plane. The coordinates of the scattered beam on the detector are expressed as

[(x_{{\rm proj}},y_{{\rm proj}})=(\,p_{x}+{\bf M}\cdot\hat{d_{x}},\,\,p_{y}+{\bf M}\cdot\hat{d_{y}}), \eqno(13)]

where Ppx,py) is the point of normal incidence on the detector. [\hat{d_{x}}] and [\hat{d_{y}}] are unit vectors of the detector axes. [{\bf M}] is the projecting vector of the scattered beam in the sample coordinate system, and

[{\bf M}\,=\,{{|{\bf L}|^{2}} \over {{\bf L}\cdot{\bf s}}}\,{{\bf s}}-{\bf L}, \eqno(14)]

where [{\bf L}] is parallel to the normal incident direction and [|{\bf L}|] is the sample-to-detector distance. The unit vector of the scattered beam [{\bf s}] can be obtained from [{\bf q}], [{\bf s}_{0}] and λ with equation (2)[link]. In GAPD, dx is assumed to be always parallel to the xy-plane, and thus [\hat{dx}] and [\hat{dy}] can be derived from [{\bf L}] as

[\eqalign{ \displaystyle\hat{dx}&\displaystyle=(L_{y},-L_{x},0)\left(L_{y}^{2}+L_{x}^{2}\right)^{{-{{1}/{2}}}},_{\vphantom{\big|}} \cr \displaystyle\hat{dy}&\displaystyle={{\hat{dx}\times{\bf L}} \over {|\hat{dx}\times{\bf L}|}}.} \eqno(15)]

In this way, user-defined input includes (i) the coordinates of the point of normal incidence on the detector, (ii) the normal incident direction, (iii) the sample-to-detector distance, (iv) the beam incident direction, and (v) the wavelength of the probe beam. The first four items are defined in the sample system. Then, each [{\bf q}] point constructed in reciprocal space can be projected onto an arbitrarily positioned 2D detector. In GAPD, the beam incident direction and normal incident direction can be input as vectors in the sample coordinate system, or the angles with the xy-plane ([\alpha_{1}] and [\alpha_{2}]) and their orthogonal projections with the y-axis ([\beta_{1}] and [\beta_{2}]).

For a polychromatic beam, the intensity at a specific position on a 2D detector, [I(2\theta,\gamma)], is the weighted integration over the incident beam wavelength range, [[\lambda_{0},\lambda_{1}]],

[I(2\theta,\gamma)=\textstyle\int\limits_{{\lambda_{0}}}^{{\lambda_{1}}} I(2\theta,\gamma,\lambda)\,w(\lambda)\,{\rm{d}}\lambda. \eqno(16)]

Here [w(\lambda)] is the weight factor, e.g. the flux fraction of the incident beam. Each set of ([2\theta,\gamma,\lambda]) corresponds to a scattering vector [{\bf q}].

3. Validation and benchmark

GAPD is validated with electron/X-ray diffraction simulations of single-crystal Cu with various geometries. The detector is set to be perpendicular to the incident beams. The crystal coordinate system coincides with the sample coordinate system. We examine the following cases for the sake of validation: an electron beam (200 keV) with zone axes [001] and [[\bar{1}11]], 18.86 keV X-rays with zone axis [100], and 8.91 keV X-rays with zone axis [111]. Their corresponding 2D diffraction patterns are shown in Fig. 2[link], and are identical to standard indexed diffraction patterns and analytical predictions (Williams & Carter, 1996[Williams, D. B. & Carter, C. B. (1996). Transmission Electron Microscopy, p. 299. Berlin: Springer.]).

[Figure 2]
Figure 2
2D diffraction patterns calculated for Cu single crystals with X-rays or electrons along different zone axes: (a) 200 keV electrons, z = [001], (b) 200 keV electrons, z = [[\bar{1}11]], (c) 18.86 keV X-rays, z = [100] and (d) 8.91 keV X-rays, z = [111].

With the implementation of GPU acceleration, reciprocal-space mapping and 2D diffraction simulations of atomic systems with several billion atoms become realistic. We use GTX 980 GPUs, and each cluster node contains three GPUs. To evaluate the computing performance of GAPD, several tests with varying number of atoms (Na), number of [{\bf q}] points ([N_{{{\bf q}}}]) and number of GPU cluster nodes (Nnodes) are performed (Fig. 3[link]).

[Figure 3]
Figure 3
Computing performance of GAPD: computation time/speed as a function of number of atoms, number of [{\bf q}] points and computing nodes as noted. Computation speed: [N_{{\rm a}}N_{{{\bf q}}}] per second.

As shown in Fig. 3(a)[link], the computing time increases linearly as Na increases at fixed Nnodes (9) and [N_{{{\bf q}}}] (1.3×105). For instance, the computation time is about 5 h for a system of 5.4 billion atoms, which corresponds to a cube-shaped Cu single crystal with edge length of 400 nm. The computing efficiency increases with increasing [N_{{{\bf q}}}] [Fig. 3(b)[link]]; the lower efficiency at small [N_{{{\bf q}}}] is likely to be due to insufficient use of GPU cores. The reciprocal and linear relations in Figs. 3(c) and 3(d)[link], respectively, indicate satifactory parallelization efficiency of GAPD.

4. Application cases

The main features of GAPD, including reciprocal-space visualization, simulation of 2D diffraction patterns on an arbitrarily positioned detector, and considering polychromatic beams, are illustrated with the following three cases: the crystal size effect on reciprocal lattice nodes, the polychromaticity effect on single-crystal diffraction, and the polychromaticity effect on polycrystalline diffraction.

4.1. Crystal size effect on node-broadening in reciprocal space

It is well known that diffraction spot broadening becomes significant for small crystals. But an open question is how small is small? The lower bound above which crystal size-induced broadening can be neglected is still controversial, ranging from 100 nm (Warren, 1969[Warren, B. E. (1969). X-ray Diffraction. Courier Corporation.]) to 500 nm (Ungár, 2001[Ungár, T. (2001). Mater. Sci. Eng. A, 309, 14-22.]). A direct simulation of such broadening with GAPD is instrumental for this matter.

A diffraction pattern measured in experiments is essentially a sampling of the Fourier or reciprocal-space representation of a specimen. Therefore, the node broadening in reciprocal space can be used to examine the size effect, without consideration of diffraction geometry and X-ray wavelength. A 3D node in reciprocal space is more appropriate to reflect the size effect than a 2D diffraction pattern, since the latter only samples a slice of the former.

Cube-shaped, defect-free, Cu single crystals with an edge length ranging from 4 nm to 400 nm are constructed, and examined in reciprocal space. The region around the ([1\bar{1}3]) reciprocal lattice node of a 4 nm crystal is shown in Fig. 4(a)[link], along with the relrods due to the small crystal size. In Fig. 4(b)[link], the perimeter of the node is of a cube shape with round corners, which is more evident in the contour plot of the center cross-section of the node [Fig. 4(c)[link]]. When an Ewald sphere intersects the node, the resulting intersection (a curved surface) depends on the diffraction geometry and X-ray wavelength, and may vary considerably. Consequently, one may obtain diffraction spots of different sizes and shapes, and thus different diffraction broadening for the same reflection plane, so diffraction patterns are less appropriate for such analysis.

[Figure 4]
Figure 4
(a) 3D visualization of reciprocal lattice node ([1\bar{1}3]) enclosed in the red cube, along with the relrods due to the small crystal size (4 nm). (b) Enlarged view of the ([1\bar{1}3]) node in (a). (c) Cross-sectional contour plot of a slice through the node center along the qxqy-plane.

For each crystal size, we obtain three reciprocal lattice nodes, ([1\bar{1}1]), ([1\bar{1}3]) and (002). For each node, we obtain the intensity profile along the qx-direction on the central qxqy-plane which cuts through the node center. The qx profiles are fitted with a pseudo-Voigt function,

[\eqalignno{ y = {}& y_{0}+A\Bigg\{ m_{{\rm u}}{{2}\over{\pi}} {{\omega}\over{4(x-x_{{\rm c}})^{2}+\omega^{2}}} \cr& +\left(1-m_{{\rm u}}\right) {{\sqrt{4\ln 2}}\over{\sqrt{\pi}\omega}} \exp\left[-{{4\ln2}\over{\omega^{2}}}\left(x-x_{{\rm c}}\right)^{2}\right]\Bigg\}.&(17)}]

Here x and y refer to [2\theta] and the scattering intensity, respectively; y0 is offset, mu is a profile shape factor, xc is the peak center, A is the peak area and ω is the full width at half-maximum (FWHM).

The FWHM of an intensity profile can be used to characterize node broadening or diffraction peak broadening. The fitted FWHMs are presented in Fig. 5[link] as a function of crystal size, which can be described with a reciprocal function. The FWHM versus crystal size curves are nearly identical for the three reciprocal-space nodes. The FWHM decreases rapidly with increasing crystal size and approaches zero asymptotically. For a moderately high reciprocal resolution in experiments, [\Delta q/q]10-3 (Lienert et al., 2017[Lienert, U., Ribárik, G., Ungár, T., Wejdemann, C. & Pantleon, W. (2017). Synchrotron Radiat. News, 30(3), 35-40.]). The corresponding [\Delta q] values are 4.8×10-4, 5.5×10-4 and 9.1×10-4 Å−1 for ([1\bar{1}1]), (002) and ([1\bar{1}3]), respectively. For FWHM values below the resolution [\Delta q], the peak broadening is deemed negligible. Then, the minimum crystal size beyond which peak broadening can be neglected ranges from 110 to 208 nm for the three reciprocal lattice nodes investigated. For most metals, q of diffraction index [\leq\!5] ranges from 0.2 to 1.65 Å−1. Then minimum crystal sizes can be deduced from the reciprocal relation between FWHM and crystal size, ranging from 60 nm to 500 nm.

[Figure 5]
Figure 5
FWHM of intensity profiles of reciprocal lattice nodes versus crystal size for three reciprocal-space nodes.

4.2. Diffraction of small single crystals with polychromatic X-rays

To evaluate the effect of polychromaticity on the diffraction of small single crystals, we simulate diffraction patterns of Cu single crystals with four different sizes: 4 nm, 14 nm, 36 nm and 100 nm. X-rays are incident perpendicularly on the (112) plane. We choose the first harmonic of undulator U18G13 (period = 18 mm, gap = 13 mm) at the Advanced Photon Source (APS) 32-ID beamline (Fan et al., 2016[Fan, D., Huang, J. W., Zeng, X. L., Li, Y. E. J. C., Huang, J. Y., Sun, T., Fezzaa, K., Wang, Z. & Luo, S. N. (2016). Rev. Sci. Instrum. 87, 053903.]) as the polychromatic X-ray source [Fig. 6[link](a)]; its spectral flux peaks at 24.65 keV or [\lambda_{{\rm c}}] = 0.5029 Å. As a reference, we also simulate the diffraction for monochromatic X-rays with a wavelength of [\lambda_{{\rm B}}] = 0.5366 Å satisfying Bragg's law in the same geometry. [\lambda_{{\rm B}}] is indicated by the red dot in Fig. 6(a)[link]. The simulated diffraction patterns are plotted as the γ–2θ plot in Fig. 6(b)[link], and diffraction curves in Fig. 6(c)[link], for diffraction spot ([13\bar{1}]).

[Figure 6]
Figure 6
(a) X-ray spectrum of the first harmonic of the U18 undulator with a gap of 13 mm at the APS beamline 32-ID (APS U18G13). (b) The γ–2θ plots of diffraction spot [(13\bar{1})] from 4 nm single-crystal Cu for monochromatic X-rays ([\lambda_{\rm B}] = 0.5366 Å), and (c) for polychromatic X-rays with APS U18G13. (d) Diffraction profiles along [2\theta] through the center of the [(13\bar{1})] diffraction spot of single Cu obtained with the polychromatic source, for various crystal sizes. The black arrow denotes the Bragg peak position.

For the 4 nm single crystal, diffraction spot ([13\bar{1}]) is circular on the γ–2θ plot in the case of the monochromatic X-rays [Fig. 6(b)[link]]. However, the corresponding diffraction spot for the polychromatic source is elliptical, elongated along the 2θ direction [Fig. 6(c)[link]], and its center shifts by 0.6° toward lower angle.

The intensity versus [2\theta] profiles through the center of diffraction spot ([13\bar{1}]) formed with the polychromatic source are presented in Fig. 6(d)[link] for various crystal sizes. There is an increasing broadening and peak shift toward lower [2\theta] with decreasing grain size. Essentially, the broadening in the [2\theta] direction is induced by the crystal size effect discussed in §4.1[link], while multiple wavelengths do augment the broadening.

For a monochromatic beam, the Ewald sphere can only sample a small portion of a reciprocal lattice node, while the node intersects with multiple Ewald spheres for a polychromatic beam. A broader bandwidth indicates a larger portion to be sampled by Ewald spheres of different radii, and, thus, more pronounced broadening. As shown above, the size of a reciprocal lattice node increases with decreasing crystal size. The size and intensity of a diffraction spot is the combined result of crystal size (node dimensions) and multiple wavelengths (polychromaticity; multiple Ewald spheres).

The Ewald sphere corresponding to the Bragg wavelength intersects the center of a reciprocal lattice node with the highest intensity for small crystals. Since the intensity at a pixel on the detector is an integration of contributions from different localities of a node weighted by the flux amplitudes of corresponding wavelengths, non-Bragg wavelengths with higher fluxes may still lead to the highest diffraction intensity, and thus a peak shift.

4.3. Diffraction of nanocrystalline solids with polychromatic X-rays

A nanocrystalline Cu structure consisting of 250 randomly oriented grains with a mean grain size of 8 nm is examined. Figs. 7(a)–7(d)[link] present 2D diffraction patterns obtained with monochromatic X-rays (wavelength [\lambda_{{\rm c}}] = 0.5029 Å, 0% bandwidth), polychromatic X-rays with a Gaussian-shaped spectrum (centered at [\lambda_{{\rm c}}], 4% bandwidth), the APS U18G13 undulator source with single harmonic shown in Fig. 6(a)[link] (spectral flux peak at [\lambda_{{\rm c}}], ∼8% bandwidth) and the APS U33G25 undulator source with multiple harmonics, respectively. The X-ray spectrum for APS U33G25 is shown in Fig. 7(e)[link]. With increasing bandwidth, the diffraction spots on Debye–Scherrer rings become stretched along the [2\theta] direction. The broadening can be simply explained by Bragg's law [equation (1)[link]]. For a polycrystalline solid with random grain orientations and sufficient number of grains, the incident angle of X-rays relative to the same group of crystal planes, θ, can be arbitrary. Thus, for fixed dhkl, the number of θ angles that satisfy Bragg's law increases with increasing number of wavelengths or bandwidth.

[Figure 7]
Figure 7
2D diffraction patterns of a 250 grain polycrystalline system for (a) 0.5029 Å monochromatic X-rays, (b) polychromatic X-rays with a Gaussian-shaped spectrum, (c) the first harmonic of APS undulator source U18G13, and (d) the first three harmonics of APS undulator source U33G25. (e) X-ray spectrum of APS U33G25. (f) Corresponding diffraction curves.

We integrate azimuthally the 2D diffraction patterns, and obtain 1D diffraction curves in Fig. 7(f)[link], each normalized by its maximum intensity. Both diffraction curves from the Gaussian and APS U18G13 spectra are broadened along the [2\theta] direction, and the peak shapes are similar to their corresponding X-ray spectra. However, the {111} and {200} peaks shift by 0.14–0.26° toward higher [2\theta] for the APS U18G13 spectrum, while no peak shift is seen for the Gaussian spectrum. The difference is caused by the asymmetry or symmetry of their respective spectra.

The finite grain size leads to diffraction peak broadening in the [2\theta] direction for each wavelength in a polychromatic beam, since the broadened nodes in reciprocal-space representation of the nanocrystalline Cu allow their intersection with Ewald spheres of different radii. As a result, summing those broad­ened peaks (for each wavelength) over the asymmetric APS U18G13 spectrum does not necessarily lead to the maximum intensity at [\lambda_{{\rm c}}], where the spectral flux peaks.

For multiple harmonics [APS U33G25; the green curve in Fig. 7(f)[link]], both the {111} reflection from the second harmonic and the {220} reflection from the third harmonic contribute to the diffraction peak. The `plateau' following the peak is due to the {200} reflection from the second harmonic and {311} from the third harmonic. Thus, the overlap of diffraction intensities from different crystal planes and different harmonics renders it difficult to analyze diffraction peaks with conventional methods. At present, simulating diffraction patterns with forward simulation codes such as GAPD in comparison with experiments is useful for interpreting multiple-harmonic diffraction data of poly/nanocrystalline solids.

5. Conclusions

We present a GPU-accelerated parallel simulation code, GAPD, for simulating electron/X-ray diffraction with mono-/polychromatic beams directly from atomic configurations. Diffraction simulation on super-large systems (∼5 billion atoms) is demonstrated, and the system size can be scaled up by a factor of 10–100 on more powerful clusters. GAPD is utilized to explore the crystal size effect on node-broadening in reciprocal space, and the influence of polychromaticity on peak-broadening of single-crystal and polycrystalline nanomaterials.

In particular, (i) for a moderately high reciprocal resolution ([\Delta q/q]10-3), peak-/node-broadening can be neglected for crystal size above 500 nm. Precise minimum crystal sizes depend on (hkl) and q-resolution.

(ii) For small single crystals, polychromatic beams with asymmetric spectra induce peak shift and broadening in the [2\theta]-direction, which diminish with increasing crystal size.

(iii) For polycrystalline solids, a diffraction peak is broaden­ed by a polychromatic beam, and its shape follows that of the beam spectrum. Asymmetric spectra induce both peak broadening and shift.

Acknowledgements

We benefited from valuable discussions within the PIMS X-ray team.

Funding information

Funding for this research was provided by: National Natural Science Foundation of China (grant No. 11627901); Science Challenge Project of China (grant No. TZ2018001).

References

First citationBarabash, R. & Ice, G. (2014). Strain and Dislocation Gradients from Diffraction: Spatially-Resolved Local Structure and Defects. Singapore: World Scientific.  Google Scholar
First citationBrandstetter, S., Derlet, P., Van Petegem, S. & Van Swygenhoven, H. (2008). Acta Mater. 56, 165–176.  Web of Science CrossRef CAS Google Scholar
First citationBriggs, R., Gorman, M. G., Coleman, A. L., McWilliams, R. S., McBride, E. E., McGonegle, D., Wark, J. S., Peacock, L., Rothman, S., Macleod, S. G., Bolme, C. A., Gleason, A. E., Collins, G. W., Eggert, J. H., Fratanduono, D. E., Smith, R. F., Galtier, E., Granados, E., Lee, H. J., Nagler, B., Nam, I., Xing, Z. & McMahon, M. I. (2017). Phys. Rev. Lett. 118, 025501.  Web of Science CrossRef PubMed Google Scholar
First citationBristowe, P. D. & Sass, S. L. (1980). Acta Metall. 28, 575–588.  CrossRef CAS Web of Science Google Scholar
First citationBudai, J., Bristowe, P. & Sass, S. (1983). Acta Metall. 31, 699–712.  CrossRef CAS Web of Science Google Scholar
First citationBudrovic, Z., Van Swygenhoven, H., Derlet, P. M., Van Petegem, S. & Schmitt, B. (2004). Science, 304, 273–276.  Web of Science CrossRef PubMed CAS Google Scholar
First citationChen, S., E, J. & Luo, S. N. (2017). J. Appl. Cryst. 50, 951–958.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationColeman, S. P., Sichani, M. M. & Spearot, D. E. (2014). JOM, 66, 408–416.  Web of Science CrossRef CAS Google Scholar
First citationColeman, S. P., Spearot, D. E. & Capolungo, L. (2013). Modell. Simul. Mater. Sci. Eng. 21, 055020.  Web of Science CrossRef Google Scholar
First citationDerlet, P. M., Van Petegem, S. & Van Swygenhoven, H. (2005). Phys. Rev. B, 71, 024114.  Web of Science CrossRef Google Scholar
First citationE, J. C., Wang, L., Cai, Y., Wu, H. A. & Luo, S. N. (2015). J. Chem. Phys. 142, 064704.  Web of Science CrossRef PubMed Google Scholar
First citationFan, D., Huang, J. W., Zeng, X. L., Li, Y. E. J. C., Huang, J. Y., Sun, T., Fezzaa, K., Wang, Z. & Luo, S. N. (2016). Rev. Sci. Instrum. 87, 053903.  Web of Science CrossRef PubMed Google Scholar
First citationFan, D., Lu, L., Li, B., Qi, M. L., E, J. C., Zhao, F., Sun, T., Fezzaa, K., Chen, W. & Luo, S. N. (2014). Rev. Sci. Instrum. 85, 113902.  Web of Science CrossRef PubMed Google Scholar
First citationFavre-Nicolin, V., Coraux, J., Richard, M.-I. & Renevier, H. (2011). J. Appl. Cryst. 44, 635–640.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationFox, A. G., O'Keefe, M. A. & Tabbernor, M. A. (1989). Acta Cryst. A45, 786–793.  CrossRef CAS Web of Science IUCr Journals Google Scholar
First citationGeloni, G., Saldin, E., Samoylova, L., Schneidmiller, E., Sinn, H., Tschentscher, T. & Yurkov, M. (2010). New J. Phys. 12, 035021.  Web of Science CrossRef Google Scholar
First citationGleiter, H. (1989). Prog. Mater. Sci. 33, 223–315.  CrossRef CAS Web of Science Google Scholar
First citationGleiter, H. (2000). Acta Mater. 48, 1–29.  Web of Science CrossRef CAS Google Scholar
First citationHammond, C. (2009). The Basics of Crystallography and Diffraction, No. 12, 3rd ed. Oxford University Press.  Google Scholar
First citationHawreliak, J., Colvin, J. D., Eggert, J. H., Kalantar, D. H., Lorenzana, H. E. S., Stölken, J., Davies, H. M., Germann, T. C., Holian, B. L., Kadau, K., Lomdahl, P. S., Higginbotham, A., Rosolankova, K., Sheppard, J. & Wark, J. S. (2006). Phys. Rev. B, 74, 184107.  Web of Science CrossRef Google Scholar
First citationHuang, X. R. (2010). J. Appl. Cryst. 43, 926–928.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationHudspeth, M., Sun, T., Parab, N., Guo, Z., Fezzaa, K., Luo, S. & Chen, W. (2015). J. Synchrotron Rad. 22, 49–58.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationKimminau, G., Nagler, B., Higginbotham, A., Murphy, W. J., Park, N., Hawreliak, J., Kadau, K., Germann, T. C., Bringa, E. M., Kalantar, D. H., Lorenzana, H. E., Remington, B. A. & Wark, J. S. (2008). J. Phys. Condens. Matter, 20, 505203.  Web of Science CrossRef Google Scholar
First citationLCLS (2017). LCLS Instruments, https://portal.slac.stanford.edu/sites/lcls_public/instruments/Pages/deGoogle Scholar
First citationLienert, U., Ribárik, G., Ungár, T., Wejdemann, C. & Pantleon, W. (2017). Synchrotron Radiat. News, 30(3), 35–40.  CrossRef Google Scholar
First citationLiu, H. K., Lin, Y. & Luo, S. N. (2014). J. Phys. Chem. C, 118, 24797–24802.  Web of Science CrossRef CAS Google Scholar
First citationLuo, S. N., Jensen, B. J., Hooks, D. E., Fezzaa, K., Ramos, K. J., Yeager, J. D., Kwiatkowski, K. & Shimada, T. (2012). Rev. Sci. Instrum. 83, 073903.  Web of Science CrossRef PubMed Google Scholar
First citationPeng, L.-M., Ren, G., Dudarev, S. L. & Whelan, M. J. (1996). Acta Cryst. A52, 257–276.  CrossRef CAS Web of Science IUCr Journals Google Scholar
First citationPlimpton, S. (1995). J. Comput. Phys. 117, 1–19.  CrossRef CAS Web of Science Google Scholar
First citationRévész, A., Ungár, T., Borbély, A. & Lendvai, J. (1996). Nanostruct. Mater. 7, 779–788.  Google Scholar
First citationSun, T. & Fezzaa, K. (2016). J. Synchrotron Rad. 23, 1046–1053.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationTamura, N. (2014). Strain and Dislocation Gradients from Diffraction: Spatially Resolved Local Structure and Defects, edited by R. Barabash & G. Ice, pp. 125–155. London: Imperial College Press.  Google Scholar
First citationTurneaure, S. J., Gupta, Y. M., Zimmerman, K., Perkins, K., Yoo, C. S. & Shen, G. (2009). J. Appl. Phys. 105, 053520.  Web of Science CrossRef Google Scholar
First citationUngár, T. (2001). Mater. Sci. Eng. A, 309, 14–22.  Google Scholar
First citationVartanyants, I. A. & Robinson, I. K. (2001). J. Phys. Condens. Matter, 13, 10593–10611.  Web of Science CrossRef CAS Google Scholar
First citationVartanyants, I. A. & Singer, A. (2016). Synchrotron Light Sources and Free-Electron Lasers, edited by E. J. Jaeschke, S. Khan, J. R. Schneider & J. B. Hastings, pp. 821–863. Springer International Publishing.  Google Scholar
First citationVartanyants, I. A., Singer, A., Mancuso, A. P., Yefanov, O. M., Sakdinawat, A., Liu, Y., Bang, E., Williams, G. J., Cadenazzi, G., Abbey, B., Sinn, H., Attwood, D., Nugent, K. A., Weckert, E., Wang, T., Zhu, D., Wu, B., Graves, C., Scherz, A., Turner, J. J., Schlotter, W. F., Messerschmidt, M., Lüning, J., Acremann, Y., Heimann, P., Mancini, D. C., Joshi, V., Krzywinski, J., Soufli, R., Fernandez-Perea, M., Hau-Riege, S., Peele, A. G., Feng, Y., Krupin, O., Moeller, S. & Wurth, W. (2011). Phys. Rev. Lett. 107, 144801.  Web of Science CrossRef PubMed Google Scholar
First citationWang, L., E, J. C., Cai, Y., Zhao, F., Fan, D. & Luo, S. N. (2015). J. Appl. Phys. 117, 084301.  Web of Science CrossRef Google Scholar
First citationWarren, B. E. (1969). X-ray Diffraction. Courier Corporation.  Google Scholar
First citationWilliams, D. B. & Carter, C. B. (1996). Transmission Electron Microscopy, p. 299. Berlin: Springer.  Google Scholar

© International Union of Crystallography. Prior permission is not required to reproduce short quotations, tables and figures from this article, provided the original authors and source are cited. For more information, click here.

Journal logoJOURNAL OF
SYNCHROTRON
RADIATION
ISSN: 1600-5775
Follow J. Synchrotron Rad.
Sign up for e-alerts
Follow J. Synchrotron Rad. on Twitter
Follow us on facebook
Sign up for RSS feeds