computer programs\(\def\hfill{\hskip 5em}\def\hfil{\hskip 3em}\def\eqno#1{\hfil {#1}}\)

Journal logoJOURNAL OF
APPLIED
CRYSTALLOGRAPHY
ISSN: 1600-5767
Volume 49| Part 4| August 2016| Pages 1356-1362

Condor: a simulation tool for flash X-ray imaging1

CROSSMARK_Color_square_no_text.svg

aLaboratory of Molecular Biophysics, Department of Cell and Molecular Biology, Uppsala University, Husargatan 3 (Box 596), SE-751 24 Uppsala, Sweden, and bNERSC, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
*Correspondence e-mail: hantke@xray.bmc.uu.se

Edited by N. D. Loh, National University of Singapore (Received 3 February 2016; accepted 7 June 2016; online 14 July 2016)

Flash X-ray imaging has the potential to determine structures down to molecular resolution without the need for crystallization. The ability to accurately predict the diffraction signal and to identify the optimal experimental configuration within the limits of the instrument is important for successful data collection. This article introduces Condor, an open-source simulation tool to predict X-ray far-field scattering amplitudes of isolated particles for customized experimental designs and samples, which the user defines by an atomic or a refractive index model. The software enables researchers to test whether their envisaged imaging experiment is feasible, and to optimize critical parameters for reaching the best possible result. It also aims to support researchers who intend to create or advance reconstruction algorithms by simulating realistic test data. Condor is designed to be easy to use and can be either installed as a Python package or used from its web interface (https://lmb.icm.uu.se/condor). X-ray free-electron lasers have high running costs and beam time at these facilities is precious. Data quality can be substantially improved by using simulations to guide the experimental design and simplify data analysis.

1. Introduction

Flash X-ray imaging (FXI) may become a tool to solve structures down to molecular resolution without the need for crystallization (Neutze et al., 2000[Neutze, R., Wouts, R., van der Spoel, D., Weckert, E. & Hajdu, J. (2000). Nature, 406, 752-757.]; Bergh et al., 2008[Bergh, M., Huldt, G., Tîmneanu, N., Maia, F. R. N. C. & Hajdu, J. (2008). Q. Rev. Biophys. 41, 181-204.]). By employing femtosecond pulses produced by X-ray free-electron lasers, FXI can outrun radiation damage processes that limit resolution (Chapman et al., 2006[Chapman, H. N., Barty, A. et al. (2006). Nat. Phys. 2, 839-843.]). FXI dispenses with image forming lenses and thereby circumvents the difficulty of manufacturing efficient lenses for X-rays (Chapman & Nugent, 2010[Chapman, H. N. & Nugent, K. A. (2010). Nat. Photon. 4, 833-839.]). Aersosol sample delivery avoids a sample support, which means that the structure can be imaged with practically no background (Bogan et al., 2008[Bogan, M. J. et al. (2008). Nano Lett. 8, 310-316.]; Seibert et al., 2011[Seibert, M. M. et al. (2011). Nature, 470, 78-81.]; Hantke et al., 2014[Hantke, M. F. et al. (2014). Nat. Photon. 8, 943-949.]).

For reaching the goal of 3 Å resolution, the Single Particle Imaging Initiative identifies the requirement of simulations that realistically represent the experiment conditions to guide future development (Aquila et al., 2015[Aquila, A. et al. (2015). Struct. Dyn. 2, 041701.]). It is essential to optimize and harmonize all relevant experimental parameters, such as photon wavelength, photon flux, illumination profile, camera distance, detector settings, sample density and even sample type. Being able to accurately predict diffraction data facilitates optimization of the experimental setup and helps to provide accurate estimates of the expected data quality. Simulation tools can help researchers to use their beam time more efficiently and measure diffraction data at the highest possible quality.

Software for simulating X-ray diffraction data exists. For crystal diffraction, for example, CCP4 (Winn et al., 2011[Winn, M. D. et al. (2011). Acta Cryst. D67, 235-242.]) is widely used. But it is aimed at crystal diffraction, making it hard to use for simulating continuous diffraction patterns. In a couple of publications (Yefanov & Vartanyants, 2013[Yefanov, O. M. & Vartanyants, I. A. (2013). J. Phys. B At. Mol. Opt. Phys. 46, 164013.]; Serkez et al., 2013[Serkez, S., Kocharyan, V. & Saldin, E. (2013). 35th International Free-Electron Laser Conference, pp. 574-582. Red Hook: Curran Associates.]; Ayyer et al., 2015[Ayyer, K., Geloni, G., Kocharyan, V., Saldin, E., Serkez, S., Yefanov, O. & Zagorodnov, I. (2015). Struct. Dyn. 2, 041702.]) the program Moltrans is mentioned and described as a software package to simulate FXI data for atomic models. Unfortunately, the code is not openly available. Very recently, SimS2E was released, which is a very sophisticated start-to-end simulation framework specialized for single-molecule FXI at the European X-ray free-electron laser (Yoon et al., 2015[Yoon, C. H., Yurkov, M. V., Schneidmiller, E. A., Samoylova, L., Buzmakov, A., Jurek, Z., Santra, R., Loh, N. D. & Mancuso, A. P. (2015). Sci. Rep. 6, 24791.]). A practical, convenient and openly available FXI software tool for a range of sample models is missing.

Here we introduce Condor, an easy-to-use software package to simulate FXI far-field scattering amplitudes from an experimental setup customized by the user. The user may define the sample either by atom positions or at lower resolution by a three-dimensional refractive index map. This allows one to simulate diffraction from samples that are unknown at atomic resolution but for which low-resolution densities from, for example, electron microscopy studies exist. Common challenges that a researcher faces with real data (Seibert et al., 2011[Seibert, M. M. et al. (2011). Nature, 470, 78-81.]; Loh et al., 2012[Loh, N. D. et al. (2012). Proc. SPIE, 8504, 850403.]; Hantke et al., 2014[Hantke, M. F. et al. (2014). Nat. Photon. 8, 943-949.]; van der Schot et al., 2015[Schot, G. van der et al. (2015). Nat. Commun. 6, 5704.]; Ekeberg et al., 2015[Ekeberg, T. et al. (2015). Phys. Rev. Lett. 114, 098102, 1-6.]) can be introduced by adding, for example, noise, signal variation, missing data regions, fluctuation of the beam tilt, sample heterogeneity or sample contamination. So far, Condor has demonstrated its usefulness for the preparation of experiments, data validation (Hantke et al., 2014[Hantke, M. F. et al. (2014). Nat. Photon. 8, 943-949.]), and the development of new software and algorithms (Daurer et al., 2016[Daurer, B. J., Hantke, M. F., Nettelblad, C. & Maia, F. R. N. C. (2016). J. Appl. Cryst. 49, 1-6.]).

Condor is distributed under the free open-source Simplified Berkeley Software Distribution (BSD) License to ensure transparency and to ease future development and availability of the code. The source code can be downloaded from https://github.com/mhantke/condor. Condor does not require a local installation. It can be used directly from its web interface at https://lmb.icm.uu.se/condor (Fig. 1[link]).

[Figure 1]
Figure 1
The interface of Condor's web application (https://lmb.icm.uu.se/condor). In the left panels the user configures the source, particle and detector model. From the upper-right panel the job is submitted, and after the job has completed a preview and download links of the simulated data appear in the lower-right panel.

In this paper we give a description of the theoretical diffraction model that the code is based on (§[link]2), describe how to use Condor (§3[link]) and outline details of the current implementation (§4[link]). The last chapter summarizes the paper and draws conclusions (§5[link]).

2. Theory

Condor attempts to predict coherent X-ray diffraction patterns on the basis of a sample model. Below we briefly outline the necessary approximations and the derivation of the scattering formulas that are used. For a comprehensive description of the theory behind, see, for example, Paganin (2006[Paganin, D. M. (2006). Coherent X-ray Optics. Oxford University Press.]) and Als-Nielsen & McMorrow (2001[Als-Nielsen, J. & McMorrow, D. (2001). Elements of Modern X-ray Physics. New York: Wiley.]).

For X-ray energies far from any absorption edges and well below the rest mass energy of an electron (511 keV) we may neglect Compton scattering. The samples that are considered here have a thickness of up to a few hundred nanometres and interact, because of their small size, only weakly with X-rays. This circumstance allows us to neglect the perturbation of the primary wave by the scattered wave within the sample. This approximation is well known as the first-order Born approximation.

Predictions suggest that femtosecond X-ray pulses can outrun radiation damage processes (Neutze et al., 2000[Neutze, R., Wouts, R., van der Spoel, D., Weckert, E. & Hajdu, J. (2000). Nature, 406, 752-757.]). Hence, in the simulations we model the sample by a scattering potential [\varphi({\bf x})], which is invariant over the duration of the pulse.

The sample particle is placed in vacuum and illuminated by a plane wave with wavevector [{\bf k}_{0}] (see Fig. 2[link]). We seek to predict the wavefield Ψ at pixel positions [{\bf x^{{\prime}}}] in the detector plane that is orthogonal to the beam axis and at a far distance from the object. In this scenario [\Psi({\bf x^{{\prime}}})] can be expressed as the sum of the primary wave [\Psi^{{(0)}}({\bf x^{{\prime}}})] and the scattered wave (or scattering amplitude) [\Psi^{{(1)}}({\bf x^{{\prime}}})]. The direct beam [\Psi^{{(0)}}({\bf x^{{\prime}}})] does not carry any structural information and is confined to the forward direction. It usually passes through a gap between the detector panels or is blocked by a beam stop and is never measured. Structural information about the sample is encoded by the scattering amplitude [\Psi^{{(1)}}], which is the superposition of spherical waves with amplitude [\varphi(\bf{x})] originating from all points [\bf{x}] in the scattering volume:

[\Psi^{{(1)}}({\bf x^{{\prime}}}) = \int\!\!\int\!\!\int{{\exp(ik\left|{\bf x^{{\prime}}}-{\bf x}\right|)} \over {\left|{\bf x^{{\prime}}}-{\bf x}\right|}}\,\varphi({\bf x})\exp(i{\bf k}_{0}\cdot{\bf x})\,{\rm d}{\bf x}. \eqno(1)]

In our scenario, the sample volume is small and the detector distance large. Hence, we may safely assume [|{\bf x^{{\prime}}}|\gg|{\bf x}|] and obtain the far-field approximation of (1[link]):

[\Psi^{{(1)}}({\bf q}) = {{\exp(ikr)} \over {r}}\int\!\!\int\!\!\int\varphi({\bf x})\exp(-i{\bf q}\cdot{\bf x})\,{\rm d}{\bf x}, \eqno(2)]

where [{\bf q} = {\bf k}_{1}-{\bf k}_{0}] denotes the scattering vector and [r = |{\bf x}^{{\prime}}|]. Since we only consider elastic scattering the energy is conserved and so is the wavenumber [k = |{\bf k}_{0}| = |{\bf k}_{1}| = 2\pi/\lambda], where λ denotes the wavelength. As we are only interested in relative phase differences we neglect the phase factor exp(ikr) in the following equations.

[Figure 2]
Figure 2
Schematic representation of the geometry in real and Fourier space. (a) A plane wave illuminates the sample. It is placed in vacuum and confined to the scattering volume illustrated by the green box. The signal at the detector plane is the superposition of the primary wave with wavevector [{\bf k}_{0}] and the scattered wave with wavevector [{\bf k}_{1}]. (b) The diffraction space is the reciprocal space of scattering vectors [{\bf q} = {\bf k}_{1}-{\bf k}_{0}] and contains the Fourier transform of the scattering potential [\varphi({\bf x})].

For numerical calculation of the scattering amplitude [\Psi^{{(1)}}({\bf q})] we have to either solve the integral in (2[link]) or approximate it by a discrete function. Analytical solutions exist for certain sample models, such as uniformly filled spheres or spheroids (Feigin & Svergun, 1987[Feigin, L. A. & Svergun, D. I. (1987). Structure Analysis by Small-Angle X-ray and Neutron Scattering. New York: Plenum Press.]; Hamzeh & Bragg, 1974[Hamzeh, F. M. & Bragg, R. H. (1974). J. Appl. Phys. 45, 3189-3195.]). In Condor these solutions of (2[link]) are implemented and can be customized by a few parameters. For more complex samples Condor provides two ways of defining the sample: either by a positional arrangement of atoms or by a gridded refractive index map. In the following subsections numerical solutions for these two particle models are presented. Both involve approximating the integral in (2[link]) by discrete Fourier transforms (DFTs) that have the general form

[\hat{F}({\bf s}) = {{1} \over {N}}\,\sum _{{j = 0}}^{{N-1}}\, F({\bf x}_{j})\exp\left(-2\pi i{{{\bf s}\cdot{\bf x}_{j}} \over {N}}\right). \eqno(3)]

This formulation allows Condor to deploy efficient fast Fourier transform algorithms and exploit rapid parallel computing architectures.

2.1. Atomic model

FXI studies often target small sample particles that have sufficient resemblance to systems for which atomic structures have been determined by either X-ray crystallography, cryo-electron microscopy or nuclear magnetic resonance spectroscopy. X-rays are scattered by atoms because of their bound electrons. The scattering strength of a single free electron is known as the Thomson scattering length r0. The scattering potential for N free electrons located at the respective positions [{\bf x}_{j}] may be written as

[\varphi _{{\rm{electrons}}}({\bf x}) = \textstyle\sum\limits _{{j = 1}}^{{N}}\delta({\bf x}-{\bf x}_{j}) \,r_{0}.\eqno(4)]

By substituting (4[link]) into (2[link]) the δ functions conveniently reduce the integral in (2[link]) to a sum and we obtain the scattering amplitude in a simpler form:

[\Psi^{{(1)}}_{{\rm{electrons}}}({\bf q}) = r^{{-1}}\textstyle\sum\limits _{{j = 0}}^{{N}} r_{0}\exp(-i{\bf q}\cdot{\bf x}).\eqno(5)]

For electrons bound to an atom of species a the scattering length can be calculated by multiplying r0 with the atomic scattering factor [f^{{(\lambda)}}_{ a}(\theta)]. The atomic scattering factor is a semi-empirically determined element-specific constant that is tabulated for a large range of wavelengths λ and scattering angles θ (Brown et al., 2006[Brown, P. J., Fox, A. G., Maslen, E. N., O'Keefe, M. A. & Willis, B. T. M. (2006). International Tables for Crystallography, Vol. C, 1st online ed., edited by E. Prince, ch. 6.1. Chester: International Union of Crystallography.]; Henke et al., 1993[Henke, B. L., Gullikson, E. & Davis, J. (1993). At. Data Nucl. Data Tables, 54, 181-342.]). The shape of the atom is reflected in the angular dependency; hence the atomic scattering factor is also known as the atomic form factor.

This permits us to replace the integral in (2[link]) as in (5[link]) by a sum. The scattering amplitude can be evaluated by separating the calculation into sums for each atom species a that accounts for Na atoms at positions [\{{\bf x}_{j}^{{(a)}}\}]. We obtain

[\Psi^{{(1)}}_{{\rm {atoms}}}({\bf q}) = r^{{-1}}\textstyle\sum\limits _{{a}}\, f^{{(\lambda)}}_{{a}}(\theta)\left[\sum\limits _{{j = 1}}^{{N_{a}}}r_{0}\exp\left(-i{\bf q}\cdot {\bf x}_{j}^{{(a)}}\right)\right].\eqno(6)]

[ \Psi^{{(1)}}_{{\rm{atoms}}}({\bf q})] now has the form of a sum of DFTs (3[link]) with [F({\bf x}_{j}) = r_{0}] computed on the nonregular grid [\{{\bf x}_{j}^{{(a)}}\}].

2.2. Refractive index model

For larger objects, such as big protein complexes or virus particles, the atomic structure is rarely on hand. However, at lower than atomic resolution electron density maps [\rho _{\rm e}({\bf x})] of a wide range of structures have been measured by electron microscopy. Also, for many relevant optical media we can estimate the atomic composition (see Table 1[link]) and are able to model samples by customized density maps of optical media. For these cases the scattering potential [\varphi _{n}({\bf x})] can be derived from the Maxwell equations and written as a function of the complex valued refractive index [n({\bf x})]:

[\varphi _{{{n}}}({\bf q}) = {{2\pi} \over {\lambda^{2}}}[1-n({\bf x})].\eqno(7)]

For convenience we define [\partial n({\bf x})\equiv 1-n({\bf x})]. By inserting (7[link]) into (2[link]) we obtain the scattering amplitude as a function of [\partial n]:

[\Psi^{{(1)}}_{{{\partial n}}}({\bf q}) = r^{{-1}}\int\!\!\int\!\!\int{{2\pi} \over {\lambda^{2}}}\partial n({\bf x})\exp(-i{\bf q}\cdot{\bf x})\,{\rm d}{\bf x}.\eqno(8)]

If this equation is interpreted as the continuum limit of (5[link]) the relationship between the refractive index and the electron density distribution [\rho _{\rm e}({\bf x})] becomes

[n_{{\rm {el}}}({\bf x}) = 1-{{\lambda^{2}} \over {2\pi}}\, r_{0}\,\rho _{\rm {el}}({\bf x}), \eqno(9)]

and the relationship between refractive index and the atom density distribution [\rho _{a}({\bf x})] becomes

[n_{{\rm{at}}}({\bf x}) = 1-{{\lambda^{2}} \over {2\pi}}\, r_{0}\sum _{{a = 0}}^{M}\, f^{{(\lambda)}}_{a}(0) \,\rho _{a}({\bf x}).\eqno(10)]

Using the relationships (9[link]) and (10[link]), Condor converts electron and atom density maps into refractive index maps. We presume here that [f^{{(\lambda)}}_{a}(\theta)\simeq f^{{(\lambda)}}_{a}(0)] for all scattering angles θ, which is a valid assumption if the resolution of the measurement is well below atomic length scales.

Table 1
Mass density, atomic composition and refractive index for a selection of optical media

The material constants were taken from Bergh et al. (2008[Bergh, M., Huldt, G., Tîmneanu, N., Maia, F. R. N. C. & Hajdu, J. (2008). Q. Rev. Biophys. 41, 181-204.]), Molla et al. (1991[Molla, A., Paul, A. & Wimmer, E. (1991). Science, 254, 1647-1651.]) and Dans et al. (1966[Dans, P. E., Forsyth, B. R. & Chanock, R. M. (1966). J. Bacteriol. 91, 1605-1611.]), and the refractive index was calculated from these values with the relationship given by (10[link]).

Material type Density (g cm−3) Atomic composition Refractive index (λ = 1.240 nm)
Water 1.00 H2O 1-(1.54+0.18i)×10-4
Protein 1.35 H86C52N13O15S 1-(2.03+0.16i)×10-4
DNA 1.70 H11C10N4O6P 1-(2.44+0.23i)×10-4
Lipid 1.00 H69C36O6P 1-(1.54+0.10i)×10-4
Cell 1.00 H23C3NO10S 1-(1.51+0.16i)×10-4
Poliovirus particle 1.34 C332652H492388N98245O131196P7501S2340 1-(1.99+0.17i)×10-4

Discretization of the Fourier integral in (2[link]) with (3[link]) on a three-dimensional cubic grid of L×L×L points at spacing [\Delta x] results in

[\Psi^{{(1)}}_{{{n}}}({\bf q},r) = r^{{-1}}\,{{2\pi} \over {\lambda^{2}}}\, L^{3}\,\widehat{\partial n}\left({{{\bf q}\Delta x} \over {2\pi}}\right)\,\Delta x^{3}\,\mbox{.}\eqno(11)]

with [\widehat{\partial n}] being the Fourier transform of [\partial n]. This expression allows Condor to efficiently calculate the scattering amplitude for any discrete map [\{\partial n({\bf x}_{j})\}] on the regular grid [\{{\bf x}_{j}\}].

2.3. Diffraction measurement

To predict the absolute scattering signal [I^{{(1)}}({\bf q})] measured with a photon detector we need to take into account the intensity I0 of the illumination, the solid angle [\Omega(\theta)] that is covered by the detector pixel, and the polarization factor [P(\theta)], which accounts for the effects of the polarization of the incoming beam in the scattered signal (Als-Nielsen & McMorrow, 2001[Als-Nielsen, J. & McMorrow, D. (2001). Elements of Modern X-ray Physics. New York: Wiley.]). With these parameters the expectation value for the number of scattered photons measured in a pixel (without noise and any losses) is given by

[I^{{(1)}}({\bf q}) = I_{0}\,\left|\Psi^{{(1)}}({\bf q})\right|^{2}\, P(\theta)\,\Omega(\theta).\eqno(12)]

Owing to the quantum nature of photons the measurement of [I^{{(1)}}({\bf q})] inevitably suffers from shot noise and thus follows Poisson statistics. This type and other types of measurement errors such as detector noise, parasitic scattering and limited quantum efficiency may be added to the simulated intensity values if desired.

For the refractive index model the agreement of data from a real FXI experiment and simulated data calculated by using the formalism that has been described here is demonstrated in Fig. 3[link]. For the atomic model such a comparison cannot be made because we lack suitable experimental data at this point.

[Figure 3]
Figure 3
Comparison of experimental data and simulated data. (a) From the measured FXI diffraction pattern (top) of a single carboxysome (i.e. an icosahedral cell organelle) the projection image (bottom) was reconstructed by iterative phase retrieval down to 18.1 nm resolution. (b) At the given resolution the Condor simulation of a diffraction pattern and projection image for a uniformly filled icosahedron in matching orientation and size provides an acceptable approximation for the data shown in (a). Figure adapted from Hantke et al. (2014[Hantke, M. F. et al. (2014). Nat. Photon. 8, 943-949.]).

3. Usage

In the following paragraphs we give an introduction to the usage and functionality of Condor. For a detailed description of all features please see Condor's documentation at https://lmb.icm.uu.se/condor/documentation.

Every Condor simulation requires the configuration of at least three components: the X-ray source, at least one sample and a pixel array detector. The configuration of the X-ray source defines the photon wavelength and intensity at the interaction point. The model of the sample can be of different kinds, either an atomic model or a refractive index description. The atomic description requires knowledge about all atom positions and atom species in the scattering volume. For example the online Protein Data Bank (PDB; Berman et al., 2000[Berman, H., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T. N., Weissig, H., Shindyalov, I. N. & Bourne, P. E. (2000). Nucleic Acids Res. 28, 235-242.]) is a resource that provides a wide range of structures at atomic resolution. The structure can be provided either by a list of coordinates and atomic numbers or by a PDB file or PDB ID code.

To define a refractive index map Condor accepts a three-dimensional array of data points on a cubic Cartesian grid or the geometrical parameters of a sphere or spheroid. The map values can be refractive indices, electron densities or atom densities. For the last two, formulas (9[link]) or (10[link]) are used for the conversion to refractive indices. Condor interfaces to the Electron Microscopy Databank (EMDB; Lawson et al., 2011[Lawson, C. L. et al. (2011). Nucleic Acids Res. 39, D456-D464.]), from which density maps can be retrieved. The orientation of the particle is defined by an extrinsic rotation. The rotation can be defined by either a triple of Euler angles, a rotation matrix or a quaternion. Multiple particles at different positions in the beam can be simulated as well. The configuration of the pixel detector determines the position of all pixels in space with respect to the interaction point. The detector noise, the fluctuating beam tilt, the saturation level, a missing data mask etc. may also be specified.

The default way of carrying out a Condor simulation is by calling the executable condor from a folder that contains a configuration file named condor.conf. Fig. 4[link] shows two example configuration files, one for the calculation with an atomic model (Fig. 4[link]a) and one for the calculation with a refractive index model (Fig. 4[link]b). Every configuration file is subdivided into at least three sections [X-ray source, sample particle(s), pixel array detector]. All quantities follow the convention of the International System of Units. If a parameter is unspecified it is set to a default value. At the end of execution the results are written to an HDF5 file. The acronym HDF5 stands for Hierarchical Data Format version 5 (The HDF Group, 2016[The HDF Group (2016). Hierarchical Data Format. Version 5. https://www.hdfgroup.org/HDF5/.]), which is a widely used file format for scientific applications and ensures high portability and performance. The data structure within the file follows the guidelines for the Coherent X-ray Imaging file format (Maia, 2012[Maia, F. R. N. C. (2012). Nat. Methods, 9, 854-855.]).

[Figure 4]
Figure 4
Configuration files for simulation with (a) an atomic and (b) a refractive index model. The source section defines the illumination properties, the particle section the sample properties and the detector section the parameters for the area pixel detector. Many parameters are optional and are set to default values if not specified. These configuration files together with the required structure files are included in the online repository of Condor and are located in the folder examples_publication.

The two example configuration files shown in Fig. 4[link] define experimentally feasible configurations at the LINAC Coherent Light Source (LCLS). The selected particle structures are the GroEL–GroES protein complex (Fig. 4[link]a) and the poliovirus particle (Fig. 4[link]b). The structure for the GroEL–GroES protein complex is taken from the atom positions of PDB entry 1aon (Xu et al., 1997[Xu, Z., Horwich, A. L. & Sigler, P. B. (1997). Nature, 388, 741-750.]). The poliovirus particle is modelled by the density map derived from EMDB entry 1144 (Bubeck et al., 2005[Bubeck D., Filman, D. J., Cheng, N., Steven, A. C., Hogle, J. M. & Belnap, D. M. (2005). J. Virol. 79, 7745-7755.]). We projected the EMDB map to electron densities using experimentally determined values for atomic composition (Molla et al., 1991[Molla, A., Paul, A. & Wimmer, E. (1991). Science, 254, 1647-1651.]) and mass density (Dans et al., 1966[Dans, P. E., Forsyth, B. R. & Chanock, R. M. (1966). J. Bacteriol. 91, 1605-1611.]) of poliovirus virions. Simulated results from these examples are shown in Fig. 5[link].

[Figure 5]
Figure 5
Simulation of diffraction patterns for two biological structures: (a) the GroEL–GroES complex and (b) the poliovirus particle. For each model the projection image is shown on the left and the noise-free intensity pattern on the right.

Condor provides not only intensities but also phases. Here the curvature of the Ewald sphere is small, and hence projection images in real space (left column in Fig. 5[link]) can be readily calculated by inverse Fourier transforming the scattering amplitudes.

For a more customizable use, Condor's application programming interface (API) can be called directly from any Python software. The Condor engine can thus be easily integrated into any software tool or pipeline that relies on simulated diffraction data. An example for a script that uses the Condor API is shown in Fig. 6[link]. Projection images and diffraction patterns that were generated with this script are presented in Fig. 7[link]. The script simulates an experiment where spheroidal water droplets contaminate the particle stream of GroEL–GroES protein complexes. Both particle species arrive in the scattering volume in random orientations and at random positions. The arrival statistics are modelled by a Poisson process with arrival rates of 0.2 for the water droplets and 0.9 for the protein complexes. The water droplets are not simulated as perfectly reproducible structures but as spheroids of varying size and shape. This is reflected in the model by size parameters that follow a normal distribution centred at 8 nm and values of the flattening parameter that follow a uniform distribution between 0.8 and 1.0.

[Figure 6]
Figure 6
Customized use of Condor by direct interaction with the Python API. The script simulates diffraction patterns from a mixture of GroEL–GroES complex and spheroidal water droplets of varying shape and size. A missing data region and Poisson noise are taken into account for the intensity estimate at the detector pixels. After the initialization, patterns are simulated sequentially in a loop by calling the method propagate() and appending the results to an HDF5 file.
[Figure 7]
Figure 7
Simulation of diffraction patterns for a mixture of two particle species: the GroEL–GroES complex and spheroidal water droplets. (a) Real-space projection images and (b) respective simulated diffraction intensity patterns with Poisson noise and a pixel mask. The physical parameters resemble the conditions at the AMO beamline at the LCLS.

4. Implementation

Condor is a Python package including C extensions for the computationally heavy operations. For the calculation of the discrete Fourier transform in equations (6[link]) and (11[link]), Condor makes use of the non-equispaced fast Fourier transform (NFFT) C library (Keiner et al., 2009[Keiner, J., Kunis, S. & Potts, D. (2009). ACM Trans. Math. Softw. 36, 1-30.]). This library provides routines to calculate the discrete Fourier transform at non-equispaced points, for example on the curved surface of the Ewald sphere. For the refractive index model Condor deploys the common NFFT algorithm, which still requires equispaced sampling in the real-space domain. For the atomic model the generalized NNFFT algorithm is used, as it allows for non-equispaced sampling in both domains. The computation of the sums in the discrete Fourier transform can benefit from parallelization. Compilation with OpenMP (https://openmp.org) allows for an easy parallelization with moderate speed-ups. Diffraction from atomic models is normally more computationally demanding and here Condor supports the use of CUDA-capable graphics cards (https://nvidia.com/cuda), which can provide a drastic increase in performance.

Computation times were measured for the simulations of the examples shown in Figs. 4[link] and 5[link], which were carried out on a MacBookPro computer [2.5 GHz Intel Core i7 (4 cores, 8 threads), 16 GB 1600 MHz DDR3] equipped with a CUDA-capable graphics card (NVIDIA GeForce GT 750 M, 2048 MB memory). The atomic model included of 58 870 atom positions, and diffraction was predicted at 256 × 256 detector pixels. Using a single CPU and with CUDA disabled the calculation took 208 s. Enabling CUDA resulted in a computation time of 3 s, giving a speedup of 69.3×. The refractive index map consisted of 173 × 173 × 173 voxels, and diffraction was predicted at 512 × 512 detector pixels. Using a single CPU the calculation took 19 s, and using four CPU threads it took 6.8 s, resulting in a speed-up of 2.8×.

Fig. 8[link](a) illustrates the representation of an experiment in Condor as a Python object. It contains a source object, one or several particle objects, and a detector object. The experiment object has a method propagate() that starts the simulation of a single shot and returns the results in the form of a Python dictionary.

[Figure 8]
Figure 8
Implementation architecture. (a) An experiment is represented in Condor as a Python object that includes a source object, one or several sample particle objects, and a detector object. A call of the method propagate() starts a simulation and returns the diffraction data. (b) The web version of Condor is realized as a hierarchical client–server model. The web server provides a dynamic web page under the address https://lmb.icm.uu.se/condor. Under this page users can configure their experiment, and upon submission data are validated and then cached in a database. Simulation jobs are scheduled by a scheduling server that manages a network of worker clients. This worker farm is dynamically extended and shrunk depending on the number of requests. After completion of a simulation the web server presents previews and links to download to the user.

As an alternative to a local installation, Condor is also provided as a web application (Fig. 1[link]) that supports most of the functionality of the full package. In the left panel of the web application one can configure the X-ray source, sample particle and detector. The upper right panel is used to submit simulation requests and monitor their progress. After a simulation has finished its results can be previewed and downloaded from the bottom right panel.

The web implementation of Condor is based on a Django (https://www.djangoproject.com/) web framework and uses a database for caching user inputs. The system is hosted by the Davinci GPU computer cluster of the Laboratory of Molecular Biophysics (Uppsala University, Sweden).

The architecture of the server–client model of the web implementation is illustrated in Fig. 8[link](b). When a user submits a simulation request the web server first checks the input. If the input passes validation the web server sends the requests to the Condor server, which manages a number of Condor clients. The first worker client that becomes available starts the Condor simulation. The number of worker clients is dynamically adjusted to the current load of the web page, such that at least one worker client is always available for processing a simulation request. The hierarchical architecture ensures responsiveness of the servers at all times, even when running multiple simulations simultaneously. While a simulation is running the scheduling server monitors the progress of the simulation. When finished the results are sent to the web server, which presents the user with previews and links for downloading the results as an HDF5 file.

5. Conclusion

FXI experiments at free-electron laser facilities are expensive and precious. Easy-to-use software can support researchers in improving data quality and can support data analysis. The software Condor is a fast simulation tool specialized for FXI research and covers a wide range of use cases and functionalities. Practically anybody is able to use Condor because of its simple structure and because common hurdles such as limited cross-platform compatibility or demanding hardware requirements have been avoided by making key features available through a web application. We, the developers, encourage and support the integration of the code into other software that relies on simulated FXI data. Reusability of the source code is facilitated by the availability of a simple and flexible Python API and by the distribution of the code under the Simplified BSD license.

Beyond its relevance in research Condor may be a useful educational software tool. Students may gain understanding of the laws of X-ray diffraction by studying changes in the diffraction pattern while changing experimental parameters. Moreover, entire experimental data sets can be readily simulated by the students themselves. Students may be invited to pursue a reconstruction from simulated data.

In conclusion, Condor will enhance and stimulate collaborative activities in software development within the FXI community. Furthermore, the software will underpin efforts in FXI education, experiment planning, conducting of experiments, algorithm development and data validation.

Footnotes

1This article will form part of a virtual special issue of the journal on free-electron laser software.

References

First citationAls-Nielsen, J. & McMorrow, D. (2001). Elements of Modern X-ray Physics. New York: Wiley.  Google Scholar
First citationAquila, A. et al. (2015). Struct. Dyn. 2, 041701.  Web of Science CrossRef PubMed Google Scholar
First citationAyyer, K., Geloni, G., Kocharyan, V., Saldin, E., Serkez, S., Yefanov, O. & Zagorodnov, I. (2015). Struct. Dyn. 2, 041702.  Web of Science CrossRef PubMed Google Scholar
First citationBergh, M., Huldt, G., Tîmneanu, N., Maia, F. R. N. C. & Hajdu, J. (2008). Q. Rev. Biophys. 41, 181–204.  Web of Science CrossRef PubMed CAS Google Scholar
First citationBerman, H., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T. N., Weissig, H., Shindyalov, I. N. & Bourne, P. E. (2000). Nucleic Acids Res. 28, 235–242.  Web of Science CrossRef PubMed CAS Google Scholar
First citationBogan, M. J. et al. (2008). Nano Lett. 8, 310–316.  Web of Science CrossRef PubMed CAS Google Scholar
First citationBrown, P. J., Fox, A. G., Maslen, E. N., O'Keefe, M. A. & Willis, B. T. M. (2006). International Tables for Crystallography, Vol. C, 1st online ed., edited by E. Prince, ch. 6.1. Chester: International Union of Crystallography.  Google Scholar
First citationBubeck D., Filman, D. J., Cheng, N., Steven, A. C., Hogle, J. M. & Belnap, D. M. (2005). J. Virol. 79, 7745–7755.  Google Scholar
First citationChapman, H. N., Barty, A. et al. (2006). Nat. Phys. 2, 839–843.  CrossRef CAS Google Scholar
First citationChapman, H. N. & Nugent, K. A. (2010). Nat. Photon. 4, 833–839.  Web of Science CrossRef CAS Google Scholar
First citationDans, P. E., Forsyth, B. R. & Chanock, R. M. (1966). J. Bacteriol. 91, 1605–1611.  CAS PubMed Google Scholar
First citationDaurer, B. J., Hantke, M. F., Nettelblad, C. & Maia, F. R. N. C. (2016). J. Appl. Cryst. 49, 1–6.  CrossRef IUCr Journals Google Scholar
First citationEkeberg, T. et al. (2015). Phys. Rev. Lett. 114, 098102, 1–6.  Google Scholar
First citationFeigin, L. A. & Svergun, D. I. (1987). Structure Analysis by Small-Angle X-ray and Neutron Scattering. New York: Plenum Press.  Google Scholar
First citationHamzeh, F. M. & Bragg, R. H. (1974). J. Appl. Phys. 45, 3189–3195.  CrossRef Web of Science Google Scholar
First citationHantke, M. F. et al. (2014). Nat. Photon. 8, 943–949.  Web of Science CrossRef CAS Google Scholar
First citationHenke, B. L., Gullikson, E. & Davis, J. (1993). At. Data Nucl. Data Tables, 54, 181–342.  CrossRef CAS Google Scholar
First citationKeiner, J., Kunis, S. & Potts, D. (2009). ACM Trans. Math. Softw. 36, 1–30.  CrossRef Google Scholar
First citationLawson, C. L. et al. (2011). Nucleic Acids Res. 39, D456–D464.  Web of Science CrossRef CAS PubMed Google Scholar
First citationLoh, N. D. et al. (2012). Proc. SPIE, 8504, 850403.  CrossRef Google Scholar
First citationMaia, F. R. N. C. (2012). Nat. Methods, 9, 854–855.  Web of Science CrossRef CAS PubMed Google Scholar
First citationMolla, A., Paul, A. & Wimmer, E. (1991). Science, 254, 1647–1651.  CrossRef PubMed CAS Google Scholar
First citationNeutze, R., Wouts, R., van der Spoel, D., Weckert, E. & Hajdu, J. (2000). Nature, 406, 752–757.  Web of Science CrossRef PubMed CAS Google Scholar
First citationPaganin, D. M. (2006). Coherent X-ray Optics. Oxford University Press.  Google Scholar
First citationSchot, G. van der et al. (2015). Nat. Commun. 6, 5704.  Web of Science PubMed Google Scholar
First citationSeibert, M. M. et al. (2011). Nature, 470, 78–81.  Web of Science CrossRef CAS PubMed Google Scholar
First citationSerkez, S., Kocharyan, V. & Saldin, E. (2013). 35th International Free-Electron Laser Conference, pp. 574–582. Red Hook: Curran Associates.  Google Scholar
First citationThe HDF Group (2016). Hierarchical Data Format. Version 5. https://www.hdfgroup.org/HDF5/Google Scholar
First citationWinn, M. D. et al. (2011). Acta Cryst. D67, 235–242.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationXu, Z., Horwich, A. L. & Sigler, P. B. (1997). Nature, 388, 741–750.  CAS PubMed Google Scholar
First citationYefanov, O. M. & Vartanyants, I. A. (2013). J. Phys. B At. Mol. Opt. Phys. 46, 164013.  Web of Science CrossRef Google Scholar
First citationYoon, C. H., Yurkov, M. V., Schneidmiller, E. A., Samoylova, L., Buzmakov, A., Jurek, Z., Santra, R., Loh, N. D. & Mancuso, A. P. (2015). Sci. Rep. 6, 24791.  CrossRef Google Scholar

This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.

Journal logoJOURNAL OF
APPLIED
CRYSTALLOGRAPHY
ISSN: 1600-5767
Volume 49| Part 4| August 2016| Pages 1356-1362
Follow J. Appl. Cryst.
Sign up for e-alerts
Follow J. Appl. Cryst. on Twitter
Follow us on facebook
Sign up for RSS feeds