research papers
pinkIndexer – a universal indexer for pinkbeam Xray and electron diffraction snapshots
^{a}Center for FreeElectron Laser Science, Deutsches ElektronenSynchrotron DESY, Notkestraße 85, 22607 Hamburg, Germany, ^{b}Vision Systems, Hamburg University of Technology, 21071 Hamburg, Germany, ^{c}Department of Physics, Universität Hamburg, Luruper Chaussee 149, 22761 Hamburg, Germany, and ^{d}The Hamburg Center for Ultrafast Imaging, Universität Hamburg, Luruper Chaussee 149, 22761 Hamburg, Germany
^{*}Correspondence email: yaroslav.gevorkov@desy.de
A crystallographic indexing algorithm, pinkIndexer, is presented for the analysis of snapshot diffraction patterns. It can be used in a variety of contexts including measurements made with a monochromatic radiation source, a polychromatic source or with radiation of very short wavelength. As such, the algorithm is particularly suited to automated data processing for two emerging measurement techniques for macromolecular serial pinkbeam Xray crystallography and serial electron crystallography, which until now lacked reliable programs for analyzing many individual diffraction patterns from crystals of uncorrelated orientation. The algorithm requires approximate knowledge of the unitcell parameters of the crystal, but not the wavelengths associated with each Bragg spot. The use of pinkIndexer is demonstrated by obtaining 1005 lattices from a published pinkbeam serial crystallography data set that had previously yielded 140 indexed lattices. Additionally, in tests on experimental serial crystallography diffraction data recorded with quasimonochromatic Xrays and with electrons the algorithm indexed more patterns than other programs tested.
Keywords: indexing; pinkIndexer; CrystFEL; pink Xray beam; serial electron diffraction.
1. Introduction
Protein crystallography is a vibrant and continually evolving field spurred by the development of new radiation sources, detectors, measurement techniques and analysis methods. One example is the relatively recent development of serial crystallography using femtosecondduration Xray pulses from freeelectron lasers (SFX), which is suited to the study of micronsized and smaller macromolecular crystals (Chapman et al., 2011; Boutet et al., 2012; Schlichting, 2015; Gati et al., 2017). With pulses that outrun atomic motions initiated by photoabsorption, doses may far exceed conventional radiation damage limits to provide structures of radiationsensitive proteins free of obvious radiation damage, permitting timeresolved studies of biomolecular dynamics at physiologically relevant temperatures (Suga et al., 2014; Tenboer et al., 2014; Kang et al., 2015; Pande et al., 2016; Stagno et al., 2016). The approach of measuring only a single snapshot diffraction pattern from each of many crystals also allows for a lower overall exposure per crystal, and hence lower doses than would be accrued in conventional rotation measurements. This, and the potential for highthroughput measurements, has motivated the development of serial crystallography at synchrotron radiation facilities (Stellato et al., 2014; Nogly et al., 2015) and using electron microscopes (Smeets et al., 2018; Bücker et al., 2019).
The speed of serial crystallography measurements, and often the corresponding consumption of sample, is primarily limited by the radiation fluence on the sample and the detector frame rate. At synchrotron radiation sources, higher fluences can be obtained by foregoing the monochromator and using a polychromatic beam. Combined with the enhanced coverage of et al., 2013; Dejoie et al., 2013). For example, Meents et al. (2017) demonstrated roomtemperature serial crystallography with 100 ps exposure times using the full spectrum of an undulator harmonic (5% relative bandwidth), with the ability to determine structures from as little as 50 indexed diffraction patterns. However, the automated analysis of pinkbeam diffraction patterns has been found to be problematic, with only 15% of patterns successfully indexed in the demonstration of Meents et al. (2017). We were therefore motivated to create a new robust algorithm to index snapshot diffraction patterns recorded with a quasicollimated beam of arbitrary bandwidth, with the requirement to index weak or incomplete patterns, using approximately known unitcell parameters. In meeting this goal, we produced an algorithm `pinkIndexer'. We found that pinkIndexer can also be applied to several other data collection methods. In addition to superior performance in processing pinkbeam diffraction compared with the stateoftheart algorithms, the algorithm indexes more patterns in monochromatic serial crystallography data sets than all other programs tested, and is successful in indexing snapshot crystal diffraction patterns recorded with electrons.
a moderate bandwidth of the order of a few per cent (referred to as a `pink' beam) may offer the additional advantage of fewer necessary diffraction patterns for a serial crystallography measurement (WhiteThe determination of the 3D macromolecular
requires the measurement of diffraction intensities at reciprocallattice points throughout a volume of yet a single snapshot diffraction pattern accesses just a cut through this space. Conservation of photon energy and momentum dictates that for a specific Xray wavelength this cut is given by a spherical surface – the – which passes through the origin of In a serial diffraction experiment the reciprocalspace volume (reduced by the symmetry of the crystal) is sampled by many such patterns recorded from crystals at various random orientations. Usually the orientation of each crystal is initially unknown, and therefore so too is the orientation of its A key analysis step is to identify the crystal orientation, which is equivalent to providing the correct indices to the observed diffraction spots. Furthermore, the distribution of crystal orientations is usually assumed to be random, precluding the use of correlations between successive patterns to deduce the crystal orientations. In the case of a broadbandwidth Xray beam, indexing is complicated by the uncertainty of the particular incident wavelength that gave rise to a given Bragg spot, while for electron diffraction the short wavelength results in an almost flat for which the determination of unknown 3D lattice parameters is ill conditioned. Indeed, the main bottleneck in pinkbeam and electron serial crystallography analysis has been the indexing step.Automatic indexing algorithms implemented in widely used software including MOSFLM (Powell, 1999), XDS (Kabsch, 1993, 2010) and DirAx (Duisenberg, 1992) were originally devised for data collected in a rotation series with monochromatic radiation. They typically perform poorly when presented with individual pinkbeam or electron snapshot diffraction patterns due to their reliance on the particular conditions of monochromatic rotation measurements. Recent algorithms designed for indexing snapshot diffraction patterns encountered in serial crystallography include TakeTwo (Ginn et al., 2016), FELIX (Beyerlein et al., 2017) and XGANDALF (Gevorkov et al., 2019). These all assume monochromatic radiation and do not fare much better than other indexers when processing polychromatic diffraction patterns. Several indexing approaches have been developed for polychromatic crystal diffraction, also referred to as Laue diffraction (Moffat, 1997). These include an approach due to Jacobson (1986) that requires the use of an energyresolving positionsensitive detector; the Daresbury software suite for indexing Laue patterns (Helliwell et al., 1989; Campbell et al., 1998) and the Precognition software (Ren et al., 1999) based on searching arcs of reflections so that prominent zone axes can be identified; geometric approaches of Carr et al. (1993) and Wenk et al. (1997); and the LaueUtil toolkit (Kalinowski et al., 2011) which carries out a clustering analysis of possible orientations that map lattice vectors to observed peaks. The latter algorithm requires measurements of a crystal at several known relative orientations and is therefore not suited to serial crystallography. Of these, the current stateoftheart software for indexing single widebandwidth diffraction patterns of macromolecular crystals is Precognition. However, while this works well for patterns recorded with a very wide spectrum (e.g. that of a wiggler or bending magnet where the bandwidth is more than 10% of the nominal Xray energy), it becomes less reliable as the number of Bragg spots decreases as occurs with either reduced spectral width (less than 5% of the nominal Xray energy) or with small crystals, where only several tens of Bragg reflections are observed.
Here we present the principles and performance of our general indexing algorithm, pinkIndexer. As described in Section 2, the algorithm maps observed Bragg reflections into trajectories of possible lattice orientations. The most likely orientation is then determined as the orientation in which most trajectories intersect. As such, pinkIndexer covers the cases of monochromatic serial Xray crystallography, Xray crystallography using the unmodified spectrum of an undulator of 1% to 25% bandwidth and approximately 1 Å wavelength, and serial electron crystallography at approximately 0.01 Å wavelength. These cases are evaluated in Section 3. The algorithm can be employed in automated processing of serial crystallography data sets, for example using the CrystFEL sofware suite (White et al., 2012; White, 2019).
2. The pinkIndexer algorithm
2.1. Diffraction geometry
Consider ). () confines the observable spatial frequencies to the shown as a circle in Fig. 1. Each pixel in a detector placed in the far field measures a particular direction given by the unit vector , which unambiguously maps to a point in The spatial frequency spectrum of a crystal of infinite extent is a lattice of points that are commonly referred to as reciprocallattice points (RLPs), shown as black dots in Fig. 1. As can be seen in that figure, a diffraction spot observed in a particular direction is unambiguously mapped to a particular RLP (green dot in Fig. 1).
from an object by a plane monochromatic wave characterized by a wavevector with a wavelength . In the kinematic approximation, the strength of scattering in a direction is given by the magnitude of the Fourier component of the object at a spatial frequency equal to the momentum transfer (James, 1950Consider now the case where the radiation source emits a finite but continuous distribution of wavelengths within some known range. Instead of a single , bounded by the red and blue Ewald spheres (longest to shortest wavelengths in the range). There is a significant difference to the monochromatic case: with a polychromatic source, a particular scattering direction no longer maps to a single point in There may be many diffracted wavevectors, each with a different wavelength (and hence different wavevector magnitude and different placement in the Ewaldsphere construction), but pointing in the same direction and thus arriving at the same point on the detector. These wavevectors are depicted by the red, purple and blue arrows in Fig. 2. Turning this around, for a given diffraction direction , there are many points in that contribute to the diffracted intensity. All these points lie on a straight line segment (green line in Fig. 2), the extension of which passes through the origin of The line segment can be described by .
as in the case of monochromatic illumination, each incident wavelength produces an with a radius inversely proportional to that wavelength. Thus a volume of can be excited in a diffraction experiment, contributing to the 2D diffraction pattern. This volume is depicted in Fig. 2We therefore see that, in the case of broad bandwidth, a point on the detector integrates signal from a line segment in . The main challenge for analyzing broadbandwidth snapshot crystal diffraction patterns is to determine where along the ULS is the RLP which generated the observed Bragg peak. This is equivalent to identifying the wavelength that excited the measured RLP. Note that if more than one RLP lies on the ULS, they will contribute to the observed intensity, excited by different wavelengths. The bandwidth in that case would be too broad to distinguish those particular reflections in the peakfinding stage without energyresolving detectors. It is nevertheless possible to separate the summed intensities after indexing and integration (Zurek et al., 1985; Shrive et al., 1990).
in contrast to a single point in the monochromatic case. The RLP which generates a Bragg peak observed at some position on the detector may therefore lie anywhere on the corresponding line segment. We call this line segment the uncertainty line segment (ULS), shown in green in Fig. 2Since the crystal orientation is not known, and thus the orientation of the . We call the RLPs that can match a ULS by rotation of the `candidate RLPs' (candidates to predict the particular Bragg spot). The candidate RLPs are plotted in dark green in Fig. 2.
is also not known, candidate RLPs may lie anywhere in the volume between shells centered at the origin with radii set by the scattering direction and range of wavelengths as depicted by the dashed circles in Fig. 22.2. Determining the crystal orientation
The task of indexing is to find the crystal orientation which gives rise to a particular measured diffraction pattern and then to assign indices to predicted reflection locations. In practice this is achieved by finding the crystal orientation which best predicts the set of Bragg peaks observed on the detector. We assume that the unitcell parameters of the crystal are known. pinkIndexer determines the likely crystal orientation as follows. (i) For each Bragg spot observed on the detector, find all RLPs of a crystal that can be intersected by the Bragg spot's ULS by rotation around the origin (candidate RLPs). (ii) For each observed Bragg spot, find all rotations of the crystal that place at least one candidate RLP onto the corresponding ULS. This is equivalent to finding all orientations of the that could predict the measured Bragg spot. (iii) Find the orientation which predicts the most Bragg spots from the list of candidate orientations for all Bragg peaks observed in the pattern. The orientation which correctly predicts the most observed Bragg spots will be the chosen indexing solution. (iv) Refine the lattice parameters and other experimental parameters to further improve the agreement of predicted and observed Bragg peaks [if the original parameters were not accurate, one could repeat steps (i) to (iv) using the refined parameters]. Once the crystal orientation is determined it is of course possible to predict the location and wavelength of all potential reflections including absent or weak reflections not present in the set of observed Bragg peaks. These can then be included in the integration of the observed intensities for structure determination.
The main challenge lies in making the search outlined above tractable and robust. As we will now discuss, for each candidate RLP there is an infinite set of reciprocallattice rotations which place it onto its particular ULS. We identify all in this family of rotations by constructing a rotation operation in two steps: first the π around the axis that bisects and as shown in Fig. 3. This rotation brings the candidate RLP onto the ULS. Next, the is rotated around by a rotation of ϕ (see Fig. 3). Since the rotated candidate RLP lies on the ULS it is invariant to the second rotation and thus all rotations ϕ are potential orientations of the lattice. This construction is only valid for one particular candidate RLP and a particular ULS. The particular RLP might not actually give rise to the Bragg spot which generated the ULS. That is, none of the orientations of the lattice generated by the operations might be the indexing solution. In a triclinic lattice only one candidate RLP, as well as other RLPs lying on the same straight line through the origin, generate the correct indexing solution. In lattices with higher symmetry, multiple candidate RLPs generate correct indexing solutions.
is rotated such that the vector of the RLP is rotated by an angleTo determine common orientations that bring a number of RLPs onto ULSs we define a 3D ϕ, satisfying the stated above that place a particular candidate RLP onto a particular ULS (corresponding to an observed Bragg spot). The consists of 3D since it is spanned by three variables describing rotations, such as the three Euler angles. In this 3D space, all candidate RLPs for a particular Bragg spot will form a set of nonintersecting curves. We call this collection a rotogram. By combining rotograms for all measured Bragg spots, a total rotogram for a diffraction pattern is formed, depicted schematically in Fig. 4. The point in a rotogram with the highest density of overlapping curves provides the lattice orientation that predicts most of the observed Bragg spots. This point represents the rotation of the lattice onto that of the measured crystal, i.e. it is the indexing solution. The task of crystal orientation determination is therefore now reduced to one of finding the point in rotation space with the largest number of intersecting lines.
that contains curves parameterized by , and the rotation angle2.3. Algorithm details
In practice, many additional issues arise when dealing with data from a real experiment that complicate the indexing process, such as spurious intensity peaks resulting from experimental noise, or multiple crystals in the beam contributing to the same diffraction pattern. The robustness of the algorithm to these factors becomes a critical issue.
pinkIndexer uses the same basic approach as another indexing method for monochromatic crystal diffraction patterns, FELIX (Beyerlein et al., 2017), which similarly parameterizes possible orientations as curves in a 3D rotation space. Both methods are similar to the Hough and Radon transforms that operate on 2D parameter spaces. With such approaches, the choice of the mapping function is crucial for the performance and simplicity of the algorithm. Wellknown mappings for 3D rotations to a 3D space are: the Eulerangles representation, the axisangle representation, the Gibbs representation and the modified Rodrigues parameters (Terzakis et al., 2018). We employ a novel mapping function by which we achieve a drastic reduction of complexity and the number of necessary parameters compared with FELIX (which uses Rodrigues parameters), while at the same time increasing the noise tolerance. The following features of the transform are desired for robustness and efficient construction of the rotogram: (a) adjacent voxels in the rotogram correspond to similar rotations; (b) rotations are distributed uniformly across a volume of the 3D rotation space; (c) the results of the transformation are efficiently discretized in cuboid samples (i.e. on an orthogonal lattice); (d) the transform is calculable in an efficient way.
Since none of the wellknown examples sufficiently fulfill these requirements, we propose another transform that better fulfills the requirements and is the major factor in the quality of the pinkIndexer algorithm. In our scheme, a single rotation operation is determined from the composite rotation . This rotation then is mapped to the point in the rotogram given by , where is the rotation axis, θ is the rotation angle and is a nonlinear scaling factor. Compared with the wellknown axisangle representation which maps a rotation to (i.e. the length of the vector encodes the rotation angle), this definition only slightly increases the computational burden and inherits its property of adjacent voxels corresponding to similar rotations. The nonlinear scaling of rotation angle to the length of the vector gives a more uniform distribution of points in the rotogram than the axisangle representation. Our transform maps all possible rotations to a finitesize ball of radius and is in some sense the opposite of the modified Rodrigues parameters, .
The construction of from is achieved in a computationally inexpensive way by employing the composition law for finite rotations first derived by Olinde Rodrigues (Altmann, 1989; Pujol, 2013). This describes the consecutive operations of two general rotations and to give by solving
For our problem, the first rotation axis is the bisector , and is the direction of the ULS. This choice of allows setting such that the equations simplify to
Setting the parameters , , and we obtain
which can be solved as
For machines where 1/x^{1/2} is implemented in hardware, replacing by 1/[1(c_{1}d_{1})^{2}]^{1/2} can lead to faster execution.
An example of a rotogram for a particular Bragg peak is shown in Fig. 5, for 56 candidate RLPs on a cubic lattice and a relative bandwidth of 6.5%. Each of the nonintersecting 56 colored curves is a plot of the vector for a full rotation of the lattice. For a given point in the plot, the corresponding rotation angle of the lattice to bring the RLP onto the ULS defined by is , and the rotation axis is . As seen from equation (2), each trajectory lies in the plane containing the orthogonal vectors and , which is to say the plane normal to . The trajectories form closed curves in the over the range of ϕ from 0 to , but we only require a single rotation of the lattice. To keep the rotogram volume as small as possible we choose the range . In the example of Fig. 5 the curves were sampled over that range at steps of 0.1 rad and while the curves are not necessarily uniformly sampled in the the choice of generates curves that more uniformly fill the space than the axisangle representation or any other construction that we tried.
In the implementation of pinkIndexer, rotograms are not calculated continuously as shown in Fig. 5 but computed on discrete sets of N×N×N voxels that circumscribe the ball of radius . For each Bragg spot the voxel array is initialized with zeros and voxels are set to 1 that are intersected by the curve for each of the candidate RLPs, with a uniform sampling of ϕ that is chosen to ensure that the curve is contiguous across the voxels. This is accomplished by computing the parameters and at those values once for the whole rotogram and using values of the parameters and that need to be computed once per curve. To make the discretization of ϕ smoother, the flagged voxels are dilated by setting all of their 26 neighboring voxels to 1. This reduces the effective resolution of the rotogram, but increases the noise tolerance. The rotogram indicates all orientations of the crystal that predict the respective Bragg spot.
By adding each Bragg spot's rotogram, a total rotogram is created where the value of each voxel gives the number of Bragg spots predicted by the corresponding orientation. The voxel with the maximum value thus indicates the most likely lattice orientation that provides the correct indexing solution. The task of indexing is thus reduced to finding the location of this maximum. Since the rotogram is discrete the determined indexing solution is approximate. A subsequent
is carried out to increase the precision of the indexing solution, in which the lattice basis is refined to minimize the mean Euclidean distance between the ULS and the respective closest RLP using a gradient descent approach. Only the RLPs close enough to a ULS are used for this to improve noise tolerance.2.4. Implementation details
The algorithm has been implemented such that parameters have effects that are easy to understand. Besides experimental settings like detector distance and pixel geometry, beam parameters and pinkIndexer requires a relative tolerance, set by the parameter tolerance, to decide when a peak is correctly fitted. Additional parameters tradeoff fitting performance against execution time. The parameter consideredPeaksCount specifies the number of found Bragg spots that are used to compute the initial indexing solution from the maximum of the rotogram. All Bragg spots are considered during The parameter angleResolution sets the resolution of the rotogram in terms of number of voxels N spanning to . Choosing larger voxels (lower resolution) leads to a faster calculation but lower precision in the initial step of determining the orientation from the rotogram. The second step of refining the orientation is controlled by the parameter refinementType. can be performed by a gradient descent method, fitting all parameters of the lattice or keeping the cell parameters constant and just refining the orientation. All parameters take descriptive values wherever possible.
parameters,3. Evaluation of algorithm performance
We evaluated the performance of the pinkIndexer algorithm on data from macromolecular crystal diffraction experiments utilizing three different types of radiation: monochromatic Xrays, pink Xrays and electrons. For the evaluation we used the CrystFEL (White et al., 2012) software suite 0.8.0+50a3cb06 with modifications to include the pinkIndexer library and enable prediction for widebandwidth and electron beams.
3.1. Monochromatic Xray beam crystallography
The performance of pinkIndexer in treating monochromatic serial femtosecond Xray diffraction data was compared with the indexers MOSFLM (Powell, 1999), XDS (Kabsch, 1993, 2010), DirAx (Duisenberg, 1992), TakeTwo (Ginn et al., 2016), FELIX (Beyerlein et al., 2017) and XGANDALF (Gevorkov et al., 2019) using the indexamajig program from the CrystFEL (White et al., 2012) software suite. For the test, all CrystFEL optimizations were turned off by using the options noretry norefine nomulti nocheckcell nocheckpeaks. Only one indexing solution per pattern was accepted. Indexing solutions that differed from the original indexing solution by less than 3° were counted as correct. The diffraction data set was retrieved from the CXIDB (Maia, 2012), entry 21, from SFX measurements of a Gproteincoupled receptor (the serotonin 5HT_{2B} receptor bound to ergotamine) (Liu et al., 2013).
Comparing algorithms using real data provides results that indicate their performance under real conditions. However, unlike when using simulated data, the true indexing solution is unknown. The indexing solutions can be tested for correctness by comparing the predicted Bragg spots with the found ones. This is a precise method when there are many Bragg spots, but when the number of found spots is small there can be several incorrect orientations of a crystal that predict the found spots well enough to pass the indexing test. Following a practice we introduced earlier to compare indexing algorithms (Gevorkov et al., 2019) we created semisimulated data sets with different numbers of Bragg spots by removing spots from patterns with large numbers of spots (which have reliable indexing solutions). As previously, we tested the indexers in two modes of Braggpeak removal. In one mode the sets contained patterns with only five to 50 Bragg spots selected randomly from the patterns. In the other mode the sets of patterns contained five to 50 Bragg spots only at low resolution. The comparison is given in Fig. 6. All algorithms performed well when there were sufficient measured Bragg spots to determine the With both cases of randomly distributed Bragg spots and lowresolution Bragg spots, the pinkIndexer algorithm outperformed all others over the whole range of Braggspot counts. The settings of pinkIndexer in these tests were chosen to favor precision over speed (angleResolution = `dense'). In all cases the lattice parameters were specified to the indexing algorithm. No additional tuning of the indexing algorithms was performed apart from an option that allows FELIX to index patterns with as few as five peaks.
The average times for the various algorithms to index monochromatic diffraction patterns are given in Table 1, computed by indexing a set of 1000 diffraction patterns chosen randomly from the same data set as used above. To ensure a fair comparison, all indexers were called from CrystFEL with the noretry flag set. No attempt was made to index multiple crystals per pattern. The program was executed on a dualsocket Intel Xeon E52698 v4 CPU (2.20 MHz, 20 cores, 50 MB cache, 512 GB RAM). pinkIndexer was tested with settings to maximize the speed (`fast mode' in Table 1, angleResolution = `loose') and with settings to maximize the yield (`precise mode', angleResolution = `dense') which took about five times longer. Settings inbetween are also possible. Even in the fast mode the algorithm takes considerably longer than other algorithms except for TakeTwo. The slower speed of pinkIndexer is because the algorithm is memory intense. Nevertheless, due to its high indexing success rate, pinkIndexer can be profitably used as a fallback option for monochromatic diffraction patterns that cannot be indexed by other indexers. This can be implemented in CrystFEL by placing pinkIndexer last in the list of indexers.

3.2. Pinkbeam serial crystallography
Diffraction patterns collected using pinkbeam radiation (1% to 25% relative bandwidth) contain many observed peaks, which means that indexing solutions can easily be verified by comparing the predicted with the observed Bragg spots. To evaluate pinkIndexer using real pinkbeam serial crystallography data we used the data set of proteinase K crystal diffraction from Meents et al. (2017) measured at the 14IDB (BioCARS) beamline at the Advanced Photon Source (APS), using the full polychromatic spectrum of an undulator harmonic. A representative diffraction pattern is depicted in Fig. 7. Although the FWHM of the incident Xray beam spectrum was 5% of the mean photon energy, the tails of the spectrum extended up to 25% of the mean photon energy. The data set contained 999 patterns that had been classified as crystal diffraction `hits' based on the detection of at least 35 Bragg spots in the original work of Meents et al. (2017). Of these, 667 patterns were successfully indexed by pinkIndexer, with 428 determined to contain a single lattice, 168 with two lattices, and 71 with three or more lattices. This gave a total of 1005 indexed lattices. A vast majority of the 332 patterns that could not be indexed appeared to be falsely identified as crystal diffraction patterns due to fitting peaks to noise. The pinkIndexer parameters used in this test are given in Appendix A.
The comparison of polychromatic indexing programs in an automated way for serial crystallography data sets is complicated by these programs not being able to be called from within the CrystFEL software suite. Indeed, the analysis of the pinkbeam serial crystallography data carried out by Meents et al. (2017) could only be achieved in a semimanual way using the software Precognition. This resulted in the indexing of 140 patterns of the 999 (Meents et al., 2017), and only singlecrystal diffraction patterns could be indexed. This comparison shows that, for serial crystallography, pinkIndexer provides an order of magnitude more indexable patterns than the current stateoftheart software. Moreover, pinkIndexer can deal with multiple crystals per pattern and is fully automatic, thus making serial crystallography with a pink beam much easier. Fig. 7 shows two crystals contributing to the pattern which are both indexed correctly. We have also successfully used pinkIndexer for serial crystallography data from Tolstikova et al. (2019) measured with Xrays with 2.5% relative bandwidth produced using a multilayer monochromator.
3.3. Serial electron crystallography
Electron crystallography poses a challenge for conventional indexing algorithms due to flatness of the pinkIndexer to serial electron diffraction data, a rotation series data set from Cruz et al. (2017) was treated like serial data by indexing each pattern individually. The known rotation increment available for this data set was used to check the correctness of the indexing solutions. The results are displayed in Fig. 8. All patterns from the data set were indexed correctly, as can be seen from the linear increment of the determined rotation angle. The maximum deviation of the angle determined by indexing from a linear fit was 0.14°, which represents an upper bound to the indexing precision since goniometer errors may also contribute. This result opens up the possibility of serial electron crystallography using randomly oriented crystals exposed in individual data frames as performed in SFX measurements.
caused by the short de Broglie wavelength of the electrons. To demonstrate the applicability of4. Conclusion
The indexer presented in this paper has been developed for pinkbeam serial crystallography using the full polychromatic spectrum of an undulator harmonic at a synchrotron radiation facility. Starting from known unitcell parameters, the pinkIndexer algorithm works by mapping all possible rotations of candidate reciprocallattice points onto line segments in that correspond to Bragg peaks of unknown wavelength. By examining these mappings in a novel rotation space the most likely lattice orientation can be found. The main limitation of pinkIndexer is its slower speed compared with many other existing algorithms for monochromatic and the requirement of knowing the cell parameters of the studied crystals. The benefit, however, is its higher success rate in indexing snapshot diffraction patterns than all other algorithms we tested.
Due to the generality of the approach to different wavelengths and spectral characteristics, the algorithm presented here has the ability to open up emerging and asyetunexplored avenues of serial crystallography. The higher Xray ), macromolecular crystallography at neutron facilities (Blakeley et al., 2008), and have motivated the generation of pulses with broader bandwidth at freeelectron laser facilities (White et al., 2013; Dejoie et al., 2013). As demonstrated here, the pinkIndexer program overcomes difficulties previously encountered in automatically analyzing thousands of polychromatic diffraction patterns. Additionally, the generality of the algorithm makes it useful for indexing monochromatic serial crystallography. In this case we found that pinkIndexer demonstrates a superior success rate in indexing diffraction patterns, especially for the tricky case of a small number of detected Bragg spots. We also showed that the approach works well for indexing snapshot diffraction patterns recorded with very short wavelengths, which is usually the situation in electron diffraction. The method might additionally find application in neutron diffraction and could be slightly modified to treat the case of convergentbeam diffraction.
of the polychromatic beam enables exposure times as short as those emitted from a single electron bunch in the storage ring, while the broad bandwidth leads to a high fraction of fully integrated Bragg peaks recorded in a snapshot pattern and a greater coverage of These advantages have long been appreciated for timeresolved Laue diffraction experiments at synchrotron facilities (Moffat, 19975. Code availability
pinkIndexer is implemented in C++ and released as an opensource library under the LGPLv3 licence. This library can be compiled independently or together with the program suite CrystFEL (White et al., 2012). The full processing pipeline, including indexing, highprecision prediction and integration, will be realized soon as a part of CrystFEL. The source code can be downloaded at https://stash.desy.de/users/gevorkov/repos/pinkindexer/browse.
APPENDIX A
pinkIndexer settings
The pinkIndexer settings for Section 3.2 were chosen as follows (format as used by CrystFEL):
photon_energy = 11000,
photon_energy_bandwidth = 0.25,
pinkIndexerconsideredpeakscount=4,
pinkIndexerangleresolution=3,
pinkIndexerrefinementtype=4,
pinkIndexertolerance=0.03.
Unit cell:
lattice_type = tetragonal,
unique_axis = c,
centering = P,
a = 69.10 A,
b = 69.10 A,
c = 106.60 A,
al = 90.00 deg,
be = 90.00 deg,
ga = 90.00 deg.
pinkIndexer settings for Section 3.1 were chosen as follows (format as used by CrystFEL):
pinkIndexertolerance=0.13,
pinkIndexerangleresolution=3.
The XGANDALF settings for Section 3 were chosen as follows (format as used by CrystFEL):
xgandalfsamplingpitch=2,
xgandalfgraddesciterations=2,
xgandalftolerance=0.02.
The FELIX settings for Section 3 were chosen as follows (format as used by CrystFEL):
felixmaxvisits=5.
Acknowledgements
We thank Vukica Šrajer for discussions about pinkbeam indexing and Robert Bücker, Pascal HoganLamarre and Pedram Mehrabi for discussions and collaboration on indexing serial electron diffraction.
Funding information
We acknowledge support from the Cluster of Excellence `Advanced Imaging of Matter' of the Deutsche Forschungsgemeinschaft (DFG) – EXC 2056 – project ID 390715994; the `Xprobe' project funded by the European Union's 2020 Research and Innovation Program under the Marie SkłodowskaCurie Grant Agreement 637295; and the European Research Council, `Frontiers in Attosecond Xray Science: Imaging and Spectroscopy (AXSIS)', ERC2013SyG 609920 (2014–2018).
References
Altmann, S. L. (1989). Math. Mag. 62, 291–308. CrossRef Google Scholar
Beyerlein, K. R., White, T. A., Yefanov, O., Gati, C., Kazantsev, I. G., Nielsen, N. F.G., Larsen, P. M., Chapman, H. N. & Schmidt, S. (2017). J. Appl. Cryst. 50, 1075–1083. Web of Science CrossRef CAS IUCr Journals Google Scholar
Blakeley, M. P., Langan, P., Niimura, N. & Podjarny, A. (2008). Curr. Opin. Struct. Biol. 18, 593–600. Web of Science CrossRef PubMed CAS Google Scholar
Boutet, S., Lomb, L., Williams, G. J., Barends, T. R. M., Aquila, A., Doak, R. B., Weierstall, U., DePonte, D. P., Steinbrener, J., Shoeman, R. L., Messerschmidt, M., Barty, A., White, T. A., Kassemeyer, S., Kirian, R. A., Seibert, M. M., Montanez, P. A., Kenney, C., Herbst, R., Hart, P., Pines, J., Haller, G., Gruner, S. M., Philipp, H. T., Tate, M. W., Hromalik, M., Koerner, L. J., van Bakel, N., Morse, J., Ghonsalves, W., Arnlund, D., Bogan, M. J., Caleman, C., Fromme, R., Hampton, C. Y., Hunter, M. S., Johansson, L. C., Katona, G., Kupitz, C., Liang, M., Martin, A. V., Nass, K., Redecke, L., Stellato, F., Timneanu, N., Wang, D., Zatsepin, N. A., Schafer, D., Defever, J., Neutze, R., Fromme, P., Spence, J. C. H., Chapman, H. N. & Schlichting, I. (2012). Science, 337, 362–364. Web of Science CrossRef CAS PubMed Google Scholar
Bücker, R., HoganLamarre, P., Mehrabi, P., Schulz, E. C., Bultema, L. A., Gevorkov, Y., Brehm, W., Yefanov, O., Oberthür, D., Kassier, G. H. & Miller, R. J. D. (2019). https://doi.org/10.1101/682575. Google Scholar
Campbell, J. W., Hao, Q., Harding, M. M., Nguti, N. D. & Wilkinson, C. (1998). J. Appl. Cryst. 31, 496–502. Web of Science CrossRef CAS IUCr Journals Google Scholar
Carr, P. D., Dodd, I. M. & Harding, M. M. (1993). J. Appl. Cryst. 26, 384–387. CrossRef Web of Science IUCr Journals Google Scholar
Chapman, H. N., Fromme, P., Barty, A., White, T. A., Kirian, R. A., Aquila, A., Hunter, M. S., Schulz, J., DePonte, D. P., Weierstall, U., Doak, R. B., Maia, F. R. N. C., Martin, A. V., Schlichting, I., Lomb, L., Coppola, N., Shoeman, R. L., Epp, S. W., Hartmann, R., Rolles, D., Rudenko, A., Foucar, L., Kimmel, N., Weidenspointner, G., Holl, P., Liang, M., Barthelmess, M., Caleman, C., Boutet, S., Bogan, M. J., Krzywinski, J., Bostedt, C., Bajt, S., Gumprecht, L., Rudek, B., Erk, B., Schmidt, C., Hömke, A., Reich, C., Pietschner, D., Strüder, L., Hauser, G., Gorke, H., Ullrich, J., Herrmann, S., Schaller, G., Schopper, F., Soltau, H., Kühnel, K.U., Messerschmidt, M., Bozek, J. D., HauRiege, S. P., Frank, M., Hampton, C. Y., Sierra, R. G., Starodub, D., Williams, G. J., Hajdu, J., Timneanu, N., Seibert, M. M., Andreasson, J., Rocker, A., Jönsson, O., Svenda, M., Stern, S., Nass, K., Andritschke, R., Schröter, C.D., Krasniqi, F., Bott, M., Schmidt, K. E., Wang, X., Grotjohann, I., Holton, J. M., Barends, T. R. M., Neutze, R., Marchesini, S., Fromme, R., Schorb, S., Rupp, D., Adolph, M., Gorkhover, T., Andersson, I., Hirsemann, H., Potdevin, G., Graafsma, H., Nilsson, B. & Spence, J. C. H. (2011). Nature, 470, 73–77. Web of Science CrossRef CAS PubMed Google Scholar
Cruz, M. J. de la, Hattne, J., Shi, D., Seidler, P., Rodriguez, J., Reyes, F. E., Sawaya, M. R., Cascio, D., Weiss, S. C., Kim, S. K., Hinck, C. S., Hinck, A. P., Calero, G., Eisenberg, D. & Gonen, T. (2017). Nat. Methods, 14, 399–402. Web of Science PubMed Google Scholar
Dejoie, C., McCusker, L. B., Baerlocher, C., Abela, R., Patterson, B. D., Kunz, M. & Tamura, N. (2013). J. Appl. Cryst. 46, 791–794. Web of Science CrossRef CAS IUCr Journals Google Scholar
Duisenberg, A. J. M. (1992). J. Appl. Cryst. 25, 92–96. CrossRef CAS Web of Science IUCr Journals Google Scholar
Gati, C., Oberthuer, D., Yefanov, O., Bunker, R. D., Stellato, F., Chiu, E., Yeh, S.M., Aquila, A., Basu, S., Bean, R., Beyerlein, K. R., Botha, S., Boutet, S., DePonte, D. P., Doak, R. B., Fromme, R., Galli, L., Grotjohann, I., James, D. R., Kupitz, C., Lomb, L., Messerschmidt, M., Nass, K., Rendek, K., Shoeman, R. L., Wang, D., Weierstall, U., White, T. A., Williams, G. J., Zatsepin, N. A., Fromme, P., Spence, J. C. H., Goldie, K. N., Jehle, J. A., Metcalf, P., Barty, A. & Chapman, H. N. (2017). Proc. Natl Acad. Sci. USA, 114, 2247–2252. Web of Science CrossRef CAS PubMed Google Scholar
Gevorkov, Y., Yefanov, O., Barty, A., White, T. A., Mariani, V., Brehm, W., Tolstikova, A., Grigat, R.R. & Chapman, H. N. (2019). Acta Cryst. A75, 694–704. Web of Science CrossRef IUCr Journals Google Scholar
Ginn, H. M., Roedig, P., Kuo, A., Evans, G., Sauter, N. K., Ernst, O. P., Meents, A., MuellerWerkmeister, H., Miller, R. J. D. & Stuart, D. I. (2016). Acta Cryst. D72, 956–965. Web of Science CrossRef IUCr Journals Google Scholar
Helliwell, J. R., Habash, J., Cruickshank, D. W. J., Harding, M. M., Greenhough, T. J., Campbell, J. W., Clifton, I. J., Elder, M., Machin, P. A., Papiz, M. Z. & Zurek, S. (1989). J. Appl. Cryst. 22, 483–497. CrossRef CAS Web of Science IUCr Journals Google Scholar
Jacobson, R. A. (1986). J. Appl. Cryst. 19, 283–286. CrossRef CAS Web of Science IUCr Journals Google Scholar
James, R. (1950). The Optical Principles of the Diffraction of Xrays. London: Bell. Google Scholar
Kabsch, W. (1993). J. Appl. Cryst. 26, 795–800. CrossRef CAS Web of Science IUCr Journals Google Scholar
Kabsch, W. (2010). Acta Cryst. D66, 125–132. Web of Science CrossRef CAS IUCr Journals Google Scholar
Kalinowski, J. A., Makal, A. & Coppens, P. (2011). J. Appl. Cryst. 44, 1182–1189. Web of Science CrossRef CAS IUCr Journals Google Scholar
Kang, Y., Zhou, X. E., Gao, X., He, Y., Liu, W., Ishchenko, A., Barty, A., White, T. A., Yefanov, O., Han, G. W., Xu, Q., de Waal, P. W., Ke, J., Tan, M. H. E., Zhang, C., Moeller, A., West, G. M., Pascal, B. D., Van Eps, N., Caro, L. N., Vishnivetskiy, S. A., Lee, R. J., SuinoPowell, K. M., Gu, X., Pal, K., Ma, J., Zhi, X., Boutet, S., Williams, G. J., Messerschmidt, M., Gati, C., Zatsepin, N. A., Wang, D., James, D., Basu, S., RoyChowdhury, S., Conrad, C. E., Coe, J., Liu, H., Lisova, S., Kupitz, C., Grotjohann, I., Fromme, R., Jiang, Y., Tan, M., Yang, H., Li, J., Wang, M., Zheng, Z., Li, D., Howe, N., Zhao, Y., Standfuss, J., Diederichs, K., Dong, Y., Potter, C. S., Carragher, B., Caffrey, M., Jiang, H., Chapman, H. N., Spence, J. C. H., Fromme, P., Weierstall, U., Ernst, O. P., Katritch, V., Gurevich, V. V., Griffin, P. R., Hubbell, W. L., Stevens, R. C., Cherezov, V., Melcher, K. & Xu, H. E. (2015). Nature, 523, 561–567. Web of Science CrossRef CAS PubMed Google Scholar
Liu, W., Wacker, D., Gati, C., Han, G. W., James, D., Wang, D., Nelson, G., Weierstall, U., Katritch, V., Barty, A., Zatsepin, N. A., Li, D., Messerschmidt, M., Boutet, S., Williams, G. J., Koglin, J. E., Seibert, M. M., Wang, C., Shah, S. T. A., Basu, S., Fromme, R., Kupitz, C., Rendek, K. N., Grotjohann, I., Fromme, P., Kirian, R. A., Beyerlein, K. R., White, T. A., Chapman, H. N., Caffrey, M., Spence, J. C. H., Stevens, R. C. & Cherezov, V. (2013). Science, 342, 1521–1524. Web of Science CrossRef CAS PubMed Google Scholar
Maia, F. R. N. C. (2012). Nat. Methods, 9, 854–855. Web of Science CrossRef CAS PubMed Google Scholar
Meents, A., Wiedorn, M. O., Srajer, V., Henning, R., Sarrou, I., Bergtholdt, J., Barthelmess, M., Reinke, P. Y. A., Dierksmeyer, D., Tolstikova, A., Schaible, S., Messerschmidt, M., Ogata, C. M., Kissick, D. J., Taft, M. H., Manstein, D. J., Lieske, J., Oberthuer, D., Fischetti, R. F. & Chapman, H. N. (2017). Nat. Commun. 8, 1281. Web of Science CrossRef PubMed Google Scholar
Moffat, K. (1997). Methods Enzymol. 277, 433–447. CrossRef CAS PubMed Web of Science Google Scholar
Nogly, P., James, D., Wang, D., White, T. A., Zatsepin, N., Shilova, A., Nelson, G., Liu, H., Johansson, L., Heymann, M., Jaeger, K., Metz, M., Wickstrand, C., Wu, W., Båth, P., Berntsen, P., Oberthuer, D., Panneels, V., Cherezov, V., Chapman, H., Schertler, G., Neutze, R., Spence, J., Moraes, I., Burghammer, M., Standfuss, J. & Weierstall, U. (2015). IUCrJ, 2, 168–176. Web of Science CrossRef CAS PubMed IUCr Journals Google Scholar
Pande, K., Hutchison, C. D. M., Groenhof, G., Aquila, A., Robinson, J. S., Tenboer, J., Basu, S., Boutet, S., DePonte, D. P., Liang, M., White, T. A., Zatsepin, N. A., Yefanov, O., Morozov, D., Oberthuer, D., Gati, C., Subramanian, G., James, D., Zhao, Y., Koralek, J., Brayshaw, J., Kupitz, C., Conrad, C., RoyChowdhury, S., Coe, J. D., Metz, M., Xavier, P. L., Grant, T. D., Koglin, J. E., Ketawala, G., Fromme, R., rajer, V., Henning, R., Spence, J. C. H., Ourmazd, A., Schwander, P., Weierstall, U., Frank, M., Fromme, P., Barty, A., Chapman, H. N., Moffat, K., van Thor, J. J. & Schmidt, M. (2016). Science, 352, 725–729. Web of Science CrossRef CAS PubMed Google Scholar
Powell, H. R. (1999). Acta Cryst. D55, 1690–1695. Web of Science CrossRef CAS IUCr Journals Google Scholar
Pujol, J. (2013). Appl. Mech. Rev. 65, 054501. Web of Science CrossRef Google Scholar
Ren, Z., Bourgeois, D., Helliwell, J. R., Moffat, K., Šrajer, V. & Stoddard, B. L. (1999). J. Synchrotron Rad. 6, 891–917. Web of Science CrossRef CAS IUCr Journals Google Scholar
Schlichting, I. (2015). IUCrJ, 2, 246–255. Web of Science CrossRef CAS PubMed IUCr Journals Google Scholar
Shrive, A. K., Clifton, I. J., Hajdu, J. & Greenhough, T. J. (1990). J. Appl. Cryst. 23, 169–174. CrossRef CAS Web of Science IUCr Journals Google Scholar
Smeets, S., Zou, X. & Wan, W. (2018). J. Appl. Cryst. 51, 1262–1273. Web of Science CrossRef CAS IUCr Journals Google Scholar
Stagno, J. R., Liu, Y., Bhandari, Y. R., Conrad, C. E., Panja, S., Swain, M., Fan, L., Nelson, G., Li, C., Wendel, D. R., White, T. A., Coe, J. D., Wiedorn, M. O., Knoska, J., Oberthuer, D., Tuckey, R. A., Yu, P., Dyba, M., Tarasov, S. G., Weierstall, U., Grant, T. D., Schwieters, C. D., Zhang, J., FerréD'Amaré, A. R., Fromme, P., Draper, D. E., Liang, M., Hunter, M. S., Boutet, S., Tan, K., Zuo, X., Ji, X., Barty, A., Zatsepin, N. A., Chapman, H. N., Spence, J. C. H., Woodson, S. A. & Wang, Y.X. (2016). Nature, 541, 242–246. Web of Science CrossRef PubMed Google Scholar
Stellato, F., Oberthür, D., Liang, M., Bean, R., Gati, C., Yefanov, O., Barty, A., Burkhardt, A., Fischer, P., Galli, L., Kirian, R. A., Meyer, J., Panneerselvam, S., Yoon, C. H., Chervinskii, F., Speller, E., White, T. A., Betzel, C., Meents, A. & Chapman, H. N. (2014). IUCrJ, 1, 204–212. Web of Science CrossRef CAS PubMed IUCr Journals Google Scholar
Suga, M., Akita, F., Hirata, K., Ueno, G., Murakami, H., Nakajima, Y., Shimizu, T., Yamashita, K., Yamamoto, M., Ago, H. & Shen, J.R. (2014). Nature, 517, 99–103. Web of Science CrossRef PubMed Google Scholar
Tenboer, J., Basu, S., Zatsepin, N., Pande, K., Milathianaki, D., Frank, M., Hunter, M., Boutet, S., Williams, G. J., Koglin, J. E., Oberthuer, D., Heymann, M., Kupitz, C., Conrad, C., Coe, J., RoyChowdhury, S., Weierstall, U., James, D., Wang, D., Grant, T., Barty, A., Yefanov, O., Scales, J., Gati, C., Seuring, C., Srajer, V., Henning, R., Schwander, P., Fromme, R., Ourmazd, A., Moffat, K., Van Thor, J. J., Spence, J. C. H., Fromme, P., Chapman, H. N. & Schmidt, M. (2014). Science, 346, 1242–1246. Web of Science CrossRef CAS PubMed Google Scholar
Terzakis, G., Lourakis, M. & AitBoudaoud, D. (2018). J. Math. Imaging Vis. 60, 422–442. Web of Science CrossRef Google Scholar
Tolstikova, A., Levantino, M., Yefanov, O., Hennicke, V., Fischer, P., Meyer, J., Mozzanica, A., Redford, S., Crosas, E., Opara, N. L., Barthelmess, M., Lieske, J., Oberthuer, D., Wator, E., Mohacsi, I., Wulff, M., Schmitt, B., Chapman, H. N. & Meents, A. (2019). IUCrJ, 6, 927–937. Web of Science CrossRef CAS PubMed IUCr Journals Google Scholar
Wenk, H. R., Heidelbach, F., Chateigner, D. & Zontone, F. (1997). J. Synchrotron Rad. 4, 95–101. CrossRef CAS Web of Science IUCr Journals Google Scholar
White, T. A. (2019). Acta Cryst. D75, 219–233. Web of Science CrossRef IUCr Journals Google Scholar
White, T. A., Barty, A., Stellato, F., Holton, J. M., Kirian, R. A., Zatsepin, N. A. & Chapman, H. N. (2013). Acta Cryst. D69, 1231–1240. Web of Science CrossRef CAS IUCr Journals Google Scholar
White, T. A., Kirian, R. A., Martin, A. V., Aquila, A., Nass, K., Barty, A. & Chapman, H. N. (2012). J. Appl. Cryst. 45, 335–341. Web of Science CrossRef CAS IUCr Journals Google Scholar
Zurek, S., Papiz, M., Machin, P. & Helliwell, J. (1985). Info. Q. Protein Crystallogr. 16, 37–40. Google Scholar
This is an openaccess article distributed under the terms of the Creative Commons Attribution (CCBY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.