research papers
MeshAndCollect: an automated multi-crystal data-collection workflow for synchrotron macromolecular crystallography beamlines
aStructural Biology Group, European Synchrotron Radiation Facility, CS 40220, 38043 Grenoble, France, bEuropean Molecular Biology Laboratory, Hamburg Outstation, Notkestrasse 85, 22607 Hamburg, Germany, cEuropean Molecular Biology Laboratory, Grenoble Outstation, 71 Avenue des Martyrs, CS 90181, 38042 Grenoble, France, dUnit of Virus Host-Cell Interactions, Université Grenoble Alpes–EMBL–CNRS, 38042 Grenoble, France, eUniversité Grenoble Alpes, IBS, 38044 Grenoble, France, fCNRS, IBS, 38044 Grenoble, France, gCEA, IBS, 38044 Grenoble, France, hICS-6: Molecular Biophysics, Institute of Complex Systems (ICS), Research Centre Juelich, 52425 Juelich, Germany, and iLaboratory for Advanced Studies of Membrane Proteins, Moscow Institute of Physics and Technology, Dolgoprudniy 141700, Russian Federation
*Correspondence e-mail: leonard@esrf.fr
Here, an automated procedure is described to identify the positions of many cryocooled crystals mounted on the same sample holder, to rapidly predict and rank their relative diffraction strengths and to collect partial X-ray diffraction data sets from as many of the crystals as desired. Subsequent hierarchical
then allows the best combination of partial data sets, optimizing the quality of the final data set obtained. The results of applying the method developed to various systems and scenarios including the compilation of a complete data set from tiny crystals of the membrane protein bacteriorhodopsin and the collection of data sets for successful using the single-wavelength technique are also presented.Keywords: macromolecular crystallography; synchrotron serial crystallography; multi-crystal data collection; hierarchical cluster analysis.
PDB references: thermolysin, 5a3y; lysozyme, 5a3z; bacteriorhodopsin, 5a44; 5a45; thaumatin, 5a47
1. Introduction
Merging partial X-ray diffraction data sets from many crystals to produce a complete data set can be traced back to the very beginnings of macromolecular crystallography (MX). Indeed, in order to cope with the radiation damage observed at room temperature, the et al., 1960). However, with the introduction of cryogenic data-collection techniques (Hope, 1988) the effects of radiation damage could be limited dramatically. This generally allowed the collection of complete data sets from single crystals of biological macromolecules, even at beamlines at high-intensity third-generation synchrotron sources, and this soon became the norm.
solution of myoglobin required the merging of partial data sets, each comprising a single precession photograph, from 22 crystals per heavy-atom derivative (KendrewThe emergence of X-ray free-electron lasers (XFELs) saw multi-crystal data collection in MX revived and taken to its logical limits. Owing to the exceptionally bright nature of XFEL beams, experimenters adopted a `diffraction before destruction' approach, dubbed serial femtosecond crystallography (SFX), streaming microcrystals through the X-ray beam and collecting still diffraction images where the crystal and X-ray laser pulse coincide (Chapman et al., 2011). Complete data sets are then compiled by combining data from many thousands of still diffraction images. While SFX is likely to prove a watershed in MX, chiefly because the crystal structures determined using the technique should be largely free of radiation damage (Neutze et al., 2000; Boutet et al., 2012), the technique is not without its disadvantages. In particular, the amount of protein required in SFX experiments is currently rather high even for structure solution based on molecular-replacement techniques (i.e. only native data sets are collected). Moreover, although software for SFX data analysis is rapidly developing (Sauter et al., 2013; White et al., 2012, 2013; Barty et al., 2014; Sawaya et al., 2014), as the diffraction images collected in such experiments contain predominantly partially recorded reflections measured from crystals of different sizes with laser pulses of different spectral content, estimation of the intensity (and its standard deviation) of any given reflection is problematical and data-processing methods will have to evolve significantly if the quality of SFX-collected data is to approach that currently available in `traditional' MX experiments.
Inspired by the success of SFX, experimenters at synchrotron MX beamlines have used similar paradigms (bright X-ray beams, fast read-out detectors, small crystals, single-exposure experiments) to develop synchrotron serial crystallography (SSX), showing that it is possible to compile useful data sets from hundreds or thousands of crystals introduced into the synchrotron beam either via jets (Nogly et al., 2015), liquid streams in glass capillaries (Stellato et al., 2014), free-standing high-viscosity micro-streams (Botha et al., 2015), sandwiched between two silicon nitride (Si3N4; Coquelle et al., 2015) or cyclic olefin (COC; Huang et al., 2015) wafers that are translated through the X-ray beam, or contained on a cryocooled sample holder (Gati et al., 2014). In the latter case the whole sample mount is continuously rastered through the X-ray beam, being rotated at the same time (de Sanctis et al., 2012), and diffraction images are recorded on the fly at set time intervals. As for crystals introduced into the X-ray beam in liquid streams or on silicon nitride wafers, the large majority of diffraction images collected contain no useful information. However, that fact that the sample is also rotated while being rastered means that where the crystal and the X-ray beam coincide some diffraction images could contain fully recorded reflections, thus rendering the processing and scaling of diffraction images using standard software packages relatively straightforward and improving the overall data quality. Moreover, for crystals larger than the X-ray beam, diffraction images can be grouped into those originating from the same crystal, thus also facilitating data processing and improving the resulting data quality (Gati et al., 2014).
While for the same crystal volume and X-ray beam size the resolution obtainable in SSX experiments is likely to always be lower than that in SFX, SSX will become an important technique in MX. In particular, initial crystals of many systems are often small and SSX provides a means to study them without the need for the often time-consuming and cumbersome optimization of crystal size and/or quality. Indeed, when combined with the extremely bright X-rays beams that will be available at future low-emittance fourth-generation storage rings (see, for example, https://www.esrf.fr/Apache_files/Upgrade/ESRF-orange-book.pdf ), such experiments may well become the norm. However, even when rastering samples contained on a cryocooled sample holder through the X-ray beam, SSX often suffers, as does SFX, from the fact that no attempt is made to synchronize the intersection of the X-ray beam and crystal during the experiment. Moreover, as the SFX `diffraction before destruction' principle currently does not apply in SSX experiments on cryocooled samples, the amount of diffraction data collected from any given crystal is far from optimized.
Recent developments based on either the optical (Huang et al., 2015) or diffraction-based (Soares et al., 2014) pre-interrogation of multi-crystal sample holders have ensured the synchronization of X-ray beam and crystals in SSX protocols and have enormously reduced the amount of sample required for a successful experiment. In a further step towards the optimal collection of diffraction data in SSX experiments from samples which can sustain the collection of many X-ray diffraction images before significant radiation damage occurs, we have developed an automatic procedure (Fig. 1). Here, the positions of many randomly oriented (micro)crystals contained in a single cryocooled sample holder are determined using an X-ray-based two-dimensional scan, the diffraction strength of each crystal found is automatically ranked and partial data sets from each crystal are collected and processed online. Subsequent manual hierarchical (HCA; Giordano et al., 2012) is then used to decide the most correlated partial data sets to merge to produce the best quality data set for use in downstream analysis and structure solution. The protocol developed can in principle be applied to crystals mounted in almost any type of currently available mounting platform (i.e. nylon loops, micro-meshes, Si3N4 or COC wafers etc.) and is applicable not only to multi-crystal data collection but additionally automates multi-position data collection from large crystals when exploiting mini-focus or micro-focus X-ray beams.
As proof of the general usefulness of the protocol developed, we present the results of applying this method to various systems and scenarios. These include the compilation of a complete data set from microcrystals of the membrane protein bacteriorhodopsin, the collection and merging of partial data sets collected from different positions of larger crystals and the collection of data sets for use in
using single-wavelength techniques.2. Methods
In the experiments described here, the best results were obtained from crystals mounted in a flat sample holder (i.e. MiTeGen MicroMeshes; MiTeGen, USA; Fig. 1a), avoiding stacking of crystals and an excess of surrounding mother liquor, before either flash-cooling in liquid nitrogen or directly on the beamline. When mounted on a goniometer, the plane of the sample holder should be perpendicular to the direction of the X-ray beam. This ensures that any crystal brought into the X-ray beam will remain illuminated over a relatively small rotation range (±5° in the experiments described here1). To make this adjustment, we usually exploit the mini-kappa goniometers (Brockhauser et al., 2013) installed on most of the MX beamlines at the ESRF. The MeshAndCollect protocol (Fig. 1b) is implemented in a customized Passerelle-EDM workflow engine (https://isencia.be/passerelle-edm-en ) called the Beamline Expert System that is based on previous developments (Brockhauser et al., 2012) and is embedded in the MXCuBE2 beamline-control graphical user interface (Gabadinho et al., 2010; de Sanctis & Leonard, 2014). Once the workflow has been launched the user defines the size of the X-ray beam to be used. Ideally, this should correspond to, or be smaller than, the minimum dimension of the crystals contained in the sample holder. The area over which the initial two-dimensional mesh scan is performed (Fig. 1a) is drawn by the user, with the dimensions of the grid and the X-ray beam size defining the number of points in the mesh scan. Diffraction images collected at each of these points are analysed on the fly for protein diffraction using the software DOZOR (§2.1). The user receives a heat map (Fig. 1), also stored in the ISPyB database (Delagenière et al., 2011), showing the grid points at which diffraction has been observed. The user then has the possibility of adjusting the contrast level to include or exclude points for subsequent data collection. In the last experimental step partial data sets (±5° total rotation range, 100 images per partial data set) are collected sequentially at each grid point with a DOZOR score above the threshold. Each partial data set is automatically processed using the GrenADes pipeline (Monaco et al., 2013) based on XDS (Kabsch, 2010) running in parallel with the data collection. Partial data sets that have been successfully processed are then scaled together using XSCALE (Kabsch, 2010). The resulting CCI(i, j) values calculated for the common unique intensities of each pair of data sets are used in a HCA protocol (Giordano et al., 2012) to produce a dendrogram (Fig. 1). This is then used to decide which partial data sets to combine to produce, using the CCP4 programs POINTLESS and AIMLESS (Evans & Murshudov, 2013), the final data set for structure solution and (Fig. 1). A feature of POINTLESS is that it uses the first partial data set provided as input as a reference data set. This avoids, where it might have been possible during automatic data processing, indexing ambiguities between partial data sets, with the result that the merged data set obtained is not artifactually merohedrally twinned (for a discussion of this, see Brehm & Diederichs, 2014). Any then detected (i.e. using the `H-test'; Yeates, 1997) in the final merged data set, although an average over all crystals included, is likely to be real, facilitating determination of the true for use with the correct fraction (if appropriate) in subsequent structure solution and
2.1. DOZOR
One of the core features of the protocol described here is the ability to automatically recognize and rank the series of single diffraction patterns collected during the low-dose mesh scan of the sample holder. This is carried out using the program DOZOR. As the algorithm used will be illustrated in more detail elsewhere, it will be only briefly described here.
In a first step, DOZOR determines the distribution of background intensity on a diffraction image as a function of the diffraction vector length h. This is accomplished by the iterative summation of pixel intensities and the sequential rejection of outliers. After azimuthal averaging this produces the one-dimensional background function . This function should be smooth: any sharp peaks are an indication of ice rings or salt diffraction, and such areas are not used in further calculations.
In the case of diffraction from a crystal of a biological macromolecule, the function
where N(h) is the number of detector pixels and Ii,j is the intensity in any pixel which belongs to the resolution shell, h), will give the estimate of the mean intensity of Bragg spots as a function of resolution and will represent the well known Wilson plot, which for any protein crystal can be modelled using , the unique pattern of average squared structure-factor magnitudes (Bourenkov & Popov, 2006). DOZOR approximates the experimental data by applying an isotropic Debye–Waller factor to the standard protein Wilson plot model,
The quality of the resulting fit is evaluated via the between the left and right parts of (2), CCpowder. The program also identifies individual Bragg spots and makes a few simple geometrical checks which additionally validate the presence of diffraction from macromolecular crystals and allow the rejection of ice or salt contamination. Finally, a score of diffraction strength is estimated as the total averaged diffraction intensity multiplied by CCpowder, where V(h) is the reciprocal volume of the resolution shell,
In the case where DOZOR cannot find any Bragg spots, the score is determined as zero.
3. Results
3.1. Bacteriorhodopsin
Crystals of bacteriorhodopsin (BR) were prepared as described previously (Gordeliy et al., 2003). In this study, two batches of bacteriorhodopsin crystals were used: BR1 (Fig. 2a), with dimensions of ∼20 × 20 × 5 µm, and BR2 (Fig. 3a), with dimensions of ∼5 × 5 × 2 µm. Diffraction data (Table 1) were collected on ESRF beamline ID29 (de Sanctis et al., 2012) using a PILATUS3 6M pixel detector (Dectris, Baden, Switzerland).
|
For BR1 the initial mesh scan was carried out using a Gaussian X-ray beam of 20 µm in diameter with a 11 photons s−1. The resulting heat map (Fig. 2b) revealed ten well diffracting positions from which partial data sets were collected. All partial data sets could be automatically processed and, after HCA (Fig. 2c), nine were chosen for scaling and merging to produce a final data set to dmin = 2.3 Å (Table 1; Wilson plot shown in Fig. 2d).
of 3 × 10For BR2, the initial mesh scan (X-ray beam of 10 µm in diameter with a 11 photons s−1) produced a heat map (Fig. 3b) showing 59 diffracting positions in the sample holder from which partial data sets were collected. 38 partial data sets could be automatically processed and, after HCA (Fig. 3c), ten were merged to produce a final data set to dmin = 2.6 Å; Table 1; Wilson plot shown in Fig. 3d).
of 1.5 × 10For both BR1 (twinning fraction 0.06) and BR2 (twinning fraction 0.39) structure solution was carried out by MOLREP (Vagin & Teplyakov, 2010) with PDB entry 3ns0 (Borshchevskiy et al., 2011) stripped of water molecules and ligands as a search model. Structure (Table 2) was carried out using the option in REFMAC5 (Murshudov et al., 2011) interspersed with rounds of manual rebuilding in Coot (Emsley et al., 2010). In both crystal structures assignment of the retinal cofactor was possible from the interpretation of both electron-density and difference density maps and is well defined both in the final 2mFobs − DFcalc electron density and in OMIT difference density maps (Figs. 2e, 2f, 3e and 3f).
using
|
3.2. Thaumatin
Thaumatin (Sigma–Aldrich catalogue No. T7638) was dissolved in double-distilled water to a concentration of 20 mg ml−1. Crystals of approximate dimensions 40 × 40 × 60 µm were obtained in 2 µl (1:1 ratio) hanging drops using 0.1 M HEPES pH 7.5, 0.7 M potassium/sodium tartrate, 20% glycerol as a reservoir. Crystals were mounted as described in §2 without further cryoprotection. Data were collected on ESRF beamline ID29. The initial mesh scan was performed with an X-ray beam of 10 µm in diameter with a of 8.7 × 1011 photons s−1. From the resulting heat map (Fig. 4a), 100 well diffracting points were chosen for the collection of partial data sets, of which 78 could be automatically integrated. After HCA (Fig. 4b) 74 were merged to produce a final data set to dmin = 1.2 Å (Table 1; Wilson plot shown in Fig. 4c).
Structure solution was carried out by MOLREP with PDB entry 4axu (Cipriani et al., 2012) stripped of water molecules and ligands as a search model. Structure (Table 2, Fig. 4d), during which analysis of difference electron-density maps clearly allowed the assignment of tartrate (one molecule; Figs. 4e and 4f) and glycerol (one molecule) moieties bound to the protein, was carried out in REFMAC5 alternated with manual rebuilding in Coot.
using3.3. Monoclinic lysozyme
Lysozyme (Roche Applied Science, catalogue No. 10837059001) was dissolved in double-distilled water to a concentration of 40 mg ml−1. `Flowers' of monoclinic (space group P21) lysozyme crystals (Fig. 5a), with each petal ∼80 µm in the largest dimension, were then obtained from 2 µl (1:1 ratio) hanging drops using 0.6 M NaNO3 as the precipitant/reservoir. Prior to mounting, 1 µm 75% glycerol was added to the crystallization drop for cryoprotection. Diffraction data were collected on ESRF beamline ID23-1 (Nurizzo et al., 2006) using an X-ray beam of 10 µm in diameter with a of 3.5 × 1010 photons s−1. The initial mesh scan produced a heat map (Fig. 5b) which was used as the basis for the collection of 54 partial data sets, of which 40 could be automatically processed. After HCA (Fig. 5c) 21 partial data sets were merged to produce a final data set to dmin = 1.6 Å (Table 1; Wilson plot shown in Fig. 5d). Structure solution and (Table 2, Fig. 5d) were then carried out as described above for thaumatin (using PDB entry 4axt stripped of water molecules and ligands as the search model for Cipriani et al., 2012), during which analysis of electron-density and difference electron density maps allowed the assignment of a nitrate (NO3−) ion bound to one of the lysozyme molecules in the (Fig. 5f).
3.4. Thermolysin
Bacillus thermoproteolyticus thermolysin (Sigma–Aldrich catalogue No. T0331) was dissolved to 100 mg ml−1 in 45% DMSO, 0.05 M MES pH 6.0. The reservoir contained 35% saturated ammonium sulfate, whereas the drops were composed of the protein solution and a solution consisting of 0.05 M MES pH 6.0, 1 M NaCl, 45% DMSO in a 1:1 ratio. Rod-shaped crystals of between 40 × 40 × 150 and 40 × 40 × 300 µm in size were quick-soaked in 6 M trimethylamine N-oxide (TMAO; Mueller-Dieckmann et al., 2011) for cryoprotection before mounting on a sample support (Fig. 6). Diffraction data were collected using an X-ray beam of 10 µm in diameter with a of 4.0 × 1010 photons s−1 at the peak of the Zn K (λ = 1.256 Å) on beamline ID23-1 of the ESRF. The initial mesh scan produced a heat map (Fig. 6a) which was used as a basis for the collection of 96 partial data sets, 77 of which were automatically processed and 49 were manually merged after HCA analysis to produce a final data set to dmin = 1.37 Å (Table 1, Figs. 6b and 6c). Structure solution (Fig. 6d) was carried out using the SAD method (Dauter et al., 2002) using the SHELXC/D/E pipeline (Sheldrick, 2008) as implemented in HKL2MAP (Pape & Schneider, 2004), with the initial de novo-obtained model of the refined (Table 2, Fig. 6e) using iterative rounds of REFMAC5 and manual rebuilding in Coot.
Our experiments with crystals of thermolysin reveal other features of the developed pipeline. In particular, when, as was the case here, the sample holder contains a series of crystals much larger than the X-ray beam (Fig. 6a) multi-crystal/multi-position data collection is also automated. Indeed, for crystals that are larger than the X-ray beam the rapid online analysis and ranking of diffraction characteristics using DOZOR (§2.1) provides diffraction cartographs (Bowler et al., 2010) of the crystals contained on the sample mount (Fig. 6b). The workflow thus ensures that partial data sets are collected from only well diffracting areas of any given crystal.
3.5. MAEL domain of Bombyx mori Maelstrom
Diffraction data from crystals of the selenomethionyl derivate of the MAEL domain of B. mori Maelstrom (for crystallization conditions, see Chen et al., 2015) were collected using an X-ray beam of 10 µm in diameter with a of ∼9.5 × 1010 photons s−1 at the peak of the Se K (λ = 0.979 Å) on beamline ID23-1 at the ESRF. Crystals of this system (20–50 µm in the largest dimension) diffract rather poorly; therefore, in order to increase the data multiplicity to allow a more accurate determination of anomalous differences, six different sample holders were used in this experiment. The initial mesh scans produced heat maps (Fig. 7a) used to direct the collection of 137 partial data sets, 122 of which could be automatically processed and 45 of which were merged to produce a final data set to dmin = 3.46 Å after HCA (Table 1, Figs. 7b and 7c). Structure solution (Figs. 7d and 7e) was carried out using the SAD technique as implemented in the CRANK2 pipeline (Skubák & Pannu, 2013).
4. Discussion
4.1. General comments
The method that we describe here, while similar to the multi-crystal data-collection methods for samples mounted in micro-meshes described previously (Soares et al., 2014), presents fundamental differences. Notably, a very low X-ray dose pre-screening of a sample mount is used to both identify the positions of crystals contained on the sample mount and to rank the diffraction characteristics of the crystals in order to create a priority for the subsequent automatic collection of partial data sets, and a HCA protocol is used to choose which partial data sets to merge to produce the best final data set. Moreover, when the sample holder contains a series of crystals much larger than the X-ray beam the method also automates the type of multi-crystal/multi-position data collection (Riekel et al., 2005) that has become essential in the structural study of G protein-coupled receptors (GPCRs; Rasmussen et al., 2011; Hollenstein et al., 2013; Lebon et al., 2011). Furthermore, for crystals larger than the X-ray beam the rapid online analysis and ranking of diffraction characteristics using DOZOR (§2.1) also provides diffraction cartographs (Bowler et al., 2010) of the crystals contained on the sample mount, ensuring that partial data sets are only collected from well diffracting areas of any given crystal.
To demonstrate the general applicability of the workflow described here, we have applied it to various systems and scenarios in which many crystals of the same type are mounted on the same cryocooled sample holder. In all of the cases presented our workflow has yielded data sets that are fit for purpose (Table 1, §4.2). As might be expected (Fry et al., 1996), the protocol described here is particularly amenable to systems (i.e. thaumatin, bacteriorhodopsin, Maelstrom) that crystallize in high-symmetry space groups. However, our experiments using monoclinic crystals of lysozyme show that the method can also be applied to low-symmetry systems. Furthermore, as the monoclinic form of lysozyme crystallized as clumps of intergrown crystals (Fig. 5a), the success of this latter experiment demonstrates that the protocol developed also automates the collection of diffraction data using mini-focus or micro-focus X-ray beams under conditions where mounting single crystals of a particular sample may prove to be difficult or impossible.
It is worth noting that the completeness of the data set obtained for monoclinic lysozyme following the HCA-directed merging of the partial data set collected is rather incomplete (21 of 40 automatically processed partial data sets merged, 85% completeness; Table 1). However, this is not the result of a combination of low-symmetry crystals lying in preferred orientations in the sample holder. Indeed, merging 39 of the 40 automatically processed partial data sets greatly improves the completeness (Fig. 5g). However, the quality of the resulting data set is seriously degraded compared with that obtained by merging only partial data sets in the main HCA cluster (Fig. 5g). Moreover, in contrast to what is observed following HCA-directed merging, the resulting difference electron density does not allow the proper identification of nitrate ions bound to the protein (Figs. 5f and 5h). It is thus clear that HCA is an indispensable tool for the proper merging of partial data sets. Nevertheless, that the merged data set for monoclinic lysozyme obtained following HCA is somewhat incomplete suggests, for some low-symmetry systems at least, that data collection from samples in two loops with different orientations in the X-ray beam may be required to ensure a fully complete, high-quality data set.
The examples that we present include the compilation of complete diffraction data from partial data sets collected from a series of microcrystals (∼5 µm in the largest dimension), contained on the same sample holder, of a membrane protein (bacteriorhodopsin) grown in lipidic et al., 2003), but are often opaque in nature, particularly when cooled. It can thus be challenging to identify, mount and centre in the X-ray beam small crystals produced in such media. That the workflow described here uses diffraction-based methods to identify the positions of crystals in a sample holder is clearly a major advantage in such cases as it obviates such problems, particularly when entire crystallization drops are harvested, by automating the collection of partial data sets from multiple crystals.
Such mesophases are very important media for the growth of membrane-protein crystals (Gordeliy4.2. Structure solution and refinement
4.2.1. Diffraction data for structure solution by molecular replacement
The examples of bacteriorhodopsin (BR1 and BR2), thaumatin and monoclinic lysozyme described above clearly show that the protocol that we have developed yields, even for very small crystals, complete diffraction data sets that allow structure solution by MR. Moreover, despite the fact that all data sets were obtained by the merging of multiple partial data sets, electron-density (2mFobs − DFcalc, αcalc) and difference density (mFobs − DFcalc, αcalc) maps calculated during structure clearly allow the identification of moieties not included in the MR search models: retinal (BR1 and BR2; Figs. 2f and 3f), tartrate (thaumatin; Fig. 4f) and NO3− (monoclinic lysozyme; Fig. 5f). This suggests that the method developed may, in the future, have a significant role to play in projects aimed at fragment screening (Murray & Blundell, 2010) as an aid in drug design. Traditionally, such projects are based around the production of relatively large, robust crystals for use in soaking experiments (Oster et al., 2015). However, the results presented here show that this clearly does not need to be the case and that complete, high-quality data sets could straightforwardly be compiled from a series of smaller crystals mounted on the same sample holder. Moreover, as evidence suggests that smaller crystals require reduced fragment/ligand-soaking times to obtain the same occupancy of the fragment/ligand in crystal structures (Cole et al., 2014), microcrystal-based fragment screening experiments may well become the norm, with soaking times based on the largest crystal contained in the crystallization drop ensuring the maximum occupancy of ligands/fragments in all of the crystals mounted on a single sample loop.
4.2.2. Diffraction data for structure solution exploiting anomalous scattering
In order to demonstrate the possibilities of the workflow presented here to produce data suitable for experimental phasing techniques that exploit 2+ ion and three Ca2+ ions per protein chain (316 residues), producing a theoretical anomalous diffraction ratio (〈ΔF/F〉) of ∼2% for data collected at the peak of the Zn K The second, the selenomethionyl derivative of the MAEL domain of B. mori Maelstrom (Chen et al., 2015), produces a theoretical anomalous diffraction ratio of 4.0% for data collected at the peak of the Se K However, the crystals of this system diffract rather poorly (see Table 1). The collection of data of sufficiently high quality for the structure solution of both systems is thus clearly challenging, even from single crystals. Nevertheless, as can be seen in Figs. 6 and 7, for both systems our multi-crystal workflow clearly yields diffraction data of sufficient quality for structure solution. As might be expected, a high data multiplicity was important in both cases (Table 1) and to achieve this for Maelstrom required combining partial data sets from crystals mounted on six different sample holders (Fig. 7a).
two different systems were investigated. The first of these, thermolysin, contains one catalytic Zn4.3. Perspectives
We have developed an automatic procedure to locate, rank the diffraction characteristics of and collect partial data sets from large numbers of crystals contained on the same sample holder. Subsequent HCA of the partial data sets collected then allows the choice of which partial data sets to merge to produce a final data set for downstream structure solution and et al., 2014; Nogly et al., 2015; Stellato et al., 2014), MeshAndCollect has several advantages, notably that small but contiguous data sets can, if desired, be collected from all crystals contained on the sample holder. Crystal wastage is thus not an issue, data reduction from raw diffraction images to structure factors and standard deviations is comparably straightforward and the quality of the final data set is improved. Moreover, the experiments described in §3 clearly demonstrate the capability of DOZOR to detect diffraction signal in low-dose two-dimensional mesh scans even for the smallest crystals (BR2; §3.1) studied in this work, which had an average volume of ∼50 µm3.
Compared with previously presented SSX protocols (GatiWhen starting this work, we presumed that cryocooled crystals contained on the same loop would be relatively isomorphous as all crystals are from the same crystallization drop and subject to similar handling during mounting and cryocooling (Giordano et al., 2012). The dendrograms shown in Figs. 2, 3, 4, 5 and 6 suggest that this is the case, although in several of our examples many of the partial data sets collected are not used to construct the final result. Most of the above histograms contain one main cluster with high mutual correlation coefficients and a continuum of data sets with decreasingly low correlation to the main cluster. Such a pattern is indicative of strongly varying data quality between partial data sets rather than crystal non-isomorphism and suggests that some partial data sets were collected from positions with overlapping crystal lattices or other issues such as crystal damage. Clearly, the evaluation of initial two-dimensional mesh scans with DOZOR did not filter such positions out. Furthermore, with only a 10° rotation range measured at each position it is difficult to detect such problematic data sets on the basis of their internal processing statistics, and HCA is required to filter out the worst partial data sets. In the case of Maelstrom, where partial data sets were measured from crystals on several different sample mounts, the dendrogram (Fig. 7) shows well populated clusters above a cutoff of dist(i, j) = 0.15 and a continuum of poorly correlated data sets below this cutoff. This suggests that both non-isomorphism and variation in data quality between partial data sets is present. However, as can be seen, both poor-quality and non-isomorphous partial data sets are successfully filtered by the HCA procedure.
Despite the success of the experiments described above, the procedure developed will eventually be improved in many areas. Here, all samples were mounted and cryocooled manually, and it may be that better results can be achieved by taking advantage of robotic crystal-handling methods both for the removal of mother liquor from the crystallization drop and the mounting and cryocooling of crystals in a suitable sample holder (Cipriani et al., 2012). Moreover, for the different experiments described here the total absorbed doses per crystal (Table 1; calculated post-experiment using RADDOSE; Paithankar & Garman, 2010) are rather low compared with the Henderson/Garman limits (Henderson, 1990; Owen et al., 2006) generally used in diffraction data collection from cryocooled single crystals of macromolecules. In future versions of the pipeline presented here, following the low-dose two-dimensional mesh scan the optimum total exposure time per crystal (partial data set) will be calculated before the data-collection step using the EDNA characterization software (Bourenkov & Popov, 2010; Incardona et al., 2009), the result being better quality and/or higher resolution data collected per crystal. For crystals that are highly radiation-sensitive one might even imagine the use of a `Burn Strategy' workflow (Leal et al., 2011) to provide a precise estimation of the maximum allowable total absorbed dose per crystal.
As the EDNA procedure implies the indexing of diffraction patterns (Incardona et al., 2009), comparison, for crystals larger than the X-ray beam, of orientation matrices will allow either the pre-clustering of partial data sets collected from different points on the same crystal or the measurement of crystal size and alignment in the sample holder. In the latter case this information could be used to automatically guide helical data collections (Flot et al., 2010; de Sanctis et al., 2012) that, provided that diffraction is homogenous, may allow the collection of complete data sets from each of the different crystals contained in the sample holder. For crystals of a similar or smaller size than the X-ray beam prior knowledge of the crystal orientation in the X-ray beam will allow a broader range of experiments than is currently the case. In particular, the order of the collection of partial data sets could be constructed to ensure the compilation of a complete data set when only a few crystals are available or to ensure the collection of as highly redundant data as possible. Finally, for sample mounts containing many small robust, well diffracting crystals one can also imagine a modification to the pipeline in which complete diffraction data sets for structure solution and subsequent are collected from all crystals contained in the sample holder. Separating such data sets into different clusters would result in ensembles of crystal structures for each target.
Once data collection and processing have been completed, a final improvement to the pipeline is in the choice of partial data sets to merge to produce a final data set. This choice clearly depends on the aim of the experiment in hand (i.e. structure solution by de novo structure solution using SAD etc.), and in principle is best made using HCA based on CCI(i, j) (§2; Giordano et al., 2012). However, for partial data sets from low-symmetry crystals the number of common unique reflections for each pair of data sets may be low, thus leading to artefacts, and a better approach may be to combine HCA with the type of `scale-and-merge' algorithms currently implemented in the PHENIX package (Adams et al., 2010; https://www.phenix-online.org/version_docs/dev-1977/reference/scale_and_merge.html ) or recently described for other SSX protocols (Huang et al., 2015).
5. Conclusions
We have presented here a pipeline for the routine collection of partial diffraction data sets from many randomly oriented crystals of the same biological macromolecule contained in a single cryocooled sample holder. The major advantages of the pipeline developed are (i) that it can be applied to crystals mounted in almost any available sample holder suitable for cryocooling, thus rendering the methodology available to the widest possible range of potential users, (ii) that the positions of all well diffracting crystals are determined and that their diffraction strength is ranked prior to data collection, (iii) that small, but contiguous, partial data sets are collected from as many crystals contained in the sample holder as is desired and (iv) that HCA is used to choose partial data sets for merging to produce the best possible data set for downstream analysis and structure solution. As described above, the protocol developed can be applied to both SSX-type experiments involving microcrystals and to multi-position data collection from crystals larger than the X-ray beam size. The results presented here suggest that the method developed will be useful in all areas of macromolecular crystallography, including the compilation of a complete data set from many very small crystals (∼5 µm in the largest dimension), in
exploiting and in projects aimed at rational drug design.While we have confined our experiments to crystals mounted on cryocooled sample holders, there is no reason, providing that the increased radiation damage is taken into account, that the automated screening and data-collection procedure developed cannot also be applied at room temperature, particularly in experiments that involve in situ screening and data collection (Axford et al., 2012; Jacquamet et al., 2004; le Maire et al., 2011; Huang et al., 2015). Moreover, MeshandCollect should also be extendable to structure solution based on radiation damage-induced phasing (RIP; Ravelli et al., 2003; de Sanctis & Nanao, 2012) or SAD experiments based on inverse-beam protocols (González, 2003).
Supporting information
PDB references: thermolysin, 5a3y; lysozyme, 5a3z; bacteriorhodopsin, 5a44; 5a45; thaumatin, 5a47
Footnotes
1The setup of the goniometers on the ESRF beamlines on which our experiments were performed means that, once its position has been identified, each crystal is moved into the X-ray beam. Here, the movement of the crystal is via two motors (sampx and sampy) and the rotation axis of the goniometer is not displaced. This movement ensures the correct vertical position of the crystal but leaves open the possibility that the crystal will be misaligned in the direction of the X-ray beam. A misalignment of 10 µm in this direction will result in a misplacement of less than 1 µm over a ±5° rotation. Thus, provided that the beam is larger than 1 µm in size the crystal will remain in the X-ray beam during the data collection.
Acknowledgements
The authors would like to gratefully acknowledge the allocation of beamtime (`SSX BAG') for this and other ongoing SSX projects at ESRF by the ESRF MX Beamtime Allocation Panel.
References
Adams, P. D. et al. (2010). Acta Cryst. D66, 213–221. Web of Science CrossRef CAS IUCr Journals Google Scholar
Axford, D. et al. (2012). Acta Cryst. D68, 592–600. Web of Science CrossRef CAS IUCr Journals Google Scholar
Barty, A., Kirian, R. A., Maia, F. R. N. C., Hantke, M., Yoon, C. H., White, T. A. & Chapman, H. (2014). J. Appl. Cryst. 47, 1118–1131. Web of Science CrossRef CAS IUCr Journals Google Scholar
Borshchevskiy, V. I., Round, E. S., Popov, A. N., Büldt, G. & Gordeliy, V. I. (2011). J. Mol. Biol. 409, 813–825. Web of Science CrossRef CAS PubMed Google Scholar
Botha, S., Nass, K., Barends, T. R. M., Kabsch, W., Latz, B., Dworkowski, F., Foucar, L., Panepucci, E., Wang, M., Shoeman, R. L., Schlichting, I. & Doak, R. B. (2015). Acta Cryst. D71, 387–397. Web of Science CrossRef IUCr Journals Google Scholar
Bourenkov, G. P. & Popov, A. N. (2006). Acta Cryst. D62, 58–64. Web of Science CrossRef CAS IUCr Journals Google Scholar
Bourenkov, G. P. & Popov, A. N. (2010). Acta Cryst. D66, 409–419. Web of Science CrossRef CAS IUCr Journals Google Scholar
Boutet, S. et al. (2012). Science, 337, 362–364. CrossRef CAS PubMed Google Scholar
Bowler, M. W., Guijarro, M., Petitdemange, S., Baker, I., Svensson, O., Burghammer, M., Mueller-Dieckmann, C., Gordon, E. J., Flot, D., McSweeney, S. M. & Leonard, G. A. (2010). Acta Cryst. D66, 855–864. Web of Science CrossRef CAS IUCr Journals Google Scholar
Brehm, W. & Diederichs, K. (2014). Acta Cryst. D70, 101–109. Web of Science CrossRef CAS IUCr Journals Google Scholar
Brockhauser, S., Ravelli, R. B. G. & McCarthy, A. A. (2013). Acta Cryst. D69, 1241–1251. Web of Science CrossRef IUCr Journals Google Scholar
Brockhauser, S., Svensson, O., Bowler, M. W., Nanao, M., Gordon, E., Leal, R. M. F., Popov, A., Gerring, M., McCarthy, A. A. & Gotz, A. (2012). Acta Cryst. D68, 975–984. Web of Science CrossRef CAS IUCr Journals Google Scholar
Chapman, H. N. et al. (2011). Nature (London), 470, 73–77. Web of Science CrossRef CAS PubMed Google Scholar
Chen, K. M., Campbell, E., Pandey, R. R., Yang, Z., McCarthy, A. A. & Pillai, R. S. (2015). RNA, 21, 833–839. CrossRef CAS PubMed Google Scholar
Cipriani, F., Röwer, M., Landret, C., Zander, U., Felisaz, F. & Márquez, J. A. (2012). Acta Cryst. D68, 1393–1399. Web of Science CrossRef CAS IUCr Journals Google Scholar
Cole, K. et al. (2014). PLoS One, 9, e101036. Web of Science CrossRef PubMed Google Scholar
Coquelle, N., Brewster, A. S., Kapp, U., Shilova, A., Weinhausen, B., Burghammer, M. & Colletier, J.-P. (2015). Acta Cryst. D71, 1184–1196. Web of Science CrossRef IUCr Journals Google Scholar
Dauter, Z., Dauter, M. & Dodson, E. J. (2002). Acta Cryst. D58, 494–506. Web of Science CrossRef CAS IUCr Journals Google Scholar
Delagenière, S. et al. (2011). Bioinformatics, 27, 3186–3192. Web of Science PubMed Google Scholar
Emsley, P., Lohkamp, B., Scott, W. G. & Cowtan, K. (2010). Acta Cryst. D66, 486–501. Web of Science CrossRef CAS IUCr Journals Google Scholar
Evans, P. R. & Murshudov, G. N. (2013). Acta Cryst. D69, 1204–1214. Web of Science CrossRef CAS IUCr Journals Google Scholar
Flot, D., Mairs, T., Giraud, T., Guijarro, M., Lesourd, M., Rey, V., van Brussel, D., Morawe, C., Borel, C., Hignette, O., Chavanne, J., Nurizzo, D., McSweeney, S. & Mitchell, E. (2010). J. Synchrotron Rad. 17, 107–118. Web of Science CrossRef CAS IUCr Journals Google Scholar
Fry, E., Logan, D. & Stuart, D. (1996). Crystallographic Methods and Protocols, edited by C. Jones, B. Mulloy & M. Sanderson, pp. 319–363. Totowa: Humana Press. Google Scholar
Gabadinho, J. et al. (2010). J. Synchrotron Rad. 17, 700–707. Web of Science CrossRef CAS IUCr Journals Google Scholar
Gati, C., Bourenkov, G., Klinge, M., Rehders, D., Stellato, F., Oberthür, D., Yefanov, O., Sommer, B. P., Mogk, S., Duszenko, M., Betzel, C., Schneider, T. R., Chapman, H. N. & Redecke, L. (2014). IUCrJ, 1, 87–94. Web of Science CrossRef CAS PubMed IUCr Journals Google Scholar
Giordano, R., Leal, R. M. F., Bourenkov, G. P., McSweeney, S. & Popov, A. N. (2012). Acta Cryst. D68, 649–658. Web of Science CrossRef CAS IUCr Journals Google Scholar
González, A. (2003). Acta Cryst. D59, 1935–1942. Web of Science CrossRef IUCr Journals Google Scholar
Gordeliy, V. I., Schlesinger, R., Efremov, R., Büldt, G. & Heberle, J. (2003). Methods Mol. Biol. 228, 305–316. PubMed CAS Google Scholar
Henderson, R. (1990). Proc. R. Soc. B Biol. Sci. 241, 6–8. CrossRef CAS Web of Science Google Scholar
Hollenstein, K., Kean, J., Bortolato, A., Cheng, R. K., Doré, A. S., Jazayeri, A., Cooke, R. M., Weir, M. & Marshall, F. H. (2013). Nature (London), 499, 438–443. CrossRef CAS PubMed Google Scholar
Hope, H. (1988). Acta Cryst. B44, 22–26. CrossRef CAS Web of Science IUCr Journals Google Scholar
Huang, C.-Y., Olieric, V., Ma, P., Panepucci, E., Diederichs, K., Wang, M. & Caffrey, M. (2015). Acta Cryst. D71, 1238–1256. CrossRef IUCr Journals Google Scholar
Incardona, M.-F., Bourenkov, G. P., Levik, K., Pieritz, R. A., Popov, A. N. & Svensson, O. (2009). J. Synchrotron Rad. 16, 872–879. Web of Science CrossRef IUCr Journals Google Scholar
Jacquamet, L., Ohana, J., Joly, J., Borel, F., Pirocchi, M., Charrault, P., Bertoni, A., Israel-Gouy, P., Carpentier, P., Kozielski, F., Blot, D. & Ferrer, J.-L. (2004). Structure, 12, 1219–1225. Web of Science CrossRef PubMed CAS Google Scholar
Kabsch, W. (2010). Acta Cryst. D66, 125–132. Web of Science CrossRef CAS IUCr Journals Google Scholar
Kendrew, J. C., Dickerson, R. E., Strandberg, B. E., Hart, R. G., Davies, D. R., Phillips, D. C. & Shore, V. C. (1960). Nature (London), 185, 422–427. CrossRef PubMed CAS Web of Science Google Scholar
Leal, R. M. F., Bourenkov, G. P., Svensson, O., Spruce, D., Guijarro, M. & Popov, A. N. (2011). J. Synchrotron Rad. 18, 381–386. Web of Science CrossRef CAS IUCr Journals Google Scholar
Lebon, G., Warne, T., Edwards, P. C., Bennett, K., Langmead, C. J., Leslie, A. G. W. & Tate, C. G. (2011). Nature (London), 474, 521–525. CrossRef CAS PubMed Google Scholar
Maire, A. le, Gelin, M., Pochet, S., Hoh, F., Pirocchi, M., Guichou, J.-F., Ferrer, J.-L. & Labesse, G. (2011). Acta Cryst. D67, 747–755. Web of Science CrossRef IUCr Journals Google Scholar
Monaco, S., Gordon, E., Bowler, M. W., Delagenière, S., Guijarro, M., Spruce, D., Svensson, O., McSweeney, S. M., McCarthy, A. A., Leonard, G. & Nanao, M. H. (2013). J. Appl. Cryst. 46, 804–810. Web of Science CrossRef CAS IUCr Journals Google Scholar
Mueller-Dieckmann, C., Kauffmann, B. & Weiss, M. S. (2011). J. Appl. Cryst. 44, 433–436. Web of Science CrossRef CAS IUCr Journals Google Scholar
Murray, C. W. & Blundell, T. L. (2010). Curr. Opin. Struct. Biol. 20, 497–507. Web of Science CrossRef CAS PubMed Google Scholar
Murshudov, G. N., Skubák, P., Lebedev, A. A., Pannu, N. S., Steiner, R. A., Nicholls, R. A., Winn, M. D., Long, F. & Vagin, A. A. (2011). Acta Cryst. D67, 355–367. Web of Science CrossRef CAS IUCr Journals Google Scholar
Neutze, R., Wouts, R., van der Spoel, D., Weckert, E. & Hajdu, J. (2000). Nature (London), 406, 752–757. Web of Science CrossRef PubMed CAS Google Scholar
Nogly, P. et al. (2015). IUCrJ, 2, 168–176. Web of Science CrossRef CAS PubMed IUCr Journals Google Scholar
Nurizzo, D., Mairs, T., Guijarro, M., Rey, V., Meyer, J., Fajardo, P., Chavanne, J., Biasci, J.-C., McSweeney, S. & Mitchell, E. (2006). J. Synchrotron Rad. 13, 227–238. Web of Science CrossRef CAS IUCr Journals Google Scholar
Oster, L., Tapani, S., Xue, Y. & Kack, H. (2015). Drug Discov. Today, 20, 1104–1411. PubMed Google Scholar
Owen, R. L., Rudiño-Piñera, E. & Garman, E. F. (2006). Proc. Natl Acad. Sci. USA, 103, 4912–4917. Web of Science CrossRef PubMed CAS Google Scholar
Paithankar, K. S. & Garman, E. F. (2010). Acta Cryst. D66, 381–388. Web of Science CrossRef CAS IUCr Journals Google Scholar
Pape, T. & Schneider, T. R. (2004). J. Appl. Cryst. 37, 843–844. Web of Science CrossRef CAS IUCr Journals Google Scholar
Rasmussen, S. G. F. et al. (2011). Nature (London), 477, 549–555. Web of Science CrossRef CAS PubMed Google Scholar
Ravelli, R. B. G., Leiros, H.-K. S., Pan, B., Caffrey, M. & McSweeney, S. (2003). Structure, 11, 217–224. Web of Science CrossRef PubMed CAS Google Scholar
Riekel, C., Burghammer, M. & Schertler, G. (2005). Curr. Opin. Struct. Biol. 15, 556–562. Web of Science CrossRef PubMed CAS Google Scholar
Sanctis, D. de et al. (2012). J. Synchrotron Rad. 19, 455–461. Web of Science CrossRef IUCr Journals Google Scholar
Sanctis, D. de & Leonard, G. (2014). Not. Neutroni Luce Sincrotrone, 19, 24–26. Google Scholar
Sanctis, D. de & Nanao, M. H. (2012). Acta Cryst. D68, 1152–1162. Web of Science CrossRef IUCr Journals Google Scholar
Sauter, N. K., Hattne, J., Grosse-Kunstleve, R. W. & Echols, N. (2013). Acta Cryst. D69, 1274–1282. Web of Science CrossRef CAS IUCr Journals Google Scholar
Sawaya, M. R. et al. (2014). Proc. Natl Acad. Sci. USA, 111, 12769–12774. Web of Science CrossRef CAS PubMed Google Scholar
Sheldrick, G. M. (2008). Acta Cryst. A64, 112–122. Web of Science CrossRef CAS IUCr Journals Google Scholar
Skubák, P. & Pannu, N. S. (2013). Nature Commun. 4, 2777. Google Scholar
Soares, A. S., Mullen, J. D., Parekh, R. M., McCarthy, G. S., Roessler, C. G., Jackimowicz, R., Skinner, J. M., Orville, A. M., Allaire, M. & Sweet, R. M. (2014). J. Synchrotron Rad. 21, 1231–1239. Web of Science CrossRef CAS IUCr Journals Google Scholar
Stellato, F. et al. (2014). IUCrJ, 1, 204–212. Web of Science CrossRef CAS PubMed IUCr Journals Google Scholar
Vagin, A. & Teplyakov, A. (2010). Acta Cryst. D66, 22–25. Web of Science CrossRef CAS IUCr Journals Google Scholar
White, T. A., Barty, A., Stellato, F., Holton, J. M., Kirian, R. A., Zatsepin, N. A. & Chapman, H. N. (2013). Acta Cryst. D69, 1231–1240. Web of Science CrossRef CAS IUCr Journals Google Scholar
White, T. A., Kirian, R. A., Martin, A. V., Aquila, A., Nass, K., Barty, A. & Chapman, H. N. (2012). J. Appl. Cryst. 45, 335–341. Web of Science CrossRef CAS IUCr Journals Google Scholar
Yeates, T. O. (1997). Method Enzymol. 276, 344–358. CrossRef CAS Google Scholar
This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.