Serial femtosecond crystallography of soluble proteins in lipidic cubic phase

A new approach of using lipidic cubic phase as a carrier matrix for delivering soluble protein microcrystals for serial crystallography helps to dramatically reduce protein consumption. The structures of two soluble test proteins have been determined by this method using less than 0.1 mg of each protein.


Introduction
The recent advent of hard X-ray free-electron lasers (XFELs) has opened up many exciting opportunities in structural biology (Feld & Frank, 2014). Intense XFEL pulses of extremely short duration (<50 fs) make it possible to outrun radiation damage, as predicted by Neutze et al. (2000) and as later demonstrated by Chapman et al. (2006), enabling structure determination from tiny crystals at room temperature. Data are usually acquired using a serial femtosecond crystallography (SFX) approach, in which a continuous stream of microcrystals is intersected with the XFEL beam and diffraction patterns from individual crystals at random orientations are recorded at the pulse repetition rate of the laser . Following the automated indexing and merging of thousands of diffraction patterns, a Monte Carlo method is used to integrate over the angular profile of the Bragg reflections and other stochastic fluctuations (Kirian et al., 2011). SFX has been successfully applied to both soluble proteins (Boutet et al., 2012;Redecke et al., 2013;Sawaya et al., 2014) and membrane proteins (Johansson et al., 2012;Kern et al., 2013;Johansson et al., 2013) using crystals ranging in size from nanometres to micrometres. These were formed in either aqueous solution or lipidic sponge phase and continuously delivered to the XFEL beam through a fast-running liquid microjet produced by a gas dynamic virtual nozzle (GDVN) injector (DePonte et al., 2008). Initial proof-of-concept experiments have shown the potential of this technology for structural studies of proteins that are refractory to the growth of sufficiently large, high-quality crystals suitable for data collection at synchrotron sources, as well as for time-resolved studies of unstable intermediate states and irreversible processes. This technique, however, requires very large amounts (10-100 mg) of crystalline material for data collection, most of which runs to waste between shots, limiting its use to only well expressed and well behaved proteins. To address this issue, we have previously introduced an LCP-SFX method (Liu et al., 2013(Liu et al., , 2014Weierstall et al., 2014) in which membrane proteins are crystallized and delivered for data collection inside a gel-like lipidic cubic phase (LCP). The specific texture and high viscosity of the LCP medium allows a reduced flow rate compared with the GDVN injector, so that the crystals are used more efficiently. Using LCP as a carrier medium thus greatly reduces the amount of protein required to collect a complete set of diffraction data.
Apart from membrane proteins, which can be crystallized in LCP and are amenable to LCP-SFX, there are many important soluble macromolecules that represent challenging crystallization targets (Garman, 2014). These include large protein complexes, protein-DNA and protein-RNA complexes and proteins with dynamic domains. Quite often, limited quantities of sub-10 mm crystals of such macromolecules are available; however, obtaining large crystals suitable for data collection at synchrotrons or producing large quantities of microcrystals for SFX using a liquid injector may represent formidable obstacles. Here, we have modified our LCP-SFX protocol to demonstrate that LCP can serve as a suitable medium for the efficient delivery of soluble protein crystals for crystallographic data collection at XFELs.
The suspension of lysozyme microcrystals was centrifuged at 500g for 5 min. The supernatant (precipitant; around 980 ml) was carefully removed and the remaining 40 ml of lysozyme microcrystal/precipitant solution was used for LCP preparations.

Phycocyanin microcrystal sample
Thermosynechococcus elongatus cells were preprocessed with a microfluidizer to break the cell walls. This was followed by a series of centrifugation cycles to isolate the thylakoid membrane. Phycobiliproteins such as phycocyanin (PC) and allophycocyanin (APC) were isolated by ultracentrifugation of the supernatant obtained after microfluidizer treatment. Cell debris and larger particles were spun down at 50 000g for 1 h. The supernatant was concentrated using Centricon spin filters with a molecular-weight cutoff of 100 kDa to obtain concentrated protein. PC microcrystals were produced by the free-interface diffusion method (Kupitz et al., 2014) at a starting concentration of 50 mg ml À1 and using 75 mM HEPES pH 7, 20 mM MgCl 2 , 17% PEG 3350 as the precipitant. The final concentrations of protein and PEG 3350 were half of the starting values. The procedure of LCP sample preparation was identical to that described above. PC microcrystals with average dimensions of 10 Â 10 Â 5 mm were  grown at 4 C over 9-14 h and pooled together before mixing with LCP as described above for the lysozyme samples.

XFEL data collection and treatment
Experiments were performed using the CXI instrument (Boutet & Williams, 2010) at the Linac Coherent Light Source (LCLS) at SLAC National Accelerator Laboratory. LCLS was operated at a wavelength of 1.56 Å (7.95 keV), delivering individual X-ray pulses of nominally 35 fs pulse duration. Protein microcrystals in LCP medium were injected at an average flow rate of 170 nl min À1 into the XFEL beam focus region inside a vacuum chamber using an LCP injector with a 50 mm diameter nozzle. Single-shot diffraction patterns of randomly oriented crystals were recorded at 120 Hz with a Cornell-SLAC Pixel Array Detector (CSPAD) positioned at a distance of 100 mm from the sample (Hart et al., 2012).
In the case of lysozyme, a total of 299 569 images were collected within 45 min, of which 119 844 were identified as crystal diffraction patterns by Cheetah , corresponding to an average hit rate of 40%. In the case of PC, a total of 287 520 images were collected within 40 min, of which 18 794 were identified as crystal hits (average hit rate of 6.5%). The peak-detection parameters and the experimental geometry were optimized to ensure the best quality of peak finding and indexing. Autoindexing and structure-factor integration of the crystal hits was performed using CrystFEL (White et al., 2012), which involved the application of fast Fourier transform (FFT)-based autoindexing algorithms, MOSFLM (Leslie, 2006), DirAx (Duisenberg, 1992) and XDS (Kabsch, 2010) followed by averaging and integration of Bragg peaks using a Monte Carlo algorithm. The final statistics of the quality of the data sets are summarized in Tables 1 and 2. The maximal radiation dose per crystal was estimated using RADDOSE (Paithankar et al., 2009). (I) values were estimated as the standard deviations of the means of the intensity measurements (White et al., 2012). The appropriate resolution cutoff was based on the behavior of the Pearson correlation coefficient CC 1/2 versus resolution, and on the improvements in the R work / R free values after including higher resolution shells in refinement (Karplus & Diederichs, 2012). After integration with CrystFEL, the initial phases were obtained by molecular replacement using known structures of the protein from the PDB (PDB entries 4et8 for lysozyme and 3l0f for PC; Boutet et al., 2012;R. Fromme, D. Brune & P. Fromme, unpublished work) and the structures were refined using phenix.refine (Afonine et al., 2012), including several simulated-annealing cycles in order to reduce phase bias. Structure images were prepared using PyMOL (Schrö dinger). The final coordinates and structure factors were deposited in the PDB under accession codes 4zix (lysozyme) and 4ziz (PC).

Results
We have collected full XFEL diffraction data sets from two soluble proteins as model systems: the small, 14.3 kDa, chicken egg-white lysozyme and the relatively large, heterohexameric, 120 kDa, phycocyanin disk-like complex involved in light harvesting in photosynthesis as part of a phycobilisome in cyanobacteria. The proteins were first crystallized in their corresponding crystallization buffers, after which the slurries of microcrystals were concentrated by centrifugation to the desired concentration and mixed with appropriate LCP host lipids (Figs. 1a and 1b)   and efficient formation of LCP; however, the shear forces can damage or dissolve the crystals if mixing is too vigorous. We observed that with gentle mixing, crystals over 10-20 mm in size, especially those with needle or plate-like shapes, tended to break into smaller pieces of less than 10 mm in size which, in turn, are sturdy enough to withstand the mechanical stress associated with mixing. Such crystal breakage did not affect the diffraction properties of the crystals in our test samples. The resulting dispersion of microcrystals in LCP was then transferred to an LCP injector and diffraction data (Figs. 1c and 1d) were collected at 120 Hz at the LCLS using an LCP flow rate of 170 nl min À1 as described previously (Liu et al., 2013;Weierstall et al., 2014).

Lysozyme structure
The structure of lysozyme was solved at 1.9 Å resolution using 54 544 indexed single-crystal diffraction snapshots. The resulting overall structure is very similar to the previously published lysozyme structure determined by SFX using a GDVN injector (Boutet et al., 2012). Since the software has substantially advanced over the last two years, we have also reprocessed the data from Boutet et al. (2012) using more recent versions of Cheetah (v.2013.3) and CrystFEL (v.0.5.3a), resulting in considerably improved data statistics. Both data sets were processed in the same way using the same versions of the software to avoid bias (Table 1). We compared the structures of lysozyme obtained using both LCP and GDVN injectors to ensure the validity of our procedure. The structures aligned very closely, with an r.m.s.d. of only 0.5 Å . The B-factor distributions are very similar, with a slightly higher average B factor in the case of the LCP structure. The quality of the maps is also very similar, with no structural differences observable in difference maps (Figs. 2a and 2b). Small differences were only detected around the side chains of solvent-exposed bulky amino-acid residues. These minor differences can be expected given the variations in the crystal-preparation protocols, crystal size and crystal-delivery methods. We can therefore conclude that using LCP as a carrier medium for SFX data collection results in a similar quality structure compared with the previously used GDVN injector, while offering the important advantage of consuming considerably less crystallized protein (0.1 mg using the LCP injector versus 15 mg using the GDVN injector).

Phycocyanin structure
As the second test protein, microcrystals of phycocyanin (PC), a photosynthetic pigment protein from the thermophilic cyanobacterium T. elongatus, were used. SFX data were collected using the LCP injector and compared with the structure of PC obtained previously using the GDVN injector. Both structures were of very similar resolution and quality. During the LCP-SFX experiment, data were collected from less than 7 ml of LCP sample (containing 3 ml crystal suspension), which yielded 18 794 hits, from which 6629 patterns were indexed at 1.75 Å resolution ( Table 2). The quality of the structural model can be assessed from the details presented for one of the chromophores of PC shown in Fig. 3. The simulated-annealing composite OMIT 2mF o À DF c electrondensity map at a contour level of 1.5 fits tightly to the chromophore, which is difficult to achieve (even at higher resolution). Most importantly, using the LCP-SFX technique, the amount of protein crystals used is dramatically reduced, even though crystallization was carried out in liquid medium (as described in x2). Hence, previously established crystallization conditions can be successfully coupled with this technique. The entire data set required just 3 ml of crystal suspension ($0.1 mg of protein), compared with the hundreds of microlitres of crystal suspension ($30 mg of protein) that were used for the GDVN structure. The quality of the resulting electron-density map is very high, despite only 6629 crystals contributing to the data set.

Discussion
Several considerations affect the use of LCP as a delivery matrix, the most important of which are the choice of the LCP host lipids and the optimal crystal size and density. The lipid choice is dictated by two factors. The first factor is the compatibility of the lipids with LCP extrusion in vacuum. The XFEL beam path usually runs in vacuum to reduce X-ray scattering. Previously, we have observed that the most commonly used lipid for LCP crystallization, monoolein, can partially solidify, forming the lamellar crystalline Lc phase. upon the injection of monoolein-based LCP in vacuum (Weierstall et al., 2014). The Lc phase produces strong diffraction that can damage the detector; therefore, it should be avoided. Monoolein belongs to a lipid class known as monoacylglycerols (MAGs) Lysozyme structure. (a) Overall structure of lysozyme showing 2mF o À DF c electron density around several residues in the active site contoured at 1. (b) Difference electron density between the data collected in this work and those presented in Boutet et al. (2012), contoured at 3. Positive difference density is green and negative is red.

Figure 3
Phycocyanin structure. Simulated-annealing composite OMIT 2mF o À DF c electron-density map contoured at the 1.5 level for the chromophore phycobilin inside its binding pocket. The map was built using 1.75 Å resolution XFEL data from PC crystals delivered in LCP. of carbon atoms between the ester and the double bond (N) and between the double bond and the terminal methyl group (T). Therefore, monoolein corresponds to 9.9 MAG in this terminology. We have found that the shorter-chained monoolein analogues 9.7 MAG and 7.9 MAG, as well as a 1:1 mixture of 9.9 MAG and 7.9 MAG, do not form the Lc phase upon injection into vacuum and therefore are suitable as host lipids when LCP is extruded in vacuum. We should note that this limitation is relieved when LCP is extruded at atmospheric pressure, and all common LCP-forming lipids that were tested, including monoolein, can be used for this purpose.
The second factor is the compatibility of LCP with the precipitant solution used for protein crystallization. Many components of crystallization cocktails can disrupt LCP when used at high concentrations or extreme pH values (Cherezov et al., 2001;Joseph et al., 2011). Of all of the MAGs, monooleinbased LCP is one of the most stable towards the broadest range of commonly used precipitants. It is likely that new LCP-forming lipids with even higher stability will be identified. The compatibility of the chosen precipitant solution should be tested with different combinations of lipids before producing the actual samples containing protein crystals. Instability of LCP towards a particular precipitant composition can often be rectified by lowering the concentration of one of the components or by replacing one of the components by another, if this does not adversely affect the stability of the protein crystals. In the case of lysozyme, the precipitant was compatible with 7.9 MAG, 9.7 MAG and 9.9 MAG/7.9 MAG mixture lipids, while the PC precipitant was not compatible with 9.7 MAG but worked well with a 1:1 9.9 MAG:7.9 MAG mixture.
Data collected by SFX contain many randomly changing parameters, such as the exact crystal orientation, the crystal size and mosaicity and the XFEL pulse intensity and energy. Data processing relies on averaging of all these fluctuations, in the simplest case by Monte Carlo integration over many observations of the same reflection, and this requires a high multiplicity of data for convergence. Therefore, for efficient SFX data collection it is important to attain a high density of microcrystals, ensuring a high crystal hit rate. A clear advantage of working with crystals of soluble proteins, compared with membrane-protein crystals grown in LCP, is the ability to concentrate the crystal slurry to the desired concentration simply by centrifugation. Thus, the optimal concentration of crystals in LCP can be achieved, which corresponds to a 30-40% hit rate (number of images with crystal diffraction/total number of images). Higher hit rates would increase the occurrence of diffraction patterns from multiple crystals owing to the high density of crystals in the LCP stream, which is undesirable. Having adjusted the crystal density to an optimal value, we could collect full data sets in the shortest time: 43 min for lysozyme and 40 min for PC.
While this manuscript was under preparation, a paper describing soluble protein crystal delivery in a grease matrix using a different injector was published (Sugahara et al., 2014). One of the principal differences between grease and LCP as the crystal-delivery matrix is that in LCP the crystals remain in contact with the original precipitant solution, while upon mixing with grease most of the precipitant solution is removed. We anticipate that different crystals could have different stabilities in such disparate matrices, and therefore further development of both of these and other approaches will expand the usage of SFX for challenging soluble protein targets at XFEL sources.