Protein–ligand complex structure from serial femtosecond crystallography using soaked thermolysin microcrystals and comparison with structures from synchrotron radiation

The applicability of the ligand-soaking method in serial femtosecond crystallography has been examined to examine the feasibility of pharmaceutical applications of X-ray free-electron lasers.


Introduction
X-ray free-electron lasers (XFELs) generate very short/ intense pulses that enable the collection of diffraction data before the destruction of the specimen (Neutze et al., 2000). This 'diffraction-before-destruction' principle of XFELs has successfully been applied in serial femtosecond crystallography (SFX), in which hundreds of thousands of single-shot diffraction images from randomly oriented microcrystals at room temperature are merged to determine a crystal structure (Chapman et al., 2011;Boutet et al., 2012). To date, a substantial number of SFX structures have been reported, including those of natively inhibited trypanosome protease from in vivo-grown microcrystals (Redecke et al., 2013), of membrane proteins from microcrystals grown in lipidic cubic phase (Zhang et al., 2015;Kang et al., 2015) and of the photoactive yellow protein in a time-resolved pump-probe experiment (Pande et al., 2016). Because SFX provides crystal structures at room temperature without radiation damage, it has the potential to be a useful tool in structural biology, which requires structural information under physiological conditions. For instance, a damage-free structure from SFX could account for the proton-transfer mechanism of nitrite reductase (Fukuda et al., 2016). From this point of view, structure-based drug design (SBDD) is expected to be a likely application of SFX (Zhang et al., 2015;Hol, 2015). In SBDD, a smallmolecule ligand is designed so as to improve its affinity for the target protein based on the structure of protein-ligand complex crystals, which are typically prepared by soaking protein crystals into a solution containing the ligand (Hol, 1986;Klebe, 2000). However, the applicability of soaked crystals in SFX has not fully been examined to date. Here, we present a ligand-soaking experiment in SFX using microcrystals of thermolysin, which has recently been demonstrated as a model system (Hattne et al., 2014). From a comparison of the SFX structures with those of a conventional experiment using synchrotron radiation at low temperatures, the applicability of SFX to SBDD will be discussed.

Sample preparation
Lyophilized thermolysin powder from Bacillus stearothermophilus (Hampton Research) was solubilized in 50 mM NaOH in water. Microcrystals of thermolysin were prepared as reported previously (Hattne et al., 2014) with slight modifications. Crystallization was performed by a batch method on a 50 ml scale; equal volumes (25 ml each) of the thermolysin solution at a concentration of 42.5 mg ml À1 and a reservoir solution comprising 40% PEG 2000 MME, 0.1 M MES-NaOH pH 6.5, 5 mM CaCl 2 were mixed and incubated at 277 K for 5 h. Elliptical-shaped microcrystals grew to approximate dimensions of 4 Â 4 Â 8 mm. After the batch crystallization, the microcrystals were collected by centrifugation at 3000g, suspended in 500 ml harvest solution comprising 20% PEG 2000 MME, 0.1 M MES-NaOH pH 6.5, 5 mM CaCl 2 and filtered through a mesh with a 50 mm pore size. To remove a copurified ligand (Birrane et al., 2014), the microcrystal suspension was incubated at room temperature for 24 h (backsoaking). The back-soaked microcrystals were collected by centrifugation and resuspended in the harvest solution for the unliganded oil-SFX form, whereas they were resuspended in a soaking solution comprising 20% PEG 2000 MME, 60 mM N-carbobenzoxy-l-aspartic acid (ZA), 0.1 M MES-NaOH pH 6.5, 5 mM CaCl 2 for the liganded oil/water-SFX forms. The soaking samples were incubated at room temperature for 48 h (soaking) and the microcrystals were collected by centrifugation and resuspended in the soaking solution. After backsoaking or soaking, the suspensions contained about 10 8 microcrystals per millilitre. Because a 1:9 mixture of the microcrystal suspension and the crystal carrier (oil-based or water-based) was used in the SFX experiment (Sugahara et al., 2015(Sugahara et al., , 2016, the final specimen contained about 10 7 microcrystals per millilitre. For the liganded/unliganded oil-SFX forms and the liganded water-SFX form, the synthetic grease Super Lube (Synco Chemical) and an aqueous solution of 12% hydroxyethyl cellulose (Sigma) containing 20% PEG 2000 MME, 60 mM ZA, 50 mM MES-NaOH pH 6.5 and 2.5 mM CaCl 2 were used as the crystal carrier, respectively. The details of cellulose as a water-based crystal carrier for SFX will be published elsewhere. Conventional macrocrystals soaked with ligand were prepared as reported by Birrane et al. (2014) with slight modifications. Crystallization was performed by the hanging-drop vapour-diffusion method at 293 K using protein solution at a concentration of 25 mg ml À1 and a reservoir solution comprising 10% PEG 2000 MME, 0.1 M MES-NaOH pH 6.5, 5 mM CaCl 2 . From a 2 ml crystallization drop prepared by mixing equal volumes of protein solution and reservoir solution, hexagonal crystals grew in 5 d to approximate dimensions of 60 Â 60 Â 150 mm. The crystals were back-soaked for 48 h and then soaked for 24 h at room temperature using the same solutions as used for the microcrystals apart from a reduced ZA concentration of 30 mM in the soaking solution. The soaked macrocrystals were flashcooled in liquid nitrogen: after treatment with a cryoprotectant solution [30%(v/v) PEG 400, 14% PEG 2000 MME, 30 mM ZA, 70 mM MES-NaOH pH 6.5, 3.5 mM CaCl 2 ] for the liganded SR1 form and as is for the liganded SR2 form (20% PEG 2000 MME, 30 mM ZA, 0.1 M MES-NaOH pH 6.5, 5 mM CaCl 2 ). Although faint ice rings were observed in the diffraction images of the liganded SR2 form, the data were acceptable for structure determination, as shown later.
2.2. X-ray data collection and structure determination SFX data at room temperature (about 300 K) were collected using a custom-built multi-port CCD detector (MPCCD; Kameshima et al., 2014) on beamline BL3 at SACLA, Japan (Ishikawa et al., 2012;Tono et al., 2013). The parameters of the XFEL beam used were a wavelength of 1.771 Å with about 0.1% standard deviation, a repetition rate of 30 Hz, a temporal width of about 10 fs (FWHM) and a pulse energy of about 450 mJ at the light source (about 200 mJ at the sample). The XFEL beam was focused to 1.5 Â 1.5 mm using Kirkpatrick-Baez mirrors (Yumoto et al., 2013). An aluminium attenuator with a thickness of 50 mm was used to prevent saturation from strong reflections. The sample-to-detector distance was set to 51.5 mm. The SFX experiment was performed using the DAPHNIS chamber with a humid helium ambience (Tono et al., 2015). The microcrystals suspended in an oil-based or water-based crystal carrier were loaded to the interaction region with XFEL pulses using the syringe-injector system as described in Sugahara et al. (2015Sugahara et al. ( , 2016; the inner diameter of the needle used and the sample flow rate were 110 mm and 0.48 ml min À1 , respectively. Image data from SFX were retrieved using the SACLA data-acquisition system (Joti et al., 2015) with filtering by Cheetah (Barty et al., 2014;Nakane et al., 2016) to extract images containing Bragg spots. The SFX data were processed and scaled using CrystFEL v.0.6.0 (White et al., 2012) without a cutoff. The processed data did not include overloaded reflections. Unit-cell parameters were analyzed using the cell_explorer function of CrystFEL with the DIRAX (Duisenberg, 1992) or MOSFLM (Leslie, 2006;Powell, 1999) indexing method. The sample-todetector distance and indexing parameters were optimized manually so as to improve the width of the unit-cell parameter distributions. The final indexing was performed using the MOSFLM method. The optimized sample-to-detector distance was 52.0 AE 0.1 mm, indicating about 0.2% accuracy ( Supplementary Fig. S2).
SR diffraction data were collected at 100 K using a MAR Mosaic 225 CCD detector on beamline BL26B2 at SPring-8, Japan. The wavelength used was 1.000 Å and the sample-todetector distance was set to 200.0 mm. The SR data collected with an oscillation angle of 0.5 were processed and scaled without an intensity cutoff using HKL-2000 (Otwinowski & Minor, 1997). For both SFX and SR, the experimental data including negative intensities were converted to positive-amplitude data based on Bayesian statistics using CTRUNCATE (French & Wilson, 1978) from the CCP4 program suite (Winn et al., 2011). All of the crystal structures were solved and refined using the PHENIX program package (Adams et al., 2002), in which the previously reported SFX structure of thermolysin (PDB entry 4ow3; Hattne et al., 2014) was used as the search model for molecular replacement. In each cycle of the PHENIX refinement except for the last few cycles, the simulated-annealing (torsion-dynamics) protocol was adopted to eliminate model bias. The structure was visualized/revised using Coot (Emsley & Cowtan, 2004). Special care was taken in the water placement, where only a water model with regular electron density not less than 0.5 in a 2mF o À DF c map and satisfying the criteria of interatomic interactions (hydrogen bonds with distances not less than 2.2 Å and not greater than 3.4 Å ; nonpolar interactions with distances not less than 2.65 Å and not greater than 4.2 Å ) was  0.0 0.0 0.0 0.0 0.0 PDB code 5wr2 5wr3 5wr4 5wr5 5wr6 † R split = 2 1=2 P jI even À I odd j= P ðI even þ I odd Þ, where I even and I odd represent the intensities of equivalent reflections from even-numbered and odd-numbered images, respectively. ‡ Pearson's correlation coefficient between averaged intensities of two corresponding observation subsets in which observations of each unique reflection are randomly divided into two half data sets. The programs CrystFEL and HKL-2000 were used for the SFX data and the SR data, respectively; overall values were not available from HKL- is the ith observation of reflection hkl and hI(hkl)i is the weighted average intensity for all observations i of reflection hkl. } R free was calculated using 5% of the reflections that were omitted from refinement. † † The number of atoms was calculated as the sum of occupancies. ‡ ‡ The maximum-likelihood-based method in PHENIX was used. selected by visual inspection in each cycle of the refinement. For the comparison of effective resolutions between data sets, the resolution limit of each data set was adjusted at the last stage of the structure refinement so that the R free value for the outmost shell was in the range 20-30%. The statistics from crystallographic analysis are summarized in Table 1. Structural superposition at corresponding C atoms was performed using LSQKAB (Kabsch, 1976) in the CCP4 suite. The distribution of C deviations from the superposition analysis was statistically examined by the Mann-Whitney U-test (Mann & Whitney, 1947), which confirmed the correctness of our conclusion from the superposition analysis described in x3 (Supplementary Tables S1 and S2). The annealed OMIT maps from PHENIX (Adams et al., 2002) were produced at the same resolution as that used for the refinement of the corresponding structure from all atoms of the final model except for those of ZA; a torsion-dynamics protocol of simulated annealing at temperatures from 2500 to 300 K followed by positional refinement and individual B-factor refinement was used.

Quality of the crystal structures
Three SFX and two SR structures of thermolysin have been determined at comparable resolutions in the range 1.9-2.3 Å (Table 1 and Supplementary Fig. S1). The present structures and four previously reported structures (PDB entries 3qgo, 3qh1, 3qh5 and 4ow3; Birrane et al., 2014;Hattne et al., 2014) share the same crystal packing; all of the crystals belong to the same space group, P6 1 22, with similar unit-cell parameters and contain a thermolysin monomer in the asymmetric unit. The final models of the present thermolysin structures with well defined electron densities contained entire amino-acid residues 1-316, a functional zinc ion at the active site and four structural calcium ions (Fig. 1). In addition, the liganded SR1 structure contained molecules of polyethylene glycol, which was used as a cryoprotectant. In the liganded forms, all atoms comprising the ligand ZA (Fig. 2a) were identified in the electron-density map with reasonable B values (Table 1). In the liganded water-SFX form using cellulose as a crystal carrier, one of the four calcium sites had a considerably low occupancy of 0.72, which may be relevant to the calciumchelating effect of cellulose in the presence of certain carboxylic acids (Rhee & Tanaka, 2000). The average B values calculated from the final models were comparable to the corresponding Wilson B values from the diffraction data. Stereochemical analysis in the PHENIX program suite (Adams et al., 2002) revealed no residues in the outlier region of the Ramachandran plot. Probably owing to the limited flexibility of the thermolysin molecule, all of the present structures share essentially the same backbone conformations, with similar patterns of B-factor distribution; this is in contrast to previous work on a G-protein-coupled receptor (GPCR) that showed structural differences in certain flexible loops between SFX and SR (Liu et al., 2013).

Comparison between SFX and SR
Many experiments have been reported on the thermal contraction of protein crystals at low temperatures using conventional X-ray sources (in-house source and synchrotron radiation; Haas & Rossmann, 1970;Walter et al., 1982;Hartmann et al., 1982;Frauenfelder et al., 1987;Tilton et al., 1992;Young et al., 1993;Keedy et al., 2015). In agreement with these reports, the unit-cell lengths of the present thermolysin crystals are 0.7-1.6% longer in the SFX structures at 300 K when compared with those in the SR structures at 100 K (Table 1). This difference is comparable to those in reported experiments at cryogenic (80-100 K) and ambient (298-300 K) temperatures: 1.7-2.4% for myoglobin crystals (Hartmann et al., 1982), 0.9-2.7% for ribonucrease A crystals (Tilton et al., 1992), 0.4-2.8% for lysozyme crystals (Young et al., 1993) and 1.3-1.8% for cyclophilin A crystals (Keedy et al., 2015). Thus, the unit-cell parameters obtained from our SFX experiments may be correct for those of thermolysin crystals at room temperature. Notably, the unit-cell parameters agree well with each other in the SFX structures. In the SR structures, the difference in the cryoprotection procedure resulted in 0.4-0.7% differences in unit-cell lengths, whereas the difference was within 0.2% in the SFX structures. Overall structures of the thermolysin-ligand complex from the liganded oil-SFX form. Thermolysin molecules are shown as a ribbon model coloured from the N-terminus to the C-terminus. Bound zinc and calcium ions are shown as grey and green balls, respectively. The bound ZA molecule is shown as a ball-and-stick model with atom-type colouring, apart from the alternate conformation, which is coloured cyan. This figure was prepared with Discovery Studio (Accelrys).
Phenomena involving atom displacement such as thermal vibration and alternate conformations can be modulated by the temperature at which the diffraction data were collected. It has been reported that the B factors of protein atoms are reduced at low temperatures (Walter et al., 1982;Hartmann et al., 1982). In the present structures, the average B values are higher for SFX at room temperature, as expected (Table 1).
However, it has been reported that the Monte Carlo integration of still images can produce artificially large B factors in SFX (Kroon-Batenburg et al., 2015). Thus, unfortunately, it is unclear whether or not the larger B factors observed in the present SFX structures reflect higher thermal vibration. A restrained conformational fluctuation of protein atoms at low temperatures has also been reported (Keedy et al., 2015). In the present structures, it is not conclusive whether or not the conformational fluctuation is restrained at low temperature. However, in the SR structures the locations of the alternate conformations that were observed were varied depending on the cryoprotection procedure, whereas they were identical in the SFX structures (Table 1). When the residues with alternate conformations are compared between the SR and SFX structures, only four of 12 residues were found in common. The SFX structures may more closely represent the physiological mode of the conformational fluctuation at room temperature when compared with the SR structures. Another difference between SFX and SR is in the water structure. An increment in the number of ordered water molecules at low temperatures has been reported in conventional crystal structures (Earnest et al., 1991;Young et al., 1993). However, from a statistical analysis of PDB entries, the increase in the number of water molecules at low temperature was not significant owing to a large variation in the ratio between water and protein atoms in the low-temperature structures (Carugo & Bordo, 1999). In the present work, the water:protein ratio is calculated to be 0.158-0.197 for the SR structures and 0.113-0.119 for the SFX structures (Table 1). The values for the SFX structures are similar to each other and are close to the statistically predicted value of 0.111 AE 0.004 for a 2.0 Å resolution structure at room temperature (Carugo & Bordo, 1999), regardless of the type of crystal carrier used. On the other hand, the values for the SR structure are very different from each other depending on the cryoprotection procedure. Furthermore, the values are much higher than the predicted value of 0.114 AE 0.008 for a 2.0 Å resolution structure at low temperature (Carugo & Bordo, 1999). From visual inspection of the structures, water molecules in the SFX structures are only observed within the first layer of the water coordination shell, in which a water molecule directly interacts with the protein atoms (Fig. 3a). In particular, in a  Structural differences in ligand recognition between SFX and SR. (a) Chemical structure of the N-carbobenzoxy-l-aspartic acid ligand. The carboxymethyl moiety showing alternate conformations is coloured red. (b, c) Stereo representations of the crystal structure relevant to ligand binding in the liganded oil-SFX form (b) and the liganded SR1 form (c). Atoms in the asymmetric unit are shown with atom-type colouring, apart from those of the alternate conformation, which are coloured cyan; symmetry-related atoms are coloured magenta. Important residues, the ligand ZA and two alternate conformations of the carboxymethyl moiety of ZA are labelled. Ligand-protein hydrogen bonds are indicated as dotted lines. The annealed OMIT maps for the ligand molecule with Fourier coefficients 2mF o À DF c (blue; 0.5 contour level) and mF o À DF c (orange; 3. contour level) are overlaid. (b) and (c) were prepared with Discovery Studio (Accelrys). pair of liganded SFX structures 80-82% of the water molecules can be overlaid at common positions with interatomic distances of less than 1.5 Å (Table 2), indicating high reproducibility of the water structure regardless of the type of crystal carrier used. The percentage of common waters in SFX structures is substantially lower in a pair of unliganded structures at 70-74%, probably reflecting the ligand binding. Of the water molecules in the SFX structures, 67-76% are commonly observed in the SR structures. However, in the SR structures many additional water molecules are observed both in the first and the outer layers of the water coordination shell depending on the cryoprotection procedure (Figs. 3b and 3c). The degree of water coordination in the SR structures is highly dependent on the cryoprotection procedure, indicating poor reproducibility of the water structure from SR. In conclusion, SFX may provide a closer representation of the physiological water structure when compared with SR.

Difference in ligand recognition
From an SFX experiment using thermolysin microcrystals soaked with the small-molecule ligand ZA, a protein-ligand complex structure was successfully obtained (Fig. 1). Both oilbased and water-based crystal carriers provided identical structures, including the mode of ligand recognition ( Fig. 2b and Supplementary Fig. S1). The enzymatic active site of the thermolysin molecule binds a ZA molecule in place of the substrate. The carboxymethyl moiety of ZA has two alternate conformations: A and B conformers with occupancies of 0.56 and 0.44, respectively. The A and B conformers are hydrogenbonded to the O atom of Tyr157 and the N 2 atom of Asn112, respectively. On the other hand, an SFX experiment without ligand soaking provided an apoenzyme structure. The apo structure was essentially the same as the reported SFX structure (PDB entry 4ow3; Hattne et al., 2014), except for subtle conformational differences in the side chains of Asn112, Thr157 and Glu166 with interatomic distances of less than 1 Å   Table 2 Superposition of the present structures and analysis of common waters.
A C superposition was performed between the present structures of thermolysin crystals as shown at the left and top. Amino-acid residues with alternate conformations were excluded from the calculation. The upper value is the r.m.s.d. value of the interatomic distances between corresponding C atoms after superposition; 304-309 C atoms were used for the calculation. A statistical examination of the positional differences between the distributions of C deviations using the Mann-Whitney U-test (Mann & Whitney, 1947) is available in Supplementary Table S1. After the C superposition, the common water molecules with close interatomic distances of less than 1.5 Å were counted. The number of atoms was calculated as the sum of occupancies. The ratio of the number of common waters to the total number of waters in the structure on the left is shown as the lower value. between corresponding atoms after C superposition of two structures ( Supplementary Fig. S1). The overall root-meansquare deviation (r.m.s.d.) value from the superposition was 0.203 Å , which was significantly lower than those from the other comparisons of the SFX structures with PDB entry 4ow3 (0.224-0.228 Å ), probably reflecting the ligand binding (Table 3 and Supplementary Table S2). The SR experiment using conventional ligand soaking with ZA also provided protein-ligand complex structures similar to but distinct from the liganded SFX structures ( Fig. 2c and Supplementary Fig.  S1). Essentially the same structures were obtained from both of the cryoconditions for the SR experiment, except that two alternate conformations were observed for Tyr157 in the liganded SR1 structure. The ZA molecule had no alternate conformations and was hydrogen-bonded to the O atom of Tyr157 in the same manner as in the A conformer in the liganded SFX structures. When the present liganded SR structures are compared with the reported cognate structure, PDB entry 3qh1 (Birrane et al., 2014), the r.m.s.d. values from the C superposition are 0.135-0.148 Å , which are significantly lower than those from the other comparisons (0.178-0.228 Å ), indicating an overall similarity among the liganded SR structures (Table 3 and  Supplementary Table S2). However, at the ligand-binding site of PDB entry 3qh1 the carboxymethyl moiety of ZA is hydrogen-bonded to the N 2 atom of Asn112 in the same manner as in the B conformer of our liganded SFX structures, and alternate conformations are observed for Asn112 but not for Tyr157, which is distinct from the present liganded SR structures. Therefore, in terms of the ligand-recognition mode, the three available liganded SR structures differ from each other. In contrast, the same ligand-binding mode was reproducibly observed from the SFX soaking experiments. In fact, the C -superposition analysis could detect the ligand binding in the SFX structures (Table 2 and Supplementary Table S1); of the r.m.s.d. values from the superposition between the SFX structures, that of 0.057 Å for a pair of liganded structures was significantly lower than those for the other pairs (0.106-0.112 Å ). The other comparisons including the SR structures gave much higher r.m.s.d. values in the range 0.158-0.192 Å . Furthermore, the ligand-binding mode observed in the SFX structures was not available from the SR experiments.
The present SFX and SR structures of thermolysin reveal considerable differences in ligand recognition, which is in contrast to the previous work on GPCR, which reported no differences in ligand recognition between SFX and SR structures (Liu et al., 2013). Differences in the temperature of data collection and in the procedure of cryoprotection are suggested as possible reasons for the structural differences that are observed. In addition, several groups have pointed out structural changes in proteins owing to radiation damage when crystal structures from XFELs and conventional X-ray sources are compared (Hirata et al., 2014;Suga et al., 2015;Fukuda et al., 2016). Thus, certain modifications from radiation damage may be another reason for the present structural differences.

Conclusions
In this work, the feasibility of ligand screening in SFX has been examined using thermolysin as a model system. As a result, a ligand-soaking experiment using SFX successfully provided untreated protein-ligand complex structures at room temperature. From a structural comparison between SFX and SR, clear structural differences in the ligand-binding mode were observed. Notably, the SFX structures were highly reproducible regardless of the type of crystal carrier used, whereas the SR structures showed substantial differences depending on the cryoprotection procedure that was used; C superposition between the liganded SR1 form and the liganded SR2 form provided a distribution with significantly higher values of the C deviation when compared with any superposition between a pair of SFX structures (Table 2 and  Supplementary Table S1). In conclusion, ligand screening in SFX may be useful for the design of small-molecule ligands for SBDD in the near future, because it provides structural information without factors that may affect the physiological structure of proteins. Table 3 Superposition of the present structures with reported structures.
A C superposition was performed between the present structures (top) and the reported structures (left). Amino-acid residues with alternate conformations were excluded from the calculation. The r.m.s.d. value of the interatomic distances between corresponding C atoms after the superposition is shown; 289-313 C atoms were used for the calculation. A statistical examination (Mann & Whitney, 1947) of the positional differences between the distributions of C deviations using the Mann-Whitney U-test is available in Supplementary Table S2