Structure determination of an integral membrane protein at room temperature from crystals in situ

The X-ray structure determination of an integral membrane protein using synchrotron diffraction data measured in situ at room temperature is demonstrated.


Introduction
Membrane-protein structure determination routinely uses X-ray diffraction data recorded at cryogenic temperatures from a single crystal, requiring a significant investment of effort to grow samples of sufficient size to allow a complete data set to be recorded. These two criteria have been driven by the typical nature of membrane-protein crystals: they are formed by limited crystal contacts, owing to a high solvent content and poor order, and are prone to non-isomorphism; these factors typically lead to weak diffraction (compared with most crystals of soluble proteins), requiring proportionally higher X-ray doses to allow measurement of high-resolution reflections. To compound the issue, phase transitions in any amphiphilic molecules in the crystal, such as detergents, can make the results of cryocooling less consistent and more likely to further compromise crystal order (Pebay-Peyroula, 2008).
It has been demonstrated that membrane-protein diffraction data can be recorded from micro/nanocrystal preparations injected into the intense pulsed beam of an X-ray freeelectron laser (XFEL) at room temperature (Weierstall et al., 2014). This significant step forward has been a consequence of the 'diffraction before destruction' experiment (Chapman et al., 2011) made feasible by the very short, intense pulses from XFELs. Membrane-protein crystal structure determination has been beyond the reach of room-temperature crystal diffraction measurements at synchrotron-radiation sources, principally owing to the significant primary and secondary radiation damage that occurs (Garman, 2010). In situ datacollection methodology (from crystals in crystallization plates) has matured to the point where the structure determination of viruses and other soluble proteins is now approaching routine Heidari Khajepour et al., 2013;Wang et al., 2012). In situ screening at synchrotrons  has shown that membrane-protein crystals yield only a small number of images before losing their diffracting ability. Highresolution diffraction data can be recorded at room temperature for membrane-protein crystals. In situ data collection removes the need for cryoprotectant, a potential obstacle in membrane-protein crystallography, where the detergent composition can vary (Pellegrini et al., 2011). Sufficient data for structure determination would require many isomorphous crystals. A recent development in data analysis of multiple crystals is the software BLEND, which has been shown to be applicable to the cases of soluble and membrane proteins and brings the benefit of accelerating the often time-consuming procedure of managing multiple data sets (Foadi et al., 2013) and identifying isomorphous crystals. This, combined with high-frame-rate pixel-array detectors (Broennimann et al., 2006) and the discovery of prolonged crystal lifetimes at room temperature for high dose and frame rates (Owen et al., 2012(Owen et al., , 2014 brings the possibility of room-temperature structure determination of membrane proteins using synchrotron radiation within the grasp of crystallographers. Here, we describe the first in situ structure determination of a membrane protein, using Haemophilus influenza TehA (HiTehA), which has previously been solved to 1.2 Å resolution from a single cooled crystal (Chen et al., 2010). We present a method to collect data from multiple in situ crystals of membrane proteins and to form a sufficiently complete data set from many partial data sets. The validity of the approach is demonstrated both by the quality of the electron-density maps associated with the assembled data set and by a detailed comparison between the derived structure and the reference structure solved using data collected at 100 K from a single crystal.

Protein expression, purification and crystallization
HiTehA was cloned into pWaldoGFPe and purified as described previously (Drew et al., 2006), with the final buffer consisting of 20 mM Tris pH 7.5, 150 mM NaCl, 60 mM n-octyl--d-glucopyranoside. The protein was screened for crystallization at 20 mg ml À1 using the vapour-diffusion method. Crystals for the in situ data-collection experiment were grown by mixing 100 nl HiTehA solution with 100 nl reservoir solution in sitting drops using a Mosquito robot (TTP Labtech); drops were dispensed onto a hydrophobiccoated 96-well plate (CrystalQuick X). The best diffracting crystals grew over 7-10 d at 277 K from a reservoir solution consisting of 0.1 M NaCl, 120 mM Tris pH 9.4, 20%(v/v) PEG 400. The crystal plate was moved to ambient temperature before mounting on a modified goniometer as described previously .

In situ data collection
Data were collected on beamline I24 at Diamond Light Source using a dedicated goniometer for the mounting of SBS-format (now ANSI/SLAS standard; http://www.slas.org) crystallization plates and a Pilatus3 6M detector. We have previously shown that a 100 mm offset must be added to the position of the rotation axis in the direction of the beam to account for the optical effect of viewing the crystals through the plate-base material, thereby ensuring that the crystals could be precisely located on the axis of rotation (Axford et al., 2012). Centring was performed by positioning the crystals onto a cross-hair coincident with the beam position and then translating them along the beam axis into the focal plane of the on-axis microscope. Visible radiation damage to the crystals following data collection was clearly contained within the crystal volume rather than appearing as a vertical line, indicating that the crystals were indeed well centred using this method. The goniometer allows an angular movement of the plate of approximately AE20 from the vertical.
A few crystals were initially used to optimize and fix the data-collection parameters. Based on the observed diffraction from these crystals, d min at the edge of the detector was set to 2.5 Å resolution (1.83 Å resolution in the detector corners). This was necessarily a best guess and could not be optimized on a per-crystal basis owing to the rapid onset of radiation damage at room temperature. Subsequent analysis has shown that several crystals diffracted to a higher resolution and into the corners of the detector. These effects are reflected in the completeness and multiplicity of the data in the highest resolution bin, as shown in x2.4. Thus, our initial estimate of 2.5 Å resolution turned out to be too conservative and the Example section of a diffraction image with spots extending to 2.1 Å resolution into the corners of the detector. The inset at the top right is an on-beam-axis view of two example crystals located at the edge of a crystallization drop captured during data collection. The red circle represents the full-width half-maximum of the beam profile and has been matched to the crystal size. The matching of the beam size to that of the crystal optimized the signal-to-noise ratio of the data. structure was eventually refined using data to 2.3 Å resolution, with the initial electron-density maps and model building being aided by data to 2.1 Å resolution (Fig. 1). The final 2.3 Å resolution limit was selected in order to achieve an overall data completeness of greater than 90%.
Multiple wedges of data were measured consisting of 30-50 images of 0.2 rotation each at 25 frames s À1 with 12% of the total beam flux, equating to $2 Â 10 11 photons s À1 . Each wedge therefore consisted of 6-10 of data after X-ray exposure for a total of 1.2-2 s.
A total of 67 wedges of data were recorded from 56 separate crystals ranging in size from 10 to 75 mm in the largest dimension. The beam size on the sample was adjusted between 10 and 50 mm to best match the size of each crystal in order to optimize the signal-to-noise ratio of the measurements while distributing the X-ray dose through the whole crystal volume. For the larger crystals data could be recorded from up to three points on the sample using a beam size smaller than the crystal. The starting angle for each wedge was varied to cover a total sampled angular range of 24 with the intention of maximizing reciprocal-space coverage in the eventuality that the crystals were systematically orientated in the drops.

100 K data collection
The reference cryocooled data set was recorded on beamline I24 from a single crystal grown using identical crystallization conditions to those described above. The crystal was flash-cooled in liquid nitrogen and maintained at 100 K in an open flow of cold N 2 gas for measurement.

Data analysis, assessment of radiation damage and merging
Integration with XDS (Kabsch, 1993) proceeded smoothly for all but the last four data sets (64-67), for which XDS failed to integrate the data even when given the correct space group. These data sets were subsequently discarded from the analysis. A check of the diffraction images for the discarded data sets revealed split diffraction spots that were indicative of poor crystal integrity and were likely to be the reason that XDS failed to index the data.
The unit-cell parameters for all of the remaining wedges are displayed in Supplementary Table S1 along with the completeness up to 2.1 Å resolution.
BLEND was run in analysis mode on the remaining 63 data sets to produce a cluster dendrogram (Fig. 2a). The linear cell variation (LCV), which describes the maximum percentage change in the unit-cell face diagonals across all data sets, is 1.18%. Two major clusters emerged (Fig. 2a), cluster 60 and cluster 61, which showed a completeness of 89.7 and 70.7%, respectively, to 2.1 Å resolution. Cluster 60, being the most complete, was used for subsequent phasing by molecular replacement, model building and refinement.
Each wedge suffered to a varying extent from radiation damage. Rather than retaining a fixed number of images per wedge, a custom selection of data was made based on the procedure described in Appendix A. Briefly, a moving average Crystal selection and data processing carried out with BLEND. (a) Dendrogram showing all integrated data sets and their merging nodes, with two major clusters at nodes 60 and 61. (b) Graph showing the number of measured images in each wedge of data (grey bars) and the number of accepted images (blue bars) after radiation-damage assessment. (c) Final stage of data processing. The R meas for each cluster (represented by a grey circle) is displayed in blue next to the node. (d) Plot of R meas versus completeness for all subclusters of cluster 60, at 2.3 Å resolution, after the removal of data sets 45 and 46. Cluster 60a (red dot) includes the same data sets as cluster 60b (blue dot), but with some images removed, after correction for radiation damage. The reduction in R meas is evident.
intensity is determined as a function of diffraction image and resolution for each set and data are rejected when this intensity falls below a threshold, in this case 75% of the starting intensity. The number of images retained per wedge (ranging between 15 and 50) after application of this procedure is shown in Fig. 2(b). This approach is quite conservative, removing images only where it was statistically evident that global radiation damage had affected the data. Different approaches using the elimination of either a fixed number or a fixed fraction of images for all data sets have also been attempted, but in neither case were the merging statistics better than with this custom procedure.
Scaling and merging were performed using AIMLESS (Evans & Murshudov, 2013) at 2.3 Å resolution. Most data sets in cluster 60 merge well, with the exception of cluster 49 and data set 45 (Fig. 2c). Excluding data set 45 from cluster 58 reduced the overall R meas from 0.182 to 0.100. Of the four data sets composing cluster 49 (52, 56, 41 and 46), data set 46 was found to be solely responsible for the poor merging and was therefore excluded. A new cluster, 60a, was therefore produced by discarding data sets 45 and 46. Fig. 2(d) shows a plot of R meas versus completeness at 2.3 Å resolution for all of the clusters (nodes) in the left branch of the dendrogram after the removal of data sets 45 and 46. Structure factors were determined from scaled and merged intensities using TRUN-CATE (French & Wilson, 1978).

Determination of high-resolution limits
Data to 2.1 Å resolution were initially used for phasing and model building as they resulted in a very clear and interpretable electron-density map. At a later stage, the resolution limit was cut to 2.3 Å resolution based on the application of an overall CC 1/2 > 0.5 criterion. This fairly stringent cutoff was used so that an overall completeness of greater than 90% was retained. This made comparison with the complete 100 K data set more meaningful.
The final overall R meas was 0.107, R p.i.m. was 0.044 and the completeness was 92.9%. The final statistical summary from AIMLESS for this data set is given in Table 1. As a note of interest, equivalent analysis without the removal of radiationdamaged images gave an overall R meas of 0.145, an R p.i.m. of 0.050 and a completeness of 95.4%.
It is important to stress that structure determination was not complicated by the fragmented nature of the multiple data sets composing the final data. Molecular replacement, model building and final refinement were carried out in exactly the same way as for a complete data set from a single crystal.

Structure determination and refinement
Phases related to the final data were obtained by molecular replacement in Phaser (McCoy et al., 2007) using the deposited structure of HiTehA (PDB entry 3m71; Chen et al., 2010) as a search model. The initial electron-density map was inspected and the model was built using Coot (Emsley & Cowtan, 2004). Model refinement was performed using PHENIX (Adams et al., 2010). The structure was refined against the 2.3 Å resolution multi-crystal data set to an R work of 15.6% and an R free of 20.01%. The TehA structure has 97% of the residues in the favoured Ramachandran region and no outliers. The structure factors and coordinates have been deposited in the Protein Data Bank as PDB entry 4ycr. Detailed refinement statistics are given in Table 2.

Data collection and analysis
A complete data set to 2.3 Å resolution was assembled from 63 partial data sets obtained by irradiating in situ 56 crystals of the membrane protein HiTehA distributed across a number  Table 1 Final merging statistics for cluster 60a.
All statistics were obtained after removing the outlier data sets. This data set was used to determine the room-temperature structure.

Overall
Inner shell Outer shell  Table 2 Data-collection and refinement statistics (molecular replacement).
Values in parentheses are for the highest resolution shell. For the 100 K and room-temperature (RT) data sets, one and 56 crystals were used, respectively.

Figure 3
Crystal structure of HiTehA from in situ and cryogenic data. Cartoon representation of TehA (a) parallel to the membrane and (b) from the periplasmic face. The structure is coloured in a rainbow from the N-terminus (blue) to the C-terminus (red). (c) 2F o À F c electron-density map section within TM9 contoured at 1.0 with the model in stick represention with carbon in yellow, nitrogen in blue and oxygen in red. Clear electron density is visible for the highly conserved gating residue Phe262. (d) 2F o À F c electron-density map for OG with its proximate residues from the in situ 2.3 Å resolution data contoured at 1. (e) The same representation as in (d) for the 1.5 Å resolution 100 K structure. ( f ) Ribbon representation with the OG detergent molecule and surrounding side chains shown as sticks. (g) A slice through the channel shows the path with the gating residue Phe262 (red sticks) on TM9. The OG detergent is bound to HiTehA on the cytoplasmic side and reaches deep into the hydrophobic channel.
of cells of a single 96-well crystallization plate mounted on a specialized goniometer . Each crystal was exposed to $2 Â 10 11 photons s À1 for 1.2-2.0 s, during which 30-50 0.2 images were recorded at 25 frames s À1 . The total data collection for all crystals took less than 3 h. Data integration was carried out with XDS (Kabsch, 1993). Of the 67 wedges of data integrated with XDS only 63 indexed correctly in space group H3; the remaining four were associated with split crystals. The completeness of the individual data wedges varied between about 12.3 and 22.6% at 2.1 Å resolution. These partial data sets were fed into BLEND (Foadi et al., 2013) to carry out radiation-damage assessment, cluster analysis of unit-cell variation and to manage the subsequent collation, scaling and merging. Assessment and rejection of diffraction images overly affected by radiation damage was made by analysis of the average intensity reduction as a function of image and resolution (see x2.4). Diffraction images suffering from radiation damage were rejected from the Comparison of the in situ and cryogenic structures. (a) The two models superimpose quite well in general. One notable exception is the loop connecting TM6 and TM7, as detailed in (b). This shifting loop is located towards the adjacent monomer and is proximate to the C-terminal region. (c) Colour representation of B factor across the chain for the two models. The blue to red spectrum indicates low to high B factors. The structure obtained with the in situ data exhibits higher B factors, especially at the ends of the helices C-terminal to TM10, than the cryogenic structure. The respective overall B factors are 26.4 Å 2 for the 295 K structure and 24.4 Å 2 for the 100 K structure. On average, the intracellular part of the models has a higher B factor than the extracellular part owing to the presence of larger loops.
analysis if their average intensity in the highest resolution shell dropped below 75% of the starting value. The final data set had an overall R meas of 0.107, an R p.i.m. of 0.044 and a completeness of 92.9% to 2.3 Å resolution (Table 1).

Structure of HiTehA
Initial structure determination was carried out via molecular replacement using the HiTehA structure (PDB entry 3m7b; Chen et al., 2010) followed by refinement using the PHENIX platform (Adams et al., 2010) to 2.3 Å resolution with final R work and R free values of 15.6 and 20.01%, respectively ( Table 2). The overall in situ structure is very similar to the published cryogenic structure, with an r.m.s.d. of 0.66 Å for all atoms. HiTehA is a trimeric membrane protein, with each monomer consisting of ten transmembrane (TM) helices linked by short loops (Fig. 3a). The HiTehA monomer consists of five two-transmembrane-helix hairpin repeats. TM1, TM3, TM5, TM7 and TM9 are part of the inner pore of the channel perpendicular to the membrane surrounded by TM2, TM4, TM6, TM8 and TM10 (Fig. 3b). The electron-density map after molecular replacement at 2.3 Å resolution was of high quality, allowing individual amino-acid side chains, water and detergent molecules to be fitted with accuracy. Residue Phe262, which was reported to be important for gating, was found to be in the same position and orientation (Fig. 3c) as observed by Chen et al. (2010).

Comparison of room-temperature and 100 K structures
In addition to the room-temperature data, a reference data set from a single crystal cryocooled to 100 K was collected and its structure was determined via molecular replacement in an identical way to the room-temperature structure (Figs. 3d-3g). These 100 K data were subsequently refined to 1.5 Å resolution with final R work and R free values of 13.6 and 16.7%, respectively ( Table 2).
The two structures superimpose very well with an r.m.s.d. of 0.55 Å for all atoms, but a clear shift in the loop connecting TM6 and TM7 is observed, with a maximum distance of 2.9 Å measured at residue Ser192 (Figs. 4a and 4b). In the case of the room-temperature structure the loop folds back towards the inside of HiTehA, whereas in the cryogenic model it folds outwards towards the cytoplasmic side. This loop is located on the interface with the next monomer of the trimeric HiTehA protein and interacts with the C-terminal end of TM helix 4. Ser192 interacts with the backbone of the adjacent monomer of the trimer in proximity to the backbone of the residues Gly130, Gly129 and Gln129. This loop shift in the monomeric interface does not impact the overall trimeric arrangement of HiTehA between the room-temperature and the 100 K model, as the trimeric superimposition involving a total of 912 C atoms results in an r.m.s.d. of 0.279 Å . Furthermore, the loop shift neither alters the position of the gating Phe262, located on TM9, nor blocks the channel.
Analysis of the B-factor distribution reveals, as expected, regions of greater flexibility in the in situ structure compared with the 100 K structure (Fig. 4c). Hoever, the magnitude of this difference is small.
The electron density from the room-temperature and 100 K data both reveal one octylglucoside (OG) detergent molecule inside the channel cavity on the cytoplasmic site (Figs. 3d and 3e) that was not reported in the original structure. The hydrophobic alkyl tail of the OG detergent reaches deep into the channel and is surrounded by the hydrophobic residues Phe262, Ile203, Leu18, Leu144, Leu85 and Phe82. The polar glycoside head group of OG is proximate to the charged groups Arg97 and Gln196 and the backbone of HiTehA (Fig. 3f ). As a note of interest, electron-density maps calculated using the structure factors from the structure of Chen and coworkers show OG-like density in the channel, but its interpretation was presumably hindered by discontinuity in this electron density.

Assessment of data quality using OMIT maps
In order to validate the multi-crystal data-set quality and to exclude model bias, an initial model of HiTehA with a C-terminal deletion ranging up to residue Val279, including the entire TM10 helix, was generated and used for refinement against the merged raw data set. Electron-density maps for the omitted region of the structure are shown in Figs. 5(a) and 5(b). The map shows continuously connected backbone and side-chain density for the omitted TM10 region.

Assessment of data quality by molecular replacement
There are no known structural homologues of HiTehA to provide a search model for molecular replacement. A feasible solution to this problem is suggested by the observation that -helical structures of membrane transporters and channels typically share common domains, motifs and repeats. These individual domains or motifs often serve as ensembles of plausible search models for molecular replacement (Pornillos & Chang, 2006;Sciara & Mancia, 2012). A potential match is represented by the backbone C -atom superposition of helices TM1-TM4 onto helices TM7-TM10 (Fig. 5c); in this case the r.m.s.d. using secondary-structure mapping (SSM; Krissinel & Henrick, 2004) amounts to 2.6 Å calculated over 98 atoms. In order to simulate a de novo molecular replacement, TM1-TM4 were selected as the search model (Fig. 5c). The new truncated model consisted of 96 amino acids, making up 29% of the total HiTehA sequence. Molecular replacement using Phaser (McCoy et al., 2007) indicated a prominent top solution with a rotation Z-score of 10.3, a translation Z-score of 16.8 and a log-likelihood gain (LLG) of 307.1. The calculated electron-density map at 2.3 Å resolution from the molecularreplacement solution clearly displayed the missing part of the model, and four additional TM helices were automatically traced by Buccaneer (Cowtan, 2006;Fig. 5d). The figure of merit associated with the resulting electron-density map was 0.599, a value indicating a high degree of map interpretability. Manual building of the model to completion was, at this stage, a straightforward procedure.

research papers 4. Conclusion
The first structure of an integral membrane protein at room temperature determined by in situ data collection at a synchrotron has been presented. From a total of 56 measured crystals, a final scaled and merged data set reaching 2.3 Å resolution was obtained from 63 partial data sets. The results are of great value since, by their nature, membrane proteins struggle to form large, well ordered crystals that are amenable to cryocooling. The approach used here is fairly conservative regarding radiation-damage assessment and data rejection.
One could easily expect, however, that a more stringent application of the procedures outlined in the Supporting Information and the measurement of data from many more crystals could yield complete data for more challenging membrane-protein systems; for example, G-protein coupled receptors (GPCRs), which are typically grown in lipidic cubic phase and are known to require multiple data sets even under cryogenic conditions (Hanson et al., 2008). Membrane-protein crystals grown in lipidic cubic phase are already screened routinely in plates for their initial diffraction on Diamond beamline I24 . The collection of in situ data can decrease the number of crystals compromised through handling and increase the throughput, facilitating the acquisition of a full data set as produced by a suitable software package such as BLEND.
It is important to note that radiation damage in synchrotron X-ray diffraction data is inevitable. Recent free-electron laser (FEL) studies have shown that essentially radiation-damage-free membrane-protein diffraction data can be measured from crystals within a lipidic cubic phase 'jet' (Weierstall et al., 2014). Currently, access to FELs is in heavy demand and the analysis of data obtained from serial femtosecond crystallography is still in its infancy (Barends, 2014;White et al., 2012) and is reliant on massive levels of averaging from tens of thousands of crystals to obtain data quality that approaches that attainable using a synchrotron. Whereas the radiation-damage-free nature of FEL diffraction data from biological macromolecules may be valued from the perspective of functional studies and biological interpretability (Neutze et al., 2004), the practical problem of obtaining data that are of sufficient quality to determine de novo phase information and interpretable electron-density maps remains.
Using the approach presented in this paper, data collection from 56 crystals and the identification of 813 images of highest quality data sufficient for structure solution required around 150 min of beamtime. There is significant The quality of the electron-density maps reflects the good quality of the in situ data. (a) 2F o À F c electrondensity map after an OMIT map related to a C-terminal HiTehA deletion including TM10. The map is calculated at 2.3 Å resolution and contoured at 1.0. The fitted TM10 is shown for clarity. (b) Positive F o À F c electron-density map at 2.3 Å resolution contoured at 3.0 showing the missing TM10. TM10 is also shown here for clarity. (c) The four-transmembrane-helix search model. The TM1-TM4 helices (yellow) superimpose well onto the TM7-TM10 helices (salmon). (d) 2F o À F c electron-density map calculated using the molecular-replacement phases at 2.3 Å resolution contoured at 1.0. The missing part of the structure in the search model is revealed in the electron-density map and is well connected, with visible density for the side chains; the initial search model (yellow) and the built model (salmon red) are shown. scope to increase the throughput of the data-acquisition procedure by automation, possibly by the use of imagerecognition software to identify samples.
APPENDIX A A procedure to assess and modify data sets affected by radiation damage Intensity averages in data sets affected by radiation damage have relatively lower values than those in unaffected data sets. The effect is especially evident and is normally greater at increasing resolution (Garman, 2010). Several studies have ascertained that equivalent intensities vary monotonically with time once the crystal has been irradiated (Diederichs et al., 2003), but the exact form of such behaviour is not easy to capture. If the average intensity is monitored in resolution shells during data collection, such a monotonic decrease should be quantitatively observable. In scaling programs a subdivision of data into resolution shells and time intervals (equivalent to a group of images) is always performed to implement any scaling algorithm, under the assumption of a constant irradiated dose. Here, we have adopted a similar approach for the determination of radiation damage. The goal of this procedure is to determine whether it is worth removing part of the data from the full data set and, if this is the case, which part should be removed. The main steps are as follows.
(i) Each data set is divided into resolution shells and groups of images. In the following, we will use s to indicate inverse resolution (s = 1/d) and t (for time) to indicate image number.
(ii) Running averages are computed for each resolution shell. These can be represented as curves in an intensity-time plot. The average intensity in each shell has a behaviour characteristic of the specific data-collection experiment. In general, this average changes with time owing to factors such as beam-flux fluctuations, the exposure of different parts of the crystal during rotation, crystal absorption and radiation damage. For all data sets described in this paper the exposure was short (essentially constant beam flux), covered a small rotation range (a negligible change in absorption) and the crystals were quite small (fully bathed crystal); thus, it is relatively safe to consider radiation damage as the most prominent cause of dynamic behaviour for each data set. Under these conditions the average intensity is expected to follow an exponential decrease with time, with the decline being more rapid at higher resolutions.
(iii) An exponential regression is performed for individual curves in each resolution shell. The exponential coefficient, indicated as À, is calculated and stored. The regression model has the form (iv) A linear regression is carried out over all in relation to the resolution s. If the coefficient a of the regression is positive and different from zero by at least one standard deviation, then the decay is considered to be a genuine effect of radiation damage. In such a case the decay coefficient () will be a function of resolution, with (v) Radiation is supposed to affect decay from the start of data collection, as implied by the continuous nature of the model for I(t) (1). It is thus appropriate to limit the number of images used in further analysis to be compatible with a desired amount of decay. Let f indicate such an amount as a fraction between 0 and 1. A value of 1 means no decay and 0 means that the whole intensity average has been reduced to zero. A default value of f = 0.75 (corresponding to data for which the average intensity has been reduced to 75% of its initial value) has been used for all data sets described in this paper. We are looking for the time at which the average intensity has decreased to f of its initial value. Using (1), it is found that (vi) If were constant with resolution, (3) would yield a single time (image) after which the intensity averages are decreased to more than f of their initial value. However, (2) tells us that typically increases with resolution. Thus, to retain data with average intensities greater than f of their initial value it is necessary to discard data from certain resolution shells in each image rather than whole images. The exact analytical value, obtained by substituting for (2) in (3), is (vii) Rather than the elimination of part of each image according to (4), we have preferred to use this formula to select an image after which all data are discarded. This is the average image among all reflections retained. The reason for this is related to the way that AIMLESS (Evans & Murshudov, 2013), and indeed most scaling programs, deals with radiation damage. This is implicitly included in scaling models as the temperature factor B and, as long as it is not too severe, the determined scale factors should correct for global radiation effects to a good approximation.
The regression model adopted in the procedure described here is an approximation to the actual decay. The linear increase of with resolution is also an approximation. They are, in essence, the simplest available models compatible with the observed phenomenon of radiation damage in crystals.