research papers
Choosing your (Friedel) mates wisely: grouping data sets to improve anomalous signal
aStructural Biology Group, European Synchrotron Radiation Facility, 71 Avenue des Martyrs, F-38000 Grenoble, France, and bDepartment of Agricultural, Food and Environmental Sciences, Marche Polytechnic University, Via Brecce Bianche, 60131 Ancona, Italy
*Correspondence e-mail: max.nanao@esrf.fr
Single-wavelength anomalous diffraction (SAD) phasing from multiple crystals can be especially challenging in samples with weak anomalous signals and/or strong non-isomorphism. Here, advantage is taken of the combinatorial diversity possible in such experiments to study the relationship between merging statistics and downstream metrics of phasing signals. It is furthermore shown that a
(GA) can be used to optimize the grouping of data sets to enhance weak anomalous signals based on these merging statistics.Keywords: single-wavelength anomalous diffraction; anomalous scattering; multi-crystal crystallography; serial crystallography; merging.
1. Introduction
Although de rigeur before the advent of modern cryocooling methods, until recent years the assembly of a complete data set from multiple incomplete data sets had become less common. Recently, thanks in no small part to the development of methods for serial crystallography at free-electron lasers, there has been renewed interest in all aspects of multi-crystal crystallography. The use of multiple crystals allows much better data quality for a given X-ray dose because of the potential to use many hundreds or thousands of crystals. While this has previously been explored via microbeams and high-precision sample-manipulation devices (Cusack et al., 1998), the scale and the nature of sample delivery has expanded tremendously. One of the key challenges associated with the use of multiple crystals is non-isomorphism. This is particularly relevant to anomalous phasing, especially in cases where the anomalous signal is very low. Based on previous work, it has been shown that better data can be obtained by merging subsets of data using, for example, correlation coefficients between sub-data sets or unit-cell parameter clustering (Santoni et al., 2017; Foadi et al., 2013; Liu et al., 2013). For a handful of sub-data sets all combinations of sub-data sets can be evaluated, but this becomes unfeasible with even a relatively small number of data sets, since the number of combinations is 2n − 1 for n data sets. We have recently introduced an alternative approach in which a (GA) is used to partition the pool of data sets into subgroups (Zander et al., 2016; Foos et al., 2018). The GA formulates the selection of isomorphous groups in evolutionary terms. The key concept is the encoding of the sub-data-set grouping in a data structure known as a chromosome. Here, a chromosome is an array of integers of length n, where n is the number of sub-data sets. The numeric value of each integer in the array encodes the merging group to which a sub-data set belongs. There is no limit to the number of groups that can be used, and the number can be chosen based on the number of non-isomorphous groups that are present in the data. In practice, it is common for only two groups to exist, but the default behaviour is to use three groups in case a third non-overlapping (and non-isomorphous) group exists (or, for example, two high-quality but non-isomorphous groups and a third `low-quality' group of sub-data sets that do not merge with either of the two former groups). The first step of the GA is to randomly initialize a set of chromosome data structures. This population of chromosomes is then submitted to cycles of GA optimization, which consist of mutating positions in the array (changing the merging group), single- and double-crossover events between chromosomes, and evaluation of the `fitness' of individual chromosomes. The `fitness' is derived from the Rmeas value, the 〈I/σ(I)〉 value, the CC1/2 value, the completeness, the multiplicity and the anomalous The relative weighting of these different components can be adjusted depending, for example, on the presence or absence of anomalous scatterers. Furthermore, default values have been determined which are generally effective. If anomalously scattering elements are present, this approach depends critically upon a connection between merging statistics and the `solvability' of a data set. Considerable work has gone into studying the relationship between merging statistics and anomalous signal in single crystals, but less is known about the multi-crystal case. Although normally an impediment, we have taken advantage of the fact that sub-data sets can be assembled into single data sets in a large number of different ways. We use this fact to generate a large number of data sets with different merging statistics, and re-evaluate commonly used merging statistics in the context of multi-crystal data. In doing so, we show that the anomalous signal can be improved using the GA approach, that the anomalous appears to be the best target for GA optimization and that this metric translates into improved downstream metrics such as anomalous difference-map peak heights, the ability to determine substructures and phasing success.
2. Methods
2.1. Sample preparation
Thermolysin crystals from Bacillus thermoproteolyticus and urease crystals from Sporosarcina pasteurii were prepared as described previously in Zander et al. (2016). Briefly, thermolysin was crystallized in 35% ammonium sulfate and was then cryoprotected with 2 M trimethylamine N-oxide. Urease was crystallized in 1.6–2.0 M ammonium sulfate in 50 mM sodium citrate buffer pH 6.3 and was then cryoprotected with 20% ethylene glycol, 2.4 M ammonium sulfate in 50 mM sodium citrate buffer pH 6.3. Cerulean crystals were grown using a microseeding method. A macrocrystal was obtained using the protocol described by Lelimousin et al. (2009). Macrocrystals were ground in 100 µl seeding buffer (0.1 M HEPES pH 6.75, 22% PEG 8000). The seeds were diluted to 1/100 in seeding buffer. The protein (15 mg ml−1) was digested with trypsin (0.5 mg ml−1) for 1 h [the ratio of trypsin to protein was 1:10(v:v)]. The seeds were mixed with digested protein at a ratio of 10%(v/v). Crystals were grown in 0.1 M HEPES pH 7, 14% PEG 8000, 0.1 M MgCl2 in 1–1.5 µl hanging drops using the vapour-diffusion method. The crystals were transferred to a cryoprotectant solution consisting of the mother liquor supplemented with 20%(v/v) glycerol (S. Aumonier, personal communication).
2.2. Data collection
2.2.1. Thermolysin and urease
The thermolysin data described in Zander et al. (2016) (Table 1) were re-analyzed and were reprocessed using XDS v.20180126. These data consisted of four different MeshAndCollect workflows on four different samples of thermolysin (Zander et al., 2015). The urease sub-data sets from Zander et al. (2016) were re-analyzed for anomalous phasing without modification.
|
2.2.2. Cerulean
Cerulean data were collected at 100 K using a Dectris PILATUS3 6M detector on the ID30B MAD beamline at 6.0 keV at the European Synchrotron Radiation Facility (Mueller-Dieckmann et al., 2015) (Table 1). Data collection was performed using the MeshAndCollect workflow (Zander et al., 2015). This resulted in 480 sub-data sets in four mesh scans that were processed by XDS (Kabsch, 2010).
2.3. Genetic algorithm
Our GA implementation is based on DEAP (https://github.com/DEAP/deap) as described previously (Zander et al., 2016). However, the Python script was modified to use the overall statistics rather than the inner-shell statistics for all metrics except the Rmeas value. The uses default probabilities for mutation and crossover: 0.6 and 0.3, respectively. The overall target function is
where
The weighting terms wRmeas, w〈I/σ(I)〉, wanomalous CC, wCC1/2, wcompleteness and wmultiplicity can be any floating-point value and are by default 1.0, 2.0, 0.0, 1.0, 0.2 and 0.0, respectively. In these tests, a simple weight-balancing scheme was employed based on the statistical values after a single optimization cycle. Specifically, an Ruser value is specified by the user to which the other components are scaled. The optimization is then run for a single cycle. The best values from each term are then obtained. The weighting scores are then computed as w〈I/σ(I)〉 = Ruser/〈I/σ(I)〉overall best, wanomalous CC = Ruser/anomalous CCoverall best, wCC1/2 = Ruser/CC1/2 overall best and wcompleteness = Ruser/completenessoverall best. While this method has the major drawbacks of not being based on the fully optimized values and requiring some knowledge of Rmeas, it nevertheless mitigates the domination of the fitness function by a single term. Automatic balancing of the weights brings the terms to approximately equal weights. Since in many cases, and in particular in this study, we wish to emphasize the importance of one term (i.e. the anomalous signal), user-specified weights than can be applied by multiplying the user weight by the automatic weight.
Normally, intermediate generations in GA optimization are deleted. However, we realized that the large number of combinations of sub-data sets calculated during the course of the GA optimization could provide more data with which to explore the relationships between merging statistics and downstream metrics (such as anomalous difference map peak heights,
correctness and phase errors against refined structures). We therefore disabled the deletion of intermediate solutions. As a control, this was also performed with the GA mutation and crossover functions set to a probability of 0, and the selection scheme changed from fitness-based to random.The preparation of FA values was performed using SHELXC (Sheldrick, 2010). Reference structures for anomalous difference map and phase-error calculations were created by downloading PDB files 2wso for Cerulean (Lelimousin et al., 2009), 4ceu for urease (Benini et al., 2014) and 3zi6 for thermolysin (Ferrer et al., 2013), followed by several rounds of in REFMAC5 (Murshudov et al., 2011) and manual rebuilding in Coot. A final in PDB-REDO was then performed (Joosten et al., 2014). The refined reference structures for thermolysin (Rcryst = 16.8%, Rfree = 19.7%), Cerulean (Rcryst = 17.4%, Rfree = 22.5%) and urease (Rcryst = 16.5%, Rfree = 19.8%) were used to calculate model-phased anomalous difference maps using ANODE (Thorn & Sheldrick, 2011).
2.4. determination and phasing
FA values and amplitudes from SHELXC were then used in SHELXD and SHELXE for phasing (Sheldrick, 2010). The SHELXD settings for thermolysin were FIND 5, NTRY 8000, SHEL 50 2.5. For urease and Cerulean, different strategies were tried with different resolution cutoffs, atom numbers and NTRY keywords.
3. Results and discussion
3.1. Thermolysin
3.1.1. Merging statistics and anomalous peak height
Merging all data produced a data set (All_T) with an extremely poor Rmeas overall value of 71.4% (Table 2). By contrast, 〈I/σ(I)〉overall was 9.7, CCanom overall was 20% and CC1/2 overall was 99.3%. Randomly selecting data sets to create 7500 merged data sets produced 〈I/σ(I)〉overall values between 1.6 and 4.2 (Fig. 1, left panel). In order to improve this signal further and to explore the relationship between anomalous peak height and various indicators of data quality, we optimized the grouping of these 158 sub-data sets using a GA. The algorithm was run for 150 cycles with a population of 50 individuals. This produced an improvement in the maximal 〈I/σ(I)〉overall and CC1/2 overall (Fig. 1), as well as other merging statistics, showing the efficacy of the GA method in improving these merging statistics. The best merging group (GA_T) showed improvements in the CC1/2 overall (99.6%), CCanom overall (22%) and Rmeas overall (58.8%) values, but 〈I/σ(I)〉overall was unchanged (9.7) compared with merging all sub-data sets. In order to explore the relationship between merging statistics and anomalous signal, we next looked at the relationship between model-phased anomalous peak heights and merging statistics. For this analysis, all intermediate GA solutions were used as input for SHELXC and ANODE. The individual merging statistics were then plotted as a function of anomalous peak height (Fig. 2). Nearly all merging statistics provided a good correlation with anomalous peak height. However, the Rmeas values would normally all be deemed to be unacceptably high. Despite this fact, the mean anomalous peak height for Rmeas overall values in the range 50–70% was 46 standard deviations above the mean density value (σ). Similarly, CC1/2 overall has a strong correlation with anomalous peak height, but only at higher values (above 95%).
‡Rmerge = and Rmeas = . |
3.1.2. determination and phasing
Despite large anomalous peak heights for many merged data sets (even merging all data produced a maximal anomalous difference map peak height of 46σ), structure solution was not straightforward. For both the All_T data set and the best GA data (GA_T), the position of the Zn atom could easily be determined. The four calcium sites had significantly lower peak heights and could only be found in anomalous residual maps. SHELXD was run for ∼5500 of the intermediate GA solutions and all of the resultant substructure-solution statistics were evaluated. The success of determination, as evaluated by a plot of CCall versus CCweak from SHELXD, showed a clear trend of larger CCall/CCweak values for data sets with higher 〈I/σ(I)〉overall, CCanom overall and CC1/2 overall values and lower Rmeas overall values (Fig. 3). Phasing by SHELXE was possible for both the All_T data set and the GA_T data, but the GA data required only four rounds of solvent flattening/automatic building to obtain a partial CC value of >25%, while the `all data' data set required eight rounds. High-quality models with partial CC values of 34% were produced from both data sets. In order to further examine the relationship between merging statistics and phasing, we ran SHELXE for all of the data sets for which SHELXD had been run. This extremely large scale set of phasing data was analysed in order to determine which merging statistics are correlated with phasing success. As in previous steps, we observed that CC1/2 overall, 〈I/σ(I)〉overall and anomalous CC all correlated well with the successful phasing of structures (Fig. 4). Previous studies and anecdotal evidence have suggested that one of the most reliable metrics of phasing in SHELXE is the of the partially automatically built model with the native data (Usón & Sheldrick, 2018). A threshold of 25% has been given as a cutoff value above which the structure is likely to be solved. We therefore plotted this metric against the weighted mean-phase error (wMPE) and were surprised to see that excellent wMPE values could be obtained even at CC values of 12% (Fig. 5). This reinforces the rule that at a CC of 25% the structure is almost certainly solved, but also suggests that values as low as 10% are worth examining in more detail.
3.2. Cerulean
The fluorescent protein Cerulean is a 239-residue protein from the green fluorescent protein family with five S atoms. The very small number of S atoms compared with the number of amino-acid residues makes de novo phasing of this protein extremely difficult, but it represents a good test case for the improvement of weak anomalous signals.
3.2.1. Merging statistics and anomalous peak height
Merging all data produced a data set (All_C) with an extremely poor Rmeas overall value of 60.5% (Table 2). By contrast, 〈I/σ(I)〉overall was 16.9, CCanom overall was 9% and CC1/2 overall was 98.6%. The GA was run for 500 cycles with a population of 25 individuals. This produced improvements in CC1/2 overall (99.1%), Rmeas overall (56.2%) and CCanom overall (15%), but did not improve 〈I/σ(I)〉overall (15.9). Despite these relatively modest gains in the merging statistics, the GA optimization yielded a significantly improved maximal anomalous peak height (11.8σ) compared with 9.7σ on merging all data (over Sγ of Cys170). The average anomalous peak heights were improved to 9.8σ for the cysteine Sγ atom and 9.2σ for the methionine Sδ atom, compared with 8.3σ and 8.2σ, respectively, when merging all data. As for thermolysin, all XSCALE merging runs were examined to explore the relationship between the merging statistics and anomalous peak heights (Fig. 6). In this case, the overall 〈I/σ(I)〉, CCanom overall and CC1/2 overall are also good choices for optimization. However, taken together with the fact that the 〈I/σ(I)〉overall for merging all data was higher than that for the GA selected grouping (GA_C) but produced lower peak heights, it would be advisable to weight the CCanom overall term highest.
3.2.2. determination and phasing
For both GA_C and All_C, we attempted to determine the SHELXD and HySS from PHENIX (Grosse-Kunstleve & Adams, 2003; McCoy et al., 2004). Although the peak heights obtained by GA optimization were reasonably high, the could not be determined de novo, even when trying multiple low- and high-resolutions cutoffs (2.7–4.3 Å in 0.2 Å increments) and different numbers of atoms (five or seven) in SHELXD with NTRY = 30 000 for each attempt. We next attempted to identify a consensus solution within all of the SHELXD results using SITCOM (Dall'Antonia & Schneider, 2006). In post-analysis using phenix.emma to compare this consensus site with the known sites extracted from the refined structure, we retrieved a subset of four correct sites using the GA_C data set versus three using the All_C data set (Adams et al., 2010; Grosse-Kunstleve & Adams, 2003). However, these four sites were distributed in a large list (82 total) of sites which was obtained by merging multiple coordinate files output from SITCOM. Therefore, identifying these correct sites without previous knowledge is unlikely. The most promising result was obtained for one SHELXD run, which resulted in two correct sites that were found using the GA_C data set. Despite this, it was not possible to optimize the nor to determine the phases from this partial solution. Indeed, for both data sets (GA_C and All_C), even starting with the known obtaining interpretable phases using SHELXE or Phaser (0.4% solvent content and 2.2 and 2.19 Å resolution, respectively) was impossible (Read & McCoy, 2011). One possible explanation for this is that 〈I/σ(I)〉overall was lower than for previously determined S-SAD proteins in general, including the previously described thermolysin data (Cianci et al., 2008; Akey et al., 2016).
and calculate phases. To improve the chance of success, our attempts used two different programs:3.3. Urease
The S. pasteurii urease data described in Zander et al. (2016) were re-analyzed here, with a focus on anomalous phasing (Zander et al., 2016). Urease has 29 S atoms in 26 methionines and three cysteines on three different polypeptide chains, with 790 amino acids in total.
3.3.1. Merging statistics and anomalous peak height
Merging all data (All_U) produced an Rmeas overall value of 87.7%, an 〈I/σ(I)〉overall of 24.6, a CC1/2 overall of 99.9% and a CCanom overall of 14% (Table 2). The GA was run for 150 cycles with a population of 30 individuals and three groups. This produced a data set with the same CC1/2 overall (99.9%), a slightly worse 〈I/σ(I)〉overall (23.7) and improvements to both Rmeas overall (64.9%) and CCanom overall (18%). Despite the similar statistics for CC1/2 overall and 〈I/σ(I)〉overall, inspection of model-phased anomalous difference maps revealed increased peak heights for the GA data set (GA_U). While the All_U data set produced a maximal peak height of 16.6σ over Met479 and average values of 11.2σ and 7.7σ over methionine and cysteine residues, the GA_U data set produced a maximal peak height of 18.8σ over Met479 and 11.9σ and average values of 8.6σ over methionine and cysteine residues. As in the previous systems, we then examined the overall trends of merging statistics versus anomalous difference map peak heights (Fig. 7). As in Cerulean and thermolysin, a generally good correlation exists between all of the metrics and anomalous difference peak heights. However, CCanom overall appears to be particularly useful in discriminating between the groupings that yield the highest anomalous difference peak heights.
3.3.2. determination and phasing
Despite the large anomalous difference map peak heights, the sulfur de novo using SHELXD, PHENIX or PRASA (Skubák, 2018). We used the same approach as that described for the Cerulean case but with NTRY = 50 000 and a set of high-resolution cutoffs from 2.5 to 3.9 Å every 0.2 Å. These produced multiple sets of possible substructures, which were then combined in SITCOM to identify the most represented sites. Unfortunately, very few sites from the known could be found in the consensus The most promising individual SHELXD run resulted in five correct sites that were found using the GA_U data set, but this could not be bootstrapped to successful phasing. Nevertheless, there was adequate signal in the data to determine high-quality phases starting from the known (Fig. 8). Interestingly, when merging all data SHELXE could only produce a model with a partial CC of >20% (20.7%) after 14 macrocycles, whereas the GA selected data produced a model with a partial CC of 22% after only four macrocycles and of 29% after six macrocycles. This suggests that even improvements of a few percent in CCanom overall can make a significant difference in phasing. A subset of the ∼12 000 intermediate merging runs (the total number is less than 30 × 150 × 3 because some individuals with the same grouping can occur during the GA evolution) were randomly selected for phasing starting from the known Specifically, 4300 merged groups were submitted to phasing and automatic building in SHELXE. Of these, 950 runs produced SHELXE partial CCs greater than 20%. Because CCanom overall appears to be the most correlated to anomalous peak height at larger values (Section 3.3.1), we examined this merging metric and its relationship to how many SHELXE macrocycles of automatic building/solvent flattening were required for successful phasing. The minimum CCanom overall that produced partial CCs of >20% was 13%, and this run required six iterations. The maximum CCanom overall was 18%. There were 1856 data sets with a CCanom overall of 18% and, of these, 57 could be solved with six iterations, 150 with five iterations, 389 with four iterations and 60 with three iterations. There was significant overlap between the most readily solved data sets (three, four and five iterations required) and the best GA merge, with an average similarity to the GA merge of 87%. It is not immediately obvious why the best overall SHELXE run required only three iterations compared with the best GA merge, which required four, despite both data sets having a CCanom overall of 18% and having been started from the same Indeed, the best overall data set had a slightly lower maximal anomalous peak height (18.4) and slightly worse Rmeas overall (66.2%), 〈I/σ(I)〉overall (23.04) and CC1/2 overall (99.7%) values compared with the GA. This suggests that for the purposes of optimization, a new metric for anomalous signal could be useful.
could not be determined4. Conclusions
Here, we have examined three experiments which combine relatively low anomalous signals with multi-crystal data collections. For thermolysin, urease and Cerulean, the estimated Bijvoet difference ratios are 1.7%, 1.3% and 1.1%, respectively. According to Olczak & Cianci (2018), phasing could be successful for the data sets presented here when 〈I/σ(I)〉overall is in the range 23–69 for thermolysin, 30–90 for urease and 34–104 for Cerulean. It is therefore not surprising that de novo phasing was not possible for urease and Cerulean, given their overall 〈I/σ(I)〉overall values of 24 and 16.9, respectively. However, it is somewhat surprising to see that a solution was found for thermolysin, albeit not in a straightforward manner, for which a data set with an 〈I/σ(I)〉overall of only 9.7 could be assembled. Previously, thermolysin was solved by zinc SAD, with a similarly low estimated Bijvoet difference ratio of 1.1%, but with an 〈I/σ(I)〉overall of 53.7 (Ferrer et al., 2013). It is possible then that these estimates could be re-evaluated in the context of multi-crystal phasing.
One of the goals of this work was to study the connection between different merging statistics and multi-crystal phasing. Previous work by Terwilliger and coworkers and Zwart showed a strong correlation between CCanom and experimental map correlation (Terwilliger et al., 2016; Zwart, 2005). We used a similar approach, but took advantage of the fact that sub-data sets can be assembled into a large number of unique data sets. We could then analyse the relationship between the merging statistics for these data sets and the structure-solution statistics. This analysis showed that while most traditionally used merging statistics can be used in a GA target function to optimize anomalous signal, there are some particularities of multi-crystal data collections. In particular, R values that would normally indicate very poor quality data sets still produce large anomalous difference map peak heights and, in the case of thermolysin, de novo phases. This reinforces the notion that other merging indicators should generally be used in place of R values, and that this is especially true for multi-crystal data. Interestingly, in all three cases the GA data sets have a significantly reduced multiplicity (by a factor of 2) while retaining similar or better merging statistics. This suggests that non-isomorphism plays an appreciable role that is perhaps even larger than that observed in previous work (Assmann et al., 2016). Indeed, it can be inferred that systematic errors play a significant role, because 〈I/σ(I)〉 should increase with greater multiplicity if the errors were predominantly random. A deeper analysis using recently developed methods for analysing the random and systematic error components of multi-crystal data will be pursued in future work (Diederichs, 2017). This will have implications on which metrics are the most suitable for inclusion in the GA target function, especially because the relationship between 〈I/σ(I)〉 and CC1/2 in particular can be quite different depending on the dominant type of error.
Nevertheless, for all three test cases CCanom overall is correlated with the highest anomalous peak heights and is thus a logical target for GA or other optimization (Diederichs & Karplus, 2013). Indeed, differences of only a few percent in this value can make the difference between solving and not solving a structure. Increasing this signal, which is related to the strength of the anomalous signal and to the noise introduced by non-isomorphism, is not a new concept. However, the combination of a very large number of crystals with very weak anomalous signal is a relatively recent development. In this work, we have explored the connection between merging statistics and phasing for three such systems. In order to further improve multi-crystal phasing experiments and analysis, it will be necessary to revisit the origins of non-isomorphism and to perhaps identify new metrics for anomalous signal.
Acknowledgements
The authors would like to acknowledge Sylvain Aumonier, Guillaume Gotthard and Antoine Royant for Cerulean crystals, Ulrich Zander for assistance with serial crystallography experiments, the allocation of beamtime (SSX BAG) for this and other ongoing SSX projects at the ESRF by the ESRF MX Beamtime Allocation Panel and the support of the ESRF Molecular Biology Laboratory, in particular Montserrat Soler-Lopez.
References
Adams, P. D., Afonine, P. V., Bunkóczi, G., Chen, V. B., Davis, I. W., Echols, N., Headd, J. J., Hung, L.-W., Kapral, G. J., Grosse-Kunstleve, R. W., McCoy, A. J., Moriarty, N. W., Oeffner, R., Read, R. J., Richardson, D. C., Richardson, J. S., Terwilliger, T. C. & Zwart, P. H. (2010). Acta Cryst. D66, 213–221. Web of Science CrossRef CAS IUCr Journals Google Scholar
Akey, D. L., Terwilliger, T. C. & Smith, J. L. (2016). Acta Cryst. D72, 296–302. Web of Science CrossRef IUCr Journals Google Scholar
Assmann, G., Brehm, W. & Diederichs, K. (2016). J. Appl. Cryst. 49, 1021–1028. Web of Science CrossRef CAS IUCr Journals Google Scholar
Benini, S., Cianci, M., Mazzei, L. & Ciurli, S. (2014). J. Biol. Inorg. Chem. 19, 1243–1261. Web of Science CrossRef CAS PubMed Google Scholar
Cianci, M., Helliwell, J. R. & Suzuki, A. (2008). Acta Cryst. D64, 1196–1209. Web of Science CrossRef CAS IUCr Journals Google Scholar
Cusack, S., Belrhali, H., Bram, A., Burghammer, M., Perrakis, A. & Riekel, C. (1998). Nature Struct. Mol. Biol. 5, 634–637. Web of Science CrossRef CAS Google Scholar
Dall'Antonia, F. & Schneider, T. R. (2006). J. Appl. Cryst. 39, 618–619. Web of Science CrossRef CAS IUCr Journals Google Scholar
Diederichs, K. (2017). Acta Cryst. D73, 286–293. Web of Science CrossRef IUCr Journals Google Scholar
Diederichs, K. & Karplus, P. A. (2013). Acta Cryst. D69, 1215–1222. Web of Science CrossRef CAS IUCr Journals Google Scholar
Ferrer, J.-L., Larive, N. A., Bowler, M. W. & Nurizzo, D. (2013). Exp. Opin. Drug. Discov. 8, 835–847. Web of Science CrossRef CAS Google Scholar
Foadi, J., Aller, P., Alguel, Y., Cameron, A., Axford, D., Owen, R. L., Armour, W., Waterman, D. G., Iwata, S. & Evans, G. (2013). Acta Cryst. D69, 1617–1632. Web of Science CrossRef CAS IUCr Journals Google Scholar
Foos, N., Seuring, C., Schubert, R., Burkhardt, A., Svensson, O., Meents, A., Chapman, H. N. & Nanao, M. H. (2018). Acta Cryst. D74, 366–378. CrossRef IUCr Journals Google Scholar
Grosse-Kunstleve, R. W. & Adams, P. D. (2003). Acta Cryst. D59, 1966–1973. Web of Science CrossRef CAS IUCr Journals Google Scholar
Joosten, R. P., Long, F., Murshudov, G. N. & Perrakis, A. (2014). IUCrJ, 1, 213–220. Web of Science CrossRef CAS PubMed IUCr Journals Google Scholar
Kabsch, W. (2010). Acta Cryst. D66, 125–132. Web of Science CrossRef CAS IUCr Journals Google Scholar
Lelimousin, M., Noirclerc-Savoye, M., Lazareno-Saez, C., Paetzold, B., Le Vot, S., Chazal, R., Macheboeuf, P., Field, M. J., Bourgeois, D. & Royant, A. (2009). Biochemistry, 48, 10038–10046. Web of Science CrossRef PubMed CAS Google Scholar
Liu, Q., Liu, Q. & Hendrickson, W. A. (2013). Acta Cryst. D69, 1314–1332. Web of Science CrossRef CAS IUCr Journals Google Scholar
McCoy, A. J., Storoni, L. C. & Read, R. J. (2004). Acta Cryst. D60, 1220–1228. Web of Science CrossRef CAS IUCr Journals Google Scholar
Mueller-Dieckmann, C., Bowler, M. W., Carpentier, P., Flot, D., McCarthy, A. A., Nanao, M. H., Nurizzo, D., Pernot, P., Popov, A., Round, A., Royant, A., de Sanctis, D., von Stetten, D. & Leonard, G. A. (2015). Eur. Phys. J. Plus, 130, 70. Google Scholar
Murshudov, G. N., Skubák, P., Lebedev, A. A., Pannu, N. S., Steiner, R. A., Nicholls, R. A., Winn, M. D., Long, F. & Vagin, A. A. (2011). Acta Cryst. D67, 355–367. Web of Science CrossRef CAS IUCr Journals Google Scholar
Olczak, A. & Cianci, M. (2018). Crystallogr. Rev. 24, 73–101. CrossRef CAS Google Scholar
Read, R. J. & McCoy, A. J. (2011). Acta Cryst. D67, 338–344. Web of Science CrossRef CAS IUCr Journals Google Scholar
Santoni, G., Zander, U., Mueller-Dieckmann, C., Leonard, G. & Popov, A. (2017). J. Appl. Cryst. 50, 1844–1851. Web of Science CrossRef CAS IUCr Journals Google Scholar
Sheldrick, G. M. (2010). Acta Cryst. D66, 479–485. Web of Science CrossRef CAS IUCr Journals Google Scholar
Skubák, P. (2018). Acta Cryst. D74, 117–124. CrossRef IUCr Journals Google Scholar
Terwilliger, T. C., Bunkóczi, G., Hung, L.-W., Zwart, P. H., Smith, J. L., Akey, D. L. & Adams, P. D. (2016). Acta Cryst. D72, 346–358. Web of Science CrossRef IUCr Journals Google Scholar
Thorn, A. & Sheldrick, G. M. (2011). J. Appl. Cryst. 44, 1285–1287. Web of Science CrossRef CAS IUCr Journals Google Scholar
Usón, I. & Sheldrick, G. M. (2018). Acta Cryst. D74, 106–116. Web of Science CrossRef IUCr Journals Google Scholar
Zander, U., Bourenkov, G., Popov, A. N., de Sanctis, D., Svensson, O., McCarthy, A. A., Round, E., Gordeliy, V., Mueller-Dieckmann, C. & Leonard, G. A. (2015). Acta Cryst. D71, 2328–2343. Web of Science CrossRef IUCr Journals Google Scholar
Zander, U., Cianci, M., Foos, N., Silva, C. S., Mazzei, L., Zubieta, C., de Maria, A. & Nanao, M. H. (2016). Acta Cryst. D72, 1026–1035. Web of Science CrossRef IUCr Journals Google Scholar
Zwart, P. H. (2005). Acta Cryst. D61, 1437–1448. Web of Science CrossRef CAS IUCr Journals Google Scholar
This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.