The PDB_REDO server for macromolecular structure model optimization
The refinement and validation of a crystallographic structure model is the last step before the coordinates and the associated data are submitted to the Protein Data Bank (PDB). The success of the refinement procedure is typically assessed by validating the models against geometrical criteria and the diffraction data, and is an important step in ensuring the quality of the PDB public archive [Read et al. (2011), Structure, 19, 1395–1412]. The PDB_REDO procedure aims for `constructive validation', aspiring to consistent and optimal refinement parameterization and pro-active model rebuilding, not only correcting errors but striving for optimal interpretation of the electron density. A web server for PDB_REDO has been implemented, allowing thorough, consistent and fully automated optimization of the refinement procedure in REFMAC and partial model rebuilding. The goal of the web server is to help practicing crystallographers to improve their model prior to submission to the PDB. For this, additional steps were implemented in the PDB_REDO pipeline, both in the refinement procedure, e.g. testing of resolution limits and k-fold cross-validation for small test sets, and as new validation criteria, e.g. the density-fit metrics implemented in EDSTATS and ligand validation as implemented in YASARA. Innovative ways to present the refinement and validation results to the user are also described, which together with auto-generated Coot scripts can guide users to subsequent model inspection and improvement. It is demonstrated that using the server can lead to substantial improvement of structure models before they are submitted to the PDB.
Crystallographic structure elucidation is a stepwise process with many decision points, and is therefore complex and labour-intensive. Over the years, this process has become more and more streamlined by automation. The crystallographic process, starting from the diffraction experiment itself, has greatly benefitted from faster computers and advances in crystallographic software. Automated pipelines are available for data reduction (e.g. Otwinowski & Minor, 1997; Vonrhein et al., 2011; Krug et al., 2012; Monaco et al., 2013; Winter et al., 2013), experimental phasing (e.g. Panjikar et al., 2005; Terwilliger et al., 2009; Pannu et al., 2011), molecular replacement (e.g. Keegan & Winn, 2007; Long et al., 2008; McCoy et al., 2007), density-map tracing and model building (e.g. Perrakis et al., 1999; Ioerger & Sacchettini, 2002; Cowtan, 2006; Terwilliger et al., 2008) and combinations thereof (e.g. Brunzelle et al., 2003; Holton & Alber, 2004; Kroemer et al., 2004).
The PDB_REDO pipeline (Joosten et al., 2012) focuses on automating the final steps of the crystallographic process, i.e. the optimization of the structure model through refinement and rebuilding. We have called this procedure `constructive validation' because throughout the process model quality assessment is used to determine whether the structure model can be improved. This takes away the need to define what is `good' or `bad', which is difficult for many metrics of model quality and can let `good enough' stand in the way of `excellent'. For example, an Rfree value of 18% is certainly `good' under most circumstances, but if it could be lowered to 16% this would be preferred. Automation allows a comprehensive approach to exploiting a large number of possible ways that can lead to a more accurate and reliable model, while minimizing user intervention. In addition, automation provides a consistency that would be difficult to achieve manually. For instance, PDB_REDO checks every amino-acid side chain to determine whether an alternative rotameric conformation can be found with an equal or better fit to the electron density. This is a daunting task if performed manually, especially for large structures, but rather trivial to perform computationally in a systematic manner: in many cases, this exhaustive search leads to substantial improvement of the geometric quality of the model (Joosten et al., 2011). PDB_REDO also optimizes refinement parameters, e.g. by finding good restraint weights and selecting the most suitable B-factor model. This also frequently leads to model-quality improvement, but again is quite time-consuming when performed manually.
The PDB_REDO pipeline was developed as command-line-oriented software for Linux. This allows high-throughput analysis of large sets of structure models, which is used to create a data bank of optimized and consistently treated PDB entries (Joosten et al., 2009). Our recent developments of PDB_REDO have made it an attractive tool for helping model refinement prior to submission to the PDB, to help ongoing crystallographic studies. However, the rather abstract textual output and a number of third-party software dependencies make the installation and routine/efficient use of PDB_REDO challenging. Moreover, as PDB_REDO is a work-in-progress project, we wanted to always make the latest software available to end users. To alleviate these barriers for end users, we chose to implement a PDB_REDO web server (https://xtal.nki.nl/PDB_REDO ). Additions to the PDB_REDO pipeline as well as new visual output that makes PDB_REDO more accessible to end users are discussed.
The PDB_REDO procedure consists of model refinement in REFMAC (Murshudov et al., 1997, 2011), rebuilding by Centrifuge (which deletes waters without electron density), SideAide [which (re)builds side chains in rotameric conformations] and pepflip (which flips the orientation of peptide planes to improve the fit with electron-density maps and the Ramachandran plot; Ramachandran et al., 1963) (Joosten et al., 2011), and validation in WHAT_CHECK (Hooft et al., 1996). Additional validation is performed by pdb-care for carbohydrates (Lütteke & von der Lieth, 2004) and FoldX (Guerois et al., 2002) for calculation of the Gibbs folding energy. These programs are linked together by a large set of decision-making algorithms which we have discussed in detail previously (Joosten et al., 2012).
Many decision-making algorithms in PDB_REDO aim to optimize the refinement parameterization for REFMAC. Two new features were added to improve this parameterization.
An additional improvement concerns the calculation of Rfree (Brünger, 1992) in cases where the test set of reflections used for calculation of Rfree is very small and stochastic effects can cause Rfree for a particular set of test reflections to be substantially (up to several percentage points) higher or lower than for an alternative test-set selection. In such cases, a single value of Rfree can be misleading and k-fold cross-validation (where k is the number of alternative test sets, e.g. 20 if the original test set constitutes 5% of all reflections) can be used to obtain averages and standard deviations for R, Rfree and their difference. In PDB_REDO, k-fold cross-validation (Picard & Cook, 1984; Kleywegt & Brünger, 1996) is used if the test set is smaller than 500 reflections. To ensure that the alternative test sets are `free', the input model for the cross-validation is perturbed by resetting the atomic B factors to a fixed value or by small shifts to the atomic coordinates in cases where individual atomic B factors cannot be used.
Two new validation routines were added to PDB_REDO.
The optimization of a structure model leads to changes at the global and at the local level, which can be quantified by the changes in the values of specific metrics that assess the fit of the model to the experimental data or the conformity of atomic geometry to a priori chemical knowledge. An important question is how to define the significance thresholds in the changes of established metrics. Such thresholds are important for the subsequent visualization of the results.
and we subsequently define a change of 2.6 as significant (as it corresponds to a p-value of 0.01 assuming a normal distribution).
Similarly, for the change of the free correlation coefficient a Zchange score of 2.6 is required. This Zchange score is calculated as
where N is the number of data points used to calculate the correlation coefficient and where the ZCC values are obtained by transforming the correlation coefficients of the initial and the final model using the Fisher transformation (Fisher, 1915),
There are no well defined targets for bond-length and bond-angle deviations expressed as root-mean-square Z-scores (r.m.s.Z), other than that the scores should not exceed 1.0 (Tickle, 2007). Therefore, an alternative scheme to mark changes that are likely to be significant was devised: if the initial r.m.s.Z is greater than 1.0 then any increase is considered to be a significant deterioration and any decrease to be a significant improvement. In addition, an increase from an r.m.s.Z value of less than 1.0 to an r.m.s.Z values of greater than 1.0 is always considered to be a significant deterioration. All other changes are considered to be insignificant.
Overall geometric quality scores are presented as the percentile rank with respect to all PDB entries (Bernstein et al., 1977; Berman et al., 2007) that are also represented in the PDB_REDO data bank (Joosten & Vriend, 2007). Given that about 70 000 structures are available in the PDB_REDO data bank, a change in percentile rank of a single point is thus equivalent to overtaking some 700 PDB entries for that specific metric. Thus, a change in percentile rank of one or greater is considered to be a significant change. The Ramachandran-plot, rotamer-quality and fine-packing percentiles are directly derived from the corresponding Z-scores from WHAT_CHECK (Vriend & Sander, 1993; Chinea et al., 1995; Hooft et al., 1997); the bump-severity percentile is based on the weighted bump severity BSw (4), which penalizes severe atomic clashes [i.e. clashes with large van der Waals (VdW) overlaps] and downweights minor clashes. We use this weighted score to downweight the minor clashes that could be brought about by too liberal VdW restraints or refinement without riding H atoms, and focus on bumps that are caused by actual fitting errors,
These global model quality scores (R, Rfree, free correlation coefficient, bond-length r.m.s.Z, bond-angle r.m.s.Z, Gibbs folding energy, Ramachandran-plot percentile, rotamer-quality percentile, bump-severity percentile and fine-packing percentile) are presented in a tabular form (Fig. 1a). The significant changes, according the criteria outlined above, are marked in red for deteriorations and green for improvements.
Here, we aim to show graphically how key quality indicators of the model at hand compare with similar resolution structures in the PDB and the PDB_REDO data bank before and after applying the PDB_REDO pipeline. To create this graph, we first retrieve `on the fly' the structures that are closest in terms of resolution to the structure at hand. For this, we sort all available entries by their distance in resolution space from the working model. The top 1000 nearest neighbours are then chosen, including any structures that have exactly the same distance as the 1000th nearest neighbour. Three global quality metrics (Rfree, Ramachandran-plot quality and rotamer quality) for all of these structures are used to create box-and-whisker plots representing the distributions of these values from these at least 1000 structure models of similar resolution (Fig. 1b). Outliers are flagged according to the 1.5 inter-quartile range (IQR) criterion (Tukey, 1977). These underlying distributions are calculated and plotted for both the PDB and the PDB_REDO data bank, and the values for the model at hand are then plotted as a blue line (before PDB_REDO) and an orange line (after PDB_REDO), allowing the user to judge how the structure models before and after PDB_REDO compare with similar published structure models and with the same structure models as optimized in the PDB_REDO data bank.
To present a static view of local changes in the model, per-residue changes in the fit to the electron-density map are plotted as the change in real-space correlation coefficient (RSCC) as calculated by EDSTATS (Fig. 1c). If the RSCC of a residue has an absolute Zchange of >2.6, as calculated in (2), it is marked in red for deterioration and green for improvement; if not, it is marked in grey. As Ninitial and Nfinal in (2) now represent the number of independent map grid points used to calculate the RSCC, Zchange not only depends on the magnitude of the RSCC change but also on the size of the residue and the resolution of the electron-density map. This means that at lower resolution Zchange becomes undefined for small compounds such as waters or ions. These residues are marked in white.
In addition to the static visuals implemented in the web browser, we want to enable the user to have an easily interactive view: local changes in the conformation of the structure such as changed rotamers, flipped peptides, flipped His/Asn/Gln residues, side chains, deleted waters or completed side chains are listed in scripts (in Scheme and Python format) that can be read directly in Coot (Emsley & Cowtan, 2004; Emsley et al., 2010). The scripts create a pop-up Coot window with a list of buttons that guide the user through all of the structural changes (Fig. 1d). The scripts are also available for the PDB_REDO data bank and are automatically loaded when the current version of the PDB_REDO plugin for Coot (Cereto-Massagué et al., 2013) is used.
The server is based on the framework of the YSBL software server (Long et al., 2008) consisting of a front-end web server for user interaction and a back-end computation node that manages the PDB_REDO jobs. The computation node (with four six-core Intel Xeon E5-4610 processors and 64 GB RAM) is set up to run a maximum of 46 PDB_REDO jobs in parallel. When this capacity is fully used, additional jobs are queued on a `first come, first served' basis. Job `requests' from the web server are accumulated as single files with references to the job parameters, based on user input. The web server and the computation node simply share common disk space for input, control and output files; the actual jobs are run on a local disk of the computation node. Job management on the computation node (starting and stopping processes) is implemented as a series of crontab scripts that are executed every few minutes and collect information from the files created by the web server.
New users register an account, after which immediate access is granted (no approval or control mechanism is in place). Each user has their personal, password-protected, workspace to submit and monitor PDB_REDO jobs (Fig. 2). Submitting a job consists of uploading reflection data in MTZ format (the correct data columns are automatically selected by the program kollumer), model coordinates in PDB format and (optionally) a file with geometric restraints describing nonstandard compounds and atomic links for REFMAC.
Detailed information on job progress is accumulated in real time as the PDB_REDO procedure goes through the several steps, and decisions are explained to the user. After a job is completed, the model change indicators described in §3 are computed and displayed in tabular and graphical form. A table with hyperlinks to the optimized structure model, an MTZ file with electron-density map coefficients, a keyword file to run REFMAC with the settings as optimized by PDB_REDO, the Coot scripts, a detailed log file and a compressed file with additional validation results, intermediate models and other files is also available.
Job results are stored for up to 30 days, but can be deleted sooner by the user. All data are considered private, except for statistical indicators of the PDB_REDO performance. After job deletion, only these statistical data (no coordinates or reflections) are retained.
The PDB_REDO server is set up to allow updates without taking the server offline. This way, the server can always run the latest version of the pipeline, allowing us to fix bugs that stop jobs from completing at very short notice. More importantly, new developments in the PDB_REDO pipeline are made available to users with minimal delay.
The server is free to use for academic and commercial users as long as they already hold a licence for the CCP4 suite (Winn et al., 2011). License requirements for YASARA and FoldX have been waived by their respective developers because only small parts of the programs' functionality are used.
To evaluate the performance of the PDB_REDO server, we analysed the results of 1167 jobs from data sets that already contained a test-set selection for Rfree and were submitted from December 2013 to April 2014. It should be noted that the jobs on the server do not constitute a carefully selected `test set'.
Fig. 3 shows the distributions of model quality changes resulting from using PDB_REDO in terms of Rfree and four geometric quality metrics. Rfree improvements of several percentage points are observed, with some of greater than 10 percentage points. A total of 40% of the runs result in an Rfree improvement of greater than 2.6 . Deteriorations of Rfree of the same magnitude occur for 9% of the runs: however, more than a quarter of these cases had initial Rfree values that were lower than the initial R, indicating that the test set was not `free' (i.e. the data in the test set were previously used to construct the structure model; see §2.2.1 of Joosten et al., 2012) at the start of the PDB_REDO run and that Rfree is underestimated. In such cases Rfree can increase by several percentage points when the refinement converges. We also, rather expectedly, observe that the chance of improving Rfree is higher if the initial R is high: at very low initial R factors (lower than 15%) there is little room for improvement and only 22% of all cases shows a significant drop in Rfree; this success rate increases to 76% for cases with an initial R factor of between 35 and 45%. At even higher starting R factors the success rate drops again to 52%: this may be explained by the presence of models that either need substantial rebuilding or models that are incorrect molecular-replacement solutions of a trial subset.
Looking at the basic global model quality scores we described above, we see reassuring trends.
PDB_REDO is under active development, which means that new features will be added to improve the results. The results show that particularly models that do not (yet) fit the experimental data well are likely to benefit from special treatment. For instance, more systematic use of jelly-body refinement in REFMAC (Murshudov et al., 2011) may improve the performance of PDB_REDO, especially for models that are in the early stages of refinement and model building or that only have very low-resolution data.
Not all server jobs complete successfully; currently, an average of 3% of submitted jobs stop prematurely. Limitations of the pipeline as well as problems with the input data may cause these stops. For instance, PDB_REDO cannot deal with unmerged reflections or with models derived from multi-model refinement at the moment. Examples of problems with the input data are atom names conflicting with the PDB standard or different residues with the same chain ID and residue number that are not each other's alternates. In most cases an error message describing the problem and possible solutions is given when the optimization process is halted. Further development is needed to improve the feedback to the user, but we have already corrected many problems that have occurred in the server, thus continuously improving the PDB_REDO pipeline.
Any automated tool has the risk of becoming a black box, and the PDB_REDO server is no exception. To reduce this risk, many items in the output of the server (e.g. the validation metrics and the choice of the B-factor model) are hyperlinked to a `frequently asked questions' web page for background information. The explicit comparison in the output of the input model and the final model and the visualization scripts for Coot will hopefully also encourage users to critically review the results.
The PDB_REDO web server is a tool to optimize structure models by refinement and model rebuilding. Significant model changes are highlighted at the global (macromolecular entity) and at the local (residue) level. The use of PDB_REDO can lead to substantial model improvements in terms of fit to the experimental data and geometric quality, provided that the initial model is a reasonable representation of the experimental data.
The authors thank Eleonore von Castelmur and Andrea Murachelli for helpful comments on the presentation of the server's output, and the CCP4/APS Crystallographic School (https://www.ccp4.ac.uk/schools/APS-school/ ) and the CSHL X-ray Methods in Structural Biology course students (https://meetings.cshl.edu/courses/2013/c-crys13.shtml ) for real-life testing of the server. RPJ is supported by a Veni grant 722.011.011 from the Netherlands Organization for Scientific Research (NWO). GNM and FL are supported by MRC grant MC_UP_A025_1012.
Berman, H., Henrick, K., Nakamura, H. & Markley, J. L. (2007). Nucleic Acids Res. 35, D301–D303. Web of Science CrossRef PubMed CAS Google Scholar
Bernstein, F. C., Koetzle, T. F., Williams, G. J., Meyer, E. F., Brice, M. D., Rodgers, J. R., Kennard, O., Shimanouchi, T. & Tasumi, M. (1977). J. Mol. Biol. 112, 535–542. CrossRef CAS PubMed Web of Science Google Scholar
Branden, C.-I. & Jones, T. A. (1990). Nature (London), 343, 687–689. Google Scholar
Brünger, A. T. (1992). Nature (London), 355, 472–475. PubMed Web of Science Google Scholar
Brunzelle, J. S., Shafaee, P., Yang, X., Weigand, S., Ren, Z. & Anderson, W. F. (2003). Acta Cryst. D59, 1138–1144. Web of Science CrossRef CAS IUCr Journals Google Scholar
Cereto-Massagué, A., Ojeda, M. J., Joosten, R. P., Valls, C., Mulero, M., Salvado, M., Arola-Arnal, A., Arola, L., Garcia-Vallvé, S. & Pujadas, G. (2013). J. Cheminformatics, 5, 36. Google Scholar
Chinea, G., Padron, G., Hooft, R. W. W., Sander, C. & Vriend, G. (1995). Proteins, 23, 415–421. CrossRef CAS PubMed Google Scholar
Cowtan, K. (2006). Acta Cryst. D62, 1002–1011. Web of Science CrossRef CAS IUCr Journals Google Scholar
Emsley, P. & Cowtan, K. (2004). Acta Cryst. D60, 2126–2132. Web of Science CrossRef CAS IUCr Journals Google Scholar
Emsley, P., Lohkamp, B., Scott, W. G. & Cowtan, K. (2010). Acta Cryst. D66, 486–501. Web of Science CrossRef CAS IUCr Journals Google Scholar
Fisher, R. A. (1915). Biometrika, 10, 507–521. Google Scholar
Guerois, R., Nielsen, J. E. & Serrano, L. (2002). J. Mol. Biol. 320, 369–387. Web of Science CrossRef PubMed CAS Google Scholar
Holton, J. & Alber, T. (2004). Proc. Natl Acad. Sci. USA, 101, 1537–1542. Web of Science CrossRef PubMed CAS Google Scholar
Hooft, R. W. W., Sander, C. & Vriend, G. (1997). Comput. Appl. Biosci. 13, 425–430. CAS PubMed Web of Science Google Scholar
Hooft, R. W. W., Vriend, G., Sander, C. & Abola, E. E. (1996). Nature (London), 381, 272. CrossRef PubMed Web of Science Google Scholar
Ioerger, T. R. & Sacchettini, J. C. (2002). Acta Cryst. D58, 2043–2054. Web of Science CrossRef CAS IUCr Journals Google Scholar
Jones, T. A., Zou, J.-Y., Cowan, S. W. & Kjeldgaard, M. (1991). Acta Cryst. A47, 110–119. CrossRef CAS Web of Science IUCr Journals Google Scholar
Joosten, R. P., Joosten, K., Cohen, S. X., Vriend, G. & Perrakis, A. (2011). Bioinformatics, 27, 3392–3398. Web of Science CrossRef CAS PubMed Google Scholar
Joosten, R. P., Joosten, K., Murshudov, G. N. & Perrakis, A. (2012). Acta Cryst. D68, 484–496. Web of Science CrossRef CAS IUCr Journals Google Scholar
Joosten, R. P. et al. (2009). J. Appl. Cryst. 42, 376–384. Web of Science CrossRef CAS IUCr Journals Google Scholar
Joosten, R. P. & Vriend, G. (2007). Science, 317, 195–196. Web of Science CrossRef PubMed CAS Google Scholar
Karplus, P. A. & Diederichs, K. (2012). Science, 336, 1030–1033. Web of Science CrossRef CAS PubMed Google Scholar
Keegan, R. M. & Winn, M. D. (2007). Acta Cryst. D63, 447–457. Web of Science CrossRef CAS IUCr Journals Google Scholar
Kleywegt, G. J. & Brünger, A. T. (1996). Structure, 4, 897–904. CrossRef CAS PubMed Web of Science Google Scholar
Krieger, E., Joo, K., Lee, J., Lee, J., Raman, S., Thompson, J., Tyka, M., Baker, D. & Karplus, K. (2009). Proteins, 77, 114–122. Web of Science CrossRef PubMed CAS Google Scholar
Krieger, E., Koraimann, G. & Vriend, G. (2002). Proteins, 47, 393–402. Web of Science CrossRef PubMed CAS Google Scholar
Kroemer, M., Dreyer, M. K. & Wendt, K. U. (2004). Acta Cryst. D60, 1679–1682. Web of Science CrossRef CAS IUCr Journals Google Scholar
Krug, M., Weiss, M. S., Heinemann, U. & Mueller, U. (2012). J. Appl. Cryst. 45, 568–572. Web of Science CrossRef CAS IUCr Journals Google Scholar
Long, F., Vagin, A. A., Young, P. & Murshudov, G. N. (2008). Acta Cryst. D64, 125–132. Web of Science CrossRef CAS IUCr Journals Google Scholar
Lütteke, T. & von der Lieth, C. W. (2004). BMC Bioinformatics, 5, 69. Google Scholar
McCoy, A. J., Grosse-Kunstleve, R. W., Adams, P. D., Winn, M. D., Storoni, L. C. & Read, R. J. (2007). J. Appl. Cryst. 40, 658–674. Web of Science CrossRef CAS IUCr Journals Google Scholar
Monaco, S., Gordon, E., Bowler, M. W., Delagenière, S., Guijarro, M., Spruce, D., Svensson, O., McSweeney, S. M., McCarthy, A. A., Leonard, G. & Nanao, M. H. (2013). J. Appl. Cryst. 46, 804–810. Web of Science CrossRef CAS IUCr Journals Google Scholar
Murshudov, G. N., Skubák, P., Lebedev, A. A., Pannu, N. S., Steiner, R. A., Nicholls, R. A., Winn, M. D., Long, F. & Vagin, A. A. (2011). Acta Cryst. D67, 355–367. Web of Science CrossRef CAS IUCr Journals Google Scholar
Murshudov, G. N., Vagin, A. A. & Dodson, E. J. (1997). Acta Cryst. D53, 240–255. CrossRef CAS Web of Science IUCr Journals Google Scholar
Otwinowski, Z. & Minor, W. (1997). Methods Enzymol. 276, 307–326. CrossRef CAS Web of Science Google Scholar
Panjikar, S., Parthasarathy, V., Lamzin, V. S., Weiss, M. S. & Tucker, P. A. (2005). Acta Cryst. D61, 449–457. Web of Science CrossRef CAS IUCr Journals Google Scholar
Pannu, N. S., Waterreus, W.-J., Skubák, P., Sikharulidze, I., Abrahams, J. P. & de Graaff, R. A. G. (2011). Acta Cryst. D67, 331–337. Web of Science CrossRef CAS IUCr Journals Google Scholar
Perrakis, A., Morris, R. & Lamzin, V. S. (1999). Nature Struct. Biol. 6, 458–463. Web of Science CrossRef PubMed CAS Google Scholar
Picard, R. R. & Cook, R. D. (1984). J. Am. Stat. Assoc. 79, 575–583. CrossRef Google Scholar
Ramachandran, G. N., Ramakrishnan, C. & Sasisekharan, V. (1963). J. Mol. Biol. 7, 95–99. CrossRef PubMed CAS Web of Science Google Scholar
Read, R. J. et al. (2011). Structure, 19, 1395–1412. Web of Science CrossRef CAS PubMed Google Scholar
Terwilliger, T. C., Adams, P. D., Read, R. J., McCoy, A. J., Moriarty, N. W., Grosse-Kunstleve, R. W., Afonine, P. V., Zwart, P. H. & Hung, L.-W. (2009). Acta Cryst. D65, 582–601. Web of Science CrossRef CAS IUCr Journals Google Scholar
Terwilliger, T. C., Grosse-Kunstleve, R. W., Afonine, P. V., Moriarty, N. W., Zwart, P. H., Hung, L.-W., Read, R. J. & Adams, P. D. (2008). Acta Cryst. D64, 61–69. Web of Science CrossRef CAS IUCr Journals Google Scholar
Tickle, I. J. (2007). Acta Cryst. D63, 1274–1281. Web of Science CrossRef CAS IUCr Journals Google Scholar
Tickle, I. J. (2012). Acta Cryst. D68, 454–467. Web of Science CrossRef CAS IUCr Journals Google Scholar
Tukey, J. W. (1977). Exploratory Data Analysis. Reading: Addison-Wesley. Google Scholar
Vonrhein, C., Flensburg, C., Keller, P., Sharff, A., Smart, O., Paciorek, W., Womack, T. & Bricogne, G. (2011). Acta Cryst. D67, 293–302. Web of Science CrossRef CAS IUCr Journals Google Scholar
Vriend, G. & Sander, C. (1993). J. Appl. Cryst. 26, 47–60. CrossRef CAS Web of Science IUCr Journals Google Scholar
Winn, M. D. et al. (2011). Acta Cryst. D67, 235–242. Web of Science CrossRef CAS IUCr Journals Google Scholar
Winter, G., Lobley, C. M. C. & Prince, S. M. (2013). Acta Cryst. D69, 1260–1273. Web of Science CrossRef CAS IUCr Journals Google Scholar
This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.