Paired refinement under the control of PAIREF

Application of the PAIREF program providing automation of paired refinement is demonstrated on six data sets. The results prove that the inclusion of high-resolution data beyond the conventional criteria can lead to more accurate structure models.


Introduction
Crystallographic resolution is understood as the minimum plane spacing given by Bragg's law for a particular set of X-ray diffraction intensities that are included in the structure analysis (Online Dictionary of Crystallography, https:// dictionary.iucr.org/Resolution). In contrast, optical resolution is defined as the expected minimum distance between two resolved peaks in the electron-density map (Vaguine et al., 1999). The resolution of data is limited due to a decrease in the intensity-to-noise ratio of reflections with the resolution. The weakness of the high-resolution data is caused by several factors, including the Lorentz-polarization factor, temperature factor and crystal imperfection. Therefore, the diffraction data are usually cut off at a certain resolution, with the aim of rejecting the data that do not improve the model.
In previous decades, conservative criteria were applied to estimate the resolution of crystallographic data. These criteria were based on a user-defined value of data quality indicators such as the signal-to-noise ratio hI/(I)i, the disagreement residual of multiple observations R merge , etc. (Evans, 2011). Later, the Pearson correlation coefficient CC 1/2 , quantifying the internal consistency of observations, was added to these criteria (Karplus & Diederichs, 2012). Inspection of the data deposited in the PDB (Berman et al., 2000) shows that there is no consensus in the application of these statistics. Moreover, the possibility of improvement of a refined model by employing a different resolution range was often not considered. Nowadays, the application of strict cutoff values on selected data quality indicators has been shown to be an obsolete approach (Diederichs & Karplus, 2013;Evans & Murshudov, 2013). Very recently, it became possible to estimate the information gain from each reflection using likelihood-based methods (Read et al., 2020). Yet this approach does not answer the question of which highresolution cutoff should be used with current refinement programs.
The ambiguity in the high-resolution-cutoff estimation has been removed with the advent of the 'paired refinement' protocol (Karplus & Diederichs, 2012). Initially, a conservative criterion is applied as usual to the high-resolution data and the phase problem is solved. Usually, the model is then significantly improved by refinement. In the paired refinement protocol, the influence of the previously rejected highresolution data during the structure refinement is tested. The structure model is refined stepwise against data at higher and higher resolution until no improvement of the model is observed. More specifically, each increase in resolution is checked against the original resolution for its added value, particularly by comparing R values of models against the same data. Only those resolution shells that prove beneficial are included in the final data set, against which the structure is refined.
In this paper, we present a new tool -PAIREF -which helps to make the decision about the useful resolution of the data set. The program performs paired refinement for validation of the high-resolution data in a fully automatic way. PAIREF is not the first utility that implements paired refinement since a similar function is present in PDB-REDO (Joosten et al., 2014). Nevertheless, PAIREF provides additional features (e.g. complete cross-validation, modification of the structure refinement protocol) and reports that naturally require more extensive input, and allows a user to make a more sophisticated decision.

Design and implementation
PAIREF is a command-line tool that can be installed as a module into the CCTBX toolbox (Grosse-Kunstleve et al., 2002) on various platforms (GNU/Linux, MS Windows). Currently, it has been developed in Python 2.7 (Hunter, 2007;Rossum, 1995) but is ready to move to Python 3. It depends on the following programs of the CCP4 software package : REFMAC5 , SFCHECK (Vaguine et al., 1999), MTZDUMP, SFTOOLS and BAVERAGE; and on the module pdbtools (Adams et al., 2010)

Parameters and algorithm
The algorithm implemented in PAIREF depends on the amount of data provided by the user. The minimal function of the program requires the following input files: structure model refined at the starting resolution (PDB or mmCIF format) and higher-resolution merged diffraction data in MTZ format which have the same free reflection flags as the data previously used in the refinement (Fig. 1). Nevertheless, the minimal requirement is not sufficient for deep data analysis including statistics such as CC*, etc. The protocol can be further supplemented by the full-resolution unmerged data for calculating merging statistics, by the external restraints in CIF format in the case where non-standard ligands are present and by the command file for REFMAC5 (alternatively generated by PDB-REDO) for better control of the structure refinement. Moreover, a definition of domains for translationlibration-screw (TLS) refinement can be provided by the user. The program allows the selection of resolution shells (with a default width of 0.05 Å ) and optional model modifications before the paired refinement.
Our paired refinement protocol with REFMAC5 is an adaptation of the original protocol that has been performed with phenix.refine (Karplus & Diederichs, 2012;Afonine et al., 2012). Initially, the input files are checked using MTZDUMP and CCTBX for consistency. The model is then refined against the data up to resolution B (higher than A), and this model is compared with the original one -both against the data at resolution A (see Section 2.2). This step is then repeated from resolution B up to resolution C (higher than B) and reproduced again until the maximum limit is reached. CC work and CC free statistics are calculated using SFTOOLS (Karplus & Diederichs, 2012). Finally, merging statistics are calculated using the CCTBX library if unmerged diffraction data were provided.
As an option, PAIREF provides a complete cross-validation protocol (Brü nger, 1993; Jiang & Brü nger, 1994) -also referred to as k-fold cross-validation (  Schematic diagram of the PAIREF algorithm. Optional input files and routines are drawn in grey, the complete cross-validation protocol is outlined in blue. 2015) -to investigate the impact of the selection of free reflections. Here, the paired refinement protocol is run in parallel for each selection individually. To remove the bias given by previous refinement with a particular set of free reflections, a number of optional input model modifications prior to refinement have been implemented: the perturbation of the atomic coordinates, the reset of atomic displacement parameters (ADPs) to a particular or average value and the addition of a fixed value to them (achieved by module pdbtools from CCTBX and BAVERAGE). In the final report, both the averaged statistics as well as the individual statistics for each selection are reported. Application of this protocol is demonstrated on a data set from cysteine dioxygenase (Section 3.3). The complete cross-validation requires the CCP4-style test set description in the input MTZ file, i.e. multiple free reflection labels must be present.
The program PAIREF does not have any decision-making routines and it remains up to the user to decide on the resolution cutoff based on the comprehensive analysis that was performed. Structure refinement is a multiparametric calculation and the user should be aware of potential problems. For example, nonconvergent refinement may result in misleading statistics and a suboptimal model (Tickle, 2011). One of the parameters that may potentially play a role is the FFT grid size (Drenth & Jeroen, 2010).

Program output and interpretation of results
Paired refinement does not reduce the problem of highresolution cutoff estimation to a single monitoring statistic. Rather, a comprehensive data analysis is summarized on an HTML page. Here, various plots, tables and links to many intermediate files and log files are presented or easily accessible via hyperlinks.
The first monitoring statistics reported by PAIREF are the differences in R values between the models refined at adjacent resolutions (both computed at the lower resolution to provide a valid comparison). A decrease in R free is expected in shells beneficial to the model quality. However, a constant R free and a simultaneous increase in R work are usually acceptable as well because these indicate less overfitting of the structure model (Karplus & Diederichs, 2012). Therefore, the next monitoring statistic is R gap (R gap = R free À R work ) which is calculated at the starting resolution (corresponding to resolution A in Section 2.1) for all analyzed shells. This is an implementation of a previously published protocol (Winter et al., 2018). In the case of the complete cross-validation protocol, R values for each set of free reflections and average values are reported. Moreover, the standard deviations of R values of structure models refined using different free reflection sets are calculated (Kleywegt & Brü nger, 1996).
However, the overall R values are not the only parameters to be taken into account when deciding on the high-resolution cutoff. The analysis is further supplemented by plots of R work , R free , CC work and CC free (CC work and CC free are correlation coefficients between experimental and calculated intensities) of the refined structure models at defined resolution. Since a perfect model gives an R value of 0.42 against random data (i.e. pure noise) -assuming non-tNCS (translational noncrystallographic symmetry) data from a non-twinned crystal (Evans & Murshudov, 2013) -a higher R value in the (current) high-resolution shell indicates either the involvement of highresolution data without information content (the data are even worse than noise), or poor quality of the model, or the presence of tNCS.
When unmerged data are available, values of CC* are added to the CC work and CC free plots. Comparison of CC values (correlation coefficients) with CC* serves for direct linking of the data and structure model quality (Diederichs & Karplus, 2013;Karplus & Diederichs, 2012). CC work or CC free greater than CC* in a high-resolution shell indicates undesirable overfitting of the structure model as the calculated intensities agree with the observed data better than the (usually unavailable) true data. Owing to the independence of CC* on a model, its comparison with CC work is just as informative as comparison with CC free . However, the usage of CC work should be preferred since it is based on much more data.
For additional information, PAIREF reports the optical resolution as calculated using SFCHECK for each resolution cutoff. When all previous procedures are finished and unmerged diffraction data are available, the merging statistics are listed in a table and shown in graphs. Finally, the progress of the refinement procedures is reported to check for convergence etc.

Distribution and documentation
Full documentation of PAIREF is available online at https://pairef.fjfi.cvut.cz and the program is distributed at https://pypi.org/project/pairef/.
A comprehensive summary of crystallographic data as well as the refinement statistics are shown in Tables 1 and 2. To be consistent with the previous results, the free reflection flags from the original data were preserved except for TL, because of inaccessibility.

Simulated data set of lysozyme
The ability to generate artificial X-ray diffraction patterns based on a well defined 'true' structure offers the possibility of monitoring the progress of paired refinement, especially the convergence of the refined models towards the 'true' structure.
We generated one hundred diffraction images using a modified structure of lysozyme (data set SIM). At first, all alternative conformations were removed from the structure with the PDB entry 1h87 (originally determined at 1.72 Å resolution) (Girard et al., 2002). The data collection was simulated using MLFSOM (Holton et al., 2014) with a crystalto-detector distance of 150 mm. MLFSOM also simulated global radiation damage for a beam of 8.4 Â 10 10 photons s À1 and 100 mm diameter, exposure of 0.1 s and a crystal size of 77.8 mm. Afterwards, the diffraction data set was processed using DIALS/AIMLESS (Evans & Murshudov, 2013;Winter et al., 2018) or XDS/XSCALE (Kabsch, 2010) up to a resolution of 1.20 Å , although the CC 1/2 values become not significantly different from zero (at the 1:1000 level) at 1.35 Å resolution.
The input model for paired refinement was generated from the structure used for the generation of the diffraction images by perturbation of atomic coordinates by an average of 0.25 Å ; the ADPs were set to their mean value (15 Å 2 ). In the final preparation step, several cycles of restrained refinement at the starting resolution (1.72 Å ) against the processed simulated data were performed. In the next step, we performed the paired refinement protocol using PAIREF.
Structure models refined against the simulated data set have considerably lower R values when compared with the other structures (based on real experimental data) mentioned later (R free = 0.071 for SIM versus R free = 0.195 for TL, both at  Table 1 Data collection and merging statistics.
Values for the highest resolution shell in the case of conservative cutoff are given in parentheses () and for the cutoff chosen as optimal are given in square brackets []. SIM represents a simulated data set generated by MLFSOM (Holton et al., 2014 h0.829i † † For the BO data set, values for a resolution shell beyond the optimal cutoff are listed in angled brackets hi. ‡ Number of additional reflections suggested by paired refinement results to be involved in the refinement in contrast to the starting resolution. Added resolution range, in Å , is given in {} brackets. § Range where CC 1/2 is significantly different from 0 at the 1:1000 level. 1.72 Å ). This effect, caused by the simulated character of the data, was also observed in the original work by Holton et al. (2014). However, the trends of nearly all indicators of data quality are similar to those of the real cases [see Fig. 2(a)]. Based on the plot of stepwise differences in overall R values, we decided to estimate the high-resolution limit as 1.3 Å because the R values increase for resolution shells beyond that limit.
We monitored the root-mean-square deviation (RMSD) values (DeLano Scientific, 2017) calculated on all 1217 atoms of the simulated structure with respect to the original structure model [Fig. 2(c)]. A systematic decrease was observed for the atomic coordinates when reflections from an additional highresolution shell were added to the refinement up to 1.3 Å resolution. This is in agreement with the high-resolution cutoff based on the differences in overall R-values behaviour only. In general, the RMSD of ADP values calculated for all the atoms (see equation given in the supporting information) follow a similar but not identical trend. Moreover, they continue to decrease and converge to the 'true' value even for the highest resolution shell which was later omitted from the data based on the other data quality indicators. As a result of our calculations, we suggest here application of a high-resolution cutoff at 1.3 Å when using our combination of programs and following our refinement protocol. Similar results were also obtained using XDS/XSCALE for data processing.

Thermolysin
Successful application of paired refinement was previously demonstrated on the crystal structure of thermolysin (TL) from B. thermoproteolyticus (Winter et al., 2018). In the original protocol, the structure was modified (perturbation of atomic positions) and refined at a defined high-resolution limit in the range from 1.80 to 1.50 Å . Model improvement was monitored on R gap only, which decreased until 1.56 Å resolution. A further increase in the resolution did not cause a substantial change of R gap .
To reproduce most of the original procedures by Winter et al., the diffraction data were processed with xia2 (Winter, 2010) using DIALS/AIMLESS software. The structure of thermolysin (PDB entry 3n21; Behnen et al., 2012) was used as a starting model. The atomic coordinates were perturbed and all ADPs were generally set to their average value of 22 Å 2 with phenix.pdbtools (Adams et al., 2010). A total of 30 cycles of restrained refinement were performed with REFMAC5 at a resolution of 1.80 Å . After that, ligands (peptide in the active site, three molecules of DMSO) and solvent were built in Coot (Emsley et al., 2010), refined with REFMAC5 and finally used in PAIREF to analyse the high-resolution cutoff.
We performed two PAIREF runs that added stepwise highresolution shells with a width of 0.10 and 0.01 Å . R free has a decreasing trend up to 1.50 Å for the first run [ Fig. 2(d)], which suggests that the data should be cut at this resolution. Moreover, the plot of R gap [ Fig. 2( f)] from the second run further confirms a good agreement between the previously published results and our calculations.

Cysteine dioxygenase
The cysteine-bound complex of cysteine dioxygenase from R. norvegicus (CDO) (Simmons et al., 2008) Table 2 Structure refinement and validation statistics.
Values are listed for the models refined at the starting and the optimal resolution in square brackets []. ÁR is the difference between R values relating to the model refined at the optimal and the starting resolution (both calculated at the starting resolution). SIM is a simulated data set generated by MLFSOM (Holton et al., 2014). , R values and CC values averaged over all 20 free reflection sets and the associated standard deviation are listed. The remaining statistics relate to the refinements with free reflection set 0. ‡ For the BO data set, values for a resolution shell beyond the optimal cutoff are listed in angled brackets hi.

Figure 2
Results from paired refinement for SIM (a)-(c), TL (d)-( f ) and CDO (g)-(l). Note for bar charts showing the differences in the overall R values: for each incremental step of resolution for X!Y, the R values were calculated at resolution X. SIM: (a) differences in the overall R values; resolution shells with a width of 0.10 Å were added stepwise. R free decreases up to 1.30 Å . (b) Comparison of CC* and CC work of refined models. (c) Both RMSDs of the coordinates and the ADPs (RMSD coordinates and RMSD ADP ) have a decreasing trend up to 1.3 Å resolution. TL: (d) differences in the overall R values; resolution shells with a width of 0.10 Å were added stepwise. (e) Comparison of CC* and CC work of the refined models. ( f ) R gap calculated using data up to 1.80 Å depending on the high-resolution cutoff; resolution shells with a width of 0.01 Å were added stepwise (a different PAIREF run, see the supporting information). CDO: (g) differences in the overall R values; resolution shells with a width of 0.10 Å were added stepwise. (h) Comparison of CC* and CC free of the model refined at 1.42 Å , averaged over all of the 20 free sets. The standard error of the mean is shown in orange. (i) R gap calculated using data up to 2.00 Å depending on the high-resolution cutoff; resolution shells with a width of 0.01 Å were added stepwise (a different PAIREF run, see the supporting information). ( j) Differences in the overall R values averaged over all 20 free sets. The standard error of the mean is shown in orange. (k) and (l) Differences in the overall R values relating to all 20 free sets, refinements at 1.50 and 1.42 Å , respectively. The numbers with arrows in the legends indicate how many rises and falls were observed while using individual free reflection sets. 2012). Although the conservative criterion for R meas suggests setting the high-resolution diffraction limit to 1.80 Å , having hI/(I)i higher than 2 suggests setting the limit to 1.60 Å , but paired refinement proved that data are useful up to 1.42 Å . All refinement was previously performed using phenix.refine (Afonine et al., 2012).
Here, we tried to reproduce the previous results in PAIREF which uses REFMAC5 as a structure refinement program. We have reprocessed the original images with XDS. The input structure model was prepared according to the following protocol: the protein atomic positions of the unliganded CDO structure (PDB entry 2b5h; Simmons et al., 2006) were perturbed by an average of 0.25 Å with phenix.pdbtools; the ligand (cysteine persulfenate) was built manually with Coot. Subsequently, the model was refined with REFMAC5 at 2.00 Å resolution, solvent was added automatically using ARP/wARP (Lamzin & Wilson, 1993) followed by a manual check of the ligand and solvent and restrained refinement with REFMAC5. This model was later used as the input file for PAIREF to analyze the high-resolution shells with a width of 0.10 Å . Unlike the protocol published previously, solvent molecules were not automatically updated during paired refinement.
The differences of overall R values [ Fig. 2(g)] indicate that the high-resolution diffraction limit may be set to 1.60 Å using our combination of software and free reflection set. However, the selection of free reflections may have an impact on the results and conclusions from paired refinement; therefore, we ran the second procedure of 20-fold cross-validation across all free reflection sets, as described in Section 2.1. The differences of overall R free averaged over the free sets are negative up to 1.50 Å resolution [ Fig. 2(j)]. CC* remains higher than CC work in the whole resolution range for all the refined models. Moreover, the trend of R gap [ Fig. 2(i)] shows a moderate decrease for higher resolution going up to 1.42 Å when shells with a width of 0.01 Å were analyzed in the third run of paired refinement using the original free flag 0. To conclude, our calculations indicate that the data improve the model up to 1.50 Å resolution. This suggestion originates from the complete cross-validation protocol which should always be considered when deciding on the high-resolution cutoff.

Endothiapepsin in complex with fragment B53
In the cases reported above, the improvement of structure models using paired refinement was shown on statistical criteria. However, the increase in information gained from the data may also be shown by the interpretability of electrondensity maps. Such enhancement was already reported for the crystal structure of the prokaryotic sodium channel pore (improvement from 4.0 to 3.5 Å resolution) and on the crystal structure of the YfbU protein from E. coli (improvement from 3.1 to 2.5 Å resolution) (Karplus & Diederichs, 2015). To demonstrate this effect using PAIREF, we reprocessed the diffraction data from the crystal structure of endothiapepsin (EP) from C. parasitica in complex with fragment B53 (PDB entry 4y4g; Huschmann et al., 2016) using XDS. The data set originates from a fragment screening project; fragment B53 has a partial occupancy.
The data were originally processed up to 1.44 Å resolution with an hI/(I)i value of 2 in the highest resolution shell (1.52-1.44 Å ). Here, we tried to simulate the regular workflow of model building and structure refinement. We removed all solvent molecules including ligands from the deposited model. The atomic coordinates were perturbed as done previously, the ADPs were manually set to their mean value of 16 Å 2 . Subsequently, 15 cycles of restrained refinement using anisotropic ADPs were performed with REFMAC5. These procedures were later followed by PAIREF calculations up to a resolution of 1.05 Å . According to our results, the optimal high-resolution limit was set to 1.20 Å [ Fig. 3(a)] since positive R free differences are observed for the higher resolution shells.
Inclusion of more intensities in the working data set considerably improved the quality of the omit map belonging to the partially occupied ligand [ Fig. 3(c)]. In general, we expect that the greatest improvement in interpretability will occur for weak density features because the noise level of the map decreases due to improved phases resulting from a more accurate model. This will not significantly influence the observation of atoms with strong density. However, for a feature in the electron-density map that is close to the lower contour levels used in interpreting the map, having a bit less noise will have a higher impact on the reliability and interpretability of the electron-density map. In our case, this effect was observed in the stage of ligand and solvent building, which may be valuable especially in difficult cases and with lowoccupied ligands.

Interferon gamma
All the above-mentioned cases are high-resolution crystal structures. The crystal structure of interferon gamma from P. olivaceus (POLI) was previously determined at a medium resolution of 2.3 Å (Zahradník et al., 2018). Moreover, the data exhibited severe anisotropy. Resolution limits were estimated in the range from 2.26 to 2.71 Å , according to the criterion of hI/(I)i being higher than 1.5 in the highest resolution shell (Evans & Murshudov, 2013). The data were reprocessed in XDS up to 1.9 Å resolution. The deposited structure (PDB entry 6f1e; Zahradník et al., 2018) was refined using all of the reflections in the final refinement step. However, we used the last model refined using work reflections only in our paired refinement.
Several parameters were used to evaluate the highresolution cutoff. Monitoring of R free differences suggests a high-resolution cutoff at 2.0 Å [see Fig. 3(d)]. The value of R work of the model refined at 1.9 Å calculated against the data in the highest resolution shell (2.0-1.9 Å ) is high: 0.43 [ Fig.  3( f)], i.e. it exceeds the R value of a perfect model refined against random data (see Section 2.2). We suggest omitting the highest resolution shell in further refinement and cutting the data at 2.0 Å resolution. Poor CC* values in the high resolution are probably caused by the anisotropy of the diffraction data which affects the correlation between reflections. These results show that the decision on diffraction data resolution should not be based only on a single/certain value of data quality indicator, but on a more comprehensive evaluation of the available data.

Bilirubin oxidase
The choice of the structure refinement program and parameters of refinement are the most decisive tools in paired refinement. PAIREF supports broad modification of structure refinement protocols using a command file for REFMAC5, including modification of ligand libraries. To demonstrate this functionality, we have analyzed the crystal structure of bilirubin oxidase in complex with ferricyanide (BO) (PDB entry 6i3j). The structure was previously refined at 2.59 Å resolution with hI/(I)i equal to 2 in the highest resolution shell (Koval' et al., 2019) as shown in Fig. 3(i).
We have reprocessed the diffraction data up to a resolution of 2.3 Å with XDS. The last model originally refined using working reflections only was used as an input file for paired refinement. The library definitions for hexacyanoferrate, weighting matrix and several external harmonic restraints were supplied to the refinement protocol (see the supporting information). In this case, no improvement in resolution can be expected according to PAIREF. Although the values of CC* are higher than CC work and CC free in the whole resolution Results from paired refinement for EP (a)-(c), POLI (d)-( f ) and BO (g)-(i). Note for bar charts showing the differences in the overall R values: for each incremental step of resolution for X!Y, the R values were calculated at resolution X. EP: (a) differences in the overall R values; resolution shells with a width of 0.05 Å were added stepwise. A systematic decrease in R free was observed up to 1.20 Å . (b) CC* remains higher than CC work in the whole resolution range for all the refined models. (c) Improvement in electron-density quality of the partially occupied fragment B53. Omit maps after refinement up to 1.44 (magenta) and 1.20 Å (green) are contoured at a level of 0.56 e Å À3 . Atomic positions of the fragment molecule originate from PDB entry 4y4g (Huschmann et al., 2016). The graphic was rendered in CCP4mg (McNicholas et al., 2011). POLI: (d) differences in the overall R values; resolution shells with a width of 0.10 Å were added stepwise. (e) Comparison of CC* and CC work of refined models. ( f ) R work of refined models. The level R work = 0.42 is shown as a red line. BO: (g) differences in the overall R values; resolution shells with a width of 0.10 Å were added stepwise. (h) Comparison of CC* and CC work of refined models. (i) hI/(I)i and CC 1/2 of the diffraction data depending on resolution; the level hI/(I)i = 2 is shown as a red line. range [ Fig. 3(h)], an increase in R free values indicates that the original high-resolution cutoff was set reasonably [ Fig. 3(g)].
To further prove this, we ran the paired refinement protocol with 2.8 Å resolution as a starting resolution. At such low resolution, it was important to perform moderate atomic coordinate perturbation (mean shift 0.02 Å ); the ADPs were set to their mean value of 35 Å 2 . In this case, paired refinement suggested the data should be cut at 2.6 Å resolution, which was the original conservative cutoff (see the supporting information).
In addition, we ran the paired refinement protocol starting at 2.59 Å resolution which was not supplied with the external harmonic restraints. An apparent improvement up to 2.5 Å resolution was observed in the data quality indicators. However, refinement lacking the important restraints led to unacceptable geometry of hexacyanoferrate molecules and of several amino acid residues (away from the active site) in the output files and could not be accepted as a positive result. Analysis of the geometry of the refined model is beyond the scope of the PAIREF program as it is not implemented. Therefore, it remains the user's responsibility to perform such analysis. To that end, PAIREF provides direct links to input, output and log files from all calculation procedures.

Impact of the model quality
We performed a limited analysis of the impact of the starting model quality on results from paired refinement. We selected the EP and POLI data sets as examples of structures solved using molecular replacement and an experimental phasing method, respectively. Several models from different model building stages were used in the analysis.
3.7.1. Molecular replacement and the EP data set. We solved the structure using the molecular replacement method with Phaser (McCoy et al., 2007). The crystal structure of penicillopepsin (54% identity, 67% similarity; PBD entry 2wea; Ding et al., 1998) was used as a search model. Subsequently, the protein chain was built automatically by ARP/ wARP (Langer et al., 2008) at the starting resolution (1.45 Å ). Altogether, we analyzed four stages of the model building: (i) model placed by molecular replacement (i.e. containing the penicillopepsin sequence), (ii) the protein chain built by ARP/ wARP, (iii) the original model of the final structure (PDB entry 4y4g) without solvent and (iv) the final complete deposited model ]. We used an identical setup for all the paired refinement protocols. Initially, the coordinates were perturbed by an average of 0.25 Å and the ADPs were set to their mean value, followed by 250 refinement cycles at the starting resolution (required for refinement convergence). Then, high-resolution shells with a width of 0.05 Å were added stepwise (see the supporting information).
Surprisingly, utilization of the data in the whole resolution range (up to 1.10 Å ) is suggested when using a distant protein model correctly placed in the asymmetric unit. In contrast to this, improvement only up to 1.30 Å is observed using the  model after complete protein rebuilding with ARP/wARP. Use of a protein model with no solvent molecules suggests the application of a high-resolution cutoff at 1.25 Å and for the most complete model at 1.20 Å . 3.7.2. Experimental phasing and the POLI data set. The crystal structure of interferon gamma from P. olivaceus was solved using SAD phasing. The following stages of model building were analysed: a poly-Ala model from SHELXE (Sheldrick, 2002), a complete protein model without solvent from PHENIX AutoBuild (Terwilliger et al., 2008) [Figs. 4(e) and 4( f )] and the model prior to the final refinement [Fig. 3(d)] at the starting resolution (2.3 Å ). Here we used optimized parameters of the paired refinement protocol for each specific model (see the supporting information).
The use of incomplete models in paired refinement suggested the application of a high-resolution cutoff of 2.2 Å , while the use of the most complete model a cutoff of 2.0 Å .
Given both examples mentioned above, it can be stated that the model quality and completeness may play a significant role in the results from paired refinement.

Limitations and further development
Amongst the hundreds of trials we performed, we did not register any failure of PAIREF itself. However, in a few cases, the external programs may fail to report an appropriate value, which may cause the crash of the PAIREF run. These cases were observed mostly at unreasonable resolution, e.g. the third or fourth resolution shell that should have already been omitted, or during analysis of very thin shells (e.g. 0.01 Å ).
Results of paired refinement are strongly influenced by the structure refinement protocol (and in some cases also by the specific REFMAC5 version). In most of the cases mentioned above, a possible improvement in model accuracy owing to the use of higher-resolution data was detected using PAIREF. However, no improvement from the conservative cutoff was observed in the case of bilirubin oxidase.
The main focus of our further development will be the implementation of structure refinement using phenix.refine. Most of the procedures cannot be parallelized. Nevertheless, the parallelization of the complete cross-validation protocol is planned to significantly reduce computational time. Moreover, the inclusion of other monitoring statistics -e.g. R complete (Luebben & Gruene, 2015) -in the final report is under development.

Discussion
In macromolecular refinement, the maximum amount of valuable data should be used to obtain the best possible structural models. Hence, evaluation of data significance should be based on novel approaches. This involves the implementation of correlation coefficients and simultaneous monitoring of trends of several statistics that are directly linked to the quality of the refined model. Paired refinement is currently generally accepted as the optimal protocol for the determination of high-resolution cutoff. The PAIREF program is a command-line tool that performs such an analysis and creates a compact report for users to make a selfcontained decision on the data limit.
In one of the examples documented here, we first analyzed the progress of the paired refinement procedure as well as the PAIREF functionality on data that have been artificially generated from a known structure. This structure later served as a target to monitor the convergence of the refined models. Continuous improvement in agreement between the original structure and models from paired refinement was observed in a range where our criteria suggested acceptance of further data. Here, the RMSD calculations showed that use of the high-resolution cutoff suggested by paired refinement produces models closest to the truth. The gap between CC work and CC* visible for all projects except SIM corresponds to the R-value gap discussed by Holton et al. (2014), and is due to deficiencies in modelling the experiment.
We also tested the program on five other real cases, some of them previously used in paired refinement. In four cases, we showed that the model could be further improved by the use of data beyond conservative cutoffs. Our program is able to successfully reproduce two particular paired refinement protocols that were published previously [TL in the work by Winter et al. (2018) and CDO in the work by Karplus & Diederichs (2012)] and the results obtained are in good agreement with the original ones. Slight differences could be caused by the use of a newer version of REFMAC5 (in the case of TL), or by the utilization of other refinement software and the absence of an automatic solvent update during paired refinement (in the case of CDO).
In the case of bilirubin oxidase, an agreement in the highresolution estimation between the conservative and paired refinement approach was observed. In all reported cases, the values of hI/(I)i and CC 1/2 are in the ranges from 0.1 to 1.7 and from 0.027 to 0.524, respectively, all in the highest accepted resolution shell. Therefore, it is clear that a resolution cutoff based purely on certain values of these statistics does not correspond to the information content in the last or next additional resolution shell, as shown in previous works (Karplus & Diederichs, 2012Diederichs & Karplus, 2013;Evans & Murshudov, 2013;Winter et al., 2018).
The addition of high-resolution reflections suggested by the paired refinement results influences the amount of experimental data used in structure refinement as well as the overall agreement of the model to the data. In addition, it produces cleaner and more detailed maps which enable further manual improvement and removal of model errors by refinement. In the case of the data set from fragment screening (EP), we demonstrated that the involvement of valid data from higher resolution shells may have a positive impact on the quality of the electron-density map. Such an effect is clearly useful for low-occupancy ligands, partially disordered regions, alternative positions or low-resolution data.
We tested the influence of model quality on the results from paired refinement. We randomly chose a distant model for molecular replacement of the structure of endothiapepsin and simulated the procedure of structure building and refinement.

research papers
We also used three models from various stages of structure determination of interferon gamma from Paralichthys olivaceus. In these two cases, we observed that the use of a poor starting model suggested a lower high-resolution cutoff than the use of the most complete models. This notwithstanding, the use of a (partially) incorrect model may also result in a misleading suggestion, e.g. inclusion of the whole resolution range. Therefore, the input structure model should be selected carefully; paired refinement is particularly sensible in the final stage of structure refinement.
PAIREF worked well for the examples described using this general protocol: (i) processing of diffraction data at (almost) the full resolution; (ii) provisional resolution cutoff according to a conservative criterion, structure solution, model building and refinement; (iii) paired refinement with sufficient model quality at a later stage of model refinement.
With the introduction of paired refinement into X-ray crystallography, the high-resolution diffraction limit has gained a new meaning, as the only criterion for the data cutoff is now the 'additional value' of the data in model refinement. Following the current trends in diffraction data evaluation, resolution cannot be directly related to a specific value of the conventional indicators of diffraction data quality.
Reflections that were added during the paired refinement protocol generally represent data with the lowest information content. Since they come from the highest resolution shells, their hI/(I)i is lower, R meas higher and CC 1/2 lower. Nonetheless, they may represent a significant portion of the data. For most of the cases reported above, the reflections added through paired refinement account for more than 40% of all data. This of course is highly dependent on the conservative criteria that were used previously, before the paired refinement protocol was applied. Moreover, paired refinement has shown its importance for the improvement of structure models or even interpretability of electron-density maps.