Electronic Reprint Biological Crystallography Decision-making in Structure Solution Using Bayesian Estimates of Map Quality: the Phenix Autosol Wizard Biological Crystallography Decision-making in Structure Solution Using Bayesian Estimates of Map Quality: the Phenix Autosol Wizard

permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited. Acta Crystallographica Section D: Biological Crystallography welcomes the submission of papers covering any aspect of structural biology, with a particular emphasis on the structures of biological macromolecules and the methods used to determine them. Reports on new protein structures are particularly encouraged, as are structure–function papers that could include crystallographic binding studies, or structural analysis of mutants or other modified forms of a known protein structure. The key criterion is that such papers should present new insights into biology, chemistry or structure. Papers on crystallographic methods should be oriented towards biological crystallography, and may include new approaches to any aspect of structure determination or analysis. Estimates of the quality of experimental maps are important in many stages of structure determination of macromolecules. Map quality is defined here as the correlation between a map and the corresponding map obtained using phases from the final refined model. Here, ten different measures of experimental map quality were examined using a set of 1359 maps calculated by re-analysis of 246 solved MAD, SAD and MIR data sets. A simple Bayesian approach to estimation of map quality from one or more measures is presented. It was found that a Bayesian estimator based on the skewness of the density values in an electron-density map is the most accurate of the ten individual Bayesian estimators of map quality examined, with a correlation between estimated and actual map quality of 0.90. A combination of the skewness of electron density with the local correlation of r.m.s. density gives a further improvement in estimating map quality, with an overall correlation coefficient of 0.92. The PHENIX AutoSol wizard carries out automated structure solution based on any combination of SAD, MAD, SIR or MIR data sets. The wizard is based on tools from the PHENIX package and uses the Bayesian estimates of map quality described here to choose the highest quality solutions after experimental phasing.


Introduction
Structure solution in macromolecular crystallography is a multi-step procedure in which more than one plausible possibility often exists at the conclusion of each step. At the start of the process one or more MAD, SAD, SIR or MIR datasets are collected and reduced to a list of indices and structure factor amplitudes (Leslie, 1992;Otwinowski & Minor, 1997;Pflugrath, 2 1999). Even at this stage there are often several possibilities for the space group that must be considered. For each possible space group, the process continues with finding a substructure containing heavy atoms or anomalously-scattering atoms (Grosse-Kunstleve, & Adams, 2003;Schneider & Sheldrick, 2002;Terwilliger & Berendzen, 1999;Weeks et al., 2003). There is often more than one plausible substructure at this stage. For example in space groups that are not chiral the two possible hands of the substructure cannot normally be distinguished. Furthermore for MAD datasets there may be alternative solutions found by searching for the substructure using different datasets (from various wavelengths or combining data from different wavelengths using F A values; Terwilliger & Berendzen, 1994). Similarly, for MIR datasets there may also be substructures found for several different derivatives. In addition to these intrinsic possibilities, it is possible that more than one set of parameters or even more than one set of software might be used to generate possible solutions. The potential heavy-atom substructures found are then used to calculate phases of structure factors, which in turn are used as the starting point for density modification (Wang, 1985) and subsequent model-building (e.g., Perrakis et al., 1999;Terwilliger et al, 2007). Normally one of the best indications of map quality is that the map that can be interpreted in terms of an atomic model.
If every possibility at every stage were investigated fully by calculating maps, carrying out density modification and model-building, the process might take many hours or days to complete.
To speed up the process, the possibilities at each stage are generally ranked, with only the highest-ranked possibilities being considered for the next step. This approach can be quite efficient, but if it is to yield the best solution at the end, it requires a reliable method for deciding which members of a set of solutions are of the highest quality.
The definition of "quality" when applied to electron density maps normally refers to the correlation between values of electron density in the map and the values of electron density in a hypothetical "true" map for the same structure. In this work, when tests are carried out to assess various measures of map quality, the "true" quality or map correlation is calculated between the map in question and a map calculated from a refined model of the corresponding structure. Maps that have a high map correlation as defined in this way are generally more useful for modelbuilding and interpretation than those with a low map correlation. It should be noted however, that map correlation is not a perfect way to assess the utility of a map, as low-resolution terms are generally stronger and therefore have a higher relative contribution to the correlation than highresolution terms, while the high-resolution terms are generally essential for interpretation of a map.
A number of methods for evaluating the quality of experimental macromolecular electron density maps have been developed. The methods can generally be grouped into real-space calculations and reciprocal-space calculations. Real-space methods are based on an examination of the electron density map and generally answer the question: "Does this map look like an electron density map of a macromolecule?" There are many distinctive features of macromolecular electron density maps that can be used to answer this question. A good map may be expected to have continuous chains of density (Baker et al., 1993). It may have local patterns of density that reflect shapes and interatomic spacings common to macromolecules (Colovos et al., 2000;Terwilliger, 2003). It may have a distribution of electron densities with a positive skew, reflecting the large number of points with moderate or low electron density, the lack of points with negative density, and the points with very positive electron density located near atoms in the structure (Podjarny, 1976;Lunin, 1993). There may be a large variation (contrast) in the local rmsd of electron density, reflecting regions of the structure containing the macromolecule (with high local variation) and solvent (with low local variation; Terwilliger, & Berendzen, 1999a, Schneider & Sheldrick, 2002. The contiguous nature of the regions of relatively flat solvent may be detected from the correlation of local rmsd at one point in a map with that at neighboring points (Terwilliger & Berendzen, 1999b). If non-crystallographic symmetry is present in the structure, then the correlation of NCS-related density can be detected (Cowtan & Main, 1998;Vellieux et al., 1999;Terwilliger, 2002a).
Reciprocal-space methods for evaluation of map quality generally address questions involving structure factors and expectations about the structure such as the model for the solvent region or for the heavy-atom substructure. One such question is simply, "Given the anomalously-scattering atom model and the observed data, what is the expected correlation between the experimental map and the true map?". The value of the figure of merit of phasing (Blow & Crick, 1969;Terwilliger & Berendzen, 1999), when estimated correctly, is similar in magnitude to the correlation between the experimental and true maps and can be used as an estimate of this correlation. Another question addresses the data and the expectations about the electron density map: "Is the amplitude of each structure factor consistent with value expected based on the amplitudes and phases of all other reflections and the model of the solvent region?" This question can be answered based on the R-factor in the first cycle of density modification (which reflects the agreement between each measured amplitude and an estimate of that amplitude based on all other amplitudes and phases along with expectations about features in the map; Cowtan & Main, 1996;Terwilliger, 2001). A related question can be asked about the phases: "If a phase is estimated from the model of the solvent region, measured amplitudes of structure factors, and the experimental values of all other phases, is this phase correlated with its experimentallydetermined value?" This question can be answered using the correlation of experimental phases with map-probability phases obtained in statistical density modification (Terwilliger, 2001). A third question that might be asked is, "Do the phases calculated using only the highest peaks in the map match the experimental phases?" This question can be answered by truncating the density at a high level, calculating phases from the map, and comparing these with the experimental phases (Baker et al., 1993).
It is important to note that the measures of map quality are analyzed here for their utility in distinguishing quality of experimental electron density maps, as opposed to maps that have been calculated using a partially-correct model or maps that have had density modification applied.
An important difference between experimental maps and those obtained using a model or based on density modification is that in the latter cases the maps have been specifically adjusted to maximize one or more of the properties that is being measured. For example, density modification typically flattens the solvent region of the map. Similarly, a map calculated from a model will tend to have a high skew of electron density and a high connectivity of high electron density. Some of these measures may also be useful in these two other important cases, but the values of each measure corresponding to a particular quality of map are likely to be substantially different.
In this work we implement 10 different measures of quality of experimental electron density maps, develop a simple Bayesian approach to estimating map quality from each, and show how the individual estimates can be combined to yield useful overall estimates of map quality. These map quality estimates are incorporated into the PHENIX AutoSol Wizard and are used to make decisions during automated structure solution.

Structure solution with the AutoSol Wizard
The AutoSol Wizard carries out structure solution for SAD/MAD or MIR/SIR/SIRAS data and any combination of these. If data representing more than one heavy-atom substructure is available, the data are grouped into "datasets" with common heavy-atom substructures.
Analysis with phenix.xtriage. Each available set of data is analyzed using phenix.xtriage (Zwart et al., 2005) for circumstances such as twinning, translational non-crystallographic symmetry, unexpectedly strong or weak reflections or groups of reflections, or anisotropic overall atomic displacement parameters that may complicate structure determination. The data are corrected for anisotropy before structure solution is carried out if the overall anisotropy correction yields values that are highly anisotropic (by default, defined as greater than 1.5-fold ratio among the atomic displacement parameters' values along the three principal reciprocal axes and greater than 20 Å 2 difference between the highest and lowest values). If an anisotropy correction is applied, then the resulting corrected data are used for structure solution only and not for refinement (as an anisotropy correction is applied as part of the refinement process itself).
Substructure solution with HYSS. For each dataset (i.e. a MAD or SAD dataset or a SIR dataset) possible heavy-atom substructures are found using the hybrid substructure search (HYSS; Grosse-Kunstleve & Adams, 2003) from isomorphous, anomalous, or dispersive differences, or from F A values (Terwilliger, 1994). The high-resolution limit used for the search is typically 3 Å.
Phasing with Phaser and SOLVE and map evaluation. Each potential heavy-atom substructure found above (along with their inverses) are used to calculate phases with Phaser (for SAD phasing; McCoy et al., 2004) or SOLVE (for MAD, SIR and MIR phasing; Terwilliger & Berendzen, 1996;Terwilliger & Berendzen, 1997;Terwilliger & Berendzen, 1999). The resulting phases and amplitudes of structure factors, along with weights (the figure of merit of phasing) are used to calculate experimental electron density maps using a high-resolution limit of 2.5 Å (or lower, if data are not available to this resolution). The high-resolution limit is applied to reduce the effects of resolution cutoffs on the features of electron density maps. These maps are evaluated with the measures of map quality described in this work and the overall Bayesian estimate of quality is used to rank solutions. In cases where two solutions have very similar heavy-atom parameters (rmsd among heavy-atom coordinates of less than 1/10 the highresolution limit of the data) The estimate of uncertainty in the map quality is used to identify solutions that might plausibly (5% possibility or greater) be the best solution and normally all such solutions are considered at each step. By default up to 3 of the highest-ranking solutions (6 for MIR structures) for the heavy-atom substructure are used to calculate phases and weights at the full available resolution of the data and for density modification. In the structure determinations carried out below for development of the map evaluation criteria, rankings were done using a Z-score procedure (Terwilliger & Berendzen, 1999) based only on the skew of electron density (as defined below).
Statistical density modification with RESOLVE. The experimental phases obtained above are used as a starting point for statistical density modification using RESOLVE (Terwilliger, 2000).
In statistical density modification with the AutoSol Wizard, a probabilistic estimate of the boundary between macromolecule and solvent is identified in two ways, and the one leading to the lower R-factor in density modification is used. The first method (Wang, 1985) is based on the local rms density, smoothing the squared density using a sphere (Leslie, 1987) with a smoothing radius (r smooth ), given by an empirically-derived formula (chosen by optimizing parameters carrying out density modification using model data): where d min is the high-resolution limit of the data and <m> is the mean figure of merit of phasing.
The second method for solvent boundary identification uses a comparison of histograms of density based on model maps calculated with partially-randomized phases with local histograms of density in the experimental map to assign a probability that each point in the map is part of the macromolecule or part of the solvent region. In both cases a probabilistic solvent boundary is obtained (Terwilliger, 1999).
Non-crystallographic symmetry is used in density modification if it is detected based on the heavy-atom substructure and the presence of correlated density at NCS-related positions in the electron density map (Terwilliger, 2002a;Terwilliger, 2002b). The value of r smooth described above is used as a smoothing radius in a local correlation map to identify the region over which NCS holds (Vellieux et al., 1995).
Model-building with RESOLVE. After density modification, the AutoSol Wizard carries out automated model-building using a single cycle of building with the PHENIX AutoBuild Wizard (Terwilliger et al., 2007), or using rapid methods for building secondary structure of proteins and nucleic acids (TT, unpublished). Initially a secondary-structure-only model is built into each map.
The correlation between a map calculated from the model and the density-modified map is then determined. If the value of the map-model correlation is less than a cutoff value (typically 0.35) then the building procedure is repeated with a standard cycle of building using the methods in the PHENIX AutoBuild Wizard. If a map-model correlation of a given cutoff (typically 0.20) or greater is obtained for at least one solution, then the top solution is identified as the one with the highest value of the map-model correlation. If a lower map-model correlation is obtained, then the top solution is identified (see below) based on the Bayesian estimates of quality using the skew of electron density (skew) and the correlation of local rms density (r 2 RMS ).

Evaluation of measures of map quality
A set of measures of map quality were applied to a set of experimental maps (or structure factor amplitudes, phases and weights) obtained from real but re-enacted structure determinations. Each of the structures considered had been determined previously, so that a refined model could be used to calculate a model map to use as a standard. The "true" quality of each map was taken to be the correlation with the corresponding standard map, calculated at the same nominal resolution. Each measure of quality was applied to each map and the resulting scores were saved along with the corresponding "true" quality. The structure solution process was automatically carried out by the PHENIX AutoSol Wizard, and each experimentally-phased map that was obtained during the structure solution process was examined in this way. To reduce the number of near-duplicate solutions considered, all solutions for a structure that had nearlyidentical values of the map correlation to the standard map (within a range of +/-0.0005 in map correlation) were considered the same, and only the first was used in the analysis. For comparisons involving two possible enantiomers of a solution, the two enantiomers of a solution sometimes differed only slightly (i.e., the heavy-atom substructure was nearly centrosymmetric).
In these analyses of enantiomeric pairs, only those that differed by an rmsd of at least 0.5 Å were considered.
For analysis of map quality, electron density maps and structure factors were calculated using a high-resolution limit of 2.5 Å (if data were available to that resolution), as described for the AutoSol Wizard above. Before applying each of the measures of map quality, the experimental maps were normalized to a mean of zero and a variance of unity. They were then adjusted in two steps to reduce the contribution from high density at the coordinates of heavy-atom sites. (The high density at heavy-atom sites might otherwise lead to high values for the skew, NCS correlation, contrast, and possibly other measures.) First, the electron density within a radius (r) of each heavy-atom site used in phasing (where r was given by twice the resolution of the data or 5 Å, whichever was greater) was limited to values less than or equal to twice the rms (2σ) of the map. Second, the electron density everywhere in the map was limited to values in the range of -5σ to +5σ. This modified map is referred to below as the normalized, truncated experimental electron density map.
Weighted electron density maps were calculated in the PHENIX environment (Adams et al., 2002) using RESOLVE (Terwilliger, 2000) on a grid with spacing of 1/3 the high-resolution limit of the data or finer. Map correlations were obtained by calculating the correlation coefficient of a pair of maps at all the grid points in the asymmetric unit of the unit cell. Model-map correlations were calculated in the same way, except that one map was calculated from the model and an overall B-factor (b_overall) was adjusted to maximize the correlation. For protein chains, an increment in B-factors (beta_b) for each bond between side-chain atoms and the C β atom was also applied.

Real-space map-quality measures
The measures of map quality used in this work are described in this and the following section and are summarized in Table 1.

Skew of electron density:
The skew of each normalized, truncated map (as described in section 2.1) was calculated using the relation, where the electron density (ρ) was calculated at all the grid points in the asymmetric unit.
Contrast of electron density. The contrast between the rms (root-mean-square) density in the solvent region and the rms density in the macromolecular region was calculated from the standard deviation of the local rms density over the entire asymmetric unit (Terwilliger, & Berendzen, 1999a;Schneider & Sheldrick, 2002). The normalized, truncated density described in section 2.1 was first squared. The squared density was then smoothed by averaging all values within a moving sphere with radius (r) given by the larger of 6 Å or twice the high-resolution limit of the data. The standard deviation (s) of the smoothed squared density was then calculated. To compensate for the effect of the solvent fraction in the crystal (f) on the resulting value, the standard deviation (s) calculated above was multiplied by the factor [(1-f)/f] 1/2 to yield the contrast c: The correction factor [(1-f)/f] 1/2 was chosen because it leads to a value of 1 for the contrast for a map for which the entire solvent region has a zero variance and the non-solvent region has a constant and non-zero variance.
Correlation of local rms density. The presence of contiguous flat solvent regions in a map was detected using the correlation coefficient of the smoothed squared electron density calculated as described above, with the same quantity calculated using half the value of the smoothing radius, yielding the correlation of rms density, r 2 RMS . In this way the local value of the rms density within a small local region (typically within a radius of 3 Å) is compared with the local rms density in a larger local region (typically within a radius of 6 Å). If there is a large, contiguous solvent region and another large contiguous region containing the macromolecule, the local rms density in the small region would be expected to be highly correlated with the rms density in the larger region.
On the other hand, if the "solvent" region is broken up into many small flat regions, then this correlation would be expected to be smaller.
Flatness of solvent region. A normalized, truncated electron density map was partitioned between regions of solvent and macromolecule as described in section 2.1. Then the rms electron density in the solvent region (rms SOLVENT ) and in the region of the macromolecule (rms PROT ) were calculated. The flatness (F) of the solvent region was expressed as the difference between the two: (4) Number of regions enclosing high density. A threshold of density (t) was found such that 5% of the volume of the asymmetric unit of the crystal had a density greater than this threshold t. All the grid points in the map above the threshold t were marked. Then the number of discrete regions (N regions ) containing marked points was counted. For this purpose, a discrete region was defined as a set of all marked grid points that can be connected by tracing from one adjacent marked grid point to another (including symmetry-related marked grid points). To partially compensate for the fact that lower-resolution maps have fewer grid points, the number of regions is multiplied by the high-resolution limit of the data used to calculate the map (d min ). To further compensate for the volume of the asymmetric unit containing the macromolecule, the number of regions is then divided by the fraction of the asymmetric unit that contains macromolecule (f) and the volume of the asymmetric unit (V) to yield the normalized number of regions per unit volume (N r ): Overlap of NCS-related density. If non-crystallographic symmetry is found in the heavy-atom substructure for a solution then the map is examined for the presence of correlated density at NCS-related locations in the map (Cowtan & Main, 1998;Vellieux et al., 1995). The overlap (O NCS ) between density at NCS-related locations is used to evaluate non-crystallographic symmetry: where ρ i and ρ j are density at NCS-related locations in the asymmetric unit and the average is either within a sphere with radius r smooth (as described above for identifying the solvent boundary), or over a region within the asymmetric unit. The values of density ρ i used are those from the normalized truncated map described above. The region where NCS applies is identified as a contiguous region where the local mean of the overlap is at least c MIN , where this cutoff c MIN is selected to yield a total volume occupied by all NCS copies approximately the same as the total volume (f) occupied by the macromolecule in the asymmetric unit (Terwilliger, 2002a

Reciprocal-space map-quality measures
R-factor and phase correlation from statistical density modification. The amplitudes and phases of structure factors calculated using statistical density modification, but without including the experimental phase probabilities, can be compared with the observed amplitudes and experimental phases (Cowtan & Main, 1996;Terwilliger, 2001). These comparisons yield an Rvalue (R DENMOD ) for the amplitudes and a mean cosine of the phase difference (m DENMOD ) for the phases. Density truncation (peak-picking). The number of non-hydrogen atoms (n) in the asymmetric unit is roughly estimated from the fraction of the asymmetric unit that contains macromolecule (f) and the volume of the asymmetric unit (V) using an approximate average atomic volume of Then the highest n grid points in the asymmetric unit of the electron density map are identified and C atoms are placed at these grid points. A map is calculated from these C atoms and the correlation (r 2 TRUNCATION ) with the original map is obtained, after adjusting an overall thermal factor to maximize this correlation

Bayesian estimates of map quality
A simple Bayesian approach was used to create estimators of map quality based on one or more of the measures of map quality described in sections 2.3 and 2.4. For each measure (e.g., skew) the analysis of maps corresponding to solved structures yielded a list of values of "true" map correlation (r 2 MODEL ) and the measure of quality (e.g., skew). A two-dimensional histogram was created to represent the joint distribution p(r 2 MODEL , skew). The distributions were sampled with 30 bins for each variable, with a range of allowed values of each ranging from -0.1 to 1.1.
Any values obtained outside this range were put in the closest available bin. To compensate for the fact that insufficient data (1359) were present to generate an accurate value for all 900 bins, the values of p(r 2 MODEL , skew) were smoothed using a Gaussian smoothing algorithm in which p(r 2 MODEL , skew) was convoluted with a Gaussian function G(r) with a radius (σ) of 3 bins ( G(r)∝exp{-(u 2 +v 2 /(2σ 2 )} ), reducing the effective number of bins to about 100.
To estimate the value of map quality (r 2 MODEL ) from a new observation of the quality measure (skew), Bayes' rule (Hamilton, 1964) was used: where the normalization factor A assures that the integrated probability for r 2 MODEL is unity and is given by, Eq. (7a) says that the (posterior) probability of a particular value of r 2 MODEL , given the measurement skew, is the prior probability of r 2 MODEL (p o (r 2 MODEL )) multiplied by the conditional probability (p(skew | r 2 MODEL )) of measuring this value of skew given that r 2 MODEL is the correct value, divided by a normalization factor. We calculated the conditional probability p(skew | r 2 MODEL ) in Eq. 7a from the joint probability distribution p(r 2 MODEL , skew) using the relation, For the present work we assume the prior probability distribution p o (r 2 MODEL ) is uniform on [0,1]. If several measures of map quality (e.g., skew and contrast c) have been measured, then the estimates can be combined using the same approach: We approximate the probability distribution p(skew, c | r 2 MODEL ) as the product of the two 2dimensional conditional probabilities that we have estimated above: which amounts to assuming that the skew and contrast c are conditionally independent for a given fixed r 2 MODEL value.
To obtain the estimated value and variance of r 2 MODEL given a set of observations of predictor variables (e.g., skew, c) we used the probability distribution given by Eq. 8a and calculated the expectation value of <r 2 MODEL >: An improved estimate of the conditional probability distributions such as p(skew, c | r 2 MODEL ) could potentially be obtained by calculating the covariance of the variables skew and c for each fixed value of r 2 MODEL and assuming a normal distribution of skew and c for this fixed value of r 2 MODEL . This formulation differs from that in Eq. 9 by including correlations between skew and c instead of assuming that they are zero, and also through the assumption of normality in the distributions of skew and c for fixed r 2 MODEL . Leaving out the fixed value of r 2 MODEL for clarity, representing the two-dimensional vector (skew, c) as x=(skew, c) and the mean values of skew and c for this value of r 2 MODEL as u=(<skew>, <c>), we can write (Hamilton, 1964): where Σ is the covariance matrix with elements σ ij representing the variation of skew and c around their means <skew> and <c>: To test this approach we used the data described above, but grouped in bins of r 2 MODEL . The observations in each bin of r 2 MODEL were analyzed using Eqs. 10a-10d based on the values of the N predictor variables (skew, c…) for all the observations in that bin to obtain an approximation of the conditional probability distribution p(skew, c | r 2 MODEL ) for that bin. This set of approximations (one for each bin of r 2 MODEL ) was then used in Eq. 8 to estimate r 2 MODEL for individual sets of observations of the N predictor variables. This approach gave correlations that were at most marginally improved over those obtained using estimates of the conditional probability distribution p(skew, c | r 2 MODEL ) based on Eq. 9. For example, using skew and correlation of local rms density (r 2 RMS ) as predictor variables, and analyzing the same data shown in Table 3 (but without cross-validation), the overall correlation coefficient between true values of r 2 MODEL and estimates using Eq. 9 (in which independence of skew and r 2 RMS is assumed) was 0.925. Using Eq. 10 (assuming Gaussian distributions for skew and r 2 RMS ) and setting the covariance terms to zero (assuming independence of skew and r 2 RMS ), yielded a value of 0.926, and the same analysis, but including the covariance terms, yielded a value of 0.927. As this approach did not significantly improve the correlation, it was not used. Fig. 1c suggests that the assumption of normality in the distributions of the predictor variables (e.g., skew and r 2 RMS ) for fixed r 2 MODEL is not welljustified. This may partially explain the poor performance of this approach. Crystallography Course), mbp (1YTT, Burling et al., 1996), mev-kinase (1KKH, Yang et al., 2002), myoglobin (Ana Gonzales, personal communication), nsf-d2 (1NSF, Yu et al., 1998), nsfn (1QCS, Yu et al., 1999), p32 (1P32, Jiang et al., 1999), p9 (1BKB, Peat et al., 1998), pdz (1KWA, Daniels et al., 1998), penicillopepsin (3APP, James & Sielecki, 1983), psd-95 (1JXM, Tavares et al., 2001), qaprtase (1QPO, Sharma et al., 1998), rab3a (1ZBD, Ostermeier & Brunger, 1999), rh-dehalogenase (1BN7, Newman et al., 1999), rnase-p (1NZ0, Kazantsev et al., 2003), rnase-s (1RGE, Sevcik et al., 1996), rop (1F4N, Willis et al., 2000), s-hydrolase (1A7A, Turner et al., 1998), sec17 (1QQE, Rice & Brunger, 1999), synapsin (1AUV, Esser et al., 1998), Sutton et al., 1999), tryparedoxin (1QK8, Alphey et al., 1999), utsynthase (1E8C, Gordon et al., 2001), vmp (1L8W, Eicken et al., 2002).
The structures from the JCSG included PDB entries 1O1X (Xu et al., 2004)

Measures of map quality
A key goal of this work was to identify one or more quality measures of maps or of structure factors that are simple to calculate and that can yield accurate estimates of the qualities of the corresponding electron density maps. Table 1 lists 6 measures of map quality examined here that are based on the features in the maps (real-space measures), and Table 2 lists 4 additional measures we have examined that depend on the structure factors and phases used to calculate maps. The measures we have examined were chosen to represent a range of possible measures that cover many important features of electron density maps and structure factors.
To evaluate possible measures of map quality, we carried out a re-analysis of data for 246 previously-solved MAD, SAD and MIR structures, creating electron density maps during the structure-determination process and analyzing them with each of the measures in Tables 1 and 2.
As the structures are all known, the "true" map quality for each map could be calculated as the correlation coefficient r 2 MODEL between each map and the corresponding map obtained from the refined model of the structure (after any necessary origin shifts are applied) using the PHENIX tool phenix.get_cc_mtz_mtz. Figures 1A through 1J show the values of each measure plotted against r 2 MODEL for 1359 maps based on structures calculated from the MAD, SAD, and MIR data listed in section 2.6. The maps represent phases obtained at several stages in structure determination. Some are calculated using heavy-atom solutions found from anomalous or isomorphous differences or from F A values with HYSS (Grosse-Kunstleve & . Others are calculated using the corresponding substructures with inverted hand. Others are obtained from difference Fourier (MIR) and anomalous difference Fourier (MAD) analyses. In the case of MIR, a large number of additional solutions are obtained by combinations of partial solutions from different derivatives.
The general features of the plots in Fig. 1 are illustrated by a discussion of Fig. 1A, which shows the skew of electron density in experimental maps as a function of the true map quality, r 2 MODEL . In Fig. 1A the purple squares correspond to datasets with a nominal resolution lower than 2 Å, and the black diamonds to datasets with resolutions of 2 Å or higher. (Note that the data for all these calculations are truncated at a resolution of 2.5 Å, so that most resolution-dependent differences are likely to be due to dataset-dependent decreases of intensities with resolution, rather than the resolution of the data.) Fig. 1A shows that the skew of the electron density depends strongly on the map quality, as represented by the correlation of the density in the map with that of a model map (r 2 MODEL ). The skew is approximately zero for maps with a correlation in the range of 0.0 < r 2 MODEL < 0.2. It increases slightly for maps with correlations in the range of 0.2 < r 2 MODEL < 0.4, and then it increases substantially for maps with higher correlations (r 2 MODEL > 0.4). The standard deviation of values of the skew is about 0.05-0.10 over most ranges of map correlation. For example, for values of map correlation with r 2 MODEL < 0.2, the mean skew is -0.02 and the standard deviation is 0.07, and for values of map correlation with 0.4 < r 2 MODEL < 0.5 the mean skew is 0.14 with standard deviation of 0.06. For values of map correlation with 0.6 < r 2 MODEL < 0.7, the mean skew is 0.38 with standard deviation of 0.10. Another way to view these relationships is to note that the difference (0.16) in mean values of the skew between values of map correlation of r 2 MODEL < 0.2 and values of map correlation in the range of 0.4 < r 2 MODEL < 0.5 is about twice the standard deviation of the skew in either range. This means that the skew can be expected to differentiate between maps with model correlations r 2 MODEL of zero and 0.4, but that cannot differentiate them correctly all of the time. This can also be seen directly from A somewhat different behaviour is shown by the number of contiguous regions (N r ) required to enclose the highest 5% of density in a map (Fig. 1E). This measure decreases with increasing map quality, but only slightly, so that it is not a strong discriminator between maps of low and moderate quality.
The overlap of NCS-related density (Fig. 1F) is a measure which, as implemented here, only applies to maps where NCS can be identified from the symmetry present in the heavy-atom sites.
It is therefore different from the measures discussed so far and cannot be used as a general measure of map quality. It is nevertheless useful in differentiating between maps of very high model map correlations (r 2 MODEL ) and those that have lower model map correlation. Figs. 1G and 1H show the phase correlations (m DENMOD ) and R-factors (R DENMOD ) obtained from the first cycle of statistical density modification using the same structure factors, phases, and weights that are used to calculate electron density maps analyzed in Figs. 1A-1F. In the first cycle of statistical density modification (Terwilliger, 2000) estimates of the phase and amplitude of a reflection k are obtained using only information from all the other reflections in the dataset.
The amplitude and phase for reflection k from the density modification procedure can then be compared with the experimentally observed amplitude and the "experimental" phase (derived using isomorphous or anomalous differences) to yield an R-factor for density modification (R DENMOD ) and a mean cosine of the phase difference (m DENMOD ). Figure 1G shows that, as expected, the R-factor for density modification decreases with increasing map quality, while Fig.   1H shows that the phase correlation increases over the same range. Fig. 1I shows that the correlation of pseudo-maps calculated using dummy atoms placed at the highest peaks in a map with their corresponding original maps (r 2 TRUNCATION ) is weakly related to the quality of the map. It seems possible that more sophisticated methods of map skeletonization (Baker et al., 1993) might be more useful in map evaluation than our simple measure.
Finally, Fig. 1J shows that the mean figure of merit of phasing (<m>) is related to the quality of the map, but that there are many maps with very low correlation to the corresponding model maps that nevertheless have high mean figures of merit. This complex relationship can be understood by considering that the figures of merit of phasing of two maps that are calculated using the same data, but opposite enantiomers of the heavy-atom substructure, are normally identical for SAD phasing if all the anomalous scatterers are of the same type. Typically one of these maps may have a high correlation to the model map, while the other may have a very low correlation.
Overall, Fig. 1 shows that several measures of map quality based on different features of the map and of structure factors and phases leading to the map have strong relationships to the quality of the electron density map, with the skew of electron density clearly being one of the best indicators of map quality.

3.2.
Estimation of map quality using features of the map and of structure factors used to calculate the map Figure 1 showed that each of the 6 different features of electron density maps and 4 characteristics of structure factors we examined depend in some way on the quality of the corresponding map. We used the Bayesian approach described in section 2.5 to use this information to estimate map quality from these 10 features. The general idea of this approach is very simple. Imagine that a particular map has been examined, yielding a skew of 0.20. Based on Fig. 1A, it is reasonable to conclude that this map is very likely to have a correlation (r 2 MODEL ) with the corresponding model map in the range of 0.4 < r 2 MODEL < 0.6, because nearly all examples in Fig. 1A with a skew of about 0.20 are in this range. Equation 7a is simply a mathematical way to make this statement. Eq. 8a is a similar statement, except that it includes more than one measure of map quality. As described in section 2.5, we assume here that the various measures of map quality (skew, contrast, etc.) are independent. This allows a very simple calculation (Eq. 8a) to be used to estimate r 2 MODEL from several measures of map quality. Fig. 2A shows the results of using Eq. 7a to estimate r 2 MODEL from the skew of electron density. In Fig. 2A the abscissa is the Bayesian estimate of r 2 MODEL using the skew of electron density, and the ordinate is the true value of r 2 MODEL . To ensure that the parameters in the Bayesian estimator did not contain information on the specific cases being tested, a jackknife procedure was used in which all solutions for the structure being examined were excluded when constructing the Bayesian estimators. Fig. 2A shows that in cases where the true value of r 2 MODEL is in the range of 0.0 < r 2 MODEL < 0.2, the estimates of r 2 MODEL all have very similar values of about 0.1. This can be understood from Fig. 1A, in which the skew is seen to be insensitive to values of r 2 MODEL in this range. The Bayesian estimates of r 2 MODEL for low values of skew are all close to the midpoint of this range, as they are simply the average of plausible values of r 2 MODEL , given the observation of the value of the skew. For higher values of r 2 MODEL , the estimates of r 2 MODEL are closer to the true values. Overall, the correlation coefficient between the Bayesian estimates and true values of r 2 MODEL is 0.90 and the rms error in prediction of r 2 MODEL is 0.10. As a check on our procedures, we note that the mean uncertainty estimates for r 2 MODEL obtained from the Bayesian procedure was 0.11, quite similar to the actual rms error in prediction of r 2 MODEL of 0.10. Table 3 summarizes the accuracy of the Bayesian estimates of map quality based on each of the measures described in Tables 1 and 2 (with the exception that the overlap of NCS density is not included because it does not apply to most of the maps in our tests). For each measure, Table   3 lists the values of the correlation coefficient of the Bayesian estimates and the true map quality (r 2 MODEL ) along with the rms prediction error in r 2 MODEL . Overall, the skew of electron density, having a correlation coefficient between Bayesian estimates and true values of r 2 MODEL is 0.90, is the most reliable indicator of map quality, with the correlation of local rms density next best (correlation of 0.85), and with contrast, flatness of solvent region, and density-modification phase correlations and R-factor giving only slightly poorer predictions of r 2 MODEL with correlations in the range of 0.75-0.80.
To identify an optimal combination of measures for estimation of map quality, we began with the best single measure (skew) and used Eq. 9 to combine information from each of the other measures. The measure giving the best prediction of r 2 MODEL in combination with skew was the correlation of local rms density (r 2 RMS , Table 3). Figure 2B shows how the estimates of map quality obtained using just the correlation of rms electron density compare with actual map quality, and Fig. 2C shows estimates based on both skew and correlation of rms electron density.
The correlation of rms density was the next-best single predictor after skew and in addition the correlation of prediction errors from these two variables was relatively low (0.61, Table 4). The assumptions in Eq. 9 are therefore relatively well-justified and it is not surprising that the resulting estimator is improved over the one using just the skew of electron density. This process was continued but no further improvement was obtained in the Bayesian estimator. The optimized combination of measures based on skew and correlation of local rms density yielded a correlation coefficient between the Bayesian estimates and true values of r 2 MODEL of 0.92 and an rms prediction error of 0.09 (Table 3 and Fig. 2C).

Identification of the hand of heavy-atom substructures using measures of map quality
A particularly important application of measures of map quality is the identification of the hand of heavy-atom substructures. In space groups that are not enantiomorphic, the hand of the heavy-atom substructure can normally not be identified directly during substructure determination by direct methods such as the HYSS procedure (Grosse-Kunstleve & Adams, 2003) used here.
Consequently some procedure is needed for identifying which hand of the heavy-atom substructure is correct. Figures  It is somewhat remarkable that these 9 measures of map quality all give very good discrimination between the correct and incorrect hands of heavy-atom substructures ( Fig. 3 and Table 5), even though they are not all so useful in estimating the absolute quality of maps (Table   3). The best discrimination between correct and incorrect hands is obtained with the skew of electron density (Fig. 3A), as expected from the high correlation of estimates of map quality based on skew with actual map quality (Table 3). Using the skew of electron density to make decisions on handedness (Fig. 3A), 98% of decisions (in cases where the quality of the maps for the two hands differs by at least 0.05) would lead to a map with higher quality than that of the opposite hand (Table 5). Note that for SIR or MIR data without anomalous differences, none of these techniques can identify the correct hand because the inverse hand of the heavy atoms leads to a map that has inverse chirality but is otherwise identical. A similar argument would partially apply in cases where the anomalous signal is weak. This situation is presumably the cause of the large number of MIR-derived points along the diagonal of the panels in Fig. 3.

Identification of the highest-quality density modified map for a structure
The scoring procedures described above are based on an analysis of the phases and structure factor amplitudes corresponding to an experimental electron density map. Prior to final map interpretation, however, the experimentally-determined phases of structure factors are normally optimized by density modification (Wang, 1985). It seemed possible that the best experimental maps would not always lead to the best density-modified maps, and consequently that some additional method of scoring the density modified maps might be useful.
To investigate this possibility, we carried out automated structure determination using the datasets used in Fig. 1, this time with default parameters in the AutoSol Wizard, including Bayesian estimates of experimental map quality based on the skew of electron density (skew) and the correlation of local rms density (r 2 RMS ). For each structure, the final steps were to carry out density modification on the top-ranked solution or solutions and then to build a preliminary atomic model. In cases where there was one solution that was much better than all others (see Methods), then only that solution was used in density modification. However in most cases there were multiple solutions with similar Bayesian estimates of quality and up to 3 (MAD, SAD) or 6 (MIR) of these were used in density modification. Figure 4A shows the relationship between qualities of experimental maps and the qualities of the corresponding density-modified maps for 545 experimental maps for 240 datasets. For experimental maps of high quality (correlation with model map over 0.6), the quality of the density-modified map is generally (but not always) very high, typically ranging from 0.75 to 0.90.
On the other hand, for experimental maps of low or moderate quality (map correlation of less than about 0.5), there is remarkably little correspondence between the quality of the experimental map and that of the density-modified map.
Part of the variability in density modification illustrated in Fig. 4A could be due to the differences in solvent content, non-crystallographic symmetry, type of experiment and resolution between the different structures. To examine this we have plotted in Fig. 4B  The variation in effects of density modification illustrated in Fig. 4B suggests that it might be useful to carry out a final ranking of solutions based on a measure of quality of the corresponding density-modified maps. We used the map-model correlation between density-modified maps and the preliminary atomic models built with the AutoSol Wizard as such a measure of quality. Table   6 shows the utility of this map-model correlation in identifying the solution with the best densitymodified map for each of the 134 structures used in Fig. 4A in which there was more than one solution tested by density modification and model-building, and in which the model-building process yielded a model with a model-map correlation of at least 0.20. The first row in Table 6 provides a background for this analysis by considering the use of our Bayesian estimates of experimental map quality to identify the best solutions. Using the Bayesian estimates (based on the experimental maps) the best experimental map for a particular structure could be identified 92% of the time. Furthermore the worst error in identification of the best map corresponded to a difference in map correlation of only 0.16. On the other hand, the solution with the highest Bayesian estimate of experimental map quality led to the best density-modified map only 57% of the time, and this density modified map had a true map correlation as much as 0.64 lower than the best density modified map for the corresponding structure.
Using the map-model correlation for the model built into the density-modified maps, the situation is reversed, with the best experimental map identified only 61% of the time and the best density-modified map identified 70% of the time. Most importantly, the density-modified map yielding the highest map-model correlation was never worse than the very best density-modified map obtained by more than a difference in correlation of 0.09, indicating that the model-map correlation is a useful criterion for final ranking of solutions.

3.5.
Using the AutoSol Wizard to redetermine structures from the PHENIX structure library To test the utility of the Bayesian estimates of map quality obtained using the skew and correlation of local rms density as described in section 3.2, we carried out structure determinations on all 48 MAD, SAD, and MIR structures in the PHENIX structure library and used these quality estimates to make decisions about which solutions to pursue. The structures in this library range from relatively straightforward cases of SAD and MAD structure determination to considerably more complex cases that involved combinations of SAD or MAD with MIR. In the automated tests carried out here, only one source of phase information was used for each structure (i.e., MAD, SAD, or MIR) except in the case of the fusion-complex structure (1SFC, Sutton et al., 1998) in which SAD and SIR data were combined. We compared the qualities of the maps obtained after density-modification from this automated procedure using two methods of making decisions. The first method was to use the Bayesian estimates based on the combination of the skew of electron density and the correlation of local rms density, as described above. The second method was to use a decision-making process using perfect scores in which the actual correlation coefficient of each map with that of the corresponding model map was used to decide which map was best. Figure 5A illustrates these comparisons for MAD structure determinations, Fig. 5B illustrates them for SAD structure determinations, and Fig. 5C for MIR structure determinations.
For MAD and SAD structure determinations the decision-making procedure using Bayesian estimates based on the combination of the skew of electron density and the correlation of local rms density led to density-modified electron density maps that were of comparable quality to those obtained using a perfect decision-making process (Fig. 5). In the case of fusion-complex, the Bayesian decision-making procedure led to a slightly better density-modified map than a procedure using the actual quality of experimental maps for decision-making. This occurred because a solution with the best experimental map led to a density-modified map that was not quite the best. For MIR structure determinations the decision-making process was not as good. In several MIR cases the final maps obtained using the Bayesian estimates were substantially poorer than obtained using perfect map correlation. The AutoSol Wizard failed, using either method of decision-making, to find a solution in one difficult case (groEL; Braig et al., 1995) that was previously solved by MIR. In this case heavy-atom solutions could not be automatically found for any of the derivatives.

Conclusions
Each of the 10 measures of quality of experimental electron density maps evaluated here has some utility in estimating the true quality of these maps. These measures of map quality have a wide range of bases (Tables 1 and 2) ranging from the flatness of the solvent region typically found in macromolecular structures to the connectivity of regions of high electron density corresponding to the chains of polymers in these structures. Overall, however, the skew of electron density stands out as the best of these measures (Table 3 and Fig. 2). Used in a simple Bayesian estimator, the correlation between map quality estimated with the skew of electron density with true map quality is about 0.90, while the next-best estimator (correlation of local rms density) gives a correlation of only 0.85. Combining the two yields the most useful estimator we have developed, with a correlation between estimated and actual map quality of 0.92 and an rms prediction error in map quality of 0.09.
With the exception of mean figure of merit of phasing, which does not depend on the hand of the heavy-atom substructure, all the measures of map quality analyzed are remarkably good discriminators between maps calculated using the correct and inverse hands of the heavy-atom substructure (Fig. 3). Using the combination of skew of electron density and correlation of local rms density in a Bayesian estimator of map quality, the AutoSol Wizard is able to carry out automated structure solution. The AutoSol Wizard makes decisions about the heavy-atom substructures to pursue based on these map quality estimates. This process yields densitymodified electron density maps of approximately the same overall quality as those obtainable with a perfect decision-making system (Fig. 5).
Our Bayesian estimates of map quality, while highly useful in evaluating experimental maps, are nevertheless not the best indicators of the quality of the corresponding density modified maps.
The map-model correlation obtained after preliminary model-building is a considerably better indicator of the quality of density modified maps ( Fig. 4 and Table 6).
In this work we have ignored the resolution-dependence of the measures of map quality. This is made possible in part by the use of a high-resolution cutoff of 2.5 Å for all the calculations of map quality and is generally justified by the relatively small remaining resolution dependence of most of the measures of map quality (Fig. 1). Nevertheless it seems possible that some improvement in estimation of map quality might be obtained by including the resolution dependence (or the effective overall isotropic displacement factor) of the data in the analysis.
Additionally, we have assumed independence of the various measures of map quality in Eq. 8a.
We were not able to improve the estimates of map quality using a simple covariance-matrix approach to combining estimates of map quality, but other more sophisticated approaches, along with a much greater set of sample data, might also lead to improved estimates of map quality.   Fig. 1 were used in Eqs. 7a and 8a to estimate overall map quality. The calculations were carried out one dataset at a time. For each dataset, joint probability distributions of each measure of quality and true quality (e.g., p(skew, r 2 MODEL ) ) were calculated excluding data from all solutions for that structure. Then these jackknifed joint probability distributions were used in Eqs. 7a and 8a to estimate map quality using the measures of quality for each map associated with that dataset. In each case true map quality (r 2 MODEL ) is plotted as a function of the Bayesian estimates of map quality. A. Estimates of map quality using the skew of electron density in Eq. 7a. B. Estimates using the correlation of local rms density in Eq. 7a. C. Estimates using the skew and correlation of local rms density in Eq. 8a.

Figure 5
Comparison of quality of density-modified maps obtained using the skew of electron density and correlation of local rms density for scoring with those obtained using the true map quality (correlation to the corresponding model map) for scoring. See text for details. The light blue bars labelled "Perfect scoring" correspond to running the AutoSol Wizard and using the actual map quality to make decisions at each step. The dark maroon bars labelled "Bayesian scoring" correspond to using the Bayesian scores based on the skew of electron density and correlation of local rms density. A. Structures determined using MAD. Structures shown are: aep-transaminase (1M32, Chen et al., 2002), armadillo (3BCT, Huber et al., 1997), cobd (1KUS, Cheong et al., 2002), cp-synthase (1L1E, Huang et al., 2002), cyanase (1DW9, Walsh et al., 2000), epsin (1EDU, Hyman et al., 2000), gene-5 (1VQB, Skinner et al., 1994), gere (1FSE, Ducros     Fig. 3. Then the true values of r 2 MODEL were subtracted, yielding prediction errors for each map for each measure of map quality. The correlation coefficients ( r 2 ) of prediction errors among the various measures of map quality are listed. .* The percentage of cases in which the higher (or lower, as appropriate) value of the quality measure is associated with the higher value of the actual map correlation coefficient with the corresponding model map. Only cases in which the actual map correlations differ by at least 0.05 are considered. * The percentage of correct predictions of best maps is the percentage of cases in which the highest value of the quality measure is associated with the highest value of the actual map correlation coefficient with the corresponding model map. The analysis is based on 331 sets of structure factors and associated maps obtained from 134 datasets as in Fig. 1, selecting the top-ranked 2 to 6 solutions and carrying out density modification with RESOLVE (Terwilliger, 2002) to yield density-modified maps. A model was built into each density-modified map using a rapid method for building helices and strands. If the value of the mapmodel correlation was less than 0.35 then the building procedure was repeated with a standard cycle of building using the methods in the PHENIX AutoBuild Wizard (Terwilliger et al., 2007) and the value mapmodel correlation from the full standard procedure was used. Only structures for which at least one modelmap correlation was at least 0.20 are included in the analysis. The worst error in identification of best maps is the largest value of the difference between the correlation coefficient of the best map with the corresponding model map and that of the map with the highest value of the quality measure.