Figure 6
Resolution-dependent mean number of water molecules per amino acid. The mean number of water molecules per residue is computed as the number of water molecules divided by the number of amino acids of all chains present in the asymmetric unit. This number was computed for each of the 77 346 protein structures determined by X-ray crystallography downloaded from the PDB on 23 May 2014. The figure utilizes box plots to visualize the distribution of the number of water molecules per residue observed in 0.1 Å bins in the experimental resolution range 0.6–3.5 Å. In the box plot, the grey boxes display values that fall in between the first and third quartile, the black bar represents the median and the whiskers extend to data points no more than 1.5 times the inner quartile range; data points outside this region are highlighted as black dots. The graph reflects the plausible trend that more discrete waters can be built in structures with higher resolution than in those with lower resolution. In the low-resolution range the distributions are skewed towards zero water molecules per residue, as can be read from the location of the median in these distributions. This trend holds until resolutions as low as 2.5 Å, from where on the distributions start to become symmetric. Notice that the bins for very high (better than 0.8 Å) and very low (worse than 3.3 Å) resolution are populated with fewer than 100 measurements. A corresponding plot for nucleic acid structure models displaying similar but limited information owing to the much smaller number of DNA/RNA structures determined by X-ray crystallography is provided in Supplementary Fig. S2. |