Figure 1
The coverage that is attainable by combining X-ray crystallography and homology modeling for modeling families. (a) The relationship between the crystallization propensity score and the corresponding coverage by X-ray structures for modeling families (i.e. a given modeling family is considered as having a structural model when any of its members has a crystallization score above a given cutoff) in a combined set of all considered complete proteomes, in eukaryotes, bacteria, archaea and viruses and in proteins located in chloroplasts and mitochondria. The vertical lines show the cutoff values that correspond to the 25th centile, the median and the 75th centile of the crystallization propensity scores of the clustered proteins from the PDB data set. (b) Scatter plot of the median propensity scores of complete proteomes grouped by their superkingdoms against the corresponding number of modeling families (y axis on a logarithmic scale). The scatter for each superkingdom was linearly fitted (thin line) and the corresponding Pearson correlation coefficient (PCC) is shown. Smaller proteomes (<100 modeling families) and viruses that also mostly include small proteomes were excluded to assure statistically sound estimates of propensities. |