Figure 4
Sequence-identification and assignment benchmarks for EM models. (a) Identity of a sequence identified for models built de novo using ARP/wARP as a function of HMMsearch best-single-domain sequence-alignment score. (b) Identity of a sequence assigned to continuous fragments of deposited EM models as a function of the sequence-assignment score (p value) for protein-fragment lengths of 10, 50 and 100 residues selected at random from test-set models. The continuous curves on the plots are logistic regression estimates of a probability that an identified sequence will have at least 80% sequence identity to the reference model. The orange circles represent three reference chains with register error that were not used for the logistic regression calculations. |