view article

Figure 5
Left: receiver operator curves (ROCs) for varying RSZD and distance to difference map peak for identifying potentially oxidatively damaged cysteines. Curves which are closer to the top-left hand corner, and therefore have a greater area under the curve (AUC), indicate better performance metrics. Each curve is plotted with a continually varying RSZD cutoff at a fixed distance threshold (1.40–1.55 Å, purple; 1.00–1.50 Å, green; 1.50–2.00 Å, orange; 1.00–2.00 Å, gold; 1.00–2.50 Å, dark blue). The RSZD threshold which maximizes Youden's J statistic (Youden, 1950View full citation) is noted for each distance threshold. An RSZD of ≥3.0σ outperforms other cutoffs. Using chemically constrained values (1.40–1.55 Å) does not optimally select for potentially damaged sites compared with a 1.00–2.50 Å range. Therefore, the classifier used an RSZD ≥ 3.0σ and distance thresholds of 1.00–2.50 Å. The ROC of the unbiased validation dataset using the determined thresholds is shown as a red dashed line. Right: the confusion matrix for using RSZD ≥ 3.0σ and 1.00 ≤ distance to difference map peak ≤ 2.50 Å to classify the performance of the identification strategy is coloured by the number of cysteines within each class. The classification strategy correctly identifies 2036/2109 non-oxidatively damaged cysteines and 19/28 oxidatively damaged cysteines in the validation dataset. As there is a 75-fold difference in the classes, the Matthews correlation coefficient is the preferred test for robustness (Supplementary Table S3).

Journal logoSTRUCTURAL
BIOLOGY
ISSN: 2059-7983
Follow Acta Cryst. D
Sign up for e-alerts
Follow Acta Cryst. on Twitter
Follow us on facebook
Sign up for RSS feeds