Machine learning deciphers structural features of RNA duplexes measured with solution X-ray scattering

Chen, Y.-L.; Pollack, L.

doi:10.1107/S2052252520008830

Figure 3
Summary of training, validation and testing of five XGBoost models on different structural descriptors. The variances are reported in the last row. The 10-fold CV results report the averaged regression mean-squared error (MSE) or classification accuracy and the standard deviation among 10 folds. Note that we used 750 and 7500 CARTs in the 10-fold CV and training processes, respectively. The shaded models are identified subjectively as poor, based on 10-fold CV results, performance on all the datasets and comparison with other trained models on the same structural descriptor. Overall, the numbers suggest that the XGBoost model is able to learn or recognize the patterns in the training data and generalize for unknown testing data. This characteristic implies the potential to be applied to noisy experimental data and different molecular systems.

IUCrJ

Volume 7| Part 5| September 2020| Pages 870-880

ISSN: 2052-2525

https://doi.org/10.1107/S2052252520008830

MATERIALS | COMPUTATION

Open

access

Follow IUCrJ

Search IUCr Journals		doi		Advanced search
Author		volume	page