Machine learning deciphers structural features of RNA duplexes measured with solution X-ray scattering

Chen, Y.-L.; Pollack, L.

doi:10.1107/S2052252520008830

Figure 5
Performance of four trained XGBoost models on the noisy synthesized data from the testing set. Twenty sampled SWAXS profiles with low, medium and high error levels are shown in the top row. The subsequent rows show a number of boxed panels containing four histograms of predictions made by the different indicated models: noise-free, noisy, sparsely sampled and densely sampled. The vertical lines represent the real values, extracted from detailed molecular analysis. The transparency of the histograms is coded by the error levels: the higher the error, the more transparent the lines. Generally speaking, all the trained models perform well on noisy data with reasonable error levels (low and medium). As the error levels increase, corresponding to an unphysically low signal-to-noise ratio, outlier values start to appear, and the prediction distribution spreads. However, even under this extreme case, some of the peak values still recapitulate the real ones.

IUCrJ

Volume 7| Part 5| September 2020| Pages 870-880

ISSN: 2052-2525

https://doi.org/10.1107/S2052252520008830

MATERIALS | COMPUTATION

Open

access

Search IUCr Journals		doi		Advanced search
Author		volume	page