scientific commentaries\(\def\hfill{\hskip 5em}\def\hfil{\hskip 3em}\def\eqno#1{\hfil {#1}}\)

IUCrJ
ISSN: 2052-2525

AI-enhanced X-ray diffraction analysis: towards real-time mineral phase identification and qu­antification

crossmark logo

aLaboratory for Waste Management, Paul Scherrer Institute, Forschungsstrasse 111, Villigen PSI, 5232 Switzerland
*Correspondence e-mail: nikolaos.prasianakis@psi.ch

There are a large number of scientific disciplines which benefit from advanced X-ray-based analytical techniques. The X-ray diffraction (XRD) method is a powerful technique which can provide qualitative and quantitative information about the crystallographic structure and composition of matter in a non-destructive way. Such information is crucial in several fields, such as in materials science for novel materials research, or in environmental and geological sciences where it improves the understanding of subsurface composition and chemistry, which is essential for resource exploitation and pollutant dispersion studies, to name a few.

The XRD techniques range from powder XRD, where the samples are provided in a homogeneous powder form, to XRD computed tomography (XRD-CT) where spatial investigation of heterogeneous samples in 2D and 3D is possible. At the same time, XRD techniques are paired with several synchrotron X-ray-based techniques and provide crucial complementary information (Allen, 2023[Allen, A. J. (2023). J. Appl. Cryst. 56, 787-800.]). The working principle of XRD is based on the diffraction of X-rays resulting from their interaction with the crystalline planes of the considered material. The diffracted X-rays are collected by a specialized detector, where the diffraction pattern is recorded as a plot of the diffracted X-ray intensity versus the scattering angle. These diffraction patterns are subsequently analysed and compared with reference patterns. Each crystal is characterized by a unique fingerprint of peaks which allows the identification of multiple crystals within the same signal. Signal fitting to known patterns is a complex computationally demanding process, especially when multiple crystals are present. While powder XRD typically provides a single diffraction pattern, in XRD-CT there is one XRD pattern for each pixel within the considered 2D/3D domain resulting in multi-dimensional big data. The number of pixels within such a 3D tomogram can easily become very large (>105) introducing several challenges related to tomographic reconstruction, high-throughput data acquisition and their respective modelling (Hayashi et al., 2015[Hayashi, Y., Hirose, Y. & Seno, Y. (2015). J. Appl. Cryst. 48, 1094-1101.], 2019[Hayashi, Y., Setoyama, D., Hirose, Y., Yoshida, T. & Kimura, H. (2019). Science, 366, 1492-1496.]; Finegan et al., 2020[Finegan, D. P., Vamvakeros, A., Tan, C., Heenan, T. M. M., Daemi, S. R., Seitzman, N., Di Michiel, M., Jacques, S., Beale, A. M., Brett, D. J. L., Shearing, P. R. & Smith, K. (2020). Nat Commun, 11, 631.]). More specifically, the model refinement computations are, due to the large number of the XRD-CT patterns that have to be processed, a few orders of magnitude slower than powder XRD computations. A potential solution to accelerate these computationally intensive processes is the application of artificial intelligence (AI) techniques.

In recent years, developments in AI and machine learning have been quite impressive opening new avenues in numerical modelling and data interpretation. There are already several applications which take advantage of these advancements, ranging from the acceleration of reactive transport simulations using machine learning (Jatnieks et al., 2016[Jatnieks, J., De Lucia, M., Dransch, D. & Sips, M. (2016). Eur. Geosci. Union Gen. Assem. 97, 447-453.]), to image and pattern recognition in the sense of ultrafast processing and detection which surpasses human capability (Ragone et al., 2023[Ragone, M., Shahabazian-Yassar, R., Mashayek, F. & Yurkiv, V. (2023). Prog. Mater. Sci. 138, 101165.]; Boiger et al., 2024[Boiger, R., Churakov, S. V., Ballester Llagaria, I., Kosakowski, G., Wüst, R. & Prasianakis, N. I. (2024). Swiss J. Geosci. 117, 1-26.]; Omori et al., 2023[Omori, N. E., Bobitan, A. D., Vamvakeros, A., Beale, A. M. & Jacques, S. D. M. (2023). Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 381, https://doi.org/10.1098/RSTA.2022.0350.]). More specifically, for XRD measurement interpretation, recent seminal works of Dong et al. (2021[Dong, H., Butler, K. T., Matras, D., Price, S. W. T., Odarchenko, Y., Khatry, R., Thompson, A., Middelkoop, V., Jacques, S. D. M., Beale, A. M. & Vamvakeros, A. (2021). npj Comput. Mater. 7, 74.], 2023[Dong, H. (2023). Doctoral Thesis, University College London.]) and Lee et al. (2021[Lee, J. W., Park, W. B., Kim, M., Pal Singh, S., Pyo, M. & Sohn, K. S. (2021). Inorg. Chem. Front. 8, 2492-2504.]) have implemented convolutional neural network (CNN) models to extract important information like phase identification and lattice parameters directly from XRD patterns. In Dong et al., it is demonstrated that using a trained CNN model the results can be interpreted up to three orders of magnitude faster, while Lee et al. (2021[Lee, J. W., Park, W. B., Kim, M., Pal Singh, S., Pyo, M. & Sohn, K. S. (2021). Inorg. Chem. Front. 8, 2492-2504.]) reported achieving the completion of the task in a few seconds instead of several hours, compared with the use of traditional techniques (Rietveld method).

In the same direction, there are the recent efforts for phase quantification using deep neural network processing of XRD patterns (Simonnet et al., 2024[Simonnet, T., Grangeon, S., Claret, F., Maubec, N., Fall, M. D., Harba, R. & Galerne, B. (2024). IUCrJ, 11, 859-870.]; Poline et al., 2024[Poline, V., Purushottam Raj Purohit, R. R. P., Bordet, P., Blanc, N. & Martinetto, P. (2024). J. Appl. Cryst. 57, 831-841.]). In the article by Simonnet et al. (2024[Simonnet, T., Grangeon, S., Claret, F., Maubec, N., Fall, M. D., Harba, R. & Galerne, B. (2024). IUCrJ, 11, 859-870.]) in this issue of IUCrJ, the authors aim to introduce a CNN-based method for direct mineral phase quantification from the XRD signals. The training of these models requires a very large dataset and for that purpose pure XRD samples of four minerals and their mixtures are generated synthetically, using crystallographic information files. Interestingly, the authors incorporate the effect of the instrumental factors into the synthetic dataset, in order to increase the model realism and to accurately account for the exact experimental geometry, such as the wavelength function and the attenuation factor.

This approach results in a significant improvement in the accuracy and efficiency of the model in the considered real mineral mixture example, as shown in Fig. 1[link] (Simonnet et al., 2024[Simonnet, T., Grangeon, S., Claret, F., Maubec, N., Fall, M. D., Harba, R. & Galerne, B. (2024). IUCrJ, 11, 859-870.]). Although the authors provide information, validation benchmarks and examples for a four-component system, the extension to systems composed of more than four minerals, which is common in practice, appears straightforward if the same methodology is followed. However, it should be noted that the number of training samples as well as the training efforts to build a more general model are expected to also increase hand in hand with the number of components, in order to maintain the same levels of accuracy.

[Figure 1]
Figure 1
Scatter plot of the CNN model prediction versus the ground truth for mixtures composed of four mineral phases (calcite, gibbsite, dolomite and hematite). The training database takes into account the instrumental effects [DwIE in Simonnet et al. (2024[Simonnet, T., Grangeon, S., Claret, F., Maubec, N., Fall, M. D., Harba, R. & Galerne, B. (2024). IUCrJ, 11, 859-870.])]. The centers of gravity are the weighted mean of each data points subset. Figure reproduced from Simonnet et al. (2024[Simonnet, T., Grangeon, S., Claret, F., Maubec, N., Fall, M. D., Harba, R. & Galerne, B. (2024). IUCrJ, 11, 859-870.]).

It is of paramount importance and an excellent example of open research that the code has been made freely available as open-access software on the GitHub software development platform (GitHub – titouansimonnet/XRD_Proportion_Inference, https://github.com/titouansimonnet/XRD_Proportion_Inference). This repository and version control system includes well-documented input files, the code for generating the training datasets (synthetic XRD patterns), and the machine learning model architecture and parameters, as well as the test cases. This provides to the scientists in the community an excellent starting point to familiarize themselves with the tool, to use it, to optimize it and to extend it for the description of more complex systems.

AI-based tools enable ultrafast interpretation and processing of XRD patterns, significantly reducing the time required compared with traditional methods. The speed of interpretation and the resulting reduction in computational cost should not be underestimated. With these advancements, there is the potential to develop real-time experimental companions capable of interpreting and possibly reducing the dimensionality of the acquired data during the experiment. This opens new horizons such as conducting dynamic experiments with parallel XRD analysis, similar to Cai et al. (2020[Cai, L., Youngman, R. E., Baker, D. E., Rezikyan, A., Zhang, M., Wheaton, B., Dutta, I., Aitken, B. G. & Allen, A. J. (2020). J. Non-Cryst. Solids, 548, 120330.]), and obtaining real-time high-resolution 3D information, which could be further used for the steering and adjustment of the experiment. Additionally, integrating real-time experimental data with highly sophisticated physical modelling algorithms can facilitate the development of digital twins of experiments, supporting the fitting of physical model parameters of interest and the exploration of parametric system responses within a single experiment.

References

First citationAllen, A. J. (2023). J. Appl. Cryst. 56, 787–800.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationBoiger, R., Churakov, S. V., Ballester Llagaria, I., Kosakowski, G., Wüst, R. & Prasianakis, N. I. (2024). Swiss J. Geosci. 117, 1–26.  CrossRef Google Scholar
First citationCai, L., Youngman, R. E., Baker, D. E., Rezikyan, A., Zhang, M., Wheaton, B., Dutta, I., Aitken, B. G. & Allen, A. J. (2020). J. Non-Cryst. Solids, 548, 120330.  Web of Science CrossRef Google Scholar
First citationDong, H. (2023). Doctoral Thesis, University College London.  Google Scholar
First citationDong, H., Butler, K. T., Matras, D., Price, S. W. T., Odarchenko, Y., Khatry, R., Thompson, A., Middelkoop, V., Jacques, S. D. M., Beale, A. M. & Vamvakeros, A. (2021). npj Comput. Mater. 7, 74.  Web of Science CrossRef Google Scholar
First citationFinegan, D. P., Vamvakeros, A., Tan, C., Heenan, T. M. M., Daemi, S. R., Seitzman, N., Di Michiel, M., Jacques, S., Beale, A. M., Brett, D. J. L., Shearing, P. R. & Smith, K. (2020). Nat Commun, 11, 631.  CrossRef PubMed Google Scholar
First citationHayashi, Y., Hirose, Y. & Seno, Y. (2015). J. Appl. Cryst. 48, 1094–1101.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationHayashi, Y., Setoyama, D., Hirose, Y., Yoshida, T. & Kimura, H. (2019). Science, 366, 1492–1496.  Web of Science CrossRef CAS PubMed Google Scholar
First citationJatnieks, J., De Lucia, M., Dransch, D. & Sips, M. (2016). Eur. Geosci. Union Gen. Assem. 97, 447–453.  CAS Google Scholar
First citationLee, J. W., Park, W. B., Kim, M., Pal Singh, S., Pyo, M. & Sohn, K. S. (2021). Inorg. Chem. Front. 8, 2492–2504.  Web of Science CrossRef CAS Google Scholar
First citationOmori, N. E., Bobitan, A. D., Vamvakeros, A., Beale, A. M. & Jacques, S. D. M. (2023). Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 381, https://doi.org/10.1098/RSTA.2022.0350Google Scholar
First citationPoline, V., Purushottam Raj Purohit, R. R. P., Bordet, P., Blanc, N. & Martinetto, P. (2024). J. Appl. Cryst. 57, 831–841.  CrossRef CAS IUCr Journals Google Scholar
First citationRagone, M., Shahabazian-Yassar, R., Mashayek, F. & Yurkiv, V. (2023). Prog. Mater. Sci. 138, 101165.  CrossRef Google Scholar
First citationSimonnet, T., Grangeon, S., Claret, F., Maubec, N., Fall, M. D., Harba, R. & Galerne, B. (2024). IUCrJ, 11, 859–870.  CrossRef IUCr Journals Google Scholar

This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.

IUCrJ
ISSN: 2052-2525