research papers
A simple solution to the
recipe problemaArgonne National Laboratory, 9700 S. Cass Avenue, 401/B4192, Lemont, IL 60439, USA
*Correspondence e-mail: brian.toby@anl.gov
Rietveld refinements are widely used for many purposes in the physical sciences. Conducting a GSAS-II program is discussed. This new method is discussed as an important step towards the development of automated technology.
typically requires expert input because correct results may require that parameters be added to the fit in the proper order. This order will depend on the nature of the data and the initial parameter values. A mechanism for computing the next parameter to add to the is shown. The fitting function is evaluated with the current parameter value set and each parameter incremented and decremented by a small offset. This provides the partial derivatives with respect to each parameter, along with information to discriminate meaningful values from numerical computational errors. The implementation of this mechanism in the open-sourceKeywords: Rietveld analysis; powder diffraction; parameter selection; GSAS-II.
1. Introduction
Rietveld analysis is the process where crystallographic models are directly fitted to powder diffraction data (Rietveld, 1969). Rietveld analysis has been a cornerstone of materials characterization for decades and is seeing increasing use for many research and process applications, including characterization of materials properties such as texture and crystallite sizes; and quantification of the component amounts in multiphase samples in fields including chemistry, physics, geosciences, pharmaceuticals and engineering. On the basis of citations of the software, a minimum of several thousand Rietveld refinements are reported in the literature every year.
One of the more challenging aspects of Rietveld analysis is determining the order in which add parameters to the ). A crystallographer experienced with can look at the plot of the observed powder pattern, the computed pattern from the current model and their differences and from that graphic can tell at a glance which parameter(s) should be included next. However, transferring this knowledge to a novice is quite a challenge (Young, 1993). If an optimal `recipe' is not followed, parameters may refine to unrealistic values. At this point it becomes unlikely that including additional parameters into the fit will allow the model to recover and find the true minimum. With older programs, refinements might even `blow up,' where fitted values might exceed the computer implementation ranges for numbers and the software would fail. More modern minimization strategies, such as use of conjugate-gradient optimizers (Coelho, 2005), Levenberg–Marquardt and singular value decomposition Hessian modification decrease optimizer sensitivity to correlated parameters, but in the end, an accurate Hessian is still needed to determine the standard uncertainties for the fitted parameters. Other approaches, such as genetic algorithms, global optimization and Monte Carlo minimization have also been applied to powder diffraction, but more commonly for structure solution (David et al., 2006; Padgett et al., 2007; Pagola & Stephens, 2010; Mattei et al., 2020; Habermehl et al., 2022).
In addition to fitting crystallographic parameters, the must also fit the background, lattice and peak shape parameters, and sometimes intensity correction terms, such as for texture, absorption or extinction (Toby, 2019The difficulty inherent in parameter order selection was summarized nicely by Ozaki et al. (2020) who stated
`It is commonly known that refining all parameters at once often leads to physically unreasonable results… it is not guaranteed… [to] lead researchers to the optimal crystal structure… Considering the wide use of Rietveld refinement… that only proficient experts can exploit
properly, should be improved.'
Presented here is a direct and compact computational approach that can identify the next parameter(s) to be added to the GSAS-II. This method is conceptually simple, should be easy to implement in other codes and uses relatively minimal computer time. It is envisioned as a step towards the development of fully automated tools.
This method has been implemented as an option within2. The worst-fit parameter concept
The key for determining the order in which add parameters to a (a) demonstrates that, while the structure does provide a general match to the observed intensities, the model is incorrect in that it does not closely match the observed intensities. This visualization provides a simple way to access the quality of a fit in a single graphic, albeit one that should be viewed at multiple magnification scales. One of the great strengths of powder diffraction crystallography is that this plot provides a clear view of the fit quality, particularly since metrics alone cannot be used as a guide to quality (Toby, 2006).
is a plot with the observed powder pattern, the computed pattern from the current model and their differences; this is sometimes called a Rietveld plot. The Rietveld plot shown in Fig. 1However, note that all minimization processes utilize some weighting of data. It has been shown that the optimal fit is obtained when observations are weighted by their experimental uncertainties (Prince, 2004). When these uncertainties are unknown or other weighting is used, the precision of the result is degraded, but the accuracy is not, unless the weighting were to accentuate some form of systematic error. The traditional Rietveld plot can be made more valuable if the difference values are plotted as the weighted differences, i.e. displaying the differences between the observed and computed points divided by the for each point, as taken from the weight [Fig. 1(b)], rather than plotting the differences directly. Showing the fit relative to weighting has three advantages. First, it provides information on how the data are being weighted. Further, as intensities in the pattern increase, typically so do their uncertainties. The unweighted differences tend to accentuate deviations that occur in intense parts of the pattern, even though these differences may be statistically insignificant. Also, while the direct differences are on the scale of the diffraction intensities, which is an inherently arbitrary axis, when optimally weighted, differences have a statistical of unity, and thus the weighted differences are on a statistically well-defined absolute scale.
When an experienced crystallographer views a Rietveld plot, they look to see what is causing the greatest disagreement (McCusker et al., 1999). If the observed peaks are shifted relative to the calculated peaks, the lattice parameters (or related instrumental corrections) are at fault. However, if all the intensities in either pattern are significantly larger than those in the other, for example, then the scale factor is not likely to be optimal. Alternatively, if the intensity agreement shows systematic deviations that vary as a function of Q, then the atomic displacement parameters (ADPs; typically Uiso values) are problematic. A few examples of this are shown in Fig. 2. If the deviations are for some reflections but not all, as is seen in Fig. 1, this is a likely indication of a problem that the atom positions of the model do not match the experiment; of atomic displacement parameters may address this. In the case of the example in Fig. 1, there are as-yet unidentified inadequacies in the crystallographic model, so parameter optimization will not address this.
What the crystallographer attempts to determine visually from a Rietveld plot is the nature of the discrepancies between the observed data and the intensity values computed from the model. From that, one estimates which parameters are causing the greatest deviations between the data and the model. These parameters will be deemed the `worst fit.' There may be a large number of parameters that are far from their optimum values, but the worst-fit parameters will have the largest impact on the overall agreement. Owing to parameter correlation, it may be impossible to optimize any other parameters before the differences due to these worst-fit parameters are addressed. Certainly, when the lattice parameters are not optimal, it makes no sense to attempt to optimize peak shape or structural parameters, and even background parameters may not refine well. Once the worst-fit parameter(s) have been fitted, some of the remaining unfit parameters will then become the worst fit and should be addressed next. While discerning the worst-fit parameters from visual features in a Rietveld plot is very likely a skill that a neural network could be taught, a relatively straightforward computational method will now be presented to determine the worst-fit parameters directly.
3. Computing the worst-fit parameter
In minimization problems, a factor described as χ2 is minimized. For Rietveld fitting, , where yj is the diffraction intensity for point j, ycalc(j) is the calculated intensity for point j and wj is the weight for point j, where optimally and σj is the for yj. Note that if yj is an intensity in scaled counts, yj = Ij/n, then , where n is the scaling factor (unity for unscaled counts). For detection methods that do not count quanta, then optimal weighting requires that σj be estimated for the detector. With 2D detection, σj can be estimated from the intensity spread in nominally equivalent pixels. Note that χ2 here should not be confused with the quality metric, the reduced χ2 = , where n and v are the number of observed data points and the number of refined parameters, respectively. In single-crystal refinements, the term GOF or goodness of fit is used, where the square of the GOF is equivalent to the reduced χ2. The reduced χ2 metric will be unity with an ideal model because the statistical for is the definition of σj.
How a function responds as a parameter is changed is, by definition, the partial derivative of that function with respect to the parameter. The sign of that partial derivative indicates if increasing or decreasing the parameter improves the fit. If we evaluate at the values for all parameters in our current model, pj, the one where the magnitude of the derivative is largest should be the one that will have the largest impact on minimizing χ2, but, as will be discussed, other considerations will be needed due to the discrete numerical computation to be done here. Note that the sign of the offset to be applied to a parameter is not relevant for our purpose, which is only to determine which parameters will have the largest effect on χ2 if the parameter is varied. The Rietveld minimizer (traditionally a variant on least squares) will determine both the magnitude and the sign of the shift to be applied to each parameter as additional parameters are included in the refinement.
To consider how this works in practice, note that the scale factor for a dataset will multiply every point in the computed diffraction pattern. One can expect the partial derivative of χ2 with respect to the scale factor to be quite large, except when the scale factor has been exactly minimized. Likewise, background values are subtracted from every point in the pattern and will also very significantly affect χ2. When away from the best-fit value, either the scale factor or the background values will almost certainly be the `worst-fit parameters' since the former has a large impact on the agreement for every peak in the pattern and the latter will affect every point. This is why, if one writes a naive prescribed parameter order `recipe' for Rietveld, the scale factor or background are almost always the first parameters to be minimized. Once those have been fitted, one can advance to other parameters. Note that this assumes that the unit-cell parameters are fairly close to correct values, so that there is appreciable overlap between the observed and computed peaks. If there is no significant overlap between the observed and computed peaks, neither the initial scale factor nor the cell constants are likely to refine to better values. On the other hand, if the cell parameters provide some peak overlap but are far from optimal, optimization of the is needed before the scale factor can be properly minimized.
With the scale factor and background fitted, the next parameter to be fitted will depend on how the observed and calculated patterns differ. This may be where the unit-cell parameters need to be added. If the cell parameters do agree well with the observed data, but the
do not, it may be necessary to treat the microstrain or crystallite size before additional progress can be made.Returning to the calculus, if a function is fully minimized, the first derivative for all parameters is zero and the second derivative is positive, ; > 0. Note that these statements are only true if the model is continuous and the derivatives are evaluated analytically, meaning that computation accuracy is essentially infinite. However, for crystallographic fitting, we evaluate χ2 with numerical computations and with discretely observed data points. This means that we have finite precision in these computations. As an example for how this affects computations, consider the partial derivative for the scale factor. If a least-squares minimization cycle has been applied, then it will be at the `correct' minimum, at least for all other parameter values held at their current values. This parameter is linear, so least squares is not an approximation; it usually converges quickly. However, these computations are still not exact. When we describe a parameter as having converged, we mean that the parameter is still showing shifts in optimization cycles, but the shifts are less than the of that parameter. In fact, any two values for a parameter that are separated by less than two times the of that parameter are considered indistinguishable. Taking the result from the minimization and computing for the scale factor is likely to give a large value even when this parameter has been properly minimized because, as noted previously, the χ2 function is extremely sensitive to the scale factor. Even when the parameter has converged to the point where shifts are at insignificant levels, very small differences due to roundoff error in the numerical computation can still allow for a very large derivative. So, alas, evaluating these derivatives for the current parameter set will not allow identification of the worst-fit parameters.
Informed by the second derivative, which examines how changes as pj changes, we can discern the worst-fit parameters from those that have a large first derivative due to computational inaccuracies, despite being well minimized. If for parameter pj we evaluate the partial derivative at locations pj − δ and pj + δ, where δ is a small perturbation to pj, on the order of or less than the on that parameter, we can test if the minimum is between pj − δ and pj + δ. If the minimum value for the parameter is well removed from the current value, pj, then we would expect to be about the same when evaluated at the three values pj − δ, pj and pj + δ. On the other hand, if the current value pj is already close to the minimum, then we would expect to have opposite sign when evaluated at pj − δ and pj + δ, indicating that the second derivative is zero somewhere in that range. If this is true, we would not expect to see a significant improvement in the fit by optimizing that parameter. We would also expect to see these opposite signs if near a maximum for χ2 with respect to pj, but since we must start a fit with parameter values that are close to the correct model, we would not expect to encounter a local maximum in χ2. Fig. 3 illustrates this derivative computation process graphically. Thus, by computing the partial derivative at two locations, and requiring that be large near the current value of pj and that have same sign when evaluated at pj − δ and pj + δ, we can discern the parameter(s) that are the worst fit.
This computation has been implemented in the GSAS-II software suite, with details provided in Appendix A, as it is hoped that the discussion in that section and the accessibility of the source code will facilitate incorporation of this capability into other Rietveld codes. When invoked, the derivatives are computed for all appropriate parameters in the model, and those parameters where both derivatives have the same sign are reported to the user in a table, ordered so that the largest-magnitude derivative (the worst-fit parameter) is reported first. The time needed for this process will depend greatly on the number of computed reflections, the diffraction points in the pattern(s) and the number of parameters in the model, but a computation time on the order of seconds to minutes is likely.
4. Discussion and conclusions
This work has presented a tool that offers novice crystallographers a mechanism to determine the order in which parameters should be added into a Rietveld analysis. This capability is easily added to a Rietveld code and herein we outline how this can be done. Access to the Rietveld engine source code may not even be needed. A script could be written to modify the input parameters supplied to a compiled Rietveld code. The required partial derivatives could then be accumulated from the resulting χ2 values.
Nonetheless, despite this advance, powder diffraction crystallographers will still need to understand the meaning of the parameters and how they interact with respect to changing the agreement between the data and the computed pattern. The method presented here will identify the parameters in the model that will offer the largest change in χ2, but not all of these parameters are appropriate to vary. For example, with the model presented in Fig. 2(c), where the peak shapes are not well fitted, a much larger derivative is seen for an instrumental broadening term than for sample broadening, but the latter is a more appropriate term to include in the model.
What has been presented here, or, for that matter, the work of Ozaki et al. (2020), addresses the serious problem of the order to introduce parameters into a model seen in many contemporary crystallographic codes. This problem may be less acute for Rietveld implementations with more robust minimization strategies (Coelho, 2018), but it has not been established whether the order that the parameters are introduced remains important. Addressing this problem still leaves several other significant tasks that at present still require the attention of an experienced crystallographer: this is determining how a model should be parameterized, such as what intensity correction terms are appropriate for the measurement. A lack of useful observables (Sivia, 2000) may prompt introduction of constraints, such as grouping the ADP values for similar atoms or use of geometrical constraints, such as a rigid body, or similarly use of compositional or geometric restraints. These constraints and restraints change how the overall fit responds to changes in the parameters. Likewise, the crystallographer must also decide when to expand the model description, for example, to treat anisotropic peak broadening, or when the data are insufficient to support full parameterization; for example, limited data range may prevent simultaneously fitting crystallite size broadening and microstrain. In those cases, on the basis of information on the sample origins, a choice must be made as to which parameter provides a more sensible model. The quality of a fit must also be determined from the validity of the results, so an expert system that truly automates must not only determine parameterization but also judge the physical and chemical plausibility of the refined parameters. Thus, there is considerable work that remains before Rietveld analysis can be made automatic, but it is now possible to envision automating parameter introduction, even if the analyst must specify considerable information based on the measurement type(s) and the family of materials. Likewise, it becomes possible to imagine software that performs automatic testing of and structures or even helps wade through the wealth of models for background, peak shape and intensity correction factors that add so much complexity to Rietveld analysis. Nonetheless, even if Rietveld analysis were to be automated, an even more difficult problem is raised by the example in Fig. 1, where the model clearly shows significant agreement with the data but, even with all parameters optimized, still inadequately fits the data. Parameter optimization will not address this. Additional models must be developed and explored. This at present very much depends on the imagination and experience of the crystallographer.
APPENDIX A
Implementation in GSAS-II
In this section, specific GSAS-II routines are discussed. Further documentation on those routines, as well as a listing of their source code, can be found in the code developer documentation, available as web pages or for download as portable (PDF or Epub) documents (Toby & Von Dreele, 2023). Note that, while GSAS-II computes analytic derivatives for most varied parameters as part of the process, the code for that is embedded into assembly of the Hessian matrix, which makes it difficult to access in other contexts. Since the derivative computation needed here is performed at two points for each parameter and since the analytic derivative computation takes a similar amount of time to compute as does evaluation of χ2, computation of numerical derivatives does not take appreciably more time than analytic derivatives would. Analytic derivatives would likely be somewhat more precise than numerical derivatives, but since these derivatives are only used to identify parameters that have significant leverage over χ2, this precision is not needed.
The computation of the `worst-fit' parameters is performed in the GSAS-II graphical user interface from an entry in the `Calculate' menu labelled `Parameter Impact'. This invokes the method OnDerivCalc() in class GSASIIdataGUI.GSASII, which in turn invokes the routine, GSASIIstrMain.Refine(), but with a special argument that indicates that partial derivatives should be computed without completing a This Refine() routine initially constructs a Python dict data structure with all the parameter values, as well as assembling the powder data and other information needed for computation of χ2 in the routine GSASIIstrMath.errRefine(). Note that the parameter dict is keyed by a variable name that uniquely identifies the role the parameter has in the model computation and the value corresponding to the dict entry is the numerical value of the parameter. There are some entries in this data structure that cannot be varied sensibly. For example, any parameter in that dictionary that does not have a floating point value is ignored. In the case of lattice parameters, the unit-cell symmetry is used to determine which of the reciprocal cell tensor entries should be included.
The process for computing partial derivatives for all appropriate parameters is done in routine GSASIIstrMain.AllPrmDerivs(), where the value of χ2 is evaluated with the current parameter set and saved, to be used later in the evaluation of the partial derivative for each parameter. We will label this here as χ2(0). Next, there is a loop over appropriate parameters (noting that some entries in the dict are not appropriate to be changed). The parameter being considered will be labelled pj. An offset to be applied to that parameter, δj, is determined by the role that parameter has in the model. Note that, if the value for δj is too small, the derivative computation will be dominated by roundoff errors and will be inaccurate. If δj is too large, it will not sample the derivative in the vicinity of the current value for that parameter. For most parameters used in GSAS-II, δj is set to 10−6 or the value of the parameter ×0.0001, whichever is larger. For fractional coordinates, δj is set to 10−6, whereas for ADPs 10−5 is used. A few parameters related to sample positioning usually have values greater than one and for these δj is set to 0.1. There may be a need to further improve these assumptions as more experience is gained.
For each appropriate parameter, χ2 is evaluated using GSASIIstrMath.errRefine() with the selected parameter set to values of pj − δ and pj + δ. The χ2 values are labelled here as χ2(pj − δj) and χ2(pj + δj), respectively. Note that these computations are completely independent and could thus be distributed to different CPUs if speed via parallelization is desired. From the three χ2 values, three approximate partial derivative values are computed: ; and the signs for the partial derivatives above and below the current parameter values are determined from and . These values are assembled into a dict that is returned by AllPrmDerivs() and after sorting are displayed in a table.
This capability has also been introduced into the GSAS-II scripting mechanism (O'Donnell et al., 2018). In this case, the method ComputeWorstFit() is provided for the GSAS-II project class (G2Project). This method calls GSASIIstrMain.Refine() and returns a list of parameters, sorted by their derivative values, as well as a tuple with the values , and .
Funding information
Use of the Advanced Photon Source, an Office of Science User Facility operated for the US Department of Energy (DOE) Office of Science by Argonne National Laboratory was supported by the US DOE (contract No. DE-AC02-06CH11357).
References
Coelho, A. A. (2005). J. Appl. Cryst. 38, 455–461. Web of Science CrossRef CAS IUCr Journals Google Scholar
Coelho, A. A. (2018). J. Appl. Cryst. 51, 210–218. Web of Science CrossRef CAS IUCr Journals Google Scholar
Cui, X., Feng, Z., Jin, Y., Cao, Y., Deng, D., Chu, H., Cao, S., Dong, C. & Zhang, J. (2015). J. Appl. Cryst. 48, 1581–1586. Web of Science CrossRef CAS IUCr Journals Google Scholar
David, W. I. F., Shankland, K., van de Streek, J., Pidcock, E., Motherwell, W. D. S. & Cole, J. C. (2006). J. Appl. Cryst. 39, 910–915. Web of Science CrossRef CAS IUCr Journals Google Scholar
Habermehl, S., Schlesinger, C. & Schmidt, M. U. (2022). Acta Cryst. B78, 195–213. Web of Science CSD CrossRef IUCr Journals Google Scholar
Mattei, G. S., Dagdelen, J. M., Bianchini, M., Ganose, A. M., Jain, A., Suard, E., Fauth, F., Masquelier, C., Croguennec, L., Ceder, G., Persson, K. A. & Khalifah, P. G. (2020). Chem. Mater. 32, 8981–8992. Web of Science CrossRef ICSD CAS Google Scholar
McCusker, L. B., Von Dreele, R. B., Cox, D. E., Louër, D. & Scardi, P. (1999). J. Appl. Cryst. 32, 36–50. Web of Science CrossRef CAS IUCr Journals Google Scholar
O'Donnell, J. H., Von Dreele, R. B., Chan, M. K. Y. & Toby, B. H. (2018). J. Appl. Cryst. 51, 1244–1250. Web of Science CrossRef CAS IUCr Journals Google Scholar
Ozaki, Y., Suzuki, Y., Hawai, T., Saito, K., Onishi, M. & Ono, K. (2020). npj Comput. Mater. 6, 75. CrossRef Google Scholar
Padgett, C. W., Arman, H. D. & Pennington, W. T. (2007). Cryst. Growth Des. 7, 367–372. Web of Science CSD CrossRef CAS Google Scholar
Pagola, S. & Stephens, P. W. (2010). J. Appl. Cryst. 43, 370–376. Web of Science CSD CrossRef CAS IUCr Journals Google Scholar
Peplow, M. (2023). Nature, 618, 21–24. CrossRef CAS PubMed Google Scholar
Prince, E. (2004). Mathematical Techniques in Crystallography and Materials Science, 3rd ed. New York: Springer-Verlag. Google Scholar
Rietveld, H. M. (1969). J. Appl. Cryst. 2, 65–71. CrossRef CAS IUCr Journals Web of Science Google Scholar
Rodríguez-Carvajal, J. (1993). Physica B, 192, 55–69. CrossRef Web of Science Google Scholar
Sivia, D. S. (2000). J. Appl. Cryst. 33, 1295–1301. Web of Science CrossRef CAS IUCr Journals Google Scholar
Szymanski, N. J., Rendy, B., Fei, Y., Kumar, R. E., He, T., Milsted, D., McDermott, M. J., Gallant, M., Cubuk, E. D., Merchant, A., Kim, H., Jain, A., Bartel, C. J., Persson, K., Zeng, Y. & Ceder, G. (2023). Nature, 624, 86–91. CrossRef CAS PubMed Google Scholar
Toby, B. H. (2006). Powder Diffr. 21, 67–70. Web of Science CrossRef CAS Google Scholar
Toby, B. H. (2019). International Tables for Crystallography, Vol. H, Powder Diffraction, 1st online ed., edited by C. J. Gilmore, J. A. Kaduk & H. Schenk, pp. 465–472. Chester: IUCr. Google Scholar
Toby, B. H. & Von Dreele, R. B. (2013). J. Appl. Cryst. 46, 544–549. Web of Science CrossRef CAS IUCr Journals Google Scholar
Toby, B. H. & Von Dreele, R. B. (2023). GSAS-II Developer's Documentation, https://gsas-ii.readthedocs.io/en/latest/. Google Scholar
Young, R. A. (1993). The Rietveld Method, edited by R. A. Young, pp. 1–38. Oxford University Press. Google Scholar
This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.