research papers
Bayesian inference of metal oxide ultrathin film structure based on crystal truncation rod measurements
aGraduate School of Engineering Science, Osaka University, 1-3 Machikaneyama, Toyonaka, Osaka 560-8531, Japan, bGraduate School of Arts and Sciences, The University of Tokyo, 3-8-1 Komaba, Meguro, Tokyo 153-8902, Japan, and cGraduate School of Frontier Sciences, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8561, Japan
*Correspondence e-mail: anada@crystal.mp.es.osaka-u.ac.jp
Monte Carlo (MC)-based 3 film fabricated on top of SrTiO3. The software successfully found the global optima from an initial model prepared by a small grid search calculation. The standard deviations of the atomic positions derived from a dataset taken at a second-generation synchrotron are ±0.02 Å for metal sites and ±0.03 Å for oxygen sites.
software to analyze the atomic arrangements of perovskite oxide ultrathin films from the crystal truncation rod intensity is developed on the basis of Bayesian inference. The advantages of the MC approach are (i) it is applicable to multi-domain structures, (ii) it provides the posterior probability of structures through Bayes' theorem, which allows one to evaluate the uncertainty of estimated structural parameters, and (iii) one can involve any information provided by other experiments and theories. The simulated annealing procedure efficiently searches for the optimum model owing to its stochastic updates, regardless of the initial values, without being trapped by local optima. The performance of the software is examined with a five-unit-cell-thick LaAlOKeywords: Bayesian inference; crystal truncation rods; perovskite films; Monte Carlo.
1. Introduction
Perovskite oxides have been studied for a long time and still attract much attention because of their variety of conductive, electronic and magnetic properties, as well as their practical applications (Tokura & Nagaosa, 2000; Goodenough, 2001; Dagotto, 2005). In the past 15 years, extensive work has been performed on epitaxial interfaces of perovskite oxides (Ohtomo et al., 2002; Ohtomo & Hwang, 2004; Nakagawa et al., 2006; Hwang et al., 2012; Salluzzo et al., 2013; Chakhalian et al., 2014; Middey et al., 2016). Epitaxial interfaces allow us to study atomically ordered interfaces between two different electronic systems, such as ferromagnet/superconductor interfaces or metal/insulator interfaces. Such structures are useful to test our understanding of condensed matter physics (Chaloupka & Khaliullin, 2008) and can be used to create new electronic phases (Reyren et al., 2007). For this reason, enormous effort has been made to control the oxide interface structures.
Most perovskite interfaces are fabricated by using pulsed laser deposition or molecular beam et al., 2010; Cantoni et al., 2012). This requires slicing the sample, which sometimes causes damage to the newly exposed surface. In addition, the sample environment is hardly controlled. The resolution of the atomic displacement is typically 0.5 Å for heavy elements, which is atomic resolution. However, when we want to study the electric polarization, which is an essential value for the physical properties of oxides, the resolution is still insufficient. For this purpose, a 0.1 Å resolution is required. Such resolution can be achieved by using surface X-ray diffraction to measure the crystal truncation rods (CTRs) arising from a sudden change in electron density at a specific plane (Robinson, 1986; Feidenhans'l, 1989). This method does not require slicing of the samples, allows tuning of the sample environment and has a resolution better than 0.1 Å. The main drawback of using this technique is the difficulty of the analysis. There are many software programs and algorithms to make the analysis easier. ROD (Vlieg, 2000) and its successor (Vonk, 2011) are classical examples, whose main purpose is the of the surface atomic arrangement. Holographic phase retrieval methods (Takahashi et al., 2001; Yacoby et al., 2002) are often used to generate the initial models for the refinements (Fong et al., 2005; Willmott et al., 2007; Yamamoto et al., 2011; Fister et al., 2014). Such techniques to find a good initial model are indispensable for the interfacial structure analysis of perovskite oxides (Willmott et al., 2007). Electron density analysis techniques provide flexible models, although the derivation of the standard deviations of structure parameters is not straightforward. Holographic analysis relies on the assumption that the sample surface is homogeneous, which is often untrue. Iterative reconstruction (Fung et al., 2007; Björck et al., 2008) shows that the true structure cannot be obtained without enforcing very strict constraints on the electron density in real space, such as positivity, atomicity, and similarity between the substrate and film structures.
techniques while monitoring the quality of the layer-by-later growth by The interfacial structure is often examined by using which is sometimes combined with electron energy loss spectroscopy (PernaHere, we develope a Monte Carlo (MC)-based ; D'Alessandro & Cilloco, 2010; D'Alessandro, 2011). Each MC step makes a specific structural model, which can be regarded as a `very strict positivity and atomicity constraint in real space'. In principle, the MC technique can treat inhomogeneous surfaces, including domain structures. In addition, the MC calculation can provide the probability density of the structure model, that is, the standard deviations of structure parameters are directly provided when a set of experimental results is given. In this paper, we consider (001)-oriented LaAlO3/SrTiO3 ultrathin films. Fig. 1 presents a schematic of the film. La and Sr occupy the A site, and Al and Ti occupy the B site. Because of the assumed sample orientation, the structure is regarded as an alternating stack of AO and BO2 planes; the oxygen site in the AO plane is called the O1 site, and those in the BO2 plane are called O2 sites. In the present study, we refined the structure of a ten-unit-cell-thick region from the surface.
program to find reliable structural models for perovskite oxide interfaces. The method we adopt is similar to the reverse Monte Carlo method used for liquids or amorphous bodies (McGreevy & Pusztai, 19882. Theory
2.1. Structure model
The in-plane structure is assumed to have no a lattice parameter to be the same as that of SrTiO3. Now we have only two parameters per site, that is, the atomic displacements along the surface normal direction from the ideal substrate lattice and the occupancy , where denotes A, B, O1 or O2, denotes La, Sr, Al, Ti, O1 or O2, and n denotes the layer index starting from the ideal substrate (see Fig. 1). Conditions of occ(La,n) + occ(Sr,n) = 1 and occ(Al,n) + occ(Ti,n) = 1 are assumed, except for n values close to the surface. The isotropic atomic displacement parameter B (Å2) was defined as being common to all atoms within the film. The total number of independent parameters m depends on the constraints we use, and the typical value of m for the present study was 60. The structural parameters of the model are expressed as = = [z( A,1), z(B,1), z(O1,1), z(O2,1), z( A,2),…, occ(La,1), occ(Sr,1), occ(Al,1), occ(Ti,1), occ(O1,1), occ(O2,1),…].
and the2.2. Bayesian inference
The experimentally observed diffraction intensity from a unit area of the sample surface at scattering vector is expressed by . A set of CTR data is composed of the diffraction intensities at different , = [, ,…, ], where N denotes the total number of measured positions. Our MC software estimates the surface structure based on a posterior probability , that is, a conditional probability of structural parameters under the conditions of the measured CTR data. In Bayesian inference, maximizing the posterior probability to estimate parameters is called the maximum a posteriori estimation. The distribution of the posterior probability reflects the uncertainty of the estimation. According to Bayes' theorem, the posterior probability is given by
where is called the likelihood function, which represents the statisical property of the measurement noise, and is called the prior probability, which represents prior information on the structure provided by other experiments and theories. In this paper, is set to a uniform distribution except in the last paragraph of §4, which means that no prior information is assumed for . As a result, is proportional to = . The conditional probability is assumed to be a Gaussian distribution with a standard deviation of :
The calculated intensity for structure at is here expressed as = , where denotes the scattering amplitude for structure at and S is a scale factor. Let be given by to express the Gaussian noise whose standard deviation is proportional to the intensity, which imitates the error arising from the optical misalignment. Here, is the background intensity at . From the typical standard deviation of our CTR data at equivalent positions, we estimate the value of .
The structure that gives the maximum value of is the structure model most likely to reproduce the experimental data. For convenience, we introduce a cost function E() = , whose minimization is equivalent to maximizing a corresponding posterior distribution. Substituting equations (1) and (2) into the definition of E(), one obtains
up to a constant that is independent of .
The advantage of the MC-based technique is the flexibility of the model construction. can include the effect of BO6 octahedral rotation or any kind of domain structure, such as domains having different film thickness, through the ordinary formula of kinematical diffraction theory. In the practical use of the MC technique, the amount of information carried by the CTR data limits the complexity of the model. at very close values are correlated, especially when the distance between the two points is closer than the instrumental resolution. This means that the amount of information in the CTR profile is proportional not to N but to the range of space. In this study, we selected the data step in the direction to be 0.02, which corresponds to ∼0.2° in the 2θ angle for 16.5 keV X-rays. Thus, the resolution functions of the neighboring data points are well separated. Under these conditions, we will examine if the information derived from the CTR profiles is sufficient to obtain the global optima, by estimating both the accuracy and the precision in §3. The application to real experimental data will be presented in §4.
2.3. Initial model construction
The search for the global minimum of is divided into two parts: preparing initial parameters and the Erf(x-xc,s), where xc and s are the peak position and the width of the corresponding Gaussian function. The initial values of the interplanar distance c+z( A,n+1)-z(A,n) are expressed as , where c is the bulk substrate lattice parameter, denotes the difference in the interplanar distance between the film and substrate, nint denotes the nominal interface position, and sz gives the sharpness of the interface. The of the initial distance c+z( A,n+1)-z( A,n) is presented in Fig. 2(a), and the corresponding initial z( A,n) profile is shown in panel (b) by the blue triangles. The depth dependence of the initial value of is expressed by the product of two error functions, Erf(n-nsurf,ssurf)[1-Erf(n-nint,sint)] (where nsurf, ssurf, sint denote the nominal surface position, the sharpness of the surface and that of the interface, respectively), as presented in Fig. 2(c) (blue triangles). The number of parameters is reduced to six (, nint, nsurf, sz, sint, ssurf), which allows us to use the grid search method to find a good initial model.
of the parameters. The requirements of the initial model largely depend on the robustness of the process. While the MC technique is robust to the initial parameters, a good initial model is preferable to achieve the model providing the global minimum of . The initial values of the structural parameters are modeled by using the error functions2.4. Monte Carlo sampling
part of the MC calculation was performed by with the Metropolis method (Metropolis(1) One component of is randomly selected to be modified. The parameter is modified by a small step defined by a Gaussian random number with a standard deviation of 0.004 Å for and 0.01 for .
(2) The change in caused by the parameter modification, = - , where and are the structure models before and after the modification, is evaluated. The probability that the modification is accepted is given by r, which is defined as
Here, TMC is a temperature parameter to adjust how often modifications are accepted in MC sampling. If the modification is not accepted, is restored to .
(3) The value of TMC is decreased according to an exponential annealing schedule. This process is called simulated annealing (Kirkpatrick et al., 1983). The number of Monte Carlo steps is typically 106–107. A 107 cycle iteration takes half an hour using a 4 GHz core i7 CPU with single-core calculation.
(4) After we have constructed a structural model that gives a satisfactory small value of , is recorded during the MC calculation with a constant temperature TMC = 1/N, the inverse of the number of data points. This procedure generates samples from the posterior probability because the detailed balance condition is satisfied by the Metropolis method. Each component-wise posterior probability has a peak. The peak position and width are interpreted as the resulting structural parameter and its uncertainty, respectively. Since it evaluates the precision of structural parameters by directly sampling from their posterior probabilities, the MC technique can be applied to general cases where a strong correlation exists between the structural parameters. The values of and the scale factor S are refined together with .
3. Analysis of virtual measurement data
To evaluate the performance of the MC O3 film on a TiO2-terminated SrTiO3(001) substrate (Yamamoto et al., 2011); we will refer to the structure as . We define = + . Artificial noise was introduced with a Gaussian distribution having a standard deviation of , where was chosen to be 0.2. Hereafter, we call the virtual measurement (VM) data. are plotted in Fig. 3. The prepared VM data were 00, 01 and 11 rods. The total number of data points (N) was 575. The minimum value of the cost function was 6.84. The initial model made with a grid search provided = 29.38, which corresponds to R = = 0.354, using the initial value of . After the grid search, all variable parameters and γ were relaxed by the MC software. The MC calculation was divided into two stages by different temperature sequences. In the first temperature sequence, z(O1, n), occ(O1, n), z(O2, n) and occ(O2, n) were constrained to the corresponding values of metal ions on the same layer. The initial temperature was set to 0.5 to sample a wide parameter space. The second MC calculation with a different temperature sequence was applied to the model resulting from the first sequence by removing the constraints on z(O1, n) and z(O2, n), while the constraints on the occupancies for oxygen sites were maintained. The initial TMC was set to 0.1 and decreased by 1% per 2000 steps. In total 2×106 calculation steps were performed for both stages of MC calculation.
the software developed for this purpose was applied to artificial CTR intensity profiles calculated from the reported structure parameters of a five-unit-cell-thick LaAlThe intensity profiles calculated from the resulting structure are shown in Fig. 3 as red curves. The structure model was improved to = 7.25 (R = 0.04), and the resulting was 0.215 ± 0.002. The refined values of the structural parameters of the A site are presented in Fig. 2. Fig. 2(b) shows the depth dependence of z(A,n). A positive displacement represents atomic movement towards the surface. Fig. 2(c) shows the depth dependence of . The error bars are defined by the standard deviation of each parameter.
The precision of the structure parameters is also expressed by the standard deviation of each parameter. The average value of the standard deviation of for each site is listed in Table 1. Here, the average is taken over the sites whose occupancy is larger than 0.5. A typical value of for metal sites is ±0.01 Å and that for the oxygen sites is ±0.04 Å. Standard deviations for the occupancy parameters at the interface and at the surface are also listed in the same table. The typical precision of the metal occupancy is 3%.
|
The accuracy of the structure parameters is estimated from the differences between the resulting and , which are plotted in Figs. 2(b) and 2(c). The difference for was approximately the same as the magnitude of the error bar, and the typical difference for was three times larger than the standard deviation estimated by this procedure. The quantitative similarity between the accuracy and the precision suggests that the amount of information provided by the CTR profiles is well estimated by the number of data points N in the present case.
4. Analysis of experimental data
The experimental data of LaAlO3/SrTiO3 along the 00, 01 and 11 rods with N = 575 (Yamamoto et al., 2011) were analyzed with our MC software. The initial value of S was determined in advance of the MC calculation by the steepest descent method using only the data near the Bragg peaks, where the CTR intensity is rather insensitive to the detail of the surface structure model. The initial values of the other structural parameters were the same as those used for the analysis made on the VM data. The initial systematic noise scale was chosen to be 0.15.
The initial model made with the grid search provides the value of = 42.54 (R = 0.391). The analysis process for the MC sampling was the same as in §3, except for the of the atomic displacement parameter B for the whole film. Fig. 4 shows the experimentally observed CTR intensity profiles together with the calculated profiles from the resulting structure. The value of was reduced to 7.66 (R = 0.110). The MC is proven to be robust enough for practical use of perovskite interfacial Fig. 5(a) shows the depth dependence of . The main structural features reported by Yamamoto et al. (2011), namely the shift of the O atoms in SrTiO3 towards the surface and the lattice expansion around the interface, are reproduced in this analysis. Fig. 5(b) shows the depth dependence of . Some amount of atomic interdiffusion at the interface is visible. The average values of the standard deviation of the structural parameters are listed in Table 1, where the average was taken over the sites where the occupancy was larger than 0.5. The precision of the parameters was similar to that in the VM analysis. The B value for the film region was found to be 1.16 ± 0.08 Å2, which is about three times larger than that of the substrate atoms. We also tried to refine the B parameters for each atom, and found that the B parameters were scattered from site to site unphysically, while the other parameters were nearly unchanged. This is because the effect of each B parameter on the cost function is too small to refine with the present dataset.
Lastly, we present the effect of the prior probability . The thickness of the film in the refined structure can be defined as the total values of occ(La,n) or occ(Al,n), which are 4.91 ± 0.08 and 4.46 ± 0.12 in the results of the presented in Fig. 5. By applying a Gaussian prior probability with an average value of 5.0 and a standard deviation of 0.1, we obtained thicknesses of 4.92 ± 0.07 and 4.74 ± 0.13 for the La and Ti sites with the cost of a slight increase in R value. The R values before and after the application of the prior probability were 0.110 and 0.111, respectively. One can define an arbitrary , which will help to find a physically reasonable solution in a short time.
5. Conclusions
We have developed MC analysis software for the CTR scattering from perovskite-type transition metal oxide interfaces. The performance of the software was demonstrated by using a five-unit-cell-thick LaAlO3 ultrathin film on an SrTiO3 substrate as an example. The precision of the structural parameters estimated from the VM data analysis was ±0.01 and ±0.04 Å for the displacement of metal and O atoms, and that for the occupancies was ±0.03. The accuracy of the structural parameters was also examined, and it was found that the accuracy was similar to the precision. Experimental data were successfully analyzed by the same procedure. We are planning to apply this method to films with some inhomogeneity in thickness.
Footnotes
‡Present address: Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8561, Japan.
Acknowledgements
The synchrotron radiation experiments at the Photon Factory were performed with the approval of the Photon Factory Program Advisory Committee (proposal Nos. 2015S2-009 and 2008-G099).
Funding information
This work was supported by a Grant-in-Aid for Scientific Research (Japan Society for the Promotion of Science KAKENHI, grant Nos. JP26105008 and JP26287080).
References
Björck, M., Schlepütz, C. M., Pauli, S. A., Martoccia, D., Herger, R. & Willmott, P. R. (2008). J. Phys. Condens. Matter, 20, 445006. Google Scholar
Cantoni, C., Gazquez, J., Miletto Granozio, F., Oxley, M. P., Varela, M., Lupini, A. R., Pennycook, S. J., Aruta, C., di Uccio, U. S., Perna, P. & Maccariello, D. (2012). Adv. Mater. 24, 3952–3957. Web of Science CrossRef CAS PubMed Google Scholar
Chakhalian, J., Freeland, J., Millis, A., Panagopoulos, C. & Rondinelli, J. (2014). Rev. Mod. Phys. 86, 1189–1202. Web of Science CrossRef CAS Google Scholar
Chaloupka, J. & Khaliullin, G. (2008). Phys. Rev. Lett. 100, 016404. Web of Science CrossRef PubMed Google Scholar
Dagotto, E. (2005). Science, 309, 257–262. Web of Science CrossRef PubMed CAS Google Scholar
D'Alessandro, M. (2011). Phys. Rev. E, 84, 041130. Google Scholar
D'Alessandro, M. & Cilloco, F. (2010). Phys. Rev. E, 82, 021128. Google Scholar
Feidenhans'l, R. (1989). Surf. Sci. Rep. 10, 105–188. CrossRef CAS Web of Science Google Scholar
Fister, T., Zhou, H., Luo, Z., Seo, S., Hruszkewycz, S., Proffit, D., Eastman, J., Fuoss, P., Baldo, P., Lee, H. & Fong, D. (2014). APL Mater. 2, 021102. Google Scholar
Fong, D. D., Cionca, C., Yacoby, Y., Stephenson, G. B., Eastman, J. A., Fuoss, P. H., Streiffer, S. K., Thompson, C., Clarke, R., Pindak, R. & Stern, E. A. (2005). Phys. Rev. B, 71, 144112. Web of Science CrossRef Google Scholar
Fung, R., Shneerson, V. L., Lyman, P. F., Parihar, S. S., Johnson-Steigelman, H. T. & Saldin, D. K. (2007). Acta Cryst. A63, 239–250. Web of Science CrossRef CAS IUCr Journals Google Scholar
Goodenough, J. (2001). Editor. Localized to Itinerant Electronic Transition in Perovskite Oxides. Structure and Bonding Vol. 98. Berlin, Heidelberg, New York: Springer. Google Scholar
Hwang, H., Iwasa, Y., Kawasaki, M., Keimer, B., Nagaosa, N. & Tokura, Y. (2012). Nat. Mater. 11, 103–113. Web of Science CrossRef CAS PubMed Google Scholar
Kirkpatrick, S., Gelatt, C. D. & Vecchi, M. P. (1983). Science, 220, 671–680. CrossRef PubMed CAS Web of Science Google Scholar
McGreevy, R. L. & Pusztai, L. (1988). Mol. Simul. 1, 359–367. Web of Science CrossRef Google Scholar
Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H. & Teller, E. (1953). J. Chem. Phys. 21, 1087–1092. CrossRef CAS Web of Science Google Scholar
Middey, S., Chakhalian, J., Mahadevan, P., Freeland, J., Millis, A. & Sarma, D. (2016). Annu. Rev. Mater. Res. 46, 305–334. Web of Science CrossRef CAS Google Scholar
Nakagawa, N., Hwang, H. & Muller, D. (2006). Nat. Mater. 5, 204–209. Web of Science CrossRef CAS Google Scholar
Ohtomo, A. & Hwang, H. (2004). Nature, 427, 423–426. Web of Science CrossRef PubMed CAS Google Scholar
Ohtomo, A., Muller, D., Grazul, J. & Hwang, H. (2002). Nature, 419, 378–380. Web of Science CrossRef PubMed CAS Google Scholar
Perna, P., Maccariello, D., Radovic, M., Scotti di Uccio, U., Pallecchi, I., Codda, M., Marré, D., Cantoni, C., Gazquez, J., Varela, M., Pennycook, S. J. & Granozio, F. M. (2010). Appl. Phys. Lett. 97, 152111. Web of Science CrossRef Google Scholar
Reyren, N., Thiel, S., Caviglia, A. D., Kourkoutis, L. F., Hammerl, G., Richter, C., Schneider, C. W., Kopp, T., Rüetschi, A.-S., Jaccard, D., Gabay, M., Muller, D. A., Triscone, J.-M. & Mannhart, J. (2007). Science, 317, 1196–1199. Web of Science CrossRef PubMed CAS Google Scholar
Robinson, I. K. (1986). Phys. Rev. B, 33, 3830–3836. CrossRef CAS Web of Science Google Scholar
Salluzzo, M., Gariglio, S., Torrelles, X., Ristic, Z., Di Capua, R., Drnec, J., Sala, M. M., Ghiringhelli, G., Felici, R. & Brookes, N. B. (2013). Adv. Mater. 25, 2333–2338. Web of Science CrossRef CAS PubMed Google Scholar
Takahashi, T., Sumitani, K. & Kusano, S. (2001). Surf. Sci. 493, 36–41. Web of Science CrossRef CAS Google Scholar
Tokura, Y. & Nagaosa, N. (2000). Science, 288, 462–468. Web of Science CrossRef PubMed CAS Google Scholar
Vlieg, E. (2000). J. Appl. Cryst. 33, 401–405. Web of Science CrossRef CAS IUCr Journals Google Scholar
Vonk, V. (2011). J. Appl. Cryst. 44, 1217–1221. Web of Science CrossRef CAS IUCr Journals Google Scholar
Willmott, P., Pauli, S., Herger, R., Schlepütz, C., Martoccia, D., Patterson, B., Delley, B., Clarke, R., Kumah, D., Cionca, C. & Yacoby, Y. (2007). Phys. Rev. Lett. 99, 155502. Web of Science CrossRef PubMed Google Scholar
Yacoby, Y., Sowwan, M., Stern, E., Cross, J., Brewe, D., Pindak, R., Pitney, J., Dufresne, E. & Clarke, R. (2002). Nat. Mater. 1, 99–101. Web of Science CrossRef PubMed CAS Google Scholar
Yamamoto, R., Bell, C., Hikita, Y., Hwang, H., Nakamura, H., Kimura, T. & Wakabayashi, Y. (2011). Phys. Rev. Lett. 107, 036104. Web of Science CrossRef PubMed Google Scholar
This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.