research papers\(\def\hfill{\hskip 5em}\def\hfil{\hskip 3em}\def\eqno#1{\hfil {#1}}\)

Journal logoSTRUCTURAL SCIENCE
CRYSTAL ENGINEERING
MATERIALS
ISSN: 2052-5206

ChemEnv: a fast and robust coordination environment identification tool

aInstitute of Condensed Matter and Nanosciences, Université catholique de Louvain, Chemin des Étoiles 8, 1348 Louvain-la-Neuve, Belgium, bEnergy Technologies Area, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA, cDepartment of Materials Science and Engineering, University of California, Berkeley, CA 94720, USA, dBASF SE, Digitalization of R&D, Carl-Bosch-Str. 38, 67056 Ludwigshafen, Germany, and eSkolkovo Institute of Science and Technology, Skolkovo Innovation Center, Nobel St. 3, Moscow, 143026, Russia
*Correspondence e-mail: geoffroy.hautier@uclouvain.be

Edited by J. Lipkowski, Polish Academy of Sciences, Poland (Received 3 December 2019; accepted 15 June 2020; online 21 July 2020)

Coordination or local environments have been used to describe, analyze and understand crystal structures for more than a century. Here, a new tool called ChemEnv, which can identify coordination environments in a fast and robust manner, is presented. In contrast to previous tools, the assessment of the coordination environments is not biased by small distortions of the crystal structure. Its robust and fast implementation enables the analysis of large databases of structures. The code is available open source within the pymatgen package and the software can also be used through a web app available on http://crystaltoolkit.org through the Materials Project.

1. Introduction

Inorganic crystal structures are typically described by their structure prototype or by a more local concept of `coordination environment' (Müller, 2007[Müller, U. (2007). Inorganic Structural Chemistry. Wiley.]; Allmann & Hinek, 2007[Allmann, R. & Hinek, R. (2007). Acta Cryst. 63, 412-417.]). Coordination environments or local environments (e.g. octahedral, tetrahedral, etc.) are often used in structure visualization as they clarify the crystal arrangement. These environments can also be used to understand crystal structures and their properties. P. Pfeiffer was the first to transfer this concept of coordination environments from coordination complexes to crystals to rationalize crystals as large molecules (Pfeiffer, 1915[Pfeiffer, P. (1915). Z. Anorg. Allg. Chem. 92, 376-380.], 1916[Pfeiffer, P. (1916). Z. Anorg. Allg. Chem. 97, 161-174.]). Very often these coordination environments are determined in a non-automatic manner by the individual researcher. Local environments play a major role in solid state chemistry and physics as well as materials science. For instance, the famous Pauling rules, which have been used to understand and rationalize crystal structures for 90 years, rely heavily on this concept (Pauling, 1929[Pauling, L. (1929). J. Am. Chem. Soc. 51, 1010-1026.]). In the Pauling rules, the analysis of the coordination environments is used to determine the stability of a material. Electronic, optical, magnetic and other properties of crystals have also been related to and explained by local environments (Hoffmann, 1987[Hoffmann, R. (1987). Angew. Chem. Int. Ed. Engl. 26, 846-878.], 1988[Hoffmann, R. (1988). Solids and Surfaces: A Chemist's View of Bonding in Extended Structures. New York: VCH Publishers.]; Lueken, 2013[Lueken, H. (2013). Magnetochemie: Eine Einführung in Theorie und Anwendung. Teubner Studienbücher Chemie. Vieweg+Teubner Verlag.]; Peng et al., 2015[Peng, H., Ndione, P. F., Ginley, D. S., Zakutayev, A. & Lany, S. (2015). Phys. Rev. X, 5, 021016.]). In recent years, coordination environments have been discussed and used as structural descriptors to derive structure–property relationships via machine-learning methods (Jain et al., 2016[Jain, A., Hautier, G., Ong, S. P. & Persson, K. (2016). J. Mater. Res. 31, 977-994.]; Zimmermann et al., 2017[Zimmermann, N. E. R., Horton, M. K., Jain, A. & Haranczyk, M. (2017). Front. Mater. 4, 34.]). Some of us have analyzed the coordination environments present in oxides in a statistical manner (Waroquiers et al., 2017[Waroquiers, D., Gonze, X., Rignanese, G.-M., Welker-Nieuwoudt, C., Rosowski, F., Göbel, M., Schenk, S., Degelmann, P., André, R., Glaum, R. & Hautier, G. (2017). Chem. Mater. 29, 8346-8360.]). Such large-scale analyzes require an easily reproducible, robust and automatic determination of coordination environments. Since the transfer of the concept of coordination environments from coordination complexes to crystals, various approaches to determine coordination numbers, coordination environments, or the distortion of coordination environments have been developed (Frank & Kasper, 1958[Frank, F. C. & Kasper, J. S. (1958). Acta Cryst. 11, 184-190.]; Brunner & Schwarzenbach, 1971[Brunner, G. O. & Schwarzenbach, D. (1971). Z. Kristallogr. 133, 127-133.]; Carter, 1978[Carter, F. L. (1978). Acta Cryst. B34, 2962-2966.]; O'Keeffe, 1979[O'Keeffe, M. (1979). Acta Cryst. A35, 772-775.]; Hoppe, 1979[Hoppe, R. (1979). Z. Kristallogr. 150, 23-52.]; Pinsky & Avnir, 1998[Pinsky, M. & Avnir, D. (1998). Inorg. Chem. 37, 5575-5582.]; Guńka & Zachara, 2019[Guńka, P. A. & Zachara, J. (2019). Acta Cryst. B75, 86-96.]; Stoiber & Niewa, 2019[Stoiber, D. & Niewa, R. (2019). Z. Kristallogr. 234, 201-209.]). However, the methods mentioned so far are not well suited for a robust and automatic assessment of coordination environments in very large databases consisting of several thousands of crystal structures such as the Inorganic Crystal Structure Database (Bergerhoff & Brown, 1987[Bergerhoff, G. & Brown, I. D. (1987). Crystallographic Databases, 360, 77-95.], Zagorac et al., 2019[Zagorac, D., Müller, H., Ruehl, S., Zagorac, J. & Rehme, S. (2019). J. Appl. Cryst. 52, 918-925.]), Pearson's database (Villars & Cenzual, 2018[Villars, P. & Cenzual, K. (2018). Pearson's Crystal Data: Crystal Structure Database for Inorganic Compounds. Release 2018/19 (on DVD). ASM International, Materials Park, Ohio, USA.]) or the Cambridge Structural Database (Groom et al., 2016[Groom, C. R., Bruno, I. J., Lightfoot, M. P. & Ward, S. C. (2016). Acta Cryst. B72, 171-179.]). Indeed, some of these methods can be sensitive to small distortions due to predefined cut-offs while others rely on additional chemical information that is not directly available from the sole consideration of the geometry of the crystal. Moreover, some of these methods only deal with the identification of the coordination number without assigning a specific environment to a given site. To fill this gap we developed ChemEnv, a fast and robust tool to identify coordination environments. It has already been applied in the study of coordination environments of oxides (Waroquiers et al., 2017[Waroquiers, D., Gonze, X., Rignanese, G.-M., Welker-Nieuwoudt, C., Rosowski, F., Göbel, M., Schenk, S., Degelmann, P., André, R., Glaum, R. & Hautier, G. (2017). Chem. Mater. 29, 8346-8360.]) and in a rigorous assessment study of the Pauling rules (George et al., 2020[George, J., Waroquiers, D., Di Stefano, D., Petretto, G., Rignanese, G.-M. & Hautier, G. (2020). Angew. Chem. Int. Ed. 59, 7569-7575.]). It is embedded in pymatgen – a Python library for materials analysis which is part of the Materials Project that aims at the accelerated design of new materials (Ong et al., 2013[Ong, S. P., Richards, W. D., Jain, A., Hautier, G., Kocher, M., Cholia, S., Gunter, D., Chevrier, V. L., Persson, K. A. & Ceder, G. (2013). Comput. Mater. Sci. 68, 314-319.]; Jain et al., 2013[Jain, A., Ong, S. P., Hautier, G., Chen, W., Richards, W. D., Dacek, S., Cholia, S., Gunter, D., Skinner, D., Ceder, G. & Persson, K. A. (2013). APL Mater. 1, 011002. ]). Our approach relies on the similarity of such distorted polyhedra present in the crystal structure to ideal reference polyhedra. After a neighbor analysis, we identify potential local environments and compare them through a distance metric to a database of perfect local environments. Different algorithms called strategies are then used to decide on a local environment assignment and the final result can present a unique environment or a mixture of several environments. This approach which is robust to distortion will be described in detail in this paper.

2. Method/algorithm

2.1. Aspects of coordination environments identification

In the process of identifying coordination environments of a given atom, two main questions have to be considered:

(a) What are the neighbors of this atom?

(b) What is the overall arrangement of these neighbors around this atom?

The first question refers to what is called the coordination number while the second corresponds to the coordination or local environment. The answer to these questions is very clear when the local structure of the atom is close to a perfect environment. However, when relatively large distortions are present, the identification can be much more difficult. In particular, a given local environment can be identified as a mix of two or more coordination environments (which can be of the same coordination number or not).

2.2. Voronoï analysis

The neighbors of a given atom in a given structure are determined using a modified Voronoï approach similar to what was proposed by O'Keeffe (1979[O'Keeffe, M. (1979). Acta Cryst. A35, 772-775.]). The Voronoï analysis allows for the splitting of the space into regions that are closer to one atom than to any other one. In the standard Voronoï approach for determining the neighbors of a given atom X, all the atoms {Y1,…,Yn} whose regions are contiguous to the region of atom X are considered as coordinated to atom X. The distances between atom X and each of its neighbors are written [\lbrace d^X_{Y_1}, \ldots d^X_{Y_n}\rbrace]. The common faces [\lbrace f^X_{Y_1}, \ldots f^X_{Y_n}\rbrace] between the region of atom X and each of the regions of atoms {Y1,…,Yn} define solid angles [\lbrace \Omega^X_{Y_1}, \ldots \Omega^X_{Y_n}\rbrace] subtended by these faces at atom X.

The Voronoï regions are easily understood by drawing the perpendicular area bisectors for each pair of atoms X and Y. Fig. 1[link] illustrates the concept in two dimensions (in which area bisectors are thus replaced by line bisectors). The example shown is a slightly distorted square lattice [see Fig. 1[link](a)] where the atoms at the corners (atoms 1, 3, 6 and 8) are displaced towards the central green atom (atom 0). The perfect square lattice is shown by the gray atoms. In Fig. 1[link](b), the perpendicular line bisectors (in red) are drawn for each segment from the central (green) atom and all other (black) atoms around it. The Voronoï region of the central atom corresponds to the region in light green in Fig. 1[link](c). Fig. 1[link](d) shows the faces [\lbrace f^0_{1}, \ldots f^0_{8}\rbrace] attributed to each pair of atoms 0i with i = 1…8. The solid angle is illustrated for neighbors 1 and 5 by [\Omega^0_{1}] and [\Omega^0_{5}], respectively.

[Figure 1]
Figure 1
Voronoï construction.

In our modified approach, two additional cut-offs can be added as shown schematically in two dimensions in Fig. 2[link]:

[Figure 2]
Figure 2
Schematic representation of the cut-off parameters used in the Voronoï analysis of neighbors: (a) distance cut-off and (b) angle cut-off. (a) Distance cut-off parameter κ. dXmin is the distance to the closest neighbor (one of the dark blue atoms). Any atom that lies outside the sphere of radius [\kappa\times d^{X}_{{\rm min}}] (in dashed orange) is not considered as a coordinated neighbor. Atoms at the corner (in light blue) are not considered as neighbors. (b) Angle cut-off parameter γ. [\Omega^X_{{\rm max}}] is the largest solid angle to a neighbor atom. Any atom for which the solid angle is smaller than [\gamma\times\Omega^X_{{\rm max}}] (in orange) is not considered as a coordinated neighbor. Atoms at the corner (in light blue) are not considered as neighbors. [Adapted with permission from D. Waroquiers et al. (2017[Allmann, R. & Hinek, R. (2007). Acta Cryst. 63, 412-417.]).]

(a) The first cut-off excludes neighbors on the basis of the distance [Fig. 2[link](a)]. Let [d^X_{{\rm min}} = {\rm min}(\lbrace d^X_{Y_1}, \ldots d^X_{Y_n}\rbrace)] be the distance to the closest neighbor of atom X and κ ≥ 1.0 be the distance cut-off parameter. All atoms lying inside the sphere of radius [\kappa\times d^X_{{\rm min}}] are considered as coordinated neighbors while those lying outside are disregarded. We define the normalized distance [\overline{d^X_{Y_i}}] of each neighbor Yi as dXYi / dXmin.

(b) The second cut-off is based on the solid angles [\lbrace \Omega^X_{Y_1}, \ldots \Omega^X_{Y_n}\rbrace] introduced before [Fig. 2[link](b)]. Let [\Omega^X_{{\rm max}} = {\rm max}(\lbrace \Omega^X_{Y_1}, \ldots \Omega^X_{Y_n}\rbrace)] be the biggest solid angle to a neighbor for atom X and γ ≤ 1.0 be the angle cut-off parameter. All neighboring atoms with a solid angle smaller than [\gamma\times\Omega^X_{{\rm max}}] are not considered as coordinated to atom X. We define the normalized angle [\overline{\Omega^X_{Y_i}}] of each neighbor Yi as [{{\Omega^X_{Y_i}}/ {\Omega^X_{{\rm max}}}}].

It is possible to use both cut-offs at the same time in which case a given atom is not considered as a coordinated neighbor if either one of the cut-offs disregards it as a coordinated neighbor.

The modified Voronoï procedure presented above allows for the determination of the coordinated neighbors of a given atom X for a given set of distance/angle parameters. The identification of the coordinated neighbors of atom X defines the local environment of this atom. The identification of the model environment which this local environment resembles the most is described in the next section.

2.3. The shape recognition problem and the continuous symmetry measure

The shape recognition problem consists in the identification of the model environment to which a local and possibly distorted environment resembles the most. Fig. 3[link] illustrates this problem. A distorted octahedron is shown in Fig. 3[link](a). Whether this distorted octahedron is more similar to a perfect octahedron [see Fig. 3[link](b)] than to any other (model) shape is precisely the purpose of the shape recognition. This inherently implies that a list of model polyhedra to be compared to is known a priori. We stick to the list of coordination environments recommended by the IUPAC (Hartshorn et al., 2007[Hartshorn, R. M., Hey-Hawkins, E., Kalio, R. & Leigh, G. J. (2007). Pure Appl. Chem. 79, 1779-1799.]) and by the IUCr (Lima-de Faria et al., 1990[Lima-de-Faria, J., Hellner, E., Liebau, F., Makovicky, E. & Parthé, E. (1990). Acta Cryst. A46, 1-11.]). This list of environments, their symbol, coordinates and additional meta-information are given as supplementary information.

[Figure 3]
Figure 3
The shape recognition problem. It consists in identifying whether the distorted octahedron in (a) is more similar to the perfect (model) octahedron in (b) than to any other model polyhedron. This presupposes that there exists a list of model polyhedra to be compared to.

In order to measure the closeness of a local environment to each perfect model environment, the Continuous Symmetry Measure (CSM) is used, as proposed by Pinsky & Avnir (1998[Pinsky, M. & Avnir, D. (1998). Inorg. Chem. 37, 5575-5582.]). This CSM can be interpreted as a measure of similarity between shapes. For a given structure [{\cal Q}] composed of [N = N_{\cal Q}] atoms (vertices) with coordinates {qk, k = 1, 2, …, N}, the CSM [{\rm S}_{{\cal P}}\left[{\cal Q}\right]] with respect to a model polyhedron [{\cal P}] with [N = N_{\cal P} = N_{\cal Q}] vertices {pk, k = 1, 2, …, N} is defined as:

[S_{{\cal P}}\left[{\cal Q}\right] = \min {{\sum\limits_{k = 1}^N\left|{\bf q}_k-{\bf p}_k\right|^2} \over {\sum\limits_{k = 1}^N\left|{\bf q}_k-\overline{{\bf q}}\right|^2}}\times 100 \eqno(1)]

with [\overline{{\bf q}} = {{1} \over {N}}\sum\nolimits_{k = 1}^{N}{\bf q}_k].

With this definition, the value of the CSM is guaranteed to be in the [0.0, 100.0] interval. A value of 0.0 for the CSM indicates that the two shapes are identical, i.e. the structure [{\cal Q}] corresponds to the perfect structure [{\cal P}]. Instead, when the structure is distorted, the value of the CSM gives a degree of distortion of the structure [{\cal Q}] with respect to the perfect structure [{\cal P}]. As such, the CSM can be understood as one definition of a distance to a shape.

In equation (1[link]), the minimization has to be performed with respect to four different degrees of freedom:

(i) Translation [see Fig. 4[link](a)]. This minimization is easily addressed by translating the local structures to their center of mass.

[Figure 4]
Figure 4
Translational and ordering degrees of freedom for the minimization in equation (1[link]). (a) Translation of the polyhedron and (b) ordering of the vertices.

(ii) Ordering of the atoms [see Fig. 4[link](b)]. The simplest method is to test all possible permutations of indices. This guarantees a correct value for the CSM but the number of permutations scales as N! making it time-consuming for large (N > 6) coordination numbers. The symmetry of the model polyhedra is used to reduce the number of independent permutations for N ≤ 6. For larger N, a different approach is adopted (see Section 2.4[link]).

(iii) Orientation of the structure [see Fig. 5[link](a)]. The local (distorted) structure is rotated in order to minimize the numerator in equation (1[link]) by using an alignment procedure based on the singular value decomposition (Kabsch, 1976[Kabsch, W. (1976). Acta Cryst. A32, 922-923.]; Kabsch, 1978[Kabsch, W. (1978). Acta Cryst. A34, 827-828.]).

[Figure 5]
Figure 5
Rotational and size degrees of freedom for the minimization in equation (1[link]). (a) Orientation and (b) size.

(iv) Size of the structure [see Fig. 5[link](b)]. A scaling factor is applied to the local structure to avoid size effects: the local structure is normalized to the root-mean square distance from the center of mass of the structure to all vertices.

The minimization process presented above is equivalent to the point set registration algorithms used in shape or pattern recognition (Pomerleau et al., 2015[Pomerleau, F., Colas, F. & Siegwart, R. (2015). Found. Trends Robot. 1, 1-104.]). The main challenge comes from the fact that the correspondence between points in [{\cal Q}] and [{\cal P}] (i.e. the ordering problem described above) is unknown. In pattern recognition in which the number of points is usually large, algorithms based on pair correlation functions combined with statistical analysis are widely used [see Maiseli et al. (2017[Maiseli, B., Gu, Y. & Gao, H. (2017). J. Visual Commun. Image Represent. 46, 95-106.]) and references therein]. In contrast, for small number of points, a different approach has to be adopted. As briefly outlined above, the simplest solution (which is used for N ≤ 6) is to test all possible permutations of indices (ignoring symmetrically identical ones), while for larger N the number of permutations is reduced using the separation-plane algorithm (see Section 2.4[link]). In any case, for a given permutation of points, the CSM can be obtained thanks to algorithm 1 (see Fig. 6[link], points in [{\cal Q}] have been translated such that their center of mass coincide with that of [{\cal P}]). The exact CSM is then the smallest one of all the CSM computed for each permutation considered.

[Figure 6]
Figure 6
Algorithm 1. Computation of the CSM for a given permutation.

2.4. Separation-plane algorithm

When the number N of coordinated neighbors increases, the number of permutations needed to minimize equation (1[link]) scales as N!. When the correspondence of vertices between the local distorted structure and the perfect model polyhedron is not known (which is usually the case for the application of the procedure to large databases of structures), this makes the computation of the CSM almost infeasible for N > 10 and very time-consuming for 6 > N ≥ 10 with the standard procedure (e.g. 9! = 362 880, 12! = 479 001 600).

In order to overcome this difficulty, the separation plane algorithm has been devised to drastically reduce the computational time needed. The basic idea is to identify possible planes in the distorted structure that can be assigned to a plane in the model polyhedron in order to reduce the number of permutations needed to find the right correspondence between points and hence the correct CSM. This idea is illustrated in Fig. 7[link] for a two-dimensional case. The points of the perfect model shape are separated into three different groups: the set of points supposed to lie within the plane and the two sets of points on either side of the plane. The permutation space is thus reduced because N! is always larger than N1!N2!N3! if at least two of N1, N2, N3 are larger than or equal to 1. For the example in Fig. 7[link], the number of permutations is reduced from 6! = 720 to 2! × 2! × 2! = 8. Additionally, for larger environments in which the separating plane contains more than three points, these can be ordered using clockwise or counterclockwise ordering, hence reducing the number of permutations even further.

[Figure 7]
Figure 7
Illustration of the separation plane algorithm. (a) Model and local (distorted) structure, (b) first trial for separation plane algorithm and (c) second trial for separation plane algorithm.

A separation is defined by its separation plane Pperf passing through at least three points of the perfect polyhedron [{\cal P}] and by the two separated groups of points Sperf and Tperf located on either side of the plane. The set of points in the plane is written as P = {pj, j = 1,…NP} while S = {sm, m = 1, …NS} and T = {tn, n = 1, …NT} stand for the two sets of points on either side of the plane. By construction, {qk} = {pj} ∪ {sm} ∪ {tn} and N = NP + NS + NT. We use ϒperf = (NS, NP, NT) as an abridged notation for the separation. For the example illustrated in Fig. 7[link], the separation is noted (2, 2, 2). An illustration of two separation planes for the cubic and cuboctahedral environments is provided in Fig. 8[link].

[Figure 8]
Figure 8
Examples of separation planes. (a) Separation (2, 4, 2) in the cubic environment: points A, B, H and E (in red) belong to the plane that separates points D and F (in green) from points C and G (in blue). (b) Separation (3, 6, 3) in a cuboctahedron: points A, I, G, D, L and F (in red) belong to the plane that separates points C, E and J (in blue) from points H, K and B (in green).

The procedure for the computation of the CSM of environments with more than six atoms is described in algorithm 2 which is shown in Fig. 9[link]. Separation planes have been defined for all the perfect model environments above six atoms. Usually, more than one separation plane can be defined in a given model polyhedron. In practice, the overall algorithm tests all the available separation planes that have been defined for the polyhedron under consideration. The list of separation planes for each coordination environment is available as SI and is also easily viewable with a script provided in the ChemEnv subpackage of pymatgen.

[Figure 9]
Figure 9
Algorithm 2. The separation plane algorithm.

The algorithm has been optimized by ordering the points of the separation plane in a clockwise or counterclockwise direction whenever possible. This makes it possible to reduce the number of permutations related to the separation plane. For example, for the separation (3, 6, 3) of the cuboctahedron shown in Fig. 8[link](b), the number of permutations of the points in the plane is 6! = 720. Ordering the points in the perfect and local environments makes it possible to reduce the number of trials to six. A similar optimization is also possible for the two separated groups of points for the separations in which these groups contain a sufficient number of points (e.g. in the icosahedral environment, the separation plane contains four points and splits the other points into two groups of four points each).

2.5. Neighbor sets and distance/angle parameters maps

The distance and angle parameters defined in Section 2.2[link] are very sensitive parameters for the determination of the neighbors of a given atom. Indeed, a very slight change in one of the parameters can change the atoms considered as neighbors and hence the coordination. Each neighbor set of atom A with coordination N is denoted by ΞN, j(A). The j index comes from the fact that two different neighbor sets can have the same coordination N. A two-dimensional example of such a case is illustrated in Fig. 10[link] in which two sets of distance and angle cut-off parameters result in two different neighbor sets of the same coordination.

[Figure 10]
Figure 10
Illustration in two dimensions of two sets of neighbors having the same coordination number. (a) Local environment of atom 0. Normalized distances to neighbors 1, 2, 3 and 4 are [\overline{d^0_1} = \overline{d^0_3} = 1.0], [\overline{d^0_2}] = 1.15 and [\overline{d^0_4}] = 1.35. Normalized angles to neighbors 1, 2, 3 and 4 are [\overline{\Omega^0_{4}}] = 1.0, [\overline{\Omega^0_{1}} = \overline{\Omega^0_{3}} \approx 0.924] and [\overline{\Omega^0_{2}} \approx 0.42]. (b) Set of neighbors (1, 2 and 3) of atom 0 with N = 3. This set of neighbors is obtained with e.g. κ = 1.25 and γ = 0.3 cut-offs. (c) Another set of neighbors (1, 3 and 4) of atom 0 with N = 3. This set of neighbors is obtained with e.g. κ = 1.4 and γ = 0.5 cut-offs.

In order to ensure robustness with respect to the distance and angle cut-off parameters, the identification procedure is performed in two steps. First, all sets of neighbors ΞN, j(A) are obtained for all possible distance/angle parameters in the Voronoï analysis. For each neighbor set, CSMs are computed with respect to each model polyhedron of the same coordination. This can be represented by a map of distance/angle parameters with regions defined for each neighbor set (see Fig. 11[link] for examples of such maps for Si and O sites in SiO2 as well as for Cr and Te sites in Cr2Te4O11). The second step allows one to test the sensitivity of the distance/angle parameters by means of strategies (see Section 2.6[link]). While for the three first cases in Fig. 11[link], the `correct' environment is reasonably clear by just looking at the figure (assigning tetrahedral (T:4), angular (A:2) and octahedral (O:6) environments, respectively, to Si in SiO2, O in SiO2 and Cr in Cr2Te4O11), the situation is more complex and the identification is not so evident for Te in Cr2Te4O11. In this case, the environment could be seen as an intermediate between two different environments. The use of strategies can clarify such ambiguous cases.

[Figure 11]
Figure 11
Examples of distance/angle parameters maps for Si and O in α-SiO2 (Materials Project id: mp-7000) and Cr and Te in Cr2Te4O11 (Materials Project id: mp-540537): (a) Si site in α-SiO2, (b) O site in α-SiO2, (c) Cr site in Cr2Te4O11 and (d) Te site in Cr2Te4O11. Each neighbor set corresponds to a region in which any distance/angle parameters combination result in the same set. The color level of each region gives an indication of the CSM value of the model polyhedron to which the corresponding neighbor set resembles the most (i.e. for which the CSM is the lowest). For the larger regions, this model polyhedron is indicated by its symbol. The square and triangle symbols correspond to fixed distance and angle parameters respectively of 1.3/0.6 and 1.6/0.4, showing a clear ambiguity for the Te site in Cr2Te4O11 (see Section 2.6[link] on how to clarify such cases).

The neighbors in each set, the CSMs for each model polyhedron in each set, and other data related to each neighbor set are stored in a so-called StructureEnvironments (see also Section 3[link]) or SE (hereafter also symbolized by ΦA for atom A) object. As exemplified in Fig. 11[link], this SE is not very useful as such as it contains a lot of information that is difficult to interpret directly. In the second step presented below, strategies are used to analyze the SE and extract usable and valuable information from the SE.

2.6. Strategies

For the final step of the identification procedure, strategies are used to reliably analyze the SE object and extract a usable and meaningful result. Reliability refers to the robustness of our algorithm in which the sensitivity of the identification to the distance/angle parameters is tested and challenged. Hence, the local environments can be interpreted as one unique environment or as an intermediate between two (or more) coordination environments, each of which being attributed a fraction or percentage. Different strategies can be used depending on the goals, needs and constraints required by the user. This flexibility provided by the strategies is one of the strengths of our identification procedure. For visualization purposes, a strategy resulting in the identification of a single coordination environment for each site has to be used while reviewing the most commonly observed environments can be done with a strategy allowing for multiple environments for the same site. One can also favor specific or larger/smaller environments depending on the project. In the following, two strategies are developed further.

2.6.1. Fixed distance/angle cut-offs strategy

The simplest way to identify the environment is to use fixed distance and angle cut-off parameters. In this Simplest­Chemenv­Strategy, the set of neighbors is thus unique and the environment is identified as the one for which the CSM is the lowest. The advantage of such a simple procedure is that it makes it possible to describe a local environment by its unique corresponding model environment, which is easier to use for visualization purposes. However, some (distorted or very distorted) local environments can be considered to be an intermediate between two or more model coordination polyhedra. In such cases, this strategy will simply `choose' one environment, depending on the distance and angle parameters. As a simple illustration, Fig. 12[link] shows the sudden switch from the square-pyramidal environment to the octahedral environment when the distance cut-off is increased. Similarly, for fixed distance and angle cut-offs, when an octahedron is smoothly distorted by moving away one of the atoms, the resulting environment from this simplest strategy changes abruptly from octahedral to square-pyramidal as shown in Fig. 13[link] (thin lines correspond to the SimplestChemenvStrategy). It is thus very sensitive to small changes in the positions of the atoms. Nevertheless, with decent distance and angle parameters (e.g. κ = 1.4 and γ = 0.3), the identified environment is reasonably correct in about 85% of the cases.

[Figure 12]
Figure 12
Coordination environments for a distorted octahedron in which the bottom atom is at distance 1.45 times larger than the other five neighbors. When the distance cut-off is lower than 1.45, the bottom atom is not considered as a neighbor and the environment is identified as a square pyramid. When the distance cut-off is larger than 1.45, the bottom atom is taken into account and the environment is identified as an octahedron.
[Figure 13]
Figure 13
Smooth distortion from octahedral to square-pyramidal environment by moving away the bottom atom. The deformation parameter α = 0 corresponds to the perfect octahedron while for α = 1, the bottom atom has been moved to a distance that is twice that of the distance to the other neighbors. The thin lines gives the fractions of octahedral and square-pyramidal environments obtained with the SimplestChemenvStrategy (with a distance cut-off of 1.4) while the thick lines correspond to the fractions obtained with the MultiWeightsChemenvStrategy. Octahedral and square-pyramidal are respectively shown as solid and dashed lines.

Another illustration of this strategy is shown in Fig. 14[link] in which a triangular prism is smoothly distorted towards an octahedron by rotating the upper and lower triangular planes in opposite directions (thin lines correspond to the SimplestChemenvStrategy). In this case, the number of neighbors remains the same while the actual identified environment switches abruptly from triangular prismatic to octahedral when the CSM of latter becomes smaller than that of the former. Once again, the sensitivity with respect to small changes in the positions of the atoms is critical in this strategy.

[Figure 14]
Figure 14
Smooth distortion from triangular prismatic to octahedral environment by twisting the triangular prism around the principal axis. The deformation parameter α = 0 corresponds to the perfect trigonal prism while for α = 1, the upper (red → orange) and lower (green → cyan) triangles have been rotated respectively clockwise and counterclockwise by 30°, corresponding to an octahedron. The thin lines gives the fractions of triangular prismatic and octahedral environments obtained with the SimplestChemenvStrategy while the thick lines correspond to the fractions obtained with the MultiWeightsChemenvStrategy. Octahedral and triangular prismatic are respectively shown as solid and dashed lines.
2.6.2. Strategy based on multiple weights

A second strategy is developed hereafter, in which special care has been taken to remove the artificial abrupt transitions observed with the SimplestChemenvStrategy. The idea is to smooth these transitions using a combination of smooth step functions. A given local environment can thus be identified either as one unique coordination environment if distortions are small, or as a mix of two or more environments for larger distortions. In practice, the local environment is described as a list of environments, each being assigned a fraction or percentage.

The percentage or fraction f(A) of a given model coordination environment depends on the results (CSMs, Voronoï parameters, …) for each possible set of neighbors contained in ΦA.

[f_{\varepsilon}(A) = f[\Phi_A](\varepsilon) \eqno(2)]

The procedure used to get the fraction of a model polyhedron for a given local environment is then obtained as the product of two terms. Suppose occurs in a given neighbor set Ξ. The first term results from the relative weight of the neighbor set (as compared to the other neighbor sets) displaying model environment . The second term comes from the relative weight of the model polyhedron within that specific neighbor set.

[f[\Phi_A](\varepsilon) = f^{{\rm outer}}[\Phi_A]\times f^{{\rm inner}}[\Xi_A^N](\varepsilon) \eqno(3)]

In the following, the first term is referred to as the outer weight (i.e. the weight that depends on other so-called outer neighbor sets) and the second term is referred to as the inner weight (i.e. the weight inside a specific neighbor set).

Inner weight. For a given neighbor set [\Xi_A^N] of atom A in a given coordination N, the relative weight (and hence fraction) of each model polyhedron is not straightforward. Let ΘN be the set of K model environments with coordination N:

[\Theta^{N} = \lbrace\varepsilon_1^N,\ldots,\varepsilon_i^N,\ldots,\varepsilon_K^N\rbrace \eqno(4)]

For example, the set Θ6 of six-coordinated model polyhedra [as reported by Hartshorn et al. (2007[Hartshorn, R. M., Hey-Hawkins, E., Kalio, R. & Leigh, G. J. (2007). Pure Appl. Chem. 79, 1779-1799.]) and Lima-de Faria et al. (1990[Lima-de-Faria, J., Hellner, E., Liebau, F., Makovicky, E. & Parthé, E. (1990). Acta Cryst. A46, 1-11.]) and implemented in the ChemEnv package] is composed of the octahedron (symbolized O:6), the trigonal prism (symbolized T:6) and the pentagonal pyramid (symbolized PP:6).

For each model polyhedron [\varepsilon_i^N], the CSM [S_{\varepsilon_i^N}[\Xi_A^N]] with respect to the local environment [\Xi_A^N] is used to assign a weight to each model polyhedron thanks to the use of an adequately shaped function. Model environments with a lower CSM (i.e. more similar to the local environment) are assigned a larger weight. In particular, if one of the model environments has a CSM of 0.0 (i.e. the local environment is perfect), its weight should be infinite so that it is the only model environment identified. The function should also allow for the assignment of a zero weight to a model polyhedron for which the CSM is larger than a given maximum value Smax. One example of such a function is the `modified' inverse function defined in equation (5[link]) and shown in Fig. 15[link].

[w_{S^{\rm max}}(S) = \left\{\matrix {1/S^{\rm max}\times{{(S-S^{\rm max})^2} \over {S}} &{\rm if }\,\, S \leq S^{\rm max},\cr \ 0.0 & {\rm if }\,\, S\,\, \gt\,\, S^{\rm max}.\hfill}\right. \eqno(5)]

in which the numerator (SSmax)2 ensures the continuity at S = Smax while the prefactor 1/Smax arises from the normalization of the [0, Smax] to [0, 1].

[Figure 15]
Figure 15
Weight function for the inner weight of model polyhedra. In this example, Smax is set to 8.0, so that the weight of any model polyhedron with a CSM larger than 8.0 is zero.

Fractions of each model environment [\varepsilon_i] are then obtained from these weights using equation (6[link]):

[f^{{\rm inner}}[\Xi_A^N](\varepsilon_i) = {{w_{S^{{\rm max}}}(S_{\varepsilon_i})} \over {\sum_{j = 1}^{j = K}w_{S^{{\rm max}}}(S_{\varepsilon_j})}} \eqno(6)]

A small example is also given in Fig. 15[link] in which CSMs for a fictitious six-coordinated case are provided.

When the coordination is clearly defined (i.e. when only one neighbor set is identified using the procedure outlined in 2.5), the fractions of each model polyhedron are solely determined by this inner weight. On the other hand, when different neighbor sets are identified, an additional complexity arises from the fact that smaller environments usually tend to be more easily recognized as similar (i.e. having smaller CSMs). The extreme case is the single neighbor which is always assigned a CSM of zero. For cases in which more than one neighbor set is present, the outer weight is used to determine the relative predominance of each of the neighbor sets (and hence their corresponding model polyhedron).

Outer weight. The outer weight or neighbor set weight refers to the weight of a given neighbor set with respect to the other neighbor sets. This outer weight is defined as a product of several `partial weights' (the definition being general enough to allow for flexibility in the choice of the weights):

[w^{{\rm outer}}[\Psi_A](\Xi_A) = \prod_{i = 1}^{i = n_w}\widehat{w^i}[\Psi_A](\Xi_A) \eqno(7)]

in which nw is the number of partial weights used.

Some of the partial neighbor set weights compare the CSMs of this neighbor set with the ones for the other neighbor sets. The simplest approach is to take the smallest CSM for each of the neighbor sets. In practice, to ensure continuity, an effective CSM is defined. The effective CSM of a given neighbor set [\Xi_A^N], denoted [S_{{\rm eff}}(\Xi_A^N)], is obtained from a weighted average using the `modified' inverse function defined in equation (5[link])

[S_{{\rm eff}}(\Xi_A^N) = {{\sum\limits_{\varepsilon\in\Theta^N}w_{S^{{\rm max}}}(S_{\varepsilon})\times S_{\varepsilon}} \over {\sum\limits_{\varepsilon\in\Theta^N}w_{S^{{\rm max}}}(S_{\varepsilon})}} \eqno(8)]

in which [S_{\varepsilon}] is a short form for [S_{\varepsilon}(\Xi_A^N)], i.e. the CSM of the neighbor set with respect to the perfect environment [\varepsilon].

Partial weights. In the following, the partial weights used in the `default' multi-weights strategy [used in a previous publication (Waroquiers et al., 2017[Waroquiers, D., Gonze, X., Rignanese, G.-M., Welker-Nieuwoudt, C., Rosowski, F., Göbel, M., Schenk, S., Degelmann, P., André, R., Glaum, R. & Hautier, G. (2017). Chem. Mater. 29, 8346-8360.])] are described. The strategy with these default parameters is easily obtained with the following class method (see examples in the tutorials provided in the supplementary material):

[\tt MultiWeightsChemenvStrategy.stats.article\!_{-}\!weights\!_{-}\!parameters()]

Other weights have also been implemented in the ChemEnv package in pymatgen.

`Distance–angle area' weight. The idea is to restrict the neighbor sets to those originating from a specific range of values for the distance and angle cutoffs. For example, one might only consider distance cutoffs between 1.2 and 1.8. One might also consider that the Voronoï angle towards a neighbor should always be between 0.3 and 0.8. In practice, a special area of distance–angle parameters is defined such as the one shown in Fig. 16[link]. Indeed, there is not much sense to allow for neighbors with a small angle parameter and a small distance parameter or with a large angle parameter and a large distance parameter. If the region of a given neighbor set (as defined in Section 2.5[link]) is crossing the above-mentioned area, the weight of this neighbor set is 1.0 (indicated in white on Fig. 16[link]), otherwise it is set to 0.0. An extension of this weight could be to ensure it is continuous.

[Figure 16]
Figure 16
Schematic of the distance–angle area weight. The shaded area is used to determine which neighbor sets are considered. If the region of a given neighbor set is crossing the shaded area, the set is assigned a `distance–angle area' weight of 1.0. In the opposite case, the set is assigned a weight of 0.0 (white regions).

`Self CSM' weight

This weight makes use of the effective CSM Seff of each neighbor set defined in equation (8[link]). Each neighbor set is assigned a weight depending on the value of this effective Seff. The idea is to disfavor neighbor sets that are globally more distorted than others. One example function used to estimate this weight is defined in equation (9[link]) and shown in Fig. 17[link].

[w_{S^{\rm max}, \lambda}(S_{\rm eff}) = \left\{\matrix { (\overline{S_{\rm eff}}-1.0)^2\times{\rm e}^{-\lambda \overline{S_{\rm eff}}} & {\rm if }\,\, \overline{S_{{\rm eff}}} \leq 1.0, \cr 0.0 & {\rm if }\,\, \overline{S_{\rm eff}} \,\,\gt \,\,1.0.\hfill}\right. \eqno(9)]

where [\overline{S_{{\rm eff}}}] is the normalized effective CSM defined as [{{S_{{\rm eff}}} \over {S_{{\rm max}}}}].

[Figure 17]
Figure 17
Weight function for the Self-CSM outer weight of neighbor sets as defined in equation (9[link]). The default parameters for this weight are shown as blue while the green and purple curves illustrate other parameters. Arrows indicate thresholds above which values (i.e. Smax) of the effective CSM Seff each of the weight functions are set to zero.

Delta CSM' weight. The goal of this neighbor set weight is to reduce the importance of a given neighbor set [\Xi^{N_1}] if another neighbor set [\Xi^{N_2}] of larger coordination number N2 > N1 is present and not too distorted with respect to the first one. In practice, this weight depends on the difference ΔSeff between the effective CSMs [as defined in equation (8[link])] of the neighbor sets [\Xi^{N_2}] and [\Xi^{N_1}]:

[\Delta S_{{\rm eff}}(\Xi^{N_1}, \Xi^{N_2}) = S_{{\rm eff}}(\Xi^{N_2}) - S_{{\rm eff}}(\Xi^{N_1}) \eqno(10)]

The Delta CSM weight is defined as:

[w^{\delta}_{\chi; \Delta^{\rm min},\Delta^{\rm max}}[\Phi_A](\Xi_A) \!=\! \min_{{\Xi_i\in\Phi_A; \,N(\Xi_i) \gt N(\Xi_A)}}\!\chi_{\Delta^{\rm min},\Delta^{\rm max}} \left(\Delta S_{{\rm eff}}(\Xi _{A}, \!\Xi_{i})\right) \eqno(11)]

in which χ is a sigmoid-like function (e.g. a smooth step or smoother step function), N(Ξ) is the coordination of neighbor set Ξ and Δmin, Δmax are the edges used in the χ function.

An example of a χ function is the smoother step function shown in Fig. 18[link] and defined as:

[\chi^{smootherstep}_{a,b}(x) = \left\{\matrix {0.0 &\!\!\!\!\!\!\!\!\!\! {\rm if }\,\, x \leq a, \cr 6\overline{x}^5-15\overline{x}^4+10\overline{x}^3 & \,\,{\rm if }\,\, a\,\, \lt \,\,x \,\,\lt \,\,b, \cr 1.0 &\,\, {\rm if }\,\, x \geq b.\hfill}\right. \eqno(12)]

in which [\overline{x} = (x-a)/(b-a)] is the scaled value of x mapping the [a,b] interval to the [0, 1] interval.

[Figure 18]
Figure 18
Smoother step function used in the Delta CSM weight. The `Delta CSM' weight assigned to the Ξ1 neighbor set is equal to 0.0 if the difference ΔSeff(Ξ1, Ξ2) between the effective CSM Seff(Ξ2) of the Ξ2 neighbor set and its own effective CSM Seff(Ξ1) is lower than Δmin. If the difference ΔSeff(Ξ1, Ξ2) is larger than Δmax, the Ξ1 set is assigned a weight of 1.0. The smoother step function is used between these two extremes. The Δmin and Δmax values can be changed if needed and examples of smoother step functions for different values are shown.

Choice of partial weights. The default list of outer weights consists of the three above-mentioned partial weights. As an example and in particular to illustrate the need to use both the Self CSM weight and the Delta CSM weight, Fig. 19[link] shows the fractions of environments obtained for different choices of weights in the case of the smooth distortion from octahedral to square-pyramidal environment (see Fig. 13[link]).

[Figure 19]
Figure 19
Choice of partial weights: comparison and combination of Self CSM and Delta CSM weights in the case of the smooth distortion from octahedral to square-pyramidal environment. Curves in blue (green) correspond to the octahedral (square-pyramidal) environment. See text for details.

The upper left panel shows the CSM of the octahedral (increasing with the distortion) and square-pyramidal (always equal to 0.0). The middle left and lower left panels show the Self CSM and Delta CSM weights for both environments. The Self CSM weight for the square-pyramidal environment is always 1.0 as its CSM is always 0.0. Conversely, the Delta CSM weight for the octahedral environment is always 1.0 as there is no larger neighbor set to be compared to. As shown in the upper right panel, when the sole Self CSM weight is included, the fractions obtained are 50% octahedral and 50% square-pyramidal when no or little distortion is applied (while one would expect to have 100% octahedral and 0% square-pyramidal). Indeed, for both environments, the value of the CSM is 0.0 and hence the Self CSM weight is 1.0. At variance, the middle right panel illustrates the fractions obtained when the sole Delta CSM weight is included. In that case, for large distortions, the fractions obtained are also 50% for each environment. Indeed, when the distortion is large, the Delta CSM weight for the square-pyramidal environment reaches 1.0 as the larger environment is too distorted to disfavor the square-pyramidal environment. The lower right panel illustrates the case when both the Self CSM and Delta CSM weights are included.

3. Description of the package

The ChemEnv module is written in Python and can be found in the pymatgen package (Ong et al., 2013[Ong, S. P., Richards, W. D., Jain, A., Hautier, G., Kocher, M., Cholia, S., Gunter, D., Chevrier, V. L., Persson, K. A. & Ceder, G. (2013). Comput. Mater. Sci. 68, 314-319.]) as part of the analysis submodule. The organization of the package is shown diagrammatically in Fig. 20[link]. The description of each of the objects referenced as circled numbers in this figure is given hereafter:

[Figure 20]
Figure 20
Organization of the ChemEnv package. Directories are indicated in black and surrounded by a rectangle. Files are indicated in typewriter (blue for python files, purple for other files). The most important python objects are indicated in italic (green). See text for more information.

[\bigcirc{\scriptstyle \kern-.65em 1}\ ] LocalGeometryFinder

Main class used to identify the local environments in a structure.

[\bigcirc{\scriptstyle \kern-.65em 2}\ ] AllCoordinationGeometries

Class containing the list of all the available model coordination geometries (as CoordinationGeometry objects, see [\bigcirc{\scriptstyle \kern-.65em 3}\ ]

[\bigcirc{\scriptstyle \kern-.65em 3}\ ] CoordinationGeometry

Generic class for the description of all the model coordination geometries. An instance of this class is created for each model environment (from the json files stored in the coordination-geometry-files directory). It contains information about its perfect coordinates as well as its edges and faces, name(s), symbol(s), technical details for the identification procedure, ….

[\bigcirc{\scriptstyle \kern-.65em 4}\ ] StructureEnvironments

Class containing the information (CSMs, neighbors, …) on all possible neighbor sets for all sites in the structure as introduced in Section 2.5[link]. This object is meant to be post-processed with a strategy in order to get relevant and usable data about the local environments of the structure.

[\bigcirc{\scriptstyle \kern-.65em 5}\ ] LightStructureEnvironments

Class containing the processed data from the StructureEnvironments class using one strategy. This object lists the environment(s) and their corresponding fractions (in case of a strategy allowing for mixtures of environments) for each site of the structure.

[\bigcirc{\scriptstyle \kern-.65em 6}\ ] DetailedVoronoiContainer

Class containing the information on the Voronoï analysis (see Section 2.2[link]) performed at the beginning of the identification procedure in order to define the different possible neighbor sets.

[\bigcirc{\scriptstyle \kern-.65em 7}\ ] SimplestChemenvStrategy

Class used to apply the fixed distance/angle cutoff strategy introduced in Section 2.6.1[link].

[\bigcirc{\scriptstyle \kern-.65em 8}\ ] MultiWeightsChemenvStrategy

Class used to apply the strategy based on multiple weights as introduced in Section 2.6.2[link].

The most relevant objects needed for the user of ChemEnv package are illustrated in Fig. 21[link].

[Figure 21]
Figure 21
Main objects of the ChemEnv package.

The LocalGeometryFinder object is the main class used to initialize and set up the structure as well as to compute the StructureEnvironments object (containing the raw coordination environments data as introduced in Section 2.5[link]). Combining this StructureEnvironments object with a strategy (e.g. SimplestChemenvStrategy or MultiWeightsChemenv­Strategy) leads to the LightStructureEnvironments object. This latter object contains the usable information about the environments in a structure, i.e. the environment or mix of environments (with their corresponding fractions) that is identified for each site.

4. Interactive web app

An interactive web app has been developed to improve accessibility of the ChemEnv algorithms as part of the Materials Projects Crystal Toolkit platform. While the Python interface is intuitive and well documented, not all scientists are Python users, and the web app enables use of ChemEnv by any user without installing custom software. The web app supports uploading of any file format supported by the pymatgen code, including Crystallographic Information Format (CIF). Alternatively, structures can be loaded directly from the Materials Project database containing more than 100 000 inorganic materials.

The web app is designed to offer one-to-one equivalent functionality to ChemEnv by directly calling the corresponding pymatgen interface, specifically using the LightStructureEnvironments and SimplestChemenvStrategy, and allowing the user full interactive control over the distance and angle cut-offs. Each symmetrically distinct chemical environment is shown in 3D using a custom atomic visualizer, along with Wyckoff label, IUPAC symbol, CSM, and human-readable environment label. Oxidation states will be used in the analysis if atoms are appropriately annotated in the uploaded file or, if these are not supplied, oxidation states can be guessed on-the-fly using pymatgen's bond valence analysis algorithms. It will be hosted by the Materials Project, and is available at http://crystaltoolkit.org.

5. Conclusion

We have developed a tool that can analyze coordination or local environments of large numbers of crystal structures in a fast and robust manner. The analysis of the neighboring atoms relies on a modified Voronoï approach based on a grid of distance. From this grid of different distance and angle cutoffs, the coordination environments are determined with the help of a similarity metric to the shape of ideal polyhedra. Two different strategies are implemented to arrive at the final assignment of the coordination environments. One of these strategies is especially robust against small distortions of the crystal structures making the algorithm particularly useful for automatic, unsupervised, local environment assignment. This new tool can be used as part of the open-source Python library (pymatgen) and within an interactive web app available on http://crystaltoolkit.org through the Materials Project.

6. Supporting information

A tutorial for the ChemEnv package, in both pdf and jupyter-notebook format, is available in the supporting information. The list of all environments as well as some details about the implementation are also available in the supporting information.

Supporting information


Footnotes

Present address: Matgenix SRL, Rue Armand Bury 185, B-6534 Gozée, Belgium (david.waroquiers@matgenix.com)

Funding information

Funding for this research was provided by: European Union's Horizon 2020, Marie Skłodowska-Curie grant (grant No. 837910 to Janine George); US Department of Energy, Office of Science, Office of Basic Energy Sciences, Materials Sciences and Engineering Division [contract No. DE-AC02-05CH11231 (Materials Project program KC23MP) to Matthew Horton and Kristin A. Persson].

References

First citationAllmann, R. & Hinek, R. (2007). Acta Cryst. 63, 412–417.  CrossRef IUCr Journals Google Scholar
First citationBergerhoff, G. & Brown, I. D. (1987). Crystallographic Databases, 360, 77–95.  Google Scholar
First citationBrunner, G. O. & Schwarzenbach, D. (1971). Z. Kristallogr. 133, 127–133.  CrossRef CAS Web of Science Google Scholar
First citationCarter, F. L. (1978). Acta Cryst. B34, 2962–2966.  CrossRef CAS IUCr Journals Web of Science Google Scholar
First citationFrank, F. C. & Kasper, J. S. (1958). Acta Cryst. 11, 184–190.  CrossRef IUCr Journals Web of Science Google Scholar
First citationGeorge, J., Waroquiers, D., Di Stefano, D., Petretto, G., Rignanese, G.-M. & Hautier, G. (2020). Angew. Chem. Int. Ed. 59, 7569–7575.  CrossRef CAS Google Scholar
First citationGroom, C. R., Bruno, I. J., Lightfoot, M. P. & Ward, S. C. (2016). Acta Cryst. B72, 171–179.  Web of Science CrossRef IUCr Journals Google Scholar
First citationGuńka, P. A. & Zachara, J. (2019). Acta Cryst. B75, 86–96.  CrossRef IUCr Journals Google Scholar
First citationHartshorn, R. M., Hey-Hawkins, E., Kalio, R. & Leigh, G. J. (2007). Pure Appl. Chem. 79, 1779–1799.  Web of Science CrossRef CAS Google Scholar
First citationHoffmann, R. (1987). Angew. Chem. Int. Ed. Engl. 26, 846–878.  CrossRef Google Scholar
First citationHoffmann, R. (1988). Solids and Surfaces: A Chemist's View of Bonding in Extended Structures. New York: VCH Publishers.  Google Scholar
First citationHoppe, R. (1979). Z. Kristallogr. 150, 23–52.  CrossRef CAS Web of Science Google Scholar
First citationJain, A., Hautier, G., Ong, S. P. & Persson, K. (2016). J. Mater. Res. 31, 977–994.  CrossRef CAS Google Scholar
First citationJain, A., Ong, S. P., Hautier, G., Chen, W., Richards, W. D., Dacek, S., Cholia, S., Gunter, D., Skinner, D., Ceder, G. & Persson, K. A. (2013). APL Mater. 1, 011002.   Google Scholar
First citationKabsch, W. (1976). Acta Cryst. A32, 922–923.  CrossRef IUCr Journals Web of Science Google Scholar
First citationKabsch, W. (1978). Acta Cryst. A34, 827–828.  CrossRef IUCr Journals Web of Science Google Scholar
First citationLima-de-Faria, J., Hellner, E., Liebau, F., Makovicky, E. & Parthé, E. (1990). Acta Cryst. A46, 1–11.  CrossRef CAS IUCr Journals Google Scholar
First citationLueken, H. (2013). Magnetochemie: Eine Einführung in Theorie und Anwendung. Teubner Studienbücher Chemie. Vieweg+Teubner Verlag.  Google Scholar
First citationMaiseli, B., Gu, Y. & Gao, H. (2017). J. Visual Commun. Image Represent. 46, 95–106.  CrossRef Google Scholar
First citationMüller, U. (2007). Inorganic Structural Chemistry. Wiley.  Google Scholar
First citationO'Keeffe, M. (1979). Acta Cryst. A35, 772–775.  CrossRef CAS IUCr Journals Web of Science Google Scholar
First citationOng, S. P., Richards, W. D., Jain, A., Hautier, G., Kocher, M., Cholia, S., Gunter, D., Chevrier, V. L., Persson, K. A. & Ceder, G. (2013). Comput. Mater. Sci. 68, 314–319.  Web of Science CrossRef CAS Google Scholar
First citationPauling, L. (1929). J. Am. Chem. Soc. 51, 1010–1026.  CrossRef CAS Google Scholar
First citationPeng, H., Ndione, P. F., Ginley, D. S., Zakutayev, A. & Lany, S. (2015). Phys. Rev. X, 5, 021016.  Google Scholar
First citationPfeiffer, P. (1915). Z. Anorg. Allg. Chem. 92, 376–380.  CAS Google Scholar
First citationPfeiffer, P. (1916). Z. Anorg. Allg. Chem. 97, 161–174.  CrossRef CAS Google Scholar
First citationPinsky, M. & Avnir, D. (1998). Inorg. Chem. 37, 5575–5582.  Web of Science CrossRef PubMed CAS Google Scholar
First citationPomerleau, F., Colas, F. & Siegwart, R. (2015). Found. Trends Robot. 1, 1–104.  Google Scholar
First citationStoiber, D. & Niewa, R. (2019). Z. Kristallogr. 234, 201–209.  CrossRef CAS Google Scholar
First citationVillars, P. & Cenzual, K. (2018). Pearson's Crystal Data: Crystal Structure Database for Inorganic Compounds. Release 2018/19 (on DVD). ASM International, Materials Park, Ohio, USA.  Google Scholar
First citationWaroquiers, D., Gonze, X., Rignanese, G.-M., Welker-Nieuwoudt, C., Rosowski, F., Göbel, M., Schenk, S., Degelmann, P., André, R., Glaum, R. & Hautier, G. (2017). Chem. Mater. 29, 8346–8360.  Web of Science CrossRef CAS Google Scholar
First citationZagorac, D., Müller, H., Ruehl, S., Zagorac, J. & Rehme, S. (2019). J. Appl. Cryst. 52, 918–925.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationZimmermann, N. E. R., Horton, M. K., Jain, A. & Haranczyk, M. (2017). Front. Mater. 4, 34.  CrossRef Google Scholar

This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.

Journal logoSTRUCTURAL SCIENCE
CRYSTAL ENGINEERING
MATERIALS
ISSN: 2052-5206
Follow Acta Cryst. B
Sign up for e-alerts
Follow Acta Cryst. on Twitter
Follow us on facebook
Sign up for RSS feeds