On the combinatorics of crystal structures. II. Number of Wyckoff sequences of a given subdivision complexity

Hornfeck, W.; Červený, K.

doi:10.1107/S2053273323002437

crystal lattices

FOUNDATIONS
ADVANCES

ISSN: 2053-2733

Volume 79| Part 3| May 2023| Pages 280-294

https://doi.org/10.1107/S2053273323002437

Open

access

On the combinatorics of crystal structures. II. Number of Wyckoff sequences of a given subdivision complexity

Wolfgang Hornfeck ^a ^* and Kamil Červený ^a

^aFZU – Institute of Physics of the Czech Academy of Sciences, Na Slovance 1999/2, 182 00 Prague 8, Czechia
^*Correspondence e-mail: hornfeck@fzu.cz

Edited by M. I. Aroyo, Universidad del País Vasco, Spain (Received 1 July 2022; accepted 13 March 2023; online 11 May 2023)

Wyckoff sequences are a way of encoding combinatorial information about crystal structures of a given symmetry. In particular, they offer an easy access to the calculation of a crystal structure's combinatorial, coordinational and configurational complexity, taking into account the individual multiplicities (combinatorial degrees of freedom) and arities (coordinational degrees of freedom) associated with each Wyckoff position. However, distinct Wyckoff sequences can yield the same total numbers of combinatorial and coordinational degrees of freedom. In this case, they share the same value for their Shannon entropy based subdivision complexity. The enumeration of Wyckoff sequences with this property is a combinatorial problem solved in this work, first in the general case of fixed subdivision complexity but non-specified Wyckoff sequence length, and second for the restricted case of Wyckoff sequences of both fixed subdivision complexity and fixed Wyckoff sequence length. The combinatorial results are accompanied by calculations of the combinatorial, coordinational, configurational and subdivision complexities, performed on Wyckoff sequences representing actual crystal structures.

Keywords: Wyckoff sequences; combinatorics; Shannon entropy; structural complexity.

1. Introduction

Any standardized crystal structure can be conveniently related to a descriptor uniquely encoding its combinatorial properties, namely its Wyckoff sequence: a string composed of the space group type number (sometimes the Hermann–Mauguin symbol is used instead) and followed by all the Wyckoff letters for each partially or fully occupied Wyckoff position in the crystal structure; the letters are put in reverse alphabetic order and augmented by their superscripted frequency of occurrence, in case a certain non-fixed Wyckoff position is occupied multiple times.

Standardization is necessary because many crystal structures do have equivalent descriptions in terms of their unit cell and atomic coordinates, depending either on matters of possible unit-cell choices or the symmetry properties of space groups [keywords: symmetry of symmetry, Cheshire groups, Euclidean normalizers; see Müller (2013 ), ch. 8]. A comprehensive scheme for crystal structure standardization has been theoretically developed by Parthé and coworkers (TYPIX) and implemented into the software STRUCTURE TIDY (Parthé & Gelato, 1984 , 1985 ; Gelato & Parthé, 1987 ; Parthé et al., 1993a ,b ). The uniqueness of the Wyckoff sequence thus depends on and follows from the uniqueness of the standardization.

The Wyckoff positions of a space group encompass all possible distinct sets of symmetry-equivalent sites within a unit cell. Or, to put it in rigorous mathematical terms: a Wyckoff position of a space group $[{\cal G}]$ consists of all points X for which the site-symmetry groups $[{\cal S}(X)]$ are conjugate subgroups of $[{\cal G}]$ (see Hahn, 2005 , ch. 8.3.2, p. 733; Aroyo, 2016 , ch. 1.4.4.2, p. 62).

It is important to distinguish the notions of space group and space-group type. A space group encompasses all of the symmetry of an actual crystal structure, including its translation symmetry, specified by its lattice parameters. A space-group type is an abstract notion comprising all those, otherwise distinct, space groups which share the same representation by matrix-column pairs of symmetry operations (Nespolo et al., 2018 ). This is of special importance in the study of group–subgroup relations, in which a group and one of its proper subgroups can be of the same space-group type, but not of the same space group. Another consequence is that the number of space groups is infinite, whereas the number of space-group types is finite: 230 distinct ones in three dimensions, including 11 enantiomorphous types. In particular, the Hermann–Mauguin symbol represents the space-group type, while concepts related to the Wyckoff positions, such as the Wyckoff sequences studied in this work, are related to the space group, in the sense that they are inseparably connected with the coordinate description of crystal structures. However, the number of distinct Wyckoff positions is considered to be finite, with a total of 1731 positions for all space groups, thus allowing for the combinatorial analysis to follow.

The Wyckoff positions are labelled by up to 27 possible Wyckoff letters (a to z and α), being devoid of any further meaning, but depending on the choice of space group. Since the Wyckoff sequence composed of these letters does not include any specific information about either the unit-cell parameters or the atomic coordinates, it is a more abstract notion as well, and useful as such for purposes of crystal structure classification and systematics (see, e.g., Allmann & Hinek, 2007 ). Note that this abstraction means that actual crystal structures can share the same Wyckoff sequence, while being distinct with respect to the specific values of the free parameter(s) of their non-fixed Wyckoff position(s). Such crystal structures are called isopointal (Lima-de-Faria et al., 1990 ).

Yet, Wyckoff sequences can be studied without referring to specific geometric crystal structures. Thus, any Wyckoff sequence really describes an infinite family of crystal structures, sharing a common parametrization in terms of their geometric degrees of freedom, according to the observed partial or full occupancy of their general position(s) and, if present, special position(s) of their corresponding space-group type. Indeed, the Wyckoff sequence has been used in the aforementioned sense, as a coordinate-free representation of crystal structures in modern machine learning approaches to materials discovery (Goodall et al., 2022 ).

One particular advantage of this abstract, combinatorial point of view is due to the fact that the number of Wyckoff sequences of given length k is finite (Hornfeck, 2022a ), making an exhaustive study possible for small values of k. In fact, most of the actual crystal structures found so far in nature have Wyckoff sequences of length below k = 50. For the space-group type Pmmm (No. 47) with ν = 19 non-fixed sites and φ = 8 fixed ones, constituting the case with the highest number of possible sequences, this would correspond to about 3.5×10¹⁸ distinct sequences to consider in total, up to the length of k = 50 inclusive (see Appendix A for the computation and the exact result).

Another advantage, and one focus of this work, is due to the fact that the Wyckoff sequence translates into the information about a crystal structure's combinatorial (M) and coordinational (A) collective degrees of freedom, associated with a weighted sum of each Wyckoff position's individual multiplicity ( M_i) and arity ( A_i), respectively, the latter being the number of independent coordinate parameters required to be specified in a standardized description of an actual crystal structure.

Both the combinatorial and coordinational degrees of freedom can be used to assess a crystal structure's combinatorial and coordinational complexity by means of an approach pioneered by Krivovichev (2012 , 2014 ) and extended by Hornfeck (2020 , 2022b ) based on the utilization of the Shannon entropy as a complexity measure.

Taking a general point of view, the collective degrees of freedom represent a certain system (a macrostate), while the individual degrees of freedom each represent a certain subsystem (a microstate). This is true on different levels of hierarchy. For instance, on the crystal structure level, the collective configurational degrees of freedom, M+A, result as the sum of the individual combinatorial (M) and coordinational (A) degrees of freedom. In a similar way, on the Wyckoff position level, the collective combinatorial and coordinational degrees of freedom, M or A, result as the sum of the individual combinatorial and coordinational degrees of freedom, M_i or A_i, respectively.

In combinatorial terms this splitting of a system into subsystems corresponds to an integer partition. However, in their crystallographic application, the number of partitions is not simply given by the number-theoretic partition function, yet restricted by crystallographic symmetry and the coupling of combinatorial and coordinational degrees of freedoms as found within each Wyckoff position. In particular, the possible values of the Wyckoff multiplicities considering all space-group types are restricted to the set $[M_{i}\in\{1, 2, 3, 4, 6, 8, 12, 16, 24, 48\}]$ (assuming primitive unit cells, i.e. modulo centring translations, which, in the following, will be the preferred choice of description), while the possible values of the Wyckoff arities are restricted to the set $[A_{i}\in\{0,1,2,3\}]$ . Moreover, for a given choice of space-group type, the values of either set that can occur are restricted further (although not their frequencies of occurrence, except for fixed positions, which can occur only once), in accordance with the existing Wyckoff positions. A final restriction is imposed by each Wyckoff position introducing a coupling of values from both sets. The combinatorial problem under consideration thus is one of counting the number of restricted, coupled partitions.

Thus, due to these particular restrictions, it is a non-trivial task to describe a macrostate defined by the collective degrees of freedom (M,A) by the composition from or the subdivision into its associated microstates (M_i,A_i). Foremost, it is a natural question to ask, given one macrostate, what is the number of microstates corresponding to it?

The following section will give the answer to this question, with the main results being a generating polynomial approach (Section 2.4, theory; Section 2.7, algorithm) and a dynamic programming approach (Section 2.8, theory and algorithm).

2. Combinatorics of Wyckoff sequences

To state the problem succinctly: how many distinct Wyckoff sequences (of optionally fixed length k) exist, given a pair of combinatorial and coordinational degrees of freedom (M,A) together with a choice of space group? (Note that a space group has to be fixed, since this determines the alphabet of Wyckoff letters which can appear in the Wyckoff sequence.)

2.1. A problem of crystals – exposition

The problem statement can be translated rather straightforwardly into some algebraic form. Note that two integer values exist for each individual Wyckoff position i, its multiplicity M_i and its arity A_i (coordinational degree of freedom). Both have to be considered in a coupled way, which will be done in the notation by the use of column vectors. Then, the total multiplicity and arity, M and A, respectively, corresponding to a certain set of Wyckoff sequences of crystal structures, are given as

$[\left[\matrix{M\cr A}\right] = \sum_{i = 1}^{n}\mu_{i}\left[\matrix{M_{i}\cr A_{i}}\right].\eqno(1)]$

Here, the sum index i runs over all the n existing Wyckoff positions for a given space group, with the integer multipliers $[\mu_{i}]$ denoting the frequency of occurrence of a given site in the sum of individual multiplicities M_i and arities A_i of the combined sites. In one general point of view, as mentioned in the Introduction, one can call the pair (M,A) the system variables (describing the macrostate), while the M_i and arities A_i would be the subsystem variables (describing the microstates).

Information about the Wyckoff positions of the 230 three-dimensional space-group types is compiled in Vol. A of the International Tables for Crystallography (Aroyo, 2016), which is the authoritative source. Alternatively, it can also be retrieved online from the Bilbao Crystallographic Server (https://www.cryst.ehu.es/) using the routine WYCKPOS (Aroyo et al., 2006a ,b , 2011 ).

The values of the Wyckoff multiplicities M_i are explicitly stated as the numeral part of the Wyckoff symbol assigned to each Wyckoff position (the non-numeral part is given by the Wyckoff letter). Note, however, that these values are given for the centred unit cells, in which case M should also be specified with respect to the unit-cell content of the centred unit cell, in order to maintain the correct correspondence. Alternatively, the values of the Wyckoff multiplicities can be reduced by the division of an integer factor depending on the centring type (2 for C, A and I centring; 3 for R centring; 4 for F centring), if the primitive unit cell, and the number of atoms it contains, is taken as a reference. Indeed, choosing the primitive unit cell as a reference is strongly recommended in the context of crystallographic complexity calculations (cf. Section 3), in order to make the results of these calculations comparable for crystal structures differing in their centring type, since any existing centring translation just repeats parts of a crystal structure inside a unit cell, thereby contributing no additional information to its description, and, accordingly, its complexity.

The values of the Wyckoff arities A_i, although not explicitly stated in the aforementioned sources, can be deduced from them, namely by means of visual inspection with respect to the number of positional variables x, y and z occurring in the listing of the general coordinates as provided for each Wyckoff position.

In general, this information can be obtained for any crystallographic space group, including higher-dimensional ones, namely by inspecting the number of symmetry-equivalent positions modulo translations, to obtain the multiplicities, as well as by elucidating the dependency of their general coordinates with respect to symmetry, to obtain the arities. In fact, all this information can be obtained in a purely algorithmic manner [see Brown et al. (1978 ) for the case of four dimensions].

It should be noted that the problem at hand can be given different mathematical interpretations and representations. Equation (1) already suggests a geometrical interpretation as two-dimensional vectors, which happen to live on the two-dimensional integer lattice $[{\bb Z}^{2}]$ (square lattice). The problem then appears as a special case of a lattice path problem (Fig. 1), in which the vectors associated with the Wyckoff positions of a space group define the set of steps.

Figure 1
Geometric interpretation for the problem of finding the number of ways of combining individual Wyckoff positions of given multiplicities and arities, (M_i,A_i), adding up to a given total multiplicity and arity (M,A). In the illustration, the target vector (M,A) = (12,3), here written in row form, is denoting a point in the two-dimensional integer lattice $[{\bb Z}^{2}]$ (square lattice) in the upper-right corner. On the top, the individual vectors (M_i,A_i) corresponding to each Wyckoff position are shown: a (2,0) red, b (2,0) green, c (2,1) blue, d (4,1) dark red, e (4,2) dark green, f (8,3) dark blue. On the bottom, their combinations adding up to (M,A) = (12,3) are shown, with vectors composed in reverse lexicographic order. Other possible combinations, in which only the order of the vectors are changed, are not shown. However, all lattice points which can be reached by any possible combinations of vectors are highlighted as open circles instead of filled ones. To see the full graph one has to invert the depicted half of it in the point (6, 1.5). In this interpretation the problem becomes a special case of a lattice path enumeration problem with the set of steps governed by the Wyckoff multiplicities and arities for a given choice of space-group symmetry.

Alternatively, the two-dimensional plane can be identified with the complex plane of Wessel, Argand and Gauß suggesting a change of notation, in which i is the imaginary unit ( $[i^{2} = -1]$ ):

$[M+A\,i = \textstyle\sum\limits_{j = 1}^{n}\mu_{j}\left(M_{j}+A_{j}\,i\right).\eqno(2)]$

While these interpretations and representations are mathematically equivalent, and thus do not seem to make any difference per se, the knowledge of these alternatives can be of importance when it comes to searching for subfields of mathematics discussing already existing solutions to a given problem, or general methods to find them, and also in the case of implementation into computer code.

2.2. Conditions on the multipliers

For any given site the multipliers $[\mu_{i}]$ are restricted to discrete intervals

$[\{\mu_{i,{\min}},\mu_{i,{\min}}+1,\ldots,\mu_{i,{\max} }-1,\mu_{i,{\max}}\}\eqno(3)]$

of potential integer values, which are denoted in shorthand as $[[[\mu_{i,{\min}},\mu_{i,{\max}}]]]$ in the following. Naturally, by definition, the frequencies of occurrence are bounded from below, likewise for all sites, by all minimal multipliers $[\mu_{i,{\min}} = 0]$ , meaning that a site is absent in this case. The variable upper bounds, the maximal multipliers, are determined according to the case distinction

$[\mu_{i,{\max}} = \left\{\matrix{r_{i}&{\rm for }\,A_{i} = 1,2,3 = A_{i}^{+}\cr 1\hfill &{\rm for }\,A_{i} = 0\hfill}\right.,\eqno(4)]$

differentiating between non-fixed and fixed sites. Collecting the terms for the non-fixed and fixed sites separately, the summation of equation (1) can be split into separate parts,

$[\left[\matrix{M\cr A}\right] = \sum_{i = 1}^{\nu}r_{i}\left[\matrix{M_{i}\cr A_{i}^{+}}\right]+\sum_{i = 1}^{\varphi}\left[\matrix{M_{i }\cr 0}\right],\eqno(5)]$

in which ν and φ now denote the total number of non-fixed and fixed sites, respectively (compare Hornfeck, 2022a).

In any case, the repetition r_i is given as

$[r_{i} = \min\left\{\left\lfloor{{M} \over {M_{i}}}\right\rfloor,\left\lfloor{{A} \over {A_{i}}}\right\rfloor\right\}, \eqno(6)]$

with $[\lfloor\cdot\rfloor]$ denoting the floor function, and the smaller ratio restricting the number of possible occurrences of the corresponding site from above. Stated in a different way, each multiplier for any non-fixed site has to fulfil both of the conditions

$[\mu_{i}M_{i}\leq M\quad{\rm and}\quad\mu_{i}A_{i}\leq A \eqno(7)]$

simultaneously, with $[\mu_{i,{\max}}]$ being defined as the largest integer doing so.

These conditions originate from the fact that the Wyckoff multiplicities and arities are non-negative integers; thus any surpassing of either one of the limits M or A cannot be balanced by the addition of another Wyckoff position, hence cannot be a part of the solution, and thus signifies an end to the Wyckoff sequence construction process. This guarantees the existence of maximal values $[\mu_{i,{\max}}]$ and fixes the size of the search space of which the solution space is a subspace.

While the determination of the size of the solution space, the number of Wyckoff sequences existing for a given choice of (M,A), is our main task, the determination of the size of the search space appears as a first step towards a result, in that it gives a numerical upper bound for the size of the solution space, a combinatorial overview about the Wyckoff sequences to expect, with respect to their length, as well as an estimate regarding the computational tractability of their actual construction based on an exhaustive exploration of the search space.

2.3. Size of the search space

For non-fixed sites the multiplier intervals are given as $[[[0,r_{i}]] = \{0,1,\ldots,r_{i}\}]$ , with variable r_i, while for the fixed sites the multiplier intervals are given as [[0,1]] = {0,1}, invariably. The Cartesian product over all interval sets of multipliers of either type determines the search space:

$[{\cal S} = [[0,\mu_{1,{\max}}]]\times[[0,\mu_{2,{\max}}]] \times\ldots\times[[0,\mu_{n,{\max}}]].\eqno(8)]$

This corresponds to the set of all possible Wyckoff letter sequences, being Wyckoff sequences without the space-group type symbol/number prefix explicitly stated, as constructed from the multiset $[[\mu_{1,{\max}}\alpha_{1},\mu_{2,{\max}}\alpha_{2},\ldots,\mu_ {n,{\max}}\alpha_{n}]]$ , in which the $[\alpha_{i}]$ denote a general Wyckoff letter out of n possible letters for a given space group. Note that the choice of space group, while defining the alphabet of Wyckoff letters and limiting the number of terms to consider in the Cartesian product, does not determine the size of the search space by itself. This size, the cardinality

$[|{\cal S}| = \textstyle\prod\limits_{i = 1}^{n}\left(\mu_{i,{\max}}+1\right)\eqno(9)]$

of the search space, in which individual solutions have to be found, if they exist, is determined by the values of the maximal multipliers. Thus, the size of the search space is determined only if both the space group and the total numbers of degrees of freedom M and A are fixed. Then, the search space size gives an absolute upper bound on the potential number of solutions, yet usually, and in anticipation of our following results, the actual number of solutions will be much lower or even zero. This difference in the number of solutions is due to the construction of the search space by means of the Cartesian product, namely because the restrictions imposed by the $[\mu_{i,{\max}}]$ values for individual Wyckoff positions do not take into account their cumulative, conditional interactions.

As is often the case in combinatorial problems, the same cardinality $[|{\cal S}|]$ can be obtained by an alternative counting method, namely as the result of the summation of all coefficients in the expansion of the univariate polynomial defined by

$[P(x) = \textstyle\prod\limits_{i = 1}^{n}\sum\limits_{k = 0}^{\mu_{i,{\max}}}x^{k}.\eqno(10)]$

This is the generating polynomial for a multiset with finite multiplicities [compare its description and, in particular, equation (4) in Hornfeck (2022a)]. Upon its expansion the coefficients c_k for each term x^k of this polynomial count the number of sequences of length k, thereby representing a more differentiated view of the search space's contents. Eventually,

$[|{\cal S}| = \textstyle\sum\limits_{k = 0}^{\mu}c_{k},\eqno(11)]$

in which $[\mu = \mu_{1,{\max}}+\mu_{2,{\max}}+\ldots+\mu_{n,{\max}}]$ is the multiset's cardinality.

The size of the search space gains some importance due to the fact that the combinatorial approach we will describe in the remainder of this work is non-constructive, as is commonly the case for such combinatorial questions, since it gives only the number of potential Wyckoff sequences matching with a given parameter pair (M,A), but does not reveal the Wyckoff sequences themselves; these can be discovered by an exhaustive check of all admissible multiplier combinations existing within the combined multiplier intervals.

We envision that the search space size can be reduced to some degree by applying effective intermediate checks for multiplier combinations already violating the upper limit as imposed by the choice of the parameters (M,A), overshooting either one parameter at a time or both simultaneously, possibly in combination with the use of a clever data structure such as pruned trees. However, search space sizes below $[|{\cal S}|\,\lt\,10^{6}]$ are easily tractable on a standard desktop personal computer, which should encompass most tasks related to the comparison of the potential combinatorial solutions with Wyckoff sequences representing actual crystal structures.

2.4. A generating polynomial approach to find solutions

Our combinatorial problem stated above is solved in two steps: first, by reducing it, conceptually, to an analogous classical problem of combinatorics, the coin change problem, as can be found in many textbooks on the topic (for instance, Marcus, 1998 , p. 89), and second, by adapting the classical problem to the crystallographic one.

The classical coin change problem is stated in Appendix B, and can be seen as an illustration of the use of generating polynomials, defined in some abstract variable. Notably, the variable is an indeterminate symbol only, entailing no specific meaning other than to allow algebraic operations performed on it; hence it merely acts as a placeholder and bookkeeping device, yet is the decisive one, in order to systematically find a solution.

This use of generating polynomials has already been described in the first entry of this series (Hornfeck, 2022a) to which the reader, interested in more detailed information, is referred. An introduction to the wider field of generating functions is given by Graham et al. (1994 ), and a more detailed exposition of the main ideas involved is given by Wilf (2006 ).

Now, adapting the classical coin change problem to our crystallographic one is carried out in three steps: (i) matching the number of distinct types of coins with the number of distinct Wyckoff positions; (ii) identifying the values of distinct types of coins with the pair of values of the Wyckoff position's multiplicity M_i and arity A_i; and (iii) treating the pair of values ( M_i, A_i) in a coupled way, by introducing two abstract variables (x,y) in the generating polynomials, instead of one.

Thus, in some way, the crystallographic problem is a coin change problem with a twist, based on imaginary coins with denominations on both their front and back sides and a pair of target values to reach upon summation. Similar combinatorial problems arise for the case of real cards, coupling values denoted by digits or letters with symbolic ones such as diamonds, hearts, spades and clubs.

In particular, the third adaptation step is crucial for the correct enumeration. As a consequence of it, generating polynomials of the kind

$[P_{i}(x,y) = \textstyle\sum\limits_{j = 0}^{\mu_{i,{\max}}}x^{j\,M_{i}}y^{j\,A_{i}}\eqno(12)]$

are assigned to each Wyckoff position, with their product over all n Wyckoff positions,

$[{\cal P}(x,y) = \textstyle\prod\limits_{i = 1}^{n}P_{i}(x,y),\eqno(13)]$

yielding the solution, namely by the value of the coefficient of x^My^A in the expanded form of the polynomial $[{\cal P}(x,y)]$ .

As an aside, one can note that the product over all n sites can be split,

$[{\cal P}(x,y) = \left[\textstyle\prod\limits_{i = 1}^{\nu}\sum\limits_{j = 0}^{r_{i}}x^{j\,M_{i}}y^{j\,A_ {i}}\right]\left[\textstyle\prod\limits_{\phantom{j}i = 1\phantom{j}}^{\varphi}\left(1+x^ {M_{i}}\right)\right],\eqno(14)]$

according to the contributions of ν non-fixed and φ fixed sites [compare equation (5)]. This splitting reduces the problem for the fixed sites to a univariate one.

It should be noted that this approach can be further generalized, in principle, by taking into account chemical degrees of freedom [as introduced by Hornfeck (2020)] in terms of atomic numbers of atoms occupying a given Wyckoff position as well. Then, one would have to introduce a third variable z into the respective polynomials with all other procedures considered to be analogously performed. However, there is a difference, in that the atomic numbers are not restricted in any way, that is to say there exists no natural coupling between them and the Wyckoff multiplicities and arities – they are a pure matter of choice. In contrast, the Wyckoff multiplicities and arities are coupled in their values for each Wyckoff position of a space group. Thus, regarding this relative arbitrariness of choice and the relative unimportance of this general case, we refrain from expanding in this direction for the moment. However, our opinion on this topic might change in the future, if there should be an interesting application for fixing the total chemical degrees of freedom, that is the total electron count within a reduced unit cell, to a set value, together with the other degrees of freedom, and asking for the number of crystal structures fulfilling this condition. In this case, any generalization required can be obtained in a straightforward manner by following the same extension procedure as described in the following.

2.5. A generalization including the Wyckoff sequence length

Another and considerably more useful generalization is made by taking into account the length k of the Wyckoff sequence as a further restriction, which can be seen as an extension and a refinement to a previous result in the combinatorics of Wyckoff sequences (Hornfeck, 2022a). This can be achieved by adjusting equation (1) according to

$[\left[\matrix{M\cr A\cr k}\right] = \sum_{i = 1}^{n}\mu_{i}\left[\matrix{M_{i}\cr A_{i}\cr k_{i}}\right]\eqno(15)]$

in addition to

$[r_{i} = \min\left\{\left\lfloor{{M} \over {M_{i}}}\right\rfloor,\left\lfloor{{A} \over {A_{i}}}\right\rfloor,k\right\},\eqno(16)]$

used for the determination of the $[\mu_{i,{\max}}]$ for each individual site i. Trivially, each site itself is of length k_i = 1 (thus, $[\lfloor k/k_{i}\rfloor = k]$ ), which gives the total length of the Wyckoff sequence as the sum of the multipliers $[\mu_{i}]$ : $[k = \mu_{1}+\mu_{2}+\ldots+\mu_{n}]$ . Again, the final result is given as the value of the coefficient of x^My^Az^k in the expanded polynomial

$[{\cal P}(x,y,z) = \textstyle\prod\limits_{i = 1}^{n}P_{i}(x,y,z),\eqno(17)]$

in which

$[P_{i}(x,y,z) = \textstyle\sum\limits_{j = 0}^{\mu_{i,{\max}}}x^{j\,M_{i}}y^{j\,A_{i}}z^{j}\eqno(18)]$

defines the individual terms.

As was the case before, the product over all n sites

$[{\cal P}(x,y,z) = \left[\textstyle\prod\limits_{i = 1}^{\nu}\sum\limits_{j = 0}^{r_{i}}x^{j\,M_{i}}y^{j\, A_{i}}z^{j}\right]\left[\textstyle\prod\limits_{\phantom{j}i = 1\phantom{j}}^{\varphi} \left(1+x^{M_{i}}z\right)\right]\eqno(19)]$

can be split according to the contributions of ν non-fixed and φ fixed sites.

Finally, it should be noted that there exist general methods for obtaining explicit formulas for the coefficients of generating functions, for instance for the powers of bivariate generating functions (Kruchinin et al., 2021 ), as this is an active field of mathematical research.

2.6. A problem of crystals – exemplification

In the following, an illustrative example is discussed in full calculational detail.

The tetragonal space-group type $[P\overline{4}2_{1}m]$ (No. 113) encompasses a total of six distinct Wyckoff positions,

$[{\cal W}_{i}(M_{i},A_{i}) = \left[\matrix{M_{i}\cr A_{i}}\right],\eqno(20)]$

here written in a more convenient in-line notation, indexed according to their Wyckoff letters, and with their corresponding multiplicities and arities stated:

$[{\cal W}^{113} = \left\{\matrix{{\cal W}_{a}(2,0),&{\cal W} _{b}(2,0),&{\cal W}_{c}(2,1),\cr {\cal W}_{d}(4,1),&{\cal W}_{e}(4,2),&{\cal W}_{f}(8,3).}\right. \eqno(21)]$

The chosen space group is the one with the smallest number of Wyckoff positions for which all possible arity values A_i = 0,1,2,3 are present. The six Wyckoff positions comprise a total of three and four distinct values for the Wyckoff multiplicity and arity, respectively, thereby allowing us to illustrate the combinatorial calculation in some detail.

Now, with some arbitrary chosen M = 12 and A = 3 given for this particular example, this results in the following maximal multipliers:

$[\eqalignno{&\mu_{a,{\max}} = 1,\,\mu_{b,{\max}} = 1,\,\mu_{c, {\max}} = 3,&\cr &\mu_{d,{\max}} = 3,\,\mu_{e,{\max}} = 1,\,\mu_{f,{\max}} = 1, &(22)}]$

for each Wyckoff position, and consequently in the following generating polynomials:

$[\eqalignno{ P_{a}(x)& = 1+x^{2},&\cr P_{b}(x)& = 1+x^{2},&\cr P_{c}(x,y)& = 1+x^{2}y+x^{4}y^{2}+x^{6}y^{3},&\cr P_{d}(x,y)& = 1+x^{4}y+x^{8}y^{2}+x^{12}y^{3},&\cr P_{e}(x,y)& = 1+x^{4}y^{2},&\cr P_{f}(x,y) &= 1+x^{8}y^{3}.&(23)}]$

Note how the two fixed sites with Wyckoff letters a and b give rise to exactly the same polynomial factor in one variable x only (namely the one used for the Wyckoff multiplicities), and how the polynomials corresponding to the non-fixed sites are restricted in terms of the monomials of highest degree in either x or y by the given values of either M or A or both.

Note also that the number of variables in each polynomial corresponds to the splitting of all n sites into ν contributions from the non-fixed sites (bivariate case) and φ contributions from the fixed ones (univariate case) [compare equation (5)].

The expanded polynomial $[{\cal P}(x,y)]$ consisting of a total of 60 terms is given by

$[\eqalignno{& 1+2x^{2}+x^{4}+x^{2}y+3x^{4}y&\cr &+3x^{6}y+x^{8}y+2x^{4}y^{2}+5x^{6}y^{2}+5x^{8}y^{2}&\cr &+3x^{10}y^{2}+x^{12}y^{2}+2x^{6}y^{3}+7x^{8}y^{3}+9x^{10}y^{3}&\cr &+6x^{12}y^{3}+3x^{14}y^{3}+x^{16}y^{3}+x^{8}y^{4}+5x^{10}y^{4}&\cr &+10x^{12}y^{4}+10x^{14}y^{4}+5x^{16}y^{4}+x^{18}y^{4}+x^{10}y^{5}&\cr &+5x^{12}y^{5}+10x^{14}y^{5}+12x^{16}y^{5}+9x^{18}y^{5}+3x^{20}y^{5}&\cr &+3x^{14}y^{6}+9x^{16}y^{6}+12x^{18}y^{6}+10x^{20}y^{6}+5x^{22}y^{6}&\cr &+x^{24}y^{6}+x^{16}y^{7}+5x^{18}y^{7}+10x^{20}y^{7}+10x^{22}y^{7}&\cr &+5x^{24}y^{7}+x^{26}y^{7}+x^{18}y^{8}+3x^{20}y^{8}+6x^{22}y^{8}&\cr &+9x^{24}y^{8}+7x^{26}y^{8}+2x^{28}y^{8}+x^{22}y^{9}+3x^{24}y^{9}&\cr &+5x^{26}y^{9}+5x^{28}y^{9}+2x^{30}y^{9}+x^{26}y^{10}+3x^{28}y^{10}&\cr &+3x^{30}y^{10}+x^{32}y^{10}+x^{30}y^{11}+2x^{32}y^{11}+x^{34}y^{11}.&(24)}]$

The decisive term x^My^A = x¹²y³ has 6 as its coefficient, which thus corresponds to the six existing solutions, looked for in the search space; they can be stated in terms of their Wyckoff letter sequence, with multiple occurrences of the same letter marked by superscripts:

$[fba,edba,d^{3},d^{2}cb,d^{2}ca,dc^{2}ba.\eqno(25)]$

Note, in particular, the cases d²cb and d²ca which differ only in the exact choice of the fixed site, being otherwise degenerate in their column vector summation [compare equation (1)].

In fact, the bivariate polynomial given in equation (24) contains the information about all possible solutions (M,A) which can be constructed from the multiset of Wyckoff letters [a,b,3c,3d,e,f] with restricted multiset multiplicities, as determined by the maximal multipliers. Since the multiplier intervals are [[0,3]] for both the Wyckoff positions c and d, as well as [[0,1]] for all other sites, the search space has a size of 4²×2⁴ = 256 cases, whose distribution according to the length k of the sequence can be read off from the expansion (11 terms; not shown) of the generating polynomial

$[(1+x)^{4}(1+x+x^{2}+x^{3})^{2}\eqno(26)]$

since $[\mu_{i,{\max}} = 1]$ for four out of the six Wyckoff sites and $[\mu_{i,{\max}} = 3]$ for the remaining two. The observed six solutions with the property (M,A) = (12,3) constitute only a tiny fraction of the search space, which is generally true, the size of the search space typically being overwhelmingly larger than the number of solutions.

In particular, the full set of solutions contains the singular occurrence of the empty sequence, { }, corresponding to the trivial monomial x⁰y⁰ = 1, as well as that of the maximal length sequence, abcccdddef, corresponding to the monomial x³⁴y¹¹ of highest degree, for which M = 34 and A = 11. Note how one can instantly check that there exists no solution for, say, the case M = 7 and A = 1, because no monomial of the form x⁷y appears in $[{\cal P}(x,y)]$ , or, to put it another way, the coefficient of this monomial in the expansion of $[{\cal P}(x,y)]$ is equal to zero. Note also how one specific solution, namely solution d³, corresponds to the occurrence of the monomial x¹²y³, the j = 3 case of the basic monomial x^j 4y^j 1 representing the $[{\cal W}_{d}(4,1)]$ Wyckoff position, in the generating polynomial P_d(x,y) of equation (23).

Taking into account the length k as an additional parameter, one has to adjust the individual polynomial terms according to

$[\eqalignno{ P_{a}(x,z)& = 1+x^{2}z,&\cr P_{b}(x,z)& = 1+x^{2}z,&\cr P_{c}(x,y,z) &= 1+x^{2}yz+x^{4}y^{2}z^{2}+x^{6}y^{3}z^{3},&\cr P_{d}(x,y,z)& = 1+x^{4}yz+x^{8}y^{2}z^{2}+x^{12}y^{3}z^{3},&\cr P_{e}(x,y,z)& = 1+x^{4}y^{2}z,&\cr P_{f}(x,y,z)& = 1+x^{8}y^{3}z.&(27)}]$

Expansion of their product yields a polynomial in 142 terms (not shown), with, for instance, the value of the coefficient for x^My^Az^k = x¹²y³z⁴ being equal to 3, corresponding to the Wyckoff letter sequences

$[edba,d^{2}cb,d^{2}ca,\eqno(28)]$

which form a subset of the six aforementioned solutions.

2.7. Summary of the generation function approach

To summarize the aforementioned results in an algorithmic form, using the generating polynomial approach one has to perform the following steps to obtain a solution (with the first two items in the enumeration defining the input to the algorithm and the last one its output):

(i) Fix a space group and thereby the set of Wyckoff positions and the potential alphabet of Wyckoff letters from which a Wyckoff sequence can be formed.

(ii) Fix the values for the total Wyckoff multiplicity M and the total Wyckoff arity A and (optionally) the Wyckoff sequence length k, all of which are expected to be integers.

(iii) Retrieve all the individual values for the Wyckoff multiplicities and arities $[{\cal W}_{i}(M_{i},A_{i})]$ .

(iv) From the individual Wyckoff multiplicities and arities compute the individual maximal multipliers $[\mu_{i,{\max}}]$ .

(v) From the individual maximal multipliers construct the individual generating polynomials, P_i(x,y) or P_i(x,y,z).

(vi) From the individual generating polynomials build their product, $[{\cal P}(x,y)]$ or $[{\cal P}(x,y,z)]$ , and expand it.

(vii) From the expansion read off the coefficient for the monomial of the form x^My^A or x^My^Az^k; this is the solution.

2.8. A dynamic programming approach to find solutions

Finally, after illustrating the aforementioned combinatorial method on an explicit example, we want to highlight the possibility of an alternative algorithmic way of calculation which turns out to be very efficient, by making optimal use of the recursive nature and overlapping substructure of the problem, in such a way that subsolutions are only ever computed once and retrieved as needed for the calculation of the solution. This algorithmic way is known under the name dynamic programming. A simple example for the classical univariate and infinite coin change problem is given in Appendix B. In the following, we show the adapted pseudocode for the bivariate and finite coin change problem in its crystallographic application, thus taking into account the case distinction for the multipliers of non-fixed and fixed Wyckoff sites (Fig. 2). The adaptation to the trivariate and finite case including the length k as a parameter follows a straightforward procedure (Fig. 3), as would be the case for any future multivariate generalization, for instance also taking into account chemical degrees of freedom.

Figure 2
Dynamic programming algorithm (given in pseudocode) for the determination of the number of Wyckoff sequences of a given space group and subdivision complexity as determined by the total number of degrees of freedom for the Wyckoff multiplicity and arity (bivariate case).

Figure 3
Dynamic programming algorithm (given in pseudocode) for the determination of the number of Wyckoff sequences of a given space group, subdivision complexity and length as determined by the total number of degrees of freedom for the Wyckoff multiplicity, Wyckoff arity and length (trivariate case). Note that k_i = 1 for all Wyckoff sites – this does not have to be specified for each Wyckoff position independently. However, possible simplifications due to this fact have not been included in the pseudocode in order to highlight its similarity with the bivariate case shown in Fig. 2

In either case, the approach starts with the trivial base case (there is always one possibility for the null tuple), from which it proceeds bottom-up making use of a tabulation implementation of subsolution values cached into either a matrix (bivariate case) or a tensor (trivariate case), which is updated iteratively until the target case is reached and the solution is returned. Equation (29) shows the solution matrix $[{\cal T}]$ for the case defined by the target tuple (M,A) = (12,3) and the multiset of tuples $[{\cal W} = [(2,0),(2,0),(2,1),(4,1),(4,2),(8,3)]]$ in which the values for M_i and A_i increase along the rows and the columns, respectively. Note that for a more economic display, the matrix is shown in its transposed form, with columns and rows interchanged, such that its upper-left corner represents the matrix element $[{\cal T}_{0,0}]$ and the lower-right corner represents the matrix element $[{\cal T}_{12,3}]$ , from which the solution, $[{\cal T}_{12,3} = 6]$ , can be read off:

$[\left[\matrix{1&0&2&0&1&0&0&0&0&0&0&0&0\cr 0&0&1&0&3&0&3&0&1&0&0&0&0\cr 0&0&0&0&2&0&5&0&5&0&3&0&1\cr 0&0&0&0&0&0&2&0&7&0&9&0&6\cr }\right].\eqno(29)]$

A direct comparison shows that all non-zero entries represent the non-vanishing coefficients as occurring in the polynomial expansion given in equation (24), yet with terms only up to the solution x¹²y³ monomial inclusive. Note that the algebraic structure of the Wyckoff positions is reflected in the matrix as well, since the reachable positions are determined by the smallest increments and their parity as observed for the Wyckoff multiplicities and arities. For instance, since all the Wyckoff multiplicities are even numbers, all odd columns of the above matrix are given by the null vector.

A Python implementation of the algorithm for the bivariate case is given in Appendix C, which also contains a Mathematica implementation of the generating polynomial approach.

3. Complexity of Wyckoff sequences

The aforementioned problem of crystals arose in the context of calculating Shannon entropy based complexity measures for crystal structures, taking into account a crystal structure's fundamental chemical, combinatorial and coordinational degrees of freedom (Hornfeck, 2020). Apart from its chemical degrees of freedom – its decoration (colouring) of Wyckoff sites with atoms – a crystal structure is geometrically defined by its combined combinatorial and coordinational degrees of freedom, the collective multiplicities M and arities A of its occupied Wyckoff positions.

3.1. Shannon entropy based complexity measures

In general, a given multiset $[[X_{1},X_{2},\ldots,X_{n}]]$ of n individual degrees of freedom X_i defines a discrete probability distribution

$[{\cal X} = \left[X_{1}/X,X_{2}/X,\ldots,X_{n}/X\right] = \left[x_{1},x_{2},\ldots,x_{n}\right],\eqno(30)]$

where $[X = X_{1}+X_{2}+\ldots+X_{n}]$ denotes the collective number of degrees of freedom as obtained from the partition of the individual numbers of degrees of freedom. From this a Shannon entropy

$[H({\cal X}) = \textstyle\sum\limits_{i = 1}^{n}L(x_{i}) = \sum\limits_{i = 1}^{n}-x_{i}\log_{2}x_{i}\eqno(31)]$

can be obtained. Note that L(0) = 0, by definition.

As mentioned before, the general interpretation is that of a system with X degrees of freedom on a higher level of structural hierarchy being subdivided into n subsystems of X_i degrees of freedom, each on a lower level of structural hierarchy. For more details of the general theory, the reader is referred to previous work done by the author (Hornfeck, 2020, 2022b).

In particular, a crystal structure's M collective combinatorial degrees of freedom are associated with the multiset $[[M_{1},M_{2},\ldots,M_{n}]]$ of individual Wyckoff multiplicities M_i, thereby yielding a fundamental combinatorial Shannon entropy:

$[H_{{\rm comb}} = H({\cal M}) = \textstyle\sum\limits_{i = 1}^{n}L(M_{i}/M).\eqno(32)]$

Note that in order to ensure the comparability of combinatorial complexity values between different crystal structures, the values of the Wyckoff multiplicities M_i have to refer to a primitive unit cell [although this is really important only for derived complexity values such as the maximal complexity, $[H_{{\rm comb},\max}= \log_{2}M]$ , the normal complexity, $[H_{{\rm comb,\,norm}}= H_{{\rm comb}}/H_{{\rm comb},\max}]$ , or the total complexity, $[H_{{\rm comb,\,total}} = M\times H_{{\rm comb}}]$ , which have been described by Hornfeck (2020)].

Now, in the same way, a crystal structure's A collective coordinational degrees of freedom are associated with the multiset $[[A_{1},A_{2},\ldots,A_{n}]]$ of individual Wyckoff arities A_i, thereby yielding a fundamental coordinational Shannon entropy

$[H_{{\rm coor}} = H({\cal A}) = \textstyle\sum\limits_{i = 1}^{n}L(A_{i}/A).\eqno(33)]$

Now, proceeding in a completely analogous manner, a crystal structure's F = M+A collective configurational degrees of freedom are associated with the combined multiset

$[\left[M_{1},M_{2},\ldots,M_{n},A_{1},A_{2},\ldots,A_{n}\right] = \left[F_{1},F_{ 2},\ldots,F_{2n}\right]\eqno(34)]$

of individual Wyckoff multiplicities M_i and individual Wyckoff arities A_i, corresponding to the individual configurational degrees of freedom F_i, thereby yielding a composite configurational Shannon entropy

$[H_{{\rm conf}} = H({\cal F}) = \textstyle\sum\limits_{i = 1}^{2n}L(F_{i}/F).\eqno(35)]$

While the association with the combined multiset is natural, it is noteworthy to mention that this does not mean that the corresponding entropies are (simply) additive; on the contrary

$[H_{{\rm conf}}\neq H_{{\rm comb}}+H_{{\rm coor}}.\eqno(36)]$

Most importantly, however, by means of the strong additivity property of the Shannon entropy, the configurational entropy is equivalent to the general expression

$[H_{{\rm conf}} = H_{{\rm subdiv}}+w_{M}H_{{\rm comb}}+w_{A}H_{{\rm coor}},\eqno(37)]$

including appropriate weighting factors w_M = M/(M+A) and w_A = A/(M+A) and an additional subdivision (mixing) complexity based on them,

$[H_{{\rm subdiv}} = L\left(w_{M}\right)+L\left(w_{A}\right).\eqno(38)]$

This equivalence now allows an assessment of the relative complexity related to the splitting of a given collective number of degrees of freedom, M+A, for the composite system, into individual numbers of degrees of freedom, M and A, for the fundamental (sub)systems.

The subdivision complexity $[H_{{\rm subdiv}}]$ is a Shannon entropy of the weighting factors occurring in equation (37) and for all the combinatorially enumerated cases it is an invariant characterizing the subdivision step $[(M+A)\rightarrow(M,A)]$ on the higher crystal structure level of hierarchy. It can be seen as connecting the collective treatment of degrees of freedom, within the combined configurational complexity $[H_{{\rm conf}}]$ , with the individual treatment of degrees of freedom, within the separated combinatorial and coordinational complexities, $[H_{{\rm comb}}]$ and $[H_{{\rm coor}}]$ , respectively. This allows for a complexity partition analysis (Hornfeck, 2022b). This central importance of the subdivision complexity $[H_{{\rm subdiv}}]$ and its relationship to the other complexity measures is graphically depicted in Fig. 4.

Figure 4
Schematic representation of the relation between the configurational complexity $[H_{{\rm conf}}]$ , calculated for M+A degrees of freedom, and the combinatorial and coordinational complexities, $[H_{{\rm comb}}]$ and $[H_{{\rm coor}}]$ , calculated for M and A degrees of freedom, respectively. Shown also are the weighting factors M/(M+A) and A/(M+A) as well as the unit weight contribution of the invariant subdivision complexity $[H_{{\rm subdiv}}(M,A)]$ which taken all together sum to the configurational complexity $[H_{{\rm conf}}]$ . For a given choice of M and A the subdivision complexity $[H_{{\rm subdiv}}(M,A)]$ is a constant, while the combinatorial and coordinational complexities, $[H_{{\rm comb}}]$ and $[H_{{\rm coor}}]$ , depend on the respective partitions of M and A possible for a given space-group type.

In the same manner, other subdivision complexities exist on the lower Wyckoff position level of hierarchy, characterizing the subdivision steps $[(X_{1}+X_{2}+\ldots+X_{n})]$ $[\rightarrow(X_{1},X_{2},\ldots,X_{n})]$ , with X denoting either M or A.

Furthermore, it should be noted that the strong additivity property also holds for the case of maximal Shannon entropies

$[H_{{\rm conf},\max} = H_{{\rm subdiv}}+w_{M}H_{{\rm comb},\max}+w_{A} H_{{\rm coor},\max},\eqno(39)]$

where $[H_{{\rm subdiv}}]$ takes on the same value as in equation (37). The maximal entropies are reached in the case of maximal subdivision and thus perfect equidistribution of degrees of freedom (corresponding to maximally expanded partitions $[1+1+\ldots+1]$ of variable length equal to the number of degrees of freedom). This makes their calculation particularly simple, resulting eventually in $[H_{{\rm comb},\max} = \log_{2}M]$ , $[H_{{\rm coor},\max} ]$ = log₂A and $[H_{{\rm conf},\max} = \log_{2}(M+A) = \log_{2}F]$ .

Each maximal entropy can also be used in turn to define its corresponding non-maximal entropy, for instance

$[H_{{\rm conf}} = \log_{2}F-\textstyle\sum\limits_{i = 1}^{2n}(F_{i}/F)\log_{2}F_{i}\eqno(40)]$

by means of taking the difference between the maximal entropies for the collective (here, F) and individual (here, F_i) degrees of freedom, the latter ones attributed with their appropriate weighting factors F_i/F.

All the aforementioned interrelations are depicted in Fig. 5.

Figure 5
Schematic representation of the interrelation between various maximal and non-maximal entropies (combinatorial, coordinational, configurational) and the subdivision entropy on different levels of structural hierarchy. On the Wyckoff position level of hierarchy the difference of (maximal) entropies defines the conventional non-maximal entropies (from left to right), while on the crystal structure level of hierarchy the (partially weighted) sum of entropies defines the maximal and non-maximal configurational entropy (from bottom to top). Note that the distinction between differences and sums of entropies depends on a deliberate choice of how to distribute terms on either side of the equals sign. In the scheme as presented here, the case for differences of entropies is based on the choice of collecting all maximal entropies together, while distributing collective and individual contributions on opposite sides of the equation could highlight the fact that the non-maximal entropies fulfil the same role as a subdivision complexity on the Wyckoff position level as the subdivision complexity $[H_{{\rm subdiv}}]$ does on the crystal structure level, thus highlighting the applicability of the strong additivity property on both levels of hierarchy (in this case the common weighting factors w_M or w_A can be omitted on the Wyckoff position level of hierarchy).

4. Combining the combinatorics and complexity of Wyckoff sequences

Some elementary statistical results shall be stated about the magnitude of the collective degrees of freedom to expect, in extreme and on average, and with respect to actual crystal structure data as retrieved from the 20 040 unique Wyckoff sequences compiled in the Pearson's Crystal Data Crystal Structure Database for Inorganic Compounds (Villars & Cenzual, 2020 ).

The minimal observed combinatorial and coordinational degrees of freedom are, as determined for the individual, independent distributions, M_min = 1, A_min = 0, the maximal are M_max = 5926, A_max = 1476, the mean values are $[M_{{\rm mean}}<fi> = <fi type=$ 94.3]" class="img_align_bottom mathimage" style="position: relative; display: inline; padding-top: 0px; padding-bottom: 0px; vertical-align: -2.465474pt;" height="14" width="227" />, $[A_{{\rm mean}} = 53.1]$ , the median values are $[M_{{\rm median}} = 48]$ , $[A_{{\rm median}} = 20]$ , and the mode values are $[M_{{\rm mode}} = 24]$ , $[A_{{\rm mode}} = 6]$ , respectively.

Our combinatorial result now tells us how many of these partitions, being solutions to a combinatorial problem restricted by crystallographic symmetry, can be realized for a given space group, given the fixed number of degrees of freedom (M,A) and, potentially, the length k of the Wyckoff sequence. In the following we will illustrate this by performing explicit calculations for three examples.

It should be noted that the examples given here were chosen such that the discussion of the results could be made explicit, with the full set of potential Wyckoff sequence solutions being listed, and with some actual representatives being found among the known crystal structures. In general, the number of potential solutions might exhibit a rapid growth (combinatorial explosion), while the number of actual representatives lags behind greatly.

4.1. Example 1: space-group type P42₁m (No. 113) revisited

As our first example, we revisit the case of space-group type $[P\overline{4}2_{1}m]$ (No. 113). Table 1 compiles some values of specific Shannon entropies for the Wyckoff sequences given in this example, taking into account the combinatorial, coordinational and configurational degrees of freedom, highlighting the constant term $[H_{{\rm subdiv}}]$ , with the other Shannon entropies related according to their strong additive sum:

$[H_{{\rm conf}} = H_{{\rm subdiv}}+(12/15)H_{{\rm comb}}+(3/15)H_{ {\rm coor}}.\eqno(41)]$

Table 1
Configurational complexities for the six possible Wyckoff sequences (of length k) with 12 combinatorial and three coordinational degrees of freedom of space group type number 113

Shannon entropies are stated in units of bit per freedom.

Wyckoff sequence	k	(M,A) = (12,3)	$[H_{{\rm subdiv}}]$	$[H_{{\rm comb}}]$	$[H_{{\rm coor}}]$	$[H_{{\rm conf}}]$
113 fba	3	(8+2+2,3+0+0)	0.722	1.252	0.000	1.723
113 d³	3	(4+4+4,1+1+1)	0.722	1.585	1.585	2.307
113 edba	4	(4+4+2+2,2+1+0+0)	0.722	1.918	0.918	2.440
113 d²cb	4	(4+4+2+2,1+1+1+0)	0.722	1.918	1.585	2.574
113 d²ca	4	(4+4+2+2,1+1+1+0)	0.722	1.918	1.585	2.574
113 dc²ba	5	(4+2+2+2+2,1+1+1+0+0)	0.722	2.252	1.585	2.840

A search in the Pearson's Crystal Data Crystal Structure Database for Inorganic Compounds (Villars & Cenzual, 2020) for the space-group type $[P\overline{4}2_{1}m]$ (No. 113) reveals a total of 759 Wyckoff sequences for which crystal structure prototypes have been assigned. Reducing for multiple entries arising from multiple crystal structure determinations sharing the same prototype yields 79 entries with a unique Wyckoff sequence/prototype combination. (Notably, 538 entries share the same Wyckoff sequence 113 fe³ca, with seven distinct prototypes associated to it, of which the tP24–Ca₂MgSi₂O₇ prototype alone occurs 307 times, thereby explaining the heavy degree of reduction observed.) Reducing again for distinct prototypes sharing the same Wyckoff sequence yields 63 unique Wyckoff sequences.

Of these, 55 correspond to a unique pair (M,A) of combinatorial and coordinational degrees of freedom, with six pairs occurring two times, and one pair, (28,12), occurring three times. The associated Wyckoff sequences and prototypes are: fe⁴cb (Th₂Se₅ prototype), fe⁴ca [(Ca_0.25La_0.75)₂Ga₃O_7.25 prototype] and e⁴dc³a (Ce₂CoAl₇Ge₄/LiSmTiO₄ prototypes).

On the basis of this work, one obtains a larger number of 44 possible Wyckoff letter sequence solutions [based on the Wyckoff multiplicities and arities, compare equation (21)], which can be constructed from an exhaustive exploration of a search space of much larger size of 11 648 multiplier choices (representatives with prototypes marked by an asterisk):

$[\eqalignno{& f^{2}e^{3},fe^{4}d,e^{5}d^{2},f^{2}e^{2}c^{2},fe^{3}dc^{2},e^{4} d^{2}c^{2},f^{2}ec^{4},fe^{2}dc^{4},&\cr &e^{3}d^{2}c^{4},f^{2}c^{6},fedc^{6},e^{2}d^{2}c^{6},fdc^{8},ed^{ 2}c^{8},d^{2}c^{10},\ast\,fe^{4}cb,&\cr &e^{5}dcb,fe^{3}c^{3}b,e^{4}dc^{3}b,fe^{2}c^{5}b,e^{3}dc^{5}b,fec ^{7}b,e^{2}dc^{7}b,fc^{9}b,&\cr &edc^{9}b,dc^{11}b,\ast\,fe^{4}ca,e^{5}dca,fe^{3}c^{3}a,\ast\,e^{ 4}dc^{3}a,fe^{2}c^{5}a,e^{3}dc^{5}a,&\cr &fec^{7}a,e^{2}dc^{7}a,fc^{9}a,edc^{9}a,dc^{11}a,e^{6}ba,e^{5}c^{ 2}ba,e^{4}c^{4}ba,&\cr &e^{3}c^{6}ba,e^{2}c^{8}ba,ec^{10}ba,c^{12}ba.&(42)}]$

While the combinatorial approach is non-constructive, it might often suffice to only know the number of possible solutions.

4.2. Example 2: space-group type Fm3m (No. 225)

As another example, we consider the cubic space-group type $[Fm\overline{3}m]$ (No. 225) encompassing a total of 12 distinct Wyckoff positions of the following multiplicities and arities:

$[{\cal W}^{225} = \left\{\matrix{{\cal W}_{a}(1,0),&{\cal W }_{b}(1,0),&{\cal W}_{c}(2,0),\cr {\cal W}_{d}(6,0),&{\cal W}_{e}(6,1),&{\cal W}_{f}(8,1),\cr {\cal W}_{g}(12,1),&{\cal W}_{h}(12,1),&{\cal W}_{i}(12,1),\cr {\cal W}_{j}(24,2),&{\cal W}_{k}(24,2),&{\cal W}_{l}(48,3).}\right.\eqno(43)]$

Note that the multiplicities stated here are those for a reduced (primitive) unit cell, while the multiplicities for an F-centred unit cell would be four times larger. Again, all of the four possible values for the Wyckoff arity are present in this example, in addition to seven distinct values for the Wyckoff multiplicity. This example was chosen because it partly corresponds to actual crystal structures.

Now, for the given choice of (M,A) = (28,3) degrees of freedom one finds 17 matching Wyckoff letter sequences:

$[\eqalignno{& if^{2},hf^{2},\ast\,gf^{2},f^{2}ed,\ast\,ifec,\ast\,hfec,&\cr &gfec,\ast\,fe^{2}dc,\ast\,ifeba,hfeba,gfeba,fe^{2}dba,&\cr &f^{3}cba,ie^{2}cba,he^{2}cba,\ast\,ge^{2}cba,\ast\,e^{3}dcba.&(44)}]$

Here, the seven sequences preceded by an asterisk are realized as crystal structures. A compilation of their complexity values, which are interrelated according to

$[H_{{\rm conf}} = H_{{\rm subdiv}}+(28/31)H_{{\rm comb}}+(3/31)H_{ {\rm coor}},\eqno(45)]$

is given in Table 2. Here, only the combinatorial contribution $[H_{{\rm comb}}]$ is responsible for the variable amount of configurational complexity $[H_{{\rm conf}}]$ , the other terms, in particular the value for $[H_{{\rm subdiv}}]$ , being constant.

Table 2
Configurational complexities for the seven possible Wyckoff sequences (of length k) with 28 combinatorial and three coordinational degrees of freedom of space group type number 225 which have realizations as crystal structures in nature

Shannon entropies are stated in units of bit per freedom. Information regarding the structure type for which the Wyckoff sequence is realized is taken from Pearson's Crystal Data Crystal Structure Database for Inorganic Compounds (Villars & Cenzual, 2020). If more than one structure type is associated with the same Wyckoff sequence a selection has been made.

Wyckoff sequence	k	(M,A) = (28,3)	$[H_{{\rm subdiv}}]$	$[H_{{\rm comb}}]$	$[H_{{\rm coor}}]$	$[H_{{\rm conf}}]$	Structure type
225 gf²	3	(12+8+8,1+1+1)	0.459	1.557	1.585	2.018	Y_0.25Bi_0.75O_1.5
225 ifec	4	(12+8+6+2,1+1+1+0)	0.459	1.788	1.585	2.227	KY₃F₁₀, Zr₃PbO₄F₆
225 hfec	4	(12+8+6+2,1+1+1+0)	0.459	1.788	1.585	2.227	Y_0.27Bi_0.73O_1.5
225 fe²dc	5	(8+6+6+6+2,1+1+1+0+0)	0.459	2.217	1.585	2.615	Cs₃Re₃S₄I₄
225 ifeba	5	(12+8+6+1+1,1+1+1+0+0)	0.459	1.860	1.585	2.292	(Ag_0.5Pd_0.5)₁₁Se₃
225 ge²cba	6	(12+6+6+2+1+1,1+1+1+0+0+0)	0.459	2.092	1.585	2.501	Cr_0.8Mn_0.2Mn(CN)₆(H₂O)₄
225 e³dcba	7	(6+6+6+6+2+1+1,1+1+1+0+0+0+0)	0.459	2.520	1.585	2.888	K_0.04(VO)Co_0.88(CN)_3.8(H₂O)_1.1

If one performs the same calculation with the additional restriction of the Wyckoff sequence length to the value k = 5 one obtains the number of four solutions, which the reader can easily check on the list given above.

4.3. Example 3: space-group type Cmcm (No. 63)

As a final example, we consider the orthorhombic space-group type Cmcm (No. 63) encompassing a total of eight distinct Wyckoff positions of the following multiplicities and arities:

$[{\cal W}^{63} = \left\{\matrix{{\cal W}_{a}(2,0),&{\cal W} _{b}(2,0),&{\cal W}_{c}(2,1),\cr {\cal W}_{d}(4,0),&{\cal W}_{e}(4,1),&{\cal W}_{f}(4,2),\cr {\cal W}_{g}(4,2),&{\cal W}_{h}(8,3).&}\right.\eqno(46)]$

Note that the multiplicities stated here are those for a reduced (primitive) unit cell, while the multiplicities for a C-centred unit cell would be two times larger. Again, all of the four possible values for the Wyckoff arity are present in this example, in addition to three distinct values for the Wyckoff multiplicity.

Now, for the given choice of (M,A) = (22,10) degrees of freedom, one finds 67 matching Wyckoff letter sequences:

$[\eqalignno{& hg^{3}c,hg^{2}fc,hgf^{2}c,\ast\,hf^{3}c,g^{4}ec,g^{3}fec,\ast\,g ^{2}f^{2}ec,&\cr &gf^{3}ec,f^{4}ec,hg^{2}c^{3},hgfc^{3},hf^{2}c^{3},g^{3}ec^{3}, \ast\,g^{2}fec^{3},&\cr &gf^{2}ec^{3},f^{3}ec^{3},hgc^{5},\ast\,hfc^{5},g^{2}ec^{5},gfec^ {5},f^{2}ec^{5},&\cr &\ast\,hc^{7},gec^{7},fec^{7},ec^{9},g^{5}b,g^{4}fb,g^{3}f^{2}b,&\cr &g^{2}f^{3}b,gf^{4}b,f^{5}b,g^{4}c^{2}b,g^{3}fc^{2}b,g^{2}f^{2}c^ {2}b,gf^{3}c^{2}b,&\cr &\ast\,f^{4}c^{2}b,g^{3}c^{4}b,g^{2}fc^{4}b,gf^{2}c^{4}b,f^{3}c^{4 }b,g^{2}c^{6}b,gfc^{6}b,&\cr &f^{2}c^{6}b,gc^{8}b,fc^{8}b,c^{10}b,g^{5}a,g^{4}fa,g^{3}f^{2}a,&\cr &g^{2}f^{3}a,gf^{4}a,f^{5}a,g^{4}c^{2}a,g^{3}fc^{2}a,g^{2}f^{2}c^ {2}a,\ast\,gf^{3}c^{2}a,&\cr &\ast\,f^{4}c^{2}a,g^{3}c^{4}a,\ast\,g^{2}fc^{4}a,gf^{2}c^{4}a, \ast\,f^{3}c^{4}a,g^{2}c^{6}a,gfc^{6}a,&\cr &\ast\,f^{2}c^{6}a,gc^{8}a,fc^{8}a,c^{10}a.&(47)}]$

Here, the 11 sequences preceded by an asterisk are realized as crystal structures. A compilation of their complexity values, which are interrelated according to

$[H_{{\rm conf}} = H_{{\rm subdiv}}+(22/32)H_{{\rm comb}}+(10/32)H_{ {\rm coor}},\eqno(48)]$

is given in Table 3. Here, both the combinatorial and coordinational degrees of freedom are responsible for the variable amount of configurational complexity $[H_{{\rm conf}}]$ .

Table 3
Configurational complexities for the 11 possible Wyckoff sequences (of length k) with 22 combinatorial and ten coordinational degrees of freedom of space group type number 63 which have realizations as crystal structures in nature

Wyckoff sequence	k	(M,A) = (22,10)	$[H_{{\rm subdiv}}]$	$[H_{{\rm comb}}]$	$[H_{{\rm coor}}]$	$[H_{{\rm conf}}]$	Structure type
63 hf³c	5	(8+4+4+4+2,3+2+2+2+1)	0.896	2.187	2.246	3.101	SrGe_5.5
63 g²f²ec	6	(4+4+4+4+4+2,2+2+2+2+1+1)	0.896	2.550	2.522	3.438	K₂Zn₅As₄
63 g²fec³	7	(4+4+4+4+2+2+2,2+2+2+1+1+1+1)	0.896	2.732	2.722	3.625	[UO₂]Cl₂[NH₃]₆
63 hfc⁵	7	(8+4+2+2+2+2+2,3+2+1+1+1+1+1)	0.896	2.550	2.646	3.476	Yb₂Mn_0.33Si_3.67
63 f⁴c²b	7	(4+4+4+4+2+2+2,2+2+2+2+1+1+0)	0.896	2.732	2.522	3.562	K₂Cu₃US₅
63 gf³c²a	7	(4+4+4+4+2+2+2,2+2+2+2+1+1+0)	0.896	2.732	2.522	3.562	Zn(Zn_0.45Co_0.55)Co₃Sn₄
63 f⁴c²a	7	(4+4+4+4+2+2+2,2+2+2+2+1+1+0)	0.896	2.732	2.522	3.562	CaFe₄O₆
63 hc⁷	8	(8+2+2+2+2+2+2+2, 3+1+1+1+1+1+1+1)	0.896	2.732	2.846	3.664	KNaOs[NO][F₅][H₂O]
63 g²fc⁴a	8	(4+4+4+2+2+2+2+2, 2+2+2+1+1+1+1+0)	0.896	2.914	2.722	3.750	β-U₃O₈
63 f³c⁴a	8	(4+4+4+2+2+2+2+2, 2+2+2+1+1+1+1+0)	0.896	2.914	2.722	3.750	Cs₂Cu₅Se₄
63 f²c⁶a	9	(4+4+2+2+2+2+2+2+2, 2+2+1+1+1+1+1+1+0)	0.896	3.096	2.922	3.938	Sr₂Ta₂O₇

5. Conclusion

As a contribution to the combinatorics of Wyckoff sequences, we have presented two methods to calculate their number for a fixed space group, given a pair of combinatorial and coordinational total degrees of freedom, and, optionally, their length. The first method is based on a generating polynomial approach (see Sections 2.4 and 2.7 for the key results), while the second makes use of a dynamic programming algorithm (Section 2.8). While the generating polynomial approach appears to be conceptually easier to understand, the dynamic programming algorithm is considerably better in its computational performance. The methods have been exemplified on cases of ideal and actual crystal structures with invariant subdivision complexity and variable configurational complexity in the sense of Hornfeck (2020), thus relating the combinatorics of Wyckoff sequences to the complexities of crystal structures.

APPENDIX A

Number of Wyckoff sequences of length k

The number of Wyckoff sequences of length k is given as

$[W(\nu,\varphi,k) = \sum_{i = 0}^{k}{{\nu+i-1}\choose{i}}{{\varphi}\choose{k-i}}\eqno(49)]$

in which ν and φ, respectively, denote the number of non-fixed and fixed Wyckoff positions of a given space group [equation (7) in Hornfeck (2022a)]. In the worst case $[\nu = 19]$ and $[\varphi = 8]$ , thus

$[\textstyle\sum\limits_{k = 1}^{50}W(19,8,k) = 3\,519\,297\,892\,616\,574\,305\eqno(50)]$

gives the number of distinct Wyckoff sequences up to k = 50 inclusive.

APPENDIX B

The classical coin change problem

Assume you are in the possession of the following multiset of coins: three coins of value v = 1, four coins of value v = 2, two coins of value v = 5. How many ways exist to pay a total price of value V = 12?

Let S_v denote the individual sum of coins of value v, hence

$[V = S_{1}+S_{2}+S_{5} = 12.\eqno(51)]$

The values which each individual sum may take are determined and limited in number by the number of available coins of each nominal value. Thus, one gets

$[\eqalignno{& S_{1}\in\{0,1,2,3\},&\cr &S_{2}\in\{0,2,4,6,8\},&\cr &S_{5}\in\{0,5,10\}&(52)}]$

as all possible values. Now, the trick is that one identifies these values with the exponents of a couple of generating polynomials, one for each type of coin, namely

$[\eqalignno{P_{1}(x)& = 1+x+x^{2}+x^{3},&\cr P_{2}(x)& = 1+x^{2}+x^{4}+x^{6}+x^{8}&\cr P_{5}(x)& = 1+x^{5}+x^{10}&(53)}]$

in which x⁰ = 1 by definition. From these individual polynomials one forms their product

$[{\cal P}(x) = P_{1}(x)P_{2}(x)P_{5}(x)\eqno(54)]$

and analyses its expansion

$[\eqalignno{& 1+x+2x^{2}+2x^{3}+2x^{4}+3x^{5}+3x^{6}+4x^{7}&\cr &+4x^{8}+4x^{9}+4x^{10}+4x^{11}+4x^{12}+4x^{13}+4x^{14}&\cr &+3x^{15}+3x^{16}+2x^{17}+2x^{18}+2x^{19}+x^{20}+x^{21}.&(55)}]$

The coefficient of x¹² in $[{\cal P}(x)]$ gives the desired solution, here it is equal to four, describing the four cases (i) 2×1+2×5, (ii) 1×2+2×5, (iii) 3×1+2×2+1×5 and (iv) 1×1+3×2+1×5, all amounting to 12.

A generalization is possible, in case the number of coins available for all given values v is assumed to be infinite. Then, the generating polynomials are replaced by the respective generating functions of a geometric series,

$[F_{v}(x) = 1+x^{v}+x^{2v}+x^{3v}+\cdots = {{1} \over {1-x^{v}}},\eqno(56)]$

and the solution to the problem is given by their product

$[{\cal F}(x) = F_{1}(x)F_{2}(x)F_{5}(x) = {{1} \over {\left(1-x\right)\left(1-x^{ 2}\right)\left(1-x^{5}\right)}}\eqno(57)]$

in the same way as before, namely as the value of the coefficient [x^V] in $[{\cal F}(x)]$ , in which V denotes the total price. In the infinite case the result is $[[x^{12}]{\cal F}(x) = 13]$ , necessarily containing at least as many or even more solutions than were found in the finite case. In this general form the problem has been addressed by Pólya (1956 ). A closed form solution for the univariate infinite case has been described by Graham et al. (1994, pp. 327–330 and pp. 344–346).

Note that

$[\prod_{i = 1}^{\infty}{{1} \over {1-x^{i}}}\eqno(58)]$

is the generating function of the number-theoretic integer partition function (Alfonsín, 2005 , p. 71), and our problem asks for the number of restricted partitions into specified, possibly repeated parts (a finite or infinite number of coins with a finite set of distinct denominations). In this context the number of non-negative integer solutions is known as Sylvester's denumerant (Sylvester, 1857 ) which has a rich history in number theory, partly due to the difficulty of obtaining exact results (Alfonsín, 2005, ch. 4).

Finally, it should be emphasized that an elegant method of solving this problem in practice is given by the concept of dynamical programming. Since the problem again and again reduces to simpler subproblems of the same kind, the solutions of these subproblems can be stored whenever they occur for the first time. Moreover, using a bottom-up approach (top-down would be equally feasible), the solutions for subproblems can be systematically generated from the already known solutions. For this purpose the function

here given in a Python implementation is initialized with a result vector of length n+1 of the form $[(1,0,0,\ldots,0)]$ . This result vector is subsequently updated in a separate loop for every coin value. The final value at position n of the result vector then is the solution to the (infinite) coin problem described before (for n = 12 the result is equal to 13).

APPENDIX C

Code examples

Figs. 6 and 7 give working examples for the calculation of the number of Wyckoff sequences with the same subdivision complexity by means of the generating polynomial and dynamic programming approaches. The input values are taken from the example described in Section 2.6, with the output being equal to six.

Figure 6
Generating polynomial approach (Mathematica).

Figure 7
Dynamic programming approach (Python). Here, the target tuple is given as [M,A] = [12,3] and the Wyckoff tuples are given as $[{\cal W} = [[2,0],[2,0],[2,1],[4,1],[4,2],[8,3]]]$ .

Acknowledgements

The author thanks Philipp Kuhn for discussions, one reviewer for helpful comments which greatly improved the manuscript, Michal Dušek for his kind endorsement, as well as the organizers of the conference Mathematics and Computer Science for Materials Innovation (MACSMIN), Vitaliy Kurlin and Mois Aroyo, for the invitation to present this work at MACSMIN-2022 (Crystal Lattice Classifications) on location at the Materials Innovation Factory, Liverpool, United Kingdom.

Funding information

This work was supported by the Czech Science Foundation, project No. 21-05926X. We also acknowledge CzechNanoLab Research Infrastructure supported by MEYS CR (LM2018110).

References

Alfonsín, J. L. R. (2005). The Diophantine Frobenius Problem. Oxford: Oxford University Press. Google Scholar
Allmann, R. & Hinek, R. (2007). Acta Cryst. A63, 412–417. Web of Science CrossRef IUCr Journals Google Scholar
Aroyo, M. I. (2016). Editor. International Tables for Crystallography, Vol. A, Space-group Symmetry, 6th ed. Chichester: Wiley. Google Scholar
Aroyo, M. I., Kirov, A., Capillas, C., Perez-Mato, J. M. & Wondratschek, H. (2006a). Acta Cryst. A62, 115–128. Web of Science CrossRef CAS IUCr Journals Google Scholar
Aroyo, M. I., Perez-Mato, J. M., Capillas, C., Kroumova, E., Ivantchev, S., Madariaga, G., Kirov, A. & Wondratschek, H. (2006b). Z. Kristallogr. 221, 15–27. Web of Science CrossRef CAS Google Scholar
Aroyo, M. I., Perez-Mato, J. M., Orobengoa, D., Tasci, E., de la Flor, G. & Kirov, A. (2011). Bulg. Chem. Commun. 43, 183–197. CAS Google Scholar
Brown, H., Bülow, R., Neubüser, J., Wondratschek, H. & Zassenhaus, H. (1978). Crystallographic Groups of Four Dimensional Space. New York: Wiley. Google Scholar
Gelato, L. M. & Parthé, E. (1987). J. Appl. Cryst. 20, 139–143. CrossRef Web of Science IUCr Journals Google Scholar
Goodall, R. E. A., Parackal, A. S., Faber, F. A., Armiento, R. & Lee, A. A. (2022). Sci. Adv. 8, eabn4117. CrossRef PubMed Google Scholar
Graham, R. L., Knuth, D. L. & Patashnik, O. (1994). Concrete Mathematics – a Foundation for Computer Science, 2nd ed. New York: Addison-Wesley. Google Scholar
Hahn, T. (2005). Editor. International Tables for Crystallography, Vol. A, Space-group symmetry. 5th ed. Heidelberg: Springer. Google Scholar
Hornfeck, W. (2020). Acta Cryst. A76, 534–548. Web of Science CrossRef IUCr Journals Google Scholar
Hornfeck, W. (2022a). Acta Cryst. A78, 149–154. CrossRef IUCr Journals Google Scholar
Hornfeck, W. (2022b). Z. Kristallogr. 237, 127–134. CrossRef CAS Google Scholar
Krivovichev, S. (2012). Acta Cryst. A68, 393–398. Web of Science CrossRef IUCr Journals Google Scholar
Krivovichev, S. V. (2014). Angew. Chem. Int. Ed. 53, 654–661. Web of Science CrossRef CAS Google Scholar
Kruchinin, D., Kruchinin, V. & Shablya, Y. (2021). Mathematics, 9, 428. CrossRef Google Scholar
Lima-de-Faria, J., Hellner, E., Liebau, F., Makovicky, E. & Parthé, E. (1990). Acta Cryst. A46, 1–11. CrossRef CAS IUCr Journals Google Scholar
Marcus, D. A. (1998). Combinatorics – a Problem Oriented Approach. Washington, D. C.: The Mathematical Association of America. Google Scholar
Müller, U. (2013). Symmetry Relationships between Crystal Structures – Applications of Crystallographic Group Theory in Crystal Chemistry. Oxford: Oxford University Press. Google Scholar
Nespolo, M., Aroyo, M. I. & Souvignier, B. (2018). J. Appl. Cryst. 51, 1481–1491. Web of Science CrossRef CAS IUCr Journals Google Scholar
Parthé, E., Cenzual, K. & Gladyshevskii, R. E. (1993a). J. Alloys Compd. 197, 291–301. Google Scholar
Parthé, E., Gelato, L., Chabot, B., Penzo, M., Cenzual, K. & Gladyshevskii, R. (1993b). TYPIX – Standardized Data and Crystal Chemical Characterization of Inorganic Structure Types (Vol. 1). In Gmelin Handbook of Inorganic and Organometallic Chemistry, 8th ed. Berlin: Springer. Google Scholar
Parthé, E. & Gelato, L. M. (1984). Acta Cryst. A40, 169–183. CrossRef Web of Science IUCr Journals Google Scholar
Parthé, E. & Gelato, L. M. (1985). Acta Cryst. A41, 142–151. CrossRef Web of Science IUCr Journals Google Scholar
Pólya, G. (1956). Am. Math. Mon. 63, 689–697. Google Scholar
Sylvester, J. J. (1857). Q. J. Pure Appl. Math. 1, 141–152. Google Scholar
Villars, P. & Cenzual, K. (2020). Pearson's Crystal Data Crystal Structure Database for Inorganic Compounds. ASM International, Materials Park, Ohio, USA. https://www.crystalimpact.com/pcd/Default.htm. Google Scholar
Wilf, H. S. (2006). Generatingfunctionology, 3rd ed. Wellesley: A. K. Peters. Google Scholar

This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.

FOUNDATIONS
ADVANCES

ISSN: 2053-2733

Volume 79| Part 3| May 2023| Pages 280-294

https://doi.org/10.1107/S2053273323002437

Open

access

Format		BIBTeX
		EndNote
		RefMan
		Refer
		Medline
		CIF
		SGML
		Plain Text
		Text

Format		BIBTeX
		EndNote
		RefMan
		Refer
		Medline
		CIF
		SGML
		Plain Text
		Text

Search IUCr Journals		doi		Advanced search
Author		volume	page

crystal lattices\(\def\hfill{\hskip 5em}\def\hfil{\hskip 3em}\def\eqno#1{\hfil {#1}}\)

On the combinatorics of crystal structures. II. Number of Wyckoff sequences of a given subdivision complexity

1. Introduction

2. Combinatorics of Wyckoff sequences

2.1. A problem of crystals – exposition

2.2. Conditions on the multipliers

2.3. Size of the search space

2.4. A generating polynomial approach to find solutions

2.5. A generalization including the Wyckoff sequence length

2.6. A problem of crystals – exemplification

2.7. Summary of the generation function approach

2.8. A dynamic programming approach to find solutions

3. Complexity of Wyckoff sequences

3.1. Shannon entropy based complexity measures

4. Combining the combinatorics and complexity of Wyckoff sequences

4.1. Example 1: space-group type P421m (No. 113) revisited

4.2. Example 2: space-group type Fm3m (No. 225)

4.3. Example 3: space-group type Cmcm (No. 63)

5. Conclusion

APPENDIX A

Number of Wyckoff sequences of length k

APPENDIX B

The classical coin change problem

APPENDIX C

Code examples

Acknowledgements

Funding information

References

crystal lattices

4.1. Example 1: space-group type P42₁m (No. 113) revisited