- 1. Introduction: the paper's background, organization, motivation, primary goal and secondary objective
- 2. Fedorov-type pseudosymmetries illustrated on a noise-free synthetic pattern
- 3. Pertinent equations, inequalities, plane symmetry and 2D Laue class hierarchy trees, and their usages
- 4. Objective crystallographic symmetry classifications of three synthetic crystal patterns and an optimal crystallographic-image-processing-induced noise suppression
- 5. Comparisons of our classification results with suggestions by the CRISP program and associated comments
- 6. Summary and conclusions
- C1. Quaternary symmetry and pseudosymmetry of intrinsic membrane protein complexes
- C2. Development of an information-theoretic projected point symmetry classification and quantifications method
- Supporting information
- References
- 1. Introduction: the paper's background, organization, motivation, primary goal and secondary objective
- 2. Fedorov-type pseudosymmetries illustrated on a noise-free synthetic pattern
- 3. Pertinent equations, inequalities, plane symmetry and 2D Laue class hierarchy trees, and their usages
- 4. Objective crystallographic symmetry classifications of three synthetic crystal patterns and an optimal crystallographic-image-processing-induced noise suppression
- 5. Comparisons of our classification results with suggestions by the CRISP program and associated comments
- 6. Summary and conclusions
- C1. Quaternary symmetry and pseudosymmetry of intrinsic membrane protein complexes
- C2. Development of an information-theoretic projected point symmetry classification and quantifications method
- Supporting information
- References
research papers
Objective
classifications of a noisy with strong Fedorov-type pseudosymmetries and its optimal image-quality enhancementaDepartment of Physics, Portland State University, Portland 97201-0751, USA
*Correspondence e-mail: pmoeck@pdx.edu
Statistically sound CRISP, the classification process does not need to be supervised by a human being and is free of any subjectively set thresholds in the geometric model selection process. This enables classification of digital images that are more or less periodic in two dimensions (2D), also known as crystal patterns, as recorded with sufficient structural resolution from a wide range of crystalline samples with different types of scanning probe and transmission electron microscopes. Correct symmetry classifications enable the optimal crystallographic processing of such images. That processing consists of the averaging over all asymmetric units in all unit cells in the selected image area and significantly enhances both the signal-to-noise ratio and the structural resolution of a microscopic study of a crystal. For sufficiently complex crystal patterns, the information-theoretic symmetry classification methods are more accurate than both visual classifications by human experts and the recommendations of one of the popular crystallographic image processing programs of electron crystallography.
classifications are obtained with information-theory-based methods in the presence of approximately Gaussian distributed noise. A set of three synthetic patterns with strong Fedorov-type pseudosymmetries and varying amounts of noise serve as examples. Contrary to traditional classifications with an image processing program such asKeywords: plane symmetry groups; projected Laue classes; Fedorov-type pseudosymmetries; information theory; crystallographic image processing.
1. Introduction: the paper's background, organization, motivation, primary goal and secondary objective
1.1. Crystallographic symmetries and pseudosymmetries
The symmetries of the Euclidean plane that are compatible with translation periodicity in two dimensions (2D) are tabulated exhaustively in Volume A of International Tables for Crystallography (Aroyo, 2016) and in the Brief Teaching Edition of Volume A (Hahn, 2010) of that series of authoritative reference books from the International Union of Crystallography (IUCr). has been defined in the IUCr's Online Dictionary of Crystallography as a `symmetry operation that is not compatible with the periodicity of a crystal pattern' (https://dictionary.iucr.org/Noncrystallographic_symmetry).
It is also noted in this dictionary and by Nespolo et al. (2008) that this term is often improperly used in biological crystallography, where one should refer either to local and operations, on the one hand, and pseudosymmetries, on the other hand. The above-mentioned online dictionary defines a crystallographic pseudosymmetry simply as featuring a `deviation' from a space-group symmetry (of one, two, or three dimensions) that `is limited' without explaining how the deviation is to be quantified (https://dictionary.iucr.org/Pseudo_symmetry). In this paper, we will provide such quantifications for three synthetic crystal patterns.
A generalization of a crystal structure to any pattern, concrete or abstract, in any dimension, which obeys the conditions of periodicity and discreteness' (https://dictionary.iucr.org/Crystal_pattern). Physical realizations of a can be undisturbed or disturbed/noisy.
is defined as the `Pseudosymmetry is `a spatial arrangement that feigns a symmetry without fulfilling it' (Moeck, 2018) and can exist in at either the site/point symmetry level of a plane symmetry group or the projected type level, or a combination thereof. When a very strong translational pseudosymmetry results in components and lattice parameters that are, within experimental error bars, indistinguishable from those of a higher-symmetry type, one speaks of a (Moeck & DeStefano, 2018). On the site/point symmetry level, one can make a distinction between crystallographic pseudosymmetries that are either compatible with the of the of the genuine symmetries or a of the genuine symmetries. These kinds of pseudosymmetries are often collectively called Fedorov-type pseudosymmetries (Chuprunov, 2007).
Pseudosymmetries of the Fedorov type form plane `pseudosymmetry groups', which are either disjoint or non-disjoint from the plane symmetry groups of the genuine symmetries. The lowest-symmetry pseudosymmetry group is per definition always disjoint from the lowest-symmetry genuine symmetry group that provides the best fit to experimental data. The minimal Fedorov-type pseudosymmetry supergroups of lowest-symmetry maximal pseudosymmetry subgroups can, however, be non-disjoint from the lowest-symmetry genuine symmetry group.
When Fedorov-type pseudosymmetries and genuine symmetries exist in
they exist in reciprocal/Fourier space as well. In noisy experimental data, local and partial symmetries may become difficult to distinguish from pseudosymmetries and genuine symmetries alike.1.2. Assignments of symmetries in the presence of noise
Note that only the idealized structure of a real-world crystal is strictly periodic in three dimensions (3D) and features an unbroken discrete space symmetry group. Analogously, the idealized structure of a ).
(such as a regular array of intrinsic membrane protein complexes in a lipid bilayer) is strictly periodic in 2D and features an unbroken discrete layer symmetry group (Kopský & Litvin, 2010The 2D projection of the structure of a real crystal that contains only a few localized symmetry-breaking structural defects is, however, deemed to possess a discrete plane symmetry group on average over multiple unit cells as well. The genuine plane symmetry group of the projected real is per definition the plane symmetry group that is least broken. The lowest-symmetry plane symmetry group of the genuine symmetries is referred to here as the `anchoring group' and is measurably least broken in the by `aggregated noise' from multiple sources.
By these definitions, Fedorov-type pseudosymmetry groups are broken to a measurably larger extent than the symmetry group of the genuine symmetries (and all maximal subgroups of these symmetries and their respective maximal subgroups). This will be further elaborated on in Section 2 of this paper, where a visual example is provided.
In the presence of noise, it may become difficult for human classifiers to distinguish Fedorov-type pseudosymmetries from their genuine symmetries counterparts. This difficulty arises from the unaided human classifier's need to extrapolate `on sight' to a hypothetical noise-free version of the crystal pattern.
1.3. Crystallographic image processing and the symmetry inclusion problem
The essence of crystallographic image processing (Hovmöller, 1992; Valpuesta et al., 1994; Wan et al., 2003; Kilaas et al., 2005; Gipson et al., 2007; Zou et al., 2011) is the enforcing of the 2D site/point symmetries that correspond to a certain higher-symmetry plane symmetry group on all of the pixel intensity values within the direct-space translation-averaged unit cell.
The Fourier-space representation of the translation-averaged
is obtained by calculating the discrete Fourier transform of the image intensity and the filtering out of all non-structure-bearing Fourier coefficients. The Fourier back transforming of the periodic structure-bearing Fourier coefficients (that are laid out on a in the amplitude map of the discrete Fourier transform) leads to the translation-averaged in direct space.Obtaining the translation-averaged direct-space ). The non-structure-bearing Fourier coefficients represent the bulk of the noise in the direct-space image. Accordingly, their filtering out enhances the signal-to-noise ratio and structural resolution of the Fourier-filtered image.
is, therefore, known as traditional Fourier filtering (Park & Quate, 1987The enforcing of the symmetries of a certain higher-symmetry plane symmetry group on the structure-bearing Fourier coefficients of a more or less 2D periodic image is loosely speaking obtained by averaging over the corresponding symmetry-related sets of structure-bearing Fourier coefficients. (These sets are specific to each plane symmetry group.) This averaging/symmetrizing enforces all site/point symmetries of the chosen plane symmetry group onto the translation-averaged
when the symmetrized structure-bearing Fourier coefficients are back-transformed into a direct-space image. In effect, one has averaged in Fourier space over all asymmetric units in all unit cells of a selected region of a digital direct-space input image.When done correctly, crystallographic image processing increases the signal-to-noise ratio and intrinsic quality1 (Paganin et al., 2019; Gureyev et al., 2019) of a digital image in significantly. Compared with traditional Fourier filtering, the processing of a digital image in the correctly determined plane symmetry group leads to a further increase of the signal-to-noise ratio and an associated increase of the structural resolution of a crystallographic study. For (approximately) Gaussian distributed noise, crystallographic image processing is by (approximately) the square root of the multiplicity of the general position per lattice point more effective in the suppression of noise than Fourier filtering alone. (That multiplicity is equal to the number of non-translational symmetry operations in a plane symmetry group.)1
The knowledge of the most likely plane symmetry that a hypothetical version of an image would possess in the absence of noise is the precondition for the correct/optimal crystallographic processing of that image. For a previously not classified crystal or et al., 2009) and electron crystallography.
this knowledge has historically not been easy to come by. Elucidating that kind of plane symmetry group has been a long-standing problem in both the computational symmetry subfield of computer science (LiuThe main reason that this problem had remained unsolved for more than half a century is the existence of mathematically defined inclusion relations between the individual i.e. statistically most justified, model for the digital input image data is to be selected. Symmetry inclusion relations, non-disjointness and disjointness are explained in some detail in Section 3 of this paper. Section 3 also presents the plane symmetry hierarchy tree as a visualization of disjoint and non-disjoint symmetry inclusion relationships between the translationengleiche (Aroyo, 2016; Hahn, 2010; Burzlaff et al., 1968) maximal subgroups and minimal supergroups of the plane symmetry groups. The symmetry hierarchy tree of the 2D point symmetries that are projected is also provided there.
groups, classes and types. In other words, the main reason was the non-disjointness of many of the geometric models that are to be compared with the input image data and from which the best,1.4. Using a geometric form of information theory offers a workaround to the symmetry inclusion problem
This author presented recently so far unique interpretation-threshold-free solutions to identifying the genuine plane symmetry group and projected , 2019, 2021d; Moeck & Dempsey, 2019; Dempsey & Moeck, 2020; Moeck, 2021b,c). Fedorov-type pseudosymmetries do not present challenges to these solutions as they are reliably identified (and can be quantified) as long as noise levels are moderate. This will be demonstrated in this paper.
in digital more or less 2D periodic images in the presence of pseudosymmetries and generalized noise (Moeck, 2018The author's solutions are based on Kenichi Kanatani's geometric form of information theory2 (Kanatani, 1997, 1998, 2004, 2005). Kanatani's theory presents a geometric `workaround' to the symmetry inclusion relations problem and has the added benefit that the prevailing noise level does not need to be estimated for the comparison of non-disjoint geometric models of digital image data. This statistical theory tackles the inclusion problem that a less restricted, e.g. lower symmetry, model of some input image data will always feature a smaller deviation (by any kind of distance measure) to the input image data than any more restricted, e.g. higher symmetry, model that is non-disjoint (Kanatani, 1997, 1998). In other words, the fit to some experimental data with more parameters will always be better than a fit with fewer parameters. The adaptation of Kanatani's framework to classifications and quantifications is described in detail in Moeck (2018). Section 3 of this paper gives the relevant equations and inequalities for making objective plane symmetry and projected classifications with the author's methods. (The usage of those relations has led to the results that are presented in Section 4.)
Objectivity is in this paper to be understood as only stating what digital image data actually reveal about a
without any subjective interpretation of any symmetry distance measure. This objectivity is obtained by using a geometric form of information theory.Note that the information-theory-based ). Fedorov-type pseudosymmetries exist also in three dimensions and are not rare in nature (Chuprunov, 2007; Somov & Chuprunov, 2009; Moeck, 2018). The symmetry inclusion relationships of the space groups occupy the bulk of Volume A1 of International Tables for Crystallography (Wondratschek & Müller, 2004). Note in passing that Kanatani's statistical theory is valid in any dimension.
classification methods of this author should be generalized to three spatial dimensions. This is because there is also subjectivity in the current practice of single-crystal X-ray and neutron crystallography (Moeck, 2018It is very well known that the structural resolution of crystallographic studies depends on the number of structural entities over which one averages (McLachlan, 1958). The optimal averaging can, however, only be obtained for the correct prior symmetry classification of the data that enter into such studies when no prior knowledge of the crystal and/or symmetry is available.
Optimal crystallographic averaging in 2D and crystallographic image processing on the basis of the correctly identified plane symmetry group are synonymous. One enforces in this case all of the site/point symmetries that the translation-averaged
image needs to feature in order to be the best representation of the input image data in the information-theoretic sense. This best representation is often called the `Kullback–Leibler best', `minimal geometric Akaike information criterion (G-AIC) value' or simply the K-L best geometric model that the input image data maximally support.1.5. Prior information-theoretic distinctions between genuine symmetries and Fedorov-type pseudosymmetries based on a reasonable noise distribution estimate
Generalized noise (Moeck, 2018, 2019, 2021d; Dempsey and Moeck, 2020) is defined in this paper as the sum of all deviations from the genuine translation periodic symmetries in a crystal's structure and/or the imaged 2D periodic properties of the crystal. At the experimental level, generalized noise as defined here combines all effects of a less-than-perfect imaging of a crystal, all rounding errors and effects of approximations in the applied image processing algorithms, effects such as uneven staining in the cryo-electron microscopy of subperiodic intrinsic membrane protein crystals, slight deviations from exact zone-axis orientations in transmission electron microscopes, and the real structure that typically exists in addition to the ideal structure of a crystal. This definition applies also to undisturbed and disturbed/noisy crystal patterns in two dimensions as analyzed in this paper. For the author's information-theoretic classification methods (Moeck, 2018, 2019, 2021d; Moeck & Dempsey, 2019; Dempsey & Moeck, 2020; Moeck, 2021b,c) to work reliably, the generalized noise needs to be Gaussian distributed [with mean zero and standard deviation ɛ, which Kanatani calls the `noise level' (Kanatani, 2005)] to a sufficient approximation.
The information-theoretic distinction between Fedorov-type pseudosymmetries that are compatible with a ). Those symmetry classifications used a of low complexity to which moderate to large amounts of Gaussian distributed noise were added.
of the underlying and the genuine symmetries has been demonstrated already in a very short conference paper (Moeck & Dempsey, 2019Dempsey & Moeck (2020) simulated the amounts and types of noise that needed to be added to a with site/point and translational pseudosymmetries for the plane symmetry classifications by the information-theoretic method to misclassify pseudosymmetries as genuine symmetries. Fourteen versions of the same medium-complexity pattern were used in that study. For each version, four classifications were made for pattern regions of different sizes and shapes. The addition of strictly Gaussian distributed noise, up to the limit that a freely available computer program (GIMP 2.10, for Windows 7 and above, downloadable from https://www.gimp.org/) enabled, did not result in any misclassification. Changing the aggregate composition of the noise systematically so that it was to lesser extents approximately Gaussian distributed resulted in a single misclassification (out of 56 classifications in total). The misclassification happened for the noisiest image and the smallest image-region selection. Note that human expert classifiers would probably have made more than one misclassification when confronted with the same tasks (Dempsey & Moeck, 2020).
As it is time to, this paper will demonstrate statistically sound distinctions between genuine symmetries and strong Fedorov-type pseudosymmetries for a highly complex .
and two of its noisy versions in Section 41.6. classifications and image processing in contemporary electron crystallography
The common practice in electron crystallography is to make ; Zou et al., 2011; Gipson et al., 2007; Wan et al., 2003; Kilaas et al., 2005; Henderson et al., 2012; Lawson et al., 2020). Following up on a report by Henderson et al. (2012) on the first electron crystallography validation task force meeting, it has recently been noted with respect to cryo-electron microscopy that `… as currently practiced, the procedure is not sufficiently standardized: a number of different variables (e.g. … threshold value for interpretation) can substantially impact the outcome. As a result, different expert practitioners can arrive at different resolution estimates for the same level of map details.' (Lawson et al., 2020). In the context of computational imaging1 (Gureyev et al., 2019; Paganin et al., 2019), `resolution' in this direct quote stands for structural resolution and intrinsic image quality.
classification on the basis of subjective interpretations of the values of Fourier-space `symmetry deviation quantifiers' that measure distances between the translation-averaged input image and differently symmetrized versions of that image (Hovmöller, 1992Two different sets of structure-bearing Fourier coefficient based symmetry deviation quantifiers, as implemented in the crystallographic image processing programs CRISP (Hovmöller, 1992; Zou et al., 2011; Zou & Hovmöller, 2012) and ALLSPACE (Valpuesta et al., 1994), are most popular in the electron crystallography community. Neither of these two sets of quantifiers are maximal-likelihood estimates combined with geometric model selection-bias correction terms for objective symmetry model selections of digital input image data. A geometric form of information theory can, therefore, not be based on these quantifiers in order to avoid a necessarily subjective decision of what the underlying plane symmetry most likely is (in the considered opinion of the users of these two computer programs).
Whereas the sets of typically employed symmetry deviation quantifiers in contemporary electron crystallography provide quantitative numerical measures, the decision as to which plane symmetry group should be enforced on the input image data as part of their crystallographic image processing is with necessity left to the electron crystallographer. In the presence of symmetry inclusion relations, Fedorov-type pseudosymmetries and generalized noise, optimizing the fit between geometric models for experimental data and the data themselves by minimizing symmetry deviation quantifiers and using overriding rules of thumb such as `when in doubt, choose the higher symmetry' (Hovmöller, 2010; Zou et al., 2011; Zou & Hovmöller, 2012; Eades, 2012) are certainly not a foolproof strategy for optimal model selection.
The CRISP program makes a suggestion that the user may either accept or overwrite, but relies heavily on visual comparisons between differently symmetrized versions of the input image data. This author has not used ALLSPACE [in its 2dx (Gipson et al., 2007) and Focus (Biyani et al., 2017) incarnations] so far, as no version that runs on Microsoft Windows compatible computers seems to exist. There are also competing computer programs with less comprehensive symmetry deviation quantifiers, e.g. VEC (Wan et al., 2003) and EDM (Kilaas et al., 2005), that rely even more heavily on visual comparisons of the translation-averaged image to its symmetrized versions.
When the underlying plane symmetry in a noisy experimental image has been underestimated, i.e. only a of the most likely plane symmetry group has been identified, one does not make the most out of the available image data in the subsequent symmetry-enforcing step of the crystallographic image processing procedure. On the other hand, if the plane symmetry is overestimated, `non-information' due to noise will unavoidably be averaged with genuine structural information in the subsequent crystallographic processing of the image. In the latter case, one may have wrongly identified a minimal of the correct plane symmetry group that the analyzed image would possess in the absence of generalized noise. That could be the union of a genuine plane symmetry group and a Fedorov-type pseudosymmetry group.
It is, accordingly, very important to get the
classification step of the crystallographic image processing procedure just right. For that, one should only rely on the digital image data themselves and refrain from any subjective considerations.With the author's objective and interpretation-threshold-free methods (Moeck, 2018, 2019, 2021d; Moeck & Dempsey, 2019; Dempsey & Moeck, 2020; Moeck, 2021b,c), one can now make advances with respect to the above-stated situation in the cryo-electron microscopy subfield that deals with subperiodic intrinsic membrane protein crystals, in the electron crystallography of inorganic materials and the crystallographic processing of digital crystal patterns in general.
1.7. Primary goal and secondary objective of this paper
The primary goal of this paper is to demonstrate the author's interpretation-threshold-free
classification methods on a series of three synthetic crystal patterns, where one is free of noise and the other two are noisy. The achievement of this goal might entice the computational symmetry and electron crystallography communities to replace their subjectivity in classifications with the objectivity that the information-theory-based methodology enables.The demonstration of the benefits of the correct crystallographic processing of a more or less 2D periodic image is the secondary objective of this paper. Scanning probe microscopists should take note as these demonstrations are mainly directed to them. This is because crystallographic image processing is just as applicable to more or less 2D periodic images from scanning probe microscopes (Moeck, 2017, 2020, 2021b,c) as it is to images from parallel-illumination transmission electron microscopes (as used in electron crystallography).
Scanning probe microscopists may, however, like to correct for scanning distortions in their images of 2D periodic samples with tools such as Jitterbug (Jones & Nellist, 2013) before they make classifications and process their images crystallographically. The achievement of the secondary objective, i.e. demonstrating the benefits of the correct crystallographic processing of a more or less 2D periodic image, may eventually lead to the widespread use of crystallographic image processing techniques in scanning probe microscopy.
The limiting effects of noise and Fedorov-type pseudosymmetries in more or less 2D periodic images on the accuracy of ; Dempsey & Moeck, 2020). This will be demonstrated here once more in Section 4 of this paper. That section constitutes this paper's main part and features four subsections containing nine numerical data tables as well as four figures. Two of these figures demonstrate the beneficial noise reduction and crystallographic-averaging-induced structural resolution enhancement effects of crystallographic image processing.
classifications have so far rarely been analyzed. As one would expect, the distinction between genuine symmetries and pseudosymmetries of the Fedorov type becomes more difficult with increasing amounts of noise even when a geometric form of information theory is used (Moeck & Dempsey, 2019In order to facilitate direct comparisons with results obtained by one of the two most popular traditional CRISP program and used for the calculation of the ratios of sums of squared residuals of non-disjoint geometric models for the image input data.
classification programs of electron crystallography, *.hka files were exported from theSection 5 of this paper compares the results of our three classifications (by the author's information-theory-based methods) with plane symmetry group estimates by the program CRISP as applied to the same and adjacent areas of the three synthetic crystal patterns. The paper ends with a summary and conclusions section.
1.8. The three appendices of this paper
Appendix A provides `Notes on the text'. They are in essence expanded footnotes. Analogously to footnotes, they are in the main text marked by superscripts Ax on a key word, where x is an integer starting with unity. For example, a brief account of the physical creationA1 of the undisturbed that is analyzed in this paper is given in that appendix as note A1, as it is the first of such notes. From the account in that particular end-note, it is obvious that the accurate symmetry classification of the in Fig. 1 can only be plane symmetry group p4. Strong pseudosymmetries of the Fedorov type are present in this pattern that human classifiers will, at least at first sight, most likely misinterpret as the genuine symmetries of plane symmetry group p4gm.
Appendix B presents the formulae for ad hoc defined confidence levels for classifications into minimal supergroups of the genuine symmetries for the special case that all geometric models of the digital input image data are based on the same number of structure-bearing Fourier coefficients. Outlooks on ongoing developments of the information-theory-based classification and quantification methods and some of their potential applicationsA2 are provided in Appendix C.
2. Fedorov-type pseudosymmetries illustrated on a noise-free synthetic pattern
Fig. 1 shows a slightly enlarged reproduction of a that originated with the artist Eva Knoll (Knoll, 2003). There are about 15.5 translation periodic motifs in the digital representation of this particular graphic work of art in Knoll's paper.
After expansion by periodic motif stitching of a digital representation of the original artwork as presented in Knoll (2003), that pattern featured approximately 144 primitive unit cells in total. Approximately 16 of these unit cells are shown in Fig. 1. The computer program Image Composite Editor (Microsoft ICE 2.0, Image Composite Editor, for Windows Vista SP2, 7, 8 and 10) was used for the periodic motif stitching. The expanded image/crystal pattern is provided in the supporting material of this paper in the *.jpg format (1160 by 1165 pixels with 24 bit depth, and 413 058 bytes) as well as in the uncompressed *.tif format (1160 by 1165 pixels with 32 bit depth, 120 by 120 d.p.i., resolution unit 2, color representation sRGB, attribute A, and 5 442 642 bytes). Just as in Dempsey & Moeck (2020), the periodic motif stitching was done in order to enable more precise crystallographic analyses.
The stitched/expanded shows a small section) serves in this paper as the basis of three synthetic patterns that are to be classified with respect to their crystallographic symmetries and Fedorov-type pseudosymmetries. The two per design noisy versions of the (in the series of analyzed patterns) are processed crystallographically in order to demonstrate that technique's benefits with respect to the noise suppression and site/point symmetry enforcing of such a processing.
(of which Fig. 1Because the physical piece of graphic art from which the digital pattern in Fig. 1 was created is hand made,A1 none of the 2D translation compatible crystallographic symmetries of the Euclidean plane are strictly speaking present as they are only mathematical abstractions. It is, however, standard practice to assign a plane symmetry group to such a as one would also do for any sufficiently well resolved image from a real crystal in the real world, see Section 1.2 above. That symmetry group of the pattern or image is per definition the one that is least broken by structural, sample preparation, imaging and image processing imperfections (generalized noise).
For the purpose of the are negligible and that there are no structural imperfections/defects that are intrinsic to the represented physical object. The generalized noise in that pattern is, therefore, negligible and we call the corresponding pattern the noise-free member of a series of three crystal patterns that are to be classified with respect to their crystallographic symmetries and Fedorov-type pseudosymmetries in this paper.
classification, the assumption is made that the imaging and image processing imperfections of the in Fig. 1A human expert classifier would most likely assign plane symmetry group p4gm to the in Fig. 1 at first sight because approximate fourfold and twofold rotation points as well as mirror and glide lines are all visibly recognizable in their required spatial arrangements in all of the 2D translation periodic unit cells. (This author assigned plane symmetry group p4gm to the pattern in this figure as well at first sight, but corrected his mistake after a more careful visual analysis.)
The different types of visually recognizable point/site symmetries in each individual p4gm would indeed underlie the completely symmetric idealization of the in Fig. 1. The rather sharp peaks in the histogram in Fig. 1 are to be interpreted as genuine characteristics of the underlying since no noise was added to deliberately disturb this pattern.
are probably broken by slightly different amounts, but these differences appear to be so minor that a human being may just assume they are all broken by the same amount. Under this assumption, plane symmetry groupThe image-pixel-value-based classification of this p2 and p4 as genuine, with p2 least broken being the anchoring group, and the Fedorov-type pseudosymmetry groups p1g1, p11g, c1m1 and c11m as quantitatively more severely broken than the p2 and p4 symmetries. These pseudosymmetries combine with the genuine symmetries to form the two minimal pseudosupergroups p2gg and c2mm, as well as their respective minimal pseudosupergroup p4gm. (With hindsight, this is as it must be given the sequence of creative processesA1 that resulted in this particular graphic piece of art.) Section 4 of this paper gives the details of the corresponding analysis.
with the author's method reveals, however, plane symmetry groupsThe point/site symmetry of the centers of the conspicuous bright `bow ties' in this pattern is visibly no higher than mm. 2mm is, on the other hand, one of the minimal supergroups of group 2, but visibly more severely broken in the in Fig. 1.
group 2, which is one of the maximal subgroups of 2This becomes even clearer in Figs. 2 and 3. Approximately four primitive (or two centered) unit cells of the pattern in Fig. 1 are displayed in Fig. 2 after translation averaging by Fourier filtering.A3 Note that each bright bow tie in Fig. 2 is shared between two adjacent unit cells that are based on what seems to be a square The centers of the bright bow ties are at fractional coordinates ½, 0, ½, 1, 0, ½ and 1, ½, as marked in Fig. 2.
These points feature visually the approximate mm, which would be required if the underlying plane symmetry group were to be c2mm or p4gm. The observed 2 at these fractional coordinates is, on the other hand, compatible with plane symmetry groups p2, p2gg and p4.
group 2 at best, rather than 2At the fractional , there are also approximate fourfold rotation points at the centers of dark `curved diamonds' so that a p4 or p4gm classification by a human expert is probably the best anyone could come up with when the slight differences in the breaking of the individual symmetry operations are not noticed and quantified. The genuine plane symmetry group of this pattern can, however, only be p2, p2gg or p4 when the visible site/point symmetry around the centers of the bright bow ties is taken into account.
coordinates 0, 0, 1, 0, 0, 1 and 1, 1 as well as ½, ½ in Fig. 2Fig. 3 zooms into the translation periodic motif of Fig. 2 and features a single bright bow tie and its immediate surrounding.
Both of the arrows in Fig. 3 point to positions in the motif where the tips of the bright bow ties end and meet straight edges from the gray `right angle ruler' parts of the motif. There is approximately a 20% difference in the distance of these points from the horizontal and vertical edges of the gray right-angle-ruler shaped motif parts, so that there is definitively no mirror line from the top-right corner to the bottom-left corner in this figure. Such a mirror line would be required for the whole motif to be part of a primitive with plane symmetry group p4gm or a centered with plane symmetry group c2mm.
3. Pertinent equations, inequalities, plane symmetry and 2D hierarchy trees, and their usages
Kanatani's G-AIC relies on the noise being approximately Gaussian distributed. For that kind of noise, the residuals need to be sums of squares of the differences between the input data and geometric models for those data. Since
classifications are best done in Fourier space, the maximal-likelihood estimate for approximately Gaussian distributed noise in more or less 2D periodic patterns takes the form of the sums of squared residuals of the complex structure-bearing Fourier coefficients for plane symmetry group classifications. For projected classifications, they take the form of the sums of squared residuals of the amplitudes of those Fourier coefficients.Equation (1) gives the sum of squared residuals of the complex Fourier coefficients of a symmetrized (geometric) model of the input image data with respect to the translation-averaged-only (Fourier filtered) version of these data:
where (.)* stands for the complex conjugate of the difference of a pair of complex numbers (.). The sum is over the differences of all N structure-bearing Fourier coefficients with matching and the subscripts on the right-hand side stand for translation averaged and symmetrized, respectively. The subscript on the left-hand side stands for complex Fourier coefficients. Note that there is a zero sum of residuals per equation (1) for the case of Fj,trans = Fj,sym, i.e. the translation-averaged-only model of the input image data, which features plane symmetry group p1.
The sum of squared residuals of the amplitudes of the Fourier coefficients is calculated in an analogous manner from the real-valued amplitudes of the structure-bearing Fourier coefficients:
where the subscript on the left-hand side stands for amplitude of Fourier coefficients.
Note again that the sum of residuals is zero when all of the translation-averaged and symmetrized Fourier coefficient amplitudes with matching of this paper.
are equal to each other. This happens for the translation-averaged-only model of the input image data, which features group 2 due to the Fourier transform being centrosymmetric. Projected 2 features, accordingly, a zero sum of amplitude residuals in the data tables that are shown in Section 4In order to restrict the sums of squared residuals to small numbers, the structure-bearing Fourier coefficients of the input image intensity and their symmetrized versions are in this paper normalized through division by the maximal amplitudes that the CRISP program provides for both the translation-averaged model and the symmetrized models of the input image data in both equations (1) and (2).
What follows below is valid for classifications into both plane symmetry groups and projected and (2) are dropped below. Two different symmetry hierarchy trees will, however, be applicable. The first one for plane symmetry groups is presented in Fig. 4(a) below. The second one is given in Fig. 4(b) for projected Laue classes.
The same equations and inequalities as well as analogous considerations concerning the plane symmetry group hierarchy and the hierarchy of 2D point groups that are projected apply, so that the subscripts cFC and aFC on the sums of squared residuals from equations (1)Kanatani's G-AIC has the general form
where is a sum of squared residuals, as for example given in equations (1) and (2), for the geometric model S, d is the dimension of S, N is the number of data points that represent the model S, n is the number of of S, and is the variance of a generalized noise term, which obeys a Gaussian distribution to a sufficient approximation. The term in (3) represents unspecified terms that are second order in , while the ellipsis indicates higher-order terms that become progressively smaller.
For small and moderate amounts of generalized noise, it is justified to ignore all of the higher-order terms in (3),
because they will make only minor contributions to the G-AIC values of all geometric models. The number of data points, N, can either be constant for all geometric models in a set of models or differ from model to model but should in the latter case be on the same order. The dimension of the model is defined by the geometric type of model. [Note in passing that Kanatani refers to the equivalent of (4) as normalized geometric AIC involving normalized residuals and normalized covariance matrices that are isotropic in his monograph, and designates it as AIC0(S) (Kanatani, 2005).]
Equation (4) is to be interpreted as a `balanced geometric model residual' for geometric model selections that is well suited to deal with symmetry inclusion relations. A non-disjoint and less constrained model, which is lower symmetry, will always fit the input data better than the more constrained model that features a higher non-disjoint symmetry. The value of the less constrained (more general) model that is in a non-disjoint relationship with a higher-symmetry model will, therefore, be smaller than its counterpart for the more constrained model. In other words, the more general model fits the data better than the more restricted model. This is because the more general (less constrained) model has more degrees of freedom.
As long as the G-AIC value of a more constrained (more symmetric) model, subscript m, is smaller than that of the less constrained (less symmetric) model, subscript l, the former model is a better representation (with more predictive power) of the input image data than the latter:
The rational/objective geometric model selection strategy is to minimize the G-AIC values (rather than only the sums of squared residuals) for a whole set of geometric models by means of repeated applications of inequality (5). As there are two models, Sm and Sl, in (5), one sets this inequality up for non-disjoint pairs of geometric models, one at a time, and tests if the inequality is fulfilled.
The geometric model selection-bias correction term in equation (4) will for a less constrained model be larger than its counterpart for a more constrained model (with equal N and d). In other words, the better fitting, less constrained, model features a higher `geometric model selection penalty' than its worse fitting, more constrained, counterpart. This kind of interplay between fitting the input image data better at the expense of a higher model selection penalty provides the basis for objective geometric model selections by minimizing their G-AIC values over a complete set of geometric models.
The fulfillment of inequality (5) allows for a more constrained/symmetric model of the input data to be selected in a statistically sound manner as a better representation of the said data although its numerical fit, as measured by its sum of squared residuals, is worse than that of the less constrained/symmetric model. Note that the identification of which of the two geometric models is the better representation of the input image data is based solely on the input data themselves and the underlying mathematics of Kanatani's theory.
There is no arbitrarily set threshold for the identification of the better model in the presence of a symmetry inclusion relationship, just an inequality that needs to be fulfilled numerically. All of the other ; Valpuesta et al., 1994; Wan et al., 2003; Kilaas et al., 2005; Gipson et al., 2007; Zou et al., 2011) and the computational symmetry community (Liu et al., 2009) feature such thresholds.
classification methods that were so far used in electron crystallography (Hovmöller, 1992At first sight, it would seem that estimates of are needed to make objective geometric model selections by the minimization of their G-AIC values by means of inequality (5) and the definition of the first-order model selection criterion (4). Each geometric model features a different separation of the presumed geometric information content, on the one hand, and presumed non-information (generalized noise) content, on the other hand.
There are, however, workarounds to estimating that not only identify the best possible separation of geometric information and non-information, but also give an estimate of the prevailing noise in the input image data. The two workarounds take in this paper advantage of both the translationengleiche symmetry inclusion relationships between plane symmetry groups as shown in Fig. 4(a) and the symmetry inclusion relationships between the 2D point groups that are projected as shown in Fig. 4(b), i.e. non-disjointness in other words.
For i.e. points). The of the geometric models in this paper depend on the number of non-translational symmetry operations in the plane symmetry groups to which the translation-averaged input image data have been symmetrized. They are obtained by the ratio
classifications of more or less 2D periodic images, the dimension of the geometric models is zero (as the data are in the form of the intensity of individual pixels that are considered to be zero-dimensional,where k is the number of non-translational symmetry operations, which is equal to the multiplicity of the general position per lattice point in all plane symmetry groups. [This number is also one of the two ordering principles of Figs. 4(a) and 4(b).]
Equation (6) and what follows from it are good approximations when N is largeA4 (as in this paper). A necessary but not sufficient precondition for N being large in Fourier space is that a digital representation of the image to be classified should have a large number of individual pixels in A complex translation periodic motif with sharp edges and strong contrast changes will produce a large number of complex Fourier coefficients when Fourier transformed.
As already mentioned above, the number of non-translational symmetry operations, k in (6), is one of the two ordering principles of the hierarchy tree of the translationengleiche plane symmetry groups, Fig. 4(a). This number is given both on the left- and right-hand side of this figure and increases from the bottom to the top of the symmetry hierarchy tree. The other ordering principle in this figure is the non-disjointness of maximal subgroups and minimal supergroups of the plane symmetry groups specified for their crystallographic settings. These symmetry inclusion relations are in Fig. 4(a) marked by arrows between maximal subgroups and minimal supergroups that are translationengleich. The ratios of the sums of squared residuals of the complex structure-bearing Fourier coefficients for `climbing up' from a lower level (subscript l for less symmetric) of the hierarchy to a higher level (subscript m for more symmetric) that is permitted by the fulfillment of inequality (5) for the special case of equal numbers of complex Fourier coefficients of the lower- and higher-symmetry geometric model of the input image data (Nm = Nl) are also given in Fig. 4(a).
Translationengleich in the previous paragraph means that the addition of a non-translational to the of a lower-symmetry group, which has the status of a maximal results in a of a higher-symmetry group, which is the former's minimal Changes from a primitive to a centered and vice versa are permitted (Burzlaff et al., 1968), as they represent, effectively, orientation changes of symmetry operations with respect to the conventional vectors. Analogous considerations apply to the hierarchy of the projected 2D where there are per definition only point symmetries to consider.
The translation-averaged geometric model of some input image data (with plane symmetry group p1) is, for example, non-disjoint from the c1m1 symmetrized model of these data, as that plane symmetry group is a minimal of p1. The centered plane symmetry group c1m1 with k = 2 is in turn in a maximal relationship with plane symmetry group p3m1 with k = 6, see Fig. 4(a). Whenever there is no connecting arrow between two plane symmetry groups in Fig. 4(a) and two projected in Fig. 4(b), that pair of symmetry groups is disjoint.
The two ordering principles in Fig. 4(b) are analogous to those in Fig. 4(a). The order of the 2D point group/projected on the left- and right-hand side of the hierarchy tree increases from the bottom to the top. Maximal subgroups are connected to their minimal supergroups by arrows. The ratios of the sums of squared residuals of the amplitudes of the structure-bearing Fourier coefficients for climbing up from a lower level of the hierarchy to a permitted higher level of the 2D point groups are also given in this figure for Nm = Nl. For an analogous pair of geometric models with hierarchy levels km and kl, the same ratios of squared residuals are given in both parts of Fig. 4. This is because the same inequalities are applicable for climbing-up tests in both hierarchy trees.
In the above-mentioned workarounds to estimating , one sets up inequality (5) for two non-disjoint models of the input image data that were symmetrized to non-disjoint plane symmetry groups, and takes advantage of the estimate
for the square of the amount of approximately Gaussian distributed noise in the lower-symmetry model (designated by the subscript l). The variable rl stands in this estimate for the so-called co-dimension in Kanatani's framework. [In our case, the co-dimension is equal to unity,A5 just as rbest in equation (7b) below.]
As long as inequality (5) is fulfilled, one is allowed to climb up in the hierarchy trees of Fig. 4. One always starts with the lower-symmetry model that corresponds to the anchoring group or class.
Inequality (5) is fulfilled under the conditions
and
So far, we followed Kanatani's general derivation in the `Model comparison by AIC' section of his monograph (2005) closely. Now we turn to our specific case of classifications of more or less 2D periodic patterns. For our case,A5 with dm = dl = 0, rm = rl = 1 and (6), we obtain from (8a)
when the number of data points in both the more and the less symmetric geometric model is the same, Nm = Nl. This problem-specific inequality is a special case of the general inequality (5) for rational/objective geometric model selections.
For the purpose of this paper, we need a generalization of (9a) for the Nm ≠ Nl case of the geometric models that we want to compare with respect to their predictive power. This is because we want to compare our classification results directly with the suggestions that the CRISP program provides, working with the same numerical representations of the geometric models for the input image data that this program allows one to export. Such a generalization of inequality (9a) is provided in Dempsey & Moeck (2020):
and it will be used throughout the rest of this paper with Nm ≃ Nl and large.
Note that per inequality (9b), climbing up from the translation-averaged-only model of the input image data to all geometric models that have been symmetrized to minimal supergroups of p1 is impossible, as kl = 1 in all of these cases. [There is also a zero sum of squared complex Fourier coefficient residuals for the translation-averaged-only model, equation (1), so that there is no inconsistency.]
One, therefore, simply assumes that there is more than translation symmetry in the input image data and uses inequality (9b) with kl = 2 and 3 as a minimum. After having made that assumption, one proceeds with determining what individual symmetry operations there are in the input image data and to what plane symmetry group they combine.
One needs to carefully distinguish between genuine plane symmetry groups and possibly existing Fedorov-type pseudosymmetry groups in the input image data based on the model pair's , , km and kl values, and Nm to Nl ratio. Based on the definitions in Section 1.2 of this paper, the least broken symmetry at the kl = 2 or 3 levels is the first genuine symmetry that is identified and all other genuine symmetries need necessarily be anchored to this particular symmetry group.
In practice, one begins an objective plane symmetry classification by calculating the sums of squared residuals for all of the geometric models that feature a multiplicity of the general position per lattice point (number of non-translational symmetry operations) of two and three, see Fig. 4(a). (Note that plane symmetry groups c1m1 and c11m feature two non-translational symmetry operations each, the multiplicity of the general position in the centered is four, but there are two lattice points per unit cell.)
All of the geometric models with two and three non-translational plane symmetry operations are disjoint from each other per definition. Combinations of the groups with two and three non-translational plane symmetry operations lead to the majority of plane symmetry groups that are higher up in the hierarchy tree, Fig. 4(a).
When there is more than translation symmetry in the input image data, at least one of the geometric models that have been symmetrized to a plane symmetry group with two or three non-translational symmetry operations will have a low sum of squared residuals of the complex structure-bearing Fourier coefficients. The plane symmetry group of that model is necessarily non-disjoint from its minimal supergroups so that tests of whether a climbing up in the plane symmetry hierarchy tree is allowed by inequality (9b) can proceed until the Kullback–Leibler best geometric model of the image input data has been found.
By first calculating the sums of squared residuals for all eight geometric models of the input image data that feature k = 2 and 3, we make sure we know from which plane symmetry group the anchoring and climbing up in the hierarchy tree of plane symmetry groups, Fig. 4(a), shall proceed in this paper, as long as permitted by the fulfillment of inequality (9b).
The sums of squared residuals of the complex structure-bearing Fourier coefficients of the geometric models of the input image data that have been symmetrized to higher-symmetry plane symmetry groups may be calculated on an as-needed basis. Note that the whole procedure can be programmed and does not require visual inspections and comparisons of differently symmetrized versions of the input image data. This makes the information-theory-based classification techniques very different to the other plane symmetry classification methods that are used in contemporary electron crystallography.
Note that to conclude that a certain minimal b) has to be fulfilled for all maximal subgroups (and in turn their maximal subgroups). If that is not the case, that plane symmetry is only a Fedorov-type pseudosymmetry as it is broken to a larger extent than the genuine plane symmetry that the hypothetical noise-free version of the input image most likely possesses. The formally correct classification of a more or less 2D periodic pattern is the plane symmetry group and projected that minimize the respective G-AIC values.
is a plane symmetry that minimizes the G-AIC value of a geometric model of the image input data within a set of models, inequality (9In the case of projected , because the Fourier transform is centrosymmetric. The anchoring group is, therefore, to be found at the kl = 4 or 6 levels of the hierarchy tree in Fig. 4(b). All other considerations for finding the K-L best projected are analogous to those for finding the K-L best plane symmetry group.
there is a zero sum of squared structure-bearing Fourier coefficient amplitude residuals for group 2, see equation (2)For consistent being no longer a good approximation of equation (3) and/or the generalized noise not being Gaussian distributed to a sufficient approximation.
classifications of more or less 2D periodic patterns, the K-L best projected and the K-L best plane symmetry group need to be compatible with each other as they are based on complementing aspects of the same input image data. As the example of the noisiest classified below will show, it is possible that the formally correct K-L best plane symmetry group and formally correct K-L best projected are crystallographically incompatible with each other. When this happens, it signifies a partial breakdown of the information-theoretic methodology that results from equation (4)A good estimate of the variance of the amount of generalized noise that needs to be approximately Gaussian distributed can be obtained after the correct classification has been made, i.e. the K-L best model in the set has been identified, from
where the subscript `best' stands for the Kullback–Leibler best model of the input image data. This estimate is in the same format as (7a), i.e. the representation of the estimated square of the noise level of the geometric model that features the lower-symmetry group or class in a pairwise model comparison procedure. When the K-L best model of the input image data has been identified, there is obviously no further climbing up allowed in the symmetry hierarchy trees of Fig. 4. This is because the G-AIC values inequality (5) can no longer be fulfilled using inequalities (8a) and (8b) as well as (9a) or (9b).
The estimate in (7b) is needed for calculations of geometric Akaike weights of a set of geometric models for the input image data. These weights are the probabilities that a certain geometric model of the input image data is indeed the K-L best model in a set of geometric models. They are to be calculated on the basis of the G-AIC values according to equation (4) with (7b) for the noise term. This is not done in this paper and the reader is referred to Moeck (2018) and Dempsey & Moeck (2020) for details on how likelihoods of geometric models are transformed into model probabilities. Providing geometric Akaike weights is a route to deriving uncertainty measures for plane symmetry group and projected classifications, without which measurements, i.e. quantifications, are simply incomplete (Helliwell, 2021). Another route to deriving classification uncertainty measures is to use Nm ≠ Nl generalizations of the confidence-level equations for selecting minimal supergroups over their maximal subgroups, see Appendix B.
Note that to obtain reasonable results for the geometric Akaike weights, a normalization of the residuals, as described in Dempsey & Moeck (2020), is mandatory when one works with *.hka files from the CRISP program. We use the same normalization in this paper as it is inconsequential for the ranking of geometric models by their G-AIC values.
4. Objective classifications of three synthetic crystal patterns and an optimal crystallographic-image-processing-induced noise suppression
4.1. Details of the classification procedure as employed in this paper
As already mentioned in the introductory Section 1.7 to this paper, classifications are done here with both the author's methods and the electron crystallography program CRISP (Hovmöller, 1992; Zou et al., 2011; Zou & Hovmöller, 2012) using the same *.hka filesA6,A7 of the latter program. An appropriately chosen series of these files contains all of the information on the structure-bearing Fourier coefficients of the differently symmetrized geometric models of the input image data that is needed for objective classification into plane symmetry groups and projected Laue classes.
In the CRISP program, these files are internally used to calculate symmetry deviation quantifiers in the form of sets of normalized amplitude and phase-angle differences of symmetrized structure-bearing complex Fourier coefficient sets of the input image data with respect to the structure-bearing complex Fourier coefficient set of these data themselves. (Ratios of sums of odd to even Fourier coefficient amplitudes are also calculated from these files when they are meaningful.) The *.hka files are also used internally to create symmetrized direct-space versions of the input image data by Fourier back transforming for visual comparisons by the CRISP program's user.
These files can be interactively edited in CRISP. This allows, for example, for restrictions of the geometric models of the input image to a desired of the Fourier coefficient amplitudes. The program's default value for this is 200. (The maximal amplitude is always set to 10 000.)
leads to a reduction of the number of complex structure-bearing Fourier coefficients of the geometric models, and we will make use of that for both the noise-free and the modest amount of added noise pattern in the analyzed series of crystal patterns, see Figs. 1Calculating the discrete Fourier transform with CRISP in its maximal setting resulted in 3666 complex structure-bearing Fourier coefficients for the translation-averaged model of the undisturbed that underlies Fig. 1. The patterns that underlie Figs. 2 and 3 are, on the other hand, restricted to the back-transform of the strongest 956 complex Fourier coefficients without any symmetrizing.
A limited and 2 suggests, this is not a problem in the present study. Limiting the has, on the other hand, the benefit of reducing `Fourier ripples' around features with very strong contrast changes, as can be seen in Fig. 2.
of the Fourier coefficient amplitudes may lead to a reduction in the accuracy of the geometric models of the input image data. As the direct visual comparison of the crystal patterns in Figs. 1With a very large number of data points in the discrete Fourier transform of some input image data with very small amplitudes, one has to wonder if the accuracies of geometric models of the input image data are not compromised by the limited representation length of real numbers in a computer program, accumulated rounding errors and numerical approximations in the calculation of the discrete Fourier transform.
The CRISP program also allows for restrictions of the spatial resolution of the geometric models of the input image data in This spatial resolution is akin to the Abbe1 resolution. Restricting the spatial resolution is typically necessary for noisy crystal patterns that are to be classified and will be done here as well for both of the noisy patterns, Figs. 5 and 6. What will be called `spread noise' below is particularly effective in reducing the number of well resolved data points in a discrete Fourier transform, as demonstrated by Dempsey & Moeck (2020). Without judicious restrictions of the dynamical range of the structure-bearing Fourier coefficient amplitudes and the Abbe resolution of a noisy one may produce conspicuous artifacts in the subsequent crystallographic processing of the more or less 2D periodic image when one works with *.hka files.
The MATLAB script hkaAICnorm, as written by a graduate student of this author (Dempsey & Moeck, 2020), was used for the extraction of the pertinent information from the exported *.hka files. That script can be freely downloaded (https://github.com/nanocrystallography/hkaAIC_Public) and calculates the sums of normalized squared residuals for all of the geometric models that are used in this study from a series of *.hka files from the CRISP program. [As described in Dempsey & Moeck (2020), the script works with normalized amplitudes of the structure-bearing Fourier coefficients in order to keep the numbers in the data tables small.]
The noise-free pattern, Fig. 1, of the synthetic series is classified with respect to its plane symmetry group and projected in Section 4.2. Section 4.3 presents the classifications of the two noisy patterns, Figs. 5 and 6, of the series.
The results of the crystallographic processing of the two noisy patterns of the .
series are given in Section 4.44.2. Classification of the noise-free pattern in the series of crystal patterns
Table 1 lists the sums of squared residuals for a judicious selection of geometric models of the noise-free pattern, of which a small section is shown in Fig. 1. In all three analyses of this paper, circular area selections with a diameter of 1024 pixels were made in for the calculation of the discrete Fourier transforms. These sections contained approximately 88 primitive unit cells of the crystal patterns that are to be classified.
|
No explicit spatial restriction was made in Fourier space for the calculation of the entries in Table 1 as it is considered to be free of generalized noise. The of the Fourier coefficient amplitudes was set to 100 in order to restrict the number of data points N in inequality (9b) to something that is easily managed. (This amounts to an implicit spatial resolution restriction.)
Note that the first seven entries in this table consist of the geometric models of the input data that feature two non-translational symmetry operations, whereas the 8th entry features three such operations. All of these eight models are disjoint from each other [and there are no connecting vectors between them in the plane symmetry hierarchy tree in Fig. 4(a)].
The subsequent three entries in Table 1 consist of geometric models that feature four non-translational symmetry operations. The last two entries feature eight such operations and the two corresponding models are disjoint from each other (in the translationengleiche sense; Burzlaff et al., 1968).
The lowest sum of squared residuals of the complex Fourier coefficients is for the obtained for the geometric model that has been symmetrized to plane symmetry group p2, see Table 1. The geometric model with plane symmetry group p4 is listed in this table as the one that has the lowest (non-zero) sum of squared residuals of the amplitudes of the Fourier coefficients.
that underlies Fig. 1The symmetry in the amplitude map of the discrete Fourier transform is for the p4 symmetry model of the input image data 4 (Aroyo, 2016; Hahn, 2010), which is a projected For easy reference, the entries for geometric models with plane symmetry groups p2 and p4 are marked in Table 1 in bold.
The selection of entries in Table 1 has been made in order to demonstrate the climbing up from a lower level of the hierarchy of plane symmetry groups, see Fig. 4(a), to the next higher level. The tests if such a climbing up is allowed by the fulfillment of inequality (9b) always start at the geometric model with the plane symmetry that has the lowest sum of squared residuals of the complex Fourier coefficients amongst the mutually disjoint models with two and three non-translational symmetry operations, i.e. the anchoring group. That starting model features always per definition a genuine symmetry, but more genuine symmetries can potentially be identified by the fulfillment of inequality (9b) for some of its non-disjoint models that may combine with the first identified genuine symmetry to form some higher-level genuine symmetry.
As already mentioned above, the geometric model that was symmetrized to plane symmetry group p2 features the lowest squared residual of the complex Fourier coefficients in Table 1. Symmetry models that are candidates for climbing up from the geometric model that was symmetrized to p2 in the plane symmetry group hierarchy tree, Fig. 4(a), e.g. p2mg, p2gm, p2gg, p2mm, c2mm or p4, need to have a sufficiently small sum of squared residuals (and G-AIC values) with respect to all of their maximal subgroups in order to be declared genuine. Otherwise, they can only be Fedorov-type pseudosymmetries by definition. Geometric models of the input image data with low (but not the lowest) sums of squared complex Fourier coefficient residuals and two or three non-translational symmetry operations may either reveal a genuine symmetry or a Fedorov-type pseudosymmetry.
Plane symmetry group p4 has only one maximal i.e. p2, so that only one inequality fulfillment test is needed to find out if the former is a genuine symmetry of the that underlies Fig. 1 or not. For each of the other five geometric models mentioned in the previous paragraph, one would need to complete three inequality fulfillment tests. It is, however, already quite clear from the entries in Table 1 that only the models that were symmetrized to plane symmetry groups p1g1, p11g, c1m1 and c11m, have low sums of squared residuals (and G-AIC values) to make them reasonable candidates for climbing-up tests to geometric models that feature a minimal that they share with p2. The models with plane symmetry groups p1m1 and p11m feature very high sums of squared residuals of the complex Fourier coefficients in Table 1 so that it is unreasonable to expect that they could possibly combine with the geometric model that features the p2 anchoring group. The that underlies Fig. 1 can, therefore, not be classified as belonging to plane symmetry groups p2mm, p2gm and p2mg. Analogously, given that the entry in the second column of Table 1 is even higher for the geometric model that was symmetrized to plane symmetry group p3, the pattern in this figure is definitely not hexagonal.
Table 2 gives the ratios of the sums of squared residuals of the complex Fourier coefficients for the non-disjoint models of Table 1 [left-hand side of inequality (9b) in the second column] together with the maximal value that these ratios may have [right-hand side of inequality (9b) in the third column] in the context of minimization of the G-AIC value of the higher-symmetry model of a pair of non-disjoint geometric models of the input image data. The tests if climbing up to the next level of the plane symmetry hierarchy tree is allowed consist of a simple comparison of the numerical values in the second and third column of Table 2, which is recorded in the fourth column.
There is only one unconditional `yes' in the fourth column of this table, as marked by the row of entries in bold, so that the conclusion has to be drawn that the geometric model which has been symmetrized to plane symmetry group p4 features the only other genuine symmetry in the that underlies Fig. 1, i.e. the noise-free pattern of the series.
It is important to realize that all genuine symmetries above the k = 2 and 3 level must by definition be anchored to the least broken plane symmetry group, i.e. the one with the lowest sum of squared residuals for the complex Fourier coefficients at the kl = 2 and 3 levels in Fig. 4(a). The fulfillment of inequality (9b) for a pair of non-disjoint geometric models that does not fulfil this overriding requirement can per definition only signify a Fedorov-type pseudosymmetry.
The `strength' of a Fedorov-type pseudosymmetry correlates inversely with the sum of the squared residuals of the complex Fourier coefficients of its corresponding geometric model of the input image data. Plane symmetry groups p2gg and c2mm must be Fedorov-type pseudosymmetries of the in Fig. 1 because climbing up from p2 is not permitted, see first and fourth entry in Table 2. These two plane symmetry groups are strong Fedorov-type pseudosymmetries because the sums of squared complex Fourier coefficient residuals of the corresponding two geometric models of the input image data are low in Table 1. Their maximal subgroups p1g1, p11g, c1m1 and c11m are even stronger Fedorov-type pseudosymmetries as they are disjoint from the p2 anchoring group and the corresponding geometric models feature lower sums of squared residuals of the complex Fourier coefficients in Table 1 than the models that represent the minimal supergroups p2gg and c2mm.
Note that climbing-up tests for strong Fedorov-type pseudosymmetries to the km = 4 level, i.e. p2gg and c2mm, and up to km = 8, i.e. p4gm, result in rather low values for the left-hand side of inequality (9b) in Table 2. This is due to the corresponding sums of squared complex Fourier coefficient residuals for the matching kl = 2 and 4 levels being of the same order in Table 1. The ratios of such sums may, for strong Fedorov pseudosymmetries, even fall below unity,A7 as shown for the last entry in Table 2.
The identification of the projected proceeds analogously. 4 has already been identified above as the of the amplitude map of the geometric model that has been symmetrized to plane symmetry group p4. Because the p4 model has the lowest squared Fourier coefficient amplitude residual sum in Table 1, 4 is the anchoring for the projected classification of the that underlies Fig. 1. Both this projected and 2D 2mm feature four operations, kl = 4, and are disjoint from each other, see the hierarchy tree in Fig. 4(b).
that minimizes the G-AIC value for the that underlies Fig. 1Table 3 gives the ratios of the sums of the squared Fourier coefficient amplitude residuals for the non-disjoint models of Table 1 (with kl = 4) together with the maximal value that these ratios may have for a climbing up to the km = 8 level. Obviously, one cannot climb up from the model with projected 4 to the non-disjoint model with projected 4mm with km = 8 [in Fig. 4(b)], based on the numbers in this table.
|
Based on the low sums of squared Fourier coefficient amplitude residuals in Table 1, the models for projected 2mm and 4mm reveal pseudosymmetries in the input image data. This is fully consistent with the identified Fedorov-type pseudosymmetries at the plane symmetry group level.
To conclude this subsection: plane symmetry group p4 (which contains p2 as its only maximal subgroup) and projected 4 are identified as both genuine in the that underlies Fig. 1 and crystallographically consistent with each other. The identified Fedorov-type pseudosymmetries at the lowest level of the hierarchy tree of plane symmetry groups are p1g1, p11g, c1m1 and c11m. These pseudosymmetries combine with each other and the identified genuine symmetries to form the pseudosymmetry groups p2gg, c2mm and p4gm. There are corresponding 2mm and 4mm pseudosymmetries in the Fourier transform amplitude map of the noise-free in Fig. 1, but no 4mm pseudo-site symmetry in the direct-space of the input image data, since the p1m1 and p11m models of these data feature sums of squared complex Fourier coefficient residuals that are way too large to pass climbing-up tests in the plane symmetry hierarchy tree of Fig. 4(a).
4.3. Classifications of the two noisy patterns of the series of crystal patterns
Figs. 5 and 6 show sections of the two synthetic patterns that were obtained by adding approximately Gaussian distributed noise to the that served as the basis of Fig. 1, i.e. the approximately 144 periodic motif repeats containing an expanded representation of the original graphic artwork (Knoll, 2003) that is considered to be free of generalized noise. The freeware program GIMP (GIMP 2.10, for Windows 7 and above, freely downloadable at https://www.gimp.org/) was used to add the noise.
Spread noise swaps individual pixel intensities in the horizontal and vertical directions by a selected number of pixels.A8 Strictly Gaussian distributed noise only changes the individual pixel values but not their positions in the translation periodic The employed mixtures of strictly Gaussian distributed noise and spread noise add up to approximately Gaussian distributed noise. (The strictly Gaussian distributed noise had been added to the in Fig. 1 before the spread noise was added with GIMP.)
The effects of the added noise are clearly visible in Figs. 5 and 6 and their histogram insets when compared with the histogram inset in Fig. 1 and that figure itself. Compared with Fig. 5, there is approximately five times as much added noise in Fig. 6.
We classify the noisy first. The in the employed *.hka files from CRISP was set to 100. The selection in Fourier space was set to a 350 pixel radius (out of the maximal possible 512 pixel radius). The combination of both of these settings resulted in a reasonable number of Fourier coefficients in the last column of Table 4. A consequence of these two settings is a contrast reduction of the crystallographically processed version of this pattern, Fig. 7 (in Section 4.4 below), with respect to the in Fig. 1. These settings ensured, on the other hand, that there are only very minor processing artifacts in the pattern of Fig. 7.
that underlies Fig. 5
|
The geometric model with plane symmetry group p2 features again the lowest sum of squared residuals of the complex Fourier coefficients in Table 4. Also as before, the model that was symmetrized to plane symmetry group p4 features the lowest sum of Fourier coefficient amplitude residuals. Again, the rows for these two geometric models of the input image data are highlighted in bold in Table 4 for easy reference.
Analogous to Table 2, Table 5 gives the ratios of the sums of the squared residuals of the complex Fourier coefficients for climbing-up tests. There are four unconditional `yes' entries in Table 5 when the prior information on the objective symmetry classification of the noise-free pattern of the series from the previous subsection is not used. The rows of the corresponding entries are again marked in bold.
|
The preliminary conclusion from the bold rows in Table 5 is that the genuine plane symmetry group of the noisy in Fig. 5 must either be p2gg or p4. These two plane symmetry groups are disjoint from each other, see Fig. 4(a), so that one of these two groups has to be a Fedorov-type pseudosymmetry per definition. The decision about which of these two plane symmetries is genuine relies on the necessity of the crystallographic consistency of the plane symmetry classification with the classification of the noisy pattern in Fig. 5.
The anchoring p4 symmetrized model of the noisy pattern in Fig. 5 features in Table 4 the lowest sum of squared residuals of the Fourier coefficient amplitudes. The in the amplitude maps of the discrete Fourier transforms of the geometric models of the that underlies Fig. 5 that were symmetrized to plane symmetry groups p1m1, p11m, p1g1, p11g, c1m1, c11m, p2gg and c2mm is point symmetry/Laue class 2mm (Aroyo, 2016; Hahn, 2010).
is group 4 because the correspondingTable 6 is analogous to Table 3 and lists the ratios of sums of squared Fourier coefficient amplitude residuals for the modest amount of added noise pattern that underlies Fig. 5. The conclusion from this table is that projected 4 is the only genuine class as climbing up from the anchoring class to Laue class/point group 4mm is not allowed. Crystallographically consistent with this is that ascent from the geometric model that was symmetrized to plane symmetry group p4 to the p4gm symmetrized model of the image input data is not allowed, see Table 5.
|
Note that better by more than a factor of 2.7 than 2mm, which is at the same kl = 4 level of the hierarchy tree of Fig. 4(b). It is, therefore, without doubt the of the Kullback–Leibler best geometric model of the amplitude map of that pattern.
group 4 captures the symmetry in the amplitude map of the discrete Fourier transform of the noisy that underlies Fig. 5Laue class 2mm is according to Table 6 a pseudosymmetry at the level and the corresponding plane symmetry group p2gg can also only be a strong Fedorov-type pseudosymmetry. With 2mm identified as pseudosymmetry and 4 as the genuine symmetry in the amplitude map of the discrete Fourier transform of the noisy pattern in Fig. 5, there must also be a 4mm pseudosymmetry in this map. This is confirmed by the numerical values in Table 6.
Note in passing that the ratio of the sums of squared residuals of the complex Fourier coefficients is for the `p4 over p2' row of Table 5 smaller than unity. This is probably the result of both small accumulated calculation errors in the analysis and slight differences in the accuracy of the representation of the geometric models in the employedA6,A7 *.hka files from CRISP.
There is again no 4mm pseudo-site symmetry in the direct-space of that because ascent from the geometric model that was symmetrized to plane symmetry group p4 to its counterpart with plane symmetry p4mm is blocked in Table 5 by a very wide margin.
Clear distinctions between genuine symmetries and Fedorov-type pseudosymmetries were, thus, again obtained. The added approximately Gaussian distributed noise presented no challenge to the
classification task with respect to its crystallographic symmetries when the amount of noise was modest.The preliminary issue which of the two disjoint plane symmetry groups, p2gg or p4, is the symmetry of the Kullback–Leibler best model of the noisy pattern that underlies Fig. 5 was straightforwardly resolved by recognizing 4 as the anchoring Note that no prior knowledge of the classification of the noise-free pattern in the series of crystal patterns from Section 4.2 was used to reach the final conclusions. As expected, the effect of adding noise is an obscuring of the differences in the amounts of breakings of the various plane symmetry groups. Adding larger amounts of noise that is to a lesser approximation Gaussian distributed should confirm the general trend that genuine symmetries and pseudosymmetries in crystal patterns get more difficult to distinguish. As we will see below, this is indeed the case.
In analogy to Tables 1 and 4, Table 7 gives the characteristics of the geometric models for the rather noisy that underlies Fig. 6. All of the sums of squared residuals except those for p1m1, p11m, p3 and p4mm are highlighted in this table in bold. This is because, as Table 8 shows, genuine symmetries at the plane symmetry group level can no longer be distinguished from strong Fedorov-type pseudosymmetries as the result of the large amount of added noise.
|
|
Plane symmetry group p4gm is now identified as genuine and the symmetry that most likely underlies the rather noisy that underlies Fig. 6. Note that ascent in the plane symmetry hierarchy tree of Fig. 4(a) is now permitted all the way up to the top of the p4gm branch, since inequality (9b) is fulfilled for all of the relevant non-disjoint geometric models of the input image data. The single row that features a `no, blocking ascent' in the fourth column of Table 8 is, accordingly, the only one that is not in bold font.
It is interesting to check if this classification is consistent with the classification of the rather noisy pattern into the most likely projected provides the basis for checking this out. 4 is, however, still identified by inequality (9b) as the one that minimizes the expected Kullback–Leibler divergence. This could be due to projected determinations being somewhat less susceptible to added noise, especially to spread noise,A8 than plane symmetry group classifications.
as well. Table 9
|
Also, there are many more calculations going into
classifications with respect to plane symmetry groups as compared with their counterparts for projected Rounding errors and approximations in the algorithms may therefore accumulate in the calculation for plane symmetry classifications more than for their counterparts for 2D Laue classes.From the obvious crystallographic inconsistency that plane symmetry group p4gm and 4 have both been identified as K-L best representations of the rather noisy pattern that underlies Fig. 6, one needs to conclude that the plane symmetry classification result is incorrect (too high) and Fedorov-type pseudosymmetries have been misinterpreted as genuine symmetries. Note that this conclusion is informed by prior knowledge of the classification of the noise-free pattern of the series, but not exclusively based on that knowledge.
Crystallographic symmetry classification results as obtained in this section were to be expected and are in line with those of Moeck & Dempsey (2019) and Dempsey & Moeck (2020) for other series of synthetic crystal patterns with and without added noise that feature pseudosymmetries. The conclusion from all three studies must be that the information-theory-based classification methods work very well for small to moderate amounts of noise that is to a sufficient approximation Gaussian distributed.
Methods that rely on ignoring higher-order terms in equation (3) must, however, fail when there is way too much noise in a more or less 2D periodic pattern that is to be classified with respect to its crystallographic symmetries. Everything depends, of course, also on the relative complexity of a and the strength of its pseudosymmetries.
The identification failure is for the not `catastrophic' as even when a misidentification is obtained for the most likely underlying plane symmetry group of the noisiest most human experts would have made the same mistake. Because it is well known that Fedorov-type pseudosymmetries are not rare in nature (Chuprunov, 2007; Somov & Chuprunov, 2009), one needs to be extra careful with the crystallographic processing of very noisy images from crystals in order not to misinterpret noise as structural information. Translational pseudosymmetries (de Gelder & Janner, 2005a,b; Somov & Chuprunov, 2009) are also not rare in nature.
in Fig. 6In Section 4.4, the modestly noisy pattern that underlies Fig. 5 is symmetrized to plane symmetry group p4, as this was the crystallographically consistent Kullback–Leibler best representation of the plane symmetry of that We will symmetrize the very noisy pattern of Fig. 6 to plane symmetry group p4gm for demonstration purposes, although our analysis indicated that there was a crystallographic inconsistency, which is to be interpreted as that group being only a pseudosymmetry group.
4.4. Results of crystallographic image processing of the two noisy patterns of the analyzed series of crystal patterns
In order to demonstrate the benefits of the crystallographic image processing procedure, the classification results of the noisy patterns in Figs. 5 and 6 are now used to boost the signal-to-noise ratio in these two crystal patterns. Fig. 7 shows approximately 2.2 unit cells of the p4 symmetrized pattern of Fig. 5.
The conspicuous bright bow ties in Fig. 7 feature 2 as perfectly as it is possible for real-world entities that have been derived from disturbed real-world entities by the employed algorithmic enforcing procedure. Note that these bow ties feature 2 to a good approximation in Figs. 1 to 3 and 5. (This represents the highest and second highest site symmetries in plane symmetry groups p2 and p4, respectively.)
Plane symmetry group p2 was the anchoring group, i.e. the least broken plane symmetry at the kl = 2 or 3 level of Fig. 4(a). The sum of squared residuals of the complex structure-bearing Fourier coefficients of the p2 symmetrized model of the in Fig. 5 was, accordingly, the lowest in Table 4.
Note how much of the added noise has been removedA10 by the crystallographic image processing by a visual comparison between the patterns in Figs. 5 and 7. This becomes also clear by a comparison of the histogram insets of both figures.
The overall contrast in Fig. 7 is lower than in Fig. 1. There are also very minor (almost imperceptible) processing artifactsA11 in this These are small prices to pay in the opinion of the author for a significant enhancement of the signal-to-noise ratio and intrinsic image quality1 by means of the crystallographic processing of a noisy image. (To see these artifacts more clearly, it might be better to look at the computer screen of the online version of this paper in a high magnification rather than directly at a printout.)
Essentially the same can be said about the crystallographically processedA10 version of the very noisy pattern that underlies Fig. 6. The contrast in the crystallographically processed version of this pattern is in Fig. 8 even lower (so that processing artifacts are imperceptible). This is mainly a consequence of using a smaller number of symmetrized complex Fourier coefficients for both the classification and the transformation back into Note that Fig. 8 shows the bright bow ties quite clearly, whereas they were visually unrecognizable (in the absence of prior knowledge) in the that underlies Fig. 6.
Because plane symmetry group p4gm has been enforced on the very noisy pattern in Fig. 6, strong Fedorov-type pseudosymmetries have been rendered visibly indistinguishable from genuine symmetries in The conspicuous bow ties feature in Fig. 8, therefore, 2mm, although the corresponding in the undisturbed was at best 2, as clearly visible in Figs. 2 and 3. Noise in the image has, thus, been misinterpreted as structure as part of a crystallographic image processing that ignored a detected crystallographic inconsistency.
The large amount of added noise pattern, Fig. 6, was crystallographically processed in plane symmetry group p4gm, Fig. 8, for demonstration purposes although the projected classification, i.e. 2D 4, identified a problem with the p4gm classification that is caused by the large amount of added noise. This was done here for the sake of a demonstration of what happens when one symmetrizes a more or less 2D periodic pattern to a plane symmetry group that is not crystallographically consistent with the corresponding 2D classification by the information-theory-based methods.
The increased narrowness of the peaks in the histogram inset of Fig. 8 with respect to their counterparts in the histogram inset of Fig. 7 is due to averaging over twice as many (wrongly identified) asymmetric units during the crystallographic image processing. This wrongful averaging created sites in the translation-averaged unit cells that now feature group 2mm at the fractional coordinates ½, 0, 0, ½, ½, 1 and 1, ½, as labeled in Fig. 2.
Nevertheless, the suppression of the noise in both of the noisy patterns is quite impressive when judged from the histogram insets in Figs. 5 and 6. Again, scanning probe microscopists should take notice of this fact as crystallographic image processing on the basis of objective classifications is now available to them as well. They need, however, to be wary of Fedorov-type pseudosymmetries that are easily misinterpreted as genuine symmetries when noise levels are high. Scanning probe microscopists in general and structural biologists who analyze subperiodic intrinsic membrane protein crystals should heed the advice that noisy images are only to be symmetrized to plane symmetry groups that are crystallographically consistent with the projected classification of a more or less 2D periodic image.
5. Comparisons of our classification results with suggestions by the CRISP program and associated comments
The objectively obtained are summed up in Table 10 and are now compared with the results of a traditional classification with the electron crystallography program CRISP, Table 11. It is clear from the latter table that the CRISP suggestions do not make distinctions between genuine symmetries and Fedorov-type pseudosymmetries.
classification results of Section 4
|
|
Note that the comparison of the classification results is based on exactly the same structure-bearing Fourier coefficients and their symmetrized versions as facilitated by using the same *.hka files (without any manual editingsA6,A7) in both types of classifications for the same pattern area selections.
As one can interactively test adjacent pattern areas for their CRISP program classification suggestions, one can not only assess the accuracy of that program's classification suggestions but also their precision. It was found that adjacent areas in both the noise-free and moderate amount of noise added pattern resulted in either p4gm or p2gg classifications with CRISP. The p4gm suggestion by CRISP for the noisiest did, however, not change with the selected pattern regions.
At least the noise-free pattern in the series should be homogeneous so that all adjacent image areas should be classified as featuring the same plane symmetry. One has to note that a large number of calculations goes into a plane symmetry classification so that CRISP's symmetry deviation quantifiers for different geometric models of the input image data are indeed slightly different for each different region.
The p2gg classification suggestions by CRISP are consistent with the bright bow ties featuring a that is no higher than group 2, as clearly revealed in Figs. 2 and 3. These classification suggestions assign group 2 as well to the centers of the dark curved diamonds in Fig. 1, which is a underestimation according to the classification results that were obtained with the information-theoretic methods. The strong Fedorov-type pseudosymmetries p1g1 and p11g in the selected regions of the noise-free and moderately noisy crystal patterns were by CRISP misinterpreted as genuine symmetries.
For the modest amount of added noise pattern, see the second entry in Table 11, the p2gg classification is consistent with the CRISP-derived lattice parameter set of a = 97.1 pixels, b = 97.0 pixels and γ = 90.0° for the that underlies Fig. 5. The small difference in the magnitude of the vectors should probably be ignored based on what has been shown by Moeck & DeStefano (2018).
Crystallographic symmetry classifications with the CRISP program rely in practice heavily on visual comparisons between the translation-averaged (Fourier filtered) and differently symmetrized versions of the input image data by an expert practitioner of electron crystallography. Faced with a p2gg classification by CRISP and a 2D that is almost of the square type (as obtained for the moderate amount of added noise pattern), most electron crystallographers would probably have simply overwritten that suggestion after visual inspections and concluded that the correct plane symmetry group is p4gm (based on a square unit cell). In doing so, they would have discounted the possibility of a very strong translational pseudosymmetry or (Moeck & DeStefano, 2018).
As mentioned above repeatedly, most human experts would most likely have classified all three synthetic patterns of the series as belonging to plane symmetry group p4gm because it would not occur to them that distinctions between genuine symmetries and pseudosymmetries might be necessary. As the analyses in the preceding sections demonstrate, p4gm classifications by CRISP for the noise-free and large amount of added noise patterns, see Table 11, constitute overestimations of the plane symmetry that is genuinely there, i.e. p4, due to Eva Knoll's handiwork.A1
Using the author's information-theory-based methods, no visual comparisons between the translation-averaged and differently symmetrized versions of the input image data are necessary.
classifications can, therefore, be made without human supervision, but under the currently necessary assumption that there is indeed more than translation symmetry in a noisy image.To employ crystallographic image processing techniques, the researcher no longer needs to be an electron crystallographer. This fact allows sufficiently well resolved more or less 2D periodic images from a wide range of crystalline samples that are recorded with different types of microscopes to be processed crystallographically. Previous successes in the crystallographic processing of images from scanning tunneling and atomic force microscopes are quoted by Moeck (2021b,c) and shown in Moeck (2017, 2020).
6. Summary and conclusions
Information-theory-based
classification methods for plane symmetry groups and projected have been demonstrated on three synthetic crystal patterns. The classifications were for the two noisy patterns complemented by the showing of the corresponding patterns and their histograms before and after their crystallographic processing. Note that these pairs of crystal patterns needed to be shown in this paper for demonstration purposes, but crystallographic image processing by the information-theoretic methods can proceed without prior visual inspections of such patterns by human beings.It is concluded that the information-theory-based classification methods are statistically sound and superior to all other existing methods, including the visual insights of human expert classifiers as far as their accuracy at first sight is concerned. Information-theory-based methods should be developed for A12 The detection of noncrystallographic symmetries (defined in the introductory Section 1.1 as being incompatible with translation symmetry) is beyond the scope of the demonstrated methods and there are no plans by this author to try to tackle that kind of problem.
classifications and quantifications in three spatial dimensions as there is also subjectivity in the current practice of single-crystal X-ray and neutron crystallography.6.1. Notes added in proof
(1) As quoted in Moeck (2018, 2019), there is a t, in the enters in that approach the direct-space analog to (9a) so that
G-AIC approach by Xanxi Liu and co-workers to plane symmetry group classifications of more or less 2D periodic patterns. The number of analyzed translation periodic tiles,results. There is no translation-averaged p1-symmetrized model of the input image data in that approach, so that the benefits of substantial noise reductions by working exclusively with the periodic structure-bearing Fourier coefficients vanish. For t > 1, a non-zero ratio of sums of squared direct-space pixel-intensity residuals for ascent to a geometric model of the data with km = 2 or 3 is, however, defined by (9c). (This might be the onlyA10 advantage of working in direct space.) When all of the sums of squared complex Fourier coefficient residuals [equation (1)] at the km = 2 or 3 level of the plane symmetry hierarchy tree (Fig. 4a) are rather high, using inequality (9c) with km = 2 or 3, kl = 1 and t > 1 could either help with the identification of the anchoring plane symmetry group or provide a statistical proof that there is only translation symmetry in the This would, however, work reliably only for low and moderate levels of approximately Gaussian distributed noise. The propensity of misidentifying Fedorov-type pseudosymmetries as genuine symmetries increases in a direct-space approach more strongly with the noise level than in the present study (which was performed exclusively in Fourier space).
and with that no(2) If one were to have a trustworthy a priori estimate of the noise level, ɛa priori, from the presumed accuracy of the geometric data acquisition process, Kanatani's framework allows for a replacement of inequality (5) with the following inequality:
which reduces for our case, dl = dm = 0, and assuming Nm = Nl = N to
Note that kl does not need to be an integer larger than unity in this formulation of inequality (5). Kanatani (2005) remarked that `it is very difficult to predict the noise level … a priori in real situations' and that the noise level `can be estimated a posteriori only if the hypothesis is true'. (Italics as in the original, the ellipsis being due to Kanatani using another symbol for the noise level.) Note that when one has ascended as high as it was possible in the hierarchy trees of Figs. 4(a) and 4(b) by using inequality (9a), one has with estimate (7b) a numerical value for the square of the noise level for the geometric model that is maximally supported by the input image data, the restrictions of the Euclidian plane, and the shifting of all deviations from these restrictions into an all-inclusive generalized noise term. The selected K-L best geometric model of the input image data is as close to the `real truth' as one could get under the quite reasonable assumptions that have been made. An analog to inequality (10b) can with the estimate (7b) for the square of the a posteriori noise level and (6) be used as a consistency check of a classification with
Such checks were not part of this study (as Nm ≠ Nl for most of our cases). Note that (11) is defined even for the translation-averaged (Fourier filtered/p1-symmetrized/projected 2) geometric models of the input image data.
(3) The development of an information-theory-based method for the classification and quantification of electron diffraction patterns, as motivated at the end of Appendix C2, progresses well. The first objective projected classifications and quantifications results were obtained from an experimental spot pattern, as discussed in Moeck & von Koch (2022a,b).
APPENDIX A
Notes on the text
A1. The artist Eva Knoll painted a single onto a single ceramic tile by hand, see the last appendix in Moeck (2021a). (That reference is to a significantly expanded version of this paper where the artist describes the genesis of `Tiles with quasi-ellipses' in her own words and gives a reference to her portfolio.) The painted featured a broken mirror line across one of its two diagonals, but covered the whole ceramic tile. That tile had a square shape (to a very good approximation) and was 6 inches (15.24 cm) long on its edges. For a color reproduction of the original painted tile, see Moeck (2021a).
The artist took a color photo of that square and produced multiple copies of that photo with the shape of squares of the same size. Sets of four photos of the tile were assembled into fourfold larger squares with fourfold rotation points at their centers by making sure that the broken mirror lines run along the fractional coordinates x, x + ½, −x, −x + ½, −x + ½, x and x + ½, −x of the thus-created (The multiplicity of the general position in this primitive is four.) It is quite remarkable that three pairs of slightly broken glide lines were created in the as a result of this assembly process.
The so-created (fourfold larger) unit-cell squares were then laid out on a square
without overlaps or gaps. This created fourfold rotation points at each of the four vertices of the and twofold rotation points in the middle of each of its four edges.The whole piece of Eva Knoll's graphic artwork consists, thus, of a translation periodic array of four properly assembled photocopies of her original tile (asymmetric unit). The graphic artwork features plane symmetry group p4 as the result of its creation process. (The genuine site symmetries in the assembly are point groups 4 and 2, which are non-disjoint.)
The artistically sophisticated distribution of paint, the broken mirror line in the original p4gm, at least at first sight.
and the two- and fourfold rotation points that resulted from the translation-periodic assembly process combined to several Fedorov-type pseudosymmetries. The latter give the visual impression that the graphic artwork features a with plane symmetry groupOwing to the large reduction in the size of the photocopies of the original tile, the diagonal pseudo-mirror line of the original tile feigns a genuine mirror line pretty well, at least at first sight. The grayscale reproduction of the original digital-color artwork in Knoll (2003) has an edge length of 5.7 cm only (and is of a square shape). There was, thus, a linear reduction of the edge length of the original painted tile to one of its digital photocopy counterparts by approximately a factor of 21.
The artist also created random assemblies of photocopies of her original tile without gaps or overlaps, see the last appendix in Moeck (2021a) for a color version of such an assembly.
A2. So far unpublished results on the classification of parallel-illumination transmission electron microscope images from a subperiodic intrinsic membrane protein crystal are mentioned in Appendix C briefly. The ongoing development of an information-theoretic classification and quantification method for projected crystallographic point symmetries from transmission electron diffraction patterns in approximate orientations is also mentioned in Appendix C.
That method has the potential to (i) distinguish genuine quaternary symmetries of intrinsic membrane protein complexes from pseudosymmetries at the ) of 2D on a 2D grid with fast pixelated direct electron detectors (Ophus, 2019; commonly referred to as 4D-STEM).
level and (ii) solve the symmetry inclusion problem in a recently demonstrated symmetry-contrast mode (Krajnak & Etheridge, 2020A3. The obtaining of satisfactory Fourier filtering results was facilitated by the above-mentioned increase in the number of unit cells in the that underlies Fig. 1 by computational periodic motif stitching. This kind of computational increase of a digital image of the original graphic work of art is also highly beneficial to the subsequent classification and a possible follow-up step of the enforcing of the plane symmetry that most likely underlies the pattern in a statistically sound sense.
Note also that Fourier filtering (Park & Quate, 1987) is an integral part of symmetry classifications and any subsequent crystallographic processing of a digital image. This is because the sums of squared residuals and the symmetrizing of the input image data are based only on the structure-bearing Fourier coefficients of a digital image (that are laid out on a lattice in reciprocal space).
A4. The analogy between Wyckoff positions in the direct-space of an ideal and so-called `domain maps' (Verberck, 2012) of the symmetries of the Fourier coefficients of such a pattern may be helpful to appreciate this statement. There is also an analogy between the in and the `minimal domain' in Fourier space.
Typically, there are many more general Wyckoff positions with
1 and their characteristic multiplicity than special Wyckoff positions with higher site symmetries and their reduced multiplicities. For unit cells that contain a large number of points in the multiplicity of the general approximates the combined-weighted multiplicities of all Wyckoff positions in an ideal of high complexity reasonably well.A5. This is because the dimension of the data space is in our case one, i.e. intensity values of pixels. The co-dimension is the difference between the dimension of the data space and the dimension of the model space, d in (3) and (4). The dimension of the model space is zero, in our case, as geometric points are representations of the individual pixels.
A6. Relying on the *.hka files of CRISP without further editing is not ideal, see also note A7, but was done in this study in order to enable a direct comparison of the symmetry classification results. The geometric models that are represented by *.hka files with different numbers of data points, different dynamic ranges and different spatial resolutions do not necessarily always give the best possible symmetrized version of the input image data in Fourier space. For the purpose of the demonstrations in this paper and to allow for the comparison of classification results that were obtained using the information-theory-based methods with those of the CRISP program, the accuracy of all geometric models is deemed to be more than sufficient.
On all accounts, the geometric models that CRISP provides in the form of exportable *.hka files are always quite representative of symmetrized versions of analyzed images as demonstrated by the successes of countless electron crystallography studies despite necessarily different choices for the spatial resolution and numbers of included structure-bearing Fourier coefficients.
A7. Ideally, one would base all calculations on symmetrized models of the input image data that feature exactly the same appropriately indexed structure-bearing Fourier coefficients and number of such coefficients. To obtain the same number of data points (complex Fourier coefficients of the image intensity) in all geometric models of the input image data, one would need to treat Fourier coefficients that are absent in certain geometric models as featuring zero amplitude and arbitrary phase. The absences can either be systematic or incidental. In both cases, the zero-amplitude Fourier coefficients are characteristics of the properly symmetrized geometric models of the input image data.
One can then give confidence levels for the classification into minimal supergroups over maximal subgroups by using equations (12) to (15) of Appendix B and provide a complete measurement result. In the absence of generalized noise (including small calculation errors), the smallest possible entry in the second column of Table 2 should for genuine symmetries then be restricted to unity.
A8. Spread noise `mimics' to some extent the effects of small random crystal-sample movements in a microscope during the recording of a more or less 2D periodic image.
A9. The rather fat tails in the histogram in Fig. 6 are actually artifacts of the way the Gimp program adds Gaussian-distributed noise to the individual pixel intensity values. All pixel intensities that would after the adding of the noise amount to something below zero are set to zero (black) and all pixel intensities that would be larger than 255 are set to 255 (white). This fat-tails effect can also be seen in the histogram of the moderately noisy that underlies Fig. 5.
The histogram in Fig. 6 may actually to a better approximation be described by one of Mandelbrot's stable distributions (Mandelbrot, 1963). Such a distribution may acquire approximate Gaussian tails with the addition of more stably distributed noise from a multitude of sources. This is in line with Mandelbrot's bon mot: `approximations are absurd in some problems but are adequate in many others, and they are so simple that one must consider them first' (1963). The central limit theorem applies to both stable distributions and Gaussian distributions.
A10. Note that much of the noise removal is due to the translation averaging by Fourier filtering over approximately 88 unit cells. In order to obtain a good image-quality enhancement in an experimental study of a crystal, one needs to start with an image with a large field of view and medium magnification. That is somewhat unusual in the microscopical practice where the focus is often on structural defects and images are recorded with small fields of view and very high magnifications.
As discussed in detail in Moeck (2019), the Fourier-space approach to classifications and the subsequent optimal processing of a 2D offers significant advantages over any direct-space approach. Wiener filters can be used in direct space to increase the image quality, but that does not restore the broken site symmetries in the translation-averaged unit cell.
The precondition for using the Fourier-space approach is, on the other hand, a direct-space image with a sufficient number of more or less translation-periodic unit cells which are represented by a large number of pixels. Depending on the complexity of the ). As for the shape of the processed image regions, circular discs are preferable over any other shapes. In the electron crystallography of intrinsic membrane protein crystals, one typically averages over several hundred to a few thousand unit cells in a TEM image and uses magnifications of around 50 000 only. The averaging of the periodic structure-bearing Fourier components with matching from multiple images of the same crystalline sample and plane symmetry group p1 is analogous to merging X-ray or neutron diffraction data from several crystals of the same kind and common practice in electron crystallography.
several tens of unit cells may suffice for good image-processing results. The results of the processing of larger regions of more or less 2D periodic images are always better than their counterparts for smaller regions (Dempsey & Moeck, 2020The stitching together of experimental direct-space images that were recorded under different imaging conditions in order to increase the number of unit cells in the composite image is not recommended. Using a computer program such as Microsoft ICE 2.0, this may lead to additional Fourier coefficients that represent the created did not lead to additional Fourier coefficients because it was free of noise, i.e. all unit cells were exactly identical due to the creation process of the graphic piece of art, see note A1.
The stitching together of the that underlies Fig. 1A11. Note that the `faint square crosses' inside the `dark curved diamonds' with 4 in Fig. 7 at the coordinates 0, 0, 1, 0, 0, 1 and 1, 1 (as marked in Fig. 2) originate partly from the tiling of digital photos of the same painted ceramic square tile, see Fig. A-7 in the expanded online version of this paper (Moeck, 2021a). There are corresponding `narrow cross' features at these positions in the expanded digital version of Eva Knoll's piece of graphic art that served as basis of all demonstrations in this paper and which is available in the *.jpg and *.tif formats in the supporting information for this paper.
The very low contrast `fourfold feature' inside the dark curved diamond at the fractional ) are not homogeneously bright (as they appear to be in Figs. 1 and 8). They feature instead a `fine structure' with the intensity distribution of a twofold rotation point that originates partly from the symmetrization of local Fourier ripples. A more thorough discussion of these artifacts is provided in Moeck (2021a).
coordinates ½, ½ originates mainly from the `symmetrization of remains of the added noise' by the crystallographic image processing. Analogously, note that the bright bow ties (at fractional coordinates ½, 0, ½, 1, 0, ½ and 1, ½, as marked in Fig. 2All of these artifacts could have been suppressed by larger spatial and .
restrictions of the noisy structure-bearing Fourier coefficients in Fourier space, resulting unavoidably in lower contrasts in the direct-space pattern after back transforming. This has in principle been demonstrated with the processing of the nosiest of the series, see Fig. 8A12. In every single-crystal X-ray or neutron diffraction based determination of an unknown one needs to assign a in which the subjectively most reasonable model for the structure is to be refined. Information theory, as defined in footnote 2, is partly about the selection of the model for experimental data that is statistically/objectively most justified by the data themselves. Since the experimental data are in diffraction-based crystallography of a geometric nature, a geometric form of information theory such as the one by Kenichi Kanatani is applicable.
When the symmetry classification (and quantification) methods of this paper have been generalized to three spatial dimensions, Walter C. Hamilton's well known significance tests of crystallographic R values after refinements into non-disjoint space groups (Hamilton, 1965) could be considered superseded. This is because they have been set up as null-hypothesis tests. Information theory is widely considered to offer a superior alternative to null-hypothesis testing, see Anderson (2008) for a gentle introduction on how to bring more objectivity to scientific studies.
APPENDIX B
Ad hoc defined confidence levels for classifications into minimal supergroups for a special case of the inequality on which the author's information-theory-based methods are based
For the special case Nm = Nl, inequality (9b) reduces to
which has been labeled as inequality (9a) in the main part of this paper.
When Nm = Nl, one can take advantage of the inequality having the simple form of a numerical value on its right-hand side that is just the sum of unity and a constant term that only depends on the difference in the hierarchy levels, k, of the respective two symmetrized non-disjoint models that are to be compared with each other, see Figs. 4(a) and 4(b). The respective ratios of sums of squared complex Fourier coefficient residuals and sums of squared Fourier coefficient amplitude residuals are provided in these figures as insets for easy reference. (The comparison of two non-disjoint symmetrized models with respect of their ability to represent the input image data is based on having an appropriate `relative measure' of their numerical distance to the common translation-averaged-only model in the first place.)
Inequality (9a) can be used in connection with ad hoc defined confidence levels for geometric model selections. Providing such confidence levels can be understood as giving a quantitative measure of the corresponding model-selection uncertainty, which needs to accompany any measurement results in order to be complete (Helliwell, 2021).
Based on Kanatani's information content ratio equation (Kanatani, 1998), ad hoc defined confidence levels for model selections in favor of a non-disjoint more symmetric/restricted geometric model can for the special case Nm = Nl be straightforwardly defined [whenever inequality (9a) is fulfilled]. For two non-disjoint geometric models one obtains
and the critical value for K is obtained by inserting the condition
results.
Obviously, K ≥ Kcritical is valid as the ratio of the two sums of squared residuals ranges from unity (13) to a constant value that is larger than unity and depends on the particular combination of km and kl in inequality (9a).
When the ratio of the squared residuals is unity [as in (13)], one has 100% confidence in choosing the more symmetric model over the less symmetric model. Both models fit the input image data equally well in that special case, which will in practice only be obtained for noise-free mathematical idealizations of real-world images, perfect geometric models and with a perfectly accurate algorithm. When inequality (9a) is not fulfilled, one has zero confidence in the selection of the more symmetric model over its less symmetric counterpart. This is all formalized by the definition of the confidence level in identifying a minimal over its maximal subgroup,
which takes on values between 100% and zero as a function of the ratio of the sums of squared residuals. [Negative values, which are meaningless, result from (15) when inequality (9a) is not fulfilled so that K > 1.] It makes sense to define an average confidence level for a transition from all maximal subgroups to their common minimal For small symmetry breakings of each individual maximal or class and low-noise data, this average confidence level can be rather high.
APPENDIX C
Outlooks on ongoing developments of the information-theoretic
classification and quantification methodology, and their potential applicationsFormulations of geometric information criteria are possible where the generalized noise does not need to be approximately Gaussian distributed. For a non-Gaussian noise model, the appropriate logarithmic likelihood estimate needs to be used instead of a sum of squared residuals. The generalized inverse of the Fisher information matrix needs then to replace the isotropic covariance matrix of Gaussian-distributed noise. In Kenichi Kanatani's own words: `such an extension does not seem to have much practical significance because of the difficulty of estimating the parameters of a non-Gaussian noise distribution' (1998). Note that the generalized noise arises from multiple sources with different characteristics, but the overall distribution is not supposed to be dominated by any one of these sources.
The assumption had to be made in the main part of this paper that there is indeed more than translation symmetry in a more or less 2D periodic pattern that is to be classified with respect to its crystallographic symmetries. This may, however, not always be the case.
There are certainly approximately 2D periodic patterns with and without noise in which all point/site symmetries higher than the identity operation are only pseudosymmetries and not genuine. These patterns are revealed by large sums of squared complex Fourier coefficient residuals for all plane symmetry groups with kl = 2 and 3 and large sums of squared Fourier coefficient amplitude residuals for all projected with kl = 4 and 6. (Note that the definition of at https://dictionary.iucr.org/Crystal_pattern leaves it open if there are site/point symmetries higher than the identity operation or a single glide line in the of the pattern or not.)
Those crystal patterns or images of crystals would be misclassified by the author's methods at the present stage of their development if the facts were ignored that the sums of squared residuals for all of these groups and classes are rather large. The first of the notes added in proof in Section 6.1 above identifies a practical workaround to this problem. The second of these notes mentions a consistency check that can be generalized to the Nm ≠ Nl case, administered a posteriori, and does not require kl being an integer larger than unity.
C1. Quaternary symmetry and pseudosymmetry of intrinsic membrane protein complexes
Crystallographic studies of the
of intrinsic membrane protein complexes in lipid bilayers are in the structural biology field based on parallel-illumination transmission electron microscope (TEM) images that are dominated by Poisson-distributed shot noise. As mentioned at the beginning of this Appendix, an information-theoretic approach to the classification and quantification of crystallographic symmetries in such highly beam sensitive crystalline samples (and the digital images that were recorded from them) could be specifically developed by a generalization of Kanatani's geometric framework.For the time being, this author sees no harm in using the methods of this paper in that particular field as well. This is for two reasons: (i) because shot noise becomes with moderate electron doses approximately Gaussian distributed and (ii) the subjective (and less accurate) traditional
classification methods (that do not model the noise at all) are currently used for exactly this purpose.So far unpublished results of this author on the plane symmetry group and Mesorhizobium loti in both the open and closed conformations indicate that the projected genuine, i.e. least broken, quaternary symmetry of this protein complex is 2. There is, however, a strong fourfold pseudosymmetry along the channel axis as indicated by a relatively low sum of squared residuals of the complex Fourier coefficients for plane symmetry group p4gm.
classification of the cyclic nucleotide-modulated potassium channel MloK1 from bacteriumThis makes the potassium channel a dimer of two dimers, while other authors (Chiu et al., 2007; Kowal et al., 2014, 2018) claimed it to be a tetramer. Their claim relies, however, on the traditional classification methodology, which contains elements of subjectivity.
Incidentally, the experimental facts of this author's study on the above mentioned MloK1 potassium channel are similar to the results of the information-theoretic analysis of the noisiest . In other words, there was apparently enough shot noise in the experimental images and contributions from other noise sources so that the generalized noise became approximately Gaussian distributed.
in the main part of this paper. The histograms of the experimental TEM images revealed a single broad peak with slim tails and a mean value that corresponded to approximately 50% of the whole dynamic intensity range. This peak looked visually like some Gaussian function to a much better approximation than the histogram inset in Fig. 6According to other authors (Chiu et al., 2007; Kowal et al., 2014, 2018), the plane projected symmetry of MloK1 potassium channel crystals from this bacterium is plane symmetry group p4gm. This author's analysis indicates, on the other hand, that this can only be a strong pseudosymmetry because projected 2mm has been identified as the K-L best representation of the symmetry information in the amplitude maps of the discrete Fourier transforms of the TEM images. Note that this analysis was based on some of the same experimental images that Kowal et al. (2014) used in their study, as downloaded from the EMDataResource (2021). Those experimental images were recorded by these other authors with a large underfocus at a nominal zero-tilt setting of the specimen goniometer. A tomographic images and derived electron density maps supported model mechanism for the opening and closing of this particular potassium channel that is restricted to fourfold rotation symmetry, such as the one proposed by Kowal et al. (2014) has, accordingly (at the present time) less `geometric support' than an alternative mechanism that is restricted to twofold rotation symmetry only.
Note that the identification of projected mm as the of the K-L best model of the experimental data rules out the existence of genuine fourfold rotation points as site symmetries in the of the MloK1 potassium channel crystal in an analogous manner, as the entries for projected 4 in Tables 7 and 9 rule out plane symmetry group p4gm for the very noisy that underlies Fig. 6. It is notable that it was again the information-theoretic projected determination that led to the identification of a strong Fedorov-type pseudosymmetry at the site/point symmetry level. Presumably, projected determinations by the new method are less sensitive to noise than the corresponding plane symmetry group determinations. (Amplitude maps of discrete Fourier transforms of perfect crystal patterns are known to be translation invariant.)
2Complementing information-theoretic classification studies of transmission electron diffraction spot patterns from intrinsic membrane protein complexes under zero-crystal-tilt conditions would be helpful as these patterns typically feature more spots than the number of structure-bearing Fourier coefficients of the corresponding TEM images and the spot intensities are not affected by aberrations of the objective lens. This means they contain more point/site symmetry specific information. Electron diffraction patterns from perfect plane-parallel crystals are translation invariant in an ideal TEM so that small random sample movements under the electron beam might be tolerable when projected
classifications are made on the basis of such patterns.C2. Development of an information-theoretic projected classification and quantifications method
A first motivation for the development of an information-theoretic projected ) with fast pixelated direct electron detectors. A new symmetry-contrast imaging mode has, for example, been recently demonstrated by Krajnak & Etheridge (2020).
classification and quantifications method was provided in the last paragraph of Appendix C1. There are, in addition, very interesting developments in 4D-STEM (Ophus, 2019Future developments of that contrast mechanism into a widely accepted standard are, however, hampered by the well known symmetry inclusion relationships. The incorporation of a newly developed information-theoretic projected N is not likely to be large in electron diffraction patterns of crystals with small unit cells and structural defects, suitable replacements for equation (6) have to be used.
group classification and quantification method on the basis of experimental electron diffraction patterns would solve this problem. AsThis author has taken up the challenge to develop such a method for
spot patterns, precession electron diffraction patterns, nearly-parallel-illumination nanodiffraction disc patterns, and convergent beam microdiffraction patterns with essentially non-overlapping and featureless (blank) electron diffraction discs.Supporting information
Expansion of a digital representation of the original artwork by Eva Knoll (tif). DOI: https://doi.org/10.1107/S2053273322000845/ou5022sup1.tif
Expansion of a digital representation of the original artwork by Eva Knoll (jpg). DOI: https://doi.org/10.1107/S2053273322000845/ou5022sup2.jpg
Footnotes
1Crystallographic image processing is in an appendix to Moeck (2021a), i.e. an expanded version of this paper, discussed as a form of computational imaging. The concept of intrinsic image quality is defined there by means of an equation. The concept of `Abbe resolution' is also defined in the main part of that open-access paper.
2According to the Merriam–Webster Dictionary, information theory is defined as `a theory that deals statistically with information, with the measurement of its content in terms of its distinguishing essential characteristics or by the number of alternatives from which it makes a choice possible, and with the efficiency of processes of communication between humans and machines' (https://www.merriam-webster.com/dictionary/information_theory).
Acknowledgements
The current fellow members of Portland State University's Nano-Crystallography Group, Regan Garner, Choomno Moos, Grayson Kolar, Gabriel Eng, Noah Allen and Lukas von Koch, are thanked for critical proofreads of the manuscript. Regan Garner is also thanked for the graphs in Figs. 4(a) and 4(b). Professor Eva Knoll of the Department of Mathematics of the University of Quebec at Montreal is thanked for both a critical proofread and enlightening discussions of her early tessellation artwork by e-mail. Professor Emeritus Emil Makovicky of the Department of Geosciences and Natural Resource Management of the University of Copenhagen is thanked for comments and stimulating discussions both by e-mail and on the Zoom video conferencing platform. Professor Emeritus Kenichi Kanatani of the Department of Computer Science of Okayama University is thanked for his interest in this author's work and associated e-mail conversations in recent years. Professor Emeritus Bryant York of the Department of Computer Science of Portland State University is likewise thanked for his interests in this author's recent work and numerous personal discussions over the years.
References
Anderson, D. R. (2008). Model Based Inference in the Life Sciences, a Primer on Evidence. Springer. Google Scholar
Aroyo, M. I. (2016). Editor. International Tables for Crystallography, Vol. A, Space-Group Symmetry, 6th ed. Chichester: Wiley. Google Scholar
Biyani, N., Righetto, R. D., McLeod, R., Caujolle-Bert, D., Castano-Diez, D., Goldie, K. N. & Stahlberg, H. (2017). J. Struct. Biol. 198, 124–133. Web of Science CrossRef CAS PubMed Google Scholar
Burzlaff, H., Fischer, W. & Hellner, F. (1968). Acta Cryst. A24, 57–67. CrossRef IUCr Journals Google Scholar
Chiu, P. L., Pagel, M. D., Evans, J., Chou, H.-T., Zeng, X., Gipson, B., Stahlberg, H. & Nimigean, C. M. (2007). Structure, 15, 1053–1064. CrossRef PubMed CAS Google Scholar
Chuprunov, E. V. (2007). Crystallogr. Rep. 52, 1–11. CrossRef CAS Google Scholar
Dempsey, A. & Moeck, P. (2020). arXiv:2009.08539 [cs.CV]. Google Scholar
Eades, J. A. (2012). Personal communication. Google Scholar
EMDataResource (2021). Data entries EMD-2526 and EMD-2527. https://www.emdataresource.org/. (Last accessed 19 December 2021.) Google Scholar
Gelder, R. de & Janner, A. (2005a). Acta Cryst. B61, 287–295. Web of Science CrossRef IUCr Journals Google Scholar
Gelder, R. de & Janner, A. (2005b). Acta Cryst. B61, 296–303. Web of Science CrossRef IUCr Journals Google Scholar
Gipson, B., Zeng, X., Zhang, Z. Y. & Stahlberg, H. (2007). J. Struct. Biol. 157, 64–72. CrossRef PubMed CAS Google Scholar
Gureyev, T. E., Paganin, D. M., Kozlov, A. & Quiney, H. M. (2019). Proc. SPIE, 10887, Quantitative Phase Imaging V, 108870J. Google Scholar
Hahn, Th. (2010). Editor. International Tables for Crystallography, Brief Teaching Edition of Volume A, Space-Group Symmetry, 5th ed. Chichester: John Wiley & Sons. Google Scholar
Hamilton, W. C. (1965). Acta Cryst. 18, 502–510. CrossRef CAS IUCr Journals Web of Science Google Scholar
Helliwell, J. R. (2021). Acta Cryst. A77, 173–185. Web of Science CrossRef IUCr Journals Google Scholar
Henderson, R., Sali, A., Baker, M. L., Carragher, B., Devkota, B., Downing, K. H., Egelman, E. H., Feng, Z., Frank, J., Grigorieff, N., Jiang, W., Ludtke, S. J., Medalia, O., Penczek, P. A., Rosenthal, P. B., Rossmann, M., Schmid, M. F., Schröder, G. F., Steven, A. C., Stokes, D. L., Westbrook, J. D., Wriggers, W., Yang, H., Young, J., Berman, H. M., Chiu, W., Kleywegt, G. J. & Lawson, C. L. (2012). Structure, 20, 205–214. CrossRef CAS PubMed Google Scholar
Hovmöller, S. (1992). Ultramicroscopy, 41, 121–135. CrossRef Web of Science Google Scholar
Hovmöller, S. (2010). Personal communication. Google Scholar
Jones, L. & Nellist, P. D. (2013). Microsc. Microanal. 19, 1050–1060. CrossRef CAS PubMed Google Scholar
Kanatani, K. (1997). IEEE Trans. Pattern Anal. Mach. Intell. 19, 246–247. CrossRef Google Scholar
Kanatani, K. (1998). Int. J. Comput. Vis. 26, 171–189. CrossRef Google Scholar
Kanatani, K. (2004). IEEE Trans. Pattern Anal. Mach. Intell. 26, 1307–1319. CrossRef PubMed Google Scholar
Kanatani, K. (2005). Statistical Optimization for Geometric Computation: Theory and Practice. Slightly corrected paperback edition. Dover Books on Mathematics. Mineola: Dover Publications Inc. Google Scholar
Kilaas, R., Marks, L. D. & Own, C. S. (2005). Ultramicroscopy, 102, 233–237. Web of Science CrossRef PubMed CAS Google Scholar
Knoll, E. (2003). M. C. Escher's Legacy, edited by D. Schattschneider and M. Emmer, pp. 189–198. Berlin, Heidelberg: Springer. Google Scholar
Kopský, V. & Litvin, D. B. (2010). Editors. International Tables for Crystallography, Vol. E, Subperiodic Groups, 2nd ed. Chichester: John Wiley & Sons. Google Scholar
Kowal, J., Biyani, N., Chami, M., Scherer, S., Rzepiela, A. J., Baumgartner, P., Upadhyay, V., Nimigean, C. M. & Stahlberg, H. (2018). Structure, 26, 20–27.e3. CrossRef CAS PubMed Google Scholar
Kowal, J., Chami, M., Baumgartner, P., Arheit, M., Chiu, P.-L., Rangl, M., Scheuring, S., Schröder, G. F., Nimigean, C. M. & Stahlberg, H. (2014). Nat. Commun. 5, 3106. CrossRef PubMed Google Scholar
Krajnak, M. & Etheridge, J. (2020). Proc. Natl Acad. Sci. USA, 117, 27805–27810. CrossRef CAS PubMed Google Scholar
Lawson, C. L., Berman, H. M. & Chiu, W. (2020). Struct. Dyn. 7, 014701. Web of Science CrossRef PubMed Google Scholar
Liu, Y., Hel-Or, H., Kaplan, C. S. & Van Gool, L. (2009). Foundations Trends. Comput. Graph. Vis. 5, 1–195. CrossRef CAS Google Scholar
Mandelbrot, B. (1963). J. Polit. Econ. 71, 421–440. CrossRef Google Scholar
McLachlan, D. Jr (1958). Proc. Natl Acad. Sci. USA, 44, 948–956. CrossRef PubMed CAS Google Scholar
Moeck, P. (2017). Microscopy and Imaging Science: Practical Approaches to Applied Research and Education, edited by A. Méndez-Villas, pp. 503–514. Microscopy Book Series, No. 7. Badajoz: FORMATEX Research Centre. Google Scholar
Moeck, P. (2018). Symmetry, 10, 133. CrossRef Google Scholar
Moeck, P. (2019). IEEE Trans. Nanotechnol. 18, 1166–1173. CrossRef CAS Google Scholar
Moeck, P. (2020). arXiv:2011.13102v2 [cond-mat.mtrl-sci]. Google Scholar
Moeck, P. (2021a). arXiv:2108.00829 [eess.IV]. Google Scholar
Moeck, P. (2021b). IEEE 21st International Conference on Nanotechnology (NANO), pp. 68–71. Google Scholar
Moeck, P. (2021c). arXiv:2108.01237 [physics.data-an]. Google Scholar
Moeck, P. (2021d). arXiv:1902.04155v4 [cond-mat.mtrl-sci]. Google Scholar
Moeck, P. & Dempsey, A. (2019). Microsc. Microanal. 25, 1936–1937. CrossRef Google Scholar
Moeck, P. & DeStefano, P. (2018). Adv. Struct. Chem. Imag. 4, 5. CrossRef Google Scholar
Moeck, P. & von Koch, L. (2022a). arXiv:2201.04789 [cond-mat.mtrl-sci]. Google Scholar
Moeck, P. & von Koch, L. (2022b). arXiv:2202.00220 [cond-mat.mtrl-sci]. Google Scholar
Nespolo, M., Souvignier, B. & Litvin, D. B. (2008). Z. Kristallogr. 223, 605–606. Web of Science CrossRef CAS Google Scholar
Ophus, C. (2019). Microsc. Microanal. 25, 563–582. Web of Science CrossRef CAS PubMed Google Scholar
Paganin, D. M., Kozlov, A. & Gureyev, T. E. (2019). arXiv:1909.11797 [physics.optics]. Google Scholar
Park, S. & Quate, C. F. (1987). J. Appl. Phys. 62, 312–314. CrossRef Google Scholar
Somov, N. V. & Chuprunov, E. V. (2009). Crystallogr. Rep. 54, 727–733. CrossRef CAS Google Scholar
Valpuesta, J. M., Carrascosa, J. L. & Henderson, R. (1994). J. Mol. Biol. 240, 281–287. CrossRef CAS PubMed Web of Science Google Scholar
Verberck, B. (2012). Symmetry, 4, 379–426. CrossRef Google Scholar
Wan, Z., Liu, Y., Fu, Z., Li, Y., Cheng, T., Li, F. & Fan, H. (2003). Z. Kristallogr. 218, 308–315. Web of Science CrossRef CAS Google Scholar
Wondratschek, H. & Müller, U. (2004). Editors. International Tables for Crystallography, Vol. A1, Symmetry Relations between Space Groups, 1st ed. Dordrecht, Boston, London: Kluwer. Google Scholar
Zou, X. D. & Hovmöller, S. (2012). CRISP 2.2 Manual. Calidris, Sweden. https://www.calidris-em.com. Google Scholar
Zou, X. D., Hovmöller, S. & Oleynikov, P. (2011). Electron Crystallography: Electron Microscopy and Electron Diffraction. IUCr Texts on Crystallography No. 16. IUCr/Oxford University Press. Google Scholar
This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.