- 1. Introduction
- 2. Solving the phase problem for data expanded to space group P1
- 3. Using phases to find the origin shift and space group
- 4. Assigning chemical elements to the electron-density peaks
- 5. Isotropic refinement and absolute structure determination
- 6. Building the structure
- 7. Examples
- 8. Program development and distribution
- References
- 1. Introduction
- 2. Solving the phase problem for data expanded to space group P1
- 3. Using phases to find the origin shift and space group
- 4. Assigning chemical elements to the electron-density peaks
- 5. Isotropic refinement and absolute structure determination
- 6. Building the structure
- 7. Examples
- 8. Program development and distribution
- References
research papers
SHELXT – Integrated space-group and crystal-structure determination
aDepartment of Structural Chemistry, Georg-August Universität Göttingen, Tammannstrasse 4, Göttingen, 37077, Germany
*Correspondence e-mail: gsheldr@shelx.uni-ac.gwdg.de
The new computer program SHELXT employs a novel dual-space algorithm to solve the for single-crystal reflection data expanded to the P1. Missing data are taken into account and the resolution extended if necessary. All space groups in the specified Laue group are tested to find which are consistent with the P1 phases. After applying the resulting origin shifts and space-group symmetry, the solutions are subject to further dual-space recycling followed by a peak search and summation of the electron density around each peak. Elements are assigned to give the best fit to the integrated peak densities and if necessary additional elements are considered. An isotropic is followed for non-centrosymmetric space groups by the calculation of a and, if appropriate, inversion of the structure. The structure is assembled to maximize its connectivity and centred optimally in the SHELXT has already solved many thousand structures with a high success rate, and is optimized for multiprocessor computers. It is, however, unsuitable for severely disordered and twinned structures because it is based on the assumption that the structure consists of atoms.
Keywords: Patterson superposition; direct methods; dual-space recycling; space-group determination; element assignment.
1. Introduction
Although crystallographic phase problem. This problem arises because although methods for measuring the intensities of the diffracted X-rays have made considerable progress during that time, the direct experimental measurement of their relative phases is still only rarely practicable. Small-molecule crystal structures are usually solved by the use of probability relationships involving the phases of the stronger reflections, the so-called (Sheldrick et al., 2001; Giacovazzo, 2014) or more recently by the iterative use of Fourier transforms, e.g. dual-space methods such as (Oszlányi & Sütő, 2004; Palatinus, 2013), in which the phases are constrained by the observed reflection intensities in and by the properties of the electron density in real space.
determination by means of X-ray diffraction has had a major scientific impact for the last 100 years, it still requires the solution of theBefore the e.g. to distinguish between centrosymmetric and non-centrosymmetric structures). This space-group determination may be upset by the presence of dominant heavy atoms or by pseudo-symmetry affecting the intensities of certain classes of reflections, and in some cases the is ambiguous. For example, the space groups I222 and I212121 have the same as do Pmmn and two different orientations of Pmn21.
can be solved, the usual procedure is to determine the of the crystal with the help of the Laue symmetry of the diffraction pattern, the presence or absence of certain reflections (the systematic absences) and statistical tests (Many dual-space methods perform at least as well when the data are first expanded to the nominal P1 (Sheldrick & Gould, 1995). In this paper `P1' will be used to cover the centred triclinic non-centrosymmetric space-group settings such as C1 as well; the data do not need to be re-indexed for the After solving the in P1, the can be determined using the P1 phases (Burla et al., 2000; Palatinus & van der Lee, 2008) and this turns out to be a very robust general approach. SHELXT also employs this strategy. The are not then used for the space-group determination, but all the weak reflections are still useful for identifying the best solution. Fig. 1 summarizes the course of using SHELXT. The individual stages will now be discussed in detail. The current version of SHELXT is intended for single-crystal X-ray data and is not suitable for neutron diffraction data.
2. Solving the for data expanded to P1
SHELXT reads standard SHELX format .ins and .hkl files. It extracts the Laue group (but not space group) and the elements that are expected to be present (but not how many atoms of each). A number of options, e.g. that all trigonal and hexagonal Laue groups should be considered (-L15), may be specified by command-line switches. A summary of the possible options is output when no filename is given on the SHELXT command line and further details are available on the SHELX home page.
The data are first merged according to the specified Laue group and then expanded to P1. In theory, SHELXT could also have been programmed to determine the Laue group, e.g. by calculating the R values or correlation coefficients when the equivalent reflections are merged. However, the Laue group has to be known to scale the data, which is an essential step for the highly focused beams now common for synchrotrons and laboratory microsources, because the effective volume of the crystal irradiated is different for different reflections and needs to be corrected for. So in practice it is best to determine the Laue group first anyway. Even though programs such as XPREP (Bruker AXS, Madison, WI 53711, USA) are no longer required to determine the it is still necessary to identify the correct and metric symmetry.
2.1. Dual-space iteration starting from a Patterson superposition
The P1 dual-space recycling in SHELXT may start with random phases, but the default option of starting from a Patterson superposition minimum function (Buerger, 1959; Sheldrick, 1997) is usually more effective. Two copies of the displaced from each other by a strong are superimposed and the minimum value of the two is calculated at each grid point. The resulting map is used as the initial electron density for the dual-space recycling. In an ideal case it is a double image of the structure consisting of 2N peaks, where N is the number of unique atoms, but the space-group symmetry has been lost. Since the dual-space recycling is being performed in P1 anyway, this is a good start and 2N is a significant reduction from the N2 peaks in the original Patterson. The subsequent dual-space recycling is performed using the modified structure factors
where E is the normalized and a new density map is calculated by a hybrid difference Fourier synthesis with phases and coefficients
where and Gc are obtained by Fourier transformation of the current map. The default values for m and q are 3 and 0.5, respectively, but may be changed by the user. Based on experience with other structure-solution programs, q should probably be larger for large equal-atom structures and smaller for structures involving heavy atoms (to reduce Fourier ripples), but in practice it is rarely necessary to change the default values.
SHELXT adds unmeasured data above and below the resolution limit of the data in the .hkl file similar to the free lunch method described by Caliandro et al. (2005). This enables structures to be solved at an earlier stage in the data collection and is particularly useful for data collected with diamond-anvil high-pressure cells, with which it is not always possible to collect complete data. It reduces the effects of series-termination errors in the Fourier syntheses, but tends to make the electron-density integration used to assign the element types less reliable.
2.2. The random omit procedure
Omit maps are frequently used in macromolecular crystallography to reduce model bias. A small part of the structure is deleted and the rest is refined to reduce memory effects, then a new difference-density map is generated and interpreted. This concept plays an important role in SHELXT, but because no model is available at the P1 dual-space stage, it is implemented differently. The following density modification is performed unless otherwise specified by the user. A mask M(x) is constructed consisting of Gaussian-shaped peaks of unit volume at the positions of the maxima in the electron-density map. A small number of these Gaussian peaks are then deleted from the mask at random, usually every third dual-space cycle, and the new density is obtained by multiplying the original density ρ(x) with the mask:
at each grid point x in the This allows the random omit method to be implemented efficiently using fast Fourier transforms (FFTs) in both directions. Imposing a shape function in this way improves the atomicity of the map. Negative density is truncated to zero, a common theme in phase improvement by density modification (Shiono & Woolfson, 1992). Compared with the stronger imposition of atomicity probably allows the resolution requirements to be relaxed. On the other hand, should be better for the solution of severely disordered or modulated structures, precisely because they are not atomistic!
To decide which P1 solution is best, three criteria are considered: (a) The CC between Go and Gc, where Gc are the amplitudes obtained by Fourier back-transformation of the modified electron density. (b) The structure factors Gc are normalized to give Ec and Rweak is calculated as the average value of Ec2 for the 10% of unique reflections (including systematic absences) with the smallest observed normalized structure factors E (Burla et al., 2013). In this way, the weak reflections can still play a decisive role in the structure solution even though they were not used directly to determine the (c) The chemical figure of merit CHEM is calculated by performing a peak search and calculating all bond angles involving two distances in the range 1.1 to 1.8 Å. CHEM is the fraction of these angles that lie between 95 and 135° (Langs & Hauptman, 2011). The combined figure of merit CFOM is given by
where X is 1.0 unless reset by the user. For organic or organometallic structures, especially for low resolution or incomplete data, the alternative,
is sometimes better, but this is not the default option because it is not appropriate for inorganic and mineral structures. If CFOM is less than a preset threshold, the program refines further sets of starting phases, increasing the number of iterations each time this is done.
3. Using phases to find the origin shift and space group
The idea of trying all possible space groups in a specified Laue group is also sometimes used in macromolecular P, Laue group mmm, and only the Sohncke space groups need to be considered, a molecular-replacement program can be asked to test all eight possibilities. If only one of the eight gives a solution with good figures of merit, both the and the have been determined! For chemical problems the situation is more interesting, because there are 30 possible orthorhombic P space groups and a total of 120 possibilities when different orientations of the axes are taken into account (as in SHELXT).
determination. For example, if the crystal is orthorhombicThe procedure used in SHELXT to find space groups and origin shifts that are consistent with the P1 phases is based closely on the methods proposed by Burla et al. (2000) and Palatinus & van der Lee (2008), so it only needs to be summarized here. For a reflection h with P1 phase ψ and its mth symmetry equivalent hm = hRm with P1 phase ψm, where Rm is a 3 × 3 rotation matrix and tm is the corresponding translation vector, we define
For the correct Δx, η should be close to zero. To facilitate comparisons, the figure of merit α is defined as the F2-weighted sum of η2 over all pairs of equivalents for all reflections, normalized so that it should be unity for random phases. α should be as small as possible for the correct combination of and origin shift.
and the correct origin shiftSHELXT first calculates α for the ; this value is referred to as α0. If α0 is less than about 0.3, the is probably centrosymmetric. For centrosymmetric space groups, the origin shift may be used to place a centre of symmetry on the origin; however, SHELXT has to take into account that the may possess more than one non-equivalent centre of symmetry. For , η is calculated with a FFT and for non-centrosymmetric, non-polar space groups a two-dimensional grid search followed by a one-dimensional search is performed to speed up the calculation. The space-group search is performed in parallel for all space groups that need to be tested. Although the solution with the lowest α value is often the correct one, only unlikely solutions with α greater than a specified value (default 0.3) are eliminated before going on to the next stage.
4. Assigning chemical elements to the electron-density peaks
Each solution with a reasonable α value is first subject to ten cycles of density modification in the chosen after applying the origin shift. This density modification consists only of averaging the phases of equivalent reflections taking the space-group symmetry into account and resetting negative density to zero. A peak search is then performed, and the density inside a sphere (default radius 0.7 Å) about each peak is summed. It is better to use integrated densities rather than peak heights because the atoms may have different atomic displacement parameters. However, these integrated densities are not on an absolute scale, so the problem is how to set the scale so that they correspond to atomic numbers and the elements can be assigned. SHELXT attempts to set the scale as follows, going on to the next test only if the previous tests are negative:
(a) If carbon is specified as one of the elements present, the program searches for peaks with similar integrated densities separated from each other by typical C—C distances (i.e. between 1.25 and 1.65 Å). If enough are found, the scale is set so that they will have average atomic numbers of 6.
(b) If boron is expected, boron cages with distances between 1.65 and 1.8 Å are searched for.
(c) A search is made for oxyanions. The oxygen atoms should have similar integrated densities to each other and similar distances to a central atom.
(d) If the above tests are negative, it is assumed that the heaviest atom expected corresponds to the peak with the highest integrated density. This can run into trouble if, for example, there is an unexpected bromide or iodide ion in the structure and it has not been possible to fix the scale by one of the above methods.
When the density scale has been found, it is used to assign elements to the remaining atoms. If it then appears that there are high-density peaks that cannot be assigned because only light atoms were expected, chlorine, bromine or iodine atoms are added. Some rudimentary checks are made to ensure that the element assignments are chemically reasonable.
5. Isotropic and determination
After the atoms have been assigned, an isotropic SHELXL (Sheldrick, 2008, 2015) and is performed in parallel. For non-centrosymmetric space groups this is followed by the determination of the (Flack, 1983) by the quotient method (Parsons et al., 2013) and inversion of the structure if the value of the is greater than 0.5. It is thus very likely that the structure determined by SHELXT will correspond to the correct (so far no examples to the contrary have been reported). If α0 is below 0.3 and no atom heavier than scandium is expected, the program stops after finding a plausible centrosymmetric solution. The -a command-line switch may be used to force the program to test all space groups in the assumed Laue group.
is performed using a conjugate-gradient solution of the least-squares normal equations. This is similar to the CGLS in6. Building the structure
The following algorithm used to assemble the structure is diabolically simple but almost always builds and clusters the molecules in a way that is instantly recognizable. No covalent radii etc. are used, so the algorithm is independent of the element assignments.
(a) Generate the SDM (shortest-distance matrix). This is a triangular matrix of the shortest distances between unique atoms, taking symmetry into account.
(b) Set a flag to -1 for each unique atom, then change it to +1 for one atom (it does not matter which).
(c) Search the SDM for the shortest distance for which the product of the two flags is -1. If none, exit.
(d) Symmetry transform the atom with flag -1 corresponding to this distance so that it is as near as possible to the atom with flag +1, then set its flag to +1.
(e) Go to (c).
The next stage is to centre the cluster of molecules optimally in the ). For example, for there are four alternative origins (0, 0, 0; 0, 0, ½; ½, 0, ¼; ½, 0, ¾1), but for there are only two (0, 0, 0; 0, 0, ½). These are combined with the lattice centring (in this case 0, 0, 0; ½, ½, ½). For polar space groups the optimal position along the polar direction(s) (e.g. along the body diagonal of the for R3 indexed on a primitive rhombohedral lattice) that minimizes the maximum distance of any atom from the centre of the is determined.
This is complicated, but makes extensive use of the tables of alternative origins for the different space groups given in Chapter 3 of Giacovazzo (20147. Examples
The first example is an organoselenium compound (Clegg et al., 1980) for which an extract from the .lxt listing file from SHELXT is shown in Fig. 2. Four different Patterson superposition vectors were used by default to start four dual-space structure solution attempts in parallel. This was a good choice because the computer had an Intel i7 processor with four cores. On the evidence of the combined figure of merit CFOM, one of the four (try 1) is a good P1 solution. The CC and the chemical figure of merit CHEM clearly indicate the correct solution, but Rweak is less clear. N is the number of peaks used in the density modification, Sig(min) is the height of peak N divided by the r.m.s. (root-mean-square) Fourier map density and Vol/N is the volume per peak in Å3.
The best phase set was then used to search for the ); the other 11 space groups tested were rejected because one or more figures of merit were too high. The P21 is clearly indicated by the values of R1, Rweak, α and the so there can be little doubt that it is correct, and in fact all the atoms are assigned to the correct elements. Note that although α0 is less than 0.3, the non-centrosymmetric space groups were searched as well because an atom (Se) heavier than scandium was specified on the SFAC instruction.
and three space groups are reported (Fig. 3The second example (Müller et al., 2006) involves a reorientation of the Since two orientations of Pmn21 have the same both (and possibly also the centrosymmetric Pmmn) would have had to be tried for a conventional structure solution. SHELXT finds only one solution and all atoms are correct (Fig. 4). The is still rather approximate but is sufficient to indicate the correct it improves on anisotropic including the hydrogen atoms.
The third example (Walker et al., 1999) contains a bromine atom and so the non-centrosymmetric P1 is also tested, despite the good R1 and α values for the centrosymmetric solution (Fig. 5). In fact, this structure is pseudo-centrosymmetric and contains a mixture of that imitates a centre of symmetry. The P1 solution is completely correct. Both solutions have similar figures of merit because the main difference is the position of one carbon atom that appears to be disordered in but not P1, but the strongly indicates P1.
The last example shows what can go wrong. This structure was published by Barkley et al. (2011) in the non-centrosymmetric , but there are two warning signs: checkCIF (Spek, 2009) detects an inversion centre (a B alert) and the is dubious: the current SHELXL (Sheldrick, 2015) gives a value of 0.46 (11). Often a value close to 0.5 indicates a centrosymmetric structure. At first glance, SHELXT appears to indicate because of a significantly lower R1 value. Unfortunately, the cannot be determined by SHELXT for this because the deposited data had been merged in a different non-centrosymmetric (hence `no Fp' in Fig. 6). However, neither nor are correct! Basically all the solutions are the same structure and the correct is the centrosymmetric P63/mmc of which all the other space groups are subgroups. The cause of the debacle is that only for were the elements assigned completely correctly and hence this has a lower R1 value. For the correct P63/mmc the manganese atom has been incorrectly assigned as calcium. With the correct element assignments all the figures of merit would have been very similar for all the space groups. In such cases the highest-symmetry (centrosymmetric) is almost always correct.
8. Program development and distribution
SHELXT is compiled with the Intel ifort Fortran compiler using the statically linked MKL library and is particularly suitable for multi-CPU computers. It is available free to academics for the 32- or 64-bit Windows, 32- or 64-bit Linux and 64-bit Mac OS X operating systems. The program may be downloaded as part of the SHELX system via the SHELX home page (https://shelx.uni-ac.gwdg.de/SHELX/ ), which also provides documentation and other useful information. Users are recommended to view the `recent changes' section on the home page from time to time.
The initial development of SHELXT was based on a test databank of about 650 structures, mostly determined in Göttingen, covering a wide range of problems. It has also been tested by more than 200 beta-testers for up to three years, in the course of which several thousand structures were solved (and a few not solved). It is difficult to generalize, but the correct was identified in about 97% of cases, and for about half of the structures every atom was located and assigned to the correct element. Most of the remaining structures were basically correct, the most common errors being carbon assigned as nitrogen or vice versa. Poor solutions were sometimes obtained when the heavy atoms corresponded to a centrosymmetric but the full structure possessed a lower symmetry. It is always essential to check the element assignments, especially if the program has added extra elements, and also to check for the presence of disordered solvent molecules that may have been missed. The biggest danger is that inexperienced users may assume that the program is always right!
Footnotes
1Misprinted as ½, 0, ¼ in Giacovazzo (2014).
Acknowledgements
The author is very grateful to the many SHELXT beta-testers for patiently reporting bugs, suggesting improvements and providing interesting data sets for testing. He is particularly grateful to Bruker AXS for their help with the logistics of the three-year beta-test, and for the use of their email list for rapid communication with the beta-testers. He thanks the Volkswagen-Stiftung and the state of Niedersachsen for the award of a Niedersachsen (emeritus) Professorship.
References
Barkley, M. C., Yang, H., Evans, S. H., Downs, R. T. & Origlieri, M. J. (2011). Acta Cryst. E67, i47–i48. Web of Science CrossRef IUCr Journals Google Scholar
Buerger, M. J. (1959). Vector Space. New York: Wiley. Google Scholar
Burla, M. C., Carrozzini, B., Cascarano, G. L., Giacovazzo, C. & Polidori, G. (2000). J. Appl. Cryst. 33, 307–311. Web of Science CrossRef CAS IUCr Journals Google Scholar
Burla, M. C., Giacovazzo, C. & Polidori, G. (2013). J. Appl. Cryst. 46, 1592–1602. Web of Science CrossRef CAS IUCr Journals Google Scholar
Caliandro, R., Carrozzini, B., Cascarano, G. L., De Caro, L., Giacovazzo, C. & Siliqi, D. (2005). Acta Cryst. D61, 556–565. Web of Science CrossRef CAS IUCr Journals Google Scholar
Clegg, W., Harms, K., Sheldrick, G. M., von Kiedrowski, G. & Tietze, L.-F. (1980). Acta Cryst. B36, 3159–3162. CSD CrossRef CAS IUCr Journals Web of Science Google Scholar
Flack, H. D. (1983). Acta Cryst. A39, 876–881. CrossRef CAS Web of Science IUCr Journals Google Scholar
Giacovazzo, C. (2014). Phasing in Crystallography. Oxford: IUCr/Oxford University Press. Google Scholar
Langs, D. A. & Hauptman, H. A. (2011). Acta Cryst. A67, 396–401. Web of Science CrossRef IUCr Journals Google Scholar
Müller, P., Herbst-Irmer, R., Spek, A. L., Schneider, T. R. & Sawaya, M. R. (2006). Crystal Structure Refinement: a Crystallographer's Guide to SHELXL, pp. 48–50. Oxford: IUCr/Oxford University Press. Google Scholar
Oszlányi, G. & Sütő, A. (2004). Acta Cryst. A60, 134–141. Web of Science CrossRef IUCr Journals Google Scholar
Palatinus, L. (2013). Acta Cryst. B69, 1–16. CrossRef CAS IUCr Journals Google Scholar
Palatinus, L. & van der Lee, A. (2008). J. Appl. Cryst. 41, 975–984. Web of Science CrossRef CAS IUCr Journals Google Scholar
Parsons, S., Flack, H. D. & Wagner, T. (2013). Acta Cryst. B69, 249–259. Web of Science CrossRef CAS IUCr Journals Google Scholar
Sheldrick, G. M. (1997). Methods Enzymol. 276, 628–641. CrossRef CAS Web of Science Google Scholar
Sheldrick, G. M. (2008). Acta Cryst. A64, 112–122. Web of Science CrossRef CAS IUCr Journals Google Scholar
Sheldrick, G. M. (2015). Acta Cryst. C71, 3–8. Web of Science CrossRef IUCr Journals Google Scholar
Sheldrick, G. M. & Gould, R. O. (1995). Acta Cryst. B51, 423–431. CrossRef CAS Web of Science IUCr Journals Google Scholar
Sheldrick, G. M., Hauptman, H. A., Weeks, C. M., Miller, R. & Usón, I. (2001). International Tables for Crystallography, Vol. F, edited by E. Arnold and M. Rossmann, pp. 333–345. Dordrecht: Kluwer Academic Publishers. Google Scholar
Shiono, M. & Woolfson, M. M. (1992). Acta Cryst. A48, 451–456. CrossRef CAS Web of Science IUCr Journals Google Scholar
Spek, A. L. (2009). Acta Cryst. D65, 148–155. Web of Science CrossRef CAS IUCr Journals Google Scholar
Walker, M., Pohl, E., Herbst-Irmer, R., Gerlitz, M., Rohr, J. & Sheldrick, G. M. (1999). Acta Cryst. B55, 607–616. Web of Science CSD CrossRef CAS IUCr Journals Google Scholar
This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.