Application of Patterson-function direct methods to materials characterization

Patterson-function direct methods are chronologically reviewed and their applications to powder and electron diffraction are described.


Introduction
Patterson-function direct methods (PFDM) are those direct methods (DM) extracting the phase information directly from the non-origin part of the experimental Patterson-type function. Although the first method of this category was described quite early by Rius (1993), the fact that PFDM lie halfway between traditional DM and Patterson deconvolution methods is surely one of the reasons why they are not as popular as other structure solution methods. The aim of the present article is to provide a comprehensive description of the advances in PFDM during the last 20 years and, at the same time, to introduce a rational classification and consistent nomenclature for their different variants. This clarification should help to increase their dissemination and to promote their wider use. PFDM are extremely simple both theoretically and computationally, and are especially well suited to such problems where not only the strong but also the weak intensities are accessible from the experiment. This comprises most applications to materials science dealing with crystalline matter. Although the present contribution focuses on the phasing of powder X-ray diffraction (PD) and electron diffraction (ED) data of inorganic materials, most of the results can be applied to any kind of material.

Patterson-function direct methods based on q 2
Before starting with the description of PFDM, a short introduction to the quantities involved in their definition is in order. In this review, for simplicity, an equal-atom crystal structure belonging to space group P1 with N atoms in the unit cell is assumed. In addition, bold letters denote complex or vector quantities, while standard text indicates the corre-sponding moduli (amplitudes). The observed quantities are the normalized E amplitudes, i.e. the F structure factor amplitudes corrected for fall-off in sin / (mainly due to the atomic form factor evolution and to the atomic thermal vibration; is half the diffraction angle and is the incident wavelength). For an arbitrary H reflection, the corresponding E amplitude is given by and can be derived from the measured intensity I and the average intensity in the corresponding reciprocal-space shell, hIi shell . A Fourier synthesis with the E values as Fourier coefficients yields the sharpened electron density distribution () of the crystal. The E values are complex quantities with their amplitudes known but with their associated phases, ', lost during the diffraction experiment. Especially relevant for the development of DM was the derivation of the probability distribution of the structure factor (s.f.) amplitudes by Wilson (1949), in which he assumed that the atomic positions were random variables with uniform distribution throughout the unit cell. Written in terms of E, the probability distribution is (for P1 symmetry) with the moments of P 1 (E) being hE 2 i = 1 and hEi = 0.9, and with associated variance The basic assumption made, i.e. that all points in the unit cell have the same probability of hosting an atom, constitutes the 'randomness' condition. An important property of equation (2) is that P 1 (E) is independent of the number of atoms N in the unit cell. Physically, the E values represent the amplitudes of a hypothetical unit cell consisting of point atoms with a scattering power equal to 1/(N) 1/2 . Consequently, the amplitude of the s.f. of 2 , G = G exp(i ), is given by the simple relationship In view of equation (4), G can also be considered experimentally accessible, so that both E and G can be used interchangeably.

The calculated structure factor amplitudes
If È represents the subset of refined phases belonging to the h reflections with large E values, then the G(È) amplitudes of the squared structure can be expressed in terms of the Fourier coefficients E exp(i') by Fourier transforming 2 (È) = (È)Á(È) and posterior multiplication with exp(i ÀH ), i.e. by means of the summation with The equal-peak condition is implicit in the squaring operation, whereas positivity is forced if h and ' h are equated (only for strong reflections), i.e. by assuming that 2 (È) is directly proportional to (È) (Sayre, 1952). However, the randomness condition is not included in the squaring operation and hence will depend on each particular phasing method. Traditional DM procedures were not especially robust regarding this condition, as proved by the frequently occurring uraniumatom solution. For a long time, this solution represented a serious DM limitation and it is characterized by the appearance of an outstanding strong peak in the Fourier map.
Although the equal-peak and positivity conditions are not violated, this solution is clearly wrong.

The origin-free modulus sum function (S M )
A Fourier synthesis with the G amplitudes as coefficients yields the modulus synthesis (M) of 2 , which is a Pattersonlike synthesis with a dominant origin peak. Similarly, a synthesis with coefficients G 2 yields the true Patterson function, P, of 2 . If the strong origin peak is removed from M, the non-origin peaks become dominant. If M 0 denotes M with no origin peak, the phasing residual will measure, as a function of È, the discrepancy between observed and calculated M 0 over the whole unit cell. The Fourier coefficients of M 0 (È) are G H (È) À hG(È)i, which can be derived from equation (5). By applying the Fourier theory, R M (È) can be worked out to (Appendix A) where G H À hGi are the Fourier coefficients of the observed M 0 . The first term, K M , is a phase-independent quantity. The second term, 2 GðÈÞ , is the variance of the probability distribution of the G(È) amplitudes that can also be assumed to be phase-independent for È sets satisfying the equal-atom and randomness conditions (irrespective of the correctness of È). In general, these two assumptions are valid because the equalatom condition is implicit in the squaring operation and because, by only considering the non-origin peaks of the modulus function (which correspond to interatomic vectors ranging over the whole unit cell), randomness is favoured. Consequently, minimizing R is essentially equivalent to maximizing the third term of equation (8), the so-called originfree modulus sum function which, after replacing G ÀH (È) by equation (5) The S M sum function [originally called Z R in Rius (1993)] represents one of the last advances in reciprocal-space DM.
Since the true È corresponds to a maximum in S M (È), a simple method for its maximization is needed. For this purpose, the order of the double summation in equation (10) is changed, so that it becomes with Q h (È) being If sq M denotes the Fourier synthesis with coefficients (G H À hGi)exp(i H ), equation (12) can also be expressed as the Fourier coefficient of the sq M Á product function, i.e.
By following Debaerdemaeker et al. (1985), the maximum of a functional like S M can be found by solving the condition for an extremum, i.e. by making If this condition is applied to equation (11), a tangent formula (TF) is obtained which provides the new phase estimates Depending on whether Q h is expressed in terms of the and ' phases [equation (12)] or as a function of sq M [equation (13)], two different optimization algorithms result: (i) the sequential S M tangent formula (S-TF algorithm) based on phase invariants, and (ii) the parallel S M tangent formula (S-FFT algorithm) based on Fourier transforms. For simplicity, the subscripts of S (i.e. M or P) have been omitted from the general algorithm designation.
In equation (9), the experimental quantities are the amplitudes. However, for certain applications it can be desirable to work directly with intensities. That this is feasible can be easily shown by considering the physical meaning of S M (È) in equation (9), which corresponds to the integral It is known that the principal differences between Patterson and modulus functions are the relative heights between origin and non-origin peaks. If the origin peaks are suppressed, the resulting P 0 and M 0 functions may be regarded as proportional by a factor close to two (Rius, 2012b). Consequently, maximizing equation (16) is equivalent to maximizing the integral or in terms of the respective Fourier coefficients since G 2 H À G 2 ¼ ðI H À IÞ=N. Notice that the Q h (È) expressions, equations (12) and (13), are also valid for S P simply by replacing (G H À G) by G 2 H À G 2 and sq M by sq P , respectively. S P is particularly useful for powder diffraction (PD), because working with experimental intensities simplifies the manipulation of overlapped intensities.
2.3. Phase refinement algorithms based on S 2.3.1. Sequential application of the tangent formula with phase invariants (S-TF algorithm). One possibility of maximizing the S M sum function is by means of the iterative application of the tangent formula of equation (15) with Q h given in terms of the phases of equation (12). In practice, the H summation (involving all reflections) is split into two separate sums: the K one (strong reflections) and the L one (weak reflections). Only for the K sum is no distinction made between the and ' phases. This causes the three summands having phase invariants ' Àh + ' K + ' hÀK , ' K + ' Àh + ' hÀK and ' Àh + ' hÀK + ' K to have the same È3 hK phase sum. Consequently, they can be replaced in the sum by where and Q h becomes According to equation (14), phases refined with the tangent formula of equation (15) will lead to an extremum in S M , i.e. to a large maximum or a large minimum. However, since for strong reflections S-TF makes and ' equal, only positive solutions are possible. In the S-TF algorithm the TF is applied in sequential mode. This means that, once a new ' h estimate is calculated with equation (15), its value is immediately replaced in È. The updated È is then used to compute the phase estimate of the next reflection. This process is repeated until all h reflections in È have been treated and no significant phase variations are observed. Before starting a new iteration cycle, the phases of the weak reflections are updated using equation (6) (for strong reflections this is done automatically, because and ' are considered equal). This means that the S-TF algorithm is essentially a two-stage process in which the estimates of the ' and phases are updated alternately. S-TF is very effective, easy to apply, and makes no use of any Fourier synthesis. It is ideal for solving small-molecule crystal structures. However, for crystal structures with a large number of atoms in the asymmetric unit (>500 atoms) the total number of terms in the L summation becomes prohibitive. The introduction of a higher cut-off value, E min , for large E values reduces the number of invariant terms at the cost of lowering the accuracy of the calculated G(È). To check the efficiency of the S-TF algorithm, the phasing power of S-TF was compared with the power of the traditional TF (Karle & Hauptman, 1956), strengthened with information on the most reliable negative quartets. For crystal structures with no fixed space group origin, the success rate of S-TF was one order of magnitude higher (Table 1) (Rius et al., 1995;Sheldrick, 1990).
In retrospect, one possible explanation for the late discovery of PFDM can be found in the leading role played by the integral in the development of DM (Cochran, 1955 (11). The reason why the latter represents an improvement is explained intuitively in Fig. 1. At the beginning of this article it was stated that the first PFDM was published in 1993. This is only partially true, because the most reliable negative quartets (Schenk, 1973;Giacovazzo, 1976) can also be derived from Patterson-function arguments, i.e. by expressing the integral as a function of the È phases. Since the non-origin parts of the modulus and Patterson functions of 2 can be considered proportional, maximizing the integral of equation (24) is equivalent to maximizing equation (16), so that in this case Q h takes the form (Rius, 1997)  Table 1 Comparison of success rates (%) for two tangent formula algorithms based on phase relationships.
(a) S-TF with a varying number of strong reflections. (b) Traditional TF (Karle & Hauptman, 1956), complemented with the most reliable negative quartets (values taken from Sheldrick, 1990). S-TF is clearly superior, especially for space groups having the origin floating at least in one direction.  Since M 0 has no origin peak, the peaks of the product function are evenly distributed along the unit cell (all interatomic peaks contribute). If the origin peak were present, it would play a dominant role in the product function, so that a solution of È giving rise to a single very strong peak in the E map would be a positive maximum of equation (23).

Code Reference
and h, k, l and Àh, Àk, Àl belong to the subset of strong reflections and X hkl involves both strong and weak reflections.
Notice that X hkl becomes clearly negative only when the three squared amplitudes in equation (26) correspond to weak reflections. One immediate difference between equations (21) and (25) is that the estimation of a new phase with equation (25) requires the lengthy calculation of a double summation.
In addition, manipulation of the mixed terms in X hkl is far from trivial.
One year after the introduction of S-TF, the first paper on the Shake-and-Bake phasing method was published (DeTitta et al., 1994). It represented a radical change in DM philosophy, since it combined phase refinement in reciprocal space with Fourier filtering, thus exploiting the considerable computing power already available at that moment. In this way the trend towards the uranium-atom solution of traditional reciprocal DM was compensated by the periodic reintroduction of randomness during the direct-space stage by picking up the N largest Fourier peaks. This new way of preserving randomness did not rely on the weak reflections. This circumstance proved particularly useful in the solution of e.g. anomalous scatterer substructures in macromolecules. However, when weak reflections are available (as in PD or ED applications), PFDM are highly competitive. As will be shown in the next section, PFDM can also be optimized entirely in direct space (S-FFT algorithm), so that if necessary they can also be strengthened by Fourier filtering.
Between the publications of the S-TF and S-FFT algorithms, 14 years elapsed. During this period some new PD applications of S-TF were explored. One example is the crystal structure solution of the layered zeolite-like silicate RUB-15 of the formula TMA 8 [Si 24 O 52 (OH) 4 ]Á20H 2 O from laboratory PD data (TMA = tetramethylammonium cation; Oberhagemann et al., 1996). At that time it was still a common belief that DM needed intensity data at atomic resolution to be successful. However, the determination of RUB-15 demonstrated that, if the electron density of the main building elements can be roughly approximated at moderate resolution to broad spherical peaks (d min ' 2 Å ), DM will work. In RUB-15, both the SiO 4 and TMA tetrahedra were handled in this way (Fig. 2). This approach allowed the solution of a series of layered zeolite-like compounds at moderate resolution in collaboration with the Institute for Mineralogy of the Ruhr University Bochum (Gies et al., 1998).
Another important result was the demonstration that S-TF can be applied to PD data of hemihedral compounds (Rius et al., 1999). This was confirmed by solving the crystal structures of: (i) the CAH10 binding phase in high-alumina cement (space group P6 3 /m; Guirado et al., 1998), and (ii) aerinite, a natural blue pigment employed in some Catalan romanesque mural paintings (space group P3c1; Rius et al., 2004). Both crystal structures had resisted multiple attempts at solution worldwide.
In the literature there are various methods of combining information from multiple PD patterns, e.g. by making use of the anisotropic thermal expansion of a material (Shankland et al., 1997). In the particular case of zeolites, the information contained in the powder patterns of the as-synthesized and calcined forms can easily be exploited in a two-stage procedure (Rius-Palleiro et al., 2005). In the first stage, the template molecule is located by combining isomorphous replacement with S-TF at very low resolution (d min ' 3.2 Å ), whereas in the second stage, the framework atoms are found by again applying the S-TF algorithm but now strengthened with the information coming from the located template molecules (d min ' 2.21 Å ). This procedure was applied to the solution of the ITQ-32 zeolite (Cantín et al., 2005).
All the S-TF applications described so far use the resolved reflections exclusively (except for hemihedral symmetries, where the intensities of systematically overlapping reflections were equidistributed and treated as resolved), so that strictly speaking these may be regarded as single-crystal applications. However, the solution of the triclinic crystal structure of tinticite, a partially disordered phosphate mineral, required a more sophisticated S-TF procedure in which not only the phases were refined but also the estimated intensities of the severely overlapped peaks (d min ' 2.3 Å ) . In the best E map, the broad spherical peaks corresponding to the [Fe III O 6 ] octahedra and to the phosphate tetrahedra (the latter with partial occupancies) showed up clearly (Rius, Loü er et al., 2000). In spite of this success, the refinement often had stability problems, undoubtedly due to the inaccurate intensity estimation of overlapping reflections from a limited number of invariant terms. opment of the S-FFT algorithm is related to the 23rd European Crystallographic Meeting in Leuven (2006). On the occasion of that meeting, Professor Baerlocher (ETH, Zurich) showed to the author the potential of charge-flipping when applied to PD (Baerlocher et al., 2007;Palatinus, 2013). Spurred on by this result, the rationale behind charge-flipping was sought. During this search it was found that S M can also be maximized by Fourier methods (Rius et al., 2007). In contrast with the S-TF algorithm, where the new ' h are estimated sequentially, the S-FFT algorithm determines the new ' h (the Fourier transforms of sq M ) in parallel, i.e. from a unique È old . A second important difference between the two algorithms is that in S-FFT the alternating update of the and ' phases is done in completely separate stages (no explicit use is made of the equality between and '). The two stages of one iteration cycle are (Fig. 3) Stage 1 : ' initial þ observed E ! ðstoredÞ ! 2 ! :

Parallel application of the tangent formula via
Since the TF refinement leads to an extremum in S M and the condition h = ' h is not applied during the refinement, it can produce either or À as valid solutions when starting from random phases.
The algorithm works quite well with single-crystal data of small-and medium-sized structures at atomic resolution. The stability of the algorithm is reflected in the fact that no electron-density modification is required after each refinement cycle, e.g. there is no need to suppress negative values or for periodic reintroduction of randomness in È by selecting the N highest peaks in the Fourier map (and posterior recalculation of È from these peaks). It is clear that, for small crystal structures, the phase refinement efficiencies of S-TF and S-FFT must be similar. Table 2 compares the respective efficiencies for a selection of representative compounds.
2.3.3. The S-FFT algorithm extended to non-positive definite q. Some first applications of the S-TF algorithm to non-positive definite density functions in difference structures and in reconstructed surfaces (by using in-plane X-ray diffraction data) can be found in Rius et al. (1996) and Pedio et al. (2000), respectively. However since, for these particular applications, the S-FFT algorithm is much simpler and more accurate (all phase invariants are implicitly taken into account), only S-FFT is considered here. In all situations so far discussed, it has been assumed that is positive definite, so that G, as given by equation (4), corresponds to the amplitude of the squared structure 2 . However, there are certain situations where positivity of is violated. Neutron diffraction data from compounds with negative scatterers are typical cases (Table 3). In such cases, the corresponding nuclear density function (still designated by ) consists of positive and negative scatterers, so that the G values derived from equation (4) are no longer the s.f. amplitudes of 2 but of the so-called 'squared-shape structure', in which the atomic peaks have the feature articles 296 Jordi Rius Patterson-function direct methods IUCrJ (2014). 1, 291-304 Table 2 Comparison of the S-FFT and S-TF phase refinement algorithms for different compounds (data resolution limit in the 0.85-1.04 Å d spacing interval).
As expected, both S-FFT and S-TF algorithms yield similar success ratios, although S-TF is much more efficient in computing time (for small molecules).

Code
Unit -   Iterative S-FFT phase refinement procedure. The initial phase values (') are combined with the experimental E amplitudes to give the initial values (upper right corner). The phases are obtained by Fourier transforming 2 . Combination of with the experimental (G À hGi) gives the Fourier coefficients of the sq M synthesis. The new structure factor estimates Q are obtained by Fourier transforming the Á product function.
shape they have in 2 but preserve the signs they have in . As was shown by Rius & Frontera (2009), the S-FFT algorithm can cope with non-positive definite by simply introducing in equation (13) an m mask calculated according to the following scheme where t ' 2.5, and a is a random value between À1 and 1. The introduction of m into equation (13) yields the extended Q values, i.e.
The viability of the algorithm was checked thoroughly with calculated data sets from organic compounds. Fig. 4 reproduces the Fourier map obtained by processing the intensity data of TVAL (triclinic modification of valinomycin) with the extended S-FFT (Karle, 1975;Smith et al., 1975). From the tests performed on a variety of organic compounds it was concluded that the extended S-FFT algorithm has a lower convergence rate than the unextended S-FFT (approximately two to three times for the studied test cases), so that the number of refinement cycles has to be increased. This is the price that the extended form has to pay for not including the positivity (or, better, the equal-sign) constraint.
For inorganic compounds no significant difference in convergence speed was detected. A perovskite-related compound containing the strong negative neutron scatterer Mn illustrates how the extended S-FFT works ( Table 3). The intensities used in the calculations were extracted from the observed powder diffraction pattern by redistributing the global intensities of the overlapping peaks according to the calculated individual intensities (Frontera et al., 2004). The success rate is three from a total of 25 trials (Rius & Frontera, 2008).
Another important situation where negative peaks appear in the Fourier map is in the solution of difference structures. An example of this type of application can be found in Rius & Frontera (2008).

Definition of atomic, experimental and effective resolutions
In contrast with other crystal structure determination methods, the experimental information used by DM is generally limited to the set of measured intensities. This is why it is very important that the data set is almost complete and atomic resolution is reached (only then will the atomic peaks show up clearly separated in the Fourier map). The experimental resolution of a powder pattern is defined by the 2 value beyond which no more diffraction peaks appear. It is normally expressed in terms of the corresponding d-spacing value (d min ). The experimental resolution depends mainly on the crystallinity of the material, e.g. materials with small domain sizes have broad diffraction peaks, so that peaks and background are difficult to separate at high 2. Also important for the application of DM to PD is the effective resolution feature articles IUCrJ (2014). 1, 291-304 Jordi Rius Patterson-function direct methods 297 Table 3 Application of the extended S-FFT algorithm to neutron diffraction data of the perovskite-related compound with unit formula (Bi 0.75 Sr 0.25 )MnO 3 .
Crystal data: a = 5.499, b = 7.770, c = 5.542 Å , space group Imma, Z = 4. Phase refinement gives a large negative peak for Mn in the E map, as expected for a negative scatterer (Fermi lengths for Bi, Sr, Mn and O are, respectively, 0.853, 0.702, À0.373 and 0.580).

Atom
Relative  Figure 4 Application of the extended S-FFT phase refinement algorithm to intensities of TVAL calculated (a) with 50% randomly assigned negative scattering factors and (b) with all atomic scattering factors of one of the two symmetry-independent molecules made negative. Atoms with negative refined densities are depicted in grey. The peak search was performed on the Fourier map computed with the phases from the extended S-FFT. concept (Fig. 5). Since traditional DM use only the intensities from resolved reflections (which are highly dependent on the amount of peak overlap), the pattern region with useful information is reduced. The d-spacing corresponding to the upper 2 limit of this region gives the effective resolution (d eff ). Very often the effective resolution is much less than the experimental one, which hampers the successful application of DM. However, if DM are modified in such a way that clusters of intensities can be treated, the effective resolution of the pattern increases and d eff and d min become more similar. The introduction of 'model-free pattern matching' greatly facilitated the partition of powder patterns into sequences of cluster intensities (Pawley, 1981;Le Bail et al., 1988). The inclusion of the cluster information in the structure determination process yields better resolved peaks in the intermediate Fourier syntheses. Summarizing, in the same way that the Rietveld method allows one to take advantage of the whole experimental resolution of the powder pattern during the refinement, cluster-based DM allow one to increase the effective resolution during the solution process, so that it comes much closer to the experimental one.

The cluster-based S P function for PD
When PFDM are applied to powder data, the smallest unit of intensity information is the total intensity of each group of unresolved reflections (cluster). The two quantities that specify an arbitrary j cluster are: the total number of reflections; n j ¼ X where m are the multiplicities of all symmetry-independent reflections. In view of equations (29) and (30), if H is an arbitrary reflection of this cluster, the equidistributed intensity for H is so that hIi, its average taken over all reflections, is equal to hE 2 i = 1.
In the cluster-based S P of equation (18), the observed intensities for overlapping reflections are simply their equidistributed values (Rius, 2011). This is the best approximation to the experimental Patterson function. Notice also that the origin peak can be removed exactly. The refinement of the È subset of phases (strong reflections) is achieved by maximizing S P with the S-FFT algorithm. È is updated from cycle to cycle and at the end of each trial the cluster-based figure of merit is computed. The solution with the smallest RV value is taken as the correct one. To handle PD data, the following modifications in the S -FFT phase refinement algorithm are necessary ( Fig. 6): (i) The coefficients (G À hGi) in stage 2 must be replaced by the (G 2 À hG 2 i) ones, so that stage 2 in x2.3.2 becomes + observed (G 2 À hG 2 i) ! sq P ! sq P ! ' new . (ii) Those values below t() are made zero. (iii) The calculation of the Fourier coefficients (Q) of the product function sq P is performed either by direct Fourier transformation (FT) or by structure factor calculation (SFC) from the N highest peaks in sq P (Fig. 6). The periodic calculation of the structure factors from the N peaks is carried out to ensure the fulfilment of the randomness condition. (With single-crystal data this step is normally not necessary).
(iv) The intensities of overlapping reflections are redistributed according to  Schematic powder pattern divided into clusters of unresolved reflections. In contrast with traditional DM which only use resolved reflections (low effective resolution), cluster-based DM can process the information of the high-angle region of the pattern. In this way the effective resolution approaches the experimental one.

Figure 6
Iterative S P -FFT phase refinement procedure for powder data. Only the differences with respect to the already described S-FFT algorithm are indicated (Fig. 3). Initial phase values (upper right corner) are combined with weighted experimental and extrapolated amplitudes to give the initial values (upper right corner). The Fourier coefficients of the sq The physical meaning of the cluster-based S P function can be best understood by writing it in terms of the cluster intensities.
In view of the proportionality between the integrals of equations (17) and (24), S P in equation (18) may be assumed to be proportional to so that by equation (31) it follows and S P essentially corresponds to the sum of the products of the observed and calculated cluster intensities, divided by the number of reflections contributing to each cluster. As long as È fulfills the general properties of the electron-density distribution (positivity, randomness, atomicity), the second sum in equation (34b) can be regarded as constant during the phase refinement.

Examples of application of the cluster-based S-FFT algorithm
Retrospectively, the development of the cluster-based S-FFT algorithm was greatly facilitated by the release of some high-quality PD patterns of organic compounds collected by Dr Gozzo for the Summer School on 'Structure Determination from PD Data' organized at the Swiss Light Source in 2008. These patterns had been measured with the novel Mythen-II microstrip one-dimensional detector (Schmitt et al., 2004). For a detailed study of the S P function with powder data, the pattern of (S)-(+)-ibuprofen was selected (Freer et al., 1993). The monoclinic unit cell contains two symmetryindependent molecules, giving rise to a cyclic hydrogenbonded dimer with the formula C 26 H 16 O 4 (Fig. 7). The intensities were extracted by pattern matching using DAJUST (d min = 1.10 Å for = 1.0 Å ) . Details of the peak profiles are given in Fig. 8. The extracted cluster intensities (total number of reflections is 1009) were processed by the XLENS_PD6 program, which has the cluster-based S-FFT implemented (downloadable from http://departments. icmab.es/crystallography/software). During the phase refinement, chemical constraints were applied every second refinement cycle. Seven trials out of 25 were successful (50 cycles per trial). All correct solutions developed the complete structural model. Some relevant details of the model extracted from the Fourier map are listed in Table 4 (Rius, 2011).
to the presence of the organic part, synchrotron radiation is preferred for hybrid materials. This normally gives higher experimental resolution (compared with laboratory data), which helps to develop the complete crystal structure model at the end of the phase refinement stage.

The d recycling method
4.1. The calculated q (based on the d function) The recycling method is an extremely simple phasing method. It is based on a function M , which is the convolution of P 0 (of the true structure) with a phase synthesis. Experimentally, M is computed with the Fourier syntheses and consists of maxima at the atomic positions and noise in between. According to Rius (2012a), the strength of M at the r k atomic positions can be approximated, for an equal atom structure, by Independently, the variance of M only depends on the amplitudes and is given by The fact that 2 M is independent of the phase estimates allows one to fix a threshold value before the structure is solved (Rius, 2012a). In practice, the threshold value Á = t M with t ' 2.5 works well for eliminating noise. In this way an m mask can be created, which will be 0 or 1 depending on whether the corresponding M value is below or above Á. By multiplying M by this mask and considering equations (36) and (37), the desired approximation to is obtained which must be always positive and uses the known E magnitudes (Rius, 2012b).

The phasing residual and the algorithm
If (r) represents a positive definite density function of the crystal, e.g. the electron density or the electrostatic potential (in this second case only for structure solution purposes), it will be assumed that the condition is only fulfilled for the true È values. The discrepancy between (r,È) and C (r,È) can be measured through the residual extended over the whole unit cell of volume V, where for clarity the r and È symbols have been omitted in the integrand. By working out the squared binomial, and since the integral of 2 over the unit cell is phase-independent (it corresponds to the value of the Patterson function at the origin and is equal to 1/V P H E H 2 ), minimizing R is equivalent to maximizing the integral which in view of equations (37) and (39), and because m = m 2 , reduces, after some algebraic manipulation, to wherein X corresponds to Here, ' denotes the phase synthesis, i.e. a Fourier synthesis with the same phases as but with constant amplitudes (in this case unity). In view of this, it follows from equation (44) that the Fourier coefficients of X are The dependence of the modulus of X on the amplitude E is of a linear type (Fig. 9). By expressing X in equation (44) with Equation (46) is formally equivalent to equation (11), except for the fact that the summation extends over all H reflections, not just the strongest ones (Rius, 2012b Table 4 Bond lengths (Å ) obtained by applying the cluster-based S-FFT algorithm to powder diffraction (PD) data and by least-squares refinement from single-crystal (SC) data.  (1) new phase estimates can also be derived by applying a tangent formula, namely The general scheme of the recycling phasing procedure is described in Fig. 10. As indicated by the structure factor calculation (SFC), the E new H structure factors are computed from the N largest peaks found in C [equation (39)]. The new È set is then used to update M . This procedure is applied cyclically until convergence is reached. Convergence is controlled by measuring the correlation Corr between the experimental E and the updated E new with the expression

Application to ED tomography data
Frequently, natural and synthetic phases only appear as submicrometric crystals, too small for collecting single-crystal X-ray data even with synchrotron radiation. Normally, structural information from these phases is obtained from PD, which combines easy sample preparation (also under nonambient conditions) with fast acquisition systems and sophisticated analytical methods. Nevertheless, PD suffers from various limitations which may be caused by the sample [(i) sufficient sample must be available; (ii) the sample must be an almost pure phase; (iii) for nanocrystals, peak broadening due to the particle size reduces the effective data resolution range] and/or by the crystal structure itself [(i) indexing of unit cells with long axes is not always trivial; (ii) systematic overlap is present in high-symmetry space groups, especially in cubic ones; (iii) accidental overlap may be severe for low-symmetry space groups]. In addition, identification of the space group for crystalline phases affected by pseudo-symmetry can be problematic even for good PD data [see, for example, Birkel et al. (2010) and Rozhdestvenskaya et al. (2010)]. The main advantage of electron diffraction (ED) is the ability to collect single-crystal data from nanometric volumes. This is possible because electrons can be deflected and focused in quasiparallel probes with a diameter of 10-30 nm and because the interaction with matter for electrons is much stronger than for X-rays, allowing a good signal-to-noise ratio even for diffraction from nanovolumes of crystalline material. Two of the principal problems of ED, i.e. dynamic effects and incomplete data sets, are minimized by measuring off-zone. This is the basis of the automated diffraction tomography (ADT) data collection strategy (Kolb et al., 2007(Kolb et al., , 2008. In ADT, the ED patterns are acquired by rotating around an arbitrary tilt axis (not corresponding to a specific crystallographic orientation) in sequential steps of 1 within the full tilt range of the microscope. The physical limit affecting the sample rotation gives rise to incomplete data sets, i.e. to a missing wedge. The precession ED technique (PED) is used to integrate the intensity between steps. Recently, an alternative technique called rotation ED (RED) has been introduced for this purpose (Zhang et al., 2010). Of course, there are also disadvantages with ED. For certain compounds, radiation damage is still a limiting problem. In general, organic and hybrid materials are more beam-sensitive than inorganic materials. The application of recycling to PED/ADT intensities from inorganic materials was recently analyzed by Rius et al. (2013) with some interesting results: (i) scaling with the Wilson plot procedure is accurate; (ii) recycling is particularly robust against missing data; (iii) unlike X-rays, where Corr [equation (49)] clearly discriminates the correct solution, the final Corr values with PED/ADT data tend to be similar for correct and wrong solutions. To circumvent this difficulty, the recycling phasing stage always terminates when a preset number of cycles is reached, and continues with conventional Fourier recycling methods. Convergence during Fourier recycling is controlled by the R CC residual, Linear dependence of X upon E in equation (45) for a P1 equal-atom crystal structure.

Figure 10
Schematic description of the M recycling phasing procedure. Starting random phase values are fed in at the upper right corner. For PED/ADT intensity data, iteration stops when a preset number of cycles is reached. The meaning of the different symbols is explained in the text (SFC = structure factor calculation). For P recycling, the E À hEi coefficients are replaced by E 2 À hE 2 i and the M subscript by P. correct solutions; for PED/ADT data, essentially correct solutions are found between 15 and 60, although R CC values up to 80 can be reached, especially if the data are affected by large thickness variations and/or by residual dynamic scattering, if missing organic parts of the structure are not included in the calculation of the intensities, or if the measured data fail to produce well shaped peaks in the Fourier map. It goes without saying that the possibility of solving crystal structures from phases only detected by TEM is very important in many research fields. For example, it is expected that many new mineralogical species can be found. In a recent collaboration with Kolb's group at the University of Mainz, recycling has solved, from PED/ATD data, the crystal structure of a new porous Bi sulfate mineral appearing only as a tiny crystalline fragment ($0.15 Â 0.15 Â 0.2 mm) displaying no net cleavage planes (Capitani et al., 2014). The crystal structure is hexagonal and the unit-cell content is [Bi 8.18 Te 0.82 (OH) 6 O 8 (SO 4 ) 2 ] 0.91+ Á0.91S 2 À with Z = 2 (Fig. 11). Some relevant experimental details are: d min = 1.0 Å , number of measured (unique) reflections = 2748 (452), R equiv = 23.57, data completeness = 100%; = 0.0197 Å , T = 93 K. The structure model obtained from recycling was complete and can be described as a self-assemblage of Bi clusters giving rise to a one-dimensional porous material with the disulfide anions inside the channels. Figures of merit for the last refinement cycle are R 1 = 0.2173 for 332 F obs > 4(F obs ) and 0.2373 for all 452 data, i.e. of the same order as the R value between symmetry-equivalent reflections (SHELX97; Sheldrick, 2008). Finally, it is worth mentioning that the O atoms could be located in the presence of the extremely heavy Bi atoms (Z = 83), a consequence of the slower scattering-power increase with atomic number compared with X-rays.

Conclusions
Currently, the application of DM has reached maturity. This means that, for an ideal single-crystal intensity data set, phasing is a rather straightforward process. However, the situation changes for inaccurate or incomplete data sets, a circumstance which occurs with increasing frequency in materials science, especially when small crystalline volumes are being analyzed. To deal with these situations, not only robust but also simple DM procedures are required which can process, in a unified manner, partial information coming from different sources, e.g. transmission microdiffraction, electron diffraction, powder diffraction, grazing-incidence diffraction. Due to their simplicity, PFDM are ideal candidates for such types of applications which also benefit from rapidly evolving instrumental capabilities.