Tetartohedral twinning could happen to you too

A review of published tetartohedrally twinned macromolecular structures is presented, together with details of the recent structure determination of triclinic tetartohedrally twinned crystals of human complement factor I.


Introduction
When crystal (pseudo)merohedral twinning arises from four twinned crystal domains, the twinning is called tetartohedral. In this manuscript, we review published tetartohedrally twinned macromolecular structures (see Table 1; Rosendal et al., 2004;Barends et al., 2005;Gayathri et al., 2007;Ferná ndez-Millá n et al., 2008;Yu et al., 2009;Anand et al., 2007;Leung et al., 2011) and find that this type of twinning is almost always accompanied by pseudosymmetry, with the twinning operators coinciding with the rotational parts of the pseudosymmetry.
To discuss a few of the issues that arise when working with tetartohedrally twinned crystals, we also illustrate the determination of the structure of tetartohedrally twinned triclinic crystals of human complement factor I (Roversi et al., 2011). 2. Tetartohedral twinning is (pseudo)merohedral twinning with N twins = 4 If the crystalline sample exposed to X-rays is made of N twins single crystals, the twinning is described in terms of the relative sizes of the N twins domains (the twin fractions k ) and the set of matrices T k that represent the twinning operators. When the twinning operators leave the crystal lattice (almost) unchanged, the twinning is called merohedral (pseudomerohedral). The X-ray diffraction spots from all of the twinned domains (almost) overlap and the diffracted intensity can be written k IðT T k hÞ / P N twins k¼1 k F Ã ðT T k hÞ Á FðT T k hÞ: As already mentioned, when N twins = 4 the structure is said to be tetartohedrally twinned. As for any type of (pseudo)merohedral twinning, detection of tetartohedral twinning is possible at an early stage, after a set of diffraction intensities has been collected, by performing a number of tests that analyse the crystal intensity statistics (see Yeates, 1997). 1 The formulae needed to estimate twin fractions from tetartohedrally twinned data have been described in Yeates & Yu (2008).
If the extent of twinning is small and/or is obscured by the presence of noncrystallographic symmetry, and especially when the NCS axes coincide with the directions of twinning (the latter introducing deviations from the intensity statistics used to derive the twinning tests), it can also be the case that twinning can only be confirmed at a stage as late as that of refinement of the model. Fortunately, in this case the availability of the model allows statistical tests on the calculated intensities, which can help in estimation of the twin fractions: intensity statistics in the presence of NCS and twinning have been discussed and illustrated in Lebedev et al. (2006) and Zwart et al. (2008).
For a discussion of experimental phasing see Dauter (2003) and for a discussion of molecular replacement in the presence of crystal twinning see Redinbo & Yeates (1993), Breyer et al. (1999 and Jameson et al. (2002). Generally speaking, whenever data sets from several different twinned samples have been measured and estimates of the twinning fractions have been obtained for each sample, if possible one should avoid working with data sets from crystals for which all the twinned fractions are close to 1/N twins ('perfect twinning'). Of course, the closer the sample is to perfect twinning, the greater the need for the accurate estimation of twinning fractions based on I h calc from each twin domain, which is only possible if the structure is available. This in turn means that only towards the end of the structure-determination process will the details of each twinned sample be properly understood and the optimal choice of sample/data set be possible.

Structural refinement against tetartohedrally twinned data
Various strategies are possible when refining against twinned data and tetartohedral twinning is not an exception. The simplest approach would involve detwinning the experimental intensities on the basis of the current estimates for the twin ratios by using the current model and I h calc from each twin domain. Structural refinement can then be carried out against these intensities, leading to a new model and a new round of estimation of twin ratios and so on, hopefully to convergence (see, for example, the refinement of PDB entry 3eop; Yu et al., 2009). This strategy may suffer from instability and its convergence may be slow.
In a second approach, the refinement target function can be defined taking twinning into account and refinement carried out against the twinned intensities. Ideally, refinement of the twinning ratios should be carried out at the same time as the refinement of the structural parameters (scale factors, atomic coordinates and B factors, occupancies etc.), possibly including joint second derivatives of the refinement target function with respect to twin fractions and other parameters. The leastsquares refinement program SHELXL-97 has long allowed joint structural refinement against tetartohedrally twinned diffraction intensities (Herbst-Irmer & Sheldrick, 1998). It refines all parameters in the same conjugate-gradient or matrix-inversion run. If the matrix of the second derivatives of the target function with respect to the parameters is inverted, it is possible to obtain the correlations between the twin fractions and the other parameters of the model and error estimates of the twin fractions.
To make refinement computationally simpler, the twin fractions can be optimized while holding the other parameters fixed and vice versa, alternating cycles of refinement of twin fractions and structural parameters. A protocol to perform refinement of the model against tetartohedrally twinned intensities was included in the supplementary information of Barends et al. (2005). This protocol makes use of the program CNS and it relies on initial estimation of the twin fractions, which are subsequently kept fixed during the least-squares structural refinement. More recently, the refinement program REFMAC5 enabled the initial detection of tetartohedral twin operators, initial estimation of the twin fractions and their maximum-likelihood optimization in between cycles of refinement of structural parameters (Murshudov et al., 2011).  (ii) h, k, l; Àh/3, k/3, 4l/3; h/3, Àk/3, À4l/3; À2h/3, Àk/3, À4l/3; (iii) h, k, l; Àh, k, Àl; Àh, Àk, l; h, Àk, l. ‡ Rosendal et al. (2004). § Barends et al. (2005). } Gayathri et al. (2007). † † Yu et al.  Table 2 Summary of the September 2009 X-ray diffraction data quality for human complement factor I (PDB entry 2xrc) integrated and scaled in three different space groups.
For the present manuscript, all data processing was repeated with the xia2 suite of programs (Winter, 2010) running XDS (Kabsch, 2010a,b) for indexing and integration and SCALA (Evans, 2006) for scaling. Of course, as is the case with all refinements against intensities from merohedrally twinned crystals and/or crystals that possess NCS, special care should be taken in assigning free R flags so that NCS-related and/or twin-related reflections either belong to the free or to the working set, i.e. NCS/twinrelated reflections should not be distributed across the two sets (Kleywegt & Brü nger, 1996). REFMAC5 internally changes free R flags so that twin-related reflections belong to the either the free or the working set.

Tetartohedrally twinned structures in the literature
Keyword searches in the Protein Data Bank and the literature (via the PubMed server) returned a number of published crystal structures from tetartohedrally twinned crystals. 2 We summarize them in Table 1.
In many of these structures the noncrystallographic symmetry operators are close to true crystallographic symmetry (pseudosymmetry; Zwart et al., 2008;Appendix A) and the group of the NCS rotations coincides with that of the twinning operators. Interestingly, most of these structures are trigonal, with the merohedral twinning operators and the NCS belonging to point group 222; the twofold axes are aligned along a, a* and c so as to create apparent 622 point symmetry. One structure (PDB entry 2xrc; see below) is pseudomerohedrally twinned, triclinic P1, but with a pseudo-orthorhombic cell and the NCS and the twinning twofolds also aligned with crystal axes. The only published tetartohedrally twinned structure for which the group of the NCS rotations and one of the twinning operators do not coincide is PDB entry 3eop, where the twofold NCS operator and the crystal symmetry together have 321 symmetry, while the twinning has 222 symmetry (the two groups sharing only the twofold along a).

Tetartohedrally twinned crystals of human complement factor I
The crystal structure of human complement factor I (fI) was described in Roversi et al. (2011). The crystals were triclinic and tetartohedrally twinned. In this manuscript, we examine the analysis of the crystal symmetry, the detection of the tetartohedral twinning and the protocol followed for initial phasing, model building and refinement of the structure against the tetartohedrally twinned diffraction data.
The fI crystals appeared to be frayed at the ends, which may indicate several crystalline layers stacking to form each sample, but otherwise had sharp edges, could be grown reproducibly and gave diffraction patterns that could be successfully indexed by invoking a single lattice (Roversi et al., 2011).
Several samples were exposed to X-rays and diffraction data sets were measured, the best diffracting of which (2.4 Å resolution) was collected in September 2009 at 100 K using X-rays of wavelength 0.97630 Å on beamline I03 at the Diamond Light Source, Harwell, England. The data were originally indexed and scaled in a primitive orthorhombic 222 lattice with the unit-cell parameters reported in Table 2. Analysis with POINTLESS and phenix.xtriage (Zwart et al., 2008) suggested a primitive 222 lattice and space group P2 1 2 1 2.
No problems were initially noticed, apart from the fact that the cumulative intensity distribution (not shown), other overall intensity statistics and the results of the L-test (see Table 3) departed from what would be expected from good to reasonable untwinned data. As there are no (pseudo)merohedral twin laws possible for these orthorhombic crystals, phenix.xtriage concluded that there could be a number of reasons for the departure of the intensity statistics from normality. Overmerging pseudosymmetric or twinned data, intensity-to-amplitude conversion problems as well as bad data quality might be possible reasons. It could be worthwhile considering reprocessing the data.
Had one attempted scaling in a lower symmetry space group, the scaling statistics would have shown only a marginal improvement upon lowering of the symmetry (see Table 2). In agreement with the these scaling statistics, the = 180 section of the self-rotation function for this crystal shows almost perfect 222 symmetry, with three peaks at 94, 93 and 92% of the origin and along the directions (! = 89.9 , ' = 0.0 ), (! = 89.9 , ' = 89.9 ) and (! = 0.0 , ' = 0.0 ), respectively. Retrospectively, once the structure was solved in P1 and the tetartohedral twin fractions were calculated with REFMAC5 it appeared that this crystal (like all other fI triclinic crystals measured but one) was almost perfectly tetartohedrally twinned, i.e. the four twin fractions were all close to 1/4 (see Table 6, last column), a special case of the condition k + k 0 = 1/2 that makes twinned crystals most problematic ('perfect twin'; Yeates, 1997).
Further clues to the fact that the crystals were not orthorhombic came from molecular-replacement efforts in P2 1 2 1 2 using Phaser and searching with several models of domains homologous to the serine protease domain (43% of the structure). The searches consistently yielded a pair of place-  Table 3 Summary of intensity statistics for the September 2009 complement factor I triclinic data.
The statistics were computed using phenix.xtriage with data between 10 Å and a maximum resolution chosen such that the data with I/(I) > 3.00 still give 85% completeness. Expected intensity statistics for untwinned and perfectly twinned crystals were taken from Yu et al. (2009) andStanley (1955).

Symmetry (Z)
P1 ( ments (with very equivalent scores) which shared the rotationfunction maximum but differed by a shift of almost 6 Å along c in the translation-function maximum. This could be interpreted as an indication of lower symmetry, but the observation was originally ignored and model building attempted starting from the top-scoring placement in P2 1 2 1 2, without much success.
In November 2009, an fI crystal gave a 2.70 Å resolution diffraction data set on beamline I02 at the Diamond Light Source, on analysis of which phenix.xtriage (Zwart et al., 2008) indicated the need to lower the symmetry to P2 1 with the monoclinic axis along the longest dimension and a angle of 90.2 . The scaling statistics also agreed with the data merging better as monoclinic (see Table 4). The 222 symmetry in the = 180 section of the self-rotation function computed from these data in P1 is still apparent in Fig. 1, but the self-rotation function maxima are only 80% of the origin and along directions ! = 90.7 , ' = 90.6 , ! = 89.7 , ' = 0.0 and ! = 0.0 , ' = 0.0 and thus are neither as intense nor as orthogonal to each other as would be expected for orthorhombic crystals.
Indeed, reprocessing the data in P2 1 improved the value of R meas a.k.a. R r.i.m. (see Table 4). This assumed two molecules per asymmetric unit and pseudomerohedral twinning with operators h, k, l and Àh, Àk, l, with the two twin fractions estimated at around 0.49-0.5. However, the intensity statistics still indicated problems with the data (see Table 5). After a few more unsuccessful attempts at refining the structure in P2 1 , the symmetry was eventually lowered to P1, invoking four molecules in the asymmetric unit and tetartohedral twinning along the crystal axes (operators h, k, l; Àh, Àk, l; h, Àk, Àl; Àh, k, Àl).
In keeping with triclinic symmetry and pointing to the fact that the crystals are not monoclinic P2 1 , the reflection 050 had nonzero intensity in more than one data set ( Fig. 2 shows one such measurement). Although violations of the systematic absences of higher symmetry space groups can be explained for example by multiple scattering (Renninger, 1937) and/or anisotropy of anomalous scattering (Templeton & Templeton, 1980), in the context provided by the merging and intensity statistics, self-rotation function and molecular-replacement hits, the repeated measurements of such a reflection from  Table 4 Summary of the November 2009 X-ray diffraction data quality for human complement factor I (PDB entry 2xrc) integrated and scaled in three different space groups.
For the present manuscript, all data processing was repeated with the xia2 suite of programs (Winter, 2010) running XDS (Kabsch, 2010a,b) for indexing and integration and SCALA (Evans, 2006)    The reflection 050 appears next to the much stronger 060. After integration, I 050 /(I 050 ) = 294/24, i.e. the intensity of the reflection is weak but still ten times its .
more than one crystal sample were taken as additional evidence that the fI crystals were indeed triclinic. The structure of the triclinic human complement fI crystals was eventually determined by sequential molecular replacement in P1, searching against the tetartohedrally twinned intensities with Phaser and search models from homologous individual domains. The initial solution was followed by iterative model building in Coot (Emsley et al., 2010) and refinement in REFMAC5 (Roversi et al., 2011). The four copies of the molecule in the cell are arranged in a pseudoorthorhombic packing which almost follows P2 1 2 1 2 symmetry except that the two molecules related by the twofold axis along c are also shifted with respect to each other by about 6 Å along the same direction.
Tight NCS restraints were initially used and gradually released during the course of model building and refinement: whenever the current electron density showed surface loops and crystal contacts that differed in the four copies of the molecule these regions were omitted from the part of the structure that was NCS-restrained. In the final model, approximately 35% of the structure had to be excluded from the NCS restraints.
Refinement statistics are reported in Roversi et al. (2011). The REFMAC5 estimates of the twinning fractions at the end of the refinement and building process are reported in Table 6. Table 6 also reports the R obs twin and R calc twin values (Lebedev et al., 2006; for the use of statistical agreement indicators on observed and calculated intensities in order to investigate twinning and NCS, see also Lee et al., 2003). As expected, R obs twin < R calc twin for all twinning operators, placing the factor I crystals in the regions of the RvR plot that is characteristic of twinned crystals with rotational pseudosymmetry (RPS; Lebedev et al., 2006).

Conclusions
The relatively recent occurrence in the literature of several cases of tetartohedrally twinned structures suggests that this form of twinning is not as infrequent as one might wish it to be (with the additional possibility that further tetartohedrally twinned structures may be lurking in the PDB, having been determined and deposited with the twinning going undetected). Tetartohedral twinning could happen to you too! Fortunately, careful analysis of intensity statistics and rotational symmetry can help to overcome the difficulties associated with this type of twinning, even in the presence of the potentially confusing shared rotational NCS and twinning symmetry. In addition, excellent statistical tools are now available in a number of data-processing/analysis programs, e.g. phenix.xtriage, to detect potential twinning laws and guide the crystallographer towards the correct symmetry, twinning laws and twinning fraction. Once detected and characterized, tetartohedral twinning is also relatively simple to handle thanks to a number of good macromolecular refinement programs, notably CNS, the least-squares program SHELXL-97 and the latest version of the maximum-likelihood refinement program REFMAC5. Tetartohedral twinning is not a fatal disease. Only, to quote Petrus Zwart By now you should be a crystallographic hypochondriac (Zwart, 2009).

A1. Twinning operators coinciding with the rotational part of pseudosymmetry
A survey of the tetartohedrally twinned structures that have appeared to date in the literature suggested that in most cases the tetartohedral twinning operators are aligned with the rotational parts of the noncrystallographic symmetry operators and that the latter in turn are close to true crystallographic symmetry, a situation known as pseudosymmetry (Zwart et al., 2008). In this Appendix, we derive a formula that illustrates the contributions to the diffracted intensity from the part of the structure that follows the pseudosymmetry and the part that does not and their interplay with the twinning fractions.
Let us write the electron density in the asymmetric unit of the crystal as  Human fI: estimation of agreement statistics and twinning fractions.
The R obs twin , R calc twin , Britton , H and ML statistics were computed with phenix.xtriage using the same subset of data as in Table 5 Table 5 Summary of intensity statistics for the November 2009 complement factor I triclinic data.
The statistics were computed using phenix.xtriage with data between 10 Å and a maximum resolution chosen such as the data with I/(I) > 3.00 still give 85% completeness. Expected intensity statistics for untwinned and perfectly twinned crystals were taken from Yu et al. (2009) andStanley (1955 where NCS (x) is the part of density in the asymmetric unit that follows noncrystallographic symmetry (NCS), i.e. electron density from a reference set of atoms and its NCS-related copies, and noNCS (x) is the electron density for the remaining part of the asymmetric unit, which cannot be described using a reference copy and NCS operators. The portion of the electron density that obeys the NCS can be written using the J NCS operators starting from the electron density for the reference copy, labelled 1 (x), where R j is the rotation matrix and t j is the translation vector of the jth noncrystallographic symmetry operator. The structure factor is the Fourier transform of the unit-cell electron density cell (x); following the above notation, where the ith crystallographic operator G i in the crystal space group G acts as In other words, S i is the rotation matrix and t i is the translation vector of the ith space-group symmetry operator. Ã is the set of crystal lattice translations. As we saw earlier, when the crystals are (pseudo)merohedrally twinned evaluation of the diffraction intensity (1) involves the calculation of the structure factors evaluated at reciprocal-lattice vectors rotated by the twinning operators Let us now assume that (i) the noncrystallographic symmetry operators are close to true crystallographic symmetry, a situation known as pseudosymmetry (Zwart et al., 2008); unlike ordinary NCS operators, which are local, pseudosymmetry operators are global and (within some tolerance; see Lebedev et al., 2006) form a group with the crystallographic operators; (ii) the twinning operators T k are a subset of the pseudosymmetry rotation operators R j .
Under these hypotheses, we have In formulae (6), the notation {T|t} symbolizes the action of the operator defined by the rotation matrix T and the translational vector t, while {S|s} {T|t} means the result of acting sequentially first with the operator defined by T and t and then with the operator defined by S and s. These equalities show that under the hypotheses stated above and for all choices of space-group symmetry operator i, pseudosymmetry operator j and twinning operator k, there exist operators labelled i 0 and j 0 that allow a simplification of the effect of a chosen twinning operator k on the symmetry copy i of the NCS copy j. Thus, we can now rewrite (5), Replacing this expression in (1) gives where I noNCS (h) = F noNCS *(h)ÁF noNCS (h) and I 1 (h) = F 1 *(h)ÁF 1 (h). The terms within the summation over the twin index k in (8) describe the joint dependency of the observed intensity on the twin fractions and on the structural parameters. The part of the structure that does not follow the pseudosymmetry [ noNCS (x) in (2)] contributes to both terms within the summation. The part of the structure that does follow it, besides mixing with the noNCS part within the summation, also makes a contribution to the intensity that does not depend on the twin fractions (the first term on the right-hand side of equation 8).
The larger the proportion of the structure following the pseudosymmetry, the smaller the dependency of the measured intensity on the twinning ratios. Notably, in the limit of no violations of the pseudosymmetry [i.e. F noNCS (h) = 0] the diffracted intensity tends to the value computed for an untwinned crystal with pseudosymmetry.
PR and SJ were funded by the Wellcome Trust (083599) and MRC (G0400775) Project Grants to SML. We thank Garib Murshudov, Dale Tronrud and Jade Li for discussions on aspects of the work and the referees for helpful suggestions on the manuscript.