COSET: a program for deriving and testing merohedral and pseudo-merohedral twin laws
COSET is a program written in ISO C99 with POSIX extensions which uses left coset decompositions to determine possible merohedral and pseudo-merohedral twin laws. In addition to a stand-alone program, the code may be compiled as a Python extension module. The program can create SHELXL instruction files which incorporate the appropriate TWIN and BASF instructions for the possible twin law(s). COSET may also be directed to execute a locally installed copy of the SHELXL binary executable to test the candidate twin laws in trial refinements. This facilitates the quick screening and assessment of possible twin laws.
The widespread deployment of CCD and other area-detector systems has greatly facilitated the recognition and analysis of twinned structures. Derivation of the twin law is an essential step in the process. The complexity of this task can depend on the type of twin and, possibly, on the point group of the crystal. In recent years a number of software tools have been developed to facilitate the determination of twin laws. There have been several programs for indexing non-merohedral twins, such as DIRAX (Duisenberg, 1992), GEMINI (Sparks, 1999) and CELL_NOW (Sheldrick, 2004). There are also programs such as ROTAX (Cooper et al., 2002) and the routine TwinRotMat in PLATON (Spek, 2009) which use differences between Fo2 and Fc2 values to detect previously unnoticed non-merohedral twinning. Other programs such as XPREP (Sheldrick, 2001) and a recent spreadsheet application (Flack & Wörle, 2013) can be used to derive merohedral twin laws.
Software for the convenient derivation of twin laws involving pseudo-merohedry is less well represented. Local versions of the XRAY76 system (Stewart et al., 1976) implemented a left coset decomposition algorithm several decades ago (Flack, 1987), but this program package, renamed the `Gnu Xtal system', does not seem to have been actively developed since 2003 (http://sourceforge.net/projects/xtal/files/ ). TWINLAWS (Schlessman & Litvin, 1995) and the coset decomposition routine on the Bilbao crystallographic server (Aroyo, Kirov et al., 2006; Aroyo, Perez-Mato et al., 2006; Aroyo et al., 2011; http://www.cryst.ehu.es/ ) are two other programs that can be used to derive twin laws for twinning by merohedry or pseudo-merohedry. The present article reports COSET, a small stand-alone program that can be used for deriving and testing twin laws in cases of twinning by merohedry or pseudo-merohedry. COSET implements the left coset decomposition algorithms discussed by Flack (1987) and can be used in conjunction with SHELXL (Sheldrick, 2008) to test candidate twin laws.
where T is the twin law, S is the symmetry operator of a metrically available apparent higher-symmetry lattice and M is the transformation matrix that transforms the lower-symmetry lattice basis vectors to the metrically available higher-symmetry basis. While manual multiplication of 3 × 3 matrices is not difficult, it can be tedious and susceptible to error, especially when there are multiple possible twin laws to be considered. In addition, manually transforming the symmetry matrices to a common basis adds another opportunity for error to be introduced. COSET automates this process of deriving and testing twin laws.
The user can specify which Flack left coset decomposition algorithm is to be used. The program may be used in a strictly geometric mode, which gives all possible twin laws for a given lower-symmetry point group with respect to an apparent higher-symmetry point group. Alternatively, the program may be run in trial mode when crystallographic intensity data are available. In trial mode, the program tests each twin law by modifying an existing SHELXL input file specified by the user. The program inserts the appropriate TWIN and BASF statements into the .ins file which are needed to test the candidate twin laws. Either these SHELXL input files may be used in individual refinements after the COSET program exits, or the program can be directed to execute a series of SHELXL trial refinements automatically. A trial refinement containing an identity twin law (i.e. no twin law) is run simultaneously as a control refinement, against which the other refinements may be directly compared.
The source code was designed to be portable and is written mostly in ISO C99, along with POSIX.1-2008 extensions for the fork() system call and the execl(), symlink() and wait() functions. On a reasonably modern Linux or Unix system, the program will most likely compile with only the following commands typed on the shell's command line:
The program successfully compiles and runs on the 32-bit version of Windows 7 using the MinGW port of the GNU C compiler gcc (http://www.mingw.org ), as well as on Mac OSX 10.6.8 using gcc-4.2.1. Details for compiling on different platforms or as a Python extension module are covered in the README.txt file included with the source code distribution.
The COSET program is designed both to be easy to use and to require a minimum of input. No assumption need be made with regard to the settings of the actual point group of the structure. Typing the program name on the command line without an input file name returns a help message. The input file is a plain text file which contains a number of directives and parameters used to govern the execution of the program. The program can process multiple coset analyses in a given execution, which are designated as `tasks'; the program is not limited in the number of tasks in a given execution. The directives and the parameter requirements are given in Table 1. A `#' character at the beginning of a line denotes a comment and is ignored by the program. The directives are not case sensitive and are explained below:
TITLE must be the first directive for a given task. The parameter for this directive is a short description of less than 75 characters.
ALGORITHM allowed parameters are `A' or `B' (without quotes).
SUPERGROUP represents the metrically available higher symmetry, and this directive must contain one of the following character strings: -1, 2/m, mmm, 4/mmm, -3m, 6/mmm, m-3m.
SUBGROUP describes the lower (true) point symmetry of the crystal and takes two parameters. The first is a character string designation for the crystal's point group, e.g. -3 or mm2. The second parameter is an integer, which is equal to the number of symmetry operators for the crystal's point group, including the identity operator. The SUBGROUP directive must preceed any RMAT directives.
RMAT takes nine numerical parameters, which are the matrix elements for the symmetry operators of the point group. The order of the matrix elements is r11, r12, r13, r21, r22, r23, r31, r32, r33. Each symmetry operator takes a separate RMAT directive. The number of RMAT directives must equal the numerical value given in the SUBGROUP directive. When constructing the RMAT directives, simply take the equivalent positions given in International Tables for Crystallography, Vol. A (Hahn, 2002), and convert them by inspection to the matrix representations after dropping all translational components. For example, if the crystal's space group is I41/a, drop the lattice-centring symbol and convert the translational symmetry elements to the corresponding non-translational equivalents. Thus, in this case, one would use the equivalent positions for P4/m. The identity operator must be included and must be the first RMAT statement in the input file.
TRANS takes nine numerical elements which transform the unit-cell parameters of the crystal to the metrically available supergroup cell. The order of the elements is t11, t12, t13, t21, t22, t23, t31, t32, t33. These elements are normally obtained from a cell-reduction program. If TRANS is omitted the identity matrix is used.
INSFILE takes a single character string, which is the name of the SHELX .ins file for the structure. This file is not altered by the program but provides the basis for new .ins file(s) used to perform trial refinements.
OUTFILE takes a single character string, which is the name of a general output from COSET. If this directive is not specified, the program writes these results to stdout (`standard output', i.e. the terminal).
NEWINS takes a single character string, which is the base name for the new set of .ins files that incorporate the SHELX BASF and TWIN instructions for twinned refinement.
EXEC takes a single character string, which is the full pathname of the local system's SHELX(T)L executable. With UNIX systems, symbolic links are created to the actual .hkl file. With systems in which symbolic links are unavailable, the .hkl file is copied to each <new_basename>.hkl file. The output SHELXL normally sends to the terminal is trapped into a so-called `screen' file for convenient post-execution examination by the user.
END takes no parameters and should be the last line of the file.
Examples of COSET input files are given in §4. It should be noted that not all directives need to be used. If crystallographic data or SHELXL files are not available, a coset decomposition can still be performed with only the TITLE, SUPERGROUP, SUBGROUP, RMAT and, if necessary, TRANS directives. This will give potential merohedral or pseudo-merohedral twin laws for any subgroup–supergroup relationship.
COSET is a `free software' (http://www.gnu.org/philosophy/free-sw.html ) program released under the GNU General Public License (http://www.gnu.org/licenses/gpl.html ). It is hoped that users will modify and enhance the program. For this reason, some details regarding the implementation of the program are given.
The input for the program is read using a finite state machine (FSM). The FSM was chosen for its flexibility and modularity. There are very few requirements regarding the order of the directives in the input file. In addition, if new directives are to be added, it is a simple matter to include the directive and its callback function to the struct keyword_table table array declaration in input.c. The struct keyword_table is defined in input.h. The body of the callback function should be located in the input.c file for consistency. Depending on the nature of the new directive, the struct task in task.h may also need to be modified.
The code regarding the storing and manipulation of symmetry matrices is given in the files symm_mat.h and symm_mat.c. The key data structure for the symmetry matrices is the C struct symm_op, which is declared as
Of particular note is the structure member bcm which designates a binary (en)coded matrix (BCM). The symmetry matrices in BCM representation are encoded as a single unsigned int. BCMs are used to economize the implementation of the Flack algorithms. With BCMs, two matrices are compared by comparing two unsigned integers, rather than by the element-by-element comparison of nine matrix elements for every matrix comparison.
This encoding takes advantage of the fact that, for crystallographic point-group symmetry matrices, all the matrix elements are either −1, 0 or 1. These values are encoded as two-bit quantities, i.e. as the binary numbers 10, 00 and 01, respectively. The two-bit representations of the matrix elements are bit shifted into an unsigned int using the offsets listed in Table 2. The offsets are stored in a 3 × 3 array of ints in the variable offset_table in symm_mat.c. Symmetry matrices are encoded and decoded with the encode_value() and decode_value() functions found in the symm_mat.c source file.
The other notable feature of the COSET program is that the symmetry matrices are stored and the left coset decompositions are performed using reciprocal-space symmetry matrices. This is done because twin laws are defined with regard to reciprocal-space vector components, hkl. Symmetry matrices representing one-, two- and fourfold rotations in direct and reciprocal space are identical. However, in general, a reciprocal-space symmetry matrix is the inverse transpose of the direct-space symmetry matrix (Sands, 1982b). For three- and sixfold axes, the direct- and reciprocal-space matrices are not identical. To achieve rigorous consistency, the COSET program converts all user-input symmetry matrices from direct-space matrices into reciprocal-space matrices.
3.3. Implementation of left coset algorithms
The implementations of Flack's A and B algorithms are found in the coset.c source file. The user-chosen function is assigned to a function pointer in the struct task found in task.h. The program can be compiled to use an `extended B algorithm'. For acentric structures, this extended algorithm explicitly prints out potential twin laws which are centrically related to potential twin laws selected by the original Flack B algorithm. To incorporate this extension, the program should be compiled with the USE_EXTENDED_B_ALGORITHM conditional compilation directive either defined in the coset.c source file or included in the CFLAGS variable in the Makefile.
When the symmetry matrices for the true point group are input via RMAT directives, they are inverted and transposed to their reciprocal-space representations (in input.c). Before undergoing the coset decomposition process, they are transformed to the metrically available higher-symmetry space. The coset decomposition is performed and the selected symmetry matrices that are the twin law candidates are transformed back to the lower-symmetry space. Both transformations are performed in the task.c source file.
The code for setting up the TWIN and BASF instructions is found in the source file shelx.c. For an n-fold axis as a candidate twin law, the program assumes n twin domains and sets the tenth parameter of the TWIN instruction to this value. The BASF instruction incorporates n − 1 fractional volume parameters. Each of these is set to a starting value of 1.0/n.
COSET has been used to derive twin laws for pseudo-merohedral twins as illustrated by two case studies, 4-(1-allyl-4,5-diphenyl-1H-imidazol-2-yl)-N,N-dimethylaniline (Akkurt et al., 2013) and 2,2′-(piperazine-1,4-diyl)diethanaminium dibenzoate (Cukrowski et al., 2012). In each case, the CIF and the reflection data were downloaded from the publisher's web site (respectively, http://dx.doi.org/10.1107/S1600536813006326 and http://dx.doi.org/10.1107/S1600536812030115 ) and edited into a form suitable for SHELXL refinement. For the coset analysis, an appropriate COSET input file and SHELXL .ins file were constructed. The .ins file included all non-H atoms with isotropic displacement parameters. H atoms from the original structure determination were deleted. The trial refinement was set up to include appropriate HFIX commands, so H atoms would be introduced at idealized positions and allowed to ride on the parent C or N atoms. The trial refinements were run with 12 cycles of least squares each. For the first four of these, the non-H atoms were refined isotropically, and for the last eight cycles non-H atoms were included anisotropically. The weights were set to the default SHELXL weighting scheme.
This structure was reported recently (Akkurt et al., 2013) and found to be twinned by pseudo-merohedry. The authors successfully derived the twin law [00, 00, 111] and refined the twin fraction to a value of 0.513 (3).
Cell reduction of the reported cell parameters yields a metrically available C monoclinic cell with unit-cell parameters of a = 11.643, b = 88.685, c = 9.426 Å, α = 90.00, β = 123.24, γ = 89.98°. The transformation matrix to the metrically available C monoclinic lattice is [0, , 100]. Rsym for the monoclinic lattice is 0.029. Using the results from the cell reduction and the point group of the crystal allows the following COSET input file to be constructed:
COSET performed the left coset decomposition and recovered the twin law found by Akkurt et al. (2013). The trial refinements gave a clear indication that the twin law was indeed the correct one. The weighted R factor, wR2, of the `no twin' control refinement started at 0.547 and decreased slightly to 0.509 over the course of 12 cycles of least squares. In contrast, wR2 for the trial refinement including the twin law started at 0.370 and decreased to 0.146. Inclusion of the twin law thus gave a marked and immediate improvement to the refinement model. The twin fraction parameter, BASF, converged to a value of 0.514 (1), which compares well with the previously reported value of 0.513 (3).
This compound crystallizes as a monoclinic structure, which is twinned by pseudo-merohedry (Cukrowski et al., 2012). The authors report the twin law as [001, 00, 100] and twin fractions of 0.8645 (8) and 0.1355 (8).
Cell reduction of the reported unit-cell parameters yields a metrically available C orthorhombic cell with unit-cell parameters of a = 20.748, b = 33.197, c = 6.669 Å, α = 90.00, β = 90.00, γ = 89.71°. The transformation matrix to the metrically available C orthorhombic lattice is [101, 01, 00]. Rsym for the orthorhombic lattice is 0.318. Using the results from the cell reduction and the point group of the crystal, the following COSET input file was created:
In this case, performing the left coset decomposition with algorithm A finds a twin law [00, 00, 00], which corresponds to a 180° rotation about , while the twin law given by Cukrowski et al. (2012) corresponds to a 180° rotation about . Using algorithm B recovers the twin law given by Cukrowski et al. The twin law obtained from algorithm A was used for the trial refinements.
Unlike the previous example, wR2 for the trial refinement including the twin law starts at a slightly higher value than the `no twin' control refinement. This is due to the disparity between the starting value of the twin fraction (0.50) and the actual value of 0.13. Nevertheless, wR2 for the trial refinement drops smoothly to a value of 0.121 over the course of 12 cycles, while wR2 for the control refinement remains at about 0.41 for all 12 cycles.
COSET is a small stand-alone program useful for deriving merohedral or pseudo-merohedral twin laws. It can be used in conjunction with the SHELXL refinement program to evaluate potential twin laws quickly. The input files for COSET are typically small and easily created using a text editor.
The COSET program can be downloaded from http://xray.chem.uwo.ca/COSET/ . The source code is available as either a ZIP archive or a GZIP compressed tar file. A 32-bit statically compiled binary executable file for Windows is also avalable on the web site. The Windows executable file was produced using MinGW's gcc compiler on Windows 7 Professional edition.
The author is grateful to Dr Sean Parkin for helpful comments regarding the code and the manuscript, as well as for testing the program on the Mac OSX platform. The author is also grateful to Dr David Watkin and Professor Nicholas Payne for their helpful comments regarding an early draft of the manuscript.
Akkurt, M., Fronczek, F. R., Mohamed, S. K., Talybov, A. H., Marzouk, A. A. E. & Abdelhamid, A. A. (2013). Acta Cryst. E69, o527–o528. CSD CrossRef CAS IUCr Journals
Aroyo, M. I., Kirov, A., Capillas, C., Perez-Mato, J. M. & Wondratschek, H. (2006). Acta Cryst. A62, 115–128. Web of Science CrossRef CAS IUCr Journals
Aroyo, M. I., Perez-Mato, J. M., Capillas, C., Kroumova, E., Ivantchev, S., Madariaga, G., Kirov, A. & Wondratschek, H. (2006). Z. Kristallogr. 221, 15–27. Web of Science CrossRef CAS
Aroyo, M. I., Perez-Mato, J. M., Orobengoa, D., Tasci, E., de la Flor, G. & Kirov, A. (2011). Bulg. Chem. Commun. 43, 183–197. CAS
Cooper, R. I., Gould, R. O., Parsons, S. & Watkin, D. J. (2002). J. Appl. Cryst. 35, 168–174. Web of Science CrossRef CAS IUCr Journals
Cukrowski, I., Adeyinka, A. S. & Liles, D. C. (2012). Acta Cryst. E68, o2389. CSD CrossRef IUCr Journals
Duisenberg, A. J. M. (1992). J. Appl. Cryst. 25, 92–96. CrossRef CAS Web of Science IUCr Journals
Flack, H. D. (1987). Acta Cryst. A43, 564–568. CrossRef CAS Web of Science IUCr Journals
Flack, H. D. & Wörle, M. (2013). J. Appl. Cryst. 46, 248–251. Web of Science CrossRef CAS IUCr Journals
Hahn, T. (2002). Editor. International Tables for Crystallography, Vol. A. Dordrecht: Kluwer Academic Publishers.
Sands, D. E. (1982a). Vectors & Tensors in Crystallography, p. 97. London: Addison–Wesley.
Sands, D. E. (1982b). Vectors & Tensors in Crystallography, pp. 110–111. London: Addison–Wesley.
Schlessman, J. & Litvin, D. B. (1995). Acta Cryst. A51, 947–949. CrossRef CAS Web of Science IUCr Journals
Sheldrick, G. M. (2001). XPREP. Bruker AXS Inc., Madison, Wisconsin, USA.
Sheldrick, G. M. (2004). CELL_NOW. University of Göttingen, Germany.
Sheldrick, G. M. (2008). Acta Cryst. A64, 112–122. Web of Science CrossRef CAS IUCr Journals
Sparks, R. A. (1999). GEMINI. Bruker AXS Inc., Madison, Wisconsin, USA.
Spek, A. L. (2009). Acta Cryst. D65, 148–155. Web of Science CrossRef CAS IUCr Journals
Stewart, J. M., Machin, P. A., Dickinson, C. W., Ammon, H. L., Heck, H. & Flack, H. (1976). The XRAY76 System. Technical Report TR-446. Computer Science Center, University of Maryland, College Park, Maryland, USA.
© International Union of Crystallography. Prior permission is not required to reproduce short quotations, tables and figures from this article, provided the original authors and source are cited. For more information, click here.