A unified convention for biological assemblies with helical symmetry
aBasic Science Program, SAIC-Frederick Inc., Center for Cancer Research Nanobiology Program, NCI-Frederick, Frederick, MD 21702, USA, and bSackler Institute of Molecular Medicine, Department of Human Genetics and Molecular Medicine, Sackler School of Medicine, Tel Aviv University, Tel Aviv 69978, Israel
*Correspondence e-mail: email@example.com
Assemblies with helical symmetry can be conveniently formulated in many distinct ways. Here, a new convention is presented which unifies the two most commonly used helical systems for generating helical assemblies from asymmetric units determined by X-ray fibre diffraction and EM imaging. A helical assembly is viewed as being composed of identical repetitive units in a one- or two-dimensional lattice, named 1-D and 2-D helical systems, respectively. The unification suggests that a new helical description with only four parameters [n1, n2, twist, rise], which is called the augmented 1-D helical system, can generate the complete set of helical arrangements, including coverage of helical discontinuities (seams). A unified four-parameter characterization implies similar parameters for similar assemblies, can eliminate errors in reproducing structures of helical assemblies and facilitates the generation of polymorphic ensembles from helical atomic models or EM density maps. Further, guidelines are provided for such a unique description that reflects the structural signature of an assembly, as well as rules for manipulating the helical symmetry presentation.
Under physiological conditions, many biomolecules are either organized in functional tubular forms or aggregated in disease-related filaments. Tubular and filamentous structures grow with a helical symmetry. The determination of helical structures is important because it provides clues to functional regulation and to the mechanisms of polymerization and depolymerization, and can help in figuring out how to prevent unwanted disease-related fibril aggregation. The most famous example is the determination of the Watson–Crick double-helical DNA structure in 1953 (Watson & Crick, 1953), which created a new era in the history of molecular biology. In proteins, even a slight difference in the interactions between molecules is sufficient to create similar filamentous or tubular structures with distinct helical symmetries. For this reason, structural polymorphism is a common characteristic of tubular or fibril entities. Depending on the specificity and rigidity of the interacting molecules, some, such as the amyloidogenic peptide Aβ1–40 (Sachse et al., 2008; Schmidt et al., 2009), can exhibit a broad spectrum of polymorphic assemblies, whereas others only show limited variability, as in the case of microtubules (Sui & Downing, 2010). This underscores the importance of revealing the structural characteristics of helical assemblies directly from a simple helical symmetry description.
Helical symmetry can be formulated in many different ways. Helical transformations can be classified into two categories: one-dimensional (1-D) helical systems and two-dimensional (2-D) helical systems. In structural determination by X-ray fibre diffraction (Klug et al., 1958), a helical structure is described as a set of n 1-D molecular helices related by an n-fold axial symmetry. However, in both systems, if the helically repeating motif has C2 symmetry, the helical structure has an additional dyad symmetry (Klug et al., 1958). Given the asymmetric units, the helical assembly can be constructed by rotohelical transformations which are defined by a specified line group (Damnjanović et al., 2007). Because of the prevalence of structural polymorphism (DeRosier et al., 1999) and helical discontinuities (seams; Kikkawa, 2004) in electron-microscopy (EM) images, a description of helical assemblies by a rolled planar 2-D lattice sheet was devised to solve the EM structure in reciprocal space. In this representation, the framework of a helical structure is viewed as a helical net; that is, a set of equivalent points wrapped around a cylindrical surface. Various 2-D lattice wrappings were defined by a circumference vector c = n1a + n2b, where n1 and n2 are two integer constants and a and b are the 2-D lattice vectors. On the other hand, the reconstruction of helical structures in real space is typically based on rotohelical transformations which are applied iteratively using the single-particle method (Sachse et al., 2007; Egelman, 2007, 2010).
A simple convention for defining the helical symmetry of biological assemblies has been suggested in the remediated Protein Data Bank (PDB; Lawson et al., 2008) and EM Data Bank (Heymann et al., 2005) archives. Both have used a definition of rotohelical transformation that does not fully capture the underlying symmetry properties of helical assemblies. It is therefore not surprising that two very similar tubular structures might be described by very different helical parameters that provide no clue to the fact that they are actually quite similar. This is the case for the two bacteriophage major coat protein helical tubes determined by X-ray fibre diffraction [PDB entries 1hgv (Pederson et al., 2001) and 1ifd (Marvin, 1990)]; the first is presented as a one-start and the second as a five-start helical tube with each helix related by a fivefold rotational symmetry. The discrepancy is understandable because the two PDB structures were the outcome of structure-determination procedures in which the helical symmetry was preset in the minimization procedure. This shortcoming underscores the importance of a standard system that would report helical structures and provide parameters that reflect their structural characteristics. It appears that to date an unambiguous, simple and systematic standard for defining a unique helical specification for constructing helical assemblies from asymmetric units is lacking.
In this paper, we present a new unified convention for the construction of helical assemblies from asymmetric units determined by X-ray fibre diffraction and EM imaging. The unification is made possible by an augmented 1-D helical system (described below) that extends the traditional 1-D helical scheme to adopt the helical symmetry descriptor [n1, n2] which is used in the 2-D helical system. A helical structure can be prepared by rolling a planar sheet composed of identical 2-D unit cells (Stewart, 1988). In order to create a seamless 2-D lattice tube, two integer constants [n1, n2] define the wrapping process: n1 refers to the number of cells that are needed to complete a full round of cylinder wrapping and n2 to the number of cells sliding along the cell edge after the wrapping. The helical symmetry of the tubular structure is explicitly determined by [n1, n2] and the corresponding 2-D wrapping transformations can be found in the literature (Tsai et al., 2006; Kikkawa, 2004).
In a traditional 1-D helical system, a helical structure is depicted as either a one-start or an n-start helical structure (Egelman, 2007; Klug et al., 1958). For a one-start helical structure, the assembly consists of only a single helix with two helical parameters, twist (φ) and rise (δ); these denote the transformation of the 1-D unit cell which is used to build the entire structure. In fibre diffraction, a one-start helix is formulated as u units in v turns with a helical repeat distance of c, which straightforwardly gives φ = 2πv/u and δ = c/(uv). An n-start helical structure has n helices related by an n-fold axial symmetry (Cn), with the axis coinciding with the helical axis. In the augmented 1-D helical system described below, in addition to the rotational operation of the Cn symmetry there is an extra translational operation along the helical axis. In the 2-D helical system this extra translational operation is implicitly included in the 2-D wrapping transformation; however, it is ignored in the traditional 1-D helical scheme. This prevents the 1-D and 2-D systems from being unified in a common helical symmetry description. In contrast, the helical symmetry in the augmented 1-D helical system with the four parameters [n1, n2, twist, rise] is defined by two consecutive helical (screw) operations: the first helical operation is specified by two helical parameters [twist, rise] exactly as in the traditional 1-D helical transformation and the second screw operation is defined by two [n1, n2] constants. n1 refers to the n1-fold rotational symmetry exactly as in the traditional 1-D helical transformation and n2 specifies the translation part of the second screw operation. We will illustrate the operational transformations as well as the interconversion between the augmented 1-D helical system and the 2-D helical system below.
In our definition, a helical structure is composed of repetitive identical units, similar to a single crystal which is built from a three-dimensional (3-D) lattice. The definition of repetitive relates to the entire helical structures, which are built from a unit cell with a specified helical symmetry. The unit cell is either a 2-D lattice or a 1-D line segment. The definition of identical implies that each repetitive unit in the construct has exactly the same environment; that is, the independent variables of a helical structure include only parameters involved in helical transformation and coordinates of asymmetric units within a unit cell. Identical implies that when evaluating the energy of the assembly there is no need to include the interactions between all units but just the interactions between one unit cell and its surrounding cells.
Given a 3-D unit cell, it is straightforward to generate the entire single crystal with fractional coordinates. The newly generated fractional coordinates are the number of cells (na, nb, nc) away from the origin (0, 0, 0) along each edge. The Cartesian coordinates of a new cell can be converted from the fractional coordinates by an orthogonal matrix (Evans, 2001) computed from 3-D lattice constants, a, b, c, α, β and γ. However, unlike the 3-D crystal system, a specified helical transformation is needed to generate the helical assembly from a given 1-D or 2-D unit cell. In the following, the equations for the 2-D wrapping and the augmented 1-D helical transformation will be derived and explained in detail.
As stated in §1, a planar sheet of 2-D lattice can be wrapped into a tube. If all sheet units are identical, the two constant integers n1 and n2 are sufficient to define all possible distinct tubes obtained by rolling it. Here, n1 is the number of cells along one edge of the 2-D lattice (a) which are required to make a full round of wrapping and n2 refers to the number of cells sliding along the other edge of the lattice (b) after wrapping.
If we place edge a along the x axis of the Cartesian coordinate and a 2-D lattice (a, b, γ) is placed on the xy plane, the wrapping equations for an [n1, n2] tube are
(xw, yw, zw) are the wrapped Cartesian coordinates of the tube and (xc, yc, zc) are the Cartesian coordinates of the associated 2-D sheet. In the wrapping equations, the 2-D lattice sheet is at a distance of the tube radius (r) from the tube axis (tx, ty, 0). The helical transformation is implicitly specified by the helical twist α and the helical rise xctx + ycty. Fig. 1 provides a graphical summary of the 2-D helical transformation. A more detailed description has been given previously (Tsai et al., 2006). Given asymmetric units in a 2-D lattice and a helical symmetry specified by the 2-D helical system in five parameters [n1, n2, a, b, γ], one can build a complete helical construct based on the 2-D helical transformation equations formulated above.
Instead of rolling a planar 2-D sheet, a helical structure can also be expressed by a single helix or n helices, with the n helices related by an n-fold screw axis instead of just a rotational axis. Because the helical assembly must consist of identical subunits, the rotational part of the screw axis must display a Cn rotational symmetry and the translational part should be limited by some discrete numbers. In the augmented 1-D helical system, the four parameters [n1, n2, φ, δ] indicate that there are n1 helices in the assembly, with each individual helix characterized by a unit twist (φ) and a unit rise (δ). Because the helices are also related by an n1-fold screw axis, each helix denoted by m1 = 0, 1, 2, …, n1 − 1 has an additional twist of m1(2π/n1) and a rise of m1(n2/n1)δ. Note that the rise, which is specified by n2 with a quantity of n2/n1δ, was not included in the traditional 1-D helical system. Note also that m1 = n1 refers back to the first helix as specified by (φ, δ), which will give n2δ rise after a complete round of n1 rotations. In the 2-D helical system, this corresponds to the number of cells involved in the helix sliding after a complete wrapping.
A helical structure in the symmetrical construct is identified by the cell coordinates [m1, m2]. The asymmetric units are given in cell [0, 0] and an [m1, m2] cell is located in the m1 helix m2 units away from the cell [m1, 0] along the helix. If the helical axis is parallel to the y axis and passes through the origin (0, 0, 0), the helical transformation equations for an [m1, m2] cell in an [n1, n2] helical construct are
(xw, yw, zw) are the transformed Cartesian coordinates for the cell [m1, m2] and (xc, yc, zc) are the Cartesian coordinates of asymmetric units in cell [0, 0]. hφ and α specify the overall rise and twist for the [m1, m2] cell as specified by an n1-fold screw axis with n2 unit shift. In the case of a helical structure with a single helix, in which n1 = 1 and m1 = 0, the helical transformation above reduces to a simple helical operation defined by [φ, δ] only. A graphical summary of the augmented 1-D helical transformation is given in Fig. 2.
There are four ways to convert a helical system from 2-D to 1-D: view the continuation of lattice edge b as a helix, view the continuation of lattice edge a as a helix or view the continuation along the vector of a + b or along the vector of a − b. The first is the most convenient choice. By selecting the vector b as an individual helix, the new 1-D helical system retains the same symmetry notation as the 2-D helical system [n1, n2]. The unit twist φ (in unit of radians) and rise δ of the n1-start helices are calculated as
where tx, ty are the helical axes of the 2-D helical system and xc, yc are the planar Cartesian coordinates at the cell origin (0, 1). Because the tube axis of the 1-D helical tube (along the y axis) is different from the 2-D helical tube (on the xy plane), the Cartesian coordinates referenced in the 2-D system require some transformations in order to correspond to the new 1-D system. This can be performed either directly in wrapped Cartesian coordinates or in native fractional coordinates. In Cartesian coordinates, the transformation is equivalent to aligning the 2-D helical axis of the xy plane back onto the y axis with the z axis (0, 0, 1) as the rotational axis. Under the right-handed rotational system, the angle θ between the old tube axis and the y axis is calculated as atan(−tx/ty) and the transformations are
where xw1, yw1 are the wrapped Cartesian coordinates of the new 1-D helical system and xw2, yw2 are the wrapped Cartesian coordinates of the old 2-D helical system.
The conversion from a 1-D to a 2-D helical system is not as straightforward as the opposite conversion. This is partly because a 2-D lattice is loosely defined by a single helix structure and partly because of the necessity to revert from the wrapped tube coordinates back to 2-D planar Cartesian coordinates. Given a 1-D helical structure, we first calculate the implicit helical radius from the center of mass of the representative units by assuming that the center of the 1-D helical assembly is located at the origin (0, 0, 0). Secondly, we either use the original n1 of the 1-D system or determine a new n1 for the 2-D helical system and define accordingly two wrapped coordinates, (xw1, yw1, zw1) and (xw2, yw2, zw2), from the 1-D helical system to serve as the origin of cells (1, 0) and (0, 1) of the new 2-D helical system, respectively. Thirdly, we reverse the two wrapped coordinates back to unwrapped planar Cartesian coordinates, (xc1, yc1, zc1) and (xc2, yc2, zc2). With the new calculated planar coordinates, it is straightforward to calculate the 2-D lattice constants as follows:
Fourthly, given the newly determined r, a, b, γ and n1, n2, the new 2-D helical tube can be determined by solving the quadratic equation b2x2 + 2n1ab cos γ x + (n1a)2 − (2πr)2 = 0. Finally, all Cartesian coordinates of the old 1-D system are reversed back to the new 2-D planar coordinates.
We have shown that both the 1-D and 2-D helical systems can be represented by two integers, [n1, n2], and that the helical assembly can be built through the helical transformation with the associated parameters. However, the helical symmetry specified by these two integers can also be interpreted in a way different from the helical systems' definitions. In the traditional helical description, an assembly with [n1, n2] symmetry can be viewed as two sets of n-start helices in which either the arrangement of the n1-start helices is specified by n2 or, vice versa, that of the n2-start helices is specified by n1. The best way to illustrate [n1, n2] helical symmetry is by using a helical net: an unwrapped flattened 2-D net bound by the circumference in one direction and extended to infinity parallel to the helical axis. Figs. 3(a) and 3(b) illustrate an example of wrapping and unwrapping of the helical net with the EM structure of a microtubule with [11, 3] symmetry (Sui & Downing, 2010). The colored circular dots in the helical net represent asymmetric units and a line passing through a set of dots is a helix. The number of intersections (n) between the set of parallel lines with the circumference is exactly the number n of helices that are required to fill the helical assembly. This is the origin of the n-start helices definition. In terms of a helical net description, the helical symmetry can be specified by picking a particular set of two intersecting lines (helices) corresponding to n1-start and n2-start lines. The intersections define the locations of repeating asymmetric units in the helical structure. In addition to the [11, 3] symmetry, two feasible helical nets with symmetries [8, 3] and [14, 3] are also depicted in Fig. 3(c) for the same helical structure. In the special case of a one-start helix, the entire assembly is built from a single helix instead of a set of helices (n-start helices). Although n2 is not required in helical symmetry denoted as a one-start helix, it is still represented in the [1, n2] notation.
Based on the augmented 1-D helical system, it is not difficult to realise that a helical symmetry with [n1, n2, twist, rise] is equivalent to [−n1, −n2, twist, rise], [−n1, n2, −twist, −rise] and [n1, −n2, −twist, −rise]. To reduce the redundancy in the helical symmetry representation, we set several simple rules. Firstly, n1 is always positive. For consistency in the interconversion between the 1-D and 2-D helical systems (with the sign of n2 kept unchanged), we choose the rise to also be positive. Secondly, the value of n2 is always smaller than that of n1. In this way, the handedness of the n1 helices is determined by the sign of twist: if positive the n1 helix is right-handed, otherwise it is left-handed. The sign of n2 gives the handedness of the n2 helices: if negative it is a right-handed helix, otherwise it is left-handed. To calculate the [twist, rise] of the n2-start helices the helical symmetry can be swapped from [n1, n2] to [n2, n1].
Above, the helical symmetry operation has been applied to asymmetric subunits without first assigning a plausible local symmetry. In order to generate all symmetric subunits in a planar 2-D lattice, the local symmetry can be specified by one of the 17 wallpaper groups. However, the local symmetry in the planar 2-D lattice is largely lost by the [n1, n2] helical transformation; therefore, we only describe pseudo-local symmetry. A local C2 symmetry operation is an exception: not only is it maintained between asymmetric subunits within the unit cell, but also in the entire helical construct. In terms of the 1-D helical system, the C2 symmetry is defined as an additional dyad symmetry (with axial C2 along the z axis and the helical axis along the y axis). To include a local C2 symmetry operation, we can assign a wallpaper group p2 before applying the helical transformation.
A helical symmetry can be described by many [n1, n2] combinations. For a particular preset n2 there are a limited number of n1; similarly, a preset n1 will have a limited selection of n2 for the same helical assembly. To understand the manipulation and limitations of changing from one [n1, n2] to another, the helical net is the best reference. For illustration, we use the [11, 3] helical symmetry of the polymorphic helical structure of the microtubule. In terms of the (h, k; n) notation (Toyoshima & Unwin, 1990; Toyoshima, 2000), the set of n-start helices can be specified by the equation
where n10 = 11 and n01 = 3 for the [11, 3] symmetry. Note that the (h, k) index has to be confined within the circumference range. Starting with the [11, 3] symmetry and a fixed n2 = 3, the redundant helical symmetry [n1, 3] can have n1 = h 11 − k 3, with h = ±1. On the other hand, given a fixed n1 = 11, we can have many redundant [11, n2] helical symmetries with n2 = h 11 − k 3 and k = ±1. In the case of (h = 1, k = ±1), we have new redundant helical symmetries of [8, 3] and [14, 3] (Fig. 3c), respectively. The complete list of redundant [n1, 3] symmetries with h = 1 are [26, 3], [23, 3], [20, 3], [17,3], [14,3], [11,3], [8, 3], [5, 3], [2, 3], [−1, 3] and [−4, 3]. For [n1, 3] there is an infinite number of helical symmetries in a planar 2-D lattice rather than just the 11 sets restricted by the circumference. In Fig. 3(c), the corresponding redundant sets of helical symmetries are noted next to the helical dots. For the redundant symmetry [−1, 3], we have created an equivalence between 11-start helices [11, 3] and one-start helix [−1, 3] (or [1, −3]) symmetry.
In order to check whether the (h, k) index is within the circumference, we convert the 1-D system to its equivalent planar 2-D helical net. It is then straightforward to calculate the new helical parameters φ for the new n-start helix. If |φ| is less than π then it falls within the circumference range.
In helical structure determination by X-ray fibre diffraction or EM based on the Fourier–Bessel method (in reciprocal space), the first step is indexing the layer-line diffraction pattern to a specified helical symmetry. There are two possible systems for indexing a diffraction pattern. In the first, assuming that a helical assembly can be described by a single (one-start) helix, the `selection rule' l = tn + um can be utilized to assign (n, l) pairs to layer-lines in which each layer-line is associated with a set of n-start helices. A more general formalism using (n, Zl) instead of (n, l), which removes the requirement for t/u to be a rational number, is more appropriate for fibre diffraction. However, for simplicity, we prefer to use the (n, l) system here. A successful layer-line indexing then gives the helical organization as the selection rule implies: u units require t turns of the one-start helix to complete a true repeat with a rise distance of c. A second more systematic (h, k; n) indexing system (Toyoshima & Unwin, 1990; Toyoshima, 2000) interprets diffraction patterns based on the helical surface lattice. In a planar 2-D lattice, diffraction by a set of lines gives a row of dots in reciprocal space. Therefore, it is straightforward to determine a 2-D planar symmetry from an ideal diffraction pattern. For a helical structure which is obtained by wrapping of a 2-D lattice, a set of lines now becomes a set of helices and the corresponding diffraction dots become layer-lines. To define a surface lattice, two indices, (1, 0; n10) and (0, 1; n01), are first assigned where n is the start number of the associated helices, which can be estimated from its peak position in the layer-line diffraction (Toyoshima & Unwin, 1990; Toyoshima, 2000). If the remaining layer-lines can be indexed and related by the equation n = hn10 − kn01 then the helical symmetry is determined. Fig. 4 illustrates the relationship between the new [n1, n2] helical scheme and the symmetry in both indexing systems. The nodes in Fig. 4 represent (i) asymmetric units based on a simple helical structure which is described by a one-start helix (t = 4 and u = 13), i.e. with 13 units and four turns completing a true repeat of distance c, and (ii) a simplified diffraction pattern of the same helical structure. However, instead of the layer-line pattern for n-order Bessel diffraction, each dot gives the position of (n, l) diffraction where the layer-line pattern has a maximum diffraction peak at the ∼n + 2 position (Diaz et al., 2010). In the figure, [n1, n2] of the new helical scheme correspond to the choice of n10 and n01 in the (h, k; n) indexing system. The same (t = 4 and u = 13) structure is related by two different helical systems [3, 1] and [3, −2] in Figs. 4(a) and 4(b), respectively, for a simplified diffraction pattern in terms of n and −l. The figure only shows one fourth of the diffraction pattern. For example, the n = 4, l = 3 diffraction is at the left-upper corner of the figure without a label of (h, k; n, l, m).
There is no clear-cut advantage in treating a helical assembly as a 1-D or a 2-D system; both systems have pluses and minuses. However, we believe that the augmented 1-D helical system is more suitable than the 2-D system for describing a helical structure, even though the two systems are equivalent and interchangeable. There are two reasons for favoring the 1-D helical system for describing a helical symmetry. Firstly, the 1-D helical system is simpler than the 2-D system, with one fewer parameter. Secondly, the 1-D helical scheme is independent of the helical radius, while the surface lattice parameters (a, b, γ) in the 2-D system will change with different radii. Here, based on a 1-D helical system we suggest general guidelines for helical structure representation. With our guidelines, if the assembly units can be unambiguously defined and follow the helical paths, each helical assembly is expected to provide a unique symmetry [n1, n2, φ, δ] that also explicitly reflects the helical structural characteristics.
In terms of a helical net, a helical structure is composed of a set of n helices in which the individual helix is named an n-start helix. If a helical structure can be expressed by just a single helix, it is a one-start helical structure. In the augmented 1-D helical system, the [n1, n2] representation implies that the organization of the n1 helices is specified by n2, with the individual n1-start helix defined by [φ, δ]. We can also swap the representation to say that there are n2 helices in the structure related by n1. Since there are many [n1, n2] combinations for a particular helical structure, the first and the most important guideline is to define the rule for choosing a unique [n1, n2] specification. In order to reflect the helical structural characteristics, the rule states that only protofilaments will be candidates for the [n1, n2] selection. In our definition, if adjacent asymmetric subunits in an assigned helix are in physical contact, this helix is a protofilament. Therefore, we first sort protofilaments according to the extent of contacts between adjacent asymmetric subunits. Of the best four protofilaments, the one with the twist angle closest to zero is set as the primary protofilament n1 and the next best protofilament is selected as the secondary protofilament n2. Note that n1 is always larger than |n2| under this guideline.
To ensure a unique helical symmetry representation for a helical assembly, redundancy needs to be reduced to singular [n1, n2, φ, δ]. The reduction guideline requires that n1, δ > 0 and n1 > |n2|. In the case of n1 < 0, one can apply the equivalent rule that the new [n1, n2] = [−n1, −n2]. If δ is negative, one can simply apply the equivalent rule [n1, n2, φ, δ] = [n1, −n2, −φ, −δ] to make it a positive value.
There are many helical filaments and tubular structures in the PDB which have been solved either directly by X-ray fibre diffraction or by fitting individual crystal structures into cryo-EM density maps. Similar to X-ray crystal structures where only the coordinates of the asymmetric units are included in the PDB file, most of the helical structures deposited in the PDB also contain only asymmetric units. Therefore, in principle, the entire helical structures should be constructed from the deposited asymmetric units by a specified helical symmetry. In the case of crystal structures, a space group and six lattice constants are defined in the keyword `CRYST1' in the PDB for calculating all symmetric units in the unit cell. However, owing to the lack of a simple, complete and widely accepted system for helical symmetry, no keyword has been set to define the helical symmetry and helical parameters are implicitly stated in the comments. Furthermore, the creation of the entire helical structure relies on a set of translational and rotational matrices which are hard-coded in the PDB.
We have applied the augmented 1-D helical scheme along with the suggested guidelines to all helical structures deposited in the PDB. A small portion of the results are given in Table 1 and a complete list is available on the web at http://protein3d.ncifcrf.gov/helicalSymmetry/table1.html . The newly determined helical parameters [n1, n2, twist, rise] not only directly reflect the helical characteristics but also provide sufficient information for constructing an entire helical structure from given asymmetric units. Here, we propose four helical parameters in a new keyword named HELSYM in the PDB for the specified helical symmetry to avoid using matrices and comments when specifying a helical symmetry.
In the PDB, the axial symmetry of a helical structure is conventionally along the z axis and passes through the origin (0, 0, 0). The helical symmetry specified in the PDB usually follows the rotohelical description, which provides the helical parameters (φ, δ) for a single helix or n helices related by a Cn rotational axis. The manually extracted data, the helical twist φ and the helical rise δ, are first verified against the helical transformation matrices if also given in the PDB file. The corresponding augmented 1-D helical parameters will then be either [1, 0, φ, δ] or [n, 0, φ, δ] for one-start helices or n-start helices, respectively. We then use our graphics tool named PNAS (Protein Nanoscale Architecture by Symmetry), in-house software running both under Linux and Windows, to search for the first four protofilaments with the largest contact between the asymmetric units and determine their n1, n2, φ, δ values accordingly. Next, we select among them the helical protofilament with the lowest absolute value of twist angle as the primary n1 helix and its associated twist and rise are set as the 1-D helical parameters [φ, δ]. Finally, the highest contact protofilament other than the chosen primary helix is assigned as the secondary n2 helix to complete the determination of [n1, n2, φ, δ].
In Fig. 5, the helical assembly of the bacteriophage major coat protein (PDB entry 1ifd ; Marvin, 1990) is assigned into three different 1-D helical symmetries: [5, 0, −33.2, 16.0], [5, 0, 38.8, 16.0] and [10, −5, 5.5, 32.0]. The first symmetry [5, 0] corresponds to the rotohelical assignment in the PDB. Apparently, each individual helix is not a protofilament since no contact between helical subunits (shown in the same color) is observed. Therefore, the helical structural characteristics will not be conveyed clearly from its helical parameters. The second [5 ,0] symmetry is based on the first protofilament with the largest number of contacts between helical subunits. However, the guideline suggests using the [10, −5] helical symmetry to represent this structure. This symmetry is advantageous for three reasons: firstly, the [10, −5] symmetry corresponding to the second and first protofilaments in the structure presents the best structural characteristics, unlike the second [5, 0] assignment which only contains information for the first protofilament; secondly, by looking down the helical axis the structure is composed of ten helices, not just five; and thirdly, another inovirus coat protein (PDB entry 1hgv ; Pederson et al., 2001) also gives a similar structure with [11, −6] helical symmetry. Here, the primary (11-start) helix is the first protofilament and the secondary (six-start) helix is the second protofilament. These two examples show that the augmented 1-D helical representation not only describes similar helical structures by similar parameters but at the same time also differentiates between similar helical organizations. In Fig. 6, three additional helical structures are depicted in 1-D symmetry. Pictorial descriptions with 1-D helical symmetry for the complete list of known helical structures can be accessed from links on the webpage http://protein3d.ncifcrf.gov/helicalSymmetry/table1.html .
For helical structures deposited in the EMDB (Lawson et al., 2011) only the helical classification is indicated but no helical symmetry is explicitly given in the data bank. However, it is not difficult to deduce the 1-D helical parameters if the helical axis can be determined from the EM density map. The graphics tool PNAS can be utilized to assign 1-D helical symmetry to the EM structure. Firstly, we determine the location of the primary protofilament by visual inspection of the density map when shown in various isosurface presentations. Looking down the EM map along the helical axis, the number of assigned protofilaments can be counted to give the n1 helical parameter. PNAS then determines the position of the helical axis using either the given map center or the calculated coordinates of the center of density. Next, we determine the [φ, δ] pair for the visually assigned primary protofilament by calculating the correlation coefficient (Grubisic et al., 2010) between the origin map density and the helical transformed density specified by a pair of manually adjustable parameters [φ, δ]. In this procedure, we follow the guideline to keep the helical twist as close to zero as possible and at the same time change the twist and rise to reach the best match as guided by visual superimposition between the origin and the transformed density map in isosurface presentation. The optimal correlation coefficient should be very close to 1.0. Now, we can use the newly determined helical parameters n1, φ and δ to determine n2: simply try integer numbers between −n1 and n1 and perform the 1-D helical transformation to determine n2 from the result of the superimposition as stated above.
To illustrate the procedure of 1-D helical symmetry determination, four superimpositions between the original EM (EMD-1240; bateriophage fd coat protein B; Wang et al., 2006) and transformed density maps relating to the four stages are given in Fig. 7. The first two superimpositions illustrate the determination of [φ, δ] for the assigned primary protofilament. The last two superimpositions illustrate the determination of the secondary protofilament n2, giving the 1-D helical symmetry [10, 5, 2.6, 34.8]. The 1-D helical symmetry for each helical EM map deposited in the EMDB has been determined with the graphics tool PNAS. Some of the results are listed in Table 1 and a complete list is reported on the webpage http://protein3d.ncifcrf.gov/helicalSymmetry/table1.html . To highlight the importance of a comprehensive helical scheme, Table 2 provides a comparison between the reported helical symmetries determined in the EM reconstruction and the new helical symmetries for six polymorphic helical structures of the microtubule. The results clearly show that the inherent structural characteristics of the microtubule obtained by the new helical scheme can directly discover polymorphic ensembles. The very similar surface lattice within different helical symmetries implies very similar subunit–subunit interactions which the microtubule uses to assemble into divergent helical organizations.
A helical description using four parameters [n1, n2, φ, δ], determined according to the augmented 1-D helical symmetry guidelines (in §2) provides the helical signature of the structure. This is because the two sets of defined helices, the n1-start and the n2-start, correspond to the two sets of protofilaments. However, will the guidelines also always give a unique [n1, n2] combination for a given helical structure? The answer is yes, as illustrated by the example below. The docked atomic model of the bacterial flagellar hook (Fujii et al., 2009; PDB entry 3a69 ) contains an asymmetric subunit with three protein domains spanning the inner, middle and outer layers of the helical cryo-EM map. In terms of individual protein domains, the best protofilament of each domain yields an 11-start, five-start and six-start helix, respectively, from the inner to the outer layers. Even though different helical descriptions of different layers are observed, the guidelines still give an unambiguous helical symmetry of [11, −6, −7.31, 45.32] for this structure. The assignment is based on two clear elements in the structural data: the 11-start helix (the third protofilament in the protein) has a twist angle closest to zero and the six-start helix is the first protofilament. In Fig. 8, each color presents an assigned helix and the assembly illustrates six, five and 11 helices, respectively, for the [6, −1], [5, −1] and [11, −6] symmetries.
Not all helical structures have unambiguous primary protofilaments, especially when the growth mechanism does not follow a helical path. The tubular structure of the HIV-1 capsid protein (CA; Byeon et al., 2009) is such an example. In solution, CA forms a dimer via the association of its C-terminal domain (CTD). The cryo-EM tubular structure (EMD-5136) reveals that the basic unit is a trimer of CA dimers with a pseudo-threefold at the CTD–CTD interfaces and the CA dimer is shared between two trimers. Following our guideline, we obtain a helical symmetry of [24, 13, 7.39, 165.78] for the CA tubular structure. The unit cell depicted by the [24,13] symmetry does not correspond to the observed CA hexamer; however, after applying the symmetry-manipulation rules (n1 = n1 − n2, [n1, n2] swapping and n2 = n2 + n1) the new helical symmetry of [13, 2, −11.00, 89.80] gives the cell dimensions of the hexamer. The surface lattices for both helical symmetries are highlighted in red in Fig. 9. The fact that the assigned asymmetric units in both helical symmetries ([24, 13] and [13, 2]) do not correspond to the assembly unit implies that the path of a trimer of CA dimers is not helical. In this case, our guidelines will fail to offer an unambiguous helical specification.
The guidelines have two limitations in fulfilling the aim that every helical structure would have a unique [n1, n2, φ, δ] helical symmetry. The first arises when the primary protofilament is ambiguous, as discussed above, and the second is encountered when there is a continuous helical density along a protofilament in the cryo-EM structure rather than a clear boundary between asymmetric units. Under such circumstances, for a determined [n1, n2] symmetry the helical structure can be described by an infinite number of [φ, δ] pairs, which are always related by a constant. The helical structure of the tubular Aβ1–42 amyloid with a hollow core (Miller et al., 2010; Zhang et al., 2009) is an example of this limitation. The cryo-EM structure gives a [2, 0] (or [2, 1]) helical symmetry and the two helical parameters [φ, δ] = [−3.75c, 4.8c], where c is a constant.
The diffraction patterns of helical structures consist of a series of layer-lines. Assuming that the layer-lines do not overlap, each layer-line is the result of diffraction by a set of n-start helices. The position of the peak with the maximum diffraction intensity in each layer-line can be indexed to correspond to a node in the helical net. The relationship between the 1-D helical system [n1, n2] symmetry and the diffraction pattern is detailed in Figs. 4(a) and 4(b). The two figures illustrate the different assignments of n10 and n01, which give different helical symmetries, [3, 1] and [3, −2], for the same structure that has a simple helical symmetry of t = 4 and u = 13. The assignment of n10 with the first peak close to the equator (n) of the diffraction pattern is consistent with the guideline for selecting the n1-start helices with a twist angle closest to zero. The assignment of n01 to the position of the diffraction which is close to the origin is also likely to constitute a main protofilament of the given helical structure.
From the rotohelical transformation, we learnt that a single (one-start) helix description is not always sufficient to generate the entire helical structure from given asymmetric units. The helix may need to be related by a Cn rotational symmetry, which implies that a minimum of n helices are required to cover the entire helical assembly. Given an [n1, n2] symmetry, there are n1 or n2 assigned helices for the entire helical description. To determine the minimal number of helices for complete structural description (or to be correlated with the rotohelical transformation), the [n1, n2] symmetry is reduced to an equivalent symmetry with n2 = 1 or 0. In the case of a reduced [n1, 0] symmetry, the new n1 is the minimal number of helices.
It is straightforward to deduce the minimal number of helices for a given [n1, n2] helical symmetry. If the numbers n1 and n2 do not have a common factor, the symmetry can always be reduced to a one-start helix description by using a combination of the swap and the equivalence rules of n = h n10 − k n01 as described in §2. For example, [7, 3] can be reduced to [1, 3] with h = 0, k = 2. On the other hand, the largest common factor between n1 and n2 is the minimal number of helices for a complete helical structure description. For example, [8, 4] can be reduced to [4, 0] with the number 4, the largest common factor of 4 and 8.
Despite the fact that so many helical structures have been determined, a universal formulation for representing helical symmetry is still lacking. The absence of agreement in the community has been attributed to three main reasons. The first apparent reason is a consequence of the fact that helical symmetries have been formulated in distinct ways to fulfill a particular requirement or convenience in different structure-determination methods. The diversified helical representations can be classified into two commonly adopted helical schemes named the 1-D and 2-D helical systems. In this study, the two helical schemes were unified into a single helical specification by two constants [n1, n2] and we have shown that the two systems are interchangeable and complementary to each other. Because of the simplicity of using one less parameter and the lack of involvement of the axial radius, we suggest using the augmented 1-D helical system with four parameters [n1, n2, twist, rise] for representing a helical structure.
The second hurdle for defining a helical description is that a helical structure can be pictured in many ways, i.e. in many [n1, n2] combinations as two (n1-start and n2-start) sets of helices. However, in principle, the generalized guidelines for describing a helical symmetry are expected to give a unique [n1, n2] specification that reflects the characteristics of the structure, although in a limited number of cases a unique specification is impossible.
The fact that no standard helical symmetry has been accepted so far can be attributed to the last obstacle: a complete coverage of helical description includes the capability of handling helical discontinuity (a seam). However, building an entire helical construct with a seam from given asymmetric units requires no additional modification in our formulation of helical transformation. Instead, a helical structure with a seam is simply reflected in the value of n2. By definition, the helical discontinuity indicates that n2 is no longer an integer but a rational number.
An implicit requirement of the 2-D helical system (§2.1) is that in a seamless helical arrangement [n1, n2] must be specified by integer numbers. By treating two consecutive asymmetric subunits in the primary protofilament as a new single asymmetric subunit, the new augmented 1-D helical symmetry becomes [n1, n2/2, 2φ, 2δ], which is equivalent to the original helical symmetry [n1, n2, φ, δ] except that the asymmetric units are doubled in size. When the tubulin subunit is treated not as a dimer of αβ subunits but as a single subunit by ignoring the small difference between the α and β subunits (Sui & Downing, 2010), we do not encounter the microtubule seam problem. However, when treating the αβ dimer as an asymmetric subunit in the new 1-D helical symmetry, helical structures with an odd number for the n2 symmetry (in single subunit representation) create a seam with a new rational n2.
The microtubule EM structure (Cochran et al., 2009; EMD-5038) presents such a helical discontinuity when treating the dimer of αβ subunits as the asymmetric unit. The augmented 1-D helical symmetry in four parameters [13, 3/2, 0.0, 80.0] is sufficient to generate the entire helical structure with a seam, based on the helical transformation matrix summarized in Fig. 2. Owing to the helical discontinuity, the repetitive asymmetric unit is no longer an identical unit. Instead, a complete round of n1 subunits (13 dimers in the microtubule case) now constitutes the identical unit in the helical structure with a seam. Therefore, the subunit coordination index [m1, m2] can no longer have an index with m1 ≥ n1 when applying the helical transformation to generate the repetitive subunits for a helical structure with a seam.
A seam in a helical structure can be classified visually with respect to its helical axis into a strictly vertical seam or a seam that wraps around the helix. The microtubule case above is an example of a vertical seam. Under the restriction that only a rational n2 and integer n1 > |n2| are allowed in the augmented 1-D helical representation, the corresponding helical structure always produces a vertical seam and the handedness of the seam is determined by the sign of n2, with positive indicating a left-handed seam and negative a right-handed seam. In contrast, a seam described by a rational n1 > |n2| and integer n2 should correspond to the type of seam that wraps around the helix.
Both the 1-D and 2-D helical systems are designed to create helical assemblies from asymmetric subunits with specified helical parameters. The conformational heterogeneity of molecular assemblies is known to set limits on solving cryo-EM structures at high resolution. Polymorphism is particularly problematic in the determination of structures with helical symmetry since even a slight deviation in the interactions between two asymmetric subunits will create distinct structures with different symmetries. We have seen such an example in Table 2 for the microtubule structure. The question is can all such polymorphic structures be generated based on a single helical structure which is given in an atomic model or an EM map? The answer is yes, because the interactions between the asymmetric subunits are preserved in the definition of the 2-D helical system. Thus, to create distinct polymorphic structures with almost the same subunit–subunit interactions we only need to change the specific [n1, n2] helical symmetry.
In this paper, we give two helical formulations (augmented 1-D and 2-D) to describe a helical structure. Unlike the rotohelical transformation (1-D formulation) with a helical plus an additional rotational operation, a new augmented 1-D formulation with two consecutive helical operations enables unification with the widely adopted 2-D formulation, giving a common helical symmetry descriptor with two integers [n1, n2]. The new formulation requires only four parameters [n1, n2, twist, rise] for the augmented 1-D helical system and five parameters [n1, n2, a, b, γ] for a 2-D helical system to generate the entire structural assembly from given asymmetric units. We propose using the augmented 1-D helical system with four parameters to describe a helical structure owing to its simplicity and independence from the helical radius compared with the 2-D helical system.
In terms of a helical net representation, a helical structure with an [n1, n2] symmetry indicates that its organization is specified by two sets of helices (n1-start and n2-start). Because many different [n1, n2] combinations exist for the same structure, we suggest general guidelines for selecting a unique [n1, n2] symmetry which reflects the structural characteristics of a given helical structure. We provide a computational graphics tool for this purpose which can be used for any helical structure determined by X-ray fibre diffraction or EM imaging.
While there are multiple ways to construct equations that generate the same helical structure, an [n1, n2, twist, rise] description provides the following advantages: firstly, it provides full helical coverage, including a helical discontinuity (seam) which is indicated by a rational n2; secondly, it reflects the structural characteristics of the assembly (formation mechanism) directly by four helical parameters; that is, similar structures give similar parameters; thirdly, the unnecessary error in reproducing the entire helical structures, such as editing wrong transformation matrices in the PDB or in the deposited EM parameters in the EMDB, will be prevented; and lastly, the new helical symmetry is expected to be useful for maintaining a pre-determined helical symmetry in structural refinement as well as for the generation of all `meaningful' polymorphic structural assemblies from a given helical atomic model or EM density map.
We would like to thank Dr Edward Egelman for discussions and in particular for his insightful comments, which helped us in improving the paper. This project has been funded in whole or in part with Federal funds from the National Cancer Institute, National Institutes of Health under contract No. HHSN261200800001E. The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organizations imply endorsement by the US Government. This research was supported (in part) by the Intramural Research Program of the NIH, National Cancer Institute, Center for Cancer Research.
Byeon, I.-J., Meng, X., Jung, J., Zhao, G., Yang, R., Ahn, J., Shi, J., Concel, J., Aiken, C., Zhang, P. & Gronenborn, A. M. (2009). Cell, 139, 780–790. Web of Science CrossRef PubMed CAS
Cochran, J. C., Sindelar, C. V., Mulko, N. K., Collins, K. A., Kong, S. E., Hawley, R. S. & Kull, F. J. (2009). Cell, 136, 110–122. Web of Science CrossRef PubMed CAS
Damnjanović, M., Nikolić, B. & Milošević, I. (2007). Phys. Rev. B, 033403.
DeRosier, D., Stokes, D. L. & Darst, S. A. (1999). J. Mol. Biol. 289, 159–165. Web of Science CrossRef PubMed CAS
Diaz, R., Rice, W. J. & Stokes, D. L. (2010). Methods Enzymol. 482, 131–165. Web of Science CrossRef CAS PubMed
Egelman, E. H. (2007). J. Struct. Biol. 157, 83–94. Web of Science CrossRef PubMed CAS
Egelman, E. H. (2010). Methods Enzymol. 482, 167–183. Web of Science CrossRef CAS PubMed
Evans, P. R. (2001). Acta Cryst. D57, 1355–1359. Web of Science CrossRef CAS IUCr Journals
Fujii, T., Kato, T. & Namba, K. (2009). Structure, 17, 1485–1493. Web of Science CrossRef PubMed CAS
Grubisic, I., Shokhirev, M. N., Orzechowski, M., Miyashita, O. & Tama, F. (2010). J. Struct. Biol. 169, 95–105. Web of Science CrossRef PubMed CAS
Heymann, J. B., Chagoyen, M. & Belnap, D. M. (2005). J. Struct. Biol. 151, 196–207. Web of Science CrossRef PubMed
Kikkawa, M. (2004). J. Mol. Biol. 343, 943–955. Web of Science CrossRef PubMed CAS
Klug, A., Crick, F. H. C. & Wyckoff, H. W. (1958). Acta Cryst. 11, 199–213. CrossRef CAS IUCr Journals Web of Science
Lawson, C. L. et al. (2011). Nucleic Acids Res. 39, D456–D464. Web of Science CrossRef CAS PubMed
Lawson, C. L., Dutta, S., Westbrook, J. D., Henrick, K. & Berman, H. M. (2008). Acta Cryst. D64, 874–882. Web of Science CrossRef IUCr Journals
Marvin, D. A. (1990). Int. J. Biol. Macromol. 12, 125–138. CrossRef CAS PubMed Web of Science
Miller, Y., Ma, B., Tsai, C.-J. & Nussinov, R. (2010). Proc. Natl Acad. Sci. USA, 107, 14128–14133. Web of Science CrossRef CAS PubMed
Pederson, D. M., Welsh, L. C., Marvin, D. A., Sampson, M., Perham, R. N., Yu, M. & Slater, M. R. (2001). J. Mol. Biol. 309, 401–421. Web of Science CrossRef PubMed CAS
Sachse, C., Chen, J. Z., Coureux, P. D., Stroupe, M. E., Fändrich, M. & Grigorieff, N. (2007). J. Mol. Biol. 371, 812–835. Web of Science CrossRef PubMed CAS
Sachse, C., Fändrich, M. & Grigorieff, N. (2008). Proc. Natl Acad. Sci. USA, 105, 7462–7466. Web of Science CrossRef PubMed CAS
Schmidt, M., Sachse, C., Richter, W., Xu, C., Fändrich, M. & Grigorieff, N. (2009). Proc. Natl Acad. Sci. USA, 106, 19813–19818. Web of Science CrossRef PubMed CAS
Stewart, M. (1988). J. Electron Microsc. Tech. 9, 325–358. CrossRef CAS PubMed Web of Science
Sui, H. & Downing, K. H. (2010). Structure, 18, 1022–1031. Web of Science CrossRef CAS PubMed
Toyoshima, C. (2000). Ultramicroscopy, 84, 1–14. Web of Science CrossRef PubMed CAS
Toyoshima, C. & Unwin, N. (1990). J. Cell Biol. 111, 2623–2635. CrossRef CAS PubMed Web of Science
Tsai, C.-J., Zheng, J. & Nussinov, R. (2006). PLoS Comput. Biol. 2, 311–319. CrossRef CAS
Wang, Y. A., Yu, X., Overman, S., Tsuboi, M., Thomas, G. J. Jr & Egelman, E. H. (2006). J. Mol. Biol. 361, 209–215. Web of Science CrossRef PubMed CAS
Watson, J. D. & Crick, F. H. (1953). Nature (London), 171, 737–738. CrossRef PubMed CAS Web of Science
Zhang, R., Hu, X., Khant, H., Ludtke, S. J., Chiu, W., Schmid, M. F., Frieden, C. & Lee, J.-M. (2009). Proc. Natl Acad. Sci. USA, 106, 4653–4658. Web of Science CrossRef PubMed CAS
This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.