research papers
a cautionary tale
aThe Eppley Institute for Research in Cancer and Allied Diseases, 987696 Nebraska Medical Center, Omaha, NE 68198-7696, USA, bStructures and Bonding, Institute of Physics, Academy of Sciences of the Czech Republic, Cukrovarnická 10, 162 53 Praha 6, Czech Republic, and cMRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, England
*Correspondence e-mail: gborgstahl@unmc.edu
Theoretically, crystals with supercells exist at a unique crossroads where they can be considered as either a large
with closely spaced reflections in or a higher dimensional with a modulation that is commensurate with the In the latter case, the structure would be defined as an average structure with functions representing a modulation to determine the atomic location in 3D space. Here, a model protein structure and simulated diffraction data were used to investigate the possibility of solving a real incommensurately modulated protein crystal using a approximation. In this way, the answer was known and the method could be tested. Firstly, an average structure was solved by using the `main' reflections, which represent the subset of the reflections that belong to the and in general are more intense than the `satellite' reflections. The average structure was then expanded to create a and refined using all of the reflections. Surprisingly, the refined solution did not match the expected solution, even though the statistics were excellent. Interestingly, the corresponding had multiple 3D daughter space groups as possibilities, and it was one of the alternate daughter space groups that the locked in on. The lessons learned here will be applied to a real incommensurately modulated profilin–actin crystal that has the same group.Keywords: aperiodic crystallography; supercell crystallographic refinement; modulated protein crystals; superspace group.
1. Introduction
On occasion, a diffraction pattern is observed that consists of many intense reflections interspersed with many weaker `satellite' reflections. Indexing software may find a good fit for the main (most intense) reflections and may not index the weaker reflections. In some of these cases, this ). When the satellites can be indexed with the main reflections in this manner the diffraction data are called `commensurately modulated'. If not indexable, the data are `incommensurate'. A view of this process is that the describes an average of what is occurring in the structure if only the main reflections are used, and the other weaker reflections describe the more complex displacement that occurs in each over the One approach to solving this type of problem is to use the main reflections to arrive at an average solution, and then extend this average solution into a and refine the resulting against all of the reflections. While testing this approach with simulated commensurate data, the expected outcome was that the refined positions, starting from the average position, would match the correct positions that were used to create the reflections for the simulation. After verifying that the approach would work with commensurate data, the plan was then to move on to investigate how commensurate approximates of incommensurate data refine and to see how close the modulation functions match the incommensurate functions (example described below). The had excellent statistics, and initially it was thought to have worked; however, this was not the case. On closer inspection, the expected positions (circles in Fig. 2) did not match the refined positions (crosses in Fig. 2). The result appears to be shifted in some way, and this is where the mystery began. Until this issue was resolved, it held up the application of this approach to the `real-world' incommensurate case that we aimed to solve.
can be extended in integer multiples along one or more of its dimensions, forming a so that all of the reflections can be correctly indexed (Fig. 11.1. Why?
Why perform the et al., 2008). Progress has been made towards this goal (Porta et al., 2011, 2017), but the structure solution has remained problematic. Incommensurate modulation occurs when there is a periodic structural change of some kind overlaid on the the wavelength of which is not an integer multiple of the that makes up the crystal. This phenomenon has been described in several of our earlier publications (see Fig. 2 in Lovelace et al., 2013). A characteristic trait of modulated diffraction is the appearance of weaker satellite reflections around the main reflections. The simplest case is a displacement modulation in which the atomic positions are displaced from the average position by a periodic (AMF) in Details of the theory and application of the theory can be found elsewhere (Janner & Janssen, 1977, 1980; Janssen et al., 1999; Smaalen, 2007). theory is a very powerful tool. As an example, a single properly chosen can describe the diversity of crystal forms observed in the solid-matter phase space for a small molecule (Dusek et al., 2003).
this way? Our research group has been focused on developing approaches to solve modulated protein structures because of an ongoing effort to solve an incommensurately modulated crystal form of profilin–actin (LovelaceA schematic diagram of the relationship between ). The atoms (black filled circles) of a appear to be moving randomly in the subcells (A–G). Note that R includes all three coordinates of 3D real space (x1, x2 and x3). In higher dimensional space the convention is to represent the directions as x1, x2 … xn as opposed to x, y, z or a, b, c because there can potentially be n dimensions (many more than are available in normal 3D space). The apparently random motion along R can be described as a periodic displacement from an average position (dotted black vertical lines in Fig. 3) by an AMF in 4D that traverses two periods along the as4 direction for every seven subcells. Distances can be represented as t fractional units of x4. There are two common parallel constructions related to as4. Lines running parallel to as1 have a constant value of t and are useful for determining the position that an atom occupies in a in real space (light gray dotted lines in Fig. 3). The second projection runs parallel to R. These projections are useful for determining atomic distances between pairs of atoms (solid black horizontal lines in Fig. 3). The AMFs are periodic, and this means that there are multiple equivalent positions. Two ways to translate to equivalent positions are to move to a new x4 value by transitioning along constructions running parallel to as1 (black circles along gray lines to gray circles in Fig. 3) or to phase shift along as4 by moving integer values of t (gray circles between t = 1 and t = 2 to gray dotted circles between t = 0 and t = 1 in Fig. 3). Additionally, through the use of equivalent positions and projections, all of the possible positions of an atom in any in the crystal can be represented within a single period of the AMF (enlarged area in Fig. 3). Also, it is important to note that states close together in may not be next to each other in real space (1–7 versus A–G in the enlarged portion of Fig. 3). To avoid further confusion, we also wish to explain that `t' is the continouse phase space along x4, while `T0' is a shift of where the origin for real space intersects with x4.
and supercells is helpful for understanding the approximation method (Fig. 3The AMF can be inferred as periodic, as opposed to random, because of the appearance of satellite reflections around the main reflections in the diffraction pattern. For incommensurate cases, normal indexing software can usually index the main reflections but can have difficulty or be unable to index the satellite reflections. In higher dimensional space, satellite reflections are indexed with a q vector (Fig. 1) that describes the direction of the modulation through the crystal as well as its overall frequency (fractional space between the main reflection and its first-order satellite). In special cases, where the modulation becomes commensurate, it is possible to describe the remaining reflections (satellites) by increasing the size of the main or basic in integer multiples along one or more of the dimensions (Fig. 3, top). This can then be used with for structure solution, taking care to take translational into account (Sliwiak et al., 2014, 2015; Campeotto et al., 2018). For an incommensurate structure, a commensurate approximation (which may also be referred to as a commensurate approximate in the literature) could be used as a way of using the traditional 3D programs to refine the structure by formulating the problem as a Our hope was that the commensurate approximation to the incommensurate structure would allow us to fit initial AMFs to the atoms and bootstrap the in superspace.
As we had done in the past (Lovelace et al., 2013), a modulated protein model in a and corresponding 1.0 Å resolution calculated diffraction data were simulated. The standard crystallographic `Table 1' for the simulation was published in Lovelace et al. (2013). These calculations were performed using a combination of Matlab (The Mathworks Inc.) and CCP4 tools (Winn et al., 2011). The only portion that used concepts was the calculation of the modulation for each of the It was desired to make the data behave more like an actual data set in which the model and observed values never perfectly match up, leading to R values that were not zero. This was accomplished by trimming the AMFs at second-order Fourier coefficients and trimming the reflections to only include up to second-order satellites. These changes give final R values of a few percent instead of zero. In the current work, the simulated diffraction data were important as a first step to study the possibility of using a approximation to solve an incommensurately modulated protein and to learn of any potential problems. Owing to software limitations, we were limited to working with commensurate modulations. Simulated data were used so that the focus of the analysis could be on how the approached a known answer as opposed to juggling with other unknowns. We are hopeful that incorporating the results discussed here will lead to a successful pathway to solve the incommensurately modulated profilin–actin complex as well as to improve approaches to refining other macromolecular structures. For those interested in reading further, Wagner & Schönleber (2009) provide an excellent example of the solution of an incommensurately modulated small molecule using both a commensurate approximation (supercell) and the method to solve the structure.
2. Methods
Test structures and simulated diffraction data were made to allow researchers to study et al., 2013). The test data were created from a modified form of the ToxD structure [PDB entry 1dtx; Fig. 4(a)]. The ToxD monomer was broken into three chains. Chain B (residues 31–38 of the original molecule), which was located against the solvent channel, was renumbered and translated out into the solvent channel and, to avoid collisions, residues 1, 4, 7 and 8 were mutated to alanines using Coot (Emsley et al., 2010). The coordinates were extended to a 7× and chain B was modulated rotationally around an axis defined by the Cα atom in the second residue in chain B and the Cβ atom in the eighth residue in chain B [Fig. 4(b)]. The amount of modulation was determined by the location along the y direction of the center of mass of chain B with a maximum rotation of ±15° [Figs. 4(c) and 4(d)]. The modulating rotation was carried out using Matlab. The starting structure for was a 7× expansion of the average structure. The modulation vector for the test diffraction data set was set to q = (2/7)b*, or there were two modulation waves every seven unit cells. In other words, each of the had chain B in a different rotated orientation based on its position within the The average structure was found using Phaser (McCoy et al., 2007) to place chains A, B and C into the using only the main reflections. The was refined with REFMAC (Murshudov et al., 2011) using the following settings: for 40 cycles with jelly body enabled and set to 0.020. A zip archive file containing all of the starting models, reflections (mtz) and refined models is available as supporting information and can also be obtained by contacting the corresponding author.
strategies for modulated data sets in a controlled setting (Lovelace3. Results and discussion
The structure solution was performed in two stages. Firstly, the average structure was solved by using only the main reflections and performing Phaser (McCoy et al., 2007) and was refined with REFMAC on the corresponding The second step was to expand the average solution into a (7× in y in this case) and then refine against all reflections that were indexed as a This approach was taken because it more closely mirrors the formulation of the theory, in which atoms are described mathematically as having an average position that is perturbed by an as opposed to directly performing against the entire When the atoms of a modulated structure in a are plotted as a displacement from their average position as a function of their t value in the resulting seven points (for the used in this paper) on the graph provide an approximation of the AMF [black line in Fig. 5(a)]. For all graphs (Figs. 2, 5 and 6), the lines represent the AMFs and the circles represent the correct positions that the atoms in the occupy on the AMF. The initial starting state for the [crosses in Fig. 5(a)] has all atoms on a flat line along x1 at zero displacement because the average structure is in the same position in each of the initial structure.
withInitially, we reviewed the ). In these animations, the displacements for correct refinements show chain B rotating back and forth [Fig. 4(d)]. An example can be found in the supporting information (result.gif) which looks the same on comparison with the correct solution (correct.gif in the supporting information). Additionally, the statistics were good, with R and Rfree of 2.2% and 2.4%, respectively. Given the observed motion and good statistics, we believed that the was successful. results [Fig. 5(b)] can also be viewed by a plot of t versus displacement where the crosses, in this case, represent the refined positions, and it is clear that they do not line up with the expected positions. The refined solution was shifted half a wavelength in and then plotted [Fig. 5(c)]. From analysis of the shifted plot, it is clear that these seven new states are just a different sampling of the continuum states available along the AMF. When the other two directions are added to the plot [x2 and x3; Fig. 6(a)], the case for a phase shift of 0.5 in t is made stronger. This same shift is shown plotted for a couple more of the modulated atoms [Figs. 6(b)–6(d)]. For all cases, simply shifting the results by 0.5 in t causes the refined values to match the expected AMFs nicely.
results by animating the solution with the subcells of the overlaid in order (A, E, B, F, C, G, D; enlarged region of Fig. 23.1. provides an answer
What happened? If we are just limited to 3D P212121(0β0)] has two P212121 daughter groups in 3D space. For the first P212121 daughter group the starting phase of the AMFs (T0) can be selected from one of seven equally spaced positions along t where T0 = n/7 and n is an integer. The second P212121 daughter group has the starting phase starting at T0 = n/7 + 1/14. For both of these options there are seven choices for the starting value of n = 0, 1, …, 6 because of equivalent locations; integer values for n > 6 will result in identical positions to n = 0, 1, …, 6. For the first daughter group, only one of the choices for T0 where n = 0 results in a 3D cell with no origin shift. For the second daughter group only n = 3 which has a t offset of 0.5 results in a 3D cell with no origin shift. This second option matches what was observed in refinement.
thinking, the result does not make sense; however, if we look at the results within the higher dimensional framework there is a reasonable answer. In this case, the [19.1 orThe most popular software for refining incommensurate structures of small molecules is Jana2006 (Petricek et al., 2006); unfortunately, there is currently no equivalent package for proteins. It offers a wide range of tools beyond One of these tools allows the user to explore commensurate approximations (supercells). The daughter 3D cells are derived by Jana2006 from the and this option can be found under the `edit m50' option in the `Cell' tab (Fig. 7). Jana2006 initially shows the available daughter groups [Fig. 7(a)], then the options for T0 [Fig. 7(b)] and finally the origin shifts and other changes that may occur to the 3D daughter based on the T0 setting [Fig. 7(c)]. Alternatively, there is an online tool called Finder (https://it.iucr.org/resources/finder/; Orlov et al., 2008) which can be used to investigate the available 3D daughter groups of a as well as to work backwards and investigate common groups for a collection of 3D groups.
The next question might be: just how sensitive is the T0 = 0, the expected solution, or slightly modulated towards T0 = 1/2, the out-of-phase solution (lines up with the AMF when the t positions of the atoms in the refined model are adjusted to t + 1/2). Even a small amount of initial movement towards the expected solution (T0 = 0) will cause the refined solution to converge appropriately (Table 1). Also, the correct solution does have slightly better statistics. The difference between the two sets, however, is so small that in normal protein refinements (with larger R values) these differences might not be interpreted as significant. Although it appears as though the cutoff to converge to the correct solution would be something like better than 0.01% towards the expected solution, this is a rounding limit of the PDB format, where in this case changes to the starting position of 0.01% were indistinguishable from the 0.00% case. It is most likely that in error space the minima describing both structural solution states are equidistant from the initial condition, which would be close to the average position. As the T0 = 1/2 state results in different reflection intensities, its error well will be both shallower and broader than the correct T0 = 0 state, and when these states interact in error space there will be a slight tendency toward the T0 = 1/2 state when starting from near the average position (Fig. 8). In an effort to verify this model, we plotted initial R values (one cycle of refinement) as a function of bias towards one of the two solutions (Supplementary Fig. S1). The graphs demonstrate a very slight tendency towards the T0 = 1/2 solution around the average position and the T0 = 0 solution as a global minimum.
to the starting position in this example? Are there any starting positions that will result in the expected refinement? The sensitivity of the result as a function of the starting position was investigated by pushing the starting position towards one of the two solutions: atoms modulated slightly towards
|
For the ToxD case, the modulations were smoothly varying, which makes it easy to detect something strange in the t plot. In early refinements, there were examples where different parts of the modulated chain converged to different solutions, resulting in some atoms being caught in the middle between these two opposing solutions (data not shown), resulting in noisy, as opposed to smooth, t plots of the positions. At the time it was thought that jelly-body corrected this issue, but what happened was that jelly-body forced all of the atoms down one of the two available solutions from and because, as stated earlier, we analyzed the results using only animations, it was not clear that there was an issue. To avoid local minima, Jana2006 always performs multiple refinements for approximations by adding small random perturbations to the atomic positions in the hope that this will result in at least one of these refinements finding the global minimum and not just a local minimum. In a commensurate approximation for incommensurately modulated data, the problem is exacerbated. Here, the actual difference in error between different daughter groups will be much smaller and possibly indistinguishable. For an incommensurate case the integration along the entire period of the AMF contributes to reflection intensities, whereas for a commensurate case only a select number of discrete points along the AMF contribute to reflection intensities. The resulting conclusion is that for an incommensurate modulation the (3+1)D description will provide the more accurate picture of what is occurring in the crystal, and this is exactly the conclusion that Wagner & Schönleber (2009) arrive at after comparing their with their solution.
if an atom undergoes a rapid change in position on the4. Conclusions
In conclusion, we have revealed that the refined Jana2006 or the Finder website can be used to find the appropriate (3+1)D to 3D daughter options for testing phase shifts in For structures, it may be useful to study the atomic positions as plotted in t plots to gain more insight into the underlying mechanisms of the displacement. Additionally, for supercells, the jelly-body option (or any option like jelly-body in your software of choice) should always be enabled to prevent the model from attempting to refine two solutions simultaneously. In future work, we will employ these methods and observations in the of incommensurately modulated profilin–actin (Lovelace et al., 2008).
model may not end up in the true atomic positions of the modulated structure owing to the availability of multiple 3D daughter space groups. Using the refined positions of the to fit AMFs should result in approximate AMFs of good enough quality to test whether phase-shifting the atomic positions of the provides a better structural solution. Software tools such asSupporting information
Animated GIF correct.gif: what the correct https://doi.org/10.1107/S2059798319011082/rr5176sup1.gif
looks like. DOI:Animated GIF result.gif: what the https://doi.org/10.1107/S2059798319011082/rr5176sup2.gif
looks like. DOI:Data files. DOI: https://doi.org/10.1107/S2059798319011082/rr5176sup3.zip
Supplementary Figure S1. DOI: https://doi.org/10.1107/S2059798319011082/rr5176sup4.pdf
Acknowledgements
We would like to thank Sander van Smaalen from Universität Bayreuth and Ron Lifshitz from Tel Aviv University for valuable discussions. We appreciate the work of the anonymous Acta Crystallographica Section D reviewers for dramatically improving the presentation of our analysis by providing a pathway to an alternate and easier-to-understand interpretation of our results. The UNMC Structural Biology Facility was supported by National Cancer Institute award No. P30 CA036727 to the Fred and Pamela Buffett Cancer Center.
Funding information
Funding for this research was provided by: Project No. 18-10504S of the Czech Science Foundation (Václav Petrícek) and the National Science Foundation, Division of Molecular and Cellular Biosciences (grant No. 1518145 to Gloria E. O. Borgstahl).
References
Campeotto, I., Lebedev, A., Schreurs, A. M. M., Kroon-Batenburg, L. M. J., Lowe, E., Phillips, S. E. V., Murshudov, G. N. & Pearson, A. R. (2018). Sci. Rep. 8, 14876. CrossRef PubMed Google Scholar
Dusek, M., Chapuis, G., Meyer, M. & Petricek, V. (2003). Acta Cryst. B59, 337–352. Web of Science CrossRef ICSD CAS IUCr Journals Google Scholar
Emsley, P., Lohkamp, B., Scott, W. G. & Cowtan, K. (2010). Acta Cryst. D66, 486–501. Web of Science CrossRef CAS IUCr Journals Google Scholar
Janner, A. & Janssen, T. (1977). Phys. Rev. B, 15, 643–658. CrossRef CAS Web of Science Google Scholar
Janner, A. & Janssen, T. (1980). Acta Cryst. A36, 408–415. CrossRef CAS IUCr Journals Web of Science Google Scholar
Janssen, T., Janner, A., Looijenga-Vos, A. & Wolff, P. M. D. (1999). International Tables for Crystallography, Vol. C, edited by A. J. C. Wilson & E. Prince, pp. 899–947. Dordrecht: Kluwer Academic Publishers. Google Scholar
Lovelace, J. J., Murphy, C. R., Daniels, L., Narayan, K., Schutt, C. E., Lindberg, U., Svensson, C. & Borgstahl, G. E. O. (2008). J. Appl. Cryst. 41, 600–605. Web of Science CrossRef CAS IUCr Journals Google Scholar
Lovelace, J. J., Simone, P. D., Petříček, V. & Borgstahl, G. E. O. (2013). Acta Cryst. D69, 1062–1072. Web of Science CrossRef CAS IUCr Journals Google Scholar
McCoy, A. J., Grosse-Kunstleve, R. W., Adams, P. D., Winn, M. D., Storoni, L. C. & Read, R. J. (2007). J. Appl. Cryst. 40, 658–674. Web of Science CrossRef CAS IUCr Journals Google Scholar
Murshudov, G. N., Skubák, P., Lebedev, A. A., Pannu, N. S., Steiner, R. A., Nicholls, R. A., Winn, M. D., Long, F. & Vagin, A. A. (2011). Acta Cryst. D67, 355–367. Web of Science CrossRef CAS IUCr Journals Google Scholar
Orlov, I., Palatinus, L. & Chapuis, G. (2008). J. Appl. Cryst. 41, 1182–1186. Web of Science CrossRef CAS IUCr Journals Google Scholar
Petricek, V., Dusek, M. & Palatinus, L. (2006). Z. Kristallogr. 229, 345–352. Google Scholar
Porta, J., Lovelace, J. & Borgstahl, G. E. O. (2017). J. Appl. Cryst. 50, 1200–1207. CrossRef CAS IUCr Journals Google Scholar
Porta, J., Lovelace, J. J., Schreurs, A. M. M., Kroon-Batenburg, L. M. J. & Borgstahl, G. E. O. (2011). Acta Cryst. D67, 628–638. Web of Science CrossRef CAS IUCr Journals Google Scholar
Sliwiak, J., Dauter, Z., Kowiel, M., McCoy, A. J., Read, R. J. & Jaskolski, M. (2015). Acta Cryst. D71, 829–843. Web of Science CrossRef IUCr Journals Google Scholar
Sliwiak, J., Jaskolski, M., Dauter, Z., McCoy, A. J. & Read, R. J. (2014). Acta Cryst. D70, 471–480. Web of Science CrossRef CAS IUCr Journals Google Scholar
Smaalen, S. van (2007). Incommensurate Crystallography. Oxford University Press. Google Scholar
Wagner, T. & Schönleber, A. (2009). Acta Cryst. B65, 249–268. Web of Science CSD CrossRef CAS IUCr Journals Google Scholar
Winn, M. D., Ballard, C. C., Cowtan, K. D., Dodson, E. J., Emsley, P., Evans, P. R., Keegan, R. M., Krissinel, E. B., Leslie, A. G. W., McCoy, A., McNicholas, S. J., Murshudov, G. N., Pannu, N. S., Potterton, E. A., Powell, H. R., Read, R. J., Vagin, A. & Wilson, K. S. (2011). Acta Cryst. D67, 235–242. Web of Science CrossRef CAS IUCr Journals Google Scholar
This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.