essays
Notes of a protein crystallographer: the advantages of combining new integrated methods of structure solution with traditional data visuals
aInstitute of Tuberculosis Research and Center for Biomolecular Sciences, Department of Pharmaceutical Sciences, College of Pharmacy, University of Illinois at Chicago, Chicago, IL 60607, USA
*Correspondence e-mail: caz@uic.edu
Dedicated to all crystallographic software developers, and the younger generations of structural biologists and macromolecular crystallographers.
The suggestion is made that combining analysis using the most advanced crystallographic software with the integrated visual tools of the field will result in more knowledgeable and better trained future generations of structural biologists. The use of integrated visuals could also expedite the structure solution of some recalcitrant and complex macromolecular crystal structures that resist automatic workflows.
Keywords: teaching; visual aids; pseudo-precession photographs; stereographic projections; crystallography workflows.
The recent CCP4/APS School in Macromolecular Crystallography workshop at the Advanced Photon Source (https://www.ccp4.ac.uk/schools/APS-2021/) entitled `From data collection to structure and beyond' brought together in a global electronic meeting some of the most prestigious senior crystallographers, premier method developers and the new generations of aspiring structural biologists and crystallographers. It was a virtual school that also included hands-on data-collection sessions at the Advanced Photon Source and comprised two full weeks of intense immersion in crystallographic methods for the training of younger crystallographers. I decided to register and participate in order to get a refresher course on the newest methods and software tools. I was certainly not disappointed.
Lectures on diffraction theory, the etc. were presented at a very fast pace. The field is certainly vibrant with computational innovations that will facilitate the solution and of ever more complex structures by conventional crystallography or by the promising youngest branch of the field, cryo-electron microscopy.
data-collection strategies and various methods such as MIR, MR, SIR, SADHowever, in my view, the introductory lectures on diffraction theory failed to take advantage of the traditional visualization tools that so well illustrate the theory and the basic principles of crystallography such as diffraction theory, the ). I am referring to precession photos and stereographic projections of the self-rotation function (SRF). Am I a dinosaur? Am I ossified to the point of no redemption? Am I totally out of touch with the new techniques and programs that can solve structures automatically using specific and ultrafast pipelines and workflows? Possibly, but I do not think so.
and the corresponding symmetry of the different projections. The basic principles were presented (for example the Ewald sphere), but the implications for the visualization of the were not adequately illustrated. In particular, the presenters failed to demonstrate the value of the images of the intensity-weighted reciprocal-lattice planes in the different I noticed the absence of highly integrated visual tools along the path to the structure solution of complex structures, including the existence of any (NCS) if present (Rossmann & Blow, 1962I must add that the writing of this article was not motivated by nostalgia for the `good old days' of tedious data collection and long, interminable and uncertain times for structure solution. My interest is in conveying to software developers, beamline scientists and future generations that the crystallographic community at large will benefit when the superb tools currently available are combined with the more traditional visual guides used in the past.
In this brief essay, I will illustrate this statement with some examples for three reasons. Firstly, although it has been said millions of times and is a mantra in many human endeavors, it is worth mentioning it again: a picture is worth a thousand words. Secondly, because having the younger generations recognize, understand and interpret these images will make them better crystallographers and structural biologists and not just `program-running wizards'. Thirdly, but not least, because it will help in solving structures more easily, particularly those complex structures where the automatic workflows cannot find the correct solution for a variety of reasons: low resolution and/or incomplete data, complex NCS,
and/or ambiguous space-group assignment, among others.The precession method was introduced in the 1930s in the laboratory of Professor Martin Buerger at the Massachusetts Institute of Technology. A complete discussion of the method was published by Buerger thirty years later (Buerger, 1964). It provided a simple and direct way of establishing the and of a crystal by providing undistorted views of the planes of the (Fig. 1a, left and right). The instrument (the precession camera) soon became commercially available and was widely used in crystallography laboratories.
For younger macromolecular crystallographers, it is worth mentioning that the native and heavy-atom derivative data for the determination of the structure of myoglobin at 2 Å resolution were collected by precession photography (Kendrew et al., 1960). The key advantage was that the successive photographs containing individual planes of the comprising the `data set' were very easy to index. The intensities of the corresponding reflections were measured readily using the flatbed densitometers available at the time. The undistorted view of the weighted allows immediate indexing as the reflections are arranged regularly along the three main reciprocal axes (Fig. 1a, left and right). The data for the structure of hemoglobin at 5.5 Å resolution (Perutz et al., 1960) were collected by precession methods and using an early counter spectrometer (Arndt & Phillips, 1961), measuring the intensity of one reflection at a time.
Nowadays, you can only see precession cameras in small (museum-like) displays near well established crystallography laboratories: they are a relic of the past. However, current integrated visual software may display the intensity-weighted
in almost any possible way with data collected using any hardware or available method.Younger generations of macromolecular crystallographers are more familiar with the screenless oscillation methods used at synchrotron sources, which were initially used at conventional in-house rotating anodes, collecting the crystallographic reflections from various reciprocal planes simultaneously that could be indexed by the initial software (Arndt & Wonacott, 1977). The pre-alignment of the crystal axes with respect to the X-ray beam was important to facilitate the indexing and integration (Fig. 1b). The autoindexing software using Fourier methods that was introduced later (Otwinowski & Minor, 1997) enormously facilitated the indexing of the initial diffraction patterns and the subsequent data collection and processing for randomly oriented crystals.
Obtaining precession photographs of the main projections of the i.e. h0l, 0kl, hk0), often including the upper levels (i.e. h1l, 1kl, hk1), was a crucial (and often time-consuming) step in any successful This was possible by the careful analysis and visual inspection of these images (Figs. 2a–2c, 3a and 3b). I dare say that the majority of younger crystallographers will not even know what I am talking about. Yet all of these data, planes and projections are `buried' in the well known `mtz' files produced by the processing programs, namely MOSFLM/iMOSFLM (Winn et al., 2011), HKL-2000/HKL-3000 (Otwinowski & Minor, 1997) and XDS (Kabsch, 2010). They can be examined using the VIEWHKL program available in various software packages (CCP4 suite; Winn et al., 2011), as well as the abovementioned HKL-3000 (Minor et al., 2006) and Phenix suite (Liebschner et al., 2019). The precession photographs allow the immediate calculation of the unit-cell parameters, and by looking at the pattern of intensities of certain sets of reflections (for example axial reflections: 00h, 0kl, 00l) the presence of screw axes of symmetry can be determined, as well as any lattice centering by examining the (Figs. 2a–2c, 3a and 3b).
(This is what the current crystallographic programs do; it is not magic. In addition, the programs compare the intensities of what they think are symmetry-equivalent reflections and provide reasonable suggestions for the full symmetry of the a–2c, 3a and 3b). Except for enantiomorphic space groups (for example P61 versus P65), this is how the software proposes a and will merge the integrated reflections if the statistical parameters are robust enough.
based on some statistical inferences. All of this is important and valuable but, in my experience, it always helps to look at these `pseudo' precession photographs to confirm what the programs are suggesting (Figs. 2Nowadays, view-enabling programs such as HKL-3000 and Phenix can also be used to display various views, regions and planes of the (2D or 3D viewers in glowing colors!), including the completeness of the reciprocal data collected in three dimensions. This is very useful to identify certain regions of where significant `gaps' exist due to the crystal orientation during data collection. These gaps or wedges can lead to incorrect space-group determination, particularly if the missing data are along the axes. This is something that the old precession photographs would not allow us to do (Fig. 2c).
The article by Rossmann and Blow defining the `self-rotation function' (SRF) and its usage for `the detection of sub-units within the crystallographic ). The final sentence of the abstract establishes what was to become the most common application of this concept: `application of the R function to horse haemoglobin gives a dominant peak that corresponds accurately to the relative orientation of the α and β chains'. Attempts to solve other protein structures in the years that followed often encountered the problem of having more than one molecule in the with or without any point-group symmetry. Solving this problem was a key step in unraveling the heavy-atom sites and subsequent phase calculations and One of the most relevant examples of these years was the use of the rotation function for the discovery of a noncrystallographic twofold axis of symmetry in rhombohedral insulin (Dodson et al., 1966). This illustration demonstrated the value of calculating and interpreting the SRF in the early stages of It was used routinely in the 1970s and beyond to establish the orientation of the symmetry elements in icosahedral viruses (point group 532), where it was crucial for the averaging of the initial electron-density maps and phase improvement. A time perspective on the molecular-replacement method and its applications has recently been published (Dodson, 2021).
is a classic (Rossmann & Blow, 1962Certainly, the new software packages do calculate the SRF to consider the presence of NCS and they routinely list the various NCS axes, typically using Eulerian angles, but the relative relationship of the NCS peaks is difficult to visualize using the Euler angles (α, β, γ or θ1, θ2, θ3). It is certainly easier using the polar coordinates of the axial directions (φ, ψ and the angle of rotation κ or χ). I routinely use the option in MOLREP (Vagin & Teplyakov, 1997, 2010; Winn et al., 2011) to calculate the SRF and plot it on a (Wulf net), which provides a superb overview of the crystallographic and NCS elements and their relative orientations with respect to the crystallographic axes. In my experience, these stereographic projections are very valuable to (i) characterize the NCS elements and their relative orientations with respect to the crystallographic axes (Fig. 4); (ii) discern issues with the quality of the crystallographic data, such as the degree of (Fig. 5); (iii) identify the correct symmetry of the (Fig. 6) and (iv) unravel complex NCS symmetry in low-symmetry space groups, including the presence of pseudo-symmetry, to facilitate structure solution (Fig. 7). I will illustrate these four cases with examples.
The orientation of the NCS symmetry elements in the lattice is the most traditional use, and in simple cases it is very easy to interpret using the direct listing from the programs. However, even the common presence of a 222 tetramer in the illustrates the SRF of a crystal of the class II fructose 1,6-bisphosphatase (FBPase) from the pathogenic bacterium Francisella tularensis. The protein is present in solution as a tetramer of identical subunits and quite often the entire tetramer is present in the of crystals belonging to P1. In the illustration, there are actually two tetramers with 222 symmetry within the in slightly different orientations and separated by about 70 Å along the c axis.
can be challenging when none of the three twofold axes are aligned with the Fig. 4I have often found that obtaining the SRF immediately after the crystallographic data have been reduced helps to recognize `red flags' regarding the quality of the data; this can be a significant time saver for a project. A lack of distinct features and the absence of the symmetry elements expected from the data reduction is often a reflection of disordered crystals and/or certain types of illustrates how different degrees of in two different data sets of P21 crystals grown in the presence of Mg2+ (not the native divalent cation) are reflected in the appearance and interpretation of the SRF.
In certain cases, it is possible to select the `best crystal' data set to proceed based on inspection of the SRF. Fig. 5The most unexpected and revealing usage of the SRF has been to show the symmetry elements of the crystallographic data without any preconceived idea of the lattice. Of course, the auto-indexing programs provide reasonable guesses of the P4122 (or enantiomer) lattice and yet I could not solve the structure, even though it was a rather simple MR solution of a protein N-terminal domain of about 150 amino acids. A rather straightforward calculation and plotting the SRF immediately showed an unambiguous threefold axis of symmetry along the approximate body diagonal of the cell, directly pointing to a cubic lattice (Fig. 6). The structure was solved immediately after reprocessing the data in P4132. In cases of space-group ambiguity, I suggest that processing the initial crystallographic data in lower symmetry space groups, within the same or different and examination of the SRF could point to the correct space-group assignment, in conjunction with the suggested solutions provided by the auto-indexing programs.
based on the values of the unit-cell parameters (dimensions and angles) and these are critical constraints. However, I encountered an example where the auto-indexing and subsequent indicated a tetragonalThe dramatic impact of the use of the SRF to unravel the NCS of complex asymmetric units is presented in Fig. 7. The crystallographic data could be reduced either in P21 (unit-cell parameters a = 80.1, b = 130.3, c = 112.6 Å, β = 90.07°) or in P2221, with the longest c axis coinciding with the 21 screw axis. The NCS revealed significant peaks for twofold, threefold and even sixfold symmetries inside the The solution by MR proceeded in a stepwise fashion, as reported previously (Wolf et al., 2020; PDB entry 6pbs). It was particularly intriguing to see strong evidence for a threefold axis, as only dimers had been hinted at before. Finally, the solution was obtained and refined by solving for a 32 symmetrical cluster (six chains) in P2221 indexing and searching for two such clusters in the diffraction data reduced as P21 (b unique). The biological surprise appeared when each polypeptide chain was bound to two `ligand' molecules, making an of 12 (ClpC1 NTD chains) and 24 (ecumicin) molecules (1:2 stoichiometry; Wolf et al., 2020).
Concurrent with the previous examples, it is worth presenting the following argument. The issues related to how to teach crystallography have been brought up for discussion many times. There are fewer formal courses of crystallography in the curricula that are important for training future students, even in the more traditional areas of biophysics, biophysical and analytical methods of
physical chemistry and others. The most common training occurs when graduate students encounter a need to solve a biological problem related to a macromolecule (protein/nucleic acid) of biological interest. The common theme is that solving the structure of the target molecule will provide insights into its function, whether catalytic, regulatory or other. Thus, the student is asked to attempt to clone, express, purify, crystallize and solve the structure of the macromolecule of interest. Often, there is no expertise in the field in the laboratory, and once the students manage to obtain crystals they are supposed to solve the structure with the `powerful software' tools available. The assumption is that once the data have been collected and processed (possibly even by beamline scientists), the structure solution comes from running a series of programs. The focus is on solving the structure, and `learning' crystallography is often equated to being able to run a series of programs, often as streamlined pipelines, and ending with the structure ready for `interpretation' and rationalizing the biological function(s). Unfortunately, this approach does not teach crystallography, and this is the reason why workshops and crystallography schools are so useful. However, in addition to what those schools do very well, it is suggested that they also include use of the very valuable visual tools of the past to illustrate and explain more fully the crystallographic concepts underlying the amazing software tools that they are providing to younger generations. This suggestion should also be adopted by the courteous beamline scientists who often process the collected data for inexperienced users.In summary, images are more valuable than words and various lists of copious numbers. They are invaluable as a complement to the extensive computer outputs. Certain `traditional' crystallographic images are more informative than the various Cartesian plots produced by most crystallographic programs. I would ask software developers that in addition to the extensive output of their workflows, additional graphic files are output to allow the users the option to examine the three main projections of the
and an image of the of the SRF using the integrated intensities. I would also encourage the beamline personnel and younger generations to use the integrated visualization tools available to `cross-examine' the answers inferred by the programs. This visual information would perfectly complement the numerical and statistical output of these programs and would probably aid in space-group determination and in the better characterization of if present. Together, these additions will be a more reliable and informative path to the satisfying experience of solving the structure, with a complete understanding of the process and a fuller appreciation of the final result.Acknowledgements
The insightful comments of the anonymous reviewers and the editor are appreciated. The author recognizes the careful editing of the manuscript by Drs Guido Pauli and Marvin Hackert. This work has been supported indirectly by the Institute of Tuberculosis Research at the University of Illinois at Chicago, Dr. Scott Franzblau, Director.
References
Arndt, U. W. & Phillips, D. C. (1961). Acta Cryst. 14, 807–818. CrossRef CAS IUCr Journals Web of Science Google Scholar
Arndt, U. W. & Wonacott, A. J. (1977). The Rotation Method in Crystallography. Amsterdam: North-Holland. Google Scholar
Buerger, M. (1964). The Precession Method. New York: John Wiley & Sons. Google Scholar
Dodson, E. (2021). Acta Cryst. D77, 867–879. CrossRef IUCr Journals Google Scholar
Dodson, E., Harding, M. M., Hodgkin, D. C. & Rossmann, M. G. (1966). J. Mol. Biol. 16, 227–241. CrossRef CAS PubMed Web of Science Google Scholar
Hackert, M. L., Abad-Zapatero, C., Stevens, S. E. Jr & Fox, J. L. (1977). J. Mol. Biol. 111, 365–369. CrossRef CAS PubMed Google Scholar
Kabsch, W. (2010). Acta Cryst. D66, 125–132. Web of Science CrossRef CAS IUCr Journals Google Scholar
Kendrew, J. C., Dickerson, R. E., Strandberg, B. E., Hart, R. G., Davies, D. R., Phillips, D. C. & Shore, V. C. (1960). Nature, 185, 422–427. CrossRef PubMed CAS Web of Science Google Scholar
Liebschner, D., Afonine, P. V., Baker, M. L., Bunkóczi, G., Chen, V. B., Croll, T. I., Hintze, B., Hung, L.-W., Jain, S., McCoy, A. J., Moriarty, N. W., Oeffner, R. D., Poon, B. K., Prisant, M. G., Read, R. J., Richardson, J. S., Richardson, D. C., Sammito, M. D., Sobolev, O. V., Stockwell, D. H., Terwilliger, T. C., Urzhumtsev, A. G., Videau, L. L., Williams, C. J. & Adams, P. D. (2019). Acta Cryst. D75, 861–877. Web of Science CrossRef IUCr Journals Google Scholar
Minor, W., Cymborowski, M., Otwinowski, Z. & Chruszcz, M. (2006). Acta Cryst. D62, 859–866. Web of Science CrossRef CAS IUCr Journals Google Scholar
Murshudov, G. N., Skubák, P., Lebedev, A. A., Pannu, N. S., Steiner, R. A., Nicholls, R. A., Winn, M. D., Long, F. & Vagin, A. A. (2011). Acta Cryst. D67, 355–367. Web of Science CrossRef CAS IUCr Journals Google Scholar
Otwinowski, Z. & Minor, W. (1997). Methods Enzymol. 276, 307–326. CrossRef CAS PubMed Web of Science Google Scholar
Perutz, M. F., Rossmann, M. G., Cullis, A. F., Muirhead, H., Will, G. & North, A. C. T. (1960). Nature, 185, 416–422. CrossRef PubMed CAS Web of Science Google Scholar
Rossmann, M. G. & Blow, D. M. (1962). Acta Cryst. 15, 24–31. CrossRef CAS IUCr Journals Web of Science Google Scholar
Selezneva, A. I., Gutka, H. J., Wolf, N. M., Qurratulain, F., Movahedzadeh, F. & Abad-Zapatero, C. (2020). Acta Cryst. F76, 524–535. CrossRef IUCr Journals Google Scholar
Vagin, A. & Teplyakov, A. (1997). J. Appl. Cryst. 30, 1022–1025. Web of Science CrossRef CAS IUCr Journals Google Scholar
Vagin, A. & Teplyakov, A. (2010). Acta Cryst. D66, 22–25. Web of Science CrossRef CAS IUCr Journals Google Scholar
Winn, M. D., Ballard, C. C., Cowtan, K. D., Dodson, E. J., Emsley, P., Evans, P. R., Keegan, R. M., Krissinel, E. B., Leslie, A. G. W., McCoy, A., McNicholas, S. J., Murshudov, G. N., Pannu, N. S., Potterton, E. A., Powell, H. R., Read, R. J., Vagin, A. & Wilson, K. S. (2011). Acta Cryst. D67, 235–242. Web of Science CrossRef CAS IUCr Journals Google Scholar
Wolf, N. M., Lee, H., Zagal, D., Nam, J.-W., Oh, D.-C., Lee, H., Suh, J.-W., Pauli, G. F., Cho, S. & Abad-Zapatero, C. (2020). Acta Cryst. D76, 458–471. CrossRef IUCr Journals Google Scholar
Wolf, N. M., Gutka, H. J., Movahedzadeh, F. & Abad-Zapatero, C. (2018). Acta Cryst. D74, 321–331. Web of Science CrossRef IUCr Journals Google Scholar
This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.