research communications\(\def\hfill{\hskip 5em}\def\hfil{\hskip 3em}\def\eqno#1{\hfil {#1}}\)

Journal logoSTRUCTURAL BIOLOGY
COMMUNICATIONS
ISSN: 2053-230X

Serendipitous high-resolution structure of Escherichia coli carbonic anhydrase 2

crossmark logo

aDepartment of Biological Chemistry, University of Michigan, Ann Arbor, MI 48109, USA, and bLife Sciences Institute, University of Michigan, Ann Arbor, MI 48109, USA
*Correspondence e-mail: rankinmi@umich.edu

Edited by R. L. Stanfield, The Scripps Research Institute, USA (Received 14 November 2024; accepted 4 January 2025; online 15 January 2025)

X-ray crystallography remains the dominant method of determining the three-dimensional structure of proteins. Nevertheless, this resource-intensive process may be hindered by the unintended crystallization of contaminant proteins from the expression source. Here, the serendipitous discovery of two novel crystal forms and one new, high-resolution structure of carbonic anhydrase 2 (CA2) from Escherichia coli that arose during a crystallization campaign for an un­related target is reported. By comparing unit-cell parameters with those in the PDB, contaminants such as CA2 can be identified, preventing futile molecular-replacement attempts. Crystallographers can use these new lattice parameters to diagnose CA2 contamination in similar experiments.

1. Introduction

The process of obtaining a three-dimensional structure of a target protein can be time-consuming and expensive, regardless of the technique used. Each stage of the gene-to-structure pipeline has potential for failure, yet the most frustrating and expensive errors may arise at the very end during the analysis of diffraction data. Efforts to solve the structure through molecular replacement or experimental phasing may result in the unfortunate discovery that the crystallized protein was not the target protein.

The PDB contains many accidental structures of contaminants that arose during purification (Niedzialkowska et al., 2016[Niedzialkowska, E., Gasiorowska, O., Handing, K. B., Majorek, K. A., Porebski, P. J., Shabalin, I. G., Zasadzinska, E., Cymborowski, M. & Minor, W. (2016). Protein Sci. 25, 720-733.]; Grzechowiak et al., 2021[Grzechowiak, M., Sekula, B., Jaskolski, M. & Ruszkowski, M. (2021). Acta Biochim. Pol. 68, 29-31.]). Typical purification schemes involve the addition of exogenous proteins such as lysozyme (Falgenhauer et al., 2021[Falgenhauer, E., von Schönberg, S., Meng, C., Mückl, A., Vogele, K., Emslander, Q., Ludwig, C. & Simmel, F. C. (2021). ChemBioChem, 22, 2805-2813.]), Tobacco etch virus (TEV) protease (Tropea et al., 2009[Tropea, J. E., Cherry, S. & Waugh, D. S. (2009). Methods Mol. Biol. 498, 297-307.]) or deoxyribonuclease (DNase) I (Funakoshi et al., 1980[Funakoshi, A., Tsubota, Y., Fujii, K., Ibayashi, H. & Takagi, Y. (1980). J. Biochem. 88, 1113-1118.]). Any of these proteins has the potential to persist through the purification and crystallize in lieu of the target protein. Genetically encoded fusion proteins such as maltose-binding protein (MBP; Lebendiker & Danieli, 2017[Lebendiker, M. & Danieli, T. (2017). Methods Mol. Biol. 1485, 257-273.]) or glutathione S-transferase (GST; Harper & Speicher, 2011[Harper, S. & Speicher, D. W. (2011). Methods Mol. Biol. 681, 259-280.]) may also remain in small quantities after cleavage and counterselection.

More commonly, contaminating proteins from the expression source lead to unintended structures. The nickel resin used in immobilized metal-affinity chromatography (IMAC), the most common method used to obtain large quantities of recombinant protein, has the potential to bind proteins other than the polyhistidine-tagged target. Many such contaminants from a nickel-affinity-based Escherichia coli purification strategy have been reported (Niedzialkowska et al., 2016[Niedzialkowska, E., Gasiorowska, O., Handing, K. B., Majorek, K. A., Porebski, P. J., Shabalin, I. G., Zasadzinska, E., Cymborowski, M. & Minor, W. (2016). Protein Sci. 25, 720-733.]; Grzechowiak et al., 2021[Grzechowiak, M., Sekula, B., Jaskolski, M. & Ruszkowski, M. (2021). Acta Biochim. Pol. 68, 29-31.]; Bolanos-Garcia & Davies, 2006[Bolanos-Garcia, V. M. & Davies, O. R. (2006). Biochim. Biophys. Acta, 1760, 1304-1313.]). These proteins may bind nickel resin or interact nonspecifically with the protein of interest and thus be retained through the final purification step. Common endogenous E. coli contaminants that have been reported to co-elute during nickel-affinity purification include ArnA (Andersen et al., 2013[Andersen, K. R., Leksa, N. C. & Schwartz, T. U. (2013). Proteins, 81, 1857-1861.]; Robichon et al., 2011[Robichon, C., Luo, J., Causey, T. B., Benner, J. S. & Samuelson, J. C. (2011). Appl. Environ. Microbiol. 77, 4634-4646.]), SlyD (Andersen et al., 2013[Andersen, K. R., Leksa, N. C. & Schwartz, T. U. (2013). Proteins, 81, 1857-1861.]; Robichon et al., 2011[Robichon, C., Luo, J., Causey, T. B., Benner, J. S. & Samuelson, J. C. (2011). Appl. Environ. Microbiol. 77, 4634-4646.]; Parsy et al., 2007[Parsy, C. B., Chapman, C. J., Barnes, A. C., Robertson, J. F. & Murray, A. (2007). J. Chromatogr. B, 853, 314-319.]), Hsp60 (GroEL; Bolanos-Garcia & Davies, 2006[Bolanos-Garcia, V. M. & Davies, O. R. (2006). Biochim. Biophys. Acta, 1760, 1304-1313.]), YodA (David et al., 2003[David, G., Blondeau, K., Schiltz, M., Penel, S. & Lewit-Bentley, A. (2003). J. Biol. Chem. 278, 43728-43735.]) and Can/YadF (carbonic anhydrase; Chai et al., 2021[Chai, L., Zhu, P., Chai, J., Pang, C., Andi, B., McSweeney, S., Shanklin, J. & Liu, Q. (2021). Crystals, 11, 1227.]). Frequent contaminants are listed in the ContaBase database (Hungler et al., 2016[Hungler, A., Momin, A., Diederichs, K. & Arold, S. T. (2016). J. Appl. Cryst. 49, 2252-2258.]).

An endogenous carbonic anhydrase frequently contaminates recombinant proteins from E. coli expression systems (Robichon et al., 2011[Robichon, C., Luo, J., Causey, T. B., Benner, J. S. & Samuelson, J. C. (2011). Appl. Environ. Microbiol. 77, 4634-4646.]; Chai et al., 2021[Chai, L., Zhu, P., Chai, J., Pang, C., Andi, B., McSweeney, S., Shanklin, J. & Liu, Q. (2021). Crystals, 11, 1227.]; Cronk et al., 2001[Cronk, J. D., Endrizzi, J. A., Cronk, M. R., O'Neill, J. W. & Zhang, K. Y. J. (2001). Protein Sci. 10, 911-922.]; Merlin et al., 2003[Merlin, C., Masters, M., McAteer, S. & Coulson, A. (2003). J. Bacteriol. 185, 6415-6424.]). Carbonic anhydrase (EC 4.2.1.1) is a zinc-dependent metalloenzyme that forms carbonic acid from CO2, a byproduct of carbohydrate and fat catabolism. In humans, carbonic anhydrases in red blood cells reversibly solubilize CO2 as carbonic acid, allowing it to reach the lungs to be exhaled (Doyle & Cooper, 2024[Doyle, J. & Cooper, J. S. (2024). StatPearls. Treasure Island: StatPearls Publishing.]). E. coli contains two carbonic anhydrase genes. The essential can gene (previously yadF; UniProt P61517) encodes carbonic anhydrase 2 (CA2), a β-class CA enzyme. CynT (UniProt P0ABE9) is a paralog of CA2 (33% sequence identity) that can complement disruption of can (Merlin et al., 2003[Merlin, C., Masters, M., McAteer, S. & Coulson, A. (2003). J. Bacteriol. 185, 6415-6424.]). The PDB contains several structures of E. coli CA2, but none of CynT (Table 1[link]).

Table 1
Current and new structures of E. coli carbonic anhydrase 2 in the PDB

PDB code Space group a, b, c (Å) dmin (Å) Form Reference
1i6o P4322 81.2, 81.2, 162.1 2.20 1 Cronk et al. (2001[Cronk, J. D., Endrizzi, J. A., Cronk, M. R., O'Neill, J. W. & Zhang, K. Y. J. (2001). Protein Sci. 10, 911-922.])
2esf P4322 82.9, 82.9, 162.2 2.25 1 Cronk et al. (2006[Cronk, J. D., Rowlett, R. S., Zhang, K. Y. J., Tu, C., Endrizzi, J. A., Lee, J., Gareiss, P. C. & Preiss, J. R. (2006). Biochemistry, 45, 4351-4361.])
1i6p P42212 68.5, 68.5, 85.9 2.00 2 Cronk et al. (2001[Cronk, J. D., Endrizzi, J. A., Cronk, M. R., O'Neill, J. W. & Zhang, K. Y. J. (2001). Protein Sci. 10, 911-922.])
4znz P42212 67.9, 67.9, 84.9 2.70 2 Niedzialkowska et al. (2016[Niedzialkowska, E., Gasiorowska, O., Handing, K. B., Majorek, K. A., Porebski, P. J., Shabalin, I. G., Zasadzinska, E., Cymborowski, M. & Minor, W. (2016). Protein Sci. 25, 720-733.])
7sev P42212 67.5, 67.5, 85.2 2.30 2 Chai et al. (2021[Chai, L., Zhu, P., Chai, J., Pang, C., Andi, B., McSweeney, S., Shanklin, J. & Liu, Q. (2021). Crystals, 11, 1227.])
9eat P42212 67.5, 67.5, 85.1 1.43 2 This work
1t75 P43212 110.4, 110.4, 162.5 2.50 3
9eaw P21212 78.2, 104.5, 48.3 2.26 4 This work
9ebz C2221 113.2, 119.1, 161.0 2.66 5 This work

Here, we report a case of persistent CA2 contamination that crystallized in three forms. Two are new crystal forms and the third yielded a high-resolution (1.43 Å) structure of a common CA2 crystal form.

2. Materials and methods

2.1. Protein production

The gene encoding a natural product biosynthetic protein of interest was cloned into the vector pMCSG7 using a ligation-independent cloning strategy (Stols et al., 2002[Stols, L., Gu, M., Dieckman, L., Raffen, R., Collart, F. R. & Donnelly, M. I. (2002). Protein Expr. Purif. 25, 8-15.]). To facilitate phosphopantetheinylation of the target protein, the plasmid was transformed into the E. coli BL21(DE3) BAP1 cell line (Pfeifer et al., 2001[Pfeifer, B. A., Admiraal, S. J., Gramajo, H., Cane, D. E. & Khosla, C. (2001). Science, 291, 1790-1792.]), which constitutively expresses sfp, encoding a nonspecific phosphopantetheinyl transferase (Quadri et al., 1998[Quadri, L. E. N., Weinreb, P. H., Lei, M., Nakano, M. M., Zuber, P. & Walsh, C. T. (1998). Biochemistry, 37, 1585-1595.]). The expression strain also contained the pRare2-CDF (Whicher et al., 2013[Whicher, J. R., Smaga, S. S., Hansen, D. A., Brown, W. C., Gerwick, W. H., Sherman, D. H. & Smith, J. L. (2013). Chem. Biol. 20, 1340-1351.]) plasmid. These cells were made competent by the Mix & Go! E. coli Transformation Kit (Zymo Research). Terrific Broth (TB) cultures containing 100 µg ml−1 ampicillin and 50 µg ml−1 spectinomycin were grown at 37°C with shaking at 225 rev min−1 until an OD600 of 1.0 was reached. The cultures were cooled to 20°C for 1 h, induced with 200 µM isopropyl β-D-1-thiogalactopyranoside (IPTG) and 2 g l−1 L-arabinose, grown for 18 h and harvested by centrifugation at 12 000g.

The cell pellet from a 1 l culture was resuspended in 70 ml lysis buffer [50 mM HEPES pH 7.8, 300 mM NaCl, 10%(v/v) glycerol, 20 mM imidazole pH 7.8], augmented with 1 mg ml−1 chicken lysozyme (Sigma), 50 µg ml−1 bovine DNase I (Sigma) and 2 mM MgCl2, and then incubated for 30 min at room temperature with agitation. Complete lysis was achieved via sonication (Branson Sonifier 450). Following centrifugation at 30 000g, the soluble fraction was collected, filtered (0.45 µm Millex-HP PES membrane filter unit, Millipore), incubated for 2 h with 5 ml packed Ni–NTA agarose beads (Qiagen) and loaded onto a glass chromatography column (Bio-Rad). The beads were washed with 100 ml lysis buffer before the protein was eluted in 40 ml elution buffer [50 mM HEPES pH 7.8, 300 mM NaCl, 10%(v/v) glycerol, 400 mM imidazole pH 7.8].

The eluate was then concentrated to 15 ml using a centrifugal filter unit (Amicon) with a 30 kDa molecular-weight cutoff (MWCO) before being diluted to 50 ml in gel-filtration buffer [50 mM HEPES pH 7, 50 mM NaCl, 10%(v/v) glycerol]. The diluted protein solution was passed through a 5 ml HiTrap Q HP anion-exchange column (Cytiva) at a flow rate of 3 ml min−1. Proteins were fractionated by a NaCl gradient (50–400 mM over 125 ml).

For a final purification step by gel filtration, proteins were concentrated to 5 ml and injected onto a Superdex 200 HiLoad 16/60 prep-grade gel-filtration column (GE Healthcare) that had been pre-equilibrated with gel-filtration buffer. Eluates were assessed by SDS–PAGE (Fig. 1[link]). The target protein was obtained with an estimated purity of >95% and a CA2 fraction of <1%. Target fractions were pooled, concentrated, flash-frozen in liquid nitrogen and stored at −80°C.

[Figure 1]
Figure 1
Assessment of protein purification. (a) Size-exclusion chromatography indicates that the target protein (72.6 kDa) is monomeric. Masses of molecular-weight standards are indicated at the top. Red lines indicate the pooled fractions, while the four circles correspond to the elution fractions shown in (b). (b) SDS–PAGE of peak Superdex 200 fractions. Several co-purified contaminants are present, including CA2 (25.0 kDa).

2.2. Protein crystallization

The protein sample prepared above was thawed on ice and then dialyzed into a buffer consisting of 10 mM HEPES pH 7.8, 25 mM NaCl overnight at 4°C using Slide-A-Lyzer MINI dialysis cups with a 10 kDa MWCO. The protein was then concentrated to 8.2 mg ml−1 and broad screening with the MCSG suite (Microlytic) was performed using a Gryphon crystallization robot. Within three days, crystals of diverse morphology grew in many conditions (Table 2[link]). Crystals were harvested directly from the growth conditions and cryoprotected by plunging them into liquid nitrogen. After discovering that these crystals from the initial broad screen did not contain the protein of interest, they were not optimized further. This expression construct was abandoned in favour of a strategy that yielded a sample with higher purity.

Table 2
Crystallization conditions

  PDB entry 9eat (form 2) PDB entry 9eaw (form 4) PDB entry 9ebz (form 5)
Method Sitting-drop vapour diffusion Sitting-drop vapour diffusion Sitting-drop vapour diffusion
Plate type Intelli-Plate 96-3 LVR Intelli-Plate 96-3 LVR Intelli-Plate 96-3 LVR
Temperature (K) 293.15 293.15 293.15
Protein concentration (mg ml−1) 8.2 8.2 8.2
Buffer composition of protein solution 10 mM HEPES pH 7.8, 25 mM NaCl 10 mM HEPES pH 7.8, 25 mM NaCl 10 mM HEPES pH 7.8, 25 mM NaCl
Composition of reservoir solution 0.1 M HEPES pH 7.5, 0.2 M lithium sulfate, 25% PEG 3350 0.2 M magnesium acetate, 20% PEG 3350 0.2 M ammonium tartrate dibasic, 20% PEG 3350
Volume and ratio of drop 0.75 µl, 2:1 0.75 µl, 2:1 0.5 µl, 1:1
Volume of reservoir (µl) 50 50 50

2.3. Data collection and processing

Data were reduced and scaled using XDS (Kabsch, 2010[Kabsch, W. (2010). Acta Cryst. D66, 125-132.]; Table 3[link]).

Table 3
Data collection and processing

Values in parentheses are for the outer shell.

  PDB entry 9eat (form 2) PDB entry 9eaw (form 4) PDB entry 9ebz (form 5)
Diffraction source 23-ID-B, APS 23-ID-D, APS 23-ID-D, APS
Wavelength (Å) 1.0332 1.0332 1.0332
Temperature (K) 100 100 100
Detector EIGER 16M PILATUS3 6M PILATUS3 6M
Crystal-to-detector distance (mm) 200 400 400
Rotation range per image (°) 0.2 0.2 0.2
Total rotation range (°) 180 166 180
Exposure time per image (s) 0.2 0.2 0.2
Space group P42212 P21212 C2221
a, b, c (Å) 67.524, 67.524, 85.076 78.233, 104.516, 48.256 113.227, 119.145, 161.01
α, β, γ (°) 90, 90, 90 90, 90, 90 90, 90, 90
Mosaicity (°) 0.094 0.187 0.140
Resolution range (Å) 47.77–1.43 43.82–2.26 47.89–2.66
Total No. of reflections 442908 (23724) 113168 (11180) 205437 (21188)
No. of unique reflections 69116 (6625) 19166 (1873) 31559 (3099)
Completeness (%) 99.48 (95.50) 99.80 (99.9) 99.9 (99.8)
Multiplicity 6.41 (3.58) 5.9 (6.0) 6.5 (6.8)
I/σ(I)〉 13.1 (1.4) 7.6 (2.3) 6.0 (0.9)
Rmeas 0.108 (1.106) 0.307 (1.391) 0.222 (2.633)
Inner shell Rmeas 0.051 0.120 0.098
CC1/2 0.998 (0.415) 0.988 (0.515) 0.994 (0.584)
Overall B factor from Wilson plot (Å2) 14.8 26.9 63.7

2.4. Structure solution and refinement

Molecular replacement (MR) in Phaser (McCoy et al., 2007[McCoy, A. J., Grosse-Kunstleve, R. W., Adams, P. D., Winn, M. D., Storoni, L. C. & Read, R. J. (2007). J. Appl. Cryst. 40, 658-674.]) using a homolog of the target protein failed. We then searched the PDB for matching lattice parameters and identified CA2 (PDB entry 1i6p; Cronk et al., 2001[Cronk, J. D., Endrizzi, J. A., Cronk, M. R., O'Neill, J. W. & Zhang, K. Y. J. (2001). Protein Sci. 10, 911-922.]) as a match for CA2 crystal form 2. MR via Phaser was then carried out using PDB entry 1i6p as a search model. At this point, CA2 contamination was suspected in the other crystals, so the high-resolution structure (PDB entry 9eat) was used as an MR search model for the CA2 samples in crystal forms 4 and 5. Refinement of all models was performed using iterative rounds of phenix.refine (Afonine et al., 2012[Afonine, P. V., Grosse-Kunstleve, R. W., Echols, N., Headd, J. J., Moriarty, N. W., Mustyakimov, M., Terwilliger, T. C., Urzhumtsev, A., Zwart, P. H. & Adams, P. D. (2012). Acta Cryst. D68, 352-367.]) and manual model building in Coot (Emsley et al., 2010[Emsley, P., Lohkamp, B., Scott, W. G. & Cowtan, K. (2010). Acta Cryst. D66, 486-501.]). The data from crystal form 2 exhibited a strong anomalous signal, presumably due to the tightly bound Zn2+ ion, so the f′ and f′′ contributions of Zn2+ were refined for this data set. All structural figures were created using PyMOL (Schrödinger). Structure validation was performed with MolProbity (Chen et al., 2010[Chen, V. B., Arendall, W. B., Headd, J. J., Keedy, D. A., Immormino, R. M., Kapral, G. J., Murray, L. W., Richardson, J. S. & Richardson, D. C. (2010). Acta Cryst. D66, 12-21.]). Refinement statistics are summarized in Table 4[link].

Table 4
Structure solution and refinement

Values in parentheses are for the outer shell.

  PDB entry 9eat (form 2) PDB entry 9eaw (form 4) PDB entry 9ebz (form 5)
Resolution range (Å) 47.75–1.43 (1.48–1.43) 43.82–2.26 (2.32–2.26) 47.89–2.66 (2.73–2.66)
Completeness (%) 99.47 (95.49) 99.80 (99.90) 99.90 (99.80)
σ Cutoff 0 0 0
No. of reflections, working set 65375 (6260) 17239 (1206) 312882 (2077)
No. of reflections, test set 3470 (364) 1916 (134) 2005 (144)
Final Rwork 0.157 (0.303) 0.197 (0.279) 0.261 (0.476)
Final Rfree 0.168 (0.313) 0.262 (0.339) 0.287 (0.498)
No. of non-H atoms
 Total 1951 3536 6831
 Protein 1716 3429 6824
 Ion 1 2 4
 Water 235 105 3
Protein residues 212 424 844
R.m.s. deviations
 Bond lengths (Å) 0.009 0.008 0.004
 Angles (°) 1.04 0.95 0.650
Average B factors (Å2)
 Overall 20.0 37.3 82.2
 Protein 18.8 37.3 82.2
 Ion 12.1 28.5 83.6
 Water 29.24 36.3 70.4
Ramachandran plot
 Most favoured (%) 98.57 95.51 97.27
 Allowed (%) 1.43 4.49 2.73
 Outliers (%) 0.00 0.00 0.00

3. Results and discussion

We report two novel crystal forms of E. coli CA2 that was obtained as a purification contaminant. These lattice parameters can be added to the list of crystal forms of CA2, saving time in the case of contamination. In addition to two novel crystal forms, we present a new, high-resolution (1.43 Å) view of CA2 (Fig. 2[link]). The presumed Zn2+ ion is coordinated with tetrahedral geometry by Cys42, Asp44, His98 and Cys101. The coordinate bond lengths, as seen in the high-resolution structure, are Zn—SG(Cys42) at 2.2 Å (range 2.2–2.2 Å), Zn—OD2(Asp44) at 2.1 Å (range 1.9–2.1 Å), Zn—NE2(His98) at 2.1 Å (range 2.0–2.1 Å) and Zn—SG(Cys101) at 2.3 Å (range 2.2–2.3 Å), with the ranges reflecting observations from all models reported in this study.

[Figure 2]
Figure 2
Carbonic anhydrase 2 in crystal form 2. (a) The P42212 unit cell contains one subunit per asymmetric unit, with tetramers formed at the 222 centers. The protomer in the asymmetric unit is coloured from the N-terminus (blue) to the C-terminus (red). The bound metal, presumably Zn2+, is shown as a grey sphere. (b) View of the active site. Polder (Liebschner et al., 2017[Liebschner, D., Afonine, P. V., Moriarty, N. W., Poon, B. K., Sobolev, O. V., Terwilliger, T. C. & Adams, P. D. (2017). Acta Cryst. D73, 148-157.]) density (green, 7σ) is shown for the presumed Zn2+ ion and its coordinating residues. Anomalous difference density is in magenta (20σ).

The identity of a protein in a crystal is technically uncertain until the structure is solved. When structure solution fails despite high-quality data with no obvious pathologies, the contents of the crystal should be considered. This study highlights the unfortunate reality that even off-target macromolecules with low (<1%) abundance may readily crystallize. It is always useful to search the PDB for a unit cell with symmetry and cell constants that match the indexed data.

Sometimes, as was the case for the data sets resulting from CA2 crystal forms 4 and 5, there is no match for the cell and symmetry in the PDB. The fortuitous discovery of a CA2 crystal in the previously characterized form 2 from the same protein sample revealed the contaminant (Table 1[link]). This observation led to successful MR structure determinations for the other two data sets with CA2 as a search model, yielding CA2 structures in crystal forms 4 and 5.

When working with a new data set that has no matches in the PDB, alternative diagnostic approaches are available. If the amount of crystalline material permits, the size of the crystallized macromolecule may be estimated by SDS–PAGE, or more accurately assessed by mass spectrometry. While researchers have had success with an exhaustive, iterative MR campaign using the full PDB as search models (Keegan et al., 2016[Keegan, R., Waterman, D. G., Hopper, D. J., Coates, L., Taylor, G., Guo, J., Coker, A. R., Erskine, P. T., Wood, S. P. & Cooper, J. B. (2016). Acta Cryst. D72, 933-943.]), tools have been developed to solve contaminant structures by using MR more efficiently. MarathonMR uses a subset of the PDB based on fold families (Hatti et al., 2017[Hatti, K., Biswas, A., Chaudhary, S., Dadireddy, V., Sekar, K., Srinivasan, N. & Murthy, M. R. N. (2017). J. Struct. Biol. 197, 372-378.]). For common contaminants, ContaMiner performs automated MR against common suspects in the ContaBase database (Hungler et al., 2016[Hungler, A., Momin, A., Diederichs, K. & Arold, S. T. (2016). J. Appl. Cryst. 49, 2252-2258.]). SIMBAD combines several strategies by first searching unit-cell parameters and then screening for common contaminants, before finally performing a brute-force search of a nonredundant subset of the PDB (Simpkin et al., 2018[Simpkin, A. J., Simkovic, F., Thomas, J. M. H., Savko, M., Lebedev, A., Uski, V., Ballard, C., Wojdyr, M., Wu, R., Sanishvili, R., Xu, Y., Lisa, M.-N., Buschiazzo, A., Shepard, W., Rigden, D. J. & Keegan, R. M. (2018). Acta Cryst. D74, 595-605.]).

Sometimes the source of the contaminating protein comes not from the expression source, but from contaminating cells. Serratia proteamaculans was suspected to have contaminated Trichoplusia ni, as the cyanate hydratase CynS co-purified with the target protein and formed well diffracting crystals (Butryn et al., 2015[Butryn, A., Stoehr, G., Linke-Winnebeck, C. & Hopfner, K.-P. (2015). Acta Cryst. F71, 471-476.]). Mass spectrometry identified CynS, and MR was successful. The Serratia genus appears to be notorious for cell contamination, as different laboratories have reported contamination with Serratia CynS (Pederzoli et al., 2020[Pederzoli, R., Tarantino, D., Gourlay, L. J., Chaves-Sanjuan, A. & Bolognesi, M. (2020). Acta Cryst. F76, 392-397.]) and glycerol dehydro­genase (Musille & Ortlund, 2014[Musille, P. & Ortlund, E. (2014). Acta Cryst. F70, 166-172.]) when expressing targets in E. coli.

Although the interference of contaminating proteins in a structural biology project is frustrating, it may sometimes lead to exciting results. Trace lysozyme added to cells during lysis formed a heterotrimeric complex that facilitated crystallization of the cortactin–Arg complex (Liu et al., 2012[Liu, W., MacGrath, S. M., Koleske, A. J. & Boggon, T. J. (2012). Acta Cryst. F68, 154-158.]). Crystallographic analysis of co-purified contaminating proteins has also yielded novel structures. Examples include the yeast nicotinamidase Pnc1p (Hu et al., 2007[Hu, G., Taylor, A. B., McAlister-Henn, L. & Hart, P. J. (2007). Arch. Biochem. Biophys. 461, 66-75.]), the putative cysteine hydrolase YcaC from Pseudomonas aeruginosa (Grøftehauge et al., 2015[Grøftehauge, M. K., Truan, D., Vasil, A., Denny, P. W., Vasil, M. L. & Pohl, E. (2015). Int. J. Mol. Sci. 16, 15971-15984.]) and the Achromobacter sp. bacterioferritin Dh1f (Dwivedy et al., 2018[Dwivedy, A., Jha, B., Singh, K. H., Ahmad, M., Ashraf, A., Kumar, D. & Biswal, B. K. (2018). Acta Cryst. F74, 558-566.]).

Engineering approaches may also minimize the chances of co-eluting proteins when using a nickel-affinity purification strategy. Cell lines such as E. coli LOBSTR (low background strain) have been developed by modifying the arnA and slyD genes of E. coli BL21(DE3) such that the encoded proteins exhibit weaker binding to Ni–NTA resin (Andersen et al., 2013[Andersen, K. R., Leksa, N. C. & Schwartz, T. U. (2013). Proteins, 81, 1857-1861.]). Similarly, in the engineered NiCo21(DE3) E. coli strain, DNA encoding a chitin-binding domain is appended to the 3′ ends of slyD, can and arnA, allowing chitin-resin depletion of the corresponding problematic proteins. In this strain, glmS has also been altered to produce a protein that binds nickel resin with lower affinity.

While usually unwelcome, crystals resulting from un­intended targets may yield new results. We present a high-quality carbonic anhydrase 2 structure that may serve as a new standard for structural studies. Additionally, we report two additional CA2 structures in new crystal forms, which may save time when others encounter the same problem.

Supporting information


Acknowledgements

We thank the beamline staff at the GM/CA sector of the Advanced Photon Source (APS) for their support. GM/CA@APS has been funded by the National Cancer Institute (ACB-12002) and the National Institute of General Medical Sciences (AGM-12006, P30 GM138396). This research used resources of the Advanced Photon Source, a US Department of Energy (DOE) Office of Science User Facility operated for the DOE Office of Science by Argonne National Laboratory under Contract No. DE-AC02-06CH11357.

Conflict of interest

The authors declare no conflicts of interest.

Funding information

The following funding is acknowledged: National Institutes of Health, National Institute of Diabetes and Digestive and Kidney Diseases (grant No. R01 DK042303 to Janet L. Smith); National Institutes of Health, National Cancer Institute (grant No. F31 CA265082 to Michael R. Rankin); National Institutes of Health, National Institute of General Medical Sciences (grant No. T32 GM145304).

References

First citationAfonine, P. V., Grosse-Kunstleve, R. W., Echols, N., Headd, J. J., Moriarty, N. W., Mustyakimov, M., Terwilliger, T. C., Urzhumtsev, A., Zwart, P. H. & Adams, P. D. (2012). Acta Cryst. D68, 352–367.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationAndersen, K. R., Leksa, N. C. & Schwartz, T. U. (2013). Proteins, 81, 1857–1861.  Web of Science CrossRef CAS PubMed Google Scholar
First citationBolanos-Garcia, V. M. & Davies, O. R. (2006). Biochim. Biophys. Acta, 1760, 1304–1313.  Web of Science PubMed CAS Google Scholar
First citationButryn, A., Stoehr, G., Linke-Winnebeck, C. & Hopfner, K.-P. (2015). Acta Cryst. F71, 471–476.  Web of Science CrossRef IUCr Journals Google Scholar
First citationChai, L., Zhu, P., Chai, J., Pang, C., Andi, B., McSweeney, S., Shanklin, J. & Liu, Q. (2021). Crystals, 11, 1227.  CrossRef Google Scholar
First citationChen, V. B., Arendall, W. B., Headd, J. J., Keedy, D. A., Immormino, R. M., Kapral, G. J., Murray, L. W., Richardson, J. S. & Richardson, D. C. (2010). Acta Cryst. D66, 12–21.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationCronk, J. D., Endrizzi, J. A., Cronk, M. R., O'Neill, J. W. & Zhang, K. Y. J. (2001). Protein Sci. 10, 911–922.  Web of Science CrossRef PubMed CAS Google Scholar
First citationCronk, J. D., Rowlett, R. S., Zhang, K. Y. J., Tu, C., Endrizzi, J. A., Lee, J., Gareiss, P. C. & Preiss, J. R. (2006). Biochemistry, 45, 4351–4361.  Web of Science CrossRef PubMed CAS Google Scholar
First citationDavid, G., Blondeau, K., Schiltz, M., Penel, S. & Lewit-Bentley, A. (2003). J. Biol. Chem. 278, 43728–43735.  Web of Science CrossRef PubMed CAS Google Scholar
First citationDoyle, J. & Cooper, J. S. (2024). StatPearls. Treasure Island: StatPearls Publishing.  Google Scholar
First citationDwivedy, A., Jha, B., Singh, K. H., Ahmad, M., Ashraf, A., Kumar, D. & Biswal, B. K. (2018). Acta Cryst. F74, 558–566.  Web of Science CrossRef IUCr Journals Google Scholar
First citationEmsley, P., Lohkamp, B., Scott, W. G. & Cowtan, K. (2010). Acta Cryst. D66, 486–501.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationFalgenhauer, E., von Schönberg, S., Meng, C., Mückl, A., Vogele, K., Emslander, Q., Ludwig, C. & Simmel, F. C. (2021). ChemBioChem, 22, 2805–2813.  CrossRef CAS PubMed Google Scholar
First citationFunakoshi, A., Tsubota, Y., Fujii, K., Ibayashi, H. & Takagi, Y. (1980). J. Biochem. 88, 1113–1118.  CrossRef CAS PubMed Google Scholar
First citationGrøftehauge, M. K., Truan, D., Vasil, A., Denny, P. W., Vasil, M. L. & Pohl, E. (2015). Int. J. Mol. Sci. 16, 15971–15984.  Web of Science PubMed Google Scholar
First citationGrzechowiak, M., Sekula, B., Jaskolski, M. & Ruszkowski, M. (2021). Acta Biochim. Pol. 68, 29–31.  Web of Science CAS PubMed Google Scholar
First citationHarper, S. & Speicher, D. W. (2011). Methods Mol. Biol. 681, 259–280.  CrossRef CAS PubMed Google Scholar
First citationHatti, K., Biswas, A., Chaudhary, S., Dadireddy, V., Sekar, K., Srinivasan, N. & Murthy, M. R. N. (2017). J. Struct. Biol. 197, 372–378.  Web of Science CrossRef CAS PubMed Google Scholar
First citationHu, G., Taylor, A. B., McAlister-Henn, L. & Hart, P. J. (2007). Arch. Biochem. Biophys. 461, 66–75.  Web of Science CrossRef PubMed CAS Google Scholar
First citationHungler, A., Momin, A., Diederichs, K. & Arold, S. T. (2016). J. Appl. Cryst. 49, 2252–2258.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationKabsch, W. (2010). Acta Cryst. D66, 125–132.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationKeegan, R., Waterman, D. G., Hopper, D. J., Coates, L., Taylor, G., Guo, J., Coker, A. R., Erskine, P. T., Wood, S. P. & Cooper, J. B. (2016). Acta Cryst. D72, 933–943.  Web of Science CrossRef IUCr Journals Google Scholar
First citationLebendiker, M. & Danieli, T. (2017). Methods Mol. Biol. 1485, 257–273.  CrossRef CAS PubMed Google Scholar
First citationLiebschner, D., Afonine, P. V., Moriarty, N. W., Poon, B. K., Sobolev, O. V., Terwilliger, T. C. & Adams, P. D. (2017). Acta Cryst. D73, 148–157.  Web of Science CrossRef IUCr Journals Google Scholar
First citationLiu, W., MacGrath, S. M., Koleske, A. J. & Boggon, T. J. (2012). Acta Cryst. F68, 154–158.  CrossRef IUCr Journals Google Scholar
First citationMcCoy, A. J., Grosse-Kunstleve, R. W., Adams, P. D., Winn, M. D., Storoni, L. C. & Read, R. J. (2007). J. Appl. Cryst. 40, 658–674.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationMerlin, C., Masters, M., McAteer, S. & Coulson, A. (2003). J. Bacteriol. 185, 6415–6424.  Web of Science CrossRef PubMed CAS Google Scholar
First citationMusille, P. & Ortlund, E. (2014). Acta Cryst. F70, 166–172.  Web of Science CrossRef IUCr Journals Google Scholar
First citationNiedzialkowska, E., Gasiorowska, O., Handing, K. B., Majorek, K. A., Porebski, P. J., Shabalin, I. G., Zasadzinska, E., Cymborowski, M. & Minor, W. (2016). Protein Sci. 25, 720–733.  Web of Science CrossRef CAS PubMed Google Scholar
First citationParsy, C. B., Chapman, C. J., Barnes, A. C., Robertson, J. F. & Murray, A. (2007). J. Chromatogr. B, 853, 314–319.  CrossRef CAS Google Scholar
First citationPederzoli, R., Tarantino, D., Gourlay, L. J., Chaves-Sanjuan, A. & Bolognesi, M. (2020). Acta Cryst. F76, 392–397.  Web of Science CrossRef IUCr Journals Google Scholar
First citationPfeifer, B. A., Admiraal, S. J., Gramajo, H., Cane, D. E. & Khosla, C. (2001). Science, 291, 1790–1792.  Web of Science CrossRef PubMed CAS Google Scholar
First citationQuadri, L. E. N., Weinreb, P. H., Lei, M., Nakano, M. M., Zuber, P. & Walsh, C. T. (1998). Biochemistry, 37, 1585–1595.  CrossRef CAS PubMed Google Scholar
First citationRobichon, C., Luo, J., Causey, T. B., Benner, J. S. & Samuelson, J. C. (2011). Appl. Environ. Microbiol. 77, 4634–4646.  CrossRef CAS PubMed Google Scholar
First citationSimpkin, A. J., Simkovic, F., Thomas, J. M. H., Savko, M., Lebedev, A., Uski, V., Ballard, C., Wojdyr, M., Wu, R., Sanishvili, R., Xu, Y., Lisa, M.-N., Buschiazzo, A., Shepard, W., Rigden, D. J. & Keegan, R. M. (2018). Acta Cryst. D74, 595–605.  Web of Science CrossRef IUCr Journals Google Scholar
First citationStols, L., Gu, M., Dieckman, L., Raffen, R., Collart, F. R. & Donnelly, M. I. (2002). Protein Expr. Purif. 25, 8–15.  Web of Science CrossRef PubMed CAS Google Scholar
First citationTropea, J. E., Cherry, S. & Waugh, D. S. (2009). Methods Mol. Biol. 498, 297–307.  CrossRef PubMed CAS Google Scholar
First citationWhicher, J. R., Smaga, S. S., Hansen, D. A., Brown, W. C., Gerwick, W. H., Sherman, D. H. & Smith, J. L. (2013). Chem. Biol. 20, 1340–1351.  CrossRef CAS PubMed Google Scholar

This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.

Journal logoSTRUCTURAL BIOLOGY
COMMUNICATIONS
ISSN: 2053-230X
Follow Acta Cryst. F
Sign up for e-alerts
Follow Acta Cryst. on Twitter
Follow us on facebook
Sign up for RSS feeds