research communications
Cloning, expression, purification, crystallization and X-ray crystallographic analysis of recombinant human C1ORF123 protein
aInstitute of Systems Biology, Universiti Kebangsaan Malaysia, 43600 UKM Bangi, Selangor, Malaysia, bSchool of Biosciences and Biotechnology, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, 43600 UKM Bangi, Selangor, Malaysia, cDepartment of Pathology, Faculty of Medicine and Health Sciences, Universiti Putra Malaysia, 43400 UPM Serdang, Selangor, Malaysia, dCentre for Chemical Biology, Universiti Sains Malaysia, 11900 Bayan Lepas, Penang, Malaysia, and eDiamond Light Source, Harwell Science and Innovation Campus, Didcot OX11 0DE, England
*Correspondence e-mail: clng@ukm.edu.my
C1ORF123 is a human hypothetical protein found in open reading frame 123 of chromosome 1. The protein belongs to the DUF866 protein family comprising eukaryote-conserved proteins with unknown function. Recent proteomic and bioinformatic analyses identified the presence of C1ORF123 in brain, frontal cortex and synapses, as well as its involvement in endocrine function and polycystic ovary syndrome (PCOS), indicating the importance of its biological role. In order to provide a better understanding of the biological function of the human C1ORF123 protein, the characterization and analysis of recombinant C1ORF123 (rC1ORF123), including overexpression and purification, verification by M magnesium chloride hexahydrate, 0.1 M sodium citrate pH 6.5. The crystals diffracted to 1.9 Å resolution and belonged to an orthorhombic with unit-cell parameters a = 59.32, b = 65.35, c = 95.05 Å. The calculated Matthews coefficient (VM) value of 2.27 Å3 Da−1 suggests that there are two molecules per with an estimated solvent content of 45.7%.
and a Western blot using anti-C1ORF123 antibodies, crystallization and X-ray of the protein crystals, are reported here. The rC1ORF123 protein was crystallized by the hanging-drop vapor-diffusion method with a reservoir solution comprised of 20% PEG 3350, 0.2Keywords: C1ORF123; hypothetical protein; DUF866; polycystic ovary syndrome; bioinformatic analysis.
1. Introduction
Open reading frame 123, which is located in the short arm of human chromosome 1, encodes a hypothetical protein known as C1ORF123 (Selvarajan & Shanmughavel, 2014). C1ORF123 consists of 160 amino acids with a calculated molecular weight of approximately 18 kDa. The C1ORF123 protein is exclusively found in eukaryotic cells and belongs to the DUF866 family of proteins of unknown function. To date, the only known protein structure from the DUF866 family is that of the Plasmodium falciparum homologue MAL13P1.257, which shares 26% sequence identity with C1ORF123 (Holmes et al., 2006). No functional studies have yet been reported for MAL13P1.257.
In humans, C1ORF123 is expressed in various anatomical regions, including the brain, skeletal muscles and ovary. Bioinformatics analysis has annotated the C1ORF123 protein as a cellular protein (Selvarajan & Shanmughavel, 2014) and suggests that it is involved in the pathway related to an abnormality known as polycystic ovary syndrome (PCOS; Mohamed-Hussein & Harun, 2009). PCOS is a heterogeneous endocrine disorder that causes ∼10% of infertility in women (Diamanti-Kandarakis, 2008). Interestingly, a proteomic analysis of goat adipose tissue also identified the C1ORF123 homologue as one of the adipokines that may be involved in endocrine function (Restelli et al., 2014). Other proteomics studies have found that the C1ORF123 protein is largely expressed in the hippocampus of people suffering from schizophrenia, bipolar disorder and methamphetamine-induced sensitization of the prefrontal cortex, as well as being a unique protein in the frontal cortex of aged rats associated with slow-wave sleep (SWS) (Schubert et al., 2015; Wearne et al., 2015; Vazquez et al., 2009). This indicates the involvement of C1ORF123 in psychotic diseases or in age-related changes in brain function. A homologue of C1ORF123 has also been identified in the electric organ of the pacific electric ray Torpedo californica along with many neuromuscular junctions and presynaptic proteins, suggesting its role in synapse structure and maintenance (Mate et al., 2011). C1ORF123 has also been identified as an O-GlcNAc transferase (OGT) interactor, indicating its possible role in the post-translational O-GlcNAcylation of proteins, which is important in many biological processes (Deng et al., 2014). The network-based approach of the STRING database (Franceschini et al., 2013) further deciphers the potential function of C1ORF123 and its homologues by the identification of interacting partners (Table 1). To better understand its biological function, we are working towards structural analysis of the human C1ORF123 protein. Here, we report the cloning, overexpression, purification, protein characterization and crystallization together with the initial X-ray crystallographic analysis of recombinant C1ORF123 (rC1ORF123).
2. Materials and methods
2.1. Protein production
The 492 bp coding sequence for human C1ORF123 (Gene ID 54987) was synthesized and cloned between the NdeI and HindIII restriction-endonuclease sites of the pUC57 cloning vector (GenScript, USA). The C1ORF123 gene was subcloned into the pET-28b vector using the same ), is approximately 20 kDa using ProtParam (Gasteiger et al., 2005). Subsequently, the pET-28b-C1ORF123 construct was transformed into Escherichia coli strain BL21 Rosetta-gami (DE3) cells. A single colony of transformant was inoculated into 6 ml Luria–Bertani (LB) broth containing 50 mg ml−1 kanamycin and agitated overnight in an incubator shaker at 250 rev min−1 and 310 K. The bacterial culture was then inoculated into 1 l LB broth supplemented with 50 mg ml−1 kanamycin and grown at 250 rev min−1 at 310 K. After the OD600 had reached 0.5–0.6, expression of the recombinant protein was induced by adding 1 mM isopropyl β-D-1-thiogalactopyranoside (IPTG). The culture was grown for a further 3–4 h at 310 K before the cells were harvested by centrifugation at 17 968g.
to produce the pET-28b-C1ORF123 construct, which includes a 6×His fusion tag at the N-terminus of the recombinant protein. The calculated molecular weight of rC1ORF123, which contains 20 amino acids as a fusion tag at the N-terminus (Table 2
|
The bacterial pellet was resuspended in lysis buffer (10 ml per gram of cell pellet) consisting of 25 mM Tris–HCl pH 7.5, 100 mM NaCl, 20 mM β-mercaptoethanol, 20 mM imidazole before being lysed by sonication (Qsonica, 30 cycles of 38% amplitude for 30 s each). The cell lysate was centrifuged at 17 968g at 277 K for 30 min to separate the soluble proteins from the cell debris. The supernatant was filter-sterilized with a 0.22 µm PVDF membrane filter before loading it onto an Ni–NTA-coupled HisTrap HP 5 ml column (GE Healthcare) which had been pre-equilibrated with binding buffer consisting of 25 mM Tris–HCl pH 7.5, 100 mM NaCl, 20 mM β-mercaptoethanol, 50 mM imidazole. The rC1ORF123 protein was eluted using a linear gradient of washing buffer consisting of 25 mM Tris–HCl pH 7.5, 100 mM NaCl, 20 mM β-mercaptoethanol, 500 mM imidazole. The protein eluted at an imidazole concentration of 79 mM. Fractions containing the rC1ORF123 protein were pooled and concentrated using Vivaspin concentrators fitted with a 3 kDa molecular-weight cutoff filter (Sartorius, Germany). The concentrated rC1ORF123 protein was further purified by (SEC) using a HiLoad 16/600 Superdex 75 pg gel-filtration column (GE Healthcare, USA) pre-equilibrated with size-exclusion buffer consisting of 25 mM Tris–HCl pH 7.5, 100 mM NaCl, 20 mM β-mercaptoethanol. The purity of the rC1ORF123 protein was verified using 12% SDS–PAGE (Fig. 1). The protein concentration of rC1ORF123 was assessed using the Bradford assay (Bio-Rad, USA). Fractions containing the rC1ORF123 protein purified by were also pooled and concentrated to 8.0 mg ml−1.
2.2. Verification of recombinant C1ORF123 protein using a Western blot
A Western blot was performed according to the protocol described by Mahmood & Yang (2012). rC1ORF123 proteins were run on a 12% SDS–PAGE gel and transferred to a nitrocellulose membrane. After blocking with bovine serum albumin (BSA) for 1 h at room temperature, the membrane was probed with rabbit anti-human C1ORF123 antibodies (Sigma–Aldrich) for 2 h at room temperature followed by two 10 min PBST washes with agitation. The membrane was then incubated at room temperature for 1 h with a goat anti-rabbit IgG secondary antibody conjugated to horseradish peroxidase (HRP) (Sigma–Aldrich). The membrane was washed twice with PBST for 10 min and then soaked in SuperSignal West Pico chemiluminescent substrate (Thermo Scientific) for ∼5 min. The results were obtained using a chemiluminescent imaging system.
2.3. Mass-spectrometry
To confirm the sequence and the molecular weight of the expressed rC10RF123 protein, a single protein band with a molecular weight of approximately 20 kDa was excised from the 12% SDS–PAGE gel (Fig. 1a) and used for protein identification by obtained after trypsin digestion of rC1ORF123 were extracted and analyzed by matrix-assisted laser desorption/ionization time-of-flight/time-of-flight (MALDI-TOF/TOF MS) using a 5800 Proteomics Analyzer (Applied Biosystems/SCIEX; Bringans et al., 2008) by Proteomics International Pty Ltd (Australia). Protein identification was carried out using the Mascot sequence-matching software (Matrix Science) based on the Ludwig NR database.
2.4. Crystallization
The rC1ORF123 protein was purified to M Tris–HCl pH 7.5, 100 mM NaCl, 20 mM β-mercaptoethanol. It was then concentrated to 8 mg ml−1 and used for initial crystallization screening with commercially available crystallization screening kits such as Index, Crystal Screen and Crystal Screen 2 (Hampton Research). Screening was performed using the sitting-drop vapor-diffusion method in standard 96-well MRC crystallization plates (Molecular Dimensions). Drops consisting of 0.5 µl rC1ORF123 protein and 0.5 µl reservoir solution were equilibrated against 80 µl reservoir solution at 293 K. Initial crystal hits were obtained from several crystallization conditions. The crystallization conditions were further optimized using the hanging-drop vapor-diffusion method in 24-well plates with crystallization drops consisting of 1 µl protein solution (concentrated to 7.8 mg ml−1) and 1 µl reservoir solution. Single crystals were obtained after 5 d from drops comprised of reservoir solution consisting of 0.2 M magnesium chloride hexahydrate, 0.1 M sodium citrate tribasic pH 6.5, 20% PEG 3350. The best crystals, with typical dimensions of ∼400 × 100 × 50 µm (Fig. 2a), were selected for X-ray The crystallization of rC1ORF123 is summarized in Table 3.
in a buffer consisting of 25 m
|
2.5. Data collection and processing
Prior to flash-cooling in liquid nitrogen, the rC1ORF123 crystals were immersed for 5 min at 293 K in cryoprotectant solution consisting of 0.1 M sodium citrate buffer pH 6.5, 0.2 M magnesium chloride tribasic, 22% PEG 3350, 20% glycerol. X-ray diffraction data were collected on the I02 beamline at Diamond Light Source, UK at 100 K in a nitrogen-gas stream and at a wavelength of 0.9797 Å. A total of 900 images were collected with 0.2° rotation range per image using a Pilatus 6M detector. The data were indexed and integrated using MOSFLM (Leslie & Powell, 2007) via the iMosflm interface (v.7.1.1) (Battye et al., 2011). The is orthorhombic, and POINTLESS (Evans, 2006) suggests that the crystal is most likely to belong to either P21212 or P212121. For further analysis, the data were scaled and merged with AIMLESS (Evans & Murshudov, 2013) in P222. The data-collection and processing statistics are summarized in Table 4.
|
3. Results and discussion
The rC1ORF123 protein with an N-terminal 6×His tag (rC1ORF123) was successfully overexpressed and purified to a) according to standard molecular-weight markers. (SEC) showed the elution of a single peak containing rC1ORF123 with a indicating that the molecular weight of the rC1ORF123 protein lies between those of myoglobin (17 kDa) and ovalalbumin (44 kDa) (Fig. 1b). The SEC further indicates the molecular weight of rC1ORF123 to be ∼28 kDa (Fig. 1c), which implies that the protein is most likely to exist as a monomer in solution. For protein validation, a Western blot was performed using anti-human C1ORF123 antibody against the rC1ORF123 protein. rC1ORF123 was positively detected by the antibody, which confirmed that the rC1ORF123 protein is similar to human C1ORF123 (Fig. 1d). To further verify the identity of rC1ORF123, the protein was validated by MALDI-TOF/TOF MS (Applied Biosystems/SCIEX). There were 62 that matched 38% of the protein sequence of the human C1ORF123 protein (Fig. 3). Both the Western blot and the MALDI/TOF results confirmed that rC1ORF123 is identical to the human C1ORF123 protein.
using (Ni–NTA) and (Superdex 75). rC1ORF123 migrated as a single protein band on SDS–PAGE with a molecular weight of approximately 20 kDa (Fig. 1The purified rC1ORF123 was concentrated to 8.0 mg ml−1 and used for crystallization screening. Initial crystal hits were obtained from several crystallization conditions that contained 0.2 M magnesium or calcium ions, medium-size polyethylene glycol (PEG 3350 or PEG 8000) and buffers (sodium cacodylate, bis-tris or HEPES) with a pH in the range between 5.5 and 7.5. Crystals suitable for X-ray were obtained after optimization from conditions that consisted of 0.2 M magnesium chloride, 0.1 M sodium citrate pH 6.5, 20%(w/v) PEG 3350 (Fig. 2a). The crystals were flash-cooled in liquid nitrogen after the addition of an additional 20% glycerol to the solution as a cryoprotectant. The best rC1ORF123 crystal, with dimensions of ∼400 × 100 × 50 µm, diffracted to 1.9 Å resolution (Fig. 2b). Diffraction data were collected with 98.1% completeness on the I02 beamline at Diamond Light Source. The data were indexed and integrated using MOSFLM (Leslie & Powell, 2007) and were scaled and merged with AIMLESS (Evans & Murshudov, 2013). Indexing indicates that the is orthorhombic, with unit-cell parameters a = 59.32, b = 65.35, c = 95.05 Å. However, POINTLESS (Evans, 2006) suggests that the actual is either P212121 or P21212, and it has yet to be determined by further structural analysis. The crystallographic parameters and data-collection statistics are shown in Table 4. The calculated Matthews coefficient (VM; Matthews, 1968) value of 2.28 Å3 Da−1 implies that the crystal consists of two rC1ORF123 molecules per with an estimated solvent content of 46.1%. The self-rotation function (Crowther, 1972) was calculated from the rC1ORF123 data using MOLREP (Vagin & Teplyakov, 2010). The self-rotation function map shows three peaks in the κ = 180° section that correspond to the three perpendicular crystallographic twofold axes of the (222; Fig. 2c). The map also shows two pairs of (NCS) twofold axes which are almost parallel to the crystallographic x and y axes. The corresponding self-rotation function peaks are at approximately φ = ±10° and φ = ±80°, suggesting that rC1ORF123 forms a dimer in the crystal. This is in good agreement with Holmes et al. (2006), who suggest that the P. falciparum homologue MAL13P1.257 might also form a weak dimer based on its Secondary-structure prediction using Phyre2 (Kelley et al., 2015) suggests that C1ORF123 forms 15 β-strands (61% sequence coverage) and one α-helix (2% sequence coverage) and contains 14 loop regions (Fig. 3). This is similar to the P. falciparum homologue MAL13P1.257 (Holmes et al., 2006), which shares only 26% sequence identity with C1ORF123. Currently, we are working towards the of rC1ORf123 by Comparison of human rC1ORF123 with its homologue from a protozoan parasite along with analysis of their conserved regions and structural differences may help us to understand the biological function of the DUF866 protein family.
Acknowledgements
The authors would like to acknowledge the Ministry of Science, Technology and Innovation (MOSTI), Malaysia for financial support through the ScienceFund grant (02-01-02-SF0993).
References
Batisse, J., Batisse, C., Budd, A., Böttcher, B. & Hurt, E. (2009). J. Biol. Chem. 284, 34911–34917. CrossRef PubMed CAS Google Scholar
Battye, T. G. G., Kontogiannis, L., Johnson, O., Powell, H. R. & Leslie, A. G. W. (2011). Acta Cryst. D67, 271–281. Web of Science CrossRef CAS IUCr Journals Google Scholar
Bringans, S. D., Kendrick, T., Lui, J. & Lipscombe, R. J. (2008). Rapid Commun. Mass Spectrom. 22, 3450–3454. CrossRef PubMed CAS Google Scholar
Crowther, R. A. (1972). The Molecular Replacement Method, edited by M. G. Rossmann, pp. 173–178. New York: Gordon & Breach. Google Scholar
Danielsen, J. M. R., Sylvestersen, K. B., Bekker-Jensen, S., Szklarczyk, D., Poulsen, J. W., Horn, H., Jensen, L. J., Mailand, N. & Nielsen, M. L. (2011). Mol. Cell. Proteomics, 10, M110.003590. CrossRef PubMed Google Scholar
Deng, R.-P., He, X., Guo, S.-J., Liu, W.-F., Tao, Y. & Tao, S.-C. (2014). Proteomics, 14, 1020–1030. CrossRef CAS PubMed Google Scholar
Diamanti-Kandarakis, E. (2008). Expert Rev. Mol. Med. 10, e3. PubMed Google Scholar
Evans, P. (2006). Acta Cryst. D62, 72–82. Web of Science CrossRef CAS IUCr Journals Google Scholar
Evans, P. R. & Murshudov, G. N. (2013). Acta Cryst. D69, 1204–1214. Web of Science CrossRef CAS IUCr Journals Google Scholar
Franceschini, A., Szklarczyk, D., Frankild, S., Kuhn, M., Simonovic, M., Roth, A., Lin, J., Minguez, P., Bork, P., von Mering, C. & Jensen, L. J. (2013). Nucleic Acids Res. 41, D808–D815. CrossRef CAS PubMed Google Scholar
Gasteiger, E., Hoogland, C., Gattiker, A., Duvaud, S., Wilkins, M. R., Appel, R. D. & Bairoch, A. (2005). The Proteomics Protocols Handbook, edited by J. M. Walker, pp. 571–607. Totowa: Humana Press. Google Scholar
Giot, L. et al. (2003). Science, 302, 1727–1736. Web of Science CrossRef PubMed CAS Google Scholar
Holmes, M. A., Buckner, F. S., Van Voorhis, W. C., Mehlin, C., Boni, E., Earnest, T. N., DeTitta, G., Luft, J., Lauricella, A., Anderson, L., Kalyuzhniy, O., Zucker, F., Schoenfeld, L. W., Hol, W. G. J. & Merritt, E. A. (2006). Acta Cryst. F62, 180–185. CrossRef IUCr Journals Google Scholar
Kelley, L. A., Mezulis, S., Yates, C. M., Wass, M. N. & Sternberg, M. J. E. (2015). Nature Protoc. 10, 845–858. Web of Science CrossRef CAS Google Scholar
Krogan, N. et al. (2006). Nature (London), 440, 637–643. CrossRef PubMed CAS Google Scholar
Leslie, A. G. W. & Powell, H. R. (2007). Evolving Methods for Macromolecular Crystallography, edited by R. J. Read & J. L. Sussman, pp. 41–51. Dordrecht: Springer. Google Scholar
Mahmood, T. & Yang, P.-C. (2012). N. Am. J. Med. Sci. 4, 429–434. CrossRef PubMed Google Scholar
Mate, S. E., Brown, K. J. & Hoffman, E. P. (2011). Skelet. Muscle, 1, 20. CrossRef PubMed Google Scholar
Matthews, B. W. (1968). J. Mol. Biol. 33, 491–497. CrossRef CAS PubMed Web of Science Google Scholar
Mohamed-Hussein, Z. A. & Harun, S. (2009). Theor. Biol. Med. Model. 6, 18. PubMed Google Scholar
Restelli, L., Codrea, M. C., Savoini, G., Ceciliani, F. & Bendixen, E. (2014). J. Proteomics, 108, 295–305. CrossRef CAS PubMed Google Scholar
Schenk, L., Meinel, D., Strässer, K. & Gerber, A. (2012). RNA, 18, 449–461. CrossRef CAS PubMed Google Scholar
Schubert, K. O., Föcking, M. & Cotter, D. R. (2015). Schizophr. Res. 167, 64–72. CrossRef PubMed Google Scholar
Selvarajan, S. & Shanmughavel, P. (2014). Eur. J. Appl. Sci. Technol. 1, 43–49. Google Scholar
Stelzl, U. et al. (2005). Cell, 122, 957–968. CrossRef PubMed CAS Google Scholar
Vagin, A. & Teplyakov, A. (2010). Acta Cryst. D66, 22–25. Web of Science CrossRef CAS IUCr Journals Google Scholar
Vazquez, J., Hall, S. C. & Greco, M. A. (2009). Brain Res. 1298, 37–45. CrossRef PubMed CAS Google Scholar
Vinayagam, A., Stelzl, U., Foulle, R., Plassmann, S., Zenkner, M., Timm, J., Assmus, H. E., Andrade-Navarro, M. A. & Wanker, E. E. (2011). Sci. Signal. 4, rs8. CrossRef PubMed Google Scholar
Wearne, T. A., Mirzaei, M., Franklin, J. L., Goodchild, A. K., Haynes, P. A. & Cornish, J. L. (2015). J. Proteome Res. 14, 397–410. CrossRef CAS PubMed Google Scholar
This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.