Roles of the hydroxy group of tyrosine in crystal structures of Sulfurisphaera tokodaii O 6-methylguanine-DNA methyltransferase

Structural analyses of O 6-methylguanine-DNA methyltransferases (MGMTs) and their mutants suggest that the highly conserved tyrosine at the N-terminus of the helix–turn–helix motif may play a protective role in MGMTs by preventing oxidants from entering the active site.


Introduction
The genomes of all organisms are always at risk. Fortunately, cells can detect and react to any damage to genomic integrity through a variety of DNA-repair mechanisms, each with its own target and mechanism. One of the most important DNArepair mechanisms involves the enzyme O 6 -methylguanine-DNA methyltransferase (also known as MGMT, AGT or OGT; EC 2.1.1.63; Kaina et al., 2007). Its primary function is to remove highly cytotoxic O 6 -alkyl adducts on the guanine base (O 6 -alkyl G), protecting the cell from adverse biological effects induced by alkylating agents (Jacinto & Esteller, 2007;Zhong et al., 2010).
MGMT is highly conserved in all organisms ranging from bacteria to mammals. It translocates alkyl adducts via a direct damage-reversal pathway from O 6 -alkyl G-based oxygen to cysteine residues in the catalytic pocket. This unconventional mechanism repairs DNA but irreversibly deactivates MGMT, which is described as 'suicide' (Pegg, 2011). After the introduction of the alkyl group, the alkylated MGMT protein appears to be rapidly ubiquitinated and is susceptible to degradation by the proteasome, rather than being reconstituted and demethylated to reactivate the enzyme . In other words, one MGMT molecule can repair only one alkyl adduct.
The Sulfurisphaera tokodaii MGMT enzyme (StoMGMT) exhibits a typical MGMT protein structure consisting of two domains: a highly conserved C-terminal domain (CTD) that surprisingly overlaps with all available MGMT structures and an N-terminal domain (NTD) which, in contrast, differs greatly among MGMTs. The CTD contains a DNA-binding helix-turn-helix (HTH) motif, which is followed by an Asn hinge. This hinge precedes the -(V/I)PCHRV(V/I)-aminoacid sequence, which contains a conserved catalytic cysteine and an active-site loop, and is involved in substrate specificity. The catalytic network consists of cysteine, water, histidine and glutamic acid, similar to the catalytic triad of serine proteases .
In addition to the active-site loop sequence, there are 5-6 conserved amino acids in MGMTs. Among them, the tyrosine residue located at the entrance to the active pocket is said to play a role in promoting the reaction by rotating the target base of damaged DNA through steric and electrostatic effects (Hu et al., 2008). Although there are reports of computational studies on the activity of the enzyme with mutations at tyrosine residues (Thirumal Kumar et al., 2019), no studies of the crystal structures of its mutants have been found. In this study, we mutated two conserved amino acids in StoMGMT. Firstly, we mutated Tyr91, which is conserved in the vicinity of the active-site loop in StoMGMT, to phenylalanine to produce the Y91F mutant. Secondly, we mutated the cysteine which is responsible for receiving the methyl group in the active site to serine (C120S mutant). In addition, we created a double mutant that included both mutations (Y91F/C120S double mutant). We decided to investigate the function of Tyr91 in detail by comparing the crystal structures of these mutants and their complexes with substrate analogs.

Macromolecule production
The STK_RS05355 gene encoding the MGMT protein from S. tokodaii strain 7 T was amplified from the genomic DNA of the strain (NBRC 100140G; obtained from NBRC, NITE, Kisarazu, Japan) using polymerase chain reaction (PCR). The PCR fragment was digested with NdeI and BamHI and cloned into a pET-11a expression vector. The site-directed mutants Y91F, C120S and Y91F/C120S were generated using the QuikChange Site-Directed Mutagenesis Kit (Agilent Technologies Japan, Tokyo, Japan). Each clone was transformed into Escherichia coli Rosetta-gami (DE3) cells and plated on plates containing Luria-Bertani (LB) agar with ampicillin and chloramphenicol. A single colony from the LB agar plate was used to inoculate the primary culture in 15 ml LB medium supplemented with 15 ml ampicillin (50 mg ml À1 ) and 15 ml chloramphenicol (34 mg ml À1 ), which was allowed to grow overnight at 310 K. A secondary culture was set up by adding an inoculum from the primary culture to 1 l LB medium supplemented with ampicillin (50 mg ml À1 ) and chloramphenicol (34 mg ml À1 ) to a final concentration of 0.1%. The culture was incubated for 12 h at 310 K. The cells were harvested by centrifuging the culture at 8000g for 15 min at 277 K. The pellet was resuspended in double the volume of lysis buffer (20 mM Tris-HCl pH 8.0, 50 mM NaCl) and sonicated on ice using an ultrasonicator (UD-201; Tomy Seiko Co., Tokyo, Japan) seven times using a 30 s on/60 s off cycle. The proteins were purified by heating the supernatant to 343 K for 30 min, followed by centrifugation of the precipitated biomolecules at 8000g for 15 min. As analyzed by SDS-PAGE, this treatment precipitated most of the proteins of the E. coli expression host, while the thermostable proteins remained soluble. The supernatant was brought to 50% saturation using ammonium sulfate, clarified by centrifugation and precipitated with 70% saturated ammonium sulfate. The resuspended protein was dialyzed overnight against 1 l dialysis buffer [50 mM potassium phosphate pH 6.5, 50 mM potassium chloride, 0.1 mM ethylenediaminetetraacetic acid (EDTA)] at 277 K using a 10 kDa molecular-weight cutoff membrane. The protein sample was applied onto a 5 ml HiTrap SP HP column (Cytiva, Marlborough, Massachusetts, USA) pre-equilibrated in dialysis buffer. The loaded column was washed with dialysis buffer and eluted using an increasing linear gradient of elution buffer (50 mM potassium phosphate pH 6.5, 550 mM potassium chloride, 0.1 mM EDTA) over 30 column volumes. The eluted protein was desalted against dialysis buffer using Amicon Ultra centrifugal filter units (Merck KGaA, Darmstadt, Germany). The production of the macromolecule is summarized in Table 1.

Crystallization
Purified StoMGMT was concentrated to 10 mg ml À1 . The crystallization conditions were screened using the hangingdrop vapor-diffusion method. The experiments were performed at 293 K using several commercial screens. Diffraction-quality crystals were obtained from Index screen conditions 56, 63 and 69 (Hampton Research, Aliso Viejo, Table 1 Macromolecule-production information.

Source organism
Sulfurisphaera tokodaii strain 7 T Expression vector pET-11a Expression host E. coli Complete amino-acid sequence of the construct produced † MIVYGLYKSPFGPITVAKNEKGFVMLDFCD CAERSSLDNDYFTDFFYKLDLYFEGKKV DLTEPVDFKPFNEFRIRVFKEVMRIKWG EVRTYKQVADAVKTSPRAVGTALSKNNV LLIIPCHRVIGEKSLGGYSRGVELKRKL LELEGIDVAKFIEK † The positions of the site-directed mutations are underlined.
California, USA). Prior to data collection, the crystals were soaked in reservoir solution with 1 mM O 6 -methyl-2 0 -deoxyguanosine (hereafter referred to as O 6 -mdG; Berry & Associates, Dexter, Michigan, USA) for 0.5-2 h. Crystallization conditions are given in Table 2.

Data collection and processing
The crystals were cryoprotected by transferring them into the perfluoropolyether oil Fomblin Y (Merck KGaA, Darmstadt, Germany) and were flash-cooled in a cold stream of nitrogen gas at 100 K immediately before data collection. The diffraction data for StoMGMT were collected on the macromolecular crystallography beamlines BL-1A and BL-5A at the Photon Factory (PF), Tsukuba, Japan. Images were indexed and integrated using XDS (Kabsch, 2010a,b) followed by data reduction and scaling using AIMLESS (Evans & Murshudov, 2013). All calculations were carried out within the CCP4i interface (Potterton et al., 2018) using the CCP4 software suite . Data-collection statistics are summarized in Table 3.

Results and discussion
The crystal structures of StoMGMT were determined at resolutions of 1.13-2.60 Å . Almost all amino-acid residues of StoMGMT could be assigned in the final models, except for some residues at the C-terminus (Val150-Lys156).
StoMGMT shares a high percentage of similarity with MGMTs from human (43%; Wibley et al., 2000;Daniels et al., 2004;Duguid et al., 2005), Pyrococcus kodakaraensis (41%; Hashimoto et al., 1999), E. coli (31%; Moore et al., 1994), Mycobacterium tuberculosis (35%; Miggiano et al., 2013Miggiano et al., , 2016 and Saccharolobus solfataricus (68%; Perugino et al., 2015;Morrone et al., 2017;Rossi et al., 2018). The CTDs of the MGMTs are highly similar across species, whereas the NTDs are quite variable (Fig. 1). The NTD of StoMGMT was composed of an antiparallel -sheet consisting of three -strands and one interconnected folded helix (h1-H3, consisting of two 3 10 -helices and one -helix). The -sheet was connected to h1 through a loop between Asp27 of 3 and Glu33 of h1. Within this loop region, there were two cysteine residues, Cys29 and Cys31, which formed a disulfide bond and showed two different conformations in all of the crystals except for the Y91F/C120S-O 6 -mdG crystal ( Fig. 1). This disulfide bond, which is also present in S. solfataricus MGMT, is a characteristic feature of thermophilic proteins and is necessary for thermal stability (Perugino et al., 2015). Interestingly, varied conformations of this disulfide bond of MGMT are exclusive to S. tokodaii, although the exact reason for this is obscure. Even the most similar ortholog of StoMGMT, S. solfataricus MGMT, does not show such multiple conformations of the disulfide bond. The H3 helix exposed to the solvent was located along the 1 strand; the CTD was connected to the NTD through a connecting loop consisting of the region between Lys56 and Phe69. The helix of H5 and H6 consists of a HTH motif, which binds to the minor groove of DNA. The short H7 helix in the middle of the CTD contained the catalytic Cys120 residue in the conserved PCHRV motif. The last H9 helix of the CTD was not visible in any structure except for the Y91F and C120S mutants.
One or two sulfate ions were found in all structures except for the Y91F/C120S-O 6 -mdG structure. The first was found in a positively charged saddle consisting of the N atom of Met1 and the side chains of Arg75 and Lys99 from the neighboring molecule (Fig. 2a). The two basic amino acids are thought to be side chains involved in DNA binding based on comparisons with MGMTs from other species (Perugino et al., 2015). The sulfate ion that connects the two molecules may be an artifact that maintains the crystal structure. The second, found only in the wild type (Wild), was bound to the N atoms of the main chains of Glu126 and Lys127 through hydrogen bonds (Fig. 2b). This part is adjacent to the active pocket of the substrate and mimics the position of the phosphate group in the DNA.  The structures of the mutant crystals were as expected: in the Y91F mutant electron density for the hydroxy group of Tyr91 was not observed (Fig. 3b), in the C120S mutant the S atom of Cys120 was replaced by an O atom, resulting in serine (Fig. 3c), and in the Y91F/C120S mutant the two mutations occurred simultaneously (Fig. 3e). The root-mean-square displacements (r.m.s.d.s) between the individual structures of Wild and the new mutants averaged 0.24 (AE0.14) Å for all main-chain atoms and 0.93 (AE0.26) Å for all protein atoms. These crystal structures showed almost no structural changes except in the mutated part.
The Wild m , C120S-O 6 -mdG and Y91F/C120S-O 6 -mdG structures were obtained and analyzed using crystals obtained by soaking the proteins in reservoir solution with O 6 -mdG. The crystal structure of Wild m had a methyl group attached to the S atom at the -position of the Cys120 residue in the active site (Fig. 3a). However, electron density for O 6 -mdG or 2 0 -deoxyguanine (dG) was not observed in this crystal structure. In contrast, in the crystal structure of C120S-O 6 -mdG, electron density for O 6 -mdG was obtained without the methyl group being transferred to the protein (Fig. 3d). The N atom at position 3 of the base moiety of O 6 -mdG and the hydroxy group of Tyr91 were hydrogen-bonded. The crystal structure of Y91F/C120S-O 6 -mdG also showed electron density for O 6 -mdG, where the methyl group was not transferred to the protein (Fig. 3f). However, electron density for O 6 -mdG or dG was not found in the Y91F mutant crystal structure, regardless of whether it was soaked in O 6 -mdG solution.
The position of the guanine base moiety in the crystal structures of C120S-O 6 -mdG and Y91F/C120S-O 6 -mdG appears to be the same as that of Cys119 and substrate DNA (Perugino et al., 2015)  Overall structure of wild-type StoMGMT. The protein is shown as a ribbon diagram with the NTD (residues 1-55) colored cyan, the connecting loop colored purple and the CTD (residues 70-156) colored green. indicates -strand, while h and H indicate 3 10 -helix and -helix, respectively. The Cys29-Cys31 disulfide bond is found in two different conformations. The sulfate ions are shown using a ball-and-stick model. H9 is ordered in the structures of the C120S and Y91F mutants.

Figure 2
Electrostatic surfaces of Wild binding to two sulfate ions, which are bound to the NTD (a) and CTD (b). The positively charged surface is colored blue while the negatively charged surface is colored red. that the O 6 -mdG in this study can be regarded as having reproduced the position of the substrate. We initially thought that the hydroxy group of Tyr91 was necessary for positioning the methylated base portion of DNA in the active site. In the crystal structure of C120S-O 6 -mdG, a hydrogen bond between the hydroxy group of Tyr91 and N3 of the purine base was observed. In the crystal structure of Y91F/C120S-O 6 -mdG, electron density for bound O 6 -mdG is ambiguous and the B factor of the O 6 -mdG is high. Furthermore, the resolution is lower than that of other structures. Therefore, the hydroxy group of Tyr91 might contribute to stabilization of the binding of substrate. However, the crystal structure of Y91F/C120S-  O 6 -mdG suggests that the presence of the Tyr91 hydroxy group is not essential for the substrate base to be able to enter the binding pocket. The lack of electron density in the Wild m crystal structure for dGs with lost methyl groups implies that substrates with rearranged methyl groups are rapidly removed from the enzyme. A previous study reported that the hydroxy group of Tyr91 may stabilize the repaired guanine by reducing its negative charge (Daniels et al., 2004). The S atom of cysteine is more nucleophilic than the O atom at position 6 of guanine, and the methyl group readily rearranges to cysteine by an S N 2 reaction (Moore et al., 1994). In the crystal structure of the Y91F mutant, the thiol group of Cys120 was converted to a sulfo group (Cys-SO 3 H) with three O atoms attached to it (Fig. 3b), even though no additional manipulation was performed. This means that the cysteine in the expressed protein was oxidized by some substances. Thus, the enzyme would be inactive because oxidized cysteine would not be able to function as a nucleophile in the reaction. Because no electron density for dG or a methyl group was observed in the structure of Y91F mutant crystals immersed in O 6 -mdG, and because it was not oxidized in Wild, we speculate that the hydroxy group of Tyr91 may have prevented the oxidant from entering the active site due to its size and charge. This suggests that the role of tyrosine, which is highly conserved at the N-terminus of the HTH motif across species, may be to protect the active site of MGMT, which is a suicide enzyme that can only work once. However, due to the critical role of the physiological context in determining whether or not a specific mutation in a protein impacts its function, definitive proof of the role of Tyr91 in these StoMGMT mutations and in protecting Cys120 from oxidation will only be possible with in vivo studies.
In conclusion, the crystal structures of the wild type and the Y91F, C120S and Y91F/C120S mutants of the MGMT enzyme derived from S. tokodaii revealed that the hydroxy group of tyrosine may play a protective role in MGMTs by preventing oxidants from entering their active sites. Overall, our results may provide a framework for directing future studies aimed at understanding the molecular mechanisms by which high levels of conserved amino acids play a role in ensuring the integrity of suicide enzymes, in addition to promoting enzyme activity.