Diffraction Structural Biology Synchrotron Radiation Crystal Structure of Udp-glucose:anthocyanidin 3-o-glucosyltransferase from Clitoria Ternatea

Flowers of the butterfly pea (Clitoria ternatea) accumulate a group of polyacylated anthocyanins, named ternatins, in their petals. The first step in ternatin biosynthesis is the transfer of glucose from UDP-glucose to antho-cyanidins such as delphinidin, a reaction catalyzed in C. ternatea by UDP-glucose:anthocyanidin 3-O-glucosyltransferase (Ct3GT-A; AB185904). To elucidate the structure–function relationship of Ct3GT-A, recombinant Ct3GT-A was expressed in Escherichia coli and its tertiary structure was determined to 1.85 A ˚ resolution by using X-ray crystallography. The structure of Ct3GT-A shows a common folding topology, the GT-B fold, comprised of two Rossmann-like // domains and a cleft located between the N-and C-domains containing two cavities that are used as binding sites for the donor (UDP-Glc) and acceptor substrates. By comparing the structure of Ct3GT-A with that of the flavonoid glycosyltransferase VvGT1 from red grape (Vitis vinifera) in complex with UDP-2-deoxy-2-fluoro glucose and kaempferol, locations of the catalytic His-Asp dyad and the residues involved in recognizing UDP-2-deoxy-2-fluoro glucose were essentially identical in Ct3GT-A, but certain residues of VvGT1 involved in binding kaempferol were found to be substituted in Ct3GT-A. These findings are important for understanding the differentiation of acceptor-substrate recognition in these two enzymes.


Introduction
Many small lipophilic compounds in living cells are modified by glycosylation, a process that can regulate the bioactivity of those compounds, their intracellular localization and their metabolism (Lim & Bowles, 2004). One of the most significant, and representative, glycosylation reactions in plants is the formation of anthocyanins, a class of flavonoids. Anthocyanins are water-soluble compounds based on a tricyclic flavonoid core and are known to function as pigments involved in determining the color of flowers, leaves, seeds and fruits (Offen et al., 2006;Tanaka et al., 2008).
The blue flower pigmentation of Clitoria ternatea results from the accumulation in the petal of polyacylated anthocyanins referred to as ternatins (Honda & Saito, 2002). Ternatins are delphinidin 3-O-(6 00 -O-malonyl)--glucoside derivatives that have a 3 0 ,5 0 -di-O--glucoside structure in their B-ring, in which both glucosyl residues are alternately acylated and glucosylated in repetitions by p-coumaroyl and glucosyl groups (Kazuma et al., 2003(Kazuma et al., , 2004. Studies on ternatin biosynthesis in C. ternatea revealed that delphinidin is not directly glucosylated at the 3 0 -or 5 0 -hydroxyl group, but that glucosylation of delphinidin occurs only when it has a 6 00 -Omalonyl--glucoside at the 3-position. Thus, glucosylation of delphinidin at the 3-hydroxyl group was proposed to be the first key step of ternatin biosynthesis (Kogawa et al., 2007).
Ct3GT-A was identified in C. ternatea as a UDP-glucose : anthocyanidin 3-O-glucosyltransferase (GenBank accession No. AB185904) that catalyzes glucosyl transfer from UDPglucose to anthocyanidins such as delphinidin (Fig. 1a). The putative amino acid sequence of Ct3GT-A is 45% identical to that of the enzyme VvGT1 from red grape (Vitis vinifera), which is a representative uridine diphosphate glycosyltransferase (UGT) with similar acceptor-substrate specificity. VvGT1 is a cyanidin 3-O-glycosyltransferase involved in the formation of anthocyanins, with a minor activity toward PDB Reference: 3wc4 flavonols such as kaempferol ( Fig. 1b) (Offen et al., 2006). The crystal structure of VvGT1, in complex with the non-transferable sugar donor UDP-2-deoxy-2-fluoro glucose (UDP-2FGlc) and the sugar acceptor kaempferol, provided the initial structural basis for understanding the catalytic mechanism and substrate recognition of this enzyme. In addition, the crystal structures of these four plant UGTs have been reported so far: Medicago truncatula UGT71G1, a triterpene/flavonoid glycosyltransferase involved in saponin biosynthesis (Shao et al., 2005), UGT85H2, an (iso)flavonoid glycosyltransferase involved in the biosynthesis of secondary metabolites (Li et al., 2007), UGT78G1, an (iso)flavonoid glycosyltransferase that functions in anthocyanin biosynthesis (Modolo et al., 2009) and Arabidopsis thaliana UGT72B1, a chloroaniline/chlorophenol glucosyltransferase in the metabolism of xenobiotics (Brazier-Hicks et al., 2007). These plant UGTs all have the GT-B fold, one of two general folds found in the UGT superfamily of enzymes (Coutinho et al., 2003;Breton et al., 2012), and they possess two N-and C-terminal domains with similar Rossmann-like folds (Wang, 2009). They also have in common a signature motif known as putative secondary plant glycosyltransferase (PSPG) box near the C-terminus, which is thought to be involved in binding to the UDP moiety of the sugardonor substrate (Lairson et al., 2008). However, the relationship between the primary structures of these enzymes and their substrate specificity including regioselective glycosylation remains to be elucidated. Although the crystal structures of several UGTs have been determined, it is still unclear how UGTs distinguish between a large variety of sugar acceptors (e.g. anthocyanidins, flavonols and isoflavones) and synthesize many kinds of products.
Here, we present the three-dimensional structure of Ct3GT-A determined at a resolution of 1.85 Å by using synchrotron radiation. The structure of Ct3GT-A shows the typical GT-B fold conserved in plant UGTs, but structural features of the acceptor-substrate-binding site in Ct3GT-A are partly different from those of other UGTs. These findings offer a deep insight into the structure-function relationship of Ct3GT-A.

Protein expression and purification
The gene encoding Ct3GT-A (GenBank accession No. AB185904) was PCR-amplified using the sense primer 5 0 -GACGACGACAAGATGAAAAACAAGCAGCATG-TTGC-3 0 and the antisense primer 5 0 -GAGGAGAAGCCC-GGTTTAGCTAGAGGAAATCACTTC-3 0 , and the obtained product was ligated into pET-30 Ek/LIC vector (Novagen). The Ct3GT-A cDNA fragment with an enterokinase cleavage site was isolated from the resultant plasmid by digestion with BglII and XhoI, and subcloned into the BamHI/SalI digested pQE31 vector (Qiagen). The recombinant protein was overexpressed in Escherichia coli XL1 Blue cells (Stratagene) by adding isopropyl--d-galactoside to a final concentration of 1 mM and inducing the cells for 20 h at 298 K. The cells were harvested by centrifugation and resuspended in a buffer containing 50 mM Tris-HCl (pH 8.0), 500 mM NaCl, 20 mM imidazole, 1 mM dithiothreitol and 0.5 mM phenylmethylsulfonyl fluoride. After disrupting the cells by sonication, the cell debris was removed by centrifugation, and the supernatant was applied to a Ni-Sepharose column (GE Healthcare). The eluted fraction containing Ct3GT-A was dialyzed against 20 mM Tris-HCl (pH 7.4), 200 mM NaCl and 2 mM CaCl 2 , and the N-terminal His-tag was removed by digestion using recombinant enterokinase (Novagen). Cation-exchange chromatography was carried out next on an SP-5PW column (Tohso, Japan) to purify the enzyme to homogeneity.

Crystallization and data collection
Single crystals of Ct3GT-A were obtained using the hanging-drop vapor-diffusion method. After mixing equal volumes of the protein solution (20 mg ml À1 ) and the reservoir solution containing 0.1 M sodium citrate tribasic dihydrate (pH 5.6), 0.2 M ammonium acetate and 26% (w/v) polyethylene glycol 4000, the solution was equilibrated against the reservoir solution at 293 K. The crystals, grown up to 0.05 Â 0.05 Â 0.5 mm in size, were soaked into a cryoprotectant solution containing 25% (v/v) glycerol in addition to the reservoir solution before measurement.
X-ray diffraction data were collected under a liquidnitrogen stream (100 K) at beamline BL6A at the Photon Factory (Tsukuba, Japan). The dataset was indexed and processed by HKL2000 (Otwinowski & Minor, 1997). The diffraction data statistics are summarized in Table 1. All graphic images of molecular structure were generated by using the program PyMOL (DeLano, 2002). The atomic coordinates of recombinant wild-type Ct3GT-A have been deposited in the RCSB Protein Data Bank (PDB) with the code of 3wc4.

Structure determination
The crystals of recombinant wild-type Ct3GT-A belong to the space group P2 1 , with cell dimensions of a = 50.2 Å , b = 55.2 Å , c = 86.2 Å and = 105.1 . There was one molecule per crystallographic asymmetric unit with a solvent content of 48% (v/v) based on a Matthews coefficient (V m ) of 2.4 Å 3 Da À1 . The initial phase was solved by molecular replacement using the coordinates of the homologous glycosyltransferase VvGT1 from V. vinifera (PDB ID: 2c1z) as a search model.
An initial model of Ct3GT-A was built manually using COOT (Emsley & Cowtan, 2004), and refined subsequently to 1.85 Å resolution with R work /R free of 17.0%/21.1% by using REFMAC5 in the CCP4 program suite (Collaborative Computational Project, Number 4, 1994). All main-chain angles were in the allowed regions of a Ramachandran plot, with 98.4% of the residues in the most-favored regions. The residual electron density that was observed in the protein interior was assumed to be one acetate ion and one glycerol molecule contained in the cryoprotectant solution. The refinement statistics are summarized in Table 1. The asymmetric unit contained one molecule that corresponds to the physiological monomeric form of Ct3GT-A (Fig. 2a).

Overall structure of Ct3GT-A
Ct3GT-A possesses a typical GT-B fold structure comprised of two Rossmann-like // domains (Fig. 2a), which are conserved in plant UGTs (Breton et al., 2012)    (C-domain) is composed of a twisted -sheet with six strands accompanied by ten -helices on its two sides. There is a cleft located between the N-and C-domains (Fig. 2b). The cleft was further divided into two cavities that are used as binding sites for the donor (UDP-Glc) and the acceptor substrates (Wang, 2009). The N-and C-domains are connected by a loop region comprising residues 246-251, which is highly flexible with temperature factors above 41 Å 2 (Fig. 2c). The donor-binding site conserved as a UGT signature 'PSPG' motif is located in the C-domain of Ct3GT-A, and the C-terminal helix comprising residues 431-445 participates in forming the Ndomain after crossing the cleft (Fig. 2a). Structural homology searches performed using the Dali server (Holm & Sander, 1993) indicated that Ct3GT-A was similar to the plant UGTs VvGT1 from V. vinifera (PDB ID: 2c1z) and UGT78G1 from M. truncatula (PDB ID: 3hbf), with root-mean-square deviations (RMSDs) of 1.9 Å for 432 C atoms (Dali Z-score of 49.9) and 2.0 Å for 437 C atoms (Dali Z-score of 48.7), respectively. VvGT1 is an enzyme that preferentially glucosylates cyanidin to yield cyanidin 3-Oglucoside in red grape, and its crystal structure has been determined as a Michaelis complex with the non-transferable UDP-2FGlc donor and the flavonol kaempferol (Offen et al., 2006). UGT78G1 was identified as a multifunctional (iso)flavonoid glycosyltransferase that catalyzes the 3-Oglycosylation of formononetin in addition to that of flavonols (Modolo et al., 2009).
Structural comparison indicated that Ct3GT-A and VvGT1 share a common backbone architecture (Fig. 2d). The positions of the donor-and acceptor-binding sites in VvGT1 correspond to those of the two cavities in Ct3GT-A. The coordinates of UDP-2FGlc and kaempferol in the VvGT1 structure fit well and without any steric hindrance within the cleft of Ct3GT-A. When the two enzymes were superimposed using the program LSQKAB (CCP4, 1994), significant displacements (> 5 Å ) were detected at four loop regions of the N-domain (residues 51-54, 75-78, 153-158 and 184-188) (Fig. 2c). Because the loop region containing residues 75-78 is located above the acceptor-binding site, the structural difference may contribute to the differentiation of acceptorsubstrate recognition between Ct3GT-A and VvGT1.

Structural characteristics for the function of Ct3GT-A
To understand the molecular characteristics of Ct3GT-A, the electrostatic potential of the protein surface was calculated using APBS (Baker et al., 2001) as shown in Figs. 3(a) and  3(b). The donor-binding site located at the surface of Ct3GT-A is formed mainly by the residues from the PSPG motif that is highly conserved among plant UGTs and rich in positive charges (Fig. 3a). The residues involved in recognizing UDP-2FGlc are almost identical in Ct3GT-A and VvGT-1, which is consistent with the fact that these enzymes use the same donor substrate. The donor-binding site is further connected to another cavity for binding acceptor substrates (acceptorbinding site), as shown in Figs. 3(c) and 3(d).
The acceptor-binding site is formed mostly by the residues from the N-domain. Besides the hydrophobic residues Phe12, Phe116, Trp135, Tyr145, Phe192 and Leu196, the hydrophilic residues Asn137, Asp181 and Asp367 are arranged to form the acceptor-binding site (Fig. 3d). The acceptor-binding site can be accessed from the solvent through two openings, 1 and 2 [ Figs (a) Electrostatic surface potential of Ct3GT-A viewed from the same direction as in Fig. 2(a). The surfaces are colored by electrostatic potential isocontours from the potential of +5 kT e À1 (blue) to À5 kT e À1 (red). (b) Electrostatic surface potential of Ct3GT-A after rotating 90 around the vertical axis. (c) Close-up view of the two openings leading to the acceptor-binding site. The residues involved in forming the openings are shown as stick models. The distances showing apparent size of the openings are indicated with dashed lines. (d) Cross-section view of the acceptor-binding site after rotating approximately 45 with respect to the figure (along the line) in (c). The residues involved in forming the acceptor-binding site are shown as stick models. The conserved catalytic residues, His17 and Asp114, are labeled in red. diameter of 11 Å (between the C atoms of Gly15 and Pro78) and a minor diameter of 8 Å (between the C of Phe14 and C2 of Val274), which is formed by hydrophobic residues Phe14, Gly15, Pro78 and Leu82 from the N-domain and V274 from the C-domain (Fig. 3c). Opening 2 is formed by the side chains of Ile79, Asp181 and Phe365 and the main chain of Gly366 (Fig. 3c), and the size of this elliptical opening is similar to that of opening 1; the major diameter of opening 2 is 11 Å (between the O2 of Asp181 and C1 of Val274) and the minor diameter is 7 Å (between the C1 of Ile79 and C of Phe365). The presence of a hydrophilic residue (Asp181) at opening 2 might help effective passage of the hydrophilic part of the substrate.
The residues His17 and Asp114, located at the bottom of the acceptor-binding site in Ct3GT-A, are conserved as the catalytic dyad His20-Asp119 in VvGT1 (Fig. 3d), suggesting that Ct3GT-A adopts a catalytic mechanism similar to that proposed for VvGT1: the conserved histidine residue acts as a general base to help deprotonation of the 3-hydroxyl group of the acceptor substrate, after which the generated nucleophile attacks the anomeric carbon of the glucose moiety (Breton et al., 2012). The carboxyl side chain of Asp119 is thought to increase the proton-accepting ability of the imidazole ring as seen in the catalytic mechanism of serine proteases, which have a catalytic triad of Ser-His-Asp with a similar geometry (Wharton, 1998).
In the acceptor-binding site of VvGT1, the side chains of Ser18, Gln84 and His150 form hydrogen bonds with the flavonol acceptor kaempferol; these residues are substituted with Gly15, Ile79 and Tyr145 in Ct3GT-A (Fig. 3d). Because the hydrogen bonds with the acceptor substrate are critical for determining molecular orientation within the binding site of VvGT1 (Offen et al., 2006), the substitutions found in Ct3GT-A may enable the differentiation of the acceptor substrate.
Although several crystal structures of acceptor-substrate complexes have been determined, including the structures of flavonol-bound forms of VvGT1 with kaempferol or quercetin and of UGT78G1 bound to myricetin, there is no information for recognition of anthocyanidins, which is presumably due to the instability of anthocyanidins unlike flavonols. Structural studies of Ct3GT-A complexes with anthocyanidins are in progress for further understanding the recognition of acceptor substrates in Ct3GT-A.