1.65 Å resolution structure of the AraC-family transcriptional activator ToxT from Vibrio cholerae

A crystal structure of ToxT at 1.65 Å resolution with a similar overall structure to the previously determined structure is reported. A region that extends from Asp101 to Glu110, which can influence ToxT activity but was disordered in the previous structure, can be traced entirely in the current structure.


Introduction
The AraC family of transcriptional activators, with members present in over 70% of sequenced bacterial genomes, is defined by a DNA-binding domain containing two helix-turnhelix motifs (Ramos et al., 1990;Gallegos et al., 1993Gallegos et al., , 1997Egan, 2002;Ibarra et al., 2008). Many AraC-family proteins have a second domain, the sequence of which shares sequence similarity within subsets of the family but not the entire family (Gallegos et al., 1997). The most common roles of the non-DNA-binding domain are effector binding and/or dimerization. AraC-family members typically activate the expression of genes involved in carbon metabolism, stress responses or virulence (Gallegos et al., 1993(Gallegos et al., , 1997Egan, 2002;Tobes & Ramos, 2002;Ibarra et al., 2008). ToxT is an AraC-family transcriptional activator of Vibrio cholerae virulence-gene expression, with a C-terminal DNA-binding domain and an N-terminal domain involved in dimerization and effector binding (Lowden et al., 2010). ToxT directly activates the expression of the genes encoding the toxin-coregulated pilus (TCP), which is essential for colonization of the human intestine, and the cholera toxin (CT), the cause of the diarrheal disease that is characteristic of cholera (Champion et al., 1997;DiRita et al., 1991;Higgins et al., 1992). ToxT has also been shown to positively auto-regulate its own expression from the tcp promoter (Brown & Taylor, 1995;Yu & DiRita, 1999). In V. cholerae, ToxT-dependent gene activation is inhibited by both bile and individual unsaturated fatty acids found in bile (Schuhmacher & Klose, 1999;Chatterjee et al., 2007). The full-length structure of ToxT determined by ISSN 2053-230X Lowden et al. (2010) has the fatty acid cis-palmitoleic acid (PAM) bound to the N-terminal domain. Although oleic acid is likely to be the physiological effector of ToxT given its high concentration in bile, both PAM and oleic acid have been shown to reduce the expression of tcp and ctx in vivo and to reduce DNA binding by ToxT in vitro (Lowden et al., 2010). Therefore, the structure obtained by Lowden et al. (2010) is expected to represent the non-activating state of ToxT, where its ability to bind DNA and activate transcription is reduced compared with its activating conformation without effector bound.
The previously determined 1.9 Å resolution ToxT crystal structure (PDB entry 3gbg; Lowden et al., 2010) shows that ToxT has the same overall domain architecture as the predicted AraC protein: each of the ToxT monomers comprises an N-terminal effector-binding and dimerization domain that shares sequence similarity with the AraC N-terminal domain, and a C-terminal DNA-binding domain. ToxT was the first AraC-family protein from the same subset of the family as AraC to have its full-length structure resolved at high resolution. However, the structure determined by Lowden et al. (2010) contains a disordered region between residues Asp101 and Glu110 within the N-terminal domain. Childers et al. (2007) have shown that alanine substitutions at residues within the disordered region in the 3gbg structure, Met103, Arg105 and Asn106, increase the activation of the ctxA promotor by threefold to fourfold compared with wildtype ToxT, indicating that this region is important for proper ToxT activation (Childers et al., 2007). Hung et al. (2005) have shown that replacing the leucine at residue 114 with a proline confers resistance to virstatin, a small-molecule inhibitor of ToxT, suggesting that the nearby disordered region may also be important for inhibition by virstatin (Hung et al., 2005;Shakhnovich et al., 2007). Here, we report a crystal structure of ToxT at 1.65 Å resolution (PDB entry 4mlo) in which the region spanning Asp101-Glu110 could be modeled.

Protein purification and crystallization
The expression and purification of ToxT was performed as described previously (Lowden et al., 2010), with a few exceptions. Briefly, ToxT was overexpressed as a ToxT-inteinchitin-binding domain fusion from plasmid pTXB1 (New England Biolabs), the same construct as used by Lowden et al. (2010), by autoinduction in ZYM-5052 medium (Studier, 2005) with 200 mg ml À1 ampicillin using Escherichia coli strain BL21 (DE3) (New England Biolabs). This strain differed from the BL21-CodonPlus (DE3)-RIL (Stratagene) strain used by Lowden et al. (2010) as we found that ToxT was highly overexpressed in the basic BL21 (DE3) strain. The initial purification was carried out using a chitin-affinity column (New England Biolabs) with gravity flow. ToxT was cleaved from the intein-chitin-binding domain by the addition of 100 mM dithiothreitol (DTT) to cleavage buffer (20 mM Tris pH 8.0, 1 mM EDTA, 150 mM NaCl) and incubation for 16 h at 4 C.
ToxT was eluted from the column, and the eluent, which contained untagged ToxT, was loaded onto a HiTrap SP Sepharose Fast Flow cation-exchange column (GE Healthcare) in buffer consisting of 20 mM Tris-HCl pH 6.8, 33.3 mM DTT, 50 mM NaCl. ToxT was eluted using a gradient from 100% buffer A (0.05 M NaCl) to 100% buffer B (1 M NaCl), with the protein peak corresponding to ToxT eluting at 88% buffer B. The fractions containing purified ToxT protein were combined and then concentrated to 1.65 mg ml À1 for crystallization screening using an Amicon ultracentrifugal filter unit (Millipore) with a molecular-weight cutoff of 10 kDa. All crystallization screening was conducted in Compact Jr or Clover Jr (Rigaku Reagents) sitting-drop vapor-diffusion plates incubated at 293 K using 0.75 ml protein solution and 0.75 ml crystallization solution equilibrated against 75 ml of the latter. Crystals displaying needle ($100 Â 10 mm) or plate  Table 1 Data-collection and refinement statistics for the ToxT structure.
Values in parentheses are for the highest resolution shell.  (hkl) is the intensity measured for the ith reflection and hI(hkl)i is the average intensity of all reflections with indices hkl. ‡ R meas is the redundancy-independent (multiplicity-weighted) R merge (Evans, 2006(Evans, , 2012. R p.i.m. is the precision-indicating (multiplicity-weighted) R merge (Diederichs & Karplus, 1997;Weiss, 2001). § CC 1/2 is the correlation coefficient of the mean intensities between two random half-sets of data (Karplus & Diederichs, 2012;Evans, 2012). } R factor = P hkl jF obs j À jF calc j = P hkl jF obs j; R free is calculated in an identical manner using a randomly selected 5% of the reflections, which were not included in the refinement.
($60 Â 20 mm) morphology formed overnight from various screens. The plate-shaped crystals which were used for data collection were obtained using condition H10 [5%(w/v) PEG 4000, 10%(v/v) 2-propanol, 0.1 M MES pH 6.5, 200 mM MgCl 2 ] from the ProPlex HT screen (Molecular Dimensions), a condition that was significantly different from the crystallization condition identified by Lowden et al. (2010). Crystals were transferred into a fresh drop composed of 80% crystallization solution and 20% ethylene glycol and stored in liquid nitrogen.

Data collection and structure refinement
X-ray diffraction data were collected on beamline 17-ID at the Advanced Photon Source using a Dectris PILATUS 6M pixel-array detector. Intensities were integrated using XDS (Kabsch, 1988), and Laue class analysis and data scaling were performed with AIMLESS (Evans & Murshudov, 2013), which suggested that the highest probability Laue class was 2/m with space group P2 1 . The Matthews coefficient (Matthews, 1968; V M = 2.3 Å 3 Da À1 , 46.8% solvent content) suggested that the asymmetric unit contained a single molecule. Structure solution was conducted by molecular replacement with Phaser  (a) Asymmetric unit of ToxT (PDB entry 4mlo) colored by secondary structure. The N-and C-terminal residues (Lys5 and Gly272) of the model are indicated along with the disordered region between Asn132 and Phe134. The 3 10 -helix spanning Leu99-Asp101 is colored blue. The PAM molecule and chloride ions are shown as cylinders and gold spheres, respectively. (b) F o À F c OMIT map contoured at 3 (green mesh) for PAM and associated hydrogen bonds (dashed lines) to ToxT residues. (c) Enlarged view of the region from Ser87 to Glu110. Helix 1 spans Ser87-Ile98 and contains a kink at Leu94. This is followed by a 3 10 -helix spanning Leu99-Asp101 and a shorter helix from Leu102 to Leu107 referred to as 1 0 .   Table 1. Refined atomic coordinates and experimental structure factors have been deposited in the Protein Data Bank (PDB entry 4mlo).

Results and discussion
The final model of ToxT could be traced in the electrondensity map from Lys5 to Gly272, except for the disordered Gly133, which is located in a loop connecting helix 2 to helix 3 (Fig. 1a). Electron density consistent with PAM was also present (Fig. 1b), as was observed in the original ToxT structure (PDB entry 3gbg; Lowden et al., 2010), although PAM was not added in either case but was acquired from the expression host. Interestingly, Asp101-Glu110 could be modeled in this structure, which included helix 1 and a loop region that connects this helix to the 9 sheet. The helix in our structure can be thought of as containing two segments, which we refer to as 1 and 1 0 to be consistent with the prior secondary-structure assignment for PDB entry 3gbg (Lowden et al., 2010;Fig. 1c). In addition, three chloride ions were modeled in the C-terminal region of ToxT, which were assigned based on the coordination distances ($3.1-3.3 Å ) to neighboring residues and water molecules. When water molecules were initially assigned to the chloride sites, positive electron density was observed following refinement, indicating an underestimation of electrons. Therefore, the modeling of chloride ions at these sites was consistent with the observed electron density and coordination.
The overall structure is similar to PDB entry 3gbg reported by Lowden et al. (2010), with an r.m.s.d. between C atoms of 1.00 Å (Lys5-Gly272) as determined using the Secondary Structure Matching (SSM; Krissinel & Henrick, 2004) algorithm with SUPERPOSE via the CCP4 interface (Winn et al., 2011). However, there are also differences between the two structures, as shown in the per-residue r.m.s.d. plot in Fig. 2(a) and the superimposed structures in Fig. 2(b). Specifically, the region between 1 and 9, which was disordered in PDB entry 3gbg (Lowden et al., 2010) from Asp101 to Glu110, could be fully traced in the current structure (Fig. 3a). In this region, helix 1 spans Ser87-Ile98 and contains a kink at Leu94. This is followed by a 3 10 -helix spanning Leu99-Asp101 that Loop region between 1 0 and 9. (a) 2F o À F c map contoured at 1 (blue mesh) for residues Gly100-Asn111 which were disordered in PDB entry 3gbg (Lowden et al., 2010). (b) Interactions between 1 0 and 3. Residues within the 1 0 (Arg105) and 3 (Glu156) helices are colored cyan. The residues in the loop regions of these helices (Ser109, Asn160 and Ile162) are colored gray. (a) Comparison of the regions connecting helices 2 and 3 and helices continues into a shorter helix from Leu102 to Leu107 (1 0 ). Tyr108-Asp113 form a connecting loop between 1 0 and 9. This region appears to be stabilized by residue Glu156, in helix 3, through a salt bridge with residue Arg105. Additionally, this region is stabilized by Asn160 and Ile162, from a loop connecting 3 and 4, through hydrogen-bonding interactions with Ser109 of the loop region (Fig. 3b). The loop region connecting helices 3 and 4 also shows conformational differences relative to PDB entry 3gbg (Lowden et al., 2010), as depicted in Fig. 4(a), potentially owing to interactions between residues in the previously disordered region and residues in helix 3. Interestingly, the ToxT region between 1 and 9 (residues Asp101-Glu110) is folded over a loop that is located sequentially after it: the loop that connects helices 3 and 4, spanning residues Lys158-Ala170. A very similar arrangement can be observed in the structure of the regulatory domain of ExsA, where the loop connecting 1 and 9 folds over helix 4 (PDB entry 4zua; Shrestha et al., 2015). ExsA is an AraC-family transcriptional activator that regulates type 3 secretion-system genes in Pseudomonas aeruginosa (Shrestha et al., 2015;Urbanowski et al., 2005).
Our observation that Arg105 forms a salt bridge with Glu156 may help to explain the prior finding that alanine substitutions of residues Met103, Arg105 and Asn106, within the region that was disordered in PDB entry 3gbg (Lowden et al., 2010), had a threefold to fourfold elevated activity at the ctxA promotor (Childers et al., 2007). Our structure suggests the possibility that Arg105 holds Glu156 in a position that somewhat attenuates ToxT activity. Glu156 is located in helix 3, which is likely to be involved in dimerization to facilitate transcriptional activation (Lowden et al., 2010). Thus, Arg105 may maintain the activity of ToxT at its wild-type level by supressing dimerization somewhat (relative to the Arg105Ala substitution). However, other than their potential effects on Arg105, the structure does not provide potential explanations for how alanine substitutions at residues Met103 or Asn106 also increase ToxT activity.
Further analysis was conducted to gauge the quality of fit of the models to the electron density. Analysis of the map-model correlation coefficients via PHENIX revealed several regions in PDB entry 3gbg that display low correlation to the 2F o À F c map, including the 3-4 (Lys158-Ala170) loop (interdomain linker), as shown in Fig. 4(b). Although the Lys158-Ala170 loop region was modeled in PDB entry 3gbg, it was poorly defined, making it difficult to discern the exact positions of the residues in this region. By contrast, the electron density in the current structure was clearly traceable in this region, which is reflected by the high correlation coefficient. It should be noted that none of the residues in this loop form hydrogen-bond contacts with symmetry-related molecules, which suggests that crystal packing was not a factor in the conformational differences relative to PDB entry 3gbg. Additional differences between the two structures were observed in the loop connecting helices 2 and 3 (Asn132-Asp141) and in part of helix 2 (Glu120-Val126) (Fig. 4b). Gly133 in PDB entry 3gbg was ordered, and was stabilized by Lys4 through hydrogenbonding interaction. However, both Gly133 and Lys4 were missing from the current structure. It is likely that the slight conformational change in the connecting-loop region (Asn132-Asp141) disrupted the hydrogen-bonding interaction between Gly133 and Lys4, causing both residues to become flexible and untraceable in the current structure. An alanine substitution of Gly133 had wild-type activity at the ctxA promoter (Childers et al., 2007), suggesting that this residue may not play a key role in the activity of ToxT.
Virstatin, a small-molecule inhibitor of ToxT identified by Hung et al. (2005), blocks ToxT dimerization and thus its ability to activate transcription of the tcp and ctx promoters (Shakhnovich et al., 2007). Shakhnovich et al. (2007) also demonstrated that a ToxT variant, Leu114Pro, is resistant to virstatin and suggested that the Leu114Pro mutation may result in a conformational change in ToxT that allows the protein to dimerize more efficiently (Shakhnovich et al., 2007). Lowden et al. (2010) suggested that the previously disordered region from Asp101 to Glu110 might be involved in the virstatin resistance of the Leu114Pro variant owing to its proximity; however, there are no obvious interactions between Leu114 and any of the residues in the 101-110 region that would suggest involvement of this region in the mechanism of virstatin resistance of ToxT Leu114Pro.
Overall, the new 1.65 Å resolution crystal structure of ToxT (PDB entry 4mlo) reveals the structure of the previously unresolved region (residues 101-110), including the presence of a previously unidentified helix (1 0 ), as well as interactions between the residue 101-110 region and surrounding residues. This region is of importance as substitutions have been shown to effect activation of the ctxA promotor (Childers et al., 2007). There are several additional structural differences between the previously reported structure (PDB entry 3gbg; Lowden et al., 2010) and the new structure (PDB entry 4mlo). Overall, the new structure provides more complete, detailed and higher quality structural information for ToxT than the previously determined ToxT structure.