Crystallization of hepatocyte nuclear factor 4α (HNF4α) in complex with the HNF1α promoter element

Sample preparation, characterization, crystallization and preliminary X-ray analysis are reported for the HNF4α–DNA binary complex.


Introduction
Hepatocyte nuclear factor 4 (HNF4) is a tissue-specific transcription factor that plays an essential role in early vertebrate development and embryonic survival. It regulates the expression of a wide variety of essential genes, including those involved in liver and pancreatic cell differentiation (Li et al., 2000;Odom et al., 2004), embryogenesis and early development (Duncan et al., 1994;Lausen et al., 2000), glucose metabolism (Stoffel & Duncan, 1997), lipid homeostasis (Hayhurst et al., 2001) and amino-acid metabolism (Schrem et al., 2002). Mutations in HNF4 cause a dominantly inherited form of diabetes known as maturity-onset diabetes of the young (MODY; Yamagata et al., 1996). These mutations cause the loss of function of the gene product (Lausen et al., 2000), which leads to impaired insulin secretion and defects in metabolic pathways (Miura et al., 2006).
HNF4 is a prototypical member of a unique nuclear receptor superfamily (NR2A1; Nuclear Receptors Nomenclature Committee, 1999) and exclusively functions as a homodimer (Jiang et al., 1995), despite its sequence homology to retinoic X receptor (RXR), which can readily heterodimerize with a related nuclear receptor (Szanto et al., 2004). HNF4 consists of distinctive functional domains including a DNA-binding domain (DBD), a ligand-binding domain (LBD) and additional domains with transcription-activation functions (AF; Schrem et al., 2002). However, the identity of its bona fide ligand is still under dispute (Hertz et al., 1998;Petrescu et al., 2005), even though its apparent ligand has been identified from structural studies (Dhe-Paganon et al., 2002;Wisely et al., 2002). HNF4-DBD contains two zinc-finger motifs that specifically recognize and bind as a homodimer to a direct repeat of two hexameric half-sites separated by one (DR1; in the majority) or two nucleotides (DR2) (Jiang et al., 1995;Rajas et al., 2002). Five MODY1 missense mutations (on four different residues) are found within the region of our HNF4-DBD construct (Fig. 1a) and an additional MODY mutation is found in the HNF4-binding site within the promoter of another MODY (MODY3) culprit gene HNF1 ( Fig. 1b; Gragnoli et al., 1997). Analysis of the structural consequences of each amino-acid substitution should be instructive as to the functional role of each residue. In order to elucidate the molecular basis of HNF4 function and the monogenic causes of diabetes, we have prepared and crystallized the human HNF4 DNA-binding domain in complex with a high-affinity HNF1 promoter element containing the HNF4 recognition sequence.

Construction, expression and purification of HNF4a DNA-binding domain
The cDNA harboring the full-length human HNF-4B splice variant (Kritis et al., 1996) was a kind gift from Dr Steve Shoelson of Joslin Diabetes Center. A fragment of human HNF4 cDNA (amino acids 46-126) was subcloned by standard PCR into a pET41a vector (GE Healthcare). HNF4 was overexpressed in Escherichia coli BL21-Gold (Novagen) with induction by 0.5 mM IPTG at an OD 600 of 0.8-1.0 at 310 K and harvested after culturing for an additional 3-4 h. No zinc solution was added during the purification since there should be a sufficient amount of Zn atoms in the medium to be incorporated into the protein. The cells were lysed by sonication and the expressed GST-fusion proteins were isolated with the use of glutathione-agarose beads (Invitrogen) in bulk plus washing in the presence of 0.6 M NaCl to prevent nonspecific binding to bacterial DNA. HNF4 was released by thrombin digestion from the resin after overnight incubation at 277 K and was further purified by ionexchange chromatography (Mono-S FPLC). Thrombin digestion produced a two-residue remnant (Gly-Ser) at the N-terminal end (Fig. 1a). The purified protein was estimated to be at least 98% pure as judged by staining with Coomassie on 8-25% gradient SDS-PAGE gel (Fig. 2). Fractions were pooled and stored at 193 K as a 10%(v/v) glycerol stock.

Gel filtration of HNF4a DNA-binding domain
Gel filtration was performed on a Superdex 75 HR 10/30 column (GE Healthcare) equilibrated with running buffer containing 20 mM Tris pH 7.5, 200 mM NaCl, 1 mM EDTA and 10 mM 2-mercapto-ethanol. Elution was performed at a flow rate of 0.5 ml min À1 . The apparent molecular weight of HNF4-DBD was determined using the same column calibrated previously with a range of reference proteins (Bio-Rad): thyroglobulin (670 kDa), bovine -globulin (158 kDa), chicken ovalbumin (44 kDa), equine myoglobin (17 kDa) and vitamin B 12 (1.4 kDa). Blue dextran was used to determine the void volume of the column.

Preparation of DNA oligomers
Tritylated oligonucleotides were purchased from the Midland Certified Reagent Company (Midland, Texas, USA) and further purified by reverse-phase HPLC on a C8 XTerra prep column (Waters) using a linear 5-50%(v/v) acetonitrile gradient in 50 mM triethylamine acetate buffer pH 7.0. Excess mobile phase containing acetonitrile was removed using HiTrapQ (GE Healthcare) and the trityl groups were removed with 80%(v/v) acetic acid. The deprotected oligonucleotides were precipitated with 75%(v/v) ethanol, dissolved in water for concentration measurement by A 260 and lyophilized before storage at 193 K. Double-stranded DNAs were generated for crystallization by heating equimolar amounts of complementary oligonucleotides to 358 K for 10 min and slowly cooling to 277 K. The annealing buffer condition was 20 mM Tris pH 8.0, 200 mM NaCl and 1 mM EDTA.

Electrophoretic mobility-shift assay (EMSA)
Single-stranded oligonucleotides 1 and 2 (Fig. 1b) were dissolved in 10 mM Tris (pH 7.5 at 293 K) and 1 mM EDTA. Oligonucleotide 1 was 5 0 -end labeled with 32 P as described by Maxam & Gilbert (1977). Labeled oligonucleotide 1 was mixed with a 1.1-fold molar excess of oligonucleotide 2 and the samples were heated to 363 K and cooled slowly to 293 K. DNA was transferred by dialysis into binding buffer [10 mM Tris pH 8.0, 1 mM EDTA, 100 mM NaCl, 1 mM MgCl 2 , 1 mM DTT, 4%(v/v) glycerol]. DNA concentrations were measured by absorbance using a molar extinction coefficient " 260 1 cm of 2.57 Â 10 5 . Samples were stored at 253 K until use. EMSAs were carried out as described by Hellman & Fried (2007) using 10%(w/v) polyacrylamide gels cast and run in 45 mM Tris-borate, 2.5 mM EDTA (pH 8.3 at 293 K). Autoradiographic images were captured on storage phosphor screens (type GP, GE Healthcare), detected with a Typhoon phosphorimager and quantitated with Image-Quanta software (GE   (1) and (2) results in convergence on self-consistent values of n and K obs (Adams & Fried, 2007;Fried & Crothers, 1984).

Dynamic light-scattering measurement
The effective molecular radius and the homogeneity/monodispersity of the complex within various particular buffer conditions were measured using the Solubility Screening Kit (Jena Biosciences) in conjunction with a Dynapro-99 dynamic light-scattering instrument (Proterion Corporation) and a DynaPro-MSTC200 microsampler (Protein Solutions). The results were analyzed using DYNAMICS v.5.26.60 (Protein Solutions). 20 ml of sample was inserted into the cuvette with the temperature control set to 293 K. The light-scattering signal was collected at a wavelength of 830.7 nm. Protein concentrations were about 2 mg ml À1 in each buffer and an average of 15 readings were recorded for each measurement.

Crystallization and optimization
Protein-DNA complexes were dialyzed in 20 mM Tris pH 7.5, 75 mM NaCl and 1 mM DTT at 277 K for 2.5 h and concentrated to at least 10 mg ml À1 . The initial crystallization trials were carried out at 295 K in 24-well plates using the hanging-drop vapor-diffusion method with a sparse-matrix screen (Jancarik & Kim, 1991)  Gel filtration of HNF4-DBD with the molecular-weight standard samples (labelled in kDa). HNF4-DBD is homogeneous and appears to be a monomer in solution. The SDS-PAGE of purified HNF4-DBD along with the molecularweight standard is shown in the inset. Binding of HNF4-DBD to duplex 21-mer target DNA detected by EMSA. Reactions were carried out in 10 mM Tris (pH 8.0 at 293 K), 1 mM EDTA, 100 mM NaCl, 1 mM MgCl 2 , 1 mM DTT, 4% glycerol, 100 mg ml À1 BSA. (a) Forward titration. All samples contained 0.21 mM duplex 21-mer. Samples contained HNF4-DBD protein, from the second lane, at 0. 41, 0.82, 1.23, 2.05, 2.87, 3.69, 4.51, 5.33, 6.56, 8.2, 12.0 and 20.5  in length and the nature of the ending (blunt end versus overhang) were screened, diffraction-quality crystals were only reproducibly obtained using the overhang 21-mer shown in Fig. 1(b) (the two HNF4 direct-repeat recognition sites are indicated by the boxes). Conditions yielding small crystals were further optimized by variation of the crystallization parameters and additives. The final condition, which produced somewhat flat bipyramidal crystals at 295 K, contained 26%(v/v) PEG 4000, 80 mM magnesium acetate and 50 mM sodium citrate pH 4.8.

Data collection and processing
The crystals were transferred into mother liquor containing an additional 15%(v/v) glycerol as a cryoprotectant before being directly plunged into liquid nitrogen and stored for data collection. The native data were collected at 100 K at APS (SER-CAT 22BM) using a MAR 225 CCD detector and an oscillation angle of 1 with 2 s exposure and were processed using HKL-2000 (Otwinowski & Minor, 1997). The wavelength used was 0.92017 Å .

Results and discussion
Recombinant HNF4-DBD (amino acids 46-126; Fig. 1a) was purified to homogeneity and mixed with pure DNA for subsequent studies. Gel-filtration experiments showed that the HNF4-DBD protein existed as a monomer in solution (Fig. 2). Purified HNF4-DBD protein forms a single complex with DNA containing its target sequence (Figs. 3a and 3b). Serial dilution analysis (Figs. 3b and 3c) revealed that the stoichiometry of the complex was 2:1 HNF4:dsDNA, with an association constant K obs of 8.48 AE 0.67 Â 10 10 M À2 . The corresponding monomer equivalent dissociation constant was 3.43 AE 0.13 Â 10 À6 M. The formation of a 2:1 complex without the accumulation of detectable levels of the 1:1 intermediate indicates that binding is cooperative. These features will serve as a reference when we study the effects of MODY mutations on DNA binding in the near future.
Dynamic light scattering (DLS) is a useful tool to monitor protein solubility behavior and to predict favorable crystallization conditions (Wilson, 2003). We used the Solubility Screening Kit (Jena Biosiences) in conjunction with DLS (Jancarik et al., 2004) in order to identify the optimal buffer conditions for complex formation and crystallization. The best polydispersity value of 0.06 was obtained with a buffer containing 20 mM Tris pH 7.5 and 75 mM NaCl and this optimal buffer was used for subsequent crystallization.
For crystallization, purified HNF4 46-126 and various DNAs were simply mixed in a 2:1.2 molar ratio, dialyzed against the optimal binding buffer (20 mM Tris pH 7.5, 75 mM NaCl) and concentrated using 10 kDa molecular-weight cutoff concentrators (Millipore). The protein-DNA concentration was 10 mg ml À1 for initial screenings and 20 mg ml À1 for final optimization. Crystals with the overhang 21-mer DNA (Fig. 1b)   Typical crystals of the HNF4-DNA complex.

Figure 5
A typical X-ray diffraction pattern from a crystal of the HNF4-DNA complex. A small section near the water ring is enlarged and shown in the inset. The overall mosaicity of the crystal was 0.3 . vapor-diffusion method and the presence of the HNF4-DNA complex in the crystals was confirmed by running SDS-PAGE and 0.5%(w/v) agarose gels (data not shown). Crystals initially appeared within 2 d and continued to grow until they reached average dimensions of 0.05 Â 0.1 Â 0.2 mm (Fig. 4). A range of solution conditions varying the pH, temperature and concentrations of additives such as organic solvents, divalent cations and polyamines were used to attempt to improve the crystal quality. The final optimized condition contains 26%(v/v) PEG 4000, 80 mM magnesium acetate and 50 mM sodium citrate pH 4.8. The best crystal diffracted to 2.0 Å at the synchrotron source and belongs to space group C2, with unitcell parameters a = 121.63, b = 35.43, c = 70.99 Å , = 119.36 (Fig. 5). The value of the Matthews coefficient (Matthews, 1968) is 2.12 Å 3 Da À1 for one complex (two HNF4 and one dsDNA) in the asymmetric unit and the estimated solvent content is 41.6% based on a protein specific density of 1.34. Final native data-collection statistics are summarized in Table 1.
The structure was determined by molecular replacement using the RXR-RAR-DNA complex structure (PDB code 1dsz) as a search model and the program MOLREP (Vagin & Teplyakov, 1997) from the CCP4 suite (Winn, 2003). An unambiguous solution was found that gave an initial R value of 51.4% and a correlation coefficient of 0.38 using data in the resolution range 15-3.0 Å . The subsequent Aweighted 2F o À F c map after rigid-body refinement clearly revealed density corresponding to the structural differences between the search model and the HNF4-DNA complex. Model improvement and refinement of the structure are in progress.