Expression , purification and preliminary crystallographic studies of a single-point mutant of Mos 1 mariner transposase

# 2004 International Union of Crystallography Printed in Denmark ± all rights reserved A soluble single-point mutant of full-length Mos1 mariner transposase (MW = 40.7 kDa) has been overexpressed in Escherichia coli, puri®ed to 95% homogeneity and crystallized. This provides the ®rst example of the crystallization of a eukaryotic transposase. The native crystals diffract to 2.5 AÊ resolution and show tetragonal symmetry, with unit-cell parameters a = b = 44.5, c = 205.6 AÊ . Multiplewavelength anomalous data from a selenomethionyl form of the protein and data from a heavy-atom derivative have been collected. Received 20 January 2004 Accepted 17 February 2004


Introduction
The mariner element Mos1 (®rst isolated from Drosophila mauritiana) is a member of the mariner/Tc1 superfamily of transposable elements (Robertson, 1995). It encodes a 40.7 kDa transposase (containing 345 residues in the sequence shown in Fig. 1) that is the sole requirement for transposition of the element in vitro. This enzyme catalyses a set of DNAcleavage and DNA-insertion reactions in a cutand-paste mechanism. In the ®rst step of this reaction, the transposase speci®cally recognizes and binds to the terminal inverted repeats of its own transposon via its N-terminal domain formed of the ®rst 118 amino acids (Zhang et al., 2001). The catalytic C-terminal domain mediates the subsequent excision of the transposon from the¯anking DNA and its insertion into a target sequence. After disengagement of the protein from the DNA, the 5 bp single-strand gaps at each end are repaired. Mos1 transposase contains a variant, DD(34)D, of the catalytic DD(35)E motif (van Luenen et al., 1994) that is required for divalent metal (Mg 2+ ) binding and is conserved among polynucleotide phosphotransferases. These include integrases, which are encoded by retroviruses and retrotransposons, and transposases of the bacterial elements Tn5 and Tn10 and other members of the mariner/Tc1 family.
The mechanism of transposition has been studied in detail for a number of prokaryotic elements, including Tn5 (Reznikoff, 2003) and Tn10 (Kennedy et al., 1998). The crystal structure of Tn5 transposase in a synaptic complex provided the ®rst structural insight into a transposase in its active DNA-bound state (Davies et al., 2000). Other studies have revealed the structures of domains of bacterial transposases (Rice & Mizuuchi, 1995;Schumacher et al., 1997) and related integrases (Chen et al., 2000;Wang et al., 2001). In contrast, less is known about the detailed mechanisms of eukaryotic transposition or the structures of the transposases involved. A recent biochemical analysis of the excision of DNA by Mos1 transposase (Dawson & Figure 1 Sequence of Mos1 transposase. The N-terminal domain (residues 1±118) is coloured red and the C-terminal domain (residues 119±345) is coloured blue. An orange circle highlights the position of the single mutation T216A. Purple stars indicate the three active-site aspartic acid residues. Methionine residues are coloured green. The secondary structure predicted by the PredictProtein server is also shown. This diagram was created using the program ESPript. crystallization papers Finnegan, 2003) highlighted differences in the mechanism of DNA cleavage by the enzyme compared with that by prokaryotic transposases. Moreover, similarities were found with the RAG1/2-mediated assembly of immunoglobulin and T-cell receptor genes in V(D)J recombination, consistent with the proposal that the modern immune system has evolved from a transposon more like Mos1 than bacterial transposons such as Tn5.
Structural analysis of Mos1 transposase has been precluded for many years owing to the insoluble nature of the protein when expressed in Escherichia coli. Here, we report the cloning, overexpression, puri®cation and crystallization of a soluble single-point mutant of full-length Mos1 transposase.

Protein cloning and overexpression
The single-point mutation (T216A) was isolated in a yeast two-hybrid screen designed to identify transposase derivatives that are able to interact more strongly with the wild-type protein. The sequence coding for the mutant protein was ampli®ed by PCR and inserted into the E. coli expression vector pBCP378 (Velterop et al., 1995) using NdeI and BamHI restriction sites in the PCR primers. DNA sequencing con®rmed the integrity of the cloned fragment. The recombinant plasmid was transformed into strain BL21(DE3) (Novagen) and plated out on LB agar containing carbenicillin (200 mg ml À1 ). Single colonies were used to inoculate 50 ml of LB medium with carbenicillin. After overnight incubation at 303 K, these cultures were diluted into 500 ml LB medium (with 200 mg ml À1 carbenicillin). Expression of the transposase was induced by the addition of isopropyl--d-thiogalactopyranoside (IPTG) when the cell culture grown at 310 K had attained late log phase (OD 600 = 0.6). The mutant protein was shown by SDS±PAGE to be present in the soluble fraction of the total cell lysate. Expression conditions were optimized and the best level of expression was achieved using 1.0 mM IPTG for 4 h at 303 K.
For the production of selenomethionyl protein, the expression construct was transformed into the methionine-de®cient E. coli strain B834(DE3). Single colonies grown on agar plates (with 200 mg ml À1 carbenicillin) were used to inoculate 50 ml overnight cultures of LB medium. These were pelleted and used to inoculate (to OD 600 = 0.2) a minimal medium containing selenomethio-nine, along with the other 19 amino acids, and protein production was induced with IPTG as before. The subsequent extraction and puri®cation of selenomethionyl protein proceeded as described below apart from the incorporation of 5 mM DTT in all buffers.

Protein purification and analysis
Cells collected from 1 l cell culture by centrifugation (2500g for 20 min at 277 K) were frozen overnight and then suspended in 20 ml extraction buffer (buffer A; 25 mM NaH 2 PO 4 , 500 mM NaCl, 5 mM MgCl 2 , 1 mM DTT pH 7.5) containing Complete protease-inhibitor mix (Roche). Lysozyme (100 ml at 100 mg ml À1 in buffer A) was added and the mixture was incubated on ice for 15 min and then sonicated. DNaseI (200 U, Sigma) was added and the suspension was incubated on ice for a further 30 min. The cell lysate was centrifuged at 2500g for 30 min at 277 K and the precipitate was discarded. Initially, the crude extract was applied onto a Poros HS cationexchange column on a BioCAD Sprint FPLC system (Applied Biosystems) and eluted with a salt gradient (0.2±1 M NaCl). Fractions eluting at approximately 0.6 M NaCl were shown by SDS±PAGE to contain Mos1 transposase. These fractions were pooled, concentrated and exchanged into buffer B (25 mM Tris, 0.2 M NaCl, 1 mM DTT pH 7.5) in a Vivaspin 20 ml centrifugal concentrator with a 10 kDa cutoff. Subsequently, the protein was passed through a Superdex 200 HR10/30 gel-®ltration column on an AKTA FPLC system (Amersham Pharmacia Biotech) in buffer B at 0.2 ml min À1 . In this way Mos1 transposase was puri®ed to 95% homogeneity, as con®rmed by SDS±PAGE, with a typical yield of 3 mg per litre of LB medium.
The mutant protein was shown in transposition assays and gel-retardation experiments to have identical activity to wild-type protein (Dawson & Zhang, personal communication). The molecular weight of puri®ed transposase was found to be 40 662 Da by MALDI±TOF mass spectrometry, compared with the predicted weight of 40 665 Da taking into account the loss of the ®rst methionine residue. The molecular weight of the selenomethionine-substituted protein was 40 925 Da. The difference in weight of 259 Da indicated 100% incorporation of ®ve Se atoms (47 Da difference per selenium) plus one Mg atom (24 Da).

Protein crystallization and preparation of heavy-atom derivatives
The hanging-drop vapour-diffusion method was used to grow crystals in 24-well Limbro plates. Plate-like crystals (shown in Fig. 2) with dimensions of up to 0.1 Â 0.1 Â 0.02 mm grew after three weeks at 290 K under the following conditions: the well solution consisted of 22±26%(w/v) PEG 4000, 0.1 M Tris buffer pH 7.5 and 5 mM MgCl 2 and the drop consisted of 2 ml protein solution at 13 mg ml À1 in buffer B plus 2 ml well solution. Selenomethionine-protein crystals were obtained by streak-seeding with a cat's whisker from a drop containing a crushed native crystal. A mercury derivative was obtained by soaking native crystals for 10 min in well solution containing 10 mM KHgI 3 followed by back-soaking into well solution. Prior to data collection, all crystals were brie¯y immersed in a cryoprotectant solution comprising 20%(v/v) glycerol in mother liquor and¯ash-cooled by plunging into liquid nitrogen. All diffraction data were collected at 100 K.

Biochemical analysis of the crystals
Native crystals were extracted from the hanging drops and washed in water before being dissolved in a 1:1 water:acetonitrile solution for N-terminal sequence analysis or 2%(w/v) SDS for analysis by PAGE and mass spectrometry. N-terminal sequence analysis of the native crystal (Procise sequencer, Applied Biosystems) predicted a sequence containing the residues (F/W)VPN. This could correspond to residues 4±7 with sequence FVPN and/or to residues 119±122 with sequence WVPH (see Fig. 1). MALDI±TOF mass spectrometry and SDS±PAGE (gel shown in Fig. 3)

crystallization papers
Pro184±Glu345 (19.273 kDa) and fragment D Trp119±Arg183 (7.87 kDa) or Trp119± Arg 167 (6.186 kDa). The in vitro cleavage of Mos1 adjacent to Arg119, Arg167 and Arg183 is consistent with the action of a protease and may re¯ect a role for protein degradation in vivo. For Tn5 it has been suggested that disengagement of the protein from DNA after transposition may be facilitated in vivo by proteolytic cleavage at Lys40 (Twining et al., 2001).

X-ray data collection
Data were collected at the Daresbury SRS (station 14.2, various wavelengths) with an ADSC Q4 CCD detector and at the ESRF, Grenoble (station ID29, various wavelengths) with an ADSC Q210 CCD detector. In order to achieve the optimal alignment of the long c axis of the crystal with the spindle axis, crystals were either mounted in bent loops or positioned on a swing-arc goniometer head (Hampton Research). For MAD data,¯uorescence spectra were measured to determine the absorption edge of selenium and data sets were measured at three wavelengths: the in¯ection point (0.98067 A Ê , ! 1 ) and the peak (0.98017 A Ê , ! 2 ) of the edge plus a low-energy remote wavelength (1.000 A Ê , ! 3 ). The mercury-derivative data were collected at a wavelength of 1.005 A Ê . All data were collected using a 9 scan with a step size of 1.0 and indexing and scaling of data was performed using DENZO and SCALEPACK (Otwinowski & Minor, 1997).

Unit-cell determination
The diffraction data (see Table 1) were consistent with space group P4 1 2 1 2 or P4 3 2 1 2, with unit-cell parameters a = b = 44.5, c = 205.6 A Ê . In this space group, with one molecule of intact protein in the asymmetric unit (eight molecules per unit cell), the value of V M is 1.3 A Ê 3 Da À1 (1.2% solvent), which is physically unreasonable. An alternative explanation is that the space group is P4 1 or P4 3 with merohedral twinning (with twinning operator h, k, l = k, h, Àl), giving rise to the apparent higher symmetry of the diffraction data. If one molecule of intact protein is present in this lower symmetry space group the value of V M is 2.5 A Ê 3 Da À1 (50.6% solvent). Other forms of crystal twinning with lower symmetry monoclinic or triclinic space groups, possibly containing truncated protein, cannot be completely discounted until the structure has been fully re®ned.

Twinning tests
Analysis of the native data set for partial twinning with CNS suggested a twinning fraction of 0.46 and the Yeates twinning server suggested a value of 0.47. The cumulative intensity statistics for the acentric re¯ections followed typical patterns for non-twinned diffraction data. However, it is recognized that pseudo-centring or strong anisotropic diffraction, such as could occur from a plate-like crystal or from a crystal with one unit-cell edge much longer than the others, can affect these statistics.

Preliminary phasing
Combining the selenomethionine MAD data with the mercury-derivative data, the positions of four out of ®ve possible selenium positions and two mercury positions were found with the program SOLVE (Terwilliger & Berendzen, 1999), assuming the space group to be P4 1 or P4 3 . Whilst both these space groups are consistent with the crystal data and the positions found, only P4 1 provided an electron-density map with interpretable features.