Zymogenic latency in an ∼250-million-year-old astacin metallopeptidase
aProteolysis Laboratory, Department of Structural Biology, Molecular Biology Institute of Barcelona (IBMB), Higher Scientific Research Council (CSIC), Barcelona Science Park, Baldiri Reixac 15–21, Helix Building, 08028 Barcelona, Catalonia, Spain, bInstitut für Molekulare Physiologie (IMP), Johannes-Gutenberg Universität Mainz (JGU), Johann-Joachim-Becher-Weg 7, 55128 Mainz, Germany, and cBiochemical Institute, Christian-Albrechts-Universität zu Kiel, Otto-Hahn-Platz 9, 24118 Kiel, Germany
*Correspondence e-mail: firstname.lastname@example.org
The horseshoe crab Limulus polyphemus is one of few extant Limulus species, which date back to ∼250 million years ago under the conservation of a common Bauplan documented by fossil records. It possesses the only proteolytic blood-coagulation and innate immunity system outside vertebrates and is a model organism for the study of the evolution and function of peptidases. The astacins are a family of metallopeptidases that share a central ∼200-residue catalytic domain (CD), which is found in >1000 species across holozoans and, sporadically, bacteria. Here, the zymogen of an astacin from L. polyphemus was crystallized and its structure was solved. A 34-residue, mostly unstructured pro-peptide (PP) traverses, and thus blocks, the active-site cleft of the CD in the opposite direction to a substrate. A central `PP motif' (F35-E-G-D-I39) adopts a loop structure which positions Asp38 to bind the catalytic metal, replacing the solvent molecule required for catalysis in the mature enzyme according to an `aspartate-switch' mechanism. Maturation cleavage of the PP liberates the cleft and causes the rearrangement of an `activation segment'. Moreover, the mature N-terminus is repositioned to penetrate the CD moiety and is anchored to a buried `family-specific' glutamate. Overall, this mechanism of latency is reminiscent of that of the other three astacins with known zymogenic and mature structures, namely crayfish astacin, human meprin β and bacterial myroilysin, but each shows specific structural characteristics. Remarkably, myroilysin lacks the PP motif and employs a cysteine instead of the aspartate to block the catalytic metal.
The Atlantic horseshoe crab Limulus polyphemus (Linnaeus, 1758) is a unique marine merostomatous decapod that is endemic to North America (Shuster, 1982; Walls et al., 2002). It is one of four closely related extant species of horseshoe crabs together with Tachypleus tridentatus, Tachypleus gigas and Carcinoscorpius rotundicaudia, which are found in Asia (Sekiguchi & Shuster, 2009). They are the only survivors of the order Xiphosurida (Bicknell & Pates, 2020) and are the closest living relatives of trilobites (Shuster, 1982). Indeed, Limulus spp. go back to ∼250 million years ago (Mya) (Bicknell & Pates, 2020) and the Limulidae family has existed since the Carboniferous period (∼360 Mya; Bicknell & Pates, 2019, 2020). Xiphosurida, which share a highly conserved horseshoe-crab-like Bauplan as inferred from an exceptionally extensive fossil record (Bicknell & Pates, 2020), date as far back as the Late Ordovician (∼445 Mya; Rudkin et al., 2008; Bicknell & Pates, 2020) or Cambrian (∼540 Mya; Størmer, 1952). Thus, these animals have survived all five great mass extinctions and are sometimes considered to be `living fossils', a term introduced by Charles Darwin (Darwin, 1859), or `stabilomorphs' (Kin & Błażejowski, 2014) and are an example of `evolutionary stasis' (Rudkin et al., 2008).
Despite their name, horseshoe crabs are actually not crustaceans but chelicerates that are phylogenetically closer to spiders, ticks and scorpions than to crabs (Lankester, 1881; Ballesteros & Sharma, 2019). L. polyphemus has a remarkable estimated life expectancy of up to 20 years (Walls et al., 2002) and is frequently used as a laboratory animal model to study its compound eyes, its simple nervous system and marine invertebrate embryology in general (Smith, 2022). Moreover, it possesses an ancient and primitive proteolytic blood-coagulation and innate immunity system, which is the only one found outside vertebrates (Rowley et al., 1984; Doolittle, 2010; Schmid et al., 2019; Winter et al., 2020; Eleftherianos et al., 2021). Thus, L. polyphemus is an important organism for study of the evolution and function of peptidases (Becker-Pauly et al., 2009).
The astacins are a family of zinc-dependent metallopeptidases (MPs; Stöcker et al., 1993; Gomis-Rüth, Trillo-Muyo et al., 2012; Stöcker & Gomis-Rüth, 2013; Bond, 2019) named after the archetypal digestive enzyme astacin from the European freshwater crayfish Astacus astacus L., which was first described in 1967 (Pfleiderer et al., 1967; Stöcker et al., 1988, 1992; Stöcker & Yiallouros, 2013). Astacins are characterized by a central ∼200-residue zinc-dependent catalytic domain (CD), which occurs in >12 000 sequences from >1000 species of identified and putative family members grouped into family PF01400 within the PFAM database (Mistry et al., 2021). Sequences are found consistently with Darwinian vertical descent throughout metazoans and, sporadically, up to the root of holozoans. They are absent from plants and viruses (Semenova & Rudenskaia, 2008) and are found to be scattered across bacteria, which suggests that they are xenologues resulting from horizontal gene transfer from eukaryotes (Koonin et al., 2001; Keeling & Palmer, 2008). The structural characteristics of astacin CDs further place the family within the metzincin clan of MPs (Bode et al., 1993; Stöcker et al., 1995; Gomis-Rüth, Trillo-Muyo et al., 2012; Cerdà-Costa & Gomis-Rüth, 2014) and family M12A of the MEROPS database (Rawlings & Bateman, 2021).
Astacins share a basic domain architecture consisting of an N-terminal signal peptide for secretion, a pro-peptide (PP) of variable length (from 34 residues in astacin to 486 residues in Drosophila melanogaster tolkin; Finelli et al., 1995; Gomis-Rüth, Trillo-Muyo et al., 2012; Arolas et al., 2018) for zymogenic latency and the CD (Gomis-Rüth, Trillo-Muyo et al., 2012). This core may be C-terminally extended by disparate modules, among which are linkers (LNK), CUB domains (found in the complement component C1r/1s, the embryonic sea urchin Uegf and bone morphogenetic protein 1; Bork & Beckmann, 1993; PF00431) and MAM domains (common to meprins, A5 receptor protein and tyrosine phosphatase μ; Cismasiu et al., 2004; PF00629). Two astacins, namely a short 240-residue protein (astl gene; UniProt accession B4F319) and a long 403-residue protein (astl-mam gene; UniProt B4F320), were identified in L. polyphemus, recombinantly expressed and biochemically characterized (Becker-Pauly et al., 2009, 2011). The short form was predominantly found in the eyes and the brain, which suggests a function in the nervous system, while the long form was ubiquitous (Becker-Pauly et al., 2009). The short paralogue has the basic domain architecture of the family, while the long paralogue further contains an LNK and an MAM domain. Both astacins share 46% sequence identity within the PP and the CD, and their trypsin-activated forms showed proteolytic activity in gelatin zymography and in solution against azocasein and the extracellular matrix proteins fibronectin, type IV collagen, gelatin and laminin, but not triple-helical collagen (Becker-Pauly et al., 2009). Finally, consistent with the horseshoe crab being a chelicerate, these astacins were found to be closer to an orthologue from the brown spider Loxoceles intermedia in a phylogenetic analysis than to the crustacean orthologs from the crayfish A. astacus and the shrimp Panaeus vannamei (Becker-Pauly et al., 2009).
Here, we crystallized the zymogen of the long paralogue, hereafter referred to as pLAST-MAM, and solved its crystal structure. Our results provide structural and molecular insight into the latency mechanism of the currently evolutionarily oldest holozoan astacin.
The pLAST-MAM zymogen was obtained by recombinant expression in Trichoplusia ni High Five insect cells, purified as described in Becker-Pauly et al. (2009) and subsequently concentrated in a Vivaspin device using a polyethersulfone membrane with 10 kDa cutoff (Vivaproducts). We screened for crystallization conditions using the sitting-drop vapour-diffusion method at the joint IBMB/IRB Automated Crystallography Platform (https://www.ibmb.csic.es/en/facilities/automated-crystallographic-platform). Reservoir solutions were prepared using a Tecan Freedom EVO robot and were dispensed into 96 × 2-well MRC plates (Innovadyne Technologies). A Phoenix/RE robot (Art Robbins) administered crystallization nanodrops consisting of 100 nl each of protein and reservoir solution. Crystallization plates were subsequently incubated at 4 or 20°C in Bruker steady-temperature crystal farms. Successful initial conditions were refined and scaled up to the microlitre range in 24-well Cryschem crystallization dishes (Hampton Research) whenever possible. Optimal crystals of the protein at ∼7 mg ml−1 in 50 mM HEPES pH 7.0 were obtained at 20°C using 0.1 M bicine pH 9.0, 10% polyethylene glycol (PEG) 40 000, 2% dioxane as the reservoir solution. Crystals were thin and fragile rectangular plates, which were harvested using cryo-loops (Molecular Dimensions), rapidly passed through a cryo-buffer consisting of reservoir solution plus 20%(v/v) glycerol and flash-vitrified in liquid nitrogen for transport and data collection.
X-ray diffraction data were collected on 18 April 2010 using an ADSC Quantum 315r detector on beamline ID29 of the ESRF synchrotron, Grenoble, France. Diffraction data were processed using XDS (Kabsch, 2010) and XSCALE, and were transformed to MTZ format using XDSCONV for use with the Phenix (Liebschner et al., 2019) and CCP4 (Winn et al., 2011) suites. Analysis with phenix.xtriage within Phenix revealed an absence of translational noncrystallographic symmetry (NCS) and no significant twinning according to the L-test. The crystals contained two monomers in the asymmetric unit and Table 1 provides essential statistics on data collection and processing.
‡Average intensity is the 〈I/σ(I)〉 of unique reflections after merging according to XSCALE (Kabsch, 2010).
§According to the wwPDB Validation Service (https://wwpdb-validation.wwpdb.org/validservice).
The structure of pLAST-MAM was solved by molecular replacement using the Phaser crystallographic software (McCoy et al., 2007) and a homology model for the CD and MAM domain predicted with AlphaFold (Jumper et al., 2021). After several trials, we could only obtain correct solutions by searching with the domains separately, i.e. two for the CD but only one for the MAM domain. Those for the CD corresponded to Eulerian angles of α = 54.2, β = 54.0, γ = 116.3 and cell-fraction translation values of x = 0.106, y = 0.002, z = 0.210 for one protomer and α = 261.5, β = 125.5, γ = 297.0, x = 0.419, y = 0.884, z = 0.303 for the second protomer. The corresponding values for the MAM moiety were α = 64.8, β = 106.0, γ = 172.9, x = 0.285, y = 0.751, z = 0.991. These solutions had a final translation-function Z-score of 17.1 and a global log-likelihood gain after refinement of 782.
The suitably rotated and translated molecules were subjected to the phenix.autobuild protocol (Terwilliger et al., 2008) within Phenix, which yielded a greatly improved Fourier map for manual model building with Coot (Casañal et al., 2020). The latter alternated with crystallographic refinement using the phenix.refine protocol (van Zundert et al., 2021) and BUSTER (Smart et al., 2012), which both included translation/liberation/screw motion and NCS restraints, until completion of the model. The latter comprised residues Glu22–Cys403 of protomer A and Glu22–Gly246 of protomer B, each with a catalytic zinc ion plus one tentatively assigned magnesium cation, one diethylene glycol molecule, one triethylene glycol molecule, two glycerol molecules and 229 solvent molecules. The occupancy of LNK and MAM of protomer A refined to 87%. Table 1 provides essential statistics on the final refined model, which was validated through the wwPDB validation service (https://validate-rcsb-1.wwpdb.org/validservice). The coordinates can be retrieved from the Protein Data Bank (https://www.wwpdb.org/) as entry 8a28.
Structure superpositions were performed with SSM (Krissinel & Henrick, 2004) within Coot. Figures were prepared using UCSF Chimera (Goddard et al., 2018). Protein interfaces and intermolecular interactions were analysed using PDBePISA (https://www.ebi.ac.uk/pdbe/pisa; Krissinel & Henrick, 2007) and verified by visual inspection. For this, the interacting surface of a complex was taken as half of the sum of the buried surface areas of either molecule.
To prevent autolysis, pLAST-MAM was recombinantly expressed in insect cells as a point mutant in which the general base/acid glutamate for catalysis (Arolas et al., 2018; E140; residues are given as single-letter codes with numbering in superscript according to UniProt B4F320; other proteins are numbered in subscript) was replaced by alanine to create a catalytically impaired variant. This strategy has often been employed in the past to prevent autolysis when crystallizing MP zymogens (see Table 1 in Arolas et al., 2018). pLAST-MAM crystals with two protomers (A and B) in the crystallographic asymmetric unit were obtained in 2010 (Table 1) but the structure was only solved very recently using a homology model predicted by AlphaFold (Jumper et al., 2021) for molecular replacement. After extensive calculations with the whole molecule and separate domains, the two CDs (N49–C244) could confidently be placed, rebuilt and refined, as well as the respective PPs (defined for E22–K48). In contrast, the LNK (F245–D257) and MAM (F258–C403) moieties were flexible and only those of protomer A could be placed in the structure. Moreover, crystallographic refinement revealed that the final Fourier map was discontinuous in several places in the MAM domain owing to this flexibility (Fig. 1a). Indeed, while the CDs showed average thermal displacement parameters (B factors) of 60 and 73 Å2 for protomers A and B, respectively, the segment spanning LNK and MAM of protomer A had an average B factor of 116 Å2 after occupancy refinement to 87%.
Inspection of the crystal packing revealed that the two CDs form tight layers parallel to the xy plane of the crystal with their respective crystallographic symmetry mates (1 and 2 in Figs. 1b and 1c). They are in a relative upside-down conformation, so that the C-termini protrude either above or below the CD layer. In the case of the A protomers, LNK and MAM project into the space between CD sections and make interactions with symmetric MAM and LNK moieties from the CD layer beneath, respectively, which are required to form the crystal (Figs. 1b and 1c). In contrast, the space between CD sections into which the C-termini of the B-protomer CDs point (sections 2 and 3 in Fig. 1d) does not contain any atoms and thus lacks crystal contacts owing to the missing LNKs and MAMs. However, when superposing the full-length protomer A on protomer B by their respective CDs, the LNK and MAM moieties adopt a very similar arrangement in the space between the two CD layers to that seen in the A protomers (sections 2 and 3 in Fig. 1e). Thus, LNK and MAM of the B protomers must also be present in the crystal to establish the intermolecular contacts necessary to build the crystal. Overall, we conclude that while both LNK–MAM moieties are very flexible and adopt several slightly different orientations that are able to assemble the crystal, those of protomer A are somewhat more rigid, so they are grossly defined in the final Fourier maps. In contrast, those of protomer B are so flexible that the density is too poor to confidently place them.
Thus, given the poor definition of the MAM domains, we will concentrate the discussion hereafter on the PP and CD moieties of the zymogen (referred to here as pLAST) and the mature CD (LAST) of protomer A, and the mechanism of latency in the context of other structurally characterized astacin zymogens. Suffice to say that the predicted structure of the MAM domain of pLAST-MAM is very similar to that of the human astacin-family member meprin β except for some loops (Fig. 1f). For a discussion of the architecture and features of these domains, please refer to Cismasiu et al. (2004), Aricescu et al. (2006, 2007), Arolas et al. (2012), Yelland & Djordjevic (2016) and Eckhard et al. (2021).
The pLAST moiety subdivides into three segments when viewed in the standard orientation of MPs (Gomis-Rüth, Botelho et al., 2012): the N-terminal PP (E22–K48), an upper N-terminal subdomain (NTS) of the CD (N49–G146) and a lower C-terminal subdomain (CTS) of the CD (F147–C244) (Fig. 2a). The PP runs along the front surface of pLAST from right to left and features helix α1 on the primed side of the cleft (substrate and active-site subsite terminology based on Schechter & Berger, 1967; Gomis-Rüth, Botelho et al., 2012). It adopts a wide loop structure protruding from the cleft between L29 and D38 (Fig. 3a), which is stabilized by two intra-main-chain hydrogen bonds (H31 N–G37 O and F35 O–I39 N). The intervening residues are included in a `PP motif' found in astacins (F-E/Q-G-D-I; Gomis-Rüth, Trillo-Muyo et al., 2012), F35-E-G-D-I39 in pLAST (Becker-Pauly et al., 2009). For I39–G41, the polypeptide adopts an extended conformation along the nonprimed side of the cleft before turning 90° downwards for V42–Y45 and then leftwards for Y45–D47. Thereafter, the peptide containing the primary activation cleavage site (K48–N49) enters into the CD, which adopts a helical conformation for K48–H54 (α2; Figs. 2a and 3a).
As in other astacins, the 195-residue CD divides into an NTS and a CTS of approximately equal size (Fig. 2a). The NTS is rich in regular secondary structure and consists of a five-stranded arched and twisted β-sheet (β1–β5), the strands of which parallel the active-site cleft except for the lowermost (β4), which is antiparallel and frames the upper rim of the cleft. The concave face of the sheet accommodates three helices (α3–α5), among which are a `backing helix' (α4) and an `active-site helix' (α5) that are characteristic of astacins and metzincins in general (Bode et al., 1993; Stöcker et al., 1993; Stöcker & Bode, 1995; Gomis-Rüth, 2009; Gomis-Rüth, Trillo-Muyo et al., 2012; Cerdà-Costa & Gomis-Rüth, 2014; Arolas et al., 2018). The active-site helix encompasses the first two-thirds of a conserved zinc-binding motif (H139-E-X-X-H-X-X-G-X-X-H149 in pLAST) found in astacins and other metzincins, which features three metal-binding histidines and the general base/acid glutamate, here replaced with an alanine (see above and Fig. 2c). At the glycine of the motif (G146), the polypeptide undergoes a sharp downwards turn to enter the CTS, which in contrast to the NTS is more irregular. It contains two short helices (α6 and α7) and the short β-ribbon β6β7 in addition to a `C-terminal helix' (α8), which again is characteristic of metzincins. Of note is another conserved structural element of metzincins, the `Met-turn', which is a tight 1,4-turn (S194–L197) encompassing the strictly conserved M196 (Fig. 2c). Its side chain provides a hydrophobic pillow for the metal-binding site that is essential for the stability and function of metzincins (Tallant, García-Castellanos et al., 2010). Immediately downstream of this methionine, Y198 provides the fourth zinc ligand of the CD through its somewhat more distant Oη atom. In other astacins, this residue is swung out upon substrate binding following a `tyrosine switch' and its Oη atom participates in stabilization of the reaction intermediate during catalysis (Stöcker & Yiallouros, 2013). Finally, a disulfide bond links the back of the NTS with the C-terminal helix α8 of the CTS (C90–C244) and a second one links strand β4 with the loop connecting β5 and α5 (Lβ5α5) (C112–C131) (Fig. 2a).
Latency is achieved in pLAST by blocking access of substrates through the PP, which runs across the active-site cleft of the CD moiety in the opposite direction to a substrate (Figs. 2a, 2b and 3a). This is a strategy to prevent untimely autolytic cleavage in cis (Khan & James, 1998; Arolas et al., 2018). In addition, the polypeptide chain does not adopt an extended conformation as required for substrates to be cleaved (Tyndall et al., 2005) but rather the aforementioned loop structure protrudes from the cleft (Fig. 2b). This prevents a scissile bond from extending across cleft subsites S1 and (Fig. 3a), which is another mechanism to prevent undesired cleavage (Arolas et al., 2018). The surface occluded by the PP–CD interaction spans 1207 Å2, which is in the range reported for protein–protein complexes (∼380–3390 Å2; Chen et al., 2013), and has a solvation free-energy gain upon interface formation (ΔiG) of −16.9 kcal mol−1 (Krissinel & Henrick, 2007), indicating a strong interaction. Participating structural elements include the entire PP and segments N49–V52, D110–V116, Y129–H143, W148–N152, S170–M178, Y198–T208 and P223–K226 of the CD, with the establishment of 20 electrostatic interactions and hydrophobic contacts between 17 pairs of residues of either moiety (Table 2).
The primary activation site of pLAST (K48–N49) is inserted within short helix α2 and buried in the zymogen, thus preventing access by activating enzymes in a similar fashion as found in pro-astacin (Guevara et al., 2010). Moreover, K48 Nζ makes strong interactions with Y173 O (2.7 Å apart) and N176 O (3.1 Å) of the CD and with E36 Oɛ2 (2.7 Å) of the PP motif, which likewise hinder activation. The latter interaction is reminiscent of the double salt bridge between an arginine and an aspartate in a PP motif found in matrix metallopeptidase (MMP) zymogens (P-R-C-G-X-P-D; van Wart & Birkedal-Hansen, 1990; Springman et al., 1990; Tallant, Marrero et al., 2010; Arolas et al., 2018). Moreover, the activation-scissile-bond N atom is bound to D47 Oδ2 (2.8 Å) within the PP, so the activation site is additionally protected in the zymogen. All of these findings support the maturation of pLAST requiring partial unfolding of the segment flanking the activation site and/or preliminary cleavages, as described for crayfish astacin (Yiallouros et al., 2002; Guevara et al., 2010).
The most relevant element for latency is D38, which binds the catalytic zinc in a bidentate manner through its Oδ1 (2.2 Å) and Oδ2 (2.4 Å) atoms (Fig. 2c), thus replacing the catalytic solvent required for catalysis in mature MPs (Arolas et al., 2018). This aspartate is embedded in the PP motif and contributes to a distorted octahedral metal coordination sphere together with H139 Nɛ2 (2.1 Å) and H149 Nɛ2 (2.1 Å) in plane with the cation and with H143 Nɛ2 (2.1 Å) and Y198 Oη (3.3 Å) in the apical positions. Thus, D38 functions as an `aspartate switch' for latency maintenance as described previously for crayfish astacin (Guevara et al., 2010) and human meprin β (Arolas et al., 2012) within the astacins (see below) and for fragilysin-3 (Goulas et al., 2011) and the bacterial MMP karilysin (Cerdà-Costa et al., 2011) within other metzincins (Arolas et al., 2018).
The archetypal astacin from crayfish, which like the horseshoe crab is an arthropod, represents the evolutionarily closest orthologue of LAST with a known mature structure (Bode et al., 1992). Indeed, 157 Cα atoms from these proteins superpose with a core root-mean-square deviation (r.m.s.d.) of 1.3 Å (38% sequence identity). Moreover, a predicted homology model of LAST was obtained with AlphaFold (Jumper et al., 2021), which showed most of the common features in relevant segments described for mature astacin. It had an average predicted local distance difference test (pLDDT) value of >97, which is indicative of high reliability (Tunyasuvunakool et al., 2021). Thus, this model is taken hereafter as a working model of mature Limulus astacin.
Superposition of the pLAST structure and the LAST model (Fig. 3b) reveals that the CD moieties mostly coincide. In particular, the NTSs match best, with an r.m.s.d. of 0.93 Å for all 746 atoms of segment L57–H149. The metal-binding site and most of the active-site cleft would largely be preformed in the zymogen, as observed for other MP zymogens (Arolas et al., 2018). Within the CTS, good agreement is observed for the segment E188–G199, which includes the Met-turn, and the entire C-terminal stretch from G207 to C244. Loop G199–D206, which frames the lower rim of the cleft, slightly deviates, with a maximal displacement of ∼2 Å that closes the cleft on the primed side upon activation. On the bottom of the nonprimed side of the cleft, E150–E179 would additionally undergo a closing motion of maximally ∼3 Å facilitated by a ∼10° rotation around W198. The largest deviation, however, is observed for the segment P180–N187, which conforms to a flexible `activation domain' and would become significantly rearranged (Fig. 3b), as described for other astacins (Guevara et al., 2010) and the otherwise unrelated trypsin-like serine endopeptidases (Huber & Bode, 1978). This rearrangement would result from the displacement of N49–L56, which upon maturation cleavage at K48–N49 would become rotated outwards around the Cα—C bond of L56. In this way, the seven preceding residues would be amply repositioned by up to ∼11 Å and penetrate the mature enzyme moiety, so the first three residues (N49-A50-I51) would be completely inaccessible to solvent, as reported for meprin β (see Section 3.5). Next, N49 would bind the `family-specific residue' immediately after the third zinc-binding histidine (E150; Bode et al., 1993; Gomis-Rüth, 2003), which in turn is held in place by internal salt bridges with R237 and R153 in the zymogen. This interaction could occur directly through the N49 Nδ2 atom, as observed in meprin β (Arolas et al., 2012). An alternative interaction through the α-amino group (N49 N) mediated by a solvent molecule, as observed in crayfish astacin (Bode et al., 1992), is also conceivable. Moreover, the N49 Oδ1 atom might also bind the R237 side chain. Overall, this scenario of a deeply buried mature N-terminus is very similar to that found in other astacins, in which the maturation mechanism has been structurally verified (see Section 3.5). This, in turn, provides confidence in the reliability of the LAST homology model.
To date, the crystal structures of crayfish pro-astacin (PDB entry 3lq0; Guevara et al., 2010), human pro-meprin β (PDB entry 4gwm; Arolas et al., 2012) and pro-myroilysin from two closely related bacterial species, Myroides profundi (PDB entry 5czw; Xu et al., 2017) and Myroides sp. CSLB8 (PDB entry 5gwd; Xu et al., 2017), have been reported, as well as their respective mature forms astacin (PDB entry 1ast; Bode et al., 1992; Gomis-Rüth et al., 1993), meprin β (PDB entry 4gwn; Arolas et al., 2012) and myroilysin from Myroides sp. CSLB8 (PDB entry 5zjk; Ran et al., 2020). The two proteins from Myroides are 99.6% identical, so only that from Myroides sp. CSLB8 will be discussed here. Of all these structures, only pro-meprin β spans additional domains downstream of the CD, namely an MAM and a TRAF domain (Arolas et al., 2012). Pictures of the three zymogens superposed onto the mature forms, together with those of the pLAST structure and the LAST model, are provided in Figs. 4(a)–4(d).
In all cases, the mature N-terminus is buried inside the catalytic moiety and is bound to the family-specific glutamate of astacins either directly through an N-terminal asparagine (LAST and meprin β) or glycine (myroilysin) or mediated by a solvent molecule because the N-terminal segment is one residue shorter (astacin). The position of the new N-terminus in the zymogen and the mature moiety is very close in astacin (∼2 Å; Fig. 4b), quite close in meprin β (∼6 Å; Fig. 4c), farther apart in LAST (∼11 Å; Fig. 4a) and farthest in myroilysin (∼17 Å; Fig. 4d).
Detailed analysis of the four zymogen–mature enzyme pairs reveals that in all cases the PP is poor in regular secondary structure and adopts a mostly extended conformation that traverses the active-site cleft in the opposite direction to a substrate. In pro-myroilysin it is additionally elongated at the N-terminus and further extends along the front surface of the NTS (Fig. 4d), while in pro-meprin β (Fig. 4c) it runs in an extended conformation along a neighbouring TRAF domain on the right of the CD (not shown). In all cases, CTS regions framing the bottom of the active-site cleft on its nonprimed side constitute activation segments that undergo rearrangement upon maturation cleavage and repositioning of the new N-terminus. In astacin, only this activation segment (I130–E139, mature enzyme numbering according to PDB entry 1ast; add 49 for full-gene numbering; see UniProt P07584) is reorganized, while the rest of the molecule is preformed in the zymogen (Guevara et al., 2010; Fig. 4b). Next, LAST is most likely to undergo slight rearrangement of two segments (G199–D206 and E150–E179) in addition to the major movement of the activation segment (P180–N187; see Section 3.4 and Fig. 4a). Meprin β, in turn, repositions most of its CTS (Q164–Y211 and L199–D233 according to UniProt Q16820; segment D194–L199 is disordered in the zymogen structure) in a concerted hinge motion that entirely closes the cleft at its bottom in response to maturation (Fig. 4c). Finally, the largest deviation is observed in myroilysin, which rearranges its entire CTS except for the Met-turn and the C-terminal helix (Fig. 4d). The segments affected are Q155–A201 and Y210–N225 (myroilysin numbering according to PDB entry 5gwd; see also UniProt A0A0P0DZ84). A large flap (N160–S193), which encompasses two helices, is folded back on top of the active-site cleft and traps the PP in the zymogen. Upon maturation, this flap is rotated to the right with a maximal displacement of ∼17 Å (measured at P176), thus liberating access to the cleft (Fig. 4d).
Differences are also found in the residues blocking the zinc ion in the zymogen. The three metazoan proteins contain an aspartate within the PP motif, which is structurally conserved (Fig. 4e), acting as an aspartate switch. In contrast, the bacterial enzyme lacks the PP motif and instead features a cysteine, which blocks the zinc according to a `cysteine-switch' mechanism (Ran et al., 2020; Xu et al., 2017). Moreover, the polypeptide chain flanking the cysteine is in a canonical, extended conformation and does not adopt the loop of the PP motif. Overall, this is inversely reminiscent of MMPs, in which canonical vertebrate orthologues regulate latency according to a cysteine-switch mechanism (Springman et al., 1990; van Wart & Birkedal-Hansen, 1990; Rosenblum et al., 2007; Tallant, Marrero et al., 2010; Arolas et al., 2018), while the bacterial orthologue karilysin from the periodontopathogen Tannerella forsythia instead operates according to an asparate switch. As in astacins, MMPs are only found dispersedly outside animals, and it has been proposed that karilysin is a xenologue co-opted from a mammalian host through horizontal gene transfer facilitated by intimate interaction between the host and the colonizing bacterium (Cerdà-Costa et al., 2011). A similar origin is conceivable for myroilysin within astacins given that Myroides spp. have been reported in several human body fluids and can trigger infection leading to soft-tissue infections in humans (Maraki et al., 2012) and bacteraemia in a diabetic patient (Endicott-Yazdani et al., 2015). Thus, as in MMPs, the latency mechanisms of holozoan orthologues and bacterial xenologues would also diverge in astacins.
All data and reagents are freely available from the authors upon reasonable request.
‡These authors share first authorship.
We are grateful to Joan Pous from the joint IBMB/IRB Automated Crystallography Platform for assistance. Author contributions were as follows. FXG-R, CB-P and WS conceived and supervised the project, CB-P purified the protein, TG and AR-B performed experiments, FXG-R performed calculations and FXG-R wrote the manuscript with contributions from all authors. The authors declare no financial or nonfinancial conflicts of interest.
This study was supported in part by grants from Spanish and Catalan public and private bodies (grant/fellowship references PID2019-107725RG-I00 from MICIN/AEI/10.13039/501100011033 to FXG-R, TG and AR-B, 2017SGR3 and Fundació La Marató de TV3 201815 to FXG-R, TG and AR-B). Further support was obtained from German funding bodies (grant SFB877, Project A9: `Proteolysis as a regulatory Event in Pathophysiology' from the Deutsche Forschungsgemeinschaft to CB-P).
Aricescu, A. R., Hon, W. C., Siebold, C., Lu, W., van der Merwe, P. A. & Jones, E. Y. (2006). EMBO J. 25, 701–712. Web of Science CrossRef PubMed CAS Google Scholar
Aricescu, A. R., Siebold, C., Choudhuri, K., Chang, V. T., Lu, W., Davis, S. J., van der Merwe, P. A. & Jones, E. Y. (2007). Science, 317, 1217–1220. Web of Science CrossRef PubMed CAS Google Scholar
Arolas, J. L., Broder, C., Jefferson, T., Guevara, T., Sterchi, E. E., Bode, W., Stöcker, W., Becker-Pauly, C. & Gomis-Rüth, F. X. (2012). Proc. Natl Acad. Sci. USA, 109, 16131–16136. Web of Science CrossRef CAS PubMed Google Scholar
Arolas, J. L., Goulas, T., Cuppari, A. & Gomis-Rüth, F. X. (2018). Chem. Rev. 118, 5581–5597. Web of Science CrossRef CAS PubMed Google Scholar
Ballesteros, J. A. & Sharma, P. P. (2019). Syst. Biol. 68, 896–917. CrossRef PubMed Google Scholar
Becker-Pauly, C., Barré, O., Schilling, O., auf dem Keller, U., Ohler, A., Broder, C., Schütte, A., Kappelhoff, R., Stöcker, W. & Overall, C. M. (2011). Mol. Cell. Proteomics, 10, M111.009233. PubMed Google Scholar
Becker-Pauly, C., Bruns, B. C., Damm, O., Schütte, A., Hammouti, K., Burmester, T. & Stöcker, W. (2009). J. Mol. Biol. 385, 236–248. PubMed CAS Google Scholar
Bicknell, R. D. C. & Pates, S. (2019). Sci. Rep. 9, 17102. CrossRef PubMed Google Scholar
Bicknell, R. D. C. & Pates, S. (2020). Front. Earth Sci. 8, 98. CrossRef Google Scholar
Bode, W., Gomis-Rüth, F. X., Huber, R., Zwilling, R. & Stöcker, W. (1992). Nature, 358, 164–167. CrossRef PubMed CAS Google Scholar
Bode, W., Gomis-Rüth, F. X. & Stöckler, W. (1993). FEBS Lett. 331, 134–140. CrossRef CAS PubMed Web of Science Google Scholar
Bond, J. S. (2019). J. Biol. Chem. 294, 1643–1651. CrossRef CAS PubMed Google Scholar
Bork, P. & Beckmann, G. (1993). J. Mol. Biol. 231, 539–545. CrossRef CAS PubMed Google Scholar
Casañal, A., Lohkamp, B. & Emsley, P. (2020). Protein Sci. 29, 1069–1078. Web of Science PubMed Google Scholar
Cerdà-Costa, N., Guevara, T., Karim, A. Y., Ksiazek, M., Nguyen, K. A., Arolas, J. L., Potempa, J. & Gomis-Rüth, F. X. (2011). Mol. Microbiol. 79, 119–132. Web of Science PubMed Google Scholar
Cerdà-Costa, N. & Gomis-Rüth, F. X. (2014). Protein Sci. 23, 123–144. Web of Science PubMed Google Scholar
Chen, J., Sawyer, N. & Regan, L. (2013). Protein Sci. 22, 510–515. Web of Science CrossRef CAS PubMed Google Scholar
Cismasiu, V. B., Denes, S. A., Reiländer, H., Michel, H. & Szedlacsek, S. E. (2004). J. Biol. Chem. 279, 26922–26931. CrossRef PubMed CAS Google Scholar
Darwin, C. R. (1859). On the Origin of Species by Means of Natural Selection, 1st ed, p. 107. London: John Murray. Google Scholar
Doolittle, R. F. (2010). J. Innate Immun. 3, 9–16. CrossRef PubMed Google Scholar
Eckhard, U., Körschgen, H., von Wiegen, N., Stöcker, W. & Gomis-Rüth, F. X. (2021). Proc. Natl Acad. Sci. USA, 118, e2023839118. CrossRef PubMed Google Scholar
Einspahr, H. M. & Weiss, M. S. (2012). International Tables for Crystallography, Vol. F, 2nd ed., edited by E. Arnold, D. M. Himmel & M. G. Rossmann, pp. 64–74. Chichester: John Wiley & Sons. Google Scholar
Eleftherianos, I., Heryanto, C., Bassal, T., Zhang, W., Tettamanti, G. & Mohamed, A. (2021). Immunology, 164, 401–432. CrossRef CAS PubMed Google Scholar
Endicott-Yazdani, T. R., Dhiman, N., Benavides, R. & Spak, C. W. (2015). Bayl. Univ. Med. Cent. Proc. 28, 342–343. Google Scholar
Finelli, A. L., Xie, T., Bossie, C. A., Blackman, R. K. & Padgett, R. W. (1995). Genetics, 141, 271–281. CrossRef CAS PubMed Google Scholar
Goddard, T. D., Huang, C. C., Meng, E. C., Pettersen, E. F., Couch, G. S., Morris, J. H. & Ferrin, T. E. (2018). Protein Sci. 27, 14–25. Web of Science CrossRef CAS PubMed Google Scholar
Gomis-Rüth, F. X. (2003). Mol. Biotechnol. 24, 157–202. Web of Science CrossRef PubMed CAS Google Scholar
Gomis-Rüth, F. X. (2009). J. Biol. Chem. 284, 15353–15357. Web of Science PubMed Google Scholar
Gomis-Rüth, F. X., Botelho, T. O. & Bode, W. (2012). Biochim. Biophys. Acta, 1824, 157–163. Web of Science PubMed Google Scholar
Gomis-Rüth, F. X., Stöcker, W., Huber, R., Zwilling, R. & Bode, W. (1993). J. Mol. Biol. 229, 945–968. PubMed Web of Science Google Scholar
Gomis-Rüth, F. X., Trillo-Muyo, S. & Stöcker, W. (2012). Biol. Chem. 393, 1027–1041. Web of Science PubMed Google Scholar
Goulas, T., Arolas, J. L. & Gomis-Rüth, F. X. (2011). Proc. Natl Acad. Sci. USA, 108, 1856–1861. Web of Science CrossRef CAS PubMed Google Scholar
Guevara, T., Yiallouros, I., Kappelhoff, R., Bissdorf, S., Stöcker, W. & Gomis-Rüth, F. X. (2010). J. Biol. Chem. 285, 13958–13965. Web of Science CrossRef CAS PubMed Google Scholar
Huber, R. & Bode, W. (1978). Acc. Chem. Res. 11, 114–122. CrossRef CAS Web of Science Google Scholar
Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., Tunyasuvunakool, K., Bates, R., Žídek, A., Potapenko, A., Bridgland, A., Meyer, C., Kohl, S. A. A., Ballard, A. J., Cowie, A., Romera-Paredes, B., Nikolov, S., Jain, R., Adler, J., Back, T., Petersen, S., Reiman, D., Clancy, E., Zielinski, M., Steinegger, M., Pacholska, M., Berghammer, T., Bodenstein, S., Silver, D., Vinyals, O., Senior, A. W., Kavukcuoglu, K., Kohli, P. & Hassabis, D. (2021). Nature, 596, 583–589. Web of Science CrossRef CAS PubMed Google Scholar
Kabsch, W. (2010). Acta Cryst. D66, 125–132. Web of Science CrossRef CAS IUCr Journals Google Scholar
Keeling, P. J. & Palmer, J. D. (2008). Nat. Rev. Genet. 9, 605–618. CrossRef PubMed CAS Google Scholar
Khan, A. R. & James, M. N. (1998). Protein Sci. 7, 815–836. Web of Science CrossRef CAS PubMed Google Scholar
Kin, A. & Błażejowski, B. (2014). PLoS One, 9, e108036. CrossRef PubMed Google Scholar
Koonin, E. V., Makarova, K. S. & Aravind, L. (2001). Annu. Rev. Microbiol. 55, 709–742. CrossRef PubMed CAS Google Scholar
Krissinel, E. & Henrick, K. (2004). Acta Cryst. D60, 2256–2268. Web of Science CrossRef CAS IUCr Journals Google Scholar
Krissinel, E. & Henrick, K. (2007). J. Mol. Biol. 372, 774–797. Web of Science CrossRef PubMed CAS Google Scholar
Lankester, E. R. (1881). Q. J. Microsc. Sci. 21, 504–548. Google Scholar
Liebschner, D., Afonine, P. V., Baker, M. L., Bunkóczi, G., Chen, V. B., Croll, T. I., Hintze, B., Hung, L.-W., Jain, S., McCoy, A. J., Moriarty, N. W., Oeffner, R. D., Poon, B. K., Prisant, M. G., Read, R. J., Richardson, J. S., Richardson, D. C., Sammito, M. D., Sobolev, O. V., Stockwell, D. H., Terwilliger, T. C., Urzhumtsev, A. G., Videau, L. L., Williams, C. J. & Adams, P. D. (2019). Acta Cryst. D75, 861–877. Web of Science CrossRef IUCr Journals Google Scholar
Linnaeus, C. (1758). Systema Naturae Per Regna Tria Naturae: Secundum Classes, Ordines, Genera, Species, Cum Characteribus, Differentiis, Synonymis, Locis, 10th ed. Stockholm: Laurentius Salvius. Google Scholar
Maraki, S., Sarchianaki, E. & Barbagadakis, S. (2012). Braz. J. Infect. Dis. 16, 390–392. CrossRef PubMed Google Scholar
McCoy, A. J., Grosse-Kunstleve, R. W., Adams, P. D., Winn, M. D., Storoni, L. C. & Read, R. J. (2007). J. Appl. Cryst. 40, 658–674. Web of Science CrossRef CAS IUCr Journals Google Scholar
Mistry, J., Chuguransky, S., Williams, L., Qureshi, M., Salazar, G. A., Sonnhammer, E. L. L., Tosatto, S. C. E., Paladin, L., Raj, S., Richardson, L. J., Finn, R. D. & Bateman, A. (2021). Nucleic Acids Res. 49, D412–D419. Web of Science CrossRef CAS PubMed Google Scholar
Pfleiderer, G., Zwilling, R. & Sonneborn, H. H. (1967). Hoppe Seylers Z. Physiol. Chem. 348, 1319–1331. CrossRef CAS PubMed Google Scholar
Ran, T., Li, W., Sun, B., Xu, M., Qiu, S., Xu, D. Q., He, J. & Wang, W. (2020). Int. J. Biol. Macromol. 156, 1556–1564. CrossRef CAS PubMed Google Scholar
Rawlings, N. D. & Bateman, A. (2021). Protein Sci. 30, 83–92. CrossRef CAS PubMed Google Scholar
Rosenblum, G., Meroueh, S., Toth, M., Fisher, J. F., Fridman, R., Mobashery, S. & Sagi, I. (2007). J. Am. Chem. Soc. 129, 13566–13574. Web of Science CrossRef PubMed CAS Google Scholar
Rowley, A. F., Rhodes, C. P. & Ratcliffe, N. A. (1984). Zool. J. Linn. Soc. 80, 283–295. CrossRef Google Scholar
Rudkin, D. M., Young, G. A. & Nowlan, G. S. (2008). Paleontology, 51, 1–9. CrossRef Google Scholar
Schechter, I. & Berger, A. (1967). Biochem. Biophys. Res. Commun. 27, 157–162. CrossRef CAS PubMed Web of Science Google Scholar
Schmid, M. R., Dziedziech, A., Arefin, B., Kienzle, T., Wang, Z., Akhter, M., Berka, J. & Theopold, U. (2019). Insect Biochem. Mol. Biol. 109, 63–71. CrossRef CAS PubMed Google Scholar
Sekiguchi, K. & Shuster, C. N. Jr (2009). Biology and Conservation of Horseshoe Crabs, edited by J. T. Tanacredi, M. L. Botton & D. R. Smith, pp. 5–24. Dordrecht: Springer. Google Scholar
Semenova, S. A. & Rudenskaia, G. N. (2008). Biomed. Khim. 54, 531–554. PubMed CAS Google Scholar
Shuster, C. N. Jr (1982). Physiology and Biology of Horseshoe Crabs: Studies on Normal and Environmentally Stressed Animals, edited by J. Bonaventura, C. Bonaventura & S. Tesh, pp. 1–52. New York: Alan R. Liss. Google Scholar
Smart, O. S., Womack, T. O., Flensburg, C., Keller, P., Paciorek, W., Sharff, A., Vonrhein, C. & Bricogne, G. (2012). Acta Cryst. D68, 368–380. Web of Science CrossRef CAS IUCr Journals Google Scholar
Smith, S. A. (2022). Invertebrate Medicine, 3rd ed., edited by G. A. Lewbart, pp. 283–300. Hoboken: John Wiley & Sons. Google Scholar
Springman, E. B., Angleton, E. L., Birkedal-Hansen, H. & Van Wart, H. E. (1990). Proc. Natl Acad. Sci. USA, 87, 364–368. CrossRef CAS PubMed Web of Science Google Scholar
Stöcker, W. & Bode, W. (1995). Curr. Opin. Struct. Biol. 5, 383–390. CrossRef CAS PubMed Web of Science Google Scholar
Stöcker, W. & Gomis-Rüth, F. X. (2013). Proteases: Structure and Function, edited by K. Brix & W. Stöcker, pp. 235–263. Vienna: Springer Verlag. Google Scholar
Stöcker, W., Gomis-Rüth, F. X., Bode, W. & Zwilling, R. (1993). Eur. J. Biochem. 214, 215–231. CAS PubMed Web of Science Google Scholar
Stöcker, W., Gomis-Rüth, F. X., Huber, R., Zwilling, R. & Bode, W. (1992). Biol. Chem. Hoppe Seyler, 373, 654. Google Scholar
Stöcker, W., Grams, F., Reinemer, P., Bode, W., Baumann, U., Gomis-Rüth, F. X. & Mckay, D. B. (1995). Protein Sci. 4, 823–840. PubMed Google Scholar
Stöcker, W. & Yiallouros, I. (2013). Handbook of Proteolytic Enzymes, 3rd ed., edited by N. D. Rawlings & G. S. Salvesen, pp. 895–900. Oxford: Academic Press. Google Scholar
Stöcker, W., Wolz, R. L., Zwilling, R., Strydom, R. J. & Auld, D. S. (1988). Biochemistry, 27, 5026–5032. Google Scholar
Størmer, L. (1952). J. Paleontol. 26, 630–639. Google Scholar
Tallant, C., García-Castellanos, R., Baumann, U. & Gomis-Rüth, F. X. (2010). J. Biol. Chem. 285, 13951–13957. Web of Science CrossRef CAS PubMed Google Scholar
Tallant, C., Marrero, A. & Gomis-Rüth, F. X. (2010). Biochim. Biophys. Acta, 1803, 20–28. Web of Science CrossRef PubMed CAS Google Scholar
Terwilliger, T. C., Grosse-Kunstleve, R. W., Afonine, P. V., Moriarty, N. W., Zwart, P. H., Hung, L.-W., Read, R. J. & Adams, P. D. (2008). Acta Cryst. D64, 61–69. Web of Science CrossRef CAS IUCr Journals Google Scholar
Tunyasuvunakool, K., Adler, J., Wu, Z., Green, T., Zielinski, M., Žídek, A., Bridgland, A., Cowie, A., Meyer, C., Laydon, A., Velankar, S., Kleywegt, G. J., Bateman, A., Evans, R., Pritzel, A., Figurnov, M., Ronneberger, O., Bates, R., Kohl, S. A. A., Potapenko, A., Ballard, A. J., Romera-Paredes, B., Nikolov, S., Jain, R., Clancy, E., Reiman, D., Petersen, S., Senior, A. W., Kavukcuoglu, K., Birney, E., Kohli, P., Jumper, J. & Hassabis, D. (2021). Nature, 596, 590–596. Web of Science CrossRef CAS PubMed Google Scholar
Tyndall, J. D. A., Nall, T. & Fairlie, D. P. (2005). Chem. Rev. 105, 973–1000. Web of Science CrossRef PubMed CAS Google Scholar
Van Wart, H. E. & Birkedal-Hansen, H. (1990). Proc. Natl Acad. Sci. USA, 87, 5578–5582. CrossRef CAS PubMed Web of Science Google Scholar
Walls, E. A., Berkson, J. & Smith, S. A. (2002). Rev. Fish. Sci. 10, 39–73. CrossRef Google Scholar
Winn, M. D., Ballard, C. C., Cowtan, K. D., Dodson, E. J., Emsley, P., Evans, P. R., Keegan, R. M., Krissinel, E. B., Leslie, A. G. W., McCoy, A., McNicholas, S. J., Murshudov, G. N., Pannu, N. S., Potterton, E. A., Powell, H. R., Read, R. J., Vagin, A. & Wilson, K. S. (2011). Acta Cryst. D67, 235–242. Web of Science CrossRef CAS IUCr Journals Google Scholar
Winter, W. E., Greene, D. N., Beal, S. G., Isom, J. A., Manning, H., Wilkerson, G. & Harris, N. (2020). Adv. Clin. Chem. 94, 31–84. CrossRef CAS PubMed Google Scholar
Xu, D., Zhou, J., Lou, X., He, J., Ran, T. & Wang, W. (2017). J. Biol. Chem. 292, 5195–5206. Web of Science CrossRef CAS PubMed Google Scholar
Yelland, T. & Djordjevic, S. (2016). Structure, 24, 2008–2015. CrossRef CAS PubMed Google Scholar
Yiallouros, I., Kappelhoff, R., Schilling, O., Wegmann, F., Helms, M. W., Auge, A., Brachtendorf, G., Berkhoff, E. G., Beermann, B., Hinz, H. J., König, S., Peter-Katalinic, J. & Stöcker, W. (2002). J. Mol. Biol. 324, 237–246. CrossRef PubMed CAS Google Scholar
Zundert, G. C. P. van, Moriarty, N. W., Sobolev, O. V., Adams, P. D. & Borrelli, K. W. (2021). Structure, 29, 913–921. PubMed Google Scholar
This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.