Structure-based mechanism of cysteine-switch latency and of catalysis by pappalysin-family metallopeptidases

The members of the unicellular pappalysin family of metallopeptidases are secreted as zymogens with a short N-terminal pro-segment, which blocks the catalytic zinc cation by a cysteine-switch mechanism, as shown structurally for bacterial mirolysin. This is a mechanism to prevent activity in the absence of the adequate temporal and spatial requisites. In addition, the complex of mirolysin with a large peptide reveals the structural basis of its catalytic mechanism.

Tannerella forsythia is an oral dysbiotic periodontopathogen involved in severe human periodontal disease. As part of its virulence factor armamentarium, at the site of colonization it secretes mirolysin, a metallopeptidase of the unicellular pappalysin family, as a zymogen that is proteolytically auto-activated extracellularly at the Ser54-Arg55 bond. Crystal structures of the catalytically impaired promirolysin point mutant E225A at 1.4 and 1.6 Å revealed that latency is exerted by an N-terminal 34-residue pro-segment that shields the front surface of the 274-residue catalytic domain, thus preventing substrate access. The catalytic domain conforms to the metzincin clan of metallopeptidases and contains a double calcium site, which acts as a calcium switch for activity. The pro-segment traverses the active-site cleft in the opposite direction to the substrate, which precludes its cleavage. It is anchored to the mature enzyme through residue Arg21, which intrudes into the specificity pocket in cleft sub-site S 1 0 . Moreover, residue Cys23 within a conserved cysteine-glycine motif blocks the catalytic zinc ion by a cysteine-switch mechanism, first described for mammalian matrix metallopeptidases. In addition, a 1.5 Å structure was obtained for a complex of mature mirolysin and a tetradecapeptide, which filled the cleft from sub-site S 1 0 to S 6 0 . A citrate molecule in S 1 completed a product-complex mimic that unveiled the mechanism of substrate binding and cleavage by mirolysin, the catalytic domain of which was already preformed in the zymogen. These results, including a preference for cleavage before basic residues, are likely to be valid for other unicellular pappalysins derived from archaea, bacteria, cyanobacteria, algae and fungi, including archetypal ulilysin from Methanosarcina acetivorans. They may further apply, at least in part, to the multi-domain orthologues of higher organisms.

Introduction
Tannerella forsythia is a Gram-negative bacterium, which was first isolated from patients by Anne Tanner at The Forsyth Institute in the mid-1970s (Tanner et al., 1979) and later named after her (Tanner & Izard, 2006;Tindall et al., 2008). It is a member of the dysbiotic oral microbiome responsible for severe periodontal disease (PD), which is the sixth most prevalent disabling health condition that affects an estimated 750 million people worldwide (Kassebaum et al., 2014;Hajishengallis, 2015). PD is a polymicrobial synergistic inflammatory disease in which a major role is exerted by the red complex, a bacterial consortium of T. forsythia, Porphyromonas gingivalis and Treponema denticola that colonizes the gingival crevice and forms dental plaque biofilms (Socransky et al., 1998;Holt & Ebersole, 2005). T. forsythia is strongly associated with destructive inflammatory host responses (Hajishengallis, 2014;Lamont & Hajishengallis, 2015). Outside the oral cavity it is linked to accelerated progression of atherosclerotic lesions in mice and an increased risk of esophageal adenocarcinoma in humans (Peters et al., 2017). In addition, it has been associated with skin abscesses in animal models (Takemoto et al., 1997;Bird et al., 2001) and has been isolated from women with bacterial vaginosis (Cassini et al., 2013). These findings underpin the capacity of the bacterium to colonize niches distal from the gingival crevice, with systemic implications probably similar to those for P. gingivalis (Seymour et al., 2007;Olsen & Yilmaz, 2019).
Here, we determine the mechanisms of latency and catalysis of unicellular pappalysins by high-resolution crystal structure analysis of the mirolysin zymogen and a product complex.

Protein production and purification
The coding sequence of T. forsythia strain ATCC 43037 promirolysin, without the signal peptide and with or without the E225A mutation, was cloned into a vector for overexpression in Escherichia coli BL21 (DE3) cells as a fusion construct with N-terminal glutathione S-transferase and a PreScission endopeptidase target sequence as previously described (Koneru et al., 2017). The recombinant promirolysin variants comprised residues Gln20-Ser331 preceded by a glycine-proline dipeptide, because of the cloning strategy, and were purified by glutathione Sepharose affinity and sizeexclusion chromatographies. A variant of the protein in which methionine was replaced with selenomethionine was obtained in the same way except that the modified amino acid was used instead of the natural residue in minimal cell culture medium. Recombinant mature wild-type mirolysin was prepared as previously described (Koneru et al., 2017) and spanned residues Arg55-Ser331. The protein was incubated at a 1:2 molar ratio with the small lipoprotein BFO_2662 (UniProt code G8ULV2) during crystallization studies (see Section 2.2), which cleaved the protein. Its C-terminal segment remained bound to mirolysin in a product complex.

Crystallization and diffraction data collection
Proteins were crystallized by the sitting-drop vapour diffusion method. Reservoir solutions were mixed in plates of research papers IUCrJ (2020). 7, 18-29 96 Â 2 ml deep wells with a Tecan robot. A Phoenix robot (Art Robbins) dispensed nanodrops of protein and reservoir solutions into MRC plates of 96 Â 2 ml wells (Innovadyne). Several hundreds of conditions from multiple screenings were assayed at the joint IBMB/IRB Automated Crystallography Platform of Barcelona Science Park. Plates were stored at 4 or 20 C in Bruker steady-temperature crystal farms. The best native and selenomethionine-derivatized promirolysin crystals were obtained at 20 C from drops with a 200 nl protein solution at $0.6 mg ml À1 in 5 mM Tris HCl, 50 mM sodium chloride, pH 8.0 and a 100 nl reservoir solution, comprising 25% polyethylene glycol (PEG) 1500, 0.1 M MIB buffer (malonic acid, imidazole and boric acid at a 2:3:3 molar ratio) at pH 6.0. The best crystals of the mirolysin product complex were obtained with protein solution at $12 mg ml À1 in 5 mM Tris HCl, 50 mM sodium chloride, 5 mM calcium chloride and pH 8.0 at 4 C from drops containing a 200 nl protein solution, and 100 nl of a reservoir solution of 40% ethanol, 5% PEG 1000, and 0.1 M phosphate-citrate buffer at pH 4.2.
Crystals were cryo-protected by rapid passage through drops containing reservoir solution plus 10-15% glycerol(v/v) and flash vitrified in liquid nitrogen before transport to the ALBA synchrotron in Cerdanyola (Catalonia, Spain). Diffraction data were collected at the zinc absorption edge from cryo-cooled crystals on a PILATUS 6M pixel detector (Dectris) at beamline XALOC (Juanhuix et al., 2014). The data were indexed, integrated and merged by programs XDS (Kabsch, 2010a) and XSCALE (Kabsch, 2010b). Data were transformed with XDSCONV to MTZ format for structure solution and refinement. The native promirolysin crystals belonged to space group P2 1 2 1 2 1 , contained one molecule per asymmetric unit and were processed to 1.4 Å resolution. The selenomethionine-containing promirolysin crystals belonged to space group P2 1 , had two protein molecules (A and B) per asymmetric unit and were processed to 1.6 Å resolution. Finally, the crystals of mirolysin in a product complex belonged to space group P2 1 2 1 2 1 , contained one complex per asymmetric unit and were processed to 1.5 Å resolution. Table 1 provides a summary of the data-processing statistics.

Structure solution and refinement
The structure of selenomethionine-derivatized promirolysin was solved first by a combination of single-wavelength anomalous diffraction with the Autosol routine (Terwilliger et al., 2009) of the PHENIX program suite (Adams et al., 2010) and maximum-likelihood-scored molecular replacement with the Phaser program (McCoy et al., 2007). For these calculations, we used a dataset collected at the zinc absorption peak processed with separate Friedel pairs (see Table 1) and the coordinates of the protein part of M. acetivorans mature ulilysin [Arg61-Ala322, PDB entry 2cki, Tallant et al. (2006)], which had been pruned with the CHAINSAW program (Stein, 2008) according to a sequence alignment with mirolysin performed with MultAlin (Corpet, 1988  Unicellular pappalysin family members. Structure-assisted sequence alignment of selected pappalysins from prokaryotes and lower eukaryotes depicting the respective (potential) CDs and upstream PSs. The organism, the UniProt code plus the sequence identity with ulilysin in parentheses, and the organism category are displayed at the beginning of each sequence block, respectively. Very high, high and middle sequence similarities are characterized by magenta, green and yellow backgrounds, respectively. Regular secondary-structure elements (helices and strands as orange and blue bars, respectively) below and above the alignment correspond to ulilysin and (pro)mirolysin, respectively. Their numbering is consistent with that of ulilysin, see Tallant et al. (2006). The conserved CG-motif responsible for latency in promirolysin is shown in bold. The number of additional N-and C-terminal residues is shown in parentheses. Residues not present in the structure of native promirolysin (this work; PDB entry 6r7v) and mature ulilysin (PDB entry 2cki) are denoted by grey bars above and below the alignment, respectively. The disulfides found in both ulilysin and mirolysin are shown as purple handles. Red scissors indicate autolytic activation points (P 1 0 residues) of ulilysin (Tallant et al., 2006) and mirolysin (Koneru et al., 2017 (Winn et al., 2011), which revealed the position of the two catalytic zinc ions. These positions, the protein coordinates and the zinc-edge dataset were fed into phenix.autosol, which produced a Fourier map and a model that was completed in subsequent cycles of manual model building with the Coot program (Emsley et al., 2010) and crystallographic refinement. The latter was carried out with PHENIX (Afonine et al., 2012) and BUSTER/TNT (Smart et al., 2012) against data processed with merged Friedel mates (Table 1). Calculations included translation/libration/screwrotation refinement and, initially, non-crystallographic symmetry restraints. Anisotropic B-factor refinement was assayed with PHENIX but it did not produce better statistics and maps than those from isotropic refinement (R factor/free R factor of 15.3/19.8 versus 16.1/18.8, respectively), so this approach was not pursued. The incorporation of selenomethionine instead of methionine was only partial, as revealed by an occupancy refinement step with all selenium atoms grouped (75% on average). The final refined model comprised residues Arg21-Pro327 from molecule A and Arg21-Leu328 from molecule B, plus two calcium ions and one zinc ion each. Four glycerols, one boric acid and 445 solvent molecules completed the model. Residues Asn164 and Gly256 of either protein molecule were Ramachandran outliers but unambiguously resolved in the final Fourier map. Moreover, respective residue Cys23 was oxidized to S-oxocysteine, and segments Lys51-His53 of molecule A and Gly49-His53 of molecule B were partially flexible and traced based on weak Fourier map density. Table 1 provides statistics of the final refinement. The structure of native promirolysin was solved by molecular replacement as above using the partially refined coor-research papers IUCrJ (2020). 7, 18-29 Guevara et al. Catalytic mechanism of pappalysin metallopeptidases 21  Table 1 for final-refinement statistics. Finally, the structure of the mirolysin product complex was solved by molecular replacement using the coordinates of the mature part of native promirolysin (Arg55-Pro327) and the two calcium ions, which provided a solution at , , , x, y, z values of 334.0, 156.6, 346.0 , 0.199, À0.527 and 0.147. This solution had initial rotation and translation function Z scores of 7.3 and 10.4, respectively, and a final log-likelihood gain of 7437. The presence of a strong peak (>19) at the omitted zinc site confirmed the correctness of the solution. Autotracing, model building and refinement proceeded as with native promirolysin. The final refined model comprised protein residues Pro58-Pro327 (molecule A) and peptide residues Lys1-Lys14 plus a citrate (CITÀ1) constituting molecule B (residues and numbers in italics), in addition to one zinc and two calcium ions. The citrate and the sequence of the first seven residues of the peptide could be unambiguously assigned owing to the very high resolution and quality of the Fourier map. This demonstrated that the peptide corresponded to segment K110-RDPVYFIKLSTI-K123 of protein BFO_2662. Two ethanol and 348 solvent molecules completed the model. Residue Asn164 was a Ramachandran outlier that was unambiguously resolved in the final Fourier map. Table 1 provides statistics of the final refinement. In all structures, disulfides linked Cys243 with Cys271 and Cys262 with Cys291. The peptide bonds preceding Pro215, Pro266 and Pro276 were in a cis conformation.

Crystallization of mirolysin variants
Recombinant promirolysin undergoes zinc-and calciumdependent step-wise autolytic processing and activation to mature 31 kDa mirolysin through truncations at both the Nand C-terminus (Koneru et al., 2017), as also reported for ulilysin (Tallant et al., 2006). A variant, in which the general base/acid Glu225 was replaced with alanine (E225A), lacked activity (Koneru et al., 2017). This variant was used to obtain the intact zymogen for structural studies. Previously, this strategy has proven successful for other MP zymogens (Guevara et al., 2010;Goulas et al., 2011;Arolas et al., 2012Arolas et al., , 2016Ló pez-Pelegrín et al., 2015). We obtained two different crystal forms for native and selenomethionine-derivatized promirolysin E225A, which contained one and two protomers per asymmetric unit and diffracted to resolutions of 1.4 and 1.6 Å , respectively (Table 1). These structures contained residues Arg21-Pro328, and superposition of the native promirolysin onto the two selenomethionine-derivatized protomers revealed close similarity and r.m.s.d. values of 0.48 and 0.42 Å for the common C atoms, so they are hereafter considered equivalent. The only significant deviation was observed for segment Lys51-Arg55, which contained the final activation cleavage point (Ser54-Arg55) (Koneru et al., 2017) and was flexible. Jointly, these structures provided the molecular determinants of promirolysin latency (see Sections 3.2 and 3.3), which was compared with that of other MPs (see Section 3.4).
Mature wild-type mirolysin was incubated with small lipoprotein BFO_2662 (UniProt code G8ULV2), whose coding gene is immediately upstream of mirolysin in the genome of T. forsythia (Ksiazek, Mizgalska et al., 2015), and the mixture was set up for crystallization. We obtained a structure at 1.5 Å resolution (Table 1) with one copy of mirolysin per asymmetric unit (Arg55-Pro327). With minor exceptions, this fourth protomer was very similar to the CD of promirolysin (see Section 3.5). However, a tetradecapeptide from the lipoprotein was found covering sub-sites S 1 0 to S 6 0 and more of the primed side of the active-site cleft [for sub-site and peptide substrate nomenclature, see Schechter & Berger (1967) and Gomis-Rü th, Botelho et al. (2012)]. Moreover, a citrate anion was attached to S 1 on the non-primed side. Thus, we serendipitously trapped a product complex, which revealed the research papers 22 Guevara et al. Catalytic mechanism of pappalysin metallopeptidases molecular determinants of substrate binding and catalysis of mirolysin (see Section 3.5).

The promirolysin structure
Promirolysin E225A has a compact structure of approximate dimensions 60 Â 45 Â 45 Å , and it splits into an Nterminal 34-residue PS (Arg21-Ser54) and a downstream 274residue metzincin-type CD (Arg55-Leu328). The latter subdivides into an upper N-terminal sub-domain (NTSD; Arg55-Asp231 + Leu306-Leu328) and a lower C-terminal sub-domain (CTSD; Leu232-Ser305) separated by a horizontal active-site cleft (Fig. 2). The PS consists of two contiguous perpendicular extended segments, I (Arg21-Gly24) and II (Ser25-Asn28), followed by helices 1p (Met29-Thr35) and 2p (Pro37-Leu52), which are rotated by $60 relative to each other and connected by linker residue Glu36 (for secondary-structure nomenclature, see Fig. 1  1p and Tyr40 from 2p provide a stacking interaction at the protein surface. The NTSD contains a strongly twisted and arched fivestranded -sheet arranged top to bottom [ Fig. 2(a)], in which the top strand is split into two (strands 2 and 3) by a protruding loop (L23) called the LNR-like loop (Tallant et al., 2006). The remaining strands (1, 4, 8 and 7) are continuous and parallel to the top strand, except for the lowermost strand 7 which is antiparallel and forms the upper rim of the active-site cleft. Two roughly perpendicular helices, the backing-helix 1 and the active-site helix 4, nestle on the concave face of the sheet, and a third one, the second Cterminal helix 6, is attached to the convex face of the sheet near strands 1 and 2 [ Fig. 2(a)]. Loop L12 extends down to the CTSD and includes two short helices, 2 and 3. The active-site helix contains the first residues of the zinc-binding motif of metzincins and includes His224 and His228, which coordinate the catalytic zinc (Zn999) at the bottom of the active-site cleft, as well as glutamate-replacing Ala225. The NTSD finishes at Asp231, which is normally a glycine in metzincins (Bode et al., 1993;Cerdà -Costa & Gomis-Rü th, 2014), with a sharp turn downward and enters the CTSD. This sub-domain contains little regular secondary structure except for helices 2, 3 and the C-terminal helix 5, and it provides the third zinc ligand of the metzincin motif, His234. The three histidines bind the metal through their respective N "2 atoms at distances spanning 1.95-2.07 Å in the four protomers, which are typical values for Zn-N bonds (2.03 Å on average) (Harding, 2006). Another characteristic element of metzincin CTSDs is the Met-turn (Asn282-Asp285), which forms a hydrophobic base for the zinc site (Tallant, García-Castellanos et al., 2010). Moreover, atom O of the downstream residue Tyr286 is close to the zinc but too far apart for coordination (4.62 Å ). Tyrosines in similar positions are zinc ligands in unbound members of the astacin and serralysin families of metzincins, a function that was also proposed for ulilysin. They are swung out upon substrate binding in a motion referred to as tyrosine switch Tallant et al., 2006;Gomis-Rü th, Trillo-Muyo et al., 2012).
Structural cohesion of the CD is provided by two internal disulfides, Cys243-Cys271 and Cys262-Cys291, and a double structural calcium site, which further explains the calcium dependence of the enzyme (Koneru et al., 2017). The two cations are liganded by residues from segment Trp236-Tyr258, which adopts a double S-loop structure, and by solvents [ Fig. 2(b)]. Ca997 is bound in an octahedral plus one coordination by seven oxygens at distances spanning 2.28-2.49 Å , which are typical values for Ca-O bonds (2.36-2.39 Å on average) (Harding, 2006 Fig. 2(b)].

Mechanism of latency
The PS traverses the active-site cleft of mirolysin in the opposite direction of the substrate [Figs. 2(a), 2(c) and 2(d)]. This is a mechanism previously described for other MP zymogens that prevents cleavage as the Michaelis complex required for catalysis cannot be formed [see Section 3.4 and Arolas et al. (2018)]. Analysis of the PS-CD interaction surface revealed an associated calculated solvation freeenergy gain upon formation of the interface of À11.1 kcal mol À1 according to Krissinel & Henrick (2007) and an interface of 1151 Å 2 , which corresponds to a buried surface area of 2302 Å 2 . These values account for a strong interaction that is wider than average for buried surfaces of proteinprotein complexes (1910 Å 2 ) (Janin et al., 2008) and remarkable given the small size of the PS. Indeed, 144 atoms from 44 residues of the CD and 105 atoms from 20 residues of the PS participate in the interface, which includes 17 hydrogen bonds, three salt bridges and one metallorganic bond [see Table 2 and Figs. 2(c) and 2(d)]. Participating segments are Arg55-Val57, Met147, Asp179-Thr192, Tyr216-Gly240, Ser255, Asn248, Tyr258-Glu269, Asp285-Met292, and segment Arg302-Ile313 from the CD; and Arg21-Glu30 plus Lys39-Ser54 from the PS.
A series of interactions are performed by Arg21, which belongs to the extended segment I and occupies the S 1 0 site of the cleft [Fig. 2(c)]. Its -amino group hydrogen-bonds research papers 24 Guevara et al. Catalytic mechanism of pappalysin metallopeptidases IUCrJ (2020). 7, 18-29 Table 2 Electrostatic interactions of promirolysin at the PS-mature enzyme interface.
The first residue/atom belongs to the PS, the second to the CD. The distances are from the native promirolysin structure (PDB entry 6r7v).

Salt bridges (Å )
Metallorganic interactions (Å ) Cys23 S -Zn999 2.22 Hydrogen bonds (Å ) Tyr216 O and Tyr286 O, while its side chain fixes Thr221 O plus Thr287 O, and salt-bridges Asp289 O 1 . This aspartate is key for substrate specificity [see Section 3.5 and Koneru et al. (2017)]. Tyrosine-switch Tyr286 pinches the extended segment I together with the upper-rim-strand segment Leu181-Tyr183 of the CD. In the capital interaction for latency, downstream Cys23 S binds the catalytic zinc at 2.11-2.22 Å in the different structures, which is closer than typical Zn-S distances (2.31 Å ) (Harding, 2006), and contributes together with the three histidine ligands to a tetrahedral zinc coordination sphere [Figs. 2(c) and 2(d)]. Downstream residue Ser25 from the extended segment II binds the upper-rim strand through its side chain. Its carbonyl contacts Met147 S , which is found in two conformations. Residue Glu26 tightly binds tyrosineswitch Tyr286 O through its O "1 atom, thus fixing the swungout conformation of the aromatic ring, with the main chain of Leu27 fixed by Asp238. The hydrophobic core of the PS (see Section 3.2) is expanded through CD residues Phe186, Phe188 and Arg233, which glue the PS to the CD through hydrophobic forces [ Fig. 2(c)]. This hydrophobic core is delimited in the back by Arg302 and Glu47, which are engaged in a double salt bridge. The core is further extended to the left by Trp46, which is buried in a hydrophobic pocket created by the CD residues Pro187, Phe188, Leu306 and Ile313, as well as Ile50 from the PS [ Fig. 2(d)]. In addition, Trp46 N " is fixed by the Asp231 side chain, which maintains the side chain of Arg302 in a competent conformation for Glu47 binding. This contribution to the PS-CD interface explains why an aspartate replaces the glycine normally found here in metzincins as part of the zinc-binding motif. Finally, the primary activation site Ser54-Arg55 is accessible, and Arg55 protrudes from the molecular surface by virtue of a hydrogen bond between Arg55 N and Asp308 O 2 . This residue further fixes the downstream segment of the CD through a second hydrogen bond (Asp308 O 2 Á Á ÁSer56 N).
Superposition of mature ulilysin onto the CD of promirolysin gave an r.m.s.d. of 0.98 Å for 250 common C atoms, which reflects close structural similarity that is consistent with the 50% sequence identity observed (Fig.  1). Furthermore, ulilysin contains a cysteine-glycine motif at the beginning of the zymogen sequence with the same number of PS residues, as well as many of the aforementioned structural features. Thus, the zymogenic structure and mechanism derived here for mirolysin are probably valid for ulilysin and other unicellular pappalysins (Fig. 1).

Promirolysin latency in the context of other MPs
Zymogenicity in MPs was first structurally analysed in the 1990s for the funnelin metallocarboxypeptidases (Coll et al., 1991;Gomis-Rü th et al., 1995;Gomis-Rü th, 2008) and for mammalian MMPs (Becker et al., 1995;Morgunova et al., 1999;Tallant, Marrero et al., 2010). MMPs are found, often in several copies, in animals, plants, fungi, archaea, bacteria and viruses (Marino-Puertas et al., 2017), with 23 paralogs in humans. Mammalian MMP zymogens are inhibited by 70-90residue PSs upstream of the CD through a cysteine within a conserved motif, PRCGXPD. This residue binds the catalytic zinc and is engaged in a cysteine-switch or velcro mechanism (Springman et al., 1990;Massova et al., 1998;Rosenblum et al., 2007;Tallant, Marrero et al., 2010;Arolas et al., 2018)  substantial rearrangement, i.e. the CD and the active site are already preformed in the zymogen and just shielded by the PS. Thus, the results herein indicate that latency for promirolysin and unicellular pappalysins probably operates based on a cysteine switch featuring a cysteine imbedded here in a conserved cysteine-glycine motif (Fig. 1).

A mirolysin product complex
We next obtained a product complex of mature mirolysin with a tetradecapeptide (Lys1-Lys14) occupying S 1 0 and further primed sub-sites of the cleft plus a citrate in S 1 (CITÀ1). Superposition onto promirolysin revealed a core r.m.s.d. of 0.44 Å upon alignment of 269 out of the 270 protein residues of mature mirolysin and 307 residues of promirolysin [Figs. 4(a), 4(b) and 4(c)]. Thus, no major overall structural rearrangement occurred upon activation [ Fig. 4(a)  A product complex of mature mirolysin. (a) Superposition of the C plots of promirolysin (PS in pink, CD in light blue) and mature mirolysin (purple) in the orientation of Fig. 2(a). Significantly deviating regions are pinpointed by green arrows. The catalytic zinc and the structural calcium cations are shown as magenta and blue spheres, respectively. (b) Detail of the initial Fourier omit map to 1.5 Å of the product complex around the citrate (CITÀ1) and the tetradecapeptide (Lys1-Lys14), both as stick models with green carbons and labels. The map (in orange) is contoured at 0.6 above threshold and is clear for CITÀ1 and the main and side chains of Lys1-Ile8 and Ser11-Thr12, as well as for the main chains of Lys9, Leu10, Ile13 and Lys14. The view results from an $45 rotation downward from the standard orientation of Fig. 2(a). (c) Close-up view of mature mirolysin (carbons in plum) and the product (carbons in green) resulting from the view in (a) after a vertical 90 rotation to the left. Selected residues are labelled with their residue numbers in purple and dark green, respectively. families including MMPs and others (Tallant, Marrero et al., 2010;Ló pez-Pelegrín et al., 2015;Arolas et al., 2018). A subtle rotation of the NTSD of $3 was detected, as well as rearrangement of the mature N-terminus, which protruded from the molecular surface and was disordered for its first three residues. In addition, segment Gly177-Asp179 underwent a downward motion (a maximal deviation of 2.45 Å at Asp178 C ) for substrate binding mediated by the flip of the peptide bond Leu176-Gly177. Upon removal of the PS, Asp231, which plays a key role in the zymogen (see Section 3.3), salt-bridged Arg233, whose side chain was rotated to meet the aspartate. Similarly, Arg302 was rearranged in the absence of its zymogenic salt-bridge partner Asp47 and contacted Asp247 (Arg302 N 1 -Asp247 O, 3.08 Å ). In addition, Thr311 was slightly lifted downwards because of the absence of the PS around His53.
Citrate CITÀ1 mimics an amino acid in S 1 after catalysis. Its central quaternary carbon resembles the C atom. It is bound to a hydroxyl (O7), an -carboxylate similar to that found after catalysis (with oxygen atoms O5 and O6), a -carboxylate as from an aspartate side chain (oxygens O3 and O4) and a second -carboxylate (oxygens O1 and O2). The latter mimics substrate atoms upstream of the C in P 1 , and O1 strongly binds general-base atom Glu225 O "1 [ Table 3 and Fig. 4(c)], indicating that either oxygen must be protonated, while O2 contacts upper-rim atom Ala184 N. Atoms O5 and O6 bind the catalytic zinc in a distorted bidentate fashion. Moreover, O5 weakly binds tyrosine-switch residue Tyr286 O , thus suggesting a role for this residue in the stabilization of the tetrahedral reaction intermediate and/or product, as well as in zinc binding to the unbound enzyme. Finally, O7 hydrogen-bonds the -amino group of Lys1 in subsite S 1 0 . This nitrogen further binds CITÀ1 O1, general base Glu225 O "2 and the upper-rim main-chain carbonyl of Gly182, but not the catalytic zinc [ Fig. 4(c)]. The side chain of Lys1 intrudes into the S 1 0 specificity pocket and binds the mainchain carbonyl of Thr287 at the pocket bottom. Lys1 N is linked to Asp289 O 1 through an internal solvent-mediated salt bridge, which explains the preference for basic residues in S 1 0 (Koneru et al., 2017). An arginine, which contains two extra non-hydrogen side-chain atoms, would be directly bound by Asp289. The strong conservation of Asp289, which plays a major role in latency (see Section 3.3), across pappalysins [see Fig. 1 and Fig. 1 in the work by Tallant et al. (2006)] indicates that the specificity for basic residues in S 1 0 should be common for this family, as further shown for archaeal ulilysin (Tallant et al., 2006) and human PAPP-A (Laursen et al., , 2002 and PAPP-A2 (Overgaard et al., 2001). The carbonyl of Lys1 binds the upper-rim main chain at Leu181. Arg2 is in S 2 0 and thus points to bulk solvent. Its side chain is fixed by CITÀ1O3 and the side chain of Asp179, which is further engaged in binding the main-chain nitrogen of residue Asp3 in S 3 0 through its main-chain carbonyl. The Asp3 side chain contacts Tyr216 O , and Pro4 in S 4 0 weakly interacts with the Tyr258, Tyr286 and Glu260 side chains. Downstream Val5 is on the surface of the enzyme and the peptide chain turns upward so that the side chain of Tyr6 in S 6 0 sticks to the molecular surface around Asp178-Asp179. From Tyr6 onwards, the peptide adopts a helical structure until Ile13. From Phe7 onwards, it does not interact with mirolysin [ Fig. 4(b)]; instead, the peptide is fixed until Lys14 by crystal contacts.

Conclusions
Mirolysin, ulilysin and other unicellular pappalysins, which are present in archaea, bacteria, cyanobacteria, algae and fungi, most likely bind substrates in extended conformations from left (residues upstream of the scissile bond) to right (downstream residues) following the commonly accepted dogma (Madala et al., 2010). The specificity of these and most MPs (Gomis-Rü th, Botelho et al., 2012) is exerted by the S 1 0 specificity pocket, which in pappalysins accommodates substrate lysines and arginines because of the presence of a conserved aspartate at the bottom of the pocket. Mirolysin and most likely other unicellular pappalysins utilize a zymogenic cysteine-switch mechanism exerted by a cysteine in a conserved cysteine-glycine dipeptide within the PS, which runs in the opposite direction to the substrate along the cleft, preventing cleavage and shielding the preformed competent CD. This is reminiscent of other metzincin families with short N-terminal PSs, e.g. the MMPs and astacins. In these families, aspartates may replace the cysteine in some but not all family members, and the conformations of the PSs vary largely. Overall, the results herein support the hypothesis that latency mechanisms are less conserved than the structure and mechanisms of the mature CDs.
Finally, the structural studies reported herein demonstrate substrate binding and zymogenicity for mirolysin, providing molecular mechanisms for biochemical reactions and latency of the pappalysin family of MPs within the metzincin clan. These data have practical implications in that PSs and bound substrates are templates for the design of specific and potent research papers IUCrJ (2020). 7, 18-29 Guevara et al. Catalytic mechanism of pappalysin metallopeptidases 27 Table 3 Electrostatic interactions of mirolysin at the product-CD interface.