Computing the bridge length: the key ingredient in a continuous isometry classification of periodic point sets

McManus, J.; Kurlin, V.

doi:10.1107/S2053273325008253

research papers

FOUNDATIONS
ADVANCES

ISSN: 2053-2733

Volume 81| Part 6| November 2025| Pages 427-437

https://doi.org/10.1107/S2053273325008253

Open

access

Computing the bridge length: the key ingredient in a continuous isometry classification of periodic point sets

Jonathan McManus ^a ^* and Vitaliy Kurlin ^a ^*

^aComputer Science Department and Materials Innovation Factory, University of Liverpool, Liverpool L69 3BX, UK
^*Correspondence e-mail: [email protected], [email protected]

Edited by A. Singer, Princeton University, USA (Received 27 June 2025; accepted 18 September 2025; online 17 October 2025)

The fundamental model of any periodic crystal is a periodic set of points at all atomic centres. Since crystal structures are determined in a rigid form, their strongest equivalence is rigid motion (composition of translations and rotations) or isometry (also including reflections). The recent classification of periodic point sets under rigid motion used a complete invariant isoset whose size essentially depends on the bridge length, defined as the minimum `jump' that suffices to connect any points in the given set. We propose a practical algorithm to compute the bridge length of any periodic point set given by a motif of points in a periodically translated unit cell. The algorithm has been tested on a large crystal dataset and is required for an efficient continuous classification of all periodic crystals. The exact computation of the bridge length is a key step to realizing the inverse design of materials from new invariant values.

Keywords: periodic point sets; labelled quotient graphs; isometry invariant.

1. Introduction: practical motivations and the problem statement

All solid crystalline materials can be modelled at the atomic level as periodic sets of points (with the chemical attributes if desired) at all atomic centres, defined below.

Definition 1 (lattice, unit cell, motif, periodic point set)

Any vectors $[{\bf v}_{1},\ldots,{\bf v}_{n}]$ that form a linear basis of $[{\bb R}^{n}]$ generate the lattice $[\Lambda = \{\sum_{i = 1}^{n}c_{i}{\bf v}_{i}\mid c_{i}\in{\bb Z}\}]$ and the unit cell $[U = \{\sum_{i = 1}^{n}t_{i} {\bf v}_{i}\mid 0\leq t_{i}\,\lt\,1\}]$ . A motif is any finite set of points $[M\subset U]$ , which can represent centres of atoms in a real crystal. The motif size is the number of points in M. A periodic point set S = $[\Lambda+M]$ = $[\{{\bf v}+p\mid{\bf v}\in\Lambda,p\in M\}]$ is a union of lattices whose origins are shifted to all points p of the motif M [see Fig. 1 (left)].

Figure 1
Left: the orthonormal basis $[{\bf v}_{1},{\bf v}_{2}]$ generates the green lattice Λ and the unit cell U containing the blue motif M of three points. The periodic point set $[S = \Lambda+M]$ is obtained by periodically repeating M along all vectors of Λ. Right: different motifs $[M,M^{\prime}]$ in the same cell generate periodic sets that differ only by translation.

Any unit cell U is a parallelepiped on basis vectors $[{\bf v}_{1},\ldots,{\bf v}_{n}]$ . If we translate the unit cell U by all vectors $[v\in\Lambda]$ , the resulting cells tile $[{\bb R}^{n}]$ without overlaps. Motif points represent atomic centres in a real crystal. The same lattice can be generated by infinitely many different bases that are all related under multiplication by $[n\times n]$ matrices with integer elements and determinant 1. Even if we fix a basis of $[{\bb R}^{n}]$ and hence a unit cell U, different motifs in U can define periodic point sets that differ only by Euclidean isometry defined as any distance-preserving transformation of $[{\bb R}^{n}]$ .

Since crystal structures are determined in a rigid form, their slightly stronger equivalence is rigid motion defined as any orientation-preserving isometry without reflections or as a composition of translations and rotations. After many years of discussing definitions of a `crystal' (Brock, 2021 ), a crystal structure was recently described in the periodic case as a class of periodic sets under rigid motion (Anosova et al., 2024 ).

Any such class consists of all (infinitely many) periodic point sets that are equivalent to each other under some rigid motions. However, almost any perturbation of atoms disturbs some interatomic distances and hence the isometry class with all cell-based descriptors, such as symmetry groups. Even in dimension 1, for any integer $[m\,\gt\,0]$ and a small threshold $[\epsilon\,\gt\,0]$ , the sequence $[{\bb Z}]$ with period 1 is pointwise ε-close to the sequence with the motif $[M = \{0,1+\epsilon,\ldots,m+\epsilon\}]$ and arbitrarily large period m+1.

This inherent discontinuity of all cell-based descriptors was resolved by pointwise distance distributions (PDDs) (Widdowson et al., 2022 ; Widdowson & Kurlin, 2021 ; Widdowson & Kurlin, 2022 ), which defined geographic-style coordinates on the Cambridge Structural Database (CSD) (Widdowson & Kurlin, 2024 ). Though PDDs distinguish all periodic crystals in the CSD within minutes on a modest desktop, the only known theoretically complete and continuous invariant that uniquely identifies any periodic point set under isometry in $[{\bb R}^{n}]$ in polynomial time of the motif size (for a fixed dimension) is the isoset (Anosova & Kurlin, 2021 ; Anosova et al., 2025 ).

The invariant isoset requires the bridge length whose definition is recalled below.

Definition 2 [bridge length β(S)]

For any finite or periodic set of points $[S\subset{\bb R}^{n}]$ , the bridge length $[\beta(S)]$ is the minimum distance such that any points $[p,q\in S]$ can be connected by a finite sequence of points $[p = p_{1},p_{2},\ldots,p_{k} = q]$ in S, such that every Euclidean distance has the upper bound $[|p_{i}-p_{i+1}|\leq\beta(S)]$ for all $[i = 1,\ldots,k-1]$ .

Equivalently, the bridge length $[\beta(S)]$ is the minimum double radius such that the union of the closed balls of the radius $[({1} / {2})\beta(S)]$ around all points of S is connected. The lattice $[\Lambda = {\bb Z}^{3}]$ of all points with integer coordinates has $[\beta(\Lambda) = 1]$ . If we add to $[{\bb Z}^{3}]$ all points whose coordinates are all half-integer, the resulting b.c.c. (body-centred cubic) periodic point set has $[\beta = {{\sqrt{3}} / {2}}]$ equal to the half-diagonal of the unit cube in $[{\bb R}^{3}]$ .

Expanding Delone's local theory (Delone et al., 1976 ; Dolbilin, 1976 ; Dolbilin et al., 1998 ; Dolbilin, 2015 ; Dolbilin, 2018 ), Dolbilin and Bouniaev studied more general t-bonded Delone sets, where t is an upper bound of the bridge length $[\beta(S)]$ for any periodic point set $[S\subset{\bb R}^{n}]$ (Bouniaev & Dolbilin, 2017 ; Dolbilin & Bouniaev, 2019 ). The main problem below is how to create an efficient algorithm to exactly compute $[\beta(S)]$ .

Problem 3

Design an algorithm to compute the bridge length $[\beta(S)]$ in polynomial time of the motif size for any periodic point set S with a fixed unit cell in $[{\bb R}^{n}]$ .

The bridge length of a finite set can be computed via a minimum spanning tree but the periodic case does not easily reduce to a finite one, as shown in Fig. 2.

Figure 2
All minimum spanning trees on extended motifs of a periodic point set S have the longest edge (in blue) of length 3, which could be made arbitrarily long, relative to a preserved minimum inter-point distance of 1 and bridge length $[\beta(S) = 2]$ due to shorter edges from the top-right point in every cell across a cell boundary.

Definition 4 (minimum spanning tree)

For any finite set M of points in $[{\bb R}^{n}]$ , a Minimum Spanning Tree $[{\rm MST}(M)]$ is a tree that has the vertex set M and a minimum total length of straight-line edges with lengths measured by Euclidean distance.

$[{\rm MST}(M)]$ is uniquely defined if all distances between points of M are distinct [see Section 4.3 of Sedgewick & Wayne (2011 )]. By Definition 2, the bridge length $[\beta(M)]$ of any finite set $[M\subset{\bb R}^{n}]$ equals the length of the longest edge of $[{\rm MST}(M)]$ .

For any periodic point set S with a unit cell U on a basis $[{\bf v}_{1},\ldots,{\bf v}_{n}]$ in $[{\bb R}^{n}]$ , one can consider the extended motifs $[M_{k} = S\cap U_{k}]$ , where the extended cell U_k is defined by the basis $[k{\bf v}_{1},\ldots,k{\bf v}_{n}]$ for any integer $[k\,\gt\,1]$ . The minimum spanning trees provide the upper bounds $[\beta(S)\leq\beta(M_{k})]$ for $[k\,\gt\,1]$ , which can be unnecessarily high (see Fig. 2), so Problem 3 is much harder for periodic sets than for finite sets of points.

Definition 5 [parameters r(U), R(S), a(U)]

Let $[S\subset{\bb R}^{n}]$ be a periodic point set whose unit cell U has a basis $[{\bf v}_{1},\ldots,{\bf v}_{n}]$ . Set $[r(U) = \max\{b,({d} / {2})\}]$ , where b = $[\max_{i = 1,\ldots,n}|{\bf v}_{i}|]$ and d = $[\sqrt{v_{1}^{2}+\ldots+v_{n}^{2}}]$ . The covering radius R(S) is the smallest radius R such that the union of closed balls of radius R around all $[p\in S]$ covers $[{\bb R}^{n}]$ . The height is h(U) = $[ {\rm vol}(U)/\max_{i = 1,\ldots,n}{\rm vol}(U_{i})]$ , where U_i is the subcell of U spanned by all basis vectors except $[{\bf v}_{i}]$ . The aspect ratio of the cell U is defined as .

For any periodic set $[S\subset{\bb R}^{n}]$ , Theorem 2 of Delone et al. (1973 ) and Lemma 3.6(a) of Anosova et al. (2025) imply the upper bound $[\beta(S)\leq\min\{r(U),2R(S)\}]$ , which is too high in practice (see Section 5). Main Theorem 6 guarantees an exact computation of $[\beta(S)]$ in a time that only quadratically depends on the motif size m of S.

Theorem 6

For any periodic point set $[S\subset{\bb R}^{n}]$ with a motif of m points in a unit cell U, the bridge length $[\beta(S)]$ can be computed in time O(m²a(U)ⁿN), where N is the time complexity of the Smith normal form, a(U) is the aspect ratio from Definition 5.

As the time complexity is proportional to the aspect ratio a(U) of a cell U, an initial reduction of U to a smaller cell will speed up the computation of the bridge length by minimizing further cell extensions, namely supercell_size in Algorithm 16.

Section 2 introduces the key concepts. Section 3 describes the main algorithm for $[\beta(S)]$ . Section 4 proves Theorem 6. Section 5 presents experiments on crystals.

2. Auxiliary concepts of graph theory for the bridge length algorithm

This section introduces a few auxiliary concepts to describe the exact algorithm for the bridge length $[\beta(S)]$ in Section 3 and to prove Theorem 6 at the end of Section 4.

Definition 7 ( $[G\subset{\bb R}^{n}]$ )

Let $[S\subset{\bb R}^{n}]$ be a periodic point set with a lattice Λ. A periodic Euclidean graph $[G\subset{\bb R}^{n}]$ is an infinite graph with the vertex set S and straight-line edges such that the translation by any vector $[{\bf v}\in\Lambda]$ defines an automorphism of G, which is a bijection $[S\to S]$ that also induces a bijection on the edges of G (see Fig. 3).

Figure 3
Left: the periodic point set S with the basis vectors $[{\bf v}_{1} = (5,0)]$ , $[{\bf v}_{2} = (0,5)]$ and motif points

. Middle: the periodic Euclidean graph $[G\subset{\bb R}^{2}]$ with three types of straight-line edges: green, blue, orange of lengths $[\sqrt{5},\sqrt{10},\sqrt{20}]$ , respectively. Right: the labelled quotient graph Q has directed edges e_g,e_b,e_o with translational vectors indicating integer shifts of cells (see Definitions 7, 8, 9).

If straight-line edges meet at interior points, they are not considered vertices of G.

Fig. 3 shows a connected periodic graph G but G can also be disconnected. For example, let S be the square lattice $[{\bb Z}^{2}]$ , then the graph G consisting of all horizontal edges connecting the points (m,n) and (m+1,n) for $[m,n\in{\bb Z}]$ is periodic but not connected. If we add to G all vertical edges connecting (m,n) and (m,n+1) for $[m,n\in{\bb Z}]$ , the resulting infinite square grid is a connected periodic graph on $[{\bb Z}^{2}]$ .

Definition 8 (quotient graph)

Let G be a periodic graph on a periodic point set S with a lattice Λ in $[{\bb R}^{n}]$ . Two points of S (also vertices or edges of G) are called Λ-equivalent if they are related by a translation along a vector $[{\bf v}\in\Lambda]$ . The quotient graph $[G/\Lambda]$ is an abstract undirected graph obtained as the quotient of G under the Λ-equivalence. Then G is called a lifted graph of $[G/\Lambda]$ . Any vertex of $[G/\Lambda]$ is a Λ-equivalence class $[p+\Lambda]$ represented by a point $[p\in S]$ . Any edge e of the quotient graph $[G/\Lambda]$ is a Λ-equivalence class $[[p,q]+\Lambda]$ represented by a straight-line edge [p,q] of G.

The quotient graph $[G/\Lambda]$ can have multiple edges between the same pair of vertices, as shown in Fig. 3, which can all be distinguished by the labels defined below.

Definition 9 (labelled quotient graph)

Let $[S\subset{\bb R}^{n}]$ be a periodic point set with a lattice Λ defined by a basis $[{\bf v}_{1},\ldots,{\bf v}_{n}]$ . Let G be a periodic graph on S. For an edge e of the quotient graph $[G/\Lambda]$ , choose any of two directions and a representative edge [p,q] in the lifted graph G. Let U(p),U(q) be the unit cells containing p,q, respectively. There is a unique vector $[{\bf v} = \sum_{i = 1}^{n}c_{i}{\bf v}_{i}\in\Lambda]$ such that and $[c_{i}\in{\bb Z}]$ , and we define the length of edge e in $[G/\Lambda]$ as the Euclidean distance . This `length' of an edge e is considered an attribute to ease later calculations, and does not change the abstract nature of the quotient graph $[G/\Lambda]$ .

A labelled quotient graph (LQG) is $[G/\Lambda]$ whose every edge e has a direction (say, from the Λ-equivalence class of p to the Λ-equivalence class of q) and the translational vector $[{\bf v}(e) = (c_{1},\ldots,c_{n})\in{\bb Z}^{n}]$ (see Fig. 3). Changing the direction of e multiplies each coordinate of $[{\bf v}(e)]$ by . An equivalence of LQGs is a composition of a graph isomorphism and changes in edge directions that match all translational vectors.

Translational vectors $[{\bf v}(e)]$ are also called voltages if $[G/\Lambda]$ is considered a voltage graph or a gain graph in topological graph theory. In crystallography, LQGs have been studied by many authors. Chung et al. (1984 , Section 6) generated 3-periodic nets by considering LQGs whose translational vectors have entries from $[\{-1,0,1\}]$ . Cohen & Megiddo (1990 , Section 2) described an algorithm to find connected components of a fixed periodic graph in terms of its LQG. Eon (2011 , Proposition 5.1) showed how to reconstruct a periodic graph up to translations from a LQG and a lattice basis, which we also prove in Lemma 10 in our notations for completeness. Eon (2016a , Section 3) described surgeries on building units of LQGs. Eon (2016b , Theorem 6.1) characterized 3-connected minimal periodic graphs (with a slightly different definition of `minimal'). McColm (2024 ) initiated a search for systematic periodic graphs realizable by real crystal nets (see also Edelsbrunner & Heiss, 2024 ).

The LQG $[G/\Lambda]$ in Fig. 3 has two vertices p,q. If we orient the three edges of $[Q = G/\Lambda]$ from p to q, the translational vector (0,0) of the blue edge e_b in $[G/\Lambda]$ means that the corresponding straight-line blue edge in the lifted graph $[G\subset{\bb R}^{2}]$ connects points of S within the same unit cell U with the basis $[{\bf v}_{1},{\bf v}_{2}]$ . The orange edge with the translational vector (1,1) means that each of its infinitely many liftings in $[G\subset{\bb R}^{2}]$ joins a point in a cell U to another point in the cell $[U+{\bf v}_{1}+{\bf v}_{2}]$ .

Lemma 10 (lifting)

Let G be a periodic Euclidean graph on a periodic point set S with a motif M in a unit cell U defined by a basis $[{\bf v}_{1},\ldots,{\bf v}_{n}]$ in $[{\bb R}^{n}]$ . Let Q be a LQG of G. Then $[G\subset{\bb R}^{n}]$ can be reconstructed from Q, the basis $[{\bf v}_{1},\ldots,{\bf v}_{n}]$ , and a bijection between all vertices of Q and all points of the motif $[M\subset U]$ .

Proof

The basis $[{\bf v}_{1},\ldots,{\bf v}_{n}]$ is needed to define a unit cell U with the given points of M, which are in 1–1 correspondence with all vertices of Q. The periodic point set S, which is the vertex set of the periodic graph G, is obtained from M by translations along the vectors $[\sum_{i = 1}^{n}c_{i}{\bf v}_{i}]$ for all $[c_{i}\in{\bb Z}]$ . By Definitions 8 and 9, every edge e of the LQG Q has a translational vector $[{\bf v}(e) = (c_{1},\ldots,c_{n})]$ and is a Λ-equivalence class $[[p,q]+\Lambda]$ for some $[p,q\in S]$ whose unit cells U(p),U(q) are related by the translation along $[\sum_{i = 1}^{n}c_{i}{\bf v}_{i}]$ . Then we can lift the edge e to the periodically translated straight-line edges $[[p+{\bf v},q+{\bf v}+\sum_{i = 1}^{n}c_{i}{\bf v}_{i}]]$ in the periodic graph G for all $[{\bf v}\in\Lambda]$ .

□

Definition 11 (path/cycle sum)

For a path (sequence of consecutive edges) in a LQG Q, we make all directions of edges consistent in the sequence and define the path sum in $[{\bb Z}^{n}]$ as the sum of the resulting translational vectors along the path. If the path is a closed cycle, the path sum is called the cycle sum.

In the language of voltage graphs, a path sum may equivalently be referred to as the net voltage over the path. In the LQG in Fig. 3, the upper cycle consisting of the directed orange edge (from p to q) and the inverted green edge (from q to p) has the cycle sum . This cycle sum means that a lifting of the cycle to the periodic graph G in $[{\bb R}^{2}]$ produces a polygonal path connecting a point to its translate by the vector $[{\bf v}_{1} = (1,0)]$ in the next cell to the right.

Definition 12 [minimal tree MST(S/Λ)]

For a periodic point set $[S\subset{\bb R}^{n}]$ with a lattice Λ, a minimal tree is a minimum spanning tree $[{\rm MST}(S/\Lambda)]$ (Definition 4) on the set $[S/\Lambda]$ of Λ-equivalence classes of points, where the distance between any classes in $[S/\Lambda]$ is the minimum Euclidean distance between their representatives in the set S.

In Fig. 3, a minimal tree $[{\rm MST}(S/\Lambda)]$ consists of one shortest green edge in $[G/\Lambda]$ .

3. Algorithm for the bridge length of a periodic point set

This section will describe the main Algorithm 16 for solving Problem 3, which will call auxiliary Algorithm 13 (Fig. 4) several times. Algorithm 13 starts from a conventional representation of a periodic set $[S\subset{\bb R}^{n}]$ with a motif M of points given by coordinates in a basis $[{\bf v}_{1},\ldots,{\bf v}_{n}]$ of a lattice Λ, as in a crystallographic information file (CIF).

Figure 4
Algorithm 13.

At every call, Algorithm 13 returns the next shortest edge e between points of S in increasing order of length. Although S is a set of points rather than a graph, we will use the term `edge', because e can be considered an edge from a complete graph with the vertex set S and with the `next shortest edge' being up to Λ-equivalence.

Any edge e between points of S will be represented by an ordered pair of points $[p,q\in M]$ and a translational vector $[(c_{1},\ldots,c_{n})\in{\bb Z}^{n}]$ so that the actual straight-line edge in the lifted periodic graph $[G\subset{\bb R}^{n}]$ is from p to the point $[q+\sum_{i = 1}^{n}c_{i}{\bf v}_{i}]$ . For convenience, we record the Euclidean distance $[d = |q-p+\sum_{i = 1}^{n}c_{i}{\bf v}_{i}|]$ between these endpoints. Then Algorithm 13 outputs any edge e as a tuple $[(p,q\semi c_{1},\ldots,c_{n}\semi d)]$ .

Algorithm 13 maintains the list of already found edges in increasing order of length. If the next required edge e is already in the list, Algorithm 13 simply returns e. This shortcut is implemented in Python with the keyword `Yield' – see the documentation at https://docs.python.org/3/glossary.html#term-generator-iterator. Rather than starting from line 1, every time when Algorithm 13 is called, each call `Yield e' returns an edge e, then temporarily suspends processing, remembering the location execution state including all local variables. When `Yield e' is called again, Algorithm 13 picks up where it left off, in contrast to functions that start fresh on every invocation.

If the next edge e is not yet found, Algorithm 13 adds more points from a shell of unit cells surrounding the previously considered cells. This shell contains the extended motif M_k without the smaller motif $[M_{k-1}]$ for $[k\,\gt\,1]$ (see Fig. 2). For any new point p, it suffices to consider only edges to points $[q\in M\subset U]$ because any edge e can be periodically translated by $[{\bf v}\in\Lambda]$ so that one of the endpoints of e belongs to U. In Algorithm 13, the Chebyshev distance $[\ell_{\infty}]$ in line 3 is the maximum absolute difference of corresponding coordinates, while d in line 7 is the usual Euclidean distance.

There is a faster way of checking a condition equivalent to next_batch_min_len by using the cell geometry. Then, in the vast majority of cases, the algorithm can stop at a supercell one size smaller, which dramatically speeds up the calculation. This calculation is described in Remark 14. However, due to the possibility of that not being the case (upon which the algorithm would just default to the same supercell size), we will keep this simpler idea and use it for the time complexity calculations.

Remark 14 (a faster way to compute next_batch_min_len in Algorithm 13)

For a unit cell with a basis $[{\bf v}_{1},\ldots,{\bf v}_{n}]$ , let $[{\bf a}_{i}]$ and $[{\bf b}_{i}]$ be the shortest vectors parallel and antiparallel to $[{\bf v}_{i}]$ from any point of a motif $[M\subset U]$ to the opposite boundary faces of the unit cell U. Then the faster alternative for next_batch_min_len is

$[\min\limits_{i = 1,\ldots,n}(|{\bf a}_{i}|+|{\bf b}_{i}|+supercell\_size \times |{\bf v}_ {i}|).]$

As all the vector lengths $[|{\bf a}_{i}|,|{\bf b}_{i}|]$ , $[i = 1,\ldots,n]$ can be pre-computed, we get a massive improvement over the calculation of next_batch_min_len in Algorithm 13.

Algorithm 16 will be building a LQG Q by adding (or ignoring) edges found by Algorithm 13 and monitoring the connectivity of the growing lifted graph G whose quotient $[G/\Lambda]$ is Q. For a basis $[{\bf v}_{1},\ldots,{\bf v}_{n}]$ of a unit cell U of the lattice Λ of S, the edge e between points p and $[q+\sum_{i = 1}^{n}c_{i}{\bf v}_{i}\in S]$ is added to Q as the edge between the Λ-equivalence classes of p and q, with the translational vector $[{\bf v}(e) = (c_{1},\ldots,c_{n})\in{\bb Z}^{n}]$ . As soon as G becomes connected, the length of the last added edge is the bridge length $[\beta(S)]$ , which will be proved in Theorem 26 later.

In comparison with a MST of a finite set of points, verifying the connectivity of the lifted periodic graph requires a much more complicated check that translational vectors with integer coordinates form a basis in $[{\bb Z}^{n}]$ (not $[{\bb R}^{n}]$ ), which can include more than n vectors. Fig. 5 shows a basis of $[{\bb Z}^{2}]$ consisting of three vectors, where no vector can be dropped without losing the connectivity of all integer points in $[{\bb Z}^{2}]$ .

Figure 5
Left: the three vectors $[{\bf v}_{1} = (0,1)]$ , $[{\bf v}_{2} = (2,0)]$ , $[{\bf v}_{3} = (3,0)]$ form a basis of $[{\bb Z}^{2}]$ . Other images: none of the three pairs $[({\bf v}_{2},{\bf v}_{3})]$ , $[({\bf v}_{1},{\bf v}_{2})]$ , $[({\bf v}_{1},{\bf v}_{3})]$ form a basis (insufficient for full connectedness) of $[{\bb Z}^{2}]$ . Some straight edges are shown curved for better visibility.

Algorithm 16 will use the Smith normal form ( $[{\rm SNF}]$ ) of a matrix of vectors $[(c_{1},\ldots,c_{n})]$ in $[{\bb Z}^{n}]$ (Newman, 1972 , see p. 26; Cohn, 1985 ; Van der Waerden, 2003 , ch. 3.6) for finitely generated modules over a principal ideal domain (PID).

Definition 15 (SNF and invariant factors)

For integers $[m,n\geq 1]$ , let A be a non-zero $[n\times m]$ matrix over a PID P, for example, $[P = {\bb Z}]$ . Then there exist invertible $[n\times n]$ and $[m\times m]$ matrices L, R, respectively, with coefficients in P, such that the product LAR is an $[n\times m]$ matrix whose only non-zero entries are diagonal elements a_i such that a_i divides a_i+1 for $[i = 1,\ldots,j-1]$ , and $[a_{i} = 0]$ for $[i = j,\ldots,n]$ for some $[1\,\lt\,j\leq n]$ . This diagonal matrix LAR is the Smith normal form $[{\rm SNF}(A)]$ . The diagonal elements a_i are called the invariant factors of A.

Let 1 denote the unit element of a PID P. If $[P = {\bb Z}]$ , then 1 is the usual integer 1. The simplest SNF has all invariant factors equal to 1, which happens if and only if the last factor $[a_{n} = 1]$ because all previous factors a_i divide a_n.

Algorithm 16 [finding the bridge length β(S) of any periodic point set $[S\subset{\bb R}^{n}]$ ]

Initialization. A LQG Q and a forest $[F\subset Q]$ initially consist of m isolated vertices, each representing a Λ-equivalence class of a point of the motif of S. We will build a translational matrix A with columns in $[{\bb Z}^{n}]$ , which is initially empty.

Loop stage. Consider the next edge $[e = next\_edge()]$ found by Algorithm 13.

Case 1. If adding the edge e to the current forest F would not form a closed cycle (ignoring all edge directions), then add e to F and Q as an edge with an arbitrarily chosen direction and corresponding translational vector $[{\bf v}(e)]$ found by Algorithm 13.

Case 2. If adding the edge e to F does form a cycle, find its cycle sum $[c\in{\bb Z}^{n}]$ from Definition 11. If c is not $[0\in{\bb Z}^{n}]$ and cannot be expressed as an integer linear combination of the columns from the current translational matrix A, then add e to Q as in Case 1 (but not to the forest F) and add the vector c as a new column to A.

Termination. Stop if both conditions below hold, otherwise continue the loop.

(1) The LQG Q (hence the forest F) becomes connected; and

(2) The translational matrix A (whose columns are cycle sums of cycles created by adding edges) has n invariant factors equal to 1 (see Definition 15).

The necessity of termination condition (1) in Algorithm 16 means that if the lifted periodic graph G is connected, then so is its quotient $[Q = G/\Lambda]$ . The inverse implication (sufficiency) may not hold. For example, in Fig. 3, the minimal tree $[{\rm MST}(S/\Lambda)]$ is a single green edge e_g, whose preimage under the quotient map $[G\to G/\Lambda]$ is the disconnected set of all green straight-line edges in the periodic graph $[G\subset{\bb R}^{2}]$ .

Example 17 (running Algorithm 16 on the periodic point set S in Fig. 3). The first addition to the quotient graph Q and forest F, which initially had two isolated vertices p,q, is the shortest green edge e_g from p to q (case 1 in the loop stage) with the translational vector $[c(e_{g}) = (0,1)\in{\bb Z}^{2}]$ . The translational matrix A remains empty.

Adding the next (by length) blue edge e_b with $[c(e_{b}) = (0,0)]$ to $[F = \{e_{g}\}]$ creates a cycle with the cycle sum $[c = c(e_{g})-c(e_{b}) = (0,1)]$ . According to case 2 in the loop stage, the quotient graph Q becomes the cycle of two edges $[e_{g}\cup e_{b}]$ but the forest remains $[F = \{e_{g}\}]$ . The translational matrix A becomes one column

$[\left(\matrix{0\cr 1}\right)]$

and does not yet have two invariant factors 1. The second termination condition is not yet satisfied, and the current lifted graph consisting of all green and blue segments is still disconnected.

Adding the orange edge e_o with $[c(e_{o}) = (1,1)]$ to F creates another cycle with the cycle sum $[c^{\prime} = c(e_{g})-c(e_{o}) = (-1,0)]$ . The quotient graph $[Q = e_{g}\cup e_{b}\cup e_{o}]$ is now full but $[F = \{e_{g}\}]$ is still one edge. The matrix A becomes

$[\left(\matrix{0&-1\cr 1&0}\right)]$

whose SNF is

$[\left(\matrix{1&0\cr 0&1}\right),]$

which shows that A has two invariant factors equal to 1. Both termination conditions hold and the lifted periodic graph $[G\subset{\bb R}^{2}]$ of all green, blue and orange edges is connected. The bridge length $[\beta(S) = 2\sqrt{5}]$ equals the length of the last (orange) edge as expected.

4. Correctness and time complexity of the bridge length algorithm

This section proves the correctness of Algorithm 16 in Theorem 26 about the bridge length and main Theorem 6 about its time complexity. Lemmas 20 and 21 will prove the necessity of termination condition (2) in Algorithm 16. Both conditions 1 and 2 will guarantee the connectedness of the lifted periodic graph G due to Lemma 23.

Lemma 18 is a partial case of the splitting lemma on p. 147 of Hatcher (2002 ).

Lemma 18 (splitting)

A short sequence of linear maps $[0\to{\bb Z}^{m-n} \buildrel f\over\rightarrow {\bb Z}^{m}\buildrel g\over\rightarrow {\bb Z}^{n}\to 0]$ is called exact if the image of each map coincides with the kernel (subspace mapping to 0) of the next map, i.e. $[{\rm Ker}(f) = 0]$ , $[{\rm Im}(f) = {\rm Ker}(g)]$ , $[{\rm Im}(g) = {\bb Z}^{n}]$ . If there is a map $[h:{\bb Z}^{n}\to{\bb Z}^{m}]$ , such that $[g\circ h]$ is the identity on $[{\bb Z}^{n}]$ , then $[{\bb Z}^{m}\cong f({\bb Z}^{m-n})\oplus h({\bb Z}^{n})]$ , where $[f({\bb Z}^{m-n})]$ and $[h({\bb Z}^{n})]$ are linearly independent subspaces of $[{\bb Z}^{m}]$ for $[m\geq n]$ .

Note, that, if f is a linear map and $[{\rm Ker}(f) = \{{\bf 0}\}]$ , then f is injective, because implies that 0 = = , so and .

Example 19 (finding a SNF). In the notations of Lemma 18, Fig. 5 defines $[g:{\bb Z}^{3}\to{\bb Z}^{2}]$ given by the matrix

$[A = \pmatrix{1&0&0\cr 0&2&3}]$

whose three columns generate $[{\bb Z}^{2}]$ . The kernel $[{\rm Ker}(g)\subset{\bb Z}^{3}]$ consists of all vectors

$[f(k) = k\left[\matrix{0\cr 3\cr -2}\right]]$

for $[k\in{\bb Z}]$ , which defines $[f:{\bb Z}\to{\bb Z}^{3}]$ with $[{\rm Ker}(f) = 0]$ and $[{\rm Im}(f) = {\rm Ker}(g)]$ as required in Lemma 18. Since $[g:{\bb Z}^{3}\to{\bb Z}^{2}]$ is surjective, we can find a map $[h:{\bb Z}^{2}\to{\bb Z}^{3}]$ satisfying $[g\circ h = {\rm id}]$ , e.g. h can be given by

$[M = \pmatrix{1&0\cr 0&-1\cr 0&1},]$

then

$[AM = \pmatrix{1&0&0\cr 0&2&3}\pmatrix{1&0\cr 0&-1\cr 0&1} = \pmatrix{1&0\cr 0&1},]$

denoted by I₂. After extending the $[3\times 2]$ matrix M by the extra column with a basis vector of $[{\rm Im}(f)]$ , we get the matrix

$[R = \pmatrix{1&0&0\cr 0&-1&3\cr 0&1&-2}]$

such that

$[AR = \pmatrix{1&0&0\cr 0&1&0}.]$

Lemma 18 implies that the constituent blocks of R are linearly independent to each other; all columns of R are linearly independent, and R is invertible. Hence, I₂AR is a SNF of A with invariant factors equal to 1 by Definition 15.

Lemma 20 (matrix generating $[{\bb Z}^{n}]$ $[\Leftrightarrow]$ n invariant factors equal to 1)

The columns of any $[n\times m]$ matrix A generate $[{\bb Z}^{n}]$ if and only if A has n invariant factors equal to 1.

Proof

Let the m columns of A generate $[{\bb Z}^{n}]$ . Then A defines the surjection $[g:{\bb Z}^{m}\to{\bb Z}^{n}]$ whose kernel $[{\rm Ker}(g)]$ can be obtained as the image of a map $[f:{\bb Z}^{m-n}\to{\bb Z}^{m}]$ , chosen such that $[{\rm Ker}(g)]$ is generated by $[f({\bf e}_{1}),\ldots,f({\bf e}_{m-n})]$ , where $[{\bf e}_{1},\ldots,{\bf e}_{m-n}]$ denote the standard orthonormal basis of $[{\bb Z}^{m-n}]$ . Since $[g:{\bb Z}^{m}\to{\bb Z}^{n}]$ is surjective, orthonormal basis vectors $[{\bf u}_{1},\ldots,{\bf u}_{n}]$ of $[{\bb Z}^{n}]$ are images $[g({\bf v}_{1}),\ldots,g({\bf v}_{n})]$ , respectively, of some vectors $[{\bf v}_{1},\ldots,{\bf v}_{n}\in{\bb Z}^{m}]$ . We can define the linear map $[h:{\bb Z}^{n}\to{\bb Z}^{m}]$ , $[h({\bf u}_{i}) = {\bf v}_{i}]$ for $[i = 1,\ldots,n]$ , so that $[g\circ h = {\rm id}]$ on $[{\bb Z}^{n}]$ . Then h has the $[m\times n]$ matrix M such that $[AM = I_{n}]$ , where I_n is the $[n\times n]$ identity matrix. Extending M by the columns $[f({\bf e}_{1}),\ldots,f({\bf e}_{m-n})]$ gives the invertible $[m\times m]$ matrix R such that AR equals the $[n\times m]$ matrix obtained by extending I_n with zero columns. Again, R is an invertible matrix over $[{\bb Z}]$ , so $[I_{m}AR = AR]$ is the SNF of A with all invariant factors equal to 1 by Definition 15.

Conversely, let the Smith normal form $[{\rm SNF} = LAR]$ of the matrix A in Definition 15 have all invariant factors equal to 1, so the $[n\times m]$ matrix LAR has the first n columns $[{\bf u}_{1},\ldots,{\bf u}_{n}]$ , which form a standard basis of $[{\bb Z}^{n}]$ , and zero columns. Then each $[{\bf u}_{i}]$ is a linear combination of the m columns of AR with coefficients from L, so these m columns generate $[{\bb Z}^{n}]$ . Since R is invertible, the m columns of A also generate $[{\bb Z}^{n}]$ .

□

Lemma 21 (connected periodic graph $[G\subset{\bb R}^{n}]$ $[\Rightarrow]$ n invariant factors equal to 1)

In Algorithm 16, if the lifted periodic graph $[G\subset{\bb R}^{n}]$ whose vertices form a periodic set S becomes connected, then the translational matrix A has n invariant factors equal to 1.

Proof

By Lemma 20 it suffices to show that any vector $[{\bf v}\in{\bb Z}^{n}]$ is an integer linear combination of columns of A. Let Λ be the lattice of S in Algorithm 16 and $[p\in S]$ be any point. The points p and $[p+{\bf v}]$ are connected in the lifted graph $[G\subset{\bb R}^{n}]$ by a polygonal path of straight-line edges. Under $[G\to G/\Lambda]$ , this path projects to a closed cycle C at the vertex (Λ-equivalence class) $[p+\Lambda]$ in the quotient graph $[Q = G/\Lambda]$ .

Let the cycle C pass through edges $[e_{1},\ldots,e_{k}]$ (with integer multiplicities) in the complement of the forest F in the quotient graph Q. These edges were added only to Q in case 2 of the loop stage. When we tried to add every edge e_j to F, the edge e_j created a cycle C_j whose cycle sum appeared as a column in the translational matrix A (if this cycle sum was not yet an integer combination of the previous columns). Then the vector $[{\bf v}]$ equals the sum of the cycle sums of all the cycles C_j for $[j = 1,\ldots,k]$ , which is an integer combination of the columns of A as required.

□

Lemma 22 (connected quotient graph $[G/\Lambda]$ $[\Rightarrow\exists]$ a tree of representatives $[T\subset G]$ )

If a LQG $[Q = G/\Lambda]$ is connected, its lifted graph $[G\subset{\bb R}^{n}]$ on a periodic point set S with a motif of m points and a lattice Λ includes a straight-line tree of representatives $[T\subset G]$ with m vertices that are not Λ-equivalent to each other.

Proof

Since Q is connected, we can choose a spanning tree $[F\subset Q]$ on the m vertices of Q. A required tree $[T\subset G]$ will be a connected union of straight-line edges of G that map 1–1 to all edges of F under the quotient $[G\to Q]$ . Start from any point $[p\in S]$ and take any edge e at the vertex (Λ-equivalence class) $[p+\Lambda]$ of $[F\subset Q]$ . The preimage of e under $[G\to Q]$ contains a unique straight-line edge $[[p,q]\subset G]$ , which we add to T. After adding to T all edges at p that project to all edges of F at the vertex $[p+\Lambda]$ , choose another point $[p^{\prime}\in T]$ such that the vertex $[p^{\prime}+\Lambda]$ has an edge of F not yet covered by T under $[G\to Q]$ . We continue adding edges to T by using their projections in $[F\subset Q]$ until we get a tree $[T\subset G]$ that spans m points of S that are not Λ-equivalent. The final T has no cycle, else this cycle projects under $[G\to Q]$ to a cycle in a forest F.

□

Lemma 23 (termination conditions in Algorithm 16 $[\Rightarrow]$ connected graph $[G\subset{\bb R}^{n}]$ )

Let Q be a LQG with a translational matrix A and a lifted graph G on a periodic point set $[S\subset{\bb R}^{n}]$ with a lattice Λ. If Q is connected and the matrix A has n invariant factors equal to 1, then the lifted periodic graph $[G\subset{\bb R}^{n}]$ is connected.

Proof

For any points $[p,q\in S]$ , we will find a path of straight-line edges in G as follows. By Lemma 22 the connectedness of the quotient graph $[Q = G/\Lambda]$ guarantees the existence of a tree $[T\subset G]$ whose vertices represent all Λ-equivalence classes of points of S. Let $[p^{\prime},q^{\prime}]$ be the vertices of T that are Λ-equivalent to p,q, respectively.

Since $[p^{\prime},q^{\prime}]$ are connected by a path in T, it suffices to find a path from p to its Λ-translate $[p^{\prime} = p+{\bf v}]$ (then similarly from q to $[q^{\prime}]$ ) in the graph G for any $[{\bf v}\in\Lambda]$ . By Lemma 20 the columns of A form a basis of $[{\bb Z}^{n}]$ , so $[{\bf v}]$ is an integer combination of these columns. It suffices to find a path in G by assuming that $[{\bf v}]$ is one column of A because a path for any sum $[\sum_{i}{\bf v}_{i}]$ can be obtained by concatenating paths for $[{\bf v}_{i}]$ . A column $[{\bf v}]$ can appear in A only in case 2 of the loop stage in Algorithm 16 as a cycle sum of a cycle $[C\subset Q]$ that was created by trying to add an edge e from Algorithm 13 to a forest $[F\subset Q]$ . If we order all edges of C from the vertex $[p+\Lambda]$ as $[e_{1},\ldots,e_{k}]$ , the sum of their translation vectors equals $[{\bf v}]$ . We build a path from p to $[p+{\bf v}]$ in G by finding a unique edge $[[p,p_{1}]\subset G]$ that projects to e₁, then a unique edge $[[p_{1},p_{2}]\subset G]$ that projects to e₂ and so on until we cover all $[e_{1},\ldots,e_{k}]$ and arrive at $[p+{\bf v}]$ .

□

Remark 24

Onus & Robins (2022 ) discuss connected components of a periodic graph K in terms of homology, namely Theorem 1(1) proves that H₀(K) has a basis of $[\sum_{i = 1}^{N}[{\bb Z}^{d}:W_{Q_{i}}]]$ elements, see details in their Section 3.1, but without describing an algorithm for finding such a basis. Our results complement their approach by proving the time complexity for checking the connectivity of a dynamic periodic Euclidean graph in Theorem 6 whilst keeping track of its connected components.

Lemma 25 (ignored edges)

Let an edge e be a Λ-equivalence class of a straight-line edge $[[p,q]+\Lambda]$ in a lifted periodic graph G for some points $[p,q\in S]$ . If Algorithm 16 does not add the edge e to a LQG Q, then the points p,q are already connected by a path in the graph $[G\subset{\bb R}^{n}]$ lifted from Q by Lemma 10.

Proof

The loop stage in Algorithm 16 ignores an edge e in the cases below.

Case 1. The edge e forms a cycle in Q whose cycle sum is the zero vector in $[{\bb Z}^{n}]$ .

Case 2. The edge e forms a cycle whose cycle sum equals an integer linear combination of pre-existing cycle sums from the translational set B.

In both cases, we have either one cycle (in case 1) containing e, whose cycle sum is $[0\in{\bb Z}^{n}]$ , or several cycles (in case 2), one (up to multiplicity) of which contains e, whose total sum of translational vectors is $[0\in{\bb Z}^{n}]$ . By Definition 9 each edge of Q involved in this zero sum can be lifted to a straight-line edge in the graph $[G\subset{\bb R}^{n}]$ .

If we start from the given point $[p\in S]$ , a cycle in Q and its sum 0 of translational vectors guarantees that the sequence of the lifted edges in G finishes at the same point p and hence forms a cycle C. This cycle C has the edge [p,q] whose exclusion keeps the points $[p,q\in S]$ connected by the path in C that is complementary to [p,q].

□

Theorem 26

Algorithm 16 finds the bridge length $[\beta(S)]$ from Definition 2 for any periodic point set $[S\subset{\bb R}^{n}]$ with a motif M of points given in a basis $[{\bf v}_{1},\ldots,{\bf v}_{n}]$ .

Proof

Within Algorithm 16, let d be the length of the last added edge e after which both termination conditions finally hold. By Lemma 25 all ignored edges do not create extra connections in the graph G. By Lemmas 21 and 22 the graph G obtained before adding the last edge e is disconnected. Lemma 23 guarantees that, when e is added, the graph G becomes connected. Because Algorithm 13 yields edges in increasing order, e is the shortest edge that could have this property, so the bridge length is $[\beta(S) = d]$ .

□

Theorem 6 has a rough upper bound assuming that the SNF $[{\rm SNF}(A)]$ of an integer $[n\times m]$ matrix A is re-computed for every iteration in time O(N). This time was estimated by Giesbrecht (1995 ) as $[O^{\sim}(n^{\omega-1}m\times M(n\log||A||))]$ , where = $[ \max_{i,j}|A_{i,j}|]$ and A_i,j denotes the element of A in the row i and column j. Here M(t) bounds the cost of multiplying two t-bit integers, and $[\omega\leq 2.372]$ is the exponent for matrix multiplication: two $[n\times n]$ matrices can be multiplied in time $[O(n^{\omega})]$ (see Williams et al., 2024 ). The `soft-Oh' simplifies the complexity up to logarithmic factors, so $[f = O^{\sim}(G)]$ if and only if $[f = O(g\log^{c}g)]$ for a constant $[c\,\gt\,0]$ .

To speed up Algorithm 16 in practice, the SNF can be updated at every iteration instead of re-computing from scratch (see details in Appendix A).

Proof of Theorem 6

Algorithm 16 solves Problem 3 by Theorem 26. It remains to show that the time complexity of Algorithm 16 is O(m²a(U)ⁿN). Algorithm 16 has the initialization of a constant time O(1) and the loop stage. We will multiply an upper bound for the number of loops by the time complexity of each loop.

One loop in Algorithm 16 contains at most the following checks:

(Cycle) Does adding an edge e to a forest F create a cycle?

(Combination) Is the cycle sum an integer combination of previous cycle sums?

(Termination) After appending a cycle sum c to the translational matrix A and calculating $[{\rm SNF}(A)]$ , does A have n invariant factors equal to 1?

The condition Cycle is checked by a depth-first search O(m) (see Sedgewick & Wayne, 2011, section 4.1). The condition Combination is equivalent to `Has $[{\rm SNF}(A)]$ changed?', and Termination is equivalent to `Is the product of invariant factors of A equal to 1?'. So both conditions can be jointly checked in the time O(N) needed to compute $[{\rm SNF}(A)]$ .

The time complexity of $[{\rm SNF}(A)]$ dominates all other steps in Algorithm 16, so we will use O(N) to represent the complexity of a single loop iteration of Algorithm 16.

Every loop iteration calls Algorithm 13. If we consider all calls to Algorithm 13 as running sequentially, the main loop will run at most a(U)+1 times, where a(U) is the aspect ratio from Definition 5. By Definition 5, ` supercell_size' must reach at least a(U)+1 to ensure that we yield all potential edges up to and including $[\beta(S)]$ , i.e. $[supercell\_size \times h(U)\,\gt\,\beta(S)]$ . Each loop runs through the unit cells that are ` supercell_size' away from the central cell U₁. By the end, we will have run through and yielded at most (a(U)+1)ⁿ unit cells. For each unit cell U_i, we find all distances between m points in U_i and m points in the central cell. The required time is O(m²) between any two cells and hence O(m²a(U)ⁿ) for all cells.

Algorithm 16 does not run for every edge found by Algorithm 13, but we assume this for simplicity.

The worst-case complexity of this implementation of Algorithm 16 is O(m²a(U)ⁿN).

□

5. Experiments on real and simulated crystals, and a discussion

This section discusses experiments computing the exact bridge length $[\beta(S)]$ for 5679 simulated and five real nanoporous crystals in Fig. 6 reported by Pulido et al. (2017 ).

Figure 6
The T2 molecule and five crystals synthesized from T2. The first four T2-α, T2-β, T2-γ, T2-δ were reported by Pulido et al. (2017

), the last T2-ε by Zhu et al. (2022

Table 1 contains the bridge lengths computed by Algorithm 16 on the crystals from Fig. 6 given by their codes in the CSD. The names of the T2 polymorphs refer to the crystalline forms α, β, γ, δ, ε based on the same molecule T2. The crystal IDs in the first column of Table 1 refer to the CSD six-letter refcodes (Taylor & Wood, 2019 ).

Table 1
The exact bridge length $[\beta(S)]$ computed by Algorithm 16 and its upper bounds for the nine experimental and 5679 simulated T2 crystals reported by Pulido et al. (2017)

CSD refcodes of experimental and simulated crystals	No. of atoms in a cell	Bridge length $[\beta(S)]$ (Å)	Upper bound r(U) (Å)	Upper bound 2R(S) (Å)	Best upper bound over exact $[\beta(S)]$	Run time (s)
T2-α NAVXUG	184	2.028	22.325	15.609	7.695	4.337
T2-β DEBXIT05	92	3.163	20.665	12.906	4.080	0.664
T2-β DEBXIT06	92	3.188	20.694	12.884	4.042	0.657
T2-γ DEBXIT01	92	1.879	23.224	23.366	12.358	0.706
T2-γ DEBXIT02	92	1.926	23.226	23.375	12.061	0.636
T2-γ DEBXIT03	92	1.902	23.230	23.373	12.216	0.653
T2-γ DEBXIT04	92	1.970	23.290	23.448	11.824	0.649
T2-δ SEMDIA	92	2.713	14.401	8.350	3.077	0.671
T2-ε DEBXIT07	92	2.062	12.608	5.707	2.768	0.641

Average for all 5679 simulated T2 crystals	295.8	2.293	23.306	9.110	4.064	31.653

Note that there are four slightly different versions of the polymorph T2-γ in the CSD (DEBXIT01…04) because their crystal structures were determined at different temperatures. The seven versions DEBXIT01…07 with the same six-letter code may look similar, even to experts. The polymorph T2-δ (SEMDIA) was deposited later than others because even the original authors confused this polymorph with earlier crystals. This confusion was detected by density functions from the work of Edelsbrunner et al. (2021 ), computed by Smith & Kurlin (2022 ). The underlying density invariants turned out to be incomplete by Example 11 of Anosova & Kurlin (2022 ), but were explicitly described for all periodic sequences of intervals in $[{\bb R}]$ (Anosova & Kurlin, 2023 ).

Table 1 includes the upper bounds $[\beta(S)\leq\min\{r(U),2R(S)\}]$ from Lemma 3.6(a) of Anosova et al. (2025) [see r(U) and R(S) in Definition 5]. The run times in Table 1 were recorded on a laptop with an Intel i5 processor, one 1 GHz core and 8 Gb RAM.

The final row contains the averages for 5679 simulated T2 crystals, which are publicly available in the supplementary materials of Pulido et al. (2017) and were used for predicting the five experimental polymorphs represented by nine entries in the CSD. For all crystals in Table 1, the translational matrix size never exceeded three columns.

The real T2 crystals in the CSD have smaller motifs consisting of only two or four T2 molecules, while simulated T2 crystals contain up to 32 molecules, which makes the run times slower in comparison with the real ones (see the last column in Table 1).

More importantly, the exact bridge length $[\beta(S)]$ is four times smaller (on average) than its upper bound min{r(U),2R(S)}. The bridge length $[\beta(S)]$ provides the upper bound $[\beta(S)+2R(S)\,\gt\,\alpha(S)]$ in Lemma 3.6(b) of Anosova et al. (2025) for a stable radius α of atomic clouds that suffices for a complete and continuous isoset invariant of S.

This isoset uniquely identifies any periodic crystal S under rigid motion and has a continuous distance metric that has detected thousands of near-duplicate crystals. Decreasing the upper bound of $[\alpha(S)]$ from 4R(S) to the smaller value $[\beta(S)+2R(S)]$ by a factor of about 2 decreases the size m of atomic clouds by a factor of $[2^{3} = 8]$ in $[{\bb R}^{3}]$ . This size reduction speeds up by several orders of magnitude the algorithms for isosets and their distance metric, which have complexity $[O(m^{3}\log m)]$ and O(m⁶) in $[{\bb R}^{3}]$ , respectively; see the conclusions of Section 5 of Anosova et al. (2025).

The next open problem is an exact computation of the minimal stable radius $[\alpha(S)]$ . The closely related problem is to compute the regularity radius ρ that is the minimum radius with the property that any Delone set with mutually equivalent clusters of the radius ρ is regular (periodic with a 1-point asymmetric set) (see Baburin et al., 2018 ).

In conclusion, this paper contributes an exact algorithm for computing the bridge length, a key ingredient for solving the geo-mapping problem for periodic point sets within the emerging area of geometric data science (Kurlin, 2025 ). This problem has been solved for 2D lattices (Kurlin, 2024 ; Bright et al., 2023a ; Bright et al., 2023b ), while the 3D case is being finalized (Kurlin, 2022 ; Bright et al., 2021 ).

APPENDIX A

A faster `online' algorithm for the SNF

A different way of checking the Termination condition is to append columns to A in an `online' fashion. This avoids the need to calculate the SNF from scratch every time (or often at all), and reduces the complexity to a time close to $[O(m^{\omega}\times E\times n)]$ , where $[\omega\leq 2.372]$ is the exponent for matrix multiplication (Williams et al., 2024), and O(E) is the complexity of the extended Euclidean algorithm (Baladi & Vallée, 2005 ). As this reduction in complexity is dominated by the price of populating the edges with Algorithm 13, this will be irrelevant for most use cases (and is not used in the experiments shown here). As a use case involves, say, a larger or higher-dimensional pre-populated set of edges, this algorithm becomes more necessary.

Recall from Definition 15 that the diagonal of the Smith normal form $[{\rm SNF}(A) = LAR]$ is made up of the invariant factors of A. To progressively calculate $[{\rm SNF}(A)]$ , we must only keep track of the right-multiplying unimodular matrix R, and the invariant factors themselves, which form a vector $[{\bf f} = (f_{1},\ldots,f_{n})\in{\bb Z}^{n}]$ . To run the main algorithm here, we do have to begin with a matrix with n integer linearly independent rows. `Adding' a vector $[{\bf v}]$ to $[{\bf f}]$ is where the process changes. We treat R and $[{\bf f}]$ as mutable, meaning each value is not necessarily fixed to its original assignment. The first step is to define $[x: = {\bf v}\times R]$ , then we find $[g_{i} = {\rm gcd}(x_{i},f_{i})]$ . If $[f_{i} = g_{i}]$ (i.e. $[f_{i}\ divides\ x_{i}]$ ), we can continue with , with no need to change R as it only keeps track of columns (for context, if we were keeping track of L too, we would have to subtract the ith row from the last row x_i/f_i times).

If f_i divides x_i for all i, we would know that including the vector changes nothing; therefore, the relative edge is also irrelevant and can be discarded [this reduces the complexity of most of the Termination condition from O(N) to $[O(n^{\omega}+\log^{2}(n))]$ ].

However, if $[g_{i}\,\lt\,f_{i}]$ , then f_i not only becomes g_i, but we also know that $[{\rm SNF}(A)]$ will change and that we must add the edge relative to $[{\bf v}]$ . We must also alter R, accounting for the fact that F represents the diagonal of a matrix. We can do this by any process typical of `changing the pivot' in the $[{\rm SNF}]$ algorithm, ensuring that we update R in tandem. As accounting for the previous values of i is trivial, it is the worst-case equivalent to calculating the $[{\rm SNF}]$ of an $[(n-i)\times(n-i)]$ matrix in time $[O(N_{n-i})]$ , which improves upon the naive calculation of $[{\rm SNF}]$ from scratch upon every alteration of A.

Lemma 27

Updating the SNF as above preserves its properties.

Proof

As we only alter with elementary row and column operations, this preserves the SNF. By multiplying the to-be-added row $[{\bf v}]$ by R before concatenating it as a new row to F, it is the same as performing those same elementary column operations upon a new matrix: $[[{\bf A}_{0},\ldots,{\bf A}_{n},{\bf v}]]$ (i.e. $[{\bf v}]$ concatenated as a row onto A).

We then continue to perform only elementary row and column operations, and we end up with a matrix that satisfies the conditions of an $[{\rm SNF}]$ noted in Definition 15.

□

Further discussion of this process is beyond the scope of this paper; however, there are still some small tricks that take advantage of the way the `new' rows for consideration are intrinsically related to $[{\bf v}]$ , and how f_i+1 divides f_i.

Acknowledgements

We thank Jean-Guillaume Eon and Gregory McColm for their helpful advice on the early draft of this paper.

Funding information

This work was supported by the Royal Society (APEX fellowship award No. APX/R1/231152) and EPSRC New Horizons (grant No. EP/X018474/1).

References

Anosova, O. & Kurlin, V. (2021). Discrete geometry and mathematical morphology, DGMM 2021, edited by J. Lindblad, F. Malmberg and N. Sladoje. Lecture Notes in Computer Science, Vol. 12708, pp. 229–241. Springer. Google Scholar
Anosova, O. & Kurlin, V. (2022). Discrete geometry and mathematical morphology, DGMM 2022, edited by É. Baudrier et al. Lecture Notes in Computer Science, Vol. 13493, pp. 395–408. Springer. Google Scholar
Anosova, O. & Kurlin, V. (2023). J. Math. Imaging Vis. 65, 689–701. CrossRef Google Scholar
Anosova, O., Kurlin, V. & Senechal, M. (2024). IUCrJ 11, 453–463. CrossRef CAS PubMed IUCr Journals Google Scholar
Anosova, O., Widdowson, D. & Kurlin, V. (2025). Pattern Recognit. 171, 112108. Google Scholar
Baburin, I. A., Bouniaev, M., Dolbilin, N., Erokhovets, N. Y., Garber, A., Krivovichev, S. V. & Schulte, E. (2018). Acta Cryst. A74, 616–629. Web of Science CrossRef IUCr Journals Google Scholar
Baladi, V. & Vallée, B. (2005). J. Number Theory 110, 331–386. CrossRef Google Scholar
Bouniaev, M. & Dolbilin, N. (2017). J. Inf. Process. 25, 735–740. Google Scholar
Bright, M., Cooper, A. & Kurlin, V. (2021). arXiv:2109.11538. Google Scholar
Bright, M., Cooper, A. & Kurlin, V. (2023a). Chirality 35, 920–936. CrossRef CAS PubMed Google Scholar
Bright, M., Cooper, A. I. & Kurlin, V. (2023b). Acta Cryst. A79, 1–13. Web of Science CrossRef IUCr Journals Google Scholar
Brock, C. P. (2021). Change to the definition of `crystal' in the IUCr Online Dictionary of Crystallography. IUCr Newsl. Vol. 29, No. 2, https://www.iucr.org/news/newsletter/etc/articles?issue=151351&result_138339_result_page=17. Google Scholar
Chung, S. J., Hahn, Th. & Klee, W. E. (1984). Acta Cryst. A40, 42–50. CrossRef CAS Web of Science IUCr Journals Google Scholar
Cohen, E. & Megiddo, N. (1990). Applied geometry and discrete mathematics, pp. 135–146. ACS, DIMACS and Association for Computing Machinery. Google Scholar
Cohn, P. M. (1985). Free rings and their relations. Academic Press. Google Scholar
Delone, B. N., Dolbilin, N. P., Shtogrin, M. I. & Galiulin, R. V. (1976). Dokl. Akad. Nauk SSSR 227, 19–21. Google Scholar
Delone, B. N., Galiulin, R. V., Dolbilin, N. P., Zalgaller, V. A. & Shtogrin, M. I. (1973). Dokl. Akad. Nauk 209, 25–28. Google Scholar
Dolbilin, N. (2015). Geometry and symmetry conference, pp. 109–125. Springer. Google Scholar
Dolbilin, N. (2018). Proc. Steklov Inst. Math. 302, 161–185. CrossRef Google Scholar
Dolbilin, N. & Bouniaev, M. (2019). Eur. J. Combin. 80, 89–101. CrossRef Google Scholar
Dolbilin, N., Lagarias, J. & Senechal, M. (1998). Discrete Comput. Geom. 20, 477–498. Web of Science CrossRef Google Scholar
Dolbilin, N. P. (1976). Dokl. Akad. Nauk 230, 516–519. Google Scholar
Edelsbrunner, H. & Heiss, T. (2024). arXiv:2408.16575. Google Scholar
Edelsbrunner, H., Heiss, T., Kurlin, V., Smith, P. & Wintraecken, M. (2021). 37th International symposium on computational geometry (SoCG 2021). Liebniz International Proceedings in Informatics (LIPIcs), Vol. 189, pp. 32:1–32:16. Schloss Dagstuhl – Leibniz Zentrum für Informatik. Google Scholar
Eon, J.-G. (2011). Acta Cryst. A67, 68–86. Web of Science CrossRef CAS IUCr Journals Google Scholar
Eon, J.-G. (2016a). Acta Cryst. A72, 268–293. Web of Science CrossRef IUCr Journals Google Scholar
Eon, J.-G. (2016b). Acta Cryst. A72, 376–384. Web of Science CrossRef IUCr Journals Google Scholar
Giesbrecht, M. (1995). Proceedings of the 1995 international symposium on symbolic and algebraic computation, pp. 110–118. Association for Computing Machinery. Google Scholar
Hatcher, A. (2002). Algebraic topology. Cambridge: Cambridge University Press. Google Scholar
Kurlin, V. (2022). arXiv:2201.10543. Google Scholar
Kurlin, V. (2024). Found. Comput. Math. 24, 805–863. CrossRef Google Scholar
Kurlin, V. (2025). SIAM J. Math. Data Sci. 7, https://doi.org/10.1137/25M1733574. Google Scholar
McColm, G. (2024). Acta Cryst. A80, 18–32. CrossRef IUCr Journals Google Scholar
Newman, M. (1972). Integral matrices. Academic Press. Google Scholar
Onus, A. & Robins, V. (2022). arXiv:2208.09223. Google Scholar
Pulido, A., Chen, L., Kaczorowski, T., Holden, D., Little, M. A., Chong, S. Y., Slater, B. J., McMahon, D. P., Bonillo, B., Stackhouse, C. J., Stephenson, A., Kane, C. M., Clowes, R., Hasell, T., Cooper, A. I. & Day, G. M. (2017). Nature 543, 657–664. CSD CrossRef CAS PubMed Google Scholar
Sedgewick, R. & Wayne, K. (2011). Algorithms. Addison-Wesley Professional. Google Scholar
Smith, P. & Kurlin, V. (2022). Advances in visual computing, ISVC2022, edited by G. Bebis et al. Lecture Notes in Computer Science, Vol. 13599, pp. 377–391. Springer. Google Scholar
Taylor, R. & Wood, P. A. (2019). Chem. Rev. 119, 9427–9477. Web of Science CrossRef CAS PubMed Google Scholar
Van der Waerden, B. L. (2003). Algebra, Vol. 1. Springer Science & Business Media. Google Scholar
Widdowson, D. & Kurlin, V. (2021). arXiv:2108.04798v3. Google Scholar
Widdowson, D. & Kurlin, V. (2022). NeurIPS 35, 24625–24638. Google Scholar
Widdowson, D. & Kurlin, V. (2024). Cryst. Growth Des. 24, 5627–5636. CrossRef CAS PubMed Google Scholar
Widdowson, D., Mosca, M. M., Pulido, A., Cooper, A. I. & Kurlin, V. (2022). match 87, 529–559. CrossRef Google Scholar
Williams, V. V., Xu, Y., Xu, Z. & Zhou, R. (2024). Proceedings of the symposium on discrete algorithms, edited by D. P. Woodruff, pp. 3792–3835. Society for Industrial and Applied Mathematics. Google Scholar
Zhu, Q., Johal, J., Widdowson, D. E., Pang, Z., Li, B., Kane, C. M., Kurlin, V., Day, G. M., Little, M. A. & Cooper, A. I. (2022). J. Am. Chem. Soc. 144, 9893–9901. CSD CrossRef CAS PubMed Google Scholar

This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.

FOUNDATIONS
ADVANCES

ISSN: 2053-2733

Volume 81| Part 6| November 2025| Pages 427-437

https://doi.org/10.1107/S2053273325008253

Open

access

Format		BIBTeX
		EndNote
		RefMan
		Refer
		Medline
		CIF
		SGML
		Plain Text
		Text

Format		BIBTeX
		EndNote
		RefMan
		Refer
		Medline
		CIF
		SGML
		Plain Text
		Text

Search IUCr Journals		doi		Advanced search
Author		volume	page

research papers\(\def\hfill{\hskip 5em}\def\hfil{\hskip 3em}\def\eqno#1{\hfil {#1}}\)

Computing the bridge length: the key ingredient in a continuous isometry classification of periodic point sets

1. Introduction: practical motivations and the problem statement

Definition 1 (lattice, unit cell, motif, periodic point set)

Definition 2 [bridge length β(S)]

Problem 3

Definition 4 (minimum spanning tree)

Definition 5 [parameters r(U), R(S), a(U)]

Theorem 6

2. Auxiliary concepts of graph theory for the bridge length algorithm

Definition 7 ()

Definition 8 (quotient graph)

Definition 9 (labelled quotient graph)

Lemma 10 (lifting)

Proof

Definition 11 (path/cycle sum)

Definition 12 [minimal tree MST(S/Λ)]

3. Algorithm for the bridge length of a periodic point set

Remark 14 (a faster way to compute next_batch_min_len in Algorithm 13)

Definition 15 (SNF and invariant factors)

Algorithm 16 [finding the bridge length β(S) of any periodic point set ]

4. Correctness and time complexity of the bridge length algorithm

Lemma 18 (splitting)

Lemma 20 (matrix generating n invariant factors equal to 1)

Proof

Lemma 21 (connected periodic graph n invariant factors equal to 1)

Proof

Lemma 22 (connected quotient graph a tree of representatives )

Proof

Lemma 23 (termination conditions in Algorithm 16 connected graph )

Proof

Remark 24

Lemma 25 (ignored edges)

Proof

Theorem 26

Proof

Proof of Theorem 6

5. Experiments on real and simulated crystals, and a discussion

APPENDIX A

A faster `online' algorithm for the SNF

Lemma 27

Proof

Acknowledgements

Funding information

References

research papers

Definition 7 ( $[G\subset{\bb R}^{n}]$ )

Algorithm 16 [finding the bridge length β(S) of any periodic point set $[S\subset{\bb R}^{n}]$ ]

Lemma 20 (matrix generating $[{\bb Z}^{n}]$ $[\Leftrightarrow]$ n invariant factors equal to 1)

Lemma 21 (connected periodic graph $[G\subset{\bb R}^{n}]$ $[\Rightarrow]$ n invariant factors equal to 1)

Lemma 22 (connected quotient graph $[G/\Lambda]$ $[\Rightarrow\exists]$ a tree of representatives $[T\subset G]$ )

Lemma 23 (termination conditions in Algorithm 16 $[\Rightarrow]$ connected graph $[G\subset{\bb R}^{n}]$ )