Introduction
Contributions
- We prove sufficient conditions on S and norm ∥·∥ under which (2) is a pseudometric, i.e., a metric over a quotient space defined by equivalence relation dS(A,B)=0. In particular, we show that dS is a pseudometric when:
- \(S=\mathbb {P}^{n}\) and ∥·∥ is any entry-wise or operator norm;
- \(S=\mathbb {W}^{n}\), the set of doubly stochastic matrices, ∥·∥ is an arbitrary entry-wise norm, and A,B are symmetric; a modification on dS extends this result to both operator norms as well as arbitrary matrices (capturing, e.g., directed graphs); and
- \(S=\mathbb {O}^{n}\), the set of orthogonal matrices, and ∥·∥ is the operator or entry-wise 2-norm.
We also characterize the corresponding equivalence classes (see “Main results” section). Relaxations (ii) and (iii) are very important from a practical standpoint. For all matrix norms, computing (2) with \(S={\mathbb {W}}^{n}\) is tractable, as it is a convex optimization. For \(S={\mathbb {O}}^{n}\), (2) is non-convex but is still tractable, as it reduces to a spectral decomposition. This was known for the Frobenius norm (Umeyama 1988); we prove this is also the case for the operator 2-norm. - We include node attributes in a natural way in the definition of dS as both soft (i.e., penalties in the objective) or hard constraints in Eq. (2). Crucially, we do this without affecting the pseudometric property and tractability. This allows us to explore label or feature preserving permutations, that incorporate both (a) exogenous node attributes, such as, e.g., user age or gender in a social network, as well as (b) endogenous, structural features of each node, such as its degree or the number of triangles that pass through it. We numerically show that adding these constraints can speed up the computation of dS.
Related Work
Notation and preliminaries
[n] | Set {1,…,n} |
\(\mathbb {R}^{n\times n} \) | The set of real n×n matrices. |
\(\mathbb {S}^{n}\) | The set of real, symmetric matrices. |
I | The identity matrix of size n×n. |
1 | The n-dimensional vector whose entries are all equal to 1. |
σmax(·) | Largest singular value of a matrix. |
\(\mathop {\mathsf {tr}}(\cdot)\) | The trace of a matrix. |
conv (·) | The convex hull of a set. |
G(V,E) | Graph with vertex set V and edge set E. |
A,B | Matrices [ai,j]i,j∈[n],[bi,j]i,j∈[n]. |
∥·∥p | Operator or entry-wise p-norm. |
∥·∥F | Frobenius norm. |
\(\mathbb {P}^{n} \) | |
\(\mathbb {W}^{n} \) | |
\(\mathbb {O}^{n} \) | |
Ω, \(\tilde {\Omega }\) | Sets over which a metric is defined. |
d(x,y) | A metric over space Ω. |
\(\bar {d}(x,y)\) | The symmetric extension of d(x,y). |
(Ω,d) | A metric space. |
GA,GB | Graphs with adjacency matrices A,B. |
P,W,O | n×n matrices. |
S | A closed and bounded subset of \(\ensuremath {\mathbb {R}}^{n\times n}\). |
dS(A,B) | A class of distance scores defined by minimization (12) over set S. |
\(d_{\mathbb {P}^{n} }\) | Pseudometric dS, where S is the set of permutation matrices. |
\(d_{\mathbb {W}^{n} }\) | Pseudometric dS, where S is the set of doubly stochastic matrices. |
\(d_{\mathbb {O}^{n} }\) | Pseudometric dS, where S is the set of orthogonal matrices. |
\(\Psi ^{n}_{\tilde {\Omega }}\) | Set of all embeddings from \([n]\to \tilde {\Omega }\), where \((\tilde {\Omega },\tilde {d})\) is a metric space. |
ψA,ψB | Embeddings in \(\Psi ^{n}_{\tilde {\Omega }}\) of nodes in graphs GA and GB, respectively. |
\(D_{\psi _{A},\psi _{B}}\) | n × n matrix of all pairwise distances between images of nodes in GA and GB, under embeddings ψA and ψB. |
Main results
A family of graph metrics
Incorporating metric embeddings
Proofs of Main results
Proof of Theorems 1–3
Proof of Theorems 4 and 5
Metric computation over the Stiefler manifold.
Graphs of different sizes
Experiments
Experimental setup
(Non-metric) Distance Score Algorithms | |
NetAlignBP | Network Alignment using Belief Propagation (Bayati et al. 2009) |
IsoRank | Neighborhood Topology Isomorphism using Page Rank (Singh et al. 2007) |
SparseIsoRank | Neighborhood Topology Sparse Isomorphism using Page Rank (Bayati et al. 2009) |
InnerPerm | Inner Product Matching with Permutations (Lyzinski et al. 2016) |
InnerDSL1 | Inner Product Matching with Matrices in \(\mathbb {W}^{n}\) and entry-wise 1-norm (Lyzinski et al. 2016) |
InnerDSL2 | Inner Product Matching with Matrices in \(\mathbb {W}^{n}\) and Frobenius norm (Lyzinski et al. 2016) |
NetAlignMR | Iterative Matching Relaxation (Klau 2009) |
Natalie (V2.0) | Improved Iterative Matching Relaxation (El-Kebir et al. 2015) |
Metrics from our Family (2) | |
EXACT | Chemical Distance via brute force search over GPU |
DSL1 | Doubly Stochastic Chemical Distance \(d_{\mathbb {W}^{n}}\) with entry-wise 1-norm |
DSL2 | Doubly Stochastic Chemical Distance \(d_{\mathbb {W}^{n}}\) with Frobenius norm |
ORTHOP | Orthogonal Relaxation of Chemical Distance \(d_{\mathbb {O}^{n}}\) with operator 2-norm |
ORTHFR | Orthogonal Relaxation of Chemical Distance \(d_{\mathbb {O}^{n}}\) with Frobenius norm |
- NetAlignBP, IsoRank, SparseIsoRank and NetAlignMR are described by (Bayati et al. 2009). Natalie is described in (El-Kebir et al. 2015). All five algorithms output \(P \in {\mathbb {P}}^{n}\).
- The algorithm in (Lyzinski et al. 2016) outputs one \(P \in \mathbb {P}^{n}\) and one \(P' \in \mathbb {W}^{n}\). We use \(P \in \mathbb {P}^{n}\) to compute ∥AP−PB∥1 and call this InnerPerm. We use \(P' \in {\mathbb {W}}^{n}\) to compute ∥AP′−P′B∥1 and ∥AP′−P′B∥2 and call these algorithms InnerDSL1 and InnerDSL2 respectively. We use our own CVX-based projected gradient descent solver for the non-convex optimization problem the authors propose.
- DSL1 and DSL2 denote dS(A,B) when \(S \in \mathbb {W}^{n}\) and ∥·∥ is ∥·∥1 (element-wise) and ∥·∥F, respectively. We implement them in Matlab (using CVX) as well as in C, aimed for medium size graphs and multi-core use. We also implemented a distributed version in Apache Spark (Zaharia et al. 2010) that scales to very large graphs over multiple machines based on the Alternating Directions Method of Multipliers (Boyd et al. 2011).
- ORTHOP and ORTHFR denote dS(A,B) when \(S \in \mathbb {O}^{n}\) and ∥·∥ is ∥·∥2 (operator norm) and ∥·∥F respectively. We compute them using an eigendecomposition.
- For small graphs, we compute \(d_{\mathbb {P}^{n}}(A,B)\) using our brute-force GPU-based code. For a single pair of graphs with n≥15 nodes, EXACT already takes several days to finish. For ∥·∥=∥·∥1 in dS (element-wise or matrix norm), we have implemented the chemical distance as an integer value LP and solved it using branch-and-cut. It did not scale well for n≥15.
- We implemented the WL algorithm over Spark to run, multithreaded, on a machine with 40 CPUs.
Experimental results
Description | |
---|---|
Bd | Barabasi Albert of degree d (Albert and Barabási 2002) |
Ep | Erdős-Rényi with probability p (Erdös and Rényi 1959) |
P | Power Law Tree (Mahmoud et al. 1993) |
Rd | Regular Graph of degree d (Bollobás 1998) |
S | Small World (Kleinberg 2000) |
Wd | Watts Strogatz of degree d (Watts and Strogatz 1998) |
k | ∥P∥0 | ∥ AP − PA ∥0 | τ |
---|---|---|---|
1 | 3,747,960 | 100.569 | 133s |
2 | 239,048 | 3004 | 104s |
3 | 182,474 | 2036 | 136s |
4 | 182,016 | 2030 | 169s |
5 | 182,006 | 2030 | 200s |