Analysis on single graph
In some applications, persistent homology is used to detect global structural features of a single network such as complexity and distributions of strongly connected regions. While some applications analyze the evaluation of a single graph according to edge weights, others analyze the evaluation of the graph over time.
In many studies, Betti numbers are used as the complexity measure for different networks. Benzekry et al. (
2015) propose that cancer therapy can be guided by changes in the complexity of protein-protein interaction (PPI) networks. They analyze 11 cancer interaction networks and find out that there is a correlation between 1-dimensional Betti number and survival of cancer patients. They compute Betti numbers using the power filtration (
Power filtration (POW) section). To examine the effect of a node on the network complexity, each node in the network is removed and the change in Betti number is recorded. They consider the drop of the Betti number as the drop of the complexity. Therefore, if the removal of a node results in the largest drop in Betti number, it also results in the largest drop in complexity and is potentially a good drug target.
Similar to this, Rucco et al. (
2016) use Betti number as persistent entropy to measure the graph complexity. They study the behavior of the idiotypic network of the mammal immune system. Their main goal is to detect the behavior of the immune system reaction to an external stimulus in terms of phase transitions. In addition to the persistent entropy, they use 2 other graph complexity measures, which are the connectivity entropy and the approximate von Neumann entropy (
Petz 2001). While connectivity entropy is used to analyze the structural properties and to identify the set of key players of the idiotypic network, approximate von Neumann entropy is used to distinguish graphs corresponding to the same system but in different conditions. For persistent entropy, they use persistent barcodes constructed with the Vietoris-Rips filtration (
Vietoris-Rips filtration (VR) section). In their experiment, they create the simulation of the idiotypic network and obtain a weighted idiotypic network using the coexistence function as a weight function between antibodies. After computing the 3 different entropy measures on this network, they identify that peak on entropy corresponds to the activation of the immune response. While the connectivity entropy does not distinguish between the activation and the immune memory states, both the approximated von Neumann entropy and the persistent entropy are able to recognize the activation of the immune system. The analysis of the Betti numbers reveals that there is a subset of antibodies arranged in a 1-dimensional hole that is present both in the activation state and in the memory state.
Cliques and cycles are important structural features of complex networks to describe their cohesive structures.
Rieck et al. (2018) use persistent homology to detect clique communities and their evolution in weighted networks. Persistent diagram is created using the vertex-based clique filtration (
Vertex-based clique filtration (VBCL) section) and the
k-clique filtration (
k-clique filtration (kCL) section). They analyze the connectivity relations for all clique degrees and all weight thresholds. Various networks are studied including co-occurrence network, brain network, and collaboration network. An interactive visualization tool is created that is capable of detecting and tracking the evolution of networks’ clique communities for different thresholds and clique degrees.
Persistent homology is also used to analyze the brain networks by computing distributions of cliques (brain regions) and cycles (strongly connected regions) in them. In
Giusti et al. (2016), the authors review the underlying mathematical background of using simplicial complex in neural data, specifically brain networks. They list different types of simplicial complexes for encoding neural data such as networks, clique complex, independence complex, and concurrence complex. They also elaborate on using persistent homology to measure the global structure of simplicial complexes and the strength of neural connections using the weighted simplex filtration (
Weighted simplex filtration (WS) section) to generate persistent diagrams.
In
Sizemore et al. (2018), the authors test the hypothesis that the spatial distributions of cliques and cycles will differ in their anatomical locations. They construct 1- and 2-dimensional persistent diagrams of brain networks using the Vietoris-Rips filtration (
Vietoris-Rips filtration (VR) section). The structural brain networks of eight volunteers is extracted using diffusion spectrum imaging. The undirected and weighted network consists of 83 nodes representing different brain regions and edges that refer to the density of white matter between the nodes. Weak and strong connections between cliques are assessed by observing the difference between birth and death times of
k-cliques in persistent diagrams.
Additionally, in
Chung et al. (2013), persistent homology is also used to analyze the brain networks with the aim of examining the abnormal white matter in maltreated children. Networks are obtained by thresholding (based on the sample covariance) sparse correlations for the Jacobian determinant from magnetic resonance imaging (MRI) and fractional anisotropy from diffusion tensor imaging (DTI) at different threshold values. The collection of the thresholded graphs forms a Vietoris-Rips filtration (
Vietoris-Rips filtration (VR) section).
Moreover, in
Khalid et al. (2014), the authors demonstrate that persistent homology is useful in analyzing functional brain connectivity. The application involves electroencephalography (EEG) data from eight cortical regions of corticosterone (CORT) induced depression mouse and control models. After the EEG measurement is obtained, the square root of (1-correlation) distance metric is used to create a binary network. Next, the Vietoris-Rips filtration (
Vietoris-Rips filtration (VR) section) is applied and used to visualize topological changes by 0-dimensional barcodes which are then used to construct single-linkage dendrograms (SLD). Finally, single-linkage distance is computed using the generated SLDs. The results show that CORT model is characterized by an increased local connectivity and by a decreased global connectivity.
Besides its utility on brain networks, persistent homology is also used to analyze word co-occurrence, remittance, and migration networks. In
Salnikov et al. (2018), the authors study the word co-occurrence networks to explore the conceptual landscape of mathematical research. They first create the network using 54177 articles in arXiv from 01/1994 to 03/2007. Then they parse a concept list from Wikipedia that includes 1612 equations, theorems, and lemmas. Next, they combine these two datasets by checking 1612 concepts’ appearance in the articles and find that 1067 of them match in at least one article and 35018 articles contain at least one of the concepts. They first take 1067 concepts as nodes and include a (
n−1)-simplex for each article containing
n-concepts. Furthermore, whenever the concept sets of two articles intersect at
n concepts, their corresponding simplices share a face of dimension (
n−1). In total, this construction results in 32707 unweighted edges. They use the temporal filtration (
Temporal filtration (TMP) section) using article dates. They create the 1- and 2-dimensional persistent diagrams, i.e. they just look at the 2-dimensional holes bounded by edges and 3-dimensional holes bounded by triangles respectively. They interpret these holes to explain the intrinsic characteristics of how research evolves in mathematics. They also explore the authors’ conceptual profile using the holes and their attributes to the holes.
Ignacio and Darcy (2019) analyze the patterns and shapes in remittance and migration networks as a directed weighted network via persistent homology to identify flow patterns between multiple countries. They detect both local and global patterns that highlight simultaneous interactions among multiple nodes. They extend the Vietoris-Rips filtration (
Vietoris-Rips filtration (VR)) to detect topological features such as persistent cycles in directed networks using the weight of the edges and create persistence barcodes. They use 0-, 1- and 2-dimensional barcodes to analyze the cycles in networks. As a modification on 1 and 2-dimensional barcodes, to encode additional information, they color the bars in barcodes according to the standard deviation of the weights in the cycles they represent. They create the 2015 Asian net migration and remittance networks which include 50 countries and states to perform their analysis on. They define the weight of a directed edge (
a,
b) as the profit country
b gains from exchanging remittances with the country
a for remittance networks and define it in a similar manner for net migration networks.
One of the challenges for most graphing methods is the inability of visualizing the global structure of graphs as a result of the absence of interactive exploration mechanisms. Persistent homology is used to address this challenge (
Suh et al. 2017). They use 0-dimensional PH features to control and modify force-directed layouts of a graph. The 0-dimensional barcode, obtained by the power filtration (
Power filtration (POW) section), enables the visualization of contraction and repulsion events in the network. More forces are added to the graph layout based on the selected number of barcodes. They have three case studies to show the effectiveness of their method on 3 different real-world networks. One of the networks is “Les Miserables” which contains 77 nodes (characters) and 254 edges, weighted by how many scenes two characters share during any chapter of the novel. Some of the key characters featured in the book can be identified on the force-directed layout modified with PH features. They are also able to extract major important nodes in the Madrid Train Bombing network and US Senate 2007 and 2008 Co- and Anti-voting network using their method.
Multiple graphs analysis
Graph comparison is an important task for many graph applications such as classification and matching. On the other hand, it is a computationally complex problem where we need to compute the similarity between 2 networks (
Conte et al. 2004). It has been studied for many years and defined as either exact matches (e.g. graph isomorphism (
Cordella et al. 2004)) or some measures of structural similarity (e.g. graph edit distance (
Gao et al. 2010)). Graph kernels are also used to capture the graph similarity (
Shervashidze et al. 2011). Recent years, persistent homology is used to extract topological features of networks to compare them.
While most existing metrics for network structure rely on local features of vertices such as node degrees, correlation of neighborhood nodes, they do not capture the precise mesoscopic structure of complex networks. Sizemore et al. (
Sizemore et al. 2016) extract mesoscale homological features as 0-3 dimensional Betti numbers. They use the Vietoris-Rips (
Vietoris-Rips filtration (VR) section) filtration to compute the homology and record the maximal clique distribution and Betti sequence. Extracted features are used to classify 14 commonly studied weighted network models into four groups or classes with agglomerative hierarchical clustering to use for graph classification. Betti values and parameters from the maximal clique distribution are used to determine the structural similarities between networks. After classifying networks into groups, they analyze the structural patterns in each group of networks.
In Petri et al. (
2013a, b; Binchi et al.
2013b), persistent homology is used to detect particular non-local structural features of networks. After creating the barcodes with the inverse Vietoris-Rips (
Vietoris-Rips filtration (VR) section) filtration based on edge weights, statistical distributions of 1-dimensional barcodes are computed. They classify real-world networks into 2 classes according to the similarity of their cycle distribution with randomized version. In Class I, cycle distributions are markedly different from the randomized versions and in Class II, cycle distributions are very close to their random versions. The authors study different network datasets, such as US air passenger networks, C. Elegans’s neuronal network (
Watts and Strogatz 1998), the online messages network (
Opsahl and Panzarasa 2009), gene network, network of mentions and re-tweet between Twitter users, school face-to-face contact network, co-authorship networks. While the gene network and airport network are in class 1, co-authorship networks and twitter network are in class 2.
In (
Carrière et al. 2019), the authors propose to use persistence diagrams for graph classification problem for undirected weighted graphs. They first define a graph kernel function, namely heat kernel signatures (
Hu et al. 2014), on networks and use the sublevel and superlevel
Vertex function based filtration (VFB) section filtration on each network to generate PDs. Then they employ two layers neural network architecture to process the PDs and classify the graphs. They evaluate their classification model on social networks, medical and biological networks. They also compare their results with four different state-of-the-art graph classification methods and show that their method has comparable results despite being much simpler than other methods.
Moreover, persistent homology is used to analyze the structure of weighted networks. In
Carstens and Horadam (2013), the authors consider the collaboration networks as weighted network. They use the Vietoris-Rips (
Vietoris-Rips filtration (VR) section) filtration to generate the persistence barcodes of networks. They employ the Betti numbers of 0, 1, and 2 dimensions and use them to distinguish collaboration networks from random networks. They conclude that the first and second Betti numbers give us richer information about weighted networks.
Pal et al. (2017) study the growing collaboration network with a temporal parametrization and characterize the temporal changes in its topological features. In a collaboration network, each person in a paper or a movie is represented as a vertex, and each collaborative act (and each of its subsets) is represented as a simplex of vertices comprising it. They define a temporal filtration (
Temporal filtration (TMP) section) from growing collaboration networks, with adding new collaborations occurred in each year. In addition, they introduce a new distance measure between a growing network which captures the difference in the rate of growth of cycles in the networks being compared. They use DBLP (Digital bibliography & library project) and IMDB (Internet movie database) data sets from 1950-2008 considering 10-year windows. They study the topological properties of networks as the growth in the cyclicity, with respect to the time corresponding to the 10-year windows, and size of the largest connected component.
In
Schauf et al. (2016), the authors consider the national input-output networks of domestic products as a weighted network and use persistent homology to identify dissimilarities between them. The nodes are available sectors in an economy and edge weights are the monetary flow measuring the magnitude of the economic relationship between two sectors. They generate persistence diagrams for dimensions 0, 1, and 2 with the Vietoris-Rips (
Vietoris-Rips filtration (VR) section) filtration. Using 0-dimensional diagrams, they distinguish economies with high GDP, large population, and small import/export percentages of GDP from those with lower GDP, small population, and larger import/export percentages. They also discuss the potential for applying higher-dimensional persistent homology to study these networks.
Similarly, financial networks are considered as weighted networks and persistent homology is used to detect early signs of critical transitions of financial crisis in
Gidea (2017). The vertices correspond to the stocks, each pair of distinct nodes is connected by an edge and each edge is assigned a weight using the Pearson correlation coefficient. For each time frame, they generate 0- and 1-dimensional persistent diagrams of the network using the Vietoris-Rips (
Vietoris-Rips filtration (VR) section) filtration. Then, the distance between them is measured via Wasserstein distance. They show that the persistent diagrams and the distances between them have significant changes prior to the 2007-2008 financial crisis.
Furthermore, in
Keil and Aktas (2018), the undirected attributed networks are considered as weighted networks. They first assign weights on edges using the vertex attributes. Then, they extract the ego-networks of each vertex and define a graph kernel function, namely the diffusion Fréchet function (
Martínez et al. 2018), on each ego network that takes both the network topology and edge weights into consideration. Next, they generate the persistence diagrams of each ego network using the sublevel and the superlevel “
Vertex function based filtration (VFB)” filtration and obtain the distance matrix between each vertex computing the Wasserstein distance between their persistence diagrams. Finally, they cluster the network using the
k-means clustering algorithm.
Beside previous weighted networks, brain networks are considered as sparse weighted networks and persistent homology is also used to analyze them (
Chung et al. 2015). They obtain the topological structure of a graph induced by sparse correlation. They first transform MRI and DTI data to weighted networks where they employ the sparse Pearson correlation to obtain the edge weights. They generate the 0-dimensional Betti plots for the brain networks using the Vietoris-Rips filtration (
Vietoris-Rips filtration (VR) section). They also generate Betti plots using sparse covariance. They show that the sparse correlation method gets a huge group separation between normal and stress-exposed children visually. This method is also less computationally expensive than the sparse covariance method.
In (
Knyazeva et al. 2018), the authors study dynamical connectome state analysis on brain networks using three different methods:
k-means clustering, modularity based clustering and topological feature based clustering. They consider brain networks as weighted networks. In topological feature based clustering, they use the Vietoris-Rips (
Vietoris-Rips filtration (VR) section) filtration. They first split the correlation matrix to the two matrices with positive and negative correlations. Then, they create
Vietoris-Rips filtration (VR) section filtrations for both matrices. In their clustering, different type of connections describes different processes in the brain, so they compute persistent homology with annotated intervals collection. After getting the intervals, they compute different statistics for each homology group and for types of interactions. Then, finally, they perform hierarchical clustering based on these topological features. They show that topological feature based clustering is more informative than the other two clustering methods.
In addition to this, in (
Chowdhury and Mémoli 2016;
2018), the authors classify the brain (hippocampal) networks using persistence diagrams. They consider five different environments with 4 holes, 3 holes, 2 holes, 1 hole and no hole and for each environment, 20 simulated brain networks are created. Persistence diagrams of these 100 networks are computed with the Dowker filtration (
Dowker sink and source filtration (DSS) section). They use the bottleneck distance between the 1-dimensional diagrams of networks to compare them. Finally, they classify the networks using the single linkage dendrogram algorithm and show that Dowker filtration is successful in capturing the differences between the five classes of networks. The authors also work on the same problem and dataset using the zigzag simplicial filtration (
Zigzag simplicial filtration (ZSF) section) in
Chowdhury et al. (2018). They create 1-dimensional zigzag persistent diagrams to perform persistent homology computations on dynamic simplicial complexes resulted from these brain networks.
In
Yoo et al. (2016), the authors show that persistent homology, or more precisely persistence vineyard, is a robust approach to estimate functional connectivity in the resting and gaming stages of the brain networks. They conduct an experiment with 26 male college students aged 19-29 years old from two universities located in Seoul, Republic of Korea. They undergo all the 26 healthy subjects resting and gaming experiments. Each stage was recorded for five minutes separately. They segment their data using 30s window lengths and 2s step size. For each window, they compute the persistence diagram using Pearson correlation between brain channels employing the weighted simplex filtration (
Weighted simplex filtration (WS) section). Then, they compute the 0-dimensional persistence vineyard to analyze the dynamic brain connectivity. In a brief, a persistent vineyard is a
p dimension persistent diagram with a time dimension added, tracking the birth and death of
p dimension diagrams in a time-varying topological space (
Edelsbrunner and Harer 2010). Their results show that persistence vineyard is successful to determine the temporarily dynamic properties of the brain in a robust and threshold-free way. They also show that persistent vineyard is more effective than the principal component analysis (PCA) and standard graph theoretical methods.
Petri et al. (2014) compares resting state functional brain activity in 15 healthy volunteers after intravenous infusion of placebo and psilocybin using persistent homology and other statistical methods-density function. First, the raw data from fMRI (functional magnetic resonance imaging) dataset is transformed into a functional network. They create the 1-dimensional persistence diagram using the Vietoris-Rips (
Vietoris-Rips filtration (VR) section) filtration. Later, they define two different homological scaffolds depending on how frequently edges are part of the generators of the persistent homology groups and how persistent are the generators to which they belong to. The results show that the homological structure of the brain’s functional patterns undergoes a dramatic change post-psilocybin, characterized by the appearance of many transient structures of low stability and of a small number of persistent ones that are not observed in the case of placebo.
In
Horak et al. (2009), the authors first study random graphs using the clique complex (
Clique complex filtration (CCL) section) filtration. Using different probabilities, they generate random networks and compute their barcodes. They show that the results on these barcodes are in agreement with the theoretical studies on these complexes. As another application, they study an email network. They create barcodes and show that higher dimensional barcodes, which do not exist for random networks, correspond to more dense communications among certain groups. They also apply their methods on scale-free networks with a modular structure. They use three different parameters to generate three types of scale networks: Clustered modular networks, clustered non-modular networks and non-clustered modular networks. They show that both clustered modular and clustered non-modular networks have more bars in their 3-dimensional and 4-dimensional barcodes than the non-clustered modular network.
Moreover, persistent homology is used for metric graph comparison. In (
Dey et al. 2015), the authors first introduce the functional metric graph filtration (
Functional metric graph filtration (FMG) section) on metric graphs. Then, they define the persistence distortion distance between two finite metric graphs using the persistence diagrams from FMG filtration. In their experiment, they show the stability of the proposed distance measure on the Athen’s road network as a metric graph and generate its noisy sample using a noise level
ε. The results show that the persistence distortion distance between the original graph and its noisy sample grows roughly proportionally to
ε. They also use proposed persistence distortion distance to compare surface meshes of different geometric models. Models from the same group have very smaller persistence distortion distances among them than those between the dissimilar group, which shows the that proposed distance is able to differentiate surface models.
In (
Hajij et al. 2018), persistent homology is employed to quantify structural changes in time-varying (dynamic) graphs. Their objective is to transform each instance of the time-varying graph into a metric space, extract topological features using persistent homology, and compare those features over time by means of bottleneck or Wasserstein distance between their corresponding persistence diagrams. Finally, several case studies on real-world networks, such as high school communication network, show how this method can find cyclic patterns, deviations from those patterns, and one-time events in time-varying graphs. In particular, 0- and 1-dimensional PH are utilized to detect the components and tunnels respectively. Each graph constitutes a distinct metric space for which the Vietoris-Rips (
Vietoris-Rips filtration (VR) section) filtration is implemented to compute the corresponding Betti numbers.
High order networks are weighted complete hypergraphs collecting relationships between elements of tuples. Computing distance between high order network is difficult when the number of nodes is large. In
Huang and Ribeiro (2017), the authors use persistent homology to derive distance approximations of networks. They compute the bottleneck distance between persistence diagrams of networks to evaluate the differences between networks. They first define a relationship function between a set of nodes to represent a measure of similarity or dissimilarity for members of the group. They use this function to assign weight on each simplex. Using these weights and the weighted simplex filtration (
Weighted simplex filtration (WS) section), they generate 0-, 1- and 2-dimensional persistence diagrams. They show that they can lower bound distance between two higher order networks, which is in general computationally expensive, with a computationally less expensive distance between their persistence diagrams. They apply their method to the coauthorship networks. They first create the networks using 5 journals from the mathematics community and 6 journals from the engineering community. They use the lower bounds to classify the networks, distinguish the collaboration patterns of engineering and mathematics community and also discriminate engineering communities with different research interests.
To answer the question of whether the existing anonymization mechanisms for preserving privacy truly keep the graph utility, Gao et al. (
Gao and Li 2018) employ persistent homology to analyze and evaluate four anonymization mechanisms. They study online social networks (OSN). They define the distance between two nodes as the number of hopes on the shortest path between these nodes and create 0-,1- and 2-dimensional persistence barcodes using this distance in the power filtration (
Power filtration (POW) section). They analyze the original and anonymized OSNs using the barcodes. The results show that original OSN graphs have stable structures. Furthermore, the 0-dimensional barcodes they obtain show that most anonymized OSNs are more closely connected than the original graph. All anonymized graphs are not as stable as the original graph, because they have more 2-dimensional holes or larger holes. They also compare their results with traditional graph metrics.
In
Kim and Memoli (2018), the authors study flocking/swarming behaviors in animals. They first create dynamic graphs and simplicial complexes using the Vietoris-Rips complex for a fixed scale parameter. Then, they construct the zigzag simplicial filtration (
Zigzag simplicial filtration (ZSF) section) and obtain 0-dimensional zigzag persistent diagrams to classify the four different type of flocking behavior in animals. Finally, using the bottleneck distance, the single linkage hierarchical clustering, and MDS, they distinguish the 4 behaviors very well.