Skip to main content
Top

Open Access 05-11-2024 | Research

Inductive Subgraph Embedding for Link Prediction

Authors: Jin Si, Chenxuan Xie, Jiajun Zhou, Shanqing Yu, Lina Chen, Qi Xuan, Chunyu Miao

Published in: Mobile Networks and Applications

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Link prediction, which aims to infer missing edges or predict future edges based on currently observed graph connections, has emerged as a powerful technique for diverse applications such as recommendation, relation completion, etc. While there is rich literature on link prediction based on node representation learning, direct link embedding is relatively less studied and less understood. One common practice in previous work characterizes a link by manipulate the embeddings of its incident node pairs, which is not capable of capturing effective link features. Moreover, common link prediction methods such as random walks and graph auto-encoder usually rely on full-graph training, suffering from poor scalability and high resource consumption on large-scale graphs. In this paper, we propose Inductive Subgraph Embedding for Link Prediciton (SE4LP) — an end-to-end scalable representation learning framework for link prediction, which utilizes the strong correlation between central links and their neighborhood subgraphs to characterize links. We sample the “link-centric induced subgraphs” as input, with a subgraph-level contrastive discrimination as pretext task, to learn the intrinsic and structural link features via subgraph classification. Extensive experiments on five datasets demonstrate that SE4LP has significant superiority in link prediction in terms of performance and scalability, when compared with state-of-the-art methods. Moreover, further analysis demonstrate that introducing self-supervision in link prediction can significantly reduce the dependence on training data and improve the generalization and scalability of model.
Notes
Jin Si and Chenxuan Xie contributed equally to this work.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

1 Introduction

Thanks to the widespread application of graphs in many real scenarios, Graph Representation Learning (GRL) has attracted considerable attention recently. The main goal of GRL is to convert discrete graph structure into a low-dimensional space, in which the vital structural information and properties are maximumly preserved as continuous vector representations, known as embeddings. Different types of embeddings could be task-driven and facilitate different downstream applications. For instance, node embedding can potentially benefit a wide variety of node-level analytics tasks such as node classification [1, 2] and node clustering [3, 4], while whole-graph embeddings are instrumental in graph-level tasks like graph classification [57]. So far, GRL has spurred advancements in several domains, including social networks [810], biochemical analysis [11], knowledge graphs [12], anomaly detection [13], etc.
Links are important components of a graph, and there exists plenty of link driven downstream applications such as social recommendation [1416] and knowledge graph reasoning [17, 18], which heavily depend on the information flow in graph. While there is rich literature on node and whole-graph representation learning, GRL for link is relatively less studied and less understood. One common practice in previous works [19, 20] is two-stage, which obtains a link representation via aggregating the representations of its incident nodes generated in the previous phase. For instance, Node2Vec [19] extends random walk to node pairs and uses a bootstrapping approach over the individual node embeddings to obtain link representation. Graph Auto-Encoder (GAE) [20, 21] first computes node representations through graph neural network (GNN), and scores the existence of links between pairwise nodes by an inner product operation between node latent representations. We refer to such two-stage link embedding mechanism as “node2link” pattern which suffers from several drawbacks. First, the indirect generation of link representations ignores the fact that different types of objects (e.g., nodes, links) do not share the same embedding space. Actually, node embedding methods mainly focus on the nodes and their properties (e.g. the node proximity), while link embedding methods should pay more attention to links (or pairwise nodes), and explicitly capture the structural interactions between nodes. Thus, the “node2link” pattern cannot preserve the complete properties of links, resulting in a failure to achieve the best performance in the downstream edge analytics tasks such as link prediction. Second, commonly used graph representation learning methods usually suffer from poor scalability and high resource consumption on large-scale graphs, resulting in inefficient generation of link representation. For instance, random-walk-based embedding techniques such as DeepWalk [8] and Node2Vec [19] rely on a large amount of CPU kernels and time to generate a sufficient number of walks. Matrix-factorization-based embedding techniques such as NetMF [22] and GraRep [23] require constructing an enormous objective matrix (usually much denser than adjacency matrix) on which matrix factorization is performed. Deep GRL methods such as graph convolutional network (GCN) and graph attention network (GAT) generally accept the entire graphs as input and perform full-graph training to learn graph representations, which restricts their scalability in large-scale graph.
In view of (1) the unsound expressiveness of link representations generated via “node2link” pattern and (2) the poor scalability of universal link prediction methods based on graph representation learning, an immediate question arises here: how can we design a universal and scalable link representation learning framework for effective link prediction?
As we known, links (interactions, relations) inherently indicate correlations between nodes, and leveraging primary interaction patterns between node pairs is crucial for developing expressive link representations. When considering two nodes connected by a link in a graph, message propagation between them can occur through both direct (i.e., edge) and indirect interactions (e.g., common neighbors, shortest paths), all encapsulated within a subgraph surrounding this link. In this work, we introduce the “subgraph2link” pattern following the facts: (1) subgraph consisting of target link and their local neighborhood information is informative and plays a critical role to provide structure contexts for link representation learning; (2) subgraph serves as the receptive field of central link, significantly smaller than the entire graph, thereby presenting a viable strategy to eliminate the necessity for full-graph training.
Inspired by the existing research in structural link learning [24] and scalable graph learning [25, 26], we propose Inductive Subgraph Embedding for Link Prediction (SE4LP). In SE4LP framework, we first sample subgraphs around target links, then design a self-supervised subgraph contrast task to better characterize the similarities and differences of interaction patterns around different links, and finally predict the existence of central links via subgraph classification. Our SE4LP jointly trains subgraph contrast and subgraph classification tasks in an end-to-end manner, further achieving high-performance link prediction. The major contributions of our work are summarized as follows:
  • SE4LP: We propose an end-to-end scalable link representation learning framework via subgraph contrast, which utilizes informative local subgraphs surrounding links to learn highly expressive link representations.
  • Scalability: We take the receptive field subgraphs extracted from a batch of links as the input during each training step, so as to learn link representation efficiently and make our SE4LP scale well on large-scale graphs.
  • Effectiveness: Extensive experiments demonstrate the superiority of our framework in terms of performance and scalability on link prediction. Furthermore, introducing self-supervised learning to link prediction help to learn effective link representation with less training samples.

2.1 Preliminaries

Let \(\mathcal {G} =(\mathcal {V} ,\mathcal {E})\) be an undirected and unweighted graph, where \(\mathcal {V} =\{v_i \mid i=1,2,\cdots ,N\}\) and \(\mathcal {E} \subseteq \mathcal {V} \times \mathcal {V} \) represent the sets of nodes and edges respectively. We regard an edge in graph \((v_i,v_j)\in \mathcal {E}\) as a positive link while those nonexistent edges \((v_i,v_j)\notin \mathcal {E}\) are treated as negative links. Generally, graph structural data consist of two features: a node attribute matrix \(\varvec{X} \in \mathbb {R}^{N\times F}\) and an adjacency matrix of graph topology \(\varvec{A} \in \mathbb {R}^{N\times N}\), where \(\varvec{x}_i\in \mathbb {R}^F\) is the F-dimensional feature vector of node \(v_i\), \(\varvec{A} _{ij} =1\) if \((v_i,v_j) \in \mathcal {E} \) and 0 otherwise. We use a diagonal degree matrix \(\varvec{D} \in \mathbb {R}^{N\times N}\) to define the degree distribution of \(\varvec{A} \), and \(\varvec{D}_{ii} = \sum _{j=0}^{N-1} \varvec{A}_{ij}\).
The problem we consider in this work is link prediction via subgraph contrastive representation learning. Given a graph \(\mathcal {G} =(\mathcal {V}, \mathcal {E}, \varvec{X})\), our goal is to learn an encoder \(\varvec{h} = f_\theta (\varvec{X}, \varvec{A})\) which maps a subgraph centered on target link to a vector as the link representation, following the “subgraph2link” pattern.
The link prediction task aims to infer the behavior of the network link formation process by predicting missing or future relationships based on currently observed connections [27, 28]. In such task scenario, graph \(\mathcal {G}\) will be processed into a partially observed graph, \(\textbf{A} \in \{1,0, \textsf{UNK}\}^{N\times N}\), where 1 denotes a known positive link, 0 denotes a known negative link, and \(\textsf{UNK}\) denotes an unknown status (missing or unobserved) link waitting to be predicted. We briefly summarize the related work of link prediction based on graph representation learning below.

2.2.1 Random walks

DeepWalk [8] and Node2link [19] transform discrete graphs into node sequences through a stochastic process of sampling nodes along the connections in graphs, and use language models to learn node representations. Since random walks are naturally based on the connectivity structure between nodes in graphs, they can be extended to learning edge features using a bootstrapping approach over the feature representations of the individual nodes [19].

2.2.2 Graph auto-encoders (GAEs)

GAE [20] train encoders that impose the topology proximity of nodes in the graph structure on the latent space by reconstructing adjacency information. They use a graph convolutional network (GCN) as encoder to generate latent representation of nodes and scores the existence of links between pairwise nodes by an inner product decoder. Adversarially Regularized Graph Autoencoder (ARGA) and its variant (ARGVA) [21] enforce the latent representation generated from graph encoder to match a prior distribution via an adversarial training scheme for robust node embedding. Both random walks and graph auto-encoders over-emphasize the proximity information at the expense of the structural information [29].

2.2.3 Supervised learning

SEAL [24] is a state-of-the-art link prediction framework, which learns heuristics from local enclosing subgraphs using a graph neural network. This framework introduces a \(\gamma \)-decaying heuristic theory, which shows that local enclosing subgraphs reserve rich information for link existence prediction. It contains three stages: 1) extracting an h-hop local subgraph, either for a training or a testing link; 2) constructing the node information matrix for each link via Node Labeling strategy, 3) learning with a graph neural network.

2.2.4 Contrastive methods

Existing contrastive learning methods [25, 29, 30] are built on the idea of mutual information maximization, which learns by predicting the agreement between two augmented instances. Deep Graph Infomax (DGI) [29] and SUGCON [25] learn node representation by contrasting node and graph encodings, achieving state-of-the-art results in node classification benchmarks. GraphCL [30] learns graph-level representation by contrasting two different augmented views of original graphs, outperforming previous models on unsupervised graph classification.

3 Methodology

In this section, we give the details of the proposed framework SE4LP, as schematically depicted in Fig. 1. Our framework is mainy composed of the following components: (1) a subgraph extractor which captures the subgraphs centered on target links from the graph topology; (2) a subgraph augmentor which generates a series of variant graph views using various transformations on attributes and topology of subgraphs; (3) a GNN-based encoder which learns graph-level representation for generated graph views; (4) a subgraph-level contrast maximizes the consistency between two augmented views of the same subgraph; (5) a subgraph predictor mapping the subgraph representations to labels reflecting link existence. Next, we describe the details of each component.

3.1 Subgraph extraction and sampling

Generic GNNs like GCN and GAT generally accept the whole graph as input and rely on full-graph training, which restricts their scalability on large-scale graphs for node representation learning. Existing improved studies use layer sampling [26, 31] or subgraph sampling [32, 33] to alleviate the aforementioned limitations. Inspired by the mechanism of “neighborhood aggregation” in GNN, we know that to characterize and classify a target node, GNN actually aggregates the node features in the subgraph centered on it. The node-centric induced subgraph is actually the receptive field of center node, as shown in Fig. 2, which is generally much smaller than the whole graph. Thus, extracting the subgraphs to form a batch as the inputs of GNNs can significantly reduce the resource consumption during model training. In this work, we consider the subgraph extraction and sampling in the scenario of link prediction, which are similarly adopted in [24].

3.1.1 Subgraph extraction

We name the subgraph centered on link as link-centric induced subgraph (abbreviated “LSg” in this paper) and give the definition.
Definition 1
(Definition) (Link-centric Induced Subgraph, LSg). For a graph \(\mathcal {G} =(\mathcal {V} ,\mathcal {E})\), given target link \(l = (v_i, v_j)\) where \(v_i, v_j \in \mathcal {V} \), the h-hop link-centric induced subgraph for link l is the subgraph \({g}_{l}^h\) induced from \(\mathcal {G}\) by the set of nodes \(\cup _{v\in (v_i,v_j)} \{v_k \mid d(v_k, v) \le h\}\).
The LSg \({g}_{l}^h\) of link \(l = (v_i, v_j)\) contains all the h-hop neighbors of \(v_i\) and \(v_j\). Figure 2 illustrates the 2-hop LSgs for links \((v_a,v_b)\) and \((v_c,v_d)\). The informative subgraph patterns can effectively reflect the existence of links between center node pairs, which helps to characterize the link structure.

3.1.2 Subgraph sampling

Furthermore, we know that a h-layer GNN expands the receptive field by one-hop during each iteration and after h iterations the features of nodes within h-hops will be aggregated. When it comes to a deeper model, the size (i.e., the number of nodes) of receptive field subgraph grows exponentially with layers, which results in neighborhood explosion problem. In this work, we use subgraph sampling to control the size of LSg, even though it can only alleviate this problem to some extent. Specifically, for a h-hop LSg of link \(l = (v_i,v_j)\), we select the top-\(K_1\) important 1-hop neighbors (with node degree value) for \(v_i\) and \(v_j\) respectively, and again select the top-\(K_2\) important 1-hop neighbors for each selected node at hop 1, and recursive ones in the downstream hops. The recursive sampling can be formulated as follows:
$$\begin{aligned} \mathcal {V}_{t} = \bigcup _{v\in \mathcal {V}_{t-1}} \ \textsf{topK}(\mathcal {N}_v, K_{t}, \textbf{D}[\mathcal {N}_v,\mathcal {N}_v]), \end{aligned}$$
(1)
where \(\mathcal {V}_{t}\) is the set of nodes sampled at hop t, \(\mathcal {N}_v\) is the 1-hop neighborhs set of node v, \(K_{t}\) is the sampling number at hop t, \(\textbf{D} [\mathcal {N}_v,\mathcal {N}_v]\) is the degree sequence of nodes in \(\mathcal {N}_v\), and \(\textsf{topK}\) is the function that returns the nodes of top-K largest degree values. \(\mathcal {V}_0\) is initialized as \(\{v_i,v_j\}\). After h iterations, the set of nodes sampled from the h-hop LSg is \(\mathcal {V}_l = \cup _{t=0}^{h} \mathcal {V}_t\). The sampled LSg (abbreviated “sLSg” in this paper) \(g_l = (\textbf{A}_l, \textbf{X}_l)\) is induced by \(\mathcal {V}_l\), and its adjacency matrix \(\textbf{A}_l\) and feature matrix \(\textbf{X}_l\) are denoted respectively as
$$\begin{aligned} \textbf{X}_l = \textbf{X}[\mathcal {V}_l, :] \ , \quad \textbf{A}_l = \textbf{A}[\mathcal {V}_l, \mathcal {V}_l] \ . \end{aligned}$$
(2)
Note that we remove the edge between \(v_i\) and \(v_j\) if \(l=(v_i,v_j)\) is a positive link. Finally, we anonymize the sLSg \(g_{l}\) by relabeling its nodes to be \(\{1,2, \cdots , |\mathcal {V}_l|\}\), in arbitrary order. For a set of links \(L_{emb}\) to be embedded, we get the sLSg for each link in \(L_{emb}\) and all the sLSg form a set: \(\mathcal {D} = \{g_{i} \mid i=1,2,\cdots , |L_{emb}|\}\).

3.2 Subgraph contrastive learning for link representation

To learn highly expressive link representations, SE4LP utilizes subgraph contrast as a pretext task to jointly train a GNN encoder \(f_\theta \). During subgraph contrast, for each slsg \(g_{i}\), its two correlated views \(\hat{g}_i^1\) and \(\hat{g}_i^2\) are generated by undergoing two augmentation operators \(t_1\) and \(t_2\), where \(\hat{g}_i^1 = t_1(g_{i})\) and \(\hat{g}_i^2 = t_2(g_{i})\). The correlated augmented views are fed into a GNN encoder \(f_\theta \) with pooling layer, producing the whole subgraph representations \(\varvec{h}_i^1\) and \(\varvec{h}_i^2\), which are then mapped into a contrast space via a projection head \(f_\phi \), yielding \(\varvec{z}_i^1\) and \(\varvec{z}_i^2\). Note that \(\theta \) and \(\phi \) are the parameters of graph encoder and projection head respectively. The representation of a slsg, \(\varvec{h}\), is treated as the representation of its central link, following a “subgraph2link” pattern. Finally, the goal of subgraph-level contrast is to maximize the consistency between two correlated augmented views of subgraphs in the contrast space:
$$\begin{aligned} \mathcal {L}_\textit{self} = \frac{1}{n} \sum _{i=1}^n \mathcal {L}_i , \end{aligned}$$
(3)
where n is the number of subgraphs in a batch (i.e., batch size). The loss for each subgraph \(\mathcal {L}_i\) can be computed as:
$$\begin{aligned} \mathcal {L}_i = -\log \frac{e^{\textrm{s}\left( \varvec{z}_{i}^{1}, \varvec{z}_{i}^{2}\right) / \tau }}{\sum _{j=1, j \ne i}^{n} e^{\textrm{s}\left( \varvec{z}_{i}^{1}, \varvec{z}_{j}^{2}\right) / \tau }}, \end{aligned}$$
(4)
where \(\textrm{s}(\cdot , \cdot )\) is the cosine similarity function having \(\textrm{s}(\varvec{z}_{i}^{1}, \varvec{z}_{i}^{2}) = {\varvec{z}_{i}^{1}}^\top \cdot \varvec{z}_{i}^{2} / \Vert \varvec{z}_{i}^{1}\Vert \Vert \varvec{z}_{i}^{2}\Vert \), and \(\tau \) is the temperature parameter. The two correlated views \(\varvec{z}_i^1\) and \(\varvec{z}_i^2\) of slsg \(g_{i}\) are treated as positive pair while the rest view pairs in the batch are treated as negative pairs. The objective aims to maximize the consistency of positive pairs as opposed to negative ones. Note that here we use an asymmetrtic and simplified loss compared to the SimCLR loss [34], i.e., we generate negative pairs by only treating view 1 (\(\varvec{z}_i^1\)) as the anchor and contrasting with view 2 (\(\varvec{z}_j^2\)) of all other subgraphs, as shown in Eq. 4.

3.3 Graph augmentation

Contrastive learning relies heavily on well-designed data augmentation strategies for view generation. So far, widely used graph augmentation techniques concentrate on structure-level and attribute-level augmentation [30, 3538]. In this paper, we use two existing augmentation methods, Attribute Masking and Edge Removing, and design two novel augmentation techniques, Attribute Similarity and KNN Graph.

3.3.1 Attribute masking

This augmentor was proposed in [39]. Given the attribute matrix \(\textbf{X}\), we randomly set a fraction of dimensions to zero in node attributes. Formally, we first sample a mask vector \({\textbf {m}}^X \in \{0,1\}^F\), where each dimension of it is independently drawn from a probability distribution \({\textbf {m}}^X_i \sim \textsf{Bernoulli}(1-p_X)\), where \(p_X\) is the probability of masking arbitrary dimension of node attributes. The resulting attribute matrix \(\hat{\textbf{X}}\) can be computed as
$$\begin{aligned} \hat{\textbf{X}} = t_{AM}(\textbf{X} ) = [\textbf{x}_1\circ {\textbf {m}}^X; \textbf{x}_2\circ {\textbf {m}}^X;\cdots ; \textbf{x}_N\circ {\textbf {m}}^X]^\top , \end{aligned}$$
(5)
where \(\circ \) is the Hadamard product and \([\cdot ;\cdot ]\) is the concatenation operation.

3.3.2 Edge removing

This augmentor was used in [30, 40] and belongs to structure-level augmentation. Given the adjacency matrix \(\textbf{A}\), we randomly remove a portion of edges from graph. Formally, we first sample a mask matrix \(\textbf{M}^A \in \{0,1\}^{N\times N}\), whose entry is independently drawn from a probability distribution \({\textbf {M}}^A_{ij} \sim \textsf{Bernoulli}(1-p_A)\), where \(p_A\) is the probability of removing an arbitrary edge from graph. The resulting adjacency matrix \(\hat{\textbf{A}}\) can be computed as
$$\begin{aligned} \hat{\textbf{A}} = t_{ER}(\textbf{A}) = \textbf{A} \circ {\textbf {M}}^A. \end{aligned}$$
(6)

3.3.3 Attribute similarity

This augmentor builds new node features based on node similarity. Specifically, the new node feature matrix is actually the node similarity matrix \(\textbf{S}\in \mathbb {R}^{N\times N}\), in which each entry \(\textbf{S}_{ij}\) represents the similarity between node \(v_i\) and \(v_j\), and can be calculated by \(S_{ij} = \textbf{x}_i^\top \textbf{x}_j\). Note that the similarity matrix generated by dot product between \(\textbf{X}\) and its transpose is unique, so we further apply Attribute Masking on the similarity matrix for diverse view generation. The resulting attribute matrix \(\hat{\textbf{X}}\) is computed as
$$\begin{aligned} \hat{\textbf{X}} = t_{AM}(\textbf{S} ) = t_{AM}(\textbf{X}\textbf{X}^\top ). \end{aligned}$$
(7)

3.3.4 KNN graph

This augmentor builds new adjacency matrix based on feature similarity. Regarding arbitrary node \(v_i\) in graph as an independent sample with features \(\textbf{x}_i\), we first calculate the feature similarity matrix by dot product, \(\textbf{S} =\textbf{X} \textbf{X}^\top \). Then for each sample, we find its top \(K_T\) similar samples as neighbors and set edges to connect it and its neighbors, formulated as \(\hat{\textbf{a} }_i = \textsf{bT}(\textbf{S}_i)\) where \(\textsf{bT}\) is the function that binarizes elements in a vector by setting the largest \(K_T\) elements as 1 and other elements as 0. The resulting adjacency matrix \(\hat{\textbf{A}}\) can be computed as
$$\begin{aligned} \hat{\textbf{A}} =t_{KG}(\textbf{S}) =[\textsf{bT}(\textbf{S}_1); \cdots ; \textsf{bT}(\textbf{S}_N)]^\top . \end{aligned}$$
(8)

3.4 Model training

We achieve link prediction by a subgraph label predictor \(f_\psi \), which maps subgraph representations to labels reflecting link existence, yielding a classification loss:
$$\begin{aligned} \mathcal {L}_\textit{pred} = -\frac{1}{n}\sum _{i=1}^N y_i \cdot \log (f_\psi (\varvec{h}_i)), \end{aligned}$$
(9)
where \(\mathcal {L}_\textit{pred}\) is the cross entropy loss. The subgraph contrast is treated as pretext task, and the encoder in SE4LP is jointly trained with the pretext and subgraph classification tasks. The loss function consists of both the self-supervised and classification task loss functions, as formularized below:
$$\begin{aligned} \mathcal {L} =\mathcal {L}_\textit{pred} + \lambda \cdot \mathcal {L}_\textit{self} \ , \end{aligned}$$
(10)
where \(\lambda \) controls the contribution of self-supervision term. The pseudocode of our SE4LP is presented in Algorithm 1.

4 Experiments

Dataset To assess how well our SE4LP can learn highly expressive link representation while keeping high scalability on large-scale graphs, we evaluate SE4LP on publicly available real-world datasets as follows: Cora, Citeseer, Pubmed, Facebook and Github. The details of publicly available real-world datasets as follows:
  • Citation benchmarks: Cora, Citeseer and Pubmed [41], where nodes represent papers with corresponding bag-of-words features, and edges represent citation relationships between papers.
  • Facebook: a page-page web graph of verified Facebook sites. Nodes represent official Facebook pages, and links are mutual likes between sites. Node features are extracted from the site descriptions created by the page owners to summarize the purpose of the site.
  • Github: a large social network where nodes are GitHub developers who have starred at least 10 repositories and edges are mutual follower relationships between them. Node features are extracted based on the location, repositories starred, employer and e-mail address.
The specifications of datasets are given in Table 1. For the two large datasets (Facebook and Github), their initial features are not aligned. We create tagged documents from feature hash and further process them into 128-dimensional initial features by Doc2Vec algorithm. Both of the Facebook and Github datasets are available online 1.
Table 1
Dataset properties
 
Dataset
Type
Nodes
Edges
Features
Small-scale
Cora
Citation network
2708
5429
1433
 
Citeseer
Citation network
3327
4732
3703
 
Pubmed
Citation network
19717
44338
500
Large-scale
Facebook
Web network
22470
171002
128\(*\)
 
Github
Social network
37700
289,003
128\(*\)
\(*\) The original node features are processed into 128-dimensional initial features by Doc2Vec [42]
Comparison Methods We evaluate SE4LP on link prediction task, and use three broad categories of methods for comparison: heuristics, unsupervised and supervised method. For link prediction heuristics, we compare with four well-known node-similarity measures: Common Neighbors (CN), Salton Index, Adamic-Adar Index (AA) and Resource Allocation Index (RA) [43]. For unsupervised methods, we consider random walks, GAEs and contrastive methods. For random walks, we compare with DeepWalk [8] and Node2Vec [19]. For GAEs, we compare with Graph Auto-Encoder (GAE), Variational Graph Auto-Encoder (VGAE) [20], Adversarially Regularized Graph Autoencoder (ARGA) and its variant (ARGVA) [21]. For contrastive method, we compare with Deep Graph Infomax (DGI) [29]. Note that the random walks methods, graph auto-encoder methods and contrastive method achieve link prediction by first generating node representations and further calculating existence scores for links, following the “node2link” pattern. For supervised method, we compare with the state-of-the-art link prediction framework, SEAL [24], which also characterizes links using local subgraphs and achieves link prediction in an end-to-end manner.

4.1 Experimental setting

Data Preparation For methods with “subgraph2link” pattern, SEAL and SE4LP, we sample the same number of positive and negative links from each dataset, with the proportion of {40%, 40%, 10%, 10%, 20%} for {Cora, Citeseer, Pubmed, Facebook, Github}. All links are split into training, validation and testing sets with a proportation of 8:1:1, and we further extract slsg for each link. For heuristics (CN, Salton, AA, and RA), random walks (DeepWalk and Node2Vec), GAEs (GAE, VGAE, ARGA and ARGVA) and contrastive method (DGI) which rely on full-graph training, we randomly sample a certain number of positive links as well as the same number of additionally negative links as testing data, and the remaining partially observed graph is used for training. We repeat 10-fold cross validation for 5 times and report the average Area Under Curve (AUC) and Average Precision (AP) measures as well as their standard deviations.
Parameter Configuration For SE4LP, we choose the number of hops from \(\{1,2\}\) and set both probabilities (\(p_A\), \(p_x\)) associated with data augmentation to 20%. We set the default GNN encoder of SE4LP to graph isomorphism network (GIN) [44], which is a 3-layers graph convolutional network with hidden dimension of 128, jumping connection, and PReLU activation function. We return subgraph representations by max pooling. The batch size n, learning rate, temperature parameter \(\tau \) and trade-off coefficient \(\lambda \) are set to 512, 0.01, 0.2, 0.1, respectively. We use early stopping with a patience of 20.
For random walks, we set the length of walks to 30, the number of walks to 200, and the context size to 10. For Node2Vec, we set the return parameter p and in-out parameter q to 4 and 1, respectively. For all GAEs, we set the encoder as 2-layer GCN network. For GAE and VGAE, we construct encoders with a 128-dimensional hidden layer and a 128-dimensional embedding layer for all the experiments. For ARGA and ARGVA, we construct encoders with a 256-dimensional hidden layer and a 128-dimensional embedding layer for all the experiments and all the discriminators are built with 2 hidden layers(16 and 64 dimensions). The learning rates are set to {0.01, 0.01, 0.005, 0.005} for GAE, VGAE, ARGA and ARGVA, respectively. For DGI, we construct encoder with a 512-dimensional hidden layer and a 512-dimensional embedding layer for all the experiments. For other parameters, we use the default setting following [29]. For SEAL, we choose the number of hops from \(\{1,2\}\). For other parameters, we use the default setting following [24].
Table 2
Performance on link prediction task reported in Area Under Curve (AUC) and Average Precision (AP) measures
Method
Cora
Citeseer
Facebook
 
AUC (%)
AP (%)
AUC (%)
AP (%)
AUC (%)
AP (%)
CN
56.19±0.099
63.08±0.059
58.76±0.095
65.31±0.060
88.70±0.076
91.72±0.045
Salton
56.85±0.094
61.73±0.065
59.32±0.090
63.79±0.065
88.61±0.077
91.07±0.054
AA
57.33±0.087
59.36±0.081
58.97±0.089
60.90±0.083
87.24±0.093
91.18±0.060
RA
57.77±0.090
64.75±0.055
59.11±0.091
65.88±0.058
85.08±0.101
90.43±0.057
\(\diamond \) DeepWalk
88.14±0.055
87.87±0.045
85.67±0.052
86.11±0.044
87.65±0.012
85.41±0.011
\(\diamond \) Node2Vec
88.65±0.058
89.62±0.049
87.36±0.065
88.15±0.059
85.64±0.022
85.25±0.028
\(\diamond \) GAE
93.79±0.038
93.43±0.038
92.63±0.013
93.50±0.016
OOM
OOM
\(\diamond \) VGAE
94.30±0.006
94.60±0.082
93.78±0.046
94.55±0.051
OOM
OOM
\(\diamond \) ARGA
90.27±0.067
90.01±0.069
89.00±0.040
89.60±0.039
OOM
OOM
\(\diamond \) ARGVA
93.26±0.041
93.61±0.045
94.03±0.013
94.30±0.014
OOM
OOM
\(\diamond \) DGI
93.15±0.023
92.70±0.030
91.84±0.014
92.31±0.014
OOM
OOM
\(\star \) SEAL(\(h=1\))
95.46±0.007
95.84±0.010
91.20±0.010
93.11±0.008
97.83±0.018
97.29±0.023
\(\star \) SEAL(\(h=2\))
\(\underline{95.88{\pm }0.007}\)
\(\underline{96.14{\pm }0.008}\)
91.35±0.012
93.10±0.008
98.58±0.001
98.71±0.001
\(\star \) SE4LP(\(h=1\))
96.05±0.007
96.28±0.010
94.74±0.007
95.24±0.008
\(\underline{97.94{\pm }0.002}\)
\(\underline{97.64{\pm }0.001}\)
\(\star \) SE4LP(\(h=2\))
94.33±0.008
94.03±0.012
\(\underline{93.38{\pm }0.006}\)
\(\underline{93.67{\pm }0.008}\)
97.53±0.001
97.24±0.001
The best and second best results are marked in bold and underlined, respectively
Table 3
Performance on link prediction task reported in Area Under Curve (AUC) and Average Precision (AP) measures
Method
Pubmed
Github
 
AUC (%)
AP (%)
AUC (%)
AP (%)
CN
65.53±0.050
67.54±0.034
67.87±0.105
73.26±0.069
Salton
64.78±0.057
66.12±0.047
62.84±0.085
60.69±0.054
AA
65.26±0.049
67.49±0.035
67.69±0.108
75.23±0.085
RA
65.27±0.047
67.78±0.032
67.56±0.078
77.14±0.050
\(\diamond \) DeepWalk
90.88±0.021
87.48±0.026
81.25±0.013
80.25±0.012
\(\diamond \) Node2Vec
90.60±0.021
89.29±0.027
80.49±0.019
79.62±0.016
\(\diamond \) GAE
91.93±0.051
91.84±0.051
OOM
OOM
\(\diamond \) VGAE
89.36±0.056
89.37±0.056
OOM
OOM
\(\diamond \) ARGA
88.00±0.049
88.33±0.045
OOM
OOM
\(\diamond \) ARGVA
90.48±0.036
90.32±0.035
OOM
OOM
\(\diamond \) DGI
91.45±0.004
90.87±0.005
OOM
OOM
\(\star \) SEAL(\(h=1\))
94.24±0.015
92.56±0.021
96.27±0.019
96.01±0.017
\(\star \) SEAL(\(h=2\))
96.82±0.009
98.11±0.011
97.11±0.014
97.02±0.011
\(\star \) SE4LP(\(h=1\))
98.36±0.001
98.25±0.002
\(\underline{96.43{\pm }0.005}\)
\(\underline{96.11{\pm }0.005}\)
\(\star \) SE4LP(\(h=2\))
\(\underline{98.35{\pm }0.004}\)
\(\underline{98.18{\pm }0.005}\)
96.12±0.010
95.86±0.015
The best and second best results are marked in bold and underlined, respectively
Tables 23 reports the results on link prediction, from which we can see that SE4LP achieves state-of-the-art results with respect to baselines. Specifically, our SE4LP significantly outperforms heuristic and random walks baselines across all datasets, indicating that the learned subgraph patterns are better at capturing the link properties than manual features or shallow topology features. When compared to GAEs, our SE4LP surpasses strong baseline: on three citation benchmarks we observe 1.9%, 0.8% and 7.0% relative improvement over best GAE in terms of AUC, respectively. We also compare with the state-of-the-art supervised model, SEAL, in a variety of hyper-parameter settings. The results shown in Tables 2,  3 suggest that our model outperforms SEAL (with same hyper-parameter settings) on three citation benchmarks in most cases, and achieves competitive results in Facebook and Github datasets. Specifically, SC4LP performs better with 1-hop slsg, while SEAL achieves better performance with 2-hop slsg. SC4LP beats SEAL when accepts subgraphs with a smaller size (\(h=1\)) When compared to SEAL that accepts larger subgraphs (\(h=2\)), SC4LP still beats SEAL in 2 out of 5 datasets. These results suggest that SEAL relies on enclosing subgraphs of larger size to learn high-order link features while our SE4LP prefers to find link patterns from small subgraphs. Actually, SEAL uses Node Labeling trick to highlight the structural information of subgraphs, and our SE4LP outperforms SEAL on benchmarks without Node Labeling, indicating the effectiveness of combining “subgraph2link” pattern with self-supervised learning for link prediction.
subgraph2link vs. node2link: When comparing the two categories of methods with different patterns, the results suggest that methods with “subgraph2link” pattern generally perform on par with or better than strong baselines with “node2link” pattern. Moreover, “subgraph2link” models are trained on slsg while “node2link” models are trained on partially observed graph where only the testing links are masked. In summary, compared with “node2link” methods, “subgraph2link” methods use relatively less training data and achieve superior performance and generalizability, validating the effectiveness of our proposal, i.e., local subgraph can provide informative structure contexts for link representation learning.

4.3 Analysis of augmentation views

We investigate four augmentation techniques by applying various pairs of augmentation types to three categories of graph datasets, as illustrated in Fig. 3. As we can see, for citation networks, using either “AttrSimilar” or “KnnGraph” as one of augmented views can significantly benefit the downstream performance, when compared to other augmentation pairs. And for the web and social networks, best performance is still achieved by using “AttrSimilar” as an augmented view: 0.37% for Facebook, and 2.33% for Github. The observation meets our intuition. In link prediction scenario, a larger similarity between nodes leads to a higher likelihood of links between them. Based on this, the new node features and adjacency information generated by “AttrSimilar” and “KnnGraph” respectively, can effectively characterize the correlation between nodes. Therefore, contrasting the new graph features with original (“Identical”) or perturbed (generated by “AttrMask” or “EdgeRemove”) ones, can avoids exploit shortcuts that can greatly decrease representation quality and learn more generalizable features, as conjectured in [34].
Table 4
Efficiency analysis on five datasets
Item
Method
Dataset
  
Cora
Citeseer
Pubmed
Facebook
Github
Training Time (s)
DeepWalk
609.2
580.7
6023.0
13531.5
21948.7
 
GAE
0.0
0.0
0.0
 
SEAL
1.3
0.8
2.2
8.5
22.8
 
SE4LP
1.6
1.2
5.1
8.7
13.3
Memory (MB)
DeepWalk
346
414
1679
3082
3099
 
GAE
4935
4932
4994
OOM
OOM
 
SEAL
2709
3079
2820
2724
3039
 
SE4LP
5698
6000
5589
6044
8740

4.4 More analysis

Impact of Subgraph Size We further investigate the impact of subgraph size in our SE4LP on citation benchmarks. We first fix \(h=1\) and adjust \(K_1\) from 2 to 10, and evaluate the results shown in Fig. 4. We observe that the performance of SE4LP increases slightly as the size of the subgraph increases, indicating that SE4LP has certain robustness to the variation in subgraph scale. Moreover, our method can be adapted to subgraphs of different sizes and capture key pattern features which reflect link existence. As a result, the subgraph sampling module can improve the scalability of SE4LP in large-scale graphs by accepting small subgraphs to achieve highly powerful link prediction.
Impact of Self-supervised Learning in Link Prediciton We further investigate the impact of contrastive self-supervision in our framework, which can be measured by the scale of training data. Specifically, we cut the training set from 10% to 90%, and observe the performance of SE4LP on link prediction, as shown in Fig. 4. As we can see, SE4LP always keeps stable performance with the reduction of training data on Citeseer and Pubmed datasets. Moreover, in our experiments, SE4LP uses 10% \(\sim \) 40% positive links for training, while GAEs and random walks use “partially observed graph” (80% positive links) for training. As a result, SE4LP can learn effective link representations with less training data, and achieve competitive or even better performance in link prediction. Generally, link prediction tasks have no shortage of training data because the number of edges in the graph is generally large. So we conclude that introducing self-supervision in link representation learning can significantly reduce the dependence on training data and improve the generalization and scalability.
Impact of Trade-off Coefficient We further investigate the impact of trade-off coefficient \(\lambda \) on balancing the subgraph contrast and classification tasks. As we can see from Fig. 4, our model achieves a relatively stable performance when \(\lambda \) is between 0.1 and 1, which follows our intuitive. We consider subgraph contrast as a pretext task, or say a regularization to subgraph classification task, so that \(\lambda \) should take a smaller value.

4.5 Computational efficiency

Finally, we investigate the computational efficiency of SE4LP, as illustrated in Table 4, by summarizing the training time and memory consumption on all the five datasets. The training time refers to the time for training the encoder (exclude validation) per epoch, and the memory refers to total memory costs of model parameters and all hidden representations. We observe that for the two comparison methods (DeepWalk, GAE), they rely on full-graph training and become less efficient as the scale of graph grows. When compared to DeepWalk belonging to random walks, our SE4LP can be trained much faster on datasets of various scales. When compared to GAE belonging to graph auto-encoder, our SE4LP achieves comparable efficiency on small-scale graphs, and beats GAE on large-scale graphs via subgraph mini-batch training. When compared to SEAL, both methods follows the same “subgraph2link” pattern and therefore have similar training time consumption. Then, SE4LP uses graph augmentation to generate two contrasting views, so the memory cost is about twice that of SEAL. In summary, our SE4LP takes both training speed and memory consumption into account, scaling well on large-scale graphs.

5 Conclusions

Graph representation learning for link is relatively less studied and less understood. In this paper, we study the “subgraph2link” patern and propose an end-to-end joint learning framework for learning link representations by contrasting encodings from different view of subgraphs centered on links. Experiments conducted on five datasets demonstrate that our framework can achieve new SOTA performance in link prediction, and have satisfactory scalability on large-scale datasets.

Acknowledgements

This work was supported by the Key Project of Regional Innovation and Development Joint Fund of National Natural Science Foundation of China under Grant U22A2025.

Declarations

Competing interests

The authors declare no competing interests.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://​creativecommons.​org/​licenses/​by/​4.​0/​.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Our product recommendations

ATZelektronik

Die Fachzeitschrift ATZelektronik bietet für Entwickler und Entscheider in der Automobil- und Zulieferindustrie qualitativ hochwertige und fundierte Informationen aus dem gesamten Spektrum der Pkw- und Nutzfahrzeug-Elektronik. 

Lassen Sie sich jetzt unverbindlich 2 kostenlose Ausgabe zusenden.

ATZelectronics worldwide

ATZlectronics worldwide is up-to-speed on new trends and developments in automotive electronics on a scientific level with a high depth of information. 

Order your 30-days-trial for free and without any commitment.

Literature
1.
go back to reference Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: Proceedings of the 5th international conference on learning representations Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: Proceedings of the 5th international conference on learning representations
2.
go back to reference Gong S, Zhou J, Xie C, Xuan Q (2023) Neighborhood homophily-based graph convolutional network. In: Proceedings of the 32nd ACM international conference on information and knowledge management, pp 3908–3912 Gong S, Zhou J, Xie C, Xuan Q (2023) Neighborhood homophily-based graph convolutional network. In: Proceedings of the 32nd ACM international conference on information and knowledge management, pp 3908–3912
3.
go back to reference Bo D, Wang X, Shi C, Zhu M, Lu E, Cui P (2020) Structural deep clustering network. In: Proceedings of the web conference 2020, pp 1400–1410 Bo D, Wang X, Shi C, Zhu M, Lu E, Cui P (2020) Structural deep clustering network. In: Proceedings of the web conference 2020, pp 1400–1410
4.
go back to reference Zhou J, Chen Z, Du M, Chen L, Yu S, Chen G, Xuan Q (2021) Robustecd: enhancement of network structure for robust community detection. IEEE Trans Knowl Data Eng 35(1):842–856 Zhou J, Chen Z, Du M, Chen L, Yu S, Chen G, Xuan Q (2021) Robustecd: enhancement of network structure for robust community detection. IEEE Trans Knowl Data Eng 35(1):842–856
5.
go back to reference Dai H, Dai B, Song L (2016) Discriminative embeddings of latent variable models for structured data. In: International conference on machine learning. PMLR, pp 2702–2711 Dai H, Dai B, Song L (2016) Discriminative embeddings of latent variable models for structured data. In: International conference on machine learning. PMLR, pp 2702–2711
6.
go back to reference Zhou J, Hu C, Chi J, Wu J, Shen M, Xuan Q (2022) Behavior-aware account deanonymization on ethereum interaction graph. IEEE Trans Inf Forensics Secur 17:3433–3448CrossRef Zhou J, Hu C, Chi J, Wu J, Shen M, Xuan Q (2022) Behavior-aware account deanonymization on ethereum interaction graph. IEEE Trans Inf Forensics Secur 17:3433–3448CrossRef
7.
go back to reference Pasa L, Navarin N, Sperduti A (2022) Polynomial-based graph convolutional neural networks for graph classification. Mach Learn 111(4):1205–1237MathSciNetCrossRef Pasa L, Navarin N, Sperduti A (2022) Polynomial-based graph convolutional neural networks for graph classification. Mach Learn 111(4):1205–1237MathSciNetCrossRef
8.
go back to reference Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International conference on knowledge discovery and data mining, pp 701–710 Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International conference on knowledge discovery and data mining, pp 701–710
9.
go back to reference Dileo M, Zignani M, Gaito S (2023) Temporal graph learning for dynamic link prediction with text in online social networks. Mach Learn, 1–20 Dileo M, Zignani M, Gaito S (2023) Temporal graph learning for dynamic link prediction with text in online social networks. Mach Learn, 1–20
10.
go back to reference Moradan A, Draganov A, Mottin D, Assent I (2023) Ucode: unified community detection with graph convolutional networks. Mach Learn 112(12):5057–5080MathSciNetCrossRef Moradan A, Draganov A, Mottin D, Assent I (2023) Ucode: unified community detection with graph convolutional networks. Mach Learn 112(12):5057–5080MathSciNetCrossRef
11.
go back to reference Subramonian A (2021) Motif-driven contrastive learning of graph representations. In: Proceedings of the AAAI conference on artificial intelligence, vol 35, pp 15980–15981 Subramonian A (2021) Motif-driven contrastive learning of graph representations. In: Proceedings of the AAAI conference on artificial intelligence, vol 35, pp 15980–15981
12.
go back to reference Hao J, Chen M, Yu W, Sun Y, Wang W (2019) Universal representation learning of knowledge bases by jointly embedding instances and ontological concepts. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery and data mining, pp 1709–1719 Hao J, Chen M, Yu W, Sun Y, Wang W (2019) Universal representation learning of knowledge bases by jointly embedding instances and ontological concepts. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery and data mining, pp 1709–1719
13.
go back to reference Pei Y, Huang T, Ipenburg W, Pechenizkiy M (2022) Resgcn: attention-based deep residual modeling for anomaly detection on attributed networks. Mach Learn 111(2):519–541MathSciNetCrossRef Pei Y, Huang T, Ipenburg W, Pechenizkiy M (2022) Resgcn: attention-based deep residual modeling for anomaly detection on attributed networks. Mach Learn 111(2):519–541MathSciNetCrossRef
14.
go back to reference Adamic LA, Adar E (2003) Friends and neighbors on the web. Soc Netw 25(3):211–230CrossRef Adamic LA, Adar E (2003) Friends and neighbors on the web. Soc Netw 25(3):211–230CrossRef
15.
go back to reference Fu C, Zhao M, Fan L, Chen X, Chen J, Wu Z, Xia Y, Xuan Q (2018) Link weight prediction using supervised learning methods and its application to yelp layered network. IEEE Trans Knowl Data Eng 30(8):1507–1518CrossRef Fu C, Zhao M, Fan L, Chen X, Chen J, Wu Z, Xia Y, Xuan Q (2018) Link weight prediction using supervised learning methods and its application to yelp layered network. IEEE Trans Knowl Data Eng 30(8):1507–1518CrossRef
16.
go back to reference Yu S, Zhao M, Fu C, Zheng J, Huang H, Shu X, Xuan Q, Chen G (2019) Target defense against link-prediction-based attacks via evolutionary perturbations. IEEE Trans Knowl Data Eng 33(2):754–767 Yu S, Zhao M, Fu C, Zheng J, Huang H, Shu X, Xuan Q, Chen G (2019) Target defense against link-prediction-based attacks via evolutionary perturbations. IEEE Trans Knowl Data Eng 33(2):754–767
17.
go back to reference Stoica G, Stretcu O, Platanios EA, Mitchell T, Póczos B (2020) Contextual parameter generation for knowledge graph link prediction. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 3000–3008 Stoica G, Stretcu O, Platanios EA, Mitchell T, Póczos B (2020) Contextual parameter generation for knowledge graph link prediction. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 3000–3008
18.
go back to reference Yu S, Wu Y, Gan R, Zhou J, Zheng Z, Xuan Q (2022) Discover important paths in the knowledge graph based on dynamic relation confidence. In: China national conference on big data and social computing. Springer, pp 341–358 Yu S, Wu Y, Gan R, Zhou J, Zheng Z, Xuan Q (2022) Discover important paths in the knowledge graph based on dynamic relation confidence. In: China national conference on big data and social computing. Springer, pp 341–358
19.
go back to reference Grover A, Leskovec J (2016) node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 855–864 Grover A, Leskovec J (2016) node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 855–864
20.
go back to reference Kipf TN, Welling M (2016) Variational graph auto-encoders. NIPS Workshop on Bayesian Deep Learning Kipf TN, Welling M (2016) Variational graph auto-encoders. NIPS Workshop on Bayesian Deep Learning
21.
go back to reference Pan S, Hu R, Long G, Jiang J, Yao L, Zhang C (2018) Adversarially regularized graph autoencoder for graph embedding. In: IJCAI International joint conference on artificial intelligence Pan S, Hu R, Long G, Jiang J, Yao L, Zhang C (2018) Adversarially regularized graph autoencoder for graph embedding. In: IJCAI International joint conference on artificial intelligence
22.
go back to reference Qiu J, Dong Y, Ma H, Li J, Wang K, Tang J (2018) Network embedding as matrix factorization: unifying deepwalk, line, pte, and node2vec. In: Proceedings of the 11th ACM international conference on web search and data mining, pp 459–467 Qiu J, Dong Y, Ma H, Li J, Wang K, Tang J (2018) Network embedding as matrix factorization: unifying deepwalk, line, pte, and node2vec. In: Proceedings of the 11th ACM international conference on web search and data mining, pp 459–467
23.
go back to reference Cao S, Lu W, Xu Q (2015) Grarep: learning graph representations with global structural information. In: Proceedings of the 24th ACM international on conference on information and knowledge management, pp 891–900 Cao S, Lu W, Xu Q (2015) Grarep: learning graph representations with global structural information. In: Proceedings of the 24th ACM international on conference on information and knowledge management, pp 891–900
24.
go back to reference Zhang M, Chen Y (2018) Link prediction based on graph neural networks. In: Proceedings of the 32nd international conference on neural information processing systems, vol 31, pp 5165–5175 Zhang M, Chen Y (2018) Link prediction based on graph neural networks. In: Proceedings of the 32nd international conference on neural information processing systems, vol 31, pp 5165–5175
25.
go back to reference Jiao Y, Xiong Y, Zhang J, Zhang Y, Zhang T, Zhu Y (2020) Sub-graph contrast for scalable self-supervised graph representation learning. In: 2020 IEEE International conference on data mining (ICDM). IEEE, pp 222–231 Jiao Y, Xiong Y, Zhang J, Zhang Y, Zhang T, Zhu Y (2020) Sub-graph contrast for scalable self-supervised graph representation learning. In: 2020 IEEE International conference on data mining (ICDM). IEEE, pp 222–231
26.
go back to reference Hamilton WL, Ying R, Leskovec J (2017) Inductive representation learning on large graphs. In: Proceedings of the 31st international conference on neural information processing systems, pp 1025–1035 Hamilton WL, Ying R, Leskovec J (2017) Inductive representation learning on large graphs. In: Proceedings of the 31st international conference on neural information processing systems, pp 1025–1035
27.
go back to reference Martínez V, Berzal F, Cubero J-C (2016) A survey of link prediction in complex networks. ACM Comput Surv (CSUR) 49(4):1–33CrossRef Martínez V, Berzal F, Cubero J-C (2016) A survey of link prediction in complex networks. ACM Comput Surv (CSUR) 49(4):1–33CrossRef
28.
go back to reference Sun H, Tian P, Xiong Y, Zhang Y, Xiang Y, Jia X, Wang H (2024) Dynamise: dynamic signed network embedding for link prediction. Mach Learn, 1–17 Sun H, Tian P, Xiong Y, Zhang Y, Xiang Y, Jia X, Wang H (2024) Dynamise: dynamic signed network embedding for link prediction. Mach Learn, 1–17
29.
go back to reference Veličković P, Fedus W, Hamilton WL, Liò P, Bengio Y, Hjelm RD (2019) Deep graph infomax. In: Proceedings of the 7th international conference on learning representations Veličković P, Fedus W, Hamilton WL, Liò P, Bengio Y, Hjelm RD (2019) Deep graph infomax. In: Proceedings of the 7th international conference on learning representations
30.
go back to reference You Y, Chen T, Sui Y, Chen T, Wang Z, Shen Y (2020) Graph contrastive learning with augmentations, vol 33, pp 5812–5823 You Y, Chen T, Sui Y, Chen T, Wang Z, Shen Y (2020) Graph contrastive learning with augmentations, vol 33, pp 5812–5823
31.
go back to reference Chen J, Ma T, Xiao C (2018) FastGCN: fast learning with graph convolutional networks via importance sampling. In: Proceedings of the 6th international conference on learning representations Chen J, Ma T, Xiao C (2018) FastGCN: fast learning with graph convolutional networks via importance sampling. In: Proceedings of the 6th international conference on learning representations
32.
go back to reference Chiang W-L, Liu X, Si S, Li Y, Bengio S, Hsieh C-J (2019) Cluster-gcn: an efficient algorithm for training deep and large graph convolutional networks. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery and data mining, pp 257–266 Chiang W-L, Liu X, Si S, Li Y, Bengio S, Hsieh C-J (2019) Cluster-gcn: an efficient algorithm for training deep and large graph convolutional networks. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery and data mining, pp 257–266
33.
go back to reference Zeng H, Zhou H, Srivastava A, Kannan R, Prasanna V (2020) Graphsaint: graph sampling based inductive learning method. In: In Proceedings of the 8th international conference on learning representations Zeng H, Zhou H, Srivastava A, Kannan R, Prasanna V (2020) Graphsaint: graph sampling based inductive learning method. In: In Proceedings of the 8th international conference on learning representations
34.
go back to reference Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. In: International conference on machine learning. PMLR, pp 1597–1607 Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. In: International conference on machine learning. PMLR, pp 1597–1607
35.
go back to reference Wang Y, Wang W, Liang Y, Cai Y, Liu J, Hooi B (2020) Nodeaug: semisupervised node classification with data augmentation. In: Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery and data mining, pp 207–217 Wang Y, Wang W, Liang Y, Cai Y, Liu J, Hooi B (2020) Nodeaug: semisupervised node classification with data augmentation. In: Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery and data mining, pp 207–217
36.
go back to reference Zhou J, Shen J, Yu S, Chen G, Xuan Q (2020) M-evolve: structural-mappingbased data augmentation for graph classification. IEEE Trans Netw Sci Eng 8(1):190–200 Zhou J, Shen J, Yu S, Chen G, Xuan Q (2020) M-evolve: structural-mappingbased data augmentation for graph classification. IEEE Trans Netw Sci Eng 8(1):190–200
37.
go back to reference Zhou J, Shen J, Xuan Q (2020) Data augmentation for graph classification. In: Proceedings of the 29th ACM international conference on information & knowledge management, pp 2341–2344 Zhou J, Shen J, Xuan Q (2020) Data augmentation for graph classification. In: Proceedings of the 29th ACM international conference on information & knowledge management, pp 2341–2344
39.
40.
go back to reference Zhu Y, Xu Y, Yu F, Liu Q, Wu S, Wang L (2021) Graph contrastive learning with adaptive augmentation. In: Proceedings of the web conference 2021, pp 2069–2080 Zhu Y, Xu Y, Yu F, Liu Q, Wu S, Wang L (2021) Graph contrastive learning with adaptive augmentation. In: Proceedings of the web conference 2021, pp 2069–2080
41.
go back to reference Sen P, Namata G, Bilgic M, Getoor L, Galligher B, Eliassi-Rad T (2008) Collective classification in network data. AI Mag 29(3):93–93 Sen P, Namata G, Bilgic M, Getoor L, Galligher B, Eliassi-Rad T (2008) Collective classification in network data. AI Mag 29(3):93–93
42.
go back to reference Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: International conference on machine learning. PMLR, pp 1188–1196 Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: International conference on machine learning. PMLR, pp 1188–1196
43.
go back to reference Lü L, Zhou T (2011) Link prediction in complex networks: a survey. Physica A: Stat Mech Appl 390(6):1150–1170CrossRef Lü L, Zhou T (2011) Link prediction in complex networks: a survey. Physica A: Stat Mech Appl 390(6):1150–1170CrossRef
44.
go back to reference Xu K, Hu W, Leskovec J, Jegelka S (2018) How powerful are graph neural networks? In: Proceedings of the 6th international conference on learning representations Xu K, Hu W, Leskovec J, Jegelka S (2018) How powerful are graph neural networks? In: Proceedings of the 6th international conference on learning representations
Metadata
Title
Inductive Subgraph Embedding for Link Prediction
Authors
Jin Si
Chenxuan Xie
Jiajun Zhou
Shanqing Yu
Lina Chen
Qi Xuan
Chunyu Miao
Publication date
05-11-2024
Publisher
Springer US
Published in
Mobile Networks and Applications
Print ISSN: 1383-469X
Electronic ISSN: 1572-8153
DOI
https://doi.org/10.1007/s11036-024-02339-3