Introduction
Objective
Approach
Findings
Contributions
Communities and Homophily in GNNs
Preliminaries
Notations
Graph neural networks
Semi-supervised learning on graphs
Homophily
Mutual interplay
Homophily and GNNs
Cluster assumption
Methods
Contributing factors
Homophily.
Community structure
Structure manipulation
Increasing homophily (\(Hom^+\))
Decreasing homophily (\(Hom^-\))
Increasing mixing (\(Mix^+\))
Decreasing mixing (\(Mix^-\))
Manipulation | Homophily | Mixing | Side-effects |
---|---|---|---|
\(Hom^+\) | Increased | Preserved | Binomial degree distribution |
\(Hom^-\) | Decreased | Increased | Destroys community structure |
\(Mix^-\) | Preserved(*) | Decreased | Binomial degree distribution |
\(Mix^+\) | Preserved | Increased | Destroys sub-communities |
Evaluation
Datasets
Dataset | Labels | Features | Nodes | Edges | Homophily | Mixing | |
---|---|---|---|---|---|---|---|
Homophilic | CORA-ML | 7 | 1433 | 2485 | 5209 | 0.81 | 0.09 |
CiteSeer | 6 | 3703 | 2110 | 3705 | 0.74 | 0.06 | |
PubMed | 3 | 500 | 19,717 | 44,335 | 0.80 | 0.09 | |
CORA-Full | 67 | 8710 | 18,703 | 64,259 | 0.57 | 0.10 | |
Non-homophilic | Squirrel | 5 | 2089 | 5201 | 216,933 | 0.22 | 0.22 |
Actor | 5 | 932 | 7600 | 29,926 | 0.22 | 0.21 | |
Texas | 4 | 1703 | 182 | 307 | 0.06 | 0.16 | |
Wisconsin | 5 | 1703 | 251 | 499 | 0.17 | 0.16 |
Homophilic datasets
Non-homophilic datasets
GNN models
Evaluation setup
Results
Manipulation impact on graphs
GNN performance
Original graphs
Manipulated graphs
GNN models comparison
Impact of homophily and mixing on GNN performance
Destroying sub-community structure
CORA-ML | CiteSeer | PubMed | CORA-Full | Squirrel | Actor | Texas | Wisconsin | |
---|---|---|---|---|---|---|---|---|
Original | 0.188 | 0.147 | 0.141 | − 0.048 | − 0.119 | − 0.039 | 0.102 | − 0.027 |
\(Mix^+\) | 0.266 | 0.227 | 0.226 | − 0.014 | − 0.092 | − 0.037 | 0.102 | − 0.016 |
Silhouette \(\Delta\) | 0.078 | 0.080 | 0.085 | 0.034 | 0.027 | 0.002 | 0 | 0.011 |