1 Introduction
2 Related works
2.1 Label propagation approaches
2.2 Modularity optimization approaches
2.3 The probabilistic model estimation approach
2.4 Clique percolation and motif-based approaches
3 Proposed model frameworks
-
Using the probabilistic method to estimate triangle motif
-
Conceptual connection of community detection with the probability presence or absence triangular motif
-
Using evolutionary methods and maximum likelihood estimation in calculations
-
In a community, a triangle motif can exist between two pairs of nodes (one node and two neighbors of that node).
-
The probability of the existence of a triangle motif increases when the two pairs of nodes are observed in multiple communities.
-
Communities can overlap; communities that overlap have a higher density of triangle motifs.
4 Community detection by PCDMS model
4.1 Updating the parameter
4.2 WSCD algorithm
4.3 Computational complexity
4.4 Initialization
5 Experiments
Method Name | Description |
---|---|
Louvain (Blondel et al., 2008) | Louvain maximizes a modularity score for each community |
Leiden (Traag et al., 2019) | The Leiden algorithm is an improvement of the Louvain |
The probabilistic community detection method that scales to large networks | |
CPM (Palla et al., 2005) | Find k-clique communities in a graph using the percolation method |
LPA (Gregory, 2010) | The label propagation algorithm detects communities by network structure |
SLPA (Xie et al., 2011) | SLPA is an overlapping community discovery that extends the LPA |
5.1 Evaluation metrics
5.2 Real-world datasets
Dataset Name | N (Nodes) | E (Edges) | K (Ground truth) |
---|---|---|---|
Karate | 34 | 78 | 2 |
Dolphin | 62 | 159 | 2 |
Pol-Book | 105 | 441 | 3 |
Football | 115 | 613 | 2 |
Email-EU | 1005 | 25571 | 42 |
Wiki-Vote | 879 | 2914 | NA |
Twitter | 1536 | 30596 | NA |
5.2.1 Experimental results on real-world datasets
PCDMS | Louvain | Leiden | Bigclam | CPM | LPA | SLPA | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Datasets | Q | CN | Q | CN | Q | CN | Q | CN | Q | CN | Q | CN | Q | CN |
Karate | 0.397 | 3 | 0.415 | 4 | 0.116 | 5 | 0.204 | 4 | 0.215 | 4 | 0.354 | 3 | 0.371 | 3 |
Dolphin | 0.522 | 3 | 0.518 | 5 | 0.134 | 7 | 0.185 | 6 | 0.321 | 5 | 0.456 | 4 | 0.470 | 3 |
Pol-Book | 0.543 | 4 | 0.526 | 4 | 0.279 | 9 | 0.347 | 8 | 0.271 | 9 | 0.481 | 5 | 0.493 | 5 |
Football | 0.632 | 11 | 0.604 | 10 | 0.257 | 19 | 0.381 | 16 | 0.283 | 18 | 0.552 | 14 | 0.596 | 13 |
Email-EU | 0.507 | 46 | 0.432 | 27 | 0.226 | 58 | 0.207 | 65 | 0.162 | 74 | 0.274 | 55 | 0.303 | 52 |
Wiki-Vote | 0.489 | 14 | 0.432 | 13 | 0.126 | 18 | 0.301 | 10 | 0.189 | 16 | 0.345 | 11 | 0.396 | 12 |
Twitter | 0.306 | 37 | 0.296 | 34 | 0.089 | 49 | 0.186 | 25 | 0.152 | 24 | 0.256 | 29 | 0.274 | 27 |
5.3 Synthetic datasets
Parameter | Description |
---|---|
N | Number of nodes |
K | Average degree |
Min K | Minimum degree of nodes |
Max K | Maximum degree of nodes |
\( \mu \) | Mixing parameter for the topology |
Min C | Minimum for the community sizes |
Max C | Maximum for the community sizes |
tau1(\( \gamma \)) | The degree distribution |
tau2(\( \beta \)) | The community size distribution |
Graph | N | k | \(\gamma \) | \( \beta \) | \( \mu \) | \( Q_{GT} \) |
---|---|---|---|---|---|---|
LFR-1 | 1000 | 20 | 3 | 1.5 | 0.05 | 0.895 |
LFR-2 | 1000 | 20 | 3 | 1.5 | 0.10 | 0.844 |
LFR-3 | 1000 | 20 | 3 | 1.5 | 0.15 | 0.800 |
LFR-4 | 1000 | 20 | 3 | 1.5 | 0.20 | 0.739 |
LFR-5 | 1000 | 20 | 3 | 1.5 | 0.25 | 0.699 |
LFR-6 | 1000 | 20 | 3 | 1.5 | 0.30 | 0.647 |
LFR-7 | 1000 | 20 | 3 | 1.5 | 0.35 | 0.603 |
LFR-8 | 1000 | 20 | 3 | 1.5 | 0.40 | 0.560 |
LFR-9 | 1000 | 20 | 3 | 1.5 | 0.45 | 0.504 |
LFR-10 | 1000 | 20 | 3 | 1.5 | 0.50 | 0.460 |
LFR-11 | 1000 | 20 | 3 | 1.5 | 0.55 | 0.407 |
LFR-12 | 1000 | 20 | 3 | 1.5 | 0.60 | 0.364 |
LFR-13 | 1000 | 20 | 3 | 1.5 | 0.65 | 0.321 |
LFR-14 | 1000 | 20 | 3 | 1.5 | 0.70 | 0.275 |
LFR-15 | 1000 | 20 | 3 | 1.5 | 0.75 | 0.229 |
LFR-16 | 1000 | 20 | 3 | 1.5 | 0.80 | 0.182 |
5.3.1 Experimental results on synthetic datasets
\( \mu \) | Louvain | Leiden | Bigclam | CPM | LPA | SLPA | PCDMS |
---|---|---|---|---|---|---|---|
0.05 | 1.00 | 0.64 | 0.89 | 0.84 | 0.99 | 1.00 | 1.00 |
0.10 | 0.99 | 0.51 | 0.86 | 0.81 | 0.97 | 0.98 | 0.98 |
0.15 | 0.96 | 0.42 | 0.77 | 0.72 | 0.93 | 0.95 | 0.97 |
0.20 | 0.93 | 0.41 | 0.73 | 0.66 | 0.88 | 0.87 | 0.92 |
0.25 | 0.89 | 0.37 | 0.69 | 0.57 | 0.83 | 0.85 | 0.90 |
0.30 | 0.86 | 0.23 | 0.59 | 0.53 | 0.76 | 0.79 | 0.83 |
0.35 | 0.82 | 0.22 | 0.57 | 0.43 | 0.69 | 0.72 | 0.80 |
0.40 | 0.79 | 0.19 | 0.47 | 0.31 | 0.52 | 0.64 | 0.78 |
0.45 | 0.70 | 0.14 | 0.31 | 0.29 | 0.43 | 0.56 | 0.75 |
0.50 | 0.53 | 0.12 | 0.25 | 0.24 | 0.41 | 0.48 | 0.62 |
0.55 | 0.50 | 0.09 | 0.19 | 0.22 | 0.36 | 0.33 | 0.58 |
0.60 | 0.44 | 0.05 | 0.12 | 0.14 | 0.28 | 0.29 | 0.42 |
0.65 | 0.37 | 0.04 | 0.08 | 0.07 | 0.24 | 0.18 | 0.39 |
0.70 | 0.29 | 0.01 | 0.05 | 0.03 | 0.15 | 0.14 | 0.33 |
0.75 | 0.21 | 0.00 | 0.03 | 0.01 | 0.11 | 0.09 | 0.29 |
0.80 | 0.15 | 0.00 | 0.01 | 0.00 | 0.08 | 0.05 | 0.17 |