1 Introduction
-
A user’s decision to forward a message should result from three factors: message content, influences of active friends and susceptibility of the user. Intuitively, a user’s influence measures his/her ability to convince another user to share his/her message while susceptibility measures how likely the user gets influenced by other users (Panagopoulos et al. 2020; Wang et al. 2015).
-
Users’ influences and susceptibilities are not only user-specific but also topic-specific. This phenomenon has not been discussed before. Social media users, especially on platforms featured by microblogs such as Twitter and Sina, usually have multiple topics of interest and different sharing patterns. Suppose a sports news reporter with a hobby of pop music. He will be more influential for sports-related tweets than those about music. As an information receiver, the reporter will be more cautious to spread sports news compared to music-related tweets.
-
Influences and susceptibilities are context-dependent (Wang et al. 2015). In other words, they spread through social relations during diffusion processes. A user will become more susceptible to a message when he/she sees that message shared by a larger number of users. Similarly, when more users have adopted the message a user shared, then the user becomes more influential to his/her friends due to the accumulated trust in the message.
2 Related works
2.1 Diffusion model-based methods
2.2 Generative methods
2.3 Cascade representation-based models
3 Problem definition
3.1 Popularity prediction
3.2 Final adopter prediction
4 Topic-specific susceptibility and influence
4.1 Twitter data collection
4.2 Users’ topic preferences
4.2.1 Topic modelling
4.2.2 User topic preference
4.3 Topic-specific susceptibility and influence
4.3.1 Measuring topic-specific susceptibility
4.3.2 Measuring topic-specific influence
4.3.3 Experimental analysis
5 Our CasSIM model
5.1 Influence and susceptibility update
5.2 Calculating topic-specific influence and susceptibility
5.3 User state update
5.4 User profiling
5.5 Model training
6 Experimental evaluation
6.1 Datasets
6.2 Baselines
6.3 Experimental settings
6.3.1 Evaluation measurements
6.3.2 Hyperparameter settings
6.4 Overall prediction performance
6.4.1 Popularity prediction
Model | 1 hour | 2 hours | 3 hours | ||||||
---|---|---|---|---|---|---|---|---|---|
MSLE | MAPE | WroPerc (%) | MSLE | MAPE | WroPerc (%) | MSLE | MAPE | WroPerc (%) | |
SEISMIC | 5.774 | – | 50.93 | 5.688 | – | 47.84 | 5.223 | – | 42.00 |
Feature-based | 4.672 | 0.359 | 41.53 | 4.165 | 0.315 | 38.48 | 4.052 | 0.308 | 31.96 |
DeepCas | 3.578 | 0.291 | 32.26 | 3.421 | 0.288 | 28.74 | 3.139 | 0.270 | 18.58 |
DeepHawkes | 2.894 | 0.289 | 26.21 | 2.551 | 0.280 | 25.89 | 2.240 | 0.268 | 17.57 |
CasCN | 2.749 | 0.285 | 27.36 | 2.442 | 0.283 | 25.56 | 2.181 | 0.279 | 17.23 |
CoupledGNN | 2.289 | 0.242 | 23.60 | 2.254 | 0.236 | 17.96 | 2.037 | 0.223 | 14.27 |
CasSeqGCN | 2.281 | 0.252 | 23.96 | 2.282 | 0.239 | 18.43 | 2.048 | 0.224 | 13.54 |
FOREST | 2.156 | 0.238 | 20.05 | 2.136 | 0.235 | 18.14 | 1.995 | 0.230 | 13.49 |
CasFlow | 2.248 | 0.239 | 20.68 | 2.195 | 0.221 | 16.79 | 1.982 | 0.215 | 12.10 |
TempCas | 2.290 | 0.226 | 18.23 | 2.208 | 0.229 | 14.73 | 1.960 | 0.209 | 11.26 |
CasSIM | 2.148 | 0.221 | 19.46 | 2.126 | 0.217 | 14.94 | 1.919 | 0.202 | 11.04 |
Model | 1 year | 2 years | 3 years | ||||||
---|---|---|---|---|---|---|---|---|---|
MSLE | MAPE | WroPerc (%) | MSLE | MAPE | WroPerc (%) | MSLE | MAPE | WroPerc (%) | |
SEISMIC | 5.496 | – | 48.24 | 5.132 | – | 41.68 | 4.720 | – | 32.88 |
Feature-based | 4.069 | 0.485 | 37.76 | 4.004 | 0.426 | 32.30 | 3.523 | 0.353 | 28.71 |
DeepCas | 2.031 | 0.293 | 28.33 | 1.916 | 0.260 | 22.69 | 1.908 | 0.227 | 21.39 |
DeepHawkes | 2.400 | 0.294 | 27.42 | 1.148 | 0.252 | 22.47 | 1.735 | 0.191 | 20.73 |
CasCN | 2.007 | 0.285 | 27.49 | 1.959 | 0.283 | 20.28 | 1.876 | 0.183 | 20.99 |
CoupledGNN | 1.970 | 0.288 | 25.90 | 1.798 | 0.282 | 20.16 | 1.430 | 0.165 | 19.63 |
CasSeqGCN | 1.953 | 0.285 | 25.32 | 1.773 | 0.306 | 20.84 | 1.458 | 0.168 | 19.43 |
FOREST | 1.359 | 0.293 | 25.11 | 1.175 | 0.298 | 19.40 | 1.495 | 0.154 | 18.88 |
CasFlow | 1.822 | 0.256 | 26.44 | 1.086 | 0.233 | 17.01 | 1.416 | 0.136 | 14.83 |
TempCas | 1.308 | 0.242 | 24.66 | 1.073 | 0.236 | 16.87 | 1.384 | 0.130 | 14.84 |
CasSIM | 1.272 | 0.231 | 24.51 | 1.063 | 0.225 | 16.26 | 1.376 | 0.126 | 14.09 |
Model | 1 hour | 2 hours | 3 hours | ||||||
---|---|---|---|---|---|---|---|---|---|
MSLE | MAPE | WroPerc (%) | MSLE | MAPE | WroPerc (%) | MSLE | MAPE | WroPerc (%) | |
SEISMIC | 14.394 | – | 45.17 | 13.353 | – | 39.35 | 12.631 | – | 33.33 |
Feature-based | 13.440 | 0.642 | 41.78 | 12.110 | 0.586 | 37.72 | 11.461 | 0.557 | 33.90 |
DeepCas | 12.897 | 0.614 | 39.73 | 11.145 | 0.579 | 36.02 | 11.677 | 0.547 | 30.13 |
DeepHawkes | 10.705 | 0.623 | 36.25 | 10.499 | 0.617 | 35.83 | 9.188 | 0.553 | 25.28 |
CasCN | 10.640 | 0.592 | 35.81 | 9.207 | 0.552 | 34.63 | 9.048 | 0.550 | 25.62 |
CoupledGNN | 9.400 | 0.497 | 34.49 | 9.122 | 0.477 | 32.86 | 9.045 | 0.452 | 22.55 |
CasSeqGCN | 9.320 | 0.494 | 34.82 | 9.127 | 0.489 | 32.98 | 8.928 | 0.453 | 22.43 |
FOREST | 8.799 | 0.489 | 33.01 | 8.469 | 0.463 | 30.25 | 8.147 | 0.454 | 21.46 |
CasFlow | 8.916 | 0.478 | 31.59 | 8.114 | 0.458 | 28.94 | 8.081 | 0.446 | 16.33 |
TempCas | 8.756 | 0.461 | 28.25 | 8.251 | 0.442 | 26.67 | 8.070 | 0.426 | 15.38 |
CasSIM | 8.569 | 0.443 | 27.53 | 8.046 | 0.437 | 25.43 | 8.032 | 0.422 | 14.76 |
Model | Twitter2012 | Sina | AMINER | Twitter2020 | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Recall | Precision | F1 | Recall | Precision | F1 | Recall | Precision | F1 | Recall | Precision | F1 | |
DeepDiffuse | 0.306 | 0.306 | 0.306 | 0.317 | 0.297 | 0.307 | 0.328 | 0.317 | 0.321 | 0.348 | 0.364 | 0.352 |
TopoLSTM | 0.314 | 0.312 | 0.313 | 0.336 | 0.342 | 0.341 | 0.337 | 0.342 | 0.340 | 0.355 | 0.351 | 0.354 |
SNIDSA | 0.403 | 0.408 | 0.406 | 0.428 | 0.377 | 0.386 | 0.354 | 0.339 | 0.346 | 0.363 | 0.374 | 0.371 |
FOREST | 0.416 | 0.408 | 0.411 | 0.436 | 0.379 | 0.393 | 0.398 | 0.344 | 0.392 | 0.404 | 0.416 | 0.413 |
DyHGCN | 0.454 | 0.441 | 0.443 | 0.449 | 0.393 | 0.394 | 0.413 | 0.388 | 0.410 | 0.441 | 0.448 | 0.442 |
CoupledGNN | 0.370 | 0.366 | 0.367 | 0.371 | 0.323 | 0.332 | 0.318 | 0.297 | 0.262 | 0.315 | 0.300 | 0.302 |
CoupledGNN + 10% | 0.382 | 0.338 | 0.346 | 0.392 | 0.348 | 0.352 | 0.325 | 0.307 | 0.319 | 0.335 | 0.324 | 0.329 |
CoupledGNN + 20% | 0.393 | 0.360 | 0.357 | 0.418 | 0.361 | 0.365 | 0.334 | 0.311 | 0.327 | 0.356 | 0.335 | 0.342 |
CoupledGNN + 30% | 0.400 | 0.367 | 0.372 | 0.431 | 0.372 | 0.378 | 0.370 | 0.359 | 0.364 | 0.371 | 0.387 | 0.385 |
CoupledGNN + 40% | 0.418 | 0.387 | 0.397 | 0.437 | 0.379 | 0.381 | 0.376 | 0.364 | 0.370 | 0.380 | 0.391 | 0.387 |
CoupledGNN + 50% | 0.411 | 0.381 | 0.388 | 0.422 | 0.364 | 0.370 | 0.366 | 0.341 | 0.351 | 0.350 | 0.332 | 0.341 |
CasSIM | 0.423 | 0.428 | 0.425 | 0.443 | 0.390 | 0.394 | 0.409 | 0.397 | 0.412 | 0.417 | 0.436 | 0.425 |
CasSIM + 10% | 0.431 | 0.429 | 0.430 | 0.447 | 0.404 | 0.405 | 0.412 | 0.408 | 0.406 | 0.421 | 0.437 | 0.423 |
CasSIM + 20% | 0.448 | 0.446 | 0.447 | 0.452 | 0.411 | 0.414 | 0.422 | 0.416 | 0.420 | 0.432 | 0.439 | 0.433 |
CasSIM + 30% | 0.465 | 0.468 | 0.467 | 0.474 | 0.423 | 0.428 | 0.437 | 0.433 | 0.435 | 0.440 | 0.442 | 0.441 |
CasSIM + 40% | 0.466 | 0.463 | 0.465 | 0.477 | 0.441 | 0.451 | 0.439 | 0.435 | 0.436 | 0.443 | 0.446 | 0.444 |
CasSIM + 50% | 0.451 | 0.453 | 0.452 | 0.449 | 0.431 | 0.440 | 0.423 | 0.420 | 0.421 | 0.429 | 0.432 | 0.431 |
6.4.2 Discussion
Dataset | Model | 1 h/1 year (AMINER) | 2 h/2 years (AMINER) | 3 h/3 years (AMINER) | ||||||
---|---|---|---|---|---|---|---|---|---|---|
MSLE | MAPE | WroPerc (%) | MSLE | MAPE | WroPerc(%) | MSLE | MAPE | WroPerc(%) | ||
Sina | CasSIM | 2.148 | 0.221 | 19.46 | 2.126 | 0.217 | 14.94 | 1.919 | 0.202 | 11.04 |
CasSIM-h/r | 2.323 | 0.241 | 23.09 | 2.243 | 0.228 | 16.73 | 1.992 | 0.216 | 13.73 | |
CasSIM-up | 2.253 | 0.234 | 21.81 | 2.223 | 0.224 | 15.04 | 1.939 | 0.210 | 12.97 | |
CasSIM-x | 2.214 | 0.224 | 21.74 | 2.230 | 0.229 | 16.43 | 1.973 | 0.213 | 13.36 | |
AMINER | CasSIM | 1.272 | 0.231 | 24.51 | 1.063 | 0.225 | 26.26 | 1.376 | 0.126 | 14.09 |
CasSIM-h/r | 1.736 | 0.284 | 24.61 | 1.403 | 0.278 | 29.63 | 1.466 | 0.139 | 15.82 | |
CasSIM-up | 1.337 | 0.247 | 22.46 | 1.370 | 0.231 | 27.17 | 1.409 | 0.139 | 16.18 | |
CasSIM-x | 1.585 | 0.259 | 24.95 | 1.355 | 0.246 | 28.12 | 1.527 | 0.140 | 15.74 | |
Twitter2012 | CasSIM | 6.440 | 0.419 | 22.26 | 4.739 | 0.388 | 22.64 | 3.903 | 0.309 | 13.57 |
CasSIM-h/r | 6.717 | 0.448 | 23.93 | 4.948 | 0.417 | 25.54 | 4.186 | 0.339 | 14.90 | |
CasSIM-up | 6.468 | 0.425 | 22.71 | 4.748 | 0.395 | 22.76 | 3.916 | 0.313 | 13.88 | |
CasSIM-x | 6.657 | 0.436 | 23.55 | 4.883 | 0.406 | 23.57 | 3.985 | 0.320 | 14.15 | |
Twitter2020 | CasSIM | 8.569 | 0.443 | 27.53 | 8.046 | 0.437 | 25.43 | 8.032 | 0.422 | 14.76 |
CasSIM-h/r | 9.283 | 0.488 | 33.64 | 9.040 | 0.473 | 30.80 | 8.577 | 0.448 | 21.07 | |
CasSIM-up | 8.891 | 0.473 | 28.80 | 8.490 | 0.451 | 27.85 | 8.267 | 0.436 | 18.54 | |
CasSIM-x | 8.907 | 0.478 | 29.46 | 9.160 | 0.464 | 28.85 | 8.352 | 0.452 | 19.79 |
6.5 Ablation study
-
CasSIM-h/r We do not distinguish users’ dual roles in diffusion and use the same vectors for users’ susceptibilities and influences.
-
CasSIM-up We remove the pre-training process for the initial user profiles and use random assignments.
-
CasSIM-x We remove users’ topic preference vectors, e.g., \(\textbf{p}_v\) and do not consider the content of messages under diffusion, e.g., \(\textbf{x}_m\).