Skip to main content
Top
Published in:

01-12-2023 | Original Article

Constrained expectation-maximisation for inference of social graphs explaining online user–user interactions

Authors: Effrosyni Papanastasiou, Anastasios Giovanidis

Published in: Social Network Analysis and Mining | Issue 1/2023

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Current network inference algorithms fail to generate graphs with edges that can explain whole sequences of node interactions in a given dataset or trace. To quantify how well an inferred graph can explain a trace, we introduce feasibility, a novel quality criterion, and suggest that it is linked to the result’s accuracy. In addition, we propose CEM-*, a network inference method that guarantees 100% feasibility given online social media traces, which are a non-trivial extension of the Expectation-Maximization algorithm developed by Newman (Nature Phys 14:67–75, 2018). We propose a set of linear optimization updates that incorporate a set of auxiliary variables and a set of feasibility constraints; the latter takes into consideration all the hidden paths that are possible between users based on their timestamps of interaction and guides the inference toward feasibility. We provide two CEM-* variations, that assume either an Erdős–Rényi (ER) or a Stochastic Block Model (SBM) prior for the underlying graph’s unknown distribution. Extensive experiments on one synthetic and one real-world Twitter dataset show that for both priors CEM-* can generate a posterior distribution of graphs that explains the whole trace while being closer to the ground truth. As an additional benefit, the use of the SBM prior infers and clusters users simultaneously during optimization. CEM-* outperforms baseline and state-of-the-art methods in terms of feasibility, run-time, and precision of the inferred graph and communities. Finally, we propose a heuristic to adapt the inference to lower feasibility requirements and show how it can affect the precision of the result.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Footnotes
1
According to the Twitter API documentation of a Tweet Object, the “retweets of retweets do not show representations of the intermediary retweet, but only the original Tweet.” https://​developer.​twitter.​com/​en/​docs/​twitter-api/​v1/​data-dictionary/​object-model/​tweet.
 
4
The highest value is marked with boldface and the second highest value is underlined. max scc: maximum strongly connected component.
 
5
N/A in the Tables refers to results not being available after 48 h.
 
Literature
go back to reference Blondel V, Guillaum JL, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech Theory Exp 10:P10008CrossRef Blondel V, Guillaum JL, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech Theory Exp 10:P10008CrossRef
go back to reference Bourigault S, Lamprier S, Gallinari P (2016) Representation learning for information diffusion through social networks: an embedded cascade model. In: Proceedings of the 9th ACM international conference on web search and data mining, pp 573–582 Bourigault S, Lamprier S, Gallinari P (2016) Representation learning for information diffusion through social networks: an embedded cascade model. In: Proceedings of the 9th ACM international conference on web search and data mining, pp 573–582
go back to reference Daley DJ, Gani J (1999) Epidemic modelling: an introduction. Cambridge University Press, Cambridge Daley DJ, Gani J (1999) Epidemic modelling: an introduction. Cambridge University Press, Cambridge
go back to reference Daneshmand H, Gomez-Rodriguez M, Song L, Schoelkopf B (2014) Estimating diffusion network structures: recovery conditions, sample complexity and softthresholding algorithm. In: Proceedings of the 31st international conference on machine learning (ICML) Daneshmand H, Gomez-Rodriguez M, Song L, Schoelkopf B (2014) Estimating diffusion network structures: recovery conditions, sample complexity and softthresholding algorithm. In: Proceedings of the 31st international conference on machine learning (ICML)
go back to reference Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc 39(1):1–22MathSciNet Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc 39(1):1–22MathSciNet
go back to reference Firestone SM, Hayama Y, Lau MSY, Yamamoto T, Nishi T, Bradhurst RA, Demirhan H, Stevenson MA, Tsutsui T (2020) Transmission network reconstruction for foot-and-mouth disease outbreaks incorporating farm-level covariates. PLoS ONE 15(7):e0235660CrossRef Firestone SM, Hayama Y, Lau MSY, Yamamoto T, Nishi T, Bradhurst RA, Demirhan H, Stevenson MA, Tsutsui T (2020) Transmission network reconstruction for foot-and-mouth disease outbreaks incorporating farm-level covariates. PLoS ONE 15(7):e0235660CrossRef
go back to reference Fraisier O, Cabanac O, Pitarch Y, Besançon R, Boughanem M (2018) #Elysee2017fr: the 2017 French presidential campaign on Twitter. In: Proceedings of the 12th international AAAI conference on web and social media Fraisier O, Cabanac O, Pitarch Y, Besançon R, Boughanem M (2018) #Elysee2017fr: the 2017 French presidential campaign on Twitter. In: Proceedings of the 12th international AAAI conference on web and social media
go back to reference Friedman N, Linial M, Nachman I, Pe’er D (2000) Using Bayesian networks to analyze expression data. J Comput Biol 7:601–620CrossRef Friedman N, Linial M, Nachman I, Pe’er D (2000) Using Bayesian networks to analyze expression data. J Comput Biol 7:601–620CrossRef
go back to reference Giovanidis A, Baynat B, Magnien C, Vendeville A (2021) Ranking online social users by their influence. IEEE/ACM Trans Netw 29(5):2198–2214CrossRef Giovanidis A, Baynat B, Magnien C, Vendeville A (2021) Ranking online social users by their influence. IEEE/ACM Trans Netw 29(5):2198–2214CrossRef
go back to reference Gomez-Rodriguez M, Leskovec J, Krause A (2012) Inferring networks of diffusion and influence. ACM Trans Knowl Discov Data (TKDD) 5(4):1–37CrossRef Gomez-Rodriguez M, Leskovec J, Krause A (2012) Inferring networks of diffusion and influence. ACM Trans Knowl Discov Data (TKDD) 5(4):1–37CrossRef
go back to reference Goyal A, Bonchi F, Lakshmanan LV (2010) Learning influence probabilities in social networks. In: Proceedings of the 3rd ACM international conference on Web search and data mining, pp 241–250 Goyal A, Bonchi F, Lakshmanan LV (2010) Learning influence probabilities in social networks. In: Proceedings of the 3rd ACM international conference on Web search and data mining, pp 241–250
go back to reference Harris JW, Stöcker H (1998) Handbook of mathematics and computational science. Springer, BerlinCrossRef Harris JW, Stöcker H (1998) Handbook of mathematics and computational science. Springer, BerlinCrossRef
go back to reference He X, Liu Y (2017) Not enough data? Joint inferring multiple diffusion networks via network generation priors. In: Proceedings of the 10th ACM international conference on web search and data mining, pp 465–474 He X, Liu Y (2017) Not enough data? Joint inferring multiple diffusion networks via network generation priors. In: Proceedings of the 10th ACM international conference on web search and data mining, pp 465–474
go back to reference Holland PW, Laskey KB, Leinhardt S (1983) Stochastic blockmodels: First steps. Social networks, pp 109–137 Holland PW, Laskey KB, Leinhardt S (1983) Stochastic blockmodels: First steps. Social networks, pp 109–137
go back to reference Jin W, Qu M, Jin X, Ren X (2019) Recurrent event network: autoregressive structure inference over temporal knowledge graphs Jin W, Qu M, Jin X, Ren X (2019) Recurrent event network: autoregressive structure inference over temporal knowledge graphs
go back to reference Lagnier C, Denoyer L, Gaussier E, Gallinari P (2013) Predicting information diffusion in social networks using content and user’s profiles. In: European conference on information retrieval, pp 74–85 Lagnier C, Denoyer L, Gaussier E, Gallinari P (2013) Predicting information diffusion in social networks using content and user’s profiles. In: European conference on information retrieval, pp 74–85
go back to reference Le CM, Levin K, Levina E (2018) Estimating a network from multiple noisy realizations. Electron J Stat 12:4697–4740MathSciNetCrossRef Le CM, Levin K, Levina E (2018) Estimating a network from multiple noisy realizations. Electron J Stat 12:4697–4740MathSciNetCrossRef
go back to reference Lokhov A (2016) Reconstructing parameters of spreading models from partial observations. Adv Neural Inf Process Syst 29:3467–3475 Lokhov A (2016) Reconstructing parameters of spreading models from partial observations. Adv Neural Inf Process Syst 29:3467–3475
go back to reference Newman ME (2018) Network structure from rich but noisy data. Nature Phys 14:67–75CrossRef Newman ME (2018) Network structure from rich but noisy data. Nature Phys 14:67–75CrossRef
go back to reference Newman ME (2018) Estimating network structure from unreliable measurements. Phys Rev E 98(6):062321CrossRef Newman ME (2018) Estimating network structure from unreliable measurements. Phys Rev E 98(6):062321CrossRef
go back to reference Papanastasiou E, Giovanidis A (2021) Bayesian inference of a social graph with trace feasibility guarantees. In: Proceedings of the 2021 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM ’21) Papanastasiou E, Giovanidis A (2021) Bayesian inference of a social graph with trace feasibility guarantees. In: Proceedings of the 2021 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM ’21)
go back to reference Peel L, Peixoto TP, De Domenico M (2022) Statistical inference links data and theory in network science. Nature Commun 13(1):6794CrossRef Peel L, Peixoto TP, De Domenico M (2022) Statistical inference links data and theory in network science. Nature Commun 13(1):6794CrossRef
go back to reference Peixoto TP (2019) Network reconstruction and community detection from dynamics. Phys Rev Lett 123(12):128301CrossRef Peixoto TP (2019) Network reconstruction and community detection from dynamics. Phys Rev Lett 123(12):128301CrossRef
go back to reference Saito K, Nakano R, Kimura M (2008) Prediction of information diffusion probabilities for independent cascade model. In: International conference on knowledge-based and intelligent information and engineering systems, pp 67–75 Saito K, Nakano R, Kimura M (2008) Prediction of information diffusion probabilities for independent cascade model. In: International conference on knowledge-based and intelligent information and engineering systems, pp 67–75
go back to reference Wang Z, Chen C, Li W (2019) Information diffusion prediction with network regularized role-based user representation learning. ACM Trans Knowl Discov Data (TKDD) 13:1–23CrossRef Wang Z, Chen C, Li W (2019) Information diffusion prediction with network regularized role-based user representation learning. ACM Trans Knowl Discov Data (TKDD) 13:1–23CrossRef
go back to reference Wu J, Xia J, Gou F (2022) Information transmission mode and IoT community reconstruction based on user influence in opportunistic social networks. Peer-to-Peer Netw Appl 15:1398–1416CrossRef Wu J, Xia J, Gou F (2022) Information transmission mode and IoT community reconstruction based on user influence in opportunistic social networks. Peer-to-Peer Netw Appl 15:1398–1416CrossRef
go back to reference Wu X, Kumar A, Sheldon D, Zilberstein S (2013) Parameter learning for latent network diffusion. In: Proceedings of the 23rd international joint conference on artificial intelligence, pp 2923–2930 Wu X, Kumar A, Sheldon D, Zilberstein S (2013) Parameter learning for latent network diffusion. In: Proceedings of the 23rd international joint conference on artificial intelligence, pp 2923–2930
go back to reference Zhang X, Zhang ZK, Wang W, Hou D, Xu J, Ye X, Li S (2021) Multiplex network reconstruction for the coupled spatial diffusion of infodemic and pandemic of COVID-19. Int J Digit Earth 4:401–423CrossRef Zhang X, Zhang ZK, Wang W, Hou D, Xu J, Ye X, Li S (2021) Multiplex network reconstruction for the coupled spatial diffusion of infodemic and pandemic of COVID-19. Int J Digit Earth 4:401–423CrossRef
go back to reference Zhang Y, Lyu T, Zhang Y (2018) Cosine: community-preserving social network embedding from information diffusion cascades. In: Proceedings of the AAAI conference on artificial intelligence, vol 32 Zhang Y, Lyu T, Zhang Y (2018) Cosine: community-preserving social network embedding from information diffusion cascades. In: Proceedings of the AAAI conference on artificial intelligence, vol 32
Metadata
Title
Constrained expectation-maximisation for inference of social graphs explaining online user–user interactions
Authors
Effrosyni Papanastasiou
Anastasios Giovanidis
Publication date
01-12-2023
Publisher
Springer Vienna
Published in
Social Network Analysis and Mining / Issue 1/2023
Print ISSN: 1869-5450
Electronic ISSN: 1869-5469
DOI
https://doi.org/10.1007/s13278-023-01037-4

Premium Partner