Skip to main content
Top
Published in: Knowledge and Information Systems 11/2020

17-07-2020 | Regular Paper

A joint optimization framework for better community detection based on link prediction in social networks

Authors: Shu-Kai Zhang, Cheng-Te Li, Shou-De Lin

Published in: Knowledge and Information Systems | Issue 11/2020

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Real-world network data can be incomplete (e.g., the social connections are partially observed) due to reasons such as graph sampling and privacy concerns. Consequently, communities detected based on such incomplete network information could be not as reliable as the ones identified based on the fully observed network. While existing studies first predict missing links and then detect communities, in this paper, a joint optimization framework, Communities detected on Predicted Edges, is proposed. Our goal to improve the quality of community detection through learning the probability of unseen links and the probability of community affiliation of nodes simultaneously. Link prediction and community detection are mutually reinforced to generate better results of both tasks. Experiments conducted on a number of well-known network data show that the proposed COPE stably outperforms several state-of-the-art community detection algorithms.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Airoldi EM, Blei DM, Fienberg SE, Xing EP (2008) Mixed membership stochastic blockmodels. J Mach Learn Res 9(Sep):1981–2014MATH Airoldi EM, Blei DM, Fienberg SE, Xing EP (2008) Mixed membership stochastic blockmodels. J Mach Learn Res 9(Sep):1981–2014MATH
2.
go back to reference Al Hasan M, Chaoji V, Salem S, Zaki M (2006) Link prediction using supervised learning. In: SDM06: workshop on link analysis, counter-terrorism and security Al Hasan M, Chaoji V, Salem S, Zaki M (2006) Link prediction using supervised learning. In: SDM06: workshop on link analysis, counter-terrorism and security
3.
go back to reference Arthur D, Vassilvitskii S (2007) k-means++: the advantages of careful seeding. In: Proceedings of the eighteenth annual ACM-SIAM symposium on discrete algorithms. Society for Industrial and Applied Mathematics, pp 1027–1035 Arthur D, Vassilvitskii S (2007) k-means++: the advantages of careful seeding. In: Proceedings of the eighteenth annual ACM-SIAM symposium on discrete algorithms. Society for Industrial and Applied Mathematics, pp 1027–1035
4.
go back to reference Blondel VD, Guillaume JL, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech Theory Exp 10:P10008MATHCrossRef Blondel VD, Guillaume JL, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech Theory Exp 10:P10008MATHCrossRef
6.
go back to reference Bottou L (2010) Large-scale machine learning with stochastic gradient descent. In: Proceedings of COMPSTAT’2010. Springer, pp 177–186 Bottou L (2010) Large-scale machine learning with stochastic gradient descent. In: Proceedings of COMPSTAT’2010. Springer, pp 177–186
7.
go back to reference Burgess M, Adar E, Cafarella M (2016) Link-prediction enhanced consensus clustering for complex networks. PLoS ONE 11(5):e0153384CrossRef Burgess M, Adar E, Cafarella M (2016) Link-prediction enhanced consensus clustering for complex networks. PLoS ONE 11(5):e0153384CrossRef
8.
go back to reference Chakraborty T, Srinivasan S, Ganguly N, Mukherjee A, Bhowmick S (2014) On the permanence of vertices in network communities. In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 1396–1405 Chakraborty T, Srinivasan S, Ganguly N, Mukherjee A, Bhowmick S (2014) On the permanence of vertices in network communities. In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 1396–1405
9.
go back to reference Chen W, Liu Z, Sun X, Wang Y (2016) Community detection in social networks through community formation games. In: Proceedings of the twenty-second international joint conference on artificial intelligence (IJCAI-16). pp 2576–2581 Chen W, Liu Z, Sun X, Wang Y (2016) Community detection in social networks through community formation games. In: Proceedings of the twenty-second international joint conference on artificial intelligence (IJCAI-16). pp 2576–2581
10.
go back to reference Ding C, He X, Simon H.D (2005) On the equivalence of nonnegative matrix factorization and spectral clustering. In: Proceedings of the 2005 SIAM international conference on data mining. SIAM, pp 606–610 Ding C, He X, Simon H.D (2005) On the equivalence of nonnegative matrix factorization and spectral clustering. In: Proceedings of the 2005 SIAM international conference on data mining. SIAM, pp 606–610
11.
go back to reference Drineas P, Frieze A, Kannan R, Vempala S, Vinay V (2004) Clustering large graphs via the singular value decomposition. Mach Learn 56(1):9–33MATHCrossRef Drineas P, Frieze A, Kannan R, Vempala S, Vinay V (2004) Clustering large graphs via the singular value decomposition. Mach Learn 56(1):9–33MATHCrossRef
12.
go back to reference Erhan D, Bengio Y, Courville A, Manzagol PA, Vincent P, Bengio S (2010) Why does unsupervised pre-training help deep learning? J Mach Learn Res 11(Feb):625–660MathSciNetMATH Erhan D, Bengio Y, Courville A, Manzagol PA, Vincent P, Bengio S (2010) Why does unsupervised pre-training help deep learning? J Mach Learn Res 11(Feb):625–660MathSciNetMATH
13.
go back to reference Feng MH, Hsu CC, Li CT, Yeh MY, Lin SD (2019) Marine: multi-relational network embeddings with relational proximity and node attributes. In: The world wide web conference. WWW ’19, pp 470–479 Feng MH, Hsu CC, Li CT, Yeh MY, Lin SD (2019) Marine: multi-relational network embeddings with relational proximity and node attributes. In: The world wide web conference. WWW ’19, pp 470–479
15.
go back to reference Génois M, Vestergaard CL, Fournet J, Panisson A, Bonmarin I, Barrat A (2015) Data on face-to-face contacts in an office building suggest a low-cost vaccination strategy based on community linkers. Netw Sci 3(3):326–347CrossRef Génois M, Vestergaard CL, Fournet J, Panisson A, Bonmarin I, Barrat A (2015) Data on face-to-face contacts in an office building suggest a low-cost vaccination strategy based on community linkers. Netw Sci 3(3):326–347CrossRef
16.
17.
go back to reference Grover A, Leskovec J (2016) node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 855–864 Grover A, Leskovec J (2016) node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 855–864
18.
go back to reference Grus J (2015) Data science from scratch: first principles with python. O’Reilly Media Inc, Sebastopol Grus J (2015) Data science from scratch: first principles with python. O’Reilly Media Inc, Sebastopol
19.
go back to reference Hosmer DW Jr, Lemeshow S, Sturdivant RX (2013) Applied logistic regression, vol 398. Wiley, HobokenMATHCrossRef Hosmer DW Jr, Lemeshow S, Sturdivant RX (2013) Applied logistic regression, vol 398. Wiley, HobokenMATHCrossRef
20.
go back to reference Huang L, Wang CD, Chao HY (2019) Higher-order multi-layer community detection. In: Proceedings the thirty-third AAAI conference on artificial intelligence (AAAI-19). pp 9945–9946 Huang L, Wang CD, Chao HY (2019) Higher-order multi-layer community detection. In: Proceedings the thirty-third AAAI conference on artificial intelligence (AAAI-19). pp 9945–9946
21.
go back to reference Jiang JY, Li CT, Chen Y, Wang W (2018) Identifying users behind shared accounts in online streaming services. In: The 41st international ACM SIGIR conference on research and development in information retrieval, SIGIR ’18. pp 65–74 Jiang JY, Li CT, Chen Y, Wang W (2018) Identifying users behind shared accounts in online streaming services. In: The 41st international ACM SIGIR conference on research and development in information retrieval, SIGIR ’18. pp 65–74
22.
go back to reference Jin D, You X, Li W, He D, Cui P, Fogelman-Soulie F, Chakraborty T (2019) Incorporating network embedding into markov random field for better community detection. In: Proceedings the thirty-third AAAI conference on artificial intelligence (AAAI-19), pp 160–167 Jin D, You X, Li W, He D, Cui P, Fogelman-Soulie F, Chakraborty T (2019) Incorporating network embedding into markov random field for better community detection. In: Proceedings the thirty-third AAAI conference on artificial intelligence (AAAI-19), pp 160–167
23.
24.
go back to reference Jonnalagadda A, Kuppusamy L (2018) Overlapping community detection in social networks using coalitional games. Knowl Inf Syst 56(3):637–661CrossRef Jonnalagadda A, Kuppusamy L (2018) Overlapping community detection in social networks using coalitional games. Knowl Inf Syst 56(3):637–661CrossRef
25.
go back to reference Lancichinetti A, Radicchi F, Ramasco JJ, Fortunato S (2011) Finding statistically significant communities in networks. PLoS ONE 6(4):e18961CrossRef Lancichinetti A, Radicchi F, Ramasco JJ, Fortunato S (2011) Finding statistically significant communities in networks. PLoS ONE 6(4):e18961CrossRef
26.
go back to reference Li CT, Lin YJ, Yeh MY (2018) Forecasting participants of information diffusion on social networks with its applications. Inf Sci 422:432–446CrossRef Li CT, Lin YJ, Yeh MY (2018) Forecasting participants of information diffusion on social networks with its applications. Inf Sci 422:432–446CrossRef
27.
go back to reference Li CT, Shan MK, Lin SD (2011) Context-based people search in labeled social networks. In: Proceedings of the 20th ACM international conference on information and knowledge management, CIKM ’11, pp 1607–1612 Li CT, Shan MK, Lin SD (2011) Context-based people search in labeled social networks. In: Proceedings of the 20th ACM international conference on information and knowledge management, CIKM ’11, pp 1607–1612
28.
go back to reference Liben-Nowell D, Kleinberg J (2007) The link-prediction problem for social networks. J Assoc Inf Sci Technol 58(7):1019–1031CrossRef Liben-Nowell D, Kleinberg J (2007) The link-prediction problem for social networks. J Assoc Inf Sci Technol 58(7):1019–1031CrossRef
30.
go back to reference Liu P, Fu J, Dong Y, Qiu X, Cheung JCK (2019) Learning multi-task communication with message passing for sequence learning. In: Proceedings the thirty-third AAAI conference on artificial intelligence (AAAI-19), pp. 4360–4367 Liu P, Fu J, Dong Y, Qiu X, Cheung JCK (2019) Learning multi-task communication with message passing for sequence learning. In: Proceedings the thirty-third AAAI conference on artificial intelligence (AAAI-19), pp. 4360–4367
31.
go back to reference Liu X, Cheng H, Zhang Z (2019) Evaluation of community detection methods. IEEE Trans Knowl Data Eng, pp 1–1 Liu X, Cheng H, Zhang Z (2019) Evaluation of community detection methods. IEEE Trans Knowl Data Eng, pp 1–1
32.
go back to reference Lobo JM, Jiménez-Valverde A, Real R (2008) AUC: a misleading measure of the performance of predictive distribution models. Glob Ecol Biogeogr 17(2):145–151CrossRef Lobo JM, Jiménez-Valverde A, Real R (2008) AUC: a misleading measure of the performance of predictive distribution models. Glob Ecol Biogeogr 17(2):145–151CrossRef
33.
go back to reference Lusseau D, Schneider K, Boisseau OJ, Haase P, Slooten E, Dawson SM (2003) The bottlenose dolphin community of doubtful sound features a large proportion of long-lasting associations. Behav Ecol Sociobiol 54(4):396–405CrossRef Lusseau D, Schneider K, Boisseau OJ, Haase P, Slooten E, Dawson SM (2003) The bottlenose dolphin community of doubtful sound features a large proportion of long-lasting associations. Behav Ecol Sociobiol 54(4):396–405CrossRef
34.
35.
go back to reference Pan S, Wu J, Zhu X, Long G, Zhang C (2017) Task sensitive feature exploration and learning for multitask graph classification. IEEE Trans Cybern 47(3):744–758CrossRef Pan S, Wu J, Zhu X, Long G, Zhang C (2017) Task sensitive feature exploration and learning for multitask graph classification. IEEE Trans Cybern 47(3):744–758CrossRef
36.
go back to reference Prat-Pérez A, Dominguez-Sal D, Larriba-Pey JL (2014) High quality, scalable and parallel community detection for large real graphs. In: Proceedings of the 23rd international conference on World wide web. ACM, pp 225–236 Prat-Pérez A, Dominguez-Sal D, Larriba-Pey JL (2014) High quality, scalable and parallel community detection for large real graphs. In: Proceedings of the 23rd international conference on World wide web. ACM, pp 225–236
37.
go back to reference Radicchi F, Castellano C, Cecconi F, Loreto V, Parisi D (2004) Defining and identifying communities in networks. Proc Natl Acad Sci USA 101(9):2658–2663CrossRef Radicchi F, Castellano C, Cecconi F, Loreto V, Parisi D (2004) Defining and identifying communities in networks. Proc Natl Acad Sci USA 101(9):2658–2663CrossRef
38.
go back to reference Rendle S, Freudenthaler C, Gantner Z, Schmidt-Thieme L (2009) BPR: Bayesian personalized ranking from implicit feedback. In: Proceedings of the twenty-fifth conference on uncertainty in artificial intelligence. AUAI Press, pp 452–461 Rendle S, Freudenthaler C, Gantner Z, Schmidt-Thieme L (2009) BPR: Bayesian personalized ranking from implicit feedback. In: Proceedings of the twenty-fifth conference on uncertainty in artificial intelligence. AUAI Press, pp 452–461
39.
go back to reference Rosvall M, Bergstrom CT (2008) Maps of random walks on complex networks reveal community structure. Proc Natl Acad Sci 105(4):1118–1123CrossRef Rosvall M, Bergstrom CT (2008) Maps of random walks on complex networks reveal community structure. Proc Natl Acad Sci 105(4):1118–1123CrossRef
41.
go back to reference Soundarajan S, Hopcroft J (2012) Using community information to improve the precision of link prediction methods. In: Proceedings of the 21st international conference on world wide web. ACM, pp 607–608 Soundarajan S, Hopcroft J (2012) Using community information to improve the precision of link prediction methods. In: Proceedings of the 21st international conference on world wide web. ACM, pp 607–608
42.
go back to reference Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q (2015) Line: Large-scale information network embedding. In: Proceedings of the 24th international conference on world wide web. International world wide web conferences steering committee, pp 1067–1077 Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q (2015) Line: Large-scale information network embedding. In: Proceedings of the 24th international conference on world wide web. International world wide web conferences steering committee, pp 1067–1077
43.
go back to reference Tran P.V (2018) Learning to make predictions on graphs with autoencoders. In: 2018 IEEE 5th international conference on data science and advanced analytics (DSAA), pp 237–245 Tran P.V (2018) Learning to make predictions on graphs with autoencoders. In: 2018 IEEE 5th international conference on data science and advanced analytics (DSAA), pp 237–245
44.
go back to reference Vinh NX, Epps J, Bailey J (2010) Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance. J Mach Learn Res 11(Oct):2837–2854MathSciNetMATH Vinh NX, Epps J, Bailey J (2010) Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance. J Mach Learn Res 11(Oct):2837–2854MathSciNetMATH
45.
go back to reference Wu J, He J, Xu J (2019) Demo-net: degree-specific graph neural networks for node and graph classification. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’19, pp 406–415 Wu J, He J, Xu J (2019) Demo-net: degree-specific graph neural networks for node and graph classification. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’19, pp 406–415
46.
go back to reference Xie J, Kelley S, Szymanski BK (2013) Overlapping community detection in networks: the state-of-the-art and comparative study. ACM Comput Surv (CSUR) 45(4):43MATHCrossRef Xie J, Kelley S, Szymanski BK (2013) Overlapping community detection in networks: the state-of-the-art and comparative study. ACM Comput Surv (CSUR) 45(4):43MATHCrossRef
47.
go back to reference Yang J, Leskovec J (2013) Overlapping community detection at scale: a nonnegative matrix factorization approach. In: Proceedings of the sixth ACM international conference on web search and data mining. ACM, pp 587–596 Yang J, Leskovec J (2013) Overlapping community detection at scale: a nonnegative matrix factorization approach. In: Proceedings of the sixth ACM international conference on web search and data mining. ACM, pp 587–596
48.
go back to reference Yang J, Leskovec J (2015) Defining and evaluating network communities based on ground-truth. Knowl Inf Syst 42(1):181–213CrossRef Yang J, Leskovec J (2015) Defining and evaluating network communities based on ground-truth. Knowl Inf Syst 42(1):181–213CrossRef
49.
go back to reference Zachary WW (1977) An information flow model for conflict and fission in small groups. J Anthropol Res 33(4):452–473CrossRef Zachary WW (1977) An information flow model for conflict and fission in small groups. J Anthropol Res 33(4):452–473CrossRef
Metadata
Title
A joint optimization framework for better community detection based on link prediction in social networks
Authors
Shu-Kai Zhang
Cheng-Te Li
Shou-De Lin
Publication date
17-07-2020
Publisher
Springer London
Published in
Knowledge and Information Systems / Issue 11/2020
Print ISSN: 0219-1377
Electronic ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-020-01490-z

Other articles of this Issue 11/2020

Knowledge and Information Systems 11/2020 Go to the issue

Premium Partner