Skip to main content
Erschienen in: Social Network Analysis and Mining 1/2019

01.12.2019 | Original Article

Spark’s GraphX-based link prediction for social communication using triangle counting

verfasst von: Ramesh Dharavath, Navaljeet Singh Arora

Erschienen in: Social Network Analysis and Mining | Ausgabe 1/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Link prediction in a given instance of a network topology is a crucial task for extracting and inspecting the evolution of social networks. It predicts missing links in existing community networks and new or terminating links in future systems. It also attracted much attention in many fields. In the past decade, many methodologies have been compiled to predict the suitable links in a given social network. Analyzing link prediction methods is difficult when the network is very complex due to restrictive computing cost. It is still a very challenging task to predict missing links efficiently and accurately in an incomplete complex network. Depending on the certainty, the nodes with an incredible number of normal neighbors will probably be connected. Numerous similarity indices have accomplished extensive exactness and efficiency that greatly optimized this task. To accommodate this instance, in this paper, we propose one such index, namely Clustering Coefficient Index, using triangle counting implemented on the component of Apache Spark’s GraphX methodology. The proposed index uses the property of formation of triangles in the given network topology and clustering coefficients. Experimental results show that the proposed methodology outperforms in linking the suitable communications compared to other existing methods.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Adamic L, Adar E (2005) How to search a social network. Soc Netw 27(3):187–203CrossRef Adamic L, Adar E (2005) How to search a social network. Soc Netw 27(3):187–203CrossRef
Zurück zum Zitat Adamic LA, Glance N (2005) The political blogosphere and the 2004 US election: divided they blog. In: Proceedings of the 3rd international workshop on link discovery. ACM, pp 36–43 Adamic LA, Glance N (2005) The political blogosphere and the 2004 US election: divided they blog. In: Proceedings of the 3rd international workshop on link discovery. ACM, pp 36–43
Zurück zum Zitat Al Hasan M, Zaki MJ (2011) A survey of link prediction in social networks. In: Social network data analytics. Springer, New York, pp 243–275 Al Hasan M, Zaki MJ (2011) A survey of link prediction in social networks. In: Social network data analytics. Springer, New York, pp 243–275
Zurück zum Zitat Barzel B, Barabási AL (2013) Network link prediction by global silencing of indirect correlations. Nat Biotechnol 31(8):720–725CrossRef Barzel B, Barabási AL (2013) Network link prediction by global silencing of indirect correlations. Nat Biotechnol 31(8):720–725CrossRef
Zurück zum Zitat Benchettara N, Kanawati R, Rouveirol C (2010) A supervised machine learning link prediction approach for academic collaboration recommendation. In: Proceedings of the fourth ACM conference on recommender systems. ACM, pp 253–256 Benchettara N, Kanawati R, Rouveirol C (2010) A supervised machine learning link prediction approach for academic collaboration recommendation. In: Proceedings of the fourth ACM conference on recommender systems. ACM, pp 253–256
Zurück zum Zitat Bu D, Zhao Y, Cai L, Xue H, Zhu X, Lu H, Li G (2003) Topological structure analysis of the protein–protein interaction network in budding yeast. Nucl Acids Res 31(9):2443–2450CrossRef Bu D, Zhao Y, Cai L, Xue H, Zhu X, Lu H, Li G (2003) Topological structure analysis of the protein–protein interaction network in budding yeast. Nucl Acids Res 31(9):2443–2450CrossRef
Zurück zum Zitat Cannistraci CV, Alanis-Lobato G, Ravasi T (2013) From link-prediction in brain connectomes and protein interactomes to the local-community-paradigm in complex networks. Sci Rep 3:1613CrossRef Cannistraci CV, Alanis-Lobato G, Ravasi T (2013) From link-prediction in brain connectomes and protein interactomes to the local-community-paradigm in complex networks. Sci Rep 3:1613CrossRef
Zurück zum Zitat Chelliah PR (2017) The hadoop ecosystem technologies and tools. In: Advances in Computers, Elsevier Chelliah PR (2017) The hadoop ecosystem technologies and tools. In: Advances in Computers, Elsevier
Zurück zum Zitat Chen J, Geyer W, Dugan C, Muller M, Guy I (2009) Make new friends, but keep the old: recommending people on social networking sites. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, pp 201–210 Chen J, Geyer W, Dugan C, Muller M, Guy I (2009) Make new friends, but keep the old: recommending people on social networking sites. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, pp 201–210
Zurück zum Zitat Clauset A, Moore C, Newman ME (2008) Hierarchical structure and the prediction of missing links in networks. Nature 453(7191):98–101CrossRef Clauset A, Moore C, Newman ME (2008) Hierarchical structure and the prediction of missing links in networks. Nature 453(7191):98–101CrossRef
Zurück zum Zitat Cukier K (2010) The data deluge: businesses, governments and society are only starting to tap its vast potential. Economist 23 Cukier K (2010) The data deluge: businesses, governments and society are only starting to tap its vast potential. Economist 23
Zurück zum Zitat Dharavath R, Singh AK (2016) Entity resolution-based jaccard similarity coefficient for heterogeneous distributed databases. In: Proceedings of the second international conference on computer and communication technologies. Springer, New Delhi, pp 497–507 Dharavath R, Singh AK (2016) Entity resolution-based jaccard similarity coefficient for heterogeneous distributed databases. In: Proceedings of the second international conference on computer and communication technologies. Springer, New Delhi, pp 497–507
Zurück zum Zitat Duch J, Arenas A (2005) Community detection in complex networks using extremal optimization. Phys Rev E 72(2):027104CrossRef Duch J, Arenas A (2005) Community detection in complex networks using extremal optimization. Phys Rev E 72(2):027104CrossRef
Zurück zum Zitat Gandomi A, Haider M (2015) Beyond the hype: big data concepts, methods, and analytics. Int J Inf Manag 35(2):137–144CrossRef Gandomi A, Haider M (2015) Beyond the hype: big data concepts, methods, and analytics. Int J Inf Manag 35(2):137–144CrossRef
Zurück zum Zitat Gantz J, Reinsel D (2011) Extracting value from chaos. IDC iview 1142(2011):1–12 Gantz J, Reinsel D (2011) Extracting value from chaos. IDC iview 1142(2011):1–12
Zurück zum Zitat Gonzalez JE, Xin RS, Dave A, Crankshaw D, Franklin MJ, Stoica I (2014) GraphX: graph processing in a distributed dataflow framework. In: OSDI, vol 14, pp 599–613 Gonzalez JE, Xin RS, Dave A, Crankshaw D, Franklin MJ, Stoica I (2014) GraphX: graph processing in a distributed dataflow framework. In: OSDI, vol 14, pp 599–613
Zurück zum Zitat Guimerà R, Sales-Pardo M (2009) Missing and spurious interactions and the reconstruction of complex networks. Proc Natl Acad Sci 106(52):22073–22078CrossRef Guimerà R, Sales-Pardo M (2009) Missing and spurious interactions and the reconstruction of complex networks. Proc Natl Acad Sci 106(52):22073–22078CrossRef
Zurück zum Zitat Hanley JA, McNeil BJ (1982) The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143(1):29–36CrossRef Hanley JA, McNeil BJ (1982) The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143(1):29–36CrossRef
Zurück zum Zitat Huynen MA, Snel B, von Mering C, Bork P (2003) Function prediction and protein networks. Curr Opin Cell Biol 15(2):191–198CrossRef Huynen MA, Snel B, von Mering C, Bork P (2003) Function prediction and protein networks. Curr Opin Cell Biol 15(2):191–198CrossRef
Zurück zum Zitat Jaccard P (1901) Étude comparative de la distribution florale dans une portion des Alpes et des Jura. Bull Soc Vaudoise Sci Nat 37:547–579 Jaccard P (1901) Étude comparative de la distribution florale dans une portion des Alpes et des Jura. Bull Soc Vaudoise Sci Nat 37:547–579
Zurück zum Zitat Jeh G, Widom J (2003) Scaling personalized web search. In: Proceedings of the 12th international conference on World Wide Web. ACM, pp 271–279 Jeh G, Widom J (2003) Scaling personalized web search. In: Proceedings of the 12th international conference on World Wide Web. ACM, pp 271–279
Zurück zum Zitat Katz L (1953) A new status index derived from sociometric analysis. Psychometrika 18(1):39–43CrossRef Katz L (1953) A new status index derived from sociometric analysis. Psychometrika 18(1):39–43CrossRef
Zurück zum Zitat Latora V, Marchiori M (2004) How the science of complex networks can help developing strategies against terrorism. Chaos, Solitons Fractals 20(1):69–75CrossRef Latora V, Marchiori M (2004) How the science of complex networks can help developing strategies against terrorism. Chaos, Solitons Fractals 20(1):69–75CrossRef
Zurück zum Zitat Leicht EA, Holme P, Newman ME (2006) Vertex similarity in networks. Phys Rev E 73(2):026120CrossRef Leicht EA, Holme P, Newman ME (2006) Vertex similarity in networks. Phys Rev E 73(2):026120CrossRef
Zurück zum Zitat Liben-Nowell D, Kleinberg J (2007) The link-prediction problem for social networks. J Assoc Inf Sci Technol 58(7):1019–1031CrossRef Liben-Nowell D, Kleinberg J (2007) The link-prediction problem for social networks. J Assoc Inf Sci Technol 58(7):1019–1031CrossRef
Zurück zum Zitat Liu W, Lü L (2010) Link prediction based on local random walk. EPL (Europhys Lett) 89(5):58007CrossRef Liu W, Lü L (2010) Link prediction based on local random walk. EPL (Europhys Lett) 89(5):58007CrossRef
Zurück zum Zitat Liu Z, Zhang QM, Lü L, Zhou T (2011) Link prediction in complex networks: a local naïve Bayes model. EPL (Europhys Lett) 96(4):48007CrossRef Liu Z, Zhang QM, Lü L, Zhou T (2011) Link prediction in complex networks: a local naïve Bayes model. EPL (Europhys Lett) 96(4):48007CrossRef
Zurück zum Zitat Lorrain F, White HC (1977) Structural equivalence of individuals in social networks. Soc Netw Dev Paradig 1:67 Lorrain F, White HC (1977) Structural equivalence of individuals in social networks. Soc Netw Dev Paradig 1:67
Zurück zum Zitat Lu LH (2012) Financial slack, board composition and the explorative and exploitative innovation behavior of firms. In: Academy of management proceedings, vol 2012, no 1, pp 1–1. Academy of Management Lu LH (2012) Financial slack, board composition and the explorative and exploitative innovation behavior of firms. In: Academy of management proceedings, vol 2012, no 1, pp 1–1. Academy of Management
Zurück zum Zitat Lü L, Zhou T (2011) Link prediction in complex networks: a survey. Phys A Stat Mech Appl 390(6):1150–1170CrossRef Lü L, Zhou T (2011) Link prediction in complex networks: a survey. Phys A Stat Mech Appl 390(6):1150–1170CrossRef
Zurück zum Zitat Lü L, Jin CH, Zhou T (2009) Similarity index based on local paths for link prediction of complex networks. Phys Rev E 80(4):046122CrossRef Lü L, Jin CH, Zhou T (2009) Similarity index based on local paths for link prediction of complex networks. Phys Rev E 80(4):046122CrossRef
Zurück zum Zitat Lusseau D, Schneider K, Boisseau OJ, Haase P, Slooten E, Dawson SM (2003) The bottlenose dolphin community of doubtful sound features a large proportion of long-lasting associations. Behav Ecol Sociobiol 54(4):396–405CrossRef Lusseau D, Schneider K, Boisseau OJ, Haase P, Slooten E, Dawson SM (2003) The bottlenose dolphin community of doubtful sound features a large proportion of long-lasting associations. Behav Ecol Sociobiol 54(4):396–405CrossRef
Zurück zum Zitat Mohan A, Venkatesan R, Pramod KV (2017) A scalable method for link prediction in large real world networks. J Parallel Distrib Comput 109:89–101CrossRef Mohan A, Venkatesan R, Pramod KV (2017) A scalable method for link prediction in large real world networks. J Parallel Distrib Comput 109:89–101CrossRef
Zurück zum Zitat Newman ME (2001) Clustering and preferential attachment in growing networks. Phys Rev E 64(2):025102CrossRef Newman ME (2001) Clustering and preferential attachment in growing networks. Phys Rev E 64(2):025102CrossRef
Zurück zum Zitat Papadimitriou A, Symeonidis P, Manolopoulos Y (2012) Fast and accurate link prediction in social networking systems. J Syst Softw 85(9):2119–2132CrossRef Papadimitriou A, Symeonidis P, Manolopoulos Y (2012) Fast and accurate link prediction in social networking systems. J Syst Softw 85(9):2119–2132CrossRef
Zurück zum Zitat Pavlov M, Ichise R (2007) Finding experts by link prediction in co-authorship networks. In: Proceedings of the 2nd international conference on finding experts on the web with semantics, vol 290, pp 42–55 Pavlov M, Ichise R (2007) Finding experts by link prediction in co-authorship networks. In: Proceedings of the 2nd international conference on finding experts on the web with semantics, vol 290, pp 42–55
Zurück zum Zitat Petersen AM, Fortunato S, Pan RK, Kaski K, Penner O, Rungi A, Riccaboni M, Stanley HE, Pammolli F (2014) Reputation and impact in academic careers. Proc Natl Acad Sci 111(43):15316–15321CrossRef Petersen AM, Fortunato S, Pan RK, Kaski K, Penner O, Rungi A, Riccaboni M, Stanley HE, Pammolli F (2014) Reputation and impact in academic careers. Proc Natl Acad Sci 111(43):15316–15321CrossRef
Zurück zum Zitat Sharan R, Ulitsky I, Shamir R (2007) Network-based prediction of protein function. Mol Syst Biol 3(1):88CrossRef Sharan R, Ulitsky I, Shamir R (2007) Network-based prediction of protein function. Mol Syst Biol 3(1):88CrossRef
Zurück zum Zitat Shyam R, Bharathi Ganesh HB, Kumar S, Poornachandran P, Soman KP (2015) Apache Spark a big data analytics platform for smart grid. Procedia Technol 21:171–178CrossRef Shyam R, Bharathi Ganesh HB, Kumar S, Poornachandran P, Soman KP (2015) Apache Spark a big data analytics platform for smart grid. Procedia Technol 21:171–178CrossRef
Zurück zum Zitat Singh H, Bawa S (2017) A MapReduce-based scalable discovery and indexing of structured big data. Future Gen Comput Syst 73:32–43CrossRef Singh H, Bawa S (2017) A MapReduce-based scalable discovery and indexing of structured big data. Future Gen Comput Syst 73:32–43CrossRef
Zurück zum Zitat Sun Y, Barber R, Gupta M, Aggarwal CC, Han J (2011) Co-author relationship prediction in heterogeneous bibliographic networks. In: International conference on advances in social networks analysis and mining (ASONAM), pp 121–128. IEEE Sun Y, Barber R, Gupta M, Aggarwal CC, Han J (2011) Co-author relationship prediction in heterogeneous bibliographic networks. In: International conference on advances in social networks analysis and mining (ASONAM), pp 121–128. IEEE
Zurück zum Zitat Tang J, Hu X, Liu H (2013) Social recommendation: a review. Soc Netw Anal Min 3(4):1113–1133CrossRef Tang J, Hu X, Liu H (2013) Social recommendation: a review. Soc Netw Anal Min 3(4):1113–1133CrossRef
Zurück zum Zitat Tasgin M, Herdagdelen A, Bingol H (2007) Community detection in complex networks using genetic algorithms. arXiv preprint arXiv:0711.0491 Tasgin M, Herdagdelen A, Bingol H (2007) Community detection in complex networks using genetic algorithms. arXiv preprint arXiv:​0711.​0491
Zurück zum Zitat Wang G (2013) Analysis of complex diseases: a mathematical perspective. CRC Press, Boca RatonCrossRef Wang G (2013) Analysis of complex diseases: a mathematical perspective. CRC Press, Boca RatonCrossRef
Zurück zum Zitat Wang D, Pedreschi D, Song C, Giannotti F, Barabasi AL (2011) Human mobility, social ties, and link prediction. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1100–1108 Wang D, Pedreschi D, Song C, Giannotti F, Barabasi AL (2011) Human mobility, social ties, and link prediction. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1100–1108
Zurück zum Zitat Watts DJ, Strogatz SH (1998) Collective dynamics of ‘small-world’ networks. Nature 393(6684):440–442CrossRef Watts DJ, Strogatz SH (1998) Collective dynamics of ‘small-world’ networks. Nature 393(6684):440–442CrossRef
Zurück zum Zitat White JG, Southgate E, Thomson JN, Brenner S (1986) The structure of the nervous system of the nematode Caenorhabditis elegans: the mind of a worm. Philos Trans R Soc Lond 314:1–340 White JG, Southgate E, Thomson JN, Brenner S (1986) The structure of the nervous system of the nematode Caenorhabditis elegans: the mind of a worm. Philos Trans R Soc Lond 314:1–340
Zurück zum Zitat Wu Z, Menichetti G, Rahmede C, Bianconi G (2015) Emergent complex network geometry. Sci Rep 5:10073CrossRef Wu Z, Menichetti G, Rahmede C, Bianconi G (2015) Emergent complex network geometry. Sci Rep 5:10073CrossRef
Zurück zum Zitat Wu Z, Lin Y, Wang J, Gregory S (2016) Link prediction with node clustering coefficient. Phys A Stat Mech Appl 452:1–8CrossRef Wu Z, Lin Y, Wang J, Gregory S (2016) Link prediction with node clustering coefficient. Phys A Stat Mech Appl 452:1–8CrossRef
Zurück zum Zitat Yuan W, He K, Guan D, Zhou L, Li C (2019) Graph kernel based link prediction for signed social networks. Inf Fusion 46:1–10CrossRef Yuan W, He K, Guan D, Zhou L, Li C (2019) Graph kernel based link prediction for signed social networks. Inf Fusion 46:1–10CrossRef
Zurück zum Zitat Zhang S, Wang RS, Zhang XS (2007) Identification of overlapping community structure in complex networks using fuzzy c-means clustering. Phys A Stat Mech Appl 374(1):483–490CrossRef Zhang S, Wang RS, Zhang XS (2007) Identification of overlapping community structure in complex networks using fuzzy c-means clustering. Phys A Stat Mech Appl 374(1):483–490CrossRef
Zurück zum Zitat Zheleva E, Getoor L, Golbeck J, Kuter U (2008) Using friendship ties and family circles for link prediction. In: Advances in social network mining and analysis. Springer, Berlin, pp 97–113 Zheleva E, Getoor L, Golbeck J, Kuter U (2008) Using friendship ties and family circles for link prediction. In: Advances in social network mining and analysis. Springer, Berlin, pp 97–113
Zurück zum Zitat Zhou T, Lü L, Zhang YC (2009) Predicting missing links via local information. Eur Phys J B Condens Matter Complex Syst 71(4):623–630CrossRef Zhou T, Lü L, Zhang YC (2009) Predicting missing links via local information. Eur Phys J B Condens Matter Complex Syst 71(4):623–630CrossRef
Metadaten
Titel
Spark’s GraphX-based link prediction for social communication using triangle counting
verfasst von
Ramesh Dharavath
Navaljeet Singh Arora
Publikationsdatum
01.12.2019
Verlag
Springer Vienna
Erschienen in
Social Network Analysis and Mining / Ausgabe 1/2019
Print ISSN: 1869-5450
Elektronische ISSN: 1869-5469
DOI
https://doi.org/10.1007/s13278-019-0573-y

Weitere Artikel der Ausgabe 1/2019

Social Network Analysis and Mining 1/2019 Zur Ausgabe