Skip to main content
Top

2018 | OriginalPaper | Chapter

Enhance Link Prediction in Online Social Networks Using Similarity Metrics, Sampling, and Classification

Authors : Pham Minh Chuan, Cu Nguyen Giap, Le Hoang Son, Chintan Bhatt, Tran Dinh Khang

Published in: Information Systems Design and Intelligent Applications

Publisher: Springer Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Link prediction in an online social network aims to determine new interactions among its members which are probably to arise in the near future. The previous researches dealt with the prediction task after calculating similarity scores between nodes in the link graph. New links are then predicted by implementing a supervised method from the scores. However, real-world applications often contain sparse and imbalanced data from the network, which may lead to difficulty in predicting new links. The selection of an appropriate classification method is indeed an important matter. Firstly, this paper proposes several extended metrics to calculate the similarity scores between nodes. Then, we design a new sampling method to make the training and testing data based on the data created by the extended metrics. Lastly, we assess some well-known classification methods namely J48, Weighted SVM, Gboost, Naïve Bayes, Random Forest, Logistics Regressive, and Xgboost in order to choose the best method and equivalent environments for the link prediction problem. A number of open directions to the problem are suggested further.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Güneş, İ., Gündüz-Öğüdücü, Ş., Çataltepe, Z.: Link prediction using time series of neighborhood-based node similarity scores. Data Mining and Knowledge Discovery 30(1) (2016) 147–180. Güneş, İ., Gündüz-Öğüdücü, Ş., Çataltepe, Z.: Link prediction using time series of neighborhood-based node similarity scores. Data Mining and Knowledge Discovery 30(1) (2016) 147–180.
2.
go back to reference Tylenda, T., Angelova, R., Bedathur, S.: Towards time-aware link prediction in evolving social networks. Proceedings of the 3rd workshop on social network mining and analysis (2009) 1–10. Tylenda, T., Angelova, R., Bedathur, S.: Towards time-aware link prediction in evolving social networks. Proceedings of the 3rd workshop on social network mining and analysis (2009) 1–10.
3.
go back to reference Adafre, S. F., Rijke, M.: Discovering missing links in Wikipedia. Proceedings of the Third ACM International Workshop on Link Discovery (2005) 90–97. Adafre, S. F., Rijke, M.: Discovering missing links in Wikipedia. Proceedings of the Third ACM International Workshop on Link Discovery (2005) 90–97.
4.
go back to reference Zhu, J., Hong, J., Hughes G.: Using Markov models for web site link prediction. Proceedings of the Thirteenth ACM Conference on Hypertext and Hypermedia (2002) 169–170. Zhu, J., Hong, J., Hughes G.: Using Markov models for web site link prediction. Proceedings of the Thirteenth ACM Conference on Hypertext and Hypermedia (2002) 169–170.
5.
go back to reference Airodi, E.M., Blei, D.M., Xing, E.P., Fienberg, S.E.: Mixed Membership stochastic block models for relational data, with applications to protein-protein interactions. Proceedings of International Biometric Society-ENAR Annual Meetings (2006) 1–34. Airodi, E.M., Blei, D.M., Xing, E.P., Fienberg, S.E.: Mixed Membership stochastic block models for relational data, with applications to protein-protein interactions. Proceedings of International Biometric Society-ENAR Annual Meetings (2006) 1–34.
6.
go back to reference Freschi, V.: A Graph-based Semi-Supervised Algorithm for Protein Function Prediction from Interaction Maps. Learning and Intelligent Optimization. Lecture Notes in Computer Science, Vol. 5851. Springer-Verlag, Berlin Heidelberg New York (2009) 249–258. Freschi, V.: A Graph-based Semi-Supervised Algorithm for Protein Function Prediction from Interaction Maps. Learning and Intelligent Optimization. Lecture Notes in Computer Science, Vol. 5851. Springer-Verlag, Berlin Heidelberg New York (2009) 249–258.
7.
go back to reference Ahmed, E., Ipeirotis, P.G., Verykios, V.: Duplicate Record Detection: A Survey. IEEE Transactions on Knowledge and Data Engineering 19 (1) (2007) 1–16. Ahmed, E., Ipeirotis, P.G., Verykios, V.: Duplicate Record Detection: A Survey. IEEE Transactions on Knowledge and Data Engineering 19 (1) (2007) 1–16.
8.
go back to reference Soares, PRDS, Prudêncio, RBC.: Time series based link prediction. Proceedings of the 2012 International Joint Conference on Neural Networks (2012) 1–7. Soares, PRDS, Prudêncio, RBC.: Time series based link prediction. Proceedings of the 2012 International Joint Conference on Neural Networks (2012) 1–7.
9.
go back to reference Adamic, L.A., Adar, E.: Friends and neighbors on the web. Social networks 25(3) (2003) 211–230. Adamic, L.A., Adar, E.: Friends and neighbors on the web. Social networks 25(3) (2003) 211–230.
10.
go back to reference Munasinghe, L., Ichise, R.: Time aware index for link prediction in social networks. Data Warehousing and Knowledge Discovery. Springer Berlin Heidelberg New York (2011) 342–353. Munasinghe, L., Ichise, R.: Time aware index for link prediction in social networks. Data Warehousing and Knowledge Discovery. Springer Berlin Heidelberg New York (2011) 342–353.
11.
go back to reference Manning, C. D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, UK. (2009). Manning, C. D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, UK. (2009).
12.
go back to reference Newman, M.E.: Clustering and preferential attachment in growing networks. Physical review E 64(2) (2001) 1–13. Newman, M.E.: Clustering and preferential attachment in growing networks. Physical review E 64(2) (2001) 1–13.
13.
go back to reference Murata, T., Moriyasu, S.: Link prediction of social networks based on weighted proximity measures. Proceedings of the IEEE/WIC/ACM international conference on web intelligence (2007) 85–88. Murata, T., Moriyasu, S.: Link prediction of social networks based on weighted proximity measures. Proceedings of the IEEE/WIC/ACM international conference on web intelligence (2007) 85–88.
14.
go back to reference Quinlan, J. R.: C4.5: programs for machine learning. Morgan Kaufmann, US (2014). Quinlan, J. R.: C4.5: programs for machine learning. Morgan Kaufmann, US (2014).
15.
go back to reference Soares, PR, Prudêncio, RB: Proximity measures for link prediction based on temporal events. Expert Systems with Applications 40(16) (2013) 6652–6660. Soares, PR, Prudêncio, RB: Proximity measures for link prediction based on temporal events. Expert Systems with Applications 40(16) (2013) 6652–6660.
16.
go back to reference Papadimitriou, A., Symeonidis, P., Manolopoulos, Y.: Fast and accurate link prediction in social networking systems. Journal of Systems and Software 85(9) (2012) 2119–2132. Papadimitriou, A., Symeonidis, P., Manolopoulos, Y.: Fast and accurate link prediction in social networking systems. Journal of Systems and Software 85(9) (2012) 2119–2132.
17.
go back to reference Valverde-Rebaza, J., Lopes, AA.: Exploiting behaviors of communities of twitter users for link prediction. Social Network Analysis and Mining 3(4) (2013) 1063–1074. Valverde-Rebaza, J., Lopes, AA.: Exploiting behaviors of communities of twitter users for link prediction. Social Network Analysis and Mining 3(4) (2013) 1063–1074.
18.
go back to reference Zhu, YX., Lü, L., Zhang, QM., Zhou, T.: Uncovering missing links with cold ends. Physica A: Statistical Mechanics and its Applications 391(22) (2012) 5769–5778. Zhu, YX., Lü, L., Zhang, QM., Zhou, T.: Uncovering missing links with cold ends. Physica A: Statistical Mechanics and its Applications 391(22) (2012) 5769–5778.
19.
go back to reference Blei, D., La, J.: Text mining: Theory and applications, chapter topic models. Taylor and Francis, London (2009). Blei, D., La, J.: Text mining: Theory and applications, chapter topic models. Taylor and Francis, London (2009).
20.
go back to reference Mark H., Eibe F., Geoffrey H., Bernhard P., Peter R., Ian H.W: The weka data mining software: an update. SIGKDD Explor. Newsl. 11 (2009) 10–18. Mark H., Eibe F., Geoffrey H., Bernhard P., Peter R., Ian H.W: The weka data mining software: an update. SIGKDD Explor. Newsl. 11 (2009) 10–18.
21.
go back to reference Becker, C., Rigamonti, R., Lepetit, V., Fua, P.: Supervised feature learning for curvilinear structure segmentation. Proceedings of the 16th International Conference on Medical Image Computing and Computer-Assisted Intervention (2013) 526–533. Becker, C., Rigamonti, R., Lepetit, V., Fua, P.: Supervised feature learning for curvilinear structure segmentation. Proceedings of the 16th International Conference on Medical Image Computing and Computer-Assisted Intervention (2013) 526–533.
22.
go back to reference Chang, C. C., Lin, C. J.: LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST) 2(3) (2011) 27. Chang, C. C., Lin, C. J.: LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST) 2(3) (2011) 27.
Metadata
Title
Enhance Link Prediction in Online Social Networks Using Similarity Metrics, Sampling, and Classification
Authors
Pham Minh Chuan
Cu Nguyen Giap
Le Hoang Son
Chintan Bhatt
Tran Dinh Khang
Copyright Year
2018
Publisher
Springer Singapore
DOI
https://doi.org/10.1007/978-981-10-7512-4_81

Premium Partner