Skip to main content

2018 | OriginalPaper | Buchkapitel

Enhance Link Prediction in Online Social Networks Using Similarity Metrics, Sampling, and Classification

verfasst von : Pham Minh Chuan, Cu Nguyen Giap, Le Hoang Son, Chintan Bhatt, Tran Dinh Khang

Erschienen in: Information Systems Design and Intelligent Applications

Verlag: Springer Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Link prediction in an online social network aims to determine new interactions among its members which are probably to arise in the near future. The previous researches dealt with the prediction task after calculating similarity scores between nodes in the link graph. New links are then predicted by implementing a supervised method from the scores. However, real-world applications often contain sparse and imbalanced data from the network, which may lead to difficulty in predicting new links. The selection of an appropriate classification method is indeed an important matter. Firstly, this paper proposes several extended metrics to calculate the similarity scores between nodes. Then, we design a new sampling method to make the training and testing data based on the data created by the extended metrics. Lastly, we assess some well-known classification methods namely J48, Weighted SVM, Gboost, Naïve Bayes, Random Forest, Logistics Regressive, and Xgboost in order to choose the best method and equivalent environments for the link prediction problem. A number of open directions to the problem are suggested further.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Güneş, İ., Gündüz-Öğüdücü, Ş., Çataltepe, Z.: Link prediction using time series of neighborhood-based node similarity scores. Data Mining and Knowledge Discovery 30(1) (2016) 147–180. Güneş, İ., Gündüz-Öğüdücü, Ş., Çataltepe, Z.: Link prediction using time series of neighborhood-based node similarity scores. Data Mining and Knowledge Discovery 30(1) (2016) 147–180.
2.
Zurück zum Zitat Tylenda, T., Angelova, R., Bedathur, S.: Towards time-aware link prediction in evolving social networks. Proceedings of the 3rd workshop on social network mining and analysis (2009) 1–10. Tylenda, T., Angelova, R., Bedathur, S.: Towards time-aware link prediction in evolving social networks. Proceedings of the 3rd workshop on social network mining and analysis (2009) 1–10.
3.
Zurück zum Zitat Adafre, S. F., Rijke, M.: Discovering missing links in Wikipedia. Proceedings of the Third ACM International Workshop on Link Discovery (2005) 90–97. Adafre, S. F., Rijke, M.: Discovering missing links in Wikipedia. Proceedings of the Third ACM International Workshop on Link Discovery (2005) 90–97.
4.
Zurück zum Zitat Zhu, J., Hong, J., Hughes G.: Using Markov models for web site link prediction. Proceedings of the Thirteenth ACM Conference on Hypertext and Hypermedia (2002) 169–170. Zhu, J., Hong, J., Hughes G.: Using Markov models for web site link prediction. Proceedings of the Thirteenth ACM Conference on Hypertext and Hypermedia (2002) 169–170.
5.
Zurück zum Zitat Airodi, E.M., Blei, D.M., Xing, E.P., Fienberg, S.E.: Mixed Membership stochastic block models for relational data, with applications to protein-protein interactions. Proceedings of International Biometric Society-ENAR Annual Meetings (2006) 1–34. Airodi, E.M., Blei, D.M., Xing, E.P., Fienberg, S.E.: Mixed Membership stochastic block models for relational data, with applications to protein-protein interactions. Proceedings of International Biometric Society-ENAR Annual Meetings (2006) 1–34.
6.
Zurück zum Zitat Freschi, V.: A Graph-based Semi-Supervised Algorithm for Protein Function Prediction from Interaction Maps. Learning and Intelligent Optimization. Lecture Notes in Computer Science, Vol. 5851. Springer-Verlag, Berlin Heidelberg New York (2009) 249–258. Freschi, V.: A Graph-based Semi-Supervised Algorithm for Protein Function Prediction from Interaction Maps. Learning and Intelligent Optimization. Lecture Notes in Computer Science, Vol. 5851. Springer-Verlag, Berlin Heidelberg New York (2009) 249–258.
7.
Zurück zum Zitat Ahmed, E., Ipeirotis, P.G., Verykios, V.: Duplicate Record Detection: A Survey. IEEE Transactions on Knowledge and Data Engineering 19 (1) (2007) 1–16. Ahmed, E., Ipeirotis, P.G., Verykios, V.: Duplicate Record Detection: A Survey. IEEE Transactions on Knowledge and Data Engineering 19 (1) (2007) 1–16.
8.
Zurück zum Zitat Soares, PRDS, Prudêncio, RBC.: Time series based link prediction. Proceedings of the 2012 International Joint Conference on Neural Networks (2012) 1–7. Soares, PRDS, Prudêncio, RBC.: Time series based link prediction. Proceedings of the 2012 International Joint Conference on Neural Networks (2012) 1–7.
9.
Zurück zum Zitat Adamic, L.A., Adar, E.: Friends and neighbors on the web. Social networks 25(3) (2003) 211–230. Adamic, L.A., Adar, E.: Friends and neighbors on the web. Social networks 25(3) (2003) 211–230.
10.
Zurück zum Zitat Munasinghe, L., Ichise, R.: Time aware index for link prediction in social networks. Data Warehousing and Knowledge Discovery. Springer Berlin Heidelberg New York (2011) 342–353. Munasinghe, L., Ichise, R.: Time aware index for link prediction in social networks. Data Warehousing and Knowledge Discovery. Springer Berlin Heidelberg New York (2011) 342–353.
11.
Zurück zum Zitat Manning, C. D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, UK. (2009). Manning, C. D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, UK. (2009).
12.
Zurück zum Zitat Newman, M.E.: Clustering and preferential attachment in growing networks. Physical review E 64(2) (2001) 1–13. Newman, M.E.: Clustering and preferential attachment in growing networks. Physical review E 64(2) (2001) 1–13.
13.
Zurück zum Zitat Murata, T., Moriyasu, S.: Link prediction of social networks based on weighted proximity measures. Proceedings of the IEEE/WIC/ACM international conference on web intelligence (2007) 85–88. Murata, T., Moriyasu, S.: Link prediction of social networks based on weighted proximity measures. Proceedings of the IEEE/WIC/ACM international conference on web intelligence (2007) 85–88.
14.
Zurück zum Zitat Quinlan, J. R.: C4.5: programs for machine learning. Morgan Kaufmann, US (2014). Quinlan, J. R.: C4.5: programs for machine learning. Morgan Kaufmann, US (2014).
15.
Zurück zum Zitat Soares, PR, Prudêncio, RB: Proximity measures for link prediction based on temporal events. Expert Systems with Applications 40(16) (2013) 6652–6660. Soares, PR, Prudêncio, RB: Proximity measures for link prediction based on temporal events. Expert Systems with Applications 40(16) (2013) 6652–6660.
16.
Zurück zum Zitat Papadimitriou, A., Symeonidis, P., Manolopoulos, Y.: Fast and accurate link prediction in social networking systems. Journal of Systems and Software 85(9) (2012) 2119–2132. Papadimitriou, A., Symeonidis, P., Manolopoulos, Y.: Fast and accurate link prediction in social networking systems. Journal of Systems and Software 85(9) (2012) 2119–2132.
17.
Zurück zum Zitat Valverde-Rebaza, J., Lopes, AA.: Exploiting behaviors of communities of twitter users for link prediction. Social Network Analysis and Mining 3(4) (2013) 1063–1074. Valverde-Rebaza, J., Lopes, AA.: Exploiting behaviors of communities of twitter users for link prediction. Social Network Analysis and Mining 3(4) (2013) 1063–1074.
18.
Zurück zum Zitat Zhu, YX., Lü, L., Zhang, QM., Zhou, T.: Uncovering missing links with cold ends. Physica A: Statistical Mechanics and its Applications 391(22) (2012) 5769–5778. Zhu, YX., Lü, L., Zhang, QM., Zhou, T.: Uncovering missing links with cold ends. Physica A: Statistical Mechanics and its Applications 391(22) (2012) 5769–5778.
19.
Zurück zum Zitat Blei, D., La, J.: Text mining: Theory and applications, chapter topic models. Taylor and Francis, London (2009). Blei, D., La, J.: Text mining: Theory and applications, chapter topic models. Taylor and Francis, London (2009).
20.
Zurück zum Zitat Mark H., Eibe F., Geoffrey H., Bernhard P., Peter R., Ian H.W: The weka data mining software: an update. SIGKDD Explor. Newsl. 11 (2009) 10–18. Mark H., Eibe F., Geoffrey H., Bernhard P., Peter R., Ian H.W: The weka data mining software: an update. SIGKDD Explor. Newsl. 11 (2009) 10–18.
21.
Zurück zum Zitat Becker, C., Rigamonti, R., Lepetit, V., Fua, P.: Supervised feature learning for curvilinear structure segmentation. Proceedings of the 16th International Conference on Medical Image Computing and Computer-Assisted Intervention (2013) 526–533. Becker, C., Rigamonti, R., Lepetit, V., Fua, P.: Supervised feature learning for curvilinear structure segmentation. Proceedings of the 16th International Conference on Medical Image Computing and Computer-Assisted Intervention (2013) 526–533.
22.
Zurück zum Zitat Chang, C. C., Lin, C. J.: LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST) 2(3) (2011) 27. Chang, C. C., Lin, C. J.: LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST) 2(3) (2011) 27.
Metadaten
Titel
Enhance Link Prediction in Online Social Networks Using Similarity Metrics, Sampling, and Classification
verfasst von
Pham Minh Chuan
Cu Nguyen Giap
Le Hoang Son
Chintan Bhatt
Tran Dinh Khang
Copyright-Jahr
2018
Verlag
Springer Singapore
DOI
https://doi.org/10.1007/978-981-10-7512-4_81