Skip to main content
Erschienen in:
Buchtitelbild

2017 | OriginalPaper | Buchkapitel

Combining Node Identifier Features and Community Priors for Within-Network Classification

verfasst von : Qi Ye, Changlei Zhu, Gang Li, Feng Wang

Erschienen in: Web and Big Data

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

With widely available large-scale network data, one hot topic is how to adopt traditional classification algorithms to predict the most probable labels of nodes in a partially labeled network. In this paper, we propose a new algorithm called identifier based relational neighbor classifier (IDRN) to solve the within-network multi-label classification problem. We use the node identifiers in the egocentric networks as features and propose a within-network classification model by incorporating community structure information to predict the most probable classes for unlabeled nodes. We demonstrate the effectiveness of our approach on several publicly available datasets. On average, our approach can provide Hamming score, Micro-\(\text {F}_1\) score and Macro-\(\text {F}_1\) score up to 14%, 21% and 14% higher than competing methods respectively in sparsely labeled networks. The experiment results show that our approach is quite efficient and suitable for large-scale real-world classification tasks.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Ahmed, A., Shervashidze, N., Narayanamurthy, S., Josifovski, V., Smola, A.J.: Distributed large-scale natural graph factorization. In: Proceedings of the 22nd International Conference on World Wide Web, pp. 37–48 (2013) Ahmed, A., Shervashidze, N., Narayanamurthy, S., Josifovski, V., Smola, A.J.: Distributed large-scale natural graph factorization. In: Proceedings of the 22nd International Conference on World Wide Web, pp. 37–48 (2013)
3.
Zurück zum Zitat Bhagat, S., Cormode, G., Muthukrishnan, S.: Node classification in social networks. CoRR, abs/1101.3291 (2011) Bhagat, S., Cormode, G., Muthukrishnan, S.: Node classification in social networks. CoRR, abs/1101.3291 (2011)
4.
Zurück zum Zitat Bian, J., Chang, Y.: A taxonomy of local search: semi-supervised query classification driven by information needs. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, pp. 2425–2428 (2011) Bian, J., Chang, Y.: A taxonomy of local search: semi-supervised query classification driven by information needs. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, pp. 2425–2428 (2011)
5.
Zurück zum Zitat Blondel, V.D., Guillaume, J.-L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech. 10, 10008 (2008)CrossRef Blondel, V.D., Guillaume, J.-L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech. 10, 10008 (2008)CrossRef
6.
Zurück zum Zitat Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R., Lin, C.-J.: LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)MATH Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R., Lin, C.-J.: LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)MATH
8.
9.
Zurück zum Zitat Girvan, M., Newman, M.E.J.: Community structure in social and biological networks. Proc. Natl. Acad. Sci. 99(12), 7821–7826 (2002)MathSciNetCrossRefMATH Girvan, M., Newman, M.E.J.: Community structure in social and biological networks. Proc. Natl. Acad. Sci. 99(12), 7821–7826 (2002)MathSciNetCrossRefMATH
10.
Zurück zum Zitat Grover, A., Leskovec, J.: Node2vec: scalable feature learning for networks. In: Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 855–864 (2016) Grover, A., Leskovec, J.: Node2vec: scalable feature learning for networks. In: Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 855–864 (2016)
11.
Zurück zum Zitat Jiang, S., Hu, Y., et al.: Learning query and document relevance from a web-scale click graph. In: Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2016, pp. 185–194 (2016) Jiang, S., Hu, Y., et al.: Learning query and document relevance from a web-scale click graph. In: Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2016, pp. 185–194 (2016)
12.
Zurück zum Zitat Joulin, A., Grave, E., et al.: Bag of tricks for efficient text classification. CoRR, abs/1607.01759 (2016) Joulin, A., Grave, E., et al.: Bag of tricks for efficient text classification. CoRR, abs/1607.01759 (2016)
13.
Zurück zum Zitat Macskassy, S.A., Provost, F.: A simple relational classifier. In: Proceedings of the Second Workshop on Multi-Relational Data Mining (MRDM-2003) at KDD-2003, pp. 64–76 (2003) Macskassy, S.A., Provost, F.: A simple relational classifier. In: Proceedings of the Second Workshop on Multi-Relational Data Mining (MRDM-2003) at KDD-2003, pp. 64–76 (2003)
14.
Zurück zum Zitat Macskassy, S.A., Provost, F.: Classification in networked data: a toolkit and a univariate case study. J. Mach. Learn. Res. 8(May), 935–983 (2007) Macskassy, S.A., Provost, F.: Classification in networked data: a toolkit and a univariate case study. J. Mach. Learn. Res. 8(May), 935–983 (2007)
15.
Zurück zum Zitat Marsden, P.V.: Egocentric and sociocentric measures of network centrality. Soc. Netw. 24(4), 407–422 (2002)CrossRef Marsden, P.V.: Egocentric and sociocentric measures of network centrality. Soc. Netw. 24(4), 407–422 (2002)CrossRef
16.
Zurück zum Zitat McDowell, L.K., Aha, D.W.: Labels or attributes? Rethinking the neighbors for collective classification in sparsely-labeled networks. In: International Conference on Information and Knowledge Management, pp. 847–852 (2013) McDowell, L.K., Aha, D.W.: Labels or attributes? Rethinking the neighbors for collective classification in sparsely-labeled networks. In: International Conference on Information and Knowledge Management, pp. 847–852 (2013)
17.
Zurück zum Zitat Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. CoRR, abs/1301.3781 (2013) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. CoRR, abs/1301.3781 (2013)
18.
Zurück zum Zitat Murphy, K.P., Learning, M.: A Probabilistic Perspective. The MIT Press, Cambridge (2012) Murphy, K.P., Learning, M.: A Probabilistic Perspective. The MIT Press, Cambridge (2012)
19.
Zurück zum Zitat Nandanwar, S., Murty, M.N.: Structural neighborhood based classification of nodes in a network. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1085–1094 (2016) Nandanwar, S., Murty, M.N.: Structural neighborhood based classification of nodes in a network. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1085–1094 (2016)
20.
Zurück zum Zitat Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 701–710 (2014) Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 701–710 (2014)
21.
Zurück zum Zitat Rayana, S., Akoglu, L.: Collective opinion spam detection: bridging review networks and metadata. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 985–994 (2015) Rayana, S., Akoglu, L.: Collective opinion spam detection: bridging review networks and metadata. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 985–994 (2015)
22.
Zurück zum Zitat Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., Mei, Q.: Line: large-scale information network embedding. In Proceedings of the 24th International Conference on World Wide Web, pp. 1067–1077 (2015) Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., Mei, Q.: Line: large-scale information network embedding. In Proceedings of the 24th International Conference on World Wide Web, pp. 1067–1077 (2015)
23.
Zurück zum Zitat Tang, L., Liu, H.: Relational learning via latent social dimensions. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 817–826 (2009) Tang, L., Liu, H.: Relational learning via latent social dimensions. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 817–826 (2009)
24.
Zurück zum Zitat Tang, L., Liu, H.: Scalable learning of collective behavior based on sparse social dimensions. In: The 18th ACM Conference on Information and Knowledge Management, pp. 1107–1116 (2009) Tang, L., Liu, H.: Scalable learning of collective behavior based on sparse social dimensions. In: The 18th ACM Conference on Information and Knowledge Management, pp. 1107–1116 (2009)
25.
Zurück zum Zitat Wang, D., Cui, P., Zhu, W.: Structural deep network embedding. In: Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1225–1234 (2016) Wang, D., Cui, P., Zhu, W.: Structural deep network embedding. In: Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1225–1234 (2016)
26.
Zurück zum Zitat Wang, S.I., Manning, C.D.: Baselines and bigrams: simple, good sentiment and topic classification. In: Proceedings of the ACL, pp. 90–94 (2012) Wang, S.I., Manning, C.D.: Baselines and bigrams: simple, good sentiment and topic classification. In: Proceedings of the ACL, pp. 90–94 (2012)
27.
Zurück zum Zitat Wang, X., Sukthankar, G.: Multi-label relational neighbor classification using social context features. In: Proceedings of The 19th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), pp. 464–472 (2013) Wang, X., Sukthankar, G.: Multi-label relational neighbor classification using social context features. In: Proceedings of The 19th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), pp. 464–472 (2013)
28.
Zurück zum Zitat Yin, D., Hu, Y., et al.: Ranking relevance in yahoo search. In: Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 323–332 (2016) Yin, D., Hu, Y., et al.: Ranking relevance in yahoo search. In: Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 323–332 (2016)
Metadaten
Titel
Combining Node Identifier Features and Community Priors for Within-Network Classification
verfasst von
Qi Ye
Changlei Zhu
Gang Li
Feng Wang
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-63564-4_1