Skip to main content
Top

2018 | OriginalPaper | Chapter

Linked Document Classification by Network Representation Learning

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Network Representation Learning (NRL) can learn a latent space representation of each vertex in a topology network structure to reflect linked information. Recently, NRL algorithms have been applied to obtain document embedding in linked document network, such as citation websites. However, most existing document representation methods with NRL are unsupervised and they cannot combine NRL with a concrete task-specific NLP tasks. So in this paper, we propose a unified end-to-end hybrid Linked Document Classification (LDC) model which can capture semantic features and topological structure of documents to improve the performance of document classification. In addition, we investigate to use a more flexible strategy to capture structure similarity to improve the traditional rigid extraction of linked document topology structure. The experimental results suggest that our proposed model outperforms other document classification methods especially in the case of having less training sets.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
go back to reference Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)MATH Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)MATH
go back to reference Cai, D., Zhang, D., Zhai, C.: Topic modeling with network regularization. In: Proceedings of the 17th International Conference on World Wide Web, pp. 101–110. ACM, China (2008) Cai, D., Zhang, D., Zhai, C.: Topic modeling with network regularization. In: Proceedings of the 17th International Conference on World Wide Web, pp. 101–110. ACM, China (2008)
go back to reference Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014) Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:​1406.​1078 (2014)
go back to reference Ganguly, S., Gupta, M., Varma, V., Pudi, V.: Author2Vec: learning author representations by combining content and link information. In: International Conference Companion on World Wide Web. International World Wide Web Conferences Steering Committee, pp. 49–50 (2016) Ganguly, S., Gupta, M., Varma, V., Pudi, V.: Author2Vec: learning author representations by combining content and link information. In: International Conference Companion on World Wide Web. International World Wide Web Conferences Steering Committee, pp. 49–50 (2016)
go back to reference Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 855–864. ACM (2016) Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 855–864. ACM (2016)
go back to reference Kalchbrenner, N., Grefenstette, E., Blunsom, P.: A convolutional neural network for modelling sentences. arXiv preprint arXiv:1404.2188 (2014) Kalchbrenner, N., Grefenstette, E., Blunsom, P.: A convolutional neural network for modelling sentences. arXiv preprint arXiv:​1404.​2188 (2014)
go back to reference Landauer, T.K., Foltz, P.W., Laham, D.: An introduction to latent semantic analysis. Discourse Process. 25(2–3), 259–284 (1998)CrossRef Landauer, T.K., Foltz, P.W., Laham, D.: An introduction to latent semantic analysis. Discourse Process. 25(2–3), 259–284 (1998)CrossRef
go back to reference Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: International Conference on Machine Learning, pp. 1188–1196 (2014) Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: International Conference on Machine Learning, pp. 1188–1196 (2014)
go back to reference Li, J., Ritter, A., Jurafsky, D.: Learning multi-faceted representations of individuals from heterogeneous evidence using neural networks. arXiv preprint arXiv:1510.05198 (2015) Li, J., Ritter, A., Jurafsky, D.: Learning multi-faceted representations of individuals from heterogeneous evidence using neural networks. arXiv preprint arXiv:​1510.​05198 (2015)
go back to reference Li, J., Zhu, J., Zhang, B.: Discriminative deep random walk for network classification. Meeting of the Association for Computational Linguistics, pp. 1004–1013 (2016) Li, J., Zhu, J., Zhang, B.: Discriminative deep random walk for network classification. Meeting of the Association for Computational Linguistics, pp. 1004–1013 (2016)
go back to reference Massa, P., Avesani, P.: Trust-aware recommender systems. In: Proceedings of the 2007 ACM Conference on Recommender systems, pp. 17–24. ACM (2007) Massa, P., Avesani, P.: Trust-aware recommender systems. In: Proceedings of the 2007 ACM Conference on Recommender systems, pp. 17–24. ACM (2007)
go back to reference Mei, Q. Ma, H., Lyu, M.R., King, I.: Learning to recommend with trust and distrust relationships. In: RecSys, pp. 189–196. ACM (2009) Mei, Q. Ma, H., Lyu, M.R., King, I.: Learning to recommend with trust and distrust relationships. In: RecSys, pp. 189–196. ACM (2009)
go back to reference Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:​1301.​3781 (2013)
go back to reference Nakagawa, T., Inui, K., Kurohashi, S.: Dependency tree-based sentiment classification using CRFs with hidden variables. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 786–794. Association for Computational Linguistics (2010) Nakagawa, T., Inui, K., Kurohashi, S.: Dependency tree-based sentiment classification using CRFs with hidden variables. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 786–794. Association for Computational Linguistics (2010)
go back to reference Pan, S., Wu, J., Zhu, X., Zhang, C., Wang, Y.: Tri-party deep network representation. In: International Joint Conference on Artificial Intelligence, pp. 1895–1901. AAAI Press (2016) Pan, S., Wu, J., Zhu, X., Zhang, C., Wang, Y.: Tri-party deep network representation. In: International Joint Conference on Artificial Intelligence, pp. 1895–1901. AAAI Press (2016)
go back to reference Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 701–710. ACM (2014) Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 701–710. ACM (2014)
go back to reference Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. EMNLP, pp. 1532–1543 (2014) Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. EMNLP, pp. 1532–1543 (2014)
go back to reference Prabowo, R., Thelwall, M.: Sentiment analysis: a combined approach. J. Inf. 3(2), 143–157 (2009) Prabowo, R., Thelwall, M.: Sentiment analysis: a combined approach. J. Inf. 3(2), 143–157 (2009)
go back to reference Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Process. Manag. 24(5), 513–523 (1988)CrossRef Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Process. Manag. 24(5), 513–523 (1988)CrossRef
go back to reference Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45(11), 2673–2681 (1997)CrossRef Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45(11), 2673–2681 (1997)CrossRef
go back to reference Smilkov, D., Thorat, N., Nicholson, C., Reif, E., Viégas, F.B., Wattenberg, M.: Embedding Projector: interactive visualization and interpretation of embeddings. arXiv preprint arXiv:1611.05469 (2016) Smilkov, D., Thorat, N., Nicholson, C., Reif, E., Viégas, F.B., Wattenberg, M.: Embedding Projector: interactive visualization and interpretation of embeddings. arXiv preprint arXiv:​1611.​05469 (2016)
go back to reference Sun, X., Guo, J., Ding, X., Liu, T.: A general framework for content-enhanced network representation learning. arXiv preprint arXiv:1610.02906 (2016) Sun, X., Guo, J., Ding, X., Liu, T.: A general framework for content-enhanced network representation learning. arXiv preprint arXiv:​1610.​02906 (2016)
go back to reference Tang, J., Liu, H.: Feature selection with linked data in social media. In: Proceedings of the 2012 SIAM International Conference on Data Mining, pp. 118–128. Society for Industrial and Applied Mathematics (2012) Tang, J., Liu, H.: Feature selection with linked data in social media. In: Proceedings of the 2012 SIAM International Conference on Data Mining, pp. 118–128. Society for Industrial and Applied Mathematics (2012)
go back to reference Tang, J., Qu, M., Mei, Q.: PTE: predictive text embedding through large-scale heterogeneous text networks. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1165–1174. ACM (2015) Tang, J., Qu, M., Mei, Q.: PTE: predictive text embedding through large-scale heterogeneous text networks. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1165–1174. ACM (2015)
go back to reference Trappey, A.J., Hsu, F.C., Trappey, C.V., Lin, C.I.: Development of a patent document classification and search platform using a back-propagation network. Expert Syst. Appl. 31(4), 755–765 (2006)CrossRef Trappey, A.J., Hsu, F.C., Trappey, C.V., Lin, C.I.: Development of a patent document classification and search platform using a back-propagation network. Expert Syst. Appl. 31(4), 755–765 (2006)CrossRef
go back to reference Tu, C., Liu, H., Liu, Z., Sun, M.: CANE: context-aware network embedding for relation modeling. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 1722–1731 (2017) Tu, C., Liu, H., Liu, Z., Sun, M.: CANE: context-aware network embedding for relation modeling. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 1722–1731 (2017)
go back to reference Yang, C., Liu, Z., Zhao, D., Sun, M., Chang, E.Y.: Network representation learning with rich text information. In: International Conference on Artificial Intelligence, pp. 2111–2117. AAAI Press (2015) Yang, C., Liu, Z., Zhao, D., Sun, M., Chang, E.Y.: Network representation learning with rich text information. In: International Conference on Artificial Intelligence, pp. 2111–2117. AAAI Press (2015)
go back to reference Wang, S., Tang, J., Aggarwal, C., Liu, H.: Linked document embedding for classification. In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, pp. 115–124. ACM (2016) Wang, S., Tang, J., Aggarwal, C., Liu, H.: Linked document embedding for classification. In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, pp. 115–124. ACM (2016)
Metadata
Title
Linked Document Classification by Network Representation Learning
Authors
Yue Zhang
Liying Zhang
Yao Liu
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-030-01716-3_25

Premium Partner