Skip to main content
Erschienen in: Data Mining and Knowledge Discovery 1/2020

24.10.2019

Topical network embedding

verfasst von: Min Shi, Yufei Tang, Xingquan Zhu, Jianxun Liu, Haibo He

Erschienen in: Data Mining and Knowledge Discovery | Ausgabe 1/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Networked data involve complex information from multifaceted channels, including topology structures, node content, and/or node labels etc., where structure and content are often correlated but are not always consistent. A typical scenario is the citation relationships in scholarly publications where a paper is cited by others not because they have the same content, but because they share one or multiple subject matters. To date, while many network embedding methods exist to take the node content into consideration, they all consider node content as simple flat word/attribute set and nodes sharing connections are assumed to have dependency with respect to all words or attributes. In this paper, we argue that considering topic-level semantic interactions between nodes is crucial to learn discriminative node embedding vectors. In order to model pairwise topic relevance between linked text nodes, we propose topical network embedding, where interactions between nodes are built on the shared latent topics. Accordingly, we propose a unified optimization framework to simultaneously learn topic and node representations from the network text contents and structures, respectively. Meanwhile, the structure modeling takes the learned topic representations as conditional context under the principle that two nodes can infer each other contingent on the shared latent topics. Experiments on three real-world datasets demonstrate that our approach can learn significantly better network representations, i.e., 4.1% improvement over the state-of-the-art methods in terms of Micro-F1 on Cora dataset. (The source code of the proposed method is available through the github link: https://​github.​com/​codeshareabc/​TopicalNE.)

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Abraham A, Pedregosa F, Eickenberg M, Gervais P, Mueller A, Kossaifi J, Gramfort A, Thirion B, Varoquaux G (2014) Machine learning for neuroimaging with scikit-learn. Front Neuroinform 8(2):14 Abraham A, Pedregosa F, Eickenberg M, Gervais P, Mueller A, Kossaifi J, Gramfort A, Thirion B, Varoquaux G (2014) Machine learning for neuroimaging with scikit-learn. Front Neuroinform 8(2):14
Zurück zum Zitat Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3(1):993–1022MATH Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3(1):993–1022MATH
Zurück zum Zitat Bottou L (2010) Large-scale machine learning with stochastic gradient descent. In: Proceedings of the 19th international symposium on computational statistics, pp 177–186CrossRef Bottou L (2010) Large-scale machine learning with stochastic gradient descent. In: Proceedings of the 19th international symposium on computational statistics, pp 177–186CrossRef
Zurück zum Zitat Cai X, Han J, Pan S, Yang L (2018a) Heterogeneous information network embedding based personalized query-focused astronomy reference paper recommendation. Int J Comput Intell Syst 11(1):591–599CrossRef Cai X, Han J, Pan S, Yang L (2018a) Heterogeneous information network embedding based personalized query-focused astronomy reference paper recommendation. Int J Comput Intell Syst 11(1):591–599CrossRef
Zurück zum Zitat Cai X, Han J, Yang L (2018b) Generative adversarial network based heterogeneous bibliographic network representation for personalized citation recommendation. In: Proceedings of the 32nd AAAI conference on artificial intelligence, pp 5747–5754 Cai X, Han J, Yang L (2018b) Generative adversarial network based heterogeneous bibliographic network representation for personalized citation recommendation. In: Proceedings of the 32nd AAAI conference on artificial intelligence, pp 5747–5754
Zurück zum Zitat Chang J, Blei D (2009) Relational topic models for document networks. In: Proceedings of the 12th international conference on artificial intelligence and statistics, pp 81–88 Chang J, Blei D (2009) Relational topic models for document networks. In: Proceedings of the 12th international conference on artificial intelligence and statistics, pp 81–88
Zurück zum Zitat Chen J, Zhang Q, Huang X (2016) Incorporate group information to enhance network embedding. In: Proceedings of the 25th ACM international conference on information and knowledge management, pp 1901–1904 Chen J, Zhang Q, Huang X (2016) Incorporate group information to enhance network embedding. In: Proceedings of the 25th ACM international conference on information and knowledge management, pp 1901–1904
Zurück zum Zitat Dojchinovski M, Vitvar T (2018) Linked web apis dataset. Semant Web 9(4):1–11 CrossRef Dojchinovski M, Vitvar T (2018) Linked web apis dataset. Semant Web 9(4):1–11 CrossRef
Zurück zum Zitat Grover A, Leskovec J (2016) Node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 855–864 Grover A, Leskovec J (2016) Node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 855–864
Zurück zum Zitat Gutmann M, Hyvärinen A (2010) Noise-contrastive estimation: a new estimation principle for unnormalized statistical models. In: Proceedings of the 13th international conference on artificial intelligent and statistics, pp 297–304 Gutmann M, Hyvärinen A (2010) Noise-contrastive estimation: a new estimation principle for unnormalized statistical models. In: Proceedings of the 13th international conference on artificial intelligent and statistics, pp 297–304
Zurück zum Zitat Huang X, Li J, Hu X (2017) Label informed attributed network embedding. In: Proceedings of the 10th ACM international conference on web search and data mining, pp 731–739 Huang X, Li J, Hu X (2017) Label informed attributed network embedding. In: Proceedings of the 10th ACM international conference on web search and data mining, pp 731–739
Zurück zum Zitat Jian L, Li J, Liu H (2018) Toward online node classification on streaming networks. Data Min Knowl Discov 32(1):231–257MathSciNetCrossRef Jian L, Li J, Liu H (2018) Toward online node classification on streaming networks. Data Min Knowl Discov 32(1):231–257MathSciNetCrossRef
Zurück zum Zitat Kimura M, Saito K, Nakano R, Motoda H (2010) Extracting influential nodes on a social network for information diffusion. Data Min Knowl Discov 20(1):70MathSciNetCrossRef Kimura M, Saito K, Nakano R, Motoda H (2010) Extracting influential nodes on a social network for information diffusion. Data Min Knowl Discov 20(1):70MathSciNetCrossRef
Zurück zum Zitat Le TM, Lauw HW (2014) Probabilistic latent document network embedding. In: Proceedings of the 14th international conference on data mining, pp 270–279 Le TM, Lauw HW (2014) Probabilistic latent document network embedding. In: Proceedings of the 14th international conference on data mining, pp 270–279
Zurück zum Zitat Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: Proceedings of the 31st international conference on machine learning, pp 1188–1196 Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: Proceedings of the 31st international conference on machine learning, pp 1188–1196
Zurück zum Zitat Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9(11):2579–2605MATH Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9(11):2579–2605MATH
Zurück zum Zitat Oro E, Pizzuti C, Procopio N, Ruffolo M (2018) Detecting topic authoritative social media users: a multilayer network approach. IEEE Trans Multimed 20(5):1195–1208CrossRef Oro E, Pizzuti C, Procopio N, Ruffolo M (2018) Detecting topic authoritative social media users: a multilayer network approach. IEEE Trans Multimed 20(5):1195–1208CrossRef
Zurück zum Zitat Pan S, Wu J, Zhu X, Zhang C, Wang Y (2016) Tri-party deep network representation. In: Proceedings of the 25th international joint conference on artificial intelligence, pp 1895–1901 Pan S, Wu J, Zhu X, Zhang C, Wang Y (2016) Tri-party deep network representation. In: Proceedings of the 25th international joint conference on artificial intelligence, pp 1895–1901
Zurück zum Zitat Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing, pp 1532–1543 Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing, pp 1532–1543
Zurück zum Zitat Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, pp 701–710 Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, pp 701–710
Zurück zum Zitat Shi T, Kang K, Choo J, Reddy CK (2018b) Short-text topic modeling via non-negative matrix factorization enriched with local word-context correlations. In: Proceedings of the 27th international conference on world wide web, pp 1105–1114 Shi T, Kang K, Choo J, Reddy CK (2018b) Short-text topic modeling via non-negative matrix factorization enriched with local word-context correlations. In: Proceedings of the 27th international conference on world wide web, pp 1105–1114
Zurück zum Zitat Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q (2015) Line: large-scale information network embedding. In: Proceedings of the 24th international conference on world wide web, pp 1067–1077 Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q (2015) Line: large-scale information network embedding. In: Proceedings of the 24th international conference on world wide web, pp 1067–1077
Zurück zum Zitat Tu C, Zhang W, Liu Z, Sun M et al (2016) Max-margin DeepWalk: discriminative learning of network representation. In: Proceedings of the 25th international joint conference on artificial intelligence, pp 3889–3895 Tu C, Zhang W, Liu Z, Sun M et al (2016) Max-margin DeepWalk: discriminative learning of network representation. In: Proceedings of the 25th international joint conference on artificial intelligence, pp 3889–3895
Zurück zum Zitat Verma A, Bharadwaj KK (2017) Identifying community structure in a multi-relational network employing non-negative tensor factorization and GA k-means clustering. Wiley Interdiscip Rev Data Min Knowl Discov 7(1):e1196CrossRef Verma A, Bharadwaj KK (2017) Identifying community structure in a multi-relational network employing non-negative tensor factorization and GA k-means clustering. Wiley Interdiscip Rev Data Min Knowl Discov 7(1):e1196CrossRef
Zurück zum Zitat Wang X, Cui P, Wang J, Pei J, Zhu W, Yang S (2017) Community preserving network embedding. In: Proceedings of the 31st AAAI conference on artificial intelligence, pp 203–209 Wang X, Cui P, Wang J, Pei J, Zhu W, Yang S (2017) Community preserving network embedding. In: Proceedings of the 31st AAAI conference on artificial intelligence, pp 203–209
Zurück zum Zitat Wang C, Song Y, Li H, Zhang M, Han J (2018) Unsupervised meta-path selection for text similarity measure based on heterogeneous information networks. Data Min Knowl Discov 32(6):1735–1767MathSciNetCrossRef Wang C, Song Y, Li H, Zhang M, Han J (2018) Unsupervised meta-path selection for text similarity measure based on heterogeneous information networks. Data Min Knowl Discov 32(6):1735–1767MathSciNetCrossRef
Zurück zum Zitat Yang C, Liu Z, Zhao D, Sun M, Chang EY (2015) Network representation learning with rich text information. In: Proceedings of the 24th international joint conference on artificial intelligence, pp 2111–2117 Yang C, Liu Z, Zhao D, Sun M, Chang EY (2015) Network representation learning with rich text information. In: Proceedings of the 24th international joint conference on artificial intelligence, pp 2111–2117
Metadaten
Titel
Topical network embedding
verfasst von
Min Shi
Yufei Tang
Xingquan Zhu
Jianxun Liu
Haibo He
Publikationsdatum
24.10.2019
Verlag
Springer US
Erschienen in
Data Mining and Knowledge Discovery / Ausgabe 1/2020
Print ISSN: 1384-5810
Elektronische ISSN: 1573-756X
DOI
https://doi.org/10.1007/s10618-019-00659-7

Weitere Artikel der Ausgabe 1/2020

Data Mining and Knowledge Discovery 1/2020 Zur Ausgabe