Skip to main content
Erschienen in: Knowledge and Information Systems 3/2015

01.09.2015 | Regular Paper

Constructing topical hierarchies in heterogeneous information networks

verfasst von: Chi Wang, Jialu Liu, Nihit Desai, Marina Danilevsky, Jiawei Han

Erschienen in: Knowledge and Information Systems | Ausgabe 3/2015

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Many digital documentary data collections (e.g., scientific publications, enterprise reports, news articles, and social media) can be modeled as a heterogeneous information network, linking text with multiple types of entities. Constructing high-quality hierarchies that can represent topics at multiple granularities benefits tasks such as search, information browsing, and pattern mining. In this work, we present an algorithm for recursively constructing multi-typed topical hierarchies. Contrary to traditional text-based topic modeling, our approach handles both textual phrases and multiple types of entities by a newly designed clustering and ranking algorithm for heterogeneous network data, as well as mining and ranking topical patterns of different types. Our experiments on datasets from two different domains demonstrate that our algorithm yields high-quality, multi-typed topical hierarchies.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Fußnoten
1
We chose papers published in 20 conferences related to the areas of Artificial Intelligence, Databases, Data Mining, Information Retrieval, Machine Learning, and Natural Language Processing from http://​www.​dblp.​org/​.
 
2
As a paper is always published in exactly one venue, there can naturally be no venue–venue links.
 
3
The 16 topics chosen were: Bill Clinton, Boston Marathon, Earthquake, Egypt, Gaza, Iran, Israel, Joe Biden, Microsoft, Mitt Romney, Nuclear power, Steve Jobs, Sudan, Syria, Unemployment, US Crime.
 
4
The one exception is venues, as there are only 20 venues in the DBLP dataset, so we set \(K=3\) in this case.
 
Literatur
1.
Zurück zum Zitat Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022MATH Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022MATH
2.
Zurück zum Zitat Chang J, Boyd-Graber J, Wang C, Gerrish S, Blei DM (2009) Reading tea leaves: how humans interpret topic models. NIPS Chang J, Boyd-Graber J, Wang C, Gerrish S, Blei DM (2009) Reading tea leaves: how humans interpret topic models. NIPS
3.
Zurück zum Zitat Chen X, Zhou M, Carin L (2012) The contextual focused topic model. In: KDD Chen X, Zhou M, Carin L (2012) The contextual focused topic model. In: KDD
4.
Zurück zum Zitat Chuang SL, Chien LF (2004) A practical web-based approach to generating topic hierarchy for text segments. In: CIKM Chuang SL, Chien LF (2004) A practical web-based approach to generating topic hierarchy for text segments. In: CIKM
5.
Zurück zum Zitat Deng H, Han J, Zhao B, Yu Y, Lin CX (2011) Probabilistic topic models with biased propagation on heterogeneous information networks. In: KDD Deng H, Han J, Zhao B, Yu Y, Lin CX (2011) Probabilistic topic models with biased propagation on heterogeneous information networks. In: KDD
6.
Zurück zum Zitat Di Caro L, Candan KS, Sapino ML (2008) Using tagflake for condensing navigable tag hierarchies from tag clouds. In: KDD Di Caro L, Candan KS, Sapino ML (2008) Using tagflake for condensing navigable tag hierarchies from tag clouds. In: KDD
7.
Zurück zum Zitat Gauch S, Chaffee J, Pretschner A (2003) Ontology-based personalized search and browsing. Web Intell Agent Syst 1(3/4):219–234 Gauch S, Chaffee J, Pretschner A (2003) Ontology-based personalized search and browsing. Web Intell Agent Syst 1(3/4):219–234
8.
Zurück zum Zitat Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Discov 8(1):53–87MathSciNetCrossRef Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Discov 8(1):53–87MathSciNetCrossRef
9.
Zurück zum Zitat Hofmann T (2001) Unsupervised learning by probabilistic latent semantic analysis. Mach Learn 42(1–2):177–196 Hofmann T (2001) Unsupervised learning by probabilistic latent semantic analysis. Mach Learn 42(1–2):177–196
10.
Zurück zum Zitat Kim H, Sun Y, Hockenmaier J, Han J (2012) Etm: Entity topic models for mining documents associated with entities. In: ICDM Kim H, Sun Y, Hockenmaier J, Han J (2012) Etm: Entity topic models for mining documents associated with entities. In: ICDM
11.
Zurück zum Zitat Lawrie D, Croft WB (2000) Discovering and comparing topic hierarchies. In: Proceedings of RIAO Lawrie D, Croft WB (2000) Discovering and comparing topic hierarchies. In: Proceedings of RIAO
12.
Zurück zum Zitat Li Q, Ji H, Huang L (2013) Joint event extraction via structured prediction with global features. In: ACL Li Q, Ji H, Huang L (2013) Joint event extraction via structured prediction with global features. In: ACL
13.
Zurück zum Zitat Liu X, Song Y, Liu S, Wang H (2012) Automatic taxonomy construction from keywords. In: KDD Liu X, Song Y, Liu S, Wang H (2012) Automatic taxonomy construction from keywords. In: KDD
14.
Zurück zum Zitat Navigli R, Velardi P, Faralli S (2011) A graph-based algorithm for inducing lexical taxonomies from scratch. In: IJCAI Navigli R, Velardi P, Faralli S (2011) A graph-based algorithm for inducing lexical taxonomies from scratch. In: IJCAI
15.
Zurück zum Zitat Newman D, Lau JH, Grieser K, Baldwin T (2010) Automatic evaluation of topic coherence. In: NAACL-HLT Newman D, Lau JH, Grieser K, Baldwin T (2010) Automatic evaluation of topic coherence. In: NAACL-HLT
16.
Zurück zum Zitat Smyth P (2000) Model selection for probabilistic clustering using cross-validated likelihood. Stat Comput 10(1):63–72CrossRef Smyth P (2000) Model selection for probabilistic clustering using cross-validated likelihood. Stat Comput 10(1):63–72CrossRef
17.
Zurück zum Zitat Snow R, Jurafsky D, Ng AY (2004) Learning syntactic patterns for automatic hypernym discovery. NIPS Snow R, Jurafsky D, Ng AY (2004) Learning syntactic patterns for automatic hypernym discovery. NIPS
18.
Zurück zum Zitat Sun Y, Han J, Gao J, Yu Y (2009a) itopicmodel: information network-integrated topic modeling. In: ICDM Sun Y, Han J, Gao J, Yu Y (2009a) itopicmodel: information network-integrated topic modeling. In: ICDM
19.
Zurück zum Zitat Sun Y, Yu Y, Han J (2009b) Ranking-based clustering of heterogeneous information networks with star network schema. In: KDD Sun Y, Yu Y, Han J (2009b) Ranking-based clustering of heterogeneous information networks with star network schema. In: KDD
20.
Zurück zum Zitat Tang J, Zhang M, Mei Q (2013) One theme in all views: modeling consensus topics in multiple contexts. In: KDD Tang J, Zhang M, Mei Q (2013) One theme in all views: modeling consensus topics in multiple contexts. In: KDD
21.
Zurück zum Zitat Wang C, Danilevsky M, Desai N, Zhang Y, Nguyen P, Taula T, Han J (2013) A phrase mining framework for recursive construction of a topical hierarchy. In: KDD Wang C, Danilevsky M, Desai N, Zhang Y, Nguyen P, Taula T, Han J (2013) A phrase mining framework for recursive construction of a topical hierarchy. In: KDD
22.
Zurück zum Zitat Wong W, Liu W, Bennamoun M (2012) Ontology learning from text: a look back and into the future. ACM Comput Surv (CSUR) 44(4):20CrossRef Wong W, Liu W, Bennamoun M (2012) Ontology learning from text: a look back and into the future. ACM Comput Surv (CSUR) 44(4):20CrossRef
23.
Zurück zum Zitat Zavitsanos E, Paliouras G, Vouros GA, Petridis S (2007) Discovering subsumption hierarchies of ontology concepts from text corpora. In: Proceedings of IEEE/WIC/ACM international conference on web intelligence Zavitsanos E, Paliouras G, Vouros GA, Petridis S (2007) Discovering subsumption hierarchies of ontology concepts from text corpora. In: Proceedings of IEEE/WIC/ACM international conference on web intelligence
Metadaten
Titel
Constructing topical hierarchies in heterogeneous information networks
verfasst von
Chi Wang
Jialu Liu
Nihit Desai
Marina Danilevsky
Jiawei Han
Publikationsdatum
01.09.2015
Verlag
Springer London
Erschienen in
Knowledge and Information Systems / Ausgabe 3/2015
Print ISSN: 0219-1377
Elektronische ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-014-0777-4

Weitere Artikel der Ausgabe 3/2015

Knowledge and Information Systems 3/2015 Zur Ausgabe

Premium Partner