Skip to main content
Erschienen in: Knowledge and Information Systems 1/2013

01.07.2013 | Regular Paper

Exploring heterogeneous information networks and random walk with restart for academic search

verfasst von: Meng-Fen Chiang, Jiun-Jiue Liou, Jen-Liang Wang, Wen-Chih Peng, Man-Kwan Shan

Erschienen in: Knowledge and Information Systems | Ausgabe 1/2013

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper, we explore heterogenous information networks in which each vertex represents one entity and the edges reflect linkage relationships. Heterogenous information networks contain vertices of several entity types, such as papers, authors and terms, and hence can fully reflect multiple linkage relationships among different entities. Such a heterogeneous information network is similar to a mixed media graph (MMG). By representing a bibliographic dataset as an MMG, the performance obtained when searching relevant entities (e.g., papers) can be improved. Furthermore, our academic search enables multiple-entity search, where a variety of entity search results are provided, such as relevant papers, authors and conferences, via a one-time query. Explicitly, given a bibliographic dataset, we propose a Global-MMG, in which a global heterogeneous information network is built. When a user submits a query keyword, we perform a random walk with restart (RWR) to retrieve papers or other types of entity objects. To reduce the query response time, algorithm Net-MMG (standing for NetClus-based MMG) is developed. Algorithm Net-MMG first divides a heterogeneous information network into a collection of sub-networks. Afterward, the Net-MMG performs a RWR on a set of selected relevant sub-networks. We implemented our academic search and conducted extensive experiments using the ACM Digital Library. The experimental results show that by exploring heterogeneous information networks and RWR, both the Global-MMG and Net-MMG achieve better search quality compared with existing academic search services. In addition, the Net-MMG has a shorter query response time while still guaranteeing good quality in search results.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Adomavicius G, Tuzhilin A (2005) Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans Knowl Data Eng 17(6):734–749CrossRef Adomavicius G, Tuzhilin A (2005) Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans Knowl Data Eng 17(6):734–749CrossRef
2.
Zurück zum Zitat Baeza-Yates R, Ribeiro-Neto B et al (1999) Modern information retrieval. ACM press, New York Baeza-Yates R, Ribeiro-Neto B et al (1999) Modern information retrieval. ACM press, New York
3.
Zurück zum Zitat Bharat K, Kamba T, Albers M (1998) Personalized, interactive news on the web. Multimed Syst 6(5): 349–358 Bharat K, Kamba T, Albers M (1998) Personalized, interactive news on the web. Multimed Syst 6(5): 349–358
4.
Zurück zum Zitat Breese JS, Heckerman D, Kadie C et al. (1998) Empirical analysis of predictive algorithms for collaborative filtering. In: Proceedings of uncertainty in artificial intelligence, pp 43–52 Breese JS, Heckerman D, Kadie C et al. (1998) Empirical analysis of predictive algorithms for collaborative filtering. In: Proceedings of uncertainty in artificial intelligence, pp 43–52
5.
Zurück zum Zitat Cheng H, Tan PN, Sticklen J, Punch WF (2007) Recommendation via query centered random walk on K-partite graph. In: Proceedings of IEEE computer society international conference on data mining, pp 457–462 Cheng H, Tan PN, Sticklen J, Punch WF (2007) Recommendation via query centered random walk on K-partite graph. In: Proceedings of IEEE computer society international conference on data mining, pp 457–462
6.
Zurück zum Zitat Cui J, Liu H, He J, Li P, Du X, Wang P (2011) Tagclus: a random walk-based method for tag clustering. Knowl Inform Syst 27(2):193–225MATHCrossRef Cui J, Liu H, He J, Li P, Du X, Wang P (2011) Tagclus: a random walk-based method for tag clustering. Knowl Inform Syst 27(2):193–225MATHCrossRef
7.
Zurück zum Zitat Han J, Kamber M (2006) Data mining: concepts and techniques. Morgan Kaufmann, Los AltosMATH Han J, Kamber M (2006) Data mining: concepts and techniques. Morgan Kaufmann, Los AltosMATH
8.
Zurück zum Zitat Han J, Kamber M, Pei J (2011) Data mining: concepts and techniques. Morgan Kaufmann, Los Altos Han J, Kamber M, Pei J (2011) Data mining: concepts and techniques. Morgan Kaufmann, Los Altos
12.
Zurück zum Zitat Jeh G, Widom J (2002) Simrank: a measure of structural-context similarity. In: Proceedings of SIGKDD. ACM, New York, NY, pp 538–543 Jeh G, Widom J (2002) Simrank: a measure of structural-context similarity. In: Proceedings of SIGKDD. ACM, New York, NY, pp 538–543
13.
Zurück zum Zitat Jiawei H, Jian P, Yiwen Y (2000) Mining frequent patterns without candidate generation. In: Proceedings of SIGMOD, pp 1–12 Jiawei H, Jian P, Yiwen Y (2000) Mining frequent patterns without candidate generation. In: Proceedings of SIGMOD, pp 1–12
14.
Zurück zum Zitat Kang U, Tsourakakis CE, Faloutsos C (2011) Pegasus: mining peta-scale graphs. Knowl Inform Syst 27(2):303–325 Kang U, Tsourakakis CE, Faloutsos C (2011) Pegasus: mining peta-scale graphs. Knowl Inform Syst 27(2):303–325
16.
Zurück zum Zitat Konstan JA, Miller BN, Maltz D, Herlocker JL, Gordon LR, Riedl J (1997) GroupLens: applying collaborative filtering to Usenet news. Commun ACM 40(3):87CrossRef Konstan JA, Miller BN, Maltz D, Herlocker JL, Gordon LR, Riedl J (1997) GroupLens: applying collaborative filtering to Usenet news. Commun ACM 40(3):87CrossRef
17.
Zurück zum Zitat Konstas I, Stathopoulos V, Jose Joemon M (2009) On social networks and collaborative recommendation. In: Procedings of SIGIR, pp 195–202 Konstas I, Stathopoulos V, Jose Joemon M (2009) On social networks and collaborative recommendation. In: Procedings of SIGIR, pp 195–202
18.
Zurück zum Zitat Liu B (2007) Web data mining: exploring hyperlinks, contents, and usage data. Springer, BerlinMATH Liu B (2007) Web data mining: exploring hyperlinks, contents, and usage data. Springer, BerlinMATH
19.
Zurück zum Zitat Liu NN, Yang Q (2008) Eigenrank: a ranking-oriented approach to collaborative filtering. In: Proceedings of SIGIR. ACM, New York, pp 83–90 Liu NN, Yang Q (2008) Eigenrank: a ranking-oriented approach to collaborative filtering. In: Proceedings of SIGIR. ACM, New York, pp 83–90
20.
Zurück zum Zitat Liu X, Bollen J, Nelson ML, Van de Sompel H (2005) Co-authorship networks in the digital library research community. Inform Process Manag 41(6):1462–1480CrossRef Liu X, Bollen J, Nelson ML, Van de Sompel H (2005) Co-authorship networks in the digital library research community. Inform Process Manag 41(6):1462–1480CrossRef
21.
Zurück zum Zitat Long B, Wu X, Zhang ZM, Yu PS (2006) Unsupervised learning on k-partite graphs. In: Proceedings of SIGKDD. ACM, New York, p 326 Long B, Wu X, Zhang ZM, Yu PS (2006) Unsupervised learning on k-partite graphs. In: Proceedings of SIGKDD. ACM, New York, p 326
22.
Zurück zum Zitat Page L, Brin S, Motwani R, Winograd T (1998) Bringing order to the web. The pagerank citation ranking. Page L, Brin S, Motwani R, Winograd T (1998) Bringing order to the web. The pagerank citation ranking.
23.
Zurück zum Zitat Pan J-Y, Yang H-J, Faloutsos C, Duygulu P (2004) Automatic multimedia cross-modal correlation discovery. In: Proceedings of SIGKDD, pp 653–658 Pan J-Y, Yang H-J, Faloutsos C, Duygulu P (2004) Automatic multimedia cross-modal correlation discovery. In: Proceedings of SIGKDD, pp 653–658
24.
Zurück zum Zitat Peng W, Li T (2011) Temporal relation co-clustering on directional social network and author-topic evolution. Knowl Inform Syst 26(3):467–486 Peng W, Li T (2011) Temporal relation co-clustering on directional social network and author-topic evolution. Knowl Inform Syst 26(3):467–486
25.
Zurück zum Zitat Sarwar B, Karypis G, Konstan J, Reidl J (2001) Item-based collaborative filtering recommendation algorithms. In: Proceedings of WWW. ACM, New York, p 295 Sarwar B, Karypis G, Konstan J, Reidl J (2001) Item-based collaborative filtering recommendation algorithms. In: Proceedings of WWW. ACM, New York, p 295
26.
Zurück zum Zitat Silberschatz A, Korth HF, Sudarshan S (2002) Database system concepts. McGraw-Hill, New York Silberschatz A, Korth HF, Sudarshan S (2002) Database system concepts. McGraw-Hill, New York
27.
Zurück zum Zitat Sun Y, Han J, Zhao P, Yin Z, Cheng H, Wu T (2009) Rankclus: integrating clustering with ranking for heterogeneous information network analysis. In: Proceedings of the 12th EDBT. ACM, New York, pp 565–576 Sun Y, Han J, Zhao P, Yin Z, Cheng H, Wu T (2009) Rankclus: integrating clustering with ranking for heterogeneous information network analysis. In: Proceedings of the 12th EDBT. ACM, New York, pp 565–576
28.
Zurück zum Zitat Sun Y, Wu T, Yin Z, Cheng H, Han J, Yin X, Zhao P (2008) BibNetMiner: mining bibliographic information networks. In: Proceedings of SIGMOD. ACM, New York, pp 1341–1344 Sun Y, Wu T, Yin Z, Cheng H, Han J, Yin X, Zhao P (2008) BibNetMiner: mining bibliographic information networks. In: Proceedings of SIGMOD. ACM, New York, pp 1341–1344
29.
Zurück zum Zitat Sun Y, Yu Y, Han J (2009) Ranking-based clustering of heterogeneous information networks with star network schema. In: Proceedings of SIGKDD. ACM, New York, pp 797–806 Sun Y, Yu Y, Han J (2009) Ranking-based clustering of heterogeneous information networks with star network schema. In: Proceedings of SIGKDD. ACM, New York, pp 797–806
30.
Zurück zum Zitat Tang J, Zhang J, Yao L, Li J, Zhang L, Su Z (2008) Arnetminer: extraction and mining of academic social networks. In: Proceedings of SIGKDD. ACM, New York, pp 990–998 Tang J, Zhang J, Yao L, Li J, Zhang L, Su Z (2008) Arnetminer: extraction and mining of academic social networks. In: Proceedings of SIGKDD. ACM, New York, pp 990–998
31.
Zurück zum Zitat Tong H, Faloutsos C, Pan JY (2006) Fast random walk with restart and its applications. In: Proceedings of ICDM, pp 613–622 Tong H, Faloutsos C, Pan JY (2006) Fast random walk with restart and its applications. In: Proceedings of ICDM, pp 613–622
32.
Zurück zum Zitat Tong H, Papadimitriou S, Yu PS, Faloutsos C (2008) Proximity tracking on time-evolving bipartite graphs. In Proceedings of SIAM. Citeseer, pp 704–715 Tong H, Papadimitriou S, Yu PS, Faloutsos C (2008) Proximity tracking on time-evolving bipartite graphs. In Proceedings of SIAM. Citeseer, pp 704–715
33.
Zurück zum Zitat Wang JL (2008) Academic literature search based on collaborative recommendation by authors. Master’s thesis, National Chengchi University Wang JL (2008) Academic literature search based on collaborative recommendation by authors. Master’s thesis, National Chengchi University
34.
Zurück zum Zitat Wang X, Sun J-T, Chen Z (2007) Shine: search heterogeneous interrelated entities. In: Proceedings of CIKM, pp 583–592 Wang X, Sun J-T, Chen Z (2007) Shine: search heterogeneous interrelated entities. In: Proceedings of CIKM, pp 583–592
35.
Zurück zum Zitat Zhou D, Orshanskiy SA, Zha H, Lee GC (2007) Co-ranking authors and documents in a heterogeneous network. In Proceedings of ICDM. IEEE Computer Society, pp 739–744 Zhou D, Orshanskiy SA, Zha H, Lee GC (2007) Co-ranking authors and documents in a heterogeneous network. In Proceedings of ICDM. IEEE Computer Society, pp 739–744
Metadaten
Titel
Exploring heterogeneous information networks and random walk with restart for academic search
verfasst von
Meng-Fen Chiang
Jiun-Jiue Liou
Jen-Liang Wang
Wen-Chih Peng
Man-Kwan Shan
Publikationsdatum
01.07.2013
Verlag
Springer-Verlag
Erschienen in
Knowledge and Information Systems / Ausgabe 1/2013
Print ISSN: 0219-1377
Elektronische ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-012-0523-8

Weitere Artikel der Ausgabe 1/2013

Knowledge and Information Systems 1/2013 Zur Ausgabe