Skip to main content
Top
Published in: Knowledge and Information Systems 1/2013

01-07-2013 | Regular Paper

Exploring heterogeneous information networks and random walk with restart for academic search

Authors: Meng-Fen Chiang, Jiun-Jiue Liou, Jen-Liang Wang, Wen-Chih Peng, Man-Kwan Shan

Published in: Knowledge and Information Systems | Issue 1/2013

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In this paper, we explore heterogenous information networks in which each vertex represents one entity and the edges reflect linkage relationships. Heterogenous information networks contain vertices of several entity types, such as papers, authors and terms, and hence can fully reflect multiple linkage relationships among different entities. Such a heterogeneous information network is similar to a mixed media graph (MMG). By representing a bibliographic dataset as an MMG, the performance obtained when searching relevant entities (e.g., papers) can be improved. Furthermore, our academic search enables multiple-entity search, where a variety of entity search results are provided, such as relevant papers, authors and conferences, via a one-time query. Explicitly, given a bibliographic dataset, we propose a Global-MMG, in which a global heterogeneous information network is built. When a user submits a query keyword, we perform a random walk with restart (RWR) to retrieve papers or other types of entity objects. To reduce the query response time, algorithm Net-MMG (standing for NetClus-based MMG) is developed. Algorithm Net-MMG first divides a heterogeneous information network into a collection of sub-networks. Afterward, the Net-MMG performs a RWR on a set of selected relevant sub-networks. We implemented our academic search and conducted extensive experiments using the ACM Digital Library. The experimental results show that by exploring heterogeneous information networks and RWR, both the Global-MMG and Net-MMG achieve better search quality compared with existing academic search services. In addition, the Net-MMG has a shorter query response time while still guaranteeing good quality in search results.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Adomavicius G, Tuzhilin A (2005) Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans Knowl Data Eng 17(6):734–749CrossRef Adomavicius G, Tuzhilin A (2005) Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans Knowl Data Eng 17(6):734–749CrossRef
2.
go back to reference Baeza-Yates R, Ribeiro-Neto B et al (1999) Modern information retrieval. ACM press, New York Baeza-Yates R, Ribeiro-Neto B et al (1999) Modern information retrieval. ACM press, New York
3.
go back to reference Bharat K, Kamba T, Albers M (1998) Personalized, interactive news on the web. Multimed Syst 6(5): 349–358 Bharat K, Kamba T, Albers M (1998) Personalized, interactive news on the web. Multimed Syst 6(5): 349–358
4.
go back to reference Breese JS, Heckerman D, Kadie C et al. (1998) Empirical analysis of predictive algorithms for collaborative filtering. In: Proceedings of uncertainty in artificial intelligence, pp 43–52 Breese JS, Heckerman D, Kadie C et al. (1998) Empirical analysis of predictive algorithms for collaborative filtering. In: Proceedings of uncertainty in artificial intelligence, pp 43–52
5.
go back to reference Cheng H, Tan PN, Sticklen J, Punch WF (2007) Recommendation via query centered random walk on K-partite graph. In: Proceedings of IEEE computer society international conference on data mining, pp 457–462 Cheng H, Tan PN, Sticklen J, Punch WF (2007) Recommendation via query centered random walk on K-partite graph. In: Proceedings of IEEE computer society international conference on data mining, pp 457–462
6.
go back to reference Cui J, Liu H, He J, Li P, Du X, Wang P (2011) Tagclus: a random walk-based method for tag clustering. Knowl Inform Syst 27(2):193–225MATHCrossRef Cui J, Liu H, He J, Li P, Du X, Wang P (2011) Tagclus: a random walk-based method for tag clustering. Knowl Inform Syst 27(2):193–225MATHCrossRef
7.
go back to reference Han J, Kamber M (2006) Data mining: concepts and techniques. Morgan Kaufmann, Los AltosMATH Han J, Kamber M (2006) Data mining: concepts and techniques. Morgan Kaufmann, Los AltosMATH
8.
go back to reference Han J, Kamber M, Pei J (2011) Data mining: concepts and techniques. Morgan Kaufmann, Los Altos Han J, Kamber M, Pei J (2011) Data mining: concepts and techniques. Morgan Kaufmann, Los Altos
12.
go back to reference Jeh G, Widom J (2002) Simrank: a measure of structural-context similarity. In: Proceedings of SIGKDD. ACM, New York, NY, pp 538–543 Jeh G, Widom J (2002) Simrank: a measure of structural-context similarity. In: Proceedings of SIGKDD. ACM, New York, NY, pp 538–543
13.
go back to reference Jiawei H, Jian P, Yiwen Y (2000) Mining frequent patterns without candidate generation. In: Proceedings of SIGMOD, pp 1–12 Jiawei H, Jian P, Yiwen Y (2000) Mining frequent patterns without candidate generation. In: Proceedings of SIGMOD, pp 1–12
14.
go back to reference Kang U, Tsourakakis CE, Faloutsos C (2011) Pegasus: mining peta-scale graphs. Knowl Inform Syst 27(2):303–325 Kang U, Tsourakakis CE, Faloutsos C (2011) Pegasus: mining peta-scale graphs. Knowl Inform Syst 27(2):303–325
16.
go back to reference Konstan JA, Miller BN, Maltz D, Herlocker JL, Gordon LR, Riedl J (1997) GroupLens: applying collaborative filtering to Usenet news. Commun ACM 40(3):87CrossRef Konstan JA, Miller BN, Maltz D, Herlocker JL, Gordon LR, Riedl J (1997) GroupLens: applying collaborative filtering to Usenet news. Commun ACM 40(3):87CrossRef
17.
go back to reference Konstas I, Stathopoulos V, Jose Joemon M (2009) On social networks and collaborative recommendation. In: Procedings of SIGIR, pp 195–202 Konstas I, Stathopoulos V, Jose Joemon M (2009) On social networks and collaborative recommendation. In: Procedings of SIGIR, pp 195–202
18.
go back to reference Liu B (2007) Web data mining: exploring hyperlinks, contents, and usage data. Springer, BerlinMATH Liu B (2007) Web data mining: exploring hyperlinks, contents, and usage data. Springer, BerlinMATH
19.
go back to reference Liu NN, Yang Q (2008) Eigenrank: a ranking-oriented approach to collaborative filtering. In: Proceedings of SIGIR. ACM, New York, pp 83–90 Liu NN, Yang Q (2008) Eigenrank: a ranking-oriented approach to collaborative filtering. In: Proceedings of SIGIR. ACM, New York, pp 83–90
20.
go back to reference Liu X, Bollen J, Nelson ML, Van de Sompel H (2005) Co-authorship networks in the digital library research community. Inform Process Manag 41(6):1462–1480CrossRef Liu X, Bollen J, Nelson ML, Van de Sompel H (2005) Co-authorship networks in the digital library research community. Inform Process Manag 41(6):1462–1480CrossRef
21.
go back to reference Long B, Wu X, Zhang ZM, Yu PS (2006) Unsupervised learning on k-partite graphs. In: Proceedings of SIGKDD. ACM, New York, p 326 Long B, Wu X, Zhang ZM, Yu PS (2006) Unsupervised learning on k-partite graphs. In: Proceedings of SIGKDD. ACM, New York, p 326
22.
go back to reference Page L, Brin S, Motwani R, Winograd T (1998) Bringing order to the web. The pagerank citation ranking. Page L, Brin S, Motwani R, Winograd T (1998) Bringing order to the web. The pagerank citation ranking.
23.
go back to reference Pan J-Y, Yang H-J, Faloutsos C, Duygulu P (2004) Automatic multimedia cross-modal correlation discovery. In: Proceedings of SIGKDD, pp 653–658 Pan J-Y, Yang H-J, Faloutsos C, Duygulu P (2004) Automatic multimedia cross-modal correlation discovery. In: Proceedings of SIGKDD, pp 653–658
24.
go back to reference Peng W, Li T (2011) Temporal relation co-clustering on directional social network and author-topic evolution. Knowl Inform Syst 26(3):467–486 Peng W, Li T (2011) Temporal relation co-clustering on directional social network and author-topic evolution. Knowl Inform Syst 26(3):467–486
25.
go back to reference Sarwar B, Karypis G, Konstan J, Reidl J (2001) Item-based collaborative filtering recommendation algorithms. In: Proceedings of WWW. ACM, New York, p 295 Sarwar B, Karypis G, Konstan J, Reidl J (2001) Item-based collaborative filtering recommendation algorithms. In: Proceedings of WWW. ACM, New York, p 295
26.
go back to reference Silberschatz A, Korth HF, Sudarshan S (2002) Database system concepts. McGraw-Hill, New York Silberschatz A, Korth HF, Sudarshan S (2002) Database system concepts. McGraw-Hill, New York
27.
go back to reference Sun Y, Han J, Zhao P, Yin Z, Cheng H, Wu T (2009) Rankclus: integrating clustering with ranking for heterogeneous information network analysis. In: Proceedings of the 12th EDBT. ACM, New York, pp 565–576 Sun Y, Han J, Zhao P, Yin Z, Cheng H, Wu T (2009) Rankclus: integrating clustering with ranking for heterogeneous information network analysis. In: Proceedings of the 12th EDBT. ACM, New York, pp 565–576
28.
go back to reference Sun Y, Wu T, Yin Z, Cheng H, Han J, Yin X, Zhao P (2008) BibNetMiner: mining bibliographic information networks. In: Proceedings of SIGMOD. ACM, New York, pp 1341–1344 Sun Y, Wu T, Yin Z, Cheng H, Han J, Yin X, Zhao P (2008) BibNetMiner: mining bibliographic information networks. In: Proceedings of SIGMOD. ACM, New York, pp 1341–1344
29.
go back to reference Sun Y, Yu Y, Han J (2009) Ranking-based clustering of heterogeneous information networks with star network schema. In: Proceedings of SIGKDD. ACM, New York, pp 797–806 Sun Y, Yu Y, Han J (2009) Ranking-based clustering of heterogeneous information networks with star network schema. In: Proceedings of SIGKDD. ACM, New York, pp 797–806
30.
go back to reference Tang J, Zhang J, Yao L, Li J, Zhang L, Su Z (2008) Arnetminer: extraction and mining of academic social networks. In: Proceedings of SIGKDD. ACM, New York, pp 990–998 Tang J, Zhang J, Yao L, Li J, Zhang L, Su Z (2008) Arnetminer: extraction and mining of academic social networks. In: Proceedings of SIGKDD. ACM, New York, pp 990–998
31.
go back to reference Tong H, Faloutsos C, Pan JY (2006) Fast random walk with restart and its applications. In: Proceedings of ICDM, pp 613–622 Tong H, Faloutsos C, Pan JY (2006) Fast random walk with restart and its applications. In: Proceedings of ICDM, pp 613–622
32.
go back to reference Tong H, Papadimitriou S, Yu PS, Faloutsos C (2008) Proximity tracking on time-evolving bipartite graphs. In Proceedings of SIAM. Citeseer, pp 704–715 Tong H, Papadimitriou S, Yu PS, Faloutsos C (2008) Proximity tracking on time-evolving bipartite graphs. In Proceedings of SIAM. Citeseer, pp 704–715
33.
go back to reference Wang JL (2008) Academic literature search based on collaborative recommendation by authors. Master’s thesis, National Chengchi University Wang JL (2008) Academic literature search based on collaborative recommendation by authors. Master’s thesis, National Chengchi University
34.
go back to reference Wang X, Sun J-T, Chen Z (2007) Shine: search heterogeneous interrelated entities. In: Proceedings of CIKM, pp 583–592 Wang X, Sun J-T, Chen Z (2007) Shine: search heterogeneous interrelated entities. In: Proceedings of CIKM, pp 583–592
35.
go back to reference Zhou D, Orshanskiy SA, Zha H, Lee GC (2007) Co-ranking authors and documents in a heterogeneous network. In Proceedings of ICDM. IEEE Computer Society, pp 739–744 Zhou D, Orshanskiy SA, Zha H, Lee GC (2007) Co-ranking authors and documents in a heterogeneous network. In Proceedings of ICDM. IEEE Computer Society, pp 739–744
Metadata
Title
Exploring heterogeneous information networks and random walk with restart for academic search
Authors
Meng-Fen Chiang
Jiun-Jiue Liou
Jen-Liang Wang
Wen-Chih Peng
Man-Kwan Shan
Publication date
01-07-2013
Publisher
Springer-Verlag
Published in
Knowledge and Information Systems / Issue 1/2013
Print ISSN: 0219-1377
Electronic ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-012-0523-8

Other articles of this Issue 1/2013

Knowledge and Information Systems 1/2013 Go to the issue

Premium Partner