Skip to main content
Top
Published in: Knowledge and Information Systems 8/2020

18-02-2020 | Regular Paper

Random walk-based entity representation learning and re-ranking for entity search

Author: Takahiro Komamizu

Published in: Knowledge and Information Systems | Issue 8/2020

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Linked Data (LD) has become a valuable source of factual records, and entity search is a fundamental task in LD. The task is, given a query consisting of a set of keywords, to retrieve a set of relevant entities in LD. The state-of-the-art approaches for entity search are based on information retrieval techniques. This paper first examines these approaches with a traditional evaluation metric, recall@k, to reveal their potential for improvement. To obtain evidence for the potentials, an investigation is carried out on the relationship between queries and answer entities in terms of path lengths on a graph of LD. On the basis of the investigation, learning representations of entities are dealt with. The existing methods of entity search are based on heuristics that determine relevant fields (i.e., predicates and related entities) to constitute entity representations. Since the heuristics require burdensome human decisions, this paper is aimed at removing the burden with a graph proximity measurement. To this end, in this paper, RWRDoc is proposed. It is an RWR (random walk with restart)-based representation learning method that learns representations of entities by using weighted combinations of representations of reachable entities w.r.t. RWR. RWRDoc is mainly designed to improve recall scores; therefore, as shown in experiments, it lacks capability in ranking. In order to improve the ranking qualities, this paper proposes a personalized PageRank-based re-ranking method, PPRSD (Personalized PageRank-based Score Distribution), for the retrieved results. PPRSD distributes relevance scores calculated by text-based entity search methods in a personalized PageRank manner. Experimental evaluations showcase that RWRDoc can improve search qualities in terms of recall@1000 and PPRSD can compensate for RWRDoc’s insufficient ranking capability, and the evaluations confirmed this compensation.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Balaneshinkordan S, Kotov A, Nikolaev F (2018) Attentive neural architecture for ad-hoc structured document retrieval. In: CIKM 2018, pp 1173–1182 Balaneshinkordan S, Kotov A, Nikolaev F (2018) Attentive neural architecture for ad-hoc structured document retrieval. In: CIKM 2018, pp 1173–1182
2.
go back to reference Balmin A, Hristidis V, Papakonstantinou Y (2004) ObjectRank: authority-based keyword search in databases. In: VLDB 2004, pp 564–575 Balmin A, Hristidis V, Papakonstantinou Y (2004) ObjectRank: authority-based keyword search in databases. In: VLDB 2004, pp 564–575
3.
go back to reference Bizer C, Heath T, Berners-Lee T (2009) Linked data—the story so far. Int J Semant Web Inf Syst 5(3):1–22CrossRef Bizer C, Heath T, Berners-Lee T (2009) Linked data—the story so far. Int J Semant Web Inf Syst 5(3):1–22CrossRef
4.
go back to reference Burges CJC, Shaked T, Renshaw E, Lazier A, Deeds M, Hamilton N, Hullender GN (2005) Learning to rank using gradient descent. In: ICML 2005, pp 89–96 Burges CJC, Shaked T, Renshaw E, Lazier A, Deeds M, Hamilton N, Hullender GN (2005) Learning to rank using gradient descent. In: ICML 2005, pp 89–96
5.
go back to reference Chen J, Xiong C, Callan J (2016) An empirical study of learning to rank for entity search. In: SIGIR 2016, pp 737–740 Chen J, Xiong C, Callan J (2016) An empirical study of learning to rank for entity search. In: SIGIR 2016, pp 737–740
6.
go back to reference Ciglan M, Nørvåg K, Hluchý L (2012) The SemSets model for ad-hoc semantic list search. In: WWW 2012, pp 131–140 Ciglan M, Nørvåg K, Hluchý L (2012) The SemSets model for ad-hoc semantic list search. In: WWW 2012, pp 131–140
7.
go back to reference Dali L, Fortuna B (2011) Learning to rank for semantic search. In: SemSearch@WWW2011 Dali L, Fortuna B (2011) Learning to rank for semantic search. In: SemSearch@WWW2011
8.
go back to reference Delbru R, Toupikov N, Catasta M, Tummarello G, Decker S (2010) Hierarchical link analysis for ranking web data. In: ESWC 2010, pp 225–239 Delbru R, Toupikov N, Catasta M, Tummarello G, Decker S (2010) Hierarchical link analysis for ranking web data. In: ESWC 2010, pp 225–239
9.
go back to reference Grover A, Leskovec J (2016) node2vec: scalable feature learning for networks. In: SIGKDD 2016, pp 855–864 Grover A, Leskovec J (2016) node2vec: scalable feature learning for networks. In: SIGKDD 2016, pp 855–864
10.
go back to reference Hasibi F (2018) Semantic search with knowledge bases. PhD thesis, Norwegian University of Science and Technology, Trondheim, Norway Hasibi F (2018) Semantic search with knowledge bases. PhD thesis, Norwegian University of Science and Technology, Trondheim, Norway
11.
go back to reference Hasibi F, Balog K, Bratsberg SE (2016) Exploiting entity linking in queries for entity retrieval. In: ICTIR 2016, pp 209–218 Hasibi F, Balog K, Bratsberg SE (2016) Exploiting entity linking in queries for entity retrieval. In: ICTIR 2016, pp 209–218
12.
go back to reference Hasibi F, Nikolaev F, Xiong C, Balog K, Bratsberg SE, Kotov A, Callan J (2017) DBpedia-entity v2: a test collection for entity search. In: SIGIR 2017, pp 1265–1268 Hasibi F, Nikolaev F, Xiong C, Balog K, Bratsberg SE, Kotov A, Callan J (2017) DBpedia-entity v2: a test collection for entity search. In: SIGIR 2017, pp 1265–1268
13.
go back to reference Haveliwala TH (2002) Topic-sensitive PageRank. In: WWW 2002, pp 517–526 Haveliwala TH (2002) Topic-sensitive PageRank. In: WWW 2002, pp 517–526
14.
go back to reference Hogan A, Harth A, Decker S (2006) ReConRank: a scalable ranking method for semantic web data with context. In: SSWS 2006 Hogan A, Harth A, Decker S (2006) ReConRank: a scalable ranking method for semantic web data with context. In: SSWS 2006
15.
go back to reference Interdonato R, Tagarelli A (2015) Multi-relational PageRank for tree structure sense ranking. World Wide Web 18(5):1301–1329CrossRef Interdonato R, Tagarelli A (2015) Multi-relational PageRank for tree structure sense ranking. World Wide Web 18(5):1301–1329CrossRef
16.
go back to reference Ito H, Komamizu T, Amagasa T, Kitagawa H (2018) Community detection and correlated attribute cluster analysis on multi-attributed graphs. In: DARLI-AP@EDBT/ICDT 2018, pp 2–9 Ito H, Komamizu T, Amagasa T, Kitagawa H (2018) Community detection and correlated attribute cluster analysis on multi-attributed graphs. In: DARLI-AP@EDBT/ICDT 2018, pp 2–9
17.
go back to reference Ito H, Komamizu T, Amagasa T, Kitagawa H (2018) Network-word embedding for dynamic text attributed networks. In: SCSN@ICSC 2018, pp 334–339 Ito H, Komamizu T, Amagasa T, Kitagawa H (2018) Network-word embedding for dynamic text attributed networks. In: SCSN@ICSC 2018, pp 334–339
18.
go back to reference Järvelin K, Kekäläinen J (2002) Cumulated gain-based evaluation of IR techniques. ACM Trans Inf Syst 20(4):422–446CrossRef Järvelin K, Kekäläinen J (2002) Cumulated gain-based evaluation of IR techniques. ACM Trans Inf Syst 20(4):422–446CrossRef
19.
go back to reference Kim J, Xue X, Croft WB (2009) A probabilistic retrieval model for semistructured data. In: ECIR 2009, pp 228–239 Kim J, Xue X, Croft WB (2009) A probabilistic retrieval model for semistructured data. In: ECIR 2009, pp 228–239
20.
go back to reference Komamizu T, Okumura S, Amagasa T, Kitagawa H (2017) FORK: feedback-aware ObjectRank-based keyword search over linked data. In: AIRS 2017, pp 58–70 Komamizu T, Okumura S, Amagasa T, Kitagawa H (2017) FORK: feedback-aware ObjectRank-based keyword search over linked data. In: AIRS 2017, pp 58–70
21.
go back to reference Li J, Dani H, Hu X, Tang J, Chang Y, Liu H (2017) Attributed network embedding for learning in a dynamic environment. In: CIKM 2017, pp 387–396 Li J, Dani H, Hu X, Tang J, Chang Y, Liu H (2017) Attributed network embedding for learning in a dynamic environment. In: CIKM 2017, pp 387–396
22.
go back to reference Lin X, Lam W, Lai KP (2018) Entity retrieval in the knowledge graph with hierarchical entity type and content. In: ICTIR 2018, pp 211–214 Lin X, Lam W, Lai KP (2018) Entity retrieval in the knowledge graph with hierarchical entity type and content. In: ICTIR 2018, pp 211–214
23.
go back to reference Metzler D, Croft WB (2005) A Markov random field model for term dependencies. In: SIGIR 2005, pp 472–479 Metzler D, Croft WB (2005) A Markov random field model for term dependencies. In: SIGIR 2005, pp 472–479
24.
go back to reference Nikolaev F, Kotov A, Zhiltsov N (2016) Parameterized fielded term dependence models for ad-hoc entity retrieval from knowledge graph. In: SIGIR 2016, pp 435–444 Nikolaev F, Kotov A, Zhiltsov N (2016) Parameterized fielded term dependence models for ad-hoc entity retrieval from knowledge graph. In: SIGIR 2016, pp 435–444
25.
go back to reference Ogilvie P, Callan JP (2003) Combining document representations for known-item search. In: SIGIR 2003, pp 143–150 Ogilvie P, Callan JP (2003) Combining document representations for known-item search. In: SIGIR 2003, pp 143–150
26.
go back to reference Page L, Brin S, Motwani R, Winograd T (1999) The PageRank citation ranking: bringing order to the web. Technical report 1999-66 Page L, Brin S, Motwani R, Winograd T (1999) The PageRank citation ranking: bringing order to the web. Technical report 1999-66
27.
go back to reference Perozzi B, Al-Rfou R, Skiena S (2014) DeepWalk: online learning of social representations. In: SIGKDD 2014, pp 701–710 Perozzi B, Al-Rfou R, Skiena S (2014) DeepWalk: online learning of social representations. In: SIGKDD 2014, pp 701–710
28.
go back to reference Ponte JM, Croft WB (1998) A language modeling approach to information retrieval. In: SIGIR 1998, pp 275–281 Ponte JM, Croft WB (1998) A language modeling approach to information retrieval. In: SIGIR 1998, pp 275–281
29.
go back to reference Pound J, Mika P, Zaragoza H (2010) Ad-hoc object retrieval in the web of data. In: WWW 2010, pp 771–780 Pound J, Mika P, Zaragoza H (2010) Ad-hoc object retrieval in the web of data. In: WWW 2010, pp 771–780
30.
go back to reference Robertson SE, Zaragoza H (2009) The probabilistic relevance framework: BM25 and beyond. FTIR 3(4):333–389 Robertson SE, Zaragoza H (2009) The probabilistic relevance framework: BM25 and beyond. FTIR 3(4):333–389
31.
go back to reference Shijia E, Xiang Y (2017) Entity search based on the representation learning model with different embedding strategies. IEEE Access 5:15174–15183CrossRef Shijia E, Xiang Y (2017) Entity search based on the representation learning model with different embedding strategies. IEEE Access 5:15174–15183CrossRef
32.
go back to reference Tong H, Faloutsos C, Pan J (2008) Random walk with restart: fast solutions and applications. Knowl Inf Syst 14(3):327–346CrossRef Tong H, Faloutsos C, Pan J (2008) Random walk with restart: fast solutions and applications. Knowl Inf Syst 14(3):327–346CrossRef
33.
go back to reference Usbeck R, Ngomo AN, Haarmann B, Krithara A, Röder M, Napolitano G (2017) 7th open challenge on question answering over linked data (QALD-7). In: ESWC 2017, pp 59–69 Usbeck R, Ngomo AN, Haarmann B, Krithara A, Röder M, Napolitano G (2017) 7th open challenge on question answering over linked data (QALD-7). In: ESWC 2017, pp 59–69
34.
go back to reference Wang Q, Kamps J, Camps GR, Marx M, Schuth A, Theobald M, Gurajada S, Mishra A (2012) Overview of the INEX 2012 linked data track. In: CLEF 2012 evaluation labs and workshop Wang Q, Kamps J, Camps GR, Marx M, Schuth A, Theobald M, Gurajada S, Mishra A (2012) Overview of the INEX 2012 linked data track. In: CLEF 2012 evaluation labs and workshop
35.
go back to reference Yang C, Liu Z, Zhao D, Sun M, Chang EY (2015) Network representation learning with rich text information. In: IJCAI 2015, pp 2111–2117 Yang C, Liu Z, Zhao D, Sun M, Chang EY (2015) Network representation learning with rich text information. In: IJCAI 2015, pp 2111–2117
36.
go back to reference Yoon M, Jung J, Kang U (2018) TPA: fast, scalable, and accurate method for approximate random walk with restart on billion scale graphs. In: ICDE 2018, pp 1132–1143 Yoon M, Jung J, Kang U (2018) TPA: fast, scalable, and accurate method for approximate random walk with restart on billion scale graphs. In: ICDE 2018, pp 1132–1143
37.
go back to reference Zhang Z, Yang H, Bu J, Zhou S, Yu P, Zhang J, Ester M, Wang C (2018) ANRL: attributed network representation learning via deep neural networks. In: IJCAI 2018, pp 3155–3161 Zhang Z, Yang H, Bu J, Zhou S, Yu P, Zhang J, Ester M, Wang C (2018) ANRL: attributed network representation learning via deep neural networks. In: IJCAI 2018, pp 3155–3161
38.
go back to reference Zhiltsov N, Kotov A, Nikolaev F (2015) Fielded sequential dependence model for ad-hoc entity retrieval in the web of data. In: SIGIR 2015, pp 253–262 Zhiltsov N, Kotov A, Nikolaev F (2015) Fielded sequential dependence model for ad-hoc entity retrieval in the web of data. In: SIGIR 2015, pp 253–262
Metadata
Title
Random walk-based entity representation learning and re-ranking for entity search
Author
Takahiro Komamizu
Publication date
18-02-2020
Publisher
Springer London
Published in
Knowledge and Information Systems / Issue 8/2020
Print ISSN: 0219-1377
Electronic ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-020-01445-4

Other articles of this Issue 8/2020

Knowledge and Information Systems 8/2020 Go to the issue

Premium Partner