Skip to main content
Top

2016 | OriginalPaper | Chapter

Leveraging Semantic Annotations to Link Wikipedia and News Archives

Authors : Arunav Mishra, Klaus Berberich

Published in: Advances in Information Retrieval

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The incomprehensible amount of information available online has made it difficult to retrospect on past events. We propose a novel linking problem to connect excerpts from Wikipedia summarizing events to online news articles elaborating on them. To address this linking problem, we cast it into an information retrieval task by treating a given excerpt as a user query with the goal to retrieve a ranked list of relevant news articles. We find that Wikipedia excerpts often come with additional semantics, in their textual descriptions, representing the time, geolocations, and named entities involved in the event. Our retrieval model leverages text and semantic annotations as different dimensions of an event by estimating independent query models to rank documents. In our experiments on two datasets, we compare methods that consider different combinations of dimensions and find that the approach that leverages all dimensions suits our problem best.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Arapakis, I., et al.: Automatically embedding newsworthy links to articles: From implementation to evaluation. JASIST 65(1), 129–145 (2014) Arapakis, I., et al.: Automatically embedding newsworthy links to articles: From implementation to evaluation. JASIST 65(1), 129–145 (2014)
2.
go back to reference Bai, J., et al.: Using query contexts in information retrieval. In: SIGIR.(2007) Bai, J., et al.: Using query contexts in information retrieval. In: SIGIR.(2007)
3.
go back to reference Balog, K., et al.: Overview of the TREC 2010 entity track. In: DTIC.(2010) Balog, K., et al.: Overview of the TREC 2010 entity track. In: DTIC.(2010)
4.
go back to reference Bellot, P., et al.: Report on INEX 2013. ACM SIGIR Forum 47(2), 21–32 (2013)CrossRef Bellot, P., et al.: Report on INEX 2013. ACM SIGIR Forum 47(2), 21–32 (2013)CrossRef
5.
go back to reference Berberich, Klaus, Bedathur, Srikanta, Alonso, Omar, Weikum, Gerhard: A Language Modeling Approach for Temporal Information Needs. In: Gurrin, Cathal, He, Yulan, Kazai, Gabriella, Kruschwitz, Udo, Little, Suzanne, Roelleke, Thomas, Rüger, Stefan, van Rijsbergen, Keith (eds.) ECIR 2010. LNCS, vol. 5993, pp. 13–25. Springer, Heidelberg (2010)CrossRef Berberich, Klaus, Bedathur, Srikanta, Alonso, Omar, Weikum, Gerhard: A Language Modeling Approach for Temporal Information Needs. In: Gurrin, Cathal, He, Yulan, Kazai, Gabriella, Kruschwitz, Udo, Little, Suzanne, Roelleke, Thomas, Rüger, Stefan, van Rijsbergen, Keith (eds.) ECIR 2010. LNCS, vol. 5993, pp. 13–25. Springer, Heidelberg (2010)CrossRef
6.
go back to reference Bron, Marc, Huurnink, Bouke, de Rijke, Maarten: Linking Archives Using Document Enrichment and Term Selection. In: Gradmann, Stefan, Borri, Francesca, Meghini, Carlo, Schuldt, Heiko (eds.) TPDL 2011. LNCS, vol. 6966, pp. 360–371. Springer, Heidelberg (2011)CrossRef Bron, Marc, Huurnink, Bouke, de Rijke, Maarten: Linking Archives Using Document Enrichment and Term Selection. In: Gradmann, Stefan, Borri, Francesca, Meghini, Carlo, Schuldt, Heiko (eds.) TPDL 2011. LNCS, vol. 6966, pp. 360–371. Springer, Heidelberg (2011)CrossRef
7.
go back to reference Cozza, Vittoria, Messina, Antonio, Montesi, Danilo, Arietta, Luca, Magnani, Matteo: Spatio-Temporal Keyword Queries in Social Networks. In: Catania, Barbara, Guerrini, Giovanna, Pokorný, Jaroslav (eds.) ADBIS 2013. LNCS, vol. 8133, pp. 70–83. Springer, Heidelberg (2013)CrossRef Cozza, Vittoria, Messina, Antonio, Montesi, Danilo, Arietta, Luca, Magnani, Matteo: Spatio-Temporal Keyword Queries in Social Networks. In: Catania, Barbara, Guerrini, Giovanna, Pokorný, Jaroslav (eds.) ADBIS 2013. LNCS, vol. 8133, pp. 70–83. Springer, Heidelberg (2013)CrossRef
8.
go back to reference Croft, B., et al.: Search Engines: Information Retrieval in Practice. Addison-Wesley, Reading.(2010) Croft, B., et al.: Search Engines: Information Retrieval in Practice. Addison-Wesley, Reading.(2010)
9.
go back to reference Dalton, J., et al.: Entity query feature expansion using knowledge base links. In: SIGIR.(2014) Dalton, J., et al.: Entity query feature expansion using knowledge base links. In: SIGIR.(2014)
10.
go back to reference Demartini, Gianluca, Iofciu, Tereza, de Vries, Arjen P.: Overview of the INEX 2009 Entity Ranking Track. In: Geva, Shlomo, Kamps, Jaap, Trotman, Andrew (eds.) INEX 2009. LNCS, vol. 6203, pp. 254–264. Springer, Heidelberg (2010)CrossRef Demartini, Gianluca, Iofciu, Tereza, de Vries, Arjen P.: Overview of the INEX 2009 Entity Ranking Track. In: Geva, Shlomo, Kamps, Jaap, Trotman, Andrew (eds.) INEX 2009. LNCS, vol. 6203, pp. 254–264. Springer, Heidelberg (2010)CrossRef
11.
go back to reference Efron, M., et al.: Temporal feedback for tweet search with non-parametric density estimation. In: SIGIR.(2014) Efron, M., et al.: Temporal feedback for tweet search with non-parametric density estimation. In: SIGIR.(2014)
12.
go back to reference Gey, F., et al.: NTCIR-GeoTime overview: Evaluating geographic and temporal search. In: NTCIR.(2010) Gey, F., et al.: NTCIR-GeoTime overview: Evaluating geographic and temporal search. In: NTCIR.(2010)
13.
go back to reference Hariharan, R., et al.: Processing spatial-keyword (SK) queries in geographic information retrieval (GIR) systems. In: SSDBM.(2007) Hariharan, R., et al.: Processing spatial-keyword (SK) queries in geographic information retrieval (GIR) systems. In: SSDBM.(2007)
14.
go back to reference Henzinger, M.R., et al.: Query-free news search. World Wide Web 8, 101–126 (2005)CrossRef Henzinger, M.R., et al.: Query-free news search. World Wide Web 8, 101–126 (2005)CrossRef
15.
go back to reference Hoffart, J., et al.: YAGO2: A spatially and temporally enhanced knowledge base from Wikipedia. In: IJCAI.(2013) Hoffart, J., et al.: YAGO2: A spatially and temporally enhanced knowledge base from Wikipedia. In: IJCAI.(2013)
16.
go back to reference Hoffart, J., et al.: Robust Disambiguation of Named Entities in Text. In: EMNLP.(2011) Hoffart, J., et al.: Robust Disambiguation of Named Entities in Text. In: EMNLP.(2011)
17.
go back to reference Kulkarni, A., et al.: Understanding temporal query dynamics. In: WSDM.(2011) Kulkarni, A., et al.: Understanding temporal query dynamics. In: WSDM.(2011)
18.
go back to reference Mandl, Thomas, Gey, Fredric C., Di Nunzio, Giorgio Maria, Ferro, Nicola, Larson, Ray R., Sanderson, Mark, Santos, Diana, Womser-Hacker, Christa, Xie, Xing: GeoCLEF 2007: The CLEF 2007 Cross-Language Geographic Information Retrieval Track Overview. In: Peters, Carol, Jijkoun, Valentin, Mandl, Thomas, Müller, Henning, Oard, Douglas W., Peñas, Anselmo, Petras, Vivien, Santos, Diana (eds.) CLEF 2007. LNCS, vol. 5152, pp. 745–772. Springer, Heidelberg (2008)CrossRef Mandl, Thomas, Gey, Fredric C., Di Nunzio, Giorgio Maria, Ferro, Nicola, Larson, Ray R., Sanderson, Mark, Santos, Diana, Womser-Hacker, Christa, Xie, Xing: GeoCLEF 2007: The CLEF 2007 Cross-Language Geographic Information Retrieval Track Overview. In: Peters, Carol, Jijkoun, Valentin, Mandl, Thomas, Müller, Henning, Oard, Douglas W., Peñas, Anselmo, Petras, Vivien, Santos, Diana (eds.) CLEF 2007. LNCS, vol. 5152, pp. 745–772. Springer, Heidelberg (2008)CrossRef
19.
go back to reference Mishra, A., et al.: Linking wikipedia events to past news. In: TAIA.(2014) Mishra, A., et al.: Linking wikipedia events to past news. In: TAIA.(2014)
20.
go back to reference Peetz, M., et al.: Using temporal bursts for query modeling. Inf. retrieval 17(1), 74–108 (2014)CrossRef Peetz, M., et al.: Using temporal bursts for query modeling. Inf. retrieval 17(1), 74–108 (2014)CrossRef
21.
go back to reference Peetz, Maria-Hendrike, de Rijke, Maarten: Cognitive Temporal Document Priors. In: Serdyukov, Pavel, Braslavski, Pavel, Kuznetsov, Sergei O., Kamps, Jaap, Rüger, Stefan, Agichtein, Eugene, Segalovich, Ilya, Yilmaz, Emine (eds.) ECIR 2013. LNCS, vol. 7814, pp. 318–330. Springer, Heidelberg (2013)CrossRef Peetz, Maria-Hendrike, de Rijke, Maarten: Cognitive Temporal Document Priors. In: Serdyukov, Pavel, Braslavski, Pavel, Kuznetsov, Sergei O., Kamps, Jaap, Rüger, Stefan, Agichtein, Eugene, Segalovich, Ilya, Yilmaz, Emine (eds.) ECIR 2013. LNCS, vol. 7814, pp. 318–330. Springer, Heidelberg (2013)CrossRef
22.
go back to reference Perea-Ortega, José M., Ureña-López, LAlfonso: Geographic Expansion of Queries to Improve the Geographic Information Retrieval Task. In: Bouma, Gosse, Ittoo, Ashwin, Métais, Elisabeth, Wortmann, Hans (eds.) NLDB 2012. LNCS, vol. 7337, pp. 94–103. Springer, Heidelberg (2012)CrossRef Perea-Ortega, José M., Ureña-López, LAlfonso: Geographic Expansion of Queries to Improve the Geographic Information Retrieval Task. In: Bouma, Gosse, Ittoo, Ashwin, Métais, Elisabeth, Wortmann, Hans (eds.) NLDB 2012. LNCS, vol. 7337, pp. 94–103. Springer, Heidelberg (2012)CrossRef
23.
go back to reference Ricardo, C., et al.: Survey of temporal information retrieval and related applications. ACM Comput. Surv. (CSUR) 47(2), 1–41 (2014) Ricardo, C., et al.: Survey of temporal information retrieval and related applications. ACM Comput. Surv. (CSUR) 47(2), 1–41 (2014)
24.
go back to reference Shen, X., et al.: Context-sensitive information retrieval using implicit feedback. In: SIGIR.(2005) Shen, X., et al.: Context-sensitive information retrieval using implicit feedback. In: SIGIR.(2005)
25.
go back to reference Tan, B., et al.: Mining long-term search history to improve search accuracy. In: KDD.(2006) Tan, B., et al.: Mining long-term search history to improve search accuracy. In: KDD.(2006)
26.
go back to reference Tsagkias, M., et al.: Linking online news and social media. In: WSDM.(2011) Tsagkias, M., et al.: Linking online news and social media. In: WSDM.(2011)
27.
go back to reference Zhai, C., et al.: Model-based feedback in the language modeling approach to information retrieval. In: CIKM.(2001) Zhai, C., et al.: Model-based feedback in the language modeling approach to information retrieval. In: CIKM.(2001)
28.
go back to reference Zhai, C., et al.: Two-stage language models for information retrieval. In: SIGIR.(2002) Zhai, C., et al.: Two-stage language models for information retrieval. In: SIGIR.(2002)
Metadata
Title
Leveraging Semantic Annotations to Link Wikipedia and News Archives
Authors
Arunav Mishra
Klaus Berberich
Copyright Year
2016
Publisher
Springer International Publishing
DOI
https://doi.org/10.1007/978-3-319-30671-1_3