Skip to main content
Top

2016 | OriginalPaper | Chapter

Towards Monitoring of Novel Statements in the News

Authors : Michael Färber, Achim Rettinger, Andreas Harth

Published in: The Semantic Web. Latest Advances and New Domains

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In media monitoring users have a clearly defined information need to find so far unknown statements regarding certain entities or relations mentioned in natural-language text. However, commonly used keyword-based search technologies are focused on finding relevant documents and cannot judge the novelty of statements contained in the text. In this work, we propose a new semantic novelty measure that allows to retrieve statements, which are both novel and relevant, from natural-language sentences in news articles. Relevance is defined by a semantic query of the user, while novelty is ensured by checking whether the extracted statements are related, but non-existing in a knowledge base containing the currently known facts. Our evaluation performed on English news texts and on CrunchBase as the knowledge base demonstrates the effectiveness, unique capabilities and future challenges of this novel approach to novelty.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
We use the words “statements”, “facts”, and “triples” interchangeably in this paper.
 
2
We do not consider triples where the object is a literal.
 
3
See http://​dbpedia.​org, requested on Mar 7, 2016. DBpedia is widely used for entity linking in general domain settings. However, also other KBs can be used as far as a suitable entity linking component is available.
 
4
We avoid the DBpedia namespaces for better readability.
 
5
For our example sentence (see Fig. 3), this would be “sixth generation iPhone”, “Apple”, and “touchscreen display”.
 
6
The data sets and evaluation results are available at http://​people.​aifb.​kit.​edu/​mfa/​novel-triple-extraction/​.
 
7
See http://​crunchbase.​com, requested on Mar 7, 2016.
 
8
See http://​dbpedia.​org, requested on Mar 7, 2016.
 
9
See http://​newsfeed.​ijs.​si, requested on Mar 7, 2016.
 
10
Goal 1 and 3 are not considered here since facts with the chosen KB properties do not occur often.
 
11
This analysis was performed by evaluating all sentences containing two labels of the entities of acquisitions which were missed and containing the phrase “acquire”/“acquisition”/“buy”/“purchase” etc.
 
12
For instance, for the topic “Diana Car Accident”, the task was to find novel information about where the accident happened, who was killed, the extent of injuries, how it happened, and who else was involved.
 
13
See http://​trec-kba.​org, requested on Mar 7, 2016.
 
Literature
1.
go back to reference Gabrilovich, E., Dumais, S., Horvitz, E.: Newsjunkie: providing personalized newsfeeds via analysis of information novelty. In: Proceedings of the 13th International Conference on World Wide Web, WWW 2004, pp. 482–490. ACM, New York (2004) Gabrilovich, E., Dumais, S., Horvitz, E.: Newsjunkie: providing personalized newsfeeds via analysis of information novelty. In: Proceedings of the 13th International Conference on World Wide Web, WWW 2004, pp. 482–490. ACM, New York (2004)
2.
go back to reference Karkali, M., Rousseau, F., Ntoulas, A., Vazirgiannis, M.: Efficient online novelty detection in news streams. In: Lin, X., Manolopoulos, Y., Srivastava, D., Huang, G. (eds.) WISE 2013, Part I. LNCS, vol. 8180, pp. 57–71. Springer, Heidelberg (2013)CrossRef Karkali, M., Rousseau, F., Ntoulas, A., Vazirgiannis, M.: Efficient online novelty detection in news streams. In: Lin, X., Manolopoulos, Y., Srivastava, D., Huang, G. (eds.) WISE 2013, Part I. LNCS, vol. 8180, pp. 57–71. Springer, Heidelberg (2013)CrossRef
3.
go back to reference Del Corro, L., Gemulla, R.: ClausIE: clause-based open information extraction. In: Proceedings of the 22nd International Conference on World Wide Web, WWW 2013, Republic and Canton of Geneva, Switzerland, pp. 355–366. ACM (2013) Del Corro, L., Gemulla, R.: ClausIE: clause-based open information extraction. In: Proceedings of the 22nd International Conference on World Wide Web, WWW 2013, Republic and Canton of Geneva, Switzerland, pp. 355–366. ACM (2013)
4.
go back to reference Zhang, L., Färber, M., Rettinger, A.: xLiD-Lexica: cross-lingual linked data lexica. In: Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 2014), pp. 2101–2105. European Language Resources Association (2014) Zhang, L., Färber, M., Rettinger, A.: xLiD-Lexica: cross-lingual linked data lexica. In: Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 2014), pp. 2101–2105. European Language Resources Association (2014)
5.
go back to reference Zhang, L., Rettinger, A.: X-LiSA: cross-lingual semantic annotation. PVLDB 7(13), 1693–1696 (2014) Zhang, L., Rettinger, A.: X-LiSA: cross-lingual semantic annotation. PVLDB 7(13), 1693–1696 (2014)
6.
go back to reference Welty, C., Fan, J., Gondek, D., Schlaikjer, A.: Large scale relation detection. In: Proceedings of the NAACL HLT 2010 First International Workshop on Formalisms and Methodology for Learning by Reading. FAM-LbR 2010, Stroudsburg, PA, USA, pp. 24–33. Association for Computational Linguistics (2010) Welty, C., Fan, J., Gondek, D., Schlaikjer, A.: Large scale relation detection. In: Proceedings of the NAACL HLT 2010 First International Workshop on Formalisms and Methodology for Learning by Reading. FAM-LbR 2010, Stroudsburg, PA, USA, pp. 24–33. Association for Computational Linguistics (2010)
7.
go back to reference Gerber, D., Ngonga Ngomo, A.C.: Bootstrapping the linked data web. In: 1st Workshop on Web Scale Knowledge Extraction @ ISWC 2011 (2011) Gerber, D., Ngonga Ngomo, A.C.: Bootstrapping the linked data web. In: 1st Workshop on Web Scale Knowledge Extraction @ ISWC 2011 (2011)
8.
go back to reference Trampuš, M., Novak, B.: Internals of an aggregated web news feed. In: Proceedings of the Fifteenth International Information Science Conference IS SiKDD 2012, pp. 431–434 (2012) Trampuš, M., Novak, B.: Internals of an aggregated web news feed. In: Proceedings of the Fifteenth International Information Science Conference IS SiKDD 2012, pp. 431–434 (2012)
9.
go back to reference Presutti, V., Draicchio, F., Gangemi, A.: Knowledge extraction based on discourse representation theory and linguistic frames. In: ten Teije, A., Völker, J., Handschuh, S., Stuckenschmidt, H., d’Acquin, M., Nikolov, A., Aussenac-Gilles, N., Hernandez, N. (eds.) EKAW 2012. LNCS, vol. 7603, pp. 114–129. Springer, Heidelberg (2012)CrossRef Presutti, V., Draicchio, F., Gangemi, A.: Knowledge extraction based on discourse representation theory and linguistic frames. In: ten Teije, A., Völker, J., Handschuh, S., Stuckenschmidt, H., d’Acquin, M., Nikolov, A., Aussenac-Gilles, N., Hernandez, N. (eds.) EKAW 2012. LNCS, vol. 7603, pp. 114–129. Springer, Heidelberg (2012)CrossRef
10.
go back to reference Carvalho, D.S., Freitas, A., da Silva, J.C.P.: Graphia: extracting contextual relation graphs from text. In: Cimiano, P., Fernández, M., Lopez, V., Schlobach, S., Völker, J. (eds.) ESWC 2013. LNCS, vol. 7955, pp. 236–241. Springer, Heidelberg (2013)CrossRef Carvalho, D.S., Freitas, A., da Silva, J.C.P.: Graphia: extracting contextual relation graphs from text. In: Cimiano, P., Fernández, M., Lopez, V., Schlobach, S., Völker, J. (eds.) ESWC 2013. LNCS, vol. 7955, pp. 236–241. Springer, Heidelberg (2013)CrossRef
11.
go back to reference Augenstein, I., Padó, S., Rudolph, S.: LODifier: generating linked data from unstructured text. In: Simperl, E., Cimiano, P., Polleres, A., Corcho, O., Presutti, V. (eds.) ESWC 2012. LNCS, vol. 7295, pp. 210–224. Springer, Heidelberg (2012)CrossRef Augenstein, I., Padó, S., Rudolph, S.: LODifier: generating linked data from unstructured text. In: Simperl, E., Cimiano, P., Polleres, A., Corcho, O., Presutti, V. (eds.) ESWC 2012. LNCS, vol. 7295, pp. 210–224. Springer, Heidelberg (2012)CrossRef
12.
go back to reference Fader, A., Soderland, S., Etzioni, O.: Identifying relations for open information extraction. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing. EMNLP 2011, Stroudsburg, PA, USA, pp. 1535–1545. Association for Computational Linguistics (2011) Fader, A., Soderland, S., Etzioni, O.: Identifying relations for open information extraction. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing. EMNLP 2011, Stroudsburg, PA, USA, pp. 1535–1545. Association for Computational Linguistics (2011)
13.
go back to reference Mausam, S., M., Bart, R., Soderland, S., Etzioni, O.: Open language learning for information extraction. In: Proceedings of the 2012 Joint Conference on Empirical Methods in NLP and Computational Natural Language Learning. EMNLP-CoNLL 2012, Stroudsburg, PA, USA, pp. 523–534. ACL (2012) Mausam, S., M., Bart, R., Soderland, S., Etzioni, O.: Open language learning for information extraction. In: Proceedings of the 2012 Joint Conference on Empirical Methods in NLP and Computational Natural Language Learning. EMNLP-CoNLL 2012, Stroudsburg, PA, USA, pp. 523–534. ACL (2012)
14.
go back to reference Zhang, Y., Callan, J., Minka, T.: Novelty and redundancy detection in adaptive filtering. In: Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval. SIGIR 2002, pp. 81–88. ACM, New York (2002) Zhang, Y., Callan, J., Minka, T.: Novelty and redundancy detection in adaptive filtering. In: Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval. SIGIR 2002, pp. 81–88. ACM, New York (2002)
15.
go back to reference Li, X., Croft, W.B.: An information-pattern-based approach to novelty detection. Inf. Process. Manag. 44(3), 1159–1188 (2008)CrossRef Li, X., Croft, W.B.: An information-pattern-based approach to novelty detection. Inf. Process. Manag. 44(3), 1159–1188 (2008)CrossRef
16.
go back to reference Li, X., Croft, W.B.: Novelty detection based on sentence level patterns. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management. CIKM 2005, pp. 744–751. ACM, New York (2005) Li, X., Croft, W.B.: Novelty detection based on sentence level patterns. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management. CIKM 2005, pp. 744–751. ACM, New York (2005)
17.
go back to reference Soboroff, I., Harman, D.: Novelty detection: the trec experience. In: Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, Vancouver, British Columbia, Canada, pp. 105–112. Association for Computational Linguistics (2005) Soboroff, I., Harman, D.: Novelty detection: the trec experience. In: Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, Vancouver, British Columbia, Canada, pp. 105–112. Association for Computational Linguistics (2005)
18.
go back to reference Clarke, C.L., Kolla, M., Cormack, G.V., Vechtomova, O., Ashkan, A., Büttcher, S., MacKinnon, I.: Novelty and diversity in information retrieval evaluation. In: Proceedings of the 31st Annual International ACM SIGIR Conference on R&D in Information Retrieval. SIGIR 2008, pp. 659–666. ACM, New York (2008) Clarke, C.L., Kolla, M., Cormack, G.V., Vechtomova, O., Ashkan, A., Büttcher, S., MacKinnon, I.: Novelty and diversity in information retrieval evaluation. In: Proceedings of the 31st Annual International ACM SIGIR Conference on R&D in Information Retrieval. SIGIR 2008, pp. 659–666. ACM, New York (2008)
19.
go back to reference Dutta, A., Meilicke, C., Stuckenschmidt, H.: Semantifying triples from open information extraction systems. In: STAIRS 2014 : Proceedings of the 7th European Starting AI Researcher Symposium, IOS Press, pp. 111–120, Clifton, VA (2014) Dutta, A., Meilicke, C., Stuckenschmidt, H.: Semantifying triples from open information extraction systems. In: STAIRS 2014 : Proceedings of the 7th European Starting AI Researcher Symposium, IOS Press, pp. 111–120, Clifton, VA (2014)
Metadata
Title
Towards Monitoring of Novel Statements in the News
Authors
Michael Färber
Achim Rettinger
Andreas Harth
Copyright Year
2016
DOI
https://doi.org/10.1007/978-3-319-34129-3_18