Skip to main content
Top

2021 | OriginalPaper | Chapter

Exploding TV Sets and Disappointing Laptops: Suggesting Interesting Content in News Archives Based on Surprise Estimation

Authors : Adam Jatowt, I-Chen Hung, Michael Färber, Ricardo Campos, Masatoshi Yoshikawa

Published in: Advances in Information Retrieval

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Many archival collections have been recently digitized and made available to a wide public. The contained documents however tend to have limited attractiveness for ordinary users, since content may appear obsolete and uninteresting. Archival document collections can become more attractive for users if suitable content can be recommended to them. The purpose of this research is to propose a new research direction of Archival Content Suggestion to discover interesting content from long-term document archives that preserve information on society history and heritage. To realize this objective, we propose two unsupervised approaches for automatically discovering interesting sentences from news article archives. Our methods detect interesting content by comparing the information written in the past with one created in the present to make use of a surprise effect. Experiments on New York Times corpus show that our approaches effectively retrieve interesting content.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
5
One could imagine a service that automatically detects interesting sentences or headlines for broad topics and publishes them daily on web portals of underlying document archives.
 
10
We have also experimented with embedding models but they did not perform better.
 
12
We set n=5 as the number of top sentences returned for every top-ranked topic in Topic-based MRRW, and for each top-ranked topic pair in Topic Pair-based MRRW method and Topic co-occurrence methods.
 
13
Anecdotally, this particular example triggered recollections of childhood memories of one author. His grandparents owned a USSR-produced TV set and often warned him not to sit close to it when he visited their home. Only now, he could understand that the fears of his relatives were actually not without a substance. On a more general note, exploring news archives offers chances for learning about history, and might sometimes even lead to serendipitous discoveries and recollections as this example demonstrates.
 
Literature
1.
go back to reference Adamopoulos, P., Tuzhilin, A.: On unexpectedness in recommender systems: or how to better expect the unexpected. ACM TIST 5(4), 54 (2015) Adamopoulos, P., Tuzhilin, A.: On unexpectedness in recommender systems: or how to better expect the unexpected. ACM TIST 5(4), 54 (2015)
2.
go back to reference Baldi, P., Itti, L.: Of bits and wows: a Bayesian theory of surprise with applications to attention. Neural Netw. 23(5), 649–666 (2010)CrossRef Baldi, P., Itti, L.: Of bits and wows: a Bayesian theory of surprise with applications to attention. Neural Netw. 23(5), 649–666 (2010)CrossRef
3.
go back to reference Berk, N.A., Gültekin, F.: The topics that students are curious about in the history lesson. Procedia-Soc. Behav. Sci. 15, 2785–2791 (2011)CrossRef Berk, N.A., Gültekin, F.: The topics that students are curious about in the history lesson. Procedia-Soc. Behav. Sci. 15, 2785–2791 (2011)CrossRef
4.
go back to reference Berlyne, D.E.: Conflict, arousal, and curiosity (1960) Berlyne, D.E.: Conflict, arousal, and curiosity (1960)
5.
go back to reference Boldi, P., Monti, C.: LlamaFur: learning latent category matrix to find unexpected relations in Wikipedia. In: Proceedings of WebScience, pp. 218–222. ACM (2016) Boldi, P., Monti, C.: LlamaFur: learning latent category matrix to find unexpected relations in Wikipedia. In: Proceedings of WebScience, pp. 218–222. ACM (2016)
6.
go back to reference Campos, R., Dias, G., Jorge, A.M., Jatowt, A.: Survey of temporal information retrieval and related applications. ACM Comput. Surv. 47(2), 15:1–15:41 (2014) Campos, R., Dias, G., Jorge, A.M., Jatowt, A.: Survey of temporal information retrieval and related applications. ACM Comput. Surv. 47(2), 15:1–15:41 (2014)
7.
go back to reference Chen, Y.N., Metze, F.: Two-layer mutually reinforced random walk for improved multi-party meeting summarization. In: 2012 IEEE Spoken Language Technology Workshop (SLT), pp. 461–466. IEEE (2012) Chen, Y.N., Metze, F.: Two-layer mutually reinforced random walk for improved multi-party meeting summarization. In: 2012 IEEE Spoken Language Technology Workshop (SLT), pp. 461–466. IEEE (2012)
8.
go back to reference Costa, M., Silva, M.: Understanding the information needs of web archive users. In: The 10th International Web Archiving Workshop (2011) Costa, M., Silva, M.: Understanding the information needs of web archive users. In: The 10th International Web Archiving Workshop (2011)
9.
go back to reference Derezinski, M., Rohanimanesh, K., Hydrie, A.: Discovering surprising documents with context-aware word representations. In: 23rd International Conference on Intelligent User Interfaces, pp. 31–35. ACM (2018) Derezinski, M., Rohanimanesh, K., Hydrie, A.: Discovering surprising documents with context-aware word representations. In: 23rd International Conference on Intelligent User Interfaces, pp. 31–35. ACM (2018)
10.
go back to reference Färber, M.: Semantic Search for Novel Information, vol. 31. IOS Press, Amsterdam (2017) Färber, M.: Semantic Search for Novel Information, vol. 31. IOS Press, Amsterdam (2017)
11.
go back to reference Geng, L., Hamilton, H.J.: Interestingness measures for data mining: a survey. ACM Comput. Surv. (CSUR) 38(3), 9 (2006)CrossRef Geng, L., Hamilton, H.J.: Interestingness measures for data mining: a survey. ACM Comput. Surv. (CSUR) 38(3), 9 (2006)CrossRef
12.
go back to reference Gomes, D., Cruz, D., Miranda, J., Costa, M., Fontes, S.: Search the past with the Portuguese web archive. In: Proceedings of the 22nd International Conference on World Wide Web, pp. 321–324 (2013) Gomes, D., Cruz, D., Miranda, J., Costa, M., Fontes, S.: Search the past with the Portuguese web archive. In: Proceedings of the 22nd International Conference on World Wide Web, pp. 321–324 (2013)
13.
go back to reference Hidi, S., Baird, W.: Interestingness-a neglected variable in discourse processing. Cogn. Sci. 10(2), 179–194 (1986) Hidi, S., Baird, W.: Interestingness-a neglected variable in discourse processing. Cogn. Sci. 10(2), 179–194 (1986)
14.
go back to reference Itti, L., Baldi, P.F.: A principled approach to detecting surprising events in video. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Siego, CA, pp. 631–637, June 2005 Itti, L., Baldi, P.F.: A principled approach to detecting surprising events in video. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Siego, CA, pp. 631–637, June 2005
15.
go back to reference Itti, L., Baldi, P.: Bayesian surprise attracts human attention. Vision Res. 49(10), 1295–1306 (2009)CrossRef Itti, L., Baldi, P.: Bayesian surprise attracts human attention. Vision Res. 49(10), 1295–1306 (2009)CrossRef
16.
go back to reference Kaminskas, M., Bridge, D.: Diversity, serendipity, novelty, and coverage: a survey and empirical analysis of beyond-accuracy objectives in recommender systems. ACM Tran. Interact. Intell. Syst. (TiiS) 7(1), 1–42 (2016) Kaminskas, M., Bridge, D.: Diversity, serendipity, novelty, and coverage: a survey and empirical analysis of beyond-accuracy objectives in recommender systems. ACM Tran. Interact. Intell. Syst. (TiiS) 7(1), 1–42 (2016)
17.
go back to reference Kanhabua, N., Anand, A.: Temporal information retrieval. In: Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1235–1238 (2016) Kanhabua, N., Anand, A.: Temporal information retrieval. In: Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1235–1238 (2016)
18.
go back to reference Koolen, M., Kamps, J.: Searching cultural heritage data: does structure help expert searchers? In: Adaptivity, Personalization and Fusion of Heterogeneous Information, pp. 152–155. Citeseer (2010) Koolen, M., Kamps, J.: Searching cultural heritage data: does structure help expert searchers? In: Adaptivity, Personalization and Fusion of Heterogeneous Information, pp. 152–155. Citeseer (2010)
19.
20.
go back to reference Li, X., Croft, W.B.: Improving novelty detection for general topics using sentence level information patterns. In: Proceedings of CIKM, pp. 238–247. ACM (2006) Li, X., Croft, W.B.: Improving novelty detection for general topics using sentence level information patterns. In: Proceedings of CIKM, pp. 238–247. ACM (2006)
21.
go back to reference Liu, B., Hsu, W., Mun, L.F., Lee, H.Y.: Finding interesting patterns using user expectations. IEEE Trans. Knowl. Data Eng. 11(6), 817–832 (1999)CrossRef Liu, B., Hsu, W., Mun, L.F., Lee, H.Y.: Finding interesting patterns using user expectations. IEEE Trans. Knowl. Data Eng. 11(6), 817–832 (1999)CrossRef
22.
go back to reference Macrae, C.N., Bodenhausen, G.V.: Social cognition: thinking categorically about others. Annu. Rev. Psychol. 51(1), 93–120 (2000)CrossRef Macrae, C.N., Bodenhausen, G.V.: Social cognition: thinking categorically about others. Annu. Rev. Psychol. 51(1), 93–120 (2000)CrossRef
23.
go back to reference Padmanabhan, B., Tuzhilin, A.: Unexpectedness as a measure of interestingness in knowledge discovery. Decis. Support Syst. 27(3), 303–318 (1999)CrossRef Padmanabhan, B., Tuzhilin, A.: Unexpectedness as a measure of interestingness in knowledge discovery. Decis. Support Syst. 27(3), 303–318 (1999)CrossRef
24.
25.
go back to reference Pessent, E.: Is history irrelevant? Dissent Mag., pp. 1, June 1971 Pessent, E.: Is history irrelevant? Dissent Mag., pp. 1, June 1971
26.
go back to reference Sandhaus, E.: The New York times annotated corpus. Linguist. Data Consortium Philadelphia 6(12), e26752 (2008) Sandhaus, E.: The New York times annotated corpus. Linguist. Data Consortium Philadelphia 6(12), e26752 (2008)
27.
go back to reference Schwartz, J.M., Cook, T.: Archives, records, and power: the making of modern memory. Arch. Sci. 2(1–2), 1–19 (2002)CrossRef Schwartz, J.M., Cook, T.: Archives, records, and power: the making of modern memory. Arch. Sci. 2(1–2), 1–19 (2002)CrossRef
28.
go back to reference Silberschatz, A., Tuzhilin, A.: What makes patterns interesting in knowledge discovery systems. IEEE TKDE 8(6), 970–974 (1996) Silberschatz, A., Tuzhilin, A.: What makes patterns interesting in knowledge discovery systems. IEEE TKDE 8(6), 970–974 (1996)
29.
go back to reference Silveira, T., Zhang, M., Lin, X., Liu, Y., Ma, S.: How good your recommender system is? A survey on evaluations in recommendation. Int. J. Mach. Learn. Cybern. 10(5), 813–831 (2019)CrossRef Silveira, T., Zhang, M., Lin, X., Liu, Y., Ma, S.: How good your recommender system is? A survey on evaluations in recommendation. Int. J. Mach. Learn. Cybern. 10(5), 813–831 (2019)CrossRef
30.
go back to reference Silvia, P.J.: What is interesting? Exploring the appraisal structure of interest. Emotion 5(1), 89 (2005)CrossRef Silvia, P.J.: What is interesting? Exploring the appraisal structure of interest. Emotion 5(1), 89 (2005)CrossRef
31.
go back to reference Spyropoulou, E., De Bie, T., Boley, M.: Interesting pattern mining in multi-relational data. Data Min. Knowl. Discov. 28(3), 808–849 (2014)MathSciNetCrossRef Spyropoulou, E., De Bie, T., Boley, M.: Interesting pattern mining in multi-relational data. Data Min. Knowl. Discov. 28(3), 808–849 (2014)MathSciNetCrossRef
32.
go back to reference Stiller, J.: A framework for classifying interactions in cultural heritage information systems. Int. J. Heritage Digital Era 1(1), 141–146 (2012)CrossRef Stiller, J.: A framework for classifying interactions in cultural heritage information systems. Int. J. Heritage Digital Era 1(1), 141–146 (2012)CrossRef
33.
go back to reference Strauss, V.: Why so many students hate history - and what to do about it? The Washington Post (2017) Strauss, V.: Why so many students hate history - and what to do about it? The Washington Post (2017)
35.
go back to reference Trant, J.: Understanding searches of a contemporary art museum catalogue: a preliminary study. Report, Archives & Museum Informatics (2006) Trant, J.: Understanding searches of a contemporary art museum catalogue: a preliminary study. Report, Archives & Museum Informatics (2006)
36.
go back to reference Tsukuda, K., Ohshima, H., Yamamoto, M., Iwasaki, H., Tanaka, K.: Discovering unexpected information on the basis of popularity/unpopularity analysis of coordinate objects and their relationships. In: Proceedings of SAC, pp. 878–885. ACM (2013) Tsukuda, K., Ohshima, H., Yamamoto, M., Iwasaki, H., Tanaka, K.: Discovering unexpected information on the basis of popularity/unpopularity analysis of coordinate objects and their relationships. In: Proceedings of SAC, pp. 878–885. ACM (2013)
37.
go back to reference Tsurel, D., Pelleg, D., Guy, I., Shahaf, D.: Fun facts: automatic trivia fact extraction from Wikipedia. In: Proceedings of WSDM, pp. 345–354. ACM (2017) Tsurel, D., Pelleg, D., Guy, I., Shahaf, D.: Fun facts: automatic trivia fact extraction from Wikipedia. In: Proceedings of WSDM, pp. 345–354. ACM (2017)
39.
go back to reference Warwick, C., Terras, M., Huntington, P., Pappa, N.: If you build it will they come? The LAIRAH study: quantifying the use of online resources in the arts and humanities through statistical analysis of user log data. Literary Linguist. Comput. 23(1), 85–102 (2007)CrossRef Warwick, C., Terras, M., Huntington, P., Pappa, N.: If you build it will they come? The LAIRAH study: quantifying the use of online resources in the arts and humanities through statistical analysis of user log data. Literary Linguist. Comput. 23(1), 85–102 (2007)CrossRef
40.
go back to reference Yannakakis, G.N., Liapis, A.: Searching for surprise. In: Proceedings of the International Conference on Computational Creativity (2016) Yannakakis, G.N., Liapis, A.: Searching for surprise. In: Proceedings of the International Conference on Computational Creativity (2016)
Metadata
Title
Exploding TV Sets and Disappointing Laptops: Suggesting Interesting Content in News Archives Based on Surprise Estimation
Authors
Adam Jatowt
I-Chen Hung
Michael Färber
Ricardo Campos
Masatoshi Yoshikawa
Copyright Year
2021
DOI
https://doi.org/10.1007/978-3-030-72113-8_17