Skip to main content

2021 | OriginalPaper | Buchkapitel

Exploding TV Sets and Disappointing Laptops: Suggesting Interesting Content in News Archives Based on Surprise Estimation

verfasst von : Adam Jatowt, I-Chen Hung, Michael Färber, Ricardo Campos, Masatoshi Yoshikawa

Erschienen in: Advances in Information Retrieval

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Many archival collections have been recently digitized and made available to a wide public. The contained documents however tend to have limited attractiveness for ordinary users, since content may appear obsolete and uninteresting. Archival document collections can become more attractive for users if suitable content can be recommended to them. The purpose of this research is to propose a new research direction of Archival Content Suggestion to discover interesting content from long-term document archives that preserve information on society history and heritage. To realize this objective, we propose two unsupervised approaches for automatically discovering interesting sentences from news article archives. Our methods detect interesting content by comparing the information written in the past with one created in the present to make use of a surprise effect. Experiments on New York Times corpus show that our approaches effectively retrieve interesting content.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
5
One could imagine a service that automatically detects interesting sentences or headlines for broad topics and publishes them daily on web portals of underlying document archives.
 
10
We have also experimented with embedding models but they did not perform better.
 
12
We set n=5 as the number of top sentences returned for every top-ranked topic in Topic-based MRRW, and for each top-ranked topic pair in Topic Pair-based MRRW method and Topic co-occurrence methods.
 
13
Anecdotally, this particular example triggered recollections of childhood memories of one author. His grandparents owned a USSR-produced TV set and often warned him not to sit close to it when he visited their home. Only now, he could understand that the fears of his relatives were actually not without a substance. On a more general note, exploring news archives offers chances for learning about history, and might sometimes even lead to serendipitous discoveries and recollections as this example demonstrates.
 
Literatur
1.
Zurück zum Zitat Adamopoulos, P., Tuzhilin, A.: On unexpectedness in recommender systems: or how to better expect the unexpected. ACM TIST 5(4), 54 (2015) Adamopoulos, P., Tuzhilin, A.: On unexpectedness in recommender systems: or how to better expect the unexpected. ACM TIST 5(4), 54 (2015)
2.
Zurück zum Zitat Baldi, P., Itti, L.: Of bits and wows: a Bayesian theory of surprise with applications to attention. Neural Netw. 23(5), 649–666 (2010)CrossRef Baldi, P., Itti, L.: Of bits and wows: a Bayesian theory of surprise with applications to attention. Neural Netw. 23(5), 649–666 (2010)CrossRef
3.
Zurück zum Zitat Berk, N.A., Gültekin, F.: The topics that students are curious about in the history lesson. Procedia-Soc. Behav. Sci. 15, 2785–2791 (2011)CrossRef Berk, N.A., Gültekin, F.: The topics that students are curious about in the history lesson. Procedia-Soc. Behav. Sci. 15, 2785–2791 (2011)CrossRef
4.
Zurück zum Zitat Berlyne, D.E.: Conflict, arousal, and curiosity (1960) Berlyne, D.E.: Conflict, arousal, and curiosity (1960)
5.
Zurück zum Zitat Boldi, P., Monti, C.: LlamaFur: learning latent category matrix to find unexpected relations in Wikipedia. In: Proceedings of WebScience, pp. 218–222. ACM (2016) Boldi, P., Monti, C.: LlamaFur: learning latent category matrix to find unexpected relations in Wikipedia. In: Proceedings of WebScience, pp. 218–222. ACM (2016)
6.
Zurück zum Zitat Campos, R., Dias, G., Jorge, A.M., Jatowt, A.: Survey of temporal information retrieval and related applications. ACM Comput. Surv. 47(2), 15:1–15:41 (2014) Campos, R., Dias, G., Jorge, A.M., Jatowt, A.: Survey of temporal information retrieval and related applications. ACM Comput. Surv. 47(2), 15:1–15:41 (2014)
7.
Zurück zum Zitat Chen, Y.N., Metze, F.: Two-layer mutually reinforced random walk for improved multi-party meeting summarization. In: 2012 IEEE Spoken Language Technology Workshop (SLT), pp. 461–466. IEEE (2012) Chen, Y.N., Metze, F.: Two-layer mutually reinforced random walk for improved multi-party meeting summarization. In: 2012 IEEE Spoken Language Technology Workshop (SLT), pp. 461–466. IEEE (2012)
8.
Zurück zum Zitat Costa, M., Silva, M.: Understanding the information needs of web archive users. In: The 10th International Web Archiving Workshop (2011) Costa, M., Silva, M.: Understanding the information needs of web archive users. In: The 10th International Web Archiving Workshop (2011)
9.
Zurück zum Zitat Derezinski, M., Rohanimanesh, K., Hydrie, A.: Discovering surprising documents with context-aware word representations. In: 23rd International Conference on Intelligent User Interfaces, pp. 31–35. ACM (2018) Derezinski, M., Rohanimanesh, K., Hydrie, A.: Discovering surprising documents with context-aware word representations. In: 23rd International Conference on Intelligent User Interfaces, pp. 31–35. ACM (2018)
10.
Zurück zum Zitat Färber, M.: Semantic Search for Novel Information, vol. 31. IOS Press, Amsterdam (2017) Färber, M.: Semantic Search for Novel Information, vol. 31. IOS Press, Amsterdam (2017)
11.
Zurück zum Zitat Geng, L., Hamilton, H.J.: Interestingness measures for data mining: a survey. ACM Comput. Surv. (CSUR) 38(3), 9 (2006)CrossRef Geng, L., Hamilton, H.J.: Interestingness measures for data mining: a survey. ACM Comput. Surv. (CSUR) 38(3), 9 (2006)CrossRef
12.
Zurück zum Zitat Gomes, D., Cruz, D., Miranda, J., Costa, M., Fontes, S.: Search the past with the Portuguese web archive. In: Proceedings of the 22nd International Conference on World Wide Web, pp. 321–324 (2013) Gomes, D., Cruz, D., Miranda, J., Costa, M., Fontes, S.: Search the past with the Portuguese web archive. In: Proceedings of the 22nd International Conference on World Wide Web, pp. 321–324 (2013)
13.
Zurück zum Zitat Hidi, S., Baird, W.: Interestingness-a neglected variable in discourse processing. Cogn. Sci. 10(2), 179–194 (1986) Hidi, S., Baird, W.: Interestingness-a neglected variable in discourse processing. Cogn. Sci. 10(2), 179–194 (1986)
14.
Zurück zum Zitat Itti, L., Baldi, P.F.: A principled approach to detecting surprising events in video. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Siego, CA, pp. 631–637, June 2005 Itti, L., Baldi, P.F.: A principled approach to detecting surprising events in video. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Siego, CA, pp. 631–637, June 2005
15.
Zurück zum Zitat Itti, L., Baldi, P.: Bayesian surprise attracts human attention. Vision Res. 49(10), 1295–1306 (2009)CrossRef Itti, L., Baldi, P.: Bayesian surprise attracts human attention. Vision Res. 49(10), 1295–1306 (2009)CrossRef
16.
Zurück zum Zitat Kaminskas, M., Bridge, D.: Diversity, serendipity, novelty, and coverage: a survey and empirical analysis of beyond-accuracy objectives in recommender systems. ACM Tran. Interact. Intell. Syst. (TiiS) 7(1), 1–42 (2016) Kaminskas, M., Bridge, D.: Diversity, serendipity, novelty, and coverage: a survey and empirical analysis of beyond-accuracy objectives in recommender systems. ACM Tran. Interact. Intell. Syst. (TiiS) 7(1), 1–42 (2016)
17.
Zurück zum Zitat Kanhabua, N., Anand, A.: Temporal information retrieval. In: Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1235–1238 (2016) Kanhabua, N., Anand, A.: Temporal information retrieval. In: Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1235–1238 (2016)
18.
Zurück zum Zitat Koolen, M., Kamps, J.: Searching cultural heritage data: does structure help expert searchers? In: Adaptivity, Personalization and Fusion of Heterogeneous Information, pp. 152–155. Citeseer (2010) Koolen, M., Kamps, J.: Searching cultural heritage data: does structure help expert searchers? In: Adaptivity, Personalization and Fusion of Heterogeneous Information, pp. 152–155. Citeseer (2010)
19.
Zurück zum Zitat Kuznetsov, S.O., Makhalova, T.: On interestingness measures of formal concepts. Inf. Sci. 442, 202–219 (2018)MathSciNetCrossRef Kuznetsov, S.O., Makhalova, T.: On interestingness measures of formal concepts. Inf. Sci. 442, 202–219 (2018)MathSciNetCrossRef
20.
Zurück zum Zitat Li, X., Croft, W.B.: Improving novelty detection for general topics using sentence level information patterns. In: Proceedings of CIKM, pp. 238–247. ACM (2006) Li, X., Croft, W.B.: Improving novelty detection for general topics using sentence level information patterns. In: Proceedings of CIKM, pp. 238–247. ACM (2006)
21.
Zurück zum Zitat Liu, B., Hsu, W., Mun, L.F., Lee, H.Y.: Finding interesting patterns using user expectations. IEEE Trans. Knowl. Data Eng. 11(6), 817–832 (1999)CrossRef Liu, B., Hsu, W., Mun, L.F., Lee, H.Y.: Finding interesting patterns using user expectations. IEEE Trans. Knowl. Data Eng. 11(6), 817–832 (1999)CrossRef
22.
Zurück zum Zitat Macrae, C.N., Bodenhausen, G.V.: Social cognition: thinking categorically about others. Annu. Rev. Psychol. 51(1), 93–120 (2000)CrossRef Macrae, C.N., Bodenhausen, G.V.: Social cognition: thinking categorically about others. Annu. Rev. Psychol. 51(1), 93–120 (2000)CrossRef
23.
Zurück zum Zitat Padmanabhan, B., Tuzhilin, A.: Unexpectedness as a measure of interestingness in knowledge discovery. Decis. Support Syst. 27(3), 303–318 (1999)CrossRef Padmanabhan, B., Tuzhilin, A.: Unexpectedness as a measure of interestingness in knowledge discovery. Decis. Support Syst. 27(3), 303–318 (1999)CrossRef
24.
Zurück zum Zitat Pasquali, A., Mangaravite, V., Campos, R., Jorge, A.M., Jatowt, A.: Interactive system for automatically generating temporal narratives. In: Azzopardi, L., Stein, B., Fuhr, N., Mayr, P., Hauff, C., Hiemstra, D. (eds.) ECIR 2019. LNCS, vol. 11438, pp. 251–255. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-15719-7_34CrossRef Pasquali, A., Mangaravite, V., Campos, R., Jorge, A.M., Jatowt, A.: Interactive system for automatically generating temporal narratives. In: Azzopardi, L., Stein, B., Fuhr, N., Mayr, P., Hauff, C., Hiemstra, D. (eds.) ECIR 2019. LNCS, vol. 11438, pp. 251–255. Springer, Cham (2019). https://​doi.​org/​10.​1007/​978-3-030-15719-7_​34CrossRef
25.
Zurück zum Zitat Pessent, E.: Is history irrelevant? Dissent Mag., pp. 1, June 1971 Pessent, E.: Is history irrelevant? Dissent Mag., pp. 1, June 1971
26.
Zurück zum Zitat Sandhaus, E.: The New York times annotated corpus. Linguist. Data Consortium Philadelphia 6(12), e26752 (2008) Sandhaus, E.: The New York times annotated corpus. Linguist. Data Consortium Philadelphia 6(12), e26752 (2008)
27.
Zurück zum Zitat Schwartz, J.M., Cook, T.: Archives, records, and power: the making of modern memory. Arch. Sci. 2(1–2), 1–19 (2002)CrossRef Schwartz, J.M., Cook, T.: Archives, records, and power: the making of modern memory. Arch. Sci. 2(1–2), 1–19 (2002)CrossRef
28.
Zurück zum Zitat Silberschatz, A., Tuzhilin, A.: What makes patterns interesting in knowledge discovery systems. IEEE TKDE 8(6), 970–974 (1996) Silberschatz, A., Tuzhilin, A.: What makes patterns interesting in knowledge discovery systems. IEEE TKDE 8(6), 970–974 (1996)
29.
Zurück zum Zitat Silveira, T., Zhang, M., Lin, X., Liu, Y., Ma, S.: How good your recommender system is? A survey on evaluations in recommendation. Int. J. Mach. Learn. Cybern. 10(5), 813–831 (2019)CrossRef Silveira, T., Zhang, M., Lin, X., Liu, Y., Ma, S.: How good your recommender system is? A survey on evaluations in recommendation. Int. J. Mach. Learn. Cybern. 10(5), 813–831 (2019)CrossRef
30.
Zurück zum Zitat Silvia, P.J.: What is interesting? Exploring the appraisal structure of interest. Emotion 5(1), 89 (2005)CrossRef Silvia, P.J.: What is interesting? Exploring the appraisal structure of interest. Emotion 5(1), 89 (2005)CrossRef
31.
Zurück zum Zitat Spyropoulou, E., De Bie, T., Boley, M.: Interesting pattern mining in multi-relational data. Data Min. Knowl. Discov. 28(3), 808–849 (2014)MathSciNetCrossRef Spyropoulou, E., De Bie, T., Boley, M.: Interesting pattern mining in multi-relational data. Data Min. Knowl. Discov. 28(3), 808–849 (2014)MathSciNetCrossRef
32.
Zurück zum Zitat Stiller, J.: A framework for classifying interactions in cultural heritage information systems. Int. J. Heritage Digital Era 1(1), 141–146 (2012)CrossRef Stiller, J.: A framework for classifying interactions in cultural heritage information systems. Int. J. Heritage Digital Era 1(1), 141–146 (2012)CrossRef
33.
Zurück zum Zitat Strauss, V.: Why so many students hate history - and what to do about it? The Washington Post (2017) Strauss, V.: Why so many students hate history - and what to do about it? The Washington Post (2017)
35.
Zurück zum Zitat Trant, J.: Understanding searches of a contemporary art museum catalogue: a preliminary study. Report, Archives & Museum Informatics (2006) Trant, J.: Understanding searches of a contemporary art museum catalogue: a preliminary study. Report, Archives & Museum Informatics (2006)
36.
Zurück zum Zitat Tsukuda, K., Ohshima, H., Yamamoto, M., Iwasaki, H., Tanaka, K.: Discovering unexpected information on the basis of popularity/unpopularity analysis of coordinate objects and their relationships. In: Proceedings of SAC, pp. 878–885. ACM (2013) Tsukuda, K., Ohshima, H., Yamamoto, M., Iwasaki, H., Tanaka, K.: Discovering unexpected information on the basis of popularity/unpopularity analysis of coordinate objects and their relationships. In: Proceedings of SAC, pp. 878–885. ACM (2013)
37.
Zurück zum Zitat Tsurel, D., Pelleg, D., Guy, I., Shahaf, D.: Fun facts: automatic trivia fact extraction from Wikipedia. In: Proceedings of WSDM, pp. 345–354. ACM (2017) Tsurel, D., Pelleg, D., Guy, I., Shahaf, D.: Fun facts: automatic trivia fact extraction from Wikipedia. In: Proceedings of WSDM, pp. 345–354. ACM (2017)
39.
Zurück zum Zitat Warwick, C., Terras, M., Huntington, P., Pappa, N.: If you build it will they come? The LAIRAH study: quantifying the use of online resources in the arts and humanities through statistical analysis of user log data. Literary Linguist. Comput. 23(1), 85–102 (2007)CrossRef Warwick, C., Terras, M., Huntington, P., Pappa, N.: If you build it will they come? The LAIRAH study: quantifying the use of online resources in the arts and humanities through statistical analysis of user log data. Literary Linguist. Comput. 23(1), 85–102 (2007)CrossRef
40.
Zurück zum Zitat Yannakakis, G.N., Liapis, A.: Searching for surprise. In: Proceedings of the International Conference on Computational Creativity (2016) Yannakakis, G.N., Liapis, A.: Searching for surprise. In: Proceedings of the International Conference on Computational Creativity (2016)
Metadaten
Titel
Exploding TV Sets and Disappointing Laptops: Suggesting Interesting Content in News Archives Based on Surprise Estimation
verfasst von
Adam Jatowt
I-Chen Hung
Michael Färber
Ricardo Campos
Masatoshi Yoshikawa
Copyright-Jahr
2021
DOI
https://doi.org/10.1007/978-3-030-72113-8_17

Neuer Inhalt