Skip to main content
Erschienen in: Social Network Analysis and Mining 1/2021

01.12.2021 | Original Article

Tweet contextualization: combining sentence extraction, sentence aggregation and sentence reordering to enhance informativeness and readability

verfasst von: Amira Dhokar, Lobna Hlaoua, Lotfi Ben Romdhane

Erschienen in: Social Network Analysis and Mining | Ausgabe 1/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Nowadays, social medias are very popular among their users. One of the most well-known social networks is Twitter. It is a micro-blog that enables its users to send short messages called tweets. A tweet is a 280 characters long message that is rarely self-content. Hence, additional information is necessary to allow better readability of the tweet. This new task has attracted a great deal of attention recently. Given a tweet, the aim of tweet contextualization is to produce an informative paragraph, called a context, from a set of documents in response to topics treated by the tweet. Furthermore of being informative, a summary should be coherent, i.e., well-written to be readable and grammatically compact. Hence, coherence is an essential characteristic in order to produce comprehensible texts. In this paper, we propose a new approach of tweet contextualization based on graphs by combining sentence extraction, sentence aggregation and sentence reordering to enhance informativeness and readability in order to build a relevant and coherent context. The main idea of our proposed method is to select relevant, informative coherent and semantically related sentences from a document that best describes themes expressed by the tweet, and aggregate relevant phrases in the same graph to filter more informative ones. We proposed a novel algorithm called CSA algorithm to achieve our aim and to construct a concise extract. We also proposed to invest in a reordering phase to improve the coherence of the obtained context.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Fußnoten
2
Association rule learning is a rule-based machine learning method for discovering interesting relations between variables in large databases.
 
Literatur
Zurück zum Zitat Amigó E, De Albornoz JC, Chugur I, Corujo A, Gonzalo J, Martín T, Meij E, De Rijke M, Spina D (2013) Overview of replab 2013: evaluating online reputation monitoring systems. In: International conference of the cross-language evaluation forum for European languages. Springer, Berlin, pp 333–352 Amigó E, De Albornoz JC, Chugur I, Corujo A, Gonzalo J, Martín T, Meij E, De Rijke M, Spina D (2013) Overview of replab 2013: evaluating online reputation monitoring systems. In: International conference of the cross-language evaluation forum for European languages. Springer, Berlin, pp 333–352
Zurück zum Zitat Barzilay R, Elhadad N (2002) Inferring strategies for sentence ordering in multidocument news summarization. J Artif Intell Res 17:35–55CrossRef Barzilay R, Elhadad N (2002) Inferring strategies for sentence ordering in multidocument news summarization. J Artif Intell Res 17:35–55CrossRef
Zurück zum Zitat Barzilay R, Lapata M (2008) Modeling local coherence: an entity-based approach. Comput Linguist 34(1):1–34CrossRef Barzilay R, Lapata M (2008) Modeling local coherence: an entity-based approach. Comput Linguist 34(1):1–34CrossRef
Zurück zum Zitat Barzilay R, Lee L(2004) Catching the drift: probabilistic content models, with applications to generation and summarization. arXiv preprint cs/0405039 Barzilay R, Lee L(2004) Catching the drift: probabilistic content models, with applications to generation and summarization. arXiv preprint cs/0405039
Zurück zum Zitat Belkaroui R, Faiz R (2017) Conversational based method for tweet contextualization. Vietnam J Comput Sci, pp 1–10 Belkaroui R, Faiz R (2017) Conversational based method for tweet contextualization. Vietnam J Comput Sci, pp 1–10
Zurück zum Zitat Bellot P, Bogers T, Geva S, Hall M, Huurdeman H, Kamps J, Kazai G, Koolen M, Moriceau V, Mothe J et al (2014) Overview of inex 2014. In: International conference of the cross-language evaluation forum for European languages. Springer, Berlin, pp 212–228 Bellot P, Bogers T, Geva S, Hall M, Huurdeman H, Kamps J, Kazai G, Koolen M, Moriceau V, Mothe J et al (2014) Overview of inex 2014. In: International conference of the cross-language evaluation forum for European languages. Springer, Berlin, pp 212–228
Zurück zum Zitat Bellot V, Moriceau P, Mothe J, Sanjuan E, Tannier X (2013) Overview of inex tweet contextualization 2013 track. CLEF Bellot V, Moriceau P, Mothe J, Sanjuan E, Tannier X (2013) Overview of inex tweet contextualization 2013 track. CLEF
Zurück zum Zitat Bhaskar P, Banerjee S, Bandyopadhyay S (2012) A hybrid tweet contextualization system using IR and summarization. In: INEX, vol 2012, p 164 Bhaskar P, Banerjee S, Bandyopadhyay S (2012) A hybrid tweet contextualization system using IR and summarization. In: INEX, vol 2012, p 164
Zurück zum Zitat Boyd D, Golder S, Lotan G (2010) Tweet, tweet, retweet: conversational aspects of retweeting on twitter. In: System sciences (HICSS), 2010 43rd Hawaii international conference on. IEEE, pp 1–10 Boyd D, Golder S, Lotan G (2010) Tweet, tweet, retweet: conversational aspects of retweeting on twitter. In: System sciences (HICSS), 2010 43rd Hawaii international conference on. IEEE, pp 1–10
Zurück zum Zitat Brin S, Page L (2012) Reprint of: the anatomy of a large-scale hypertextual web search engine. Comput Netw 56(18):3825–3833CrossRef Brin S, Page L (2012) Reprint of: the anatomy of a large-scale hypertextual web search engine. Comput Netw 56(18):3825–3833CrossRef
Zurück zum Zitat Bron C, Kerbosch J (1973) Algorithm 457: finding all cliques of an undirected graph. Commun ACM 16(9):575–577CrossRef Bron C, Kerbosch J (1973) Algorithm 457: finding all cliques of an undirected graph. Commun ACM 16(9):575–577CrossRef
Zurück zum Zitat Deveaud R, Boudin F (2013) Effective tweet contextualization with hashtags performance prediction and multi-document summarization. In: Initiative for the evaluation of XML retrieval (INEX) Deveaud R, Boudin F (2013) Effective tweet contextualization with hashtags performance prediction and multi-document summarization. In: Initiative for the evaluation of XML retrieval (INEX)
Zurück zum Zitat Duggan M, Ellison NB, Lampe C, Lenhart A, Madden M (2015) Social media update 2014, Pew Research Center, vol 19 Duggan M, Ellison NB, Lampe C, Lenhart A, Madden M (2015) Social media update 2014, Pew Research Center, vol 19
Zurück zum Zitat Edmundson HP (1969) New methods in automatic extracting. J ACM 16(2):264–285CrossRef Edmundson HP (1969) New methods in automatic extracting. J ACM 16(2):264–285CrossRef
Zurück zum Zitat Elsner M, Austerweil JL, Charniak E (2007) A unified local and global model for discourse coherence. In: HLT-NAACL, pp 436–443 Elsner M, Austerweil JL, Charniak E (2007) A unified local and global model for discourse coherence. In: HLT-NAACL, pp 436–443
Zurück zum Zitat Ermakova L, Mothe J (2013) Irit at inex 2013: tweet contextualization track. In: Initiative for the evaluation of XML retrieval (INEX) Ermakova L, Mothe J (2013) Irit at inex 2013: tweet contextualization track. In: Initiative for the evaluation of XML retrieval (INEX)
Zurück zum Zitat Ganguly D, Leveling J, Jones GJ (2012) Dcu@ inex-2012: exploring sentence retrieval for tweet contextualization Ganguly D, Leveling J, Jones GJ (2012) Dcu@ inex-2012: exploring sentence retrieval for tweet contextualization
Zurück zum Zitat Grosz BJ, Weinstein S, Joshi AK (1995) Centering: a framework for modeling the local coherence of discourse. Comput Linguist 21(2):203–225 Grosz BJ, Weinstein S, Joshi AK (1995) Centering: a framework for modeling the local coherence of discourse. Comput Linguist 21(2):203–225
Zurück zum Zitat Guinaudeau C, Strube M (2013) Graph-based local coherence modeling. In: ACL (1), pp 93–103 Guinaudeau C, Strube M (2013) Graph-based local coherence modeling. In: ACL (1), pp 93–103
Zurück zum Zitat Linhares AC (2013) An automatic greedy summarization system at inex 2013 tweet contextualization track. In: CLEF (working notes). Citeseer Linhares AC (2013) An automatic greedy summarization system at inex 2013 tweet contextualization track. In: CLEF (working notes). Citeseer
Zurück zum Zitat Lin Z, Ng HT, Kan M-Y (2011) Automatically evaluating text coherence using discourse relations. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies-volume 1. Association for Computational Linguistics, pp 997–1006 Lin Z, Ng HT, Kan M-Y (2011) Automatically evaluating text coherence using discourse relations. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies-volume 1. Association for Computational Linguistics, pp 997–1006
Zurück zum Zitat Mihalcea R (2004) Graph-based ranking algorithms for sentence extraction, applied to text summarization. In: Proceedings of the ACL 2004 on interactive poster and demonstration sessions. Association for Computational Linguistics, p 20 Mihalcea R (2004) Graph-based ranking algorithms for sentence extraction, applied to text summarization. In: Proceedings of the ACL 2004 on interactive poster and demonstration sessions. Association for Computational Linguistics, p 20
Zurück zum Zitat Mihalcea R, Tarau P (2004) Textrank: bringing order into texts. Association for Computational Linguistics Mihalcea R, Tarau P (2004) Textrank: bringing order into texts. Association for Computational Linguistics
Zurück zum Zitat Morchid M, Linares G (2012) Inex 2012 benchmark a semantic space for tweets contextualization. In: INEX, vol 2012. Citeseer, p 203 Morchid M, Linares G (2012) Inex 2012 benchmark a semantic space for tweets contextualization. In: INEX, vol 2012. Citeseer, p 203
Zurück zum Zitat Parveen D, Strube M (2015) Integrating importance, non-redundancy and coherence in graph-based extractive summarization. In: IJCAI, pp 1298–1304 Parveen D, Strube M (2015) Integrating importance, non-redundancy and coherence in graph-based extractive summarization. In: IJCAI, pp 1298–1304
Zurück zum Zitat Parveen D, Ramsl H-M, Strube M (2015) Topical coherence for graph-based extractive summarization Parveen D, Ramsl H-M, Strube M (2015) Topical coherence for graph-based extractive summarization
Zurück zum Zitat Parveen D, Mesgar M, Strube M (2016) Generating coherent summaries of scientific articles using coherence patterns. In: Proceedings of the 2016 conference on empirical methods in natural language processing, pp 772–783 Parveen D, Mesgar M, Strube M (2016) Generating coherent summaries of scientific articles using coherence patterns. In: Proceedings of the 2016 conference on empirical methods in natural language processing, pp 772–783
Zurück zum Zitat Perrin A (2015) Social media usage: 2005–2015 Perrin A (2015) Social media usage: 2005–2015
Zurück zum Zitat Pontes EL, Torres-Moreno J-M, Huet S, Linhares AC (2016) Tweet contextualization using continuous space vectors: automatic summarization of cultural documents. In: CLEF (Working Notes), pp 1238–1245 Pontes EL, Torres-Moreno J-M, Huet S, Linhares AC (2016) Tweet contextualization using continuous space vectors: automatic summarization of cultural documents. In: CLEF (Working Notes), pp 1238–1245
Zurück zum Zitat Radev DR, Allison T, Blair-Goldensohn S, Blitzer J, Celebi A, Dimitrov S, Drabek E, Hakim A, Lam W, Liu D, et al (2004) Mead-a platform for multidocument multilingual text summarization. In: LREC Radev DR, Allison T, Blair-Goldensohn S, Blitzer J, Celebi A, Dimitrov S, Drabek E, Hakim A, Lam W, Liu D, et al (2004) Mead-a platform for multidocument multilingual text summarization. In: LREC
Zurück zum Zitat Regneri M (2007) Finding all cliques of an undirected graph. In: Seminar—“Current Trends in IE” WS Regneri M (2007) Finding all cliques of an undirected graph. In: Seminar—“Current Trends in IE” WS
Zurück zum Zitat Salton G, Singhal A, Mitra M, Buckley C (1997) Automatic text structuring and summarization. Inf Process Manag 33(2):193–207CrossRef Salton G, Singhal A, Mitra M, Buckley C (1997) Automatic text structuring and summarization. Inf Process Manag 33(2):193–207CrossRef
Zurück zum Zitat SanJuan E, Moriceau V, Tannier X, Bellot P, Mothe J (2012) Overview of the inex 2012 tweet contextualization track. Initiative for XML Retrieval INEX, p 148 SanJuan E, Moriceau V, Tannier X, Bellot P, Mothe J (2012) Overview of the inex 2012 tweet contextualization track. Initiative for XML Retrieval INEX, p 148
Zurück zum Zitat Soricut R, Marcu D (2006) Discourse generation using utility-trained coherence models. In: Proceedings of the COLING/ACL on main conference poster sessions. Association for Computational Linguistics, pp 803–810 Soricut R, Marcu D (2006) Discourse generation using utility-trained coherence models. In: Proceedings of the COLING/ACL on main conference poster sessions. Association for Computational Linguistics, pp 803–810
Zurück zum Zitat Tomita E, Tanaka A, Takahashi H (2006) The worst-case time complexity for generating all maximal cliques and computational experiments. Theor Comput Sci 363(1):28–42MathSciNetCrossRef Tomita E, Tanaka A, Takahashi H (2006) The worst-case time complexity for generating all maximal cliques and computational experiments. Theor Comput Sci 363(1):28–42MathSciNetCrossRef
Zurück zum Zitat Tomita E, Akutsu T, Matsunaga T (2011) Efficient algorithms for finding maximum and maximal cliques: effective tools for bioinformatics. INTECH Open Access Publisher Tomita E, Akutsu T, Matsunaga T (2011) Efficient algorithms for finding maximum and maximal cliques: effective tools for bioinformatics. INTECH Open Access Publisher
Zurück zum Zitat Torres-Moreno J-M (2014) Three statistical summarizers at clef-inex 2013 tweet contextualization track. In: CLEF (working notes), pp 565–573 Torres-Moreno J-M (2014) Three statistical summarizers at clef-inex 2013 tweet contextualization track. In: CLEF (working notes), pp 565–573
Zurück zum Zitat Yeh JY, Ke HR, Yang W (2008) ispreadrank: Ranking sentences for extraction-based summarization using feature weight propagation in the sentence similarity network. Expert Syst Appl 35(3):1451–1462CrossRef Yeh JY, Ke HR, Yang W (2008) ispreadrank: Ranking sentences for extraction-based summarization using feature weight propagation in the sentence similarity network. Expert Syst Appl 35(3):1451–1462CrossRef
Zurück zum Zitat Zingla M, Ettaleb M, Latiri CC, Slimani Y (2014) Inex2014: tweet contextualization using association rules between terms. In: CLEF (Working Notes), pp 574–584 Zingla M, Ettaleb M, Latiri CC, Slimani Y (2014) Inex2014: tweet contextualization using association rules between terms. In: CLEF (Working Notes), pp 574–584
Metadaten
Titel
Tweet contextualization: combining sentence extraction, sentence aggregation and sentence reordering to enhance informativeness and readability
verfasst von
Amira Dhokar
Lobna Hlaoua
Lotfi Ben Romdhane
Publikationsdatum
01.12.2021
Verlag
Springer Vienna
Erschienen in
Social Network Analysis and Mining / Ausgabe 1/2021
Print ISSN: 1869-5450
Elektronische ISSN: 1869-5469
DOI
https://doi.org/10.1007/s13278-021-00724-4

Weitere Artikel der Ausgabe 1/2021

Social Network Analysis and Mining 1/2021 Zur Ausgabe

Premium Partner