Skip to main content
Erschienen in: Social Network Analysis and Mining 1/2019

01.12.2019 | Original Article

A lightweight and multilingual framework for crisis information extraction from Twitter data

verfasst von: Roberto Interdonato, Jean-Loup Guillaume, Antoine Doucet

Erschienen in: Social Network Analysis and Mining | Ausgabe 1/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Obtaining relevant timely information during crisis events is a challenging task that can be fundamental to handle the consequences deriving from both unexpected events (e.g., terrorist attacks) and partially predictable ones (i.e., natural disasters). Even though microblogging-based online social networks (e.g., Twitter) have become an attractive data source in these emergency situations, overcoming the information overload deriving from mass events is not trivial. The aim of this work was to enable unsupervised extraction of relevant information from Twitter data during a crisis event, offering a lightweight alternative to learning-based approaches. The proposed lightweight crisis management framework integrates natural language processing and clustering techniques in order to produce a ranking of tweets relevant to a crisis situation based on their informativeness. Experiments carried out on six Twitter collections in two languages (English and French) proved the significance and the flexibility of our approach.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Arthur D, Vassilvitskii S (2007) k means++: the advantages of careful seeding. In: Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, pp 1027–1025 Arthur D, Vassilvitskii S (2007) k means++: the advantages of careful seeding. In: Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, pp 1027–1025
Zurück zum Zitat Bakshy E, Hofman JM, Mason WA, Watts DJ (2011) Everyone’s an influencer: quantifying influence on Twitter. In: Proceedings ACM conference on web search and web data mining (WSDM), pp 65–74 Bakshy E, Hofman JM, Mason WA, Watts DJ (2011) Everyone’s an influencer: quantifying influence on Twitter. In: Proceedings ACM conference on web search and web data mining (WSDM), pp 65–74
Zurück zum Zitat Basu M, Ghosh K, Das S, Dey R, Bandyopadhyay S, Ghosh S (2017) Identifying post-disaster resource needs and availabilities from microblogs. In Proceedings of IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM), pp 427–430 Basu M, Ghosh K, Das S, Dey R, Bandyopadhyay S, Ghosh S (2017) Identifying post-disaster resource needs and availabilities from microblogs. In Proceedings of IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM), pp 427–430
Zurück zum Zitat Berlingerio M, Calabrese F, Di Lorenzo G, Dong X, Gkoufas Y, Mavroeidis D (2013) SaferCity: a system for detecting and analyzing incidents from social media. In: Proceedings of international conference on data mining workshops (ICDMW), pp 1077–1080 Berlingerio M, Calabrese F, Di Lorenzo G, Dong X, Gkoufas Y, Mavroeidis D (2013) SaferCity: a system for detecting and analyzing incidents from social media. In: Proceedings of international conference on data mining workshops (ICDMW), pp 1077–1080
Zurück zum Zitat Bizid I, Nayef N, Boursier P, Faïz S, Doucet A (2015a) Identification of microblogs prominent users during events by learning temporal sequences of features. In: Proceedings ACM conference on information and knowledge management (CIKM), pp 1715–1718 Bizid I, Nayef N, Boursier P, Faïz S, Doucet A (2015a) Identification of microblogs prominent users during events by learning temporal sequences of features. In: Proceedings ACM conference on information and knowledge management (CIKM), pp 1715–1718
Zurück zum Zitat Bizid I, Nayef N, Boursier P, Faïz S, Morcos J (2015b) Prominent users detection during specific events by learning on- and off-topic features of user activities. In: Proceedings IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM), pp 500–503 Bizid I, Nayef N, Boursier P, Faïz S, Morcos J (2015b) Prominent users detection during specific events by learning on- and off-topic features of user activities. In: Proceedings IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM), pp 500–503
Zurück zum Zitat Bizid I, Boursier P, Morcos J, Faïz S (2015c) MASIR: a multi-agent system for real-time information retrieval from microblogs during unexpected events. In: Proceedings of international conference agent and multi-agent systems: technologies and applications (KES-AMSTA), pp 3–13 Bizid I, Boursier P, Morcos J, Faïz S (2015c) MASIR: a multi-agent system for real-time information retrieval from microblogs during unexpected events. In: Proceedings of international conference agent and multi-agent systems: technologies and applications (KES-AMSTA), pp 3–13
Zurück zum Zitat Bizid I (2016) Prominent microblog users prediction during crisis events: using phase-aware and temporal modeling of users behavior. PhD thesis Bizid I (2016) Prominent microblog users prediction during crisis events: using phase-aware and temporal modeling of users behavior. PhD thesis
Zurück zum Zitat Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3(4–5):993–1022MATH Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3(4–5):993–1022MATH
Zurück zum Zitat Burel G, Saif H, Alani H (2017) Semantic wide and deep learning for detecting crisis-information categories on social media. In: Proceedings of international semantic web conference (ISWC), pp 138–155 Burel G, Saif H, Alani H (2017) Semantic wide and deep learning for detecting crisis-information categories on social media. In: Proceedings of international semantic web conference (ISWC), pp 138–155
Zurück zum Zitat Francisco M, Alves-Souza SN, Campos EGL, De Souza LS (2017) Total data quality management and total information quality management applied to costumer relationship management. In: Proceedings of the 9th international conference on information management and engineering, ICIME 2017, pp 40–45 Francisco M, Alves-Souza SN, Campos EGL, De Souza LS (2017) Total data quality management and total information quality management applied to costumer relationship management. In: Proceedings of the 9th international conference on information management and engineering, ICIME 2017, pp 40–45
Zurück zum Zitat Gabrilovich E, Markovitch S (2007) Computing semantic relatedness using Wikipedia-based explicit semantic analysis. In: Proceedings international joint conference on artificial intelligence (IJCAI), pp 1606–1611 Gabrilovich E, Markovitch S (2007) Computing semantic relatedness using Wikipedia-based explicit semantic analysis. In: Proceedings international joint conference on artificial intelligence (IJCAI), pp 1606–1611
Zurück zum Zitat Goel R, Soni S, Goyal N, Paparrizos J, Wallach HM, Diaz F, Eisenstein J (2016) The social dynamics of language change in online networks. In: Proceedings of international conference social informatics (SocInfo), pp 41–57 Goel R, Soni S, Goyal N, Paparrizos J, Wallach HM, Diaz F, Eisenstein J (2016) The social dynamics of language change in online networks. In: Proceedings of international conference social informatics (SocInfo), pp 41–57
Zurück zum Zitat Gupta A, Kumaraguru P (2012) Credibility ranking of tweets during high impact events. In: Proceedings of the 1st workshop on privacy and security in online social media (PSOSM), pp 2–8 Gupta A, Kumaraguru P (2012) Credibility ranking of tweets during high impact events. In: Proceedings of the 1st workshop on privacy and security in online social media (PSOSM), pp 2–8
Zurück zum Zitat Gupta A, Kumaraguru P, Castillo C, Meier P (2014) TweetCred: real-time credibility assessment of content on twitter. In: Proceedings of international conference social informatics (SocInfo), pp 228–243 Gupta A, Kumaraguru P, Castillo C, Meier P (2014) TweetCred: real-time credibility assessment of content on twitter. In: Proceedings of international conference social informatics (SocInfo), pp 228–243
Zurück zum Zitat Huang B, Carley KM (2017) On predicting geolocation of tweets using convolutional neural networks. In: International conference on social, cultural, and behavioral modeling (SBP-BRiMS), pp 281–291 Huang B, Carley KM (2017) On predicting geolocation of tweets using convolutional neural networks. In: International conference on social, cultural, and behavioral modeling (SBP-BRiMS), pp 281–291
Zurück zum Zitat Hung K-C, Kalantari M, Rajabifard A (2017) An integrated method for assessing the text content quality of volunteered geographic information in disaster management. IJISCRAM 9(2):1–17 Hung K-C, Kalantari M, Rajabifard A (2017) An integrated method for assessing the text content quality of volunteered geographic information in disaster management. IJISCRAM 9(2):1–17
Zurück zum Zitat Imran M, Castillo C, Diaz F, Vieweg S (2015) Processing social media messages in mass emergency. ACM Comput Surv 47(4):1–38CrossRef Imran M, Castillo C, Diaz F, Vieweg S (2015) Processing social media messages in mass emergency. ACM Comput Surv 47(4):1–38CrossRef
Zurück zum Zitat Imran M, Mitra P, Srivastava J (2016) Enabling rapid classification of social media communications during crises. IJISCRAM 8(3):1–17 Imran M, Mitra P, Srivastava J (2016) Enabling rapid classification of social media communications during crises. IJISCRAM 8(3):1–17
Zurück zum Zitat Imran M, Elbassuoni S, Castillo C, Diaz F, Meier P (2013) Extracting information nuggets from disaster-related messages in social media. In: 10th proceedings of the international conference on information systems for crisis response and management, Baden-Baden, Germany, May 12–15, 2013 Imran M, Elbassuoni S, Castillo C, Diaz F, Meier P (2013) Extracting information nuggets from disaster-related messages in social media. In: 10th proceedings of the international conference on information systems for crisis response and management, Baden-Baden, Germany, May 12–15, 2013
Zurück zum Zitat Interdonato R, Doucet A, Guillaume J-L (2018) Unsupervised crisis information extraction from twitter data. In IEEE/ACM 2018 international conference on advances in social networks analysis and mining, ASONAM 2018, Barcelona, Spain, August 28–31, 2018, pp 579–580 Interdonato R, Doucet A, Guillaume J-L (2018) Unsupervised crisis information extraction from twitter data. In IEEE/ACM 2018 international conference on advances in social networks analysis and mining, ASONAM 2018, Barcelona, Spain, August 28–31, 2018, pp 579–580
Zurück zum Zitat Ito J, Song J, Toda H, Koike Y, Oyama S (2015) Assessment of tweet credibility with LDA features. In: Proceedings of international conference on world wide web—companion, pp 953–958 Ito J, Song J, Toda H, Koike Y, Oyama S (2015) Assessment of tweet credibility with LDA features. In: Proceedings of international conference on world wide web—companion, pp 953–958
Zurück zum Zitat Kwak H, Lee C, Park H, Moon SB (2010) What is Twitter, a social network or a news media? In: Proceedings of ACM conference on world wide web (WWW), pp 591–600 Kwak H, Lee C, Park H, Moon SB (2010) What is Twitter, a social network or a news media? In: Proceedings of ACM conference on world wide web (WWW), pp 591–600
Zurück zum Zitat Lee D, Seung HS (1999) Learning the parts of objects by nonnegative matrix factorization. Nature 401:788–791CrossRef Lee D, Seung HS (1999) Learning the parts of objects by nonnegative matrix factorization. Nature 401:788–791CrossRef
Zurück zum Zitat Letsios M, Balalau OD, Danisch M, Orsini E, Sozio M (2016) Finding heaviest k-subgraphs and events in social media. In: Proceedings IEEE international conference on data mining (ICDM), pp 113–120 Letsios M, Balalau OD, Danisch M, Orsini E, Sozio M (2016) Finding heaviest k-subgraphs and events in social media. In: Proceedings IEEE international conference on data mining (ICDM), pp 113–120
Zurück zum Zitat Ghasemaghaei M, Hassanein K (2015) Online information quality and consumer satisfaction: the moderating roles of contextual factors—a meta-analysis. Inf Manag 52(8):965–981CrossRef Ghasemaghaei M, Hassanein K (2015) Online information quality and consumer satisfaction: the moderating roles of contextual factors—a meta-analysis. Inf Manag 52(8):965–981CrossRef
Zurück zum Zitat Mendoza M, Poblete B, Castillo C (2010) Twitter under crisis: can we trust what wert? pp 71–79 Mendoza M, Poblete B, Castillo C (2010) Twitter under crisis: can we trust what wert? pp 71–79
Zurück zum Zitat Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Annual conference on neural information processing systems (NIPS), pp 3111–3119 Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Annual conference on neural information processing systems (NIPS), pp 3111–3119
Zurück zum Zitat Nazer TH, Morstatter F, Dani H, Liu H (2016) Finding requests in social media for disaster relief. In: Proceedings of IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM), pp 1410–1413 Nazer TH, Morstatter F, Dani H, Liu H (2016) Finding requests in social media for disaster relief. In: Proceedings of IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM), pp 1410–1413
Zurück zum Zitat Olteanu A, Castillo C, Diaz F, Vieweg S (2014) Crisislex: a lexicon for collecting and filtering microblogged communications in Crises. In: Proceedings of international conference on weblogs and social media. ICWSM Olteanu A, Castillo C, Diaz F, Vieweg S (2014) Crisislex: a lexicon for collecting and filtering microblogged communications in Crises. In: Proceedings of international conference on weblogs and social media. ICWSM
Zurück zum Zitat Qu Y, Huang C, Zhang P, Zhang J (2011) Microblogging after a major disaster in China. In: Proceedings of international conference computer supported cooperative work (CSCW). ACM Press, p 25 Qu Y, Huang C, Zhang P, Zhang J (2011) Microblogging after a major disaster in China. In: Proceedings of international conference computer supported cooperative work (CSCW). ACM Press, p 25
Zurück zum Zitat Rogstadius J, Vukovic M, Teixeira CA, Kostakos V, Karapanos E, Laredo JA (2013) CrisisTracker: crowdsourced social media curation for disaster awareness. IBM J Res Dev 57(5):4:1–4:13CrossRef Rogstadius J, Vukovic M, Teixeira CA, Kostakos V, Karapanos E, Laredo JA (2013) CrisisTracker: crowdsourced social media curation for disaster awareness. IBM J Res Dev 57(5):4:1–4:13CrossRef
Zurück zum Zitat Seppänen H, Mäkelä J, Luokkala P, Virrantaus K (2013) Developing shared situational awareness for emergency management. Saf Sci 55:1–9CrossRef Seppänen H, Mäkelä J, Luokkala P, Virrantaus K (2013) Developing shared situational awareness for emergency management. Saf Sci 55:1–9CrossRef
Zurück zum Zitat Seppänen H, Virrantaus K (2015) Shared situational awareness and information quality in disaster management. Saf Sci 77:112–122CrossRef Seppänen H, Virrantaus K (2015) Shared situational awareness and information quality in disaster management. Saf Sci 77:112–122CrossRef
Zurück zum Zitat Shamala P, Ahmad R, Ali HZ, Sedek M (2017) Integrating information quality dimensions into information security risk management (ISRM). J Inf Secur Appl 36:1–10 Shamala P, Ahmad R, Ali HZ, Sedek M (2017) Integrating information quality dimensions into information security risk management (ISRM). J Inf Secur Appl 36:1–10
Zurück zum Zitat Shao M, Li J, Chen F, Huang H, Zhang S, Chen X (2017) An efficient approach to event detection and forecasting in dynamic multivariate social media networks. In: Proceedings of ACM conference on world wide web (WWW), pp 1631–1639 Shao M, Li J, Chen F, Huang H, Zhang S, Chen X (2017) An efficient approach to event detection and forecasting in dynamic multivariate social media networks. In: Proceedings of ACM conference on world wide web (WWW), pp 1631–1639
Zurück zum Zitat Thomson R, Ito N, Suda H, Lin F, Liu Y, Hayasaka R, Isochi R, Wang Z (2012) Trusting tweets : the fukushima disaster and information source credibility on twitter. Iscram, (April), pp 1–10 Thomson R, Ito N, Suda H, Lin F, Liu Y, Hayasaka R, Isochi R, Wang Z (2012) Trusting tweets : the fukushima disaster and information source credibility on twitter. Iscram, (April), pp 1–10
Zurück zum Zitat Varga I, Sano M, Torisawa K, Hashimoto C, Ohtake K, Kawai T, Oh J-H, De Saeger S (2013) Aid is out there: looking for help from tweets during a large scale disaster, pp 1619–1629 Varga I, Sano M, Torisawa K, Hashimoto C, Ohtake K, Kawai T, Oh J-H, De Saeger S (2013) Aid is out there: looking for help from tweets during a large scale disaster, pp 1619–1629
Zurück zum Zitat Vieweg S, Hughes AL, Starbird K, Palen L (2010) Microblogging during two natural hazards events: what twitter may contribute to situational awareness. In: Proceedings of the 28th international conference on human factors in computing systems, CHI 2010, Atlanta, Georgia, USA, April 10–15, 2010, pp 1079–1088 Vieweg S, Hughes AL, Starbird K, Palen L (2010) Microblogging during two natural hazards events: what twitter may contribute to situational awareness. In: Proceedings of the 28th international conference on human factors in computing systems, CHI 2010, Atlanta, Georgia, USA, April 10–15, 2010, pp 1079–1088
Zurück zum Zitat Xia X, Yang X, Wu C, Li S, Bao L (2012) Information credibility on twitter in emergency situation. In: Intelligence and security informatics—Pacific Asia workshop, PAISI, volume 7299 LNCS, pp 45–59 Xia X, Yang X, Wu C, Li S, Bao L (2012) Information credibility on twitter in emergency situation. In: Intelligence and security informatics—Pacific Asia workshop, PAISI, volume 7299 LNCS, pp 45–59
Zurück zum Zitat Yagci IA, Das S (2018) Measuring design-level information quality in online reviews. Electron Commer Res Appl 30:102–110CrossRef Yagci IA, Das S (2018) Measuring design-level information quality in online reviews. Electron Commer Res Appl 30:102–110CrossRef
Zurück zum Zitat Zadeh PA, Wang G, Cavka HB, Staub-French S, Pottinger R (2017) Information quality assessment for facility management. Adv Eng Inform 33:181–205CrossRef Zadeh PA, Wang G, Cavka HB, Staub-French S, Pottinger R (2017) Information quality assessment for facility management. Adv Eng Inform 33:181–205CrossRef
Metadaten
Titel
A lightweight and multilingual framework for crisis information extraction from Twitter data
verfasst von
Roberto Interdonato
Jean-Loup Guillaume
Antoine Doucet
Publikationsdatum
01.12.2019
Verlag
Springer Vienna
Erschienen in
Social Network Analysis and Mining / Ausgabe 1/2019
Print ISSN: 1869-5450
Elektronische ISSN: 1869-5469
DOI
https://doi.org/10.1007/s13278-019-0608-4

Weitere Artikel der Ausgabe 1/2019

Social Network Analysis and Mining 1/2019 Zur Ausgabe