Skip to main content
Erschienen in: Information Systems Frontiers 5/2018

20.03.2018

Classifying and Summarizing Information from Microblogs During Epidemics

verfasst von: Koustav Rudra, Ashish Sharma, Niloy Ganguly, Muhammad Imran

Erschienen in: Information Systems Frontiers | Ausgabe 5/2018

Einloggen

Aktivieren Sie unsere intelligente Suche um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

During a new disease outbreak, frustration and uncertainties among affected and vulnerable population increase. Affected communities look for known symptoms, prevention measures, and treatment strategies. On the other hand, health organizations try to get situational updates to assess the severity of the outbreak, known affected cases, and other details. Recent emergence of social media platforms such as Twitter provide convenient ways and fast access to disseminate and consume information to/from a wider audience. Research studies have shown potential of this online information to address information needs of concerned authorities during outbreaks, epidemics, and pandemics. In this work, we target three types of end-users (i) vulnerable population—people who are not yet affected and are looking for prevention related information (ii) affected population—people who are affected and looking for treatment related information, and (iii) health organizations—like WHO, who are interested in gaining situational awareness to make timely decisions. We use Twitter data from two recent outbreaks (Ebola and MERS) to build an automatic classification approach useful to categorize tweets into different disease related categories. Moreover, the classified messages are used to generate different kinds of summaries useful for affected and vulnerable communities as well as health organizations. Results obtained from extensive experimentation show the effectiveness of the proposed approach.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Aronson, A.R. (2001). Effective mapping of biomedical text to the umls metathesaurus: the metamap program. In Proceedings of the AMIA symposium (p. 17). American Medical Informatics Association. Aronson, A.R. (2001). Effective mapping of biomedical text to the umls metathesaurus: the metamap program. In Proceedings of the AMIA symposium (p. 17). American Medical Informatics Association.
Zurück zum Zitat Bodenreider, O. (2004). The unified medical language system (umls): integrating biomedical terminology. Nucleic Acids Research, 32(suppl 1), D267–D270.CrossRef Bodenreider, O. (2004). The unified medical language system (umls): integrating biomedical terminology. Nucleic Acids Research, 32(suppl 1), D267–D270.CrossRef
Zurück zum Zitat De Choudhury, M. (2015). Anorexia on tumblr: a characterization study. In Proceedings of the 5th international conference on digital health 2015 (pp. 43–50). ACM. De Choudhury, M. (2015). Anorexia on tumblr: a characterization study. In Proceedings of the 5th international conference on digital health 2015 (pp. 43–50). ACM.
Zurück zum Zitat de Quincey, E., Kyriacou, T., Pantin, T. (2016). # Hayfever; a longitudinal study into hay fever related tweets in the UK. In Proceedings of the 6th international conference on digital health conference (pp. 85–89). ACM. de Quincey, E., Kyriacou, T., Pantin, T. (2016). # Hayfever; a longitudinal study into hay fever related tweets in the UK. In Proceedings of the 6th international conference on digital health conference (pp. 85–89). ACM.
Zurück zum Zitat Denecke, K. (2014). Extracting medical concepts from medical social media with clinical nlp tools: a qualitative study. In Proceedings of the fourth workshop on building and evaluation resources for health and biomedical text processing. Denecke, K. (2014). Extracting medical concepts from medical social media with clinical nlp tools: a qualitative study. In Proceedings of the fourth workshop on building and evaluation resources for health and biomedical text processing.
Zurück zum Zitat Denecke, K., & Nejdl, W. (2009). How valuable is medical social media data? Content analysis of the medical web. Information Sciences, 179(12), 1870–1880.CrossRef Denecke, K., & Nejdl, W. (2009). How valuable is medical social media data? Content analysis of the medical web. Information Sciences, 179(12), 1870–1880.CrossRef
Zurück zum Zitat Elkin, N. (2008). How America searches: health and wellness. Opinion Research Corporation: iCrossing pp. 1–17. Elkin, N. (2008). How America searches: health and wellness. Opinion Research Corporation: iCrossing pp. 1–17.
Zurück zum Zitat Esuli, A., & Sebastiani, F. (2007). SENTIWORDNET: a high-coverage lexical resource for opinion mining. Technical Report 2007-TR-02 Istituto di Scienza e Tecnologie dell’Informazione Consiglio Nazionale delle Ricerche Pisa IT. Esuli, A., & Sebastiani, F. (2007). SENTIWORDNET: a high-coverage lexical resource for opinion mining. Technical Report 2007-TR-02 Istituto di Scienza e Tecnologie dell’Informazione Consiglio Nazionale delle Ricerche Pisa IT.
Zurück zum Zitat Fox, S. (2011). The social life of health information Vol. 2011. Washington, DC: Pew Internet & American Life Project. Fox, S. (2011). The social life of health information Vol. 2011. Washington, DC: Pew Internet & American Life Project.
Zurück zum Zitat Friedman, C., Hripcsak, G., Shagina, L., Liu, H. (1999). Representing information in patient reports using natural language processing and the extensible markup language. Journal of the American Medical Informatics Association, 6(1), 76–87.CrossRef Friedman, C., Hripcsak, G., Shagina, L., Liu, H. (1999). Representing information in patient reports using natural language processing and the extensible markup language. Journal of the American Medical Informatics Association, 6(1), 76–87.CrossRef
Zurück zum Zitat Friedman, C., Shagina, L., Lussier, Y., Hripcsak, G. (2004). Automated encoding of clinical documents based on natural language processing. Journal of the American Medical Informatics Association, 11(5), 392–402.CrossRef Friedman, C., Shagina, L., Lussier, Y., Hripcsak, G. (2004). Automated encoding of clinical documents based on natural language processing. Journal of the American Medical Informatics Association, 11(5), 392–402.CrossRef
Zurück zum Zitat Gimpel, K., Schneider, N., O’Connor, B., Das, D., Mills, D., Eisenstein, J., Heilman, M., Yogatama, D., Flanigan, J., Smith, N.A. (2011). Part-of-speech tagging for twitter: annotation, features, and experiments. In Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies: short papers (Vol. 2, pp. 42–47). Association for Computational Linguistics. Gimpel, K., Schneider, N., O’Connor, B., Das, D., Mills, D., Eisenstein, J., Heilman, M., Yogatama, D., Flanigan, J., Smith, N.A. (2011). Part-of-speech tagging for twitter: annotation, features, and experiments. In Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies: short papers (Vol. 2, pp. 42–47). Association for Computational Linguistics.
Zurück zum Zitat Goodwin, T.R., & Harabagiu, S.M. (2016). Medical question answering for clinical decision support. In Proceedings of the 25th ACM international on conference on information and knowledge management (pp. 297–306). ACM. Goodwin, T.R., & Harabagiu, S.M. (2016). Medical question answering for clinical decision support. In Proceedings of the 25th ACM international on conference on information and knowledge management (pp. 297–306). ACM.
Zurück zum Zitat Heinze, D.T., Morsch, M.L., Holbrook, J. (2001). Mining free-text medical records. In Proceedings of the AMIA symposium (p. 254). American Medical Informatics Association. Heinze, D.T., Morsch, M.L., Holbrook, J. (2001). Mining free-text medical records. In Proceedings of the AMIA symposium (p. 254). American Medical Informatics Association.
Zurück zum Zitat Homan, C.M., Lu, N., Tu, X., Lytle, M.C., Silenzio, V. (2014). Social structure and depression in trevorspace. In Proceedings of the 17th ACM conference on computer supported cooperative work & social computing (pp. 615–625). ACM. Homan, C.M., Lu, N., Tu, X., Lytle, M.C., Silenzio, V. (2014). Social structure and depression in trevorspace. In Proceedings of the 17th ACM conference on computer supported cooperative work & social computing (pp. 615–625). ACM.
Zurück zum Zitat Hripcsak, G., Austin, J.H., Alderson, P.O., Friedman, C. (2002). Use of natural language processing to translate clinical information from a database of 889,921 chest radiographic reports 1. Radiology, 224(1), 157–163.CrossRef Hripcsak, G., Austin, J.H., Alderson, P.O., Friedman, C. (2002). Use of natural language processing to translate clinical information from a database of 889,921 chest radiographic reports 1. Radiology, 224(1), 157–163.CrossRef
Zurück zum Zitat Imran, M., Castillo, C., Lucas, J., Meier, P., Vieweg, S. (2014). Aidr: Artificial intelligence for disaster response. In Proceedings of the WWW companion (pp. 159–162). Imran, M., Castillo, C., Lucas, J., Meier, P., Vieweg, S. (2014). Aidr: Artificial intelligence for disaster response. In Proceedings of the WWW companion (pp. 159–162).
Zurück zum Zitat Imran, M., Mitra, P., Castillo, C. (2016). Twitter as a lifeline: human-annotated twitter corpora for nlp of crisis-related messages. In Proceedings of the tenth international conference on language resources and evaluation (LREC 2016). European language resources association (ELRA), Paris, France. Imran, M., Mitra, P., Castillo, C. (2016). Twitter as a lifeline: human-annotated twitter corpora for nlp of crisis-related messages. In Proceedings of the tenth international conference on language resources and evaluation (LREC 2016). European language resources association (ELRA), Paris, France.
Zurück zum Zitat Kinnane, N.A., & Milne, D.J. (2010). The role of the internet in supporting and informing carers of people with cancer: a literature review. Supportive Care in Cancer, 18(9), 1123–1136.CrossRef Kinnane, N.A., & Milne, D.J. (2010). The role of the internet in supporting and informing carers of people with cancer: a literature review. Supportive Care in Cancer, 18(9), 1123–1136.CrossRef
Zurück zum Zitat Kong, L., Schneider, N., Swayamdipta, S., Bhatia, A., Dyer, C., Smith, N.A. (2014). A dependency parser for tweets. In Proceedings of the EMNLP. Kong, L., Schneider, N., Swayamdipta, S., Bhatia, A., Dyer, C., Smith, N.A. (2014). A dependency parser for tweets. In Proceedings of the EMNLP.
Zurück zum Zitat Lin, C.Y. (2004). ROUGE: A package for automatic evaluation of summaries. In Proceedings of the workshop on text summarization branches out (with ACL). Lin, C.Y. (2004). ROUGE: A package for automatic evaluation of summaries. In Proceedings of the workshop on text summarization branches out (with ACL).
Zurück zum Zitat Lu, Y., Zhang, P., Deng, S. (2013). Exploring health-related topics in online health community using cluster analysis. In 46th Hawaii international conference on system sciences (HICSS), 2013 (pp. 802–811). IEEE. Lu, Y., Zhang, P., Deng, S. (2013). Exploring health-related topics in online health community using cluster analysis. In 46th Hawaii international conference on system sciences (HICSS), 2013 (pp. 802–811). IEEE.
Zurück zum Zitat Maity, S., Chaudhary, A., Kumar, S., Mukherjee, A., Sarda, C., Patil, A., Mondal, A. (2016). Wassup? lol: characterizing out-of-vocabulary words in twitter. In Proceedings of the 19th ACM conference on computer supported cooperative work and social computing companion, CSCW ’16 companion (pp. 341–344). New York: ACM. Maity, S., Chaudhary, A., Kumar, S., Mukherjee, A., Sarda, C., Patil, A., Mondal, A. (2016). Wassup? lol: characterizing out-of-vocabulary words in twitter. In Proceedings of the 19th ACM conference on computer supported cooperative work and social computing companion, CSCW ’16 companion (pp. 341–344). New York: ACM.
Zurück zum Zitat Park, A., Hartzler, A.L., Huh, J., McDonald, D.W., Pratt, W. (2014). Automatically detecting failures in natural language processing tools for online community text. Journal of Medical Internet Research, 17(8), e212–e212.CrossRef Park, A., Hartzler, A.L., Huh, J., McDonald, D.W., Pratt, W. (2014). Automatically detecting failures in natural language processing tools for online community text. Journal of Medical Internet Research, 17(8), e212–e212.CrossRef
Zurück zum Zitat Paul, M.J., & Dredze, M. (2011). You are what you tweet: analyzing twitter for public health. Icwsm, 20, 265–272. Paul, M.J., & Dredze, M. (2011). You are what you tweet: analyzing twitter for public health. Icwsm, 20, 265–272.
Zurück zum Zitat Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E. (2011). Scikit-learn: machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E. (2011). Scikit-learn: machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830.
Zurück zum Zitat Roberts, K., & Harabagiu, S.M. (2011). A flexible framework for deriving assertions from electronic medical records. Journal of the American Medical Informatics Association, 18(5), 568–573.CrossRef Roberts, K., & Harabagiu, S.M. (2011). A flexible framework for deriving assertions from electronic medical records. Journal of the American Medical Informatics Association, 18(5), 568–573.CrossRef
Zurück zum Zitat Rudra, K., Ghosh, S., Ganguly, N., Goyal, P., Ghosh, S. (2015). Extracting situational information from microblogs during disaster events: a classification-summarization approach. In Proceedings of the CIKM. Rudra, K., Ghosh, S., Ganguly, N., Goyal, P., Ghosh, S. (2015). Extracting situational information from microblogs during disaster events: a classification-summarization approach. In Proceedings of the CIKM.
Zurück zum Zitat Rudra, K., Sharma, A., Ganguly, N., Imran, M. (2017). Classifying information from microblogs during epidemics. In Proceedings of the 2017 international conference on digital health (pp. 104–108). ACM. Rudra, K., Sharma, A., Ganguly, N., Imran, M. (2017). Classifying information from microblogs during epidemics. In Proceedings of the 2017 international conference on digital health (pp. 104–108). ACM.
Zurück zum Zitat Savova, G.K., Masanz, J.J., Ogren, P.V., Zheng, J., Sohn, S., Kipper-Schuler, K.C., Chute, C.G. (2010). Mayo clinical text analysis and knowledge extraction system (ctakes): architecture, component evaluation and applications. Journal of the American Medical Informatics Association, 17(5), 507–513.CrossRef Savova, G.K., Masanz, J.J., Ogren, P.V., Zheng, J., Sohn, S., Kipper-Schuler, K.C., Chute, C.G. (2010). Mayo clinical text analysis and knowledge extraction system (ctakes): architecture, component evaluation and applications. Journal of the American Medical Informatics Association, 17(5), 507–513.CrossRef
Zurück zum Zitat Scanfeld, D., Scanfeld, V., Larson, E.L. (2010). Dissemination of health information through social networks: twitter and antibiotics. American Journal of Infection Control, 38(3), 182–188.CrossRef Scanfeld, D., Scanfeld, V., Larson, E.L. (2010). Dissemination of health information through social networks: twitter and antibiotics. American Journal of Infection Control, 38(3), 182–188.CrossRef
Zurück zum Zitat Stearns, M.Q., Price, C., Spackman, K.A., Wang, A.Y. (2001). Snomed clinical terms: overview of the development process and project status. In Proceedings of the AMIA symposium (p. 662). American Medical Informatics Association. Stearns, M.Q., Price, C., Spackman, K.A., Wang, A.Y. (2001). Snomed clinical terms: overview of the development process and project status. In Proceedings of the AMIA symposium (p. 662). American Medical Informatics Association.
Zurück zum Zitat Tu, H., Ma, Z., Sun, A., Wang, X. (2016). When metamap meets social media in healthcare: are the word labels correct?. In Information retrieval technology (pp. 356–362). Springer. Tu, H., Ma, Z., Sun, A., Wang, X. (2016). When metamap meets social media in healthcare: are the word labels correct?. In Information retrieval technology (pp. 356–362). Springer.
Zurück zum Zitat Uzuner, Ö., South, B.R., Shen, S., DuVall, S.L. (2011). 2010 I2b2/va challenge on concepts, assertions, and relations in clinical text. Journal of the American Medical Informatics Association, 18(5), 552–556.CrossRef Uzuner, Ö., South, B.R., Shen, S., DuVall, S.L. (2011). 2010 I2b2/va challenge on concepts, assertions, and relations in clinical text. Journal of the American Medical Informatics Association, 18(5), 552–556.CrossRef
Zurück zum Zitat Yang, F.C., Lee, A.J., Kuo, S.C. (2016). Mining health social media with sentiment analysis. Journal of medical systems, 40(11), 236.CrossRef Yang, F.C., Lee, A.J., Kuo, S.C. (2016). Mining health social media with sentiment analysis. Journal of medical systems, 40(11), 236.CrossRef
Zurück zum Zitat Yom-Tov, E. (2015). Ebola data from the internet: an opportunity for syndromic surveillance or a news event?. In Proceedings of the 5th international conference on digital health 2015 (pp. 115–119). ACM. Yom-Tov, E. (2015). Ebola data from the internet: an opportunity for syndromic surveillance or a news event?. In Proceedings of the 5th international conference on digital health 2015 (pp. 115–119). ACM.
Metadaten
Titel
Classifying and Summarizing Information from Microblogs During Epidemics
verfasst von
Koustav Rudra
Ashish Sharma
Niloy Ganguly
Muhammad Imran
Publikationsdatum
20.03.2018
Verlag
Springer US
Erschienen in
Information Systems Frontiers / Ausgabe 5/2018
Print ISSN: 1387-3326
Elektronische ISSN: 1572-9419
DOI
https://doi.org/10.1007/s10796-018-9844-9

Weitere Artikel der Ausgabe 5/2018

Information Systems Frontiers 5/2018 Zur Ausgabe