Skip to main content
Top
Published in: Social Network Analysis and Mining 1/2015

01-12-2015 | Original Article

Health-related hypothesis generation using social media data

Authors: Jon Parker, Andrew Yates, Nazli Goharian, Ophir Frieder

Published in: Social Network Analysis and Mining | Issue 1/2015

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Traditional public health surveillance, also known as syndromic surveillance, is expensive and burdensome because it relies on clinical reports authored by health professionals with considerable time and effort. Due to its preventative cost, syndromic surveillance is typically only performed for high risk concerns like influenza. Therefore, a health surveillance system that works for numerous health concerns simultaneously would be of great practical use. We present a framework that processes a stream of time-stamped social media messages. The framework produces “interest curves” that permit the generation of hypotheses regarding which health-related conditions/topics may be increasing in prevalence. We do not claim to detect an actual outbreak of a health-related condition because this framework only has access to social media messages and not a harder data source like patient records. This approach differs from other prior approaches because it is not customized to detect one particular illness (e.g., influenza) as is commonly done. The inner workings of the framework can be interpreted as a transformation that converts a signal deeply embedded in the “stream of raw tweets” domain to a signal in the “health related topics” domain. This framework’s capability is demonstrated by examining multiple interest curves related to seasonal influenza and allergies.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
go back to reference Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proceedings of 20th international conference on very large data bases, VLDB, pp 487–499 Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proceedings of 20th international conference on very large data bases, VLDB, pp 487–499
go back to reference Allan J, Papka R, Lavrenko V (1998) On-line new event detection and tracking. In: Proceedings of the 21st annual international ACM SIGIR conference on research and development in information retrieval. ACM Allan J, Papka R, Lavrenko V (1998) On-line new event detection and tracking. In: Proceedings of the 21st annual international ACM SIGIR conference on research and development in information retrieval. ACM
go back to reference Aramaki E, Maskawa S, Morita M (2011) Twitter catches the flu: detecting influenza epidemics using twitter. In: Proceedings of the Conference on empirical methods in natural language processing, EMNLP, pp 1568–1576 Aramaki E, Maskawa S, Morita M (2011) Twitter catches the flu: detecting influenza epidemics using twitter. In: Proceedings of the Conference on empirical methods in natural language processing, EMNLP, pp 1568–1576
go back to reference Blei D, Ng A, Jordan M (2003) Latent dirichlet allocation. the. J Mach Learn Res 3:993–1022MATH Blei D, Ng A, Jordan M (2003) Latent dirichlet allocation. the. J Mach Learn Res 3:993–1022MATH
go back to reference Box G, Jenkins G, Reinsel G (1970) Time series analysis: forecasting and control. John Wiley & Sons Box G, Jenkins G, Reinsel G (1970) Time series analysis: forecasting and control. John Wiley & Sons
go back to reference Brown ST, Tai JH, Bailey RR, Cooley PC, Wheaton WD, Potter MA, Voorhees RE, Lejeune M, Grefenstette JJ, Burke DS, McGlone SM, Lee BY (2011) Would school closure for the 2009 H1N1 influenza epidemic have been worth the cost?: a computational simulation of Pennsylvania. BMC Public Health 11(1):353CrossRef Brown ST, Tai JH, Bailey RR, Cooley PC, Wheaton WD, Potter MA, Voorhees RE, Lejeune M, Grefenstette JJ, Burke DS, McGlone SM, Lee BY (2011) Would school closure for the 2009 H1N1 influenza epidemic have been worth the cost?: a computational simulation of Pennsylvania. BMC Public Health 11(1):353CrossRef
go back to reference Burger EW, Federoff H, Frieder O, Goharian N, Yates A (2013) Social media communications networks and pharmacovigilance: SequelAE-2.0. In: Proceedings of the IEEE 15th international conference on e-health networking, applications and services, healthcom Burger EW, Federoff H, Frieder O, Goharian N, Yates A (2013) Social media communications networks and pharmacovigilance: SequelAE-2.0. In: Proceedings of the IEEE 15th international conference on e-health networking, applications and services, healthcom
go back to reference Chang J, Boyd-Graber JL, Gerrish S, Wang C, Blei DM (2009) Reading tea leaves: how humans interpret topic models. In: Proceedings of the 23rd annual conference on neural information processing systems, NIPS, pp 288–296 Chang J, Boyd-Graber JL, Gerrish S, Wang C, Blei DM (2009) Reading tea leaves: how humans interpret topic models. In: Proceedings of the 23rd annual conference on neural information processing systems, NIPS, pp 288–296
go back to reference Chou W, Hunt Y, Beckjord E, Moser R, Hesse B (2009) Social media use in the United States: implications for health communication. J Med Internet Res, 11(4) Chou W, Hunt Y, Beckjord E, Moser R, Hesse B (2009) Social media use in the United States: implications for health communication. J Med Internet Res, 11(4)
go back to reference Corley C, Mikler A, Singh K, Cook D (2009) Monitoring influenza trends through mining social media. In Proceedings of the international conference on bioinformatics computational biology, ICBCB, pp 340–346 Corley C, Mikler A, Singh K, Cook D (2009) Monitoring influenza trends through mining social media. In Proceedings of the international conference on bioinformatics computational biology, ICBCB, pp 340–346
go back to reference Culotta A (2010) Towards detecting influenza epidemics by analyzing twitter messages. In: Proceedings of the 1st workshop on social media analytics, pp 115–122 Culotta A (2010) Towards detecting influenza epidemics by analyzing twitter messages. In: Proceedings of the 1st workshop on social media analytics, pp 115–122
go back to reference Diaz-Aviles E, Stewart A, Velasco E, Denecke K, Nejdl W (2012) Towards personalized learning to rank for epidemic intelligence based on social media streams. In: Proceedings of the 21st international conference companion on world wide web, WWW, pp 495–496 Diaz-Aviles E, Stewart A, Velasco E, Denecke K, Nejdl W (2012) Towards personalized learning to rank for epidemic intelligence based on social media streams. In: Proceedings of the 21st international conference companion on world wide web, WWW, pp 495–496
go back to reference Freifeld CC, Mandla KD, Reis BY, Brownstein JS (2008) Health map: global infectious disease monitoring through automated classification and visualization of internet media reports. J Am Med Inform Assoc Freifeld CC, Mandla KD, Reis BY, Brownstein JS (2008) Health map: global infectious disease monitoring through automated classification and visualization of internet media reports. J Am Med Inform Assoc
go back to reference Ginsberg J, Mohebbi M, Patel R, Brammer L, Smolinski M, Brilliant L (2008) Detecting influenza epidemics using search engine query data. Nature 457(7232):1012–1014CrossRef Ginsberg J, Mohebbi M, Patel R, Brammer L, Smolinski M, Brilliant L (2008) Detecting influenza epidemics using search engine query data. Nature 457(7232):1012–1014CrossRef
go back to reference Jamison-Powell S, Linehan C, Daley L, Garbett A, Lawson S (2012) I can’t get no sleep: discussing# insomnia on twitter. In: Proceedings of the ACM annual conference on human factors in computing systems, CHI, pp 1501–1510 Jamison-Powell S, Linehan C, Daley L, Garbett A, Lawson S (2012) I can’t get no sleep: discussing# insomnia on twitter. In: Proceedings of the ACM annual conference on human factors in computing systems, CHI, pp 1501–1510
go back to reference Jansen B, Zhang M, Sobel K, Chowdury A (2009) Twitter power: tweets as electronic word of mouth. J Am Soc Inform Sci Technol 60(11):2169–2188CrossRef Jansen B, Zhang M, Sobel K, Chowdury A (2009) Twitter power: tweets as electronic word of mouth. J Am Soc Inform Sci Technol 60(11):2169–2188CrossRef
go back to reference Koike D, et al. (2013) Time series topic modeling and bursty topic detection of correlated news and twitter. In: Proc. 6th IJCNLP Koike D, et al. (2013) Time series topic modeling and bursty topic detection of correlated news and twitter. In: Proc. 6th IJCNLP
go back to reference Lampos V, Cristianini N (2012) Nowcasting events from the social web with statistical learning. ACM Trans Intell Syst Technol 3(4):72CrossRef Lampos V, Cristianini N (2012) Nowcasting events from the social web with statistical learning. ACM Trans Intell Syst Technol 3(4):72CrossRef
go back to reference Li H, Wang Y, Zhang D, Zhang M, Chang E (2008) PFP: parallel FP-growth for query recommendation. In: Proceedings of the ACM conference on recommender systems, pp 107–114 Li H, Wang Y, Zhang D, Zhang M, Chang E (2008) PFP: parallel FP-growth for query recommendation. In: Proceedings of the ACM conference on recommender systems, pp 107–114
go back to reference McIver DJ, Brownstein JS (2014) Wikipedia usage estimates prevalence of influenza-like illness in the United States in near real-time. PLoS Comput Biol 10(4):e1003581CrossRef McIver DJ, Brownstein JS (2014) Wikipedia usage estimates prevalence of influenza-like illness in the United States in near real-time. PLoS Comput Biol 10(4):e1003581CrossRef
go back to reference Mykhalovskiy E, Weir L et al (2006) The global public health intelligence network and early warning outbreak detection: a Canadian contribution to global public health. Can J Public Health 97(1):42 Mykhalovskiy E, Weir L et al (2006) The global public health intelligence network and early warning outbreak detection: a Canadian contribution to global public health. Can J Public Health 97(1):42
go back to reference Nakhasi A, Passarella R, Bell S, Paul M, Dredze M, Pronovost P (2012) Malpractice and malcontent: analyzing medical complaints in twitter. In: AAAI Fall Symposium Series Nakhasi A, Passarella R, Bell S, Paul M, Dredze M, Pronovost P (2012) Malpractice and malcontent: analyzing medical complaints in twitter. In: AAAI Fall Symposium Series
go back to reference O’Connor B, Balasubramanyan R, Routledge BR, Smith NA (2010) From tweets to polls: linking text sentiment to public opinion time series. In: Proceedings of the 4th international conference on weblogs and social media, ICWSM O’Connor B, Balasubramanyan R, Routledge BR, Smith NA (2010) From tweets to polls: linking text sentiment to public opinion time series. In: Proceedings of the 4th international conference on weblogs and social media, ICWSM
go back to reference Page E (1954) Continuous inspection schemes. Biometrika 100–115 Page E (1954) Continuous inspection schemes. Biometrika 100–115
go back to reference Parker J, Epstein JM (2011) A distributed platform for global-scale agent-based models of disease transmission. ACM Trans Model Comput Simul. 22(1) Article 2, p 25 Parker J, Epstein JM (2011) A distributed platform for global-scale agent-based models of disease transmission. ACM Trans Model Comput Simul. 22(1) Article 2, p 25
go back to reference Parker J, Wei Y, Yates A, Frieder O, Goharian N (2013) A framework for detecting public health trends with twitter. In: Proceedings of the international conference on advances in social networks analysis and mining Parker J, Wei Y, Yates A, Frieder O, Goharian N (2013) A framework for detecting public health trends with twitter. In: Proceedings of the international conference on advances in social networks analysis and mining
go back to reference Paul M, Dredze M (2012) A model for mining public health topics from twitter. HEALTH 11:16–26 Paul M, Dredze M (2012) A model for mining public health topics from twitter. HEALTH 11:16–26
go back to reference Paul MJ, Girju R (2010) A two-dimensional topic-spect model for discovering multi-faceted topics. In: Proceedings of the 24th AAAI conference on artificial intelligence Paul MJ, Girju R (2010) A two-dimensional topic-spect model for discovering multi-faceted topics. In: Proceedings of the 24th AAAI conference on artificial intelligence
go back to reference Roberts S (1959) Control chart tests based on geometric moving averages. Technometrics 1(3):239–250CrossRef Roberts S (1959) Control chart tests based on geometric moving averages. Technometrics 1(3):239–250CrossRef
go back to reference Sakaki T, Okazaki M, Matsuo Y (2010) Earthquake shakes twitter users: real-time event detection by social sensors. In: Proceedings of the 19th international conference on world wide web, WWW, pp 851–860 Sakaki T, Okazaki M, Matsuo Y (2010) Earthquake shakes twitter users: real-time event detection by social sensors. In: Proceedings of the 19th international conference on world wide web, WWW, pp 851–860
go back to reference Shewhart W (1931) Economic control of quality of manufactured product. vol 509. ASQ Quality Press Shewhart W (1931) Economic control of quality of manufactured product. vol 509. ASQ Quality Press
go back to reference Tumasjan A, Sprenger TO, Sandner PG, Welpe IM (2010) Predicting elections with twitter: what 140 characters reveal about political sentiment. In: Proceedings of the 4th international conference on weblogs and social media, ICWSM Tumasjan A, Sprenger TO, Sandner PG, Welpe IM (2010) Predicting elections with twitter: what 140 characters reveal about political sentiment. In: Proceedings of the 4th international conference on weblogs and social media, ICWSM
go back to reference Wenerstrom B, Kantardzic M, Arabmakki E, Hindi M (2012) Multi-tweet summarization for flu outbreak detection. In: AAAI Fall Symposium Series Wenerstrom B, Kantardzic M, Arabmakki E, Hindi M (2012) Multi-tweet summarization for flu outbreak detection. In: AAAI Fall Symposium Series
go back to reference Yates A, Goharian N (2013) ADR trace: detecting expected and unexpected adverse drug reactions from user reviews on social media sites. In: Proceedings of the 35th European conference on information retrieval (ECIR 2013) Yates A, Goharian N (2013) ADR trace: detecting expected and unexpected adverse drug reactions from user reviews on social media sites. In: Proceedings of the 35th European conference on information retrieval (ECIR 2013)
go back to reference Yates A, Goharian N, Frieder O (2014) Relevance-ranked domain-specific synonym discovery. In: Proceedings of the 36th European conference on information retrieval, ECIR Yates A, Goharian N, Frieder O (2014) Relevance-ranked domain-specific synonym discovery. In: Proceedings of the 36th European conference on information retrieval, ECIR
go back to reference Zhu Y, Goharian N (2013) To follow or not to follow: a feature evaluation. In: Proceedings of the 22nd international conference on world wide web (WWW’13) Zhu Y, Goharian N (2013) To follow or not to follow: a feature evaluation. In: Proceedings of the 22nd international conference on world wide web (WWW’13)
Metadata
Title
Health-related hypothesis generation using social media data
Authors
Jon Parker
Andrew Yates
Nazli Goharian
Ophir Frieder
Publication date
01-12-2015
Publisher
Springer Vienna
Published in
Social Network Analysis and Mining / Issue 1/2015
Print ISSN: 1869-5450
Electronic ISSN: 1869-5469
DOI
https://doi.org/10.1007/s13278-014-0239-8

Other articles of this Issue 1/2015

Social Network Analysis and Mining 1/2015 Go to the issue

Premium Partner