Skip to main content
Log in

Mining Health Social Media with Sentiment Analysis

  • Patient Facing Systems
  • Published:
Journal of Medical Systems Aims and scope Submit manuscript

Abstract

With the rapid development of the Internet, more and more users utilize health communities (known as forums) to find health-related information, share their medical stories and experiences, or interact with other people in the communities. In this paper, we propose a framework to analyze the user-generated contents in a health community. The proposed framework contains three phases. First, we extract medical terms, including conditions, symptoms, treatments, effectiveness and side effects to form a virtual document for each question in the community. Next, we modify Latent Dirichlet Allocation (LDA) by adding a weighted scheme, called conLDA, to cluster virtual documents with similar medical term distributions into a conditional topic (C-topic). Finally, we analyze the clustered C-topics by sentiment polarities, and physiological and psychological sentiment. The experiment results show that conLDA outperforms the original LDA, and can cluster relevant medical terms and relevant questions together. The C-topics clustered by conLDA are more thematic than those clustered by the original LDA. The results of sentiment analysis may provide a quick reference and valuable insights for patients, caregivers and doctors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. http://curetogether.com/

  2. http://www.medhelp.org/

  3. http://www.nlm.nih.gov/research/umls/

  4. http://en.wikipedia.org/wiki/Gibbs_sampling

  5. http://www2.imm.dtu.dk/pubdb/views/publication_details.php?id=6010

  6. http://tartarus.org/martin/PorterStemmer/

References

  1. Aletras, N., and Stevenson, M., Evaluating topic coherence using distributional semantics, Proceedings of the 10th International Conference on Computational Semantics. 13–22, 2013.

  2. Augustyniak, L., Kajdanowicz, T., Kazienko, P., Kulisiewicz, M., and Tuliglowicz, W., An approach to sentiment analysis of movie reviews: Lexicon based vs. classification, Proceedings of the 9th International Conference on Hybrid Artificial Intelligence Systems. 168–178, 2014.

  3. Bahrainian, S., and Dengel, A., Sentiment analysis and summarization of Twitter data, Proceedings of the 16th IEEE International Conference on Computational Science and Engineering. 227–234, 2013.

  4. Beck, F., Richard, J.B., Nguyen-Thanh, V., Montagni, I., Parizot, I., and Renahy, E., Use of the internet as a health information resource among French young adults: results from a nationally representative survey. J. Med. Internet Res. 16(5):e128, 2014.

    Article  PubMed  PubMed Central  Google Scholar 

  5. Bianco, A., Zucco, R., Nobile, C.G.A., Pileggi, C., and Pavia, M., Parents seeking health-related information on the internet: cross-sectional study. J. Med. Internet Res. 15(9):e204, 2013.

    Article  PubMed  PubMed Central  Google Scholar 

  6. Biyani, P., Caragea, C., Mitra, P., and Yen, J., Identifying emotional and informational support in online health communities, Proceedings of the 25th International Conference on Computational Linguistics. 827–836, 2014.

  7. Blei, D.M., Ng, A.Y., Jordan, M.I., and Lafferty, J., Latent Dirichlet allocation. J. Mach. Learn. Res. 3:993–1022, 2003.

    Google Scholar 

  8. Chen, A.T., Exploring online support spaces: using cluster analysis to examine breast cancer, diabetes and fibromyalgia support groups. J. Patient Educ. Couns. 87(2):250–257, 2012.

    Article  Google Scholar 

  9. Chen, L.S., Lin, Z.C., and Chang, J.R., FIR: an effective scheme for extracting useful metadata from social media. J. Med. Syst. 39(11):1, 2015.

    Article  Google Scholar 

  10. Ge, G., Chen, L., and Du, J., The research on topic detection of microblog based on TC-LDA, Proceedings of the 15th IEEE International Conference on Communication Technology. 722–727, 2013.

  11. Heidelberger, C., El-Gayar, O., and Sarnikar, S., Online health social networks and patient health decision behavior: A research agenda, Proceedings of the 44th Hawaii International Conference on System Science. 1–7, 2011.

  12. Hu, X., Tang, L., Tang, J., and Liu, H., Exploiting social relations for sentiment analysis in microblogging, Proceedings of the 6th ACM International Conference on Web Search and Data Mining. 537–546, 2013.

  13. Huang, Z., Dong, W., Ji, L., and Duan, H., Outcome prediction in clinical treatment processes. J. Med. Syst. 40(1):8, 2016.

    Article  PubMed  Google Scholar 

  14. Huang, Z., Lu, X., and Duan, H., Latent treatment pattern discovery for clinical processes. J. Med. Syst. 37(2):9915, 2013.

    Article  PubMed  Google Scholar 

  15. Lau, J.H., Newman, D., Karimi, S. and Baldwin, T., Best topic word selection for topic labeling, Proceedings of the 23rd International Conference on Computational Linguistics: Posters. 605–613, 2010.

  16. Lin, C. and He, Y., Joint sentiment/topic model for sentiment analysis, Proceedings of the 18th ACM Conference on Information and Knowledge Management. 375–384, 2010.

  17. Lin, C., He, Y., Everson, R., and Rüger, S., Weakly supervised joint sentiment-topic detection from text. IEEE Trans. Knowl. Data Eng. 24(6):1134–1145, 2012.

    Article  Google Scholar 

  18. Lin, Y., Li, W., Chen, K., and Liu, Y., A document clustering and ranking system for exploring MEDLINE citations. J. Am. Med. Inform. Assoc. 14(5):651–661, 2007.

    Article  PubMed  PubMed Central  Google Scholar 

  19. Lu, Y., Zhang, P. and Deng, S., Exploring health-related topics in online health community using cluster analysis, Proceedings of the 46th Hawaii International Conference on System Science. 802–811, 2013.

  20. Mimno, D., Wallach, H.M., Talley, E., Leenders, M. and McCallum, A., Optimizing semantic coherence in topic models, Proceedings of Conference on Empirical Methods in Natural Language Processing. 262–272, 2011.

  21. Monnier, J., Laken, M., and Carter, C., Patient and caregiver interest in internet-based cancer services. Cancer Pract. 10:305–310, 2002.

    Article  PubMed  Google Scholar 

  22. Nguyen, T., Phung, D., Dao, B., Venkatesh, S., and Berk, M., Affective and content analysis of online depression communities. IEEE Trans. Affect. Comput. 5(3):217–226, 2014.

    Article  Google Scholar 

  23. O'Neil, B., Ziebland, S., Valderas, J., and Lupiáñez-Villanueva, F., User-generated online health content: a survey of internet users in the United Kingdom. J. Med. Internet Res. 16(4):e118, 2014.

    Article  Google Scholar 

  24. Portier, K., Greer, G.E., Rokach, L., Ofek, N., Wang, Y., Biyani, P., Yu, M., Banerjee, S., Zhao, K., Mitra, P., and Yen, J., Understanding topics and sentiment in an online cancer survivor community. J. Natl. Cancer Inst. Monogr. 47:195–198, 2013.

    Article  Google Scholar 

  25. Qiu, B., Zhao, K., Mitra, P., Wu, D., Caragea, C., and Yen, J., Get online support, feel better - sentiment analysis and dynamics in an online cancer survivor community, Proceedings of the Third IEEE International Conference on Social Computing. 274–281, 2011.

  26. Röder, M., Both, A., and Hinneburg, A., Exploring the space of topic coherence measures, Proceedings of the 8th ACM International Conference on Web Search and Data Mining. 399–408, 2015.

  27. Siegel, R.L., Miller, K.D., and Jemal, A., Cancer statistics. Cancer J. Clin. 65(5–65):29, 2015.

    Google Scholar 

  28. Tang, X., and Yang, C.C., Ranking user influence in healthcare social media. ACM Trans. Intell. Syst. Technol. 3(4):73:1–73:21, 2012.

    Article  Google Scholar 

  29. Vanzo, A., Croce, D. and Basili, R., Context-based model for sentiment analysis in Twitter, Proceedings of the 25th International Conference on Computational Linguistics. 2345–2354, 2014.

  30. Wang, Y., Agichtein, E., and Benzi, M., TM-LDA: Efficient online modeling of latent topic transitions in social media, Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 123–131, 2012.

  31. Zaidan, A.A., Zaidan, B.B., Kadhem, Z., Larbani, M., Lakulu, M.B., and Hashim, M., Challenges, alternatives, and path to sustainability: Better public health promotion using social networking pages as a key tool. J. Med. Syst. 39(2):7–2015.

  32. Zhang, Y., He, D., and Sang, Y., Facebook as a platform for health information and communication: a case study of diabetes group. J. Med. Syst. 37(3):9942, 2013.

    Article  PubMed  Google Scholar 

  33. Zhao, K., Greer, G., Qiu, B., Mitra, P., Portier, K., and Yen, J., Finding influential users of an online health community: a new metric based on sentiment influence. J. Am. Med. Inform. Assoc. 21(e2):212–218, 2014.

    Article  Google Scholar 

  34. Ziebland, S., and Wyke, S., Health and illness in a connected world: how might sharing experiences on the internet affect people's health? Milt. Q. 90(2):219–249, 2012.

    Google Scholar 

Download references

Acknowledgments

The authors are grateful to the anonymous referees for their helpful comments and suggestions. This research was supported in part by the Ministry of Science and Technology, Republic of China under Grant No. MOST 103-2410-H-002-109-MY3.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anthony J.T. Lee.

Additional information

This article is part of the Topical Collection on Patient Facing Systems

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, FC., Lee, A.J. & Kuo, SC. Mining Health Social Media with Sentiment Analysis. J Med Syst 40, 236 (2016). https://doi.org/10.1007/s10916-016-0604-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10916-016-0604-4

Keywords

Navigation