Skip to main content
Erschienen in: International Journal of Data Science and Analytics 4/2019

10.05.2018 | Regular Paper

Accurate classification of socially generated medical discourse

verfasst von: Rana Alnashwan, Humphrey Sorensen, Adrian O’Riordan, Cathal Hoare

Erschienen in: International Journal of Data Science and Analytics | Ausgabe 4/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The growth of online health communities particularly those involving socially generated content can provide considerable value for society. Participants can gain knowledge of medical information or interact with peers on medical forum platforms. Analysing sentiment expressed by members of a health community in medical forum discourse can be of significant value, such as by identifying a particular aspect of an information space, determining themes that predominate among a large data set, and allowing people to summarize topics within a big data set. In this paper, we identify sentiments expressed in online medical forums that discuss Lyme disease. There are two goals in our research: first, to identify a complete and relevant set of categories that can characterize Lyme disease discourse; and second, to test and investigate strategies, both individually and collectively, for automating the classification of medical forum posts into those categories. We present a feature-based model that consists of three different feature sets: content-free, content-specific and meta-level features. Employing inductive learning algorithms to build a feature-based classification model, we assess the feasibility and accuracy of our automated classification. We further evaluate our model by assessing its ability to adapt to an online medical forum discussing Lupus disease. The experimental results demonstrate the effectiveness of our approach.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
The terms sentiment and affect have been used interchangeably in the literature, where they refer to extraction of opinions, emotions or views that may be expressed in the text.
 
Literatur
1.
Zurück zum Zitat Petrie, K.J., Weinman, J.: Perceptions of Health and Illness: Current Research and Applications. Taylor & Francis, Boca Raton (1997) Petrie, K.J., Weinman, J.: Perceptions of Health and Illness: Current Research and Applications. Taylor & Francis, Boca Raton (1997)
2.
Zurück zum Zitat Davison, K.P., Pennebaker, J.W., Dickerson, S.S.: Who talks? The social psychology of illness support groups. Am. Psychol. 55(2), 205 (2000)CrossRef Davison, K.P., Pennebaker, J.W., Dickerson, S.S.: Who talks? The social psychology of illness support groups. Am. Psychol. 55(2), 205 (2000)CrossRef
3.
Zurück zum Zitat Bhatia, S., Mitra, P.: Adopting inference networks for online thread retrieval. In: AAAI, vol. 10, pp. 1300–1305 (2010) Bhatia, S., Mitra, P.: Adopting inference networks for online thread retrieval. In: AAAI, vol. 10, pp. 1300–1305 (2010)
4.
Zurück zum Zitat Bobicev, V., Sokolova, M., Oakes, M.: What goes around comes around: learning sentiments in online medical forums. Cognit. Comput. 7(5), 609–621 (2015)CrossRef Bobicev, V., Sokolova, M., Oakes, M.: What goes around comes around: learning sentiments in online medical forums. Cognit. Comput. 7(5), 609–621 (2015)CrossRef
5.
Zurück zum Zitat Zhang, T., Cho, J.H., Zhai, C.: Understanding user intents in online health forums. In: Proceedings of the 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, pp. 220–229. ACM (2014) Zhang, T., Cho, J.H., Zhai, C.: Understanding user intents in online health forums. In: Proceedings of the 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, pp. 220–229. ACM (2014)
6.
Zurück zum Zitat Fox, S.: The Social Life of Health Information, 2011. Pew Internet & American Life Project, Washington (2011) Fox, S.: The Social Life of Health Information, 2011. Pew Internet & American Life Project, Washington (2011)
7.
Zurück zum Zitat Bravo-Marquez, F., Mendoza, M., Poblete, B.: Meta-level sentiment models for big social data analysis. Knowl. Based Syst. 69, 86–99 (2014)CrossRef Bravo-Marquez, F., Mendoza, M., Poblete, B.: Meta-level sentiment models for big social data analysis. Knowl. Based Syst. 69, 86–99 (2014)CrossRef
8.
Zurück zum Zitat Biyani, P., Bhatia, S., Caragea, C., Mitra, P.: Using non-lexical features for identifying factual and opinionative threads in online forums. Knowl. Based Syst. 69, 170–178 (2014)CrossRef Biyani, P., Bhatia, S., Caragea, C., Mitra, P.: Using non-lexical features for identifying factual and opinionative threads in online forums. Knowl. Based Syst. 69, 170–178 (2014)CrossRef
9.
Zurück zum Zitat Ding, X., Liu, B.: The utility of linguistic rules in opinion mining. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 811–812. ACM (2007) Ding, X., Liu, B.: The utility of linguistic rules in opinion mining. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 811–812. ACM (2007)
10.
Zurück zum Zitat Poggi, I., D’Errico, F.: Multimodal acid communication of a politician. In: ESSEM@ AI* IA, pp. 59–70 (2013) Poggi, I., D’Errico, F.: Multimodal acid communication of a politician. In: ESSEM@ AI* IA, pp. 59–70 (2013)
11.
Zurück zum Zitat Cieliebak, M., Dürr, O., Uzdilli, F.: Potential and limitations of commercial sentiment detection tools. In: ESSEM@ AI* IA, pp. 47–58 (2013) Cieliebak, M., Dürr, O., Uzdilli, F.: Potential and limitations of commercial sentiment detection tools. In: ESSEM@ AI* IA, pp. 47–58 (2013)
12.
Zurück zum Zitat Khan, F.H., Qamar, U., Bashir, S.: eSAP: a decision support framework for enhanced sentiment analysis and polarity classification. Inf. Sci. 367, 862–873 (2016)CrossRef Khan, F.H., Qamar, U., Bashir, S.: eSAP: a decision support framework for enhanced sentiment analysis and polarity classification. Inf. Sci. 367, 862–873 (2016)CrossRef
13.
Zurück zum Zitat Al-Twairesh, N., Al-Khalifa, H., Al-Salman, A.: Subjectivity and sentiment analysis of arabic: trends and challenges. In: 2014 IEEE/ACS 11th International Conference on Computer Systems and Applications (AICCSA), pp. 148–155. IEEE (2014) Al-Twairesh, N., Al-Khalifa, H., Al-Salman, A.: Subjectivity and sentiment analysis of arabic: trends and challenges. In: 2014 IEEE/ACS 11th International Conference on Computer Systems and Applications (AICCSA), pp. 148–155. IEEE (2014)
14.
Zurück zum Zitat Plutchik, R.: The nature of emotions human emotions have deep evolutionary roots, a fact that may explain their complexity and provide tools for clinical practice. Am. Sci. 89(4), 344–350 (2001)CrossRef Plutchik, R.: The nature of emotions human emotions have deep evolutionary roots, a fact that may explain their complexity and provide tools for clinical practice. Am. Sci. 89(4), 344–350 (2001)CrossRef
15.
16.
Zurück zum Zitat Bravo-Marquez, F., Frank, E., Mohammad, S.M., Pfahringer, B.: Determining word-emotion associations from tweets by multi-label classification. In: 2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI), pp. 536–539. IEEE (2016) Bravo-Marquez, F., Frank, E., Mohammad, S.M., Pfahringer, B.: Determining word-emotion associations from tweets by multi-label classification. In: 2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI), pp. 536–539. IEEE (2016)
17.
Zurück zum Zitat Mohammad, S.M., Turney, P.D.: Emotions evoked by common words and phrases: using mechanical turk to create an emotion lexicon. In: Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text. Association for Computational Linguistics, pp. 26–34 (2010) Mohammad, S.M., Turney, P.D.: Emotions evoked by common words and phrases: using mechanical turk to create an emotion lexicon. In: Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text. Association for Computational Linguistics, pp. 26–34 (2010)
18.
Zurück zum Zitat Wang, X., Wei, F., Liu, X., Zhou, M., Zhang, M.: Topic sentiment analysis in twitter: a graph-based hashtag sentiment classification approach. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, pp. 1031–1040. ACM (2011) Wang, X., Wei, F., Liu, X., Zhou, M., Zhang, M.: Topic sentiment analysis in twitter: a graph-based hashtag sentiment classification approach. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, pp. 1031–1040. ACM (2011)
19.
Zurück zum Zitat Abbasi, A., Chen, H.: Applying authorship analysis to extremist-group web forum messages. IEEE Intell. Syst. 20(5), 67–75 (2005)CrossRef Abbasi, A., Chen, H.: Applying authorship analysis to extremist-group web forum messages. IEEE Intell. Syst. 20(5), 67–75 (2005)CrossRef
20.
Zurück zum Zitat Dang, Y., Zhang, Y., Chen, H.: A lexicon-enhanced method for sentiment classification: an experiment on online product reviews. IEEE Intell. Syst. 25(4), 46–53 (2010)CrossRef Dang, Y., Zhang, Y., Chen, H.: A lexicon-enhanced method for sentiment classification: an experiment on online product reviews. IEEE Intell. Syst. 25(4), 46–53 (2010)CrossRef
21.
Zurück zum Zitat Alnashwan, R., O’Riordan, A.P., Sorensen, H., Hoare, C.: Improving sentiment analysis through ensemble learning of meta-level features. In: KDWEB 2016: 2nd International Workshop on Knowledge Discovery on the Web. Sun SITE Central Europe (CEUR)/RWTH Aachen University (2016) Alnashwan, R., O’Riordan, A.P., Sorensen, H., Hoare, C.: Improving sentiment analysis through ensemble learning of meta-level features. In: KDWEB 2016: 2nd International Workshop on Knowledge Discovery on the Web. Sun SITE Central Europe (CEUR)/RWTH Aachen University (2016)
22.
Zurück zum Zitat Zheng, R., Li, J., Chen, H., Huang, Z.: A framework for authorship identification of online messages: writing-style features and classification techniques. J. Assoc. Inf. Sci. Technol. 57(3), 378–393 (2006)CrossRef Zheng, R., Li, J., Chen, H., Huang, Z.: A framework for authorship identification of online messages: writing-style features and classification techniques. J. Assoc. Inf. Sci. Technol. 57(3), 378–393 (2006)CrossRef
23.
Zurück zum Zitat Lu, Y.: Automatic topic identification of health-related messages in online health community using text classification. SpringerPlus 2(1), 309 (2013)CrossRef Lu, Y.: Automatic topic identification of health-related messages in online health community using text classification. SpringerPlus 2(1), 309 (2013)CrossRef
24.
Zurück zum Zitat Baccianella, S., Esuli, A., Sebastiani, F.: Sentiwordnet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. In: LREC, vol. 10, pp. 2200–2204 (2010) Baccianella, S., Esuli, A., Sebastiani, F.: Sentiwordnet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. In: LREC, vol. 10, pp. 2200–2204 (2010)
25.
Zurück zum Zitat Esuli, A., Sebastiani, F.: Sentiwordnet: a publicly available lexical resource for opinion mining. In: Proceedings of the 5th Conference on Language Resources and Evaluation (LREC06), pp. 417–422 (2006) Esuli, A., Sebastiani, F.: Sentiwordnet: a publicly available lexical resource for opinion mining. In: Proceedings of the 5th Conference on Language Resources and Evaluation (LREC06), pp. 417–422 (2006)
26.
Zurück zum Zitat Liu, B.: Sentiment analysis and opinion mining. Synth. Lect. Hum. Lang. Technol. 5(1), 1–167 (2012)CrossRef Liu, B.: Sentiment analysis and opinion mining. Synth. Lect. Hum. Lang. Technol. 5(1), 1–167 (2012)CrossRef
27.
Zurück zum Zitat Bradley, M.M., Lang, P.J.: Affective norms for English words (anew): instruction manual and affective ratings, Technical report C-1, the center for research in psychophysiology. University of Florida, Tech. Rep. (1999) Bradley, M.M., Lang, P.J.: Affective norms for English words (anew): instruction manual and affective ratings, Technical report C-1, the center for research in psychophysiology. University of Florida, Tech. Rep. (1999)
28.
29.
Zurück zum Zitat Mohammad, S.M., Kiritchenko, S., Zhu, X.: NRC-Canada: building the state-of-the-art in sentiment analysis of tweets. arXiv:1308.6242 (2013) Mohammad, S.M., Kiritchenko, S., Zhu, X.: NRC-Canada: building the state-of-the-art in sentiment analysis of tweets. arXiv:​1308.​6242 (2013)
30.
Zurück zum Zitat Thelwall, M., Buckley, K., Paltoglou, G.: Sentiment strength detection for the social web. J. Assoc. Inf. Sci. Technol. 63(1), 163–173 (2012)CrossRef Thelwall, M., Buckley, K., Paltoglou, G.: Sentiment strength detection for the social web. J. Assoc. Inf. Sci. Technol. 63(1), 163–173 (2012)CrossRef
31.
Zurück zum Zitat Cambria, E., Havasi, C., Hussain, A.: Senticnet 2: a semantic and affective resource for opinion mining and sentiment analysis. In: FLAIRS Conference, pp. 202–207 (2012) Cambria, E., Havasi, C., Hussain, A.: Senticnet 2: a semantic and affective resource for opinion mining and sentiment analysis. In: FLAIRS Conference, pp. 202–207 (2012)
32.
Zurück zum Zitat Yang, Y.: An evaluation of statistical approaches to text categorization. Inf. Retr. 1(1), 69–90 (1999)CrossRef Yang, Y.: An evaluation of statistical approaches to text categorization. Inf. Retr. 1(1), 69–90 (1999)CrossRef
33.
Zurück zum Zitat Nichols, T.R., Wisner, P.M., Cripe, G., Gulabchand, L.: Putting the kappa statistic to use. Qual. Assur. J. 13(3–4), 57–61 (2010)CrossRef Nichols, T.R., Wisner, P.M., Cripe, G., Gulabchand, L.: Putting the kappa statistic to use. Qual. Assur. J. 13(3–4), 57–61 (2010)CrossRef
34.
Zurück zum Zitat Jain, A., Zongker, D.: Feature selection: evaluation, application, and small sample performance. IEEE Trans. Pattern Anal. Mach. Intell. 19(2), 153–158 (1997)CrossRef Jain, A., Zongker, D.: Feature selection: evaluation, application, and small sample performance. IEEE Trans. Pattern Anal. Mach. Intell. 19(2), 153–158 (1997)CrossRef
35.
Zurück zum Zitat Guo, B., Nixon, M.S.: Gait feature subset selection by mutual information. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 39(1), 36–46 (2009)CrossRef Guo, B., Nixon, M.S.: Gait feature subset selection by mutual information. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 39(1), 36–46 (2009)CrossRef
36.
Zurück zum Zitat Bihis, M., Roychowdhury, S.: A generalized flow for multi-class and binary classification tasks: an azure ml approach. In: 2015 IEEE International Conference on Big Data (Big Data). pp. 1728–1737. IEEE (2015) Bihis, M., Roychowdhury, S.: A generalized flow for multi-class and binary classification tasks: an azure ml approach. In: 2015 IEEE International Conference on Big Data (Big Data). pp. 1728–1737. IEEE (2015)
37.
Zurück zum Zitat Salathe, M., Bengtsson, L., Bodnar, T.J., Brewer, D.D., Brownstein, J.S., Buckee, C., Campbell, E.M., Cattuto, C., Khandelwal, S., Mabry, P.L., et al.: Digital epidemiology. PLoS Comput. Biol. 8(7), e1002616 (2012)CrossRef Salathe, M., Bengtsson, L., Bodnar, T.J., Brewer, D.D., Brownstein, J.S., Buckee, C., Campbell, E.M., Cattuto, C., Khandelwal, S., Mabry, P.L., et al.: Digital epidemiology. PLoS Comput. Biol. 8(7), e1002616 (2012)CrossRef
Metadaten
Titel
Accurate classification of socially generated medical discourse
verfasst von
Rana Alnashwan
Humphrey Sorensen
Adrian O’Riordan
Cathal Hoare
Publikationsdatum
10.05.2018
Verlag
Springer International Publishing
Erschienen in
International Journal of Data Science and Analytics / Ausgabe 4/2019
Print ISSN: 2364-415X
Elektronische ISSN: 2364-4168
DOI
https://doi.org/10.1007/s41060-018-0128-8

Weitere Artikel der Ausgabe 4/2019

International Journal of Data Science and Analytics 4/2019 Zur Ausgabe