Skip to main content
main-content
Top

Hint

Swipe to navigate through the articles of this issue

11-07-2022 | Regular Paper

Domain-specific text dictionaries for text analytics

Authors: Andrea Villanes, Christopher G. Healey

Published in: International Journal of Data Science and Analytics

Login to get access
share
SHARE

Abstract

We investigate the use of sentiment dictionaries to estimate sentiment for large document collections. Our goal in this paper is a semiautomatic method for extending a general sentiment dictionary for a specific target domain in a way that minimizes manual effort. General sentiment dictionaries may not contain terms important to the target domain or may score terms in ways that are inappropriate for the target domain. We combine statistical term identification and term evaluation using Amazon Mechanical Turk to extend the EmoLex sentiment dictionary to a domain-specific study of dengue fever. The same approach can be applied to any term-based sentiment dictionary or target domain. We explain how terms are identified for inclusion or re-evaluation and how Mechanical Turk generates scores for the identified terms. Examples are provided that compare EmoLex sentiment estimates before and after it is extended. We conclude by describing how our sentiment estimates can be integrated into an epidemiology surveillance system that includes sentiment visualization and discussing the strengths and limitations of our work.
Footnotes
1
https://​www.​brandwatch.​com, formerly Crimson Hexagon, a subscription service that provides “insights from 100 million sources and 1.4 trillion posts”
 
2
The odds a particular outcome occurs given a particular exposure, versus the odds of the outcome absent the exposure.
 
Literature
1.
go back to reference Alharbi, M., Laramee, R.S.: SoS TextViz: an extend survey of surveys on text visualization. Computers 8(1), 143–152 (2019) CrossRef Alharbi, M., Laramee, R.S.: SoS TextViz: an extend survey of surveys on text visualization. Computers 8(1), 143–152 (2019) CrossRef
2.
go back to reference Dou, W., Liu, S.: Topic- and time-oriented visual text analysis. IEEE Comput. Gr. Vis. 36(4), 8–13 (2016) CrossRef Dou, W., Liu, S.: Topic- and time-oriented visual text analysis. IEEE Comput. Gr. Vis. 36(4), 8–13 (2016) CrossRef
3.
go back to reference Kucher, K., Paradis, C., Kerren, A.: State of the art in sentiment visualization. Comput. Gr. Forum 37(1), 71–96 (2017) CrossRef Kucher, K., Paradis, C., Kerren, A.: State of the art in sentiment visualization. Comput. Gr. Forum 37(1), 71–96 (2017) CrossRef
4.
go back to reference Shepard, D.S., Halasa, Y.A., Tyagi, B.K., Adhish, S.V., Nandan, D., Karthiga, K.S., Chellaswamy, V., Gaba, M., Arora, N.K.: Economic and disease burden of dengue illness in India. Am. J. Trop. Med. Hyg. 91(6), 1235–1242 (2014) CrossRef Shepard, D.S., Halasa, Y.A., Tyagi, B.K., Adhish, S.V., Nandan, D., Karthiga, K.S., Chellaswamy, V., Gaba, M., Arora, N.K.: Economic and disease burden of dengue illness in India. Am. J. Trop. Med. Hyg. 91(6), 1235–1242 (2014) CrossRef
5.
go back to reference Plutchik, R.: A general psychoevolutionary theory of emotion. In: Plutchik, R., Kellerman, H. (eds.) Theories of Emotion : Emotion, Theory, Research, and Experience, pp. 3–31. Academic Press, New York (1980) CrossRef Plutchik, R.: A general psychoevolutionary theory of emotion. In: Plutchik, R., Kellerman, H. (eds.) Theories of Emotion : Emotion, Theory, Research, and Experience, pp. 3–31. Academic Press, New York (1980) CrossRef
6.
go back to reference Mohammad, S.M., Turney, P.D.: Crowdsourcing a word-emotion association lexicon. Comput. Intell. 29(3), 436–465 (2013) MathSciNetCrossRef Mohammad, S.M., Turney, P.D.: Crowdsourcing a word-emotion association lexicon. Comput. Intell. 29(3), 436–465 (2013) MathSciNetCrossRef
7.
go back to reference Liu, B., Zhang, L.: A survey of opinion mining and sentiment analysis. In: Aggarwal, C.C., Zhai, C.X. (eds.) Mining Text Data, pp. 415–463. Springer, New York (2012) CrossRef Liu, B., Zhang, L.: A survey of opinion mining and sentiment analysis. In: Aggarwal, C.C., Zhai, C.X. (eds.) Mining Text Data, pp. 415–463. Springer, New York (2012) CrossRef
8.
go back to reference Mohammad, S.M.: Sentiment analysis: detecting valence, emotions, and other affectual states from text. In: Meiselman, H. (ed.) Emotional Measurement, pp. 201–237. Elsevier, Atlanta (2015) Mohammad, S.M.: Sentiment analysis: detecting valence, emotions, and other affectual states from text. In: Meiselman, H. (ed.) Emotional Measurement, pp. 201–237. Elsevier, Atlanta (2015)
9.
go back to reference Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends Inf. Retr. 2(1–2), 1–135 (2008) CrossRef Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends Inf. Retr. 2(1–2), 1–135 (2008) CrossRef
10.
go back to reference Zhang, L., Wang, S., Liu, B.: Deep learning for sentiment analysis: a survey. WIREs Data Min. Knowl. Discov. 8(4), 1–25 (2018) Zhang, L., Wang, S., Liu, B.: Deep learning for sentiment analysis: a survey. WIREs Data Min. Knowl. Discov. 8(4), 1–25 (2018)
11.
go back to reference Russell, J.A.: A circumplex model of affect. J. Personal. Soc. Psychol. 39(6), 1161–1178 (1980) CrossRef Russell, J.A.: A circumplex model of affect. J. Personal. Soc. Psychol. 39(6), 1161–1178 (1980) CrossRef
12.
go back to reference Russell, J.A., Feldman Barrett, L.: The structure of current affect: controversies and emerging consensus. Curr. Dir. Psychol. Sci. 8(1), 10–14 (1999) CrossRef Russell, J.A., Feldman Barrett, L.: The structure of current affect: controversies and emerging consensus. Curr. Dir. Psychol. Sci. 8(1), 10–14 (1999) CrossRef
13.
go back to reference Pang, B., Lee, L.: A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd annual meeting of the association for computational linguistics (ACL ’04), Barcelona, Spain, pp. 271–278 (2004) Pang, B., Lee, L.: A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd annual meeting of the association for computational linguistics (ACL ’04), Barcelona, Spain, pp. 271–278 (2004)
14.
go back to reference Pang, B., Lee, L., Vithyanathan, S.: Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of the 2002 conference on empirical methods in natural language processing (EMNLP 2002), Philadelphia, PA, pp. 79–86 (2002) Pang, B., Lee, L., Vithyanathan, S.: Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of the 2002 conference on empirical methods in natural language processing (EMNLP 2002), Philadelphia, PA, pp. 79–86 (2002)
15.
go back to reference Turney, P.: Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th annual meeting of the association for computational linguistics (ACL ’02), Philadelphia, PA, pp. 417–424 (2002) Turney, P.: Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th annual meeting of the association for computational linguistics (ACL ’02), Philadelphia, PA, pp. 417–424 (2002)
16.
go back to reference Bonata, V., Janardhan, N.: A comprehensive study on lxicon based approaches for sentiment analysis. Asian J. Comput. Sci. Technol. 8(S2), 1–6 (2019) CrossRef Bonata, V., Janardhan, N.: A comprehensive study on lxicon based approaches for sentiment analysis. Asian J. Comput. Sci. Technol. 8(S2), 1–6 (2019) CrossRef
19.
go back to reference Li, Z., Wei, Y., Zhang, Y., Yang, Q.: Hierarchical attention transfer network for cross-domain sentiment classification. In: Proceedings of the thirty-second AAAI conference on artifical intelligence (AAAI-18), New Orleans, LA, pp. 5852–5859 (2018) Li, Z., Wei, Y., Zhang, Y., Yang, Q.: Hierarchical attention transfer network for cross-domain sentiment classification. In: Proceedings of the thirty-second AAAI conference on artifical intelligence (AAAI-18), New Orleans, LA, pp. 5852–5859 (2018)
20.
go back to reference Zhang, K., Zhang, K., Zhang, M., Zhao, H., Liu, W., Wei, W.: Incorporating dynamic semantics into pre-trained language model for aspect-based sentiment analysis. In: Cohn, T., He, Y., Liu, Y. (eds.) Findings of the Association for Computational Linguistics (ACL 2022), pp. 3599–3610. Ireland, Dublin (2022) CrossRef Zhang, K., Zhang, K., Zhang, M., Zhao, H., Liu, W., Wei, W.: Incorporating dynamic semantics into pre-trained language model for aspect-based sentiment analysis. In: Cohn, T., He, Y., Liu, Y. (eds.) Findings of the Association for Computational Linguistics (ACL 2022), pp. 3599–3610. Ireland, Dublin (2022) CrossRef
21.
go back to reference Kenton, J.D., Chang, M.-W., Toutanova, L.K.: BERT: Pre-training of deep bidirectional transforms for language understanding. In: Proceedings of the 2019 annual conference of the North American chapter of the association for computational linguistics-human language technologies (NAACL-HLT 2019), virtual, pp. 4171–4189 (2019) Kenton, J.D., Chang, M.-W., Toutanova, L.K.: BERT: Pre-training of deep bidirectional transforms for language understanding. In: Proceedings of the 2019 annual conference of the North American chapter of the association for computational linguistics-human language technologies (NAACL-HLT 2019), virtual, pp. 4171–4189 (2019)
22.
go back to reference Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., McCandlish, S., Radford, A., Sutskever, I., Amodei, D.: Language models are few-shot learners. In: Advances in Neural Information Processing Systems 33 (NeurlPS 2020), pp. 1877–1901. virtual, (2020) Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., McCandlish, S., Radford, A., Sutskever, I., Amodei, D.: Language models are few-shot learners. In: Advances in Neural Information Processing Systems 33 (NeurlPS 2020), pp. 1877–1901. virtual, (2020)
23.
go back to reference Lewis, M., Liu, Y., Goya, N., Ghazvininejad, M., Mohamed, A., Levy, O., Stoyanov, V., Zettlemoyer, L.: BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Proceedings of the 58th annual meeting of the association for computational linguistics (ACL 2020), Seattle, Washington, pp. 7871–7880 (2020) Lewis, M., Liu, Y., Goya, N., Ghazvininejad, M., Mohamed, A., Levy, O., Stoyanov, V., Zettlemoyer, L.: BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Proceedings of the 58th annual meeting of the association for computational linguistics (ACL 2020), Seattle, Washington, pp. 7871–7880 (2020)
24.
go back to reference Song, K., Tan, X., Qin, T., Lu, U., Y., L.T.: MASS: Masked sequence to sequence pre-training for language generation. In: Proceedings of the 36th international conference on machine learning (ICML 2019), Long Beach, California, pp. 5926–5936 (2019) Song, K., Tan, X., Qin, T., Lu, U., Y., L.T.: MASS: Masked sequence to sequence pre-training for language generation. In: Proceedings of the 36th international conference on machine learning (ICML 2019), Long Beach, California, pp. 5926–5936 (2019)
25.
go back to reference Pepe, A., Bollen, J.: Between conjecture and memento: shaping a collective emotional perception of the future. In: AAAI spring symposium on emotion, personality, and social behavior, Stanford, CA, pp. 111–116 (2008) Pepe, A., Bollen, J.: Between conjecture and memento: shaping a collective emotional perception of the future. In: AAAI spring symposium on emotion, personality, and social behavior, Stanford, CA, pp. 111–116 (2008)
27.
go back to reference Thelwall, M., Buckley, K., Paltoglou, G., Cai, D., Kappas, A.: Sentiment strength detection in short informal text. J. Am. Soc. Inf. Sci. Technol. 61(12), 2544–2558 (2010) CrossRef Thelwall, M., Buckley, K., Paltoglou, G., Cai, D., Kappas, A.: Sentiment strength detection in short informal text. J. Am. Soc. Inf. Sci. Technol. 61(12), 2544–2558 (2010) CrossRef
28.
go back to reference Tausczik, Y.R., Pennebaker, J.W.: The psychological meaning of words: LIWC and computerized text analysis methods. J. Lang. Soc. Psychol. 29(1), 25–54 (2010) CrossRef Tausczik, Y.R., Pennebaker, J.W.: The psychological meaning of words: LIWC and computerized text analysis methods. J. Lang. Soc. Psychol. 29(1), 25–54 (2010) CrossRef
29.
go back to reference Baccianella, S., Esuli, A., Sebastiani, F.: SentiWordNet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. In: Proceedings of the 7th international conference on language resources and evaluation (LREC ’10), Valletta, Malta, pp. 2200–2204 (2010) Baccianella, S., Esuli, A., Sebastiani, F.: SentiWordNet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. In: Proceedings of the 7th international conference on language resources and evaluation (LREC ’10), Valletta, Malta, pp. 2200–2204 (2010)
30.
go back to reference Warriner, A.B., Kuperman, V., Brysbaert, M.: Norms of valence, arousal, and dominance for 13,915 English lemmas. Behav. Res. Methods 45(4), 1191–1207 (2013) CrossRef Warriner, A.B., Kuperman, V., Brysbaert, M.: Norms of valence, arousal, and dominance for 13,915 English lemmas. Behav. Res. Methods 45(4), 1191–1207 (2013) CrossRef
31.
go back to reference Cao, N., Lin, Y.-R., Sun, X., Lazer, D., Liu, S., Huamin, Q.: Whisper: Tracing the spatiotemporal process of information diffusion in real time. IEEE Trans Vis. Comput. Gr. 18(12), 2649–2658 (2012) CrossRef Cao, N., Lin, Y.-R., Sun, X., Lazer, D., Liu, S., Huamin, Q.: Whisper: Tracing the spatiotemporal process of information diffusion in real time. IEEE Trans Vis. Comput. Gr. 18(12), 2649–2658 (2012) CrossRef
32.
go back to reference Cao, N., Lu, L., Lin, Y.-R., Wang, F.: SocialHelix: Visual analysis of sentiment divergence in social media. J. Vis. 18(2), 221–235 (2014) CrossRef Cao, N., Lu, L., Lin, Y.-R., Wang, F.: SocialHelix: Visual analysis of sentiment divergence in social media. J. Vis. 18(2), 221–235 (2014) CrossRef
33.
go back to reference Wu, Y., Liu, S., Yan, K., Liu, M., Wu, F.: OpinionFlow: visual analysis of opinion diffusion on social media. IEEE Trans. Vis. Comput. Gr. 20(12), 1763–1772 (2014) CrossRef Wu, Y., Liu, S., Yan, K., Liu, M., Wu, F.: OpinionFlow: visual analysis of opinion diffusion on social media. IEEE Trans. Vis. Comput. Gr. 20(12), 1763–1772 (2014) CrossRef
34.
go back to reference Liu, Y., Wang, H., Landis, S., Macjejewski, R.: A visual analytics framework for identifying topic drivers in media events. IEEE Trans. Vis. Comput. Gr. 24(9), 2501–2515 (2017) CrossRef Liu, Y., Wang, H., Landis, S., Macjejewski, R.: A visual analytics framework for identifying topic drivers in media events. IEEE Trans. Vis. Comput. Gr. 24(9), 2501–2515 (2017) CrossRef
35.
go back to reference El-Assady, M., Gold, V., Acevedo, C., Collins, C., Keim, D.: ConToVi: multi-party conversation exploration using topic-space views. Comput. Gr. Forum 35(3), 431–440 (2016) CrossRef El-Assady, M., Gold, V., Acevedo, C., Collins, C., Keim, D.: ConToVi: multi-party conversation exploration using topic-space views. Comput. Gr. Forum 35(3), 431–440 (2016) CrossRef
36.
go back to reference El-Assady, M., Sevastjanova, R., Keim, D., Collins, C.: ThreadReconstructor: modeling reply-chains to untangle conversational text through visual analytics. Comput. Gr. Forum 37(3), 351–365 (2018) CrossRef El-Assady, M., Sevastjanova, R., Keim, D., Collins, C.: ThreadReconstructor: modeling reply-chains to untangle conversational text through visual analytics. Comput. Gr. Forum 37(3), 351–365 (2018) CrossRef
37.
go back to reference Hoque, E., Carenini, G.: ConVis: a visual text analytic system for exploring blog conversations. Comput. Gr. Forum 33(3), 221–230 (2014) CrossRef Hoque, E., Carenini, G.: ConVis: a visual text analytic system for exploring blog conversations. Comput. Gr. Forum 33(3), 221–230 (2014) CrossRef
38.
go back to reference Hoque, E., Carenini, G.: MultiConVis: A visual text analysis system for exploring a collection of online conversations. In: Proceedings of the 21st international conference on intelligent user interfaces (IUI ’16), Sonoma, CA, pp. 96–107 (2016) Hoque, E., Carenini, G.: MultiConVis: A visual text analysis system for exploring a collection of online conversations. In: Proceedings of the 21st international conference on intelligent user interfaces (IUI ’16), Sonoma, CA, pp. 96–107 (2016)
39.
go back to reference Mohammad, S.M., Sobhani, P., Kiritchenko, S.: Stance and sentiment in tweets. ACM Trans. Int. Technol. 17(3), 26 (2017) CrossRef Mohammad, S.M., Sobhani, P., Kiritchenko, S.: Stance and sentiment in tweets. ACM Trans. Int. Technol. 17(3), 26 (2017) CrossRef
40.
go back to reference Kucher, K., Martins, R.M., Paradis, C., Kerren, A.: StanceVis Prime: visual analysis of sentiment and stance in social media texts. J. Vis. 23(6), 1015–1034 (2020) CrossRef Kucher, K., Martins, R.M., Paradis, C., Kerren, A.: StanceVis Prime: visual analysis of sentiment and stance in social media texts. J. Vis. 23(6), 1015–1034 (2020) CrossRef
41.
go back to reference Wei, F., Shixia, L., Yangqiu, S., Shimei, P., Zhou, M.X., Qian, W., Lei, S., Li, T., Qiang, Z.: TIARA: interactive, topic-based visual text summarization and analysis. In: Proceedings of the 16th SIGKDD international conference on knowledge discovery and data mining (KDD 2010), Washington, DC, pp. 153–162 (2010) Wei, F., Shixia, L., Yangqiu, S., Shimei, P., Zhou, M.X., Qian, W., Lei, S., Li, T., Qiang, Z.: TIARA: interactive, topic-based visual text summarization and analysis. In: Proceedings of the 16th SIGKDD international conference on knowledge discovery and data mining (KDD 2010), Washington, DC, pp. 153–162 (2010)
42.
go back to reference Dörk, M., Gruen, D., Williamson, C., Carpendale, S.: A visual backchannel for large-scale events. IEEE Trans. Vis. Comput. Gr. 16(6), 1129–1138 (2010) CrossRef Dörk, M., Gruen, D., Williamson, C., Carpendale, S.: A visual backchannel for large-scale events. IEEE Trans. Vis. Comput. Gr. 16(6), 1129–1138 (2010) CrossRef
43.
go back to reference Mohammad, S.M.: Challenges in sentiment analysis. In: Das, D., Cambria, E., Bandyopadhyay, S. (eds.) A Practical Guide to Sentiment Analysis, pp. 61–83. Springer, New York (2016) Mohammad, S.M.: Challenges in sentiment analysis. In: Das, D., Cambria, E., Bandyopadhyay, S. (eds.) A Practical Guide to Sentiment Analysis, pp. 61–83. Springer, New York (2016)
44.
go back to reference World Health Organization: Prevention and control of dengue and dengue hemorrhagic fever: comprehensive guidelines. Technical report, World Health Organization Regional Office for South-East Asia (1999) World Health Organization: Prevention and control of dengue and dengue hemorrhagic fever: comprehensive guidelines. Technical report, World Health Organization Regional Office for South-East Asia (1999)
45.
go back to reference Bhatt, S., Gething, P.W., Brady, O.J., Messina, J.P., Farlow, A.W., Moyes, C.L., Drake, J.M., Brownstein, J.S., Hoen, A.G., Sankoh, O.: The global distribution and burden of dengue. Nature 496(7446), 504 (2013) CrossRef Bhatt, S., Gething, P.W., Brady, O.J., Messina, J.P., Farlow, A.W., Moyes, C.L., Drake, J.M., Brownstein, J.S., Hoen, A.G., Sankoh, O.: The global distribution and burden of dengue. Nature 496(7446), 504 (2013) CrossRef
46.
go back to reference Montoya, M., Gresh, L., Mercado, J.C., Williams, K.L., Vargas, M.J., Gutierrez, G., Kuan, G., Gordon, A., Balmaseda, A., Harris, E.: Symptomatic versus inapparent outcome in repeat dengue virus infections is influenced by the time interval between infections and study year. PLoS Negl. Trop. Dis. 7(8), 2357 (2013) CrossRef Montoya, M., Gresh, L., Mercado, J.C., Williams, K.L., Vargas, M.J., Gutierrez, G., Kuan, G., Gordon, A., Balmaseda, A., Harris, E.: Symptomatic versus inapparent outcome in repeat dengue virus infections is influenced by the time interval between infections and study year. PLoS Negl. Trop. Dis. 7(8), 2357 (2013) CrossRef
47.
go back to reference Moreira, L.A., Iturbe-Ormaetxe, I., Jeffery, J.A., Lu, G., Pyke, A.T., Hedges, L.M., Rocha, B.C., Hall-Mendelin, S., Day, A., Riegler, M.: A Wolbachia symbiont in Aedes Aegypti limits infection with dengue, chikungunya, and plasmodium. Cell 139(7), 1268–1278 (2009) CrossRef Moreira, L.A., Iturbe-Ormaetxe, I., Jeffery, J.A., Lu, G., Pyke, A.T., Hedges, L.M., Rocha, B.C., Hall-Mendelin, S., Day, A., Riegler, M.: A Wolbachia symbiont in Aedes Aegypti limits infection with dengue, chikungunya, and plasmodium. Cell 139(7), 1268–1278 (2009) CrossRef
48.
go back to reference Olkowski, S., Forshey, B.M., Morrison, A.C., Rocha, C., Vilcarromero, S., Halsey, E.S., Kochel, T.J., Scott, T.W., Stoddard, S.T.: Reduced risk of disease during postsecondary dengue virus infections. J. Infect. Dis. 208(6), 1026–1033 (2013) CrossRef Olkowski, S., Forshey, B.M., Morrison, A.C., Rocha, C., Vilcarromero, S., Halsey, E.S., Kochel, T.J., Scott, T.W., Stoddard, S.T.: Reduced risk of disease during postsecondary dengue virus infections. J. Infect. Dis. 208(6), 1026–1033 (2013) CrossRef
49.
go back to reference Reyes, M., Mercado, J.C., Standish, K., Matute, J.C., Ortega, O., Moraga, B., Avilés, W., Henn, M.R., Balmaseda, A., Kuan, G.: Index cluster study of dengue virus infection in Nicaragua. Am. J. Trop. Med. Hyg. 83(3), 683–689 (2010) CrossRef Reyes, M., Mercado, J.C., Standish, K., Matute, J.C., Ortega, O., Moraga, B., Avilés, W., Henn, M.R., Balmaseda, A., Kuan, G.: Index cluster study of dengue virus infection in Nicaragua. Am. J. Trop. Med. Hyg. 83(3), 683–689 (2010) CrossRef
50.
go back to reference Shepard, D.S., Undurraga, E.A., Halasa, Y.A.: Economic and disease burden of dengue in southeast asia. PLoS Negl. Trop. Dis. 7(2), 2055 (2013) CrossRef Shepard, D.S., Undurraga, E.A., Halasa, Y.A.: Economic and disease burden of dengue in southeast asia. PLoS Negl. Trop. Dis. 7(2), 2055 (2013) CrossRef
51.
go back to reference Lozano, R., Naghavi, M., Foreman, K., Lim, S., Shibuya, K., Aboyans, V., Abraham, J., Adair, T., Aggarwal, R., Ahn, S.Y.: Global and regional mortality from 235 causes of death for 20 age groups in 1990 and 2010: a systematic analysis for the Global Burden of Disease Study 2010. Lancet 380(9859), 2095–2128 (2012) CrossRef Lozano, R., Naghavi, M., Foreman, K., Lim, S., Shibuya, K., Aboyans, V., Abraham, J., Adair, T., Aggarwal, R., Ahn, S.Y.: Global and regional mortality from 235 causes of death for 20 age groups in 1990 and 2010: a systematic analysis for the Global Burden of Disease Study 2010. Lancet 380(9859), 2095–2128 (2012) CrossRef
52.
go back to reference World Health Organization: Setting priorities in communicable disease surveillance. Technical report, World Health Organization, Lyon, France (2006) World Health Organization: Setting priorities in communicable disease surveillance. Technical report, World Health Organization, Lyon, France (2006)
53.
go back to reference Brownstein, J.S., Freifeld, C.C., Reis, B.Y., Mandl, K.D.: Surveillance sans frontières: internet-based emerging infectious disease intelligence and the HealthMap project. PLoS Med. 5(7), 151 (2008) CrossRef Brownstein, J.S., Freifeld, C.C., Reis, B.Y., Mandl, K.D.: Surveillance sans frontières: internet-based emerging infectious disease intelligence and the HealthMap project. PLoS Med. 5(7), 151 (2008) CrossRef
54.
go back to reference Davies, S.E.: The challenge to know and control: disease outbreak surveillance and alerts in China and India. Glob. Pub. Health 7(7), 695–716 (2012) CrossRef Davies, S.E.: The challenge to know and control: disease outbreak surveillance and alerts in China and India. Glob. Pub. Health 7(7), 695–716 (2012) CrossRef
55.
go back to reference Farrington, C.P., Andrews, N.J., Beale, A.D., Catchpole, M.A.: A statistical algorithm for the early detection of outbreaks of infectious disease. J. Royal Stat. Soc. Series A (Statistics in Society) 159(3), 547–563 (1996) MathSciNetMATHCrossRef Farrington, C.P., Andrews, N.J., Beale, A.D., Catchpole, M.A.: A statistical algorithm for the early detection of outbreaks of infectious disease. J. Royal Stat. Soc. Series A (Statistics in Society) 159(3), 547–563 (1996) MathSciNetMATHCrossRef
56.
go back to reference Liu, Y.: China’s public health-care system: facing the challenges. Bull. World Health Organ. 82(7), 532–538 (2004) Liu, Y.: China’s public health-care system: facing the challenges. Bull. World Health Organ. 82(7), 532–538 (2004)
57.
go back to reference Thacker, S.B., Qualters, J.R., Lee, L.M.: Public health surveillance in the United States: evolution and challenges. MMWR Surveill. Summ. 61, 3–9 (2012) Thacker, S.B., Qualters, J.R., Lee, L.M.: Public health surveillance in the United States: evolution and challenges. MMWR Surveill. Summ. 61, 3–9 (2012)
58.
go back to reference Beatty, M.E., Stone, A., Fitzsimons, D.W., Hanna, J.N., Lam, S.K., Vong, S., Guzman, M.G., Mendez-Galvan, J.F., Halstead, S.B., Letson, G.W.: Best practices in dengue surveillance: a report from the Asia-Pacific and Americas dengue prevention boards. PLoS Negl. Trop. Dis. 4(11), 890 (2010) CrossRef Beatty, M.E., Stone, A., Fitzsimons, D.W., Hanna, J.N., Lam, S.K., Vong, S., Guzman, M.G., Mendez-Galvan, J.F., Halstead, S.B., Letson, G.W.: Best practices in dengue surveillance: a report from the Asia-Pacific and Americas dengue prevention boards. PLoS Negl. Trop. Dis. 4(11), 890 (2010) CrossRef
59.
go back to reference Konowitz, P.M., Petrossian, G.A., Rose, D.N.: The underreporting of disease and physicians’ knowledge of reporting requirements. Pub. Health Rep. 99(1), 31 (1984) Konowitz, P.M., Petrossian, G.A., Rose, D.N.: The underreporting of disease and physicians’ knowledge of reporting requirements. Pub. Health Rep. 99(1), 31 (1984)
60.
go back to reference McKenzie, J.F., Pinger, R.R.: An Introduction to Community Health, Brief Jones & Bartlett Publishers, Burlington (2013) McKenzie, J.F., Pinger, R.R.: An Introduction to Community Health, Brief Jones & Bartlett Publishers, Burlington (2013)
61.
go back to reference Singh, J., Dinkar, A., Atam, V., Himanshu, D., Gupta, K.K., Usman, K., Misra, R.: Awareness and outcome of changing trends in clinical profile of dengue fever: a retrospective analysis of dengue epidemic from January to December 2014 at a tertiary care hospital. J. Assoc. Phys. India 65, 42 (2017) Singh, J., Dinkar, A., Atam, V., Himanshu, D., Gupta, K.K., Usman, K., Misra, R.: Awareness and outcome of changing trends in clinical profile of dengue fever: a retrospective analysis of dengue epidemic from January to December 2014 at a tertiary care hospital. J. Assoc. Phys. India 65, 42 (2017)
62.
go back to reference Fisher, R.A.: Statistical Methods for Research Workers. Oliver & Boyd, Edinburugh (1925) MATH Fisher, R.A.: Statistical Methods for Research Workers. Oliver & Boyd, Edinburugh (1925) MATH
63.
go back to reference Upton, G.J.: Fisher’s exact test. J. Royal Stat. Soc. Series A 155(3), 395–402 (1992) CrossRef Upton, G.J.: Fisher’s exact test. J. Royal Stat. Soc. Series A 155(3), 395–402 (1992) CrossRef
66.
go back to reference Villanes, A., Griffiths, E., Rappa, M., Healey, C.G.: Dengue fever surveillance in India using text mining in public media. Am. J. Trop. Med. Hyg. 98, 181–191 (2018) CrossRef Villanes, A., Griffiths, E., Rappa, M., Healey, C.G.: Dengue fever surveillance in India using text mining in public media. Am. J. Trop. Med. Hyg. 98, 181–191 (2018) CrossRef
67.
go back to reference Agarwal, A., Fu, W., Menzies, T.: What is wrong with topic modeling? And how to fix it using search-based software engineering. Inf. Softw. Technol. 98, 74–88 (2018) CrossRef Agarwal, A., Fu, W., Menzies, T.: What is wrong with topic modeling? And how to fix it using search-based software engineering. Inf. Softw. Technol. 98, 74–88 (2018) CrossRef
68.
go back to reference Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. 3(4–5), 993–1022 (2003) MATH Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. 3(4–5), 993–1022 (2003) MATH
69.
go back to reference Villanes, A.: Epidemiological disease surveillance using public media text mining. PhD thesis, North Carolina State University (2019) Villanes, A.: Epidemiological disease surveillance using public media text mining. PhD thesis, North Carolina State University (2019)
Metadata
Title
Domain-specific text dictionaries for text analytics
Authors
Andrea Villanes
Christopher G. Healey
Publication date
11-07-2022
Publisher
Springer International Publishing
Published in
International Journal of Data Science and Analytics
Print ISSN: 2364-415X
Electronic ISSN: 2364-4168
DOI
https://doi.org/10.1007/s41060-022-00344-x

Premium Partner