Skip to main content
Top
Published in: The Journal of Supercomputing 5/2020

10-01-2019

Developing a supervised learning-based social media business sentiment index

Authors: Hyeonseo Lee, Nakyeong Lee, Harim Seo, Min Song

Published in: The Journal of Supercomputing | Issue 5/2020

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The fast-growing digital data generation leads to the emergence of the era of big data, which become particularly more valuable because approximately 70% of the collected data in the world comes from social media. Thus, the investigation of online social network services is of paramount importance. In this paper, we use the sentiment analysis, which detects attitudes and emotions toward issues of society posted in social media, to understand the actual economic situation. To this end, two steps are suggested. In the first step, after training the sentiment classifiers with several big data sources of social media datasets, we consider three types of feature sets: feature vector, sequence vector and a combination of dictionary-based feature and sequence vectors. Then, the performance of six classifiers is assessed: MaxEnt-L1, C4.5 decision tree, SVM-kernel, Ada-boost, Naïve Bayes and MaxEnt. In the second step, we collect datasets that are relevant to several economic words that the public use to explicitly express their opinions. Finally, we use a vector auto-regression analysis to confirm our hypothesis. The results show the statistically significant relationship between public sentiment and economic performance. That is, “depression” and “unemployment” lead to KOSPI. Also, it shows that the extracted keywords from the sentiment analysis, such as “price,” “year-end-tax” and “budget deficit,” cause the exchange rates.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Perrin A (2015) Social media usage. Pew research center, pp 52–68 Perrin A (2015) Social media usage. Pew research center, pp 52–68
4.
go back to reference Jin S, Lin W, Yin H, Yang S, Li A, Deng B (2015) Community structure mining in big data social media networks with MapReduce. Clust Comput 18(3):999–1010CrossRef Jin S, Lin W, Yin H, Yang S, Li A, Deng B (2015) Community structure mining in big data social media networks with MapReduce. Clust Comput 18(3):999–1010CrossRef
5.
go back to reference Zhang G, Xu L, Xue Y (2017) Model and forecast stock market behavior integrating investor sentiment analysis and transaction data. Clust Comput 20(1):789–803CrossRef Zhang G, Xu L, Xue Y (2017) Model and forecast stock market behavior integrating investor sentiment analysis and transaction data. Clust Comput 20(1):789–803CrossRef
6.
go back to reference Nasukawa T, Yi J (2003) Sentiment analysis: capturing favorability using natural language processing. In: Proceedings of the 2nd International Conference on Knowledge Capture. ACM, pp 70–77 Nasukawa T, Yi J (2003) Sentiment analysis: capturing favorability using natural language processing. In: Proceedings of the 2nd International Conference on Knowledge Capture. ACM, pp 70–77
7.
go back to reference Appel O, Chiclana F, Carter J (2015) Main concepts, state of the art and future research questions in sentiment analysis. Acta Polytech Hung 12(3):87–108 Appel O, Chiclana F, Carter J (2015) Main concepts, state of the art and future research questions in sentiment analysis. Acta Polytech Hung 12(3):87–108
8.
go back to reference Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retr 2(1–2):1–135CrossRef Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retr 2(1–2):1–135CrossRef
9.
go back to reference Liu B (2012) Sentiment analysis and opinion mining. Synth Lect Hum Lang Technol 5(1):1–167CrossRef Liu B (2012) Sentiment analysis and opinion mining. Synth Lect Hum Lang Technol 5(1):1–167CrossRef
10.
go back to reference Pang B, Lee L, Vaithyanathan S (2002) Thumbs up?: Sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing-Volume 10. Association for Computational Linguistics, pp 79–86 Pang B, Lee L, Vaithyanathan S (2002) Thumbs up?: Sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing-Volume 10. Association for Computational Linguistics, pp 79–86
11.
go back to reference Wilson T, Wiebe J, Hoffmann P (2005) Recognizing contextual polarity in phrase-level sentiment analysis. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing. Association for Computational Linguistics, pp 347-354 Wilson T, Wiebe J, Hoffmann P (2005) Recognizing contextual polarity in phrase-level sentiment analysis. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing. Association for Computational Linguistics, pp 347-354
12.
go back to reference O’Hare N, Davy M, Bermingham A, Ferguson P, Sheridan P, Gurrin C, Smeaton AF (2009) Topic-dependent sentiment analysis of financial blogs. In: Proceedings of the 1st International CIKM Workshop on Topic-Sentiment Analysis for Mass Opinion. ACM, pp 9–16 O’Hare N, Davy M, Bermingham A, Ferguson P, Sheridan P, Gurrin C, Smeaton AF (2009) Topic-dependent sentiment analysis of financial blogs. In: Proceedings of the 1st International CIKM Workshop on Topic-Sentiment Analysis for Mass Opinion. ACM, pp 9–16
13.
go back to reference Go A, Bhayani R, Huang L (2009) Twitter sentiment classification using distant supervision. CS224N Project Report, Stanford, vol 1, no 12 Go A, Bhayani R, Huang L (2009) Twitter sentiment classification using distant supervision. CS224N Project Report, Stanford, vol 1, no 12
14.
go back to reference Wu F, Yuan Z, Huang Y (2017) Collaboratively training sentiment classifiers for multiple domains. IEEE Trans Knowl Data Eng 29(7):1370–1383CrossRef Wu F, Yuan Z, Huang Y (2017) Collaboratively training sentiment classifiers for multiple domains. IEEE Trans Knowl Data Eng 29(7):1370–1383CrossRef
15.
go back to reference Fernández AM, Esuli A, Sebastiani F (2016) Distributional correspondence indexing for cross-lingual and cross-domain sentiment classification. J Artif Intell Res 55(1):131–163MathSciNetCrossRef Fernández AM, Esuli A, Sebastiani F (2016) Distributional correspondence indexing for cross-lingual and cross-domain sentiment classification. J Artif Intell Res 55(1):131–163MathSciNetCrossRef
16.
go back to reference Wang L, Niu J, Song H, Atiquzzaman M (2018) SentiRelated: a cross-domain sentiment classification algorithm for short texts through sentiment related index. J Netw Comput Appl 101:111–119CrossRef Wang L, Niu J, Song H, Atiquzzaman M (2018) SentiRelated: a cross-domain sentiment classification algorithm for short texts through sentiment related index. J Netw Comput Appl 101:111–119CrossRef
17.
go back to reference Bader BW, Kegelmeyer WP, Chew PA (2011) Multilingual sentiment analysis using latent semantic indexing and machine learning. In: IEEE 11th International Conference on Data Mining Workshops, pp 45–52 Bader BW, Kegelmeyer WP, Chew PA (2011) Multilingual sentiment analysis using latent semantic indexing and machine learning. In: IEEE 11th International Conference on Data Mining Workshops, pp 45–52
18.
go back to reference Manek AS, Shenoy PD, Mohan MC, Venugopal KR (2017) Aspect term extraction for sentiment analysis in large movie reviews using Gini index feature selection method and SVM classifier. World Wide Web 20(2):135–154CrossRef Manek AS, Shenoy PD, Mohan MC, Venugopal KR (2017) Aspect term extraction for sentiment analysis in large movie reviews using Gini index feature selection method and SVM classifier. World Wide Web 20(2):135–154CrossRef
19.
go back to reference Culnan M, McHugh P, Zubillaga J (2010) How large U.S. companies can use twitter and other social media to gain business value. MIS Q Executive 9(4):243–259 Culnan M, McHugh P, Zubillaga J (2010) How large U.S. companies can use twitter and other social media to gain business value. MIS Q Executive 9(4):243–259
20.
go back to reference Di Gangi PM, Wasko M, Hooker RE (2010) Getting customers’ ideas to work for you: learning from dell how to succeed with online user innovation communities. MIS Q Executive 9(4):163–178 Di Gangi PM, Wasko M, Hooker RE (2010) Getting customers’ ideas to work for you: learning from dell how to succeed with online user innovation communities. MIS Q Executive 9(4):163–178
21.
go back to reference He W, Zha S, Li L (2013) Social media competitive analysis and text mining: a case study in the pizza industry. Int J Inf Manag 33(3):464–472CrossRef He W, Zha S, Li L (2013) Social media competitive analysis and text mining: a case study in the pizza industry. Int J Inf Manag 33(3):464–472CrossRef
22.
go back to reference Yang Y, Duan W, Cao Q (2013) The impact of social and conventional media on firm equity value: a sentiment analysis approach. Decis Support Syst 55(4):919–926CrossRef Yang Y, Duan W, Cao Q (2013) The impact of social and conventional media on firm equity value: a sentiment analysis approach. Decis Support Syst 55(4):919–926CrossRef
23.
go back to reference Phillips SJ, Anderson RP, Schapire RE (2006) Maximum entropy modeling of species geographic distributions. Ecol Model 190(3):231–259CrossRef Phillips SJ, Anderson RP, Schapire RE (2006) Maximum entropy modeling of species geographic distributions. Ecol Model 190(3):231–259CrossRef
24.
go back to reference Sun CJ, Yao L, Lin L, Sha XJ, Wang XL (2011) Semi-supervised biomedical relation classification using generalized expectation criteria. In: 2011 International Conference on Machine Learning and Cybernetics (ICMLC), vol 4. IEEE, pp 1949–1952 Sun CJ, Yao L, Lin L, Sha XJ, Wang XL (2011) Semi-supervised biomedical relation classification using generalized expectation criteria. In: 2011 International Conference on Machine Learning and Cybernetics (ICMLC), vol 4. IEEE, pp 1949–1952
25.
go back to reference Mann GS, McCallum A (2010) Generalized expectation criteria for semi-supervised learning with weakly labeled data. J Mach Learn Res 11:955–984MathSciNetMATH Mann GS, McCallum A (2010) Generalized expectation criteria for semi-supervised learning with weakly labeled data. J Mach Learn Res 11:955–984MathSciNetMATH
26.
go back to reference Polat K, Güneş S (2009) A novel hybrid intelligent method based on C4. 5 decision tree classifier and one-against-all approach for multi-class classification problems. Expert Syst Appl 36(2):1587–1592CrossRef Polat K, Güneş S (2009) A novel hybrid intelligent method based on C4. 5 decision tree classifier and one-against-all approach for multi-class classification problems. Expert Syst Appl 36(2):1587–1592CrossRef
27.
go back to reference Schapire RE (2003) The boosting approach to machine learning: an overview. In: Nonlinear estimation and classification. Springer, New York, pp 149–171 Schapire RE (2003) The boosting approach to machine learning: an overview. In: Nonlinear estimation and classification. Springer, New York, pp 149–171
28.
go back to reference Lewis DD (1998) Naive (Bayes) at forty: the independence assumption in information retrieval. In: European Conference on Machine Learning. Springer, Berlin, pp 4–15 Lewis DD (1998) Naive (Bayes) at forty: the independence assumption in information retrieval. In: European Conference on Machine Learning. Springer, Berlin, pp 4–15
29.
go back to reference Vapnik V (2013) The nature of statistical learning theory. Springer, BerlinMATH Vapnik V (2013) The nature of statistical learning theory. Springer, BerlinMATH
30.
go back to reference Levine R, Zervos S (1998) Stock markets, banks, and economic growth. Am Econ Rev 88:537–558 Levine R, Zervos S (1998) Stock markets, banks, and economic growth. Am Econ Rev 88:537–558
Metadata
Title
Developing a supervised learning-based social media business sentiment index
Authors
Hyeonseo Lee
Nakyeong Lee
Harim Seo
Min Song
Publication date
10-01-2019
Publisher
Springer US
Published in
The Journal of Supercomputing / Issue 5/2020
Print ISSN: 0920-8542
Electronic ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-018-02737-x

Other articles of this Issue 5/2020

The Journal of Supercomputing 5/2020 Go to the issue

Premium Partner