Skip to main content
Erschienen in: Cluster Computing 3/2019

04.08.2017

Opinion mining on large scale data using sentiment analysis and k-means clustering

verfasst von: Sumbal Riaz, Mehvish Fatima, M. Kamran, M. Wasif Nisar

Erschienen in: Cluster Computing | Sonderheft 3/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

With the rapid growth of web technology and easy access of internet, online shopping has been increased. Now people express their opinions and share their experiences that greatly influence new buyers for purchasing products, thereby generating large data sets. This large data is very helpful for analyzing customer preference, needs and its behavior toward a product. Companies face the challenge of analyzing this sheer amount of data to extract customer opinion. To address this challenge, in this paper, we performed sentiment analysis on the customer review real-world data at phrase level to find out customer preference by analyzing subjective expressions. Then we calculated the strength of sentiment word to find out the intensity of each expression and applied clustering for placing the words in various clusters based on their intensity. We also compared the results of our technique with star-ranking given on the same dataset and found the drastic change in our results. We also provide a visual representation of our results to provide a clear insight of customer preference and behavior to help decision makers for better decision making.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Smith, A., Anderson, M.: Online Shopping and E-Commerce. Pew Research Center, Washington, DC (2016) Smith, A., Anderson, M.: Online Shopping and E-Commerce. Pew Research Center, Washington, DC (2016)
2.
Zurück zum Zitat Liu, B.: Sentiment analysis and subjectivity. In: Indurkhya, N., Damerau, F.J. (eds.) Handbook of Natural Language Processing, 2nd edn. CRC Press, Boca Raton (2010) Liu, B.: Sentiment analysis and subjectivity. In: Indurkhya, N., Damerau, F.J. (eds.) Handbook of Natural Language Processing, 2nd edn. CRC Press, Boca Raton (2010)
3.
Zurück zum Zitat Asghar, M.Z., Ahmad, S., Qasim, M., Zahra, S.R., Kundi, F.M.: SentiHealth: creating health-related sentiment lexicon using hybrid approach. SpringerPlus 5, 1139 (2016)CrossRef Asghar, M.Z., Ahmad, S., Qasim, M., Zahra, S.R., Kundi, F.M.: SentiHealth: creating health-related sentiment lexicon using hybrid approach. SpringerPlus 5, 1139 (2016)CrossRef
4.
Zurück zum Zitat Hutto, C.J., Gilbert, E.: Vader: a parsimonious rule-based model for sentiment analysis of social media text. In: Eighth International AAAI Conference on Weblogs and Social Media (2014) Hutto, C.J., Gilbert, E.: Vader: a parsimonious rule-based model for sentiment analysis of social media text. In: Eighth International AAAI Conference on Weblogs and Social Media (2014)
5.
Zurück zum Zitat Wang, H., Can, D., Kazemzadeh, A., Bar, F., Narayanan, S.: A system for real-time twitter sentiment analysis of 2012 US presidential election cycle. In: Proceedings of the ACL 2012 System Demonstrations, pp. 115–120 (2012) Wang, H., Can, D., Kazemzadeh, A., Bar, F., Narayanan, S.: A system for real-time twitter sentiment analysis of 2012 US presidential election cycle. In: Proceedings of the ACL 2012 System Demonstrations, pp. 115–120 (2012)
7.
Zurück zum Zitat Medhat, W., Hassan, A., Korashy, H.: Sentiment analysis algorithms and applications: a survey. Ain Shams Eng. J. 5, 1093–1113 (2014)CrossRef Medhat, W., Hassan, A., Korashy, H.: Sentiment analysis algorithms and applications: a survey. Ain Shams Eng. J. 5, 1093–1113 (2014)CrossRef
8.
Zurück zum Zitat Bai, X.: Predicting consumer sentiments from online text. Decis. Support Syst. 50, 732–742 (2011)CrossRef Bai, X.: Predicting consumer sentiments from online text. Decis. Support Syst. 50, 732–742 (2011)CrossRef
9.
Zurück zum Zitat Cambria, E., Schuller, B., Xia, Y., Havasi, C.: New avenues in opinion mining and sentiment analysis. IEEE Intell. Syst. 28, 15–21 (2013)CrossRef Cambria, E., Schuller, B., Xia, Y., Havasi, C.: New avenues in opinion mining and sentiment analysis. IEEE Intell. Syst. 28, 15–21 (2013)CrossRef
10.
Zurück zum Zitat Archak, N., Ghose, A., Ipeirotis, P.G.: Deriving the pricing power of product features by mining consumer reviews. Manag. Sci. 57, 1485–1509 (2011)CrossRef Archak, N., Ghose, A., Ipeirotis, P.G.: Deriving the pricing power of product features by mining consumer reviews. Manag. Sci. 57, 1485–1509 (2011)CrossRef
11.
Zurück zum Zitat Taboada, M., Brooke, J., Tofiloski, M., Voll, K., Stede, M.: Lexicon-based methods for sentiment analysis. Comput. Linguist. 37, 267–307 (2011)CrossRef Taboada, M., Brooke, J., Tofiloski, M., Voll, K., Stede, M.: Lexicon-based methods for sentiment analysis. Comput. Linguist. 37, 267–307 (2011)CrossRef
12.
Zurück zum Zitat Kang, H., Yoo, S.J., Han, D.: Senti-lexicon and improved Naïve Bayes algorithms for sentiment analysis of restaurant reviews. Expert Syst. Appl. 39, 6000–6010 (2012)CrossRef Kang, H., Yoo, S.J., Han, D.: Senti-lexicon and improved Naïve Bayes algorithms for sentiment analysis of restaurant reviews. Expert Syst. Appl. 39, 6000–6010 (2012)CrossRef
13.
Zurück zum Zitat Wang, S., Li, D., Song, X., Wei, Y., Li, H.: A feature selection method based on improved Fisher’s discriminant ratio for text sentiment classification. Expert Syst. Appl. 38, 8696–8702 (2011)CrossRef Wang, S., Li, D., Song, X., Wei, Y., Li, H.: A feature selection method based on improved Fisher’s discriminant ratio for text sentiment classification. Expert Syst. Appl. 38, 8696–8702 (2011)CrossRef
14.
Zurück zum Zitat Ravi, K., Ravi, V.: A survey on opinion mining and sentiment analysis: tasks, approaches and applications. Knowl. Syst. 89, 14–46 (2015)CrossRef Ravi, K., Ravi, V.: A survey on opinion mining and sentiment analysis: tasks, approaches and applications. Knowl. Syst. 89, 14–46 (2015)CrossRef
15.
Zurück zum Zitat Ye, Q., Zhang, Z., Law, R.: Sentiment classification of online reviews to travel destinations by supervised machine learning approaches. Expert Syst. Appl. 36, 6527–6535 (2009)CrossRef Ye, Q., Zhang, Z., Law, R.: Sentiment classification of online reviews to travel destinations by supervised machine learning approaches. Expert Syst. Appl. 36, 6527–6535 (2009)CrossRef
16.
Zurück zum Zitat Kouloumpis, E., Wilson, T., Moore, J.D.: Twitter sentiment analysis: the good the bad and the OMG!. ICWSM 11, 164 (2011) Kouloumpis, E., Wilson, T., Moore, J.D.: Twitter sentiment analysis: the good the bad and the OMG!. ICWSM 11, 164 (2011)
17.
Zurück zum Zitat Saif, H., He, Y., Alani, H.: Semantic sentiment analysis of twitter. Semant. Web-ISWC 2012, 508–524 (2012) Saif, H., He, Y., Alani, H.: Semantic sentiment analysis of twitter. Semant. Web-ISWC 2012, 508–524 (2012)
18.
Zurück zum Zitat Barbosa, L., Feng, J.: Robust sentiment detection on twitter from biased and noisy data. In: Proceedings of the 23rd International Conference on Computational Linguistics: Posters, pp. 36-44 (2010) Barbosa, L., Feng, J.: Robust sentiment detection on twitter from biased and noisy data. In: Proceedings of the 23rd International Conference on Computational Linguistics: Posters, pp. 36-44 (2010)
19.
Zurück zum Zitat Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing contextual polarity in phrase-level sentiment analysis. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, pp. 347–354 (2005) Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing contextual polarity in phrase-level sentiment analysis. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, pp. 347–354 (2005)
20.
Zurück zum Zitat Pennebaker, J.W., Mehl, M.R., Niederhoffer, K.G.: Psychological aspects of natural language use: our words, our selves. Ann. Rev. Psychol. 54, 547–577 (2003)CrossRef Pennebaker, J.W., Mehl, M.R., Niederhoffer, K.G.: Psychological aspects of natural language use: our words, our selves. Ann. Rev. Psychol. 54, 547–577 (2003)CrossRef
21.
Zurück zum Zitat Lu, Y., Kong, X., Quan, X., Liu, W., Xu, Y.: Exploring the sentiment strength of user reviews. In: International Conference on Web-Age Information Management, pp. 471–482 (2010)CrossRef Lu, Y., Kong, X., Quan, X., Liu, W., Xu, Y.: Exploring the sentiment strength of user reviews. In: International Conference on Web-Age Information Management, pp. 471–482 (2010)CrossRef
22.
Zurück zum Zitat Eirinaki, M., Pisal, S., Singh, J.: Feature-based opinion mining and ranking. J. Comput. Syst. Sci. 78, 1175–1184 (2012)MathSciNetCrossRef Eirinaki, M., Pisal, S., Singh, J.: Feature-based opinion mining and ranking. J. Comput. Syst. Sci. 78, 1175–1184 (2012)MathSciNetCrossRef
23.
Zurück zum Zitat Deng, Z.-H., Luo, K.-H., Yu, H.-L.: A study of supervised term weighting scheme for sentiment analysis. Expert Syst. Appl. 41, 3506–3513 (2014)CrossRef Deng, Z.-H., Luo, K.-H., Yu, H.-L.: A study of supervised term weighting scheme for sentiment analysis. Expert Syst. Appl. 41, 3506–3513 (2014)CrossRef
24.
Zurück zum Zitat Agarwal, A., Xie, B., Vovsha, I., Rambow, O., Passonneau, R.: Sentiment analysis of twitter data. In: Proceedings of the Workshop on Languages in Social Media, pp. 30–38 (2011) Agarwal, A., Xie, B., Vovsha, I., Rambow, O., Passonneau, R.: Sentiment analysis of twitter data. In: Proceedings of the Workshop on Languages in Social Media, pp. 30–38 (2011)
25.
Zurück zum Zitat Khan, F.H., Bashir, S., Qamar, U.: TOM: Twitter opinion mining framework using hybrid classification scheme. Decis. Support Syst. 57, 245–257 (2014)CrossRef Khan, F.H., Bashir, S., Qamar, U.: TOM: Twitter opinion mining framework using hybrid classification scheme. Decis. Support Syst. 57, 245–257 (2014)CrossRef
26.
Zurück zum Zitat Asghar, M.Z., Khan, A., Ahmad, S., Qasim, M., Khan, I.A.: Lexicon-enhanced sentiment analysis framework using rule-based classification scheme. PLoS ONE 12, e0171649 (2017)CrossRef Asghar, M.Z., Khan, A., Ahmad, S., Qasim, M., Khan, I.A.: Lexicon-enhanced sentiment analysis framework using rule-based classification scheme. PLoS ONE 12, e0171649 (2017)CrossRef
27.
Zurück zum Zitat Mostafa, M.M.: More than words: social networks’ text mining for consumer brand sentiments. Expert Syst. Appl. 40, 4241–4251 (2013)CrossRef Mostafa, M.M.: More than words: social networks’ text mining for consumer brand sentiments. Expert Syst. Appl. 40, 4241–4251 (2013)CrossRef
28.
Zurück zum Zitat Asghar, M.Z., Khan, A., Ahmad, S., Khan, I.A., Kundi, F.M.: A unified framework for creating domain dependent polarity lexicons from user generated reviews. PLoS oNE 10, e0140204 (2015)CrossRef Asghar, M.Z., Khan, A., Ahmad, S., Khan, I.A., Kundi, F.M.: A unified framework for creating domain dependent polarity lexicons from user generated reviews. PLoS oNE 10, e0140204 (2015)CrossRef
29.
Zurück zum Zitat Kontopoulos, E., Berberidis, C., Dergiades, T., Bassiliades, N.: Ontology-based sentiment analysis of twitter posts. Expert Syst. Appl. 40, 4065–4074 (2013)CrossRef Kontopoulos, E., Berberidis, C., Dergiades, T., Bassiliades, N.: Ontology-based sentiment analysis of twitter posts. Expert Syst. Appl. 40, 4065–4074 (2013)CrossRef
30.
Zurück zum Zitat Bell, D., Koulouri, T., Lauria, S., Macredie, R.D., Sutton, J.: Microblogging as a mechanism for human-robot interaction. Knowl. Syst. 69, 64–77 (2014)CrossRef Bell, D., Koulouri, T., Lauria, S., Macredie, R.D., Sutton, J.: Microblogging as a mechanism for human-robot interaction. Knowl. Syst. 69, 64–77 (2014)CrossRef
31.
Zurück zum Zitat Popescu, O., Strapparava, C.: Time corpora: epochs, opinions and changes. Knowl. Syst. 69, 3–13 (2014)CrossRef Popescu, O., Strapparava, C.: Time corpora: epochs, opinions and changes. Knowl. Syst. 69, 3–13 (2014)CrossRef
32.
Zurück zum Zitat Neviarouskaya, A., Prendinger, H., Ishizuka, M.: SentiFul: a lexicon for sentiment analysis. IEEE Trans. Affect. Comput. 2, 22–36 (2011)CrossRef Neviarouskaya, A., Prendinger, H., Ishizuka, M.: SentiFul: a lexicon for sentiment analysis. IEEE Trans. Affect. Comput. 2, 22–36 (2011)CrossRef
33.
Zurück zum Zitat Asghar, M.Z., Khan, A., Ahmad, A., Kundi, F.M.: Preprocessing in natural language processing. Emerg. Issues Nat. Appl. Sci. 10, 152–161 (2013) Asghar, M.Z., Khan, A., Ahmad, A., Kundi, F.M.: Preprocessing in natural language processing. Emerg. Issues Nat. Appl. Sci. 10, 152–161 (2013)
34.
Zurück zum Zitat Ohsawa, Y., Benson, N.E., Yachida, M.: KeyGraph: automatic indexing by co-occurrence graph based on building construction metaphor. In: Proceedings. IEEE International Forum on Research and Technology Advances in Digital Libraries: ADL 98, pp. 12–18 (1998) Ohsawa, Y., Benson, N.E., Yachida, M.: KeyGraph: automatic indexing by co-occurrence graph based on building construction metaphor. In: Proceedings. IEEE International Forum on Research and Technology Advances in Digital Libraries: ADL 98, pp. 12–18 (1998)
35.
Zurück zum Zitat Lee, D., Jeong, O.-R., Lee, S.: Opinion mining of customer feedback data on the web. In: Proceedings of the 2nd International Conference on Ubiquitous Information Management and Communication, pp. 230–235 (2008) Lee, D., Jeong, O.-R., Lee, S.: Opinion mining of customer feedback data on the web. In: Proceedings of the 2nd International Conference on Ubiquitous Information Management and Communication, pp. 230–235 (2008)
Metadaten
Titel
Opinion mining on large scale data using sentiment analysis and k-means clustering
verfasst von
Sumbal Riaz
Mehvish Fatima
M. Kamran
M. Wasif Nisar
Publikationsdatum
04.08.2017
Verlag
Springer US
Erschienen in
Cluster Computing / Ausgabe Sonderheft 3/2019
Print ISSN: 1386-7857
Elektronische ISSN: 1573-7543
DOI
https://doi.org/10.1007/s10586-017-1077-z

Weitere Artikel der Sonderheft 3/2019

Cluster Computing 3/2019 Zur Ausgabe

Premium Partner