Skip to main content
Erschienen in: Arabian Journal for Science and Engineering 4/2021

13.02.2021 | Research Article-Computer Engineering and Computer Science

An Improved Feature Selection Based on Naive Bayes with Kernel Density Estimator for Opinion Mining

verfasst von: Raja Rajeswari Sethuraman, John Sanjeev Kumar Athisayam

Erschienen in: Arabian Journal for Science and Engineering | Ausgabe 4/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Opinion mining has gained much attention in the recent years due to the rapid growth of social media. It is a task of analyzing customer reviews to make decisions by classifying the reviews into positive or negative. These text reviews have high dimensions that lead to the curse of dimensionality. To handle this high dimension of text data, improved gain ratio is proposed to select the features with the highest ranking. Naїve Bayes classifier with kernel density function is used to evaluate the feature set. The Naїve Bayes classifier with Kernel density estimation is a nonparametric classifier that computes the probability density function based on the kernel estimator. This classifier produces higher accuracy in various benchmarking datasets.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Meena, A.; Prabhakar, T.V.: Sentence level sentiment analysis in the presence of conjuncts using linguistic analysis. Eur Conf Inf Retr. 4425, 573–580 (2007) Meena, A.; Prabhakar, T.V.: Sentence level sentiment analysis in the presence of conjuncts using linguistic analysis. Eur Conf Inf Retr. 4425, 573–580 (2007)
2.
Zurück zum Zitat Khairnar, J.; Kinikar, M.: Machine learning algorithms for opinion mining and sentiment classification. Int. J. Sci. Res. Publ. 3, 1–6 (2013) Khairnar, J.; Kinikar, M.: Machine learning algorithms for opinion mining and sentiment classification. Int. J. Sci. Res. Publ. 3, 1–6 (2013)
3.
Zurück zum Zitat Ravi, K.; Ravi, V.: A survey on opinion mining and sentiment analysis: tasks, approaches and applications. Knowl. -Based Syst. 89, 14–46 (2015)CrossRef Ravi, K.; Ravi, V.: A survey on opinion mining and sentiment analysis: tasks, approaches and applications. Knowl. -Based Syst. 89, 14–46 (2015)CrossRef
4.
Zurück zum Zitat Moussa, M.E.; Mohamed, E.H.; Haggag, M.H.: A survey on opinion summarization techniques for social media. Future Comput. Informatics J. 3(1), 82–109 (2018)CrossRef Moussa, M.E.; Mohamed, E.H.; Haggag, M.H.: A survey on opinion summarization techniques for social media. Future Comput. Informatics J. 3(1), 82–109 (2018)CrossRef
5.
Zurück zum Zitat Tang, B.; Kay, S.; He, H.; Baggenstoss, P.M.: EEF: exponentially embedded families with class-specific features for classification. IEEE Signal Process. Lett. 23(7), 969–973 (2016)CrossRef Tang, B.; Kay, S.; He, H.; Baggenstoss, P.M.: EEF: exponentially embedded families with class-specific features for classification. IEEE Signal Process. Lett. 23(7), 969–973 (2016)CrossRef
6.
Zurück zum Zitat Tang, B.; Kay, S.; He, H.: Toward optimal feature selection in Naïve Bayes for text categorization. IEEE Trans. Knowl. Data Eng. 28(9), 2508–2521 (2016)CrossRef Tang, B.; Kay, S.; He, H.: Toward optimal feature selection in Naïve Bayes for text categorization. IEEE Trans. Knowl. Data Eng. 28(9), 2508–2521 (2016)CrossRef
7.
Zurück zum Zitat Zhu, D.; Lappas, T.; Zhang, J.: Unsupervised tip-mining from customer reviews. Decis. Support Syst. 107, 116–124 (2018)CrossRef Zhu, D.; Lappas, T.; Zhang, J.: Unsupervised tip-mining from customer reviews. Decis. Support Syst. 107, 116–124 (2018)CrossRef
8.
Zurück zum Zitat Manochandar, S.; Punniyamoorthy, M.: Scaling feature selection method for enhancing the classification performance of support vector machines in text mining. J. Comput. Ind. Eng. 124, 139–156 (2018)CrossRef Manochandar, S.; Punniyamoorthy, M.: Scaling feature selection method for enhancing the classification performance of support vector machines in text mining. J. Comput. Ind. Eng. 124, 139–156 (2018)CrossRef
9.
Zurück zum Zitat Ashok Kumar, J.; Abirami, S.: Aspect-based opinion ranking framework for product reviews using a Spearman’s rank correlation coefficient method. Inf. Sci. 460–461, 23–41 (2018) Ashok Kumar, J.; Abirami, S.: Aspect-based opinion ranking framework for product reviews using a Spearman’s rank correlation coefficient method. Inf. Sci. 460–461, 23–41 (2018)
10.
Zurück zum Zitat Yang, H.-L.; Lin, Q.-F.: Opinion mining for multiple types of emotion-embedded products/services through evolutionary strategy. Expert Syst. Appl. 99, 44–55 (2018)CrossRef Yang, H.-L.; Lin, Q.-F.: Opinion mining for multiple types of emotion-embedded products/services through evolutionary strategy. Expert Syst. Appl. 99, 44–55 (2018)CrossRef
11.
Zurück zum Zitat Janardhana, D.R.; Manjunath, M.: Sentiment analysis and opinion mining using machine learning. Int. J. Innov. Res. Comput. Commun. Eng. 3, 9321–9329 (2015) Janardhana, D.R.; Manjunath, M.: Sentiment analysis and opinion mining using machine learning. Int. J. Innov. Res. Comput. Commun. Eng. 3, 9321–9329 (2015)
12.
Zurück zum Zitat Pang, B., Lee, L.: Opinion Mining and Sentiment Analysis. Foundations and Trends in Information Retrieval, Now publishers. 2, 1–135 (2008) Pang, B., Lee, L.: Opinion Mining and Sentiment Analysis. Foundations and Trends in Information Retrieval, Now publishers. 2, 1–135 (2008)
13.
Zurück zum Zitat Tommasel, A.; Godoy, D.: A Social-aware online short-text feature selection technique for social media. Inf. Fusion 40, 1–17 (2018)CrossRef Tommasel, A.; Godoy, D.: A Social-aware online short-text feature selection technique for social media. Inf. Fusion 40, 1–17 (2018)CrossRef
14.
Zurück zum Zitat Deshmukh, J.S.; Tripathy, A.K.: Entropy based classifier for cross-domain opinion mining. Appl. Comput Informatics 14, 55–64 (2018)CrossRef Deshmukh, J.S.; Tripathy, A.K.: Entropy based classifier for cross-domain opinion mining. Appl. Comput Informatics 14, 55–64 (2018)CrossRef
15.
Zurück zum Zitat Khan, K.; Baharudin, B.; Khan, A.; Ullah, A.: Mining opinion components from unstructured reviews: A review. J. King Saud Univ. Comput. Inf. Sci. 26, 258–275 (2014) Khan, K.; Baharudin, B.; Khan, A.; Ullah, A.: Mining opinion components from unstructured reviews: A review. J. King Saud Univ. Comput. Inf. Sci. 26, 258–275 (2014)
16.
Zurück zum Zitat Murakami, Y.; Mizuguchi, K.: Applying the Naïve Bayes classifier with kernel density estimation to the prediction of protein–protein interaction sites. Bio informatics 26, 1841–1848 (2010) Murakami, Y.; Mizuguchi, K.: Applying the Naïve Bayes classifier with kernel density estimation to the prediction of protein–protein interaction sites. Bio informatics 26, 1841–1848 (2010)
17.
Zurück zum Zitat Fersini, E.; Messina, E.; Pozzi, F.A.: Sentiment analysis: Bayesian ensemble learning. Decis. Support Syst. 68, 26–38 (2014)CrossRef Fersini, E.; Messina, E.; Pozzi, F.A.: Sentiment analysis: Bayesian ensemble learning. Decis. Support Syst. 68, 26–38 (2014)CrossRef
18.
Zurück zum Zitat Wu, C.; Wu, F.; Wu, S.; Yuan, Z.; Huang, Y.: A hybrid unsupervised method for aspect term and opinion target extraction. Knowl.-Based Syst. 148, 66–73 (2018)CrossRef Wu, C.; Wu, F.; Wu, S.; Yuan, Z.; Huang, Y.: A hybrid unsupervised method for aspect term and opinion target extraction. Knowl.-Based Syst. 148, 66–73 (2018)CrossRef
19.
Zurück zum Zitat Liu, B.: Synthesis lectures on human language technologies sentiment analysis and opinion mining. Morgan & Claypool Publishers, San Rafael (2012) Liu, B.: Synthesis lectures on human language technologies sentiment analysis and opinion mining. Morgan & Claypool Publishers, San Rafael (2012)
20.
Zurück zum Zitat Taboada, M.; Brooke, J.; Tofiloski, M.; Voll, K.; Stede, M.: Lexicon—based methods for sentiment analysis. Comput. Linguist. 37, 267–307 (2011)CrossRef Taboada, M.; Brooke, J.; Tofiloski, M.; Voll, K.; Stede, M.: Lexicon—based methods for sentiment analysis. Comput. Linguist. 37, 267–307 (2011)CrossRef
21.
Zurück zum Zitat Silverman, B.W.: Density Estimation for Statistics and Data Analysis -Monographs on Statistics and Applied Probability. Chapman and Hall Publishers, London (1986) Silverman, B.W.: Density Estimation for Statistics and Data Analysis -Monographs on Statistics and Applied Probability. Chapman and Hall Publishers, London (1986)
22.
Zurück zum Zitat Wang, F.; Xu, T.; Tang, T.; Wang, H.: Bilevel feature extraction-based text mining for fault diagnosis of railway systems. IEEE Trans. Intell. Transp. Syst. 18(1), 49–58 (2017)CrossRef Wang, F.; Xu, T.; Tang, T.; Wang, H.: Bilevel feature extraction-based text mining for fault diagnosis of railway systems. IEEE Trans. Intell. Transp. Syst. 18(1), 49–58 (2017)CrossRef
23.
Zurück zum Zitat Jin, W., Ho, H.H.: A novel lexicalized HMM-based learning framework for web opinion mining. In: Proceedings of the 26th International Conference on Machine Learning, pp. 465–472 (2009) Jin, W., Ho, H.H.: A novel lexicalized HMM-based learning framework for web opinion mining. In: Proceedings of the 26th International Conference on Machine Learning, pp. 465–472 (2009)
24.
Zurück zum Zitat Kim, S.G.; Kang, J.: Analyzing the discriminative attributes of products using text mining focused on cosmetic reviews. Inf. Process. Manag. 54(6), 938–957 (2018)CrossRef Kim, S.G.; Kang, J.: Analyzing the discriminative attributes of products using text mining focused on cosmetic reviews. Inf. Process. Manag. 54(6), 938–957 (2018)CrossRef
25.
Zurück zum Zitat Hamedmoghadam, H.; Jalili, M.; X Y, : An opinion formation based binary optimization approach for feature selection. Phys. A Stat. Mech. Appl. 491, 142–152 (2018)MathSciNetCrossRef Hamedmoghadam, H.; Jalili, M.; X Y, : An opinion formation based binary optimization approach for feature selection. Phys. A Stat. Mech. Appl. 491, 142–152 (2018)MathSciNetCrossRef
26.
Zurück zum Zitat Martarelli, N.J.; Nagano, M.S.: A constructive evolutionary approach for feature selection in unsupervised learning. Swarm Evol Comput 42, 125–137 (2018)CrossRef Martarelli, N.J.; Nagano, M.S.: A constructive evolutionary approach for feature selection in unsupervised learning. Swarm Evol Comput 42, 125–137 (2018)CrossRef
27.
Zurück zum Zitat Rajamohana, S.P.; Umamaheswari, K.: Hybrid approach of improved binary particle swarm optimization and shuffled frog leaping for feature selection. Comput. Electr. Eng. 67, 497–508 (2018)CrossRef Rajamohana, S.P.; Umamaheswari, K.: Hybrid approach of improved binary particle swarm optimization and shuffled frog leaping for feature selection. Comput. Electr. Eng. 67, 497–508 (2018)CrossRef
28.
Zurück zum Zitat Heyong, W.; Ming, H.: Supervised Hebb rule based feature selection for text classification. J. Inf. Process. Manag. 56, 167–191 (2018) Heyong, W.; Ming, H.: Supervised Hebb rule based feature selection for text classification. J. Inf. Process. Manag. 56, 167–191 (2018)
29.
Zurück zum Zitat Bhattacharya, A.; Dunson, D.: Nonparametric Bayes classification and hypothesis testing on manifolds. J. Multivar. Anal. 111, 1–19 (2012)MathSciNetCrossRef Bhattacharya, A.; Dunson, D.: Nonparametric Bayes classification and hypothesis testing on manifolds. J. Multivar. Anal. 111, 1–19 (2012)MathSciNetCrossRef
30.
Zurück zum Zitat Tang, B.; He, H.; Baggenstoss, P.M.: A Bayesian classification approach using class-specific features for text categorization. IEEE Trans. Knowl. Data Eng. 28(6), 1602–1606 (2016)CrossRef Tang, B.; He, H.; Baggenstoss, P.M.: A Bayesian classification approach using class-specific features for text categorization. IEEE Trans. Knowl. Data Eng. 28(6), 1602–1606 (2016)CrossRef
31.
Zurück zum Zitat Yang, L.; Liguo, H.; Xuesen, C.: A kernel density estimation based text classification algorithm. Adv. Sci. Technol. Lett. SERSC Sci. Eng. Res. Support Soc. 78, 49–54 (2014) Yang, L.; Liguo, H.; Xuesen, C.: A kernel density estimation based text classification algorithm. Adv. Sci. Technol. Lett. SERSC Sci. Eng. Res. Support Soc. 78, 49–54 (2014)
32.
Zurück zum Zitat Xiao, M., Guo, Y.: Semi-supervised kernel matching for domain adaptation. In: Proceedings of the 26th AAAI Conference on Artificial Intelligence, pp. 1183–1189 (2012) Xiao, M., Guo, Y.: Semi-supervised kernel matching for domain adaptation. In: Proceedings of the 26th AAAI Conference on Artificial Intelligence, pp. 1183–1189 (2012)
33.
Zurück zum Zitat Uysal, A.K.: An improved global feature selection Scheme for text classification. Expert Syst. Appl. 43, 82–92 (2016)CrossRef Uysal, A.K.: An improved global feature selection Scheme for text classification. Expert Syst. Appl. 43, 82–92 (2016)CrossRef
34.
Zurück zum Zitat Perez, A.; Larranaga, P.; Inza, I.: Bayesian classifiers based on kernel density estimation: Flexible classifiers. Int. J. Approx. Reason. 50, 341–362 (2009)CrossRef Perez, A.; Larranaga, P.; Inza, I.: Bayesian classifiers based on kernel density estimation: Flexible classifiers. Int. J. Approx. Reason. 50, 341–362 (2009)CrossRef
35.
Zurück zum Zitat Hulden, M., Silfverberg, M., Francom, J.: Kernel density estimation for text-based geolocation. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, pp. 145–150 (2015) Hulden, M., Silfverberg, M., Francom, J.: Kernel density estimation for text-based geolocation. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, pp. 145–150 (2015)
36.
Zurück zum Zitat Holmes, M.P., Gray, A.G., Isbell, C.L.: Fast nonparametric conditional density estimation. In: Proceedings of the 23rd Conference on Uncertainty in Artificial Intelligence, 175–182 (2012) Holmes, M.P., Gray, A.G., Isbell, C.L.: Fast nonparametric conditional density estimation. In: Proceedings of the 23rd Conference on Uncertainty in Artificial Intelligence, 175–182 (2012)
37.
Zurück zum Zitat Sharma, D.; Jain, S.: Evaluation of stemming and stop word techniques on text classification problem. Int. J. Sci. Res. Comput. Sci. Eng. 3, 1–4 (2015) Sharma, D.; Jain, S.: Evaluation of stemming and stop word techniques on text classification problem. Int. J. Sci. Res. Comput. Sci. Eng. 3, 1–4 (2015)
38.
Zurück zum Zitat Bilal, M.; Israr, H.; Shahid, M.; Khan, A.: Sentiment classification of Roman-Urdu opinions using Naı¨ve Bayesian decision tree and KNN classification techniques. J. King Saud Univ. Comput. Inf. Scie. 28, 330–344 (2016) Bilal, M.; Israr, H.; Shahid, M.; Khan, A.: Sentiment classification of Roman-Urdu opinions using Naı¨ve Bayesian decision tree and KNN classification techniques. J. King Saud Univ. Comput. Inf. Scie. 28, 330–344 (2016)
39.
Zurück zum Zitat Kang, M.; Ahn, J.; Lee, K.: Opinion mining using ensemble text hidden Markov models for text classification. Expert Syst. Appl. 94, 218–227 (2018)CrossRef Kang, M.; Ahn, J.; Lee, K.: Opinion mining using ensemble text hidden Markov models for text classification. Expert Syst. Appl. 94, 218–227 (2018)CrossRef
40.
Zurück zum Zitat Jiang, L.; Zhang, L.; Yu, L.; Wang, D.: Class-specific attribute weighted Naive Bayes. Pattern Recognit. 88, 321–330 (2019)CrossRef Jiang, L.; Zhang, L.; Yu, L.; Wang, D.: Class-specific attribute weighted Naive Bayes. Pattern Recognit. 88, 321–330 (2019)CrossRef
41.
Zurück zum Zitat Jiang, L.; Li, C.; Wang, S.; Zhang, L.: Deep feature weighting for Naive Bayes and its application to text classification. Eng. Appl. Artif. Intell. 52, 26–39 (2016)CrossRef Jiang, L.; Li, C.; Wang, S.; Zhang, L.: Deep feature weighting for Naive Bayes and its application to text classification. Eng. Appl. Artif. Intell. 52, 26–39 (2016)CrossRef
42.
Zurück zum Zitat Jiang, L.; Zhang, L.; Li, C.; Wu, J.: A correlation-based feature weighting filter for Naive Bayes. IEEE Trans. Knowl. Data Eng. 31(2), 201–213 (2019)CrossRef Jiang, L.; Zhang, L.; Li, C.; Wu, J.: A correlation-based feature weighting filter for Naive Bayes. IEEE Trans. Knowl. Data Eng. 31(2), 201–213 (2019)CrossRef
43.
Zurück zum Zitat Chen, S.; Webb, G.I.; Liu, L.; Ma, X.: A novel selective Naive Bayes algorithm. Knowl. -Based Syst. 192, 105361 (2020)CrossRef Chen, S.; Webb, G.I.; Liu, L.; Ma, X.: A novel selective Naive Bayes algorithm. Knowl. -Based Syst. 192, 105361 (2020)CrossRef
44.
Zurück zum Zitat Cao, P.; Liu, X.; Zhang, J.; Zhao, D.; Huang, M.; Zaiane, O.: ℓ2,1 norm regularized multi-kernel based joint nonlinear feature selection and over-sampling for imbalanced data classification. Neurocomputing 234, 38–57 (2017)CrossRef Cao, P.; Liu, X.; Zhang, J.; Zhao, D.; Huang, M.; Zaiane, O.: ℓ2,1 norm regularized multi-kernel based joint nonlinear feature selection and over-sampling for imbalanced data classification. Neurocomputing 234, 38–57 (2017)CrossRef
Metadaten
Titel
An Improved Feature Selection Based on Naive Bayes with Kernel Density Estimator for Opinion Mining
verfasst von
Raja Rajeswari Sethuraman
John Sanjeev Kumar Athisayam
Publikationsdatum
13.02.2021
Verlag
Springer Berlin Heidelberg
Erschienen in
Arabian Journal for Science and Engineering / Ausgabe 4/2021
Print ISSN: 2193-567X
Elektronische ISSN: 2191-4281
DOI
https://doi.org/10.1007/s13369-021-05381-5

Weitere Artikel der Ausgabe 4/2021

Arabian Journal for Science and Engineering 4/2021 Zur Ausgabe

Research Article-Computer Engineering and Computer Science

A New Method to Measure the Information Quality Based on Shannon Entropy

Research Article-Computer Engineering and Computer Science

A Novel Quranic Search Engine Using an Ontology-Based Semantic Indexing

    Marktübersichten

    Die im Laufe eines Jahres in der „adhäsion“ veröffentlichten Marktübersichten helfen Anwendern verschiedenster Branchen, sich einen gezielten Überblick über Lieferantenangebote zu verschaffen.