Introduction
Natural language processing
Sentiment analysis
Data sources
Review
Sentiment classification
Naïve Bayes
Nearest neighbor
Centroid based
Support vector machine
Unsupervised techniques
Dictionary (Lexicon) based techniques
Statistics (Corpus) based techniques
Complex challenges
Document level
Sentence level
Feature level
Lexicon level
Discussion
SNo. | Naïve Bayes | k-Nearest neighbor | Centroid |
---|---|---|---|
1 | Yes | Yes | Yes |
2 | Yes | Yes | No |
3 | Yes | No | No |
4 | Word probability | Value of k
| Centroid vector |
5 | Probability weights | Distance similarity | Vector distance |
6 | Simple and fast | Handle co-related features | Classify on vector distance |
7 | Assume feature independence | Sensitive to irrelevant features | Sensitive to noise |
8 | Yes | Too expensive | No |
SNo. | Support vector machine | Lexicon (dictionary) based | Statistical (corpus) based |
---|---|---|---|
1 | Yes | No | No |
2 | No | NA | NA |
3 | No | No | Yes |
4 | Kernel function | Word polarity | Feature matrix |
5 | Hyperplane | Word polarity | Word distance |
6 | Classify on hyperplane | Can identify new lexicons | Handle online data |
7 | Require more resources | Struggle with domain context | Conceptual document size |
8 | No | Yes | Yes |