| Determine the sentiment polarity | GCN | To extract graph representations | Effective in enhancing word dependencies in sentences and improving sentiment extraction learning ability. Effective in constructing un-directional graphs to fully utilize sentence dependency trees and derive precise affective relations | 164 | 82 |
SenticNet | To leverage affective dependencies for specific aspects |
Dependency tree | To capture syntactical information |
Aspect-specific affective enhanced dependency graph | To enhance affective information |
LSTM | To derive hidden contextual representations from sentence embeddings |
GloVe | For word embeddings in non-BERT models, capturing semantic relationships |
L2 regularization | To prevent overfitting by penalizing large weights |
Adam optimizer | For efficient parameter optimization |
Uniform initialization | To initialize weights and biases uniformly for neural network layers |
BERT | For contextual word embeddings for sentiment analysis |
| Hotel recommendation system based on sentiment analysis of the reviews | BERT | To classify sentiments in reviews using an ensemble model with three phases | Effective in reducing computational time and achieving higher accuracy for sentiment analysis by combining all feature settings | 70 | 23.33 |
Word2vec | To generate word vectors as textual features for the sentiment classification model |
TF–IDF | To calculate the importance of frequent words in the reviews as textual features for the classification model |
Random forest classifier | To classify the sentiment of reviews based on various textual features |
Fuzzy logic | To categorize reviews into different aspects by handling misspellings and typographical errors |
TPU v3-8 | To accelerate the training process of the BERT model |
Fuzzy string matching | To improve the accuracy of aspect-based review categorization by handling variations in text |
| Aspect-level sentiment classification | Interactive multitask learning networks | Solves multiple tasks simultaneously, enabling better exploitation of interactions between tasks | Effective in co-extracting aspect and opinion terms. Effective in utilizing domain-specific knowledge through joint training and domain-specific embeddings to enhance task performance. Effective in maintaining competitive performance even without domain-specific embedding | 112 | 22.4 |
Message passing mechanism | Allows informative interactions between tasks by sending useful information back to a shared latent representation |
CNN | Used in the feature extraction component after the word embedding layer to extract features from the input sequence |
Shared latent representation | A sequence of latent vectors shared among all tasks, initialized by the feature extraction component and updated through message passing |
Glove | Provides pre-trained word vectors to capture general semantic information |
FastText | Provides pre-trained word vectors, trained on a large domain-specific corpus for the restaurant and laptop domains |
Adam optimizer | To optimize the learning process |
| Aspect extraction | Neural word embeddings | To map words that usually co-occur within the same context to nearby points in the embedding space | Effective in extracting coherent aspects and identifying separable aspects despite challenges in specific categories such as taste and smell | 220 | 31.43 |
Aspect embeddings | To represent aspects in the same embedding space as words |
Grid search | To optimize hyperparameters of topic models using the topic coherence metric |
Gibbs sampling | To perform 1000 iterations of sampling for all topic models to infer topic distributions |
GibbsLDA + + | To implement LocLDA for topic modeling using Gibbs sampling |
Word2vec | To initialize the word embedding matrix with pre-trained word vectors and set specific parameters |
K-means clustering | To initialize the aspect embedding matrix with cluster centroids from word embeddings |
Adam optimizer | To optimize model parameters during training |
Orthogonality penalty | To enforce an orthogonality constraint on aspect embeddings |
| Aspect-category sentiment analysis and aspect-term sentiment analysis | CNN | To efficiently extract N-gram features at multiple granularities | Effective in controlling sentiment information flow at a fine granularity. Effective in unraveling aspect and sentiment information. Effective in differentiating the sentiments of multiple entities within the same sentence | 300 | 50 |
Gated Tanh-ReLU units | To selectively output sentiment features based on the given aspect or entity |
Multiple filters in convolutional layers | To capture features at different granularities within each receptive field |
GloVe | To initialize word embedding vectors |
Uniform distribution | To initialize out-of-vocabulary words with a uniform distribution |
Adagrad optimizer | To optimize model parameters |
| Aspect-level sentiment classification | LSTM | To model aspects and texts simultaneously and learn long-term dependencies while avoiding the gradient vanishing or exploding problem | Effective in learning sentiment polarities of different aspects. Effective in identifying sentiment-indicating words for specific aspects when multiple aspects within a sentence | 192 | 32 |
Bi-LSTM | To learn the hidden semantics of words in both the sentence and the aspect target by processing the sequence in both forward and backward directions |
Attention-over-attention module | To automatically generate mutual attention from aspect-to-text and text-to-aspect |
Uniform initialization | To initialize all weight matrices randomly from a uniform distribution |
Zero initialization | To initialize all bias terms to zero |
L2 regularization | To prevent overfitting |
GloVe | To initialize word embeddings |
Adam optimizer | To optimize the model |
| Aspect term extraction | LSTM | To build initial aspect and opinion representations by recording sequential information | Effective in leveraging opinion summaries for improved aspect extraction. Effective in discovering uncommon aspects by utilizing history attention mechanisms | 114 | 19 |
Truncated history attention (THA) | To encode historical information into aspect representations by distilling useful features from recent aspect predictions and generating history-aware aspect representations |
Selective transformation network (STN) | To obtain opinion summaries by applying aspect information to transform initial opinion representations and using attention over these transformed representations |
Bi-linear attention network | To calculate the opinion summary as a weighted sum of new opinion representations based on their associations with the current aspect representation |
GloVe | To initialize word embeddings |
Uniform distribution | To initialize embeddings for out-of-vocabulary words randomly sampled from the uniform distribution |
Glorot uniform initialization | To initialize the matrices in LSTMs using the Glorot Uniform strategy |
| Aspect-based sentiment classification | GCN | To exploit syntactical information and word dependencies | Effective in demonstrating the insufficiency of directly integrating syntax information into the attention mechanism | 205 | 41 |
Bi-LSTM | To capture contextual information regarding word order in sentences |
Multi-layered graph convolution structure | To encode and update the representation of nodes in the graph using features of immediate neighbors and to draw syntactically relevant words to the target aspect |
GloVe | To initialize word embeddings |
Uniform initialization | To initialize all model weights uniformly |
Adam optimizer | To optimize the model |
L2 regularization | To prevent overfitting |
| Aspect-level sentiment analysis | Bi-LSTM | To learn representations for features of a sentence by integrating context information in both forward and backward directions | Effective in encoding context and dependency information into aspect vectors for sentiment classification. Effective in propagating relevant information along the sequence of words and syntactic dependency paths | 169 | 33.8 |
GCN | To enhance the embeddings learned by Bi-LSTM by integrating dependency information directly from the dependency tree of the sentence |
Dependency tree | To structure the sentence in a way that highlights syntactic relationships between words |
GloVe | To initialize word embeddings |
Part-of-speech (POS) embeddings | To incorporate syntactic information into the model by embedding POS tags |
Bi-LSTM | To capture contextual information for each word by learning embeddings |
Adam optimizer | To optimize the model parameters |
| Aspect extraction and aspect sentiment classification | BERT | To initialize the word embeddings with pre-trained language model representations | Effective in improving aspect extraction by incorporating contextualized domain knowledge. Effective in enhancing classification across multiple review-based tasks through joint post-training | 285 | 57 |
FP16 computation | To reduce the size of both the model and hidden representations of data |
Adam optimizer | To optimize the model parameters during training |
| Aspect-level sentiment classification | Global lexical graph | To encode corpus-level word co-occurrence information | Effective in leveraging both syntactic and lexical graphs to improve classification and sentiment polarity identification | 95 | 23.75 |
Hierarchical syntactic graph | To differentiate various types of dependency relations by grouping similar dependency types |
Hierarchical lexical graph | To distinguish different types of word co-occurrence relations |
Bi-level interactive graph convolution network | To fully exploit and integrate the information from both the syntactic and lexical graphs |
GloVe | To initialize word embeddings |
spaCy toolkit | To extract dependency relations from text |
Adam optimizer | To optimize the neural network |
L2 regularization | To prevent overfitting |
| Aspect-level sentiment classification | Knowledge transfer from document-level data | To improve the performance of aspect-level sentiment classification by leveraging less expensive, document-level data | Effective in capturing domain-specific opinion words. Effective in handling sentences with negation words. Effective in recognizing neutral instances to compensate for the lack of aspect-level training examples | 107 | 17.83 |
LSTM | To enhance aspect-level sentiment classification by integrating document-level knowledge |
GloVe | To initialize embeddings |
L2 regularization | To prevent overfitting by penalizing large weights |
RMSProp optimizer | To optimize the model parameters |
| Aspect-level sentiment classification | Interactive attention networks (IAN) | To interactively learn attention in contexts and targets and generate their representations separately | Effective in interactively learning and modeling the representations of targets and contexts for sentiment classification. Effective in enhancing sentiment polarity prediction by fully considering the interaction between target and context | 494 | 70.57 |
LSTM | To handle sequential data and capture long-term dependencies in the text for sentiment classification |
GloVe | To initialize word embeddings from context and target with pre-trained word vectors |
Uniform initialization | To initialize embeddings by sampling from a uniform distribution |
Zero initialization | To initialize all biases to zero |
Momentum optimization | To train the parameters of IAN |
L2 regularization | To prevent overfitting |
| Aspect sentiment triplet extraction | Unified tagging system | To label aspect terms and sentiments using a unified tagging schema built on top of stacked Bi-LSTM networks | Effective in extracting aspects and their associated sentiments simultaneously. Effective in leveraging mutual information among aspect extraction, sentiment classification, and opinion term extraction to enhance overall performance. Effective in incorporating sentiment classification signals to aid in the accurate extraction of opinion terms | 104 | 26 |
Bi-LSTM | To perform sequence tagging for aspect extraction, sentiment classification, and opinion term extraction by capturing contextual information in both directions |
BIO-like tagging system | To label opinion terms, providing a structured way to identify the beginning, inside, and outside of opinion expressions |
GCN | To utilize semantic and syntactic information in a sentence for opinion term tagging |
GloVe | To initialize word embeddings |
SGD optimizer | To train the model with a stochastic gradient descent algorithm |
| Aspect-level sentiment classification | GCN | To capture sentiment dependencies between multiple aspects in one sentence | Effective in capturing sentiment dependencies between multiple aspects in one sentence. Effective in capturing interactive information among multiple aspects | 97 | 24.25 |
Bi-LSTM | To capture the contextual information for each word |
GloVe | To initialize the word embeddings |
BERT | To initialize the word embeddings with pre-trained language model representations |
Normal distribution | To initialize the weight matrix of the last fully connected layer using a normal distribution |
Uniform distribution | To initialize all weight matrices (except the last fully connected layer) using a uniform distribution |
L2 regularization | To prevent overfitting |
Adam optimizer | To optimize the model parameters |
| Aspect-level sentiment classification | Fine-grained attention mechanism | To capture the word-level interaction between aspect and context | Effective in linking and fusing information between context and aspect words. Effective in handling aspects with multiple words, reducing information loss in coarse-grained attention mechanisms. Effective in capturing aspect-level interactions to improve performance, especially in datasets with multiple aspects per sentence | 238 | 39.67 |
Coarse-grained attention mechanism | To capture the overall interaction between aspect and context |
Multi-grained attention network (MGAN) | To combine fine-grained and coarse-grained attention mechanisms for comprehensive aspect-level sentiment analysis |
Bi-LSTM | To capture temporal interactions among words by processing sequences in both forward and backward directions |
GloVe | To initialize word embeddings for both context and aspect words |
Uniform initialization | To initialize the weight matrix and bias by sampling from a uniform distribution |
L2 regularization | To prevent overfitting |
| Aspect-based sentiment analysis | Dependency parsing | To obtain the dependency tree of a sentence | Effective in capturing important syntactic structures for sentiment analysis. Effective in handling multiple aspects within a single sentence, improving the accuracy across different semantic distance ranges | 211 | 52.75 |
Relational graph attention network (R-GAT) | To encode the aspect-oriented dependency tree structure for sentiment prediction |
Graph attention network (GAT) | To generalize encoding graphs with labeled edges |
Bi-LSTM | To encode the word embeddings of tree nodes and the aspect words |
GloVe | To provide word embeddings for R-GAT |
BERT | To provide pre-trained word representations, with fine-tuning on the task |
Adam optimizer | To train the model with efficient gradient-based optimization |
| Target-dependent aspect detection and targeted aspect-based polarity classification | LSTM | To model sequences and maintain long-term dependencies in the text data | Effective in significantly improving aspect detection. Effective in using target-level attention to identify parts of target expressions with higher sentiment salience | 187 | 31.17 |
Stacked attention mechanism | To focus on different levels of information, specifically target-level and sentence-level |
Sentic LSTM | To integrate explicit commonsense knowledge with implicit knowledge within the LSTM architecture |
Recurrent additive network | To simulate semantic patterns and enhance the LSTM by modeling the additive effects of sentiments |
Syntax-based concept parser | To extract a set of concept candidates at each time step |
AffectiveSpace embedding | To provide concept embeddings that represent the affective aspects of the concepts |
| Aspect-level sentiment classification | Syntax-based GCN | To model the syntactic dependency tree and enhance sentence representation toward a given aspect independently | Effective in incorporating syntactic dependency trees and knowledge graphs for aspect-level sentiment classification. Effective in enriching sentence representations toward given aspects | 91 | 22.75 |
Knowledge-based GCN | To model commonsense knowledge graphs independently and enrich sentence representation toward a given aspect |
Bi-LSTM | To obtain contextualized word representations as input features for the GCN by modeling the sentence from the embeddings |
GloVe | To provide pre-trained word embeddings |
BERT | To generate contextualized word embeddings that capture the meaning of words in context |
Uniform initialization | To initialize all out-of-vocabulary words and weights with a uniform distribution |
Adam optimizer | To optimize the model parameters |
| Aspect-level sentiment classification | Dependency graph | To represent a sentence as a dependency graph instead of a word sequence | Effective in leveraging dependency graphs to propagate sentiment features from syntax-dependent words to the aspect target. Effective in utilizing syntax information to improve sentiment classification performance | 116 | 23.2 |
Graph attention network (GAT) | To propagate sentiment features from important syntax neighborhood words to the aspect target in the dependency graph |
LSTM | To explicitly capture aspect-related information across layers during recursive neighborhood expansion within the TD-GAT framework |
GloVe | To provide pre-trained word embeddings for initializing the model |
BERT | To incorporate deep contextualized word representations |
L2 regularization | To prevent overfitting by penalizing large weights |
Adam optimizer | To optimize the model parameters |
SGD optimizer | To fine-tune and stabilize the model after initial training with the Adam optimizer |
| Target-dependent sentiment classification | BERT | To leverage pre-trained DL models that understand the context of words in a sentence by looking at the words before and after the target word | Effective in integrating target position output information into BERT models to enhance classification accuracy. Effective in applying information fusion techniques such as element-wise multiplication or concatenation to improve model performance | 155 | 31 |
Target-dependent BERT | To enhance the BERT model by incorporating target-specific information for improved performance in aspect-level sentiment classification |
Adam optimizer | To optimize the model parameters |
| Aspect-based sentiment analysis and targeted sentiment analysis | LSTM | To model the sequential dependencies in the data and capture long-term relationships within the text | Effective in using target- and aspect-dependent sentence attention to retrieve relevant information for both aspect categorization and sentiment classification. Effective in incorporating affective properties through knowledge integration | 330 | 55 |
Hierarchical attention mechanism | To focus on target- and sentence-level information for more precise sentiment analysis |
Sentic LSTM | To extend the traditional LSTM by integrating commonsense knowledge tightly into the recurrent encoder |
Bi-LSTM | To process the input sequence in both forward and backward directions |
Pre-trained skip-gram model | To initialize the word embeddings |
| Aspect-level sentiment classification | Feature capsules | To transform N-gram features into capsules that represent more complex patterns and features | Effective in transferring knowledge from document-level tasks. Effective in leveraging shared features across related tasks. Effective in utilizing multi-task learning variants to achieve robust performance despite label noise | 108 | 21.6 |
Semantic capsules | To aggregate feature capsules into aspect-related sentence-level representations |
Class capsules | To generate capsules that correspond to sentiment polarities |
GloVe | To use pre-trained word embeddings for improved word representation |
Adam optimizer | To optimize the neural network |
| Sentence-pair classification | BERT | To enhance the accuracy of TABSA by fine-tuning a pre-trained BERT model using its final hidden state representations and a classification layer | Effective in expanding the corpus by converting target and aspect information into auxiliary sentences. Effective in constructing auxiliary sentences for complex ABSA tasks | 248 | 49.6 |
Auxiliary sentence construction | To create supplementary sentences that aid in transforming the (T)ABSA task into a sentence-pair classification task |
| Domain aspect classification, aspect-term and opinion-word separation, and sentiment polarity classification | Word2Vec | To compute word embeddings for word similarity calculations | Effective in multilingual domain aspect classification. Effective in separating aspect terms from opinion words without additional supervision | 108 | 18 |
Apache Spark MLlib | To implement and compute domain-based word embeddings efficiently |
LDA-based topic model | To identify and extract topics from text data, adapted with biased topic modeling parameters |
Maximum entropy classifier | To categorize sentences into predefined classes based on the learned distributions |