1 Introduction
2 Related Work
2.1 Dialogue Classification
2.2 Heterogeneous Graph Network
3 Methodology
3.1 DialGraph Construction
3.2 Node Representation
3.3 Heterogeneous Graph Network
3.4 Training and Optimization
4 Experiments
-
How does our approach compare with existing methods on the dialogue classification task? (Section 4.3.1)
-
How does the heterogeneous graph information affect the dialogue classification performance? (Section 4.3.2)
-
What are the contributions of each component in our approach? (Section 4.3.3)
4.1 Datasets and Experiment Settings
Dataset | Labels | Train | Dev | Test | Avg Len | Max Len |
---|---|---|---|---|---|---|
CM | 37 | 15,791 | 1996 | 1997 | 680 | 5059 |
ECS | 14 | 68,759 | 8697 | 8698 | 5023 | 98749 |
Business types | Conversation intentions |
---|---|
Consulting | Subscription information inquiry, regulations, tariff, processing mode |
Account information, marketing activities information | |
Number status, e-commerce product information, etc | |
Processing | Download/setup, cancel, move/install/uninstall |
Change, open, print/mail, pay, cancel/reopen | |
Replace/exchange, reset/modify/reissue, etc | |
Complainting | Uninformed customization, business usage, business processing |
Business regulations dissatisfaction, information security | |
Network problems, marketing problems, cost problems, etc |
Dial Id | Sentence Id | Sentence Info | Category |
---|---|---|---|
07dc2 | 1 | text: 我是您的客服,很高兴为您服务。 | 异常评论: 评论泄露隐私 (Abnormal comments: Comments Leak Privacy) |
(I’m your service guy | |||
I’m happy to help you.) | |||
id: 1 | |||
member_type: 3 | |||
2 | text: 订单的评论泄露了我的手机号。 | ||
(The review of the order | |||
gave away my phone number.) | |||
id: 2 | |||
member_type: 1 | |||
3 | text: 请发送给我订单编号。 | ||
(Send me the order number.) | |||
id: 3 | |||
member_type: 2 |
4.2 Baselines
-
TextRNN [29] is a type of recurrent neural network that can handle text data and take into account the order of words. TextRNN uses a recursive way to pass the output of the previous time step as the input of the current time step, so as to transfer the context information to the next time step.
-
TextRNN-Att [30] is a text classification model that combines recurrent neural networks (RNNs) and attention mechanisms. TextRNN-Att uses a bidirectional RNN to encode the input text into hidden states, and then applies an attention layer to aggregate the hidden states into a sentence representation. The attention layer assigns different weights to different parts of the text, depending on their relevance to the classification task.
-
TextCNN [31] short for Text Convolutional Neural Network, is a deep learning model designed for text classification and sentiment analysis tasks. TextCNN can handle variable-length sentences and learn complex semantic features from the text.
-
CNN-LSTM [32] is a widely-used model consisting of regional CNN and LSTM. By combining the regional CNN and LSTM components, the CNN-LSTM model can leverage both the local spatial information captured by the CNN and the sequential dependencies captured by the LSTM. This hybrid approach allows the model to effectively extract meaningful features from input data and capture complex relationships within sequential data.
-
BERT [25] is a transformers-based language model, which is pre-trained on large-scale corpus and has achieved remarkable success in many NLP tasks. The use of transformers in BERT enables it to capture contextual dependencies in a more comprehensive manner. The transformer architecture utilizes attention mechanisms to weigh the importance of different words in a sentence based on their relevance to each other. This attention mechanism allows BERT to consider the entire context when representing a word, rather than just relying on its immediate neighbors.
-
Roberta [33] is a robustly optimized version of BERT, a pre-trained language model that uses bidirectional transformers to learn contextual representations of text.
-
ERNIE [34] stands for Enhanced Representation through kNowledge IntEgration, which indicates its ability to incorporate various types of knowledge into the pre-training process of language models.
-
Han [7] is a hierarchical attention network containing two levels of attention mechanisms applied at word-level and sentence-level, which is similar to the graph. The hierarchical nature of Han’s attention mechanisms allows it to effectively model relationships and dependencies between words and sentences. The model can capture not only the local interactions between words but also the broader interactions and contextual dependencies between sentences.
-
DAG [22] is an acronym for Directed Acyclic Graph. DAG can be used to model the structure and context of a conversation, where each node represents an utterance and each edge represents the dependency or influence between utterances. DAG can capture the information flow and the long-distance dependencies in a conversation.
-
InductGCN [35] constructs a graph based on the statistics of training documents only and represents document vectors with a weighted sum of word vectors. It then conducts one-directional GCN propagation during testing.
-
TextGCN [36] incorporates semantic information and relationships from text data by constructing a text graph and applying graph convolution operations. It can perform text classification without the need for external embeddings, making it a valuable approach for specific text classification tasks.
-
AttentionXML [37] introduced an attention mechanism and a probabilistic label tree (PLT). Attention mechanism ensures that the model captures the subtle and context-dependent associations between text and labels, enhancing classification accuracy.
4.3 Results and Analysis
4.3.1 DialGNN Comparing with Baseline Methods
Models | CM | ECS | ||
---|---|---|---|---|
Accuracy | F1 | Accuracy | F1 | |
TextRNN | 58.3 | 34.4 | 54.7 | 48.4 |
TextRNN-Att | 61.5 | 39.0 | 57.7 | 51.9 |
TextCNN | 60.6 | 43.4 | 56.8 | 50.8 |
CNN_LSTM | 47.4 | 28.0 | 55.2 | 50.0 |
BERT | 69.2 | 53.8 | 59.6 | 54.8 |
Roberta | 67.7 | 53.9 | 58.8 | 53.0 |
ERNIE | 67.7 | 50.6 | 57.9 | 51.2 |
Han | 62.7 | 46.6 | 58.5 | 53.3 |
DAG | 11.5 | 1.6 | 14.3 | 3.3 |
InductGCN | 43.7 | 35.7 | 49.1 | 42.7 |
TextGCN | 43.3 | 28.6 | 49.9 | 42.9 |
AttentionXML | 30.7 | 30.9 | 43.1 | 43.1 |
DialGNN (BERT) | 70.2 | 59.3 | 60.3 | 54.9 |
Models | CM | ECS | ||
---|---|---|---|---|
Accuracy | F1 | Accuracy | F1 | |
CNN_LSTM | 47.4 | 28.0 | 55.2 | 50.0 |
+ DialGNN | 54.1 | 41.8 | 57.2 | 52.3 |
Han | 62.7 | 46.6 | 58.5 | 53.3 |
+ DialGNN | 63.1 | 48.1 | 59.1 | 54.3 |
+ DialGNN-seg | – | – | 63.2 | 59.3 |
BERT | 69.2 | 53.8 | 59.6 | 54.8 |
+ DialGNN | 70.2 | 59.3 | 60.3 | 54.9 |
+ DialGNN-seg | – | – | 65.4 | 61.2 |
4.3.2 Comparisons on Graph Designs
Models | Accuracy | F1 |
---|---|---|
Base(BERT-tiny) | 65.1 | 41.0 |
+ context graph | 63.2 | 38.6 |
+ DialGNN | 68.5 | 57.4 |
Base2(BERT) | 69.2 | 53.8 |
+ async init | 60.9 | 41.2 |
+ sent nodes agg | 61.9 | 45.4 |
+ DialGNN | 70.2 | 59.3 |
4.3.3 Ablation Study
Models | Accuracy | F1 |
---|---|---|
w/o TF-IDF | 68.7 | 53.6 |
w/o sent-word update | 69.4 | 55.8 |
w/o word-sent update | 69.2 | 52.0 |
BERT + DialGNN | 70.2 | 59.3 |
4.4 Case Study
Model | Example | ||
---|---|---|---|
1 | 2 | 3 | |
DialGNN | Malicious non-returns only refunds 恶意不退货仅退款 | Reset/modify /reissue 重置/修改/补发 | Dissatisfaction with operational requirements 业务规定不满 |
Roberta | Fraudulent shipping insurance 骗取运费险 | Business use issues 业务使用问题 | Removal/installation /dismantling 移机/装机/拆机 |
ERNIE | Fraudulent shipping insurance 骗取运费险 | Business process issues 业务办理问题 | Removal/installation /dismantling 移机/装机/拆机 |
TextRNN | Exploit evaluation blackmail 利用评价要挟 | business requirement 业务规定 | Product/Business Functions 产品/业务功能 |
TextRNN_Att | Exploit evaluation blackmail 利用评价要挟 | Business process issues 业务办理问题 | Product/Business Functions 产品/业务功能 |
TextCNN | Exploit evaluation blackmail 利用评价要挟 | Business process issues 业务办理问题 | Modification 变更 |
InductGCN | Fraudulent shipping insurance 骗取运费险 | Print/Mail 打印/邮寄 | Removal/installation /dismantling 移机/装机/拆机 |