research-article

Hierarchical Multi-label Text Classification: An Attention-based Recurrent Network Approach

Authors:
Wei Huang

University of Science and Technology of China, Hefei, China

University of Science and Technology of China, Hefei, China
View Profile

,
Enhong Chen

University of Science and Technology of China, Hefei, China

University of Science and Technology of China, Hefei, China
View Profile

,
Qi Liu

University of Science and Technology of China, Hefei, China

University of Science and Technology of China, Hefei, China
View Profile

,
Yuying Chen

University of Science and Technology of China & Ant Financial Services Group,, Hefei, China

University of Science and Technology of China & Ant Financial Services Group,, Hefei, China
View Profile

,
Zai Huang

University of Science and Technology of China, Hefei, China

University of Science and Technology of China, Hefei, China
View Profile

,
Yang Liu

University of Science and Technology of China, Hefei, China

University of Science and Technology of China, Hefei, China
View Profile

,
Zhou Zhao

Zhejiang University, Hangzhou, China

Zhejiang University, Hangzhou, China
View Profile

,
Dan Zhang

iFLYTEK Research, Hefei, China

iFLYTEK Research, Hefei, China
View Profile

,
Shijin Wang

iFLYTEK Research, Hefei, China

iFLYTEK Research, Hefei, China
View Profile

CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge ManagementNovember 2019Pages 1051–1060https://doi.org/10.1145/3357384.3357885

Published:03 November 2019Publication History

CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management

Pages 1051–1060

ABSTRACT

Hierarchical multi-label text classification (HMTC) is a fundamental but challenging task of numerous applications (e.g., patent annotation), where documents are assigned to multiple categories stored in a hierarchical structure. Categories at different levels of a document tend to have dependencies. However, the majority of prior studies for the HMTC task employ classifiers to either deal with all categories simultaneously or decompose the original problem into a set of flat multi-label classification subproblems, ignoring the associations between texts and the hierarchical structure and the dependencies among different levels of the hierarchical structure. To that end, in this paper, we propose a novel framework called Hierarchical Attention-based Recurrent Neural Network (HARNN) for classifying documents into the most relevant categories level by level via integrating texts and the hierarchical category structure. Specifically, we first apply a documentation representing layer for obtaining the representation of texts and the hierarchical structure. Then, we develop an hierarchical attention-based recurrent layer to model the dependencies among different levels of the hierarchical structure in a top-down fashion. Here, a hierarchical attention strategy is proposed to capture the associations between texts and the hierarchical structure. Finally, we design a hybrid method which is capable of predicting the categories of each level while classifying all categories in the entire hierarchical structure precisely. Extensive experimental results on two real-world datasets demonstrate the effectiveness and explanatory power of HARNN.

References

M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, et al. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467, 2016.Google Scholar
P. N. Bennett and N. Nguyen. Refined experts: improving classification in large taxonomies. In Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, pages 11--18. ACM, 2009.Google ScholarDigital Library
W. Bi and J. T. Kwok. Multi-label classification on tree-and dag-structured hierarchies. In Proceedings of the 28th International Conference on Machine Learning (ICML-11), pages 17--24, 2011.Google Scholar
H. B. Borges and J. C. Nievola. Multi-label hierarchical classification using a competitive neural network for protein function prediction. In Neural Networks (IJCNN), The 2012 International Joint Conference on, pages 1--8. IEEE, 2012.Google Scholar
A. Braytee, W. Liu, D. R. Catchpoole, and P. J. Kennedy. Multi-label feature selection using correlation information. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pages 1649--1656. ACM, 2017.Google ScholarDigital Library
L. Cai and T. Hofmann. Hierarchical document categorization with support vector machines. In Proceedings of the thirteenth ACM international conference on Information and knowledge management, pages 78--87. ACM, 2004.Google ScholarDigital Library
R. Cerri, R. C. Barros, A. C. de Carvalho, and Y. Jin. Reduction strategies for hierarchical multi-label classification in protein function prediction. BMC bioinformatics, 17(1):373, 2016.Google ScholarCross Ref
N. Cesa-Bianchi, C. Gentile, and L. Zaniboni. Incremental algorithms for hierarchical classification. Journal of Machine Learning Research, 7(Jan):31--54, 2006.Google Scholar
J. Davis and M. Goadrich. The relationship between precision-recall and roc curves. In Proceedings of the 23rd international conference on Machine learning, pages 233--240. ACM, 2006.Google ScholarDigital Library
O. Dekel, J. Keshet, and Y. Singer. Large margin hierarchical classification. In Proceedings of the twenty-first international conference on Machine learning, page 27. ACM, 2004.Google ScholarDigital Library
A. Esuli, T. Fagni, and F. Sebastiani. Boosting multi-label hierarchical text categorization. Information Retrieval, 11(4):287--313, 2008.Google ScholarDigital Library
C. J. Fall, A. Törcsvári, K. Benzineb, and G. Karetka. Automated categorization in the international patent classification. In Acm Sigir Forum, volume 37, pages 10--25. ACM, 2003.Google ScholarDigital Library
E. Gibaja and S. Ventura. Multi-label learning: a review of the state of the art and ongoing research. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 4(6):411--444, 2014.Google ScholarCross Ref
J. C. Gomez and M.-F. Moens. A survey of automated hierarchical classification of patents. In Professional Search in the Modern World, pages 215--249. Springer, 2014.Google ScholarCross Ref
A. Graves. Generating sequences with recurrent neural networks. arXiv preprint arXiv:1308.0850, 2013.Google Scholar
J. Han, C. Wang, and A. El-Kishky. Bringing structure to text: mining phrases, entities, topics, and hierarchies. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1968--1968. ACM, 2014.Google ScholarDigital Library
S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural computation, 9(8):1735--1780, 1997.Google ScholarDigital Library
Z. Huang, Q. Liu, E. Chen, H. Zhao, M. Gao, S. Wei, Y. Su, and G. Hu. Question difficulty prediction for reading problems in standard tests. In Thirty-First AAAI Conference on Artificial Intelligence, 2017.Google ScholarDigital Library
D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.Google Scholar
Z. Lin, M. Feng, C. N. d. Santos, M. Yu, B. Xiang, B. Zhou, and Y. Bengio. A structured self-attentive sentence embedding. arXiv preprint arXiv:1703.03130, 2017.Google Scholar
J. Ma, P. Cui, X. Wang, and W. Zhu. Hierarchical taxonomy aware network embedding. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 1920--1929. ACM, 2018.Google ScholarDigital Library
A. Mayne and R. Perry. Hierarchically classifying documents with multiple labels. In Computational Intelligence and Data Mining, 2009. CIDM'09. IEEE Symposium on, pages 133--139. IEEE, 2009.Google ScholarCross Ref
T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111--3119, 2013.Google ScholarDigital Library
Z. Ren, M.-H. Peetz, S. Liang, W. Van Dolen, and M. De Rijke. Hierarchical multi-label classification of social text streams. In Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval, pages 213--222. ACM, 2014.Google ScholarDigital Library
J. Rousu, C. Saunders, S. Szedmak, and J. Shawe-Taylor. Learning hierarchical multi-category text classification models. In Proceedings of the 22nd international conference on Machine learning, pages 744--751. ACM, 2005.Google ScholarDigital Library
J. Rousu, C. Saunders, S. Szedmak, and J. Shawe-Taylor. Kernel-based learning of hierarchical multilabel classification models. Journal of Machine Learning Research, 7(Jul):1601--1626, 2006.Google Scholar
M. E. Ruiz and P. Srinivasan. Hierarchical text categorization using neural networks. Information Retrieval, 5(1):87--118, 2002.Google ScholarDigital Library
C. N. Silla and A. A. Freitas. A survey of hierarchical classification across different application domains. Data Mining and Knowledge Discovery, 22(1--2):31--72, 2011.Google Scholar
N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Dropout: a simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1):1929--1958, 2014.Google ScholarDigital Library
H. Tao, S. Tong, H. Zhao, T. Xu, B. Jin, and Q. Liu. A radical-aware attention-based model for chinese text classification. In Thirty-Third AAAI Conference on Artificial Intelligence, 2019.Google ScholarCross Ref
C. Vens, J. Struyf, L. Schietgat, S. Dvz eroski, and H. Blockeel. Decision trees for hierarchical multi-label classification. Machine learning, 73(2):185, 2008.Google ScholarDigital Library
X. Wang and G. Sukthankar. Multi-label relational neighbor classification using social context features. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 464--472. ACM, 2013.Google ScholarDigital Library
J. Wehrmann, R. Cerri, and R. Barros. Hierarchical multi-label classification networks. In International Conference on Machine Learning, pages 5225--5234, 2018.Google Scholar
F. Wu, J. Zhang, and V. Honavar. Learning classifiers using hierarchically structured class taxonomies. In International Symposium on Abstraction, Reformulation, and Approximation, pages 313--320. Springer, 2005.Google ScholarDigital Library
L. Xu, Z. Wang, Z. Shen, Y. Wang, and E. Chen. Learning low-rank label correlations for multi-label classification with missing labels. In 2014 IEEE International Conference on Data Mining, pages 1067--1072. IEEE, 2014.Google ScholarDigital Library
B. Yang, J.-T. Sun, T. Wang, and Z. Chen. Effective multi-label active learning for text classification. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 917--926. ACM, 2009.Google ScholarDigital Library
L. Zhang, K. Xiao, Q. Liu, Y. Tao, and Y. Deng. Modeling social attention for stock analysis: An influence propagation perspective. In 2015 IEEE International Conference on Data Mining, pages 609--618. IEEE, 2015.Google ScholarDigital Library

Index Terms

Hierarchical Multi-label Text Classification: An Attention-based Recurrent Network Approach
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks

Recommendations

An Interactive Fusion Model for Hierarchical Multi-label Text Classification
Natural Language Processing and Chinese Computing
Abstract
Scientific research literature usually has multi-level labels, and there are often dependencies between multi-level labels. It is crucial for the model to learn and integrate the information between multi-level labels for the hierarchical multi-...
Read More
Cognitive structure learning model for hierarchical multi-label text classification
Abstract
The human mind grows in learning new knowledge, which finally organizes and develops a basic mental pattern called cognitive structure. Hierarchical multi-label text classification (HMLTC), a fundamental but challenging task in many real-world ...
Read More
Exploiting label dependency for hierarchical multi-label classification
PAKDD'12: Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I

Hierarchical multi-label classification is a variant of traditional classification in which the instances can belong to several labels, that are in turn organized in a hierarchy. Existing hierarchical multi-label classification algorithms ignore ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management
November 2019
3373 pages
ISBN:9781450369763
DOI:10.1145/3357384
General Chairs:
Wenwu Zhu
Tsinghua University, China
,
Dacheng Tao
University of Massachusetts, USA
,
Xueqi Cheng
Institute of Computing Technology, CAS, China
,
Program Chairs:
Peng Cui
Tsinghua University, China
,
Elke Rundensteiner
Worcester Polytechnic Institute, USA
,
David Carmel
Amazon Research, USA
,
Qi He
LinkedIn, USA
,
Jeffrey Xu Yu
Chinese University of Hong Kong, China
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 3 November 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
attention mechanism
hierarchical attention networks
hierarchical multi-label text classification
Qualifiers
- research-article
Conference

Acceptance Rates
CIKM '19 Paper Acceptance Rate202of1,031submissions,20%Overall Acceptance Rate1,861of8,427submissions,22%
More
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 84
  Total Citations
  View Citations
- 2,777
  Total Downloads
- Downloads (Last 12 months)330
- Downloads (Last 6 weeks)43
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Hierarchical Multi-label Text Classification: An Attention-based Recurrent Network Approach

CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management

ABSTRACT

References

Cited By

Index Terms

Recommendations

An Interactive Fusion Model for Hierarchical Multi-label Text Classification

Cognitive structure learning model for hierarchical multi-label text classification

Exploiting label dependency for hierarchical multi-label classification