Skip to main content
Top
Published in: Soft Computing 8/2020

24-09-2019 | Focus

An attention mechanism and multi-granularity-based Bi-LSTM model for Chinese Q&A system

Authors: Xiao-mei Yu, Wen-zhi Feng, Hong Wang, Qian Chu, Qi Chen

Published in: Soft Computing | Issue 8/2020

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Natural language processing (NLP) is one of the key techniques in intelligent question-answering (Q&A) systems. Although recurrent neural networks and long short-term memory (LSTM) networks exhibit obvious advantages on well-known English Q&A datasets, they still suffer from several defects including indeterminateness, polysemy and the lack of changing morphology in Chinese, which results in complex NLP on large and diverse Chinese Q&A datasets. In this paper, we first analyze limitations of applying LSTM and bidirectional LSTM (Bi-LSTM) models to noisy Chinese Q&A datasets. Then, we focus on integrating attention mechanisms and multi-granularity word segmentation into Bi-LSTM and propose an attention mechanism and multi-granularity-based Bi-LSTM model (AM–Bi-LSTM) which combines the improved attention mechanism with a novel processing of multi-granularity word segmentation to handle the complex NLP in Chinese Q&A datasets. Furthermore, similarity of questions and answers is formulated to implement the quantitative computation which helps to achieve better performance in Chinese Q&A systems. Finally, we verify the proposed model on a noisy Chinese Q&A dataset. The experimental results demonstrate that the novel AM–Bi-LSTM model achieves significant improvement on evaluation metrics of accuracy, mean average precision and so on. Moreover, the experimental results indicate that the novel AM–Bi-LSTM model outperforms baseline methods and other LSTM-based models.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
go back to reference Allam AMN, Haggag MH (2012) The question answering systems: a survey. Int J Res Rev Inf Sci 2(3):211–221 Allam AMN, Haggag MH (2012) The question answering systems: a survey. Int J Res Rev Inf Sci 2(3):211–221
go back to reference Almomani A, Alauthman M, Albalas F et al (2018) An online intrusion detection system to cloud computing based on NeuCube algorithms. Int J Cloud Appl Comput 8(2):96–112 Almomani A, Alauthman M, Albalas F et al (2018) An online intrusion detection system to cloud computing based on NeuCube algorithms. Int J Cloud Appl Comput 8(2):96–112
go back to reference Bird S, Klein E, Loper E (2009) Natural language processing with Python: analyzing text with the natural language toolkit. O’Reilly Media, Beijing MATH Bird S, Klein E, Loper E (2009) Natural language processing with Python: analyzing text with the natural language toolkit. O’Reilly Media, Beijing MATH
go back to reference Chang X, Yu YL, Yang Y et al (2016) Semantic pooling for complex event analysis in untrimmed videos. IEEE Trans Pattern Anal Mach Intell 39(8):1617–1632 Chang X, Yu YL, Yang Y et al (2016) Semantic pooling for complex event analysis in untrimmed videos. IEEE Trans Pattern Anal Mach Intell 39(8):1617–1632
go back to reference Cheng Z, Chang X, Zhu L et al (2019) MMALFM: explainable recommendation by leveraging reviews and images. ACM Trans Inf Syst 37(2):16 Cheng Z, Chang X, Zhu L et al (2019) MMALFM: explainable recommendation by leveraging reviews and images. ACM Trans Inf Syst 37(2):16
go back to reference Cho K, van Merrienboer B, Bahdanau D, Bengio Y (2014) On the properties of neural machine translation: encoder–decoder approaches. In: Eighth workshop on syntax, semantics and structure in statistical translation, 10 Cho K, van Merrienboer B, Bahdanau D, Bengio Y (2014) On the properties of neural machine translation: encoder–decoder approaches. In: Eighth workshop on syntax, semantics and structure in statistical translation, 10
go back to reference Day MY, Ong CS, Hsu WL (2007) Question classification in English-Chinese cross-language question answering: an integrated genetic algorithm and machine learning approach. In: IEEE international conference on information reuse and integration, pp 203–208 Day MY, Ong CS, Hsu WL (2007) Question classification in English-Chinese cross-language question answering: an integrated genetic algorithm and machine learning approach. In: IEEE international conference on information reuse and integration, pp 203–208
go back to reference Demin B, Parlati S, Spinnato PF et al (2019) U-LITE, a private cloud approach for particle physics computing. Int J Cloud Appl Comput 9(1):1–15 Demin B, Parlati S, Spinnato PF et al (2019) U-LITE, a private cloud approach for particle physics computing. Int J Cloud Appl Comput 9(1):1–15
go back to reference Dkaich R, El Azami I, Mouloudi A (2017) XML OLAP cube in the cloud towards the DWaaS. Int J Cloud Appl Comput 7(1):47–56 Dkaich R, El Azami I, Mouloudi A (2017) XML OLAP cube in the cloud towards the DWaaS. Int J Cloud Appl Comput 7(1):47–56
go back to reference Erhan D, Bengio Y, Courville A et al (2010) Why does unsupervised pre-training help deep learning? J Mach Learn Res 11(3):625–660MathSciNetMATH Erhan D, Bengio Y, Courville A et al (2010) Why does unsupervised pre-training help deep learning? J Mach Learn Res 11(3):625–660MathSciNetMATH
go back to reference Feng M, Xiang B, Glass MR et al (2015) Applying deep learning to answer selection: a study and an open task. In: IEEE workshop on automatic speech recognition and understanding (ASRU). IEEE, pp 813–820 Feng M, Xiang B, Glass MR et al (2015) Applying deep learning to answer selection: a study and an open task. In: IEEE workshop on automatic speech recognition and understanding (ASRU). IEEE, pp 813–820
go back to reference Gao H, Mao J, Zhou J et al (2015) Are you talking to a machine? Dataset and methods for multilingual image question. In: Advances in neural information processing systems, pp 2296–2304 Gao H, Mao J, Zhou J et al (2015) Are you talking to a machine? Dataset and methods for multilingual image question. In: Advances in neural information processing systems, pp 2296–2304
go back to reference Green Jr BF, Wolf AK, Chomsky C et al (1961) Baseball: an automatic question-answerer. In: Proceedings of western joint IRE-AIEE-ACM computing conference, Los Angeles, 9–11 May, pp 219–224 Green Jr BF, Wolf AK, Chomsky C et al (1961) Baseball: an automatic question-answerer. In: Proceedings of western joint IRE-AIEE-ACM computing conference, Los Angeles, 9–11 May, pp 219–224
go back to reference Guan Y, Wang XL, Zhao J (2006) The research on professional website oriented Chinese question answering system. Nat Immunol 8(1):92–100 Guan Y, Wang XL, Zhao J (2006) The research on professional website oriented Chinese question answering system. Nat Immunol 8(1):92–100
go back to reference Hermjakob U (2001) Parsing and question classification for question answering. In: Proceedings of the ACL 2001 workshop on open-domain question answering Hermjakob U (2001) Parsing and question classification for question answering. In: Proceedings of the ACL 2001 workshop on open-domain question answering
go back to reference Hu B, Wang H, Yu X et al (2017) Sparse network embedding for community detection and sign prediction in signed social networks. J Ambient Intell Humaniz Comput 1:1–12 Hu B, Wang H, Yu X et al (2017) Sparse network embedding for community detection and sign prediction in signed social networks. J Ambient Intell Humaniz Comput 1:1–12
go back to reference Iyyer M, Boyd-Graber J, Claudino L et al (2014) A neural network for factoid question answering over paragraphs. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 633–644 Iyyer M, Boyd-Graber J, Claudino L et al (2014) A neural network for factoid question answering over paragraphs. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 633–644
go back to reference Ji C, Liu S, Yang C et al (2018) A shapelet selection algorithm for time series classification: new directions. Procedia Comput Sci 129:461–467 Ji C, Liu S, Yang C et al (2018) A shapelet selection algorithm for time series classification: new directions. Procedia Comput Sci 129:461–467
go back to reference Lai Y, Jia Y, Lin Y et al (2017) A Chinese question answering system for single-relation factoid questions. In: National CCF conference on natural language processing and Chinese computing. Springer, Cham, pp 124–135 Lai Y, Jia Y, Lin Y et al (2017) A Chinese question answering system for single-relation factoid questions. In: National CCF conference on natural language processing and Chinese computing. Springer, Cham, pp 124–135
go back to reference Lee CW, Day MY, Sung CL et al (2008) Boosting Chinese question answering with two lightweight methods: ABSPs and SCO-QAT. ACM Trans Asian Lang Inf Process 7(4):12 Lee CW, Day MY, Sung CL et al (2008) Boosting Chinese question answering with two lightweight methods: ABSPs and SCO-QAT. ACM Trans Asian Lang Inf Process 7(4):12
go back to reference Li S, Zhang J, Huang X et al (2002) Semantic computation in a Chinese question-answering system. J Comput Sci Technol 17(6):933–939MATH Li S, Zhang J, Huang X et al (2002) Semantic computation in a Chinese question-answering system. J Comput Sci Technol 17(6):933–939MATH
go back to reference Li Z, Nie F, Chang X et al (2017) Beyond trace ratio: weighted harmonic mean of trace ratios for multiclass discriminant analysis. IEEE Trans Knowl Data Eng 29(10):2100–2110 Li Z, Nie F, Chang X et al (2017) Beyond trace ratio: weighted harmonic mean of trace ratios for multiclass discriminant analysis. IEEE Trans Knowl Data Eng 29(10):2100–2110
go back to reference Liu H, Hu B, Moore P (2015) HCI model with learning mechanism for cooperative design in pervasive computing environment. J Internet Technol 16(2):201–210 Liu H, Hu B, Moore P (2015) HCI model with learning mechanism for cooperative design in pervasive computing environment. J Internet Technol 16(2):201–210
go back to reference Liu FL, Hao WN et al (2017) Attention of bilinear function based Bi-LSTM model for machine reading comprehension. Comput Sci 44(s1):92–96 Liu FL, Hao WN et al (2017) Attention of bilinear function based Bi-LSTM model for machine reading comprehension. Comput Sci 44(s1):92–96
go back to reference Liu R, Wang H, Yu XM (2018) Shared-nearest-neighbor-based clustering by fast search and find of density peaks. Inf Sci 450:200–226MathSciNet Liu R, Wang H, Yu XM (2018) Shared-nearest-neighbor-based clustering by fast search and find of density peaks. Inf Sci 450:200–226MathSciNet
go back to reference Mao J, Xu W, Yang Y et al (2014) Deep captioning with multimodal recurrent neural networks (m-RNN). arXiv preprint arXiv:1412.6632 Mao J, Xu W, Yang Y et al (2014) Deep captioning with multimodal recurrent neural networks (m-RNN). arXiv preprint arXiv:​1412.​6632
go back to reference Negi P, Mishra A, Gupta BB (2013) Enhanced CBF packet filtering method to detect DDoS attack in cloud computing environment. arXiv preprint arXiv:1304.7073 Negi P, Mishra A, Gupta BB (2013) Enhanced CBF packet filtering method to detect DDoS attack in cloud computing environment. arXiv preprint arXiv:​1304.​7073
go back to reference Paris CL (1985) Towards more graceful interaction: a survey of question-answering programs. Columbia University Computer Science Technical Reports Paris CL (1985) Towards more graceful interaction: a survey of question-answering programs. Columbia University Computer Science Technical Reports
go back to reference Peng F, Weischedel R, Licuanan A et al (2005) Combining deep linguistics analysis and surface pattern learning: a hybrid approach to Chinese definitional question answering. In: Proceedings of the conference on human language technology and empirical methods in natural language processing. Association for Computational Linguistics, pp 307–314 Peng F, Weischedel R, Licuanan A et al (2005) Combining deep linguistics analysis and surface pattern learning: a hybrid approach to Chinese definitional question answering. In: Proceedings of the conference on human language technology and empirical methods in natural language processing. Association for Computational Linguistics, pp 307–314
go back to reference Qiu X, Huang X (2015) Convolutional neural tensor network architecture for community-based question answering. In: Twenty-Fourth international joint conference on artificial intelligence Qiu X, Huang X (2015) Convolutional neural tensor network architecture for community-based question answering. In: Twenty-Fourth international joint conference on artificial intelligence
go back to reference Shi D, Zhu L, Cheng Z et al (2018) Unsupervised multi-view feature extraction with dynamic graph learning. J Vis Commun Image Represent 56:256–264 Shi D, Zhu L, Cheng Z et al (2018) Unsupervised multi-view feature extraction with dynamic graph learning. J Vis Commun Image Represent 56:256–264
go back to reference Socher R, Lin CC, Manning C et al (2011) Parsing natural scenes and natural language with recursive neural networks. In: Proceedings of the 28th international conference on machine learning (ICML-11), pp 129–136 Socher R, Lin CC, Manning C et al (2011) Parsing natural scenes and natural language with recursive neural networks. In: Proceedings of the 28th international conference on machine learning (ICML-11), pp 129–136
go back to reference Suppes P, Liang L, Bottner M (1996) Machine learning comprehension grammars for ten languages. Comput Linguist 22(3):329–350 Suppes P, Liang L, Bottner M (1996) Machine learning comprehension grammars for ten languages. Comput Linguist 22(3):329–350
go back to reference Sutskever I, Vinyals O, Le Q (2014) Sequence to sequence learning with neural networks. In: Advances in neural information processing systems Sutskever I, Vinyals O, Le Q (2014) Sequence to sequence learning with neural networks. In: Advances in neural information processing systems
go back to reference Tan M, Santos C, Xiang B et al (2015) LSTM-based deep learning models for non-factoid answer selection. arXiv preprint arXiv:1511.04108 Tan M, Santos C, Xiang B et al (2015) LSTM-based deep learning models for non-factoid answer selection. arXiv preprint arXiv:​1511.​04108
go back to reference Wang Y, Huang M, Zhu X et al (2016) Attention-based LSTM for aspect-level sentiment classification. In: Conference on empirical methods in natural language processing, pp 606–615 Wang Y, Huang M, Zhu X et al (2016) Attention-based LSTM for aspect-level sentiment classification. In: Conference on empirical methods in natural language processing, pp 606–615
go back to reference Woods WA (1973) Progress in natural language understanding: an application to lunar geology. In: Proceedings of the national conference of the American Federation of Information Processing Societies, 4–8 June, pp 441–450 Woods WA (1973) Progress in natural language understanding: an application to lunar geology. In: Proceedings of the national conference of the American Federation of Information Processing Societies, 4–8 June, pp 441–450
go back to reference Xu C (2017) Research on multi-granularity analysis and processing method of time series signal based on convolution-long-term memory neural network, Harbin Institute of Technology Xu C (2017) Research on multi-granularity analysis and processing method of time series signal based on convolution-long-term memory neural network, Harbin Institute of Technology
go back to reference Xu J, Xu Y, Zhang Y et al (2015) Combining semantic comprehension and machine learning for Chinese sentiment classification. Open Autom Control Syst J 7(1):1660–1666MathSciNet Xu J, Xu Y, Zhang Y et al (2015) Combining semantic comprehension and machine learning for Chinese sentiment classification. Open Autom Control Syst J 7(1):1660–1666MathSciNet
go back to reference Yao Y, Huang Z (2016) Bi-directional LSTM recurrent neural network for Chinese word segmentation. In: International conference on neural information processing. Springer, Cham, pp 345–353 Yao Y, Huang Z (2016) Bi-directional LSTM recurrent neural network for Chinese word segmentation. In: International conference on neural information processing. Springer, Cham, pp 345–353
go back to reference Yu XM, Wang H, Zheng X et al (2016) Effective algorithms for vertical mining probabilistic frequent patterns in uncertain mobile environments. Int J Ad Hoc Ubiquitous Comput 23(3/4):137 Yu XM, Wang H, Zheng X et al (2016) Effective algorithms for vertical mining probabilistic frequent patterns in uncertain mobile environments. Int J Ad Hoc Ubiquitous Comput 23(3/4):137
go back to reference Zhang D, Lee WS (2003) Question classification using support vector machines. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval. ACM, pp 26–32 Zhang D, Lee WS (2003) Question classification using support vector machines. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval. ACM, pp 26–32
go back to reference Zhang B, Zhu L, Sun J, Zhang H (2018) Cross-media retrieval with collective deep semantic learning. Multimed Tools Appl 77(17):22247–22266 Zhang B, Zhu L, Sun J, Zhang H (2018) Cross-media retrieval with collective deep semantic learning. Multimed Tools Appl 77(17):22247–22266
go back to reference Zhen L, Wenxian X, Wenlong W et al (2012) The research of Chinese Q&A system based on similarity algorithm. Computer, informatics, cybernetics and applications. Springer, Dordrecht, pp 981–990 Zhen L, Wenxian X, Wenlong W et al (2012) The research of Chinese Q&A system based on similarity algorithm. Computer, informatics, cybernetics and applications. Springer, Dordrecht, pp 981–990
go back to reference Zheng XW, Li Y, Liu H et al (2016) A study on a cooperative character modeling based on an improved NSGA II. Multimed Tools Appl 75(8):4305–4320 Zheng XW, Li Y, Liu H et al (2016) A study on a cooperative character modeling based on an improved NSGA II. Multimed Tools Appl 75(8):4305–4320
go back to reference Zhou B, Sun C, Lin L et al (2018) LSTM based question answering for large scale knowledge base. Beijing Da Xue Xue Bao 54(2):286–292MathSciNet Zhou B, Sun C, Lin L et al (2018) LSTM based question answering for large scale knowledge base. Beijing Da Xue Xue Bao 54(2):286–292MathSciNet
go back to reference Zhu L, Huang Z, Li Z et al (2018) Exploring auxiliary context: discrete semantic transfer hashing for scalable image retrieval. IEEE Trans Neural Netw Learn syst 29(11):5264–5276 Zhu L, Huang Z, Li Z et al (2018) Exploring auxiliary context: discrete semantic transfer hashing for scalable image retrieval. IEEE Trans Neural Netw Learn syst 29(11):5264–5276
Metadata
Title
An attention mechanism and multi-granularity-based Bi-LSTM model for Chinese Q&A system
Authors
Xiao-mei Yu
Wen-zhi Feng
Hong Wang
Qian Chu
Qi Chen
Publication date
24-09-2019
Publisher
Springer Berlin Heidelberg
Published in
Soft Computing / Issue 8/2020
Print ISSN: 1432-7643
Electronic ISSN: 1433-7479
DOI
https://doi.org/10.1007/s00500-019-04367-8

Other articles of this Issue 8/2020

Soft Computing 8/2020 Go to the issue

Premium Partner