nach oben

Machine Vision and Applications

Erschienen in:

02.02.2018 | Special Issue Paper

Multitask learning for neural generative question answering

verfasst von: Yanzhou Huang, Tao Zhong

Erschienen in: Machine Vision and Applications | Ausgabe 6/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Neural generative model in question answering (QA) usually employs sequence-to-sequence (Seq2Seq) learning to generate answers based on the user’s questions as opposed to the retrieval-based model selecting the best matched answer from a repository of pre-defined QA pairs. One key challenge of neural generative model in QA lies in generating high-frequency and generic answers regardless of the questions, partially due to optimizing log-likelihood objective function. In this paper, we investigate multitask learning (MTL) in neural network-based method under a QA scenario. We define our main task as agenerative QA via Seq2Seq learning. And we define our auxiliary task as a discriminative QA via binary QAclassification. Both main task and auxiliary task are learned jointly with shared representations, allowing to obtain improved generalization and transferring classification labels as extra evidences to guide the word sequence generation of the answers. Experimental results on both automatic evaluations and human annotations demonstrate the superiorities of our proposed method over baselines.

Vorheriger Artikel Extended sparse representation-based classification method for face recognition

Nächster Artikel Person re-identification by discriminant analytical least squares metric learning

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

https://github.com/fxsjy/jieba.

For simplicity, we translate the Chinese into English.

All are native speaker of Chinese, and they at least have received a bachelor’s degree.

https://code.google.com/archive/p/word2vec/.

https://www.tensorflow.org/.

For simplicity, we translate the Chinese into English.

Ando, R.K., Zhang, T.: A framework for learning predictive structures from multiple tasks and unlabeled data. J. Mach. Learn. Res. 6, 1817–1853 (2005)MathSciNetMATH

Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: arXiv preprint arXiv:1409.0473 (2014)

Chen, Z., Watanabe, S.: Speech enhancement and recognition using multi-task learning of long short-term memory recurrent neural networks. In: InterSpeech’15 (2015)

Chung, J., Gucehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. In: arXiv preprint arXiv:1412.3555 (2014)

Collobert, R., Weston, J.: A unified architecture for natural languageprocessing: deep neural networks with multitask learning. In: Proceedings of the 25th International Conference on Machine Learning, pp. 160–167 (2008)

Diederik, P., Kingma, J.B.: Adam: a method for stochastic optimization. In: arXiv preprint arXiv:1412.6980 (2014)

Fleiss, J.L., Cohen, J.: The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. Educ. Psychol. Meas. 33(3), 613–619 (1973)CrossRef

Galley, M., Brockett, C., Sordoni, A., Ji, Y., Auli, M., Quirk, C., Mitchell, M., Gao, J., Dolan, B.: deltableu: a discriminative metric for generation tasks with intrinsically diverse targets. In: arXiv preprint arXiv:1506.06863 (2015)

Han, L., Zhang, Y.: Learning multi-level task groups in multi-task learning. In: AAAI’15, pp. 2638–2644 (2015)

10.

Hatori, J., Matsuzaki, T., Miyao, Y., Tsujii, J.: Incremental joint approach to word segmentation, pos tagging, and dependency parsing in chinese. In: ACL’12, pp. 1045–1053 (2012)

11.

Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRef

12.

Hong, C., Yu, J., Chen, X.: Image-based 3D human pose recovery with locality sensitive sparse retrieval. In: IEEE International Conference on Systems, Man, and Cybernetics (SMC), 2013, pp. 2103–2108. IEEE (2013)

13.

Hong, C., Yu, J., Wan, J., Tao, D., Wang, M.: Multimodal deep autoencoder for human pose recovery. IEEE Trans. Image Process. 24(12), 5659–5670 (2015)MathSciNetCrossRef

14.

Hong, C., Chen, X., Wang, X., Tang, C.: Hypergraph regularized autoencoder for image-based 3d human pose recovery. Signal Process. 124, 132–140 (2016)CrossRef

15.

Ji, Z., Lu, Z., Li, H.: An information retrieval approach to short text conversation. In: arXiv preprint arXiv:1408.6988 (2014)

16.

Li, J., Galley, M., Brockett, C., Spithourakis, G.P., Gao, J., Dolan, B.: A persona-based neural conversation model. In: arXiv preprint arXiv:1603.06155 (2016)

17.

Li, J., Galley, M., Brockett, C., Gao, J., Dolan, B.: A diversity-promoting objective function for neural conversation models. In: arXiv preprint arXiv:1510.03055 (2015)

18.

Liu, C.W., Lowe, R., Serban, L.V., Noseworthy, M., Charlin, L., Pineau, J.: How NOT to evaluate your dialogue system: an empirical study of unsupervised evaluation metrics for dialogue response generation. In: arXiv preprint arXiv:1603.08023 (2016)

19.

Liu, X., Gao, J., He, X., Deng, L., Duh, K., Wang, Y.Y.: Representation learning using multi-task deep neural networks for semantic classification and information retrieval. In: NAACL’15 (2015)

20.

Lowe, R., Pow, N., Serban, I., Pineau, J.: The ubuntu dialogue corpus: a large dataset for research in unstructured multi-turn dialogue systems. In: arXiv preprint arXiv:1506.08909 (2015)

21.

Luong, M.T., Le, Q.V., Sutskever, I., Vinyals, O., Kaiser, L.: Multi-task sequence to sequence learning. In: arXiv preprint arXiv:1511.06114 (2016)

22.

Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NIPS’13, pp. 3111–3119 (2013)

23.

Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: Bleu: a method for automatic evaluation of machine translation. In: ACL’02, Association for Computational Linguistics, pp. 311–318 (2002)

24.

Pironkov, G., Dupont, S., Dutoit, T.: Speaker-aware long short-term memory multi-task learning for speech recognition. In: EUSIPCO’16, pp. 1911–1915 (2016)

25.

Ritter, A., Cherry, C., Dolan, B.: Data-driven response generation in social media. In: EMNLP’11 (2011)

26.

Serban, I.V., Sordoni, A., Bengio, Y., Courville, A., Pineau, J.: Building end-to-end dialogue systems using generative hierarchical neural network models. In: arXiv preprint arXiv:1507.04808 (2015)

27.

Shang, L., Lu, Z., Li, H.: Neural responding machine for short-text conversation. In: arXiv preprint arXiv:1503.02364 (2015)

28.

Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: NIPS’14, pp. 3104–3112 (2014)

29.

Tang, D., Wei, F., Qin, B., Yang, N., Liu, T., Zhou, M.: Sentiment embeddings with applications to sentiment analysis. IEEE Trans. Knowl. Data Eng. 28(2), 496–509 (2016)CrossRef

30.

Vinyals, O., Le, Q.: A neural conversational model. In: arXiv preprint arXiv:1506.05869 (2015)

31.

Wu, Y., Wu, W., Zhou, M., Li, Z.: Sequential match network: a new architecture for multi-turn response selection in retrieval-based chatbots. In: arXiv preprint arXiv:1612.01627 (2016)

32.

Xing, C., Wu, W., Wu, Y., Liu, J., Huang Y., Ming, Z., Ma, W.Y.: Topic aware neural response generation. In: AAAI’17, pp. 3351–3357 (2017)

33.

Vinyals, O., Le, Q.: A neural conversational model. In: arXiv preprint arXiv:1506.05869 (2015)

34.

Yin, J., Jiang, X., Lu, Z., Shang, L., Li, H., Li, X.: Neural generative question answering. In: arXiv preprint arXiv:1512.01337 (2015)

35.

Zhou, X., Dong, D., Wu, H., Zhao, S., Yan, R., Yu, D., Liu, X., Tian, H.: Multi-view response selection for human-computer conversation. In: EMNLP’16, pp. 372–381 (2016)

Titel: Multitask learning for neural generative question answering
verfasst von: Yanzhou Huang
Tao Zhong
Publikationsdatum: 02.02.2018
Verlag: Springer Berlin Heidelberg
Erschienen in: Machine Vision and Applications / Ausgabe 6/2018
Print ISSN: 0932-8092
Elektronische ISSN: 1432-1769
DOI: https://doi.org/10.1007/s00138-018-0908-0

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Weitere Artikel der Ausgabe 6/2018

Rank–sparsity balanced representation for subspace clustering

Machine learning for big visual analysis

Extended sparse representation-based classification method for face recognition

Person re-identification by discriminant analytical least squares metric learning

Hierarchical convolutional features for end-to-end representation-based visual tracking

Two-stream person re-identification with multi-task deep neural networks