Skip to main content

2021 | OriginalPaper | Buchkapitel

ThinkTwice: A Two-Stage Method for Long-Text Machine Reading Comprehension

verfasst von : Mengxing Dong, Bowei Zou, Jin Qian, Rongtao Huang, Yu Hong

Erschienen in: Natural Language Processing and Chinese Computing

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Long-text machine reading comprehension (LT-MRC) requires machine to answer questions based on a lengthy text. Despite transformer-based models achieve promising results, most of them are incapable of dealing with long sequences for their time-consuming. In general, a proper solution by sliding window splits the passage into equally spaced fragments, then predicts the answer based on each fragment separately without considering other contextual fragments. However, this approach suffers from lack of long-distance dependency, which severely damages the performance. To address this issue, we propose a two-stage method ThinkTwice for LT-MRC. ThinkTwice casts the process of LT-MRC into two main steps: 1) it firstly retrieves several fragments that the final answer is most likely to lie in; 2) then extracts the answer span from these fragments instead of from the lengthy document. We do experiments on NewsQA. The experimental results demonstrate that ThinkTwice can capture the most informative fragments from a long text. Meanwhile, ThinkTwice achieves considerable improvements compared to all existing baselines. All codes have been released at Github (https://​github.​com/​Walle1493/​ThinkTwice).

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Such as the max position embedding length of BERT is 512.
 
2
[CLS] and [SEP] are special tokens. The former can theoretically represent the overall information of the whole input sequence after being encoded, and the latter is used for input segmentation.
 
Literatur
1.
Zurück zum Zitat Atkinson, R.C., Shiffrin, R.M.: Human memory: a proposed system and its control processes. In: Psychology of Learning and Motivation, vol. 2, pp. 89–195. Elsevier (1968) Atkinson, R.C., Shiffrin, R.M.: Human memory: a proposed system and its control processes. In: Psychology of Learning and Motivation, vol. 2, pp. 89–195. Elsevier (1968)
3.
Zurück zum Zitat Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:​1810.​04805 (2018)
4.
Zurück zum Zitat Ding, M., Zhou, C., Yang, H., Tang, J.: CogLTX: applying BERT to long texts. In: Advances in Neural Information Processing Systems, vol. 33 (2020) Ding, M., Zhou, C., Yang, H., Tang, J.: CogLTX: applying BERT to long texts. In: Advances in Neural Information Processing Systems, vol. 33 (2020)
5.
7.
Zurück zum Zitat Joshi, M., Chen, D., Liu, Y., Weld, D.S., Zettlemoyer, L., Levy, O.: Spanbert: improving pre-training by representing and predicting spans. Trans. Assoc. Comput. Linguist. 8, 64–77 (2020)CrossRef Joshi, M., Chen, D., Liu, Y., Weld, D.S., Zettlemoyer, L., Levy, O.: Spanbert: improving pre-training by representing and predicting spans. Trans. Assoc. Comput. Linguist. 8, 64–77 (2020)CrossRef
8.
Zurück zum Zitat Joshi, M., Choi, E., Weld, D.S., Zettlemoyer, L.: TriviaQA: a large scale distantly supervised challenge dataset for reading comprehension. arXiv preprint arXiv:1705.03551 (2017) Joshi, M., Choi, E., Weld, D.S., Zettlemoyer, L.: TriviaQA: a large scale distantly supervised challenge dataset for reading comprehension. arXiv preprint arXiv:​1705.​03551 (2017)
9.
Zurück zum Zitat Joshi, M., Levy, O., Weld, D.S., Zettlemoyer, L.: BERT for coreference resolution: baselines and analysis. arXiv preprint arXiv:1908.09091 (2019) Joshi, M., Levy, O., Weld, D.S., Zettlemoyer, L.: BERT for coreference resolution: baselines and analysis. arXiv preprint arXiv:​1908.​09091 (2019)
10.
Zurück zum Zitat Kryściński, W., Keskar, N.S., McCann, B., Xiong, C., Socher, R.: Neural text summarization: a critical evaluation. arXiv preprint arXiv:1908.08960 (2019) Kryściński, W., Keskar, N.S., McCann, B., Xiong, C., Socher, R.: Neural text summarization: a critical evaluation. arXiv preprint arXiv:​1908.​08960 (2019)
11.
Zurück zum Zitat Kundu, S., Ng, H.T.: A question-focused multi-factor attention network for question answering. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018) Kundu, S., Ng, H.T.: A question-focused multi-factor attention network for question answering. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)
12.
Zurück zum Zitat Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: a lite BERT for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: a lite BERT for self-supervised learning of language representations. arXiv preprint arXiv:​1909.​11942 (2019)
14.
Zurück zum Zitat Mikolov, T., Karafiát, M., Burget, L., Černockỳ, J., Khudanpur, S.: Recurrent neural network based language model. In: Eleventh Annual Conference of the International Speech Communication Association (2010) Mikolov, T., Karafiát, M., Burget, L., Černockỳ, J., Khudanpur, S.: Recurrent neural network based language model. In: Eleventh Annual Conference of the International Speech Communication Association (2010)
15.
Zurück zum Zitat Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: Squad: 100,000+ questions for machine comprehension of text. arXiv preprint arXiv:1606.05250 (2016) Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: Squad: 100,000+ questions for machine comprehension of text. arXiv preprint arXiv:​1606.​05250 (2016)
16.
Zurück zum Zitat Seo, M., Kembhavi, A., Farhadi, A., Hajishirzi, H.: Bidirectional attention flow for machine comprehension. arXiv preprint arXiv:1611.01603 (2016) Seo, M., Kembhavi, A., Farhadi, A., Hajishirzi, H.: Bidirectional attention flow for machine comprehension. arXiv preprint arXiv:​1611.​01603 (2016)
17.
Zurück zum Zitat Tay, Y., Tuan, L.A., Hui, S.C., Su, J.: Densely connected attention propagation for reading comprehension. arXiv preprint arXiv:1811.04210 (2018) Tay, Y., Tuan, L.A., Hui, S.C., Su, J.: Densely connected attention propagation for reading comprehension. arXiv preprint arXiv:​1811.​04210 (2018)
22.
Zurück zum Zitat Wang, Z., Ng, P., Ma, X., Nallapati, R., Xiang, B.: Multi-passage BERT: a globally normalized BERT model for open-domain question answering. arXiv preprint arXiv:1908.08167 (2019) Wang, Z., Ng, P., Ma, X., Nallapati, R., Xiang, B.: Multi-passage BERT: a globally normalized BERT model for open-domain question answering. arXiv preprint arXiv:​1908.​08167 (2019)
23.
Zurück zum Zitat Wu, Y., et al.: Google’s neural machine translation system: bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 (2016) Wu, Y., et al.: Google’s neural machine translation system: bridging the gap between human and machine translation. arXiv preprint arXiv:​1609.​08144 (2016)
24.
Zurück zum Zitat Xie, Q., Dai, Z., Hovy, E., Luong, M.T., Le, Q.V.: Unsupervised data augmentation for consistency training. arXiv preprint arXiv:1904.12848 (2019) Xie, Q., Dai, Z., Hovy, E., Luong, M.T., Le, Q.V.: Unsupervised data augmentation for consistency training. arXiv preprint arXiv:​1904.​12848 (2019)
25.
27.
Metadaten
Titel
ThinkTwice: A Two-Stage Method for Long-Text Machine Reading Comprehension
verfasst von
Mengxing Dong
Bowei Zou
Jin Qian
Rongtao Huang
Yu Hong
Copyright-Jahr
2021
DOI
https://doi.org/10.1007/978-3-030-88480-2_34