ABSTRACT
Automated essay scoring (AES) is a promising, yet challenging task. Current state-of-the-art AES models ignore the domain difference and cannot effectively leverage data from different domains. In this paper, we propose a domain-adaptive framework to improve the domain adaptability of AES models. We design two domain-independent self-supervised tasks and jointly train them with the AES task simultaneously. The self-supervised tasks enable the model to capture the shared knowledge across different domains and act as the regularization to induce a shared feature space. We further propose to enhance the model's robustness to domain variation via a novel domain adversarial training technique. The main idea of the proposed domain adversarial training is to train the model with small well-designed perturbations to make the model robust to domain variation. We obtain the perturbation via a variation of the Fast Gradient Sign Method (FGSM). Our approach achieves new state-of-the-art performance in both in-domain and cross-domain experiments on the ASAP dataset. We also show that the proposed domain adaptation framework is architecture-free and can be successfully applied to different models.
Supplemental Material
- Dimitrios Alikaniotis, Helen Yannakoudakis, and Marek Rei. 2016. Automatic Text Scoring Using Neural Networks. (2016). https://doi.org/10.18653/v1/p16--1068Google Scholar
- Martín Arjovsky, Soumith Chintala, and Léon Bottou. 2017. Wasserstein GAN. CoRR abs/1701.07875 (2017). arXiv:1701.07875 http://arxiv.org/abs/1701.07875Google Scholar
- Yigal Attali and Jill Burstein. 2006. Automated essay scoring with e-rater® V. 2. The Journal of Technology, Learning and Assessment 4, 3 (2006).Google Scholar
- Fabio Maria Carlucci, Antonio D'Innocente, Silvia Bucci, Barbara Caputo, and Tatiana Tommasi. 2019. Domain Generalization by Solving Jigsaw Puzzles. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019. Computer Vision Foundation / IEEE, 2229--2238. https://doi.org/10.1109/CVPR.2019.00233Google ScholarCross Ref
- Xilun Chen and Claire Cardie. 2018. Multinomial Adversarial Networks for Multi-Domain Text Classification. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2018, New Orleans, Louisiana, USA, June 1-6, 2018, Volume 1 (Long Papers), Marilyn A. Walker, Heng Ji, and Amanda Stent (Eds.). Association for Computational Linguistics, 1226--1240. https://doi.org/10. 18653/v1/n18-1111Google ScholarCross Ref
- Xilun Chen, Yu Sun, Ben Athiwaratkun, Claire Cardie, and Kilian Q. Weinberger. 2018. Adversarial Deep Averaging Networks for Cross-Lingual Sentiment Classification. Trans. Assoc. Comput. Linguistics 6 (2018), 557--570. https://transacl.org/ojs/index.php/tacl/article/view/1413Google ScholarCross Ref
- Madalina Cozma, Andrei M. Butnaru, and Radu Tudor Ionescu. 2018. Automated essay scoring with string kernels and word embeddings. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, July 15-20, 2018, Volume 2: Short Papers, Iryna Gurevych and Yusuke Miyao (Eds.). Association for Computational Linguistics, 503--509. https://doi.org/10.18653/v1/P18--2080Google ScholarCross Ref
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), Jill Burstein, Christy Doran, and Thamar Solorio (Eds.). Association for Computational Linguistics, 4171--4186. https://doi.org/10.18653/v1/n19-1423Google Scholar
- Fei Dong and Yue Zhang. 2016. Automatic Features for Essay Scoring - An Empirical Study. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016, Austin, Texas, USA, November 1- 4, 2016, Jian Su, Xavier Carreras, and Kevin Duh (Eds.). The Association for Computational Linguistics, 1072--1077. https://doi.org/10.18653/v1/d16-1115Google ScholarCross Ref
- Fei Dong, Yue Zhang, and Jie Yang. 2017. Attention-based Recurrent Convolutional Neural Network for Automatic Essay Scoring. In Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017), Vancouver, Canada, August 3-4, 2017, Roger Levy and Lucia Specia (Eds.). Association for Computational Linguistics, 153--162. https://doi.org/10.18653/v1/K17-1017Google ScholarCross Ref
- Youmna Farag, Helen Yannakoudakis, and Ted Briscoe. 2018. Neural Automated Essay Scoring and Coherence Modeling for Adversarially Crafted Input. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2018, New Orleans, Louisiana, USA, June 1-6, 2018, Volume 1 (Long Papers), Marilyn A.Walker, Heng Ji, and Amanda Stent (Eds.). Association for Computational Linguistics, 263--271. https://doi.org/10.18653/v1/n18-1024Google ScholarCross Ref
- Yaroslav Ganin, Evgeniya Ustinova, Hana Ajakan, Pascal Germain, Hugo Larochelle, François Laviolette, Mario Marchand, and Victor S. Lempitsky. 2016. Domain-Adversarial Training of Neural Networks. J. Mach. Learn. Res. 17 (2016), 59:1--59:35. http://jmlr.org/papers/v17/15-239.htmlGoogle Scholar
- Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde- Farley, Sherjil Ozair, Aaron C. Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8-13 2014, Montreal, Quebec, Canada, Zoubin Ghahramani, Max Welling, Corinna Cortes, Neil D. Lawrence, and Kilian Q. Weinberger (Eds.). 2672--2680. http://papers. nips.cc/paper/5423-generative-adversarial-netsGoogle ScholarDigital Library
- Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. 2015. Explaining and Harnessing Adversarial Examples. (2015). http://arxiv.org/abs/1412.6572Google Scholar
- Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Computation 9, 8 (1997), 1735--1780. https://doi.org/10.1162/neco.1997.9. 8.1735Google ScholarDigital Library
- Cancan Jin, Ben He, Kai Hui, and Le Sun. 2018. TDNN: A Two-stage Deep Neural Network for Prompt-independent Automated Essay Scoring. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, July 15-20, 2018, Volume 1: Long Papers, Iryna Gurevych and Yusuke Miyao (Eds.). Association for Computational Linguistics, 1088--1097. https://doi.org/10.18653/v1/P18-1100Google ScholarCross Ref
- Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. (2015). http://arxiv.org/abs/1412.6980Google Scholar
- Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet Classification with Deep Convolutional Neural Networks. In Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012, Lake Tahoe, Nevada, United States, Peter L. Bartlett, Fernando C. N. Pereira, Christopher J. C. Burges, Léon Bottou, and Kilian Q. Weinberger (Eds.). 1106--1114. http://papers.nips.cc/paper/4824-imagenet-classification-with-deepconvolutional- neural-networksGoogle ScholarDigital Library
- Jiawei Liu, Yang Xu, and Lingzhe Zhao. 2019. Automated Essay Scoring based on Two-Stage Learning. CoRR abs/1901.07744 (2019). arXiv:1901.07744 http://arxiv.org/abs/1901.07744Google Scholar
- Pengfei Liu, Xipeng Qiu, and Xuanjing Huang. 2017. Adversarial Multi-task Learning for Text Classification. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30 - August 4, Volume 1: Long Papers, Regina Barzilay and Min-Yen Kan (Eds.). Association for Computational Linguistics, 1--10. https://doi.org/10.18653/v1/P17- 1001Google ScholarCross Ref
- Ryo Masumura, Yusuke Shinohara, Ryuichiro Higashinaka, and Yushi Aono. 2018. Adversarial Training for Multi-task and Multi-lingual Joint Modeling of Utterance Intent Classification. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31 - November 4, 2018, Ellen Riloff, David Chiang, Julia Hockenmaier, and Jun'ichi Tsujii (Eds.). Association for Computational Linguistics, 633--639. https://doi.org/10.18653/ v1/d18--1064Google ScholarCross Ref
- Takeru Miyato, Andrew M. Dai, and Ian J. Goodfellow. 2017. Adversarial Training Methods for Semi-Supervised Text Classification. (2017). https://openreview. net/forum?id=r1X3g2_xlGoogle Scholar
- Farah Nadeem, Huy Nguyen, Yang Liu, and Mari Ostendorf. 2019. Automated Essay Scoring with Discourse-Aware Neural Models. In Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications, BEA@ACL 2019, Florence, Italy, August 2, 2019, Helen Yannakoudakis, Ekaterina Kochmar, Claudia Leacock, Nitin Madnani, Ildikó Pilán, and Torsten Zesch (Eds.). Association for Computational Linguistics, 484--493. https://doi.org/10.18653/ v1/w19-4450Google ScholarCross Ref
- Mehdi Noroozi and Paolo Favaro. 2016. Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles. In Computer Vision - ECCV 2016 - 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part VI (Lecture Notes in Computer Science, Vol. 9910), Bastian Leibe, Jiri Matas, Nicu Sebe, and Max Welling (Eds.). Springer, 69--84. https://doi.org/10.1007/978- 3-319-46466-4_5Google Scholar
- Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global Vectors for Word Representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, October 25-29, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL, Alessandro Moschitti, Bo Pang, and Walter Daelemans (Eds.). ACL, 1532--1543. https://doi.org/10.3115/v1/d14-1162Google Scholar
- Peter Phandi, Kian Ming Adam Chai, and Hwee Tou Ng. 2015. Flexible Domain Adaptation for Automated Essay Scoring Using Correlated Linear Regression. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015, Lisbon, Portugal, September 17-21, 2015, Lluís Màrquez, Chris Callison-Burch, Jian Su, Daniele Pighin, and Yuval Marton (Eds.). The Association for Computational Linguistics, 431--439. https://doi.org/10.18653/ v1/d15-1049Google ScholarCross Ref
- Kaveh Taghipour and Hwee Tou Ng. 2016. A Neural Approach to Automated Essay Scoring. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016, Austin, Texas, USA, November 1-4, 2016, Jian Su, Xavier Carreras, and Kevin Duh (Eds.). TheAssociation for Computational Linguistics, 1882--1891. https://doi.org/10.18653/v1/d16-1193Google ScholarCross Ref
- Yi Tay, Minh C. Phan, Luu Anh Tuan, and Siu Cheung Hui. 2018. SkipFlow: Incorporating Neural Coherence Features for End-to-End Automatic Text Scoring. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2-7, 2018, Sheila A. McIlraith and Kilian Q. Weinberger (Eds.). AAAI Press, 5948--5955. https://www.aaai.org/ ocs/index.php/AAAI/AAAI18/paper/view/16431Google ScholarCross Ref
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4-9 December 2017, Long Beach, CA, USA, Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V. N. Vishwanathan, and Roman Garnett (Eds.). 5998--6008. http://papers.nips.cc/paper/7181-attention-is-all-you-needGoogle ScholarDigital Library
- Yucheng Wang, Zhongyu Wei, Yaqian Zhou, and Xuanjing Huang. 2018. Automatic Essay Scoring Incorporating Rating Schema via Reinforcement Learning. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31 - November 4, 2018, Ellen Riloff, David Chiang, Julia Hockenmaier, and Jun'ichi Tsujii (Eds.). Association for Computational Linguistics, 791--797. https://doi.org/10.18653/v1/d18--1090Google ScholarCross Ref
- Helen Yannakoudakis, Ted Briscoe, and Ben Medlock. 2011. A New Dataset and Method for Automatically Grading ESOL Texts. In The 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference, 19-24 June, 2011, Portland, Oregon, USA, Dekang Lin, Yuji Matsumoto, and Rada Mihalcea (Eds.). The Association for Computer Linguistics, 180--189. https://www.aclweb.org/anthology/P11-1019/Google Scholar
Index Terms
- Domain-Adaptive Neural Automated Essay Scoring
Recommendations
Domain Adaptation for Speaker Verification Based on Self-supervised Learning with Adversarial Training
MultiMedia ModelingAbstractSpeaker verification models trained on a single domain have difficulty keeping performance on new domain data. Adversarial training maps different domain data to the same subspace to handle this problem. However, adversarial training only uses ...
Domain Adaptation for Learning from Label Proportions Using Domain-Adversarial Neural Network
AbstractLearning from Label Proportions (LLP) is a machine learning problem where the training data are composed of bags of instances, and only the class label proportions for each bag are given. In some domains, we can directly obtain label distributions;...
Domain consistency regularization for unsupervised multi-source domain adaptive classification
Highlights- We propose a novel multi-source domain adaptation method for classification.
- We ...
AbstractDeep learning-based multi-source unsupervised domain adaptation (MUDA) has been actively studied in recent years. Compared with single-source unsupervised domain adaptation (SUDA), domain shift in MUDA exists not only between the ...
Comments