Top

Soft Computing

Published in:

01-07-2014 | Methodologies and Application

Automatic generation of multiple choice questions using dependency-based semantic relations

Authors: Naveed Afzal, Ruslan Mitkov

Published in: Soft Computing | Issue 7/2014

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

In this paper, we present an unsupervised dependency-based approach to extract semantic relations to be applied in the context of automatic generation of multiple choice questions (MCQs). MCQs also known as multiple choice tests provide a popular solution for large-scale assessments as they make it much easier for test-takers to take tests and for examiners to interpret their results. Manual generation of MCQs is a very expensive and time-consuming task and yet they often need to be produced on a large scale and within short iterative cycles. We approach the problem of automated MCQ generation with the help of unsupervised relation extraction, a technique used in a number of related natural language processing problems. The goal of Unsupervised relation extraction is to identify the most important named entities and terminology in a document and then recognise semantic relations between them, without any prior knowledge as to the semantic types of the relations or their specific linguistic realisation. We use these techniques to process instructional texts and identify those facts (terminology, entities, and semantic relations between them) that are likely to be important for assessing test-takers’ familiarity with the instructional material. We investigate an approach to learn semantic relations between named entities by employing a dependency tree model. Our findings show that an optimised configuration of our MCQ generation system is capable of attaining high precision rates, which are much more important than recall in the automatic generation of MCQs. We also carried out a user-centric evaluation of the system, where subject domain experts evaluated automatically generated MCQ items in terms of readability, usefulness of semantic relations, relevance, acceptability of questions and distractors and overall MCQ usability. The results of this evaluation make it possible for us to draw conclusions about the utility of the approach in practical e-Learning applications.

previous article Graded consequence: an institution theoretic study

next article Detecting nonlinear interrelation patterns among process variables using genetic programming

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

http://www.e-learningcentre.co.uk/Reviews_and_resources/Market_Size_Reports_/The_UK_e_learning_market_2009.

http://www-tsujii.is.s.u-tokyo.ac.jp/GENIA/home/wiki.cgi?page=Event+Annotation.

http://www-tsujii.is.s.u-tokyo.ac.jp/GENIA/tagger/.

http://mars.cs.utu.fi/BioInfer/.

http://www.sics.se/humle/projects/prothalt/#data.

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC146421/.

http://www.ncbi.nlm.nih.gov/.

http://www.biomedcentral.com/info/about/datamining/.

Afzal N, Mitkov R, Farzindar A (2011) Unsupervised Relation extraction using dependency trees for automatic generation of multiple-choice questions. In: Butz C, Lingras P (eds) Proceedings of the Canadian AI 2011, LNAI 6657. Springer, Heidelberg, pp 32–43

Agichtein E, Gravano L (2000) Snowball: Extracting Relations from Large Plaintext Collections. In: Proceedings of the 5th ACM international conference on digital libraries

Bikel DM, Miller S, Schwartz R, Weischedel R (1998) Nymble: a high-performance learning name-finder. In Proceedings of the conference on applied natural language processing

Brown J, Frishkoff G, Eskenazi M (2005) Automatic question generation for vocabulary assessment. In: Proceeding of HLT/EMNLP. Vancouver, BC

Caraballo SA (1999) Automatic construction of a hypernym-labeled noun hierarchy from text. In: Proceedings of 37th annual meeting of the association for computational linguistics, pp 120–126

Carlsson C, Brunelli M, Mezei J (2012) Decision making with a fuzzy ontology. Soft Comput 16(7):1143–1152CrossRef

Chen C-Y, Liou H-C, Chang JS (2006) FAST—an automatic generation system for grammar tests. In: Proceedings of COLING/ACL interactive presentation sessions, Sydney

Chen W, Aist G, Mostow J (2009) Generating questions automatically from informational text. In: Proceedings of the 2nd workshop on question generation. Brighton

Cohen AM, Hersh WR (2005) A survey of current work in biomedical text mining. Brief Bioinform 6(1):57–71

Cohen J (1968) Weighted kappa: nominal scale agreement with provision for scaled disagreement or partial credit. Psychol Bull

Corney DP, Jones D, Buxton B, Langdon W (2004) BioRAT: extracting biological information from full-length papers. Bioinformatics 20:3206–3213CrossRef

Cover T, Thomas J (1991) Elements of information theory. Wiley, New YorkCrossRefMATH

Dagan I, Lee L, Pereira F (1997) Similarity-based methods for word sense disambiguation. In: Proceedings of the 35th annual meeting of the association for computational linguistics, Madrid, p 56.63

Dagan I, Lee L, Pereira F (1999) Similarity-based models of word cooccurrence probabilities. Mach Learn J 34(1–3):43–69

Das R, Elikkottil A (2010) Auto-summarizer to aid a Q/A system. Int J Comput Appl 1(1):113–117

De Maio C, Fenza G, Loia V, Senatore S (2009) Towards an automatic fuzzy ontology generation. In: Proceedings of IEEE international conference on fuzzy systems, pp 1044–1049

Dhillon IS, Mallela S, Kumar R (2002) Enhanced word clustering for hierarchical text classification (Tech. Rep. Nos. TR-02-17). Austin: Department of Computer Sciences, University of Texas

Farzindar A, Lapalme G (2004) LetSum, an automatic Legal Text Summarizing system. In: Gordon Thomas F (ed) Legal Knowledge and Information Systems, Jurix 2004: the 7th annual conference. IOS Press, Berlin, pp 11–18

Firth JR (1957) A synopsis of linguistic theory 1930–1955. Studies in Linguistic Analysis. Blackwell, Oxford, pp 1–32

Gates D (2008) Generating Look-Back Strategy Questions from Expository Texts. In: Workshop on the question generation shared task and evaluation challenge. NSF, Arlington

Graesser A, Person N (1994) Question asking during tutoring. Am Educ Res J 31:104–137CrossRef

Graesser AC, Chipman P, Haynes BC, Olney A (2005) Autotutor: an intelligent tutoring system with mixed-initiative dialogue. IEEE Trans Educ 48(4):612–618CrossRef

Grefenstette G (1994) Explorations in automatic Thesaurus discovery, vol. 278 of Kluwer International Series in Engineering and Computer Science. Kluwer, Boston

Gronlund N (1982) Constructing achievement tests. Prentice Hall, New York

Harris Z (1954) Distributional structure. Word 10(23):146–162

Harshman R (1970) Foundations of the parafac procedure: Models and conditions for an “explanatory” multi-modal factor analysis. In: UCLA Working Papers in Phonetics, vol 16

Hasegawa T, Sekine S, Grishman R (2004) Discovering relations among named entities from large corpora. In: Proceedings of ACL’04

Hatzivassiloglou V (1996) Do we need linguistics when we have statistics? A comparative analysis of the contributions of linguistic cues to a statistical word grouping system. In: Judith K, Philip R (eds) The balancing act: combining symbolic and statistical approaches to language, chapter 4. MIT Press, Cambridge, pp 67–94

Hirschman L, Mani I (2003) Evaluation. In: Mitkov R (ed) The Oxford Handbook of Computational Linguistics. Oxford University Press, UK, pp 414–429

Hodges PE, McKee AH, Davis BP, Payne WE, Garrels JI (1999) The Yeast Proteome Database (YPD): a model for the organization and presentation of genomewide functional data. Nucleic Acids Res 27(1): 69–73

Hoshino A, Nakagawa H (2007) Assisting cloze test making with a web application. In: Proceedings of society for information technology and teacher education international conference, Chesapeake

Huang M, Zhu X, Payan GD, Qu K, Li M (2004) Discovering patterns to extract protein-protein interactions from full biomedical texts. Bioinformatics, pp 3604–3612

Kalady S, Elikkottil A, Das R (2010) Natural language question generation using syntax and keywords. In: Proceedings of the 3rd workshop on question generation

Karamanis N, Ha LA, Mitkov R (2006) Generating multiple-choice test items from medical text: A pilot study. In: Proceedingd of the 4th international natural language generation conference, (July), pp 111–113

Kullback S, Leibler R (1951) On information and sufficiency. Ann Math Stat 22:79–86CrossRefMATHMathSciNet

Lapata M, Keller F, McDonald S (2001) Evaluating smoothing algorithms against plausibility judgements. In: Proceedings of the 39th annual meeting of the association for computational linguistics (ACL-2001), Toulouse, pp 346–353

Lin D (1998) Automatic retrieval and clustering of similar words. In: Proceedings of international conference on computational linguistics and the annual meeting of the association for Computational Linguistics

Lin J (1991) Divergence measures based on the Shannon entropy. IEEE Trans Inform Theory 37(1):145–151CrossRefMATHMathSciNet

Martin EP, Bremer E, Guerin G, DeSesa M-C, Jouve O (2004) Analysis of protein/protein interactions through biomedical literature: text mining of abstracts vs. text mining of full text articles. Springer, Berlin, pp 96–108

Mitkov R, An LA (2003) Computer-aided generation of multiple-choice tests. In: Proceedings of the HLT/NAACL 2003 workshop on building educational applications using natural language processing, Edmonton, pp 17–22

Mitkov R, Ha LA, Karamanis N (2006) A computer-aided environment for generating multiple-choice test items. Natural Language Engineering 12(2). Cambridge University Press, Cambridge, pp 177–194

Mostow J, Chen W Generating Instruction Automatically for the Reading Strategy of Self-Questioning. In: Proceedings of the 14th international conference on artificial intelligence in Education, Brighton

Nielsen R (2008) Question generation: Proposed challenge tasks and their evaluation. In: Proceedings of the workshop on the question generation shared task and evaluation, challenge

Palmer M, Kingsbury P, Gildea D (2005) The proposition bank: an annotated corpus of semantic roles. Comput Linguist 31(1): 71–106

Papasalouros A, Kanaris K, Konstantinos K (2008) Automatic generation of multiple choice questions from domain ontologies. In: Proceeding of IADIS international conference e-learning

Paroubek P, Chaudiron S, Hirschman L (2007) Principles of evaluation in natural language processing. TAL 48(1/2007):7–31

Pereira F, Tishby N, Lee L (1993) Distributional clustering of similar words. In: Proceedings of the 31st annual meeting of the association for computational linguistics (ACL-1993), Columbus, pp 183–190

Pradhan S, Hacioglu K, Krugler V, Ward W, Martin JH, Jurafsky D (2005) Support vector learning for semantic argument classification. Mach Learn 60(1):11–39CrossRef

Rao CR (1983) Diversity: its measurement, decomposition, apportionment and analysis. Indian J Stat 44(A):1–22

Schwartz L, Aikawa T, Pahud M (2004) Dynamic language learning tools. In: Proceedings of the of the 2004 In-STIL/ICALL Symposium

Stevenson M, Greenwood M (2005) A semantic approach to IE pattern induction. In: Proceedings of ACL’05, pp 379–386

Stevenson M, Greenwood M (2009) Dependency pattern models for information extraction. Res Lang Comput

Sumita E, Sugaya F, Yamamoto S (2005) Measuring non-native speakers’ proficiency of English using a test with automatically-generated fill-in-the-blank questions. In: Proceedings of the 2nd workshop on building educational applications using NLP, pp 61–68

Tateno J, Sano H, Aizawa H, Nakamura T, Morita Y (2005) Producing english Educational materials form the BNC and releasing them on the Web, IEICE Technical report, TL2005-1826, Tokyo, pp 7–12

Ureel L, Forbus K, Riesbeck C, Birnbaum L (2005) Question generation for learning by reading. In: Proceedings of the AAAI workshop on textual question answering, Pittsburgh

Vanderwende L (2007) Answering and questioning for machine reading. In: Proceedings of the 2007 AAAI spring symposium on machine reading, Stanford

Vanderwende L (2008) The importance of being important: question generation. In: Proceedings of the workshop on the question generation shared task and evaluation challenge, Arlington

Walker MA, Rambow O, Rogati M (2001) Spot: a trainable sentence planner. In: Proceedings of NAACL

Weeds J (2003) Measures and applications of lexical distributional similarity. Ph.D. thesis, University of Sussex

Title: Automatic generation of multiple choice questions using dependency-based semantic relations
Authors: Naveed Afzal
Ruslan Mitkov
Publication date: 01-07-2014
Publisher: Springer Berlin Heidelberg
Published in: Soft Computing / Issue 7/2014
Print ISSN: 1432-7643
Electronic ISSN: 1433-7479
DOI: https://doi.org/10.1007/s00500-013-1141-4

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Other articles of this Issue 7/2014

Graded consequence: an institution theoretic study

Linear optimization with bipolar fuzzy relational equation constraints using the Łukasiewicz triangular norm

Fuzzy inference system for follicle detection in ultrasound images of ovaries

An approach to facial expression recognition integrating radial basis function kernel and multidimensional scaling analysis

Improved RM-MEDA with local learning

Lattice based communication P systems with applications in cluster analysis

Premium Partner