Evolutionary algorithms for subgroup discovery in e-learning: A practical application using Moodle data
Introduction
The design and implementation of web-based education systems have grown exponentially in the last years, spurred by the fact that neither students nor teachers are bound to a specific location and that this form of computer-based education is virtually independent of any specific hardware platforms. These systems accumulate a great deal of information which is very valuable in analyzing students’ behavior and assisting authors in the detection of possible errors, shortcomings and improvements. However, due to the vast quantities of data these systems can generate daily, it is very difficult to manage manually, and authors demand tools which assist them in this task, preferably on a continuous basis. The use of data mining is a promising area in the achievement of this objective (Romero and Ventura, 2006, Romero and Ventura, 2007).
In the knowledge discovery in databases (KDD) process, the data mining step consists of the automatic extraction of implicit and interesting patterns from large data collections. A list of data mining techniques or tasks includes statistics, clustering, classification, outlier detection, association rule mining, sequential pattern mining, text mining, or subgroup discovery, among others (Klösgen & Zytkow, 2002).
In recent years, researchers have begun to investigate various data mining methods in order to help teachers improve e-learning systems. A review can be seen in (Romero & Ventura, 2007). These methods allow the discovery of new knowledge-based on students’ usage data.
Subgroup discovery is a specific method for discovering descriptive rules (Klösgen, 1996, Wrobel, 1997). The objective is to discover characteristics of subgroups with respect to a specific property of interest (represented in the rule consequent). It must be noted that subgroup discovery aims at discovering individual rules (or local patterns of interest), which must be represented in explicit symbolic form and which must be relatively simple in order to be recognized as actionable by potential users. Therefore, the subgroups discovered in data have an explanatory nature and the interpretability for the final user of the extracted knowledge is a crucial aspect in this field. This task has been applied to different domains: detection of patient groups with risk for atherosclerotic coronary heart disease (Gamberger & Lavrac, 2002b), mining UK traffic data (Kavsek, Lavrac, & Bullas, 2002), personal web pages (Nakada & Kunifuji, 2003), identification of interesting diagnostic patterns to supplement a medical documentation and consultation system (Atzmueller, Puppe, & Buscher, 2004) or marketing problems (del Jesus, González, Herrera, & Mesonero, 2007).
This work proposes the application of subgroup discovery to the usage data of the course management system Moodle at the University of Cordoba, Spain. Moodle is a free open source course management system designed to help educators create effective online learning communities. Moodle has a flexible array of course activities such as forums, chats, quizzes, resources, choices, surveys, or assignments. Our objective is to obtain rules which describe relationships between the student’s usage of the different activities and modules provided by this e-learning system and the final score obtained in the courses. These rules can help the teacher to discover beneficial or detrimental relationships between the use of web-based educational resources and the student’s learning.
We will focus our attention in the use of a subgroup discovery algorithm-based on the use of genetic algorithms (GAs) called SDIGA (Subgroup Discovery Iterative Genetic Algorithm). SDIGA is an evolutionary model for the extraction of fuzzy rules for the subgroup discovery task. This algorithm is described in detail in (del Jesus et al., 2007). Its main characteristics are presented in this paper.
We compare the results obtained by this algorithm with those obtained by two classical subgroup discovery methods: Apriori-SD (Kavsek & Lavrac, 2006) and CN2-SD (Lavrac, Kavsec, Flach, & Todorovski, 2004). Furthermore, we also use an algorithm for class association rule discovery such as CBA (Classification Based on Association) (Liu, Hsu, & Ma, 1998). We will present an experimental study where SDIGA obtains the best results for our educational mining problem.
This paper is arranged in the following way: Section 2 describes the problem of discovering rules in e-learning and surveys some specific work in the area. Section 3 introduces the subgroup discovery task, the type of rules and quality measures used and the fuzzy evolutionary approach. Section 4 describes the e-learning case study, the experimentation carried out and the analysis of results. Finally, the conclusions and further research are outlined.
Section snippets
Rule discovery in learning management systems
Many web-based educational systems with different capabilities and approaches have been developed to deliver online education. There are different types of web-based educational systems: particular web-based courses, learning management systems, and adaptive and intelligent web-based educational systems (Romero & Ventura, 2006). This paper is mostly oriented forwards learning management systems. Different terms are used to denominate these systems: learning management systems (LMS), course
Subgroup discovery: classic approaches and evolutionary proposals
We have described some of the data mining techniques most used in e-learning, but subgroup discovery can also be applied to this task. In this section, the subgroup discovery task is introduced and classical and evolutionary approaches are described. First, we describe the topic of subgroup discovery and the classical approaches. Then, we analyze the use of evolutionary algorithms for rule induction. Finally, we introduce an evolutionary proposal for the subgroup discovery task.
E-learning case study: usage data of the cordoba university moodle e-learning system
In this section we examine the Moodle case study. We first describe our specific problem and then show the experimental results obtained in the execution of the different subgroup discovery algorithms. Finally we analyze several rules from the point of view of the teacher with the aim of improving the e-learning courses.
Conclusions
In this work we have described the application of subgroup discovery to e-learning, with the case study of the Moodle course management system. We have used real usage data pickep up from students at the University of Cordoba, Spain.
We have compared the results obtained by different algorithms for subgroup discovery, showing the suitability of the evolutionary subgroup discovery to this problem. In particular, SDIGA algorithm obtains a small number of rules which are highly understandable for
Acknowledgement
The authors gratefully acknowledge the financial support provided by the Spanish department of Research under TIN2005-08386-C05-01, TIN2005-08386-C05-02 and TIN2005-08386-C05-03 Projects.
References (56)
- et al.
Evolving fuzzy rule based controllers using genetic algorithms
Fuzzy Sets and Systems
(1996) - et al.
Active subgroup mining: A case study in coronary heart disease risk group detection
Artificial Intelligence in Medicine
(2003) - et al.
Educational data mining: A survey from 1995 to 2005
Expert Systems with Applications
(2007) - et al.
Integrating fuzzy knowledge by genetic algorithms
IEEE Transactions on Evolutionary Computation
(1998) - Agrawal, R., Imielinski, T., & Swami, A. (1993). Mining association rules between sets of items in large databases. In...
- Alcalá, J., Sánchez, L., García, S., del Jesus, M. J., Ventura, S., Garrell, J. M., et al. (2007). KEEL: A data mining...
- Atzmueller, M., Puppe, F. & Buscher, H. P. (2004). Towards knowledge-intensive subgroup discovery. In Proceedings...
- et al.
SD-Map – A fast algorithm for exhaustive subgroup discovery
- Castro, F., Vellido, A., Nebot, A., & Minguillon, J. (2005). Detecting atypical student behaviour on an e-learning...
- et al.
The CN2 induction algorithm
Machine Learning
(1989)
MOGUL: A methodology to obtain genetic fuzzy rule-based systems under the iterative rule learning approach
International Journal of Intelligent Systems
Genetic fuzzy systems: evolutionary tuning and learning of fuzzy knowledge bases
Evolutionary fuzzy rule induction process for subgroup discovery: a case study in marketing
IEEE Transactions on Fuzzy Systems
Online education and learning management systems
Expert-guided subgroup discovery: Methodology and application
Journal Of Artificial Intelligence Research
Evolutionary computation in data mining
Search-intensive concept induction
Evolutionary Computation
SLAVE: A genetic learning system based on an iterative approach
IEEE Transactions on Fuzzy Systems
Spatial clustering methods in data mining: A survey
Mining frequent patterns by pattern-growth: Methodology and implications
ACM SIGKDD Explorations Newsletter
Classification and modeling with linguistic information granules
APRIORI-SD: Adapting association rule learning to subgroup discovery
Applied Artificial Intelligence
Explora: A multipattern and multistrategy discovery assistant
Cited by (84)
Investigating collaborative problem solving skills and outcomes across computer-based tasks
2023, Computers and EducationAssessment of collaborative problem solving skills
2022, International Encyclopedia of Education: Fourth EditionExploring social and cognitive dimensions of collaborative problem solving in an open online simulation-based task
2020, Computers in Human BehaviorA new evolutionary algorithm for mining top-k discriminative patterns in high dimensional data
2017, Applied Soft Computing JournalCitation Excerpt :Discriminative pattern mining has simultaneously evolved with different terminologies, Subgroups Discovery [2,3]; Emerging Patterns [4,5]; and Contrast Sets [6], until they were unified by Novak et al. [7]. There are many applications reported in the literature in different domains such as: medicine [8,9], bioinformatics [10,11], marketing [12,13], e-learning [14] and traffic accidents [15,16]. Little attention has been given to mining discriminative patterns in high dimensional domains in spite of the great number of applications in the literature.