Evolutionary algorithms for subgroup discovery in e-learning: A practical application using Moodle data

doi:10.1016/j.eswa.2007.11.026

Expert Systems with Applications

Volume 36, Issue 2, Part 1, March 2009, Pages 1632-1644

https://doi.org/10.1016/j.eswa.2007.11.026 Get rights and content

Abstract

This work describes the application of subgroup discovery using evolutionary algorithms to the usage data of the Moodle course management system, a case study of the University of Cordoba, Spain. The objective is to obtain rules which describe relationships between the student’s usage of the different activities and modules provided by this e-learning system and the final marks obtained in the courses. We use an evolutionary algorithm for the induction of fuzzy rules in canonical form and disjunctive normal form. The results obtained by different algorithms for subgroup discovery are compared, showing the suitability of the evolutionary subgroup discovery to this problem.

Introduction

The design and implementation of web-based education systems have grown exponentially in the last years, spurred by the fact that neither students nor teachers are bound to a specific location and that this form of computer-based education is virtually independent of any specific hardware platforms. These systems accumulate a great deal of information which is very valuable in analyzing students’ behavior and assisting authors in the detection of possible errors, shortcomings and improvements. However, due to the vast quantities of data these systems can generate daily, it is very difficult to manage manually, and authors demand tools which assist them in this task, preferably on a continuous basis. The use of data mining is a promising area in the achievement of this objective (Romero and Ventura, 2006, Romero and Ventura, 2007).

In the knowledge discovery in databases (KDD) process, the data mining step consists of the automatic extraction of implicit and interesting patterns from large data collections. A list of data mining techniques or tasks includes statistics, clustering, classification, outlier detection, association rule mining, sequential pattern mining, text mining, or subgroup discovery, among others (Klösgen & Zytkow, 2002).

In recent years, researchers have begun to investigate various data mining methods in order to help teachers improve e-learning systems. A review can be seen in (Romero & Ventura, 2007). These methods allow the discovery of new knowledge-based on students’ usage data.

Subgroup discovery is a specific method for discovering descriptive rules (Klösgen, 1996, Wrobel, 1997). The objective is to discover characteristics of subgroups with respect to a specific property of interest (represented in the rule consequent). It must be noted that subgroup discovery aims at discovering individual rules (or local patterns of interest), which must be represented in explicit symbolic form and which must be relatively simple in order to be recognized as actionable by potential users. Therefore, the subgroups discovered in data have an explanatory nature and the interpretability for the final user of the extracted knowledge is a crucial aspect in this field. This task has been applied to different domains: detection of patient groups with risk for atherosclerotic coronary heart disease (Gamberger & Lavrac, 2002b), mining UK traffic data (Kavsek, Lavrac, & Bullas, 2002), personal web pages (Nakada & Kunifuji, 2003), identification of interesting diagnostic patterns to supplement a medical documentation and consultation system (Atzmueller, Puppe, & Buscher, 2004) or marketing problems (del Jesus, González, Herrera, & Mesonero, 2007).

This work proposes the application of subgroup discovery to the usage data of the course management system Moodle at the University of Cordoba, Spain. Moodle is a free open source course management system designed to help educators create effective online learning communities. Moodle has a flexible array of course activities such as forums, chats, quizzes, resources, choices, surveys, or assignments. Our objective is to obtain rules which describe relationships between the student’s usage of the different activities and modules provided by this e-learning system and the final score obtained in the courses. These rules can help the teacher to discover beneficial or detrimental relationships between the use of web-based educational resources and the student’s learning.

We will focus our attention in the use of a subgroup discovery algorithm-based on the use of genetic algorithms (GAs) called SDIGA (Subgroup Discovery Iterative Genetic Algorithm). SDIGA is an evolutionary model for the extraction of fuzzy rules for the subgroup discovery task. This algorithm is described in detail in (del Jesus et al., 2007). Its main characteristics are presented in this paper.

We compare the results obtained by this algorithm with those obtained by two classical subgroup discovery methods: Apriori-SD (Kavsek & Lavrac, 2006) and CN2-SD (Lavrac, Kavsec, Flach, & Todorovski, 2004). Furthermore, we also use an algorithm for class association rule discovery such as CBA (Classification Based on Association) (Liu, Hsu, & Ma, 1998). We will present an experimental study where SDIGA obtains the best results for our educational mining problem.

This paper is arranged in the following way: Section 2 describes the problem of discovering rules in e-learning and surveys some specific work in the area. Section 3 introduces the subgroup discovery task, the type of rules and quality measures used and the fuzzy evolutionary approach. Section 4 describes the e-learning case study, the experimentation carried out and the analysis of results. Finally, the conclusions and further research are outlined.

Section snippets

Rule discovery in learning management systems

Many web-based educational systems with different capabilities and approaches have been developed to deliver online education. There are different types of web-based educational systems: particular web-based courses, learning management systems, and adaptive and intelligent web-based educational systems (Romero & Ventura, 2006). This paper is mostly oriented forwards learning management systems. Different terms are used to denominate these systems: learning management systems (LMS), course

Subgroup discovery: classic approaches and evolutionary proposals

We have described some of the data mining techniques most used in e-learning, but subgroup discovery can also be applied to this task. In this section, the subgroup discovery task is introduced and classical and evolutionary approaches are described. First, we describe the topic of subgroup discovery and the classical approaches. Then, we analyze the use of evolutionary algorithms for rule induction. Finally, we introduce an evolutionary proposal for the subgroup discovery task.

E-learning case study: usage data of the cordoba university moodle e-learning system

In this section we examine the Moodle case study. We first describe our specific problem and then show the experimental results obtained in the execution of the different subgroup discovery algorithms. Finally we analyze several rules from the point of view of the teacher with the aim of improving the e-learning courses.

Conclusions

In this work we have described the application of subgroup discovery to e-learning, with the case study of the Moodle course management system. We have used real usage data pickep up from students at the University of Cordoba, Spain.

We have compared the results obtained by different algorithms for subgroup discovery, showing the suitability of the evolutionary subgroup discovery to this problem. In particular, SDIGA algorithm obtains a small number of rules which are highly understandable for

Acknowledgement

The authors gratefully acknowledge the financial support provided by the Spanish department of Research under TIN2005-08386-C05-01, TIN2005-08386-C05-02 and TIN2005-08386-C05-03 Projects.

References (56)

B. Carse et al.
Evolving fuzzy rule based controllers using genetic algorithms
Fuzzy Sets and Systems
(1996)
D. Gamberger et al.
Active subgroup mining: A case study in coronary heart disease risk group detection
Artificial Intelligence in Medicine
(2003)
C. Romero et al.
Educational data mining: A survey from 1995 to 2005
Expert Systems with Applications
(2007)
C.H. Wang et al.
Integrating fuzzy knowledge by genetic algorithms
IEEE Transactions on Evolutionary Computation
(1998)
Agrawal, R., Imielinski, T., & Swami, A. (1993). Mining association rules between sets of items in large databases. In...
Alcalá, J., Sánchez, L., García, S., del Jesus, M. J., Ventura, S., Garrell, J. M., et al. (2007). KEEL: A data mining...
Atzmueller, M., Puppe, F. & Buscher, H. P. (2004). Towards knowledge-intensive subgroup discovery. In Proceedings...
M. Atzmueller et al.
SD-Map – A fast algorithm for exhaustive subgroup discovery
Castro, F., Vellido, A., Nebot, A., & Minguillon, J. (2005). Detecting atypical student behaviour on an e-learning...
P. Clark et al.
The CN2 induction algorithm
Machine Learning
(1989)

O. Cordón et al.

MOGUL: A methodology to obtain genetic fuzzy rule-based systems under the iterative rule learning approach

International Journal of Intelligent Systems

(1999)

O. Cordón et al.

Genetic fuzzy systems: evolutionary tuning and learning of fuzzy knowledge bases

(2001)

M.J. del Jesus et al.

Evolutionary fuzzy rule induction process for subgroup discovery: a case study in marketing

IEEE Transactions on Fuzzy Systems

(2007)

M. Flate

Online education and learning management systems

(2003)

Freyberger, J., Heffernan, N., & Ruiz, C. (2004). Using association rules to guide a search for best fitting transfer...

Gamberger, D. & Lavrac, N. (2002). Descriptive induction through subgroup discovery: A case study in a medical domain....

D. Gamberger et al.

Expert-guided subgroup discovery: Methodology and application

Journal Of Artificial Intelligence Research

(2002)

A. Ghosh et al.

Evolutionary computation in data mining

(2005)

A. Giordana et al.

Search-intensive concept induction

Evolutionary Computation

(1995)

A. González et al.

SLAVE: A genetic learning system based on an iterative approach

IEEE Transactions on Fuzzy Systems

(1999)

J. Han et al.

Spatial clustering methods in data mining: A survey

(2001)

J. Han et al.

Mining frequent patterns by pattern-growth: Methodology and implications

ACM SIGKDD Explorations Newsletter

(2000)

H. Ishibuchi et al.

Classification and modeling with linguistic information granules

(2004)

Jovanoski, V., & Lavrac, N. (2001). Classification rule learning with APRIORI-C. In progress in artificial intelligence...

Kavsek, B., Lavrac, N., & Bullas, J. C. (2002). Rule induction for subgroup discovery: a case study in mining UK...

B. Kavsek et al.

APRIORI-SD: Adapting association rule learning to subgroup discovery

Applied Artificial Intelligence

(2006)

KEEL (2007). Available from...

W. Klösgen

Explora: A multipattern and multistrategy discovery assistant

Cited by (84)

Investigating collaborative problem solving skills and outcomes across computer-based tasks
2023, Computers and Education
Collaborative problem solving (CPS) is a critical competency for the modern workforce, as many of todays' problems require groups to come together to find innovative solutions to complex problems. This has motivated increased interest in work dedicated to assessing and developing CPS skills. However, there has been limited attention in prior CPS assessment research on potential differences in how CPS behaviors are exhibited across task contexts. In the current study, we investigated associations among middle- and high-school students’ displayed CPS skills across two online (i.e., via videoconferencing) tasks (Physics Playground and the T-Shirt Math Task) and the extent to which different skills were related to CPS outcomes across those tasks. Results showed variation in associations of CPS skills across the tasks, contributing further evidence to our understanding of how different CPS task designs can give students the opportunity to demonstrate different CPS skills. Our findings highlight the potential of incorporating multiple tasks during CPS assessments and can inform future research on CPS task design and computer-based CPS assessment.
Assessment of collaborative problem solving skills
2022, International Encyclopedia of Education: Fourth Edition
Collaborative problem solving (CPS) has been deemed a competency critical for success in today's world given that many of the challenges of today require individuals to come together to find solutions to novel problems. This has made developing and implementing ways to assess CPS an important endeavor. In the current article, we describe principles and applications for how to carry out various aspects of CPS assessment, including operationalizing the construct, identifying evidence of CPS skills from log data in digital environments, and aggregating evidence about individuals' behaviors to make inferences about CPS proficiency.
Exploring social and cognitive dimensions of collaborative problem solving in an open online simulation-based task
2020, Computers in Human Behavior
Collaborative problem solving (CPS) is a complex construct comprised of skills associated with social and cognitive dimensions. The diverse set of skills within these dimensions make CPS difficult to measure. Typically, research on measuring CPS has used highly constrained environments that help narrow the problem space. In the current study, we applied the in-task assessment framework to support the exploration of CPS skills at a deep level in an open digital environment in which three students worked together to solve an electronics problem. The construct of CPS was defined in depth prior to the implementation of the environment through the development of a complex, hierarchical ontology. The features from the ontology were identified in the data and four theoretically-grounded profiles of types of collaborative problem-solvers were produced - high social/high cognitive, high social/low cognitive, low social/high cognitive, and low social/low cognitive. Results showed that students in the low social/low cognitive profile group demonstrated poorer performance than students in other profile groups. Further, having at least one high social/high cognitive member in a team facilitated performance. This study offers groundwork for future studies in measuring CPS with an approach suitable for less constrained collaborative environments.
Interdisciplinary research agenda in support of assessment of collaborative problem solving: lessons learned from developing a Collaborative Science Assessment Prototype
2017, Computers in Human Behavior
Evidence from labor-market economics and predictive validity studies in psychology suggests that collaborative problem solving (CPS) is an increasingly important skill for both academic and career success in the 21st century. While there is a general agreement that collaborative problem solving is an important skill, there is less agreement on how to build an assessment to measure it, especially at scale and as a standardized test. Developing the type of CPS assessment envisioned in this work will require interdisciplinary synergy, involving learning science, data science, psychometrics, and software engineering. In this conceptual paper, we present our identification and novel instantiation of five interdisciplinary research strands supporting the development of a CPS assessment. We discuss how these research strands can comprehensively address some of the shortcomings of existing CPS assessments, such as collecting and managing the data from the process of collaboration in structured log files, or considering a statistical definition of collaboration in the design of the collaborative tasks. We describe the Collaborative Science Assessment Prototype developed at Educational Testing Service (ETS) under the proposed interdisciplinary research agenda to illustrate how these research strands can be operationalized.
A new evolutionary algorithm for mining top-k discriminative patterns in high dimensional data
2017, Applied Soft Computing Journal
Citation Excerpt :
Discriminative pattern mining has simultaneously evolved with different terminologies, Subgroups Discovery [2,3]; Emerging Patterns [4,5]; and Contrast Sets [6], until they were unified by Novak et al. [7]. There are many applications reported in the literature in different domains such as: medicine [8,9], bioinformatics [10,11], marketing [12,13], e-learning [14] and traffic accidents [15,16]. Little attention has been given to mining discriminative patterns in high dimensional domains in spite of the great number of applications in the literature.
This paper presents an evolutionary algorithm for Discriminative Pattern (DP) mining that focuses on high dimensional data sets. DPs aims to identify the sets of characteristics that better differentiate a target group from the others (e.g. successful vs. unsuccessful medical treatments). It becomes more natural to extract information from high dimensionality data sets with the increase in the volume of data stored in the world (30 GB/s only in the Internet). There are several evolutionary approaches for DP mining, but none focusing on high-dimensional data. We propose an evolutionary approach attributing features that reduce the cost of memory and processing in the context of high-dimensional data. The new algorithm thus seeks the best (top-k) patterns and hides from the user many common parameters in other evolutionary heuristics such as population size, mutation and crossover rates, and the number of evaluations. We carried out experiments with real-world high-dimensional and traditional low dimensional data. The results showed that the proposed algorithm was superior to other approaches of the literature in high-dimensional data sets and competitive in the traditional data sets.
Subgroup Discovery in MOOCs: A Big Data Application for Describing Different Types of Learners
2024, arXiv

View all citing articles on Scopus

View full text

Evolutionary algorithms for subgroup discovery in e-learning: A practical application using Moodle data

Abstract

Introduction

Section snippets

Rule discovery in learning management systems

Subgroup discovery: classic approaches and evolutionary proposals

E-learning case study: usage data of the cordoba university moodle e-learning system

Conclusions

Acknowledgement

Fuzzy Sets and Systems

Artificial Intelligence in Medicine

Expert Systems with Applications

IEEE Transactions on Evolutionary Computation

SD-Map – A fast algorithm for exhaustive subgroup discovery

The CN2 induction algorithm

Machine Learning

MOGUL: A methodology to obtain genetic fuzzy rule-based systems under the iterative rule learning approach

International Journal of Intelligent Systems

Genetic fuzzy systems: evolutionary tuning and learning of fuzzy knowledge bases

Evolutionary fuzzy rule induction process for subgroup discovery: a case study in marketing

IEEE Transactions on Fuzzy Systems

Online education and learning management systems

Expert-guided subgroup discovery: Methodology and application

Journal Of Artificial Intelligence Research

Evolutionary computation in data mining

Search-intensive concept induction

Evolutionary Computation

SLAVE: A genetic learning system based on an iterative approach

IEEE Transactions on Fuzzy Systems

Spatial clustering methods in data mining: A survey

Mining frequent patterns by pattern-growth: Methodology and implications

ACM SIGKDD Explorations Newsletter

Classification and modeling with linguistic information granules

APRIORI-SD: Adapting association rule learning to subgroup discovery

Applied Artificial Intelligence

Explora: A multipattern and multistrategy discovery assistant