Introduction
-
What is the best machine learning classification model for classifying student’s dissertation project grade, using small dataset size, with a reasonable and significant accuracy rate?
-
What are the main key indicators that could help in creating the classification model for predicting students’ dissertation project grades?
-
-
Could students’ performance in any course (excluding the Dissertation) be predicted with a reasonable and significant accuracy rate using only students’ pre-admission records, course names, and instructors’ name attributes?
Research methodology
Participants and datasets
Students Attributes | Data Type | Attributes’ Details |
---|---|---|
ID | Ordinal | 1,2,3 4, …., 52 (total were 50 students) |
Age | Ratio | Between 30 and 52 |
B.Sc. degree | Nominal | Computer Science, Information system, … |
B.Sc. grade | Ratio | 3.25, 3.62, … |
Course names | Nominal | Introduction to AI, Knowledge management, …. |
Course Grades | Ordinal | A, B, C, & F |
Instructor Names | Nominal | Instructor 1, instructor 2, … |
Descriptive Statistics | Dataset1 | Dataset2 |
---|---|---|
Number of instances | 273 | 38 |
Dependent Variables (DV) | Grades (All Courses Grades) | Grade (Dissertation Grade) |
DV Mean | 3.39 | 3.44 |
DV Median | 4 | 4 |
DV Mode | 4 | 4 |
Accuracy Baseline | P (4) = (136/234) * 100 = 58.1% | P (4) = (23/38) * 100 = 60.5% |
Tools
Data Analysis & Procedures
Dataset pre-processing phase
Module Description | Courses |
---|---|
Informatics Research Methods | 0 |
Knowledge Representation | 1 |
Learning from Data | 2 |
Introduction to AI | 3 |
Knowledge Management | 4 |
Web Design Project | 5 |
Applied Databases | 6 |
Knowledge Engineering | 7 |
Data Mining and Exploration | 8 |
Introduction to Computational Linguistics | 9 |
Speech Processing | 10 |
Dissertation | 11 |
Grade | Grades (new variable name) |
---|---|
A | 4 |
B | 3 |
C | 2 |
Fail | 1 |
Instructor Name | Instructors (new variable name) |
---|---|
Instructor 2 | 2 |
Instructor 3 | 3 |
Instructor 4 | 4 |
BSc Degree | BSc Deg. (new variable name) |
---|---|
Mathematics | 0 |
BSc Computing1 | 1 |
Information Technology | 1 |
Operations & Information Management | 2 |
Management Information System | 2 |
Business Information Technology | 2 |
Electronic Engineering | 3 |
Electrical Engineering | 3 |
Engineering | 3 |
Engineering | 3 |
Electrical & Electronics Engineering | 3 |
Electrical Engineering | 3 |
Electrical Engineering - Computers & Control Section | 3 |
Electronics Engineering (Computer & Control) | 3 |
Computer Information Systems | 4 |
Information System | 4 |
Computer Information Systems - Circuits and Systems | 4 |
Computer Science/Information System | 4 |
Computer Engineering | 5 |
Computer Engineering | 5 |
Computer Engineering | 5 |
Computer Engineering | 5 |
Computer Engineering | 5 |
Computer Science | 6 |
Computer Science Mathematical Statistics | 6 |
Computer Science | 6 |
Computer Science | 6 |
Computer Science | 6 |
Computer Science | 6 |
Computer Science | 6 |
Computer Science | 6 |
Computer Science | 6 |
Computer Science | 6 |
Computer Systems | 7 |
Computer Systems | 7 |
Software Engineering | 8 |
Software Engineering | 8 |
Software Engineering | 8 |
-
preserve the new encoded numeric (ordinal) attribute datatype from being changed to continuous ones.
-
Avoid producing values that will not belong to any of the Grades attribute’s classes.
Attributes selection phase
Classification model evaluation
-
Null hypothesis (H0): there is no difference between the accuracy predicted by classification algorithms and NIR (accuracy of the random prediction).
-
Alternative hypothesis (H1): there is difference between the accuracy predicted by classification algorithms and NIR (accuracy of random prediction).
Results
Datasets summary statistics
Key attributes
Evaluation of classification models
Dataset2 | Numeric Attributes | Nominal Attributes |
Notes
| ||
---|---|---|---|---|---|
Classifications Algorithms |
Accuracy
|
Kappa
|
Accuracy
|
Kappa
| |
MLP - ANN | 60.5% | 0.0% | 60.5% | 0.0% | With three hidden layers, i.e. 3,2,1, however, other values were used aswell, but the accuracy results maintained the same |
LDA | 71.1% | 44.7% | 57.9% | 19.3% | |
NB | 65.8% | 32.1% | 71.1% | 37.4% | using Kernel |
SVM | 68.4% | 33.4% | 76.3% | 49.3% | were sigma = 0.1590384 and C = 1, for nominal The final values used for the model were sigma = 0.0564085 and C = 1 |
KNN | 65.8% | 28.8% | 65.8% | 31.9% | At K = 7, for nominal at k = 5 |
Dataset1 | Numeric Attributes | Nominal Attributes |
Notes
| ||
---|---|---|---|---|---|
Classifications Algorithms |
Accuracy
|
Kappa
|
Accuracy
|
Kappa
| |
MLP - ANN | 58.1% | 0.0% | 58.1% | 0.0% | |
LDA | 56.4% | −1.0% | 63.2% | 35.1% | |
NB | 57.7% | 0.1% | 58.1% | 0.0% | Using kernel |
SVM | 58.1% | 0.0% | 69.7% | 41.7% | where c = 1, c = 0.25 (‘sigma’ was held constant at a value of 0.2410613, Accuracy was used to select the optimal model using the largest value, for numeric. The final values used for the model were sigma = 0.2410613 and C = 0.25.), for nominal |
KNN | 55.6% | 11.4% | 56.4% | 5.1% | k = 7,k = 9 |
Dataset2 | Dataset1 | ||||||
---|---|---|---|---|---|---|---|
Classifications Algorithms |
Accuracy
|
Kappa
|
Attribute Type
| Classifications Algorithms |
Accuracy
|
Kappa
|
Attribute Type
|
MLP - ANN | 60.5% | 0.0% | Numeric | MLP - ANN | 58.1% | 0.0% | Nominal |
LDA | 71.1% | 44.7% | Numeric | LDA | 63.2% | 35.1% | Nominal |
NB | 71.1% | 37.4% | Nominal | NB | 58.1% | 0.0% | Nominal |
SVM | 76.3% | 37.4% | Nominal | SVM | 69.7% | 41.7% | Nominal |
KNN | 65.8% | 37.4% | Nominal | KNN | 56.4% | 5.1% | Nominal |