Skip to main content

2021 | Book

Artificial Intelligence in Education

22nd International Conference, AIED 2021, Utrecht, The Netherlands, June 14–18, 2021, Proceedings, Part II

Editors: Ido Roll, Danielle McNamara, Sergey Sosnovsky, Rose Luckin, Vania Dimitrova

Publisher: Springer International Publishing

Book Series : Lecture Notes in Computer Science


About this book

This two-volume set LNAI 12748 and 12749 constitutes the refereed proceedings of the 22nd International Conference on Artificial Intelligence in Education, AIED 2021, held in Utrecht, The Netherlands, in June 2021.*

The 40 full papers presented together with 76 short papers, 2 panels papers, 4 industry papers, 4 doctoral consortium, and 6 workshop papers were carefully reviewed and selected from 209 submissions. The conference provides opportunities for the cross-fertilization of approaches, techniques and ideas from the many fields that comprise AIED, including computer science, cognitive and learning sciences, education, game design, psychology, sociology, linguistics as well as many domain-specific areas.

​*The conference was held virtually due to the COVID-19 pandemic.

Table of Contents



Scrutability, Control and Learner Models: Foundations for Learner-Centred Design in AIED

There is a huge, and growing, amount of personal data that has the potential to help people learn. There is also a growing and broad concern about the ways that personal data is harvested and used. This makes it timely to draw on the decades of AIED research towards creating systems and interfaces that enable learners to truly harness and control their learning data. This invited keynote will present a whirlwind tour of my learner modelling research and a selection of other work that has influenced my own towards the goal of putting people in control of their own learning data and its use. I will explain the rationale for my focus on scrutability, as a foundation for users to harness and control their learning data, especially for learning contexts.I will share key lessons from my work for creating AIED systems that are deeply learner centred. Building on this, I will present a vision for AIED, one that takes a learner-centred perspective to designing AIED systems and recognises the inherent limitations of learning data. This is a broad view of AIED that returns its founding goals to create advanced learning technologies.

Judy Kay

Short Papers

Open Learner Models for Multi-activity Educational Systems

In recent years, there has been an increasing trend in the use of student-centred approaches within educational systems that engage students in various higher-order learning activities such as creating resources, creating solutions, rating the quality of resources, and giving feedback. In response to this trend, this paper proposes an interpretable and open learner model called MA-Elo that capture an abstract representation of a student’s knowledge state based on their engagement with multiple types of learning activities. We apply MA-Elo to three data sets obtained from an educational system supporting multiple student activities. Results indicate that the proposed approach can provide a higher predictive performance compared with baseline and some state-of-the-art learner models.

Solmaz Abdi, Hassan Khosravi, Shazia Sadiq, Ali Darvishi
Personal Vocabulary Recommendation to Support Real Life Needs

The vocabulary taught in language classes or through digital language learning tools is disconnected from the real-life needs of many language learners. Immigrants, refugees, students abroad learn a language to navigate through their daily lives and often need words that are missing from their curricula they study. Today’s language learners rely heavily on digital translators and dictionaries, creating a database of words they need in their everyday life. The availability of this data could allow personal vocabulary suggestions that meet real-life needs. To show the unsuitability of commonly provided vocabulary lists, we compare them to the vocabulary needed by 37 Syrian refugees living in Lebanon and Germany. We show that the vocabulary provided by the Cambridge English List and Duolingo has low usefulness and low efficiency and discuss future directions for personal vocabulary recommendations.

Victoria Abou-Khalil, Brendan Flanagan, Hiroaki Ogata
Artificial Intelligence Ethics Guidelines for K-12 Education: A Review of the Global Landscape

To scope the global landscape of ethical issues involving the use of AI in K-12 education, we identified relevant ethics guidance documents, and then compared and contrasted concerns raised and principles applied. We found that while AIEdK-12 ethics guidelines employed many principles common to non-AIEd policy statements (e.g., transparency), new ethical principles were being engaged including pedagogical appropriateness and children’s rights.

Cathy Adams, Patti Pente, Gillian Lemermeyer, Geoffrey Rockwell
Quantitative Analysis to Further Validate WC-GCMS, a Computational Metric of Collaboration in Online Textual Discourse

Online learning is increasingly prevalent; for its advantage of unhindered access to quality learning and its leverage for education during the pandemic. Improving social experience in online learning would potentially scaffold the cognitive benefits it provides. A potential strategy is to support online-groups in real-time, similar to how a teacher guides face-to-face (F2F) group-learning in traditional classroom. Previously, we introduced the Word-Count/Gini-Coefficient Measure of Symmetry (WC-GCMS) that can automatically reflect the collaboration level of online textual discourse. In this paper, we introduce Social Coherence (SC), another marker of collaboration, and our analysis shows that WC-GCMS is sensitive to the SC level of group discourse, further validating the potency of the metric.

Adetunji Adeniran, Judith Masthoff
Generation of Automatic Data-Driven Feedback to Students Using Explainable Machine Learning

This paper proposes a novel approach that employs learning analytics techniques combined with explainable machine learning to provide automatic and intelligent actionable feedback that supports students self-regulation of learning in a data-driven manner. Prior studies within the field of learning analytics predict students’ performance and use the prediction status as feedback without explaining the reasons behind the prediction. Our proposed method, which has been developed based on LMS data from a university course, extends this approach by explaining the root causes of the predictions and automatically provides data-driven recommendations for action. The underlying predictive model effectiveness of the proposed approach is evaluated, with the results demonstrating 90 per cent accuracy.

Muhammad Afzaal, Jalal Nouri, Aayesha Zia, Panagiotis Papapetrou, Uno Fors, Yongchao Wu, Xiu Li, Rebecka Weegar
Interactive Personas: Towards the Dynamic Assessment of Student Motivation within ITS

An intelligent system can provide sufficient collaborative opportunities and support yet fail to be pedagogically effective if the students are unwilling to participate. One of the common ways to assess motivation is using self-report questionnaires, which often do not take the context and the dynamic aspect of motivation into account. To address this, we propose personas, a user-centered design approach. We describe two design iterations where we: identify motivational factors related to students’ collaborative behaviors; and develop a set of representative personas. These personas could be embedded in an interface and be used as an alternative method to assess motivation within ITS.

Ishrat Ahmed, Adam Clark, Stefania Metzger, Ruth Wylie, Yoav Bergner, Erin Walker
Agent-Based Classroom Environment Simulation: The Effect of Disruptive Schoolchildren’s Behaviour Versus Teacher Control over Neighbours

Schoolchildren’s academic progress is known to be affected by the classroom environment. It is important for teachers and administrators to understand their pupils’ status and how various factors in the classroom may affect them, as it can help them adjust pedagogical interventions and management styles. In this study, we expand a novel agent-based model of classroom interactions of our design, towards a more efficient model, enriched with further parameters of peers and teacher’s characteristics, which we believe renders a more realistic setting. Specifically, we explore the effect of disruptive neighbours and teacher control. The dataset used for the design of our model consists of 65,385 records, which represent 3,315 classes in 2007, from 2,040 schools in the UK.

Khulood Alharbi, Alexandra I. Cristea, Lei Shi, Peter Tymms, Chris Brown
Integration of Automated Essay Scoring Models Using Item Response Theory

Automated essay scoring (AES) is the task of automatically grading essays without human raters. Many AES models offering different benefits have been proposed over the past few decades. This study proposes a new framework for integrating AES models that uses item response theory (IRT). Specifically, the proposed framework uses IRT to average prediction scores from various AES models while considering the characteristics of each model for evaluation of examinee ability. This study demonstrates that the proposed framework provides higher accuracy than individual AES models and simple averaging methods.

Itsuki Aomi, Emiko Tsutsumi, Masaki Uto, Maomi Ueno
Towards Sharing Student Models Across Learning Systems

Modern AIED systems develop sophisticated and multidimensional models of students. However, what is learned about students in one system—their skills, behaviors, and affect—is not carried over to other systems that could benefit students by using the information, potentially reducing both the effectiveness and efficiency of these systems. This challenge has been cited by a number of researchers as one of the most important for the field of AIED. In this paper, we discuss existing progress towards resolving this challenge, break down five sub-challenges, and propose how to address the sub-challenges.

Ryan S. Baker, Bruce M. McLaren, Stephen Hutt, J. Elizabeth Richey, Elizabeth Rowe, Ma. Victoria Almeda, Michael Mogessie, Juliana M. AL. Andres
Protecting Student Privacy with Synthetic Data from Generative Adversarial Networks

Educational data requires layers of protection that prohibit easy access to sensitive student data. However, the additional layers of security hinder research that relies on educational data to progress. In this paper, a Least Squares GAN (LSGAN) is proposed to create synthetic student performance datasets based on a master dataset without recreating samples. Synthetic data is less likely to be traced back to a student thereby reducing privacy issues. Two feature subsets were considered in the study: sequential, and all features. GANs trained on the sequential data produced new datasets that were representative of student performance from the training dataset, while the GAN trained on all features was not able to capture characteristics from the dataset. Based on the results, the synthetic dataset can provide an alternative unrestricted source of data without compromising student privacy.

Peter Bautista, Paul Salvador Inventado
Learning Analytics and Fairness: Do Existing Algorithms Serve Everyone Equally?

Systemic inequalities still exist within Higher Education (HE). Reports from Universities UK show a 13% degree-awarding gap for Black, Asian and Minority Ethnic (BAME) students, with similar effects found when comparing students across other protected attributes, such as gender or disability. In this paper, we study whether existing prediction models to identify students at risk of failing (and hence providing early and adequate support to students) do work equally effectively for the majority vs minority groups. We also investigate whether disaggregating of data by protected attributes and building individual prediction models for each subgroup (e.g., a specific prediction model for females vs the one for males) could enhance model fairness. Our results, conducted over 35 067 students and evaluated over 32,538 students, show that existing prediction models do indeed seem to favour the majority group. As opposed to hypothesise, creating individual models does not help improving accuracy or fairness.

Vaclav Bayer, Martin Hlosta, Miriam Fernandez
Exploiting Structured Error to Improve Automated Scoring of Oral Reading Fluency

In order to track the development of young readers’ oral reading fluency (ORF) at scale, it is necessary to move away from hand-scoring responses to automating the assessment of ORF, while retaining the quality of the scores. We present a method for improving automated ORF scoring that utilizes an observed systematicity in machine error, namely, that cases with low estimated reading accuracy are harder to score correctly for fluency. We show that the method yields an improved performance, including on out-of-domain data.

Beata Beigman Klebanov, Anastassia Loukina
Data Augmentation for Enlarging Student Feature Space and Improving Random Forest Success Prediction

One of the main problems encountered when predicting student success, as a tool to aid students, is the lack of data used to model each student. This lack of data is due in part to the small number of students in each university course and also, the limited number of features that describe the educational background for each student. In this article, we introduce new features by augmenting the student feature space to obtain an improved model. These features are divided into several groups, namely, external added data, metric and counter data, and evolutive data. We will then assess the quality of the augmented data to classify at-risk students in their first year of university. For this article, the classifiers are built using Random Forests. As this learning method measures variable importance, we can enquire on the relevance of the augmented data, as well as the data groups that allow a more significant collection of features.

Timothy H. Bell, Christel Dartigues-Pallez, Florent Jaillet, Christophe Genolini
The School Path Guide: A Practical Introduction to Representation and Reasoning in AI for High School Students

This paper presents a structured activity to introduce high school students in the topics of representation and reasoning in Artificial Intelligence, which are completely new for them at this educational level. The activity has been designed in the scope of the Erasmus+ project called AI+, which aims to develop a curriculum of Artificial Intelligence (AI) for high school students in Europe. As established in the AI+ principles, all the teaching activities are based on the use of the student's smartphone as the core element to introduce a practical approach to AI in classes. In this case, a smartphone app is developed by students using the MIT App Inventor software. The topics of representation and reasoning are introduced to students by means of topological maps and graph-like representations, which are used later to perform a simple probabilistic reasoning over them.

Sara Guerreiro-Santalla, Francisco Bellas, Oscar Fontenla-Romero
Kwame: A Bilingual AI Teaching Assistant for Online SuaCode Courses

Introductory hands-on courses such as our smartphone-based coding course, SuaCode require a lot of support for students to accomplish learning goals. Online environments make it even more difficult to get assistance especially more recently because of COVID-19. Given the multilingual context of SuaCode students—learners across 42 African countries that are mostly Anglophone or Francophone—in this work, we developed a bilingual Artificial Intelligence (AI) Teaching Assistant (TA)—Kwame—that provides answers to students’ coding questions from SuaCode courses in English and French. Kwame is a Sentence-BERT (SBERT)-based question-answering (QA) system that we trained and evaluated offline using question-answer pairs created from the course’s quizzes, lesson notes and students’ questions in past cohorts. Kwame finds the paragraph most semantically similar to the question via cosine similarity. We compared the system with TF-IDF and Universal Sentence Encoder. Our results showed that fine-tuning on the course data and returning the top 3 and 5 answers improved the accuracy results. Kwame will make it easy for students to get quick and accurate answers to questions in SuaCode courses.

George Boateng
Early Prediction of Children’s Disengagement in a Tablet Tutor Using Visual Features

Intelligent tutoring systems could benefit from human teachers’ ability to monitor students’ affective states by watching them and thereby detecting early warning signs of disengagement in time to prevent it. Toward that goal, this paper describes a method that uses input from a tablet tutor’s user-facing camera to predict whether the student will complete the current activity or disengage from it. Training a disengagement predictor is useful not only in itself but also in identifying visual indicators of negative affective states even when they don’t lead to non-completion of the task. Unlike prior work that relied on tutor-specific features, the method relies solely on visual features and so could potentially apply to other tutors. We present a deep learning method to make such predictions based on a Long Short Term Memory (LSTM) model that uses a target replication loss function. We train and test the model on screen capture videos of children in Tanzania using a tablet tutor to learn basic Swahili literacy and numeracy. We achieve balanced-class-size prediction accuracy of 73.3% when 40% of the activity is still left. We also analysed how prediction accuracy varies among tutor activities, revealing two distinct causes of disengagement.

Bikram Boote, Mansi Agarwal, Jack Mostow
An Educational System for Personalized Teacher Recommendation in K-12 Online Classrooms

In this paper, we propose a simple yet effective solution to build practical teacher recommender systems for online one-on-one classes. Our system consists of (1) a pseudo matching score module that provides reliable training labels; (2) a ranking model that scores every candidate teacher; (3) a novelty boosting module that gives additional opportunities to new teachers; and (4) a diversity metric that guardrails the recommended results to reduce the chance of collision. Offline experimental results show that our approach outperforms a wide range of baselines. Furthermore, we show that our approach is able to reduce the number of student-teacher matching attempts from 7.22 to 3.09 in a five-month observation on a third-party online education platform.

Jiahao Chen, Hang Li, Wenbiao Ding, Zitao Liu
Designing Intelligent Systems to Support Medical Diagnostic Reasoning Using Process Data

We captured 36 medical professionals’ process data across five medical cases using CResME, a multimedia system designed to activate illness scripts. Findings showed medical expertise was unrelated to diagnostic performance when illness scripts were disrupted, and that process data was predictive of diagnostic performance for some medical cases. Implications of our study illustrate ways to design AIEd systems capable of scaffolding diagnostic reasoning to reduce medical errors.

Elizabeth B. Cloude, Nikki Anne M. Ballelos, Roger Azevedo, Analia Castiglioni, Jeffrey LaRochelle, Anya Andrews, Caridad Hernandez
Incorporating Item Response Theory into Knowledge Tracing

The popularity of artificial neural networks has brought high predictive power to many difficult machine learning problems. Knowledge tracing (KT), the task of tracking students’ understanding of various concepts over time, is included in this category. But the deep learning methods which have performed best in knowledge tracing are hard to explain in a statistical sense.In this work, we leverage the psychological theory from Item Response Theory (IRT) to build interpretable neural networks for knowledge tracing which are competitive with other deep learning methods. This presents a trade-off between a small loss in predictive power and an increase in interpretability. The advantage of IRT-inspired knowledge tracing is that it transforms the high-dimensional student ability representation from deep learning models into an explainable IRT representation at each timestep. Further, the item parameters from IRT models can be directly recovered from the trained neural network weights.

Geoffrey Converse, Shi Pu, Suely Oliveira
Automated Model of Comprehension V2.0

Reading comprehension is key to knowledge acquisition and to reinforcing memory for previous information. While reading, a mental representation is constructed in the reader’s mind. The mental model comprises the words in the text, the relations between the words, and inferences linking to concepts in prior knowledge. The automated model of comprehension (AMoC) simulates the construction of readers’ mental representations of text by building syntactic and semantic relations between words, coupled with inferences of related concepts that rely on various automated semantic models. This paper introduces the second version of AMoC that builds upon the initial model with a revised processing pipeline in Python leveraging state-of-the-art NLP models, additional heuristics for improved representations, as well as a new radiant graph visualization of the comprehension model.

Dragos-Georgian Corlatescu, Mihai Dascalu, Danielle S. McNamara
Pre-course Prediction of At-Risk Calculus Students

Identifying students who are at-risk of failing a mathematics course at the earliest possible moment allows for support and scaffolding to be applied when it can have greatest impact. However, because risk of non-success can arise from a complex interaction of factors, early detection of struggling students is difficult. Machine learning is particularly suited to modeling this challenging interplay of variables. In this study, we measure how well machine learning models can identify at-risk students before an entry-level university calculus course begins. Five classification algorithms were applied to data combined from the student information system, an adaptive placement test, and a student survey. We were able to produce predictions before class start that were competitive with other studies using course activity data after coursework began. In addition, important features of the model provided insights into possible causes of student non-success.

James Cunningham, Raktim Mukhopadhyay, Rishabh Ranjit Kumar Jain, Jeffrey Matayoshi, Eric Cosyn, Hasan Uzun
Examining Learners’ Reflections over Time During Game-Based Learning

Reflections are critical components of game-based learning environments (GBLEs) as learners must accurately use and monitor self-regulatory pro- while learning with instructional materials. Within this study, we examined how middle-school students (N = 35) learned with Crystal Island, a microbiology-based GBLE where learners are required to diagnose a disease infecting researchers on an island. This study aimed to identify how learners’ time reflecting changed during gameplay and is related to learners’ scientific reasoning actions (e.g., information gathering, note-taking, hypothesis formation and testing) and whether this was related to learning gains. Results from a multilevel growth model indicated that time spent reflecting increased over time, but the specific timing of reflection prompts (e.g., after submitting a diagnosis) was related to the time learners reflected over time. Further, time engaging in scientific actions and learning gains moderated the relationship between time spent reflecting between different reflection prompts but does not have a main effect on time spent reflecting. This paper discusses implications for when and how reflection prompts should be triggered during game-based learning and designing GBLEs capable of intelligently and dynamically modeling, scaffolding, and fostering reflective thinking.

Daryn A. Dever, Elizabeth B. Cloude, Roger Azevedo
Examining the Use of a Teacher Alerting Dashboard During Remote Learning

Remote learning in response to the COVID-19 pandemic has introduced many challenges for educators. It is important to consider how AI technologies can be leveraged to support educators and, in turn, help students learn in remote settings. In this paper, we present the results of a mixed-methods study that examined how teachers used a dashboard with real-time alerts during remote learning. Specifically, three high school teachers held remote synchronous classes and received alerts in the dashboard about students’ difficulties on scientific inquiry practices while students conducted virtual lab investigations in an intelligent tutoring system. Quantitative analyses revealed that students significantly improved across a majority of inquiry practices during remote use of the technologies. Additionally, through qualitative analyses of the transcribed audio data, we identified five trends related to dashboard use in a remote setting, including three reflecting effective implementations of dashboard features and two reflecting the limitations of dashboard use. Implications regarding the design of dashboards for use across varying contexts are discussed.

Rachel Dickler, Amy Adair, Janice Gobert, Huma Hussain-Abidi, Joe Olsen, Mariel O’Brien, Michael Sao Pedro
Capturing Fairness and Uncertainty in Student Dropout Prediction – A Comparison Study

This study aims to explore and improve ways of handling a continuous variable dataset, in order to predict student dropout in MOOCs, by implementing various models, including the ones most successful across various domains, such as recurrent neural network (RNN), and tree-based algorithms. Unlike existing studies, we arguably fairly compare each algorithm with the dataset that it can perform best with, thus ‘like for like’. I.e., we use a time-series dataset ‘as is’ with algorithms suited for time-series, as well as a conversion of the time-series into a discrete-variables dataset, through feature engineering, with algorithms handling well discrete variables. We show that these much lighter discrete models outperform the time-series models. Our work additionally shows the importance of handing the uncertainty in the data, via these ‘compressed’ models.

Efthyvoulos Drousiotis, Panagiotis Pentaliotis, Lei Shi, Alexandra I. Cristea
Dr. Proctor: A Multi-modal AI-Based Platform for Remote Proctoring in Education

Technological advancements have enabled remote exams as a viable alternative to in-person proctoring. In light of the COVID-19 pandemic, educational institutions relied heavily on remote operation. The sudden shift exposed the weaknesses in available proctoring solutions, as pertains to fairness, economic viability, data privacy, network issues and usability. Moreover, whether they are equal in function to physical proctoring is questionable. Based on extensive research, we establish the system requirements and design for Dr. Proctor, a non-commercial solution that addresses many of the exposed concerns about remote proctoring.

Ahmed E. Elshafey, Mohammed R. Anany, Amr S. Mohamed, Nourhan Sakr, Sherif G. Aly
Multimodal Trajectory Analysis of Visitor Engagement with Interactive Science Museum Exhibits

Recent years have seen a growing interest in investigating visitor engagement in science museums with multimodal learning analytics. Visitor engagement is a multidimensional process that unfolds temporally over the course of a museum visit. In this paper, we introduce a multimodal trajectory analysis framework for modeling visitor engagement with an interactive science exhibit for environmental sustainability. We investigate trajectories of multimodal data captured during visitor interactions with the exhibit through slope-based time series analysis. Utilizing the slopes of the time series representations for each multimodal data channel, we conduct an ablation study to investigate how additional modalities lead to improved accuracy while modeling visitor engagement. We are able to enhance visitor engagement models by accounting for varying levels of visitors’ science fascination, a construct integrating science interest, curiosity, and mastery goals. The results suggest that trajectory-based representations of the multimodal visitor data can serve as the foundation for visitor engagement modeling to enhance museum learning experiences.

Andrew Emerson, Nathan Henderson, Wookhee Min, Jonathan Rowe, James Minogue, James Lester
Analytics of Emerging and Scripted Roles in Online Discussions: An Epistemic Network Analysis Approach

This paper investigates emerging roles in the context of the community of inquiry model. The paper reports the results of a study that demonstrated the application of epistemic network and clustering analyses to reveal the roles that different students assumed during an asynchronous course with online discussions. The proposed method highlights the differences and similarities between emerging and scripted roles based on the development of social and cognitive presences, two key constructs of the model of communities of inquiry.

Máverick Ferreira, Rafael Ferreira Mello, Rafael Dueire Lins, Dragan Gašević
Towards Automatic Content Analysis of Rhetorical Structure in Brazilian College Entrance Essays

Essay scorers manually look for the presence of required rhetorical categories to evaluate coherence, which is a time-consuming task. Several attempts in the literature have been reported to automate the identification of rhetorical categories in essays with machine learning. However, existing machine learning algorithms are mostly trained on content features which can lead to over-fitting and hindering model generalizability. Thus, this paper proposed a set of content-independent features to identify rhetorical categories. The best performing classifier, XGBoost, achieved performance comparable to human annotation and outperformed previous models.

Rafael Ferreira Mello, Giuseppe Fiorentino, Péricles Miranda, Hilário Oliveira, Mladen Raković, Dragan Gašević
Contrasting Automatic and Manual Group Formation: A Case Study in a Software Engineering Postgraduate Course

This paper proposes the comparison of a group formation approach based on an evolutionary algorithm with a manual approach performed by an instructor with ten years of experience on this task. The groups were created based on the professional, psychological, and experience profile of each student. The results obtained demonstrated the algorithm’s potential, reaching an average similarity of $$83.46\%$$ 83.46 % with the groups formed manually by the instructor.

Giuseppe Fiorentino, Péricles Miranda, André Nascimento, Ana Paula Furtado, Henrik Bellhäuser, Dragan Gašević, Rafael Ferreira Mello
Aligning Expectations About the Adoption of Learning Analytics in a Brazilian Higher Education Institution

Stakeholders’ buy-in is fundamental for the successful implementation of Learning Analytics (LA) in Higher Education. We present the results of a survey in a Brazilian HEI, to investigate the ideal and realistic expectations of students and instructors about the adoption of LA. Results indicate a high interest in using LA for improving the learning experience, but with ideal expectations higher than realistic expectations, and point out key challenges and opportunities for Latin American researchers to join efforts towards building solid evidence that can inform educational policy-makers and managers, and support the development of strategies for LA services in the region.

Samantha Garcia, Elaine Cristina Moreira Marques, Rafael Ferreira Mello, Dragan Gašević, Taciana Pontual Falcão
Interactive Teaching with Groups of Unknown Bayesian Learners

In this work we empirically explore the extension of an interactive approach for machine teaching from single learners to groups of learners. We use interactivity to overcome the common mismatch between the knowledge the teacher has about the students and the students themselves. With a multi-learner setting we also investigated the best way to consider the class—as a whole or divided in partitions accordingly to the students priors. The results of an user study where we teach a Bayesian estimation task have shown that, regardless of considering partitions or not, the interactive approaches significantly increase the learning performance of the class when compared to non-interactive alternatives.

Carla Guerra, Francisco S. Melo, Manuel Lopes
Multi-task Learning Based Online Dialogic Instruction Detection with Pre-trained Language Models

In this work, we study computational approaches to detect online dialogic instructions, which are widely used to help students understand learning materials, and build effective study habits. This task is rather challenging due to the widely-varying quality and pedagogical styles of dialogic instructions. To address these challenges, we utilize pre-trained language models, and propose a multi-task paradigm which enhances the ability to distinguish instances of different classes by enlarging the margin between categories via contrastive loss. Furthermore, we design a strategy to fully exploit the misclassified examples during the training stage. Extensive experiments on a real-world online educational data set demonstrate that our approach achieves superior performance compared to representative baselines. To encourage reproducible results, we make our implementation online available at .

Yang Hao, Hang Li, Wenbiao Ding, Zhongqin Wu, Jiliang Tang, Rose Luckin, Zitao Liu
Impact of Predictive Learning Analytics on Course Awarding Gap of Disadvantaged Students in STEM

In this work, we investigate the degree-awarding gap in distance higher education by studying the impact of a Predictive Learning Analytics system, when applying it to 3 STEM (Science, Technology, Engineering and Mathematics) courses with over 1,500 students. We focus on Black, Asian and Minority Ethnicity (BAME) students and students from areas with high deprivation, a proxy for low socio-economic status. Nineteen teachers used the system to obtain predictions of which students were at risk of failing and got in touch with them to support them (intervention group). The learning outcomes of these students were compared with students whose teachers did not use the system (comparison group). Our results show that students in the intervention group had 7% higher chances of passing the course, when controlling for other potential factors of success, with the actual pass rates being 64% vs 61%. When disaggregated: 1) BAME students had 10% higher pass rates (55 %vs 45%) than BAME students in the comparison group and 2) students from the most deprived areas had 4% higher pass rates (58% vs 54%) in the intervention group compared to the comparison group.

Martin Hlosta, Christothea Herodotou, Vaclav Bayer, Miriam Fernandez
Evaluation of Automated Image Descriptions for Visually Impaired Students

Illustrations are widely used in education, and sometimes, alternatives are not available for visually impaired students. Therefore, those students would benefit greatly from an automatic illustration description system, but only if those descriptions were complete, correct, and easily understandable using a screenreader. In this paper, we report on a study for the assessment of automated image descriptions. We interviewed experts to establish evaluation criteria, which we then used to create an evaluation questionnaire for sighted non-expert raters, and description templates. We used this questionnaire to evaluate the quality of descriptions which could be generated with a template-based automatic image describer. We present evidence that these templates have the potential to generate useful descriptions, and that the questionnaire identifies problems with description templates.

Anett Hoppe, David Morris, Ralph Ewerth
Way to Go! Effects of Motivational Support and Agents on Reducing Foreign Language Anxiety

Using a tutoring system for English as a foreign language, we studied the impact on students’ anxiety levels of an animated agent that provides motivational, supportive feedback. We compared two types of feedback — explanatory and motivational supportive feedback — presented in three ways: by text, by voice, or by a character agent. Results showed that using an agent that gives motivational, supportive feedback decreases the learners’ anxiety levels overall. We also found that performance and gender interact with the effectiveness of the treatment for reducing foreign language anxiety (FLA). Our findings have implications for promoting equity and determining how best to improve positive emotions and reduce anxiety for all students.

Daneih Ismail, Peter Hastings
“I didn’t copy his code”: Code Plagiarism Detection with Visual Proof

Code plagiarism in online courses gives a false idea of the performance of students. In 2020, we run a smartphone-based online coding course, SuaCode Africa 2.0 in which 27% of plagiarism cases was found in the final assignment submissions. Hence, a need arose to develop software that detects plagiarism among source code. The software described in this paper detects plagiarized source code containing English and French texts. Also, the code examples provided by the instructors is taken into consideration. In other words, code blocks present in the examples can be reused by any student. We trained machine learning models on three cosine similarity based metric extracted from the TF-IDF feature vector of the code files. The system provides proof of plagiarism on a GUI tool that visualizes the similar sections of the flagged files. This software will contribute to having a sincere evaluation of the impact of SuaCode on the students, thereby preventing the production of incompetent programmers.

Samuel John, George Boateng
An Epistemic Model-Based Tutor for Imperative Programming

We developed a tutor for imperative programming in C++. It covers algorithm formulation, program design and coding – all three stages involved in writing a program to solve a problem. The design of the tutor is epistemic, i.e., true to real-life programming practice. The student works through all the three stages of programming in interleaved fashion, and within the context of a single code canvas. The student has the sole agency to compose the program and write the code. The tutor uses goals and plans as prompts to scaffold the student through the programming process designed by an expert. It provides drill-down immediate feedback at the abstract, concrete and bottom-out levels at each step. So, by the end of the session, the student is guaranteed to write the complete and correct program for a given problem. We used model-based architecture to implement the tutor because of the ease with which it facilitates adding problems to the tutor. In a preliminary study, we found that practicing with the tutor helped students solve problems with fewer erroneous actions and less time.

Amruth N. Kumar
Long Term Retention of Programming Concepts Learned Using Tracing Versus Debugging Tutors

We studied long-term retention of the concepts that introductory programming students learned using two software tutors on tracing the behavior of functions and debugging functions. Whereas the concepts covered by the tutor on the behavior of functions were interdependent, the concepts covered by debugging tutor were independent. We analyzed the data of the students who had used the tutors more than once, hours to weeks apart. Our objective was to find whether students retained what they had learned during the first session till the second session. We found that the more the problems students solved during the first session, the greater the retention. Knowledge and retention varied between debugging and behavior tutors, even though they both dealt with functions, possibly because debugging tutor covered independent concepts whereas behavior tutor covered interdependent concepts.

Amruth N. Kumar
Facilitating the Implementation of AI-Based Assistive Technologies for Persons with Disabilities in Vocational Rehabilitation: A Practical Design Thinking Approach

Digital and AI-based assistive technologies (AI-AT) are becoming more important for the inclusion of persons with disabilities (PWD). One challenge in providing PWD with AI-AT is to meet their requirements and needs. At the same time, they are often embedded in organizational contexts and thus need to be cost-effective and easy to learn and handle. This short paper introduces a systematic approach to match the individual needs and organizational context with AI-AT that support working and learning of PWD. The approach combines Design Thinking (DT) methods, participatory elements, and online collaboration tools in a cycle of three workshops. The aim is to understand the target group better, identify, evaluate and choose appropriate AI-AT and develop innovation spaces that help introduce and test AI-AT. The approach was developed for a vocational rehabilitation setting but can also be easily adapted for various settings (e.g., educational technology or corporate AI projects).

Marco Kähler, Rolf Feichtenbeiner, Susan Beudt
Quantifying the Impact of Severe Weather Conditions on Online Learning During the COVID-19 Pandemic

From October to November 2020 the Philippines was struck by eight typhoons, two of which caused widespread flooding, utilities interruptions, property destruction, and loss of life. How did these severe weather conditions affect online learning participation of students pursuing their undergraduate and graduate studies in the midst of the COVID-19 pandemic? We used CausalImpact analysis to explore September 2020 to January 2021 data collected from the Moodle Learning Management System data of one university in the Philippines. We found that overall student online participation was significantly negatively affected by typhoons. However, the effect on participation in Assignments and Quizzes were not significant. These findings suggested that students continued to invest their time and energy on activities that have a direct bearing on their final grades.

Ezekiel Adriel Lagmay, Ma. Mercedes T. Rodrigo
I-Mouse: A Framework for Player Assistance in Adaptive Serious Games

A serious game is an educational digital game created to entertain and achieve characterizing goal to promote learning. However, a serious game’s major challenge is capturing and sustaining player attention and motivation, thus restricting learning abilities. Adaptive frameworks in serious games (Adaptive serious games) tackle the challenge by automatically assisting players in balancing boredom and frustration. The current state-of-the-art in Adaptive serious games targets modeling a player’s cognitive states by considering eye-tracking characteristics like gaze, fixation, pupil diameter, or mouse tracking characteristics such as mouse positions. However, a combination of eye and mouse tracking characteristics has seldom been used. Hence, we present I-Mouse, a framework for predicting the need for player assistance in educational serious games through a combination of eye and mouse-tracking data. I-Mouse framework comprises four steps: (a) Feature generation for identifying cognitive states, (b) Partition clustering for player state modeling, (c) Data balancing of the clustered data, and (d) Classification to predict the need for assistance. We evaluate the framework using a real game data set to predict the need for assistance, and Random Forest is the best performing model with an accuracy of 99% amongst the trained classification models.

Riya Lalwani, Ashish Chouhan, Varun John, Prashant Sonar, Aakash Mahajan, Naresh Pendyala, Alexander Streicher, Ajinkya Prabhune
Parent-EMBRACE: An Adaptive Dialogic Reading Intervention

Dialogic reading is a practice where adults and children engage in a dialogue as they read together to improve children’s language strategies and comprehension. These dialogues are often initiated by parent questioning behaviors, but parents do not always engage in this behavior spontaneously. In this paper, we describe an adaptive intervention for dialogic reading, Parent-EMBRACE, built into an iPad application that uses an embodied cognition approach and is designed specifically for Latino dual language learners in the US. The intervention: 1) Models parent question asking, 2) Provides parents with on-demand hints on questions that can be asked at particular moments during the story, 3) Prompts parents to ask questions at appropriate times, 4) Includes a dashboard that presents parents with data on their question-asking behaviors, 5) Provides all support in both English and Spanish. We discuss the implications of this intervention as an intelligent tutoring system for parent-child interactions, plans to extend and evaluate the system.

Arun Balajiee Lekshmi Narayanan, Ju Eun Lim, Tri Nguyen, Ligia E. Gomez, M. Adelaida Restrepo, Chris Blais, Arthur M. Glenberg, Erin Walker
Using Fair AI with Debiased Network Embeddings to Support Help Seeking in an Online Math Learning Platform

There has been a long-standing issue of sparse discussion forums participation in online learning, which can impede students’ help seeking practices. Researchers have examined AI techniques such as link prediction with network analysis to connect help seekers with help providers. However, little is known whether these AI systems will treat students fairly. In this study, we aim to start a foundation work to build a recommender system that can (1) fairly suggest peers who are likely to answer a question and (2) predict the response quality of students.

Chenglu Li, Wanli Xing, Walter Leite
A Multimodal Machine Learning Framework for Teacher Vocal Delivery Evaluation

The quality of vocal delivery is one of the key indicators for evaluating teacher enthusiasm, which has been widely accepted to be connected to the overall course qualities. However, existing evaluation for vocal delivery is mainly conducted with manual ratings, which faces two core challenges: subjectivity and time-consuming. In this paper, we present a novel machine learning approach that utilizes pairwise comparisons and a multimodal orthogonal fusing algorithm to generate large-scale objective evaluation results of the teacher vocal delivery in terms of fluency and passion. We collect two datasets from real-world education scenarios and the experiment results demonstrate the effectiveness of our algorithm. To encourage reproducible results, we make our code public available at .

Hang Li, Yu Kang, Yang Hao, Wenbiao Ding, Zhongqin Wu, Zitao Liu
Solving ESL Sentence Completion Questions via Pre-trained Neural Language Models

Sentence completion (SC) questions present a sentence with one or more blanks that need to be filled in, three to five possible words or phrases as options. SC questions are widely used for students learning English as a Second Language (ESL) and building computational approaches to automatically solve such questions is beneficial to language learners. In this work, we propose a neural framework to solve SC questions in English examinations by utilizing pre-trained language models. We conduct extensive experiments on a real-world K-12 ESL SC question dataset and the results demonstrate the superiority of our model in terms of prediction accuracy. Furthermore, we run precision-recall trade-off analysis to discuss the practical issues when deploying it in real-life scenarios. To encourage reproducible results, we make our code publicly available at .

Qiongqiong Liu, Tianqiao Liu, Jiafu Zhao, Qiang Fang, Wenbiao Ding, Zhongqin Wu, Feng Xia, Jiliang Tang, Zitao Liu
DanceTutor: An ITS for Coaching Novice Ballet Dancers Using Pose Recognition of Whole-Body Movements

This paper presents the design, development and evaluation of a prototype intelligent dance tutoring system, DanceTutor, for coaching students in low-resource settings. The system evaluates seventeen core body points on a dancer using video footage captured from a mobile phone or web camera using a combination of simple algorithms and 2D pose estimation software. Detailed feedback is provided on the quality and correctness of the dancer’s pose for the first five static dance positions in Ballet, and then for intermediate to advanced exercises with permutations of the five basic Ballet positions. Evaluation of the prototype revealed the highly subjective nature and cultural biases of evaluating the quality of a dancer’s technique. Three experienced dance teachers, trained in different countries, evaluated 165 video recordings of 11 candidate dancers. The system was only able to achieve 47% consensus overall with the feedback and grading results produced by the dance teachers, who each evaluated tension and height differently. There was however a 60% agreement between DanceTutor and one teacher who used the most granular evaluation strategy matching DanceTutor’s baseline and assessment features.

Lurlynn Maharaj-Pariagsingh, Phaedra S. Mohammed
Tracing Embodied Narratives of Critical Thinking

Critical Thinking (CrT) is generally characterized as an abstract thinking process, detached from the (bodily) actions one engages in during the process. Though recent cognitive theories assert that all thinking is action-based, the embodied and distributed cognitive processes underlying CrT have not been identified. We present preliminary findings from the first iteration of a design-based research project which involves probing possible connections between CrT and one’s (bodily) action sequences. We performed sequential pattern mining and qualitative analysis on the study participants’ actions logs to find differences in participants CrT processes. Our analysis showed that only a subset of participants contextualized their assumptions, inferences, and implications in the different information resources available in the environment. A majority of participants’ actions performed within the interface were incoherent. These results have implications for automated analyses of the CrT process, and for the design of AI-based scaffolds to support CrT development.

Shitanshu Mishra, Rwitajit Majumdar, Aditi Kothiyal, Prajakt Pande, Jayakrishnan Madathil Warriem
Multi-armed Bandit Algorithms for Adaptive Learning: A Survey

Adaptive learning aims to provide each student individual tasks specifically tailed to his/her strengths and weaknesses. However, it is challenging to realize it, overcoming the complexity issue in online learning. There are many unsolved problems such as knowledge component sequencing, activity sequencing, exercise sequencing, question sequencing, and pedagogical strategy, to realize adaptive learning. Bandit algorithms are particularly suitable to model the process of planning and using feedback on the outcome of that decision to inform future decisions. They are finding their way into practical applications in various areas especially in online platforms where data is readily available, and automation is the only way to scale. This paper presents a survey on bandit algorithms for facilitating adaptive learning in different settings. The findings indicate that the various bandit algorithms have great potential to solve the above problems. Also, we discuss issues and challenges of developing and using adaptive learning systems based on the multi-armed bandit framework.

John Mui, Fuhua Lin, M. Ali Akber Dewan
Paraphrasing Academic Text: A Study of Back-Translating Anatomy and Physiology with Transformers

This paper explores a general approach to paraphrase generation using a pre-trained seq2seq model fine-tuned using a back-translated anatomy and physiology textbook. Human ratings indicate that the paraphrase model generally preserved meaning and grammaticality/fluency: 70% of meaning ratings were above 75, and 40% of paraphrases were considered more grammatical/fluent than the originals. An error analysis suggests potential avenues for future work.

Andrew M. Olney
PAKT: A Position-Aware Self-attentive Approach for Knowledge Tracing

Knowledge Tracing aims to model a student’s knowledge state from her past learning interactions and predict her performance in future. Although structures such as positional encoding or forgetting gate have already been used in Knowledge Tracing models, positional information with great potential is not fully utilized. In this paper, we propose a Position-aware Self-Attentive Knowledge Tracing (PAKT) model with a position supervision mechanism. Massive experimental results show that PAKT outperforms other benchmarks on several popular datasets.

Yuanxin Ouyang, Yucong Zhou, Hongbo Zhang, Wenge Rong, Zhang Xiong
Identifying Struggling Students by Comparing Online Tutor Clickstreams

New ways to identify students in need of assistance are imperative to the evolution of online tutoring platforms. Currently implemented models to identify struggling students use costly and tedious classroom observation paired with student’s platform usage, and are often suitable for only a subset of students. With the recent influx of new students to online tutoring platforms due to COVID-19, a simple method to quickly identify struggling students could help facilitate effective remote learning. To this end, we created an anomaly detection algorithm that models the normal behavior of students during remote learning and recognizes when students deviate from this behavior. We demonstrated how anomalous behavior revealed which students needed additional assistance and predicted student learning outcomes.

Ethan Prihar, Alexander Moore, Neil Heffernan
Exploring Dialogism Using Language Models

Dialogism is a philosophical theory centered on the idea that life involves a dialogue among multiple voices in a continuous exchange and interaction. Considering human language, different ideas or points of view take the form of voices, which spread throughout any discourse and influence it. From a computational point of view, voices can be operationlized as semantic chains that contain related words. This study introduces and evaluates a novel method of identifying semantic chains using BERT, a state-of-the-art language model for computational linguistics. The resulting model generalizes to multiple relations including repetitions, semantically related concepts from WordNet (i.e., synonyms, hypernyms, hyponyms, and siblings), as well as pronominal resolutions. By combining the attention scores between words, word pairs are merged into connected components that denote emerging voices from the discourse. The introduced visualization argues for a more dense capturing of inner semantic links between words and even compound words in contrast to classical methods of building lexical chains.

Stefan Ruseti, Maria-Dorinela Dascalu, Dragos-Georgian Corlatescu, Mihai Dascalu, Stefan Trausan-Matu, Danielle S. McNamara
EduPal Leaves No Professor Behind: Supporting Faculty via a Peer-Powered Recommender System

The swift transitions in higher education after the COVID-19 outbreak identified a gap in the pedagogical support available to faculty. We propose a smart, knowledge-based chatbot that addresses issues of knowledge distillation and provides faculty with personalized recommendations. Our collaborative system crowdsources useful pedagogical practices and continuously filters recommendations based on theory and user feedback, thus enhancing the experiences of subsequent peers. We build a prototype for our local STEM faculty as a proof concept and receive favorable feedback that encourages us to extend our development and outreach, especially to underresourced faculty.

Nourhan Sakr, Aya Salama, Nadeen Tameesh, Gihan Osman
Computer-Supported Human Mentoring for Personalized and Equitable Math Learning

Computer tutor data indicate that more learning opportunities yield greater achievement, but also confirm there are gaps in the number and quality of opportunities marginalized students receive that technology alone does not address. Personalized learning with mentors can close this gap in opportunities but is expensive to implement. We introduce a free, web-based application, Personalized Learning $$^2$$ 2 (PL $$^2$$ 2 ), designed to improve mentoring efficiency by connecting mentors to intervention and instructional resources. Preliminary findings indicated that PL $$^2$$ 2 ’s categorization of students based on math learning software data enabled mentors to focus their efforts, and that mentors found PL $$^2$$ 2 resources to positively expand how they taught and mentored.

Peter Schaldenbrand, Nikki G. Lobczowski, J. Elizabeth Richey, Shivang Gupta, Elizabeth A. McLaughlin, Adetunji Adeniran, Kenneth R. Koedinger
Internalisation of Situational Motivation in an E-Learning Scenario Using Gamification

Self-directed learning is of critical importance in adult learning, for example, when taking part in online courses or learning at universities. To work on a challenging topic continuously requires learners to self-motivate. By applying self-determination theory to address the basic psychological needs for competence, autonomy and relatedness, the internalisation of motivation can be fostered. We implemented a learning environment, which addresses these needs using gamification elements to scaffold situational motivation, and compared it with a control version in a user study to investigate the effect of the implemented gamification elements on the internalization of situational motivation. Our results show an internalization of situational motivation with significantly higher internalised and significantly lower extrinsic situational motivation in the gamified version relative to the control condition.

Philipp Schaper, Anna Riedmann, Birgit Lugrin
Learning Association Between Learning Objectives and Key Concepts to Generate Pedagogically Valuable Questions

It has been shown that answering questions contributes to students learning effectively. However, generating questions is an expensive task and requires a lot of effort. Although there has been research reported on the automation of question generation in the literature of Natural Language Processing, these technologies do not necessarily generate questions that are useful for educational purposes. To fill this gap, we propose QUADL, a method for generating questions that are aligned with a given learning objective. The learning objective reflects the skill or concept that students need to learn. The QUADL method first identifies a key concept, if any, in a given sentence that has a strong connection with the given learning objective. It then converts the given sentence into a question for which the predicted key concept becomes the answer. The results from the survey using Amazon Mechanical Turk suggest that the QUADL method can be a step towards generating questions that effectively contribute to students’ learning.

Machi Shimmei, Noboru Matsuda
Exploring the Working and Effectiveness of Norm-Model Feedback in Conceptual Modelling – A Preliminary Report

Having learners (K7–10) acquire system thinking skills is challenging. Together with teachers we deploy qualitative representations of complex systems to enable this learning process. Teachers select their own topics for their leaners to work on which makes that lessons vary in content depending on the teacher’s preference. Within this setting we face the challenge of adequately coaching learners while they create their knowledge models. For this, we use norm-model based feedback, ignoring learner specific information. Here we report the working and effectiveness of this approach.

Loek Spitz, Marco Kragten, Bert Bredeweg
A Comparative Study of Learning Outcomes for Online Learning Platforms

Personalization and active learning help educational systems to close the gap between students with varying abilities. We run a comparative head-to-head study of learning outcomes for two popular online platforms: Platform A, which delivers content over lecture videos and multiple-choice quizzes, and Platform B, which provides interactive problem-solving exercises and personalized feedback. We observe a statistically significant increase in the learning outcomes on Platform B. Further, the results of the self-assessment questionnaire suggest that participants using Platform B improve their metacognition.

Francois St-Hilaire, Nathan Burns, Robert Belfer, Muhammad Shayan, Ariella Smofsky, Dung Do Vu, Antoine Frau, Joseph Potochny, Farid Faraji, Vincent Pavero, Neroli Ko, Ansona Onyi Ching, Sabina Elkins, Anush Stepanyan, Adela Matajova, Laurent Charlin, Yoshua Bengio, Iulian Vlad Serban, Ekaterina Kochmar
Explaining Engagement: Learner Behaviors in a Virtual Coding Camp

Engagement is critical to learning, yet current research rarely explores its underlying contextual influences, such as differences across modalities and tasks. Accordingly we examine how patterns of behavioral engagement manifest in a diverse group of ten middle school girls participating in a synchronous virtual computer science camp. We form multimodal measures of behavioral engagement from learner chats and speech. We found that the function of modalities varies, and chats are useful for short responses, whereas speech is better for elaboration. We discuss implications of our work for the design of intelligent systems that support online educational experiences.

Angela E. B. Stewart, Jaemarie Solyst, Amanda Buddemeyer, Leshell Hatley, Sharon Henderson-Singer, Kimberly Scott, Erin Walker, Amy Ogan
Using AI to Promote Equitable Classroom Discussions: The TalkMoves Application

Inclusion in mathematics education is strongly tied to discourse rich classrooms, where students ideas play a central role. Talk moves are specific discursive practices that promote inclusive and equitable participation in classroom discussions. This paper describes the development of the TalkMoves application, which provides teachers with detailed feedback on their usage of talk moves based on accountable talk theory. Building on our recent work to automate the classification of teacher talk moves, we have expanded the application to also include feedback on a set of student talk moves. We present results from several deep learning models trained to classify student sentences into student talk moves with performance up to 73% F1. The classroom data used for training these models were collected from multiple sources that were pre-processed and annotated by highly reliable experts. We validated the performance of the model on an out-of-sample dataset which included 166 classroom transcripts collected from teachers piloting the application.

Abhijit Suresh, Jennifer Jacobs, Charis Clevenger, Vivian Lai, Chenhao Tan, James H. Martin, Tamara Sumner
Investigating Effects of Selecting Challenging Goals

Goal setting is a vital component of self-regulated learning. Numerous studies show that selecting challenging goals has strong positive effects on performance. We investigate the effect of support for goal setting in SQL-Tutor. The experimental group had support for selecting challenging goals, while the control group students could select goals freely. The experimental group achieved the same learning outcomes as the control group, but by attempting and solving significantly fewer, but more complex problems. Causal modelling revealed that the experimental group students who selected more challenging goals were superior in problem solving. We also found a significant improvement in self-reported goal setting skills of the experimental group.

Faiza Tahir, Antonija Mitrović, Valerie Sotardi
Modeling Frustration Trajectories and Problem-Solving Behaviors in Adaptive Learning Environments for Introductory Computer Science

Modeling a learner’s frustration in adaptive environments can inform scaffolding. While much work has explored momentary frustration, there is limited research investigating the dynamics of frustration over time and its relationship with problem-solving behaviors. In this paper, we clustered 86 undergraduate students into four frustration trajectories as they worked with an adaptive learning environment for introductory computer science. The results indicate that students who initially report high levels of frustration but then reported lower levels later in their problem solving were more likely to have sought help. These findings provide insight into how frustration trajectory models can guide adaptivity during extended problem-solving episodes.

Xiaoyi Tian, Joseph B. Wiggins, Fahmid Morshed Fahid, Andrew Emerson, Dolly Bounajim, Andy Smith, Kristy Elizabeth Boyer, Eric Wiebe, Bradford Mott, James Lester
Behavioral Phenotyping for Predictive Model Equity and Interpretability in STEM Education

Predictive models are increasingly being deployed in social and behavioral applications in support of decision making that directly affects people’s lives. Given such high stakes, it is important to develop models with interpretable and defensible features, with decisions that are unbiased toward historically marginalized groups. In this work we investigate the use of nonnegative matrix factorization (NMF) for generating interpretable features in an educational setting, combined with a standard bias mitigation algorithm for training predictive models. Our application in this work is predicting enrollment in STEM degrees, and improving fairness of our models through bias mitigation. We perform our experiments on the High School Longitudinal Study of 2009, and evaluate our results using both objective metrics and subjective interpretation of the NMF factors, or behavioral phenotypes. Our empirical results from these experiments suggest that NMF combined with bias mitigation can potentially be used to improve fairness measures while simultaneously aiding in interpretability.

Marcus Tyler, Alex Liu, Ravi Srinivasan
Teaching Underachieving Algebra Students to Construct Models Using a Simple Intelligent Tutoring System

An algebraic model uses a set of algebraic equations to describe a situation. Constructing such models is a fundamental skill, but many students still lack the skill, even after taking several algebra courses in high school and college. For underachieving college students, we developed a tutoring system that taught students to decompose the to-be-modelled situation into schema applications, where a schema represents a simple relationship such as distance-rate-time or part-whole. However, when a model consists of multiple schema applications, it needs some connection among them, usually represented by letting the same variable appear in the slots of two or more schemas. Students in our studies seemed to have more trouble identifying connections among schemas than identifying the schema applications themselves. This paper describes a newly designed tutoring system that emphasizes such connections. An evaluation was conducted using a regression discontinuity design. It produced a marginally reliable positive effect of moderate size (d = 0.4).

Kurt VanLehn, Fabio Milner, Chandrani Banerjee, Jon Wetzel
Charisma and Learning: Designing Charismatic Behaviors for Virtual Human Tutors

Charisma is a powerful device of communication. Research on charisma on a specific type of leader in a specific type of organization – teachers in the classroom - has indicated the positive influence of a teacher’s charismatic behaviors, often referred to as immediacy behaviors, on student learning. How do we realize such behaviors in a virtual tutor? How do such behaviors impact student learning? In this paper, we discuss the design of a charismatic virtual human tutor. We developed verbal and nonverbal (with the focus on voice) charismatic strategies and realized such strategies through scripted tutorial dialogues and pre-recorded voices. A study with the virtual human tutor has shown an intriguing impact of charismatic behaviors on student learning.

Ning Wang, Aditya Jajodia, Abhilash Karpurapu, Chirag Merchant
AI-Powered Teaching Behavior Analysis by Using 3D-MobileNet and Statistical Optimization

Artificial intelligent technology can realize multi-angle analysis and feedback of teaching process. This paper provides an innovative auxiliary for classroom teaching evaluation and fills in the lack of teacher behavior analysis in AI application. Firstly, a 3D-MobileNet framework is proposed for behavior recognition, which can process time-domain information for the video through layered training. Next, we design a comprehensive model by using both the analytic hierarchy process and entropy weight method (AHP-EW) to output the quantitative results of the teaching evaluation in three dimensions. This model combines the subjective and objective weights through a statistical optimization strategy to improve the credibility. Finally, we test our model on a 45-min teaching video, and compare it with the existing model in various aspects, proving that our method is highly feasible and competitive.

Ruhan Wang, Jiahao Lyu, Qingyun Xiong, Junqi Guo
Assessment2Vec: Learning Distributed Representations of Assessments to Reduce Marking Workload

Reducing instructors workload in online and large-scale learning environments could be one of the most important factors in educational systems. To address this challenge, techniques such as Artificial Intelligence has been considered in tutoring systems and automatic essay scoring tasks. In this paper, we construct a novel model to enable learning distributed representations of assessments namely Assessment2Vec and mark assessments automatically with Supervised Contrastive Learning loss which will effectively reduce instructors’ workload in marking large number of assessments. The experimental results based on the real-world datasets show the effectiveness of the proposed approach.

Shuang Wang, Amin Beheshti, Yufei Wang, Jianchao Lu, Quan Z. Sheng, Stephen Elbourn, Hamid Alinejad-Rokny, Elizabeth Galanis
Toward Stable Asymptotic Learning with Simulated Learners

Simulations of human learning have shown potential for supporting ITS authoring and testing, in addition to other use cases. To date, simulated learner technologies have often failed to robustly achieve perfect performance with considerable training. In this work we identify an impediment to producing perfect asymptotic learning performance in simulated learners and introduce one significant improvement to the Apprentice Learner Framework to this end.

Daniel Weitekamp, Erik Harpstead, Kenneth Koedinger
A Word Embeddings Based Clustering Approach for Collaborative Learning Group Formation

Today, collaborative learning has become quite central as a method for learning, and over the past decades, a large number of studies have demonstrated the benefits from various theoretical and methodological perspectives. This study proposes a novel approach that utilises Natural Language Processing(NLP) methods, particularly pre-trained word embeddings, to automatically create homogeneous or heterogeneous groups of students in terms of knowledge and knowledge gaps expressed in assessments. The two different ways of creating groups serve two different pedagogical purposes: (1) homogeneous group formation based on students’ knowledge can support and make teachers’ pedagogical activities such as feedback provision more time efficient, and (2) the heterogeneous groups can support and enhance collaborative learning. We evaluate the performance of the proposed approach through experiments with a dataset from a university course in programming didactics.

Yongchao Wu, Jalal Nouri, Xiu Li, Rebecka Weegar, Muhammad Afzaal, Aayesha Zia
Intelligent Agents Influx in Schools: Teacher Cultures, Anxiety Levels and Predictable Variations

Artificially intelligent robots entered Japanese schools in 2017 in the same impetuous way in which computers flooded education worldwide during the 80s. Unlike computers, which became indispensable to school culture, AI robots have yet to find a place in the classroom. This paper presents a pilot study aimed at finding clusters of common teacher attitude to better plan the deployment of such AI agents in the future. The results of teacher surveys, first, indicated that the most influential factor in teacher adoption of such technology is coding literacy rather than age or study major; and secondly, served to train machine learning algorithms and develop a “culture stress system” that predicts teacher anxiety and recommends an optimum number of AI agents that targeted teachers in a school can plausibly accommodate.

R. Yamamoto Ravenor
WikiMorph: Learning to Decompose Words into Morphological Structures

This paper presents WikiMorph, a tool that automatically breaks down words into morphemes, etymological compounds (morphemes from root languages), and generates contextual definitions for each component. It comes in two flavors: a dataset and a deep-learning-based model. The dataset was extracted from Wiktionary and contains over 450k entries. We then used this dataset to train a GPT-2 model to generalize and decompose any word into morphemes and their definitions. We find that the model accurately generates complex breakdowns when given a high-quality initial definition.

Jeffrey T. Yarbro, Andrew M. Olney
Individualization of Bayesian Knowledge Tracing Through Elo-infusion

For as long as the Bayesian Knowledge Tracing (BKT) approach is known, so are the attempts to account for not only skill-level but individual student factors. A lot of computational methods to implement individualization in BKT were proposed over the past 25 years as BKT existed. To this day, virtually all individualization approaches were not suited for easy implementation. Either they were purely analytical (only fit for post-hoc analyses) or required significant computational effort to realize (e.g., calibrating individual factors as students cleared units of content).In this work, we discuss implementing the individualization of BKT using a mechanism of an Elo rating schema. Elo has been established in the educational domain for some time and offers tangible theoretical and practical benefits. We show that infusing BKT even with an Elo component using a single parameter to track student-specific factors results in significant quantitative and qualitative improvements to modeling student learning. This approach is easy to implement in a system already featuring BKT.

Michael Yudelson
Self-paced Graph Memory Network for Student GPA Prediction and Abnormal Student Detection

Student learning performance prediction (SLPP) is a crucial step in high school education. However, traditional methods fail to consider abnormal students. In this study, we organized every student’s learning data as a graph to use the schema of graph memory networks (GMNs). To distinguish the students and make GMNs learn robustly, we proposed to train GMNs in an “easy-to-hard” process, leading to self-paced graph memory network (SPGMN). SPGMN chooses the low-difficult samples as a batch to tune the model parameters in each training iteration. This approach not only improves the robustness but also rearranges the student sample from normal to abnormal. The experiment results show that SPGMN achieves a higher prediction accuracy and more robustness in comparison with traditional methods. The resulted student sequence reveals the abnormal student has a different pattern in course selection to normal students.

Yue Yun, Huan Dai, Ruoqi Cao, Yupei Zhang, Xuequn Shang
Using Adaptive Experiments to Rapidly Help Students

Adaptive experiments can increase the chance that current students obtain better outcomes from a field experiment of an instructional intervention. In such experiments, the probability of assigning students to conditions changes while more data is being collected, so students can be assigned to interventions that are likely to perform better. Digital educational environments lower the barrier to conducting such adaptive experiments, but they are rarely applied in education. One reason might be that researchers have access to few real-world case studies that illustrate the advantages and disadvantages of these experiments in a specific context. We evaluate the effect of homework email reminders in students by conducting an adaptive experiment using the Thompson Sampling algorithm and compare it to a traditional uniform random experiment. We present this as a case study on how to conduct such experiments, and we raise a range of open questions about the conditions under which adaptive randomized experiments may be more or less useful.

Angela Zavaleta-Bernuy, Qi Yin Zheng, Hammad Shaikh, Jacob Nogas, Anna Rafferty, Andrew Petersen, Joseph Jay Williams
A Comparison of Hints vs. Scaffolding in a MOOC with Adult Learners

Scaffolding and providing feedback on problem-solving activities during online learning has consistently been shown to improve performance in younger learners. However, less is known about the impacts of feedback strategies on adult learners. This paper investigates how two computer-based support strategies, hints and required scaffolding questions, contribute to performance and behavior in an edX MOOC with integrated assignments from ASSISTments, a web-based platform that implements diverse student supports. Results from a sample of 188 adult learners indicated that those given scaffolds benefited less from ASSISTments support and were more likely to request the correct answer from the system.

Yiqiu Zhou, Juan Miguel Andres-Bray, Stephen Hutt, Korinn Ostrow, Ryan S. Baker
An Ensemble Approach for Question-Level Knowledge Tracing

Knowledge tracing—where a machine models the students’ knowledge as they interact with coursework—is a well-established area in the field of Artificial Intelligence in Education. In this paper, an ensemble approach is proposed that addresses existing limitations in question-centric knowledge tracing and achieves the goal of predicting future question correctness. The proposed approach consists of two models; one is Light Gradient Boosting Machine (LightGBM) built by incorporating all relevant key features engineered from the data. The second model is a Multiheaded-Self-Attention Knowledge Tracing model (MSAKT) that extracts historical student knowledge of future question by calculating their contextual similarity with previously attempted questions. The proposed model’s effectiveness is evaluated by conducting experiments on a big Kaggle dataset achieving an Area Under ROC Curve (AUC) score of 0.84 with 84% accuracy using 10fold cross-validation.

Aayesha Zia, Jalal Nouri, Muhammad Afzaal, Yongchao Wu, Xiu Li, Rebecka Weegar

Industry and Innovation

Scaffolds and Nudges: A Case Study in Learning Engineering Design Improvements

We present a brief case study of a multi-year learning engineering effort to iteratively redesign the problem-solving experience of students using the “Solving Quadratic Equations” workspace in Carnegie Learning’s MATHia intelligent tutoring system. We consider two design changes, one involving additional scaffolds for the problem-solving task and the next involving a “nudge” for learners to more rapidly and readily engage with these scaffolds and discuss resulting changes in the relative proportion of students who fail to master skills associated with this workspace over the course of two school years.

Stephen E. Fancsali, Martina Pavelko, Josh Fisher, Leslie Wheeler, Steven Ritter
Condensed Discriminative Question Set for Reliable Exam Score Prediction

The inevitable shift towards online learning due to the emergence of the COVID-19 pandemic triggered a strong need to assess students using shorter exams whilst ensuring reliability. This study explores a data-centric approach that utilizes feature importance to select a discriminative subset of questions from the original exam. Furthermore, the discriminative question subset’s ability to approximate the students exam scores is evaluated by measuring the prediction accuracy and by quantifying the error interval of the prediction. The approach was evaluated using two real-world exam datasets of the Scholastic Aptitude Test (SAT) and Exame Nacional do Ensino Médio (ENEM) exams, which consist of student response data and the corresponding the exam scores. The evaluation was conducted against randomized question subsets of sizes 10, 20, 30 and 50. The results show that our method estimates the full scores more accurately than a baseline model in most question sizes while maintaining a reasonable error interval. The encouraging evidence found in this paper provides support for the strong potential of the on-going study to provide a data-centric approach for exam size reduction.

Jung Hoon Kim, Jineon Baek, Chanyou Hwang, Chan Bae, Juneyoung Park
Evaluating the Impact of Research-Based Updates to an Adaptive Learning System

ALEKS is an adaptive learning system covering subjects such as math, statistics, and chemistry. Several recent studies have looked in detail at various aspects of student knowledge retention and forgetting within the system. Based on these studies, various enhancements were recently made to the ALEKS system with the underlying goal of helping students learn more and advance further. In this work, we describe how the enhancements were informed by these previous research studies, as well as the process of turning the research findings into practical updates to the system. We conclude by analyzing the potential impact of these changes; in particular, after controlling for several variables, we estimate that students using the updated system learned 9% more on average.

Jeffrey Matayoshi, Eric Cosyn, Hasan Uzun
Back to the Origin: An Intelligent System for Learning Chinese Characters

Learning Chinese characters is a challenging task for both native and foreign beginners. One major reason is that most Chinese characters in writing are distinct from each other and lack of directly phonetic clues. Fortunately, many Chinese characters’ original forms have iconicity that indicates their meanings. By leveraging on these characteristics and the latest computer vision (CV) techniques, we design and build an intelligent system that could automatically retrieve the iconic and original forms of Chinese characters. Furthermore, the system could provide learners with different styles of the character in a chronological order to bridge the original form and the most commonly used one. Specifically, we adopt the SE-Resnet-50 classification model for both character recognition and style recognition tasks, and design a dedicated retrieval mechanism to properly select the representative characters in different styles for learners. A specific user interface is designed for beginners to upload, recognize, remember, and understand the Chinese characters.

Jinglei Yu, Jiachen Song, Yu Lu, Shengquan Yu

Doctoral Consortium

Automated Assessment of Quality and Coverage of Ideas in Students’ Source-Based Writing

Source-based writing is an important academic skill in higher education, as it helps instructors evaluate students’ understanding of subject matter. To assess the potential for supporting instructors’ grading, we design an automated assessment tool for students’ source-based summaries with natural language processing techniques. It includes a special-purpose parser that decomposes the sentences into clauses, a pre-trained semantic representation method, a novel algorithm that allocates ideas into weighted content units and another algorithm for scoring students’ writing. We present results on three sets of student writing in higher education: two sets of STEM student writing samples and a set of reasoning sections of case briefs from a law school preparatory course. We show that this tool achieves promising results by correlating well with reliable human rubrics, and by helping instructors identify issues in grades they assign. We then discuss limitations and two improvements: a neural model that learns to decompose complex sentences into simple sentences, and a distinct model that learns a latent representation.

Yanjun Gao, Rebecca J. Passonneau
Impact of Intelligent Tutoring System (ITS) on Mathematics Achievement Using ALEKS

This doctoral research will explore the Impact of the Intelligent Tutoring System (ITS), such as the Assessment and LEarning in Knowledge Spaces (ALEKS), on students’ mathematics achievement, affective engagement, and cognitive engagement on 9th-grade students from two different school districts with somewhat similar demographics. This proposed quasi-experimental study will compare ALEKS-led (while teachers were present) versus teacher-led instructions to provide additional support to struggling students for fifty minutes per week for six weeks. This research will also explore the challenges posed by using ITS in the classroom.

Rashmi Khazanchi
Designing and Testing Assessments and Scaffolds for Mathematics Practices in Science Inquiry

This project seeks to operationalize the mathematics practices that are needed for mathematically-integrated science inquiry, as outlined in the NGSS, and examine the effects of scaffolding these on competencies on the mathematics practices and on content acquisition. Using Evidence-Centered Design, I will design and pilot an assessment that can assess and scaffold competency in mathematics practices integral to virtual scientific inquiry. The assessment and scaffolds will be piloted with students and tested as to their efficacy at improving students’ competencies on the practices of interest. After scaffolding on the practices (versus control), students will investigate two new phenomena. Statistical analyses will test if competencies on the practices are predictive of content acquisition in the new topics. The fine-grained operationalization of the practices into sub-practices and the empirical link of scaffolding these on content acquisition have implications for both research as well as science assessment and instruction.

Joe Olsen, Janice Gobert
Contextual Safeguarding in Education: Bayesian Network Risk Analysis for Decision Support

A Multi Academy Trust in the UK operates thirty-five academies educating 17,000 students across seven local authority areas. Significant societal problems are increasing risk to young people, including exploitation and violent crime associated with gang culture and drugs. A predictive analytics system is being implemented to support the delivery of contextual safeguarding, where the interplay of the school, peer, family and community environments determine the safeguarding risk. Due to the intense level of human activity required by safeguarding teams to identify and intervene with those at risk, Bayesian network risk modelling is being integrated with traditional analytics to extend and augment human capacity. The participants are keenly aware of the potential for harm from this data; in its collation, appropriateness of methods, accuracy and validity of output, and the human interpretation and resulting actions and impact on young people.

Matthew Woodruff, Graham Feek
Artificial Intelligence in Education
Ido Roll
Danielle McNamara
Sergey Sosnovsky
Rose Luckin
Vania Dimitrova
Copyright Year
Electronic ISBN
Print ISBN

Premium Partner