ABSTRACT
Completion rates for massive open online classes (MOOCs) are notoriously low. Identifying student patterns related to course completion may help to develop interventions that can improve retention and learning outcomes in MOOCs. Previous research predicting MOOC completion has focused on click-stream data, student demographics, and natural language processing (NLP) analyses. However, most of these analyses have not taken full advantage of the multiple types of data available. This study combines click-stream data and NLP approaches to examine if students' on-line activity and the language they produce in the online discussion forum is predictive of successful class completion. We study this analysis in the context of a subsample of 320 students who completed at least one graded assignment and produced at least 50 words in discussion forums, in a MOOC on educational data mining. The findings indicate that a mix of click-stream data and NLP indices can predict with substantial accuracy (78%) whether students complete the MOOC. This predictive power suggests that student interaction data and language data within a MOOC can help us both to understand student retention in MOOCs and to develop automated signals of student success.
- Balakrishnan, G., & Coetzee, D. 2013. Predicting Student Retention in Massive Open Online Courses Using Hidden Markov Models. Electrical Engineering and Computer Sciences University of California at Berkeley.Google Scholar
- Boyer, S., & Veeramachaneni, K. 2015. Transfer Learning for Predictive Models in Massive Open Online Courses. In the Proceedings of the International Conference on Artificial Intelligence in Education.Google Scholar
- Bradley, M. M., and Lang, P. J. 1999. Affective norms for English words (ANEW): Stimuli, instruction manual and affective ratings. Technical report. The Center for Research in Psychophysiology, University of Florida.Google Scholar
- Cambria, E. and Hussain, A. 2015. Sentic Computing: A Common-Sense-Based Framework for Concept-Level Sentiment Analysis. Cham, Switzerland: Springer. Google ScholarDigital Library
- Cambria, E., Havasi, C., & Hussain, A. 2012. SenticNet 2: A semantic and affective resource for opinion mining and sentiment analysis. In G. M. Youngblood & P. M. Mcarthy (Eds.), Proceedings of the 25th Florida artificial intelligence research society conference (pp. 202--207).Google Scholar
- Cambria, E., Speer, R., Havasi, C., & Hussain, A. 2010. SenticNet: A publicly available semantic resource for opinion mining. In C. Havasi, D. Lenat, & B. Van Durme (Eds.), Commonsense Knowledge: Papers from the AAAI Fall Symposium (pp. 14--18).Google Scholar
- Chaturvedi, S., Goldwasser, D., & Daume, H. 2014. Predicting instructor's intervention in MOOC forums. Proceedings of the 52nd Meeting of the Association for Computational Linguistics.Google ScholarCross Ref
- Crossley, S. A. 2013. Advancing research in second language writing through computational tools and machine learning techniques. Language Teaching, 46 (2), 256--271.Google ScholarCross Ref
- Crossley, S. A., McNamara, D. S., Baker, R., Wang, Y., Paquette, L., Barnes, T., & Bergner, Y. 2015. Language to completion: Success in an educational data mining massive open online class. In Santos, O. C., Boticario, J. G., Romero, C., Pechenizkiy, M., Merceron, A., Mitros, P., Luna, J. M., Mihaescu, C., Moreno, P., Hershkovitz, A., Ventura, S., & Desmarais, M. (eds.) Proceedings of the 8th International Conference on Educational Data Mining. (pp. 388--392).Google Scholar
- Crossley, S. A., Kyle, K., & McNamara, D. S. in press. The Tool for the Automatic Analysis of Text Cohesion (TAACO): Automatic Assessment of Local, Global, and Text Cohesion. Behavior Research Methods.Google Scholar
- Dascalu, M., 2014. Analyzing discourse and text complexity for learning and collaborating, Studies in Computational Intelligence. Springer, Switzerland. Google ScholarDigital Library
- Dascalu, M., Dessus, P., Bianco, M., Trausan-Matu, S., & Nardy, A., 2014. Mining texts, learners productions and strategies with ReaderBench. In Educational Data Mining: Applications and Trends, A. Peña-Ayala Ed. Springer, Switzerland, 335--377.Google Scholar
- Dascalu, M., Stavarache, L. L., Trausan-Matu, S., Dessus, P., & Bianco, M., 2014. Reflecting Comprehension through French Textual Complexity Factors. In 26th Int. Conf. on Tools with Artificial Intelligence (ICTAI 2014) IEEE, Limassol, Cyprus, 615--619. Google ScholarDigital Library
- Dascalu, M., Trausan-Matu, S., Dessus, P., & Mcnamara, D. S., 2015. Discourse cohesion: A signature of collaboration. In 5th Int. Learning Analytics & Knowledge Conf. (LAK'15) ACM, Poughkeepsie, NY, 350--354. Google ScholarDigital Library
- Dascalu, M., Trausan-Matu, S., Mcnamara, D. S., & Dessus, P., in press. ReaderBench -- Automated Evaluation of Collaboration based on Cohesion and Dialogism. International Journal of Computer-Supported Collaborative Learning.Google Scholar
- DeBoer, J., Ho, A. D., Stump, G. S., & Breslow, L. 2014. Changing "Course": Reconceptualizing Educational Variables for Massive Open Online Courses. Educational Researcher, March, 74--84.Google Scholar
- Elouazizi, N. 2014. Point of view mining and cognitive presence in MOOCs: A (computational) linguistic perspective. Proceedings of the Empirical Methods in Natural Language Processing Workshop, 32--37.Google ScholarCross Ref
- Halawa, S., Greene, D., & Mitchell, J. 2014. Dropout Prediction in MOOCs Using Learner Activity Features. Experiences and Best Practices in and Around MOOCs, 7.Google Scholar
- He, J., Bailey, J., Rubinstein, B. I., Zhang, R. 2015. Identifying At-Risk Students in Massive Open Online Courses. In Twenty-Ninth AAAI Conference on Artificial Intelligence. Google ScholarDigital Library
- Hu, M., & Liu, B. 2004. Mining and summarizing customer reviews. In W. Kim & R. Kohavi (Eds.), Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 168--177). Google ScholarDigital Library
- Hutto, C. J., & Gilbert, E. 2014. Vader: A parsimonious rule-based model for sentiment analysis of social media text. In E. Adar & P. Resnick (Eds.), Proceedings of the Eighth International AAAI Conference on Weblogs and Social Media (pp. 216--225).Google Scholar
- Kizilcec, R. F., Piech, C., and Schneider, E. 2013. Deconstructing disengagement: analyzing learnersubpopulations in massive open online courses. In the Proceedings of the Third International Conference on Learning Analytics and Knowledge, 170--179. Google ScholarDigital Library
- Kloft, M., Stiehler, F., Zheng, Z., & Pinkwart, N. 2014. Predicting MOOC Dropout over Weeks Using Machine Learning Methods. The 2014 Conference on Empirical Methods on Natural Language Processing.Google ScholarCross Ref
- Koller, D., Ng, A., Do, C., and Chen, Z. 2013. Retention and Intention in Massive Open OnlineCourses. Educause.Google Scholar
- Kyle, K., and Crossley, S. A. in press. Automatically Assessing Lexical Sophistication: Indices, Tools, Findings, and Application. TESOL Quarterly.Google Scholar
- Lauria, E. J., Baron, J. D., Devireddy, M., Sundararaju, V., & Jayaprakash, S. M. 2012. Mining Academic Data to Improve College Student Retention: An Open Source Perspective. In Proceedings of the 2nd Conference on Learning Analytics and Knowledge, 139--142. Google ScholarDigital Library
- Lykourentzou, I., Giannoukos, I., Nikolopoulos, V., Mpardis, G., & Loumos, V. 2009. Dropout Prediction in e-Learning Courses Through the Combination of Machine Learning Techniques. Computers & Education, 53(3), 950--965. Google ScholarDigital Library
- Manning, C. D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S. J., & McClosky, D. 2014. The Stanford CoreNLP natural language processing toolkit. In Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations (pp. 55--60).Google ScholarCross Ref
- McNamara, D. S., Crossley, S. A., & Roscoe, R. 2013. Natural Language Processingin an Intelligent Writing Strategy Tutoring System. Behavior Research Methods, 45 (2), 499--515.Google ScholarCross Ref
- Mohammad, S. M., & Turney, P. D. 2010. Emotions evoked by common words and phrases: Using Mechanical Turk to create an emotion lexicon. In Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text (pp. 26--34). Association for Computational Linguistics. Google ScholarDigital Library
- Mohammad, S. M., & Turney, P. D. 2013. Crowdsourcing a word--emotion association lexicon. Computational Intelligence, 29(3), 436--465.Google ScholarCross Ref
- Moon, S., Potdar, S., & Martin, L. 2014. Identifying student leaders from MOOC discussion forums through language influence. Proceedings of the Empirical Methods in Natural Language Processing Workshop, 15--20.Google ScholarCross Ref
- Pennebaker, J. W., Booth, R. J., and Francis, M. E. 2007. LIWC2007: Linguistic inquiry and word count. Austin, Texas.Google Scholar
- Ramesh, A., Goldwasser, D., Huang, B., Daume, H., and Getoor, L. 2014. Understanding MOOC Discussion Forums using Seeded LDA. ACL Workshop on Innovative Use of NLP for Building Educational Applications, 22--27.Google Scholar
- Saif, M., and Turney, P. 2013. Crowdsourcing a Word-Emotion Association Lexicon, Computational Intelligence, 29 (3), 436--465.Google ScholarCross Ref
- Scherer, K. R. 2005. What are emotions? And how should they be measured? Social Science Information, 44 (4), 695--729.Google ScholarCross Ref
- Seaton, D. T., Bergner, Y., Chuang, I., Mitros, P., & Pritchard, D. E. (2014). Who does what in a massive open online course? Communications of the ACM, 57(4), 58--65. Google ScholarDigital Library
- Sharma, K., Jermann, P., & Dillenbourg, P. 2015. Identifying Styles and Paths Toward Success in MOOCs. In the Proceedings of the 8th International Conference on Educational Data Mining, 408--411.Google Scholar
- Taylor, C., Veeramachaneni, K., & O'Reilly, U. M. 2014. Likely to Stop? Predicting Stopout in Massive Open Online Courses. arXiv preprint, arXiv:1408.3382.Google Scholar
- Taboada, M., Brooke, J., Tofiloski, M., Voll, K., & Stede, M. 2011. Lexicon-based methods for sentiment analysis. Computational linguistics, 37(2), 267--307. Google ScholarDigital Library
- Toutanova, K., Klein, D., Manning, C. D., & Singer, Y. 2003. Feature-rich part-of-speech tagging with a cyclic dependency network. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology-Volume 1 (pp. 173--180). Association for Computational Linguistics. Google ScholarDigital Library
- Wen, M., Yang, D. and Rose, C. P. 2014. Sentiment Analysis in MOOC Discussion Forums: What does it tell us? In the Proceedings of the 7th International Conference on Educational Data Mining, 130--137.Google Scholar
- Wen, M., Yang, D. and Rose, C. P. 2014. Linguistic Reflections of Student Engagement in Massive Open Online Courses. In the Proceedings of the International Conference on Weblogs and Social Media.Google Scholar
- Wang, Y. 2014. MOOC Leaner Motivation and Learning Pattern Discovery. In the Proceedings of the 7th International Conference on Educational Data Mining, 452--454.Google Scholar
- Wang, Y. E., Paquette, L., Baker, R. 2015. A Longitudinal Study on Learner Career Advancement in MOOCs. Journal of Learning Analytics, 1 (3), 203--206.Google ScholarCross Ref
- Whitehill, J., Williams, J. J., Lopez, G., Coleman, C. A., & Reich, J. 2015. Beyond Predictions: First Steps Toward Automatic Intervention in MOOC Student Dtopout. Available at SSRN 2611750.Google Scholar
Index Terms
- Combining click-stream data with NLP tools to better understand MOOC completion
Recommendations
Perception of MOOC Pedagogical Tools and Learners' Learning Styles in MOOC Blended Teaching: a Case Study
ICEBT '19: Proceedings of the 2019 3rd International Conference on E-Education, E-Business and E-TechnologyRapid development has been achieved since the emergence of MOOC in 2008, but there are still many defects in the popularization of MOOC. Developing blended teaching by utilizing is considered to be one of effective means to overcome these shortcomings. ...
Self-Regulation for High School Learners in a MOOC Computer Science Course
SIGCSE '20: Proceedings of the 51st ACM Technical Symposium on Computer Science EducationCourses designed for Massive Open Online Courseware (MOOC)platforms provide learners worldwide with extensive learning opportunities. Previous research has explored learner motivation in MOOC courses using self-regulated learning (SRL) theory. How-ever; ...
Chinese English Teachers' Perspectives on “Distributed Flip MOOC Blends”: From BMELTT to BMELTE
This article reports on a study involving experienced university lecturers from mainland China reflecting on how to blend FutureLearn MOOCs into their existing English Language Teaching (ELT) curricula while on an ‘upskilling' teacher education summer ...
Comments