skip to main content
10.1145/2883851.2883931acmotherconferencesArticle/Chapter ViewAbstractPublication PageslakConference Proceedingsconference-collections
research-article
Public Access

Combining click-stream data with NLP tools to better understand MOOC completion

Published:25 April 2016Publication History

ABSTRACT

Completion rates for massive open online classes (MOOCs) are notoriously low. Identifying student patterns related to course completion may help to develop interventions that can improve retention and learning outcomes in MOOCs. Previous research predicting MOOC completion has focused on click-stream data, student demographics, and natural language processing (NLP) analyses. However, most of these analyses have not taken full advantage of the multiple types of data available. This study combines click-stream data and NLP approaches to examine if students' on-line activity and the language they produce in the online discussion forum is predictive of successful class completion. We study this analysis in the context of a subsample of 320 students who completed at least one graded assignment and produced at least 50 words in discussion forums, in a MOOC on educational data mining. The findings indicate that a mix of click-stream data and NLP indices can predict with substantial accuracy (78%) whether students complete the MOOC. This predictive power suggests that student interaction data and language data within a MOOC can help us both to understand student retention in MOOCs and to develop automated signals of student success.

References

  1. Balakrishnan, G., & Coetzee, D. 2013. Predicting Student Retention in Massive Open Online Courses Using Hidden Markov Models. Electrical Engineering and Computer Sciences University of California at Berkeley.Google ScholarGoogle Scholar
  2. Boyer, S., & Veeramachaneni, K. 2015. Transfer Learning for Predictive Models in Massive Open Online Courses. In the Proceedings of the International Conference on Artificial Intelligence in Education.Google ScholarGoogle Scholar
  3. Bradley, M. M., and Lang, P. J. 1999. Affective norms for English words (ANEW): Stimuli, instruction manual and affective ratings. Technical report. The Center for Research in Psychophysiology, University of Florida.Google ScholarGoogle Scholar
  4. Cambria, E. and Hussain, A. 2015. Sentic Computing: A Common-Sense-Based Framework for Concept-Level Sentiment Analysis. Cham, Switzerland: Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Cambria, E., Havasi, C., & Hussain, A. 2012. SenticNet 2: A semantic and affective resource for opinion mining and sentiment analysis. In G. M. Youngblood & P. M. Mcarthy (Eds.), Proceedings of the 25th Florida artificial intelligence research society conference (pp. 202--207).Google ScholarGoogle Scholar
  6. Cambria, E., Speer, R., Havasi, C., & Hussain, A. 2010. SenticNet: A publicly available semantic resource for opinion mining. In C. Havasi, D. Lenat, & B. Van Durme (Eds.), Commonsense Knowledge: Papers from the AAAI Fall Symposium (pp. 14--18).Google ScholarGoogle Scholar
  7. Chaturvedi, S., Goldwasser, D., & Daume, H. 2014. Predicting instructor's intervention in MOOC forums. Proceedings of the 52nd Meeting of the Association for Computational Linguistics.Google ScholarGoogle ScholarCross RefCross Ref
  8. Crossley, S. A. 2013. Advancing research in second language writing through computational tools and machine learning techniques. Language Teaching, 46 (2), 256--271.Google ScholarGoogle ScholarCross RefCross Ref
  9. Crossley, S. A., McNamara, D. S., Baker, R., Wang, Y., Paquette, L., Barnes, T., & Bergner, Y. 2015. Language to completion: Success in an educational data mining massive open online class. In Santos, O. C., Boticario, J. G., Romero, C., Pechenizkiy, M., Merceron, A., Mitros, P., Luna, J. M., Mihaescu, C., Moreno, P., Hershkovitz, A., Ventura, S., & Desmarais, M. (eds.) Proceedings of the 8th International Conference on Educational Data Mining. (pp. 388--392).Google ScholarGoogle Scholar
  10. Crossley, S. A., Kyle, K., & McNamara, D. S. in press. The Tool for the Automatic Analysis of Text Cohesion (TAACO): Automatic Assessment of Local, Global, and Text Cohesion. Behavior Research Methods.Google ScholarGoogle Scholar
  11. Dascalu, M., 2014. Analyzing discourse and text complexity for learning and collaborating, Studies in Computational Intelligence. Springer, Switzerland. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Dascalu, M., Dessus, P., Bianco, M., Trausan-Matu, S., & Nardy, A., 2014. Mining texts, learners productions and strategies with ReaderBench. In Educational Data Mining: Applications and Trends, A. Peña-Ayala Ed. Springer, Switzerland, 335--377.Google ScholarGoogle Scholar
  13. Dascalu, M., Stavarache, L. L., Trausan-Matu, S., Dessus, P., & Bianco, M., 2014. Reflecting Comprehension through French Textual Complexity Factors. In 26th Int. Conf. on Tools with Artificial Intelligence (ICTAI 2014) IEEE, Limassol, Cyprus, 615--619. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Dascalu, M., Trausan-Matu, S., Dessus, P., & Mcnamara, D. S., 2015. Discourse cohesion: A signature of collaboration. In 5th Int. Learning Analytics & Knowledge Conf. (LAK'15) ACM, Poughkeepsie, NY, 350--354. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Dascalu, M., Trausan-Matu, S., Mcnamara, D. S., & Dessus, P., in press. ReaderBench -- Automated Evaluation of Collaboration based on Cohesion and Dialogism. International Journal of Computer-Supported Collaborative Learning.Google ScholarGoogle Scholar
  16. DeBoer, J., Ho, A. D., Stump, G. S., & Breslow, L. 2014. Changing "Course": Reconceptualizing Educational Variables for Massive Open Online Courses. Educational Researcher, March, 74--84.Google ScholarGoogle Scholar
  17. Elouazizi, N. 2014. Point of view mining and cognitive presence in MOOCs: A (computational) linguistic perspective. Proceedings of the Empirical Methods in Natural Language Processing Workshop, 32--37.Google ScholarGoogle ScholarCross RefCross Ref
  18. Halawa, S., Greene, D., & Mitchell, J. 2014. Dropout Prediction in MOOCs Using Learner Activity Features. Experiences and Best Practices in and Around MOOCs, 7.Google ScholarGoogle Scholar
  19. He, J., Bailey, J., Rubinstein, B. I., Zhang, R. 2015. Identifying At-Risk Students in Massive Open Online Courses. In Twenty-Ninth AAAI Conference on Artificial Intelligence. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Hu, M., & Liu, B. 2004. Mining and summarizing customer reviews. In W. Kim & R. Kohavi (Eds.), Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 168--177). Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Hutto, C. J., & Gilbert, E. 2014. Vader: A parsimonious rule-based model for sentiment analysis of social media text. In E. Adar & P. Resnick (Eds.), Proceedings of the Eighth International AAAI Conference on Weblogs and Social Media (pp. 216--225).Google ScholarGoogle Scholar
  22. Kizilcec, R. F., Piech, C., and Schneider, E. 2013. Deconstructing disengagement: analyzing learnersubpopulations in massive open online courses. In the Proceedings of the Third International Conference on Learning Analytics and Knowledge, 170--179. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Kloft, M., Stiehler, F., Zheng, Z., & Pinkwart, N. 2014. Predicting MOOC Dropout over Weeks Using Machine Learning Methods. The 2014 Conference on Empirical Methods on Natural Language Processing.Google ScholarGoogle ScholarCross RefCross Ref
  24. Koller, D., Ng, A., Do, C., and Chen, Z. 2013. Retention and Intention in Massive Open OnlineCourses. Educause.Google ScholarGoogle Scholar
  25. Kyle, K., and Crossley, S. A. in press. Automatically Assessing Lexical Sophistication: Indices, Tools, Findings, and Application. TESOL Quarterly.Google ScholarGoogle Scholar
  26. Lauria, E. J., Baron, J. D., Devireddy, M., Sundararaju, V., & Jayaprakash, S. M. 2012. Mining Academic Data to Improve College Student Retention: An Open Source Perspective. In Proceedings of the 2nd Conference on Learning Analytics and Knowledge, 139--142. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Lykourentzou, I., Giannoukos, I., Nikolopoulos, V., Mpardis, G., & Loumos, V. 2009. Dropout Prediction in e-Learning Courses Through the Combination of Machine Learning Techniques. Computers & Education, 53(3), 950--965. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Manning, C. D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S. J., & McClosky, D. 2014. The Stanford CoreNLP natural language processing toolkit. In Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations (pp. 55--60).Google ScholarGoogle ScholarCross RefCross Ref
  29. McNamara, D. S., Crossley, S. A., & Roscoe, R. 2013. Natural Language Processingin an Intelligent Writing Strategy Tutoring System. Behavior Research Methods, 45 (2), 499--515.Google ScholarGoogle ScholarCross RefCross Ref
  30. Mohammad, S. M., & Turney, P. D. 2010. Emotions evoked by common words and phrases: Using Mechanical Turk to create an emotion lexicon. In Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text (pp. 26--34). Association for Computational Linguistics. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Mohammad, S. M., & Turney, P. D. 2013. Crowdsourcing a word--emotion association lexicon. Computational Intelligence, 29(3), 436--465.Google ScholarGoogle ScholarCross RefCross Ref
  32. Moon, S., Potdar, S., & Martin, L. 2014. Identifying student leaders from MOOC discussion forums through language influence. Proceedings of the Empirical Methods in Natural Language Processing Workshop, 15--20.Google ScholarGoogle ScholarCross RefCross Ref
  33. Pennebaker, J. W., Booth, R. J., and Francis, M. E. 2007. LIWC2007: Linguistic inquiry and word count. Austin, Texas.Google ScholarGoogle Scholar
  34. Ramesh, A., Goldwasser, D., Huang, B., Daume, H., and Getoor, L. 2014. Understanding MOOC Discussion Forums using Seeded LDA. ACL Workshop on Innovative Use of NLP for Building Educational Applications, 22--27.Google ScholarGoogle Scholar
  35. Saif, M., and Turney, P. 2013. Crowdsourcing a Word-Emotion Association Lexicon, Computational Intelligence, 29 (3), 436--465.Google ScholarGoogle ScholarCross RefCross Ref
  36. Scherer, K. R. 2005. What are emotions? And how should they be measured? Social Science Information, 44 (4), 695--729.Google ScholarGoogle ScholarCross RefCross Ref
  37. Seaton, D. T., Bergner, Y., Chuang, I., Mitros, P., & Pritchard, D. E. (2014). Who does what in a massive open online course? Communications of the ACM, 57(4), 58--65. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Sharma, K., Jermann, P., & Dillenbourg, P. 2015. Identifying Styles and Paths Toward Success in MOOCs. In the Proceedings of the 8th International Conference on Educational Data Mining, 408--411.Google ScholarGoogle Scholar
  39. Taylor, C., Veeramachaneni, K., & O'Reilly, U. M. 2014. Likely to Stop? Predicting Stopout in Massive Open Online Courses. arXiv preprint, arXiv:1408.3382.Google ScholarGoogle Scholar
  40. Taboada, M., Brooke, J., Tofiloski, M., Voll, K., & Stede, M. 2011. Lexicon-based methods for sentiment analysis. Computational linguistics, 37(2), 267--307. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Toutanova, K., Klein, D., Manning, C. D., & Singer, Y. 2003. Feature-rich part-of-speech tagging with a cyclic dependency network. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology-Volume 1 (pp. 173--180). Association for Computational Linguistics. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Wen, M., Yang, D. and Rose, C. P. 2014. Sentiment Analysis in MOOC Discussion Forums: What does it tell us? In the Proceedings of the 7th International Conference on Educational Data Mining, 130--137.Google ScholarGoogle Scholar
  43. Wen, M., Yang, D. and Rose, C. P. 2014. Linguistic Reflections of Student Engagement in Massive Open Online Courses. In the Proceedings of the International Conference on Weblogs and Social Media.Google ScholarGoogle Scholar
  44. Wang, Y. 2014. MOOC Leaner Motivation and Learning Pattern Discovery. In the Proceedings of the 7th International Conference on Educational Data Mining, 452--454.Google ScholarGoogle Scholar
  45. Wang, Y. E., Paquette, L., Baker, R. 2015. A Longitudinal Study on Learner Career Advancement in MOOCs. Journal of Learning Analytics, 1 (3), 203--206.Google ScholarGoogle ScholarCross RefCross Ref
  46. Whitehill, J., Williams, J. J., Lopez, G., Coleman, C. A., & Reich, J. 2015. Beyond Predictions: First Steps Toward Automatic Intervention in MOOC Student Dtopout. Available at SSRN 2611750.Google ScholarGoogle Scholar

Index Terms

  1. Combining click-stream data with NLP tools to better understand MOOC completion

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          LAK '16: Proceedings of the Sixth International Conference on Learning Analytics & Knowledge
          April 2016
          567 pages
          ISBN:9781450341905
          DOI:10.1145/2883851

          Copyright © 2016 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 25 April 2016

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          LAK '16 Paper Acceptance Rate36of116submissions,31%Overall Acceptance Rate236of782submissions,30%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader