Skip to main content

Advertisement

Log in

Addressing the assessment challenge with an online system that tutors as it assesses

  • Original Paper
  • Published:
User Modeling and User-Adapted Interaction Aims and scope Submit manuscript

Abstract

Secondary teachers across the United States are being asked to use formative assessment data (Black and Wiliam 1998a,b; Roediger and Karpicke 2006) to inform their classroom instruction. At the same time, critics of US government’s No Child Left Behind legislation are calling the bill “No Child Left Untested”. Among other things, critics point out that every hour spent assessing students is an hour lost from instruction. But, does it have to be? What if we better integrated assessment into classroom instruction and allowed students to learn during the test? We developed an approach that provides immediate tutoring on practice assessment items that students cannot solve on their own. Our hypothesis is that we can achieve more accurate assessment by not only using data on whether students get test items right or wrong, but by also using data on the effort required for students to solve a test item with instructional assistance. We have integrated assistance and assessment in the ASSISTment system. The system helps teachers make better use of their time by offering instruction to students while providing a more detailed evaluation of student abilities to the teachers, which is impossible under current approaches. Our approach for assessing student math proficiency is to use data that our system collects through its interactions with students to estimate their performance on an end-of-year high stakes state test. Our results show that we can do a reliably better job predicting student end-of-year exam scores by leveraging the interaction data, and the model based on only the interaction information makes better predictions than the traditional assessment model that uses only information about correctness on the test items.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Anozie, N., Junker, B.W.: Predicting end-of-year accountability assessment scores from monthly student records in an online tutoring system. In: Beck, J., Aimeur, E., Barnes, T. (eds.) Educational Data Mining: Papers from the AAAI Workshop, pp. 1–6. AAAI Press, Menlo Park, CA Technical Report WS-06-05 (2006)

  • Ayers E., Junker B.W.: Do skills combine additively to predict task difficulty in eighth grade mathematics? In: Beck, J., Aimeur, E., Barnes, T. (eds.) Educational Data Mining: Papers from the AAAI Workshop, pp. 14–20. AAAI Press, Menlo Park, CA, Technical Report WS-06-05 (2006)

  • Baker, R.S., Corbett, A.T., Koedinger, K.R.: Detecting student misuse of intelligent tutoring systems. In: James, C.L., Vicari, R.M., Paraguacu, F. (eds.) Intelligent Tutoring Systems: 7th International Conference ITS 2004, Maceió, Alagoas, Brazil Proceedings, pp. 531–540. Springer-Verlag Berlin Heidelberg, Berlin, Germany (2004)

  • Baker, R.S., Roll, I., Corbett, A.T., Koedinger, K.R.: Do performance goals lead students to game the system? In: Proceedings of the 12th International Conference on Artificial Intelligence in Education, pp. 57–64. Netherlands, Amsterdam (2005)

  • Beck J.E., Sison J.: Using knowledge tracing in a noisy environment to measure student reading proficiencies. Int. J. Artif. Intell. Educ. 16, 129–143 (2006)

    Google Scholar 

  • Beck J.E., Jia P., Mostow J.: Automatically assessing oral reading fluency in a computer tutor that listens. Technol. Instr. Cogn. Learn. 2, 61–81 (2004)

    Google Scholar 

  • Black P., Wiliam D.: Assessment and classroom learning. Assess. Educ.: Princ., Policy Pract. 5, 7–74 (1998a)

    Article  Google Scholar 

  • Black P., Wiliam D.: Inside the black box: raising standards through classroom assessment. Phi Delta Kappan 80(2), 139–149 (1998b)

    Google Scholar 

  • Boston, C.: The concept of formative assessment. Pract. Assess. Res. Eval. 8(9) (2002)

  • Campione J.C., Brown A.L., Bryant R.J.: Individual differences in learning and memory. In: Sternberg, R.J. (eds) Human Abilities: An Information-processing Approach, pp. 103–126. W. H. Freeman, New York (1985)

    Google Scholar 

  • Corbett, A.T., Bhatnagar, A.: Student modeling in the ACT Programming Tutor: Adjusting a procedural learning model with declarative knowledge. User Modeling: Proceedings of the Sixth International Conference on User Modeling UM97 Chia Laguna, Sardinia, Italy, pp. 243–254. Springer-Verlag Wein, New York (1997)

  • Corbett A.T., Anderson J.R., O’Brien A.T.: Student modeling in the ACT programming tutor. In: Nichols, P., Chipman, S., Brennan, R. (eds) Cognitively Diagnostic Assessment., Erlbaum, Hillsdale, NJ (1995)

    Google Scholar 

  • Computer Research Association.: Cyberinfrastructure for Education and Learning for the Future: a Vision and Research Agenda. Final report of Cyberlearning Workshop Series workshops held Fall 2004—Spring 2005 by the Computing Research Association and the International Society of the Learning Sciences. Retrieved from http://www.cra.org/reports/cyberinfrastructure.pdf on 10 November 2006 (2005)

  • Embretson S.E.: Structured Rasch models for measuring individual-difference in learning and change. Int. J. Psychol. 27(3–4), 372–372 (1992)

    Google Scholar 

  • Feng M., Heffernan N.T.: Towards live informing and automatic analyzing of student learning: Reporting in the assistment system. J. Interact. Learn. Res. 18(2), 207–230 (2007) AACE, Chesapeake, VA

    Google Scholar 

  • Feng, M., Heffernan, N.T., Koedinger, K.R.: Addressing the testing challenge with a web-based e-assessment system that tutors as it assesses. In: Carr, L.A., De Roure, D.C., Iyengar, A., Goble, C.A., Dahlin, M. (eds.) Proceedings of the Fifteenth International World Wide Web Conference, pp. 307–316. Edinburgh UK, 2006. ACM Press, New York, NY (2006)

  • Feng M., Heffernan N., Beck J., Koedinger K.: Can we predict which groups of questions students will learn from?. In: Baker, Beck (eds) Proceedings of the First International Conference on Educational Data Mining, pp. 218–225. Montreal, Canada (2008)

    Google Scholar 

  • Feng M., Beck J., Heffernan N., Koedinger K.: Can an intelligent tutoring system predict math proficiency as well as a standardized test? In: Baker, Beck (eds) Proceedings of the First International Conference on Educational Data Mining, pp. 107–116. Montreal, Canada (2008)

    Google Scholar 

  • Fischer G., Seliger E.: Multidimensional linear logistic models for change. Chap. 19. In: Linden, W.J., Hambleton, R.K. (eds) Handbook of Modern Item Response Theory, Springer-Verlag, New York (1997)

    Google Scholar 

  • Grigorenko E.L., Sternberg R.J.: Dynamic testing. Psychol. Bull. 124, 75–111 (1998)

    Article  Google Scholar 

  • Hulin C.L., Lissak R.I., Drasgow F.: Recovery of two- and three-parameter logistic item characteristic curves: A Monte Carlo study. Appl. Psychol. Meas. 6(3), 249–260 (1982)

    Article  Google Scholar 

  • Jannarone R.J.: Conjunctive item response theory kernels. Psychometrika 55(3), 357–373 (1986)

    Article  Google Scholar 

  • Koedinger, K.R., Aleven, V., Heffernan, N.T., McLaren, B., Hockenberry, M.: Opening the door to non-programmers: authoring intelligent tutor behavior by demonstration. In: Proceedings of the 7th International Conference on Intelligent Tutoring Systems, pp. 162–173. Maceio, Brazil (2004)

  • Massachusetts Department of Education.: Massachusetts Mathematics Curriculum Framework. Retrieved from http://www.doe.mass.edu/frameworks/math/2000/final.pdf, 6 November 2005 (2000)

  • MCAS technical report.: Retrieved from http://www.cs.wpi.edu/mfeng/pub/mcas_techrpt01.pdf, 5 August 2005 (2001)

  • Mitchell T.: Machine Learning. McGraw-Hill, Columbus, OH (1997)

    MATH  Google Scholar 

  • Mostow J., Aist G.: Evalutating tutors that listen: an overview of Project LISTEN. In: Feltovich, P. (eds) Smart Machines in Education, pp. 169–234. MIT/AAAI Press, Menlo Park, CA (2001)

    Google Scholar 

  • Olson, L.: State test programs mushroom as NCLB Mandate Kicks. In: Education Week, 20 November, pp. 10–14 (2004)

  • Olson, L.: Special report: testing takes off. Education Week, 30 November 2005, pp. 10–14 (2005)

  • Raftery A.E.: Bayesian model selection in social research. Sociol Methodol 25, 111–163 (1995)

    Article  Google Scholar 

  • Razzaq, L., Heffernan, N.T.: Scaffolding vs. hints in the Assistment System. In: Ikeda, Ashley, Chan (eds.) Proceedings of the 8th International Conference on Intelligent Tutoring Systems, pp. 635–644. Springer-Verlag, Jhongli, Taiwan, Berlin, Germany (2006)

  • Razzaq, L., Feng, M., Nuzzo-Jones, G., Heffernan, N.T., Koedinger, K.R., Junker, B., Ritter, S., Knight, A., Aniszczyk, C., Choksey, S., Livak, T., Mercado, E., Turner, T.E., Upalekar, R., Walonoski, J.A., Macasek, M.A., Rasmussen, K.P.: The ASSISTment project: blending assessment and assisting. In: Proceedings of the 12th Annual Conference on Artificial Intelligence in Education. Amsterdam, The Netherlands, pp. 555–562. ISO Press, Amsterdam (2005)

  • Razzaq, L., Heffernan, N.T., Lindeman, R.W.: What level of tutor interaction is best? In: Luckin, Koedinger (eds.) Proceedings of the 13th Conference on Artificial Intelligence in Education, pp. 222–229. IOS Press, Los Angeles, CA, Amsterdam, The Netherlands (2007)

  • Roediger H.L. III, Karpicke J.D.: The power of testing memory. Perspect. Psychol. Sci. 1(3), 181–210 (2006)

    Article  Google Scholar 

  • Sternburg R.J., Grigorenko E.L.: All testing is dynamic testing. Issues Educ. 7, 137–170 (2001)

    Google Scholar 

  • Sternburg R.J., Grigorenko E.L.: Dynamic Testing: The Nature and Measurement of Learning Potential. Cambridge University Press, Cambridge (2002)

    Google Scholar 

  • Tan E.S., Imbos T., Does R.J.M.: A distribution-free approach to comparing growth of knowledge. J. Educ. Measure. 31(1), 51–65 (1994)

    Article  Google Scholar 

  • Tatsuoka K.K.: Rule space: an approach for dealing with misconceptions based on item response theory. J. Educ. Measure. 20, 345–354 (1983)

    Article  Google Scholar 

  • van der Linden, W.J., Hambleton, R.K. (eds.): Handbook of Modern Item Response Theory. Springer Verlag, New York, NY (1997)

    MATH  Google Scholar 

  • Walonoski, J., Heffernan, N.T.: Detection and analysis of off-task gaming behavior in intelligent tutoring systems. In: Ikeda, Ashley, Chan (eds.) In: Proceedings of the 8th International Conference on Intelligent Tutoring Systems. Berlin, pp. 382–391. Springer-Verlag, Jhongli, Taiwan (2006)

  • Zimowski, M., Muraki, E., Mislevy, R., Bock, D.: BILOG-MG 3—Multiple-Group IRT Analysis and Test maintenance for Binary Items. Scientific Software International, Inc., Lincolnwood, IL. URL http://www.ssicentral.com/. (2005)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mingyu Feng.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Feng, M., Heffernan, N. & Koedinger, K. Addressing the assessment challenge with an online system that tutors as it assesses. User Model User-Adap Inter 19, 243–266 (2009). https://doi.org/10.1007/s11257-009-9063-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11257-009-9063-7

Keywords

Navigation