Elsevier

Computers & Education

Volume 100, September 2016, Pages 94-109
Computers & Education

Automated essay evaluation software in English Language Arts classrooms: Effects on teacher feedback, student motivation, and writing quality

https://doi.org/10.1016/j.compedu.2016.05.004Get rights and content

Highlights

  • We compare the effects of combining teacher feedback with automated feedback.

  • Teachers using automated feedback gave proportionately more higher-level feedback.

  • The combined feedback condition was associated with greater student persistence.

  • Automated feedback helped save time and effort without sacrificing writing quality.

  • Combining automated teacher and automated feedback benefits teachers and students.

Abstract

Automated Essay Evaluation (AEE) systems are being increasingly adopted in the United States to support writing instruction. AEE systems are expected to assist teachers in providing increased higher-level feedback and expediting the feedback process, while supporting gains in students’ writing motivation and writing quality. The current study explored these claims using a quasi-experimental study. Four eighth-grade English Language Arts (ELA) classes were assigned to a combined feedback condition in which they received feedback on their writing from their teacher and from an automated essay evaluation (AEE) system called PEG Writing®. Four other eighth-grade ELA classes were assigned to a teacher feedback-only condition, in which they received feedback from their teacher via GoogleDocs. Results indicated that teachers gave the same median amount feedback to students in both condition, but gave proportionately more feedback on higher-level writing skills to students in the combined PEG + Teacher Feedback condition. Teachers also agreed that PEG assisted them in saving one-third to half the time it took to provide feedback when they were the sole source of feedback (i.e., the GoogleDocs condition). At the conclusion of the study, students in the combined feedback condition demonstrated increases in writing persistence, though there were no differences between groups with regard to final-draft writing quality.

Introduction

A commonly used method for teaching writing is to provide instructional feedback (Biber et al., 2011, Black and William, 2009, Hattie and Timperley, 2007, Kellogg and Whiteford, 2009). Instructional feedback is information provided by an agent—such as a teacher, peer, or computer—that indicates both correctness/incorrectness and ways to improve performance or understanding (Hattie and Timperley, 2007, Parr and Timperley, 2010). Struggling writers, in particular, need targeted instructional feedback because they tend to produce shorter, less-developed, and more error-filled and problem-laden texts than their peers (Troia, 2006).

However, the role of instructional feedback in the teaching of writing is not without controversy. Proponents advocate its role in supporting motivation and writing quality by (a) indicating to the author his/her position relative to a desired level of quality; (b) identifying areas in need of improvement related to low-level writing skills (spelling, word choice, mechanics, grammar) or high-level skills (idea development and elaboration, organization, rhetoric); and (c) prompting additional practice attempts in which the author incorporates and eventually internalizes the feedback (Ferster et al., 2012, Kellogg et al., 2010, Parr and Timperley, 2010). In contrast, others argue that providing instructional feedback is (a) too time consuming and leads to teacher burnout (Anson, 2000, Baker, 2014, Lee, 2014); (b) too difficult for teachers to provide given the complexity of writing ability (Marshall and Drummond, 2006, Parr and Timperley, 2010); and (c) ineffective or incapable of achieving substantial, generalizable gains in students’ writing performance (Bangert-Drowns et al., 1991, Biber et al., 2011, Kluger and DeNisi, 1998). Nevertheless, instructional feedback continues to be recommended as a method for teaching writing (American Psychological Association, 2015, Graham et al., 2015, Graham et al., 2015, Graham et al., 2012, Graham and Perin, 2007).

In the U.S., an increasingly common form of instructional feedback for writing is feedback provided by automated essay evaluation systems (AEE) (Warschauer & Grimes, 2008). AEE are web-based formative writing assessment software programs which provide students with immediate automated feedback in the form of essay ratings and individualized suggestions for improvement when revising (Shermis & Burstein, 2013). Some of the principal benefits of AEE are efficiency and flexibility. While there is no consensus regarding the optimal timing of feedback (see Shute (2008) for review), immediate feedback is often preferred (Chan et al., 2014, Ferster et al., 2012) especially in classroom settings (Hattie and Timperley, 2007, Shute, 2008). In addition, unlike teacher or peer feedback, automated feedback allows students to control feedback timing. Students receive feedback when they request it, either in the middle of, or after having completed, an essay draft. This enables feedback to be immediately actionable, accelerating the practice-feedback loop (Foltz et al., 2013, Kellogg et al., 2010).

While automated feedback addresses some of the barriers faced by teachers when providing instructional feedback, the intended use of AEE systems is to complement and not replace teacher feedback (Foltz, 2014, Foltz et al., 2013, Kellogg et al., 2010). Indeed, AEE is thought to free up instructional time and allow teachers to be more selective in the type of feedback they provide, thereby improving students’ writing motivation and writing performance (Grimes & Warschauer, 2010). For instance, after implementing AEE in her high school, a school administrator reported that, “[AEE] has helped motivate our students to write while making it easier for educators to provide the feedback needed to ensure growth in writing” (Schmelzer, 2004, p.34).

Yet, the growing adoption of AEE in the U.S. has been accompanied by a number of concerns and fears. For instance, despite its intended role as a complement to teacher feedback, some fear that AEE will come to replace the teacher as primary feedback agent (Ericcson and Haswell, 2006, Herrington and Moran, 2001), and thereby negate the social communicative function of writing (National Council of Teachers of English [NCTE], 2013). Others are concerned that AEE can be easily fooled to assign high scores to essays which are long, syntactically complex, and replete with complex vocabulary (Bejar et al., 2014, Higgins and Heilman, 2014). Concerns such as these have led some groups to summarily reject the use of AEE (Conference on College Composition and Communication, 2014, National Council of Teachers of English, 2013).

This debate over AEE’s virtues and ills is compounded by two related issues. First, there is a dearth of research on AEE used for the purpose of formative assessment—i.e., assessment for, rather than of learning (Black & William, 2009). By far, the majority of research has focused on the psychometric properties of the automated scoring engine, rather than documenting evidence that automated feedback is associated with desired changes in teacher feedback practices or students’ writing motivation or writing quality (Stevenson & Phakiti, 2014). Indeed, a recent chapter on the formative use of AEE in the Handbook of Writing Research still primarily discusses the features of the AEE scoring systems and the reliability and agreement of those systems with human essay ratings. The chapter authors acknowledge that research “still needs to be conducted to gain a more comprehensive understanding of the impact of [automated] feedback than can guide best-use practice” (Shermis, Burstein, Elliot, Miel, & Foltz, 2015, p.406). Second, previous research has most often examined the effects of automated feedback in isolation of teacher feedback (Stevenson & Phakiti, 2014). Such designs lack ecological validity and may inadvertently bolster fears that adoption of AEE will replace teachers as feedback agents.

Given the controversies surrounding the use of instructional feedback and AEE, as well as the dearth of prior research focusing on the intended usage of AEE, the current study was designed to explore the implications for instruction and student performance when teacher feedback on writing was combined with automated feedback. Specific outcomes of interest included the amount, type, and level of teacher feedback; students’ writing motivation; and final-draft writing quality. To further provide context for this study, three areas of prior research will be discussed: (1) categorization of teacher feedback on writing, (2) effects of teacher and automated feedback on writing motivation, and (3) effects of teacher and automated feedback on writing quality.

Section snippets

Categorizing teacher feedback on writing

Teacher feedback on writing is commonly categorized as having at least two components: feedback type and feedback level. Feedback type relates to the manner in which feedback is presented to the student. A common distinction is between direct and indirect feedback (Biber et al., 2011, Black and William, 1998, Cho et al., 2006, DeGroff, 1992, Shute, 2008). Direct feedback (i.e., directives) involves teachers making a correction or directly telling students what needs to be revised. Indirect

Study purpose

This study examined effects on teacher feedback, and students’ writing motivation and final-draft writing quality associated with a combined automated + teacher feedback condition, in which students received feedback from an AEE system called PEG Writing® as well as their teacher, and a teacher-feedback-only condition, in which students received feedback from their teacher via the comment and edit functions of GoogleDocs. To date, no research has evaluated the effects of a combined

Setting and participants

This study was conducted in a middle school in an urban school district in the mid-Atlantic region of the United States. The district serves approximately 10,000 students in ten elementary schools, three middle schools, and one high school. In this district, 43% of students are African-American, 20% are Hispanic/Latino, and 33% White. Approximately 9% of students are English Language Learners, and 50% of students come from low income families.

Two eighth-grade English Language Arts (ELA)

Pretest analyses

A one-way ANOVA indicated that groups were equivalent with regard to prior literacy achievement: PEG + Teacher (M = 916.23, SD = 256.79), GoogleDocs (M = 851.37, SD = 252.56); F(1, 142) = 2.34, p = 0.129. Non-parametric analyses performed on the individual writing-motivation survey items revealed that the null hypothesis of equal distributions across feedback conditions was retained in all cases. Hence, at pretest, groups were equivalent with respect to prior literacy ability and writing

Discussion

To our knowledge, this was the first study to compare the effects of a combined teacher + automated feedback condition against a teacher-feedback-only condition (GoogleDocs) A recent literature review indicated the absence of such comparisons and the need to utilize a more ecologically valid experimental design than customary automated feedback versus no-feedback control designs (Stevenson & Phakiti, 2014).

Conclusion

With the increasing adoption of AEE in classroom settings in the U.S., it is important to carefully understand the associated effects on teachers’ feedback practices and key student outcomes, such as writing motivation and writing quality, when AEE is used as intended. The current study provides partial support for the claim that AEE will afford teachers the ability to focus on higher-level writing skills, while increasing students’ writing motivation and writing quality. Nevertheless, study

Acknowledgements

An earlier version of this work was presented as a paper presented at the 10th Workshop on Innovative Use of NLP for Building Educational Applications in Denver, Colorado, in June of 2015. This research was supported in part by a Delegated Authority contract from Measurement Incorporated® to University of Delaware (EDUC432914150001). The opinions expressed in this paper are those of the authors and do not necessarily reflect the positions or policies of this agency, and no official endorsement

References (92)

  • M.D. Shermis

    State-of-the-art automated essay scoring: competition, results, and future directions from a United States demonstration

    Assessing Writing

    (2014)
  • M. Stevenson et al.

    The effects of computer-generated feedback on the quality of writing

    Assessing Writing

    (2014)
  • R.D. Abbott et al.

    Longitudinal relationships of levels of language in writing and between writing and reading in grades 1 to 7

    Journal of Educational Psychology

    (2010)
  • A.F. AbuSeileek

    Using track changes and word processor to provide corrective feedback to learners in writing

    Journal of Computer Assisted Learning

    (2013)
  • Y. Ahmed et al.

    Developmental relations between reading and writing at the word, sentence and text levels: a latent change score analysis

    Journal of Educational Psychology

    (2014)
  • L. Allal et al.

    Revision: Cognitive and instructional processes

    (2004)
  • I.E. Allen et al.

    Likert scales and data analyses

    Quality Progress

    (2007)
  • American Psychological Association, Coalition for Psychology in Schools and Education

    Top 20 principles from psychology for preK–12 teaching and learning

    (2015)
  • R.R. Andridge et al.

    A review of hot deck imputation for survey non-response

    International Statistics Review

    (2010)
  • T. Asparouhov et al.

    Multiple imputation with Mplus

    (2010)
  • R.L. Bangert-Drowns et al.

    The instructional effect of feedback in test-like events

    Review of Educational Research

    (1991)
  • Y. Benjamini et al.

    Controlling the false discovery rate: a practical and powerful approach to multiple testing

    Journal of the Royal Statistical Society. Series B (Methodological)

    (1995)
  • D. Biber et al.

    The effectiveness of feedback for L1-English and L2-writing development: A meta-analysis. TOEFL iBT™ research report

    (2011)
  • P. Black et al.

    Assessment and classroom learning

    Assessment in Education

    (1998)
  • P. Black et al.

    Developing the theory of formative assessment

    Educational Assessment, Evaluation and Accountability

    (2009)
  • M.B. Bunch et al.

    Automated scoring in assessment systems

  • P.E. Chan et al.

    The critical role of feedback in formative instructional practices

    Intervention in School and Clinic

    (2014)
  • C.E. Chen et al.

    Beyond the design of automated writing evaluation: pedagogical practices and perceived learning effectiveness in EFL writing classes

    Learning and Technology

    (2008)
  • K. Cho et al.

    Commenting on writing: typology and perceived helpfulness of comments from novice peer reviewers and subject matter experts

    Written Communication

    (2006)
  • L. Clare et al.

    Learning to write in urban elementary and middle schools: An investigation of teachers’ written feedback on student compositions

    (2000)
  • Common Core State Standards Initiative

    Common core state standards for English language arts & literacy in history/social studies, science, and technical subjects

    (2010)
  • Conference on College Composition and Communication

    Writing assessment: A position statement

    (2014)
  • L. DeGroff

    Process-writing teachers’ responses to fourth grade writers’ first drafts

    The Elementary School Journal

    (1992)
  • H. Dujinhower et al.

    Progress feedback effects on students’ writing mastery goal, self-efficacy beliefs, and performance

    Educational Research and Evaluation

    (2010)
  • P.F. Ericcson et al.
  • B. Ferster et al.

    Automated formative assessment as a tool to scaffold student documentary writing

    Journal of Interactive Learning Research

    (2012)
  • A. Field

    Discovering statistics using IBM SPSS statistics

    (2014)
  • P.W. Foltz

    Improving student writing through automated formative assessment: Practices and results

    (2014)
  • P.W. Foltz et al.

    Implementation and applications of the intelligent essay assessor

  • M. Franzke et al.

    Summary Street®: computer support for comprehension and writing

    Journal of Educational Computing Research

    (2005)
  • S. Graham et al.

    Research-based writing practices and the common Core

    Elementary School Journal

    (2015)
  • S. Graham et al.

    Formative assessment and writing: a meta-analysis

    Elementary School Journal

    (2015)
  • S. Graham et al.

    Assessing the writing achievement of young struggling writers: application of generalizability theory

    Learning Disability Quarterly

    (2014)
  • S. Graham et al.

    A meta-analysis of writing instruction for students in the elementary grades

    Journal of Educational Psychology

    (2012)
  • S. Graham et al.

    Writing next: Effective strategies to improve writing of adolescents in middle and high schools – A report to Carnegie Corporation of New York

    (2007)
  • D. Grimes et al.

    Utility in a fallible tool: a multi-site case study of automated writing evaluation

    Journal of Technology, Learning, and Assessment

    (2010)
  • Cited by (0)

    View full text