1995 | OriginalPaper | Chapter
Adjusting subjectively rated scores
Author : Nicholas T. Longford
Published in: Models for Uncertainty in Educational Testing
Publisher: Springer New York
Included in: Professional Book Archive
Activate our intelligent search to find suitable subject content or patents.
Select sections of text to find matching patents with Artificial Intelligence. powered by
Select sections of text to find additional relevant content using AI-assisted search. powered by
In the context of educational testing, estimation of the variance components and of the reliability coefficients is at best of secondary importance to the pivotal task — assigning scores to the essays (performances, problem solutions, or the like) in a way that reflects their quality as faithfully as possible. In ideal circumstances, this would correspond to reconstructing the true score α i for each essay. A more realistic target is to get as close to α i as possible. This chapter discusses improvements on the trivial estimator of the true score, the mean score over the K sessions, $${y_{i,.}} = ({y_{i,{j_{i1}}}} + \ldots + {y_{i,{j_{iK}}}})/K,$$ by means of several adjustment schemes. The variance components, σ a 2 and σ b 2 and σ e 2 play an important role in these schemes. To motivate them, consider the following extreme cases: when everybody has the same true score, σ a 2 = 0, each examinee should be given the same score, irrespective of the grades given by the raters. Similarly, when the raters vary a great deal in their severities (large σ b 2), or the rating is very inconsistent (large σ a 2), an extreme score (say, 0 or 9 on the scale 0–9) is not a strong evidence of very poor or very high quality of the essay. On average, it may be prudent to ‘pull’ the extreme scores closer to the mean, so as to hedge our bets against the largest possible errors.