Skip to main content
Erschienen in: Quality & Quantity 3/2014

01.05.2014

Intercoder reliability indices: disuse, misuse, and abuse

verfasst von: Guangchao Charles Feng

Erschienen in: Quality & Quantity | Ausgabe 3/2014

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Although intercoder reliability has been considered crucial to the validity of a content study, the choice among them has been controversial. This study analyzed all the content studies published in the two major communication journals that reported intercoder reliability, aiming to find how scholars conduct intercoder reliability test. The results revealed that some intercoder reliability indices were misused persistently concerning the levels of measurement, the number of coders, and the means of reporting reliability over the past 30 years. Implications of misuse, disuse, and abuse were discussed, and suggestions regarding proper choice of indices in various situations were made at last.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
\(^{1}\) Coders could be also called annotators, judges, raters, observers, classifiers and others, depending on the research field. Intercoder, as well as interrater, is used interchangeably throughout the paper.
 
2
When the reliability value is exceedingly lower than the value of percent agreement, e.g., percent agreement is higher than 0.8, while reliability is close or lower than 0, this may indicate that the marginal distribution is too skewed.
 
3
It is identical to Bennett et al. (1954)’s \(S\) coefficient.
 
4
As Lombard et al. (2002) argued, the proportion of percent agreement was probably underestimated because most “NAs” would actually adopt percent agreement.
 
5
They have corresponding multiple coder versions proposed by other scholars. For instance, Fleiss (1971) extended \(\pi \) while Conger (1980) and Light (1971) suggested the multiple coder version of \(\kappa \).
 
6
Cohen (1968) later proposed weighted \(\kappa \) for ordinal ratings. Krippendorff (2004a)’s \(\alpha \) is able to be applied to all levels of measurement. Some indices like ICCs are only applicable to interval ratings, and yet some like \(I_{r}\), Brennan and Prediger (1981)’s \(\kappa \) and \(\pi \) do not have higher levels of counterparts.
 
7
Although it has been a consensus that percent agreement, including Holsti generally overestimates reliability in that it does not make allowance for chance agreement, but it is not considered as misuse if used for nominal scaled codings. The rationale is to be explained below.
 
8
Reporting standard errors for the reliability value obtained is still arguable in the literature. Therefore, not reporting standard errors is not a problem for the present.
 
9
There are plenty of modeling approaches, such as log-linear, IRT (item response theory), latent class, and mixture modeling. In a separate study of the author, the approach of log-linear modeling was found to be no better than most indices.
 
10
Although variables with binary outcomes belong to the nominal level, most indices share more characteristics between binary and interval variables.
 
Literatur
Zurück zum Zitat Artstein, R., Poesio, M.: Inter-coder agreement for computational linguistics. Comput. Linguist. 34(4), 555–596 (2008)CrossRef Artstein, R., Poesio, M.: Inter-coder agreement for computational linguistics. Comput. Linguist. 34(4), 555–596 (2008)CrossRef
Zurück zum Zitat Brennan, R., Prediger, D.: Coefficient kappa: some uses, misuses, and alternatives. Educ. Psychol. Meas. 41(3), 687 (1981)CrossRef Brennan, R., Prediger, D.: Coefficient kappa: some uses, misuses, and alternatives. Educ. Psychol. Meas. 41(3), 687 (1981)CrossRef
Zurück zum Zitat Cicchetti, D., Feinstein, A.: High agreement but low kappa: II. Resolving the paradoxes* 1. J. Clin. Epidemiol. 43(6), 551–558 (1990)CrossRef Cicchetti, D., Feinstein, A.: High agreement but low kappa: II. Resolving the paradoxes* 1. J. Clin. Epidemiol. 43(6), 551–558 (1990)CrossRef
Zurück zum Zitat Cronbach, L.: Coefficient alpha and the internal structure of tests. Psychometrika 16(3), 297–334 (1951) Cronbach, L.: Coefficient alpha and the internal structure of tests. Psychometrika 16(3), 297–334 (1951)
Zurück zum Zitat Fleiss, J.: Measuring nominal scale agreement among many raters. Psychol. Bull. 76(5), 378–382 (1971)CrossRef Fleiss, J.: Measuring nominal scale agreement among many raters. Psychol. Bull. 76(5), 378–382 (1971)CrossRef
Zurück zum Zitat Gwet, K.: Inter-rater reliability: dependency on trait prevalence and marginal homogeneity. Stat. Methods Inter-Rater Reliab. Assess. Ser. 2, 1–9 (2002) Gwet, K.: Inter-rater reliability: dependency on trait prevalence and marginal homogeneity. Stat. Methods Inter-Rater Reliab. Assess. Ser. 2, 1–9 (2002)
Zurück zum Zitat Gwet, K.: Computing inter-rater reliability and its variance in the presence of high agreement. Br. J. Math. Stat. Psychol. 61(1), 29–48 (2008)CrossRef Gwet, K.: Computing inter-rater reliability and its variance in the presence of high agreement. Br. J. Math. Stat. Psychol. 61(1), 29–48 (2008)CrossRef
Zurück zum Zitat Gwet, K.: Handbook of Inter-Rater Reliability—A Definitive Guide to Measuring the Extent of Agreement Among Multiple Raters. Advanced Analytics LLC, Gaithersburg (2010) Gwet, K.: Handbook of Inter-Rater Reliability—A Definitive Guide to Measuring the Extent of Agreement Among Multiple Raters. Advanced Analytics LLC, Gaithersburg (2010)
Zurück zum Zitat Holsti, O.: Content analysis for the social sciences and humanities. Addison-Wesley: Reading, MA (1969) Holsti, O.: Content analysis for the social sciences and humanities. Addison-Wesley: Reading, MA (1969)
Zurück zum Zitat Kolbe, R.H., Burnett, M.S.: Content-analysis research: an examination of applications with directives for improving research reliability and objectivity. J. Consum. Res. 18(2), 243–250 (1991). Retrieved from http://www.jstor.org/stable/2489559 Kolbe, R.H., Burnett, M.S.: Content-analysis research: an examination of applications with directives for improving research reliability and objectivity. J. Consum. Res. 18(2), 243–250 (1991). Retrieved from http://​www.​jstor.​org/​stable/​2489559
Zurück zum Zitat Krippendorff, K.: Content Analysis: An Introduction to Its Methodology, 2nd ed. Sage, Thousand Oaks (2004a) Krippendorff, K.: Content Analysis: An Introduction to Its Methodology, 2nd ed. Sage, Thousand Oaks (2004a)
Zurück zum Zitat Krippendorff, K.: A dissenting view on so-called paradoxes of reliability coefficients. In: Salmon, C.T. (ed.) Communication Yearbook, vol. 36, pp. 481–500. Routledge, New York (2012) Krippendorff, K.: A dissenting view on so-called paradoxes of reliability coefficients. In: Salmon, C.T. (ed.) Communication Yearbook, vol. 36, pp. 481–500. Routledge, New York (2012)
Zurück zum Zitat Light, R.J.: Measures of response agreement for qualitative data: some generalizations and alternatives. Psychol. Bull. 76(5), 365–377 (1971)CrossRef Light, R.J.: Measures of response agreement for qualitative data: some generalizations and alternatives. Psychol. Bull. 76(5), 365–377 (1971)CrossRef
Zurück zum Zitat Lin, L.: A concordance correlation coefficient to evaluate reproducibility. Biometrics 45(1), 255 (1989)CrossRef Lin, L.: A concordance correlation coefficient to evaluate reproducibility. Biometrics 45(1), 255 (1989)CrossRef
Zurück zum Zitat Lombard, M., Snyder Duch, J.: Content analysis in mass communication: assessment and reporting of intercoder reliability. Hum. Commun. Res. 28(4), 587–604 (2002)CrossRef Lombard, M., Snyder Duch, J.: Content analysis in mass communication: assessment and reporting of intercoder reliability. Hum. Commun. Res. 28(4), 587–604 (2002)CrossRef
Zurück zum Zitat Osgood, C.: The representational model and relevant research methods. In: de Sola Pool, I. (ed.) Trends in Content Analysis, pp. 33–88. University of Illinois Press, Champaign (1959) Osgood, C.: The representational model and relevant research methods. In: de Sola Pool, I. (ed.) Trends in Content Analysis, pp. 33–88. University of Illinois Press, Champaign (1959)
Zurück zum Zitat Riffe, D., Lacy, S., Fico, F.: Analyzing Media Messages: Using Quantitative Content Analysis in Research. Lawrence Erlbaum Assoc Inc, New Jersey (2005) Riffe, D., Lacy, S., Fico, F.: Analyzing Media Messages: Using Quantitative Content Analysis in Research. Lawrence Erlbaum Assoc Inc, New Jersey (2005)
Zurück zum Zitat Scott, W.: Reliability of content analysis: the case of nominal scale coding. Public Opin. Q. 19, 321–325 (1955). doi:10.1086/266577 Scott, W.: Reliability of content analysis: the case of nominal scale coding. Public Opin. Q. 19, 321–325 (1955). doi:10.​1086/​266577
Zurück zum Zitat Spiegelman, M., Terwilliger, C., Fearing, F.: The reliability of agreement in content analysis. J. Soc. Psychol. 37, 175–187 (1953) Spiegelman, M., Terwilliger, C., Fearing, F.: The reliability of agreement in content analysis. J. Soc. Psychol. 37, 175–187 (1953)
Zurück zum Zitat Zhao, X.: A Reliability Index (ai) that Assumes Honest Coders and Variable Randomness. Association for Education in Journalism and Mass Communication, Chicago (2012) Zhao, X.: A Reliability Index (ai) that Assumes Honest Coders and Variable Randomness. Association for Education in Journalism and Mass Communication, Chicago (2012)
Zurück zum Zitat Zhao, X., Liu, J.S., Deng, K.: Assumptions behind inter-coder reliability indices. In: Salmon, C.T. (ed.) Communication Yearbook, vol. 36, pp. 419–480. Routledge, New York (2012) Zhao, X., Liu, J.S., Deng, K.: Assumptions behind inter-coder reliability indices. In: Salmon, C.T. (ed.) Communication Yearbook, vol. 36, pp. 419–480. Routledge, New York (2012)
Metadaten
Titel
Intercoder reliability indices: disuse, misuse, and abuse
verfasst von
Guangchao Charles Feng
Publikationsdatum
01.05.2014
Verlag
Springer Netherlands
Erschienen in
Quality & Quantity / Ausgabe 3/2014
Print ISSN: 0033-5177
Elektronische ISSN: 1573-7845
DOI
https://doi.org/10.1007/s11135-013-9956-8

Weitere Artikel der Ausgabe 3/2014

Quality & Quantity 3/2014 Zur Ausgabe

Premium Partner