General estimators for the reliability of qualitative data

Cooil, Bruce; Rust, Roland T.

doi:10.1007/BF02301413

General estimators for the reliability of qualitative data

Published: June 1995

Volume 60, pages 199–220, (1995)
Cite this article

Psychometrika Aims and scope Submit manuscript

Bruce Cooil¹ &
Roland T. Rust¹

138 Accesses
11 Citations
3 Altmetric
Explore all metrics

Abstract

We study a proportional reduction in loss (PRL) measure for the reliability of categorical data and consider the general case in which each ofN judges assigns a subject to one ofK categories. This measure has been shown to be equivalent to a measure proposed by Perreault and Leigh for a special case when there are two equally competent judges, and the correct category has a uniform prior distribution. We consider a general framework where the correct category is assumed to have an arbitrary prior distribution, and where classification probabilities vary by correct category, judge, and category of classification. In this setting, we consider PRL reliability measures based on two estimators of the correct category—the empirical Bayes estimator and an estimator based on the judges' consensus choice. We also discuss four important special cases of the general model and study several types of lower bounds for PRL reliability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Article Open access 01 April 2016

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

Article Open access 05 May 2021

Violating the normality assumption may be the lesser of two evils

Article Open access 07 May 2021

References

Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In B. N. Petrov & F. Csáki (Eds.),2nd International Symposium on Information Theory (pp. 267–281). Budapest: Akadémiai Kiadó.
Google Scholar
Agresti, A. (1990).Categorical data analysis. New York: John Wiley & Sons.
Google Scholar
Batchelder, W. H., & Romney, A. K. (1986). The statistical analysis of a general Condorcet model for dichotomous choice situations. In B. Grofman & G. Owen (Eds.),Information pooling and group decision making (pp. 103–112). Greenwich, CN: JAI Press.
Google Scholar
Batchelder, W. H., & Romney, A. K. (1988). Test theory without an answer key.Psychometrika, 53, 193–224.
Article Google Scholar
Batchelder, W. H., & Romney, A. K. (1989). New results in test theory without an answer key. In Edward E. Roskam (Ed.),Mathematical psychology in progress (pp. 229–248). Berlin, Heidelberg, New York: Springer-Verlag.
Google Scholar
Clogg, C. C. (1981). New developments in latent structure analysis. In D. M. Jackson & E. F. Borgatta (Eds.),Factor analysis and measurement in sociological research (pp. 215–246). London: Sage.
Google Scholar
Cohen, J. (1960). A coefficient of agreement for nominal scales.Educational and Psychological Measurement, 20, 37–46.
Google Scholar
Cohen, J. (1968). Weighted kappa: Nominal scale agreement with provision for scaled disagreement or partial credit.Psychological Bulletin, 70, 213–220.
Google Scholar
Cooil, B., & Rust, R. T. (1994). Reliability and expected loss: A unifying principle.Psychometrika, 59, 203–216.
Article Google Scholar
Costner, H. L. (1965). Criteria for measures of association.American Sociological Review, 30, 341–353.
Google Scholar
Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests.Psychometrika, 16, 297–334.
Article Google Scholar
Cronbach, L. J., Gleser, G. C., Nanda, H., & Rajaratnam, N. (1972).The dependability of behavioral measurements: Theory of generalizability for scores and profiles. New York: John Wiley & Sons.
Google Scholar
David, F. N., & Barton, D. E. (1962).Combinatorial chance. London: Griffin.
Google Scholar
David, H. A. (1981).Order statistics (2nd ed.). New York: John Wiley & Sons.
Google Scholar
Dillon, W. R., & Mulani, N. (1984). A probabilistic latent class model for assessing inter-judge reliability.Multivariate Behavioral Research, 19, 438–458.
Article Google Scholar
Haberman, S. J. (1974). Log-linear models for frequency tables derived by indirect observation. Maximum likelihood equations.Annals of Statistics, 2, 911–924.
Google Scholar
Hughes, M. A., & Garrett, D. E. (1990). Intercoder reliability estimation approaches in marketing: A generalizability theory framework for quantitative data.Journal of Marketing Research, 27, 185–195.
Google Scholar
Johnson, N. L., & Kotz, S. (1969).Discrete distributions. Boston, MA: Houghton Mifflin.
Google Scholar
Kesten, H., & Morse, N. (1959). A property of the multinomial distribution.Annals of Mathematical Statistics, 30, 120–127.
Google Scholar
Kozelka, R. M. (1956). Approximate upper percentage points for extreme values in multinomial sampling.Annals of Mathematical Statistics, 27, 507–512.
Google Scholar
Loevinger, J. (1948). The technic of homogeneous tests compared with some aspects of “scale analysis” and factor analysis.Psychological Bulletin, 45, 507–530.
Google Scholar
Marshall, A. W., & Olkin, I. (1979).Inequalities: Theory of majorization and its applications. New York: Academic Press.
Google Scholar
Mellenbergh, G. J., & van der Linden, W. J. (1979). The internal and external optimality of decisions based on tests.Applied Psychological Measurement, 3, 257–273.
Google Scholar
Perreault, W. D. Jr., & Leigh, L. E. (1989). Reliability of nominal data based on qualitative judgments.Journal of Marketing Research, 26, 135–48.
Google Scholar
Romney, A. K., Weller, S. C., & Batchelder, W. H. (1986). Culture as consensus: A theory of culture and informant accuracy.American Anthropologist, 88, 313–338.
Article Google Scholar
Rust, R. T., Simester, D., Brodie, R. J., & Nilikant, V. (in press). Model selection criteria: An investigation of relative accuracy, posterior probabilities, and combinations of criteria.Management Science.
Schouten, H. J. A. (1982). Measuring pairwise agreement among many observers, II: Some improvements and additions.Biometrical Journal, 24, 431–435.
Google Scholar
Schouten, H. J. A. (1986). Nominal scale agreement among observers,Psychometrika, 51, 453–466.
Article Google Scholar
Schwarz, G. (1978). Estimating the dimension of a model.Annals of Statistics, 6, 461–464.
Google Scholar
White, H. (1982). Maximum likelihood estimation of misspecified models.Econometrica, 50, 1–25.
Google Scholar
Winer, B. J. (1971).Statistical principles in experimental design. New York: McGraw-Hill.
Google Scholar
Woodroofe, M. (1982).On model selection and the arc sine laws.Annals of Statistics, 10, 1182–1194.
Google Scholar

Download references

Author information

Authors and Affiliations

Owen Graduate School of Management, Vanderbilt University, 401 21st Avenue South, 37203, Nashville, TN
Bruce Cooil & Roland T. Rust

Authors

Bruce Cooil
View author publications
You can also search for this author in PubMed Google Scholar
Roland T. Rust
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bruce Cooil.

Additional information

Bruce Cooil is Associate Professor of Statistics, and Roland T. Rust is Professor and area head for Marketing, Owen Graduate School of Management, Vanderbilt University. The authors thank three anonymous reviewers and an Associate Editor for their helpful comments and suggestions. This work was supported in part by the Dean's Fund for Faculty Research of the Owen Graduate School of Management, Vanderbilt University.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cooil, B., Rust, R.T. General estimators for the reliability of qualitative data. Psychometrika 60, 199–220 (1995). https://doi.org/10.1007/BF02301413

Download citation

Received: 06 May 1993
Revised: 05 April 1994
Issue Date: June 1995
DOI: https://doi.org/10.1007/BF02301413

Key words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

General estimators for the reliability of qualitative data

Abstract

Access this article

Similar content being viewed by others

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

Violating the normality assumption may be the lesser of two evils

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Key words

Navigation

General estimators for the reliability of qualitative data

Abstract

Access this article

Similar content being viewed by others

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

Violating the normality assumption may be the lesser of two evils

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key words

Search

Navigation