Skip to main content
Top
Published in: Educational Assessment, Evaluation and Accountability 1/2021

08-02-2021

Making sense out of measurement non-invariance: how to explore differences among educational systems in international large-scale assessments

Authors: Edwin Cuellar, Ivailo Partchev, Robert Zwitser, Timo Bechger

Published in: Educational Assessment, Evaluation and Accountability | Issue 1/2021

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

International large-scale assessment in education aims to compare educational achievement across many countries. Differences between countries in language, culture, and education give rise to differential item functioning (DIF). For many decades, DIF has been regarded as a nuisance and a threat to validity. In this paper, we take a different stance and argue that DIF holds essential information about the differences between countries. To uncover this information, we explore the use of multivariate analysis techniques as ways to analyze DIF emphasizing visualization. PISA 2012 data are used for illustration.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Footnotes
1
Orthonormal means that \(\mathbf {U^{\prime }U}=\mathbf {V^{\prime }V}=\mathbf {I}\), implying that the columns of both U and V are mutually independent or perpendicular, and the sum-of-squares for each of their columns is 1.
 
2
This information is provided by the PISA technical report (see Appendix A, OECD2014)
 
Literature
go back to reference Ackerman, T.A. (1992). A didactic explanation of item bias, item impact, and item validity from a multidimensional perspective. Journal of Educational Measurement, 29(1), 67–91.CrossRef Ackerman, T.A. (1992). A didactic explanation of item bias, item impact, and item validity from a multidimensional perspective. Journal of Educational Measurement, 29(1), 67–91.CrossRef
go back to reference Angoff, W., & Ford, S. (1973). Item-race interaction on a test of scholastic aptitude. Journal of Educational Measurement, 10(2), 95–105.CrossRef Angoff, W., & Ford, S. (1973). Item-race interaction on a test of scholastic aptitude. Journal of Educational Measurement, 10(2), 95–105.CrossRef
go back to reference Bechger, T., & Maris, G. (2015). A statistical test for differential item pair functioning. Psychometrika, 80(2), 317–340.CrossRef Bechger, T., & Maris, G. (2015). A statistical test for differential item pair functioning. Psychometrika, 80(2), 317–340.CrossRef
go back to reference Bechger, T., Hox, J., van den Wittenboer, G., & de Glopper, C. (1999). The validity of comparative educational studies. Educational Measurement: Issues and Practice, 18(3), 18–26.CrossRef Bechger, T., Hox, J., van den Wittenboer, G., & de Glopper, C. (1999). The validity of comparative educational studies. Educational Measurement: Issues and Practice, 18(3), 18–26.CrossRef
go back to reference Brazma, A., & Vilo, J. (2000). Gene expression data analysis. FEBS Letters, 480, 117–24.CrossRef Brazma, A., & Vilo, J. (2000). Gene expression data analysis. FEBS Letters, 480, 117–24.CrossRef
go back to reference Brinkhuis, M.J., Bakker, M., & Maris, G. (2015). Filtering data for detecting differential development. Journal of Educational Measurement, 52(3), 319–338.CrossRef Brinkhuis, M.J., Bakker, M., & Maris, G. (2015). Filtering data for detecting differential development. Journal of Educational Measurement, 52(3), 319–338.CrossRef
go back to reference Cadima, J., & Joliffe, I. (2009). On relationships between uncentred and column-centred principal component analysis. Pakistan Journal of Statistics, 25(4), 473–503. Cadima, J., & Joliffe, I. (2009). On relationships between uncentred and column-centred principal component analysis. Pakistan Journal of Statistics, 25(4), 473–503.
go back to reference Doebler, A. (2019). Looking at dif from a new perspective: A structure-based approach acknowledging inherent indefinability. Applied Psychological Measurement, 43(4), 303–321.CrossRef Doebler, A. (2019). Looking at dif from a new perspective: A structure-based approach acknowledging inherent indefinability. Applied Psychological Measurement, 43(4), 303–321.CrossRef
go back to reference Eckart, C., & Young, G. (1936). The approximation of one matrix by another of lower rank. Psychometrika 1211–218. Eckart, C., & Young, G. (1936). The approximation of one matrix by another of lower rank. Psychometrika 1211–218.
go back to reference Everitt, B., Landau, S., Leese, M., & Stahl, D. (2011). Cluster analysis. Chichester: Wiley.CrossRef Everitt, B., Landau, S., Leese, M., & Stahl, D. (2011). Cluster analysis. Chichester: Wiley.CrossRef
go back to reference Gabriel, K. (1971). The biplot graphical display of matrices with application to principal component analysis. Biometrika, 456–467. Gabriel, K. (1971). The biplot graphical display of matrices with application to principal component analysis. Biometrika, 456–467.
go back to reference Glas, C.A.W., & Verhelst, N.D. (1995). Testing the Rasch model. In Fischer, G. H., & Molenaar, I W (Eds.) Rasch models: Foundations, recent developments, and applications, chap 5 (pp. 69–95). New York: Springer. Glas, C.A.W., & Verhelst, N.D. (1995). Testing the Rasch model. In Fischer, G. H., & Molenaar, I W (Eds.) Rasch models: Foundations, recent developments, and applications, chap 5 (pp. 69–95). New York: Springer.
go back to reference Golub, G.H., & Van Loan, C.F. (1996). Matrix computations, 3rd edn. Johns Hopkins University Press. Golub, G.H., & Van Loan, C.F. (1996). Matrix computations, 3rd edn. Johns Hopkins University Press.
go back to reference Hastie, T., Tibshirani, R., & Friedman, J. (2013). The elements of statistical learning: Data mining, inference and prediction. Springer Series in Statistics. New York: Springer. Hastie, T., Tibshirani, R., & Friedman, J. (2013). The elements of statistical learning: Data mining, inference and prediction. Springer Series in Statistics. New York: Springer.
go back to reference Jolliffe, I. (2002). Principal component analysis. Springer Series in Statistics, Springer, Berlin. Jolliffe, I. (2002). Principal component analysis. Springer Series in Statistics, Springer, Berlin.
go back to reference Koops, J., Bechger, T., & Maris, G. (in press). Research for practical issues and solutions in computerized multistage testing (chap 19). In von Davier, A., & Duanli, Y (Eds.) (pp. 201–216). London: Routledge. Koops, J., Bechger, T., & Maris, G. (in press). Research for practical issues and solutions in computerized multistage testing (chap 19). In von Davier, A., & Duanli, Y (Eds.) (pp. 201–216). London: Routledge.
go back to reference Lele, S., & Richtsmeier, J. (2001). An invariant approach to statistical analysis of shapes. Chapman & Hall/CRC Interdisciplinary Statistics, CRC Press. Lele, S., & Richtsmeier, J. (2001). An invariant approach to statistical analysis of shapes. Chapman & Hall/CRC Interdisciplinary Statistics, CRC Press.
go back to reference Lord, F. (1980). Applications of item response theory to practical testing problems. Hillsdale: Lawrence Erlbaum Associate. Lord, F. (1980). Applications of item response theory to practical testing problems. Hillsdale: Lawrence Erlbaum Associate.
go back to reference Madeira, S., & Oliveira, A. (2004). Biclustering algorithms for biological data analysis: A survey. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 24–45. Madeira, S., & Oliveira, A. (2004). Biclustering algorithms for biological data analysis: A survey. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 24–45.
go back to reference Maris, G., Bechger, T., & San Martin, E. (2015). A Gibbs sampler for the (extended) marginal Rasch model. Psychometrika, 80(4), 859–879.CrossRef Maris, G., Bechger, T., & San Martin, E. (2015). A Gibbs sampler for the (extended) marginal Rasch model. Psychometrika, 80(4), 859–879.CrossRef
go back to reference Millsap, R. (2012). Statistical approaches to measurement invariance. Routledge. Millsap, R. (2012). Statistical approaches to measurement invariance. Routledge.
go back to reference Oshima, T., & Miller, M.D. (1992). Multidimensionality and item bias in item response theory. Applied Psychological Measurement, 16(3), 237–248.CrossRef Oshima, T., & Miller, M.D. (1992). Multidimensionality and item bias in item response theory. Applied Psychological Measurement, 16(3), 237–248.CrossRef
go back to reference San Martín, E., & Rolin, J. (2013). Identification of parametric Rasch-type models. Journal of Statistical Planning and Inference, 143(1), 116–130.CrossRef San Martín, E., & Rolin, J. (2013). Identification of parametric Rasch-type models. Journal of Statistical Planning and Inference, 143(1), 116–130.CrossRef
go back to reference Thompson, D.R., Huntley, M.A., & Suurtamm, C. (2017). International perspectives on mathematics curriculum. IAP. Thompson, D.R., Huntley, M.A., & Suurtamm, C. (2017). International perspectives on mathematics curriculum. IAP.
go back to reference Travers, K.J., & Westbury, I. (1989). The IEA study of mathematics I: Analysis of mathematics curricula. Pergamon Press. Travers, K.J., & Westbury, I. (1989). The IEA study of mathematics I: Analysis of mathematics curricula. Pergamon Press.
go back to reference Wang, T., Strobl, C., Zeileis, A., & Merkle, E. (2018). Score-based tests of differential item functioning via pairwise maximum likelihood estimation. Psychometrika, 83(1), 132–155.CrossRef Wang, T., Strobl, C., Zeileis, A., & Merkle, E. (2018). Score-based tests of differential item functioning via pairwise maximum likelihood estimation. Psychometrika, 83(1), 132–155.CrossRef
go back to reference Zwitser, R., Glaser, S., & Maris, G. (2017). Monitoring countries in a changing world: A new look at DIF in international surveys. Psychometrika, 82 (1), 210–232.CrossRef Zwitser, R., Glaser, S., & Maris, G. (2017). Monitoring countries in a changing world: A new look at DIF in international surveys. Psychometrika, 82 (1), 210–232.CrossRef
Metadata
Title
Making sense out of measurement non-invariance: how to explore differences among educational systems in international large-scale assessments
Authors
Edwin Cuellar
Ivailo Partchev
Robert Zwitser
Timo Bechger
Publication date
08-02-2021
Publisher
Springer Netherlands
Published in
Educational Assessment, Evaluation and Accountability / Issue 1/2021
Print ISSN: 1874-8597
Electronic ISSN: 1874-8600
DOI
https://doi.org/10.1007/s11092-021-09355-x

Other articles of this Issue 1/2021

Educational Assessment, Evaluation and Accountability 1/2021 Go to the issue