Skip to main content
Top
Published in: Quality & Quantity 3/2017

04-04-2016

Spurious relationships arising from aggregate variables in linear regression

Authors: David J. Armor, Chenna Reddy Cotla, Thomas Stratmann

Published in: Quality & Quantity | Issue 3/2017

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Linear regressions that use aggregated values from a group variable such as a school or a neighborhood are commonplace in the social sciences. This paper uses Monte Carlo methods to demonstrate that aggregated variables produce spurious relationships with other dependent and independent variables in a model even when there are no underlying relationships among those variables. The size of the spurious relationships (or postulated effects) increases as the number of observations per group decreases. Although this problem is remedied by including the individual-level variable in the regression, the problem has not been discussed in the methodological literature. Accordingly, studies using aggregate variables must be interpreted with caution if the individual-level measurements are not available.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Footnotes
1
The shape of the actual distributions are unknown, but assuming they are normal, the 100,000 samples would generate an extremely small standard error, so that a correlation as small as .01 would be significant at the .05 level.
 
2
Simulations were also run for 60 and 80 schools, but results differed only slightly from the 50 school case.
 
3
In the dataset, the variable was named "metasum".
 
4
It is understood that the coefficients for S and P in model (4) can be different than those in model (3), even though the same β symbols are used.
 
5
The full set of simulation correlations that go into model (4) are available from the authors.
 
Literature
go back to reference Bryk, A.S., Raudenbush, S.W.: Hierarchical Linear Models. Sage Publications, Newbury Park (1992) Bryk, A.S., Raudenbush, S.W.: Hierarchical Linear Models. Sage Publications, Newbury Park (1992)
go back to reference Gottfried, M.A.: Absent peers in elementary years: the negative classroom effects of unexcused absences on standardized testing outcomes. Teach. Coll. Rec. 113, 1597–1632 (2011) Gottfried, M.A.: Absent peers in elementary years: the negative classroom effects of unexcused absences on standardized testing outcomes. Teach. Coll. Rec. 113, 1597–1632 (2011)
go back to reference Hanushek, E.A., Kain, J.F., Rivkin, S.G.: New evidence about Brown v. Board of Education: the complex effects of school racial composition on achievement. J. Labor Econ. 27, 349–383 (2009)CrossRef Hanushek, E.A., Kain, J.F., Rivkin, S.G.: New evidence about Brown v. Board of Education: the complex effects of school racial composition on achievement. J. Labor Econ. 27, 349–383 (2009)CrossRef
go back to reference Hill, C.J., Bloom, H.S., Black, A.R., Lipsey, M.W.: Empirical benchmarks for interpreting effect sizes in research. Child. Dev. Perspect. 2, 172–177 (2008)CrossRef Hill, C.J., Bloom, H.S., Black, A.R., Lipsey, M.W.: Empirical benchmarks for interpreting effect sizes in research. Child. Dev. Perspect. 2, 172–177 (2008)CrossRef
go back to reference Kahlenberg, R.D.: The Future of School Integration: Socioeconomic Diversity as an Education Reform Strategy. The Century Foundation, Washington DC (2012) Kahlenberg, R.D.: The Future of School Integration: Socioeconomic Diversity as an Education Reform Strategy. The Century Foundation, Washington DC (2012)
go back to reference King, G.: A Solution to the Ecological Inference Problem: Reconstructing Individual Behavior from Aggregate Data. Princeton University Press, Princeton (1997) King, G.: A Solution to the Ecological Inference Problem: Reconstructing Individual Behavior from Aggregate Data. Princeton University Press, Princeton (1997)
go back to reference Lipsey, M.W., Puzio, K., Yun, C., Hebert, M.A., Steinka-Fry, K., Cole, W., Roberts, M., Anthony, K.S., Busick, M.D.: Translating the Statistical Representation of the Effects of Education Interventions Into More Readily Interpretable Forms. U.S. Department of Education, Institute for Education Science, Washington DC (2012) Lipsey, M.W., Puzio, K., Yun, C., Hebert, M.A., Steinka-Fry, K., Cole, W., Roberts, M., Anthony, K.S., Busick, M.D.: Translating the Statistical Representation of the Effects of Education Interventions Into More Readily Interpretable Forms. U.S. Department of Education, Institute for Education Science, Washington DC (2012)
go back to reference Loveless, T.: How Well are American Students Learning?. Brookings Institution, Washington DC (2012) Loveless, T.: How Well are American Students Learning?. Brookings Institution, Washington DC (2012)
go back to reference Marks GN. (2012). Are school-SES effects theoretical and methodological artifacts?. Teach. Coll. Rec. (ID Number 16872) Marks GN. (2012). Are school-SES effects theoretical and methodological artifacts?. Teach. Coll. Rec. (ID Number 16872)
go back to reference Moulton, B.R.: An illustration of a pitfall in estimating the effects of aggregate variables on micro units. Rev. Econ. Stat. 72, 334–338 (1990)CrossRef Moulton, B.R.: An illustration of a pitfall in estimating the effects of aggregate variables on micro units. Rev. Econ. Stat. 72, 334–338 (1990)CrossRef
go back to reference Sampson, R.J., Raudenbush, S.W., Earls, F.: Neighbourhoods and violent crime: a multilevel study of collective efficacy. Science 277, 918–924 (1997)CrossRef Sampson, R.J., Raudenbush, S.W., Earls, F.: Neighbourhoods and violent crime: a multilevel study of collective efficacy. Science 277, 918–924 (1997)CrossRef
go back to reference Vigdor, J., Nechyba, T.: Peer Effects in Elementary School: Learning from ‘Apparent’ Random Assignment. Duke University and NBER, Durham (2004) Vigdor, J., Nechyba, T.: Peer Effects in Elementary School: Learning from ‘Apparent’ Random Assignment. Duke University and NBER, Durham (2004)
go back to reference Willms, J.D.: School composition and contextual effects on student outcomes. Teach. Coll. Rec. 112(4), 1137–1162 (2010) Willms, J.D.: School composition and contextual effects on student outcomes. Teach. Coll. Rec. 112(4), 1137–1162 (2010)
go back to reference Wooldridge, J.M.: Cluster-sample methods in applied econometrics. Am. Econ. Rev. 93, 133–138 (2003)CrossRef Wooldridge, J.M.: Cluster-sample methods in applied econometrics. Am. Econ. Rev. 93, 133–138 (2003)CrossRef
Metadata
Title
Spurious relationships arising from aggregate variables in linear regression
Authors
David J. Armor
Chenna Reddy Cotla
Thomas Stratmann
Publication date
04-04-2016
Publisher
Springer Netherlands
Published in
Quality & Quantity / Issue 3/2017
Print ISSN: 0033-5177
Electronic ISSN: 1573-7845
DOI
https://doi.org/10.1007/s11135-016-0335-0

Other articles of this Issue 3/2017

Quality & Quantity 3/2017 Go to the issue