Skip to main content
Top
Published in: Demography 5/2016

19-08-2016

Spatial Variation in the Quality of American Community Survey Estimates

Authors: David C. Folch, Daniel Arribas-Bel, Julia Koschinsky, Seth E. Spielman

Published in: Demography | Issue 5/2016

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Social science research, public and private sector decisions, and allocations of federal resources often rely on data from the American Community Survey (ACS). However, this critical data source has high uncertainty in some of its most frequently used estimates. Using 2006–2010 ACS median household income estimates at the census tract scale as a test case, we explore spatial and nonspatial patterns in ACS estimate quality. We find that spatial patterns of uncertainty in the northern United States differ from those in the southern United States, and they are also different in suburbs than in urban cores. In both cases, uncertainty is lower in the former than the latter. In addition, uncertainty is higher in areas with lower incomes. We use a series of multivariate spatial regression models to describe the patterns of association between uncertainty in estimates and economic, demographic, and geographic factors, controlling for the number of responses. We find that these demographic and geographic patterns in estimate quality persist even after we account for the number of responses. Our results indicate that data quality varies across places, making cross-sectional analysis both within and across regions less reliable. Finally, we present advice for data users and potential solutions to the challenges identified.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
Children in poverty are those under 18 living in a family whose income is below the poverty level and who are related to the householder by birth, marriage, or adoption. In 32,007 of the 72,539 census tracts (44.1 %) in the contiguous United States, the MOE is greater than or equal to the estimate.
 
2
The 2011–2013 ACS estimates represent the last three-year data release.
 
3
By 2009, the Census Bureau had collected sufficient data to start distributing one-year, three-year, and five-year estimates each year.
 
4
These estimates are based on census tracts in the contiguous United States, with outlier tracts removed (see the Data section).
 
5
The statistic is computed for each census tract on the map. Statistical significance is based on 999 random permutations of the data, and a significance level of .05.
 
6
These areas are determined by the December 2009 definition of Core Based Statistical Areas (http://​www.​census.​gov/​population/​metro/​files/​lists/​2009/​List1.​txt).
 
7
We order the census tracts within an MSA by increasing distance from their city center; we then split this ordered list into 100 bins (percentiles). We repeat this process for the 150 largest MSAs and then pool the tracts by percentile for all MSAs into a single set of 100 bins. When completed, the first bin has the urban core of all the MSAs, and the 100th bin has the urban fringe of each MSA. The points in Fig. 7 represent the median CV value from each bin.
 
8
Urban centers are identified using the U.S. Geological Survey Geographic Names Information System. The latitude-longitude marker for the first city listed in the MSA name is extracted from the database, and then distances are computed from each census tract centroid to the urban center.
 
9
Federally subsidized housing programs include public housing (traditional and HOPE VI), multifamily housing (including housing for the elderly and disabled, Sections 202 and 811), and vouchers (predominantly, housing choice vouchers for tenants). Address-level records are aggregated to the tract level.
 
10
Age diversity is computed using four groups: younger than 18 years, 18–34 years, 35–64 years, and 65 and older. Racial/ethnic diversity is also computed using four groups: white (non-Hispanic), African American (non-Hispanic), Asian (non-Hispanic), and Hispanic.
 
11
All the results shown relate to a spatial weights matrix built using the queen contiguity criterion, under which two observations are neighbors and are thus assigned a weight of 1 if they share a border of any length, including a single point. This matrix is then standardized so that every row sums to 1, effectively converting ∑ j w ij u j into the average value of u in the surroundings of i.
 
12
The southern part of the country is defined as Alabama, Arizona, Arkansas, California, Colorado, Delaware, District of Columbia, Florida, Georgia, Kentucky, Louisiana, Maryland, Mississippi, Nevada, New Mexico, North Carolina, Oklahoma, South Carolina, Tennessee, Texas, Utah, Virginia, and West Virginia. The remaining states, excluding Alaska and Hawaii, are assigned to the northern part.
 
13
The Chow test identifies whether there is a statistical difference between the magnitude of regression coefficients in the North and South models.
 
14
Because it is directly related to the standard error of the estimate, in a simple random sample and with variance held constant, the CV is proportional to \( 1/\sqrt{n} \). In a log-log setting, as the one we have in our regressions, this means that sample size (hu_respond) should theoretically have an effect of \( -0.5\left(1/\sqrt{n}=1/\sqrt{n^{0.5}}={n}^{-0.5}\right) \) on the CV, which is almost exactly what we find in the national regression. The smaller coefficients associated with the other specifications and geographic variation in the coefficient likely reflect differences in the response propensities of those particular subsets of households. These varying response propensities create “design effects.” In other words, the need to adjust for varying response propensities, in a sense, makes the sample less random, thus loosening the theoretical relationship between sample size and CV.
 
15
The simulations repeatedly dropped 10 % of the observations and then reestimated the model. The distribution of parameter estimates across the simulations displayed only minor variations in the estimates.
 
Literature
go back to reference Anselin, L. (1988). Spatial econometrics. Dordrecht, The Netherlands: Kluwer Academic Publishers. Anselin, L. (1988). Spatial econometrics. Dordrecht, The Netherlands: Kluwer Academic Publishers.
go back to reference Anselin, L. (1990). Spatial dependence and spatial structural instability in applied regression analysis. Journal of Regional Science, 30, 185–207.CrossRef Anselin, L. (1990). Spatial dependence and spatial structural instability in applied regression analysis. Journal of Regional Science, 30, 185–207.CrossRef
go back to reference Anselin, L. (1995). Local indicators of spatial association—LISA. Geographical Analysis, 27, 93–115.CrossRef Anselin, L. (1995). Local indicators of spatial association—LISA. Geographical Analysis, 27, 93–115.CrossRef
go back to reference Anselin, L., & Lozano, N. (2008). Errors in variables and spatial effects in hedonic house price models of ambient air quality. Empirical Economics, 34(5), 5–34.CrossRef Anselin, L., & Lozano, N. (2008). Errors in variables and spatial effects in hedonic house price models of ambient air quality. Empirical Economics, 34(5), 5–34.CrossRef
go back to reference Arraiz, I., Drukker, D., Kelejian, H., & Prucha, I. (2010). A spatial Cliff-Ord-type model with heteroskedastic innovations: Small and large sample results. Journal of Regional Science, 50, 592–614.CrossRef Arraiz, I., Drukker, D., Kelejian, H., & Prucha, I. (2010). A spatial Cliff-Ord-type model with heteroskedastic innovations: Small and large sample results. Journal of Regional Science, 50, 592–614.CrossRef
go back to reference Bazuin, J. T., & Fraser, J. C. (2013). How the ACS gets it wrong: The story of the American Community Survey and a small, inner city neighborhood. Applied Geography, 45, 292–302.CrossRef Bazuin, J. T., & Fraser, J. C. (2013). How the ACS gets it wrong: The story of the American Community Survey and a small, inner city neighborhood. Applied Geography, 45, 292–302.CrossRef
go back to reference Bound, J., Brown, C., & Mathiowetz, N. (2001). Measurement error in survey data. In J. J. Heckman & E. Leamer (Eds.), Handbook of econometrics (Vol. 5, pp. 3705–3843). Amsterdam, The Netherlands: Elsevier Science. Bound, J., Brown, C., & Mathiowetz, N. (2001). Measurement error in survey data. In J. J. Heckman & E. Leamer (Eds.), Handbook of econometrics (Vol. 5, pp. 3705–3843). Amsterdam, The Netherlands: Elsevier Science.
go back to reference Bruce, A., & Robinson, J. G. (2009). Tract level planning database with census 2000 data (Technical report). Washington, DC: U.S. Census Bureau. Bruce, A., & Robinson, J. G. (2009). Tract level planning database with census 2000 data (Technical report). Washington, DC: U.S. Census Bureau.
go back to reference Citro, C. F., & Kalton, G. (2007). Using the American Community Survey: Benefits and challenges. Washington, DC: National Academies Press. Citro, C. F., & Kalton, G. (2007). Using the American Community Survey: Benefits and challenges. Washington, DC: National Academies Press.
go back to reference ESRI. (2011). The American Community Survey (Technical report). Redlands, CA: ESRI. ESRI. (2011). The American Community Survey (Technical report). Redlands, CA: ESRI.
go back to reference Greene, W. (2003). Econometric analysis. Upper Saddle River, NJ: Prentice Hall. Greene, W. (2003). Econometric analysis. Upper Saddle River, NJ: Prentice Hall.
go back to reference MacDonald, H. (2006). The American Community Survey: Warmer (more current), but fuzzier (less precise) than the decennial census. Journal of the American Planning Association, 72, 491–503.CrossRef MacDonald, H. (2006). The American Community Survey: Warmer (more current), but fuzzier (less precise) than the decennial census. Journal of the American Planning Association, 72, 491–503.CrossRef
go back to reference MacEachren, A. M. (1992). Visualizing uncertain information. Cartographic Perspectives, 1992(13), 10–19. MacEachren, A. M. (1992). Visualizing uncertain information. Cartographic Perspectives, 1992(13), 10–19.
go back to reference MacEachren, A. M., Robinson, A., Hopper, S., Gardner, S., Murray, R., Gahegan, M., & Hetzler, E. (2005). Visualizing geospatial information uncertainty: What we know and what we need to know. Cartography and Geographic Information Science, 32, 139–160.CrossRef MacEachren, A. M., Robinson, A., Hopper, S., Gardner, S., Murray, R., Gahegan, M., & Hetzler, E. (2005). Visualizing geospatial information uncertainty: What we know and what we need to know. Cartography and Geographic Information Science, 32, 139–160.CrossRef
go back to reference Muchinsky, P. M. (1996). The correction for attenuation. Educational and Psychological Measurement, 56, 63–75.CrossRef Muchinsky, P. M. (1996). The correction for attenuation. Educational and Psychological Measurement, 56, 63–75.CrossRef
go back to reference Rey, S. J., & Anselin, L. (2007). PySAL: A python library of spatial analytical methods. Review of Regional Studies, 37, 5–27. Rey, S. J., & Anselin, L. (2007). PySAL: A python library of spatial analytical methods. Review of Regional Studies, 37, 5–27.
go back to reference Salvo, J. J., & Lobo, A. P. (2006). Moving from a decennial census to a continuous measurement survey: Factors affecting nonresponse at the neighborhood level. Population Research and Policy Review, 25, 225–241.CrossRef Salvo, J. J., & Lobo, A. P. (2006). Moving from a decennial census to a continuous measurement survey: Factors affecting nonresponse at the neighborhood level. Population Research and Policy Review, 25, 225–241.CrossRef
go back to reference Sommers, D., & Hefter, S. P. (2010). American Community Survey sample stratification—Current and new methodology (Technical report). Washington, DC: U.S. Census Bureau. Sommers, D., & Hefter, S. P. (2010). American Community Survey sample stratification—Current and new methodology (Technical report). Washington, DC: U.S. Census Bureau.
go back to reference Spearman, C. (1904). The proof and measurement of association between two things. American Journal of Psychology, 15, 72–101.CrossRef Spearman, C. (1904). The proof and measurement of association between two things. American Journal of Psychology, 15, 72–101.CrossRef
go back to reference Spielman, S. E., Folch, D. C., & Nagle, N. N. (2014). Patterns and causes of uncertainty in the American Community Survey. Applied Geography, 46, 147–157.CrossRef Spielman, S. E., Folch, D. C., & Nagle, N. N. (2014). Patterns and causes of uncertainty in the American Community Survey. Applied Geography, 46, 147–157.CrossRef
go back to reference Sun, M., & Wong, D. W. S. (2010). Incorporating data quality information in mapping American Community Survey data. Cartography and Geographic Information Science, 37, 285–299.CrossRef Sun, M., & Wong, D. W. S. (2010). Incorporating data quality information in mapping American Community Survey data. Cartography and Geographic Information Science, 37, 285–299.CrossRef
go back to reference U.S. Census Bureau. (1994). Geographic areas reference manual (Technical report). Washington, DC: U.S. Census Bureau. U.S. Census Bureau. (1994). Geographic areas reference manual (Technical report). Washington, DC: U.S. Census Bureau.
go back to reference U.S. Census Bureau. (2009a). A compass for understanding and using American Community Survey Data: What researchers need to know. Washington, DC: U.S. Government Printing Office. U.S. Census Bureau. (2009a). A compass for understanding and using American Community Survey Data: What researchers need to know. Washington, DC: U.S. Government Printing Office.
go back to reference U.S. Census Bureau. (2009b). Design and methodology: American Community Survey. Washington, DC: U.S. Government Printing Office. U.S. Census Bureau. (2009b). Design and methodology: American Community Survey. Washington, DC: U.S. Government Printing Office.
go back to reference Wong, D. W., & Sun, M. (2013). Handling data quality information of survey data in GIS: A case of using the American Community Survey data. Spatial Demography, 1, 3–16.CrossRef Wong, D. W., & Sun, M. (2013). Handling data quality information of survey data in GIS: A case of using the American Community Survey data. Spatial Demography, 1, 3–16.CrossRef
Metadata
Title
Spatial Variation in the Quality of American Community Survey Estimates
Authors
David C. Folch
Daniel Arribas-Bel
Julia Koschinsky
Seth E. Spielman
Publication date
19-08-2016
Publisher
Springer US
Published in
Demography / Issue 5/2016
Print ISSN: 0070-3370
Electronic ISSN: 1533-7790
DOI
https://doi.org/10.1007/s13524-016-0499-1

Other articles of this Issue 5/2016

Demography 5/2016 Go to the issue