2021 | Buch

# A Course on Small Area Estimation and Mixed Models

## Methods, Theory and Applications in R

verfasst von: Domingo Morales, María Dolores Esteban, Agustín Pérez, Tomáš Hobza

Buchreihe : Statistics for Social and Behavioral Sciences

Enthalten in:

insite
SUCHEN

### Über dieses Buch

This advanced textbook explores small area estimation techniques, covers the underlying mathematical and statistical theory and offers hands-on support with their implementation. It presents the theory in a rigorous way and compares and contrasts various statistical methodologies, helping readers understand how to develop new methodologies for small area estimation. It also includes numerous sample applications of small area estimation techniques. The underlying R code is provided in the text and applied to four datasets that mimic data from labor markets and living conditions surveys, where the socioeconomic indicators include the small area estimation of total unemployment, unemployment rates, average annual household incomes and poverty indicators. Given its scope, the book will be useful for master and PhD students, and for official and other applied statisticians.

### Inhaltsverzeichnis

##### Chapter 1. Small Area Estimation
Abstract
This chapter gives some introductory comments about small area estimation and mixed models. As the book illustrates the statistical methodology with applications of R codes to synthetic data, the chapter describes the structure and contents of the employed data files. All the examples are carried out with two files containing unit-level data and two files containing aggregated data at the domain level.
Domingo Morales, María Dolores Esteban, Agustín Pérez, Tomáš Hobza
##### Chapter 2. Design-Based Direct Estimation
Abstract
This chapter gives a short introduction to the survey sampling theory and describes some properties of direct estimators, with special emphasis on estimators of means, totals, and ratios. For each estimator, the design-based expectation and variance are calculated and a direct estimator of the variance is given. The chapter also presents design-based resampling methods for variance estimation. The last section of the chapter provides R codes.
Domingo Morales, María Dolores Esteban, Agustín Pérez, Tomáš Hobza
##### Chapter 3. Design-Based Indirect Estimation
Abstract
This chapter presents some design-based indirect small area estimators using different types of auxiliary information. It describes the basic synthetic, the post-stratified, the sample size dependent, and the GREG estimators of domain totals and means. Further, the chapter presents an application to a labor force survey where the different steps in the construction of design-based indirect estimators are illustrated. The last section provides R codes.
Domingo Morales, María Dolores Esteban, Agustín Pérez, Tomáš Hobza
##### Chapter 4. Prediction Theory
Abstract
This chapter introduces the prediction theory for finite populations and studies the problem of predicting linear population parameters under a general linear model. It also proves the general prediction theorem under a superpopulation linear model and derives the best linear unbiased predictors of population totals under some linear models. Some of the obtained predictors are widely used in statistical inference on finite populations. The last section provides R codes.
Domingo Morales, María Dolores Esteban, Agustín Pérez, Tomáš Hobza
##### Chapter 5. Linear Models
Abstract
This chapter presents a short introduction to linear regression models with fixed effects. A proposition gives the explicit solutions of the likelihood equations for calculating the maximum likelihood estimators of the model parameters. The chapter also contains a collection of examples based on simple linear regression models. For each considered model, best linear unbiased predictors of domain-level means are obtained. The last section provides R codes.
Domingo Morales, María Dolores Esteban, Agustín Pérez, Tomáš Hobza
##### Chapter 6. Linear Mixed Models
Abstract
This chapter introduces linear mixed models, which have wide applicability in small area estimation due to their flexibility to combining different types of information and explaining sources of errors. Three of the most used fitting methods are presented under two parametrizations. They allow the calculation of maximum likelihood, residual maximum likelihood, and method of moments estimators. The last section provides R codes.
Domingo Morales, María Dolores Esteban, Agustín Pérez, Tomáš Hobza
##### Chapter 7. Nested Error Regression Models
Abstract
This chapter deals with the estimation of the regression and variance components’ parameters of the nested error regression model. It describes three fitting methods for calculating maximum likelihood, residual maximum likelihood, and method of moments estimators. The chapter gives the derivations of the algorithm formulas required for programming the fitting algorithms. The last section provides R codes.
Domingo Morales, María Dolores Esteban, Agustín Pérez, Tomáš Hobza
##### Chapter 8. EBLUPs Under Nested Error Regression Models
Abstract
This chapter treats the problem of predicting linear combinations of components of a finite population random vector. The linear parameters have the form of weighted sums with known positive or null weights. By assuming that the population target vector follows a nested error regression model, this chapter introduces empirical best linear unbiased predictors and model-assisted estimators of small area linear parameters and derives the expressions of the predictors of a single observation and of a domain mean. It also describes a parametric bootstrap algorithm for estimating the mean squared errors. The last section provides R codes.
Domingo Morales, María Dolores Esteban, Agustín Pérez, Tomáš Hobza
##### Chapter 9. Mean Squared Error of EBLUPs
Abstract
This chapter treats the problem of approximating and estimating the mean squared error of empirical best linear unbiased predictors of small area linear parameters under linear mixed models. This is done in several steps. First, when all the model parameters are unknown. Second, when only the variance component parameters are unknown. Third, and last, when all the parameters are unknown. The chapter gives the final expression of an analytic estimator of the mean squared error and the particularization to the nested error regression model. The last section of the chapter provides R codes.
Domingo Morales, María Dolores Esteban, Agustín Pérez, Tomáš Hobza
##### Chapter 10. EBPs Under Nested Error Regression Models
Abstract
This chapter derives empirical best predictors of additive parameters based on nested error regression models and pays special attention to the prediction of mean incomes, poverty proportions, and poverty gaps in small areas. The chapter calculates the distribution of the non-observed part of the population target vector conditioned to its observed part. Based on the conditional distribution, the empirical best predictors are derived. The last section provides R codes.
Domingo Morales, María Dolores Esteban, Agustín Pérez, Tomáš Hobza
##### Chapter 11. EBLUPs Under Two-Fold Nested Error Regression Models
Abstract
This chapter introduces the Henderson 3, maximum likelihood, and residual maximum likelihood methods for estimating the regression and variance component parameters of the two-fold nested error regression model parameters. It derives the empirical best linear unbiased predictors of population linear parameters and approximates the mean squared error. The chapter presents simulation experiments and provides R codes.
Domingo Morales, María Dolores Esteban, Agustín Pérez, Tomáš Hobza
##### Chapter 12. EBPs Under Two-Fold Nested Error Regression Models
Abstract
This chapter derives empirical best predictors of additive parameters based on two-fold nested error regression models and pays special attention to the prediction of mean incomes, poverty proportions, and poverty gaps in small areas. The chapter calculates the distribution of the non-observed part of the population target vector conditioned to its observed part. Based on the conditional distribution, the empirical best predictors are derived. A parametric bootstrap algorithm is recommended for estimating the mean squared errors. The last section provides R codes.
Domingo Morales, María Dolores Esteban, Agustín Pérez, Tomáš Hobza
##### Chapter 13. Random Regression Coefficient Models
Abstract
This chapter describes a modification of the nested error regression model having random regression coefficients. We can intuitively expect that the slope parameters of some explanatory variable are not constant and therefore they should take different values in different domains. The random regression coefficient model gives a practical solution to this problem by assuming that the beta parameters are random and therefore they give a more flexible way of modeling. The chapter gives fitting algorithms, derives the empirical best predictors of domain means, and approximates the mean squared errors. The last section provides R codes.
Domingo Morales, María Dolores Esteban, Agustín Pérez, Tomáš Hobza
##### Chapter 14. EBPs Under Unit-Level Logit Mixed Models
Abstract
The binomial-logit mixed models are generalized linear mixed models for dichotomous or counting variables that take into account the between domains variability, that is not explained through auxiliary variables, by introducing random effects. For fitting the model, this chapter describes the method of simulated moments, the EM and the ML-Laplace approximation algorithms are also introduced. The chapter presents several model-based predictors of population-based and model-based parameters and treats the problem of MSE estimation by parametric bootstrap. The last section provides R codes.
Domingo Morales, María Dolores Esteban, Agustín Pérez, Tomáš Hobza
##### Chapter 15. EBPs Under Unit-Level Two-Fold Logit Mixed Models
Abstract
The unit-level two-fold logit mixed models can be used to estimate proportions at different levels of aggregations or study the temporal behavior of domain proportions. These models have random effects taking into account the between-domains and the between-subdomain variability (within domains) that is not explained by the auxiliary variables. The chapter derives the Laplace approximation algorithm for calculating the maximum likelihood estimators of the model parameters and introduces empirical best and plug-in predictors for estimating domain proportions at different levels of aggregation. The chapter presents simulation experiments and provides R codes.
Domingo Morales, María Dolores Esteban, Agustín Pérez, Tomáš Hobza
##### Chapter 16. Fay–Herriot Models
Abstract
Area-level linear mixed models are useful tools for analyzing aggregated data. Because of its applicability and interpretability, the Fay–Herriot model is the basic area-level linear mixed model for small area estimation. For these models, the chapter derives the best linear unbiased predictor of a linear combination of fixed and random effects. The chapter shows how to incorporate the sampling error variances into the Fay–Herriot model, presents four methods for estimating the model parameters, and gives predictors of population quantities. It also approximates the corresponding mean squared errors and develops a hierarchical Bayes approach. The last section of the chapter provides R codes.
Domingo Morales, María Dolores Esteban, Agustín Pérez, Tomáš Hobza
##### Chapter 17. Area-Level Temporal Linear Mixed Models
Abstract
This chapter deals with two temporal extensions of the basic Fay–Herriot area-level model. The first model assumes that the domain-time random effects are independent. The second model assumes an autoregressive correlation structure across time within each domain. For the two models, the residual maximum likelihood fitting algorithm is given and the empirical best linear unbiased predictors of domain means are derived. Further, the problem of estimating the mean squared errors is also addressed. The last section provides R codes.
Domingo Morales, María Dolores Esteban, Agustín Pérez, Tomáš Hobza
##### Chapter 18. Area-Level Spatio-Temporal Linear Mixed Models
Abstract
This chapter describes an area-level linear mixed model with a simultaneous autoregressive vector of area random effects. This model takes into account the spatial correlation among data from different areas to borrow additional strength from the areas. The chapter also presents spatio-temporal models that take into account the temporal and spatial correlations for improving the predictions. The chapter gives a residual maximum likelihood Fisher-scoring fitting algorithm, derives the EBLUPs of linear parameters, presents simulation results, and gives some R codes.
Domingo Morales, María Dolores Esteban, Agustín Pérez, Tomáš Hobza
##### Chapter 19. Area-Level Bivariate Linear Mixed Models
Abstract
This chapter describes the bivariate Fay–Herriot model under complete parametrization, and it gives the Fisher-scoring algorithms to calculate the maximum likelihood and the residual maximum likelihood estimators of the model parameters and approximates the matrices of mean squared crossed errors of the empirical best linear unbiased predictors of population linear parameters. Some simulations and R codes illustrate the theoretical developments.
Domingo Morales, María Dolores Esteban, Agustín Pérez, Tomáš Hobza
##### Chapter 20. Area-Level Poisson Mixed Models
Abstract
This chapter describes the area-level Poisson mixed model in the framework of small area estimation. It also presents several fitting methods for estimating the model parameters, namely, the method of moments, the penalized quasi-likelihood approach, and the EM and the Laplace approximation algorithms for calculating maximum likelihood estimators. The chapter derives empirical best and plug-in predictors of domain counts and proportions and gives some R codes.
Domingo Morales, María Dolores Esteban, Agustín Pérez, Tomáš Hobza
##### Chapter 21. Area-Level Temporal Poisson Mixed Models
Abstract
This chapter presents temporal Poisson mixed models for estimating domain counts and proportions. The first Poisson model uses independent time random effects. The second model assumes that time random effects follow an autoregressive process of order one. The mathematical developments are only presented for the first model. The chapter presents the algebraic calculations required for programming the ML-Laplace approximation algorithm. It also gives empirical best predictors of domain proportions and counts. For estimating the mean squared error of the EBP, a parametric bootstrap estimator is introduced. Finally, some R codes are given and applied to simulated data.
Domingo Morales, María Dolores Esteban, Agustín Pérez, Tomáš Hobza
##### Backmatter
Titel
A Course on Small Area Estimation and Mixed Models
verfasst von
Domingo Morales
María Dolores Esteban
Agustín Pérez
Tomáš Hobza