Skip to main content
Top
Published in: Empirical Economics 1/2021

14-05-2020

Does the choice of balance-measure matter under genetic matching?

Authors: Adeola Oyenubi, Martin Wittenberg

Published in: Empirical Economics | Issue 1/2021

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In applied studies, the influence of balance measures on the performance of matching estimators is often taken for granted. This paper considers the performance of different balance measures that have been used in the literature when balance is being optimized. We also propose the use of the entropy measure in assessing balance. To examine the effect of balance measures, we conduct a series of simulation studies where we optimize balance using genetic algorithm (GenMatch). We found that balance measures do influence matching estimates under the GenMatch algorithm. The bias and root-mean-square error (RMSE) of the estimated treatment effect vary with the choice of balance measure. While the performance of different balance measures vary across simulation designs, some pattern did emerge. In an artificial data generating process (DGP) with one covariate, the proposed entropy balance measure has the lowest RMSE. However, in more realistic DGPs with many covariates, the standardized difference in means appear to be a very robust measure of balance, this measure either dominate other measures or come in as a close second in terms of bias or/and RMSE. The implication of these results is that sensitivity of matching estimates to the choice of balance measure should be given greater attention in empirical studies.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Footnotes
1
The traditional implementation of propensity score matching requires discussion of the propensity score specification (see Lehrer and Kordas 2013 and their discussion on estimation of propensity scores). This, in itself, may affect the results, apart from the impact of balance measures. Here we fix the matching method and check if the result is sensitive to the choice of balance measure.
 
2
This balance measure—the entropic distance metric between distributions, was originally proposed by Oyenubi (2018) in his PhD thesis, under the supervision of Prof. Martin Wittenberg and Prof. Patrizio Piraino.
 
3
This number is called population size in Genetic Algorithm.
 
4
This fitness function can be to minimize the mean of the balance statistics across all covariates or perform lexical optimization (see Diamond and Sekhon 2013).
 
5
Selection gives preference to good solutions to make it into the next generation of solutions (or the offspring population). Crossover combines two or more current solutions to form a new solution (offspring in the new population). Mutation is used to encourage diversity amongst solutions. This is achieved by changing parts of a candidate solution in the current population randomly to produce new solutions. See Mebane and Sekhon (2011) for more details.
 
6
We do not expect this extreme case in our context because of the common support assumption. However, there may be areas of thin/no support in finite samples which this measure will be useful in picking up.
 
7
The (average) treatment to control ratio for Simulation 1 is displayed in online Appendix B. For Simulation 2, the probability of treatment assignment at the average value of the covariates is approximately 0.5.
 
8
i.e., \(\% {\text{Bias}} = \left| {\frac{{\mathop \sum \nolimits_{i = 1}^{500} \frac{{\left( {\hat{\theta } - \theta } \right)}}{500}}}{\theta }} \right|* 100\) where \(\theta\) is the true ATT and \(\hat{\theta }\) is the estimate of \(\theta\) in each iteration of the simulations. The RMSE is given by \(\mathop \sum \nolimits_{i = 1}^{500} \frac{{\left( {\hat{\theta } - \theta } \right)^{2} }}{500}\). For simulation 1 \(\theta\) is replaced with the absolute value of the minimum Bias estimate.
 
9
Note that for simulations 2 and 3 we use a subset of the balance measures used in simulation 1. Since Simulations 2 and 3 has discrete and continuous covariates only balance measures that can handle both kinds of variables can be used.
 
10
This is because Mean and Variance are sufficient to identify normal distributions. Measures that look beyond these two quantities will, therefore, be inefficient.
 
11
The only exception is the default balance measure in simulation 3 under bias correction and φ = − 1.5.
 
Literature
go back to reference Austin PC (2009) Balance diagnostics for comparing the distribution of baseline covariates between treatment groups in propensity-score matched samples. Stat Med 28(25):3083–3107CrossRef Austin PC (2009) Balance diagnostics for comparing the distribution of baseline covariates between treatment groups in propensity-score matched samples. Stat Med 28(25):3083–3107CrossRef
go back to reference Belitser SV, Martens EP, Pestman WR, Groenwold RH, Boer A, Klungel OH (2011) Measuring balance and model selection in propensity score methods. Pharmacoepidemiol Drug Saf 20(11):1115–1129CrossRef Belitser SV, Martens EP, Pestman WR, Groenwold RH, Boer A, Klungel OH (2011) Measuring balance and model selection in propensity score methods. Pharmacoepidemiol Drug Saf 20(11):1115–1129CrossRef
go back to reference Busso M, DiNardo J, McCrary J (2014) New evidence on the finite sample properties of propensity score reweighting and matching estimators. Rev Econ Stat 96(5):885–897CrossRef Busso M, DiNardo J, McCrary J (2014) New evidence on the finite sample properties of propensity score reweighting and matching estimators. Rev Econ Stat 96(5):885–897CrossRef
go back to reference Caliendo M, Kopeinig S (2008) Some practical guidance for the implementation of propensity score matching. J Econ Surv 22(1):31–72CrossRef Caliendo M, Kopeinig S (2008) Some practical guidance for the implementation of propensity score matching. J Econ Surv 22(1):31–72CrossRef
go back to reference Carr J (2014) An introduction to genetic algorithms. Sr Proj 1:40 Carr J (2014) An introduction to genetic algorithms. Sr Proj 1:40
go back to reference Casella G, Berger RL (2002) Statistical inference, vol 2. Duxbury, Pacific Grove Casella G, Berger RL (2002) Statistical inference, vol 2. Duxbury, Pacific Grove
go back to reference Crump RK, Hotz VJ, Imbens GW, Mitnik OA (2009) Dealing with limited overlap in estimation of average treatment effects. Biometrika 96(1):187–199CrossRef Crump RK, Hotz VJ, Imbens GW, Mitnik OA (2009) Dealing with limited overlap in estimation of average treatment effects. Biometrika 96(1):187–199CrossRef
go back to reference Dehejia RH, Wahba S (1999) Causal effects in nonexperimental studies: reevaluating the evaluation of training programs. J Am Stat Assoc 94(448):1053–1062CrossRef Dehejia RH, Wahba S (1999) Causal effects in nonexperimental studies: reevaluating the evaluation of training programs. J Am Stat Assoc 94(448):1053–1062CrossRef
go back to reference Diamond A, Sekhon JS (2013) Genetic matching for estimating causal effects: a general multivariate matching method for achieving balance in observational studies. Rev Econ Stat 95(3):932–945CrossRef Diamond A, Sekhon JS (2013) Genetic matching for estimating causal effects: a general multivariate matching method for achieving balance in observational studies. Rev Econ Stat 95(3):932–945CrossRef
go back to reference Frölich M (2004) Finite-sample properties of propensity-score matching and weighting estimators. Rev Econ Stat 86(1):77–90CrossRef Frölich M (2004) Finite-sample properties of propensity-score matching and weighting estimators. Rev Econ Stat 86(1):77–90CrossRef
go back to reference Granger C, Maasoumi E, Racine J (2004) A dependence metric for possibly nonlinear processes. J Time Ser Anal 25(5):649–669CrossRef Granger C, Maasoumi E, Racine J (2004) A dependence metric for possibly nonlinear processes. J Time Ser Anal 25(5):649–669CrossRef
go back to reference Hainmueller J (2012) Entropy balancing for causal effects: a multivariate reweighting method to produce balanced samples in observational studies. Polit Anal 20(1):25–46CrossRef Hainmueller J (2012) Entropy balancing for causal effects: a multivariate reweighting method to produce balanced samples in observational studies. Polit Anal 20(1):25–46CrossRef
go back to reference Ho DE, Imai K, King G, Stuart EA (2007) Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Polit Anal 15:199–236CrossRef Ho DE, Imai K, King G, Stuart EA (2007) Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Polit Anal 15:199–236CrossRef
go back to reference Hoaglin DC, Mosteller F, Tukey JW (1983) Understanding robust and exploratory data analysis, vol 3. Wiley, New York Hoaglin DC, Mosteller F, Tukey JW (1983) Understanding robust and exploratory data analysis, vol 3. Wiley, New York
go back to reference Huber M (2009) Testing for covariate balance using nonparametric quantile regression and resampling methods. Unpublished working and discussion papers Huber M (2009) Testing for covariate balance using nonparametric quantile regression and resampling methods. Unpublished working and discussion papers
go back to reference Huber M, Lechner M, Wunsch C (2013) The performance of estimators based on the propensity score. J Econom 175(1):1–21CrossRef Huber M, Lechner M, Wunsch C (2013) The performance of estimators based on the propensity score. J Econom 175(1):1–21CrossRef
go back to reference Iacus SM, King G, Porro G, Katz JN (2012) Causal inference without balance checking: coarsened exact matching. Polit Anal 20(1):1–24CrossRef Iacus SM, King G, Porro G, Katz JN (2012) Causal inference without balance checking: coarsened exact matching. Polit Anal 20(1):1–24CrossRef
go back to reference Imai K, Ratkovic M (2014) Covariate balancing propensity score. J R Stat Soc Ser B (Stat Methodol) 76(1):243–263CrossRef Imai K, Ratkovic M (2014) Covariate balancing propensity score. J R Stat Soc Ser B (Stat Methodol) 76(1):243–263CrossRef
go back to reference Imai K, King G, Stuart EA (2008) Misunderstandings between experimentalists and observationalists about causal inference. J R Stat Soc Ser A (Stat Soc) 171(2):481–502CrossRef Imai K, King G, Stuart EA (2008) Misunderstandings between experimentalists and observationalists about causal inference. J R Stat Soc Ser A (Stat Soc) 171(2):481–502CrossRef
go back to reference Khan S, Tamer E (2010) Irregular identification, support conditions, and inverse weight estimation. Econometrica 78(6):2021–2042CrossRef Khan S, Tamer E (2010) Irregular identification, support conditions, and inverse weight estimation. Econometrica 78(6):2021–2042CrossRef
go back to reference King G, Lucas C, Nielsen RA (2017) The balance-sample size frontier in matching methods for causal inference. Am J Polit Sci 61(2):473–489CrossRef King G, Lucas C, Nielsen RA (2017) The balance-sample size frontier in matching methods for causal inference. Am J Polit Sci 61(2):473–489CrossRef
go back to reference Kinnear Jr KE (1994) A perspective on the work in this book. In: Advances in genetic programming, p 3–19 Kinnear Jr KE (1994) A perspective on the work in this book. In: Advances in genetic programming, p 3–19
go back to reference Kvam PH, Vidakovic B (2007) Nonparametric statistics with applications to science and engineering, vol 653. Wiley, New YorkCrossRef Kvam PH, Vidakovic B (2007) Nonparametric statistics with applications to science and engineering, vol 653. Wiley, New YorkCrossRef
go back to reference LaLonde RJ (1986) Evaluating the econometric evaluations of training programs with experimental data. Am Econ Rev 76(4) :604–620 LaLonde RJ (1986) Evaluating the econometric evaluations of training programs with experimental data. Am Econ Rev 76(4) :604–620
go back to reference Lechner M, Strittmatter A (2019) Practical procedures to deal with common support problems in matching estimation. Econom Rev 38(2):1–15 Lechner M, Strittmatter A (2019) Practical procedures to deal with common support problems in matching estimation. Econom Rev 38(2):1–15
go back to reference Lee BK, Lessler J, Stuart EA (2010) Improving propensity score weighting using machine learning. Stat Med 29(3):337–346CrossRef Lee BK, Lessler J, Stuart EA (2010) Improving propensity score weighting using machine learning. Stat Med 29(3):337–346CrossRef
go back to reference Lehrer SF, Kordas G (2013) Matching using semiparametric propensity scores. Empir Econ 44(1):13–45CrossRef Lehrer SF, Kordas G (2013) Matching using semiparametric propensity scores. Empir Econ 44(1):13–45CrossRef
go back to reference Maasoumi E, Racine JS (2008) A robust entropy-based test of asymmetry for discrete and continuous processes. Econom Rev 28(1–3):246–261CrossRef Maasoumi E, Racine JS (2008) A robust entropy-based test of asymmetry for discrete and continuous processes. Econom Rev 28(1–3):246–261CrossRef
go back to reference Maasoumi E, Wang L (2019) The gender gap between earnings distributions. J Polit Econ 127(5):2438–2504CrossRef Maasoumi E, Wang L (2019) The gender gap between earnings distributions. J Polit Econ 127(5):2438–2504CrossRef
go back to reference Mitchell M (1998) An introduction to genetic algorithms. MIT Press, CambridgeCrossRef Mitchell M (1998) An introduction to genetic algorithms. MIT Press, CambridgeCrossRef
go back to reference Oyenubi A (2018) Quantifying balance for causal inference: an information-theoretic perspective. Doctoral dissertation, University of Cape Town Oyenubi A (2018) Quantifying balance for causal inference: an information-theoretic perspective. Doctoral dissertation, University of Cape Town
go back to reference Parizzi A, Brcic R (2011) Adaptive InSAR stack multilooking exploiting amplitude statistics: a comparison between different techniques and practical results. IEEE Geosci Remote Sens Lett 8(3):441–445CrossRef Parizzi A, Brcic R (2011) Adaptive InSAR stack multilooking exploiting amplitude statistics: a comparison between different techniques and practical results. IEEE Geosci Remote Sens Lett 8(3):441–445CrossRef
go back to reference Sekhon, J. S., 2011. Multivariate and propensity score matching software with automated balance optimization: The matching package for R Sekhon, J. S., 2011. Multivariate and propensity score matching software with automated balance optimization: The matching package for R
go back to reference Setoguchi S, Schneeweiss S, Brookhart MA, Glynn RJ, Cook EF (2008) Evaluating uses of data mining techniques in propensity score estimation: a simulation study. Pharmacoepidemiol Drug Saf 17(6):546–555CrossRef Setoguchi S, Schneeweiss S, Brookhart MA, Glynn RJ, Cook EF (2008) Evaluating uses of data mining techniques in propensity score estimation: a simulation study. Pharmacoepidemiol Drug Saf 17(6):546–555CrossRef
go back to reference Tan YP, Nagamani J, Lu H (2003) Modified Kolmogorov–Smirnov metric for shot boundary detection. Electron Lett 39(18):1313–1315CrossRef Tan YP, Nagamani J, Lu H (2003) Modified Kolmogorov–Smirnov metric for shot boundary detection. Electron Lett 39(18):1313–1315CrossRef
go back to reference Zhao Z (2004) Using matching to estimate treatment effects: data requirements, matching metrics, and Monte Carlo evidence. Rev Econ Stat 86(1):91–107CrossRef Zhao Z (2004) Using matching to estimate treatment effects: data requirements, matching metrics, and Monte Carlo evidence. Rev Econ Stat 86(1):91–107CrossRef
Metadata
Title
Does the choice of balance-measure matter under genetic matching?
Authors
Adeola Oyenubi
Martin Wittenberg
Publication date
14-05-2020
Publisher
Springer Berlin Heidelberg
Published in
Empirical Economics / Issue 1/2021
Print ISSN: 0377-7332
Electronic ISSN: 1435-8921
DOI
https://doi.org/10.1007/s00181-020-01873-9

Other articles of this Issue 1/2021

Empirical Economics 1/2021 Go to the issue