Skip to main content
Published in:


RKHS-based covariate balancing for survival causal effect estimation

Authors: Wu Xue, Xiaoke Zhang, Kwun Chuen Gary Chan, Raymond K. W. Wong

Published in: Lifetime Data Analysis | Issue 1/2024

Log in

Activate our intelligent search to find suitable subject content or patents.

loading …


Survival causal effect estimation based on right-censored data is of key interest in both survival analysis and causal inference. Propensity score weighting is one of the most popular methods in the literature. However, since it involves the inverse of propensity score estimates, its practical performance may be very unstable, especially when the covariate overlap is limited between treatment and control groups. To address this problem, a covariate balancing method is developed in this paper to estimate the counterfactual survival function. The proposed method is nonparametric and balances covariates in a reproducing kernel Hilbert space (RKHS) via weights that are counterparts of inverse propensity scores. The uniform rate of convergence for the proposed estimator is shown to be the same as that for the classical Kaplan–Meier estimator. The appealing practical performance of the proposed method is demonstrated by a simulation study as well as two real data applications to study the causal effect of smoking on survival time of stroke patients and that of endotoxin on survival time for female patients with lung cancer respectively.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"


Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"


Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Available only for authorised users
go back to reference Aronszajn N (1950) Theory of reproducing kernels. Trans Am Math Soc 68(3):337–404MathSciNet Aronszajn N (1950) Theory of reproducing kernels. Trans Am Math Soc 68(3):337–404MathSciNet
go back to reference Astrakianakis G, Seixas NS, Ray R, Camp JE, Gao DL, Feng Z, Li W, Wernli KJ, Fitzgibbons ED, Thomas DB (2007) Lung cancer risk among female textile workers exposed to endotoxin. J Natl Cancer Inst 99(5):357–364 Astrakianakis G, Seixas NS, Ray R, Camp JE, Gao DL, Feng Z, Li W, Wernli KJ, Fitzgibbons ED, Thomas DB (2007) Lung cancer risk among female textile workers exposed to endotoxin. J Natl Cancer Inst 99(5):357–364
go back to reference Austin PC (2011) An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivar Behav Res 46(3):399–424 Austin PC (2011) An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivar Behav Res 46(3):399–424
go back to reference Austin PC (2013) The performance of different propensity score methods for estimating marginal hazard ratios. Stat Med 32(16):2837–2849MathSciNet Austin PC (2013) The performance of different propensity score methods for estimating marginal hazard ratios. Stat Med 32(16):2837–2849MathSciNet
go back to reference Austin PC, Cafri G (2020) Variance estimation when using propensity-score matching with replacement with survival or time-to-event outcomes. Stat Med 39(11):1623–1640MathSciNet Austin PC, Cafri G (2020) Variance estimation when using propensity-score matching with replacement with survival or time-to-event outcomes. Stat Med 39(11):1623–1640MathSciNet
go back to reference Austin PC, Schuster T (2016) The performance of different propensity score methods for estimating absolute effects of treatments on survival outcomes: a simulation study. Stat Methods Med Res 25(5):2214–2237MathSciNet Austin PC, Schuster T (2016) The performance of different propensity score methods for estimating absolute effects of treatments on survival outcomes: a simulation study. Stat Methods Med Res 25(5):2214–2237MathSciNet
go back to reference Austin PC, Stuart EA (2015) Moving towards best practice when using inverse probability of treatment weighting (IPTW) using the propensity score to estimate causal treatment effects in observational studies. Stat Med 34(28):3661–3679MathSciNet Austin PC, Stuart EA (2015) Moving towards best practice when using inverse probability of treatment weighting (IPTW) using the propensity score to estimate causal treatment effects in observational studies. Stat Med 34(28):3661–3679MathSciNet
go back to reference Austin PC, Stuart EA (2017) The performance of inverse probability of treatment weighting and full matching on the propensity score in the presence of model misspecification when estimating the effect of treatment on survival outcomes. Stat Methods Med Res 26(4):1654–1670MathSciNet Austin PC, Stuart EA (2017) The performance of inverse probability of treatment weighting and full matching on the propensity score in the presence of model misspecification when estimating the effect of treatment on survival outcomes. Stat Methods Med Res 26(4):1654–1670MathSciNet
go back to reference Austin PC, Grootendorst P, Normand SLT, Anderson GM (2007) Conditioning on the propensity score can result in biased estimation of common measures of treatment effect: a monte carlo study. Stat Med 26(4):754–768MathSciNet Austin PC, Grootendorst P, Normand SLT, Anderson GM (2007) Conditioning on the propensity score can result in biased estimation of common measures of treatment effect: a monte carlo study. Stat Med 26(4):754–768MathSciNet
go back to reference Bhat VM, Cole JW, Sorkin JD, Wozniak MA, Malarcher AM, Giles WH, Stern BJ, Kittner SJ (2008) Dose-response relationship between cigarette smoking and risk of ischemic stroke in young women. Stroke 39(9):2439–2443 Bhat VM, Cole JW, Sorkin JD, Wozniak MA, Malarcher AM, Giles WH, Stern BJ, Kittner SJ (2008) Dose-response relationship between cigarette smoking and risk of ischemic stroke in young women. Stroke 39(9):2439–2443
go back to reference Chan KCG, Yam SCP, Zhang Z (2016) Globally efficient non-parametric inference of average treatment effects by empirical balancing calibration weighting. J R Stat Soc Ser B 78(3):673–700MathSciNet Chan KCG, Yam SCP, Zhang Z (2016) Globally efficient non-parametric inference of average treatment effects by empirical balancing calibration weighting. J R Stat Soc Ser B 78(3):673–700MathSciNet
go back to reference Chernozhukov V, Fernández-Val I, Melly B (2013) Inference on counterfactual distributions. Econometrica 81(6):2205–2268MathSciNet Chernozhukov V, Fernández-Val I, Melly B (2013) Inference on counterfactual distributions. Econometrica 81(6):2205–2268MathSciNet
go back to reference Cole SR, Hernán MA (2004) Adjusted survival curves with inverse probability weights. Comput Methods Programs Biomed 75(1):45–49 Cole SR, Hernán MA (2004) Adjusted survival curves with inverse probability weights. Comput Methods Programs Biomed 75(1):45–49
go back to reference Donald SG, Hsu YC (2014) Estimation and inference for distribution functions and quantile functions in treatment effect models. J Econ 178:383–397MathSciNet Donald SG, Hsu YC (2014) Estimation and inference for distribution functions and quantile functions in treatment effect models. J Econ 178:383–397MathSciNet
go back to reference Foldes A, Rejto L (1981) Strong uniform consistency for nonparametric survival curve estimators from randomly censored data. Ann Stat 9(1):122–129MathSciNet Foldes A, Rejto L (1981) Strong uniform consistency for nonparametric survival curve estimators from randomly censored data. Ann Stat 9(1):122–129MathSciNet
go back to reference Fong C, Hazlett C, Imai K (2018) Covariate balancing propensity score for a continuous treatment: application to the efficacy of political advertisements. Ann Appl Stat 12(1):156–177MathSciNet Fong C, Hazlett C, Imai K (2018) Covariate balancing propensity score for a continuous treatment: application to the efficacy of political advertisements. Ann Appl Stat 12(1):156–177MathSciNet
go back to reference Gallagher LG, Rosenblatt KA, Ray RM, Li W, Gao DL, Applebaum KM, Checkoway H, Thomas DB (2013) Reproductive factors and risk of lung cancer in female textile workers in Shanghai, China. Cancer Causes Control 24(7):1305–1314 Gallagher LG, Rosenblatt KA, Ray RM, Li W, Gao DL, Applebaum KM, Checkoway H, Thomas DB (2013) Reproductive factors and risk of lung cancer in female textile workers in Shanghai, China. Cancer Causes Control 24(7):1305–1314
go back to reference Greenland S, Pearl J, Robins JM (1999) Causal diagrams for epidemiologic research. Epidemiology 1999:37–48 Greenland S, Pearl J, Robins JM (1999) Causal diagrams for epidemiologic research. Epidemiology 1999:37–48
go back to reference Gretton A, Herbrich R, Smola A, Bousquet O, Schölkopf B (2005) Kernel methods for measuring independence. J Mach Learn Res 6:2075–2129MathSciNet Gretton A, Herbrich R, Smola A, Bousquet O, Schölkopf B (2005) Kernel methods for measuring independence. J Mach Learn Res 6:2075–2129MathSciNet
go back to reference Gu C (2013) Smoothing spline ANOVA models, 2nd edn. Springer, New York Gu C (2013) Smoothing spline ANOVA models, 2nd edn. Springer, New York
go back to reference Guyot P, Ades A, Ouwens MJ, Welton NJ (2012) Enhanced secondary analysis of survival data: reconstructing the data from published Kaplan–Meier survival curves. BMC Med Res Methodol 12(1):1–13 Guyot P, Ades A, Ouwens MJ, Welton NJ (2012) Enhanced secondary analysis of survival data: reconstructing the data from published Kaplan–Meier survival curves. BMC Med Res Methodol 12(1):1–13
go back to reference Hirshberg DA, Wager S (2021) Augmented minimax linear estimation. Ann Stat 49(6):3206–3227MathSciNet Hirshberg DA, Wager S (2021) Augmented minimax linear estimation. Ann Stat 49(6):3206–3227MathSciNet
go back to reference Hu L, Gu C, Lopez M, Ji J, Wisnivesky J (2020) Estimation of causal effects of multiple treatments in observational studies with a binary outcome. Stat Methods Med Res 29(11):3218–3234MathSciNet Hu L, Gu C, Lopez M, Ji J, Wisnivesky J (2020) Estimation of causal effects of multiple treatments in observational studies with a binary outcome. Stat Methods Med Res 29(11):3218–3234MathSciNet
go back to reference Hu L, Ji J, Li F (2021) Estimating heterogeneous survival treatment effect in observational data using machine learning. Stat Med 40(21):4691–4713MathSciNet Hu L, Ji J, Li F (2021) Estimating heterogeneous survival treatment effect in observational data using machine learning. Stat Med 40(21):4691–4713MathSciNet
go back to reference Huang R, Xu R, Dulai PS (2020) Sensitivity analysis of treatment effect to unmeasured confounding in observational studies with survival and competing risks outcomes. Stat Med 39(24):3397–3411MathSciNet Huang R, Xu R, Dulai PS (2020) Sensitivity analysis of treatment effect to unmeasured confounding in observational studies with survival and competing risks outcomes. Stat Med 39(24):3397–3411MathSciNet
go back to reference Imai K, Ratkovic M (2014) Covariate balancing propensity score. J R Stat Soc Ser B 76(1):243–263MathSciNet Imai K, Ratkovic M (2014) Covariate balancing propensity score. J R Stat Soc Ser B 76(1):243–263MathSciNet
go back to reference Jørgensen HS, Nakayama H, Raaschou HO, Vive-Larsen J, Støier M, Olsen TS (1995) Outcome and time course of recovery in stroke. Part i: outcome the copenhagen stroke study. Arch Phys Med Rehab 76(5):399–405 Jørgensen HS, Nakayama H, Raaschou HO, Vive-Larsen J, Støier M, Olsen TS (1995) Outcome and time course of recovery in stroke. Part i: outcome the copenhagen stroke study. Arch Phys Med Rehab 76(5):399–405
go back to reference Kang JDY, Schafer JL (2007) Demystifying double robustness: a comparison of alternative strategies for estimating a population mean from incomplete data. Stat Sci 22(4):523–539MathSciNet Kang JDY, Schafer JL (2007) Demystifying double robustness: a comparison of alternative strategies for estimating a population mean from incomplete data. Stat Sci 22(4):523–539MathSciNet
go back to reference Khedher SB, Neri M, Guida F, Matrat M, Cenée S, Sanchez M, Menvielle G, Molinié F, Luce D, Stücker I (2017) Occupational exposure to endotoxins and lung cancer risk: results of the icare study. Occup Environ Med 74(9):667–679 Khedher SB, Neri M, Guida F, Matrat M, Cenée S, Sanchez M, Menvielle G, Molinié F, Luce D, Stücker I (2017) Occupational exposure to endotoxins and lung cancer risk: results of the icare study. Occup Environ Med 74(9):667–679
go back to reference Kimura K, Minematsu K, Kazui S, Yamaguchi T (2005) Mortality and cause of death after hospital discharge in 10,981 patients with ischemic stroke and transient ischemic attack. Cerebrovasc Dis 19(3):171–178 Kimura K, Minematsu K, Kazui S, Yamaguchi T (2005) Mortality and cause of death after hospital discharge in 10,981 patients with ischemic stroke and transient ischemic attack. Cerebrovasc Dis 19(3):171–178
go back to reference Lee BK, Lessler J, Stuart EA (2010) Improving propensity score weighting using machine learning. Stat Med 29(3):337–346MathSciNet Lee BK, Lessler J, Stuart EA (2010) Improving propensity score weighting using machine learning. Stat Med 29(3):337–346MathSciNet
go back to reference Lenters V, Basinas I, Beane-Freeman L, Boffetta P, Checkoway H, Coggon D, Portengen L, Sim M, Wouters IM, Heederik D et al (2010) Endotoxin exposure and lung cancer risk: a systematic review and meta-analysis of the published literature on agriculture and cotton textile workers. Cancer Causes Control 21(4):523–555 Lenters V, Basinas I, Beane-Freeman L, Boffetta P, Checkoway H, Coggon D, Portengen L, Sim M, Wouters IM, Heederik D et al (2010) Endotoxin exposure and lung cancer risk: a systematic review and meta-analysis of the published literature on agriculture and cotton textile workers. Cancer Causes Control 21(4):523–555
go back to reference Levine DA, Walter JM, Karve SJ, Skolarus LE, Levine SR, Mulhorn KA (2014) Smoking and mortality in stroke survivors: can we eliminate the paradox? J Stroke Cerebrovasc Dis 23(6):1282–1290 Levine DA, Walter JM, Karve SJ, Skolarus LE, Levine SR, Mulhorn KA (2014) Smoking and mortality in stroke survivors: can we eliminate the paradox? J Stroke Cerebrovasc Dis 23(6):1282–1290
go back to reference Liebers V, Brüning T, Raulf M (2020) Occupational endotoxin exposure and health effects. Arch Toxicol 94(11):3629–3644 Liebers V, Brüning T, Raulf M (2020) Occupational endotoxin exposure and health effects. Arch Toxicol 94(11):3629–3644
go back to reference Linden A, Yarnold PR (2017) Using classification tree analysis to generate propensity score weights. J Eval Clin Pract 23(4):703–712 Linden A, Yarnold PR (2017) Using classification tree analysis to generate propensity score weights. J Eval Clin Pract 23(4):703–712
go back to reference Makuch RW (1982) Adjusted survival curve estimation using covariates. J Chronic Dis 35(6):437–443 Makuch RW (1982) Adjusted survival curve estimation using covariates. J Chronic Dis 35(6):437–443
go back to reference Mao H, Li L, Yang W, Shen Y (2018) On the propensity score weighting analysis with survival outcome: estimands, estimation, and inference. Stat Med 37(26):3745–3763MathSciNet Mao H, Li L, Yang W, Shen Y (2018) On the propensity score weighting analysis with survival outcome: estimands, estimation, and inference. Stat Med 37(26):3745–3763MathSciNet
go back to reference Ni A, Lin Z, Lu B (2021) Stratified restricted mean survival time model for marginal causal effect in observational survival data. Ann Epidemiol 64:149–154 Ni A, Lin Z, Lu B (2021) Stratified restricted mean survival time model for marginal causal effect in observational survival data. Ann Epidemiol 64:149–154
go back to reference Ouwens MJ, Philips Z, Jansen JP (2010) Network meta-analysis of parametric survival curves. Res Synth Methods 1(3–4):258–271 Ouwens MJ, Philips Z, Jansen JP (2010) Network meta-analysis of parametric survival curves. Res Synth Methods 1(3–4):258–271
go back to reference Rosenbaum PR, Rubin DB (1983) The central role of the propensity score in observational studies for causal effects. Biometrika 70(1):41–55MathSciNet Rosenbaum PR, Rubin DB (1983) The central role of the propensity score in observational studies for causal effects. Biometrika 70(1):41–55MathSciNet
go back to reference Rubin DB (1974) Estimating causal effects of treatments in randomized and nonrandomized studies. J Educ Psychol 66(5):688–701 Rubin DB (1974) Estimating causal effects of treatments in randomized and nonrandomized studies. J Educ Psychol 66(5):688–701
go back to reference Stitelman OM, Wester CW, De Gruttola V, van der Laan MJ (2011) Targeted maximum likelihood estimation of effect modification parameters in survival analysis. Int J Biostat 7(1) Stitelman OM, Wester CW, De Gruttola V, van der Laan MJ (2011) Targeted maximum likelihood estimation of effect modification parameters in survival analysis. Int J Biostat 7(1)
go back to reference Tang S, Yang S, Wang T, Cui Z, Li L, Faries DE (2019) Causal inference of hazard ratio based on propensity score matching. arXiv preprint arXiv:1911.12430 Tang S, Yang S, Wang T, Cui Z, Li L, Faries DE (2019) Causal inference of hazard ratio based on propensity score matching. arXiv preprint arXiv:​1911.​12430
go back to reference Wahba G (1990) Spline models for observational data. SIAM, Philadelphia Wahba G (1990) Spline models for observational data. SIAM, Philadelphia
go back to reference Wang J, Wong RK, Yang S, Chan KCG (2021) Estimation of partially conditional average treatment effect by hybrid kernel-covariate balancing. arXiv preprint arXiv:2103.03437 Wang J, Wong RK, Yang S, Chan KCG (2021) Estimation of partially conditional average treatment effect by hybrid kernel-covariate balancing. arXiv preprint arXiv:​2103.​03437
go back to reference Wang Y, Zubizarreta JR (2020) Minimal dispersion approximately balancing weights: asymptotic properties and practical considerations. Biometrika 107(1):93–105MathSciNet Wang Y, Zubizarreta JR (2020) Minimal dispersion approximately balancing weights: asymptotic properties and practical considerations. Biometrika 107(1):93–105MathSciNet
go back to reference Wen L, Young JG, Robins JM, Hernán MA (2021) Parametric g-formula implementations for causal survival analyses. Biometrics 77(2):740–753MathSciNet Wen L, Young JG, Robins JM, Hernán MA (2021) Parametric g-formula implementations for causal survival analyses. Biometrics 77(2):740–753MathSciNet
go back to reference Westreich D, Lessler J, Funk MJ (2010) Propensity score estimation: neural networks, support vector machines, decision trees (CART), and meta-classifiers as alternatives to logistic regression. J Clin Epidemiol 63(8):826–833 Westreich D, Lessler J, Funk MJ (2010) Propensity score estimation: neural networks, support vector machines, decision trees (CART), and meta-classifiers as alternatives to logistic regression. J Clin Epidemiol 63(8):826–833
go back to reference Williamson PR, Smith CT, Hutton JL, Marson AG (2002) Aggregate data meta-analysis with time-to-event outcomes. Stat Med 21(22):3337–3351 Williamson PR, Smith CT, Hutton JL, Marson AG (2002) Aggregate data meta-analysis with time-to-event outcomes. Stat Med 21(22):3337–3351
go back to reference Wolf PA, D’Agostino RB, Kannel WB, Bonita R, Belanger AJ (1988) Cigarette smoking as a risk factor for stroke: the framingham study. JAMA 259(7):1025–1029 Wolf PA, D’Agostino RB, Kannel WB, Bonita R, Belanger AJ (1988) Cigarette smoking as a risk factor for stroke: the framingham study. JAMA 259(7):1025–1029
go back to reference Wong RKW, Chan KCG (2018) Kernel-based covariate functional balancing for observational studies. Biometrika 105(1):199–213MathSciNet Wong RKW, Chan KCG (2018) Kernel-based covariate functional balancing for observational studies. Biometrika 105(1):199–213MathSciNet
go back to reference Xie J, Liu C (2005) Adjusted Kaplan–Meier estimator and log-rank test with inverse probability of treatment weighting for survival data. Stat Med 24(20):3089–3110MathSciNet Xie J, Liu C (2005) Adjusted Kaplan–Meier estimator and log-rank test with inverse probability of treatment weighting for survival data. Stat Med 24(20):3089–3110MathSciNet
go back to reference Zhang X, Xue W, Wang Q (2021) Covariate balancing functional propensity score for functional treatments in cross-sectional observational studies. Comput Stat Data Anal 163:107303MathSciNet Zhang X, Xue W, Wang Q (2021) Covariate balancing functional propensity score for functional treatments in cross-sectional observational studies. Comput Stat Data Anal 163:107303MathSciNet
go back to reference Zhao P, Su X, Ge T, Fan J (2016) Propensity score and proximity matching using random forest. Contemp Clin Trials 47:85–92 Zhao P, Su X, Ge T, Fan J (2016) Propensity score and proximity matching using random forest. Contemp Clin Trials 47:85–92
go back to reference Zhao Q (2019) Covariate balancing propensity score by tailored loss functions. Ann Stat 47(2):965–993MathSciNet Zhao Q (2019) Covariate balancing propensity score by tailored loss functions. Ann Stat 47(2):965–993MathSciNet
go back to reference Zubizarreta JR (2015) Stable weights that balance covariates for estimation with incomplete outcome data. J Am Stat Assoc 110(511):910–922MathSciNet Zubizarreta JR (2015) Stable weights that balance covariates for estimation with incomplete outcome data. J Am Stat Assoc 110(511):910–922MathSciNet
RKHS-based covariate balancing for survival causal effect estimation
Wu Xue
Xiaoke Zhang
Kwun Chuen Gary Chan
Raymond K. W. Wong
Publication date
Springer US
Published in
Lifetime Data Analysis / Issue 1/2024
Print ISSN: 1380-7870
Electronic ISSN: 1572-9249

Premium Partner