Skip to main content
Top
Published in: Lifetime Data Analysis 2/2022

03-03-2022

Bayesian penalized Buckley-James method for high dimensional bivariate censored regression models

Authors: Wenjing Yin, Sihai Dave Zhao, Feng Liang

Published in: Lifetime Data Analysis | Issue 2/2022

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

For high dimensional gene expression data, one important goal is to identify a small number of genes that are associated with progression of the disease or survival of the patients. In this paper, we consider the problem of variable selection for multivariate survival data. We propose an estimation procedure for high dimensional accelerated failure time (AFT) models with bivariate censored data. The method extends the Buckley-James method by minimizing a penalized \(L_2\) loss function with a penalty function induced from a bivariate spike-and-slab prior specification. In the proposed algorithm, censored observations are imputed using the Kaplan-Meier estimator, which avoids a parametric assumption on the error terms. Our empirical studies demonstrate that the proposed method provides better performance compared to the alternative procedures designed for univariate survival data regardless of whether the true events are correlated or not, and conceptualizes a formal way of handling bivariate survival data for AFT models. Findings from the analysis of a myeloma clinical trial using the proposed method are also presented.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Literature
go back to reference Ahmed SE, Hossain S, Doksum KA (2012) Lasso and shrinkage estimation in weibull censored regression models. J Stat Plan Inference 142(6):1273–1284MathSciNetMATHCrossRef Ahmed SE, Hossain S, Doksum KA (2012) Lasso and shrinkage estimation in weibull censored regression models. J Stat Plan Inference 142(6):1273–1284MathSciNetMATHCrossRef
go back to reference Candès E, Fan Y, Janson L, Lv J (2018) Panning for gold: Model-free knockoffs for high-dimensional controlled variable selection. J R Stat Soc: Ser B (Stat Methodol) 80(3):551–577MathSciNetMATHCrossRef Candès E, Fan Y, Janson L, Lv J (2018) Panning for gold: Model-free knockoffs for high-dimensional controlled variable selection. J R Stat Soc: Ser B (Stat Methodol) 80(3):551–577MathSciNetMATHCrossRef
go back to reference Chang SH (2004) Estimating marginal effects in accelerated failure time models for serial sojourn times among repeated events. Lifetime Data Anal 10(2):175–190MathSciNetMATHCrossRef Chang SH (2004) Estimating marginal effects in accelerated failure time models for serial sojourn times among repeated events. Lifetime Data Anal 10(2):175–190MathSciNetMATHCrossRef
go back to reference Chatonnet F, Pignarre A, Sérandour AA, Caron G, Avner S, Robert N, Kassambara A, Laurent A, Bizot M, Agirre X et al (2020) The hydroxymethylome of multiple myeloma identifies fam72d as a 1q21 marker linked to proliferation. Haematologica 105(3):774–783CrossRef Chatonnet F, Pignarre A, Sérandour AA, Caron G, Avner S, Robert N, Kassambara A, Laurent A, Bizot M, Agirre X et al (2020) The hydroxymethylome of multiple myeloma identifies fam72d as a 1q21 marker linked to proliferation. Haematologica 105(3):774–783CrossRef
go back to reference Chiou SH, Kang S, Kim J, Yan J (2014) Marginal semiparametric multivariate accelerated failure time model with generalized estimating equations. Lifetime Data Anal 20(4):599–618MathSciNetMATHCrossRef Chiou SH, Kang S, Kim J, Yan J (2014) Marginal semiparametric multivariate accelerated failure time model with generalized estimating equations. Lifetime Data Anal 20(4):599–618MathSciNetMATHCrossRef
go back to reference Duan W, Zhang R, Zhao Y, Shen S, Wei Y, Chen F, Christiani DC (2018) Bayesian variable selection for parametric survival model with applications to cancer omics data. Human Genom 12(1):49CrossRef Duan W, Zhang R, Zhao Y, Shen S, Wei Y, Chen F, Christiani DC (2018) Bayesian variable selection for parametric survival model with applications to cancer omics data. Human Genom 12(1):49CrossRef
go back to reference George EI, McCulloch RE (1997) Approaches for bayesian variable selection. Stat Sinica 7(2):339–373MATH George EI, McCulloch RE (1997) Approaches for bayesian variable selection. Stat Sinica 7(2):339–373MATH
go back to reference Hawley TS, Riz I, Yang W, Wakabayashi Y, DePalma L, Chang YT, Peng W, Zhu J, Hawley RG (2013) Identification of an abcb1 (p-glycoprotein)-positive carfilzomib-resistant myeloma subpopulation by the pluripotent stem cell fluorescent dye cdy1. Am J Hematol 88(4):265–272CrossRef Hawley TS, Riz I, Yang W, Wakabayashi Y, DePalma L, Chang YT, Peng W, Zhu J, Hawley RG (2013) Identification of an abcb1 (p-glycoprotein)-positive carfilzomib-resistant myeloma subpopulation by the pluripotent stem cell fluorescent dye cdy1. Am J Hematol 88(4):265–272CrossRef
go back to reference He W, Lawless JF (2005) Bivariate location-scale models for regression analysis, with applications to lifetime data. J R Stat Soc: Ser B (Stat Methodol) 67(1):63–78MathSciNetMATHCrossRef He W, Lawless JF (2005) Bivariate location-scale models for regression analysis, with applications to lifetime data. J R Stat Soc: Ser B (Stat Methodol) 67(1):63–78MathSciNetMATHCrossRef
go back to reference Hu J, Chai H (2013) Adjusted regularized estimation in the accelerated failure time model with high dimensional covariates. J Multiv Anal 122:96–114MathSciNetMATHCrossRef Hu J, Chai H (2013) Adjusted regularized estimation in the accelerated failure time model with high dimensional covariates. J Multiv Anal 122:96–114MathSciNetMATHCrossRef
go back to reference Huang J, Ma S (2010) Variable selection in the accelerated failure time model via the bridge method. Lifetime Data Anal 16(2):176–195MathSciNetMATHCrossRef Huang J, Ma S (2010) Variable selection in the accelerated failure time model via the bridge method. Lifetime Data Anal 16(2):176–195MathSciNetMATHCrossRef
go back to reference Huang J, Ma S, Xie H (2006) Regularized estimation in the accelerated failure time model with high-dimensional covariates. Biometrics 62(3):813–820MathSciNetMATHCrossRef Huang J, Ma S, Xie H (2006) Regularized estimation in the accelerated failure time model with high-dimensional covariates. Biometrics 62(3):813–820MathSciNetMATHCrossRef
go back to reference Huang J, Ma S, Xie H (2007) Least absolute deviations estimation for the accelerated failure time model. Stat Sinica 17(4):1533–1548MathSciNetMATH Huang J, Ma S, Xie H (2007) Least absolute deviations estimation for the accelerated failure time model. Stat Sinica 17(4):1533–1548MathSciNetMATH
go back to reference Huang L, Kopciuk K, Lu X (2020) Adaptive group bridge selection in the semiparametric accelerated failure time model. J Multiv Anal 175:104562MathSciNetMATHCrossRef Huang L, Kopciuk K, Lu X (2020) Adaptive group bridge selection in the semiparametric accelerated failure time model. J Multiv Anal 175:104562MathSciNetMATHCrossRef
go back to reference Huang Y (2002) Censored regression with the multistate accelerated sojourn times model. J R Stat Soc: Ser B (Stat Methodol) 64(1):17–29MathSciNetMATHCrossRef Huang Y (2002) Censored regression with the multistate accelerated sojourn times model. J R Stat Soc: Ser B (Stat Methodol) 64(1):17–29MathSciNetMATHCrossRef
go back to reference Jin Z, Lin D, Ying Z (2006) Rank regression analysis of multivariate failure time data based on marginal linear models. Scandinavian J Stat 33(1):1–23MathSciNetMATHCrossRef Jin Z, Lin D, Ying Z (2006) Rank regression analysis of multivariate failure time data based on marginal linear models. Scandinavian J Stat 33(1):1–23MathSciNetMATHCrossRef
go back to reference Kalbfleisch JD, Prentice RL (2011) The statistical analysis of failure time data. Wiley, New JerseyMATH Kalbfleisch JD, Prentice RL (2011) The statistical analysis of failure time data. Wiley, New JerseyMATH
go back to reference Khan MHR, Shaw JEH (2016) Variable selection for survival data with a class of adaptive elastic net techniques. Stat Comput 26(3):725–741MathSciNetMATHCrossRef Khan MHR, Shaw JEH (2016) Variable selection for survival data with a class of adaptive elastic net techniques. Stat Comput 26(3):725–741MathSciNetMATHCrossRef
go back to reference Khan MHR, Shaw JEH (2019) Variable selection for accelerated lifetime models with synthesized estimation techniques. Stat Methods Med Res 28(3):937–952MathSciNetCrossRef Khan MHR, Shaw JEH (2019) Variable selection for accelerated lifetime models with synthesized estimation techniques. Stat Methods Med Res 28(3):937–952MathSciNetCrossRef
go back to reference Khan MHR, Bhadra A, Howlader T (2019) Stability selection for lasso, ridge and elastic net implemented with aft models. Stat Appl Genet Mol Biol 18(5):742MathSciNetMATHCrossRef Khan MHR, Bhadra A, Howlader T (2019) Stability selection for lasso, ridge and elastic net implemented with aft models. Stat Appl Genet Mol Biol 18(5):742MathSciNetMATHCrossRef
go back to reference Konrath S, Fahrmeir L, Kneib T (2015) Bayesian accelerated failure time models based on penalized mixtures of gaussians: regularization and variable selection. AStA Adv Stat Anal 99(3):259–280MathSciNetMATHCrossRef Konrath S, Fahrmeir L, Kneib T (2015) Bayesian accelerated failure time models based on penalized mixtures of gaussians: regularization and variable selection. AStA Adv Stat Anal 99(3):259–280MathSciNetMATHCrossRef
go back to reference Lee KE, Mallick BK (2004) Bayesian methods for variable selection in survival models with application to dna microarray data. Sankhyā: Ind J Stat 66(4):756–778MathSciNetMATH Lee KE, Mallick BK (2004) Bayesian methods for variable selection in survival models with application to dna microarray data. Sankhyā: Ind J Stat 66(4):756–778MathSciNetMATH
go back to reference Lee KH, Chakraborty S, Sun J (2017) Variable selection for high-dimensional genomic data with censored outcomes using group lasso prior. Comput Stat Data Anal 112:1–13MathSciNetMATHCrossRef Lee KH, Chakraborty S, Sun J (2017) Variable selection for high-dimensional genomic data with censored outcomes using group lasso prior. Comput Stat Data Anal 112:1–13MathSciNetMATHCrossRef
go back to reference Li H, Yin G (2009) Generalized method of moments estimation for linear regression with clustered failure time data. Biometrika 96(2):293–306MathSciNetMATHCrossRef Li H, Yin G (2009) Generalized method of moments estimation for linear regression with clustered failure time data. Biometrika 96(2):293–306MathSciNetMATHCrossRef
go back to reference Li Y, Dicker L, Zhao SD (2014) The dantzig selector for censored linear regression models. Stat Sinica 24(1):251MathSciNetMATH Li Y, Dicker L, Zhao SD (2014) The dantzig selector for censored linear regression models. Stat Sinica 24(1):251MathSciNetMATH
go back to reference Noll JE, Vandyke K, Hewett DR, Mrozik KM, Bala RJ, Williams SA, Kok CH, Zannettino AC (2015) Pttg1 expression is associated with hyperproliferative disease and poor prognosis in multiple myeloma. J Hematol Oncol 8(1):106CrossRef Noll JE, Vandyke K, Hewett DR, Mrozik KM, Bala RJ, Williams SA, Kok CH, Zannettino AC (2015) Pttg1 expression is associated with hyperproliferative disease and poor prognosis in multiple myeloma. J Hematol Oncol 8(1):106CrossRef
go back to reference Pan W, Kooperberg C (1999) Linear regression for bivariate censored data via multiple imputation. Stat Med 18(22):3111–3121CrossRef Pan W, Kooperberg C (1999) Linear regression for bivariate censored data via multiple imputation. Stat Med 18(22):3111–3121CrossRef
go back to reference Pan W, Louis TA (2000) A linear mixed-effects model for multivariate censored data. Biometrics 56(1):160–166MATHCrossRef Pan W, Louis TA (2000) A linear mixed-effects model for multivariate censored data. Biometrics 56(1):160–166MATHCrossRef
go back to reference Sabourin JA, Valdar W, Nobel AB (2015) A permutation approach for selecting the penalty parameter in penalized model selection. Biometrics 71(4):1185–1194MathSciNetMATHCrossRef Sabourin JA, Valdar W, Nobel AB (2015) A permutation approach for selecting the penalty parameter in penalized model selection. Biometrics 71(4):1185–1194MathSciNetMATHCrossRef
go back to reference Sha N, Tadesse MG, Vannucci M (2006) Bayesian variable selection for the analysis of microarray data with censored outcomes. Bioinformatics 22(18):2262–2268CrossRef Sha N, Tadesse MG, Vannucci M (2006) Bayesian variable selection for the analysis of microarray data with censored outcomes. Bioinformatics 22(18):2262–2268CrossRef
go back to reference Shaughnessy J (2005) Amplification and overexpression of cks1b at chromosome band 1q21 is associated with reduced levels of p27 kip1 and an aggressive clinical course in multiple myeloma. Hematology 10:117–126CrossRef Shaughnessy J (2005) Amplification and overexpression of cks1b at chromosome band 1q21 is associated with reduced levels of p27 kip1 and an aggressive clinical course in multiple myeloma. Hematology 10:117–126CrossRef
go back to reference Shaughnessy JD Jr, Zhan F, Burington BE, Huang Y, Colla S, Hanamura I, Stewart JP, Kordsmeier B, Randolph C, Williams DR et al (2007) A validated gene expression model of high-risk multiple myeloma is defined by deregulated expression of genes mapping to chromosome 1. Blood 109(6):2276–2284CrossRef Shaughnessy JD Jr, Zhan F, Burington BE, Huang Y, Colla S, Hanamura I, Stewart JP, Kordsmeier B, Randolph C, Williams DR et al (2007) A validated gene expression model of high-risk multiple myeloma is defined by deregulated expression of genes mapping to chromosome 1. Blood 109(6):2276–2284CrossRef
go back to reference Shi L, Campbell G, Jones W, Campagne F, Wen Z, Walker S, Su Z, Chu T, Goodsaid F, Pusztai L et al (2010) The maqc-ii project: a comprehensive study of common practices for the development and validation of microarray-based predictive models. Nature Biotechnol 28:827–838CrossRef Shi L, Campbell G, Jones W, Campagne F, Wen Z, Walker S, Su Z, Chu T, Goodsaid F, Pusztai L et al (2010) The maqc-ii project: a comprehensive study of common practices for the development and validation of microarray-based predictive models. Nature Biotechnol 28:827–838CrossRef
go back to reference Tibshirani R (1997) The lasso method for variable selection in the cox model. Stat Med 16(4):385–395CrossRef Tibshirani R (1997) The lasso method for variable selection in the cox model. Stat Med 16(4):385–395CrossRef
go back to reference Tsiatis AA (1990) Estimating regression parameters using linear rank tests for censored data. Annal Stat 90:354–372MathSciNetMATH Tsiatis AA (1990) Estimating regression parameters using linear rank tests for censored data. Annal Stat 90:354–372MathSciNetMATH
go back to reference Uno H, Cai T, Pencina MJ, D’Agostino RB, Wei L (2011) On the c-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat Med 30(10):1105–1117MathSciNetCrossRef Uno H, Cai T, Pencina MJ, D’Agostino RB, Wei L (2011) On the c-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat Med 30(10):1105–1117MathSciNetCrossRef
go back to reference Visser M (1996) Nonparametric estimation of the bivariate survival function with an application to vertically transmitted aids. Biometrika 83(3):507–518MATHCrossRef Visser M (1996) Nonparametric estimation of the bivariate survival function with an application to vertically transmitted aids. Biometrika 83(3):507–518MATHCrossRef
go back to reference Wang S, Nan B, Zhu J, Beer DG (2008) Doubly penalized buckley-james method for survival data with high-dimensional covariates. Biometrics 64(1):132–140MathSciNetMATHCrossRef Wang S, Nan B, Zhu J, Beer DG (2008) Doubly penalized buckley-james method for survival data with high-dimensional covariates. Biometrics 64(1):132–140MathSciNetMATHCrossRef
go back to reference Wang X, Song L (2011) Adaptive lasso variable selection for the accelerated failure models. Commun Stat-Theory Methods 40(24):4372–4386MathSciNetMATHCrossRef Wang X, Song L (2011) Adaptive lasso variable selection for the accelerated failure models. Commun Stat-Theory Methods 40(24):4372–4386MathSciNetMATHCrossRef
go back to reference Wang YG, Fu L (2011) Rank regression for accelerated failure time model with clustered and censored data. Comput Stat Data Anal 55(7):2334–2343MathSciNetMATHCrossRef Wang YG, Fu L (2011) Rank regression for accelerated failure time model with clustered and censored data. Comput Stat Data Anal 55(7):2334–2343MathSciNetMATHCrossRef
go back to reference Wei LJ (1992) The accelerated failure time model: a useful alternative to the cox regression model in survival analysis. Stat Med 11(14–15):1871–1879CrossRef Wei LJ (1992) The accelerated failure time model: a useful alternative to the cox regression model in survival analysis. Stat Med 11(14–15):1871–1879CrossRef
go back to reference Wei LJ, Ying Z, Lin D (1990) Linear regression analysis of censored survival data based on rank tests. Biometrika 77(4):845–851MathSciNetCrossRef Wei LJ, Ying Z, Lin D (1990) Linear regression analysis of censored survival data based on rank tests. Biometrika 77(4):845–851MathSciNetCrossRef
go back to reference Yi GY, He W (2006) Methods for bivariate survival data with mismeasured covariates under an accelerated failure time model. Commun Stat-Theory Methods 35(8):1539–1554MathSciNetMATHCrossRef Yi GY, He W (2006) Methods for bivariate survival data with mismeasured covariates under an accelerated failure time model. Commun Stat-Theory Methods 35(8):1539–1554MathSciNetMATHCrossRef
go back to reference Zhan F, Huang Y, Colla S, Stewart JP, Hanamura I, Gupta S, Epstein J, Yaccoby S, Sawyer J, Burington B et al (2006) The molecular classification of multiple myeloma. Blood 108(6):2020–2028CrossRef Zhan F, Huang Y, Colla S, Stewart JP, Hanamura I, Gupta S, Epstein J, Yaccoby S, Sawyer J, Burington B et al (2006) The molecular classification of multiple myeloma. Blood 108(6):2020–2028CrossRef
go back to reference Zhu LP, Li L, Li R, Zhu LX (2011) Model-free feature screening for ultrahigh-dimensional data. J Am Stat Assoc 106(496):1464–1475MathSciNetMATHCrossRef Zhu LP, Li L, Li R, Zhu LX (2011) Model-free feature screening for ultrahigh-dimensional data. J Am Stat Assoc 106(496):1464–1475MathSciNetMATHCrossRef
Metadata
Title
Bayesian penalized Buckley-James method for high dimensional bivariate censored regression models
Authors
Wenjing Yin
Sihai Dave Zhao
Feng Liang
Publication date
03-03-2022
Publisher
Springer US
Published in
Lifetime Data Analysis / Issue 2/2022
Print ISSN: 1380-7870
Electronic ISSN: 1572-9249
DOI
https://doi.org/10.1007/s10985-022-09549-5

Other articles of this Issue 2/2022

Lifetime Data Analysis 2/2022 Go to the issue