Skip to main content
Top
Published in: Lifetime Data Analysis 4/2023

02-07-2023

Quantile forward regression for high-dimensional survival data

Authors: Eun Ryung Lee, Seyoung Park, Sang Kyu Lee, Hyokyoung G. Hong

Published in: Lifetime Data Analysis | Issue 4/2023

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Despite the urgent need for an effective prediction model tailored to individual interests, existing models have mainly been developed for the mean outcome, targeting average people. Additionally, the direction and magnitude of covariates’ effects on the mean outcome may not hold across different quantiles of the outcome distribution. To accommodate the heterogeneous characteristics of covariates and provide a flexible risk model, we propose a quantile forward regression model for high-dimensional survival data. Our method selects variables by maximizing the likelihood of the asymmetric Laplace distribution (ALD) and derives the final model based on the extended Bayesian Information Criterion (EBIC). We demonstrate that the proposed method enjoys a sure screening property and selection consistency. We apply it to the national health survey dataset to show the advantages of a quantile-specific prediction model. Finally, we discuss potential extensions of our approach, including the nonlinear model and the globally concerned quantile regression coefficients model.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Literature
go back to reference Belloni A, Chernozhukov V (2011) \(\ell _1\)-penalized quantile regression in high-dimensional sparse models. Ann Stat 39(1):82–130MATHCrossRef Belloni A, Chernozhukov V (2011) \(\ell _1\)-penalized quantile regression in high-dimensional sparse models. Ann Stat 39(1):82–130MATHCrossRef
go back to reference Chen J, Chen Z (2008) Extended Bayesian information criteria for model selection with large model spaces. Biometrika 95(3):759–771MathSciNetMATHCrossRef Chen J, Chen Z (2008) Extended Bayesian information criteria for model selection with large model spaces. Biometrika 95(3):759–771MathSciNetMATHCrossRef
go back to reference Cheng MY, Honda T, Zhang JT (2016) Forward variable selection for sparse ultra-high dimensional varying coefficient models. J Am Stat Assoc 111(515):1209–1221MathSciNetCrossRef Cheng MY, Honda T, Zhang JT (2016) Forward variable selection for sparse ultra-high dimensional varying coefficient models. J Am Stat Assoc 111(515):1209–1221MathSciNetCrossRef
go back to reference Eli S, Tangvik RJ, Nymo LS, Harthug S, Lassen K, Viste A (2020) Weight loss and bmi criteria in GLIM’s definition of malnutrition is associated with postoperative complications following abdominal resections - results from a national quality registry. Clin Nutrit 39(5):1593–1599CrossRef Eli S, Tangvik RJ, Nymo LS, Harthug S, Lassen K, Viste A (2020) Weight loss and bmi criteria in GLIM’s definition of malnutrition is associated with postoperative complications following abdominal resections - results from a national quality registry. Clin Nutrit 39(5):1593–1599CrossRef
go back to reference Fan J, Lv J (2008) Sure independence screening for ultrahigh dimensional feature space (with discussion). J Royal Stat Soc: Series B (Stat Methodol) 70(5):849–911MathSciNetMATHCrossRef Fan J, Lv J (2008) Sure independence screening for ultrahigh dimensional feature space (with discussion). J Royal Stat Soc: Series B (Stat Methodol) 70(5):849–911MathSciNetMATHCrossRef
go back to reference Fan J, Lv J (2010) A selective overview of variable selection in high dimensional feature space. Stat Sin 20(1):101–148MathSciNetMATH Fan J, Lv J (2010) A selective overview of variable selection in high dimensional feature space. Stat Sin 20(1):101–148MathSciNetMATH
go back to reference Fard NA, Morales GDF, Mejova Y, Schifanella R (2021) On the interplay between educational attainment and nutrition: a spatially-aware perspective. EPJ Data Sci 10(1):18CrossRef Fard NA, Morales GDF, Mejova Y, Schifanella R (2021) On the interplay between educational attainment and nutrition: a spatially-aware perspective. EPJ Data Sci 10(1):18CrossRef
go back to reference Flegal KM, Kit BK, Orpana H, Graubard BI (2013) Association of all-cause mortality with overweight and obesity using standard body mass index categories: a systematic review and meta-analysis. JAMA 309(1):71–82CrossRef Flegal KM, Kit BK, Orpana H, Graubard BI (2013) Association of all-cause mortality with overweight and obesity using standard body mass index categories: a systematic review and meta-analysis. JAMA 309(1):71–82CrossRef
go back to reference Gearhardt AN, Corbin WR (2009) Body mass index and alcohol consumption: family history of alcoholism as a moderator. Psychol Addict Behav 23(2):216–225CrossRef Gearhardt AN, Corbin WR (2009) Body mass index and alcohol consumption: family history of alcoholism as a moderator. Psychol Addict Behav 23(2):216–225CrossRef
go back to reference He X, Wang L, Hong HG (2013) Quantile-adaptive model-free variable screening for high-dimensional heterogeneous data. Ann Stat 41:342–369MathSciNetMATH He X, Wang L, Hong HG (2013) Quantile-adaptive model-free variable screening for high-dimensional heterogeneous data. Ann Stat 41:342–369MathSciNetMATH
go back to reference Honda T, Lin C (2022) Forward variable selection for ultra-high dimensional quantile regression models. Ann Instit Stat Math 1–32 Honda T, Lin C (2022) Forward variable selection for ultra-high dimensional quantile regression models. Ann Instit Stat Math 1–32
go back to reference Hong HG, Kang J, Li Y (2018) Conditional screening for ultra-high dimensional covariates with survival outcomes. Lifetime Data Anal 24(1):45–71MathSciNetMATHCrossRef Hong HG, Kang J, Li Y (2018) Conditional screening for ultra-high dimensional covariates with survival outcomes. Lifetime Data Anal 24(1):45–71MathSciNetMATHCrossRef
go back to reference Hong HG, Christiani DC, Li Y (2019) Quantile regression for survival data in modern cancer research: expanding statistical tools for precision medicine. Precis Clin Med 2(2):90–99CrossRef Hong HG, Christiani DC, Li Y (2019) Quantile regression for survival data in modern cancer research: expanding statistical tools for precision medicine. Precis Clin Med 2(2):90–99CrossRef
go back to reference Hwang WY, Zhang HH, Ghosal S (2009) First: combining forward iterative selection and shrinkage in high dimensional sparse linear regression. Stat Interface 2:341–348MathSciNetMATHCrossRef Hwang WY, Zhang HH, Ghosal S (2009) First: combining forward iterative selection and shrinkage in high dimensional sparse linear regression. Stat Interface 2:341–348MathSciNetMATHCrossRef
go back to reference Karavasiloglou N, Pestoni G, Wanner M, Faeh D, Rohrmann S (2019) Healthy lifestyle is inversely associated with mortality in cancer survivors: results from the third national health and nutrition examination survey (NHANES III). PLOS ONE 14(6):1–11CrossRef Karavasiloglou N, Pestoni G, Wanner M, Faeh D, Rohrmann S (2019) Healthy lifestyle is inversely associated with mortality in cancer survivors: results from the third national health and nutrition examination survey (NHANES III). PLOS ONE 14(6):1–11CrossRef
go back to reference Kleiner KD, Gold MS, Frostpineda K, Lenzbrunsman B, Perri MG, Jacobs WS (2004) Body mass index and alcohol use. J Addict Dis 23(3):105–118CrossRef Kleiner KD, Gold MS, Frostpineda K, Lenzbrunsman B, Perri MG, Jacobs WS (2004) Body mass index and alcohol use. J Addict Dis 23(3):105–118CrossRef
go back to reference Knight K (1998) Limiting distributions for \(l_1\) regression estimators under general conditions. Ann Stat 26(2):755–770MATHCrossRef Knight K (1998) Limiting distributions for \(l_1\) regression estimators under general conditions. Ann Stat 26(2):755–770MATHCrossRef
go back to reference Koenker R, Machado JAF (1999) Goodness of fit and related inference processes for quantile regression. J Am Stat Assoc 94:1296–1310MathSciNetMATHCrossRef Koenker R, Machado JAF (1999) Goodness of fit and related inference processes for quantile regression. J Am Stat Assoc 94:1296–1310MathSciNetMATHCrossRef
go back to reference Kong Y, Li Y, Zerom D (2019) Screening and selection for quantile regression using an alternative measure of variable importance. J Multiv Anal 173:435–455MathSciNetMATHCrossRef Kong Y, Li Y, Zerom D (2019) Screening and selection for quantile regression using an alternative measure of variable importance. J Multiv Anal 173:435–455MathSciNetMATHCrossRef
go back to reference Ledoux M, Talagrand M (1991) Probability in Banach Spaces: Isoperimetry and Processes. Springer, New YorkMATHCrossRef Ledoux M, Talagrand M (1991) Probability in Banach Spaces: Isoperimetry and Processes. Springer, New YorkMATHCrossRef
go back to reference Lee ER, Noh H, Park BU (2014) Model selection via Bayesian information criterion for quantile regression models. J Am Stat Assoc 109:216–229MathSciNetMATHCrossRef Lee ER, Noh H, Park BU (2014) Model selection via Bayesian information criterion for quantile regression models. J Am Stat Assoc 109:216–229MathSciNetMATHCrossRef
go back to reference Luo S, Chen Z (2014) Sequential lasso cum EBIC for feature selection with ultra-high dimensional feature space. J Am Stat Assoc 109:1229–1240MathSciNetMATHCrossRef Luo S, Chen Z (2014) Sequential lasso cum EBIC for feature selection with ultra-high dimensional feature space. J Am Stat Assoc 109:1229–1240MathSciNetMATHCrossRef
go back to reference Must A, Spadano J, Coakley EH, Field AE, Colditz G, Dietz WH (1999) The disease burden associated with overweight and obesity. JAMA 282(16):1523–1529CrossRef Must A, Spadano J, Coakley EH, Field AE, Colditz G, Dietz WH (1999) The disease burden associated with overweight and obesity. JAMA 282(16):1523–1529CrossRef
go back to reference Pijyan A, Zheng Q, Hong HG, Li Y (2020) Consistent estimation of generalized linear models with high dimensional predictors via stepwise regression. Entropy 22(9):965MathSciNetCrossRef Pijyan A, Zheng Q, Hong HG, Li Y (2020) Consistent estimation of generalized linear models with high dimensional predictors via stepwise regression. Entropy 22(9):965MathSciNetCrossRef
go back to reference Sluik D, Brouwer-Brolsma EM, Berendsen AAM, Mikkilä V, Poppitt SD, Silvestre MP, Tremblay A, Pérusse L, Bouchard C, Raben A, Feskens EJM (2019) Protein intake and the incidence of pre-diabetes and diabetes in 4 population-based studies: the preview project. Am J Clin Nutrit 109(5):1310–1318CrossRef Sluik D, Brouwer-Brolsma EM, Berendsen AAM, Mikkilä V, Poppitt SD, Silvestre MP, Tremblay A, Pérusse L, Bouchard C, Raben A, Feskens EJM (2019) Protein intake and the incidence of pre-diabetes and diabetes in 4 population-based studies: the preview project. Am J Clin Nutrit 109(5):1310–1318CrossRef
go back to reference Tibshirani R (1997) The lasso method for variable selection in the cox model. Stat Medi 28:385–395CrossRef Tibshirani R (1997) The lasso method for variable selection in the cox model. Stat Medi 28:385–395CrossRef
go back to reference van der Vaart Wellner JA (1996) Weak Convergence and Empirical Processes: With Applications to Statistics. Springer Series in Statistics, Springer, New YorkMATHCrossRef van der Vaart Wellner JA (1996) Weak Convergence and Empirical Processes: With Applications to Statistics. Springer Series in Statistics, Springer, New YorkMATHCrossRef
go back to reference Zhang CH, Huang J (2008) The sparsity and bias of the lasso selection in high-dimensional linear regression. Ann Stat 36:1567–1594MathSciNetMATHCrossRef Zhang CH, Huang J (2008) The sparsity and bias of the lasso selection in high-dimensional linear regression. Ann Stat 36:1567–1594MathSciNetMATHCrossRef
go back to reference Zheng Q, Hong HG, Li Y (2020) Building generalized linear models with ultrahigh dimensional features: a sequentially conditional approach. Biometrics 76(1):47–60MathSciNetMATHCrossRef Zheng Q, Hong HG, Li Y (2020) Building generalized linear models with ultrahigh dimensional features: a sequentially conditional approach. Biometrics 76(1):47–60MathSciNetMATHCrossRef
Metadata
Title
Quantile forward regression for high-dimensional survival data
Authors
Eun Ryung Lee
Seyoung Park
Sang Kyu Lee
Hyokyoung G. Hong
Publication date
02-07-2023
Publisher
Springer US
Published in
Lifetime Data Analysis / Issue 4/2023
Print ISSN: 1380-7870
Electronic ISSN: 1572-9249
DOI
https://doi.org/10.1007/s10985-023-09603-w

Other articles of this Issue 4/2023

Lifetime Data Analysis 4/2023 Go to the issue