Abstract
We propose a Bayesian implementation of the lasso regression that accomplishes both shrinkage and variable selection. We focus on the appropriate specification for the shrinkage parameter λ through Bayes factors that evaluate the inclusion of each covariate in the model formulation. We associate this parameter with the values of Pearson and partial correlation at the limits between significance and insignificance as defined by Bayes factors. In this way, a meaningful interpretation of λ is achieved that leads to a simple specification of this parameter. Moreover, we use these values to specify the parameters of a gamma hyperprior for λ. The parameters of the hyperprior are elicited such that appropriate levels of practical significance of the Pearson correlation are achieved and, at the same time, the prior support of λ values that activate the Lindley-Bartlett paradox or lead to over-shrinkage of model coefficients is avoided. The proposed method is illustrated using two simulation studies and a real dataset. For the first simulation study, results for different prior values of λ are presented as well as a detailed robustness analysis concerning the parameters of the hyperprior of λ. In all examples, detailed comparisons with a variety of ordinary and Bayesian lasso methods are presented.
Similar content being viewed by others
References
Armagan, A., Dunson, D., Lee, J.: Bayesian generalized double Pareto shrinkage. arXiv:1104.0861v3 [stat.ME] (2012)
Balakrishnan, S., Madigan, D.: Priors on the variance in sparse Bayesian learning: the demi-Bayesian lasso. In: Chen, M.-H., Muller, P., Sun, D., Ye, K. (eds.) Frontiers of Statistical Decision Making and Bayesian Analysis: In Honor of James O. Berger, pp. 346–359. Springer, Berlin (2010)
Bartlett, M.: Comment on D.V. Lindley’s statistical paradox. Biometrika 44, 533–534 (1957)
Carvalho, C., Polson, N., Scott, J.: The horseshoe estimator for sparse signal. Biometrika 97, 465–480 (2010)
Dellaportas, P., Forster, J., Ntzoufras, I.: On Bayesian model and variable selection using MCMC. Stat. Comput. 12, 27–36 (2002)
Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression. Ann. Stat. 32, 407–499 (2004)
Fahrmeir, L., Kneib, T., Konrath, S.: Bayesian regularisation in structured additive regression: a unifying perspective on shrinkage, smoothing and predictor selection. Stat. Comput. 20, 203–219 (2010)
George, E., McCulloch, R.: Variable selection via Gibbs sampling. J. Am. Stat. Assoc. 88, 881–889 (1993)
Gramacy, R.: monomvn: Estimation for multivariate normal and Student-t data with monotone missingness. R package version 1.8-3 (2010)
Green, P.: Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82, 711–732 (1995)
Griffin, J.E., Brown, P.J.: Inference with normal-gamma prior distributions in regression problems. Bayesian Anal. 5, 171–188 (2010)
Hans, C.: Bayesian Lasso regression. Biometrika 96, 835–845 (2009)
Hans, C.: Model uncertainty and variable selection in Bayesian lasso regression. Stat. Comput. 20, 221–229 (2010)
Jeffreys, H.: Theory of Probability. Oxford University Press, Oxford (1961)
Johnson, B.: On lasso for censored data. Electron. J. Stat. 3, 485–506 (2009)
Kass, R., Raftery, A.: Bayes factors. J. Am. Stat. Assoc. 90, 773–795 (1995)
Kuo, L., Mallick, B.: Variable selection for regression models. Sankhyā B 60, 65–81 (1998)
Li, Q., Lin, N.: The Bayesian elastic net. Bayesian Anal. 5, 847–866 (2010)
Lindley, D.: A statistical paradox. Biometrika 44, 187–192 (1957)
Lykou, A., Whittaker, J.: Sparse canonical correlation analysis by using the lasso. Comput. Stat. Data Anal. 54, 3144–3157 (2010)
Meier, L., Van de Geer, S., Bühlmann, P.: The group lasso for logistic regression. J. R. Stat. Soc. B 70, 53–71 (2008)
Nott, D., Kohn, R.: Adaptive sampling for Bayesian variable selection. Biometrika 92, 747–763 (2005)
Ntzoufras, I.: Bayesian Modeling Using WinBugs. Wiley, New York (2009)
Osborne, M.R., Presnell, B., Turlach, B.A.: On the lasso and its dual. J. Comput. Graph. Stat. 9, 319–337 (2000)
Park, M.Y., Hastie, T.: l 1 regularization path algorithm for generalized linear models. J. R. Stat. Soc. B 69, 659–677 (2006)
Park, T., Casella, G.: The Bayesian lasso. J. Am. Stat. Assoc. 103, 681–687 (2008)
R Development Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna (2011)
Scheipl, F.: Normal-mixture-of-inverse-gamma priors for Bayesian regularization and model selection in structured additive regression models. Technical Report 84, Department of Statistics, University of Munich (2010); available at http://epub.ub.uni-muenchen.de/11785/
Scheipl, F.: Spikeslabgam: Bayesian variable selection, model choice and regularization for generalized additive mixed models in R. J. Stat. Softw. 43(14), 1–24 (2011)
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. B 58, 267–288 (1996)
Tibshirani, R.: The lasso method for variable selection in the cox model. Stat. Med. 16, 385–395 (1997)
Whittaker, J.: Graphical Models in Applied Multivariate Statistics. Wiley, New York (1990)
Yuan, M., Lin, Y.: Efficient empirical Bayes variable selection and estimation in linear models. J. Am. Stat. Assoc. 100, 1215–1225 (2005)
Zellner, A.: On assessing prior distributions and Bayesian regression analysis using g-prior distributions. In: Goel, P., Zellner, A. (eds.) Bayesian Inference and Decision Techniques: Essays in Honor of Bruno de Finetti, pp. 233–243. North-Holland, Amsterdam (1986)
Zhang, C.-H., Huang, J.: The sparsity and bias of the lasso selection in high-dimensional linear regression. Ann. Stat. 36, 1567–1594 (2008)
Zou, H.: The adaptive lasso and its oracle properties. J. Am. Stat. Assoc. 101, 1418–1429 (2006)
Zou, J., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. B 67, 301–320 (2005)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Lykou, A., Ntzoufras, I. On Bayesian lasso variable selection and the specification of the shrinkage parameter. Stat Comput 23, 361–390 (2013). https://doi.org/10.1007/s11222-012-9316-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-012-9316-x