Much of the theory and computational algorithms for ℓ
-penalized methods in the high-dimensional context has been developed for convex loss functions, e.g., the squared error loss for linear models (Chapters 2 and 6) or the negative log-likelihood in a generalized linear model (Chapters 3 and 6). However, there are many models where the negative log-likelihood is a non-convex function. Important examples include mixture models or linear mixed effects models which we describe in more details. Both of them address in a different way the issue of modeling a grouping structure among the observations, a quite common feature in complex situations. We discuss in this chapter how to deal with non-convex but smooth ℓ
- penalized likelihood problems. Regarding computation, we can typically find a local optimum of the corresponding non-convex optimization problem only whereas the theory is given for the estimator defined by a global optimum. Particularly in highdimensional problems, it is difficult to compute a global optimum and it would be desirable to have some theoretical properties of estimators arising from “reasonable” local optima. However, this is largely an unanswered problem.