Unimodal regression via prefix isotonic regression
Introduction
Given univariate real data values with nonnegative real weights , , where , and given , the isotonic regression of the data is the set that minimizes subject to the increasing isotonic constraint that Note that the values are merely required to be nondecreasing, rather than strictly increasing. The unimodal regression of the data is the set that minimizes Eq. (1) subject to the unimodal constraint that there is an such that i.e., such that is increasing on and decreasing on . The unimodal constraint is also called an umbrella ordering, and isotonic regression is often called monotonic regression, though in some application areas this term means the values might be decreasing.
Isotonic regression does not yield a smooth curve, but rather a collection of level sets where the regression is constant. Fig. 1 gives an example of an isotonic regression of a set of data with equal weights, where circles represent data points and lines represent level sets, with a filled circle representing a data point which is also a level set. Fig. 2 shows a unimodal regression.
By the error of a regression we mean the quantity in Eq. (1). In the algorithms the value called error is actually the th power of this quantity in order to simplify calculations.
Both isotonic regression and unimodal regression are examples of nonparametric shape-constrained regression. Our interest in efficient unimodal regression was motivated by its repeated use in dose-response problems with competing failure modes (Hardwick and Stout, 2000, Pan, 1996). For such problems, as the dose increases the efficacy increases but the toxicity increases as well. The goal is to find the dose that maximizes the probability of being efficacious and non-toxic, and it is usually assumed that this probability distribution is unimodal. More generally such regressions are of use in a wide range of applications when there is prior knowledge about the shape of a response function but no assumption of a parametric form. See, for example, the references to water-level time-series data in Haiminen and Gionis (2004) and to tree growth in Turner and Wollan (1997). The latter is another example of competing failure modes, where as trees in a newly planted grove grow, their “vigor” initially increases as they increase in size, but eventually starts decreasing as they compete for nutrients and light.
In Section 2 we examine previous work on the problem of determining unimodal regression. In Section 3 we introduce the notion of prefix isotonic regression, and in Sections 3.1 , 3.2 , 3.3 we develop algorithms for the , , and unweighted versions of this problem, taking time , , and , respectively. These then yield unimodal algorithms of the same time complexity. All of these algorithms, both isotonic and unimodal, are optimal, and, except for the isotonic problem, all are the first optimal solutions to their problems. In Section 3.4 we examine the slightly different problem of determining the value at of the isotonic regression on the first values. Section 4 contains an immediate corollary of the results on prefix isotonic regression, namely that unimodal regression can be computed in the same time bounds. Section 5 concludes with some final remarks.
Throughout, we assume that the data is given in order of increasing values. If the data is not so ordered, then an initial sorting step, taking time, is needed. Since the values of the are irrelevant we simplify notation by assuming .
Section snippets
Previous work
It is well-known that the increasing isotonic regression can be determined in time. Apparently all published algorithms use the “pair adjacent violators” (PAV) approach (Ayer et al., 1955). In this approach, initially each data value is viewed as a level set. At each step, if there are two adjacent level sets that are out of order (i.e., the left level set is above the right one) then the sets are combined and the weighted mean of the data values becomes the value of the new level
Prefix isotonic regression
By determining an isotonic regression we mean determining the error of the regression and the extents and regression values of the level sets. Given real-valued weighted data values with nonnegative real weights , and given a metric on the reals, let denote the isotonic regression on . The prefix isotonic regression problem is to determine for all .
Note that prefix isotonic regression determines exactly the set of increasing isotonic
Unimodal regression
It is a very simple process to modify the algorithm in Fig. 3 to utilize prefix isotonic regression. The errorl values are calculated via a standard prefix increasing isotonic regression, and the errorr values are calculated via a prefix increasing isotonic regression going through the data in right-to-left order. The time complexity of this algorithm is quite straightforward since its total time is dominated by the time to perform the isotonic regressions.
Theorem 2 Given weighted data
Final comments
It has been shown that the problem of determining the unimodal regression of a set of data can be optimally solved by using an approach based on prefix isotonic regression. This approach is quite similar to that in Frisén (1980), Geng and Shi (1990), Mureika et al. (1992), Pehrsson and Frisén (1983) and Turner (1998), but achieves greater efficiency by organizing the regression calculations into a systematic prefix calculation. The prefix approach not only reduces the asymptotic time of
Acknowledgements
This work was supported in part by National Science Foundation grant DMS–0072910. A preliminary version of portions of this paper appeared in Stout (2000).
The author thanks the reviewers for their helpful comments.
References (18)
- et al.
Locating service centers with precedence constraints
Discrete Appl. Math.
(1993) - et al.
Efficient computation of an isotonic median regression
Appl. Math. Lett.
(1995) An algorithm for tree-ordered isotonic median regression
Statist. Probab. Lett.
(1996)- et al.
Locating a maximum using isotonic regression
Comput. Statist. Data Anal.
(1997) - et al.
A fast scaling algorithm for minimizing separable convex functions subject to chain constraints
Oper. Res.
(2001) - et al.
An empirical distribution function for sampling with incomplete information
Ann. Math. Statist.
(1955) - et al.
A fast merging algorithm
J. Assoc. Comput. Mach.
(1979) Unimodal regression
The Statistician
(1980)- et al.
Isotonic regression for umbrella orderings
Appl. Stat.
(1990)
Cited by (50)
Non-convex isotonic regression via the Myersonian approach
2021, Statistics and Probability LettersPrediction and validation of alternative fillers used in micro surfacing mix-design using machine learning techniques
2019, Construction and Building MaterialsA dynamic programming approach for generalized nearly isotonic optimization
2023, Mathematical Programming Computationgfpop: An R Package for Univariate Graph-Constrained Change-Point Detection
2023, Journal of Statistical SoftwarePrivate Isotonic Regression
2022, arXiv