Abstract
Algorithms are given for determining weighted isotonic regressions satisfying order constraints specified via a directed acyclic graph (DAG). For the L 1 metric a partitioning approach is used which exploits the fact that L 1 regression values can always be chosen to be data values. Extending this approach, algorithms for binary-valued L 1 isotonic regression are used to find L p isotonic regressions for 1<p<∞. Algorithms are given for trees, 2-dimensional and multidimensional orderings, and arbitrary DAGs. Algorithms are also given for L p isotonic regression with constrained data and weight values, L 1 regression with unweighted data, and L 1 regression for DAGs where there are multiple data values at the vertices.
Similar content being viewed by others
References
Ahuja, R.K., Orlin, J.B.: A fast scaling algorithm for minimizing separable convex functions subject to chain constraints. Oper. Res. 49, 784–789 (2001)
Angelov, S., Harb, B., Kannan, S., Wang, L.-S.: Weighted isotonic regression under the L 1 norm. In: Symposium on Discrete Algorithms (SODA), pp. 783–791 (2006)
Barlow, R.E., Bartholomew, D.J., Bremner, J.M., Brunk, H.D.: Statistical Inference Under Order Restrictions. Wiley, New York (1972)
Brunk, H.D.: Maximum likelihood estimates of monotone parameters. Ann. Math. Stat. 26, 607–616 (1955)
Chakravarti, N.: Isotonic median regression for orders represented by rooted trees. Nav. Res. Logist. 39, 599–611 (1992)
Chandrasekaran, R., Rhy, Y.U., Jacob, V.S., Hong, S.: Isotonic separation. INFORMS J. Comput. 17, 462–474 (2005)
Chepoi, V., Cogneau, D., Fichet, B.: Polynomial algorithms for isotonic regression. In: L 1 Statistical Procedures and Related Topics. Lecture Notes—Monograph Series, vol. 31, pp. 147–160 (1967). Institute of Mathematical Statistics
Dembczynski, K., Greco, S., Kotlowski, W., Slowinski, R.: Statistical model for rough set approach to multicriteria classification. In: PKDD 2007: 11th European Conference on Principles and Practice of Knowledge Discovery in Databases. Springer Lecture Notes in Computer Science, vol. 4702, pp. 164–175 (2007)
Dykstra, R.L., Robertson, T.: An algorithm for isotonic regression of two or more independent variables. Ann. Stat. 10, 708–716 (1982)
Gebhardt, F.: An algorithm for monotone regression with one or more independent variables. Biometrika 57, 263–271 (1970)
Jackson, D.: Note on the median of a set of numbers. Bull. Am. Math. Soc. 27, 160–164 (1921)
Kaufman, Y., Tamir, A.: Locating service centers with precedence constraints. Discrete Appl. Math. 47, 251–261 (1993)
Legg, D., Townsend, D.: Best monotone approximation in L ∞[0,1]. J. Approx. Theory 42, 30–35 (1984)
Maxwell, W.L., Muckstadt, J.A.: Establishing consistent and realistic reorder intervals in production-distribution systems. Oper. Res. 33, 1316–1341 (1985)
Pardalos, P.M., Xue, G.: Algorithms for a class of isotonic regression problems. Algorithmica 23, 211–222 (1999)
Pardalos, P.M., Xue, G.-L., Yong, L.: Efficient computation of an isotonic median regression. Appl. Math. Lett. 8, 67–70 (1995)
Punera, K., Ghosh, J.: Enhanced hierarchical classification via isotonic smoothing. In: Proceedings of the International Conference on the World Wide Web 2008, pp. 151–160 (2008)
Qian, S.: An algorithm for tree-ordered isotonic median regression. Stat. Probab. Lett. 27, 195–199 (1996)
Robertson, T., Wright, F.T.: Multiple isotonic median regression. Ann. Stat. 1, 422–432 (1973)
Robertson, T., Wright, F.T., Dykstra, R.L.: Order Restricted Statistical Inference. Wiley, New York (1988)
Spouge, J., Wan, H., Wilbur, W.J.: Least squares isotonic regression in two dimensions. J. Optim. Theory Appl. 117, 585–605 (2003)
Stout, Q.F.: Unimodal regression via prefix isotonic regression. Comput. Stat. Data Anal. 53, 289–297 (2008). A preliminary version appeared in “Optimal algorithms for unimodal regression”. Comput. Stat. 32 (2000)
Stout, Q.F.: Strict L ∞ isotonic regression. J. Optim. Theory Appl. 152, 121–135 (2012)
Stout, Q.F.: Weighted L ∞ isotonic regression (2012, submitted)
Stout, Q.F.: Isotonic regression for multiple independent variables (2012, submitted)
Strömberg, U.: An algorithm for isotonic regression with arbitrary convex distance function. Comput. Stat. Data Anal. 11, 205–219 (1991)
Thompson, W.A. Jr.: The problem of negative estimates of variance components. Ann. Math. Stat. 33, 273–289 (1962)
Acknowledgements
The author thanks the referees for their helpful suggestions.
Author information
Authors and Affiliations
Corresponding author
Appendix: Computing the L p Mean
Appendix: Computing the L p Mean
The computation of the L p -mean of a level set for general p is more complicated than for p=1,2, or ∞. We are interested in identifying those aspects which might be more than linear in the time of the remainder of the algorithm. We assume that the data values have been presorted.
Simple calculus [11] shows that the L p mean of the data (y,w) is the unique C such that
which does not, in general, have an analytic solution. An important aspect is to determine the data values that bracket the L p mean, i.e., to determine which are in the RHS and which are on the LHS of (7).
Given the bracketing values, the time to find the mean to desired accuracy may depend upon the data and we make no assumptions concerning it. Given p, n, and data (y,w), the total time to find all L p means used in the algorithm is decomposed into a part independent of the data and the remaining portion, denoted \(\mathcal{T}_{p}(n,\mathbf{y},\mathbf{w})\), that is data dependent. For general p the data independent portion requires evaluating (7) for each set at least once per partitioning stage, and thus Θ(n) time per stage. Since there can be Θ(n) stages in each of the algorithms in Sect. 5, the data independent time for finding means is Θ(n 2) once the bracketing values are known. By first partitioning on data values, all of the partitioning algorithms can determine the bracketing values in time no more than the remaining time of the algorithm.
If p is an even integer, (7) reduces to finding the unique solution of ∑ i=1,n w i (y i −C)p−1=0, i.e., no bracketing values nor presorting are required. This is a sum of (p−1)st degree polynomials, and hence can be determined by summing up, over the data values, the coefficients of each term and then evaluating the result. For p an odd integer once again the terms are polynomials so, given the bracketing values, one can use sums of coefficients to evaluate the summations in (7). Since roots of polynomials of degree ≤4 can be found exactly in constant time, \(\mathcal{T}_{2} = \mathcal{T}_{3} = \mathcal{T}_{4} =\mathcal{T}_{5} = 0\).
For PAV sets are merged rather than partitioned. This does not change the data independent time for general p, but for p an even integer each set is reduced to p+1 coefficients and merging sets merely requires adding coefficients. Hence the data independent time is Θ(n). For p odd the situation is a bit more complex since the bracketing values are needed. One can construct a balanced search tree where data values are keys and at each node, for each term there is the sum of the coefficients beneath it. The only change from being on the LHS vs. the RHS of (7) is whether the term has a plus or minus sign, and hence, given this tree, one can descend through the tree and determine the bracketing data values in logarithmic time. Merging sets corresponds to merging their trees, including maintaining the information about sums of coefficients in subtrees. With careful merging this can be accomplished in Θ(nlogn) total time.
Rights and permissions
About this article
Cite this article
Stout, Q.F. Isotonic Regression via Partitioning. Algorithmica 66, 93–112 (2013). https://doi.org/10.1007/s00453-012-9628-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00453-012-9628-4