Bin width selection in multivariate histograms by the combinatorial method

Devroye, Luc; Lugosi, Gábor

doi:10.1007/BF02603004

Bin width selection in multivariate histograms by the combinatorial method

Published: June 2004

Volume 13, pages 129–145, (2004)
Cite this article

Test Aims and scope Submit manuscript

Luc Devroye² &
Gábor Lugosi¹

226 Accesses
12 Citations
3 Altmetric
Explore all metrics

Abstract

We present several multivariate histogram density estimates that are universallyL ₁-optimal to within a constant factor and an additive term\(O\left( {\sqrt {\log {n \mathord{\left/ {\vphantom {n n}} \right. \kern-\nulldelimiterspace} n}} } \right)\). The bin widths are chosen by the combinatorial method developed by the authors inCombinatorial Methods in Density Estimation (Springer-Verlag, 2001). The present paper solves a problem left open in that book.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Minimum distance histograms with universal performance guarantees

Article Open access 13 July 2019

A chi-square goodness-of-fit test for continuous distributions against a known alternative

Article 14 May 2020

Tests for comparing weighted histograms. Review and improvements

Article 02 May 2017

References

Abou-Jaoude, S. (1976a). Conditions nécessaires et suffisantes de convergencel ₁ en probabilité de l'histogramme pour une densité.Annales de l'Institut Henri Poincaré, 12:213–231.
MathSciNet MATH Google Scholar
Abou-Jaoude, S. (1976b). La convergencel ₁ etl _∞ de l'estimateur de la partition aleatoire pour une densité.Annales de l'Institut Henri Poincaré, 12:299–317.
MathSciNet MATH Google Scholar
Atilgan, T. (1990). On derivation and application of AIC as a data-based criterion for histograms.Communications in Statistics—Theory and Methods, 19:885–903.
MathSciNet Google Scholar
Barron, A., Birgé, L., andMassart, P. (1999). Risk bounds for model selection via penalization.Probability Theory and Related Fields, 113:301–415.
Article MATH MathSciNet Google Scholar
Biau, G. andDevroye, L. (2002, to appear). On the risk of estimates for block decreasing densities.Journal of Multivariate Analysis.
Birgé, L. andRozenholc, Y. (2002). How many bins should be put in a regular histogram. Technical report.
Castellan, G. (2000). Sélection d'histogrammes ou de modèles exponentiels de polynomes par morceaux à l'aide d'un critère de type Akaike. Technical report.
Chen, X. R. andZhao, L. C. (1987). Almost sureL ₁-norm convergence for data-based histogram density estimates.Journal of Multivariate Analysis, 21:179–188.
Article MATH MathSciNet Google Scholar
Devroye, L. (1987).A Course in Density Estimation. Birkhäuser-Verlag, Boston.
MATH Google Scholar
Devroye, L. andGyörfi, L. (1985).Nonparametric Density Estimation: The L ₁ View. Wiley, New York.
MATH Google Scholar
Devroye, L. andLugosi, G. (1996). A universally acceptable smoothing factor for kernel density estimates.Annals of Statistics, 24:2499–2512.
Article MATH MathSciNet Google Scholar
Devroye, L. andLugosi, G. (1997). Nonasymptotic universal smoothing factors, kernel complexity and yatracos classes.Annals of Statistics, 25:2626–2637.
Article MATH MathSciNet Google Scholar
Devroye, L. andLugosi, G. (2001).Combinatorial Methods in Density Estimation. Springer-Verlag, New York.
MATH Google Scholar
Freedman, D. andDiaconis, P. (1981). On the histogram as a density estimator:l ₂ theory.Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete, 57:453–476.
Article MATH MathSciNet Google Scholar
Hall, P. (1990). Akaike's information criterion and Kullback-Leibler loss for histogram density estimation.Probability Theory and Related Fields, 85:449–467.
Article MATH MathSciNet Google Scholar
Hall, P. andHannan, E. (1988). On stochastic complexity and nonparametric density estimation.Biometrika, 75:705–714.
Article MATH MathSciNet Google Scholar
Kanazawa, Y. (1988). An optimal variable cell histogram.Communications in Statistics, part A: Theory and Methods, 17:1401–1422.
MATH MathSciNet Google Scholar
Kanazawa, Y. (1992). An optimal variable cell histogram based on the sample spacings.Annals of Statistics, 20:291–304.
MATH MathSciNet Google Scholar
Kanazawa, Y. (1993). Hellinger distance and Akaike's information criterion for the histogram.Statistics and Probability Letters, 17:293–298.
Article MATH MathSciNet Google Scholar
Kim, B. andRyzin, J. V. (1975). Uniform consistency of a histogram density estimator and modal estimation.Communications in Statistics, 4:303–315.
Article Google Scholar
Kogure, A. (1987). Asymptotically optimal cells for a histogram.Annals of Statistics, 15:1023–1030.
MATH MathSciNet Google Scholar
Lecoutre, J.-P. (1985). Thel ₂-optimal cell width for the histogram.Statistics and Probability Letters, 3:303–306.
Article MATH MathSciNet Google Scholar
Lugosi, G. andNobel, A. (1996). Consistency of data-driven histogram methods for density estimation and classification.Annals of Statistics, 24:687–706.
Article MATH MathSciNet Google Scholar
Rodriguez, C. andRyzin, J. V. (1985). Maximum entropy histograms.Statistics and Probability Letters, 3:117–120.
Article MATH MathSciNet Google Scholar
Rodriguez, C. andRyzin, J. V. (1986). Large sample properties of maximum entropy histograms.IEEE Transactions on Information Theory, IT-32:751–759.
Article MATH Google Scholar
Rudemo, M. (1982). Empirical choice of histograms and kernel density estimators.Scandinavian Journal of Statistics, 9:65–78.
MathSciNet MATH Google Scholar
Schläffli, L. (1950).Gesammelte Mathematische Abhandlungen. Birkhäuser-Verlag, Basel.
Google Scholar
Scott, D. (1979). On optimal data-based histograms.Biometrika, 79:605–610.
Article Google Scholar
Stone, C. J. (1985). An asymptotically optimal histogram selection rule. In L. L. Cam and R. A. Olshen, eds.,Proceedings of the Berkeley Conference in Honor of Jerzy Neyman and Jack Kiefer, Vol. II, pp. 513–520. Wadsworth, Belmont.
Google Scholar
Taylor, C. (1987). Akaike's information criterion and the histogram.Biometrika, 74:636–639.
Article MATH MathSciNet Google Scholar
Vapnik, V. N. andChervonenkis, A. Y. (1971). On the uniform convergence of relative frequencies of events to their probabilities.Theory of Probability and its Applications, 16:264–280.
Article MATH Google Scholar
Wand, M. (1997). Data-based choice of histogram bin width.The American Statistician, 51:59–64.
Article Google Scholar
Yu, B. andSpeed, T. (1990). Stochastic complexity and model selection II: Histograms. Technical report.
Yu, B. andSpeed, T. (1992). Data compression and histograms.Probability Theory and Related Fields, pp. 195–229.
Zhao, L. C., Krishnaiah, P. R., andChen, X. R. (1990). Almost sureL _r-norm convergence for data-based histogram estimates.Theory of Probability and its Applications, 35:396–403.
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Economics, Universitat Pompeu Fabra, Barcelona, Spain
Gábor Lugosi
School of Computer Science, McGill University, H3A 2K6, Montreal, Canada
Luc Devroye

Authors

Luc Devroye
View author publications
You can also search for this author in PubMed Google Scholar
Gábor Lugosi
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Devroye, L., Lugosi, G. Bin width selection in multivariate histograms by the combinatorial method. Test 13, 129–145 (2004). https://doi.org/10.1007/BF02603004

Download citation

Received: 15 April 2002
Accepted: 15 November 2002
Issue Date: June 2004
DOI: https://doi.org/10.1007/BF02603004

Key Words

AMS subject classification

62G05

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bin width selection in multivariate histograms by the combinatorial method

Abstract

Access this article

Similar content being viewed by others

Minimum distance histograms with universal performance guarantees

A chi-square goodness-of-fit test for continuous distributions against a known alternative

Tests for comparing weighted histograms. Review and improvements

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Key Words

AMS subject classification

Navigation

Bin width selection in multivariate histograms by the combinatorial method

Abstract

Access this article

Similar content being viewed by others

Minimum distance histograms with universal performance guarantees

A chi-square goodness-of-fit test for continuous distributions against a known alternative

Tests for comparing weighted histograms. Review and improvements

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Key Words

AMS subject classification

Search

Navigation