Skip to main content
Log in

Bin width selection in multivariate histograms by the combinatorial method

  • Published:
Test Aims and scope Submit manuscript

Abstract

We present several multivariate histogram density estimates that are universallyL 1-optimal to within a constant factor and an additive term\(O\left( {\sqrt {\log {n \mathord{\left/ {\vphantom {n n}} \right. \kern-\nulldelimiterspace} n}} } \right)\). The bin widths are chosen by the combinatorial method developed by the authors inCombinatorial Methods in Density Estimation (Springer-Verlag, 2001). The present paper solves a problem left open in that book.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Abou-Jaoude, S. (1976a). Conditions nécessaires et suffisantes de convergencel 1 en probabilité de l'histogramme pour une densité.Annales de l'Institut Henri Poincaré, 12:213–231.

    MathSciNet  MATH  Google Scholar 

  • Abou-Jaoude, S. (1976b). La convergencel 1 etl de l'estimateur de la partition aleatoire pour une densité.Annales de l'Institut Henri Poincaré, 12:299–317.

    MathSciNet  MATH  Google Scholar 

  • Atilgan, T. (1990). On derivation and application of AIC as a data-based criterion for histograms.Communications in Statistics—Theory and Methods, 19:885–903.

    MathSciNet  Google Scholar 

  • Barron, A., Birgé, L., andMassart, P. (1999). Risk bounds for model selection via penalization.Probability Theory and Related Fields, 113:301–415.

    Article  MATH  MathSciNet  Google Scholar 

  • Biau, G. andDevroye, L. (2002, to appear). On the risk of estimates for block decreasing densities.Journal of Multivariate Analysis.

  • Birgé, L. andRozenholc, Y. (2002). How many bins should be put in a regular histogram. Technical report.

  • Castellan, G. (2000). Sélection d'histogrammes ou de modèles exponentiels de polynomes par morceaux à l'aide d'un critère de type Akaike. Technical report.

  • Chen, X. R. andZhao, L. C. (1987). Almost sureL 1-norm convergence for data-based histogram density estimates.Journal of Multivariate Analysis, 21:179–188.

    Article  MATH  MathSciNet  Google Scholar 

  • Devroye, L. (1987).A Course in Density Estimation. Birkhäuser-Verlag, Boston.

    MATH  Google Scholar 

  • Devroye, L. andGyörfi, L. (1985).Nonparametric Density Estimation: The L 1 View. Wiley, New York.

    MATH  Google Scholar 

  • Devroye, L. andLugosi, G. (1996). A universally acceptable smoothing factor for kernel density estimates.Annals of Statistics, 24:2499–2512.

    Article  MATH  MathSciNet  Google Scholar 

  • Devroye, L. andLugosi, G. (1997). Nonasymptotic universal smoothing factors, kernel complexity and yatracos classes.Annals of Statistics, 25:2626–2637.

    Article  MATH  MathSciNet  Google Scholar 

  • Devroye, L. andLugosi, G. (2001).Combinatorial Methods in Density Estimation. Springer-Verlag, New York.

    MATH  Google Scholar 

  • Freedman, D. andDiaconis, P. (1981). On the histogram as a density estimator:l 2 theory.Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete, 57:453–476.

    Article  MATH  MathSciNet  Google Scholar 

  • Hall, P. (1990). Akaike's information criterion and Kullback-Leibler loss for histogram density estimation.Probability Theory and Related Fields, 85:449–467.

    Article  MATH  MathSciNet  Google Scholar 

  • Hall, P. andHannan, E. (1988). On stochastic complexity and nonparametric density estimation.Biometrika, 75:705–714.

    Article  MATH  MathSciNet  Google Scholar 

  • Kanazawa, Y. (1988). An optimal variable cell histogram.Communications in Statistics, part A: Theory and Methods, 17:1401–1422.

    MATH  MathSciNet  Google Scholar 

  • Kanazawa, Y. (1992). An optimal variable cell histogram based on the sample spacings.Annals of Statistics, 20:291–304.

    MATH  MathSciNet  Google Scholar 

  • Kanazawa, Y. (1993). Hellinger distance and Akaike's information criterion for the histogram.Statistics and Probability Letters, 17:293–298.

    Article  MATH  MathSciNet  Google Scholar 

  • Kim, B. andRyzin, J. V. (1975). Uniform consistency of a histogram density estimator and modal estimation.Communications in Statistics, 4:303–315.

    Article  Google Scholar 

  • Kogure, A. (1987). Asymptotically optimal cells for a histogram.Annals of Statistics, 15:1023–1030.

    MATH  MathSciNet  Google Scholar 

  • Lecoutre, J.-P. (1985). Thel 2-optimal cell width for the histogram.Statistics and Probability Letters, 3:303–306.

    Article  MATH  MathSciNet  Google Scholar 

  • Lugosi, G. andNobel, A. (1996). Consistency of data-driven histogram methods for density estimation and classification.Annals of Statistics, 24:687–706.

    Article  MATH  MathSciNet  Google Scholar 

  • Rodriguez, C. andRyzin, J. V. (1985). Maximum entropy histograms.Statistics and Probability Letters, 3:117–120.

    Article  MATH  MathSciNet  Google Scholar 

  • Rodriguez, C. andRyzin, J. V. (1986). Large sample properties of maximum entropy histograms.IEEE Transactions on Information Theory, IT-32:751–759.

    Article  MATH  Google Scholar 

  • Rudemo, M. (1982). Empirical choice of histograms and kernel density estimators.Scandinavian Journal of Statistics, 9:65–78.

    MathSciNet  MATH  Google Scholar 

  • Schläffli, L. (1950).Gesammelte Mathematische Abhandlungen. Birkhäuser-Verlag, Basel.

    Google Scholar 

  • Scott, D. (1979). On optimal data-based histograms.Biometrika, 79:605–610.

    Article  Google Scholar 

  • Stone, C. J. (1985). An asymptotically optimal histogram selection rule. In L. L. Cam and R. A. Olshen, eds.,Proceedings of the Berkeley Conference in Honor of Jerzy Neyman and Jack Kiefer, Vol. II, pp. 513–520. Wadsworth, Belmont.

    Google Scholar 

  • Taylor, C. (1987). Akaike's information criterion and the histogram.Biometrika, 74:636–639.

    Article  MATH  MathSciNet  Google Scholar 

  • Vapnik, V. N. andChervonenkis, A. Y. (1971). On the uniform convergence of relative frequencies of events to their probabilities.Theory of Probability and its Applications, 16:264–280.

    Article  MATH  Google Scholar 

  • Wand, M. (1997). Data-based choice of histogram bin width.The American Statistician, 51:59–64.

    Article  Google Scholar 

  • Yu, B. andSpeed, T. (1990). Stochastic complexity and model selection II: Histograms. Technical report.

  • Yu, B. andSpeed, T. (1992). Data compression and histograms.Probability Theory and Related Fields, pp. 195–229.

  • Zhao, L. C., Krishnaiah, P. R., andChen, X. R. (1990). Almost sureL r-norm convergence for data-based histogram estimates.Theory of Probability and its Applications, 35:396–403.

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Devroye, L., Lugosi, G. Bin width selection in multivariate histograms by the combinatorial method. Test 13, 129–145 (2004). https://doi.org/10.1007/BF02603004

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02603004

Key Words

AMS subject classification

Navigation