Skip to main content
Log in

Lower and Upper Bounds for Misclassification Probability Based on Renyi's Information

  • Published:
Journal of VLSI signal processing systems for signal, image and video technology Aims and scope Submit manuscript

Abstract

Fano's inequality has proven to be one important result in Shannon's information theory having found applications in numerous proofs of convergence. It also provides us with a lower bound on the symbol error probability in a communication channel, in terms of Shannon's definitions of entropy and mutual information. This result is also significant in that it suggests insights on how the classification performance is influenced by the amount of information transferred through the classifier. We have previously extended Fano's lower bound on the probability of error to a family of lower and upper bounds based on Renyi's definitions of entropy and mutual information. These new bounds however, despite their theoretical appeal, were practically incomputable. In this paper, we present some modifications to these bounds that will allow us to utilize them in practical situations. The significance of these new bounds is threefold: Illustrating a theoretical use of Renyi's definition of information, extending Fano's result to include an upper bound for probability of classification error, and providing insights on how the information transfer through a classifier affects its performance. The performance of the modified bounds is investigated in various numerical examples, including applications to digital communication channels that are designed to point out the major conclusions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. T. Cover and J. Thomas, Elements of Information Theory, New York: John Wiley, 1991.

    Book  MATH  Google Scholar 

  2. R. Linsker, “Towards an Organizing Principle for a Layered Perceptual Network,” Neural Information Processing Systems, 1988, pp. 485-494.

  3. K. Torkkola, “Visualizing Class Structure in Data Using Mutual Information,” in Proceedings of Neural Networks for Signal Processing X, Sydney, Australia, 2000, pp. 376-385.

  4. K. Fu, “Statistical Pattern Recognition,” in Adaptive, Learning and Pattern Recognition Systems, Mendel and Fu (Eds.), New York: Academic Press, 1970, pp. 35-76.

    Chapter  Google Scholar 

  5. K. Fukunaga, An Introduction to Statistical Pattern Recognition, New York: Academic Press, 1972.

    Google Scholar 

  6. G. Deco and D. Obradovic, An Information Theoretic Approach to Neural Computing, New York: Springer, 1996.

    Book  MATH  Google Scholar 

  7. J. Fisher, “Nonlinear Extensions to the MACE Filter,” Ph.D. Dissertation, University of Florida, 1997.

  8. B. Ripley, Pattern Recognition and Neural Networks, New York: Cambridge University Press, 1996.

    Book  MATH  Google Scholar 

  9. V. Vapnik, The Nature of Statistical Learning Theory, New York: Springer Verlag, 1995.

    Book  MATH  Google Scholar 

  10. R.M. Fano, Transmission of Information: A Statistical Theory of Communications, New York: MIT Press and John Wiley &; Sons Inc., 1961.

    Google Scholar 

  11. C.E. Shannon, “A Mathematical Theory of Communications,” Bell Systems Technical Journal, vol. 27, 1948, pp. 379-423, 623–656.

    Article  MathSciNet  MATH  Google Scholar 

  12. A. Renyi, Probability Theory, New York: American Elsevier Publishing Company Inc., 1970.

    Google Scholar 

  13. R.G. Gallager, Information Theory and Reliable Communication, New York: John Wiley &; Sons Inc., 1968.

    MATH  Google Scholar 

  14. M.B. Bassat and J. Raviv, “Renyi's Entropy and the Prbability of Error,” IEEE Transactions on Information Theory, vol. 24, no. 3, 1978, pp. 324-330.

    Article  MATH  Google Scholar 

  15. M. Feder and N. Merhav, “Relations Between Entropy and Error Probability,” IEEE Transactions on Information Theory, vol. 40, no. 1, 1994, pp. 259-266.

    Article  MathSciNet  MATH  Google Scholar 

  16. T.S. Han and S. Verdu, “Generalizing the Fano Inequality,” IEEE Transactions on Information Theory, vol. 40, no. 4, 1994, pp. 1247-1251.

    Article  MathSciNet  MATH  Google Scholar 

  17. H.V. Poor and S. Verdu, “A Lower Bound on the Probability of Error in Multihypothesis Testing,” IEEE Transactions on Information Theory, vol. 41, no. 6, 1995, pp. 1992-1994.

    Article  MathSciNet  MATH  Google Scholar 

  18. S. Kullback, Information Theory and Statistics, New York: Dover Publications Inc., 1968.

    Google Scholar 

  19. D. Erdogmus and J.C. Principe, “Information Transfer Through Classifiers and its Relation to Probability of Error,” in Proceedings of International Joint Conference on Neural Networks (IJCNN'01), Washington, DC, 2001, pp. 50-54.

  20. C. Bishop, Neural Networks for Pattern Recognition, Oxford: Clarendon Press, 1995.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Erdogmus, D., Principe, J.C. Lower and Upper Bounds for Misclassification Probability Based on Renyi's Information. The Journal of VLSI Signal Processing-Systems for Signal, Image, and Video Technology 37, 305–317 (2004). https://doi.org/10.1023/B:VLSI.0000027493.48841.39

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1023/B:VLSI.0000027493.48841.39

Navigation