Top

Published in:

2010 | OriginalPaper | Chapter

2. Renyi’s Entropy, Divergence and Their Nonparametric Estimators

Authors : Dongxin Xu, Deniz Erdogmuns

Published in: Information Theoretic Learning

Publisher: Springer New York

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

It is evident from Chapter 1 that Shannon’s entropy occupies a central role in information-theoretic studies. Yet, the concept of information is so rich that perhaps there is no single definition that will be able to quantify information properly. Moreover, from an engineering perspective, one must estimate entropy from data which is a nontrivial matter. In this book we concentrate on Alfred Renyi’s seminal work on information theory to derive a set of estimators to apply entropy and divergence as cost functions in adaptation and learning. Therefore, we are mainly interested in computationally simple, nonparametric estimators that are continuous and differentiable in terms of the samples to yield well-behaved gradient algorithms that can optimize adaptive system parameters. There are many factors that affect the determination of the optimum of the performance surface, such as gradient noise, learning rates, and misadjustment, therefore in these types of applications the entropy estimator’s bias and variance are not as critical as, for instance, in coding or rate distortion theories. Moreover in adaptation one is only interested in the extremum (maximum or minimum) of the cost, with creates independence from its actual values, because only relative assessments are necessary. Following our nonparametric goals, what matters most in learning is to develop cost functions or divergence measures that can be derived directly from data without further assumptions to capture as much structure as possible within the data’s probability density function (PDF).

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter Information Theory, Machine Learning, and Reproducing Kernel Hilbert Spaces

next chapter Adaptive Information Filtering with Error Entropy and Error Correntropy Criteria

Aczél J., Daróczy Z., On measures of information and their characterizations,Mathematics in Science and Engineering, vol. 115, Academic Press, New York, 1975.

19.

Basu A., Lindsay B. Minimum disparity estimation in the continuous case: Efficiency, distributions, robustness,Ann. Inst.Statist. Math.., 46:683–705, 1994.CrossRefMATHMathSciNet

32.

Bengtsson I., Zyczkowski K.,Geometry of quantum states, Cambridge, UK, 2006.CrossRefMATH

39.

Bhattacharyya A., On a measure of divergence between two statistical populations defined by their probability distributions,Bul. Calcutta Math. Soc., 35:99–109, 1943.MATH

40.

Bourbaki N.,Topological Vector Spaces, Springer, 1987

44.

Campbell L., A coding theorem and Renyi’s entropy,Inf. Control, 8:423–429, 1965CrossRefMATH

56.

Chernoff H., A measure of asymptotic efficiency of tests for a hypothesis based on a sum of observations.Ann. Math. Stat., 23:493–507, 1952.CrossRefMATHMathSciNet

65.

Cover T., Thomas J.,Elements of Information Theory, Wiley, New York, 1991CrossRefMATH

86.

Erdogmus D., Information theoretic learning: Renyi’s entropy and its applications to adaptive systems training, Ph.D. Dissertation, University of Florida, Gainesville, 2002.

91.

Erdogmus D., Hild K., Principe J., Beyond second order statistics for learning: a pairwise interaction model for entropy estimation,J. Natural Comput., 1(1):85–108, 2003.CrossRefMathSciNet

100.

Fine S., Scheinberg K., Cristianini N., Shawe-Taylor J., Williamson B., Efficient SVM training using low-rank kernel representations,J. Mach. Learn. Res., 2:243–264, 2001.

116.

Golub G., Van Loan C.,Matrix Computation, 3rd ed. The Johns Hopkins University Press, Baltimore, Maryland, 1996.

117.

Gonzalez T., Clustering to minimize the maximum intercluster distance.Theor. Comput. Sci., 38:293–306, 1985.CrossRefMATH

120.

Grassberger, P., I. Procaccia, Characterization of strange attractors,Phys. Rev. Lett., 50(5):346–349, 1983.CrossRefMathSciNet

121.

Greengard L., Rokhlin V., A fast algorithm for particle simulations.J. Comput. Phys., 73(2):325–348, 1987.CrossRefMATHMathSciNet

122.

Greengard L., Strain J., The fast Gauss transform.SIAM J. Sci. Statist. Comput., 12(1):79–94, 1991.CrossRefMATHMathSciNet

135.

Hart, P., Moment distributions in economics: an exposition,J. Royal. Statis Soc. Ser. A, 138:423–434, 1975.CrossRef

138.

Havrda J., Charvat, F., Quantification methods of classification processes: concept of structural a entropy,Kybernetica 3:30, 1967.MATHMathSciNet

154.

Horn D., Gottlieb A., Algorithm for data clustering in pattern recognition problems based on quantum mechanics,Phys. Rev. Lett., 88(1):018702, 2002.CrossRef

168.

Jizba P., Toshihico T., The world according to Renyi: Thermodynamics of multifractal systems,Ann. Phys., 312:17–59, 2004.CrossRefMATH

177.

Kapur J.,Measures of Information and their Applications, Wiley Eastern Ltd, New Delhi, 1994.MATH

179.

Kawai A, Fukushige T., $105/Gflops astrophysical N-body simulation with reconfigurable add-in card and hierarchical tree algorithm, inProc. SC2006, IEEE Computer Society Press, Tampa FL, 2006.

184.

Kolmogorov A., Sur la notion de la moyenne,Atti della R. Accademia Nazionale dei Lincei, 12:388–391, 1930.

188.

Kullback S.,Information theory and statistics, Dover, Mineola, NY, 1959.MATH

205.

Lutwak E., Yang D., Zhang G., Cramér–Rao and moment-entropy inequalities for Renyi entropy and generalized Fisher information,IEEE Trans. Info. Theor.., 51(2):473–479, 2005.CrossRefMATHMathSciNet

229.

Nagumo M., Uber eine klasse von mittelwerte,Japanese J. Math.., 7:71, 1930.MATH

236.

Pardo L.,Statistical Inference based on Divergence measures, Chapman & Hall, Boca raton, FL, 2006.MATH

241.

Parzen E., On the estimation of a probability density function and the mode,Ann. Math. Statist.., 33:1065–1067, 1962.CrossRefMATHMathSciNet

252.

Principe, J., Xu D., Fisher J., Information theoretic learning, in unsupervised adaptive filtering, Simon Haykin (Ed.), pp. 265–319, Wiley, New York, 2000.

259.

Rao S., Unsupervised Learning: An Information Theoretic Learning Approach, Ph.D. thesis, University of Florida, Gainesville, 2008.

263.

Renyi A., On measures of entropy and information,Proc. of the 4th Berkeley Symp. Math. Statist. Prob. 1960, vol. I, Berkeley University Press, pp. 457, 1961.

264.

Renyi A., Probability Theory, North-Holland, University Amsterdam, 1970.

265.

Renyi A. (Ed.),Selected Papers of Alfred Renyi, vol. 2, Akademia Kiado, Budapest, 1976.

266.

Renyi A., Some fundamental questions about information theory, in Renyi, A. (Ed.),Selected Papers of Alfred Renyi, vol. 2, Akademia Kiado, Budapest, 1976.

276.

Rudin W.Principles of Mathematical Analysis. McGraw-Hill, New York, 1976.MATH

292.

Seth S., and Principe J., On speeding up computation in information theoretic learning, inProc. IJCNN 2009, Atlanta, GA, 2009.

300.

Silverman B.,Density Estimation for Statistics and Data Analysis, Chapman and Hall, London, 1986.CrossRefMATH

308.

Song, K., Renyi information, log likelihood and an intrinsic distribution measure,J. of Stat. Plan. and Inference, 93: 51–69, 2001.CrossRefMATH

319.

Torkkola K., Feature extraction by non-parametric mutual information maximization,J. Mach. Learn. Res.., 3:1415–1438, 2003.MATHMathSciNet

320.

Tsallis C., Possible generalization of Boltzmann Gibbs statistics,J. Stat. Phys.., 52:479, 1988.CrossRefMATHMathSciNet

326.

von Neumann, J.,Mathematical Foundations of Quantum Mechanics, Princeton University Press, Princeton, NJ, 1955.MATH

340.

Xu D., Energy, Entropy and Information Potential for Neural Computation, PhD Dissertation, University of Florida, Gainesville, 1999

345.

Yang C., Duraiswami R., Gumerov N., Davis L., Improved fast Gauss transform and efficient kernel density estimation. InProc. ICCV 2003, pages 464–471, 2003.

Title: Renyi’s Entropy, Divergence and Their Nonparametric Estimators
Authors: Dongxin Xu
Deniz Erdogmuns
Publisher: Springer New York
Book: Information Theoretic Learning
Print ISBN: 978-1-4419-1569-6

Electronic ISBN: 978-1-4419-1570-2

Copyright Year: 2010
DOI: https://doi.org/10.1007/978-1-4419-1570-2_2

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner