Skip to main content
Erschienen in:
Buchtitelbild

2010 | OriginalPaper | Buchkapitel

1. Information Theory, Machine Learning, and Reproducing Kernel Hilbert Spaces

verfasst von : José C. Principe

Erschienen in: Information Theoretic Learning

Verlag: Springer New York

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The common problem faced by many data processing professionals is how to best extract the information contained in data. In our daily lives and in our professions, we are bombarded by huge amounts of data, but most often data are not our primary interest. Data hides, either in time structure or in spatial redundancy, important clues to answer the information-processing questions we pose. We are using the term information in the colloquial sense, and therefore it may mean different things to different people, which is OK for now. We all realize that the use of computers and the Web accelerated tremendously the accessibility and the amount of data being generated. Therefore the pressure to distill information from data will mount at an increasing pace in the future, and old ways of dealing with this problem will be forced to evolve and adapt to the new reality. To many (including the author) this represents nothing less than a paradigm shift, from hypothesis-based, to evidence-based science and it will affect the core design strategies in many disciplines including learning theory and adaptive systems.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Aczél J., Daróczy Z., On measures of information and their characterizations, Mathematics in Science and Engineering, vol. 115, Academic Press, New York, 1975. Aczél J., Daróczy Z., On measures of information and their characterizations, Mathematics in Science and Engineering, vol. 115, Academic Press, New York, 1975.
7.
Zurück zum Zitat Aronszajn N., The theory of reproducing kernels and their applications, Cambridge Philos. Soc. Proc., vol. 39:133–153, 1943.CrossRefMathSciNet Aronszajn N., The theory of reproducing kernels and their applications, Cambridge Philos. Soc. Proc., vol. 39:133–153, 1943.CrossRefMathSciNet
35.
Zurück zum Zitat Berlinet A., Thomas-Agnan C., Reproducing Kernel Hilbert Spaces in Probability and Statistics, Kluwer, Norwell, MA, 2003. Berlinet A., Thomas-Agnan C., Reproducing Kernel Hilbert Spaces in Probability and Statistics, Kluwer, Norwell, MA, 2003.
41.
Zurück zum Zitat Bregman L.M. (1967). The relaxation method of finding the common points of convex sets and its application to the solution of problems in convex programming. USSR Comput. Math. Math. Phys. 7:200–217.CrossRef Bregman L.M. (1967). The relaxation method of finding the common points of convex sets and its application to the solution of problems in convex programming. USSR Comput. Math. Math. Phys. 7:200–217.CrossRef
43.
Zurück zum Zitat Burbea J., Rao C, Entropy differential metric, distance and divergence measures in probability spaces: A unified approach, J. Multivar. Anal., 12:575–596, 1982.CrossRefMATHMathSciNet Burbea J., Rao C, Entropy differential metric, distance and divergence measures in probability spaces: A unified approach, J. Multivar. Anal., 12:575–596, 1982.CrossRefMATHMathSciNet
51.
Zurück zum Zitat Casals J., Jutten C., Taleb A., Source separation techniques applied to linear prediction, Proc. ICA’00, Helsinki, Finland, pp. 193–204, 2000. Casals J., Jutten C., Taleb A., Source separation techniques applied to linear prediction, Proc. ICA’00, Helsinki, Finland, pp. 193–204, 2000.
57.
Zurück zum Zitat Chi C., Chen C., Cumulant-based inverse filter criteria for MIMO blind deconvolution: Properties, algorithms, and application to D/CDMA systems in multipath, IEEE Trans. Signal Process., 49(7):1282–1299, 2001CrossRef Chi C., Chen C., Cumulant-based inverse filter criteria for MIMO blind deconvolution: Properties, algorithms, and application to D/CDMA systems in multipath, IEEE Trans. Signal Process., 49(7):1282–1299, 2001CrossRef
65.
66.
Zurück zum Zitat Csiszar I., Information type measures of difference of probability distributions and indirect observations, Stuia Sci. Math. Hungary, 2: 299–318, 1967.MATHMathSciNet Csiszar I., Information type measures of difference of probability distributions and indirect observations, Stuia Sci. Math. Hungary, 2: 299–318, 1967.MATHMathSciNet
68.
Zurück zum Zitat Deco G., Obradovic D., An Information-Theoretic Approach to Neural Computing, Springer, New York, 1996.CrossRefMATH Deco G., Obradovic D., An Information-Theoretic Approach to Neural Computing, Springer, New York, 1996.CrossRefMATH
69.
Zurück zum Zitat DeFigueiredo R., A generalized Fock space framework for nonlinear system and signal analysis, IEEE Trans. Circuits Syst., CAS-30(9):637–647, Sept. 1983.CrossRefMathSciNet DeFigueiredo R., A generalized Fock space framework for nonlinear system and signal analysis, IEEE Trans. Circuits Syst., CAS-30(9):637–647, Sept. 1983.CrossRefMathSciNet
72.
Zurück zum Zitat Dhillon I., Guan Y., Kulisweifeng B., Kernel k-means, spectral clustering and normalized cuts”, Proc.Tenth ACM SIGKDD Int. Conf. Knowledge Discovery Data Mining (KDD), pp. 551–556, August 2004. Dhillon I., Guan Y., Kulisweifeng B., Kernel k-means, spectral clustering and normalized cuts”, Proc.Tenth ACM SIGKDD Int. Conf. Knowledge Discovery Data Mining (KDD), pp. 551–556, August 2004.
95.
Zurück zum Zitat Erdogmus D., Principe J. From linear adaptive filtering to nonlinear signal processing” IEEE SP Mag., 23:14–33, 2006.CrossRef Erdogmus D., Principe J. From linear adaptive filtering to nonlinear signal processing” IEEE SP Mag., 23:14–33, 2006.CrossRef
97.
Zurück zum Zitat Fano R., Transmission of Information: A Statistical Theory of Communications, MIT Press, New York, 1961. Fano R., Transmission of Information: A Statistical Theory of Communications, MIT Press, New York, 1961.
99.
Zurück zum Zitat Feng X., Loparo K., Fang Y., Optimal state estimation for stochastic systems: An information theoretic approach, IEEE Trans. Autom. Control, 42(6):771–785, 1997.CrossRefMATHMathSciNet Feng X., Loparo K., Fang Y., Optimal state estimation for stochastic systems: An information theoretic approach, IEEE Trans. Autom. Control, 42(6):771–785, 1997.CrossRefMATHMathSciNet
102.
Zurück zum Zitat Fisher J., Ihler A., Viola P., Learning informative statistics: A nonparametric approach, Proceedings of NIPS’00, pp. 900–906, 2000. Fisher J., Ihler A., Viola P., Learning informative statistics: A nonparametric approach, Proceedings of NIPS’00, pp. 900–906, 2000.
103.
Zurück zum Zitat Fock V., The Theory of Space Time and Gravitation”, Pergamon Press, New York, 1959. Fock V., The Theory of Space Time and Gravitation”, Pergamon Press, New York, 1959.
106.
Zurück zum Zitat Fu K., Statistical pattern recognition, in Adaptive, Learning and Pattern Recognition Systems, Mendel and Fu Eds., Academic Press, New York, 1970, pp. 35–76.CrossRef Fu K., Statistical pattern recognition, in Adaptive, Learning and Pattern Recognition Systems, Mendel and Fu Eds., Academic Press, New York, 1970, pp. 35–76.CrossRef
108.
Zurück zum Zitat Fukunaga K., An Introduction to Statistical Pattern Recognition, Academic Press, New York, 1972 Fukunaga K., An Introduction to Statistical Pattern Recognition, Academic Press, New York, 1972
112.
133.
Zurück zum Zitat Hardle W., Applied Nonparametric Regression, Econometric Society Monographs vol 19, Cambridge University Press, New York, 1990. Hardle W., Applied Nonparametric Regression, Econometric Society Monographs vol 19, Cambridge University Press, New York, 1990.
137.
Zurück zum Zitat Hartley R., Transmission of information, Bell Syst. Tech. J., 7:535, 1928.CrossRef Hartley R., Transmission of information, Bell Syst. Tech. J., 7:535, 1928.CrossRef
139.
Zurück zum Zitat Haykin S. (ed.), Blind Deconvolution, Prentice-Hall, Upper Saddle River, NJ, 1994. Haykin S. (ed.), Blind Deconvolution, Prentice-Hall, Upper Saddle River, NJ, 1994.
141.
Zurück zum Zitat Haykin S., Neural Networks: A Comprehensive Foundation, Prentice Hall, Upper Saddle River, NJ, 1999.MATH Haykin S., Neural Networks: A Comprehensive Foundation, Prentice Hall, Upper Saddle River, NJ, 1999.MATH
152.
Zurück zum Zitat Hinton G. and Sejnowski T., Unsupervised learning: Foundations of neural computation, MIT Press, Cambridge, MA, 1999. Hinton G. and Sejnowski T., Unsupervised learning: Foundations of neural computation, MIT Press, Cambridge, MA, 1999.
156.
Zurück zum Zitat Hyvarinen A., Karhunen J., Oja E., Independent Component Analysis, Wiley, New York, 2001.CrossRef Hyvarinen A., Karhunen J., Oja E., Independent Component Analysis, Wiley, New York, 2001.CrossRef
170.
Zurück zum Zitat Jones M., McKay I., Hu T., Variable location and scale density estimation, Ann. Inst. Statist. Math., 46:345–52, 1994.MathSciNet Jones M., McKay I., Hu T., Variable location and scale density estimation, Ann. Inst. Statist. Math., 46:345–52, 1994.MathSciNet
173.
Zurück zum Zitat Kailath T., RKHS approach to detection and estimation problems–part I: Deterministic signals in Gaussian noise, IEEE Trans. Inf. Theor., IT-17(5):530–549, Sept. 1971.CrossRefMathSciNet Kailath T., RKHS approach to detection and estimation problems–part I: Deterministic signals in Gaussian noise, IEEE Trans. Inf. Theor., IT-17(5):530–549, Sept. 1971.CrossRefMathSciNet
174.
Zurück zum Zitat Kailath T. and Duttweiler D., An RKHS approach to detection and estimation problems-part III: Generalized innovations representations and a likelihood-ratio formula, IEEE Trans. Inf. Theor., IT-18(6):30–45, November 1972.MathSciNet Kailath T. and Duttweiler D., An RKHS approach to detection and estimation problems-part III: Generalized innovations representations and a likelihood-ratio formula, IEEE Trans. Inf. Theor., IT-18(6):30–45, November 1972.MathSciNet
175.
Zurück zum Zitat Kailath T. and Weinert H., An RKHS approach to detection and estimation problems-part II: Gaussian signal detection, IEEE Trans. Inf. Theor., IT-21(1):15–23, January 1975.CrossRefMathSciNet Kailath T. and Weinert H., An RKHS approach to detection and estimation problems-part II: Gaussian signal detection, IEEE Trans. Inf. Theor., IT-21(1):15–23, January 1975.CrossRefMathSciNet
177.
Zurück zum Zitat Kapur J., Measures of Information and their Applications, Wiley Eastern Ltd, New Delhi, 1994.MATH Kapur J., Measures of Information and their Applications, Wiley Eastern Ltd, New Delhi, 1994.MATH
178.
Zurück zum Zitat Kass R. and Vos P., Geometrical Foundations of Asymptotic Inference, Wiley, New York, 1997.CrossRefMATH Kass R. and Vos P., Geometrical Foundations of Asymptotic Inference, Wiley, New York, 1997.CrossRefMATH
185.
Zurück zum Zitat Kolmogorov A., Interpolation and extrapolation of stationary random processes, Rand Co. (translation from the Russian), Santa Monica, CA, 1962. Kolmogorov A., Interpolation and extrapolation of stationary random processes, Rand Co. (translation from the Russian), Santa Monica, CA, 1962.
196.
Zurück zum Zitat LeCun Y., Chopra S., Hadsell R., Ranzato M., Huang F., A tutorial on energy-based learning, in Predicting Structured Data, Bakir, Hofman, Scholkopf, Smola, Taskar (Eds.), MIT Press, Boston, 2006. LeCun Y., Chopra S., Hadsell R., Ranzato M., Huang F., A tutorial on energy-based learning, in Predicting Structured Data, Bakir, Hofman, Scholkopf, Smola, Taskar (Eds.), MIT Press, Boston, 2006.
199.
Zurück zum Zitat Linsker R., Towards an organizing principle for a layered perceptual network. In D. Z. Anderson (Ed.), Neural Information Processing Systems - Natural and Synthetic. American Institute of Physics, New York, 1988. Linsker R., Towards an organizing principle for a layered perceptual network. In D. Z. Anderson (Ed.), Neural Information Processing Systems - Natural and Synthetic. American Institute of Physics, New York, 1988.
202.
Zurück zum Zitat Liu W., Pokarel P., Principe J., The kernel LMS algorithm, IEEE Trans. Signal Process., 56(2):543–554, Feb. 2008.CrossRefMathSciNet Liu W., Pokarel P., Principe J., The kernel LMS algorithm, IEEE Trans. Signal Process., 56(2):543–554, Feb. 2008.CrossRefMathSciNet
203.
Zurück zum Zitat Loève, M.M., Probability Theory, VanNostrand, Princeton, NJ, 1955.MATH Loève, M.M., Probability Theory, VanNostrand, Princeton, NJ, 1955.MATH
214.
Zurück zum Zitat Mate L., Hilbert Space Methods in Science and Engineering, Hilger, New York, 1989.MATH Mate L., Hilbert Space Methods in Science and Engineering, Hilger, New York, 1989.MATH
216.
Zurück zum Zitat Menendez M., Morales D., Pardo L., Salicru M., Asymptotic behavior and stastistical applications of divergence measures in multinomial populations: a unified study, Statistical Papers, 36–129, 1995. Menendez M., Morales D., Pardo L., Salicru M., Asymptotic behavior and stastistical applications of divergence measures in multinomial populations: a unified study, Statistical Papers, 36–129, 1995.
217.
Zurück zum Zitat Mercer J., Functions of positive and negative type, and their connection with the theory of integral equations, Philosoph. Trans. Roy. Soc. Lond., 209:415–446, 1909.CrossRefMATH Mercer J., Functions of positive and negative type, and their connection with the theory of integral equations, Philosoph. Trans. Roy. Soc. Lond., 209:415–446, 1909.CrossRefMATH
225.
Zurück zum Zitat Muller K., Smola A., Ratsch G., Scholkopf B., Kohlmorgen J., Vapnik V., Predicting time series with support vector machines. In Proceedings of International Conference on Artificial Neural Networks, Lecture Notes in Computer Science, volume 1327, pages 999–1004, Springer-Verlag, Berlin, 1997. Muller K., Smola A., Ratsch G., Scholkopf B., Kohlmorgen J., Vapnik V., Predicting time series with support vector machines. In Proceedings of International Conference on Artificial Neural Networks, Lecture Notes in Computer Science, volume 1327, pages 999–1004, Springer-Verlag, Berlin, 1997.
232.
Zurück zum Zitat Nilsson N., Learning Machines, Morgan Kauffman, San Mateo, Ca, 1933. Nilsson N., Learning Machines, Morgan Kauffman, San Mateo, Ca, 1933.
235.
Zurück zum Zitat Papoulis A., Probability, Random Variables and Stochastic Processes, McGraw-Hill, New York, 1965.MATH Papoulis A., Probability, Random Variables and Stochastic Processes, McGraw-Hill, New York, 1965.MATH
238.
Zurück zum Zitat Parzen E., Statistical inference on time series by Hilbert space methods, Tech. Report 23, Stat. Dept., Stanford Univ., 1959. Parzen E., Statistical inference on time series by Hilbert space methods, Tech. Report 23, Stat. Dept., Stanford Univ., 1959.
241.
252.
Zurück zum Zitat Principe, J., Xu D., Fisher J., Information theoretic learning, in unsupervised adaptive filtering, Simon Haykin (Ed.), pp. 265–319, Wiley, New York, 2000. Principe, J., Xu D., Fisher J., Information theoretic learning, in unsupervised adaptive filtering, Simon Haykin (Ed.), pp. 265–319, Wiley, New York, 2000.
264.
Zurück zum Zitat Renyi A., Probability Theory, North-Holland, University Amsterdam, 1970. Renyi A., Probability Theory, North-Holland, University Amsterdam, 1970.
272.
278.
Zurück zum Zitat Salicru M., Menendez M., Morales D., Pardo L., Asymptotic distribution of (h,ϕ)-entropies, Comm. Statist. Theor. Meth., 22(7):2015–2031, 1993.CrossRefMATHMathSciNet Salicru M., Menendez M., Morales D., Pardo L., Asymptotic distribution of (h,ϕ)-entropies, Comm. Statist. Theor. Meth., 22(7):2015–2031, 1993.CrossRefMATHMathSciNet
287.
Zurück zum Zitat Schölkopf B., Smola A., Muller K., Nonlinear component analysis as a kernel eigenvalue problem, Neural Comput., 10:1299–1319, 1998.CrossRef Schölkopf B., Smola A., Muller K., Nonlinear component analysis as a kernel eigenvalue problem, Neural Comput., 10:1299–1319, 1998.CrossRef
289.
Zurück zum Zitat Schölkopf B. and Smola A., Learning with Kernels. MIT Press, Cambridge, MA, 2002 Schölkopf B. and Smola A., Learning with Kernels. MIT Press, Cambridge, MA, 2002
293.
Zurück zum Zitat Shannon C., and Weaver W., The mathematical Theory of Communication, University of Illinois Press, Urbana, 1949.MATH Shannon C., and Weaver W., The mathematical Theory of Communication, University of Illinois Press, Urbana, 1949.MATH
294.
Zurück zum Zitat Shawe-Taylor J. Cristianini N., Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge, UK, 2004. Shawe-Taylor J. Cristianini N., Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge, UK, 2004.
311.
Zurück zum Zitat Suykens J., Gestel T., Brabanter J., Moor B., Vandewalle J., Least Squares Support Vector Machines, Word Scientific, Singapore, 2002. Suykens J., Gestel T., Brabanter J., Moor B., Vandewalle J., Least Squares Support Vector Machines, Word Scientific, Singapore, 2002.
316.
Zurück zum Zitat Tishby N., Pereira F., and Bialek W., The information bottleneck method. In Proceedings of the 37th Annual Allerton Conference on Communication, Control and Computing, pp. 368–377, 1999. Tishby N., Pereira F., and Bialek W., The information bottleneck method. In Proceedings of the 37th Annual Allerton Conference on Communication, Control and Computing, pp. 368–377, 1999.
323.
Zurück zum Zitat Vapnik V., The Nature of Statistical Learning Theory, Springer-Verlag, New York, 1995CrossRefMATH Vapnik V., The Nature of Statistical Learning Theory, Springer-Verlag, New York, 1995CrossRefMATH
327.
Zurück zum Zitat Wahba G., Spline Models for Observational Data, SIAM,. Philadelphia, PA, 1990, vol. 49. Wahba G., Spline Models for Observational Data, SIAM,. Philadelphia, PA, 1990, vol. 49.
330.
Zurück zum Zitat Watanabe S., Pattern Recognition: Human and Mechanical. Wiley, New York, 1985. Watanabe S., Pattern Recognition: Human and Mechanical. Wiley, New York, 1985.
331.
Zurück zum Zitat Werbos P., Beyond regression: New tools for prediction and analysis in the behavioral sciences, Ph.D. Thesis, Harvard University, Cambridge, 1974. Werbos P., Beyond regression: New tools for prediction and analysis in the behavioral sciences, Ph.D. Thesis, Harvard University, Cambridge, 1974.
332.
Zurück zum Zitat Widrow B., S. Stearns, Adaptive Signal Processing, Prentice Hall, Englewood Cliffs, NJ, 1985. Widrow B., S. Stearns, Adaptive Signal Processing, Prentice Hall, Englewood Cliffs, NJ, 1985.
333.
Zurück zum Zitat Wiener N., Nonlinear Problems in Random Theory, MIT, Boston, 1958.MATH Wiener N., Nonlinear Problems in Random Theory, MIT, Boston, 1958.MATH
Metadaten
Titel
Information Theory, Machine Learning, and Reproducing Kernel Hilbert Spaces
verfasst von
José C. Principe
Copyright-Jahr
2010
Verlag
Springer New York
DOI
https://doi.org/10.1007/978-1-4419-1570-2_1