Top

Published in:

2019 | OriginalPaper | Chapter

2. Machine Learning Overview

Authors : Jiawei Zhang, Philip S. Yu

Published in: Broad Learning Through Fusions

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Learning denotes the process of acquiring new declarative knowledge, the organization of new knowledge into general yet effective representations, and the discovery of new facts and theories through observation and experimentation. Learning is one of the most important skills that mankind can master, which also renders us different from the other animals on this planet. To provide an example, according to our past experiences, we know the sun rises from the east and falls to the west; the moon rotates around the earth; 1 year has 365 days, which are all knowledge we derive from our past life experiences.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter Broad Learning Introduction

next chapter Social Network Overview

Machine learning models usually denote the well trained learning algorithms by some training data. In the sequel of this book, we will not differentiate the differences between machine learning models and machine learning algorithms by default.

Y. Bengio, P. Simard, P. Frasconi, Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5(2), 157–166 (1994)CrossRef

T. Bengtsson, P. Bickel, B. Li, Curse-of-dimensionality revisited: collapse of the particle filter in very large scale systems, in Probability and Statistics: Essays in Honor of David A. Freedman, vol. 2 (2008), pp. 316–334MATH

S. Berchtold, C. Bohm, H. Kriegel, The pyramid-technique: towards breaking the curse of dimensionality, in Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data (SIGMOD ’02), vol. 27, pp. 142–153 (1998)CrossRef

J. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms (Kluwer Academic Publishers, Norwell, 1981)MATHCrossRef

L. Breiman, J. Friedman, R. Olshen, C. Stone, Classification and Regression Trees (Wadsworth and Brooks, Monterey, 1984)

C. Brodley, P. Utgoff, Multivariate decision trees. Mach. Learn. 19(1), 45–77 (1995)MATH

O. Chapelle, B. Schlkopf, A. Zien, Semi-supervised Learning, 1st edn. (MIT Press, Cambridge, 2010)

J. Chung, Ç. Gülçehre, K. Cho, Y. Bengio, Empirical evaluation of gated recurrent neural networks on sequence modeling. CoRR, abs/1412.3555 (2014)

C. Cortes, V. Vapnik, Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)MATH

10.

J. Duchi, E. Hazan, Y. Singer, Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)MathSciNetMATH

11.

B. Efron, T. Hastie, I. Johnstone, R. Tibshirani, Least angle regression. Ann. Stat. 32, 407–499 (2004)MathSciNetMATHCrossRef

12.

M. Ester, H. Kriegel, J. Sander, X. Xu, A density-based algorithm for discovering clusters a density-based algorithm for discovering clusters in large spatial databases with noise, in Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (AAAI Press, Menlo Park, 1996)

13.

S. Fahlman, C. Lebiere, The cascade-correlation learning architecture, in Advances in Neural Information Processing Systems 2 (Morgan-Kaufmann, Burlington, 1990)

14.

I. Goodfellow, Y. Bengio, A. Courville, Deep Learning (MIT Press, Cambridge, 2016). http://www.deeplearningbook.org MATH

15.

I. Guyon, A. Elisseeff, An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–11182 (2003)MATH

16.

J. Hartigan, M. Wong, A k-means clustering algorithm. JSTOR Appl. Stat. 28(1), 100–108 (1979)MATHCrossRef

17.

D. Hawkins, The problem of overfitting. J. Chem. Inf. Comput. Sci. 44(1), 1–12 (2004)MathSciNetCrossRef

18.

S. Haykin, Neural Networks: A Comprehensive Foundation, 2nd edn. (Prentice Hall PTR, Upper Saddle River, 1998)

19.

S. Hochreiter, J. Schmidhuber, Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRef

20.

A. Hoerl, R. Kennard, Ridge regression: biased estimation for nonorthogonal problems. Technometrics 42(1), 80–86 (2000)MATHCrossRef

21.

Z. Huang, Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Min. Knowl. Discov. 2(3), 283–304 (1998)CrossRef

22.

T. Joachims, Text categorization with support vector machines: learning with many relevant features, in European Conference on Machine Learning (Springer, Berlin, 1998)

23.

L. Kaufmann, P. Rousseeuw, Clustering by Means of Medoids (North Holland/Elsevier, Amsterdam, 1987)

24.

Y. Kim, Convolutional neural networks for sentence classification, in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (Association for Computational Linguistics, Doha, 2014)

25.

R. Kohavi, A study of cross-validation and bootstrap for accuracy estimation and model selection, in International Joint Conference on Artificial Intelligence (IJCA) (Morgan Kaufmann Publishers Inc., San Francisco, 1995)

26.

A. Krizhevsky, I. Sutskever, G. Hinton, Imagenet classification with deep convolutional neural networks, in Proceedings of the 25th International Conference on Neural Information Processing Systems (NIPS’12) (Curran Associates Inc., Red Hook, 2012)

27.

Y. Lecun, L. Bottou, Y. Bengio, P. Haffner, Gradient-based learning applied to document recognition, in Proceedings of the IEEE (IEEE, Piscataway, 1998)

28.

J. Liu, S. Ji, J. Ye, SLEP: sparse learning with efficient projections. Technical report (2010)

29.

W. McCulloch, W. Pitts, A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 5(4), 115–133 (1943)MathSciNetMATHCrossRef

30.

M. Minsky, S. Papert, Perceptrons: Expanded Edition (MIT Press, Cambridge, 1988)MATH

31.

S. Pan, Q. Yang, A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)CrossRef

32.

N. Parikh, S. Boyd, Proximal algorithms. Found. Trends Optim. 1(3), 123–231 (2014)

33.

R. Pascanu, C. Gulcehre, K. Cho, Y. Bengio, How to construct deep recurrent neural networks. CoRR, abs/1312.6026 (2013)

34.

R. Pascanu, T. Mikolov, Y. Bengio, On the difficulty of training recurrent neural networks, in Proceedings of the 30th International Conference on International Conference on Machine Learning (ICML’13) (2013)

35.

D. Pelleg, A. Moore, X-means: extending k-means with efficient estimation of the number of clusters, in Proceedings of the 17th International Conference on Machine Learning, Stanford (2000)

36.

J. Platt, Sequential minimal optimization: a fast algorithm for training support vector machines. Technical report. Adv. Kernel Methods Support Vector Learning 208 (1998)

37.

J. Quinlan, Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)

38.

J. Quinlan, C4.5: Programs for Machine Learning (Morgan Kaufmann Publishers Inc., San Francisco, 1993)

39.

L. Raileanu, K. Stoffel. Theoretical comparison between the Gini index and information gain criteria. Ann. Math. Artif. Intell. 41(1), 77–93 (2004)MathSciNetMATHCrossRef

40.

C. Rasmussen, The infinite Gaussian mixture model, in Advances in Neural Information Processing Systems 12 (MIT Press, Cambridge, 2000)

41.

J. Rawlings, S. Pantula, D. Dickey, Applied Regression Analysis, 2nd edn. (Springer, Berlin, 1998)MATHCrossRef

42.

F. Rosenblatt, The perceptron: a probabilistic model for information storage and organization in the brain. Psychol. Rev. 65, 386 (1958)CrossRef

43.

D. Rumelhart, G. Hinton, R. Williams, Learning internal representations by error propagation, in Parallel Distributed Processing: Explorations in the Microstructure of Cognition (MIT Press, Cambridge, 1986)

44.

D. Rumelhart, G. Hinton, R. Williams, Learning representations by back-propagating errors, in Neurocomputing: Foundations of Research (MIT Press, Cambridge, 1988)

45.

D. Rumelhart, R. Durbin, R. Golden, Y. Chauvin, Backpropagation: the basic theory, in Developments in Connectionist Theory. Backpropagation: Theory, Architectures, and Applications (Lawrence Erlbaum Associates, Inc., Hillsdale, 1995)

46.

R. Salakhutdinov, G. Hinton, Deep Boltzmann machines, in Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics (2009)

47.

C. Shannon, A mathematical theory of communication. SIGMOBILE Mob. Comput. Commun. Rev. 5(1), 3–55 (2001)MathSciNetCrossRef

48.

D. Svozil, V. Kvasnicka, J. Pospichal, Introduction to multi-layer feed-forward neural networks. Chemom. Intell. Lab. Syst. 39(1), 43–62 (1997)CrossRef

49.

P. Tan, M. Steinbach, V. Kumar, Introduction to Data Mining (First Edition) (Addison-Wesley Longman Publishing Co., Inc., Boston, 2005)

50.

R. Tibshirani, The lasso method for variable selection in the cox model. Stat. Med. 16, 385–395 (1997)CrossRef

51.

L. Van Der Maaten, E. Postma, J. Van den Herik, Dimensionality reduction: a comparative review. J. Mach. Learn. Res. 10, 66–71 (2009)

52.

M. Verleysen, D. François, The curse of dimensionality in data mining and time series prediction, in Computational Intelligence and Bioinspired Systems. International Work-Conference on Artificial Neural Networks (Springer, Berlin, 2005)CrossRef

53.

P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, P. Manzagol, Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 3371–3408 (2010)MathSciNetMATH

54.

P. J. Werbos, Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. PhD thesis, Harvard University, Cambridge, 1974

55.

X. Yan, X. Su, Linear Regression Analysis: Theory and Computing (World Scientific Publishing Co., Inc., River Edge, 2009)MATHCrossRef

56.

J. Zhang, L. Cui, Y. Fu, F. Gouza, Fake news detection with deep diffusive network model. CoRR, abs/1805.08751 (2018)

57.

X. Zhu, Semi-supervised learning literature survey. Comput. Sci. 2(3), 4 (2006)

Title: Machine Learning Overview
Authors: Jiawei Zhang
Philip S. Yu
Publisher: Springer International Publishing
Book: Broad Learning Through Fusions
Print ISBN: 978-3-030-12527-1

Electronic ISBN: 978-3-030-12528-8

Copyright Year: 2019
DOI: https://doi.org/10.1007/978-3-030-12528-8_2

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner