nach oben

Erschienen in:

2019 | OriginalPaper | Buchkapitel

2. Fundamentals of Machine Learning

verfasst von : Ke-Lin Du, M. N. S. Swamy

Erschienen in: Neural Networks and Statistical Learning

Verlag: Springer London

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

This chapter deals with the fundamental concepts and theories of machine learning. It first introduces various learning and inference methods, followed by learning and generalization, model selection, and neural networks as universal machines. Some other important topics are also covered.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Introduction

Nächstes Kapitel Elements of Computational Learning Theory

Akaike, H. (1969). Fitting autoregressive models for prediction. Annals of the Institute of Statistical Mathematics, 21, 425–439.MathSciNetMATH

Akaike, H. (1970). Statistical prediction information. Annals of the Institute of Statistical Mathematics, 22, 203–217.MathSciNetMATHCrossRef

Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19, 716–723.MathSciNetMATHCrossRef

Amari, S., Murata, N., Muller, K. R., Finke, M., & Yang, H. (1996). Statistical theory of overtraining: Is cross-validation asymptotically effective? In D. S. Touretzky, M. C. Mozer, & M. E. Hasselmo (Eds.), Advances in neural information processing systems (Vol. 8, pp. 176–182). Cambridge, MA: MIT Press.

Arlot, S., & Lerasle, M. (2016). Choice of V for V-fold cross-validation in least-squares density estimation. Journal of Machine Learning Research, 17, 1–50.MathSciNetMATH

Auer, P., Herbster, M., & Warmuth, M. K. (1996). Exponentially many local minima for single neurons. In D. S. Touretzky, M. C. Mozer, & M. E. Hasselmo (Eds.), Advances in neural information processing systems (Vol. 8, pp. 316–322). Cambridge, MA: MIT Press.

Baldi, P., & Sadowski, P. (2013). Understanding dropout. In Advances in neural information processing systems (Vol. 27, pp. 2814–2822).

Barron, A. R. (1993). Universal approximation bounds for superpositions of a sigmoidal function. IEEE Transactions on Information Theory, 39(3), 930–945.MathSciNetMATHCrossRef

Barto, A. G., Sutton, R. S., & Anderson, C. W. (1983). Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Transactions on Systems, Man, and Cybernetics, 13, 834–846.CrossRef

10.

Bartlett, P. L. (1998). The sample complexity of pattern classification with neural networks: The size of the weights is more important than the size of the network. IEEE Transactions on Information Theory, 44(2), 525–536.MathSciNetMATHCrossRef

11.

Baum, E. B., & Wilczek, F. (1988). Supervised learning of probability distributions by neural networks. In D. Z. Anderson (Ed.), Neural information processing systems (pp. 52–61). New York: American Institute Physics.

12.

Belkin, M., & Niyogi, P. (2001). Laplacian eigenmaps and spectral techniques for embedding and clustering. In Advances neural information processing systems (Vol. 14, pp. 585–591). Cambridge, MA: MIT Press.

13.

Belkin, M., & Niyogi, P. (2003). Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation, 15(6), 1373–1396.MATHCrossRef

14.

Belkin, M., Niyogi, P., & Sindhwani, V. (2006). Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. Journal of Machine Learning Research, 7, 2399–2434.MathSciNetMATH

15.

Bengio, Y., & Grandvalet, Y. (2004). No unbiased estimator of the variance of \(K\)-fold cross-validation. Journal of Machine Learning Research, 5, 1089–1105.MathSciNetMATH

16.

Bernier, J. L., Ortega, J., Ros, E., Rojas, I., & Prieto, A. (2000). A quantitative study of fault tolerance, noise immunity, and generalization ability of MLPs. Neural Computation, 12, 2941–2964.CrossRef

17.

Bishop, C. M. (1995). Neural networks for pattern recognition. New York: Oxford Press.MATH

18.

Bishop, C. M. (1995). Training with noise is equivalent to Tikhonov regularization. Neural Computation, 7(1), 108–116.CrossRef

19.

Blum, A. L., & Rivest, R. L. (1992). Training a 3-node neural network is NP-complete. Neural Networks, 5(1), 117–127.CrossRef

20.

Bousquet, O., & Elisseeff, A. (2002). Stability and Generalization. Journal of Machine Learning Research, 2, 499–526.MathSciNetMATH

21.

Breiman, L., & Spector, P. (1992). Submodel selection and evaluation in regression: The X-random case. International Statistical Review, 60(3), 291–319.CrossRef

22.

Burges, C. J. C. (2010). From RankNet to LambdaRank to LambdaMART: An overview. Technical Report MSR-TR-2010-82, Microsoft Research.

23.

Caruana, R. (1997). Multitask learning. Machine Learning, 28, 41–75.CrossRef

24.

Cawley, G. C., & Talbot, N. L. C. (2007). Preventing over-fitting during model selection via Bayesian regularisation of the hyper-parameters. Journal of Machine Learning Research, 8, 841–861.MATH

25.

Cawley, G. C., & Talbot, N. L. C. (2010). On over-fitting in model selection and subsequent selection bias in performance evaluation. Journal of Machine Learning Research, 11, 2079–2107.MathSciNetMATH

26.

Chapelle, O., & Chang, Y. (2011). Yahoo! learning to rank challenge overview. In JMLR workshop and conference proceedings: Workshop on Yahoo! learning to rank challenge (Vol. 14, pp. 1–24).

27.

Chawla, N., Bowyer, K., & Kegelmeyer, W. (2002). SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321–357.MATHCrossRef

28.

Chen, D. S., & Jain, R. C. (1994). A robust backpropagation learning algorithm for function approximation. IEEE Transactions on Neural Networks, 5(3), 467–479.CrossRef

29.

Chiu, C., Mehrotra, K., Mohan, C. K., & Ranka, S. (1994). Modifying training algorithms for improved fault tolerance. In Proceedings of IEEE International Conference on Neural Networks, Orlando, FL, USA (Vol. 4, pp. 333–338).

30.

Cichocki, A., & Unbehauen, R. (1992). Neural networks for optimization and signal processing. New York: Wiley.MATH

31.

Cover, T. M. (1965). Geometrical and statistical properties of systems of linear inequalities with applications in pattern recognition. IEEE Transactions on Electronic Computers, 14, 326–334.MATHCrossRef

32.

Dasgupta, S., Littman, M., & McAllester, D. (2002). PAC generalization bounds for co-training. In: Advances in neural information processing systems (Vol. 14, pp. 375–382).

33.

Denker, J. S., Schwartz, D., Wittner, B., Solla, S. A., Howard, R., Jackel, L., et al. (1987). Large automatic learning, rule extraction, and generalization. Complex Systems, 1, 877–922.MathSciNetMATH

34.

Dietterich, T. G., Lathrop, R. H., & Lozano-Perez, T. (1997). Solving the multiple instance problem with axis-parallel rectangles. Artificial Intelligence, 89, 31–71.MATHCrossRef

35.

Domingos, P. (1999). The role of Occam’s razor in knowledge discovery. Data Mining and Knowledge Discovery, 3, 409–425.CrossRef

36.

Edwards, P. J., & Murray, A. F. (1998). Towards optimally distributed computation. Neural Computation, 10, 997–1015.CrossRef

37.

Fedorov, V. V. (1972). Theory of optimal experiments. San Diego: Academic Press.

38.

Freund, Y., Iyer, R., Schapire, R. E., & Singer, Y. (2003). An efficient boosting algorithm for combining preferences. Journal of Machine Learning Research, 4, 933–969.MathSciNetMATH

39.

Freund, Y., Seung, H. S., Shamir, E., & Tishby, N. (1997). Selective sampling using the query by committee algorithm. Machine Learning, 28, 133–168.MATHCrossRef

40.

Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5), 1189–1232.MathSciNetMATHCrossRef

41.

Friedrichs, F., & Schmitt, M. (2005). On the power of Boolean computations in generalized RBF neural networks. Neurocomputing, 63, 483–498.CrossRef

42.

Gal, Y., & Ghahramani, Z. (2016). Dropout as a Bayesian approximation: Representing model uncertainty in deep learning. In Proceedings of the 33rd International Conference on Machine Learning (Vol. 48, pp. 1050–1059).

43.

Geman, S., Bienenstock, E., & Doursat, R. (1992). Neural networks and the bias/variance dilemma. Neural Computation, 4(1), 1–58.

44.

Ghodsi, A., & Schuurmans, D. (2003). Automatic basis selection techniques for RBF networks. Neural Networks, 16, 809–816.CrossRef

45.

Gish, H. (1990). A probabilistic approach to the understanding and training of neural network classifiers. In Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (pp. 1361–1364).

46.

Hanson, S. J., & Burr, D. J. (1988). Minkowski back-propagation: Learning in connectionist models with non-Euclidean error signals. In D. Z. Anderson (Ed.), Neural information processing systems (pp. 348–357). New York: American Institute Physics.

47.

Hassoun, M. H. (1995). Fundamentals of artificial neural networks. Cambridge, MA: MIT Press.MATH

48.

Hecht-Nielsen, R. (1987). Kolmogorov’s mapping neural network existence theorem. In Proceedings of the 1st IEEE International Conference on Neural Networks (Vol. 3, pp. 11–14). San Diego, CA.

49.

Helmbold, D. P., & Long, P. M. (2018). Surprising properties of dropout in deep networks. Journal of Machine Learning Research, 18, 1–28.MATH

50.

Herbrich, R., Graepel, T., & Obermayer, K. (2000). Large margin rank boundaries for ordinal regression. In P. J. Bartlett, B. Scholkopf, D. Schuurmans, & A. J. Smola (Eds.), Advances in large margin classifiers (pp. 115–132). Cambridge, MA: MIT Press.

51.

Hinton, G. E. (1989). Connectionist learning procedure. Artificial Intelligence, 40, 185–234.CrossRef

52.

Hinton, G. E. (2012). Dropout: A simple and effective way to improve neural networks. videolectures.net.

53.

Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. R. (2012). Improving neural networks by preventing co-adaptation of feature detectors. The Computing Research Repository (CoRR), abs/1207.0580.

54.

Hinton, G. E., & van Camp, D. (1993). Keeping neural networks simple by minimizing the description length of the weights. In Proceedings of the 6th Annual ACM Conference on Computational Learning Theory (pp. 5–13). Santa Cruz, CA.

55.

Ho, K. I.-J., Leung, C.-S., & Sum, J. (2010). Convergence and objective functions of some fault/noise-injection-based online learning algorithms for RBF networks. IEEE Transactions on Neural Networks, 21(6), 938–947.CrossRef

56.

Hoi, S. C. H., Jin, R., & Lyu, M. R. (2009). Batch mode active learning with applications to text categorization and image retrieval. IEEE Transactions on Knowledge and Data Engineering, 21(9), 1233–1248.CrossRef

57.

Holmstrom, L., & Koistinen, P. (1992). Using additive noise in back-propagation training. IEEE Transactions on Neural Networks, 3(1), 24–38.CrossRef

58.

Hotelling, H. (1936). Relations between two sets of variates. Biometrika, 28, 321–377.MATHCrossRef

59.

Huber, P. J. (1981). Robust statistics. New York: Wiley.MATHCrossRef

60.

Janssen, P., Stoica, P., Soderstrom, T., & Eykhoff, P. (1988). Model structure selection for multivariable systems by cross-validation. International Journal of Control, 47, 1737–1758.MathSciNetMATHCrossRef

61.

Kettenring, J. (1971). Canonical analysis of several sets of variables. Biometrika, 58(3), 433–451.MathSciNetMATHCrossRef

62.

Khan, S. H., Hayat, M., & Porikli, F. (2019). Regularization of deep neural networks with spectral dropout. Neural Networks, 110, 82–90.CrossRef

63.

Kokiopoulou, E., & Saad, Y. (2007). Orthogonal neighborhood preserving projections: A projection-based dimensionality reduction technique. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(12), 2143–2156.CrossRef

64.

Kolmogorov, A. N. (1957). On the representation of continuous functions of several variables by superposition of continuous functions of one variable and addition. Doklady Akademii Nauk USSR, 114(5), 953–956.MATH

65.

Krogh, A., & Hertz, J. A. (1992). A simple weight decay improves generalization. In Proceedings of Neural Information and Processing Systems (NIPS) Conference (pp. 950–957). San Mateo, CA: Morgan Kaufmann.

66.

Lanckriet, G. R. G., Cristianini, N., Bartlett, P., El Ghaoui, L., & Jordan, M. I. (2004). Learning the kernel matrix with semidefinite programming. Journal of Machine Learning Research, 5, 27–72.MathSciNetMATH

67.

Lin, Y., Lee, Y., & Wahba, G. (2002). Support vector machines for classification in nonstandard situations. Machine Learning, 46, 191–202.MATHCrossRef

68.

Liu, W., Pokharel, P. P., & Principe, J. C. (2007). Correntropy: Properties and applications in non-Gaussian signal processing. IEEE Transactions on Signal Processing, 55(11), 5286–5298.MathSciNetMATHCrossRef

69.

Liu, Y., Starzyk, J. A., & Zhu, Z. (2008). Optimized approximation algorithm in neural networks without overfitting. IEEE Transactions on Neural Networks, 19(6), 983–995.CrossRef

70.

Maass, W. (2000). On the computational power of winner-take-all. Neural Computation, 12, 2519–2535.CrossRef

71.

MacKay, D. (1992). Information-based objective functions for active data selection. Neural Computation, 4(4), 590–604.CrossRef

72.

Markatou, M., Tian, H., Biswas, S., & Hripcsak, G. (2005). Analysis of variance of cross-validation estimators of the generalization error. Journal of Machine Learning Research, 6, 1127–1168.MathSciNetMATH

73.

Matsuoka, K., & Yi, J. (1991). Backpropagation based on the logarithmic error function and elimination of local minima. In Proceedings of the International Joint Conference on Neural Networks (pp. 1117–1122). Seattle, WA.

74.

McCullagh, P. (1980). Regression models for ordinal data. Journal of the Royal Statistical Society: Series B, 42(2), 109–142.MathSciNetMATH

75.

Muller, B., Reinhardt, J., & Strickland, M. (1995). Neural networks: An introduction (2nd ed.). Berlin: Springer.MATHCrossRef

76.

Murray, A. F., & Edwards, P. J. (1994). Synaptic weight noise euring MLP training: Enhanced MLP performance and fault tolerance resulting from synaptic weight noise during training. IEEE Transactions on Neural Networks, 5(5), 792–802.CrossRef

77.

Nadeau, C., & Bengio, Y. (2003). Inference for the generalization error. Machine Learning, 52, 239–281.MATHCrossRef

78.

Niyogi, P., & Girosi, F. (1999). Generalization bounds for function approximation from scattered noisy data. Advances in Computational Mathematics, 10, 51–80.MathSciNetMATHCrossRef

79.

Nowlan, S. J., & Hinton, G. E. (1992). Simplifying neural networks by soft weight-sharing. Neural Computation, 4(4), 473–493.CrossRef

80.

Pan, S. J., & Yang, Q. (2010). A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22(10), 1345–1359.CrossRef

81.

Parzen, E. (1962). On estimation of a probability density function and mode. The Annals of Mathematical Statistics, 33(1), 1065–1076.MathSciNetMATHCrossRef

82.

Phatak, D. S. (1999). Relationship between fault tolerance, generalization and the Vapnik-Cervonenkis (VC) dimension of feedforward ANNs. Proceedings of International Joint Conference on Neural Networks, 1, 705–709.CrossRef

83.

Plutowski, M. E. P. (1996). Survey: Cross-validation in theory and in practice. Research Report. Princeton, NJ: Department of Computational Science Research, David Sarnoff Research Center.

84.

Poggio, T., & Girosi, F. (1990). Networks for approximation and learning. Proceedings of the IEEE, 78(9), 1481–1497.MATHCrossRef

85.

Prechelt, L. (1998). Automatic early stopping using cross validation: Quantifying the criteria. Neural Networks, 11, 761–767.CrossRef

86.

Reed, R., Marks, R. J, I. I., & Oh, S. (1995). Similarities of error regularization, sigmoid gain scaling, target smoothing, and training with jitter. IEEE Transactions on Neural Networks, 6(3), 529–538.CrossRef

87.

Rimer, M., & Martinez, T. (2006). Classification-based objective functions. Machine Learning, 63(2), 183–205.MATHCrossRef

88.

Rissanen, J. (1978). Modeling by shortest data description. Automatica, 14(5), 465–477.MATHCrossRef

89.

Rissanen, J. (1999). Hypothesis selection and testing by the MDL principle. Computer Journal, 42(4), 260–269.MathSciNetMATHCrossRef

90.

Roweis, S. T., & Saul, L. K. (2000). Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500), 2323–2326.CrossRef

91.

Royden, H. L. (1968). Real analysis (2nd ed.). New York: Macmillan.MATH

92.

Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning internal representations by error propagation. In D. E. Rumelhart & J. L. McClelland (Eds.), Parallel distributed processing: Explorations in the microstructure of cognition (Vol. 1, pp. 318–362). Cambridge, MA: MIT Press.

93.

Rumelhart, D. E., Durbin, R., Golden, R., & Chauvin, Y. (1995). Backpropagation: the basic theory. In Y. Chauvin & D. E. Rumelhart (Eds.), Backpropagation: Theory, architecture, and applications (pp. 1–34). Hillsdale, NJ: Lawrence Erlbaum.

94.

Sabato, S., & Tishby, N. (2012). Multi-instance learning with any hypothesis class. Journal of Machine Learning Research, 13, 2999–3039.MathSciNetMATH

95.

Sarbo, J. J., & Cozijn, R. (2019). Belief in reasoning. Cognitive Systems Research, 55, 245–256.CrossRef

96.

Schultz, W. (1998). Predictive reward signal of dopamine neurons. Journal of Neurophysiology, 80(1), 1–27.MathSciNetCrossRef

97.

Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461–464.MathSciNetMATHCrossRef

98.

Shao, J. (1993). Linear model selection by cross-validation. Journal of the American Statistical Association, 88, 486–494.MathSciNetMATHCrossRef

99.

Siegelmann, H. T., & Sontag, E. D. (1995). On the computational power of neural nets. Journal of Computer and System Sciences, 50(1), 132–150.MathSciNetMATHCrossRef

100.

Silva, L. M., de Sa, J. M., & Alexandre, L. A. (2008). Data classification with multilayer perceptrons using a generalized error function. Neural Networks, 21, 1302–1310.MATHCrossRef

101.

Sima, J. (1996). Back-propagation is not efficient. Neural Networks, 9(6), 1017–1023.CrossRef

102.

Singh, A., Pokharel, R., & Principe, J. C. (2014). The C-loss function for pattern classification. Pattern Recognition, 47(1), 441–453.MATHCrossRef

103.

Solla, S. A., Levin, E., & Fleisher, M. (1988). Accelerated learning in layered neural networks. Complex Systems, 2, 625–640.MathSciNetMATH

104.

Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15, 1929–1958.MathSciNetMATH

105.

Stoica, P., & Selen, Y. (2004). A review of information criterion rules. EEE Signal Processing Magazine, 21(4), 36–47.CrossRef

106.

Stone, M. (1974). Cross-validatory choice and assessment of statistical predictions. Journal of the Royal Statistical Society Series B, 36, 111–147.MathSciNetMATH

107.

Sugiyama, M., & Ogawa, H. (2000). Incremental active learning for optimal generalization. Neural Computation, 12, 2909–2940.CrossRef

108.

Sugiyama, M., & Nakajima, S. (2009). Pool-based active learning in approximate linear regression. Machine Learning, 75, 249–274.CrossRef

109.

Sum, J. P.-F., Leung, C.-S., & Ho, K. I.-J. (2012). On-line node fault injection training algorithm for MLP networks: Objective function and convergence analysis. IEEE Transactions on Neural Networks and Learning Systems, 23(2), 211–222.

110.

Tabatabai, M. A., & Argyros, I. K. (1993). Robust estimation and testing for general nonlinear regression models. Applied Mathematics and Computation, 58, 85–101.MathSciNetMATHCrossRef

111.

Tecuci, G., Kaiser, L., Marcu, D., Uttamsingh, C., & Boicu, M. (2018). Evidence-based reasoning in intelligence analysis: Structured methodology and system. Computing in Science & Engineering, 20(6), 9–21.CrossRef

112.

Tikhonov, A. N. (1963). On solving incorrectly posed problems and method of regularization. Doklady Akademii Nauk USSR, 151, 501–504.

113.

Tucker, L. R. (1964). The extension of factor analysis to three-dimensional matrices. Contributions to mathematical psychology (pp. 109–127). Holt, Rinehardt & Winston: New York, NY.

114.

Vapnik, V. N. (1998). Statistical learning theory. New York: Wiley.MATH

115.

Wan, L., Zeiler, M., Zhang, S., LeCun, Y., Fergus, R. (2013). Regularization of neural networks using dropconnect. In Proceedings of International Conference on Machine Learning (pp. 1058–1066).

116.

Widrow, B., & Lehr, M. A. (1990). 30 years of adaptive neural networks: Perceptron, Madaline, and backpropagation. Proceedings of the IEEE, 78(9), 1415–1442.CrossRef

117.

Wu, G., & Cheng, E. (2003). Class-boundary alignment for imbalanced dataset learning. In Proceedings of ICML 2003 Workshop on Learning Imbalanced Data Sets II (pp. 49–56). Washington, DC.

118.

Xiao, Y., Feng, R.-B., Leung, C.-S., & Sum, J. (2016). Objective function and learning algorithm for the general node fault situation. IEEE Transactions on Neural Networks and Learning Systems, 27(4), 863–874.MathSciNetCrossRef

119.

Xu, H., Caramanis, C., & Mannor, S. (2010). Robust regression and Lasso. IEEE Transactions on Information Theory, 56(7), 3561–3574.MathSciNetMATHCrossRef

120.

Xu, H., Caramanis, C., & Mannor, S. (2012). Sparse algorithms are not stable: A no-free-lunch theorem. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(1), 187–193.CrossRef

121.

Yang, L., Hanneke, S., & Carbonell, J. (2013). A theory of transfer learning with applications to active learning. Machine Learning, 90(2), 161–189.MathSciNetMATHCrossRef

122.

Zahalka, J., & Zelezny, F. (2011). An experimental test of Occam’s razor in classification. Machine Learning, 82, 475–481.MathSciNetCrossRef

123.

Zhang, M.-L., & Zhou, Z.-H. (2007). ML-KNN: A lazy learning approach to multi-label learning. Pattern Recognition, 40(7), 2038–2048.MATHCrossRef

Titel: Fundamentals of Machine Learning
verfasst von: Ke-Lin Du
M. N. S. Swamy
Verlag: Springer London
Buch: Neural Networks and Statistical Learning
Print ISBN: 978-1-4471-7451-6

Electronic ISBN: 978-1-4471-7452-3

Copyright-Jahr: 2019
DOI: https://doi.org/10.1007/978-1-4471-7452-3_2

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"