Abstract
This chapter deals with linear regression in detail. Modeling problems where the model output is linear in the model’s parameters are studied. The method of “least squares” which minimizes the sum of squared model errors is derived, and its properties are analyzed. Furthermore, various extensions are introduced like weighting and regularization. The concept of the smoothing or hat matrix, which maps the measured output values to the model output values, is introduced. It will be required to understand the leave-one-out error in linear regression and the local behavior of kernel methods in later chapters. Also, the important aspect of the “effective number of parameters” is discussed as it will be a key topic throughout the whole book. In addition, methods for recursive updating are introduced that are capable of dealing with data streams. Finally, the more advanced issue of linear subset selection is discussed in detail. It allows to select regressors incrementally, thereby carrying out structure optimization. These approaches are also a recurrent theme throughout the book.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
This error is called the equation error and is different from the output error (difference between process and model output) used in the other examples. The reason for using the equation error here is that for IIR filters the output error would not be linear in the parameters; see Chap. 18.
- 2.
Strictly speaking, the regression matrix must have at least as many rows as columns. (Recall that for the FIR and IIR filter examples, the number of rows is smaller than N.) Moreover, this condition is not sufficient since additionally the columns must be linearly independent.
- 3.
Note that the Hessian \( \underline {H}\) is symmetric and therefore all eigenvalues are real. Furthermore, the eigenvalues are non-negative because the Hessian is positive semi-definite since \( \underline {H}= \underline {X}^T \underline {X}\). If \( \underline {X}\) and thus \( \underline {H}\) are not singular (i.e., have full rank), the eigenvalues are strictly positive.
- 4.
References
Branch, M.A., Grace, A.: MATLAB Optimization Toolbox User’s Guide, Version 1.5. The MATHWORKS Inc., Natick, MA (1998)
Breiman, L., et al.: Arcing classifier (with discussion and a rejoinder by the author). Ann. Stat. 26(3), 801–849 (1998)
Bühlmann, P., Hothorn, T.: Boosting algorithms: regularization, prediction and model fitting. Stat. Sci. 22(4), 477–505 (2007)
Chen, S., Billings, S.A., Luo, W.: Orthogonal least squares methods and their application to nonlinear system identification. Int. J. Control 50(5), 1873–1896 (1989)
Chen, S., Cowan, C.F.N., Grant, P.M.: Orthogonal least-squares learning algorithm for radial basis function networks. IEEE Trans. Neural Netw. 2(2) (1991)
Donoho, D.L.: Compressed sensing. IEEE Trans. Inf. Theory 52(4), 1289–1306 (2006)
Draper, N.R., Smith, H.: Applied Regression Analysis. Probability and Mathematical Statistics. John Wiley & Sons, New York (1981)
Efron, B., Hastie, T., Johnstone, I., Tibshirani, R., et al.: Least angle regression. Ann. Stat. 32(2), 407–499 (2004)
Fischer, M.: Fuzzy-modellbasierte Regelung nichtlinearer Prozesse. Ph.D. Thesis, TU Darmstadt, Reihe 8: Mess-, Steuerungs- und Regelungstechnik, Nr. 750. VDI-Verlag, Düsseldorf (1999)
Fischer, M., Nelles, O., Fink, A.: Adaptive fuzzy model-based control. Journal A 39(3), 22–28 (1998)
Fortescue, T.R., Kershenbaum, L.S., Ydstie, B.E.: Implementation of self-tuning regulators with variable forgetting factor. Automatica 17, 831–835 (1981)
Friedman, J.H.: Stochastic gradient boosting. Comput. Stat. Data Anal. 38(4), 367–378 (2002)
Gill, P.E., Murray, W., Wright, M.H.: Practical Optimization. Academic Press, London (1981)
Golub, G.H., Van Loan, C.F.: Matrix Computations. Mathematical Sciences. The Johns Hopkins University Press, Baltimore (1987)
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Springer Series in Statistics, 2nd edn. Springer, Berlin (2009)
Haykin, S.: Adaptive Filter Theory. Prentice Hall, Oxford (1991)
Isermann, R.: Identifikation dynamischer Syteme – Band 1, 2. ed. Springer, Berlin (1992)
Isermann, R., Lachmann, K.-H., Matko, D.: Adaptive Control Systems. Prentice Hall, New York (1992)
Johansen, T.A.: On Tikhonov regularization, bias and variance in nonlinear system identification. Automatica 33(3), 441–446 (1997)
Kofahl, R.: Parameteradaptive Regelungen mit robusten Eigenschaften. Ph.D. Thesis, TU Darmstadt, FB MSR 39. Springer, Berlin (1988)
Lewis, J.M., Lakshmivarahan, S., Dhall, S.: Dynamic data assimilation: a least squares approach, vol. 13. Cambridge University Press, Cambridge (2006)
Miller, A.J.: Subset Selection in Regression. Statistics and Applied Probability. Chapman & Hall, New York (1990)
Murray-Smith, R., Johansen, T.A.: Local learning in local model networks. In: Murray-Smith, R., Johansen, T.A. (eds.) Multiple Model Approaches to Modelling and Control, chapter 7, pp. 185–210. Taylor & Francis, London (1997)
Saunders, C., Gammerman, A., Vovk, V.: Ridge regression learning algorithm in dual variables. In: Proceedings of the 15th International Conference on Machine Learning (ICML-1998), pp. 515–521. Morgan Kaufmann (1998)
Schapire, R.E.: The strength of weak learnability. Mach. Learn. 5(2), 197–227 (1990)
Söderström, T., Stoica, P.: System Identification. Prentice Hall, New York (1989)
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B (Methodolog.) 58(1), 267–288 (1996)
Tikhonov, A.N., Arsenin, V.Y.: Solutions of Ill-Posed Problems. Wiley, New York (1977)
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2020 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Nelles, O. (2020). Linear Optimization. In: Nonlinear System Identification. Springer, Cham. https://doi.org/10.1007/978-3-030-47439-3_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-47439-3_3
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-47438-6
Online ISBN: 978-3-030-47439-3
eBook Packages: Physics and AstronomyPhysics and Astronomy (R0)