Skip to main content

2013 | OriginalPaper | Buchkapitel

7. Moving Beyond Linearity

verfasst von : Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani

Erschienen in: An Introduction to Statistical Learning

Verlag: Springer New York

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

So far in this book, we have mostly focused on linear models. Linear models are relatively simple to describe and implement, and have advantages over other approaches in terms of interpretation and inference. However, standard linear regression can have significant limitations in terms of predictive power. This is because the linearity assumption is almost always an approximation, and sometimes a poor one. In Chapter 6 we see that we can improve upon least squares using ridge regression, the lasso, principal components regression, and other techniques. In that setting, the improvement is obtained by reducing the complexity of the linear model, and hence the variance of the estimates. But we are still using a linear model, which can only be improved so far! In this chapter we relax the linearity assumption while still attempting to maintain as much interpretability as possible. We do this by examining very simple extensions of linear models like polynomial regression and step functions, as well as more sophisticated approaches such as splines, local regression, and generalized additive models.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
If \(\hat{\mathbf{C}}\) is the 5 ×5 covariance matrix of the \(\hat{\beta }_{j}\), and if \(\boldsymbol{\ell}_{0}^{T} = (1,x_{0},x_{0}^{2},x_{0}^{3},x_{0}^{4})\), then \(\mbox{ Var}[\hat{f}(x_{0})] = \boldsymbol{\ell}_{0}^{T}\hat{\mathbf{C}}\boldsymbol{\ell}_{0}\).
 
2
We exclude C 0(X) as a predictor in (7.5) because it is redundant with the intercept. This is similar to the fact that we need only two dummy variables to code a qualitative variable with three levels, provided that the model will contain an intercept. The decision to exclude C 0(X) instead of some other C k (X) in (7.5) is arbitrary. Alternatively, we could include C 0(X), C 1(X), , C K (X), and exclude the intercept.
 
3
derivative
cubic spline
Cubic splines are popular because most human eyes cannot detect the discontinuity at the knots.
 
4
There are actually five knots, including the two boundary knots. A cubic spline with five knots would have nine degrees of freedom. But natural cubic splines have two additional natural constraints at each boundary to enforce linearity, resulting in \(9 - 4 = 5\) degrees of freedom. Since this includes a constant, which is absorbed in the intercept, we count it as four degrees of freedom.
 
5
The exact formulas for computing \(\hat{g}(x_{i})\) and S λ are very technical; however, efficient algorithms are available for computing these quantities.
 
6
backfitting
A partial residual for X 3, for example, has the form \(r_{i} = y_{i} - f_{1}(x_{i1}) - f_{2}(x_{i2})\). If we know f 1 and f 2, then we can fit f 3 by treating this residual as a response in a non-linear regression on X 3.
 
Metadaten
Titel
Moving Beyond Linearity
verfasst von
Gareth James
Daniela Witten
Trevor Hastie
Robert Tibshirani
Copyright-Jahr
2013
Verlag
Springer New York
DOI
https://doi.org/10.1007/978-1-4614-7138-7_7