The truth is almost never linear!
But often the linearity assumption is “good enough”
What about when its not?
Polynomials
Step Functions
Splines
Local Regression
Generalized Additive Models
Create new variables \(X_1 = X\), \(X_2 = X^2\), and so on, then treat as multiple linear regression
Not really interested in the coefficients; more interested in the fitted function values at any value \(x_0\):
Since \(\hat{f}(x_0)\) is a linear function of the \(\hat{\beta_\ell}\), can get a simple expression for pointwise-variances \(Var[\hat{f}(x_0)]\) at any value of \(x_0\). In the figure above, we have computed the fit and pointwise standard errors on a grid of values for \(x_0\). We show \(\hat{f}(x_0) \pm 2 \cdot se[\hat{f}(x_0)]\)
We either fix the degree \(d\) at some reasonably low value, else use cross-validation to choose \(d\)
Logistic regression follows naturally. For example, in the figure we model:
\(Pr(y_i > 250|x_i) = \frac{exp(B_0 + B_1x_1 + B_2x_{i}^{2} + B_3x_{i}^{3} + ... + B_dx_{i}^{d})}{1 + exp(B_0 + B_1x_1 + B_2x_{i}^{2} + B_3x_{i}^{3} + ... + B_dx_{i}^{d})}\)
To get confidence intervals, compute upper and lower bounds on on the logit scale, and then invert to get on probability scale
Can do separately on several variables—just stack the variables into one matrix, and separate out the pieces afterwards (see GAMs later)
Caveat: polynomials have notorious tail behavior — very bad for extrapolation
Can fit using \(y ~ poly(x, degree = 3)\) in formula
Another way of creating transformations of a variable — cut the variable into distinct regions