# Nonlinear models in R language: gam analysis of polynomial regression, local spline, smooth spline and generalized additive model

Time：2021-4-19

## Overview

Here, we relax the assumption of the popular linear method. Sometimes the linear assumption is just a poor approximation. There are many ways to solve this problem, some of which can be solved by using regularization method to reduce the complexity of the model. However, these techniques still use linear models and can only be improved so far. This paper focuses on the extension of linear model

• _ Polynomial regression_     This is a simple method to provide nonlinear fitting for data.
• _ Step function_ Divide the range of variables into_ K_ Different regions to generate qualitative variables. It has the effect of fitting piecewise constant function.
• _ Regression spline_ It is more flexible than polynomial and step function, and is actually an extension of them.
• _ Local spline curve_    Similar to the regression spline curve, but allows overlapping regions, and can overlap smoothly.
• _ Smooth spline curve_ They are also similar to regression splines, but they minimize the residual sum of squares criterion of smoothness penalty.
• _ Generalized additive model_ The above method is allowed to be extended to handle multiple predictive variables.

## polynomial regression

This is the most traditional way to extend the linear model. As we increase the polynomial term, polynomial regression enables us to generate nonlinear curves while still using the least squares method to estimate the coefficients.

## stepwise regression

It is often used in biostatistics and epidemiology.

## Regression spline

Regression splines are many applications of extended polynomials and stepwise regression techniques_ Basic_ One of the functions. in fact. Polynomials and stepwise regression functions are just_ Base_ Function.

This is an example of piecewise cubic fitting (top left).

In order to solve this problem, a better solution is to use constraints, so that the fitting curve must be continuous.

### Choose the location and number of knots

One option is to place more knots where we think the change is fastest and fewer knots where it is more stable. But in practice, knots are usually placed in a uniform way.

It should be clear that in this case, there are actually five knots, including boundary knots.

So how many knots should we use? A simple choice is to try many knots and see which produces the best curve. However, a more objective approach is to use cross validation.

## Smooth spline

We discuss regression splines, which are created by specifying a set of knots, generating a series of basis functions, and then estimating the spline coefficients using the least square method. Smoothing splines is another way to create splines. Let’s recall that our goal is to find some functions that are very suitable for the observed data, that is, to minimize RSS. However, if there are no restrictions on our functions, we can set RSS to zero by choosing the function that precisely interpolates all the data.

### Select the smoothing parameter lambda

Again, we turn to cross validation. It turns out that we can actually compute loocv very efficiently to smooth splines, regression splines and any other basis functions.

Smooth splines are generally preferable to regression splines because they usually create simpler models and have comparable fit.

## Local regression

Local regression involves using only nearby training observations to calculate target points_ x_ 0.

Local regression can be performed in a variety of ways, especially when it comes to fitting_    Linear regression model is especially obvious in the multivariate scheme    Therefore, some variables can be fitted globally, while others can be fitted locally.

GAM model provides a general framework to extend the linear model by allowing nonlinear functions of each variable while maintaining additivity.

GAM with smooth splines is not so simple because least squares cannot be used. Instead, we use a method called_ Inverse fitting_ It’s the best way.

• GAM allows nonlinear functions to be fitted to each predictor so that we can automatically model the nonlinear relationships that standard linear regression will miss. We don’t have to try many different transformations for each variable.
• Nonlinear fitting can be potentially applied to the dependent variable_ Y_ Make more accurate predictions.
• Because the model is additive, we can still examine each pair of predictors_ Y_ While keeping other variables unchanged.

shortcoming

• The main limitation is that the model is limited to the cumulative model, so important interactions may be missed.

## example

### Polynomial regression and step function

``````library(ISLR)
attach(Wage)``````

We can easily use it to fit polynomial functions, and then specify the variables and degree of the polynomial. This function returns the matrix of orthogonal polynomials, which means that each column is a linear combination of variables`age`，  `age^2`，  `age^3`, and`age^4`. If you want to get the variable directly, you can specify`raw=TRUE`But this will not affect the forecast results. It can be used to check the required coefficient estimates.

``````fit = lm(wage~poly(age, 4), data=Wage)
kable(coef(summary(fit)))``````

Now let’s create one`ages`  The vector we want to predict. Finally, we are going to plot the data and fit the polynomial of degree 4.

``````ageLims <- range(age)
age.grid <- seq(from=ageLims[1], to=ageLims[2])

pred <- predict(fit, newdata = list(age = age.grid),
se=TRUE) ``````

``````plot(age,wage,xlim=ageLims ,cex=.5,col="darkgrey")
lines(age.grid,pred\$fit,lwd=2,col="blue")
matlines(age.grid,se.bands,lwd=2,col="blue",lty=3)``````

In this simple example, we can use ANOVA test.

`````` ## Analysis of Variance Table
##
## Model 1: wage ~ age
## Model 2: wage ~ poly(age, 2)
## Model 3: wage ~ poly(age, 3)
## Model 4: wage ~ poly(age, 4)
## Model 5: wage ~ poly(age, 5)
##   Res.Df     RSS Df Sum of Sq      F Pr(>F)
## 1   2998 5022216
## 2   2997 4793430  1    228786 143.59 <2e-16 ***
## 3   2996 4777674  1     15756   9.89 0.0017 **
## 4   2995 4771604  1      6070   3.81 0.0510 .
## 5   2994 4770322  1      1283   0.80 0.3697
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``````

We see,`_M_1`Compared with the quadratic model, P value is higher`_M_2`It is essentially zero, which indicates that linear fitting is not enough. Therefore, we can conclude that quadratic or cubic models may be more suitable for this data, and tend to simple models.

We can also use cross validation to select polynomial degree.

In fact, the minimum cross validation error we see here is for quartic polynomials, but choosing the cubic or quadratic model will not cause too much loss. Next, we consider predicting whether an individual’s annual income exceeds 250000.

However, the confidence interval of probability is unreasonable, because we finally get some negative probability. In order to generate confidence intervals, it is more meaningful to transform pairs   _ Number_    forecast.

draw:

``````plot(age,I(wage>250),xlim=ageLims ,type="n",ylim=c(0,.2))
lines(age.grid,pfit,lwd=2, col="blue")
matlines(age.grid,se.bands,lwd=1,col="blue",lty=3)``````

### Stepwise regression function

Here, we need to split the data.

``table(cut(age, 4)) ``

``````##
## (17.9,33.5]   (33.5,49]   (49,64.5] (64.5,80.1]
##         750        1399         779          72``````

``````fit <- lm(wage~cut(age, 4), data=Wage)
coef(summary(fit))``````

``````##                        Estimate Std. Error t value  Pr(>|t|)
## (Intercept)              94.158      1.476  63.790 0.000e+00
## cut(age, 4)(33.5,49]     24.053      1.829  13.148 1.982e-38
## cut(age, 4)(49,64.5]     23.665      2.068  11.443 1.041e-29
## cut(age, 4)(64.5,80.1]    7.641      4.987   1.532 1.256e-01``````

### `splines`Spline function

Here, we will use cubic splines.

Because we use the cubic spline of three knots, the generated spline has six basis functions.

`````` ## [1] 3000    6
dim(bs(age, df=6))

## [1] 3000    6
##   25%   50%   75%
## 33.75 42.00 51.00 ``````

Fit the spline curve.

We can also fit smooth splines. Here, we fit the spline curve with 16 degrees of freedom, and then select the spline curve through cross validation to generate 6.8 degrees of freedom.

`````` fit2\$df

## [1] 6.795
lines(fit, col='red', lwd=2)
lines(fit2, col='blue', lwd=1)
legend('topright', legend=c('16 DF', '6.8 DF'),
col=c('red','blue'), lty=1, lwd=2, cex=0.8) ``````

### Local regression

Local regression was performed.

## GAMs

Now, we use gam to predict wages by spline of year, age and education. Since this is only a linear regression model with multiple basic functions, we only use the`lm()`Function.

In order to fit more complex splines, we need to use smooth splines.

Draw these two models

`year`It’s linear. We can create a new model and then use ANOVA test.

`````` ## Analysis of Variance Table
##
## Model 1: wage ~ ns(age, 5) + education
## Model 2: wage ~ year + s(age, 5) + education
## Model 3: wage ~ s(year, 4) + s(age, 5) + education
##   Res.Df     RSS Df Sum of Sq    F  Pr(>F)
## 1   2990 3712881
## 2   2989 3693842  1     19040 15.4 8.9e-05 ***
## 3   2986 3689770  3      4071  1.1    0.35
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``````

Seems to add linearity`year`  The composition is better than that without linear addition    The gam of the ingredients is much better.

`````` ##
## Deviance Residuals:
##     Min      1Q  Median      3Q     Max
## -119.43  -19.70   -3.33   14.17  213.48
##
## (Dispersion Parameter for gaussian family taken to be 1236)
##
##     Null Deviance: 5222086 on 2999 degrees of freedom
## Residual Deviance: 3689770 on 2986 degrees of freedom
## AIC: 29888
##
## Number of Local Scoring Iterations: 2
##
## Anova for Parametric Effects
##              Df  Sum Sq Mean Sq F value  Pr(>F)
## s(year, 4)    1   27162   27162      22 2.9e-06 ***
## s(age, 5)     1  195338  195338     158 < 2e-16 ***
## education     4 1069726  267432     216 < 2e-16 ***
## Residuals  2986 3689770    1236
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Anova for Nonparametric Effects
##             Npar Df Npar F  Pr(F)
## (Intercept)
## s(year, 4)        3    1.1   0.35
## s(age, 5)         4   32.4 <2e-16 ***
## education
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``````

In the model with nonlinear relation, we can confirm again`year`No contribution to the model.

Next, we will use local regression to fit gam.

Before calling GAM, we can also use local regression to create interaction items.

We can plot the resulting surface.

reference

## Envoy announced alpha version of native support for windows

Author: sunjay Bhatia Since 2016, porting envoy to the windows platform has been an important part of the projectOne of the goalsToday, we are excited to announce the alpha version of envoy’s windows native support. The contributor community has been working hard to bring the rich features of envoy to windows, which is another step […]