## Original link:http://tecdat.cn/?p=23509

We use generalized additive models (GAMs) in our research work. The mgcv package is an excellent suite of software for specifying, fitting and visualizing GAMs for very large datasets.

This post describes what is currently possible with Generalized Additive Models (GAMs).

We need to load mgcv

`library('mgcv')`

# Popular example datasets

The data in dat is well studied in GAM-related studies, and contains a number of covariates — labeled x0 to x3 — that have a non-linear relationship to the dependent variable to varying degrees.

We want to try to fit these relationships by using splines to approximate the true relationship between covariates and dependent variables. To fit an additive model, we use

` gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), dat, "REML")`

mgcv provides a summary() method to extract information about the fitted GAM.

check() function to check whether each smooth_function_ in the model uses a sufficient number of basis functions. You may not use check() directly – additional diagnostics will be output, and four model diagnostics plots will be produced.

# draw smooth_function_graph

To visualize estimated GAMs, mgcv provides the plot.gam() method and the vis.gam() function to produce ggplot2-like plots from objects. To visualize the smooth _function_ of the four estimates in the GAM model, we will use

`plot(mod)`

The result is to plot every smooth _function_ in mod GAM.

Use the plot function to draw multiple panels on a drawing device and line up the individual plots.

# extract smooth_function_data

The underlying smooth _function_ used to process the representation in the mod, if you want to extract most of the data used to build this graph, you can use the smooth() function.

`smooth(mod, "x1")`

# Diagnostic map

Diagnostic plot produced by check()

`check(mod)`

The result is an array of four diagnostic plots, including a QQ plot (top left) and histogram (bottom left) of model residuals, a plot of residuals versus linear predictor (top right), and a plot of observed versus fitted values.

Each of these four graphs is generated through a user-accessible function that implements a specific graph. For example, qqplot(mod) produces the QQ plot at the top left of the above figure.

`qqplot(mod)`

The result of qqplot(mod) is a QQ plot of residuals with reference magnitudes obtained by simulating the data from the fitted model.

`Also handles many of the more specialized`

smooth_function_`. For example, two-dimensional`

smooth_function_.

`plot(mod)`

The default way of plotting a 2D smooth _function_ is to use plot().

The and factor smooth_function_interaction terms, equivalent to random slopes and intercepts of smooth curves, are plotted on a panel, and colors are used to distinguish different random smooth_function_.

```
## simulated data
f0 <- function(x) 2 * sin(pi * x)
f1 <- function(x, a=2, b=-1) exp(a * x)+b
f2 <- function(x) 0.2 * x^11 * (10 * (1 - x))^6 + 10 * (10 * x)^3 * (1 - x)^10
f <- f0(x0) + f1(x1, a\[fac\], b\[fac\]) + f2(x2)
fac <- factor(fac)
y <- f + rnorm(n) * 2
plot(mod)
```

Results of a more complex GAM with factor-smooth_function_interaction terms, bs = 'fs'.

# What else can be done?

Can handle most smooth_function_ that mgcv can estimate, including by-variable smooth_function_ with factors and continuous secondary variables, random effects smooth_function_ (bs = 're'), 2D tensor product smooth_ function_, and a model with parameter terms.

## references

Augustin, N. H., Sauleau, E.-A., and Wood, S. N. (2012). On quantile quantile plots for generalized linear models. _Computational statistics & data analysis_ 56, 2404–2409. doi:10.1016/j.csda.2012.01.026.

Most Popular Insights

1.R language multivariate logistic regression application case

2.Panel Smooth Transition Regression (PSTR) Analysis Case Implementation

3.Partial Least Squares Regression (PLSR) and Principal Component Regression (PCR) in matlab

4.R language Poisson regression model analysis case

5.**R ****language mixed effects logistic regression logistic model analysis of lung cancer**

6.Implementation of LASSO regression, Ridge regression and Elastic Net model in r language