# Extension tecdat|R Generalized Additive (Additive) Models (GAMs) and Visualization of Smooth Functions

Time：2022-8-18

We use generalized additive models (GAMs) in our research work. The mgcv package is an excellent suite of software for specifying, fitting and visualizing GAMs for very large datasets.

This post describes what is currently possible with Generalized Additive Models (GAMs).

We need to load mgcv

``library('mgcv')``

# Popular example datasets

The data in dat is well studied in GAM-related studies, and contains a number of covariates — labeled x0 to x3 — that have a non-linear relationship to the dependent variable to varying degrees. We want to try to fit these relationships by using splines to approximate the true relationship between covariates and dependent variables. To fit an additive model, we use

`` gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), dat,  "REML")``

mgcv provides a summary() method to extract information about the fitted GAM. check() function to check whether each smooth_function_ in the model uses a sufficient number of basis functions. You may not use check() directly – additional diagnostics will be output, and four model diagnostics plots will be produced.

# draw smooth_function_graph

To visualize estimated GAMs, mgcv provides the plot.gam() method and the vis.gam() function to produce ggplot2-like plots from objects. To visualize the smooth _function_ of the four estimates in the GAM model, we will use

``plot(mod)`` The result is to plot every smooth _function_ in mod GAM.

Use the plot function to draw multiple panels on a drawing device and line up the individual plots.

# extract smooth_function_data

The underlying smooth _function_ used to process the representation in the mod, if you want to extract most of the data used to build this graph, you can use the smooth() function.

``smooth(mod, "x1")`` # Diagnostic map

Diagnostic plot produced by check()

``check(mod)`` The result is an array of four diagnostic plots, including a QQ plot (top left) and histogram (bottom left) of model residuals, a plot of residuals versus linear predictor (top right), and a plot of observed versus fitted values.

Each of these four graphs is generated through a user-accessible function that implements a specific graph. For example, qqplot(mod) produces the QQ plot at the top left of the above figure.

``qqplot(mod)`` The result of qqplot(mod) is a QQ plot of residuals with reference magnitudes obtained by simulating the data from the fitted model.

`Also handles many of the more specialized`smooth_function_`. For example, two-dimensional`smooth_function_.

``plot(mod)`` The default way of plotting a 2D smooth _function_ is to use plot().

The and factor smooth_function_interaction terms, equivalent to random slopes and intercepts of smooth curves, are plotted on a panel, and colors are used to distinguish different random smooth_function_.

``````## simulated data
f0 <- function(x) 2 * sin(pi * x)
f1 <- function(x, a=2, b=-1) exp(a * x)+b
f2 <- function(x) 0.2 * x^11 * (10 * (1 - x))^6 + 10 * (10 * x)^3 * (1 - x)^10
f <- f0(x0) + f1(x1, a\[fac\], b\[fac\]) + f2(x2)
fac <- factor(fac)
y <- f + rnorm(n) * 2

plot(mod)`````` Results of a more complex GAM with factor-smooth_function_interaction terms, bs = 'fs'.

# What else can be done?

Can handle most smooth_function_ that mgcv can estimate, including by-variable smooth_function_ with factors and continuous secondary variables, random effects smooth_function_ (bs = 're'), 2D tensor product smooth_ function_, and a model with parameter terms.

## references

Augustin, N. H., Sauleau, E.-A., and Wood, S. N. (2012). On quantile quantile plots for generalized linear models. _Computational statistics & data analysis_ 56, 2404–2409. doi:10.1016/j.csda.2012.01.026. Most Popular Insights

5.R language mixed effects logistic regression logistic model analysis of lung cancer

## Leetcode PHP题解–D127 455. Assign Cookies

D127 455. Assign Cookies topic link 455. Assign Cookies topic analysis Given two arrays, the first array represents the capacity of each container, and the second array represents the number of each item. The items need to be stuffed into the container, but there are two conditions. Condition 1: Only one item can be stuffed […]