Time：2021-12-1

# Model background

Each dynamic phenomenon can be represented by a latent process (Λ (T) To describe the evolution of this latent process in a continuous time T. when modeling the flag variable of repeated measurement, we usually do not regard it as a latent process with error measurement. However, this is the basic assumption made by the mixed model theory. The mixed model of latent process uses this framework to extend the linear mixed model theory to any type of results (ordered, binary, continuous, category and any distribution).

## Latent category hybrid model

The latent category hybrid model was introduced in Proust Lima et al. (2006)   https://doi.org/10.1111/j.1541-0420.2006.00573.x  And 2013  https://doi.org/10.1111/bmsp.12000  ）。

A linear hybrid model is used to model variables of interest defined as latent processes according to time:

Of which:

• X (T) and Z (T) are vectors of covariates (Z (T);

• β Is the fixed effect (i.e. the overall mean effect);

• UI is random effect (i.e. individual effect); They are distributed according to the zero mean multivariate normal distribution with covariance matrix B;

• (WI (T)) is a Gaussian process that can be added to the model to relax the internal correlation structure of the object.

At the same time, the relationship between the observations of the latent process marker variable Yij of interest (for object I and occasion J) is defined in the observation equation:

among

• TIJ is the measurement time of topic I and occasion J;

• ϵ Ij is an independent zero mean Gaussian error;

• H is a link function, which can convert the latent process into scale and measurement.

Different parameter families are used.  When the flag variable is continuous, H-1 is the parameter family of increasing monotone functions, where:

• Linear transformation: This is simplified to a linear mixed model (2 parameters)

• Beta cumulative integral cloth family readjustment (4 parameters)

WhensignWhen the variable is a discrete category (binary or ordered):  H is the threshold function, i.e. each level of Y corresponds to Λ (TIJ) to be estimated+ ϵ Boundary of ij interval.

## Identifiability

As with any latent variable model, the measurement of latent variables must be defined. In LCMM, the variance of the error is 1, and the average intercept (at β Medium) is 0.

# Example

In this paper,`lcmm`  The mixed latent process model was illustrated by studying the linear trajectory of depressive symptoms (measured by CES-D scale) of men aged about 65 years old  。 Including intercept and age65 related random effects.

## Models considered:

_ Fixed effect part_   yes

## Model h for estimating different continuous link functions

We used the age variable of about 65 years old for centralization, and took ten years as the unit.
The latent process hybrid model can be fitted with different link functions, as shown below. This is done with parameter links.

When defining a linear link function, the model is simplified to a standard linear hybrid model. Linear link functions are available by default:

``lcmm(CESD  ~  age65*male,   random=~   age65  # Link = Linear``

It is identical to the model installed by hlme. The only difference from hlme objects is the parameterization of intercept and residual standard error.

``hlme(CESD  ~  age65*male,   random=~   age65  # Link = Linear``

The log likelihood is the same, but the parameters are estimated β Not in the same range

``````loglik
\[1\] -7056.652``````

Nonlinear link function 1: beta cumulative distribution function the rescaled cumulative distribution function (CDF) of beta distribution provides concave, convex or sigmo ï D transformation between the flag variable and its basic latent process.

``lcmm( random=~ age65, link='beta')``

Nonlinear link function 2: quadratic i-spline. The quadratic i-spline family is similar to the continuously increasing link function. It involves nodes distributed within the range of flag variables. By default, 5 equidistant knots within the range of flag variables are used:

``lcmm(random=~ age65, subject='ID', link='splines')``

You can specify the number of knots and their location. First enter the number of nodes, and then  ， Then specify the location  `equi`， `quant`  or  `manual`  It is used for equidistant nodes respectively. The quantile or internal knot of the flag variable distribution is manually entered in the parameter intnodes. For example,  `7-equi-splines`  It means that there are 7 equidistant nodes,`6-quant-splines`  I-spline,   It means an i-spline with 6 nodes, which is located at the quantile of the marker variable distribution.

For example, there are 5 knots at the quantile:

``lcmm(link='5-quant-splines')``

## Select the best model

To select the most appropriate link function, you can compare these different models. Usually, this can be done by usingAIC  or  UACVIsotherm is realized by comparing the models according to the goodness of fit  。

AIC (uacv in the output of each model):

In this case, according to AIC standard, the best fit is provided by the model of i-splines and 5 quantile node link function. Different estimation link functions can be compared in the figure:

``````plot(mli,   Which = "linkfunction", xlab = "latent process")
legend(x="topleft",   Legend = C ("linear",   "Beta", "spline"   (5 equidistant nodes) "," spline curve (5 quantile nodes) ")``````

We see that the two spline transformations are very close. The linear model seems inappropriate, as shown by the difference between linear curve and spline curve. The beta transformation is different from the spline curve only when the latent process has a high value. The transformed confidence band can be obtained by Monte Carlo method:

``````predict(mspl5q,ndraws=2000)
legend(legend=c("95%   Confidence band "," quantile spline "), lty = C (2, Na))``````

## Estimating the model with discrete link function H

Sometimes, for flag variables with only a limited number of levels, continuous link functions are not appropriate, and the ordered nature of flag variables must be handled. The LCMM function handles this situation by considering the threshold link function. However, we must know that the numerical complexity of the model with threshold link function is much more important (due to the numerical integration of random effect distribution). This must be kept in mind when fitting this model, and the number of random effects should be carefully selected.

Note that the model becomes a cumulative probability hybrid model。 Here is an example of using Hier variable (Level 4), because considering the range of 0-52 (e.g. 52 threshold parameters), the threshold link function of CESD will involve too many parameters.

`` lcmm(HIER ~ age65*male, link='thresholds')``

Output after fitting

### outline

The summary of the model includes convergence, goodness of fit criteria and estimated parameters.

According to the trajectory predicted by the distribution of covariates, the predicted trajectory can be calculated according to the proportion of dependent variables and the distribution of covariates:

``predict(msp, newdata=datnew, var.time="age"``

Then draw:

``````Plot (women, xlab = "age")
Legend (legend = C ("female", "male",   "95%   Confidence interval ",   "95%   Confidence interval ")``````

### Goodness of fit 1: residual diagram

The specific residuals (qqplot in the lower right panel) should be Gaussian.

Goodness of fit 2: prediction and observation chart

Average predictions and observations can be plotted based on age. Note that the prediction and observation are within the scope of the latent process (the observation is converted through the estimated link function):

``plot(   var.time="age65",   Xlab = "(age - 65) / 10",   break.times=8,   Ylab = "latent process")``

Most popular insights

## Game case ｜ application evolution and practice of service mesh in happy games

author Chen Zhiwei, Tencent level 12 background expert engineer, is now responsible for the public background technology research and development and team management of happy game studio. Rich experience in micro service distributed architecture and game background operation and maintenance research and development. preface The background of happy game studio is a distributed micro service […]