Original link:http://tecdat.cn/?p=22206
Model background
Each dynamic phenomenon can be represented by a latent process (Λ (T) To describe the evolution of this latent process in a continuous time T. when modeling the flag variable of repeated measurement, we usually do not regard it as a latent process with error measurement. However, this is the basic assumption made by the mixed model theory. The mixed model of latent process uses this framework to extend the linear mixed model theory to any type of results (ordered, binary, continuous, category and any distribution).
Latent category hybrid model
The latent category hybrid model was introduced in Proust Lima et al. (2006) https://doi.org/10.1111/j.15410420.2006.00573.x And 2013 https://doi.org/10.1111/bmsp.12000 ）。
A linear hybrid model is used to model variables of interest defined as latent processes according to time:
Of which:

X (T) and Z (T) are vectors of covariates (Z (T);

β Is the fixed effect (i.e. the overall mean effect);

UI is random effect (i.e. individual effect); They are distributed according to the zero mean multivariate normal distribution with covariance matrix B;

(WI (T)) is a Gaussian process that can be added to the model to relax the internal correlation structure of the object.
At the same time, the relationship between the observations of the latent process marker variable Yij of interest (for object I and occasion J) is defined in the observation equation:
among

TIJ is the measurement time of topic I and occasion J;

ϵ Ij is an independent zero mean Gaussian error;

H is a link function, which can convert the latent process into scale and measurement.
Different parameter families are used. When the flag variable is continuous, H1 is the parameter family of increasing monotone functions, where:

Linear transformation: This is simplified to a linear mixed model (2 parameters)

Beta cumulative integral cloth family readjustment (4 parameters)
WhensignWhen the variable is a discrete category (binary or ordered): H is the threshold function, i.e. each level of Y corresponds to Λ (TIJ) to be estimated+ ϵ Boundary of ij interval.
Identifiability
As with any latent variable model, the measurement of latent variables must be defined. In LCMM, the variance of the error is 1, and the average intercept (at β Medium) is 0.
Example
In this paper,lcmm
The mixed latent process model was illustrated by studying the linear trajectory of depressive symptoms (measured by CESD scale) of men aged about 65 years old 。 Including intercept and age65 related random effects.
Models considered:
，
_ Fixed effect part_ yes
Model h for estimating different continuous link functions
We used the age variable of about 65 years old for centralization, and took ten years as the unit.
The latent process hybrid model can be fitted with different link functions, as shown below. This is done with parameter links.
Linear link function
When defining a linear link function, the model is simplified to a standard linear hybrid model. Linear link functions are available by default:
lcmm(CESD ~ age65*male, random=~ age65 # Link = Linear
It is identical to the model installed by hlme. The only difference from hlme objects is the parameterization of intercept and residual standard error.
hlme(CESD ~ age65*male, random=~ age65 # Link = Linear
The log likelihood is the same, but the parameters are estimated β Not in the same range
loglik
\[1\] 7056.652
Nonlinear link function 1: beta cumulative distribution function the rescaled cumulative distribution function (CDF) of beta distribution provides concave, convex or sigmo ï D transformation between the flag variable and its basic latent process.
lcmm( random=~ age65, link='beta')
Nonlinear link function 2: quadratic ispline. The quadratic ispline family is similar to the continuously increasing link function. It involves nodes distributed within the range of flag variables. By default, 5 equidistant knots within the range of flag variables are used:
lcmm(random=~ age65, subject='ID', link='splines')
You can specify the number of knots and their location. First enter the number of nodes, and then ， Then specify the location equi
， quant
or manual
It is used for equidistant nodes respectively. The quantile or internal knot of the flag variable distribution is manually entered in the parameter intnodes. For example, 7equisplines
It means that there are 7 equidistant nodes,6quantsplines
Ispline, It means an ispline with 6 nodes, which is located at the quantile of the marker variable distribution.
For example, there are 5 knots at the quantile:
lcmm(link='5quantsplines')
Select the best model
To select the most appropriate link function, you can compare these different models. Usually, this can be done by usingAIC or UACVIsotherm is realized by comparing the models according to the goodness of fit 。
AIC (uacv in the output of each model):
In this case, according to AIC standard, the best fit is provided by the model of isplines and 5 quantile node link function. Different estimation link functions can be compared in the figure:
plot(mli, Which = "linkfunction", xlab = "latent process")
legend(x="topleft", Legend = C ("linear", "Beta", "spline" (5 equidistant nodes) "," spline curve (5 quantile nodes) ")
We see that the two spline transformations are very close. The linear model seems inappropriate, as shown by the difference between linear curve and spline curve. The beta transformation is different from the spline curve only when the latent process has a high value. The transformed confidence band can be obtained by Monte Carlo method:
predict(mspl5q,ndraws=2000)
legend(legend=c("95% Confidence band "," quantile spline "), lty = C (2, Na))
Estimating the model with discrete link function H
Sometimes, for flag variables with only a limited number of levels, continuous link functions are not appropriate, and the ordered nature of flag variables must be handled. The LCMM function handles this situation by considering the threshold link function. However, we must know that the numerical complexity of the model with threshold link function is much more important (due to the numerical integration of random effect distribution). This must be kept in mind when fitting this model, and the number of random effects should be carefully selected.
Note that the model becomes a cumulative probability hybrid model。 Here is an example of using Hier variable (Level 4), because considering the range of 052 (e.g. 52 threshold parameters), the threshold link function of CESD will involve too many parameters.
lcmm(HIER ~ age65*male, link='thresholds')
Output after fitting
outline
The summary of the model includes convergence, goodness of fit criteria and estimated parameters.
According to the trajectory predicted by the distribution of covariates, the predicted trajectory can be calculated according to the proportion of dependent variables and the distribution of covariates:
predict(msp, newdata=datnew, var.time="age"
Then draw:
Plot (women, xlab = "age")
plot(men, add=TRUE)
Legend (legend = C ("female", "male", "95% Confidence interval ", "95% Confidence interval ")
Goodness of fit 1: residual diagram
The specific residuals (qqplot in the lower right panel) should be Gaussian.
Goodness of fit 2: prediction and observation chart
Average predictions and observations can be plotted based on age. Note that the prediction and observation are within the scope of the latent process (the observation is converted through the estimated link function):
plot( var.time="age65", Xlab = "(age  65) / 10", break.times=8, Ylab = "latent process")
Most popular insights
1.Lmer mixed linear regression model based on R language
3.R language linear mixed effect model combat case
4.R language linear mixed effect model combat case2
5.R language linear mixed effect model combat case
6.Partial folded Gibbs sampling of linear mixed effects models
7.Research on the popularity of teachers with R language lme4 mixed effect model
8.Harrv model based on mixed data sampling (MIDAS) regression in R language to predict GDP growth
9.Hierarchical linear models HLM using SAS, Stata, HLM, R, SPSS and Mplus