Block Gibbs Gibbs sampling Bayesian multiple linear regression in R language

Time:2021-8-23

Original link:http://tecdat.cn/?p=11617

Original source:Tuo end data tribal official account


In this article, I will use Gibbs sampling of block for multiple linear regression to obtain the conditional posterior distribution required for Gibbs sampling of block. Then, the sampler is coded and tested with simulated data.

Bayesian model

Suppose we have a sample size of the subject. Bayesian multiple regression assumes that the vector is extracted from multivariate normal distribution. By using identity matrix, we assume independent observations.

Block Gibbs Gibbs sampling Bayesian multiple linear regression in R language

So far, this has nothing to do withMultivariate normal regressionSame. The following solution can be obtained by maximizing the probability:

Block Gibbs Gibbs sampling Bayesian multiple linear regression in R language

The Bayesian model is obtained by specifying a priori distribution. In this example, I will use a priori values in the following cases

Block Gibbs Gibbs sampling Bayesian multiple linear regression in R language

block Gibbs

Before coding the sampler, we need to derive the a posteriori conditional distribution of each parameter of the Gibbs sampler.

Block Gibbs Gibbs sampling Bayesian multiple linear regression in R language

Conditional posteriorBlock Gibbs Gibbs sampling Bayesian multiple linear regression in R languageTake more linear algebra.

Block Gibbs Gibbs sampling Bayesian multiple linear regression in R language

This is a very beautiful and intuitive result. The covariance matrix of conditional a posteriori is the of covariance matrixestimateBlock Gibbs Gibbs sampling Bayesian multiple linear regression in R language

Also note that conditional posterior is aMultivariate distribution。 Therefore, in each iteration of Gibbs sampler, we start fromA posteriori plots a complete vector

simulation

The result vector of my simulationBlock Gibbs Gibbs sampling Bayesian multiple linear regression in R language。 

Running the Gibbs sampler generates estimates of true coefficients and variance parameters. 500000 iterations were run.cycleIt is 100000 times and 10 iterations.

The following is a graph of MCMC chain, where the true values are represented by red lines.

#   Calculate a posteriori summary statistics
post_dist %>%
  group_by(para) %>%
  summarise(median=median(draw),
            lwr=quantile(.025),
            upr=quantile(.975)) %>%
 
#   Consolidated summary statistics
post\_dist <- post\_dist %>%
  left\_join(post\_sum_stats, by='param')
 
#   Draw MCMC chain
ggplot(post_dist,aes(x=iter,y=dra)) +
  geom_line() +
  geom\_hline(aes(yintercept=true\_vals))

Block Gibbs Gibbs sampling Bayesian multiple linear regression in R language

This is the posterior distribution of the trimmed parameters:

ggplot(post_dist,aes(x=draw)) +
  geom_histogram(aes(x=draw),bins=50) +
  geom\_vline(aes(xintercept = true\_vals))

Block Gibbs Gibbs sampling Bayesian multiple linear regression in R language

It seems possible to obtain the parametersReasonable a posteriori estimation。 To ensure that the Bayesian estimator works properly, I repeated this for 1000 simulated data setsprocess

This will produce 1000 groups of a posteriori mean and 1000 groups of 95%confidence interval。 On average, these 1000 a posteriori means should be expressed inTrue valueAs the center. On average, the real parameter value should be within 95% of the timeconfidence intervalInside.

The following is a summary of these assessments.

Block Gibbs Gibbs sampling Bayesian multiple linear regression in R language

The “estimated average” column is the average of all 1000 simulationsMean a posteriori mean。 The deviation percentage is less than 5%. For all parameters, the coverage of 95% CI is about 95%.

extend

We can make many extensions to this model. For example, a distribution other than the normal distribution can be used tofittingDifferent types of results.   For example, if we have secondary metadata, we can model it as:

Block Gibbs Gibbs sampling Bayesian multiple linear regression in R language

Then put a priori distribution on the. This idea extends Bayesian linear regression to Bayesian GLM.

In the linear case outlined in this paper, the covariance matrix can be modeled more flexibly. Instead, it is assumed that the covariance matrix is diagonal and has a single common variance. This is the same variance hypothesis in multiple linear regression. If the data are classified (for example, there are multiple observations per subject), we can use the anti Wishart distribution to model the entire covariance matrix.


Block Gibbs Gibbs sampling Bayesian multiple linear regression in R language

Most popular insights

1.Deep learning using Bayesian Optimization in MATLAB

2.Implementation of Bayesian hidden Markov HMM model in MATLAB

3.Bayesian simple linear regression simulation of Gibbs sampling in R language

4.Block Gibbs Gibbs sampling Bayesian multiple linear regression in R language

5.Bayesian model of MCMC sampling for Stan probability programming in R language

6.Python implements Bayesian linear regression model with pymc3

7.R language uses Bayesian hierarchical model for spatial data analysis

8.R language random search variable selection SSVS estimation Bayesian vector autoregressive (BVaR) model

9.Implementation of Bayesian hidden Markov HMM model in MATLAB