R language uses the tail dependence of copulas model to analyze the cost of loss compensation

Time：2021-11-28

The problem of dependence between two random variables has attracted much attention. Dependence is a concept that reflects the degree of correlation between two random variables. It is different from correlation. The commonly used correlation measure is Pearson correlation coefficient. It only measures the linear relationship between two random variables. Its value depends not only on their copula function, but also on their edge distribution function.

Intuitively, Copula function is the joint distribution of two (or more) random variables, which can be expressed as the function of their edge distribution function. This function is copula function, which has nothing to do with the edge distribution of random variables. It reflects the “structure” between two (more) random variables, which contains all the information about the dependence of two random variables.

Joe (1990) tail dependence index

Joe (1990) proposed a (strong) tail dependence index. For example, for the lower tail, consider that is • Upper and lower tail (experience)DependentSex function

Our idea is to draw the above function. definition Lower tail For the upper tail, which is And , dependent survival copula, i.e among Now, we can easily deduce the empirical correspondence of these functions, namely:  So, for the upper tail, on the right, we have the following graph And for the lower tail, on the left, we have Loss compensation data

Copula function is widely used in economy, finance, insurance and other fields. As early as 1998, frees and Valdez (1998) studied the relationship between claim amount and management fee, characterized it by copula function and applied it to premium pricing.

For the code, consider some real data, such as the loss compensation data set.

The loss compensation cost data has 1500 samples and 2 variables. These two columns contain compensation payments (losses) and allocated loss adjustment costs (alae). The latter are additional costs associated with the settlement of claims (such as claim investigation costs and legal costs). Our idea is to draw the lower tail function on the left and the upper tail function on the right. Now we can compare this graph with some copulas graphs with the same Kendall’s tau parameter

Gauss copulas

If we consider Gaussian copulas.

> copgauss=normalCopula(paramgauss)
> Lga=function(z) pCopula(c(z,z),copgauss)/z
> Rga=function(z) (1-2*z+pCopula(c(z,z),copgauss))/(1-z)

> lines(c(u,u+.5-u\[1\]),c(Lgs,Rgs) Gumbelcopula

Or Gumbel’s copula.

> copgumbel=gumbelCopula(paramgumbel, dim = 2)

> lines(c(u,u+.5-u\[1\]) confidence interval

However, since we do not have any confidence interval, it is still difficult to draw a conclusion (even if Gumbel copula seems more suitable than Gaussian copula). One strategy can be to generate samples from these copula curves and visualize them. For Gaussian copula curve

> nsimul=500
> for(s in 1:nsimul){
+ Xs=rCopula(nrow(X),copgauss)
+ Us=rank(Xs\[,1\])/(nrow(Xs)+1)
+ Vs=rank(Xs\[,2\])/(nrow(Xs)+1)
+ lines(c(u,u+.5-u\[1\]),MGS\[s,\],col="red")

Include – point by point – 90% confidence intervals

> Q95=function(x) quantile(x,.95)

> lines(c(u,u+.5-u\[1\]),V05,col="red",lwd=2)

Gaussian copula curve Gumbel copula curve Although the speed of statistical convergence will be very slow, it is simple to evaluate whether the underlying copula curve has tail dependence. Especially when copula curve shows tail independence. For example, consider a 1000 size Gaussian copula sample. This is the result of generating a random scheme. Or let’s look at the tail on the left (in logarithmic scale) Now, consider 10000 samples. On these graphs, if the limit is 0 or a strict positive value, it is quite difficult to determine (similarly, when the value of interest is at the support boundary of the parameter, this is a classical statistical problem). Therefore, a simple idea is to consider a weaker tail dependence index.

===

_ Ledford_   And_ Tawn(1996)_ Tail correlation coefficient

Another way to describe tail dependencies can be found in Ledford & tawn (1996). It is assumed that and have the same distribution. Now, if we assume that these variables are (strictly) independent. But if we assume that these variables are (strictly) homotonic (that is, the variables here are equal because they have the same distribution), then

Therefore, there is a hypothesis: Then a = 2 can be interpreted as independence, and a = 1 represents strong (perfect) positive dependence. Therefore, consider the following transformation to obtain a parameter in [0,1], whose dependence intensity increases with the increase of the index, for example In order to derive the tail dependence index, it is assumed that there is a limit, that is This will be interpreted as a (weak) tail dependent index. Therefore, define the function Lower tail (on the left) Upper tail (on the right). The R code for calculating these functions is very simple.

> L2emp=function(z) 2*log(mean(U<=z))/

> R2emp=function(z) 2*log(mean(U>=1-z))/
+ log(mean((U>=1-z)&(V>=1-z)))-1
> plot(c(u,u+.5-u\[1\]),c(L,R),type="l",ylim=0:1,

> abline(v=.5,col="grey")

Gaussian copula function

Similarly, these empirical functions can also be compared with some parameter functions, such as functions obtained from Gaussian Copula Functions (with the same Kendall’s tau).

> copgauss=normalCopula(paramgauss)
> Lgs =function(z) 2*log(z)/log(pCopula(c(z,z),
+ copgauss))-1
> Rgas =function(z) 2\*log(1-z)/log(1-2\*z+
+ pCopula(c(z,z),copgauss))-1

> lines(c(u,u+.5-u\[1\]) Gumbel copula

> copgumbel=gumbelCopula(paramgumbel, dim = 2)
> L=function(z) 2*log(z)/log(pCopula(c(z,z),
+ copgumbel))-1
> R=function(z) 2\*log(1-z)/log(1-2\*z+
+ pCopula(c(z,z),copgumbel))-1

> lines(c(u,u+.5-u\[1\]),c(Lgl,Rgl),col="blue") Similarly, we observe the confidence interval, Gumbel copula provides a good fit here

Extreme copula

We consider the extreme copulas in the copulas family. In the case of two variables, the extreme value can be written as among Is a pickands dependent function, which is a convex function satisfied with It is observed that in this case: among Kendall coefficient, which can be written as for example So, we got Gumbel copula. Now let’s look at (nonparametric) reasoning, more precisely, the estimation of dependent functions. The starting point of the most standard estimator is observation Is there a copula function With distribution function Conversely, the pickands dependency function can be written as Therefore, the natural estimation of pickands function is Among them, Is an empirical cumulative distribution function This is the estimation method proposed in cap é R à a, Foug è res & GenesT (1997). Here, we can use

> Z=log(U\[,1\])/log(U\[,1\]*U\[,2\])
> h=function(t) mean(Z<=t)
> a=function(t){
function(t) (H(t)-t)/(t*(1-t))
+ return(exp(integrate(f,lower=0,upper=t,
+ subdivisions=10000)\$value))

> plot(c(0,u,1),c(1,A(u),1),type="l"

The estimated values of pickands dependence function are obtained by integration. The dependence index of the upper tail can be seen intuitively in the figure above. > A(.5)/2
\[1\] 0.4055346 Most popular insights

Android master notes – start optimization

The only way to start and open the app, the first experience, is related to core data such as user retention and conversion rate; Start analysis startup type Android vitals can monitor the application cold, hot and warm startup time. Via ADB shell am start – w Execute the command to start and print the […]