## Original link:http://tecdat.cn/?p=5334

Geometric Brownian motion (GBM) is the standard force to simulate most financial instruments that rely on some form of path dependence. Although GBM is based on well founded theory, people should never forget its original purpose – the modeling of particle motion follows a strictly normal distribution pulse. The basic formula is given by the following formula:

The standard Wiener process represents innovation. The effect of gas modeling is very good, and there are some serious defects in financial modeling. The problem is that the Wiener process has two very strict conditions:

a) Innovation is usually distributed, and the mean zero sum variance is TB) innovation is independent

Now, at least some people exposed to financial market data know that stock returns do not meet the first condition, and sometimes even the second condition. The general consensus is that stock returns are distorted, effective and uneven. Although stock returns tend to converge to the normal distribution and decrease in frequency (i.e. monthly returns are more normal than daily returns), most scholars will agree that t distribution or Cauchy distribution is more suitable for returns.

However, in practice, most people only adopt normal distribution when simulating Brownian motion, and accept that the resulting asset price is not 100% accurate. On the other hand, I am not satisfied with this semi solution. I will show the cost of blindly trusting GBM in the following example. I propose a pseudo Brownian method in which random innovation is sampled from the kernel density estimation of empirical return rather than the assumed normal distribution. The advantage of this method is that it produces results closer to those observed in the past without completely replicating the past (this will be the result of sampling random innovations directly from past innovations).

# Introductory examples

Before we get to the interesting part, we show how much money is wasted in the market. Let’s start with a simple example. We need to load three packages and their dependencies (you can download the R – of this post at the bottom of the page)

```
install.packages("quantmod")
require(quantmod)
```

For our first example, we will try to simulate the return of at & T. The following command allows us to download price information from Yahoo Finance and calculate the monthly log return. In order to determine what I did at the beginning, we will compare the return distribution with the normal distribution.

```
att <- getSymbols("T", from = "1985-01-01", to = "2015-12-31", auto.assign = FALSE)
plot(density(attr), main = "Distribution of AT&T Returns")
rug(jitter(attr))
```

Even without a master’s degree in art history, most people would agree that the two lines do not match. For those who do not rely on this visual method, the reliable Kolmogorov Smirnov test provides a more formal method.

```
set.seed(2013)
ks.test(attr, rnorm(n = length(attr), mean = mean(attr), sd = sd(attr)))
```

The p value returned by the test is 0.027, which is far from enough (the smaller the p value, we must conclude that the two distributions are different). Next, we set up the standard GBM function. I am fully aware that various GBM functions exist as part of many packages. Nevertheless, I decided to create my own functions to make the internal work more transparent.

```
m((mu * dt * x) + #drift
rnorm(1, mean = 0, sd = 1) * sqrt(dt) * sigma * x) #random innovation
x
}
```

In this simple function (I know there is a more elegant way to do this, but the result remains the same) the rnorm function acts as the Wiener process driver. There is no doubt that this does not respect what we have seen above. In contrast, my pseudo Brownian function samples random innovations from kernel density estimates of past empirical returns.

```
pseudoGBM <- function(x, rets, n, ...) {
N <
y\[\[i\]\] <- x + x * (mean(rets) + samp\[i\])
x <- y\[\[i\]\]
}
return(y)
}
```

Admittedly, this function is a bit concise because it assumes static increments (that is, DT = 1) and requires little user input. It only needs a starting value (x), a vector returned in the past (rets) and the specified path length (n) Input allows the user to pass other commands to the density function. This allows the user to control the smoothness of the kernel density estimation by adding a bandwidth command (BW =). Without any further trouble, let’s start using the above functions for simulation. In the first example, we use only two functions in the starting value x to simulate a price path, that is, the last price in the series. To see the performance of the two methods, we calculate the returns of the simulation sequence and compare their distribution with the empirical distribution.

```
x <- as.numeric(tail(att$T.Adjusted, n = 1))
set.seed(2013)
attPGBMr <- diff(log(attPGBM))\[-1\]
d1 <- density(attr)
d2 <- density(attGBMr)
d3 <- density(attPGBMr)
plot(range(d1$x, d3$x), range(d1$y, d3$y), type = "n",
ylab = "Density", xlab = "Returns", main = "Comparison of Achieved Densities")
lines(d1, col = "black", lwd = 1)
lines(d2, col = "red", lty = 2)
lines(d3, col = "blue", lty = 3)
```

Obviously, we see that the pgbm function (blue line) is superior to the standard GBM function (red line) in generating returns close to the empirical return distribution (black line). Similarly, key (or visually incompetent) readers can view the results of the KS test.

```
ks.test(attr, attPGBMr)
ks.test(attr, attGBMr)
```

Again, we see that the pgbm function (P value = 0.41) is much better than the GBM function (P value = 0.02).

# Advanced examples

As promised, our second example will show how much money there is online when a person mistakenly assumes a normal distribution when he cannot represent the basic data. Since the wake of the financial dark era, Europe has shown a special desire for structured financial products, which can participate in the stock market while limiting or eliminating downside risks. Such securities are usually path dependent, so GBM is usually used for modeling.

We will use a specific product provided by Generali Germany – rente chance plus – which is the original reason why I developed the pgbm function. When I worked in a private bank, my task was to evaluate this specific security, starting with standard Monte Carlo simulation based on GBM, but I soon realized that this was not enough. Rente chance plus provides the upper limit of 20% participation in eurostoxx 50 index to 15%, and there is no downward trend in initial investment and realized income. Security is assessed at the end of each year. Although Generali is free to change the participation rate and capitalization rate over a 20-year investment period, for demonstration purposes, we will assume that these factors remain unchanged.

Reflecting our program from the above, we first download eurostoxx 50 price information from Yahoo Finance.

`eu <- getSymbols("^STOXX50E", from = "1990-01-01", to = "2015-12-31", auto.assign = FALSE)`

Next, let’s look at how well the data fit the normal distribution, or how bad it is.

```
plot(density(eur), main ibution of EUROSTOXX 50 Returns")
ks.tst(eu.r, rnm(n = length(eu.r), mean = mean(eu.r), sd = sd(eu.r)))
```

From a strict perspective, this looks worse than the at & T distribution. Eurostoxx’s return is obviously a negative deviation, a little leptokurtic. The return p value of KS test is 0.06 to confirm the visual mismatch. Now that we have determined that the normal distribution is not the most appropriate, we can look at the consequences of mistakenly assuming it. We will run the simulation of 10000 iterations using standard GBM and my pgbm functions and compare the results (if you are copying the following code, please drink yourself a cup of coffee while waiting. This will take some time to run).

```
x <- as.numer
SIM1 <- as.data.frame(matrix(replicate(10000, {eu.GBM <- myGBM(x=x, mu = mean(eu.r), sigma = sd(
SIM2 <- as.data.frame(matrix(replicate(10000, {eu.PGBM <- pseudoGBM(x = x, n = 240, rets = eu.r)}), ncol = 1000, 10000), SIM1\[seq(0, 240, 12), \])), start = c(2016), frequency = 1)
sim2 <- ts(as.matrix(rbind(rep(x, 10000), SIM2\[seq(0, 240, 12), \])), start = c(2016), frequency = 1)
```

Of course, we are not interested in the price level of eurostoxx 50, but the return assessed under the constraints of participation rate and ceiling rate. The good news is that the hardest part is behind us. Calculating returns and applying constraints are simple. Adjusting the results is not easy.

```
s1.r <-(sim2))
, s2.r\*0.2, 0.15\*0.20)
S1<-colSums(s1.r)
S2<-colSumS1,arkred")
rug(jitter(S2), side = 1, col = "darkblue")
ks.test(S1, S2)
```

We can clearly see that the cumulative return simulated by pgbm function (blue) shows a negative deviation, and the range is wider than that simulated by standard GBM function (red). Please note that since there are no downlink restrictions on security, the distribution in the lower tail area looks different. The KS test confirms that the two distributions are different in an extremely deterministic manner (however, small P values are mainly caused by large sample sizes). Now answer the million dollar question (actually very literal). How much is there on the line? Well, if Generali uses a normal distribution to predict returns and re insure accordingly, they will

Mean (S1) – mean (S2)

**… underestimated the cumulative return by about 0.6%.**This may not seem like much, but if we assume that the safe quantity is 1 billion euros, then Generali will not reach 6 million euros – a considerable amount of money is just assuming the wrong distribution.

# Conclusions and limitations

So what did we learn from it? The distribution used to model innovation in any path dependent security pricing model can have a significant impact. Although this statement is obvious in itself, the degree of distribution difference is surprising. Of course, people working in Generali and other institutions may be smarter than me. They know very well that normal distribution is not always the best choice. However, most people will use more formal (but possibly inaccurate) distributions, such as t-distribution or Cauchy distribution. The use of nuclear density distribution is an unheard of method. There is a reason.

Firstly, there is no guarantee that the kernel density estimation can represent the unknown basic distribution more accurately than the unavoided normal distribution. Using past data to predict the future is always bad for any data scientist, but unfortunately we have no choice. However, the inherent normal distribution of standard GBM does rely too much on past information (i.e. historical mean and standard deviation), but it has great advantages in formal solutions because of its core role (puns are only used for hindsight) probability theory.

Secondly, kernel density estimation is very sensitive to the bandwidth used. If the bandwidth is too large, you will get a smooth distribution, but it is no different from the normal distribution. If the bandwidth is too small, you will get a distribution that emphasizes extreme values very much, especially if the data sample you estimate the kernel density is quite small. In the above example, we used the automatic bandwidth selector inherent in the density function, but there is little way to know what the optimal bandwidth is.

The above method has other limitations because we have made many very unrealistic assumptions. In the example of Generali, we assume that Generali does not change the participation rate and upper limit rate, which is unlikely. More generally, however, we make some basic assumptions about financial markets. Informed (hopefully) we assume that the capital market is efficient. Therefore, we assume that there is no autocorrelation in the return, which is the second condition of the Wiener process, but does this represent the basic data?

`acf（eu.r，main = “EUROSTOXX 50 returned autocorrelation)`