The practical application of R language Fama French three factor model: portfolio optimization

Time:2021-4-8

Link to the original text:http://tecdat.cn/?p=20360 

This paper will explain the R language in financial mathematics to optimize the portfolio, the implementation and use of factor model.

Macroeconomic factor model with single market factor

We’ll start with a simple example of a single known factor, the market index. The model is

The practical application of R language Fama French three factor model: portfolio optimization

The explicit factor ft is S & P 500 index. We will do a simple least square (LS) regression to estimate intercept α and loading β

The practical application of R language Fama French three factor model: portfolio optimization

Most lines of code are used to prepare data rather than to perform factor modeling. Let’s start preparing the data:

#Set start and end dates and list of stock names
begin_date <- "2016-01-01"
end_date <- "2017-12-31"


#Download data from Yahoo Finance
data_set <- xts()
for (stock_index in 1:length(stock_namelist))
  data_set <- cbind(data_set, Ad(getSymbols(stock_namelist[stock_index], 
                                            from = begin_date, to = end_date, 
head(data_set)
#>                AAPL  AMD      ADI     ABBV AEZS        A       APD       AA       CF
#> 2016-01-04 98.74225 2.77 49.99239 49.46063 4.40 39.35598 107.89010 23.00764 35.13227
#> 2016-01-05 96.26781 2.75 49.62508 49.25457 4.21 39.22057 105.96097 21.96506 34.03059
#> 2016-01-06 94.38389 2.51 47.51298 49.26315 3.64 39.39467 103.38042 20.40121 31.08988
#> 2016-01-07 90.40047 2.28 46.30082 49.11721 3.29 37.72138  99.91463 19.59558 29.61520
#> 2016-01-08 90.87848 2.14 45.89677 47.77789 3.29 37.32482  99.39687 19.12169 29.33761
#> 2016-01-11 92.35001 2.34 46.98954 46.25827 3.13 36.69613  99.78938 18.95583 28.14919

head(SP500_index)
#>              index
#> 2016-01-04 2012.66
#> 2016-01-05 2016.71
#> 2016-01-06 1990.26
#> 2016-01-07 1943.09
#> 2016-01-08 1922.03
#> 2016-01-11 1923.67
plot(SP500_index)

The practical application of R language Fama French three factor model: portfolio optimization

#The logarithmic returns of stock and SP500 index are calculated as explicit factors
X <- diff(log(data_set), na.pad = FALSE)
N < - ncol (x) # number of shares
T < - nrow (x) # days

Now we are ready to fit the factor model. Ls fitting is easy to implement in R, as follows:

The practical application of R language Fama French three factor model: portfolio optimization

 beta <- cov(X,f)/as.numeric(var(f))
alpha <- colMeans(X) - beta*colMeans(f)
sigma2 <- rep(NA, N)

print(alpha)
#>              index
#> AAPL  0.0003999086
#> AMD   0.0013825599
#> ADI   0.0003609968
#> ABBV  0.0006684632
#> AEZS -0.0022091301
#> A     0.0002810616
#> APD   0.0001786375
#> AA    0.0006429140
#> CF   -0.0006029705
print(beta)
#>          index
#> AAPL 1.0957919
#> AMD  2.1738304
#> ADI  1.2683047
#> ABBV 0.9022748
#> AEZS 1.7115761
#> A    1.3277212
#> APD  1.0239453
#> AA   1.8593524
#> CF   1.5702493 

Alternatively, we can use matrix representation for fittingThe practical application of R language Fama French three factor model: portfolio optimization

We defineThe practical application of R language Fama French three factor model: portfolio optimizationAnd expansion factorThe practical application of R language Fama French three factor model: portfolio optimization​。 Then minimizeThe practical application of R language Fama French three factor model: portfolio optimization

t(X) %*% F_ %*% solve(t(F_) %*% F_)  

#>              alpha      beta
#> AAPL  0.0003999086 1.0957919
#> AMD   0.0013825599 2.1738304
#> ADI   0.0003609968 1.2683047
#> ABBV  0.0006684632 0.9022748
#> AEZS -0.0022091301 1.7115761
#> A     0.0002810616 1.3277212
#> APD   0.0001786375 1.0239453
#> AA    0.0006429140 1.8593524
#> CF   -0.0006029705 1.5702493
E <- xts(t(t(X) - Gamma %*% t(F_ )), index (x)) # residuals

In addition, we can simply use r to do the work for us

 cbind(alpha = factor_model$alpha, beta = factor_model$beta)
#>              alpha     index
#> AAPL  0.0003999086 1.0957919
#> AMD   0.0013825599 2.1738304
#> ADI   0.0003609968 1.2683047
#> ABBV  0.0006684632 0.9022748
#> AEZS -0.0022091301 1.7115761
#> A     0.0002810616 1.3277212
#> APD   0.0001786375 1.0239453
#> AA    0.0006429140 1.8593524
#> CF   -0.0006029705 1.5702493 

Visual covariance matrix

Interestingly, visualizing the logarithmic yield [arithmetic processing error]The practical application of R language Fama French three factor model: portfolio optimization

And the estimated covariance matrix of residual ψ. Let’s start with the covariance matrix of logarithmic returns

Main ("covariance matrix of logarithmic return of single factor model")

The practical application of R language Fama French three factor model: portfolio optimization

We can observe that all stocks are highly correlated, which is influenced by market factors. In order to check the stock correlation, we draw a correlation chart

plot(cov2cor(Psi),
         Main = residual covariance matrix)

The practical application of R language Fama French three factor model: portfolio optimization

cbind(stock_ namelist, sector_ Namelist) the industry of stocks
#>       stock_namelist sector_namelist         
#>  [1,] "AAPL"         "Information Technology"
#>  [2,] "AMD"          "Information Technology"
#>  [3,] "ADI"          "Information Technology"
#>  [4,] "ABBV"         "Health Care"           
#>  [5,] "AEZS"         "Health Care"           
#>  [6,] "A"            "Health Care"           
#>  [7,] "APD"          "Materials"             
#>  [8,] "AA"           "Materials"             
#>  [9,] "CF"           "Materials"

Interestingly, we can observe that the automatic clustering of ψ can correctly identify the industry of the stock.

Evaluate investment funds

In this example, we will evaluate the performance of several investment funds based on factor model. We take the S & P 500 index as a clear market factor, and assume that the risk-free return is zero, RF = 0. In particular, we consider six Exchange Traded Funds:

First, we load the data

#Set start and end dates and list of stock names
begin_date <- "2016-10-01"
end_date <- "2017-06-30"

#Download data from Yahoo Finance
data_set <- xts()
for (stock_index in 1:length(stock_namelist))
  data_set <- cbind(data_set, Ad(getSymbols(stock_namelist[stock_index], 

head(data_set)
#>                 SPY   XIVH     SPHB     SPLV     USMV      JKD
#> 2016-10-03 203.6610 29.400 31.38322 38.55683 42.88382 119.8765
#> 2016-10-04 202.6228 30.160 31.29729 38.10687 42.46553 119.4081
#> 2016-10-05 203.5195 30.160 31.89880 38.02249 42.37048 119.9421
#> 2016-10-06 203.6610 30.160 31.83196 38.08813 42.39899 120.0826
#> 2016-10-07 202.9626 30.670 31.58372 37.98500 42.35146 119.8296
#> 2016-10-10 204.0197 31.394 31.87970 38.18187 42.56060 120.5978

head(SP500_index)
#>              index
#> 2016-10-03 2161.20
#> 2016-10-04 2150.49
#> 2016-10-05 2159.73
#> 2016-10-06 2160.77
#> 2016-10-07 2153.74
#> 2016-10-10 2163.66

#The logarithmic returns of stock and SP500 index are calculated as explicit factors
X <- diff(log(data_set), na.pad = FALSE)
N < - ncol (x) # number of shares
T < - nrow (x) # days

Now we can calculate alpha and beta of all ETFs

 #>              alpha      beta
#> SPY   7.142225e-05 1.0071424
#> XIVH  1.810392e-03 2.4971086
#> SPHB -2.422107e-04 1.5613533
#> SPLV  1.070918e-04 0.6777149
#> USMV  1.166177e-04 0.6511667
#> JKD   2.569578e-04 0.8883843 

Some observations can now be made:

  • Spy is an ETF of S & P 500. As expected, its alpha value is almost zero, and its beta value is almost 1: α = 7.142211 × 10-5 and β = 1.0071423.
  • Xivh is an ETF with high alpha value, and the alpha value calculated is the highest (1-2 orders of magnitude higher): α = 1.810392 × 10-3.
  • SPHB is an ETF, which is supposed to have a high beta, but the calculated beta is the highest, but not the highest: β = 1.5613531. Interestingly, the calculated alpha is negative, so the ETF should be cautious.
  • Splv is an ETF to reduce volatility. In fact, the calculated beta is low: β = 0.6777072.
  • Usmv is also an ETF that reduces volatility. In fact, the beta calculated is the lowest: β = 0.6511671.
  • JKD shows a good compromise.

We can use some visualizations:

 barplot(rev(alpha), horiz = TRUE, main = "alph

The practical application of R language Fama French three factor model: portfolio optimization

We can also use Sharpe ratio to compare different ETFs more systematically. Review the factor model of one asset and one factorThe practical application of R language Fama French three factor model: portfolio optimization

We get

The practical application of R language Fama French three factor model: portfolio optimization

The Sharpe ratio is as follows:The practical application of R language Fama French three factor model: portfolio optimization

hypothesisThe practical application of R language Fama French three factor model: portfolio optimization

​。 Therefore, one way to rank different assets based on Sharpe ratio is to rank them according to α / β ratio

 print(ranking)
#>         alpha/beta         SR         alpha      beta
#> XIVH  7.249952e-04 0.13919483  1.810392e-03 2.4971086
#> JKD   2.892417e-04 0.17682677  2.569578e-04 0.8883843
#> USMV  1.790904e-04 0.12280053  1.166177e-04 0.6511667
#> SPLV  1.580189e-04 0.10887903  1.070918e-04 0.6777149
#> SPY   7.091574e-05 0.14170591  7.142225e-05 1.0071424
#> SPHB -1.551287e-04 0.07401566 -2.422107e-04 1.5613533 

We can see that:

  • As far as α / β is concerned, xivh is the best (α is the largest), while SPHB is the worst (α is negative).
  • In terms of Sharpe ratio (or rather information ratio, because we ignore risk-free interest rate), JDK is the best, followed by spy. This confirms the view that most investment funds do not outperform the market.
  • Obviously, by any measure, SPHB is the worst: negative α, negative β ratio and Sharpe ratio.
  • JDK achieves the best performance because it has a good alpha value (though not the best) and a medium beta value of 0.88.
  • Xivh and SPHB have a large number of different betas and therefore have extreme market exposure.
  • Usmv has the lowest exposure rate in the market, has an acceptable alpha value, and its Sharpe ratio is close to the second and third highest positions.

Fama French three factor model

This example will illustrate the Fama French three factor model using nine stocks in the S & P 500 index. Let’s start by loading the data:

#Set start and end dates and list of stock names
begin_date <- "2013-01-01"
end_date <- "2017-08-31"

#Download data from Yahoo Finance
data_set <- xts()
for (stock_index in 1:length(stock_namelist))
  data_set <- cbind(data_set, Ad(getSymbols(stock_namelist[stock_index], 

#Download Fama French factor


head(fama_lib)
#>            Mkt.RF   SMB   HML
#> 1926-07-01   0.10 -0.24 -0.28
#> 1926-07-02   0.45 -0.32 -0.08
#> 1926-07-06   0.17  0.27 -0.35
#> 1926-07-07   0.09 -0.59  0.03
#> 1926-07-08   0.21 -0.36  0.15
#> 1926-07-09  -0.71  0.44  0.56
tail(fama_lib)
#>            Mkt.RF   SMB   HML
#> 2017-11-22  -0.05  0.10 -0.04
#> 2017-11-24   0.21  0.02 -0.44
#> 2017-11-27  -0.06 -0.36  0.03
#> 2017-11-28   1.06  0.38  0.84
#> 2017-11-29   0.02  0.04  1.45
#> 2017-11-30   0.82 -0.56 -0.50

#Calculate the logarithmic return and Fama French factor of the stock
X <- diff(log(data_set), na.pad = FALSE)
N < - ncol (x) # number of shares

Now we have three factors in the matrix F and want to fit the modelThe practical application of R language Fama French three factor model: portfolio optimization

The current load is a beta matrix:The practical application of R language Fama French three factor model: portfolio optimization​。 We can do least square fitting, minimizeThe practical application of R language Fama French three factor model: portfolio optimization​。 More conveniently, we defineThe practical application of R language Fama French three factor model: portfolio optimizationAnd expansion factorThe practical application of R language Fama French three factor model: portfolio optimization​。 The LS formula can then be written as minimizedThe practical application of R language Fama French three factor model: portfolio optimization

print(Gamma)
#>              alpha        b1          b2          b3
#> AAPL  1.437845e-04 0.9657612 -0.23339130 -0.49806858
#> AMD   6.181760e-04 1.4062105  0.80738336 -0.07240117
#> ADI  -2.285017e-05 1.2124008  0.09025928 -0.20739271
#> ABBV  1.621380e-04 1.0582340  0.02833584 -0.72152627
#> AEZS -4.513235e-03 0.6989534  1.31318108 -0.25160182
#> A     1.146100e-05 1.2181429  0.10370898 -0.20487290
#> APD   6.281504e-05 1.0222936 -0.04394061  0.11060938
#> AA   -4.587722e-05 1.3391852  0.62590136  0.99858692
#> CF   -5.777426e-04 1.0387867  0.48430007  0.82014523 

In addition, we can use r to complete:

#>              alpha    Mkt.RF         SMB         HML
#> AAPL  1.437845e-04 0.9657612 -0.23339130 -0.49806858
#> AMD   6.181760e-04 1.4062105  0.80738336 -0.07240117
#> ADI  -2.285017e-05 1.2124008  0.09025928 -0.20739271
#> ABBV  1.621380e-04 1.0582340  0.02833584 -0.72152627
#> AEZS -4.513235e-03 0.6989534  1.31318108 -0.25160182
#> A     1.146100e-05 1.2181429  0.10370898 -0.20487290
#> APD   6.281504e-05 1.0222936 -0.04394061  0.11060938
#> AA   -4.587722e-05 1.3391852  0.62590136  0.99858692
#> CF   -5.777426e-04 1.0387867  0.48430007  0.82014523 

Statistical factor model

Now let’s consider a statistical factor model or an implicit factor model, where both factors and loads are not available. Call the principal component method of the model XT = α 1t + BFT + et with K factor

  1. PCA:

    • Sample mean value:The practical application of R language Fama French three factor model: portfolio optimization

  • Matrix:The practical application of R language Fama French three factor model: portfolio optimization
  • Sample covariance matrix:The practical application of R language Fama French three factor model: portfolio optimization
  • Feature decomposition:The practical application of R language Fama French three factor model: portfolio optimization
  • Estimates:

    •  The practical application of R language Fama French three factor model: portfolio optimization
  • The practical application of R language Fama French three factor model: portfolio optimization
  • The practical application of R language Fama French three factor model: portfolio optimization
  • Update feature decompositionThe practical application of R language Fama French three factor model: portfolio optimization
  • Repeat steps 2-3 until convergence.
  • #>              alpha                                        
    #> AAPL  0.0007074564 0.0002732114 -0.004631647 -0.0044814226
    #> AMD   0.0013722468 0.0045782146 -0.035202146  0.0114549515
    #> ADI   0.0006533116 0.0004151904 -0.007379066 -0.0053058139
    #> ABBV  0.0007787929 0.0017513359 -0.003967816 -0.0056000810
    #> AEZS -0.0041576357 0.0769496344  0.002935950  0.0006249473
    #> A     0.0006902482 0.0012690079 -0.005680162 -0.0061507654
    #> APD   0.0006236565 0.0005442926 -0.004229364 -0.0057976394
    #> AA    0.0006277163 0.0027405024 -0.009796620 -0.0149177957
    #> CF   -0.0000573028 0.0023108605 -0.007409061 -0.0153425661 

    Similarly, we can use r to do the work:

    #>              alpha      factor1      factor2       factor3
    #> AAPL  0.0007074564 0.0002732114 -0.004631647 -0.0044814226
    #> AMD   0.0013722468 0.0045782146 -0.035202146  0.0114549515
    #> ADI   0.0006533116 0.0004151904 -0.007379066 -0.0053058139
    #> ABBV  0.0007787929 0.0017513359 -0.003967816 -0.0056000810
    #> AEZS -0.0041576357 0.0769496344  0.002935950  0.0006249473
    #> A     0.0006902482 0.0012690079 -0.005680162 -0.0061507654
    #> APD   0.0006236565 0.0005442926 -0.004229364 -0.0057976394
    #> AA    0.0006277163 0.0027405024 -0.009796620 -0.0149177957
    #> CF   -0.0000573028 0.0023108605 -0.007409061 -0.0153425661 

    The final comparison of covariance matrix estimation is carried out through different factor models

    We will eventually compare the following different factor models:

    • Sample covariance matrix
    • Macroeconomic one factor model
    • The basic three factor Fama French model
    • Statistical factor model

    We estimate the model in the training phase, and then compare the estimated covariance matrix with the sample covariance matrix in the test phase. The estimation error will be evaluated according to the primary (average loss increase percentage)

    The practical application of R language Fama French three factor model: portfolio optimization

    Load training and test sets:

    #Set start and end dates and list of stock names
    begin_date <- "2013-01-01"
    end_date <- "2015-12-31"
    
    #Prepare stock data
    data_set <- xts()
    for (stock_index in 1:length(stock_namelist))
      data_set <- cbind(data_set, Ad(getSymbols(stock_namelist[stock_index], 
    
    
    #Fama French factor
    mydata <- mydata[-nrow(mydata), 
    
    
    #Preparation index
    f_SP500 <- diff(log(SP500_index), na.pad = FALSE)
    
    #Split the data into training data and test data
    T_trn <- round(0.45*T)
    X_trn <- X[1:T_trn, ]
    X_tst <- X[(T_trn+1):T, ]

    Now let’s use the training data to estimate different factor models:

    #Sample covariance matrix
    Sigma_SCM <- cov(X_trn)
    
    #Single factor model
    Gamma <- t(solve(t(F_) %*% F_, t(F_) %*% X_trn))
    
    E <- xts(t(t(X_trn) - Gamma %*% t(F_)), index(X_trn))
    
    #Fama French three factor model
    
    Sigma_FamaFrench <- B %*% cov(F_FamaFrench_trn) %*% t(B) + diag(diag(Psi))
    
    #Statistical single factor model
    
    while (norm(Sigma - Sigma_prev, "F")/norm(Sigma, "F") > 1e-3) {
      B <- eigSigma$vectors[, 1:K, drop = FALSE] %*% diag(sqrt(eigSigma$values[1:K]), K, K)
    
    
    
    #Statistical three factor model
    K <- 3
    
    while (norm(Sigma - Sigma_prev, "F")/norm(Sigma, "F") > 1e-3) {
      B <- eigSigma$vectors[, 1:K] %*% diag(sqrt(eigSigma$values[1:K]), K, K)
      Psi <- diag(diag(Sigma - B %*% t(B)))
    
    Sigma_PCA3 <- Sigma
    
    #Statistical five factor model
    K <- 5
    
    eigSigma <- eigen(Sigma)
    while (norm(Sigma - Sigma_prev, "F")/norm(Sigma, "F") > 1e-3) {
      B <- eigSigma$vectors[, 1:K] %*% diag(sqrt(eigSigma$values[1:K]), K, K)
      Psi <- diag(diag(Sigma - B %*% t(B)))

    Finally, let’s compare the different estimates in the test data:

    Sigma_true <- cov(X_tst)
    
    Barplot (error, main = "covariance matrix estimation error"),

    The practical application of R language Fama French three factor model: portfolio optimization

    PRIAL <- 100*(ref - error^2)/ref
    
    Barplot (primary, main = a priori method for covariance matrix estimation),

    The practical application of R language Fama French three factor model: portfolio optimization

    Finally, we can see that it is helpful to use factor model to estimate covariance matrix.


    The practical application of R language Fama French three factor model: portfolio optimization

    Most popular insights

    1.Recognition of changing stock market by machine learning — Application of hidden Markov model (HMM)

    2.R language garch-dcc model and DCC (MVT) modeling estimation

    3.R language implementation copula algorithm modeling dependency case analysis report

    4.R language copulas and VaR analysis of financial time series data

    5.Time series prediction of R-language multivariate copula GARCH model

    6.An example of stock forecasting based on neural network with R language

    7.Realization of Volatility Prediction with R language: arch model and har-rv model

    8.How to make Markov switching model with R language

    9.Matlab uses copula simulation to optimize market risk