R language Fama French (FF) three factor model and CAPM multi factor extended model analyze the visualization of portfolio risk / return in stock market

Time:2022-3-13

Original link:http://tecdat.cn/?p=24983

In this paper, we go beyond the simple linear regression of CAPM and explore the multifactor model of Fama French (FF) stock risk / return.

FF model extends CAPM by regressing portfolio returns of several variables other than market returns. From the perspective of general data science, FF extends the simple linear regression of CAPM (we have one independent variable) to multiple linear regression (we have many independent variables).

What we want to look at is the FF three factor model, which tests the explanatory ability of (1) market income (the same as CAPM), (2) company size (small and large) and (3) company value (book to market ratio). The company value factor is marked as HML in FF, representing high low, which refers to the company’s book to market ratio. When we regress the return of the portfolio with the HML factor, we are investigating how much of the return is due to including stocks with high book to market ratio (sometimes referred to as value premium, because stocks with high book to market value are referred to as value stocks).
 

A large part of this article involves importing data from FF website and sorting it out for our portfolio income. We will see that processing data is conceptually easy to understand, but it is time-consuming in practice. However, mixing data from different sources is a necessary skill for any industry that has data flows from different suppliers and wants to use them creatively. Once the data is sorted out, fitting the model takes no time.

Today, we will use our usual portfolio, including:

+Spy (Standard & Poor's 500 Fund) has a weight of 25%.
+EFA (a non US Equity Fund) with a weight of 25%.
+IJs (a small cap value fund) has a weight of 20%.
+EEM (an emerging market fund) has a weight of 20%.
+AGG (a bond fund) has a weight of 10%.

Before calculating the beta of the portfolio, we need to find the monthly return of the portfolio.

mbls <- c("SPY","EFA", "IJS", "EEM","AGG")

pes <- 
  getSymbols
w <- c(0.25, 0.25, 0.20, 0.20, 0.10)

as\_t\_ng <-  
  res %>% 
  to.monthly %>% 
  tk_tbl %>%
  gather %>% 
  group_by%>% 
p_tuaeoly <- 
  ase\_un\_lng %>%
  tq_portfolio

We will deal with an object of portfolio returns.

Introduction and arrangement of Fama French factor

Our first task is to obtain FF data. Fortunately, FF provides their factor data on the Internet. We will record every step of importing and cleaning up this data, which may be a little too much to some extent. It’s frustrating now, but it can save time when we need to update this model or expand to 5-factor cases.

Check out FF website The data is packaged as a zip file, so we need to do more than just callread_csv()。 Let’s usetempfile()Create a function namedtempThis is where we will place the compressed files.

temp <- tempfile()

R created a temporary file namedtemp 。 Download the 3-factor zip. We want to pass it on todownload.file()And store the results intemp.

First, we will divide the string into three parts: base, factor and format — which is not necessary for today’s task, but it will be convenient if we want to build a shiny application to let users select a factor from the FF website, or if we just want to rerun the analysis with a different set of FF factors. Then we’ll stick these things together and save the string as full_ url。
 

be
faor 

fmt

furl <-
  glue

Now let’s passfull_urlHeredownload.file().

download.file

Finally, we can use the} functionread_csv()Unzip the data and {read the CSV fileunz()

Go\_3\_Fars <- 
  read_csv

head(Go\_3\_Fars )

R language Fama French (FF) three factor model and CAPM multi factor extended model analyze the visualization of portfolio risk / return in stock market

We have imported the dataset, but we don’t see any factors, just a column with a strange format date.

When this happens_ Usually_ You can fix it by skipping a certain number of rows that contain metadata. See if we skip six lines.

Glo_as <- 
  read_csv(
    skip = 6)

head(Glo_as )

R language Fama French (FF) three factor model and CAPM multi factor extended model analyze the visualization of portfolio risk / return in stock market

This is what we expect. There are five columns: one is called x1, which holds a strange formatted date, and then MKT RF, which represents the market return higher than the risk-free interest rate, SMB represents the scale factor, HML represents the value factor, and RF represents the risk-free interest rate.

However, this data has been converted to character format — look at the category of each column.
 

map(Gob3s, class)

R language Fama French (FF) three factor model and CAPM multi factor extended model analyze the visualization of portfolio risk / return in stock market

We have two options to force these columns to the correct format. First, we can do this at the time of import throughcl_yps = colsProvide parameters for each numeric column.

Gll3Ftrs <- 
  read_csv(unz
head(Gll3Ftrs )

R language Fama French (FF) three factor model and CAPM multi factor extended model analyze the visualization of portfolio risk / return in stock market

This works well, but it is specific to FF 3 factor sets with these specific column names. If we import different FF factor sets, we will need to specify different column names.

As an alternative, the following code block converts columns to numbers after import, but is more generic. It can be applied to other FF factor sets.

To do this, we rename the X1 column to date, and then change our column format to numbers. The operation of the vars () function is similar to that of the select () function. We can tell it to operate on all columns by adding a negative sign in front of date, except for the date column.

Gloa\_3\_Fars <- 
  read_csv(unz %>%
  rename%>% 
  mutate_at

head(Gloa\_3\_Fars )

R language Fama French (FF) three factor model and CAPM multi factor extended model analyze the visualization of portfolio risk / return in stock market

Now our factor has digital data, and the date column has better labels, but the format is wrong.

We can use thislubridateThe package parses the date string into a better date format. We will use thisparse_date_time()Function and call theymd()Function to ensure that the final result is in date format. Similarly, when processing data from new sources, dates, in fact, any column can have multiple formats.

Gll\_3\_ts <- 
  read_csv %>%
  rename %>% 
  mutate_at%>% 
  mutate

head(Gll\_3\_ts )

R language Fama French (FF) three factor model and CAPM multi factor extended model analyze the visualization of portfolio risk / return in stock market

The date format looks good, which is important because we want to trim the factor data that matches the FF date with our portfolio date. However, please note that FF uses the first day of the month, while our portfolio returns use the last day of the month. This rolls back the monthly date to the last day of the previous month. The first date in our FF data is “1990-07-01”. Let’s roll back.

Gol3Frs %>% 
  select %>%
  mutate %>% 
  head

R language Fama French (FF) three factor model and CAPM multi factor extended model analyze the visualization of portfolio risk / return in stock market

If we want to reset the date to month end, we need to add one first and then roll back.

Gob3Fars %>% 
  select%>%
  mutate %>% 
  head

R language Fama French (FF) three factor model and CAPM multi factor extended model analyze the visualization of portfolio risk / return in stock market

We have other ways to solve this problem – at the beginning, we can index our portfolio yield to indexat = firstof.

Finally, we only want FF factor data that is consistent with our portfolio data, so we # press # date in the portfolio return objectfirst()Andlast()datefilter()

Glb3Ftos <- 
  read_csv(unz %>% 
  rename%>% 
  mutate_at %>% 
  mutate) + months) %>% 
  filte

head(Glb3Ftos , 3)

R language Fama French (FF) three factor model and CAPM multi factor extended model analyze the visualization of portfolio risk / return in stock market

tail(Glaos, 3)

R language Fama French (FF) three factor model and CAPM multi factor extended model analyze the visualization of portfolio risk / return in stock market

We use left\_ Join (… By = “date”) combines these data objects. It also converts FF data to decimal and creates a file called R\_ The new column of excess preserves returns higher than the risk-free interest rate.

ff\_proio\_tns <- 
  piruq\_ealaed\_ntly %>% 
  left_join %>% 
  mutate


head(ff_poleus, 4)

R language Fama French (FF) three factor model and CAPM multi factor extended model analyze the visualization of portfolio risk / return in stock market

We now have an object containing our portfolio returns and FF factors, and we can carry out the simplest part of our exercise from the perspective of coding, which is also the only part that our boss / colleagues / customers / investors care about: Modeling and visualization

Now we have data in good format. CAPM uses simple linear regression, while FF uses multiple regression with many independent variables. Therefore, our 3-factor FF equation islm(R_excess ~ MKT_RF + SMB + HML

We will add a term to the CAPM code stream that includes a 95% confidence interval for our coefficients.

ffdlrhd <-
  ffptoltus %>% 
  do) %>% 
  tidy(conf.level = .95)

fdlyd %>% 
  mutate_if %>% 
  select

R language Fama French (FF) three factor model and CAPM multi factor extended model analyze the visualization of portfolio risk / return in stock market

Our model object now contains aconf.highAndconf.lowColumn to save the minimum and maximum values of our confidence interval.

We can pipe these results toggplot()And create a coefficient scatter chart with confidence interval. I don’t want to draw intercepts, so I’ll filter them out of the code stream.

We use errorbar to add confidence intervals.

fdpynd %>% 
  mutate_if%>%
  filter %>% 
  ggplot+ 
  geom_point +
  geom_errorbar +
  labs +
  theme_minimal +
  theme

R language Fama French (FF) three factor model and CAPM multi factor extended model analyze the visualization of portfolio risk / return in stock market

The results here are predictable because, like CAPM, we are returning to a portfolio of three factors, one of which is the market. Therefore, the market factor is dominant in the model, while the confidence interval of the other two factors is zero.


R language Fama French (FF) three factor model and CAPM multi factor extended model analyze the visualization of portfolio risk / return in stock market

Most popular insights

1.Arima + GARCH trading strategy for S & P500 stock index in R language

2.Analysis of stock matching trading strategy improved by R language spy-tlt portfolio and Chinese stock market portfolio

3.R language time series: application of trading strategy of ARIMA GARCH model in foreign exchange market prediction

4.R language implementation of TMA triple average futures high frequency trading strategy

5.Multilanguage quantitative mean square

6.An example of using R language to realize neural network to predict stock

7.Implementation of R language to predict Volatility: arch model and har-rv model

8.How to make Markov Switching Model in R language

9.Matlab uses copula simulation to optimize market risk