R language Fama French (FF) three factor model and CAPM multi factor extended model analyze the visualization of portfolio risk / return in stock market

Time：2022-3-13

Original link:http://tecdat.cn/?p=24983

In this paper, we go beyond the simple linear regression of CAPM and explore the multifactor model of Fama French (FF) stock risk / return.

FF model extends CAPM by regressing portfolio returns of several variables other than market returns. From the perspective of general data science, FF extends the simple linear regression of CAPM (we have one independent variable) to multiple linear regression (we have many independent variables).

What we want to look at is the FF three factor model, which tests the explanatory ability of (1) market income (the same as CAPM), (2) company size (small and large) and (3) company value (book to market ratio). The company value factor is marked as HML in FF, representing high low, which refers to the company’s book to market ratio. When we regress the return of the portfolio with the HML factor, we are investigating how much of the return is due to including stocks with high book to market ratio (sometimes referred to as value premium, because stocks with high book to market value are referred to as value stocks).

A large part of this article involves importing data from FF website and sorting it out for our portfolio income. We will see that processing data is conceptually easy to understand, but it is time-consuming in practice. However, mixing data from different sources is a necessary skill for any industry that has data flows from different suppliers and wants to use them creatively. Once the data is sorted out, fitting the model takes no time.

Today, we will use our usual portfolio, including:

``````+Spy (Standard & Poor's 500 Fund) has a weight of 25%.
+EFA (a non US Equity Fund) with a weight of 25%.
+IJs (a small cap value fund) has a weight of 20%.
+EEM (an emerging market fund) has a weight of 20%.
+AGG (a bond fund) has a weight of 10%.``````

Before calculating the beta of the portfolio, we need to find the monthly return of the portfolio.

``````mbls <- c("SPY","EFA", "IJS", "EEM","AGG")

pes <-
getSymbols
w <- c(0.25, 0.25, 0.20, 0.20, 0.10)

as\_t\_ng <-
res %>%
to.monthly %>%
tk_tbl %>%
gather %>%
group_by%>%
p_tuaeoly <-
ase\_un\_lng %>%
tq_portfolio``````

We will deal with an object of portfolio returns.

Introduction and arrangement of Fama French factor

Our first task is to obtain FF data. Fortunately, FF provides their factor data on the Internet. We will record every step of importing and cleaning up this data, which may be a little too much to some extent. It’s frustrating now, but it can save time when we need to update this model or expand to 5-factor cases.

Check out FF website The data is packaged as a zip file, so we need to do more than just call`read_csv()`。 Let’s use`tempfile()`Create a function named`temp`This is where we will place the compressed files.

``temp <- tempfile()``

R created a temporary file named`temp` 。 Download the 3-factor zip. We want to pass it on to`download.file()`And store the results in`temp`.

First, we will divide the string into three parts: base, factor and format — which is not necessary for today’s task, but it will be convenient if we want to build a shiny application to let users select a factor from the FF website, or if we just want to rerun the analysis with a different set of FF factors. Then we’ll stick these things together and save the string as full_ url。

``````be
faor

fmt

furl <-
glue``````

Now let’s pass`full_url`Here`download.file()`.

``download.file``

Finally, we can use the} function`read_csv()`Unzip the data and {read the CSV file`unz()`

``````Go\_3\_Fars <-
read_csv

head(Go\_3\_Fars )``````

We have imported the dataset, but we don’t see any factors, just a column with a strange format date.

When this happens_ Usually_ You can fix it by skipping a certain number of rows that contain metadata. See if we skip six lines.

``````Glo_as <-
read_csv(
skip = 6)

head(Glo_as )``````

This is what we expect. There are five columns: one is called x1, which holds a strange formatted date, and then MKT RF, which represents the market return higher than the risk-free interest rate, SMB represents the scale factor, HML represents the value factor, and RF represents the risk-free interest rate.

However, this data has been converted to character format — look at the category of each column.

``map(Gob3s, class)``

We have two options to force these columns to the correct format. First, we can do this at the time of import through`cl_yps = cols`Provide parameters for each numeric column.

``````Gll3Ftrs <-
read_csv(unz
head(Gll3Ftrs )``````

This works well, but it is specific to FF 3 factor sets with these specific column names. If we import different FF factor sets, we will need to specify different column names.

As an alternative, the following code block converts columns to numbers after import, but is more generic. It can be applied to other FF factor sets.

To do this, we rename the X1 column to date, and then change our column format to numbers. The operation of the vars () function is similar to that of the select () function. We can tell it to operate on all columns by adding a negative sign in front of date, except for the date column.

``````Gloa\_3\_Fars <-
read_csv(unz %>%
rename%>%
mutate_at

head(Gloa\_3\_Fars )``````

Now our factor has digital data, and the date column has better labels, but the format is wrong.

We can use this`lubridate`The package parses the date string into a better date format. We will use this`parse_date_time()`Function and call the`ymd()`Function to ensure that the final result is in date format. Similarly, when processing data from new sources, dates, in fact, any column can have multiple formats.

``````Gll\_3\_ts <-
read_csv %>%
rename %>%
mutate_at%>%
mutate

head(Gll\_3\_ts )``````

The date format looks good, which is important because we want to trim the factor data that matches the FF date with our portfolio date. However, please note that FF uses the first day of the month, while our portfolio returns use the last day of the month. This rolls back the monthly date to the last day of the previous month. The first date in our FF data is “1990-07-01”. Let’s roll back.

``````Gol3Frs %>%
select %>%
mutate %>%
head``````

If we want to reset the date to month end, we need to add one first and then roll back.

``````Gob3Fars %>%
select%>%
mutate %>%
head``````

We have other ways to solve this problem – at the beginning, we can index our portfolio yield to indexat = firstof.

Finally, we only want FF factor data that is consistent with our portfolio data, so we # press # date in the portfolio return object`first()`And`last()`date`filter()`

``````Glb3Ftos <-
read_csv(unz %>%
rename%>%
mutate_at %>%
mutate) + months) %>%
filte

head(Glb3Ftos , 3)``````

``tail(Glaos, 3)``

We use left\_ Join (… By = “date”) combines these data objects. It also converts FF data to decimal and creates a file called R\_ The new column of excess preserves returns higher than the risk-free interest rate.

``````ff\_proio\_tns <-
piruq\_ealaed\_ntly %>%
left_join %>%
mutate

head(ff_poleus, 4)``````

We now have an object containing our portfolio returns and FF factors, and we can carry out the simplest part of our exercise from the perspective of coding, which is also the only part that our boss / colleagues / customers / investors care about: Modeling and visualization

Now we have data in good format. CAPM uses simple linear regression, while FF uses multiple regression with many independent variables. Therefore, our 3-factor FF equation is`lm(R_excess ~ MKT_RF + SMB + HML`

We will add a term to the CAPM code stream that includes a 95% confidence interval for our coefficients.

``````ffdlrhd <-
ffptoltus %>%
do) %>%
tidy(conf.level = .95)

fdlyd %>%
mutate_if %>%
select``````

Our model object now contains a`conf.high`And`conf.low`Column to save the minimum and maximum values of our confidence interval.

We can pipe these results to`ggplot()`And create a coefficient scatter chart with confidence interval. I don’t want to draw intercepts, so I’ll filter them out of the code stream.

We use errorbar to add confidence intervals.

``````fdpynd %>%
mutate_if%>%
filter %>%
ggplot+
geom_point +
geom_errorbar +
labs +
theme_minimal +
theme``````

The results here are predictable because, like CAPM, we are returning to a portfolio of three factors, one of which is the market. Therefore, the market factor is dominant in the model, while the confidence interval of the other two factors is zero.

Most popular insights

CSS record

1. Set the font mode of icon icon, and then the prefix of icon CSS class name should be consistent. [class^=”icon-“], [class*=” icon-“] {} The font used by IOS is invalid. In iconfont CSS, set font family: ‘font_ family’; Change to @font-face { font-family: ‘iconfont’; } 2. Deformation animation when executing animation, if you want […]