Construct Fama and French Three Factors in Stata
2K views
May 16, 2024
This video explains how to construct Fama and French three factor model in Stata. We have used the data from CRSP and Compustat and at the end of this video we compare the SMB and HML factors constructed using our code with that of the factors available on Fama's website. We have used CRSP monthly stock return data to construct SMB and HML factors. Download Stata code to construct FF 3 and 5 factor model https://payhip.com/b/isHhD Download PowerPoint presentation used in this video https://payhip.com/b/N3atE Website: thedatahall.com As an Amazon Associate, I earn from qualifying purchases.
View Video Transcript
0:00
Welcome to the Data Hall YouTube channel
0:02
In our previous video, we discussed about how do we construct farmer in a French three-factor model
0:08
We also discussed about the different ways of constructing a portfolio. So we discussed about the univariate portfolio, the bi-way rate independent and the bi-varied-dependent portfolios
0:19
Then we also discussed about valuated and equal-weighted portfolios and how to construct the SMB and HML factor
0:28
In this video, we are going to discuss how do we... we construct the farm and French three factors in Stata
0:36
And lastly, I'm going to show you that how much R factors
0:41
the factors that we have generated are correlated with the factors that are available
0:46
on the Kenneth Farmer's website. So we compare R factor with the original factors and see how much correlated R factors are
0:55
So this do file is quite self-explanatory. I have given a lot of comments
1:01
You just have to change the working directory. And if you are working with CRSP and Compustad data
1:08
then all the variables that we use are exactly the same. The name of the variables are exactly the same as there are in the CRSP
1:18
and the Compustad database. So I have given a detailed details of each and everything
1:27
So there are different. sections within this do file. First we do the stock identification. We prepare the
1:35
stock identification file where we identify different stocks that we have. Then we prepare the delisting
1:42
information Then we move on to preparing the monthly stock returns We adjust our stock return to delisted stocks We calculate the excess return Then we clean the Compustad data We merge the CRSP and Compustad data
1:59
Then we move on to calculating the market cap for the size sorting
2:04
In previous video, we discussed that. Market cap for the case of size sorting
2:10
we would take the June end market capitalization. Similarly, we move to market
2:16
to book to market equity ratio where we first calculate the book equity, then we calculate the market equity, and then we divide the book equity to market equity
2:26
Now there is a lot of you know details regarding this thing. We just cannot simply calculate the book to market ratio
2:36
We would have to be specific about the time that that we would have to use for book equity and the time period that we use for market equity
2:44
Lastly, we move to size and value portfolio sortings. We calculate the portfolio returns
2:52
And then we would test the accuracy of the factors that we have constructed with the
2:58
when that are available on the Kenneth Farmer website. Because this specific due file is constructed on the basis of CRSP and Compostad data
3:10
so I'm using the exact same variable names for example, In CRSP data we have Permano that identifies stocks, then we have exchange code for different
3:21
exchanges. We have share code where we can identify the common stock and the other stocks
3:30
mutual fund etc I have given detailed description of each of these variables and how this code would work I just going to discuss all the details in this video I just going to execute this whole code From this so there this timer that I have sat
3:51
I just want to, at the end of this video, I just want to demonstrate that how much time this code takes to execute
3:58
So what I'm going to do is I'm going to execute this code
4:02
Pause the video and once you reach at the end of the timer
4:08
we can show you how much time this code took to execute it
4:14
Now the code had executed and we can see that it took around a
4:18
212 second. If I divide that by 60 that would give us the number of
4:24
minutes that it took to execute this code. So it took 3.5 minutes and do
4:29
remember that the data I'm using is monthly data from 1962 till at the end of 2022
4:41
So this is how much time it took. Now you can download this code and we can share the
4:53
the exact data files that we are using within this. This due file, the reason is that there are copyright violations that
5:04
would have to go through. So due to copyright issues, we cannot share the CRSP and Compostar data, but what we have done
5:11
is we have made some dummy datasets that you can use along with this code and learn this code
5:18
and apply it on your own dataset. And now lastly, I'm going to, what I'm going to do is I'm going to take the FF factors, right
5:28
Okay this is the do file that I going to work with So the SMB and HML factor that I have created are called MySmb and My HML I going to merge this with the data that I have downloaded from the Kenneth Farmer website Once I have
5:45
merged these, we have 738 months. I'm just going to drop the unnecessary data and observations
5:53
And this is where you can get to see that how much correlated are the variables that we have
6:00
constructed, the factors that we have constructed are with the factors of the one that we have downloaded from the Pharma Kenneth website
6:08
So if I do correlate my factor with the SMB factor that we have created and the SMB factor that we have downloaded
6:15
we get 9.99 correlation in case of SMB and 097 correlation in case of HTML which is pretty high
6:26
Let's look at the R square and the coefficients, which is very high. and the coefficients we get 0.99 coefficient and our r square is also 0.9. So 98% of variations
6:39
in the in the factor that we have constructed is explained by the factor that we have downloaded
6:47
from the Kenneth website. Similarly for the case of for the case of HML we get 95 R square and
6:59
and the coefficient is 0.99. So the R squared is 95%. Now, this is how close R factors are
7:09
to the one that we have downloaded from the Forman Current website. Now, I hope this video is useful
7:16
You can download the code and the demo files that you can use with the code from the link given in description
7:24
Do subscribe to this channel and do hit the bell icon
#Investing