How to Perform Rolling Regression in R
828 views
May 16, 2024
In this video we discuss how to estimate beta using rolling regression I..e to perform rolling regression in R. We have used two packages i.e. the broom package and the slider package. The broom package have a function called rollapply and the slider package have different functions but the one we used is slider_period. Website: thedatahall.com As an Amazon Associate, I earn from qualifying purchases.
View Video Transcript
0:00
Welcome to the Data Hall YouTube channel
0:02
In today's video, we are going to focus on the rolling regression in our in some of our previous video we have discussed how do we calculate
0:10
stock returns whether that is weekly return, monthly return or daily returns. We have also discussed about how do we download
0:20
stock data using the Quant mode package. In this video, we are going to focus on the rolling
0:26
rolling regression so there are different packages that we can use to to estimate rolling
0:31
to perform rolling regression and get the the betas of the and the variables in this video particularly
0:39
we are going to use the the capham model for this purpose we are going to calculate a systematic
0:44
risk but this time the systematic risk would be on the rolling window basis if you want to
0:50
perform that on the non-rolling window basis like you want to calculate this is a
0:56
systematic, the weekly, monthly or yearly systematic risk, then you would have to watch our previous video
1:02
Now, the package that we are going to use is one of the package is broom package
1:07
So there are multiple packages that have this functionality. The most used package is the broom package where we use the roll apply function
1:16
And the second package that I'm going to use, this is a more powerful package
1:21
specifically for this purpose, is the slider package where we use the slide period
1:26
function so let's load install these libraries you need to install them if you
1:32
haven't installed them I have already installed them so I'm just going to skip this
1:37
line of code I'm just going to load these libraries all of these four libraries
1:42
right and then I'm going to load my dataset so I have this data set in my
1:50
working directory right so you would have to set this to source working directory set the
1:56
working directory where the data is and I'm going to load other data so what we have over
2:02
here we have daily stock prices data we have different stocks for example we have Apple
2:09
we have Microsoft and some other stocks we have their stock prices the SNP 500
2:16
index and then we have the risk-free rate so first we need to calculate the stock
2:22
returns so we need to calculate daily stock return and the monthly stock return
2:27
So this whole process of estimating the rolling regression, I would perform this on two types of data
2:35
That is daily stock return and monthly stock return. So therefore, first I would calculate the daily stock return and then I would calculate
2:44
the monthly stock rate. Now, this is extensively discussed in my previous videos
2:48
but I would walk you through this code. So first we take the data that I have loaded over here
2:52
this data frame, a group by the stocks, then we mutate that is create a new column that is RI which would be the the stock returns
3:03
the market return log of current price divided by the Lack price then calculated the access stock
3:12
return and the excess market return and the date column in our let me show you the structure
3:19
of this data and the data that we have just loaded you can see that the date column is currently
3:25
interpreted as the character vector column the character data set by r so what we need to do
3:33
is tell r that this date column is in a date format so in my case it's day month and year so i'm using
3:42
the dmy function and specify the date column and lastly i'm just going to focus on these columns
3:49
so i would delete all the other columns for example the r i dmk t the price
3:55
columns and because I have grouped over here so I'm going to ungroup it
4:01
If I press control enter firstly you can see that now. Okay so now I if yeah so this is daily so now you can see that as opposed to a date being
4:18
red as character vector vector now it is represented by the date format the date format
4:24
that is recognized by our secondly we have calculated our excess stock return and
4:31
excess market return now we are going to set first set the data and then we are going to perform these these rolling regressions next we are going to calculate set the data and then we are going to perform these rolling regressions Next we are going to calculate the monthly stock returns So we take the data
4:42
Now, again, this code has been extensively discussed in my video where I discuss how to calculate
4:48
daily, weekly and monthly stock returns. I'm just going to quickly walk you through
4:53
So I convert this data into a wide format because I want to convert it into an XTS object
5:01
and convert it into a monthly data. Lastly, convert it back to long format
5:10
and calculate these excess market return and excess stock return and only select the variables that I'm interested in
5:18
So once you have this data set, so let me show you this data
5:22
we have, you can see that all these dates are the last date
5:28
of each specific months. We have their access market return, and��ess stock returns, where in this data daily, we have the daily access returns
5:39
Next, what we need to do is use the roll apply function. Now, remember, this is coming from the broom package
5:45
So install this package and load this package. And first we need to do is, the easiest way is like we define a function, let's call this
5:56
function regress. So this function contains three lines of code. we execute the regression of our access stock return on excess market return and we use a
6:09
data frame that is supplied over here then we extract the coefficient from this regression
6:18
and that this function would return that specific coefficient right so once we define this
6:24
function it would not perform anything but we'd use this function in our
6:31
in our next code. Now, first we are going to do rolling regression using daily data
6:37
and then we are going to do rolling regression using monthly data. So we take our daily data
6:43
Remember, this roll apply can only be applied for a single stock, right
6:49
If you supply it with the whole data frame, then that would be a wrong method of working with this role apply
6:58
Now, in this video, I'm just going to demonstrate how do you apply it to a single stock
7:04
but you can use four loops to apply it or some apply family of functions to apply it on
7:13
or you can define again a function and use map to apply it on whole data set on other symbols
7:21
other stocks. So the sticker that I'm going to use in this specific case is the Apple stock
7:28
I'm just going to select the X's stock return, Xx market return
7:33
Remember, sometimes this roll apply would give you incorrect results if you select other
7:40
datasets like the unnecessary columns. So that's why I have limited them to these two columns
7:47
I apply the role apply function with the width that is 60 days because this is daily data
7:53
So whatever width I would specify would be in terms of daily
7:58
function would be the regress function that I have defined over here and this function
8:01
would what this function would do is perform regression and give me the coefficient
8:07
I do not want this function to be applied on each column and if there are missing values
8:14
then then those would be represented by any. If I execute this function it is going to
8:20
take a few seconds because it is a daily data. Let me show you the values now remember it
8:28
is a vector, right? So the starting 60 values would be missing values and then we would have
8:36
our beta on the 60th day. Now what I need to do is because I cannot see any dates and if I want
8:44
to merge this data with my other variables, I would need date. So what I am doing is I'm creating a data
8:49
frame where the dates column would come from the data daily column, that I already had
8:58
this column and we would get the betas from from from the vector that we have created
9:07
right I just going to combine these and let me show you and now we have dates and we have the betas right So we have the first beta for 29th of March 2011
9:20
That is 1.22. And the rest of the 59 observations are missing
9:24
because it takes those observations to calculate the rolling regression. Now let's perform this function as a rolling regression
9:32
on monthly data. The only thing you need to do is obviously change the
9:38
the data frame so I'm just going to change the data frame instead of daily data
9:42
we would have daily monthly the rest of the code is exactly the same 60 over here
9:49
would mean 60 months of data so if I apply this show you the data you would see that
9:56
the first 60 months are empty they are missing observations because we didn't
10:04
we didn't have previous values to calculate the beta for these these time periods again I'm
10:11
because these are just the beta values we do not know for which month or for which specific time
10:18
period this beta is so what I'm going to do is take the dates column from this data monthly
10:25
data frame and combine it with this betas that we have just created over here if I do that you can
10:34
see that we have these betas over here and you can see these are monthly rates. Now this is
10:42
what I have performed using the Roll Apply function. Now let's perform this same task using the
10:48
slider package where we have the slide period function. Now slide period function is quite
10:55
now when we are working with Roll Applier remember it would take the 60
11:04
60 calendar days, right? Not 60 trading days. So it would not take into account the missing values, right
11:14
The missing dates when there were holidays, it would straightaway take 60 observations
11:20
That's why you can see that we have missing observations for the starting 59 values
11:30
Now with Slider, you can work with trading. trading days, right? So first we are going to do is we are going to calculate rolling regression
11:42
on monthly data. Again, we take the monthly data, filter it to each, like we just take the Apple firm
11:49
stock stock data, because again, as with the role apply, we can only apply this to a single
11:58
stock and if you want to apply it to your whole data set to multiple stock, then you'd have to
12:04
loop through each stock in your dataset. So what we do over here, we create a new column, let's call it beta, and we are going to use the
12:14
slide period function where we would pick everything that is, everything that is supplied
12:22
over here by this data frame. And we need to specify the index, what is our index, that is the date column, that is the
12:34
date column that we have in our data frame. We need to specify the period and let me show you
12:40
some of the, let me show you the help menu of this function. Okay, so I haven't loaded this package
12:52
Remember, when you are performing this, you'd have to load this package, right? So now let me
12:59
show you the document, but do install this package, right, before loading it
13:04
So let me scroll down and show you what we have in the period parameter
13:13
We can specify the year, quarter, month. In this case, we have monthly data, so we are going to perform that on months
13:21
So we specify this string as month, right. And the function would be the regression of excess stock return on excess market return
13:32
and data would be dot. That means any data supplied over here
13:36
So in the first iteration, it would have 60 observations. Like, it would take the 60 observation for 60 observation
13:44
and then it would keep rolling on And we need to extract the regression coefficient Before would specify how many days we want and complete would specify whether we want also the missing observations or not as we did
14:00
with the roll apply. So that so if you if you convert this into false, it would just give you the
14:06
it would remove the missing values, right? So all these missing observations would be removed
14:10
But I want to make, I want to keep them because I want to be able to compare
14:15
it with my role apply function so that way I can demonstrate that they perform the exact same
14:22
task so let me execute this function we get the slider m which contains our our betas using
14:32
monthly data using the slider package if I can scroll down we have this date 31st of
14:40
December 2015 and let me show you the value you can see that point 8
14:45
and let me show you the exact same value that is for exact same date and the value is the exact same, right
14:52
So that demonstrate that these both functions are how do we perform the rolling regression using these two functions
15:01
Now let's come to the last aspect, which is the rolling regression on daily data using the
15:07
slide period function. So let me get this, right? So we are going to use the slide period
15:15
So when you are working with the slider, so I'm going to make a video of on the slider package of its own because a slider package is quite useful and it is quite in depth
15:28
So I want to make a video on that separately. So stay tuned to this channel
15:34
So this video would come soon. So what I do is I use the slider period function as I did over here specify all the exact same parameters because in the daily
15:45
data, we have the date column that contains the index, that is what I have specified
15:50
Because in this case, we have daily data. So in period we would have a string, would have a string day, right
16:01
And the function is exactly the same, and the rest of the observations are exactly the same
16:05
but when we apply this function and we have the data, let's compare it with the rule
16:13
apply daily right in roll apply daily you can see that we have the beta the first beta would be for 29th
16:20
of March 2011 but in case of slider we see that we have for the third of March 2011 right so
16:33
so why is that because by default it takes the the trading days so so the first observation starts with
16:43
the 3rd of January and between 3rd of January and 3rd of March, we have around 59 observations
16:52
right? And so it is taking calendar dates rather than taking the, sorry, it is taking the
17:03
trading days rather than taking calendar days as the role apply would do. So that's why we would
17:08
have 60 observations as missing values. And we have a different beta O.S
17:13
here. So if you are interested in working with the trading days, then you cannot work with the
17:18
role apply function. You would have to do slider. But slider can also work with the calendar
17:25
dates. Right. So if you just use the slide function, the slide function do not have this
17:31
this index, right? If you can scroll above, you would see that there is this slide function
17:41
I hope it is in this menu, but anyways. So it doesn't have this
17:48
So whether you command this out or not, it doesn't matter. But anyway, there is no, this, it won't have this index parameter over here
17:58
So if I execute it now, and it is going to take a second
18:03
And if I can show you the data, now we would have the exact same data as we had with the
18:09
roll ply that is for the 29th of March 2011 we had 1.2909 and we have 1.2209 so I hope this
18:18
video was useful stay tuned to this channel for more videos related like this and thanks for
18:24
watching this video
#Investing