How to Calculate Stock Return in Stata

In the field of finance, we are mostly interested in analyzing stock returns and for that, we need to understand how to calculate stock returns in Stata. In this tutorial, we are going to look into the code of calculating daily, weekly, monthly, and yearly stock returns.

Daily Stock return

There are two methods of calculating stock return i.e., simple return and log return. Mostly in research, we use log return as opposed to simple return. Log return is calculated by taking the natural log of current and previous price differences. Following is the code to calculate log return:

gen ri=ln(closing/closing[_n-1])

The above code will generate a variable “ri” and it will store stock return data. ln function is used for natural log, whereas _n-1 is used for the previous closing price.

Now the above code works fine if we only have one time series in our data and it is correctly sorted, however, if we have panel data then the above code will not work correctly. Firstly, because in the previous code we have not sorted the time variable, and secondly because we are not explicitly telling Stata that we have panels data.

In case of panel data, the correct code should be following:

bys symbol_code (date): gen stock_return = ln(closing / closing[_n-1])

The “symbol_code” variable contains a unique code or symbol for each firm. So, the bys (bysort) prefix will repeat whatever command is written after colen : for each firm. As the same dates exist for each company, therefore, before executing the command it will first sort the data on “date” variable. Before executing the generate command for each firm, Stata will first sort each firm’s data based on the date variable because we have given the date variable within parenthesis.

Related Book: Introductory Econometrics for Finance by Chris Brooks

Weekly stock return

The weekly returns of stock equities are commonly used in empirical research to avoid the non-synchronicity of daily data. Therefore, to generate weekly stock returns, we will first generate a new variable that would contain the weeks of the year.  

gen wofd = wofd(date)

The above code will generate a variable wofd, although you can use any name.

After the equality sign we have used wofd function, this function will generate the week of the year from the dates stored in the date variable.

The wofd variable generated using the above code is not meaningful to the human eye, so we will format it using the following code:

format wofd %tw

So “2011w2” means the second week of the month and so on.

Next, we are only going to retain only the closing prices that are last day of the week.

bys symbol_code wofd: keep if _n == _N

The above code will only keep the last observation for each week and each symbol_code. The bys symbol_code wofd prefix will repeat the command after the colon for each symbol and week. _n stands for sequence of the observation whereas, _N stands for the total number of observations. So keep if _n==_N will only keep the last observation for each week. To further simplify the idea of “_n” and “_N”, let’s use following commands, you don’t have to use them for stock return calculation, it is just to illustrate a point.

bys symbol_code wofd:gen n=_n
bys symbol_code wofd:gen N=_N

The above commands will generate two variables i.e., small “n” and capital “N”. From the below picture you can see that “n” holds the sequence of observations within that specific week, whereas “N” has the total number of observations in that week. For example, in the 6th week we have a total 5 number of observations whereas in the 7th week there are 4 total observations. So, the command keep if n==N will only keep the observations where both the variables have the same values i.e., only the last observations for each week.

Related Article: Rolling Beta and rolling mean, median in Stata

The above commands do not generate weekly returns, for that we will use the following command. This command will be repeated for each symbol code because we have used bys symbol_code prefix and before executing the command it will sort data based on wofd variable.

bys symbol_code (wofd): gen riw= ln(closing / closing[_n-1])

Monthly Stock Return

There is an advantage of monthly stock return Therefore, it is preferred over weekly return. The main advantage of using monthly returns data instead of daily returns data is that returns are at least roughly regularly distributed with monthly data (or, at the very least, the simplifying assumption of normality is much less crazy for monthly returns than it is for daily returns). Therefore, to calculate monthly data we just need to change function in comparison with weekly function (wofd) we will type (mofd)

gen mofd = mofd(date)
format mofd %tm

By following (mofd) function we will have monthly sorted data, we can check that by viewing our data sheet, a new column will be generated as illustrated below;

Then we will execute the following commands

bys symbol_code mofd: keep if _n == _N
bys symbol_code (mofd): gen rim= ln(closing / closing[_n-1])

Yearly Stock Return

As we follow the same pattern for monthly data, we will repeat the same procedure in the case of years. Just change the function on months into years like (yofd).

gen yofd = yofd(date)
format yofd %ty

The above commands will assist you to sort yearly data. You can see a new column of yofd which represents the yearly data and a new column will be generated for concerned data.

Then we will execute the following commands

bys symbol_code yofd: keep if _n == _N
bys symbol_code (yofd): gen riy = ln(closing / closing[_n-1]
Notify of
Inline Feedbacks
View all comments
Would love your thoughts, please comment.x