Publication Style Correlation Table in Stata

This article gives a comprehensive overview of three Stata commands from the estout package that are used to output publication style correlation table in . Though this package is capable of reporting outputs for regression results and summary statistics as well, it is particularly useful for outputting correlation tables too. If you want to understand the basic concept related to correlation, please head on to Chapter 8 of A Gentle Introduction to Stata by Alan C. Acock.

Download Example File

If have not already done so, you can install the estout package to your Stata through the following command

ssc install estout

For this article, we once again make use of the Stata’s 1978 Automobile dataset.

sysuse auto.dta, clear


To output the correlation table we need to follow two steps:

  1. Run estpost to estimate the correlation matrix,
  2. Output the correlation table in a presentable way.

1. Run estpost to estimate the correlation matrix

correlate price mpg rep78
simple correlation table in stata

The command above displays a simple correlation table for the specified variables. The table includes correlations of each of the three variables with each other, i.e. ‘price’ with ‘mpg’, ‘price’ with ‘rep78’, and ‘mpg’ with ‘rep78’.

estpost correlate price mpg rep78
correlation table using estpost, estout esttab command

Adding estpost at the start of our previous commands returns us a different table with two rows and four columns. This command creates a table that shows the correlation between only the first two variables in the command. So in this case, it will report the correlation between ‘price’ and ‘mpg’. To include correlations between all three variables (or any number of variables), we add the option of matrix to our command.

estpost correlate price mpg rep78, matrix 
correlation table with estout and matrix option

We now get a table with correlations between all three variables reported. The matrix option helps report correlations of all possible pairwise combinations of the variables in our command. The reason we still see a difference between the correlation values in this table and the one reported by correlate is because the estpost command does not omit missing values when calculating correlations.

However, not all variables that we calculate correlations of have an equal number of observations. In our example, ‘price’ and ‘mpg’ have 74 observations, while ‘rep78’ has 69 (5 missing values).

In order to account for such missing values in the dataset, and ensure that the estout command only calculates a correlation matrix from non-missing values for all observations, we make use of the listwise option along with matrix.

estpost correlate price mpg rep78, matrix  listwise
example of estout using matrix and listwise options

The column for the number of observations, e(count), should now show a consistent figure throughout the table.

[embedyt] https://www.youtube.com/watch?v=St6qDmvBI08[/embedyt]

2. Output the correlation table in a presentable way

esttab using results.rtf, replace 
correlation table in word in stata

The syntax, though similar to outreg2, eliminates the need to specify an option for the output file type and accounts for that within the file name through an extension, .rtf in this case.

‘Long’ to ‘Wide’ Format of the Table

Because this basic syntax reports the correlation data in one column, we make it more readable and presentable by adding the unstack option to display the correlations in a ‘wide’ format.

esttab using results.rtf, replace unstack

Removing Empty Rows

The additional rows in the table, otherwise place holders for standard errors which are not available here, can be removed by the straightforward not option:

esttab using results.rtf, replace unstack not

Removing Number of Observations and Notes

To omit the the number of observations from the bottom of the table, we further add an option of noobs:

esttab using results.rtf, replace unstack not noobs

To remove the note on significance levels at the bottom of the table, we also add an option of nonote:

esttab using results.rtf, replace unstack not noobs nonote
Related Article: Using Putexcel to Export Stata results into Excel

Adjusting Decimal Places

We can make the number of decimal places consistent throughout the table by specifying a number in the b() option. In this example, we limit our table to report correlations up to two decimal places:

esttab using results.rtf, replace unstack not noobs nonote b(2)

Reporting Labels Instead of Variable Names

In order to report label names instead of variable names in our correlation table, the option of label does the trick:

esttab using results.rtf, replace unstack not noobs nonote b(2) label
final correlation table in ms word

To output the same table in Excel, we run the estpost and esttab command as follows:

estpost correlate price mpg rep78, matrix listwise esttab using correlationresults.csv, replace unstack not noobs compress b(2) nonote label

Simple changing the file extension from.rtf to .csv saves it in Excel format. In addition, the compress option reduces unnecessary horizontal spacing.

The steps above should now leave you with  a neat, publication-worthy correlation table for use in formal academic papers.

For more details on estout command click here.

Notify of
Inline Feedbacks
View all comments
Would love your thoughts, please comment.x