This article gives a comprehensive overview of three Stata commands from the estout package that are used to output publication style correlation table in . Though this package is capable of reporting outputs for regression results and summary statistics as well, it is particularly useful for outputting correlation tables too.

If have not already done so, you can install the `estout`

package to your Stata through the following command

ssc install estout

For this article, we once again make use of the Stata’s 1978 Automobile dataset.

sysuse auto.dta, clear

*estout*

*estout*

To output the correlation table we need to follow two steps:

- Run
*estpost*to estimate the correlation matrix, - Output the correlation table in a presentable way.

*1. Run estpost to estimate the correlation matrix*

*1. Run estpost to estimate the correlation matrix*

correlate price mpg rep78

The command above displays a simple correlation table for the specified variables. The table includes correlations of each of the three variables with each other, i.e. ‘price’ with ‘mpg’, ‘price’ with ‘rep78’, and ‘mpg’ with ‘rep78’.

estpost correlate price mpg rep78

Adding * estpost *at the start of our previous commands returns us a different table with two rows and four columns. This command creates a table that shows the correlation between only the first two variables in the command. So in this case, it will report the correlation between ‘price’ and ‘mpg’. To include correlations between all three variables (or any number of variables), we add the option of

*to our command.*

`matrix`

estpost correlate price mpg rep78, matrix

We now get a table with correlations between all three variables reported. The * matrix *option helps report correlations of all possible pairwise combinations of the variables in our command. The reason we still see a difference between the correlation values in this table and the one reported by

*correlate*is because the

*command does not omit missing values when calculating correlations.*

`estpost`

However, not all variables that we calculate correlations of have an equal number of observations. In our example, ‘price’ and ‘mpg’ have 74 observations, while ‘rep78’ has 69 (5 missing values).

In order to account for such missing values in the dataset, and ensure that the *estout *command only calculates a correlation matrix from non-missing values for all observations, we make use of the *listwise *option along with `matrix`

.

estpost correlate price mpg rep78, matrix listwise

The column for the number of observations, e(count), should now show a consistent figure throughout the table.

*2. Output the correlation table in a presentable way*

*2. Output the correlation table in a presentable way*

esttab using results.rtf, replace

The syntax, though similar to * outreg2, *eliminates the need to specify an option for the output file type and accounts for that within the file name through an extension, .rtf in this case.

*‘Long’ to ‘Wide’ Format of the Table*

*‘Long’ to ‘Wide’ Format of the Table*

Because this basic syntax reports the correlation data in one column, we make it more readable and presentable by adding the * unstack *option to display the correlations in a ‘wide’ format.

esttab using results.rtf, replace unstack

*Removing Empty Rows*

The additional rows in the table, otherwise place holders for standard errors which are not available here, can be removed by the straightforward * not *option:

esttab using results.rtf, replace unstack not

*Removing Number of Observations and Notes*

*Removing Number of Observations and Notes*

To omit the the number of observations from the bottom of the table, we further add an option of `noobs`

:

esttab using results.rtf, replace unstack not noobs

To remove the note on significance levels at the bottom of the table, we also add an option of `nonote`

:

esttab using results.rtf, replace unstack not noobs nonote

*Adjusting Decimal Places*

*Adjusting Decimal Places*

We can make the number of decimal places consistent throughout the table by specifying a number in the * b() *option. In this example, we limit our table to report correlations up to two decimal places:

esttab using results.rtf, replace unstack not noobs nonote b(2)

*Reporting Labels Instead of Variable Names*

*Reporting Labels Instead of Variable Names*

In order to report label names instead of variable names in our correlation table, the option of * label *does the trick:

esttab using results.rtf, replace unstack not noobs nonote b(2) label

To output the same table in Excel, we run the * estpost and esttab* command as follows:

estpost correlate price mpg rep78, matrix listwiseesttab using correlationresults.csv, replace unstack not noobs compress b(2) nonote label

Simple changing the file extension from.rtf to .csv saves it in Excel format. In addition, the * compress* option reduces unnecessary horizontal spacing.

The steps above should now leave you with a neat, publication-worthy correlation table for use in formal academic papers.

For more details on estout command click here.