This short article is a guide on reporting summary statistics from Stata to Ms Word, Excel or LaTeX using the * outreg2* command in Stata.

For this guide, we start by using Stata’s inbuilt 1978 Automobile dataset and describing it using:

sysuse auto.dta

*All Summary Statistics for All Variables*

*All Summary Statistics for All Variables*

To report summary statistics for all the variables in our dataset, we use a familiar * outreg2 *syntax with a new option of

*. This option indicates to Stata that a summary table is being output.*

`sum(log)`

outreg2 using results, word replace sum(log)

We can also replace variable names with variable labels, a step that we explain in one of our <link>introductory * outreg2 *articles</link>.

With the * summarize* command, which is typically used to return summary statistics, Stata allows an option of

*.This option outputs a table with additional statistics. We can report these extra statistics through the*

`detail`

*command by typing*

`outreg2`

*in the parenthesis of the*

`detail`

*option used above:*

`sum()`

outreg2 using results, word replace sum(detail)

*Some Summary Statistics for All Variables*

*Some Summary Statistics for All Variables*

If we only wish to report, say the number of observations, mean and standard deviation of our variables (and not the minimum and maximum that are also reported by default), we add the * keep() *option to specify which variables we want to retain:

outreg2 using results, word replace sum(log) eqkeep(N mean sd)

##### Related Article: Using Putexcel to Export Stata results into Excel

*Some Statistics for Some Variables*

*Some Statistics for Some Variables*

As seen above, we use the * keep() *option to retain variables that we specify in the parenthesis. However, we cannot specify both the

` `*eqkeep()*

and *options at the same time.*

`keep()`

To obtain a summary table with a few statistics for a few variables, you can use * eqkeep()* to retain statistics, and

*to omit variables, or vice versa.*

`drop()`

The following command will return an error since both * eqkeep() *and

*appear simultaneously:*

`keep()`

outreg2 using results, word replace sum(log) eqkeep(N mean sd) keep(price mpg headroom trunk rep78)

*Summary Statistics for Observations Used In a Regression*

*Summary Statistics for Observations Used In a Regression*

Because of Stata’s casewise/listwise deletion, it omits observations with missing values from any regression analysis done. Therefore, the number of observations used in regressions is often lower for each variable than the number of observations reported for them in the summary statistics.

For example, if we summarise the data, we see that the variables ‘price’, ‘mpg’, and ‘headroom’ have 74 observations. ‘rep78’ has 69 observations. When we regress price on the other three variables, we note that the regression used 69 observations even though there were variables with 74 observations. This is because Stata omits any observation where rep78 is missing. We therefore find estimates from only 69 observations reported in the regression results.

To obtain summary statistics for variables and observations used in a regression only we first run the regression, then use the * outreg2 *command right after it with an option of

*;*

`sum`

regress price mpg headroom rep78outreg2 using results, word replace sum

*Summary Statistics for Different Groups/Categories*

*Summary Statistics for Different Groups/Categories*

To obtain summary statistics for each category in a categorical variable, we simply add the *bysort *prefix. Here we, use ‘foreign’ as our categorical variable of choice. This variable assumes the value of 1 when a vehicle is foreign, and 0 when a vehicle is domestic.

bysort foreign: outreg2 using results, word replace sum(log) eqkeep(N mean sd)

The results from the regressions will be reported separately for foreign cars (where variable ‘foreign’ = 1) and domestic cars (where variable ‘foreign’ = 0)

*Outputting Frequency Distribution*

*Outputting Frequency Distribution*

The option of * cross *allows us to output the frequency distribution of any variable we specify after

*. In case of a categorical variable, it includes each category in the table.*

`outreg2`

outreg2 foreign rep78 using results, word replace cross

Though * outreg2 *achieves all of the above well, it may not be the best command to output summary statistics from Stata. Another command,

*is perhaps more appropriate for this purpose, and will be discussed in a future article.*

`asdoc`

,