When there are many variables in the data, you might want to relocate them in a certain manner that will make it easy for you to use them in future. For instance, in larger data sets, it becomes difficult to identify important variables and look them up in data editor. The variables that are used repeatedly and are most important in the data, are preferred to locate at the beginning columns. Due to these reasons, if one wants to order variables in Stata, can be easily done by using the command
Order in Stata.
The Order Command:
Let’s import the auto data using following command. Auto data will be used in this article as an example:
Order Command relocates variables in the position where you want it to specify. For instance, in the list of variables shown in the data editor, we wish to order weight variable and place it at the beginning. We would simply use the following command:
Note that we didn’t specify the position where we want to place the
weight variable, but weight variable is placed at the top of list of variables. This occurs due to default setting that any variable placed after the order command will be positioned at the top of list of variables. It can also be verified from the data editor window, where the weight variable will be the first column followed by other columns.
Order command can also be executed in the Stata by using the Menu.
Data > Data Utilities > Change order of the variables
You can order the variables from drop-down option in the window as shown in picture below.
However, the convenient way is to use to order command in the Stata to relocate variables. Just how we relocated one variable; weight in the data, we can relocate multiple variables in the data too. For that purpose, we have to use the order command before the list of variables we wish to be relocated. For instance, if we wish to order variables: foreign, mpg and length in the similar manner, the following command will be used:
order foreign mpg length
The point to be noted is that the variable used immediately after order command will be prioritized, foreign in this case, and the other two variables; mpg and length will come after foreign variable.
Related Article: Using Rename command to rename Variable in Stata
Moving variable to the end of list
Moving forward, we wish to relocate a certain variable to the end of the list of variables. Now we have to specify it to the order command by using the
last option after the name of the variable. Putting “weight” variable to the end, we will use the following command:
order weight, last
Similarly, if more than one variables are supposed to be placed at the end of the variable list, the same order command will be used with the last word at the end. The order of variables will be such that the variable placed before the last word in the order command will be placed at the end of the variable list.
order foreign mpg length, last
The variable “length” will be placed at the end of the variables list.
Placing Variables before or after any other variable
Just like we can relocate variables at the end of the list of variables, we can also place variables before a specific variable in the list. For instance, if a variable named “turn” is to place before the “trunk” variable, the order command will relocate the turn variable before trunk.
The command we get will be:
order turn, before(trunk)
While dealing with large data set having a lot of variables can create difficulty in data analysis and handling data. However, variables can be easily searched using the search bar on the top of variables.
Click on the filter option in variables window > search for a variable by writing its name
Ordering Variables Alphabetically:
Variables can also be ordered alphabetically using the order command. Ordering variables alphabetically makes it easier for the user to handle data easily and reach out for the required variables.
You can order all the variables alphabetically using the order command, along with the additional instruction of
alphabetic option. To do this, we should copy all the variables from top down list, and paste them after the order command, and use the word alphabetic to specify the order command.
order price mpg rep78 headroom trunk weight length turn displacement gear_ratio foreign, alphabetic
All the variables in the image shown above are placed alphabetically now.
If only two or three variables are required to order alphabetically, it can also be done using the order command. For instance, if we wish to order variables price, rep78, and mpg alphabetically, we will place these variables after order command. As per the default setting, variables after the order command goes to top, the variables will be shifted on the top list of variables but now alphabetically. In the example we used here, the mpg variable will go on top, price will come after mpg, followed by rep78.
order price rep78 mpg, alphabetic
Related Article: Using Sort command in Stata
The Wild Card: Sequencing variables
The wild cards in Stata are extremely useful and can save a lot of time while handling the data. These are basically shortcuts used in Stata to quickly handle the variables. In the context of order command, wild cards sequence the variables having same initials, or starting with the same alphabet, and relocate them using order command. The first alphabet of that variable will be written after the order command. Let’s say we want to order variables having the initial “t” in their names. Using the order command, two variables (trunk and turn) starting with the “t” word will go on the top of list of variables. Following command will be used to order the variables having same initial
In the following image, variables having initials “t” are placed on the top list of variables using order command.
Similarly, if two variables having initials starting from “s” will be placed on top using the command order s*.
In handling data sets, certain variables have numeric in them and should be used cautiously. It becomes easier to handle such variables by using the order command. Demonstrating it, let’s generate three variables having numeric
gen a2=1 gen b3=1 gen a1=1 gen b2=1
These variables generated have random sequencing. If we wish the variables to be placed alphabetically, we will use the following command:
order a* b*
Note that all the variables starting from a and b will go on top. In this data set, we don’t have any other variables starting from a and b, so only variables generated above will go on top.
However, these variables are not placed numerically but are sequenced only alphabetically. To sequence them numerically such that a1 comes before a2, followed by b2 and b3, the sequential command will be used:
order a* b* , sequential
This will result in sequencing the variables alphabetically and numerically.