Cheat sheets for programming languages were commonplace before the internet became widely available with powerful search engines. Nowadays, I believe that cheat sheets are still very useful because search engines become simply too powerful and provide more answers than required by the question. Some of the most popular cheat sheets for Stata were prepared by Tim Essam and Laura Hughes from the US Agency for International Trade and Development. Follow them on twitter: @StataRGIS and @flaneuseks. Here are some of their cheat sheets:
Tag Archives: Stata
Nesting local macros in Stata
Local macros are a very useful feature of Stata. Here is a simple example.
local macro1 “Hello!”
local macro2 = “How are you?”
local macro3 2+2
local macro4 = 2+2
di ” `macro1'
`macro2′ ”
di “Here’s some math: `macro3' =
`macro4′ ”
A few comments are in order here. macro1′ tells Stata to replace this expression with the contents of the local macro1. Make sure to leave no spaces inside the `’. Also, note how macro3 and macro4 lead to different outcomes. You need to use the equal sign to make Stata to evaluate the expression, otherwise it will treat it like a string.
Now, Stata applies the “parentheses rule” when replacing local macros for their contents. That is, it first replaces the innermost local macro, then the second innermost local macro, and so on. Here is an example.
local macro5 “nest”
local nestmacro “Local macros can be nested!”
di ” “macro5’macro’ ”
Let’s add another layer to the nesting of the previous example and play with it a little bit more:
local macro5 “nest”
local macro6 “macro”
local macro7 5
local nestmacro “Local macros can be nested!”
di ” “macro`macro7”`macro6” ”
I guess you have a pretty good idea of how this works now. Good luck in your Stata coding.
A little bit more about loops and macros in Stata
Loops and lists are important tools for Stata programmers. Although these tools may not be the most computationally efficient technique, they can be of great help during the early stages of code development. By the way, it is always good to keep in mind that a good programming practice is to always keep definitions local to the do-file you are coding whenever possible.
The first example is about using a counter. You can place it on the do-file editor and then run it.
local fruits apple banana pear
local counter 0
foreach d of local fruits{
local counter= `counter’+1
display “ `counter’ – `d’”
}
After running it, the output should be something like this:
1 – apple
2 – banana
3 – pear
Notice that I used the equal sign “=” to build the counter. In short, you have to use it. Just for fun, see what happens when you replace “local counter= `counter’+1” by “local counter `counter’+1”.
Now suppose we have an ordered list of produce 4-digit codes and we would like to display the PLU code next to each fruit name.
local fruits apple banana pear
local plucode 3009 4011 4414
/* The first step is to find out the number of fruits */
local fruitnumber: word count `fruits’
di “Fruit – PLU code”
/* The second step is to build a loop to pair each fruit with its PLU code */
forvalues j = 1/`fruitnumber’ {
/* The third step is to pick the j-th element of each list to have them paired */
local pair1: word `j’ of `fruits’
local pair2: word `j’ of `plucode’
display “`pair1′ – `pair2′”
}
Output:
. local fruits apple banana pear
. local plucode 3009 4011 4414
. /* The first step is to find out the number of fruits */
. local fruitnumber: word count `fruits’
. di “Fruit – PLU code”
Fruit – PLU code
.
. /* The second step is to build a loop to pair each fruit with its PLU code */
. forvalues j = 1/`fruitnumber’ {
/* The third step is to pick the j-th element of each list to have them paired */
local pair1: word `j’ of `fruits’
local pair2: word `j’ of `plucode’
display “`pair1′ – `pair2′”
}
apple – 3009
banana – 4011
pear – 4414
.
end of do-file
So, the “word count” function returns the number of elements of the macro. And the “word `j’ of `fruits’” returns the j-th element of the macro fruits. You can find more about these macro extended functions on Stata by typing “help extended_fcn”.
Stata tip: creating a local containing all (or almost all) variables of the data set
Locals containing a list of variables can be very useful when using Stata. A common need is a local containing all variables of a data set. This local can be created by means of the ds command.
Here is an example using the lifeexp.dta data file.
. webuse lifeexp, clear
(Life expectancy, 1998)
Now, let’s create a local named allvar that will contain all variables of this data set.
. ds
region country popgrowth lexp gnppc safewater
. local allvar `r(varlist)’
. di “`allvar'”
region country popgrowth lexp gnppc safewater
We can see that ds stored the variable list into r(varlist). One interesting variation is the creation of a local containing all variables except region. You will need to specify the variables to be escluded right after ds, and add the option not after a comma.
. ds region, not
country popgrowth lexp gnppc safewater
. local othervar `r(varlist)’
. di “`othervar'”
country popgrowth lexp gnppc safewater
The command ds has several other useful applications that will be commented later in this blog.
Transferring IPEADATA series to Stata
A common issue that arises when converting time series data from IPEADATA to Stata format is dealing appropriately with the time variable. For instance, for monthly series the date format will be YYYY.MM. Stata usually interprets this format as numeric.
Suppose you already downloaded a monthly series from IPEADATA and transferred it to Stata. It is very likely that the date variable (let’s call it date) has been automatically handled as a numeric variable. The first thing to pay attention is that the numeric format disregards zeroes on the right-hand side of the decimal point. This means that October of 1940 is coded as 1940.10 by IPEADATA and interpreted as 1940.1 by Stata. To recover the missing zero, the first step is to convert this variable to string format. This can be done with the string() function.
generate sdate=string(date)
To add back the missing zeroes, we can do the following:
replace sdate=sdate+”0″ if length(sdate)<7
Now, we just need to tell Stata to interpret sdate as a monthly date variable. This can be accomplished with the command numdate. This is not a standard Stata command and needs to be installed in your computer (ssc install numdate).
numdate mo newdate = sdate, pattern(YM)
The above line can be interpreted as create a new date variable named newdate from variable sdate that is in the YYYY.MM format.
The numdate ado file can deal with very flexible date specifications, and its help file is very comprehensive. Two other useful commands are convdate and extrdate. They are used to convert or extract parts of dates from variables that are already in the Stata date format.
A final recommendation is to take a look at Stata documentation on dates that is available at http://www.stata.com/manuals13/ddatetime.pdf.
A few tips for programming in Stata
Stata is a very powerful and useful statistical software. Just like any sophisticated tool, it takes time to learn about it. And you need to invest some time to master it. Programming is one of those skills that knowing a little bit can be very beneficial. Below you will find four videos. The first video goes over the functionalities of the Stata Program Editor. The second video covers some basics of Stata commands. The third video talks about loops, which are an essential tool for programmers. Finally, the fourth video is about macros, which together with loops are very useful to handle repetitive tasks.
How to use the Stata Program editor:
Basics of Stata:
Quick guide to loops:
More about macros:
Exporting Stata’s correlation tables to a document file
I came across a very useful ado file for Stata named asdoc that facilitates the creation of neat tables.
To install asdoc, just type ssc install asdoc .
Here is an example of exporting a correlation table to a document named table.doc.
sysuse auto
asdoc correl price mpg headroom trunk weight, save(table.doc) dec(3)
Note that dec(3) means to export the correlations with 3 decimal places.
Asdoc has tons of other applications. Its help file is very comprehensive. And you can have a glimpse of its capabilities in the following videos:
Reading DATASUS and ANS .dbc files using R
Those interested in conducting research using Brazilian data often come across some unusual file formats used by Brazilian government agencies. One example is the .dbc files used for health data produced by DATASUS and by ANS. These .dbc files are not related to the.dbf files used by FoxPro for instance. In fact, they are just compressed .dbf files.
There is a package in R that can convert these .dbc files into regular .dbf files. This package is named read.dbc. Once you generated the .dbf files, you can use the package foreign and its function read.dbf to import the data into R. This very same package allow you to save the data using Stata .dta format by means of the function write.dta.
The package maptools also has a function to read dbf files named dbf.read, though I have not tried it yet.