STATA User Group September 2007 Shuk-Li Man and Hannah Evans
Content MACROS LOOPS MACROS WITH LOOPS -Description -Why use them? -How to store and use\recall them -Applied examples LOOPS -Applied examples-(using foreach, forval and while) MACROS WITH LOOPS -Why use them together?
Aims of the Talk An understanding of what macros and loops are and the principles behind using them. Understand how loops and macros can be the solution to solve problems in a number of different in contexts. Be able to apply macros and loops to your work
MACROS
What is a macro? A macro is a saved sequence of keyboard strokes that can be stored and then recalled with a single keyboard stroke. (ref: http://whatis.techtarget.com)
Macros. Why use them? Macros Increases the scope of your do file How? Allows you to run the same do file on multiple datasets because you can get them to run under the conditions of the dataset.
Macros in STATA A sequence of keystrokes, the macrocontent are stored under a macroname (you can call it what ever you like). Either local or global macros. - A local macro that is created in Stata can only be used within Stata. -A global macro made in Stata can be used or modified in other programs.
Storing a macro (macro assignment) local animals dog cat mouse Here the macrocontent is dogcatmouse stored under macroname animals local sum =2+2 Here the macrocontent is 4 stored under macroname sum local add “=2+2” Here the macrocontent is =2+2 stored under macroname add
Recalling a macro (after storing it) ` .’ macroname Single quotes are compulsory
On the keyboard left use buttons indicated in red circles right use button indicated in green circle
Using a macro “ .” ` .’ macroname Double quotes are good practice
Examples of recalling a macro-using display local animals dog cat mouse display “`animals’” . dogcatmouse local sum =2+2 display “`sum’” . 4 local add “=2+2” display “`add’” . =2+2
Using Macros in different contexts In macro you can store…. The categories/levels of a variable- e.g. gender, male or female The output of a variable e.g. mean age, maximum height Store the file names in a directory- e.g. all files in directory C:/datafiles
Macros-Applied Example 1 -Store the categories of a variable Levelsof gender, local(genderlevels) Stores the categories male and female from variable gender and stores it under the macroname genderlevels display “`genderlevels’” . malefemale
Macros-Applied Example 2 -Storing the output from a variable (1) Summarise age local k=r(max) di “`k’” . 96 * oldest person is 96
Macros-Applied Example 3 -Storing the output from a variable (2) gen numword=wordcount(description) summarize numword . Variable Obs Mean Std. Dev. Min Max . numword 5 1 4 local k=r(max) di “ `k’ ” 4 * maximum number of words in the description is 4 and this is stored in macroname k
Macros-Applied Example 4 GPRD example: Text files clinical1.txt, clinical2.txt & clinical3.txt stored in directory C:\GPRD files local mylist: dir “C:\GPRD files\” files “clinical*.txt” This stores all text files in directory “C:\GPRD files\” beginning with the word “clinical” and is Same as: cd “C:\GPRD files\” local mylist clinical1.txt clinical2.txt clinical3.txt -Store names of files in a directory You name
LOOPS
Loops. What is one? A set of commands executed repeatedly for.. 1. a list of elements (foreach) or… 2. a range of values (forval) 3. a specified condition (while) (Not covered in this talk)
Loops. Why use them? Reduces length of do files Reduces chances human error Reduces checking time Can reduce the amount of memory STATA needs to run a command Increases the efficiency of a do file
Loops. Example Predictors of a prolonged HIV inpatient stay. Factors include: Ethnicity (categorical) Clinical stage (3 indicator variables) Age (continuous) Immigration status (binary)
Loops. Example 1: Forval forval num = 1/3 { tab clin_stage`num’ pro_stay } Same as: tab clin_stage1 pro_stay tab clin_stage2 pro_stay tab clin_stage3 pro_stay range lname –arbitrary name of range
Loops. Example 1: Foreach foreach factor of varlist ethnicity immigration { recode `factor’ 99=. tab `factor’ pro_stay, row chi2 exact } Same as: recode ethnicity 99=. tab ethnicity pro_stay, row chi2 exact recode immigration 99=. tab immigration pro_stay, row chi2 exact lname – arbitrary name of the list
Loops. Example 1: Foreach foreach factor of varlist ethnicity immigration { recode `factor’ 99=. tab `factor’ pro_stay, row chi2 exact } Same as: recode ethnicity 99=. tab ethnicity pro_stay, row chi2 exact recode immigration 99=. tab immigration pro_stay, row chi2 exact lname – arbitrary name of the list Options: varlist, numlist, newvarlist, lmacname, gmacname
Loops. Example 1: Foreach foreach factor of varlist ethnicity immigration { recode `factor’ 99=. tab `factor’ pro_stay, row chi2 exact } Same as: recode ethnicity 99=. tab ethnicity pro_stay, row chi2 exact recode immigration 99=. tab immigration pro_stay, row chi2 exact
Loops. Example 2: Foreach GPRD example: save all text files to a .dta format append these files together. foreach x of clinical1 clinical2 clinical3 { insheet using `x’.txt save `x’, replace } Product: clinical1.dta, clinical2.dta and clinical3.dta
Loops. Foreach-other examples Foreach x in { …. } Foreach x of numlist { Foreach x of newlist { Foreach x of `macroname’ { General list e.g. filenames The same as using forval For strings previously unspecified in STATA This allows the loop to run through the list stored in a macro
MACROS WITH LOOPS
Macros and loops. Why use together? Macros and loops advantage- Can use do file on multiple datafiles. Loops advantage- Shorter less error prone do file Loops advantage addition- Can get around functions that do not allow by() as a sub option
Macros and Loops -Example 1: using Macros Example 1 Levelsof gender, local(levels) display “`levels’” malefemale Foreach x of `levels’ { tab1 smoking bmi if gender=“`x’” } Same as: tab1 smoking bmi if gender==“female” tab1 smoking bmi if gender==“male”
Macros and Loops Example 2 HAVE WANT HPVtypes 9 23 & 36 36 23 & 36 23 9 & 23 type9 type23 type36 1
Macros and Loops Example 2 -using Macros Example 2 gen numbword=wordcount(description) summarise numbword . Variable Obs Mean Std. Dev. Min Max . numbword 5 1 4 local k=r(max) di “`k’” 4 Greatest number of words in the description is 4
2. Generate variables for each word forval i=1/`k' { gen word`i'=word( word, `i') }
Macros and Loops Example 2 HPVType 9 23 & 36 36 23 & 36 23 9 & 23 word1 word2 word3 word4 9 23 & 36
Macros and Loops using Macros- Example 4 cont… 3. Generate variable for each type forval i=1/36 { gen type`i’=0 <-creates variables type1-type36 forval j=1/`k’ { replace type`i’=1 if real(word`j')==`i’<- line gives 1s to types`i’ for those words types that are present and ignores the words that are “&” -e.g. if type 36 present in original variable HPVtype then will change 0 to 1 in type36 variable } sum type`i’, mean Local k=r(max) <- stores maximum for type1-type36 drop type`i’ if “`k’”==0<-drops all those variables where maximum is zero i.e. drops those types that are not in original variable “HPVtypes” creates 36 variables called a variable all containing zero
Macros and Loops Example 2-Output HPVtypes 9 23 & 36 36 23 & 36 23 9 & 23 type9 type23 type36 1
Macros and Loops Example 2 -updated dataset HPVtypes 9 23 & 36 36 23 & 36 23 9 & 23 type9 type23 type36 1 & 21 & 16
Macros and Loops Example 2 Problem: Want to append all files in a directory that begin with the word clinical
REFRESH Macros- Example 5- -Store names of files in a directory Text files received from GPRD clinical1.txt,clinical2.txt and clinical3.dta stored in directory C:\GPRD files local mylist: dir “C:\GPRD files\” files “clinical*.txt" di “`mylist’” clinical1.txtclinical2.txtclinical3.txt
REFRESH Loops-Foreach example 2 GPRD example: save all text files to a .dta format append these files together 1. foreach x of clinical1 clinical2 clinical3 { insheet using `x’.txt, clear save `x’, replace } Product: clinical1.dta, clinical2.dta and clinical3.dta
Macros and Loops example 2 local mylist: dir “C:\GPRD files\” files “clinical*.dta” foreach x of `mylist’ { insheet using `x’.txt, clear save `x’, replace } gen row_no=_n sum row_no local k=r(max) drop in 1/`k’ append using `x’ save clinical_all, replace *saves all text file with the name clinical at the beginning as .dta file * drops all observations from clinical3.dta * appends clinical1.dta, clinical2.dta, clinical3.dta together and saves as clinical_all.dta
Macros and Loops Example 2 Result A file called clinical_all which contains all text files with file name starting with “clinical” from directory “C:\GPRD files\”.
Go away today with… An understanding of what macros and loops are and the principles behind using them. Understand how loops and macros can be the solution to solve problems in a number of different in contexts. Be able to apply macros and loops to your work
And Hopefully…. BE PERSUADED THAT USING LOOPS AND MACROS IN DO FILES ARE REQUIRED FOR YOU TO WORK MORE EFFICEINTLY IN THE FUTURE AND BE MORE CONFIDENT THAT YOUR RESULTS ARE ALL CORRECT.
______________ Thank you