Presentation is loading. Please wait.

Presentation is loading. Please wait.

2014 Nordic and Baltic Stata Users Group Metting Working sideways in Stata Jakob Hjort DataManager, MPH Department of Cardiology Aarhus University Hospital.

Similar presentations


Presentation on theme: "2014 Nordic and Baltic Stata Users Group Metting Working sideways in Stata Jakob Hjort DataManager, MPH Department of Cardiology Aarhus University Hospital."— Presentation transcript:

1 2014 Nordic and Baltic Stata Users Group Metting Working sideways in Stata Jakob Hjort DataManager, MPH Department of Cardiology Aarhus University Hospital DK-8200 Aarhus Denmark

2 The rectangular dataset

3 Statistics The rectangular dataset

4 Statistics The rectangular dataset ”It is not the data we want it’s the  ssence of data” results

5 Datamanagement The rectangular dataset

6 Datamanagement The rectangular dataset

7 Datamanagement Statistics

8 Datamanagement Statistics The rectangular dataset - transpose?

9 use ”family.dta”, clear * Dataset with: fam_name, inc_mother & inc_father mata st_view(x=0,.,(”inc_mother”,”inc_father”)) income=colsum(x’)’ st_addvar(”long”,”inc_household”) st_store(.,”inc_household”,income) end list fam_name inc_mother inc_father inc_household The rectangular dataset – subset in matrix using mata?

10 generate [type] newvar=exp [if] [in] The direct approach Datamanagement

11 generate [type] newvar=exp [if] [in] The direct approach WeightHeightBMI Datamanagement Ex.: generate BMI=Weight/Height^2

12 egen [type] newvar=fcn(arguments) [if] [in] [,options] rowtotal, rowmin, rowmax, rowfirst, rowlast, rowmean, rowmedian, rowmiss, rownonmiss, rowpctile, rowsd, concat, anycount, anymatch, anyvalue,count, diff, fill, group, iqr, kurt, max, mdev, mean, median, min, mode, mtr, pc, pctile, rank, sd, seq, skew, std, tag, total The direct approach Datamanagement

13 egen [type] newvar=fcn(arguments) [if] [in] [,options] rowtotal, rowmin, rowmax, rowfirst, rowlast, rowmean, rowmedian, rowmiss, rownonmiss, rowpctile, rowsd, concat, anycount, anymatch, anyvalue,count, diff, fill, group, iqr, kurt, max, mdev, mean, median, min, mode, mtr, pc, pctile, rank, sd, seq, skew, std, tag, total The direct approach IncJanIncFebincome Datamanagement Ex.: egen income=rowtotal(inc*) IncMarIncAprIncMay IncJunIncJul…

14 program define _growmin version 6, missing gettoken type 0 : 0 gettoken g 0 : 0 gettoken eqs 0 : 0 syntax varlist [if] [in] [, BY(string)] if `"`by'"' != "" { _egennoby rowmin() `"`by'"' } tempvar touse mark `touse' `if' `in' quietly { gen `type' `g' =. tokenize `varlist' while "`1'"!="" { replace `g' = cond(`1' < `g',`1',`g') mac shift } end Looking under the skirts – just for inspiration viewsource _growmin.ado the rowmin() function of egen

15 program define _growmin version 6, missing gettoken type 0 : 0 gettoken g 0 : 0 gettoken eqs 0 : 0 syntax varlist [if] [in] [, BY(string)] if `"`by'"' != "" { _egennoby rowmin() `"`by'"' } tempvar touse mark `touse' `if' `in' quietly { 1. gen `type' `g' =. 2. tokenize `varlist' 3. while "`1'"!="" { 4. replace `g' = cond(`1' < `g',`1',`g') 5. mac shift 6. } } end Looking under the skirts – just for inspiration viewsource _growmin.ado the rowmin() function of egen 1. Initialize target variable 2. Prepare the variable-list 3. Looping: 4. In-the-loop-commands

16  Prepare the variable-list Variables can be specified with wildcards - The expanded list is stored in `vars' (unab means unabbreviate – however the command itself can’t be un-abbreviated). unab vars: inc*. unab vars: incJan-incDec 1. Initialize target variable 2. Prepare the variable-list 3. Looping: 4. In-the-loop-commands. local vars incJan incFeb incMar incApr incMay incJun /// incJul incAug incSep incOct incNov incDec. ds inc*. ds incJan-incDec incJan incFeb incMar incApr incMay incJun incJul incAug incSep incOct incNov incDec Full specification of each and every variable – OK with 12 but what in case of hundreds? The list is stored in `vars' Variables can be specified with wildcards - The list is stored in `r(varlist)’ Nice feature: the expanded list is shown for inspection

17  Looping ”foreach” is the quickest and the most transparent loop command foreach lvar in incJan incFeb { // do stuff with "`lvar'” } unab lvar: inc* foreach lvar in `lvar' { // do stuff with "`lvar'” } ds inc* foreach lvar in `r(varlist)' { // do stuff with "`lvar'” } 1. Initialize target variable 2. Prepare the variable-list 3. Looping: 4. In-the-loop-commands

18 ”foreach” is the quickest and the most transparent loop command foreach lvar in incJan incFeb { // do stuff with "`lvar'” } unab lvar: inc* foreach lvar in `lvar' { // do stuff with "`lvar'” } ds inc* foreach lvar in `r(varlist)' { // do stuff with "`lvar'” } 1. Initialize target variable 2. Prepare the variable-list 3. Looping: 4. In-the-loop-commands alt 0 9 6 Hold + press … on numeric keypad ` 0 3 9 ’ Hold + press … on numeric keypad alt = = Left single-quote Right single-quote  Looping

19  In the loop generate minimum=. unab vars: inc* foreach lvar in `vars' { replace minimum = cond(`lvar' < minimum,`lvar’,minimum) } generate minimum=. unab vars: inc* foreach lvar in `vars' { replace minimum = `lvar’ if `lvar’<minimum } generate minimum=. unab vars: inc* foreach lvar in `vars' { if `lvar’<minimum { replace minimum = `lvar’ } 1. Initialize target variable 2. Prepare the variable-list 3. Looping: 4. In-the-loop-commands !

20 Some of the danish participants who might know ”the DREAM database” will propably be able to see how these approaches can be useful when working with this fantastic but difficult construction.

21 Thank you very much


Download ppt "2014 Nordic and Baltic Stata Users Group Metting Working sideways in Stata Jakob Hjort DataManager, MPH Department of Cardiology Aarhus University Hospital."

Similar presentations


Ads by Google