Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mar-16H.S.1 Error check in data Hein Stigum Presentation, data and programs at:

Similar presentations


Presentation on theme: "Mar-16H.S.1 Error check in data Hein Stigum Presentation, data and programs at:"— Presentation transcript:

1 Mar-16H.S.1 Error check in data Hein Stigum Presentation, data and programs at: http://folk.uio.no/heins/

2 Example data HUMIS –Birth cohort, 5 counties in Norway –N=475 mother-child pairs –Repeated questionnaires Purpose –Outcome:Growth after birth –Exposure:Contaminants in mother’s milk Mar-16H.S.2

3 Mar-16H.S.3 Agenda Potential problems –String variables, Missing, … Univariate Bivariate Multivariable Individual growth

4 Mar-16H.S.4 Potential problems

5 Mar-16H.S.5 String variables encode KJONN if KJONN!=" ", generate(sex3) String to numeric

6 Mar-16H.S.6 Missing

7 Mar-16H.S.7 Univariate outliers

8 Mar-16H.S.8 Commands for previous plot local i=1 foreach var of varlist age1 weight1 fHCB BMI1 mHeight mWeight { graph hbox `var', marker(1, mlabel(id) msymbol(i) mlabpos(0) mlabangle(-90)) /// name(plt`i', replace) local ++i } graph combine plt1 plt2 plt3 plt4 plt5 plt6, col(2)

9 Mar-16H.S.9 Bivariate outliers

10 Mar-16H.S.10 Commands for previous plot twoway (scatter mWeight mHeight) /// (scatter mWeight mHeight if BMI1>35 | BMI1<16, mcol(red))/// (qfit mWeight mHeight)/// (qfit mWeight mHeight if mHeight<185)///, legend(off) text(110 195 "BMI>35", col(red)) /// ytitle("Mother's weight") xtitle("Mother's height")

11 Mar-16H.S.11 Multivariable outliers Weight

12 Mar-16H.S.12 Commands for previous plot gen agesq=age^2 gen ageqb=age^3 regress weight age agesq ageqb if age>=0 & age<1000 capture: drop xb res predict xb, xb/* predicted value */ predict res, res/* residuals */ tw (scatter weight age)(scatter weight age if abs(res)>4000, mcol(red))/// (line xb age, sort lcol(red)) if age>=0 & age<1000, legend(off)

13 Mar-16H.S.13 Plot of individual growth patterns: weight versus age

14 Mar-16H.S.14 Weight by age 1

15 Mar-16H.S.15 Weight by age 2

16 Mar-16H.S.16 Weight by age 3

17 Mar-16H.S.17 Weight by age 4

18 Mar-16H.S.18 Weight by age 5

19 Mar-16H.S.19 Weight by age 6

20 Mar-16H.S.20 Weight by age 7

21 Mar-16H.S.21 Weight by age 8

22 Mar-16H.S.22 Weight by age 9

23 Mar-16H.S.23 Weight by age 10

24 Mar-16H.S.24 Weight by age 11

25 Mar-16H.S.25 Weight by age 12

26 Mar-16H.S.26 Weight by age 13

27 Mar-16H.S.27 Weight by age 14

28 Mar-16H.S.28 Weight by age 15

29 Mar-16H.S.29 Weight by age 16

30 Commands for previous plots * Individual growth patterns. OBS 16 pages of each 30 plots * Repeated measurements, long format, age nested in id sort id age/* sort by id-number and age */ global d=30/* 30 plots per page */ forvalues i=1(1)16 {/* 16 pages*30 plots=480 subjects */ local j=(`i'-1)*$d+1/* plot subjects in id-interval: j<=id<=k */ local k=`i'*$d twoway (line weight age, connect(ascending)) if id>=`j' & id<=`k‘ ///,by(id, compact title("Weight by age, `i'") note("") ) /// ylabel(0(5000)15000) xlabel(0(200)800) graph export “H:\Projects\HUMIS\Weight gain\plt`i'.emf", replace /* Enhanced Metafile Format */ }/* end of loop */ * Make new Photo album in Powerpoint, and add all plots. This will give one plot per page in max size. Mar-16H.S.30

31 Mar-16H.S.31 After new data merge Plot of individual growth patterns: weight versus age

32 Mar-16H.S.32

33 Mar-16H.S.33

34 Mar-16H.S.34

35 Mar-16H.S.35

36 Mar-16H.S.36

37 Mar-16H.S.37

38 Mar-16H.S.38

39 Mar-16H.S.39

40 Mar-16H.S.40

41 Mar-16H.S.41

42 Mar-16H.S.42

43 Mar-16H.S.43

44 Mar-16H.S.44

45 Mar-16H.S.45

46 Mar-16H.S.46

47 Mar-16H.S.47

48 Mar-16H.S.48 Individual plots in large datasets? Scan 1 page (=30 curves) in 5 sec –Hours used=5N/(30*60*60) Scan all –If N=50 000, need 2.3 hours May instead scan curves of subjects with medium to large residuals. –Residual>1000 finds 190 of the 470 children=40% 12 of the 15 deviant growth patterns=80%

49 Summing up Graph, outliers –Uni:Boxplots –Bi:Scatterplots –Multi:Scatterplots+residuals –Individual growth Merge errors are not rare! Mar-16H.S.49


Download ppt "Mar-16H.S.1 Error check in data Hein Stigum Presentation, data and programs at:"

Similar presentations


Ads by Google