Download presentation
Presentation is loading. Please wait.
Published byBertina Wiggins Modified over 6 years ago
1
Moving from SAS to Stata: Making customized tables in RTF using -rtfutil- and other packages
Thank you for your kind introduction, and the opportunity to give this talk. The title of the talk is Clinical database management: From raw data through study tabulations to analysis datasets Si litt om bakgrunn, CRO, akademia, SAS, Stata Inge Christoffer Olsen, Phd Diakonhjemmet Hospital, Norway
2
Background Used SAS when working for a norwegian CRO
Forced into SPSS when moving to Diakonhjemmet Managed to force Stata onto the researchers at Diakonhjemmet I will begin with a quote from the famous physicist Max Planck: ” An experiment is a question which science poses to Nature and a measurement is the recording of Nature's answer” Meaning we cannot understand Nature without measurements. We need to take care of our measurements!
3
Background I love Stata! Programming is so efficient, clear and easy.
4
Background But… Programming is so efficient, clear and easy.
5
Stata output How do you come from here:
6
Final table …to here? Diagnosis Male Female Total Rheumatoid arthritis
21 57 78 Spondyloarthritis 76 15 91 Psoriatric arthritis 30 Ulcerative colitis 61 32 93 Crohn’s disease 92 63 155 Psoriasis 5 35 295 187 482
7
Cut and paste 1: Text tabulate diagnosis sex if visit_no== | Sex Diagnosis | Male Female | Total Rheumatoid arthritis | | Spondyloarthritis | | Psoriatric arthritis | | Ulcerative colitis | | Crohn's disease | | Psoriasis | | Total | | *Possible to remedy by using fixed width fonts, but still only text
8
Cut and paste 2: HTML * Sometimes OK, but unreliable Randomised and
Sex Yes ---- Diagnosis Male Female Rheumatoid arthritis 21 57 Spondyloarthritis 76 15 Psoriatric arthritis 15 15 Ulcerative colitis 61 32 Crohn's disease 92 63 Psoriasis 30 5 * Sometimes OK, but unreliable
9
Cut and paste 3: Picture Nice, but useless
10
Dissapointed Only major feature where Stata is inferior to SAS or SPSS
Forced to enter results manually Error prone!
11
Reports In SAS it was relatively easy to report tables from statistical analyses in RTF format using PROC REPORT Characteristic Statistic Abatacept N = 6 Adalimumab N = 133 Anakinra N = 2 Certol. Pegol N = 190 Etanercept N = 203 Golimumab N = 327 Infliximab N = 46 Rituximab N = 42 Tocilizumab N = 30 Age (Years) (a) n 6 133 2 190 203 327 46 42 30 Mean 47.77 42.82 46.40 51.48 45.47 44.12 46.05 55.27 53.73 Std. Dev. 15.47 13.57 27.44 14.72 13.62 13.14 13.34 12.21 12.77 Median 51.60 41.20 53.75 45.00 42.90 43.60 54.95 54.30 Min/Max 26.5/64.1 19.0/74.7 27.0/65.8 18.7/82.5 19.3/76.3 17.7/80.5 25.5/79.0 30.7/78.0 29.8/73.7 Sex, n(%) Male 1 ( 16.7) 64 ( 48.1) 0 ( 0.0) 44 ( 23.2) 80 ( 39.4) 151 ( 46.2) 20 ( 43.5) 6 ( 14.3) 4 ( 13.3) Female 5 ( 83.3) 69 ( 51.9) 2 ( 100) 146 ( 76.8) 123 ( 60.6) 176 ( 53.8) 26 ( 56.5) 36 ( 85.7) 26 ( 86.7) No. of prev. Biologics (b) 77 85 62 114 29 32 27 3.7 1.5 4.0 2.0 1.4 2.1 2.2 2.3 0.8 0.7 1.2 1.0 1.3 1.1 3.5 3/5 1/4 1/6 1/8 1/7 1/5 No. of prev. DMARDs 129 172 181 280 40 36 28 2.7 3.0 1.9 2.5 1.6 0.9 2.8 0/4 0/6 0/5 0/3 Biologics Naive, n(%) Yes 52 ( 40.3) 87 ( 50.6) 119 ( 65.7) 166 ( 59.3) 11 ( 27.5) 4 ( 11.1) 1 ( 3.6) No 6 ( 100) 77 ( 59.7) 85 ( 49.4) 62 ( 34.3) 114 ( 40.7) 29 ( 72.5) 32 ( 88.9) 27 ( 96.4)
12
Reports Most work a statistician or researcher do in Stata usually ends up in a report or article Most reports or articles are presented in a document, usually Word (or pdf using Latex) Unsatisfying to rely on manually entering tables from Stata output Natively it is possible to export raw results to Excel, but this still mandates a lot of manual work I hate manual work! Must be a way to produce Word/RTF tables
13
Solution Fortunately there is a lot of user-written programs for Stata (available through ssc) I will present how RTF-tables can be produced using the package –rtfutil- in addition to results compiling packages such as (x)contract-, -(x)collapse-, and -parmest-
14
Aim Typical Table 2 in an article Variable Treatment 1 Treatment 2
Difference (95% CI) Contvar 1, mean (SD) 1.5 (5.54) 0.7 (3.94) 1.29 ( ) Contvar 2, mean (SD) 1.6 (5.67) 0.7 (4.41) 1.09 ( ) Contvar 3, mean (SD) 0.3 (1.01) -0.2 (1.38) -0.02 ( ) Catvar 1, n(%) 10 (23.3) 7 (16.7) 6.6 ( ) Contvar 4, mean (SD) 0.1 (0.59) -0.2 (0.67) -0.01 ( ) Contvar 5 mean (SD) 0.1 (0.23) 0.0 (0.25) 0.08 ( ) Catvar 2, n(%) 29 (87.9) 39 (92.9) -5 ( ) Contvar 6, mean (SD) 0.1 (1.28) -0.2 (1.68) 0.14 ( )
15
Step 1 Use –xcollapse- by treatment to get the mean and SD
The resulting dataset 1 has two lines with mean and SD for each treatment Restore the original dataset and run some regression analysis (e.g. by -mixed-) Use –margins- to get the treatment difference Use –parmest- to store the result dataset 2 Append the second dataset to the first Add some variable indicating endpoint Store in an endpoint specific temporary dataset
16
Step 2 Repeat step 1 for each continuous endpoint
Combine all endpoint specific datasets var varlab Treatment estimate max95 min95 mean sd contvar1 Contvar 1, mean (SD) 1 2 3 contvar2 Contvar 2, mean (SD) contvar3 Contvar 3, mean (SD) contvar4 Contvar 4, mean (SD)
17
Step 3 Create text variable using e.g. –sdecode- to make RTF-ready ,
varlab Treatment estimate max95 min95 mean sd text contvar1 Contvar 1, mean (SD) 1 \qr{1.51 (5.544)} 2 \qr{0.67 (3.943)} 3 \qr{1.29 (-0.35 \endash 2.94)} contvar2 Contvar 2, mean (SD) \qr{1.56 (5.667)} \qr{0.69 (4.412)} \qr{1.09 (-0.68 \endash 2.86)} contvar3 Contvar 3, mean (SD) \qr{0.25 (1.009)} \qr{-0.15 (1.376)} \qr{-0.02 (-0.50 \endash 0.47)} contvar4 Contvar 4, mean (SD) \qr{0.065 (0.5879)} \qr{ (0.6674)} \qr{ ( \endash )} \qr means right align, \endash is a semi-long dash
18
Step 4 Use –xrewide- (or reshape wide) to get to one line: xrewide text, i(var varlab) j(treatment) var varlab text1 text2 text3 contvar1 Contvar 1, mean (SD) \qr{1.51 (5.544)} \qr{0.67 (3.943)} \qr{1.29 (-0.35 \endash 2.94)} contvar2 Contvar 2, mean (SD) \qr{1.56 (5.667)} \qr{0.69 (4.412)} \qr{1.09 (-0.68 \endash 2.86)} contvar3 Contvar 3, mean (SD) \qr{0.065 (0.5879)} \qr{ (0.6674)} \qr{ ( \endash )} contvar4 Contvar 4, mean (SD) \qr{0.23 (0.988)} \qr{-0.17 (1.337)} \qr{-0.03 (-0.50 \endash 0.43)}
19
Step 5 Repeat step 1 to 4 for categorical and/or time-to-event variables and compile to a final results dataset Sort the dataset according to the sequence you want to present the data
20
Step 6 Use the –rtfutil- package to write the results dataset to RTF
tempname handle2 rtfopen `handle2' using “Output/Table 2.rtf", template(minimal) replace paper(a4land) landscape use work/total, clear file write `handle2' " {\pard\b Typical Table 2 in an article \par}" _n rtfrstyle varlab text1 text2 text3, cwidths( ) local(b d e) listtab varlab text1 text2 text3, handle(`handle2') begin("`b'") delim("`d'") end("`e'") /// head("`b’ \ql{\b Variable} `d' \qr{\b Treatment 1 }`d' \qr{\b Treatment 2} `d' \qr{\b Difference (95% CI)} `e'" rtfclose `handle2'
21
Strengths Produce ready-to-use tables to be inserted into an article or report No manual work! Possible to include .eps figures directly into RTF document Quick once you have the results datasets
22
Weaknesses Quite a lot of programming (initially)
Can take a lot of tweaking to get the result exactly as you want Inclusion of figures are not fully supported in RTF, need to open the document in Word and include the figure files
23
Tabulation Datasets (TD)
Final organisation Raw output from eCRF Imported into Stata Tabulation Datasets (TD) Analysis Datasets Results dataset RTF tables
24
Additional tips The -project- module is fantastic for organizing and maintaining Stata projects Utilizes checksums to keep overview of unchanged files, only recently changed do-files and dependent do-files will be run
25
Acknowledgements The –rtfutil- module is written by Roger B. Newson
The –project- module is written by Robert Picard
26
The end Thank you!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.