Presentation is loading. Please wait.

Presentation is loading. Please wait.

A A R H U S U N I V E R S I T E T Faculty of Agricultural Sciences Literate programming with SAS - and other languages Søren Højsgaard Faculty of Agricultural.

Similar presentations


Presentation on theme: "A A R H U S U N I V E R S I T E T Faculty of Agricultural Sciences Literate programming with SAS - and other languages Søren Højsgaard Faculty of Agricultural."— Presentation transcript:

1 A A R H U S U N I V E R S I T E T Faculty of Agricultural Sciences Literate programming with SAS - and other languages Søren Højsgaard Faculty of Agricultural Sciences Aarhus University Denmark SASforum, May 2009, Copenhagen

2 Take-home message  Literate programming: Combining text, code and results in one document  Supports text formats:  LaTeX / OpenOffice (OpenDocument Text)  In combination with the ’engines’  SAS, R, S-plus, Maple, Stata, …  Ensures reproducibility of analysis  Great help in ”recalling what I did 2 months ago”  StatWeave does all this – and is free…  This talk: Focus on StatWeave with OpenOffice and SAS/R …

3 Overview – Combining code, documentation and results Source document  Writing  SAS statements  More writing  R statements  Even more writing  More SAS statements  More writing… Final document  Writing  SAS statements  SAS output  SAS graphics  More writing  R statements  R output  Even more writing  SAS statements  SAS output  More writing…

4 Hello StatWeave World…

5 What is literate programming  Knuth (1979) coined the term literate programming:  Create software as works of literature:  Embed source code into descriptive text (rather than the opposite which is common practice)  Software should follow flow of thoughts and logic  Should be designed to be readable by humans (and not only by compilers / programs).  Very useful idea in statistics…

6 Why literate programming?  Reproducible statistical analysis  Research, consulting  Document exactly what has been done  Possible to re-run if data change  Manuals, course notes etc.  Shown output guaranteed to be result of shown code

7 Some systems for literate programming  Comments inside code  WEB (Knuth 1979) and friends  Sweave (Lesich 2002)  R code in LaTeX documents  odfWeave (Kuhn and Coulter 2007)  R code in OpenOffice documents  SASweave (Lenth and Højsgaard 2007)  SAS / R code in LaTeX documents  StatWeave  SAS / R / maple / S-plus / Stata … code in LaTeX and OpenOffice documents

8 StatWeave  StatWeave created by Russ Lenth, University of Iowa, USA  Available: http://www.cs.uiowa.edu/~rlenth/StatWeave/http://www.cs.uiowa.edu/~rlenth/StatWeave/  StatWeave is in its making, but becomming ”mature” and stable.  Statweave design goals  Support many languages  R, S-plus, SAS, Stata, Maple, …  Support different word processing systems, currently  LaTeX  OpenDocument Text (ODT) www.openoffice.orgwww.openoffice.org  Portability: Usable on all platforms (Written in JAVA)  Extendible:  Add other languages

9 Under the hood of StatWeave  Source file is regular text document but with code chunks added (with special tags)  Two basic operations  Weaving: Process source file into single document with code listings, output listings, graphs…  Tangling: Extract code from source file to run later  Weaving is useful for reproducible statistical analysis

10 Running StatWeave  Command-line interface: statweave SAS-HelloWorld-swv.odt statweave --tangle SAS-HelloWorld-swv.odt statweave --keepall SAS-HelloWorld-swv.odt  Graphical User Interface:  Generally, source xxx-swv.odt becomes output xxx.odt

11 Chicken weight data  Set global options (for SAS code)  Inline evaluation of expressions

12 … chicken weight data

13  Output can be saved for later use  - and display

14 Code reuse and argument substitution  Save code chunks for later execution  Pass arguments to code chunks  Simplest case: Not unlike a macro…

15 …code reuse and argument substitution  Costumize display and output (tables) by reusable code chunk

16 …code reuse and argument substitution

17 Multi-language example: SAS, R and DOS together  Can use different engines in the same source file  Use SAS when appropriate; use R when appropriate; use Maple when appropriate…  Weaving:  SAS/R/XX chunks assembled into separate code files.  Code files are processed in order of first appearence in the source file

18 …Multi-language example: SAS, R and DOS together

19

20

21

22

23  Synchronization issue: SAS chunk depends on data from R chunk which depends on data from SAS chunk….  Solution: The restart option will restart the engines

24 Code chunks are processed as a whole  Code chunks are processed as a ”unit” so in general one can not split a call to proc xxxx over several chunks:  Thus the following is illegal

25 … one exception in SAS: IML

26 Odds and ends – Maple  Differentiate y= sin(x) x x x  Output is ugly, but it reads:

27 Odds and ends – calling the shell  Want to list all StatWeave / Open office source files: *-swv.odt

28 Summary  Reproducible statistical analyses  Integrate text, code and results in one document  Several text formats  Several languages  This talk (and the examples) are avaiable at http://genetics.agrsci.dk/~sorenh/misc/ http://genetics.agrsci.dk/~sorenh/misc/  All credit is due to Russ Lenth, the creator of StatWeave. Thanks!!!!


Download ppt "A A R H U S U N I V E R S I T E T Faculty of Agricultural Sciences Literate programming with SAS - and other languages Søren Højsgaard Faculty of Agricultural."

Similar presentations


Ads by Google