Presentation is loading. Please wait.

Presentation is loading. Please wait.

A A R H U S U N I V E R S I T E T Faculty of Agricultural Sciences Literate programming with multiple languages Russel V. Lenth Department of Statistics.

Similar presentations


Presentation on theme: "A A R H U S U N I V E R S I T E T Faculty of Agricultural Sciences Literate programming with multiple languages Russel V. Lenth Department of Statistics."— Presentation transcript:

1 A A R H U S U N I V E R S I T E T Faculty of Agricultural Sciences Literate programming with multiple languages Russel V. Lenth Department of Statistics & Actuarial Science, The University of Iowa, USA Søren Højsgaard Faculty of Agricultural Sciences Aarhus University Denmark DSC 2009, July 2009, Copenhagen, Denmark

2 Take-home message  Literate programming: Combining text, code and results in one document  StatWeave does this  Supports text formats:  LaTeX / OpenOffice (OpenDocument Text)  In combination with one or several of the ’engines’  SAS, R, S-plus, Maple, Stata, Matlab, shell…  StatWeave is  ”Sweave for generalized values of LaTeX and S”  Jave based and hence portable  A great help in creating reproducible statistical analyses  Extensible: Add languages

3 Overview – Combining code, documentation and results Source document  Writing  SAS statements  More writing  R statements  Even more writing  More SAS statements  More writing… Final document  Writing  SAS statements  SAS output  SAS graphics  More writing  R statements  R output  Even more writing  SAS statements  SAS output  More writing…

4 Example: R + LaTeX

5

6

7 Example: SAS + OpenDocument Text

8

9 What is literate programming  Term coined by Knuth (1979):  Create software as works of literature:  Embed source code into descriptive text (rather than the opposite)  Software should follow flow of thoughts and logic  Should be designed to be readable by humans (and not only by compilers / programs).  Some systems for literate programming (in statistics)  Sweave (Lesich 2002)  R code in LaTeX documents  odfWeave (Kuhn and Coulter 2007)  R code in OpenOffice documents  SASweave (Lenth and Højsgaard 2007)  SAS / R code in LaTeX documents  StatWeave  SAS / R / maple / S-plus / Stata / Matlab / shell… code in LaTeX and OpenOffice documents

10 Why literate programming?  Reproducible statistical analysis  Research, consulting  Document exactly what has been done  Possible to re-run if data change  Maintain one document only (at least in principle)  Manuals, course notes etc.  Shown output guaranteed to be result of shown code

11 StatWeave  StatWeave created by Russ Lenth, University of Iowa, USA  Available: http://www.cs.uiowa.edu/~rlenth/StatWeave/http://www.cs.uiowa.edu/~rlenth/StatWeave/  StatWeave is in its making, but becomming ”mature” and stable.  Source file is regular text document but with code chunks added (with special tags)  Two basic operations  Weaving: Process source file into single document with code listings, output listings, graphs…  Tangling: Extract code from source file to run later  Weaving is useful for reproducible statistical analysis

12 Running StatWeave  Command-line interface: statweave SAS-HelloWorld-swv.odt statweave --tangle SAS-HelloWorld-swv.odt statweave --keepall SAS-HelloWorld-swv.odt  Graphical User Interface:

13 Example: SAS + ODT  Set global options (for SAS code)  Inline evaluation of expressions

14 Example: SAS + ODT

15  Output can be saved for later use  - and display

16 Code reuse and argument substitution  Save code chunks for later execution  Pass arguments to code chunks  Simplest case: Not unlike a macro…

17 Example: SAS + ODT - code reuse and argument substitution  Costumize display and output (tables) by reusable code chunk

18 Example: SAS + ODT - code reuse and argument substitution

19 Example: Multiple languages - SAS, R and DOS together  Can use different engines in the same source file  Use SAS when appropriate; use R when appropriate; use Maple when appropriate…  Weaving:  SAS/R/XX chunks assembled into separate code files.  Code files are processed in order of first appearence in the source file

20 Example: Multiple languages

21

22

23

24

25  Synchronization issue: SAS chunk depends on data from R chunk which depends on data from SAS chunk….  Solution: The restart option will restart the engines

26 Example: Maple + LaTeX

27

28 Example: Maple + ODT  Differentiate y= sin(x) x x x  Output is ugly, but it reads:

29 Odds and ends – calling the shell  Want to list all StatWeave / Open office source files: *-swv.odt

30 Code chunks are processed as a whole  Code chunks are processed as a ”unit” so in general one can not split a call to proc xxxx over several chunks:  Thus the following is illegal

31 … one exception in SAS: IML

32 Summary  Reproducible statistical analyses  Integrate text, code and results in one document  Several text formats  Several languages  This talk (and the examples) available at http://genetics.agrsci.dk/~sorenh/misc/ http://genetics.agrsci.dk/~sorenh/misc/  All credit is due to Russ Lenth, the creator of StatWeave. Thanks!!!!


Download ppt "A A R H U S U N I V E R S I T E T Faculty of Agricultural Sciences Literate programming with multiple languages Russel V. Lenth Department of Statistics."

Similar presentations


Ads by Google