Download presentation
Presentation is loading. Please wait.
Published byBertha Mason Modified over 9 years ago
1
A A R H U S U N I V E R S I T E T Faculty of Agricultural Sciences Literate programming with multiple languages Russel V. Lenth Department of Statistics & Actuarial Science, The University of Iowa, USA Søren Højsgaard Faculty of Agricultural Sciences Aarhus University Denmark DSC 2009, July 2009, Copenhagen, Denmark
2
Take-home message Literate programming: Combining text, code and results in one document StatWeave does this Supports text formats: LaTeX / OpenOffice (OpenDocument Text) In combination with one or several of the ’engines’ SAS, R, S-plus, Maple, Stata, Matlab, shell… StatWeave is ”Sweave for generalized values of LaTeX and S” Jave based and hence portable A great help in creating reproducible statistical analyses Extensible: Add languages
3
Overview – Combining code, documentation and results Source document Writing SAS statements More writing R statements Even more writing More SAS statements More writing… Final document Writing SAS statements SAS output SAS graphics More writing R statements R output Even more writing SAS statements SAS output More writing…
4
Example: R + LaTeX
7
Example: SAS + OpenDocument Text
9
What is literate programming Term coined by Knuth (1979): Create software as works of literature: Embed source code into descriptive text (rather than the opposite) Software should follow flow of thoughts and logic Should be designed to be readable by humans (and not only by compilers / programs). Some systems for literate programming (in statistics) Sweave (Lesich 2002) R code in LaTeX documents odfWeave (Kuhn and Coulter 2007) R code in OpenOffice documents SASweave (Lenth and Højsgaard 2007) SAS / R code in LaTeX documents StatWeave SAS / R / maple / S-plus / Stata / Matlab / shell… code in LaTeX and OpenOffice documents
10
Why literate programming? Reproducible statistical analysis Research, consulting Document exactly what has been done Possible to re-run if data change Maintain one document only (at least in principle) Manuals, course notes etc. Shown output guaranteed to be result of shown code
11
StatWeave StatWeave created by Russ Lenth, University of Iowa, USA Available: http://www.cs.uiowa.edu/~rlenth/StatWeave/http://www.cs.uiowa.edu/~rlenth/StatWeave/ StatWeave is in its making, but becomming ”mature” and stable. Source file is regular text document but with code chunks added (with special tags) Two basic operations Weaving: Process source file into single document with code listings, output listings, graphs… Tangling: Extract code from source file to run later Weaving is useful for reproducible statistical analysis
12
Running StatWeave Command-line interface: statweave SAS-HelloWorld-swv.odt statweave --tangle SAS-HelloWorld-swv.odt statweave --keepall SAS-HelloWorld-swv.odt Graphical User Interface:
13
Example: SAS + ODT Set global options (for SAS code) Inline evaluation of expressions
14
Example: SAS + ODT
15
Output can be saved for later use - and display
16
Code reuse and argument substitution Save code chunks for later execution Pass arguments to code chunks Simplest case: Not unlike a macro…
17
Example: SAS + ODT - code reuse and argument substitution Costumize display and output (tables) by reusable code chunk
18
Example: SAS + ODT - code reuse and argument substitution
19
Example: Multiple languages - SAS, R and DOS together Can use different engines in the same source file Use SAS when appropriate; use R when appropriate; use Maple when appropriate… Weaving: SAS/R/XX chunks assembled into separate code files. Code files are processed in order of first appearence in the source file
20
Example: Multiple languages
25
Synchronization issue: SAS chunk depends on data from R chunk which depends on data from SAS chunk…. Solution: The restart option will restart the engines
26
Example: Maple + LaTeX
28
Example: Maple + ODT Differentiate y= sin(x) x x x Output is ugly, but it reads:
29
Odds and ends – calling the shell Want to list all StatWeave / Open office source files: *-swv.odt
30
Code chunks are processed as a whole Code chunks are processed as a ”unit” so in general one can not split a call to proc xxxx over several chunks: Thus the following is illegal
31
… one exception in SAS: IML
32
Summary Reproducible statistical analyses Integrate text, code and results in one document Several text formats Several languages This talk (and the examples) available at http://genetics.agrsci.dk/~sorenh/misc/ http://genetics.agrsci.dk/~sorenh/misc/ All credit is due to Russ Lenth, the creator of StatWeave. Thanks!!!!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.