Download
literate programming with multiple languages n.
Skip this Video
Loading SlideShow in 5 Seconds..
Literate programming with multiple languages PowerPoint Presentation
Download Presentation
Literate programming with multiple languages

Literate programming with multiple languages

92 Views Download Presentation
Download Presentation

Literate programming with multiple languages

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Literate programming with multiple languages Søren Højsgaard Faculty of Agricultural Sciences Aarhus University Denmark Russel V. Lenth Department of Statistics & Actuarial Science, The University of Iowa, USA DSC 2009, July 2009, Copenhagen, Denmark A A R H U S U N I V E R S I T E T Faculty of Agricultural Sciences

  2. Take-home message • Literate programming: Combining text, code and results in one document • StatWeave does this • Supports text formats: • LaTeX / OpenOffice (OpenDocument Text) • In combination with one or several of the ’engines’ • SAS, R, S-plus, Maple, Stata, Matlab, shell… • StatWeave is • ”Sweave for generalized values of LaTeX and S” • Jave based and hence portable • A great help in creating reproducible statistical analyses • Extensible: Add languages

  3. Source document Writing SAS statements More writing R statements Even more writing More SAS statements More writing… Final document Writing SAS statements SAS output SAS graphics More writing R statements R output Even more writing SAS statements SAS output More writing… Overview – Combining code, documentation and results

  4. Example: R + LaTeX

  5. Example: R + LaTeX

  6. Example: R + LaTeX

  7. Example: SAS + OpenDocument Text

  8. Example: SAS + OpenDocument Text

  9. What is literate programming • Term coined by Knuth (1979): • Create software as works of literature: • Embed source code into descriptive text (rather than the opposite) • Software should follow flow of thoughts and logic • Should be designed to be readable by humans (and not only by compilers / programs). • Some systems for literate programming (in statistics) • Sweave (Lesich 2002) • R code in LaTeX documents • odfWeave (Kuhn and Coulter 2007) • R code in OpenOffice documents • SASweave (Lenth and Højsgaard 2007) • SAS / R code in LaTeX documents • StatWeave • SAS / R / maple / S-plus / Stata / Matlab / shell… code in LaTeX and OpenOffice documents

  10. Why literate programming? • Reproducible statistical analysis • Research, consulting • Document exactly what has been done • Possible to re-run if data change • Maintain one document only (at least in principle) • Manuals, course notes etc. • Shown output guaranteed to be result of shown code

  11. StatWeave • StatWeave created by Russ Lenth, University of Iowa, USA • Available: http://www.cs.uiowa.edu/~rlenth/StatWeave/ • StatWeave is in its making, but becomming ”mature” and stable. • Source file is regular text document but with code chunks added (with special tags) • Two basic operations • Weaving: Process source file into single document with code listings, output listings, graphs… • Tangling: Extract code from source file to run later • Weaving is useful for reproducible statistical analysis

  12. Running StatWeave • Command-line interface:statweave SAS-HelloWorld-swv.odt statweave --tangle SAS-HelloWorld-swv.odtstatweave --keepall SAS-HelloWorld-swv.odt • Graphical User Interface:

  13. Example: SAS + ODT • Set global options (for SAS code) • Inline evaluation of expressions

  14. Example: SAS + ODT

  15. Example: SAS + ODT • Output can be saved for later use • - and display

  16. Code reuse and argument substitution • Save code chunks for later execution • Pass arguments to code chunks • Simplest case: Not unlike a macro…

  17. Example: SAS + ODT - code reuse and argument substitution • Costumize display and output (tables) by reusable code chunk

  18. Example: SAS + ODT - code reuse and argument substitution

  19. Example: Multiple languages - SAS, R and DOS together • Can use different engines in the same source file • Use SAS when appropriate; use R when appropriate; use Maple when appropriate… • Weaving: • SAS/R/XX chunks assembled into separate code files. • Code files are processed in order of first appearence in the source file

  20. Example: Multiple languages

  21. Example: Multiple languages

  22. Example: Multiple languages

  23. Example: Multiple languages

  24. Example: Multiple languages

  25. Example: Multiple languages • Synchronization issue: SAS chunk depends on data from R chunk which depends on data from SAS chunk…. • Solution: The restart option will restart the engines

  26. Example: Maple + LaTeX

  27. Example: Maple + LaTeX

  28. Example: Maple + ODT • Differentiate y= sin(x) xxx • Output is ugly, but it reads:

  29. Odds and ends – calling the shell • Want to list all StatWeave / Open office source files: *-swv.odt

  30. Code chunks are processed as a whole • Code chunks are processed as a ”unit” so in general one can not split a call to proc xxxx over several chunks: • Thus the following is illegal

  31. … one exception in SAS: IML

  32. Summary • Reproducible statistical analyses • Integrate text, code and results in one document • Several text formats • Several languages • This talk (and the examples) available at http://genetics.agrsci.dk/~sorenh/misc/ • All credit is due to Russ Lenth, the creator of StatWeave. Thanks!!!!