Presentation is loading. Please wait.

Presentation is loading. Please wait.

Reproducible Research And Dynamic Documents in Stata

Similar presentations


Presentation on theme: "Reproducible Research And Dynamic Documents in Stata"— Presentation transcript:

1 Reproducible Research And Dynamic Documents in Stata
----- Meeting Notes (01/09/15 12:14) ----- What's your experience with Stata? You have your Laptops E. F. Haghish University of Freiburg

2 Part 1: Reproducible Analyses
Why data analysis should be reproducible? How can we communicate the results of the analysis effectively? What kind of errors/obstacles might make the procedure inefficient.

3 It Is The Statistics Era!
Faster and cheaper computers Data is everywhere Internet makes gathering data easy and cheap Many jobs are available in data science Quantitative studies are flourishing

4 Changes in the traditional Statistics practice
Doing more data analysis compared to the past Many exploratory analyses might be done which never make it to the published work Analyses are shared over the internet with colleagues Writing scientific publication has become more cooperative than before Statistical programming has become popular Web-based and interactive statistical applications are emerging

5 Reproducible Analysis
Reproducible Research is more general term, pertaining to broader scope Unreproducible quantitative research = Unreliable Results Unintentional Errors can happen at any stage of research. Design, procedure, assessments, data collection, data management and preparation, analysis procedure, witting, and publication We can devide these errors to two parts. Pre-digitalized data and after digitalized data Reproducible Analysis focuses on the procedure from the time data is digitalized till witting the results

6 Obstacles of Statistical Analysis
A major problem in social sciences where students are tought statistics using ”mouse & click”, although most software support writing syntax Easy to forget Not reproducible Slow Mistakes cannot be corrected Cannot be supervised or checked The procedure cannot be shared Cannot be reused

7 Improving Reproducibility?
Writing syntax alone does not guarantee reproducibility Make the codes easily rerunable, by connecting the whole procedure within a ”main file” do and run cause Stata to execute the commands stored in filename just as if they were entered from the keyboard. do echoes the commands as it executes them, whereas run is silent. Store different codes in separete files and connect them into the master file. Then, running the master file will run every code in the right order. Always use the raw data to avoid confusion. Comment your codes. Always assume you are intending to share your codes. Write beautifully and explain the code, when it’s needed.

8 Example of Master file use rawdata.dta, clear do preparation.do
do descriptive.do do analysis.do do report.do In the Master file, the procedure becomes observable in a logical order. If the data analysis is complicated and the number of files increases, it makes re-reading the project analysis much easier, faster, and more efficient. Dynamic document can be written within the do files. This procedure is specially useful for Weaver package.

9 Dynamic Documents Literate Programming Producing analysis reports
Taking notes of complex statistics procedures Teaching statistics

10 Markup Language Markup, in a broad sense is a ”computer language” used for annoting, formatting, and styling a document using text tags. Example: HTML, RTF, XML, LaTeX

11 HTML Markup Example Try it now

12 LaTeX Markup Example Very sophisticated, and you can do literally anything with it, yet, keep your document light and fast. Try it online at

13 Markdown Invented by John Gruber (2004), it is a light-weight markup language There are different versions of it available, which are developed by other programmers It’s very popular It has very simple syntax for annoting document But it is limited and is not as sophisticated as HTML or LaTeX It is used for creating a ”Standard Document” that only has the most essentials. It supports headings, paragraph, basic tables, adding image and link, making text bold, italic, etc

14 Markdown In contrast to HTML and LaTeX, Markdown focuses merely on the ”content” of the document and does not provide anything for changing the formatting of the document. The streangth of Markdown, is its simplicity. After exporting Microsoft Word docx, reduce the left and right margins of the document to 1 cm.

15 Markdown Make text Italic *text* _text_ Mak text bold **text**
Italic and bold ***text*** ___text___

16 Markdown Header 1 This is Header 1 ============ Header 2

17 Markdown Alternatively, headers can be specified at the beginning of the text using hashtags # This is header 1 ## This is header 2 ### This is header 3 #### This is header 4 ##### This is header 5 ###### This is header 6

18 Markdown

19 Markdown Adding web link [text](http://url.com/) Adding Image
! [explanation](./path.png) Note that the image CANNOT be resized or aligned in the document. It will be imported in its current dimentions, and always placed at the left side of the document. If the image is in a large size, it will ruin the document, especially in Microsoft Office, Office Libre, and PDF formats.

20 Markdown Creating an ordered list Apple Orange Cherry

21 Markdown Creating unordered list, which also can be nested using tab.
* Abacus * answer * Bubbles 1. bunk 2. bupkis * BELITTLER 3. burper * Cunning

22 Markdown To add a horisontal line --- * * *
To begin a new line, leave one line empty between the paragraphs. To avoid line wrap, leave 2 or more spaces at the end of the line

23 Remember! Write your document ONLY WITH ONE MARKUP LANGUAGE
Markdown’s simplicity can improve the readability of your document, so consider writing with Markdown unless you want to write a very sophisticated document in LaTeX or HTML…

24 Part 2: Lab Session 3 software are taught in the Lab session MarkDoc
Weaver CodeMap

25 ssc install markdoc ssc install weaver CodeMap only works on Mac

26 MarkDoc vs Weaver Weaver only creates HTML and PDF
MarkDoc creates HTML, PDF, Microsoft Word DOCX, Open Office ODT, and LaTeX Weaver is very robust and is completely programmed in Stata MarkDoc relies on third-party software, named Pandoc which is a document convertor MarkDoc is suitable for writing documents that include a lot of text. Also, when the author intends to do further work on the generated Docx, LaTeX, etc. Weaver is suitable for briefly explaining the results of a data analysis and sharing the PDF. Weaver also provides live-preview of the document while weaving.

27 MarkDoc Everything should be wrapped in smcl log file
qui log using example, replace qui log c //removes this command from the document markdoc example, export(html) replace

28 MarkDoc See markdoc-text.do
Text is written as comment inside the log file and can be written using 3 markup languages, Markdown, HTML, and LaTeX This do file includes 3 documents written in Markdown, HTML, and LaTeX. Which one is nicer? /* Writing text in MarkDoc ======================= This is heading 2 Text should be written as comment */

29 MarkDoc Stata commands are used between the commands as usual. MarkDoc automatically include them in the document, regardless of the markup language you are writing with. There are many ways for adding an image/figure to the document. HTML, PDF, and LaTeX formats are very versatile but for adding an image to Microsoft Word document only Markdown can be used.

30 Writing Dynamic Text Use macros or returned values with text to refer to them. The txt command allows writing text to the document. This cannot be done within the comments signs because the macros will not be interpreted. The txt command can also contain markup signs.

31 Hiding Commands Use /**/ before a command to hide it. This DOES NOT hide the output. To hide the output use Stata “quietly” command Using “qui log on” and “qui log off” you can excludes some parts of the codes and results from the document. See markdoc_dynamic_text.do

32 Dynamic Tables MarkDoc can also create dynamic tables use tble command

33 Stata Journal Publications
Use Markdown to create LaTeX files Use style(stata) Use texmaster option

34 Weaver Weaver has a set of commands for writing the document.
weave for starting a new document div puts the commands and results in separate frames img works the same as in markDoc knit for writing dynamic text report for printing a PDF while working on the document weavend for closing the document

35 Weaver codes only shows the command results only shows the results

36 CodeMap For understanding the structure of a complicated Statistical package or data analysis It reveals the connections between code files and functions Useful for high-end users are interested to learn statistical programming.


Download ppt "Reproducible Research And Dynamic Documents in Stata"

Similar presentations


Ads by Google