Do files, log files, and workflow in Stata Biostatistics 212 Lecture 2.

Slides:



Advertisements
Similar presentations
Housekeeping: Variable labels, value labels, calculations and recoding
Advertisements

Create an APA-style header using Microsoft Word 2007 quick tips for creating an APA template Trinity Writing Center (2011)
The INFILE Statement Reading files into SAS from an outside source: A Very Useful Tool!
Using Excel Biostatistics 212 Lecture 4. Housekeeping Questions about Lab 3? –replace vs. recode Final Project Dataset! –“Housekeeping” commands vs. data.
Using Excel Biostatistics 212 Lecture 4. Housekeeping Finish Lab 2 today and/or start Lab 3 Mac Addendum Copying and pasting from Stata.
Html: getting started HTML is hyper text markup language. It is what web browsers look at on the Internet. HTML documents should be created in a simple.
Stata and logit recap. Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with.
Sharon Elin 2007 Citing Internet Sources the Easy Way ~ Using Easybib.com.
RESEARCH WORKFLOW USING STATA How to Be an Effective Researcher CCPR Workshop.
Generating new variables and manipulating data with STATA Biostatistics 212 Lecture 3.
Using…. EasyCBM Reasons to use EasyCBM
Calendar Browser is a groupware used for booking all kinds of resources within an organization. Calendar Browser is installed on a file server and in a.
Transitioning from Gradequick to ABI Gradebook April 16, 2009.
Today: Run SAS programs on Saturn (UNIX tutorial) Runs SAS programs on the PC.
Good modeling practices AGEC 641 Lab, Fall 2011 Mario Andres Fernandez Based on material written by Gillig and McCarl. Improved upon by many previous lab.
Generating new variables and manipulating data with STATA Biostatistics 212 Lecture 3.
Introduction to Statistical Computing in Clinical Research Biostatistics 212 Course director: Mark Pletcher Teaching Assistant: Lee Zane.
Generating new variables and manipulating data with STATA Biostatistics 212 Session 2.
Good Data Management Practices Patty Glynn 10/31/05
Open and save files directly from Word, Excel, and PowerPoint No more flash drives or sending yourself documents via Stop manually merging versions.
Tutorial 1 Creating a Database. Objectives Learn basic database concepts and terms Learn basic database concepts and terms Explore the Microsoft Access.
Lecture 1: Introduction Lecture series based on the text: Essential MATLAB for Engineers and Scientists By Hahn & Valentine
L2: BECOMING SELF- SUFFICIENT IN STATA Getting started with Stata Angela Ambroz May 2015.
Making a figure, dates, and other advanced topics Biostatistics 212 Lecture 6.
Using Excel Biostatistics 212 Lecture 4. Housekeeping Questions about Lab 3? Final Project Dataset! –Check in.
Microsoft ® Office Access ™ 2007 Training Choose between Access and Excel ICT Staff Development presents:
Choose between Access and Excel Right questions, right program If you’re having trouble choosing between Access and Excel, take a moment to answer an important.
L.E.A. Data Technologies L.E.A. Data Technologies Introduction.
ABI Gradebook Training We are all in this together!
10/5/2015CS346 PHP1 Module 1 Introduction to PHP.
Making Tables and Figures with Stata Biostatistics 212 Lecture 6.
XP 1 Microsoft Word 2002 Tutorial 1 – Creating a Document.
STATA Mini Course Fall 2015 Jane Leber Herr Littauer 113 1Stata Mini Course – Spring 2015.
IS1811 Multimedia Development for Internet Applications Lecture 4: Introduction to HTML Rob Gleasure
Organizing a project, making a table Biostatistics 212 Lecture 7.
Organizing a project, making a table Biostatistics 212 Session 5.
Productivity Programs Common Features and Commands.
Presented By David Speight.  Easy Student Accessibility  Familiar Navigation  Fits Inside the Box  Works Outside the Box  Allows Creativity without.
Introduction to MATLAB 7 Engineering 161 Engineering Practices II Joe Mixsell Spring 2010.
Generating new variables and manipulating data with STATA Biostatistics 212 Lecture 3.
Organizing a project, making a table Biostatistics 212 Lecture 7.
ISU Basic SAS commands Laboratory No. 1 Computer Techniques for Biological Research Animal Science 500 Ken Stalder, Professor Department of Animal Science.
Introduction to Statistical Computing in Clinical Research Biostatistics 212.
Making Tables and Figures with Stata Biostatistics 212 Lecture 6.
1 Data Manipulation (with SQL) HRP223 – 2010 October 13, 2010 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This.
Introduction to Statistical Computing in Clinical Research
Introduction to MATLAB 7 MATLAB Programming for Engineer Hassan Migdadi Spring 2013.
Introduction to Statistical Computing in Clinical Research Biostatistics 212 Lecture 1.
Introduction to MATLAB 7 Engineering 161 Engineering Practices II Joe Mixsell Spring 2012.
Today Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with Stata – Estimation – GOF.
Introduction to Text Based Coding. We’re learning to explore how text based programming works You will display and enter text into a window You will use.
1 Data Manipulation (with SQL) HRP223 – 2009 October 12, 2009 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This.
Analyzing Data. Learning Objectives You will learn to: – Import from excel – Add, move, recode, label, and compute variables – Perform descriptive analyses.
Lesson 6 Word Lesson 6 presentation prepared by Michele Smith – North Buncombe High School, Weaverville, NC. Content from Microsoft Office Word 2010 Lesson.
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 5 & 6 By Ravi Mandal.
Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with Stata – Estimation –
Basic Web Design UVI CELL Dave Gilliss Dave Gilliss
Software Overview How to… Review Video and Data  Review the Journal Review the Journal  Simple Search Simple Search  Advanced Search Advanced Search.
Tutorial 1 Creating a Database
Microsoft Word 2016 Lesson 6 Part 1.
Lecture 5 Good modeling Chengcheng Fei 2017 Fall
GO! with Microsoft Office 2016
A video coding and data visualization tool
Using a set-up file to read ASCII data into SPSS
GO! with Microsoft Access 2016
Tutorial 1 – Creating a Document
ECONOMETRICS ii – spring 2018
Lecture 1: Introduction
Stata Basic Course Lab 2.
Presentation transcript:

Do files, log files, and workflow in Stata Biostatistics 212 Lecture 2

Housekeeping Everyone connected to web, servers, etc? Questions from Lab 1 –Page up to repeat/edit a command –Storage types ( help data_types ) –Brackets, italics, commas, etc in a Stata command – see handout tabulate var1 var2 [, chi2]comma optional (note brackets) ttest contvar, by(catvar)comma required –Definition of a p-value –Death as an outcome, SE of a proportion, etc –P=.000? –Sig figs –Why is summarize caccat wrong? Final Project Anything else?

Today... Rationale for Do and Log files How they work Demonstrations Lab

Last week Using Stata interactively for immediate analysis –Fill in the blanks –Like a calculator

What happens if… A question arises about your results? You decide to do something differently? –Add a new variable to your model –Categorize a variable differently You get new data? You lose something? –Overwrite your data file, computer crash, etc

What happens if… A question arises about your results? You decide to do something differently? –Add a new variable to your model –Categorize a variable differently You get new data? You lose something? –Overwrite your data file, computer crash, etc ALL OF THESE THINGS WILL HAPPEN TO YOU!

Cardinal Principles Keep your source data pristine and secure Document everything you do to it Document every analysis Make sure you can repeat everything you do easily and quickly and accurately

Cardinal Principles Keep your source data pristine and secure Document everything you do to it Document every analysis Make sure you can repeat everything you do easily and quickly and accurately Do and Log files make this easy!

One systematic approach Import data Save as a Stata dataset Clean the data using a do file, save new dataset Analyze the data using other do files Document each step with a log file Transfer results from log files to tables, figures, etc. More on this later

Do files A list of commands Text Create with the do file editor Run –With do file editor button, or –do yourdofile.do

Do files Demo –Simple list of commands –Different types of comments –Run in three different ways –“run” vs. “do”

Do files “Comments” are a way to document your logic – here are the options * Anything after asterix is comment /* Anything until you reach the reciprocal symbol is comment */ Other options: // ///

Do files Advantages –Plan your analysis –Cut and paste, find and replace, etc –Repeat quickly and easily and reproducibly –Comments enhance documentation –Development cycle iterations You will get errors, make corrections, rerun, etc

Log files A record of all Stata output Plain text (.log ) versus Stata formatted (.smcl ) –We use plain text for this course Start and stop with button or commands –log using yourlogname.log (open) ‾, append (add to end) ‾, replace (replace) –log close (close) –log off (pause) –log on (un-pause) Don’t edit log files!

Log files Demo –Start logging, run commands, close and look –.smcl vs..log –long output command or lots of commands

Log files Advantages –Complete documentation –Time/date of run –No “buffer” problem –Documents analysis on data as it was at that time

Log files Command logs, FYI –List of commands you enter –Control same as other logs cmdlog using cmdlog close cmdlog off cmdlog on –I never use them! Use do files instead.

Using Do and Log files together Open the log file WITHIN the do file! –Everything documented every time –Improves repeatability Open your dataset WITHIN the do file! –Subset for inclusions/exclusions in do file also Save your dataset WITHIN the do file! –And save it with a different name –NEVER save manually except right after importing data into Stata –Watch for “proliferating datasets” problem

Using Do and Log files together Open the log file WITHIN the do file! –Everything documented every time –Improves repeatability Open your dataset WITHIN the do file! –Subset for inclusions/exclusions in do file also Save your dataset WITHIN the do file! –And save it with a different name –NEVER save manually except right after importing data into Stata –Watch for “proliferating datasets” problem

Using Do and Log files together Demo –Within do file: Open log, close log Open dataset “Capture log close” cd – PC vs. Mac Set more off/on

Using Do and Log files together Advantages –Full documentation –Easy repeatability –Data security and file management system

Using Do and Log files together It’s worth the effort!

What happens if… Revisited A question arises about your results? You decide to do something differently? –Add a new variable to your model –Categorize a variable differently You get new data? You lose something? –Overwrite your data file, computer crash, etc

Advice from a former TA (Lee Zane)

My Advice Thou shalt do MOST of your work on do files Thou shalt open a log WHEN YOU ARE READY to document your analysis i.e. Feel free to explore your data, follow instincts, etc quickly without do/log files

Lab today Lab 2 –Walks you through do and log files –Set up template for future labs

Preview of next week… Cleaning your data –Generating new variables –Manipulating data –Labeling