STATA for S-052 M. Shane Tutwiler Your Friendly S-040 Lecturer William Johnston IT Services Harvard Graduate School of Education.

Slides:



Advertisements
Similar presentations
Basics of Biostatistics for Health Research Session 2 – February 14 th, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health.
Advertisements

Max Perez Leon Quinoso Brian Fried StatLab. Create a folder named IntroStata in the desktop. Lets put all files in that folder Very simple. We can use.
Statistical Methods Lynne Stokes Department of Statistical Science Lecture 7: Introduction to SAS Programming Language.
Stata and logit recap. Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with.
Stata Intro Practice Exercises Debby Kermer, George Mason University Libraries Data Services.
1. Overview Brief guide to the display windows and toolbar
Generating new variables and manipulating data with STATA Biostatistics 212 Lecture 3.
Logit & Probit Regression
I NTRO TO S TATA James Ng Center for Digital Scholarship Hesburgh Libraries.
Teaching Stata—some reflections after 8 years of training experiences Karen Robson York University.
Getting Started With STATA How do I do this? It probably opened automatically, but you may have to save it to the desktop, and double-click it to open.
STATA TUTORIAL: LAB STATA windows  The command window  The viewer/results window  The review of commands window  The variable window.
Generating new variables and manipulating data with STATA Biostatistics 212 Lecture 3.
Introduction to Statistical Computing in Clinical Research Biostatistics 212 Course director: Mark Pletcher Teaching Assistant: Lee Zane.
Examine the data Hsien-Ming Lien Dept of Public Finance, NCCU.
A Simple Guide to Using SPSS© for Windows
Ordinal Logistic Regression
Stata Introduction Sociology 229A, Class 2 Copyright © 2008 by Evan Schofer Do not copy or distribute without permission.
Generating new variables and manipulating data with STATA Biostatistics 212 Session 2.
An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.
Getting Started with your data
SPSS Statistical Package for the Social Sciences is a statistical analysis and data management software package. SPSS can take data from almost any type.
Unit 4c: Taxonomies of Logistic Regression Models © Andrew Ho, Harvard Graduate School of EducationUnit 4c – Slide 1
Introduction to SPSS (For SPSS Version 16.0)
Unit 4c: Taxonomies of Logistic Regression Models © Andrew Ho, Harvard Graduate School of EducationUnit 4c – Slide 1
How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window.
Day 1: Getting Started Department of Economics
L2: BECOMING SELF- SUFFICIENT IN STATA Getting started with Stata Angela Ambroz May 2015.
Econometric Analysis Using Stata
1 CCPR Computing Services Workshop: Introduction to Stata June, 2006.
 Overview of SPSS  Interface  Getting Started  Managing Data  Descriptive Statistics  Basic Analysis  Additional Resources.
Stata Workshop #1 Chiu-Hsieh (Paul) Hsu Associate Professor College of Public Health
Session I How to use STATA & Basic Data Management Commands.
Ann Arbor ASA “Up and Running” Series: Intro Stata
Harvard-MIT Data Center (HMDC)
API-208: Stata Review Session Daniel Yew Mao Lim Harvard University Spring 2013.
Dealing with data All variables ok? / getting acquainted Base model Final model(s) Assumption checking on final model(s) Conclusion(s) / Inference Better.
STATA Mini Course Fall 2015 Jane Leber Herr Littauer 113 1Stata Mini Course – Spring 2015.
Generating new variables and manipulating data with STATA Biostatistics 212 Lecture 3.
Introduction to Statistical Computing in Clinical Research Biostatistics 212.
Advanced Stata Workshop FHSS Research Support Center.
VIDEO: INTRODUCTION TO STATA EMBA Data Analysis Professor Timothy Simcoe Boston University School of Management.
SP5 - Neuroinformatics SynapsesSA Tutorial Computational Intelligence Group Technical University of Madrid.
Introduction to Statistical Computing in Clinical Research
Introduction to Statistical Computing in Clinical Research Biostatistics 212 Lecture 1.
Getting Started with Stata 2/11/2010 Tom Tomberlin Nealia Khan Learning Technologies Center Harvard Graduate School of Education.
Basics of Biostatistics for Health Research Session 3 – February 21, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.
PSC 47410: Data Analysis Workshop  What’s the purpose of this exercise?  The workshop’s research questions:  Who supports war in America?  How consistent.
DTC Quantitative Methods Summary of some SPSS commands Weeks 1 & 2, January 2012.
Today Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with Stata – Estimation – GOF.
Basics of Biostatistics for Health Research Session 4 – February 28, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.
Stata Review Session Economics 1018 Abby Williamson and Hongyi Li November 17, 2006.
Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with Stata – Estimation –
Stata – be the master Stata. “After I have run my standard commands, what can I do to make my model better (and understand better what is going on)?”
Data Analysis using Stata workshop #4 / Kristin Bott reed.edu > K.Bott / Instructional Technology Services Reed College / Portland, OR.
Before the class starts: 1) login to a computer 2) start Stata 13.
Data Workshop H397. Data Cleaning  Inputting data  Missing Values  Converting String Variables  Creating Scales  Creating Dummy Variables.
Introduction to STATA Before you get frustrated, imagine processing data by hand and think dearly of STATA.
Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with Stata – Estimation –
Advanced Quantitative Techniques
EHS 655 Lecture 4: Descriptive statistics, censored data
QM222 Class 13 Section D1 Omitted variable bias (Chapter 13.)
Econometrics 704 Emilio Cuilty
ECONOMETRICS ii – spring 2018
Introduction Introduction to Stata 2016.
Introduction to Stata Spring 2017.
Objectives This is an introduction to the statistical software STATA aiming at: Preparing the participants in STATA basics (interphase and commands) for.
Statistical Analysis with
Stata Basic Course Lab 2.
Hsien-Ming Lien Dept of Public Finance, NCCU
Presentation transcript:

STATA for S-052 M. Shane Tutwiler Your Friendly S-040 Lecturer William Johnston IT Services Harvard Graduate School of Education

Getting the files The do-file used in this workshop as well as all data files are in the Stata Help tab of the course iSite. –Download SATdata.csv, auto.dta and Stata for S-052.do and save them to a new folder called Stata_Workshop on your desktop or on a usb drive.

Office: Gutman Want to set up a consultation? –hgse.service-now.com/ess/research.do Want to learn more on your own? –itservices.gse.harvard.edu/its/services/research-online- resources/stata Contact Information

Agenda: Overview I.Overview of Stata II.Getting Started III.‘Do’ files IV.Basic data cleaning V.Basic data management VI.Beginning analysis VII.Questions

Getting Help in Stata Many pathways to getting help in Stata:. help command. search command. findit command Use the help menu Look online with a web browser Set up an appointment

Some notes A word about programming in and using Stata Stata is case sensitive, so Myvar is different from myvar All commands in Stata are lower-case and = “ & “, or = “ | “, not = “ ! “ Assignment is “ = “, value equivalency is “ == “ Missing values are coded as extremely large numbers, and are represented by a. or a blank

How to Begin a Session? Specify your directory –cd “_______” Begin using a log file –log using “______.log” Open your data and look at it –insheet using “SATdata.csv”, comma –browse –describe

Anatomy of a Stata Command Stata commands follow a pattern: [prefix:] command [varlist] [if] [in] [weight ] [, options] For example: bysort region: summarize expense, detail mean csat if income >= & region !=. list state in 1/10, nolabel

Getting Started Opening Data Stata formatted data (.dta) : use “file name” Comma-separated variables: insheet using “file name”, comma Tab-delimited variables: insheet using “file name”, tab Web-based data files: webuse “web location” Flat-files: Create a dictionary {beyond the scope of this workshop}

Looking at Data Look at your data – did our data import correctly? How are our data measured? What kinds of variables do we have? Editor. edit Browser. browse Other commands. codebook. describe

Examining Data There are several ways to look at our data in Stata How would we describe the distribution of our data? Graphs of distribution Histograms histogram Scatterplots scatter Charts/Tables of frequency and distribution Frequency tables table Cross-tabs tabulate

Basic Data Operations, part 1 Generating a new variable gen newvarname=expression Subsetting keep varlist drop varlist if Joining Two Datasets. Merge Note—this is covered in detail in the Data Management Workshop!

Basic Data Operations part 2 Labeling To label a variable: label variable varname labelname To label values:. label define labelname 1 ‘high’ 0 ’low’. label value variable labelname Renaming. rename varname1 varname2 Replacing values of an already generated variable. replace newvarname=expression

Apply Your Knowledge Use the SATdata dataset Generate a dichotomous variable called hi_score from the csat variable, where a value of 1 indicates a score of greater than 922 and a 0 is less than or equal to 922. Label it as 0=low and 1=high.

Agenda I.Overview of Stata II.Getting Started III.‘Do’ files IV.Basic data cleaning V.Basic data management VI.Beginning analysis VII.Questions

Beginning Analysis Useful commands Looking at Distributions table, histogram, summarize Testing the Normality Assumption sktest, ladder, gladder Beginning to Look at Relationships tabulate, pwcorr, ttest, anova

Apply Your Knowledge Generate a histogram of the expense variable. Generate a two-way table to see if distributions are the same or different for the values of expense by the different values of your newly created hi_score variable. If you have time, see if there is a significant correlation between scores on SATs and the average amount of money that each state spends on education (expense).

Building Regression Models Regression models Linear regression regress depvar indepvar1 indepvar2 … Logistic Regression logit depvar indepvar1 indepvar2 …

Apply Your Knowledge Generate two scatterplots – one to look at the relationship between expense and csat, one to look at expense and hi_score. Depending on your estimation of the relationship (linear or not), run the appropriate regression to test for the relative effect of expense on either csat scores or hi_scores.

Saving data, code, and output Saving your newly transformed data save “pathname\filename.dta” outsheet using “pathname\filename” Saving your code SAVE YOUR DO-FILE!!!!! Saving your output create a log file. log using “pathname\filename”. log close (!!!!) Not closing = not saving! Saving graphs. graph save

Agenda: Overview I.Overview of Stata II.Getting Started III.‘Do’ files IV.Basic data cleaning V.Basic data management VI.Beginning analysis VII.Questions

Thanks! Questions? Gutman Library, room 323a