Getting Started with Stata 2/11/2010 Tom Tomberlin Nealia Khan Learning Technologies Center Harvard Graduate School of Education.

Slides:



Advertisements
Similar presentations
Stata Intro Practice Exercises Debby Kermer, George Mason University Libraries Data Services.
Advertisements

Statistical Methods Lynne Stokes Department of Statistical Science Lecture 7: Introduction to SAS Programming Language.
Stata and logit recap. Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with.
Stata Intro Practice Exercises Debby Kermer, George Mason University Libraries Data Services.
1. Overview Brief guide to the display windows and toolbar
Generating new variables and manipulating data with STATA Biostatistics 212 Lecture 3.
Teaching Statistics Using Stata Software Susan Hailpern BSN MPH MS Department of Epidemiology and Population Health Albert Einstein College of Medicine.
Teaching Stata—some reflections after 8 years of training experiences Karen Robson York University.
Statistics Describing Data Using Tables and Graphs
STATA TUTORIAL: LAB STATA windows  The command window  The viewer/results window  The review of commands window  The variable window.
Generating new variables and manipulating data with STATA Biostatistics 212 Lecture 3.
Introduction to Statistical Computing in Clinical Research Biostatistics 212 Course director: Mark Pletcher Teaching Assistant: Lee Zane.
Examine the data Hsien-Ming Lien Dept of Public Finance, NCCU.
A Simple Guide to Using SPSS© for Windows
Stata Introduction Sociology 229A, Class 2 Copyright © 2008 by Evan Schofer Do not copy or distribute without permission.
Generating new variables and manipulating data with STATA Biostatistics 212 Session 2.
1. Overview Do-files Summary statistics Correlation Linear regression
Getting Started with your data
Quantifying Data.
Unit 4c: Taxonomies of Logistic Regression Models © Andrew Ho, Harvard Graduate School of EducationUnit 4c – Slide 1
Introduction to SPSS (For SPSS Version 16.0)
Unit 4c: Taxonomies of Logistic Regression Models © Andrew Ho, Harvard Graduate School of EducationUnit 4c – Slide 1
© Willett, Harvard University Graduate School of Education, 8/27/2015S052/I.3(c) – Slide 1 More details can be found in the “Course Objectives and Content”
How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window.
Day 1: Getting Started Department of Economics
Econometric Analysis Using Stata
 Overview of SPSS  Interface  Getting Started  Managing Data  Descriptive Statistics  Basic Analysis  Additional Resources.
Stata Workshop #1 Chiu-Hsieh (Paul) Hsu Associate Professor College of Public Health
Class Meeting #11 Data Analysis. Types of Statistics Descriptive Statistics used to describe things, frequently groups of people.  Central Tendency 
Statistics for Social Sciences I (E563) Statistics for Social Sciences I (E563) Statistics for Social Sciences I (E563) Statistics for Social Sciences.
Harvard-MIT Data Center (HMDC)
18b. PROC SURVEY Procedures in SAS ®. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2 Training.
API-208: Stata Review Session Daniel Yew Mao Lim Harvard University Spring 2013.
Key Data Management Tasks in Stata
Introduction to STATA for Clinical Researchers Jay Bhattacharya August 2007.
SPSS Overview. The opening screen 2 The SPSS windows 3.
Example SPSS Basic Medical Statistics Course October 2010 Wilma Heemsbergen.
Generating new variables and manipulating data with STATA Biostatistics 212 Lecture 3.
Department of Economics Trinity College Dublin, Ireland Day 2: Labour Market Participation and Income Earning Activities 1.
Week 5: Logistic regression analysis Overview Questions from last week What is logistic regression analysis? The mathematical model Interpreting the β.
Introduction to Statistical Computing in Clinical Research Biostatistics 212.
Introduction to Statistical Computing in Clinical Research
STATA for S-052 M. Shane Tutwiler Your Friendly S-040 Lecturer William Johnston IT Services Harvard Graduate School of Education.
Basics of Biostatistics for Health Research Session 3 – February 21, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.
Basics of Biostatistics for Health Research Session 1 – February 7 th, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health.
Mr. Magdi Morsi Statistician Department of Research and Studies, MOH
PSC 47410: Data Analysis Workshop  What’s the purpose of this exercise?  The workshop’s research questions:  Who supports war in America?  How consistent.
PSY6010: Statistics, Psychometrics and Research Design Professor Leora Lawton Spring 2007 Wednesdays 7-10 PM Room 204.
D/RS 1013 Data Screening/Cleaning/ Preparation for Analyses.
Today Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with Stata – Estimation – GOF.
R Workshop #2 Basic Data Analysis. What we did last week: Understand the basics of how R works Generated objects (vectors, matrices, etc.) Read in data.
Stata Review Session Economics 1018 Abby Williamson and Hongyi Li November 17, 2006.
Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with Stata – Estimation –
Stata – be the master Stata. “After I have run my standard commands, what can I do to make my model better (and understand better what is going on)?”
Analyzing Data. Learning Objectives You will learn to: – Import from excel – Add, move, recode, label, and compute variables – Perform descriptive analyses.
Data Analysis using Stata workshop #4 / Kristin Bott reed.edu > K.Bott / Instructional Technology Services Reed College / Portland, OR.
Data Workshop H397. Data Cleaning  Inputting data  Missing Values  Converting String Variables  Creating Scales  Creating Dummy Variables.
Introduction to STATA Before you get frustrated, imagine processing data by hand and think dearly of STATA.
Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with Stata – Estimation –
EHS 655 Lecture 4: Descriptive statistics, censored data
QM222 Class 13 Section D1 Omitted variable bias (Chapter 13.)
Applied Statistical Analysis
ECONOMETRICS ii – spring 2018
Introduction Introduction to Stata 2016.
Lab 2 Data Manipulation and Descriptive Stats in R
Objectives This is an introduction to the statistical software STATA aiming at: Preparing the participants in STATA basics (interphase and commands) for.
Stata Basic Course Lab 4.
Use of PROC TABULATE Out File to Customize Tables
Stata Basic Course Lab 2.
Hsien-Ming Lien Dept of Public Finance, NCCU
Presentation transcript:

Getting Started with Stata 2/11/2010 Tom Tomberlin Nealia Khan Learning Technologies Center Harvard Graduate School of Education

Agenda I.Overview of Stata II.Getting Started III.‘Do’ files IV.Basic data cleaning V.Basic data management VI.Beginning analysis VII.Special topics (time permitting)

Agenda I.Overview of Stata II.Getting Started III.‘Do’ files IV.Basic data cleaning V.Basic data management VI.Beginning analysis VII.Special topics (time permitting)

Overview Why use Stata?  Availability  Can self-program, or use menus  Cutting –edge statistical methods (including user-defined functions)  Publication-quality graphics

Stats and Graphics

Getting Started A word about programming in and using Stata Stata is case sensitive, so Myvar is different from myvar All commands in Stata are lower-case “and’ = &, “or” = |, “not”= ! Assignment is “=“, value equivalency is “==“

Windows in Stata

Agenda I.Overview of Stata II.Getting Started III.‘Do’ files IV.Basic data cleaning V.Basic data management VI.Beginning analysis VII.Special topics (time permitting)

Getting Started Opening Stata Opening Data: –Stata formatted data  “use” command –Comma-separated variables  “insheet using” –Tab-delimited variables  “insheet using” –Flat-files  Create a dictionary

Apply Your Knowledge Exercise 1: Open Stata Using the insheet command, open the comma- separated variables data file located in – F:\workshops\SATdata.csv  (HINT: all Stata commands must be written in lower case.  Don’t forget to put pathnames in quotes!)

Examining Data Look at your data – did our data import correctly? –How are our data measured? –What kinds of variables do we have? How would we describe the distribution of our data? –Graphs  Histograms  Scatterplots –Charts/Tables  Frequency tables  Cross-tabs

Looking at Data There are several ways to look at our data in Stata –Editor –Browser –Stata commands  codebook  des  Tables of frequency and distribution  Graphs of distribution

Examining Data Let’s look at how the variable ‘csat’ is distributed –hist csat –tab csat

Agenda I.Overview of Stata II.Getting Started III.‘Do’ files IV.Basic data cleaning V.Basic data management VI.Beginning analysis VII.Special topics (time permitting)

Do files What are do-files? ‘Do’ files are essentially a syntax list of all of the commands that you wish to run, and the setting that you would like to set –Why use them?  Replication  Collaboration  Audit trail  Help –How to create and run one

Do-files Creating and running a do-file

Do files –EXERCISE 2: Create a simple do-file from the commands that you have already entered. (HINT: you must clear the data in memory before opening a new dataset.)

Agenda I.Overview of Stata II.Getting Started III.‘Do’ files IV.Basic data cleaning V.Basic data management VI.Beginning analysis VII.Special topics (time permitting)

Agenda I.Overview of Stata II.Getting Started III.‘Do’ files IV.Basic data cleaning V.Basic data management VI.Beginning analysis VII.Special topics (time permitting)

Basic Data Cleaning –Labeling –To label a variable: label var varname label –To label values:  label define labelname 1 ‘high’ 0 ’low’  Label val varname labelname –Renaming  ren varname1 varname2 –Recoding  recode varname oldvalue=newvalue –Generating a new variable  gen newvarname=somevalue –Replacing values of an already generated variable  replace newvarname=somevalue

Basic Data Management Subsetting –keep –drop –if Merging merge must sort both files by the linkage variable! ex: merge linkage_var using “F:\workshops\newfile”

Basic Data Cleaning EXERCISE 3: generate a dichotomous variable called hi_score from the csat variable, where a value of 1 indicates a score of greater than 922 and a 0 is less than or equal to 922. label it as 0=low and 1=high.

Agenda I.Overview of Stata II.Getting Started III.‘Do’ files IV.Basic data cleaning V.Basic data management VI.Beginning analysis VII.Special topics (time permitting)

Beginning Analysis Univariate analysis  summarize  histogram  Table Bivariate analysis tabulate pwcorr ttest

Apply Your Knowledge  EXERCISE 4:  Generate a histogram of the expense variable  generate a two-way table to see if distributions are the same or different for the values of expense by the different values of your newly created hi_score variable  If you have time, see if there is a significant correlation between scores on SATs and the average amount of money that each state spends on education.

Beginning Analysis Multivariate models –Linear regression regress depvar indepvar1 indepvar2 … indepvarN –Logistic Regression  logit depvar indepvar1 indepvar2 … indepvarN

Apply Your Knowledge Exercise 5: Generate two scaterplots – one to look at the relationship between expense and csat, one to look at expense and hi_score. Depending on your estimation of the relationship (linear or not), run the appropriate regression to test for the relative effect of expense on either csat scores or hi_scores

Agenda I.Overview of Stata II.Getting Started III.‘Do’ files IV.Basic data cleaning V.Basic data management VI.Beginning analysis VII.Special topics (time permitting)

Thanks Questions? Gutman Library, room 323a&b