API-208: Stata Review Session Daniel Yew Mao Lim Harvard University Spring 2013.

Slides:



Advertisements
Similar presentations
Statistical Analysis SC504/HS927 Spring Term 2008
Advertisements

Basics of Biostatistics for Health Research Session 2 – February 14 th, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health.
Slides 2c: Using Spreadsheets for Modeling - Excel Concepts (Updated 1/19/2005) There are several reasons for the popularity of spreadsheets: –Data are.
Statistical Fundamentals: Using Microsoft Excel for Univariate and Bivariate Analysis Alfred P. Rovai Getting Started PowerPoint Prepared by Alfred P.
Using Excel Biostatistics 212 Lecture 4. Housekeeping Questions about Lab 3? –replace vs. recode Final Project Dataset! –“Housekeeping” commands vs. data.
Stata and logit recap. Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with.
SPSS Session 5: Association between Nominal Variables Using Chi-Square Statistic.
Stata Intro Practice Exercises Debby Kermer, George Mason University Libraries Data Services.
Generating new variables and manipulating data with STATA Biostatistics 212 Lecture 3.
Teaching Statistics Using Stata Software Susan Hailpern BSN MPH MS Department of Epidemiology and Population Health Albert Einstein College of Medicine.
INTRODUCTION TO STATA Võ Tuấn Khoa Trần Thế Trung.
Text Exercise 4.43 (a) 1 for level A X = 0 otherwise Y =  0 +  1 X +  or E(Y) =  0 +  1 X  0 =  1 = the mean of Y for level B the amount that the.
Chapter 11 Contingency Table Analysis. Nonparametric Systems Another method of examining the relationship between independent (X) and dependant (Y) variables.
By Wendiann Sethi Spring  The second stages of using SPSS is data analysis. We will review descriptive statistics and then move onto other methods.
By Hrishikesh Gadre Session II Department of Mechanical Engineering Louisiana State University Engineering Equation Solver Tutorials.
STATA TUTORIAL: LAB STATA windows  The command window  The viewer/results window  The review of commands window  The variable window.
AVP for Institutional Effectiveness and Director of IR Muriel Lopez-Wagner Assistant Director Tanner Carollo Institutional Effectiveness Associate Joanna.
Generating new variables and manipulating data with STATA Biostatistics 212 Lecture 3.
Generating new variables and manipulating data with STATA Biostatistics 212 Session 2.
An Introduction into Stata I Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” Session 3, June 9, 2011.
Getting Started with your data
Introduction to SPSS (For SPSS Version 16.0)
MOUSING WITH SPSS Frances Provan, Information Services, Edinburgh University Useful point and click.
A lesson approach © 2011 The McGraw-Hill Companies, Inc. All rights reserved. a lesson approach Microsoft® Access 2010 © 2011 The McGraw-Hill Companies,
How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window.
Day 1: Getting Started Department of Economics
Stata 12 Merging Guide Nathan Favero Texas A&M University October 19, 2012.
Econometric Analysis Using Stata
Stata Workshop #1 Chiu-Hsieh (Paul) Hsu Associate Professor College of Public Health
6.1 What is Statistics? Definition: Statistics – science of collecting, analyzing, and interpreting data in such a way that the conclusions can be objectively.
Session I How to use STATA & Basic Data Management Commands.
Using SPSS for Windows Part II Jie Chen Ph.D. Phone: /6/20151.
Key Data Management Tasks in Stata
Generating new variables and manipulating data with STATA Biostatistics 212 Lecture 3.
Using Excel Biostatistics 212 Lecture 4. Housekeeping Questions about Lab 3? –replace vs. recode –Cross-checking/recoding missing values –Analysis of.
Introduction to Enterprise Guide Jennifer Schmidt Rhonda Ellis Cassandra Hall.
 Muhamad Jantan & T. Ramayah School of Management, Universiti Sains Malaysia Data Analysis Using SPSS.
Analyses using SPSS version 19
Getting Started with Stata 2/11/2010 Tom Tomberlin Nealia Khan Learning Technologies Center Harvard Graduate School of Education.
STATA for S-052 M. Shane Tutwiler Your Friendly S-040 Lecturer William Johnston IT Services Harvard Graduate School of Education.
Basics of Biostatistics for Health Research Session 1 – February 7 th, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health.
Lesson 13 Databases Unit 2—Using the Computer. Computer Concepts BASICS - 22 Objectives Define the purpose and function of database software. Identify.
DTC Quantitative Methods Summary of some SPSS commands Weeks 1 & 2, January 2012.
Today Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with Stata – Estimation – GOF.
Stata Review Session Economics 1018 Abby Williamson and Hongyi Li November 17, 2006.
Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with Stata – Estimation –
Stata: Getting Starting and Being Productive with VA Data Give me six hours to chop down a tree and I will spend the first four sharpening the axe. --Abraham.
Ec 2390: Section 1 Useful STATA commands Jack Willis September 14th, 2015.
Before the class starts: 1) login to a computer 2) start Stata 13.
SOC 305, Southeastern Louisiana University Prof. Robert Martin.
Data Workshop H397. Data Cleaning  Inputting data  Missing Values  Converting String Variables  Creating Scales  Creating Dummy Variables.
Introduction to STATA Before you get frustrated, imagine processing data by hand and think dearly of STATA.
Slide Slide 1 Chapter 10 Correlation and Regression 10-1 Overview 10-2 Correlation 10-3 Regression 10-4 Variation and Prediction Intervals 10-5 Multiple.
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with Stata – Estimation –
 Naïve Bayes  Data import – Delimited, Fixed, SAS, SPSS, OBDC  Variable creation & transformation  Recode variables  Factor variables  Missing.
Econ 326 Prof. Mariana Carrera Lab Session X [DATE]
Advanced Quantitative Techniques
Everything You Need in One Simple Overview…. July 27, 2016
Notes 7.1 Day 1– Solving Two-Step Equations
QM222 Class 13 Section D1 Omitted variable bias (Chapter 13.)
Statistical Fundamentals: Using Microsoft Excel for Univariate and Bivariate Analysis Alfred P. Rovai Getting Started PowerPoint Prepared by Alfred P.
Introduction Introduction to Stata 2016.
Elementary Statistics
Basic Statistical Terms
Introduction to Stata Spring 2017.
Migration and the Labour Market
Microsoft Excel 101.
Stata Basic Course Lab 4.
A Brief Introduction to Stata(2)
Presentation transcript:

API-208: Stata Review Session Daniel Yew Mao Lim Harvard University Spring 2013

Roadmap Importing Data Data analysis Data management Programming Getting Started

Getting Started: Orientation COMMAND WINDOW: commands typed here VARIABLES WINDOW: variable list shown here RESULTS WINDOW: results and commands displayed here REVIEW WINDOW: past commands appear here

Getting Started: Syntax

Getting Started: Syntax Example

Getting Started: Useful Commands I if in by sum help ssc install

Getting Started: Useful Commands II Arithmetic Operators “ + ” addition “ - ” subtraction “ * ” multiplication “ / ” division “ ^ ” power

Getting Started: Useful Commands III Relational Operators “ > ” Greater than “ < ” Less than “ >= ” Equal or greater than “ <= ” Equal or less than “ == ” Equal to “ ~= ” Not equal to “ != ” Not equal to

Getting Started: Useful Commands IV Logical (Boolean) Operators “ & ” = and – Example: A & B “ | ” = or – Example: A | B A AB B

Getting Started: Example

Getting Started: Worked Example Average share of ADB loans during first and second years on UNSC Between 1985 and 2004 Average share of ADB loans during first and second years on UNSC Between 1985 and 2004 Average share of ADB loans during first and second years on UNSC Between 1985 and 2004, for each country Average share of ADB loans during first and second years on UNSC Between 1985 and 2004, for each country

Getting Started: Creating Do-files Text file containing all commands relevant to analysis Useful for batch processing

Getting Started: Creating Do-files

Getting Started: Commenting in Do-files * * Ignore stuff written on this line /* Text Here*/ Ignore stuff written in between

Getting Started: Commenting in Do-files

Importing Data: Data Types Stata Data.xls.csv

Data Management: Data Structure Cross- sectional Time-series Panel

Data Management: Datasets merge : add variables across datasets. append : add observations across datasets. reshape : convert data from wide/long or long/wide rename : change the name of a variable. drop : eliminate variables or observations. keep : keep variables or observations. sort : arrange into ascending order.

Data Management: Missing Data Recode List-wise deletion Multiple Imputation

Data Management: Outliers Impossible values Extreme values Logarithmic function

Data Management: Modifying Data generate : create new variable. replace : replace old values. recode : change values by conditions. label define : defines value labels (or “dictionary”). label values : attaches value labels (or “dictionary”) to a variable.

Data Analysis: Exploring Data summarize : descriptive statistics. codebook : display contents of variables. describe : display properties of variables. count : counts cases. list : show values.

Data Analysis: Analyzing Data tabstat : tables with statistics. tabulate : one- or two-way frequency tables (related: tab1 and tab2 ). table : calculates and displays tables of statistics.

Data Analysis: Worked Example Exercise 1: Create an aidsize variable with three categories based on the amount of ADB loans received (adbconstant): small (0 to 99), medium (100 to 999), and large (1000 or more). Include labels.

Data Analysis: MLE regress : standard OLS. Probit/logit : binary dependent variable. oprobit : ordered probit regression. ologit : ordered logistic regression. xtreg : fixed, between, and random effects, and population averaged linear models. xtregar : fixed and random effects models with AR(1) disturbance.

Data Analysis: Matching psmatch2 : propensity score matching. cem : coarsened exact matching.

Data Analysis: Interpreting Coefficients

Programming

Conclusion Pattern recognition Self-learning Programming

Q&A Thank you!