Marissa Gargano Sarrynna Sou Susan Edwards Chris Cummiskey

Slides:



Advertisements
Similar presentations
© 2004 Prentice-Hall, Inc.Chap 1-1 Basic Business Statistics (9 th Edition) Chapter 1 Introduction and Data Collection.
Advertisements

© 2002 Prentice-Hall, Inc.Chap 1-1 Statistics for Managers using Microsoft Excel 3 rd Edition Chapter 1 Introduction and Data Collection.
1 Practicals, Methodology & Statistics II Laura McAvinue School of Psychology Trinity College Dublin.
GS/PPAL Section N Research Methods and Information Systems A QUANTITATIVE RESEARCH PROJECT - (1)DATA COLLECTION (2)DATA DESCRIPTION (3)DATA ANALYSIS.
Introduction to Linear Regression and Correlation Analysis
Business Statistics - QBM117 Introduction to hypothesis testing.
Data Collection & Processing Hand Grip Strength P textbook.
Many times in statistical analysis, we do not know the TRUE mean of a population of interest. This is why we use sampling to be able to generalize the.
Design Effects: What are they and how do they affect your analysis? David R. Johnson Population Research Institute & Department of Sociology The Pennsylvania.
1st meeting: Multilevel modeling: introduction Subjects for today:  Basic statistics (testing)  The difference between regression analysis and multilevel.
SPSS Basics and Applications Workshop: Introduction to Statistics Using SPSS.
Statistical analysis Outline that error bars are a graphical representation of the variability of data. The knowledge that any individual measurement.
Stat 112: Notes 2 Today’s class: Section 3.3. –Full description of simple linear regression model. –Checking the assumptions of the simple linear regression.
ICCS 2009 IDB Workshop, 18 th February 2010, Madrid 1 Training Workshop on the ICCS 2009 database Weighting and Variance Estimation picture.
Chapter 6: Analyzing and Interpreting Quantitative Data
Experimental Research Methods in Language Learning Chapter 12 Reliability and Reliability Analysis.
Sample Size Determination
Basic Business Statistics, 8e © 2002 Prentice-Hall, Inc. Chap 1-1 Inferential Statistics for Forecasting Dr. Ghada Abo-zaid Inferential Statistics for.
ICCS 2009 IDB Seminar – Nov 24-26, 2010 – IEA DPC, Hamburg, Germany Training Workshop on the ICCS 2009 database Weights and Variance Estimation picture.
Appropriate use of Design Effects and Sample Weights in Complex Health Survey Data: A Review of Articles Published using Data from Add Health, MTF, and.
Statistical Concepts Basic Principles An Overview of Today’s Class What: Inductive inference on characterizing a population Why : How will doing this allow.
Lecture 5.  It is done to ensure the questions asked would generate the data that would answer the research questions n research objectives  The respondents.
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2016 Room 150 Harvill.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 1-1 Statistics for Managers Using Microsoft ® Excel 4 th Edition Chapter.
GS/PPAL Research Methods and Information Systems
Learning Objectives : After completing this lesson, you should be able to: Describe key data collection methods Know key definitions: Population vs. Sample.
Chapter 1 Introduction and Data Collection
AP Seminar: Statistics Primer
Advanced Quantitative Techniques
Statistical analysis.
Sample Size Determination
This will help you understand the limitations of the data and the uses to which it can be put (and the confidence with which you can put it to those.
Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Spring 2017 Room 150 Harvill Building 9:00 - 9:50 Mondays, Wednesdays.
Peter Linde, Interviewservice Statistics Denmark
Using Stata to Analyze Complex Survey Data
Intervention Study: Kenya PRIMR Case Regression Analysis
Statistical analysis.
AP Seminar: Statistics Primer
Introduction to Statistics
SOCIAL NETWORK AS A VENUE OF PARTICIPATION AND SHARING AMONG TEENAGERS
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
Statistical Data Analysis
Presented: 2009 Canadian Users Stata Group Meeting
March 2017 Susan Edwards, RTI International
Marissa Gargano Sarrynna Sou Susan Edwards Chris Cummiskey
Basic Sampling Issues.
Chapter Eight: Quantitative Methods
The Nature of Probability and Statistics
LAMAS Working Group 7-8 December 2015
Section 7.7 Introduction to Inference
Chapter 1 The Where, Why, and How of Data Collection
Sampling and Sample Size Calculations
Simple Linear Regression
Chapter 1 The Where, Why, and How of Data Collection
UNDERSTANDING RESEARCH RESULTS: STATISTICAL INFERENCE
Sampling and Power Slides by Jishnu Das.
Statistical Data Analysis
How To conduct a thesis 1- Define the problem
Business Statistics: A First Course (3rd Edition)
New Techniques and Technologies for Statistics 2017  Estimation of Response Propensities and Indicators of Representative Response Using Population-Level.
Research Design and Methods
How To conduct a thesis 1- Define the problem
Sample Sizes for IE Power Calculations.
Sample vs Population (true mean) (sample mean) (sample variance)
Type I and Type II Errors
Global PaedSurg Research Training Fellowship
The Where, Why, and How of Data Collection
Ten (or so) Pointers About Surveying
Bootstrapping and Bootstrapping Regression Models
Chapter 1 The Where, Why, and How of Data Collection
Presentation transcript:

Marissa Gargano Sarrynna Sou Susan Edwards Chris Cummiskey Complex Data Analysis Improve your data analytic abilities using Stata and Early Grade Reading and Mathematics Assessment Data Sunday March 5, 8:30 – 14:45 Georgia 2 (South Tower) CIES 2017 Downtown Sheraton Atlanta, GA Marissa Gargano Sarrynna Sou Susan Edwards Chris Cummiskey Marissa Gargano Sarrynna Sou Susan Edwards Chris Cummiskey

Before we data dive Introductions around the room: Name, Affiliation, Position, Favorite thing to do in ATL. What Stata version do you have (12, 13, 14)? Stata-Use/Comfort Level. Statistical Background Experience with data from EGRA/EGMA. What you hope to get out of this class.

Assumption for Participation Currently have your computer with you. Currently have Stata on your computer. Computer has access to the internet. Currently have Microsoft Work and Excel on you computer. Have conducted analysis in Stata. Have a general grasp of statistics. Are not terrified with the word: “regression”.

Workshop Brief: Review the Agenda Two Sections: Tanzania Cross-sectional Survey [Morning] Kenya Impact Evaluation Survey [Afternoon]

Data Access Agreements Make sure everyone has signed the data access agreement… Make sure everyone has the Workshop Folder copied over to their computer.

Install “Groups” into your Stata. Connect to the internet… Open Stata, type the following into your command bar ssc install groups

Lets get Started on Simple Survey Statistics. For more than 56 years, statistics research has been one of our primary specialties. Our statisticians, epidemiologists, and bio-statisticians conduct complex statistical analyses to support wide ranging research programs in both laboratory and social sciences.

Census vs. Random Sample Portion of population Entire population Calculate estimated values Calculate true values Will always have some uncertainty No uncertainty No need to be statistically significant! Need for statistical significances To expensive Not practical Cost effective

Background: Simple Random Sample vs. School-Based Complex Random Complex (Cluster) Sample Directly sample students Sample school  student More than 1 stage sampled Only 1 stage sampled Not practical Practical Too expensive (not cost effective) Cost effective

Sample Analysis: Two Types Descriptive describe the sample (not the population from which the same came) Inferential Project the sample to the population Population Cluster Effect Sample Sample weights

Inferential Statistics: It is all about 2 simple things Point Estimates: (i.e. means, percentages) Use Sample Weights To make the sample representative of the population Accounts for over any over/under representation in the sample Minimize bias Precision Estimates: (i.e. standard error) Use sufficient sample size and small cluster size Try to make the point estimates as precise as possible Minimize the uncertainty of the point estimate. Inferential statistics

Inferential Statistics: Put Point and Precision together Once you have the proper Point Estimate and Precision Estimate, you can create various statistical figures: 95% Confidence Intervals Representative of the population and sufficiently tight. Formal statistical tests Test for difference among groups Test for correlations

“Inferential Analysis”: Failure to account for sample methodology If students were sampled with SRS Analyzed accounting for the proper sample methodology Not weighted Does not account for cluster effect Weighted Accounts for cluster effect * p<0.05 Incorrectly estimate the population’s mean orf. Public looks better than private. Overly confident in the precision of the means orf means [incorrectly conclude statistically significant differences]. Source: RTI International, Grade 2 Early Grade Reading Assessment (EGRA) and Snapshot of School Management Effectiveness in Indonesia (EdData II), March-April 2014.

Stata Coding: svy: mean orf, over(private) If students were sampled with SRS Analyzed accounting for the proper sample methodology svy: mean orf, over(private) mean orf, over(private) * p<0.05 Incorrectly estimate the population’s mean orf. Public looks better than private. Overly confident in the precision of the means orf means [incorrectly conclude statistically significant differences]. Source: RTI International, Grade 2 Early Grade Reading Assessment (EGRA) and Snapshot of School Management Effectiveness in Indonesia (EdData II), March-April 2014.

What is “svy”? “svy” for “survey design” stata command used for complex sample analysis. equivalent to SPSS’s “csaplan” (complex sample analysis plan) IF set up correctly: Accounts for the over/under representation in the sample Using the sample weights Accounts for the cluster effect Using the Taylor linearized variance estimation Very easy to use once the svyset as been specified and saved to the dataset.

What is “svyset”? “svyset” for “survey design set up” stata command used at data processing stage. tells Stata how the complex sample was drawn (so you can analyze using the “svy” command). very easy to mess up if: The data processer has little experience with complex sample The data processer has not been involved with the sample methodology and drawing the sample The data processer has not been involved in the questionnaire design, pilot data collection.

OK, TOO MUCH THEORY… LETS GET TANGIBLE in TZ Open a do-file. Save it to the Tanzania “Do Files” Folder: <path>\Analysis Workshop\Materials\Tanzania_2013\Do Files\ 1_Analyze_Tanzania-Data_CIES2017_Analysis-Workshop.do

In the Do-File Read in the Tanzania Census Data [School Level] use “…<path>\CIES-2017 Complex Analysis Workshop\Materials\Tanzania_2013\Data\ Census_List-Frame For Tanzania2013-National Survey_PLSE7.dta”, clear Read in the Tanzania Sampled Data [Student Level] use “...<path>\CIES-2017 Complex Analysis Workshop\Materials\Tanzania_2013\Data\ PUF_3.Tanzania2014-National_grade2_EGRA-EGMA-SSME_English-Kiswahili.dta”, clear

10 Minute Break?