Download presentation
Presentation is loading. Please wait.
Published byMarlene Hunt Modified over 6 years ago
1
Marissa Gargano Sarrynna Sou Susan Edwards Chris Cummiskey
Complex Data Analysis Improve your data analytic abilities using Stata and Early Grade Reading and Mathematics Assessment Data Sunday March 5, 8:30 – 14: Georgia 2 (South Tower) CIES 2017 Downtown Sheraton Atlanta, GA Marissa Gargano Sarrynna Sou Susan Edwards Chris Cummiskey Marissa Gargano Sarrynna Sou Susan Edwards Chris Cummiskey
2
Before we data dive Introductions around the room:
Name, Affiliation, Position, Favorite thing to do in ATL. What Stata version do you have (12, 13, 14)? Stata-Use/Comfort Level. Statistical Background Experience with data from EGRA/EGMA. What you hope to get out of this class.
3
Assumption for Participation
Currently have your computer with you. Currently have Stata on your computer. Computer has access to the internet. Currently have Microsoft Work and Excel on you computer. Have conducted analysis in Stata. Have a general grasp of statistics. Are not terrified with the word: “regression”.
4
Workshop Brief: Review the Agenda
Two Sections: Tanzania Cross-sectional Survey [Morning] Kenya Impact Evaluation Survey [Afternoon]
5
Data Access Agreements
Make sure everyone has signed the data access agreement… Make sure everyone has the Workshop Folder copied over to their computer.
6
Install “Groups” into your Stata.
Connect to the internet… Open Stata, type the following into your command bar ssc install groups
7
Lets get Started on Simple Survey Statistics.
For more than 56 years, statistics research has been one of our primary specialties. Our statisticians, epidemiologists, and bio-statisticians conduct complex statistical analyses to support wide ranging research programs in both laboratory and social sciences.
8
Census vs. Random Sample
Portion of population Entire population Calculate estimated values Calculate true values Will always have some uncertainty No uncertainty No need to be statistically significant! Need for statistical significances To expensive Not practical Cost effective
9
Background: Simple Random Sample vs. School-Based Complex Random
Complex (Cluster) Sample Directly sample students Sample school student More than 1 stage sampled Only 1 stage sampled Not practical Practical Too expensive (not cost effective) Cost effective
10
Sample Analysis: Two Types
Descriptive describe the sample (not the population from which the same came) Inferential Project the sample to the population Population Cluster Effect Sample Sample weights
11
Inferential Statistics: It is all about 2 simple things
Point Estimates: (i.e. means, percentages) Use Sample Weights To make the sample representative of the population Accounts for over any over/under representation in the sample Minimize bias Precision Estimates: (i.e. standard error) Use sufficient sample size and small cluster size Try to make the point estimates as precise as possible Minimize the uncertainty of the point estimate. Inferential statistics
12
Inferential Statistics: Put Point and Precision together
Once you have the proper Point Estimate and Precision Estimate, you can create various statistical figures: 95% Confidence Intervals Representative of the population and sufficiently tight. Formal statistical tests Test for difference among groups Test for correlations
13
“Inferential Analysis”: Failure to account for sample methodology
If students were sampled with SRS Analyzed accounting for the proper sample methodology Not weighted Does not account for cluster effect Weighted Accounts for cluster effect * p<0.05 Incorrectly estimate the population’s mean orf. Public looks better than private. Overly confident in the precision of the means orf means [incorrectly conclude statistically significant differences]. Source: RTI International, Grade 2 Early Grade Reading Assessment (EGRA) and Snapshot of School Management Effectiveness in Indonesia (EdData II), March-April 2014.
14
Stata Coding: svy: mean orf, over(private)
If students were sampled with SRS Analyzed accounting for the proper sample methodology svy: mean orf, over(private) mean orf, over(private) * p<0.05 Incorrectly estimate the population’s mean orf. Public looks better than private. Overly confident in the precision of the means orf means [incorrectly conclude statistically significant differences]. Source: RTI International, Grade 2 Early Grade Reading Assessment (EGRA) and Snapshot of School Management Effectiveness in Indonesia (EdData II), March-April 2014.
15
What is “svy”? “svy” for “survey design”
stata command used for complex sample analysis. equivalent to SPSS’s “csaplan” (complex sample analysis plan) IF set up correctly: Accounts for the over/under representation in the sample Using the sample weights Accounts for the cluster effect Using the Taylor linearized variance estimation Very easy to use once the svyset as been specified and saved to the dataset.
16
What is “svyset”? “svyset” for “survey design set up”
stata command used at data processing stage. tells Stata how the complex sample was drawn (so you can analyze using the “svy” command). very easy to mess up if: The data processer has little experience with complex sample The data processer has not been involved with the sample methodology and drawing the sample The data processer has not been involved in the questionnaire design, pilot data collection.
17
OK, TOO MUCH THEORY… LETS GET TANGIBLE in TZ
Open a do-file. Save it to the Tanzania “Do Files” Folder: <path>\Analysis Workshop\Materials\Tanzania_2013\Do Files\ 1_Analyze_Tanzania-Data_CIES2017_Analysis-Workshop.do
18
In the Do-File Read in the Tanzania Census Data [School Level]
use “…<path>\CIES-2017 Complex Analysis Workshop\Materials\Tanzania_2013\Data\ Census_List-Frame For Tanzania2013-National Survey_PLSE7.dta”, clear Read in the Tanzania Sampled Data [Student Level] use “...<path>\CIES-2017 Complex Analysis Workshop\Materials\Tanzania_2013\Data\ PUF_3.Tanzania2014-National_grade2_EGRA-EGMA-SSME_English-Kiswahili.dta”, clear
19
10 Minute Break?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.