18b. PROC SURVEY Procedures in SAS ®. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2 Training.

Slides:



Advertisements
Similar presentations
Calculation of Sampling Errors MICS3 Regional Workshop on Data Archiving and Dissemination Alexandria, Egypt 3-7 March, 2007.
Advertisements

Calculation of Sampling Errors MICS3 Data Analysis and Report Writing Workshop.
MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Creating Analysis Files: Description of Preparation Steps.
4. NLTS2 Data Sources: Parent and Youth Surveys. 4. Sources: Parent and Youth Surveys Prerequisites Recommended modules to complete before viewing this.
Prerequisites Recommended modules to complete before viewing this module 1. Introduction to the NLTS2 Training Modules 2. NLTS2 Study Overview 3. NLTS2.
12. NLTS2 Documentation: Quick References. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2 Training.
13.Analysis Demonstration: Descriptive/Comparative Analysis Using Longitudinal Data.
11. NLTS2 Documentation: Data Dictionaries. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2.
10. NLTS2 Documentation Overview. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2 Training Modules.
16b. Accessing Data: Means in SAS ®. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2 Training.
9. Weighting and Weighted Standard Errors. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2 Training.
Multiple Indicator Cluster Surveys Data Processing Workshop
19.Multivariate Analysis Using NLTS2 Data. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2 Training.
Simple Logistic Regression
ICCS 2009 IDB Seminar – Nov 24-26, 2010 – IEA DPC, Hamburg, Germany Using the IEA IDB Analyzer to merge and analyze data.
ICCS 2009 IDB Workshop, 18 th February 2010, Madrid Using the IEA IDB Analyzer to merge and analyze data.
7.Implications for Analysis: Parent/Youth Survey Data.
SAS Programming: Working With Variables. Data Step Manipulations New variables should be created during a Data step Existing variables should be manipulated.
17a.Accessing Data: Manipulating Variables in SPSS ®
Chapter 11 Contingency Table Analysis. Nonparametric Systems Another method of examining the relationship between independent (X) and dependant (Y) variables.
QBM117 Business Statistics Statistical Inference Sampling 1.
Ann Arbor ASA ‘Up and Running’ Series: SPSS Prepared by volunteers of the Ann Arbor Chapter of the American Statistical Association, in cooperation with.
QM Spring 2002 Business Statistics SPSS: A Summary & Review.
Quick Data Summaries in SAS Start by bringing in data –Use permanent data set for these examples Proc Tabulate –Produces summaries very quickly and easily.
The Ontario Cancer Risk Factor Surveillance Program Michael Spinks Senior Research Analyst Cancer Care Ontario at 5 th Annual RRFSS Workshop Institute.
Notes on Logistic Regression STAT 4330/8330. Introduction Previously, you learned about odds ratios (OR’s). We now transition and begin discussion of.
15a.Accessing Data: Frequencies in SPSS ®. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2 Training.
15b. Accessing Data: Frequencies in SAS ®. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2 Training.
Pet Fish and High Cholesterol in the WHI OS: An Analysis Example Joe Larson 5 / 6 / 09.
Understanding and Using NAMCS and NHAMCS Data Data Tools and Basic Programming Techniques 2010 National Conference on Health Statistics August 16, 2010.
NHANES Analytic Strategies Deanna Kruszon-Moran, MS Centers for Disease Control and Prevention National Center for Health Statistics.
Proc Surveyselect or the easy way to select samples Gitte Churlish Churlish Consulting.
Chapter 9 Producing Descriptive Statistics PROC MEANS; Summarize descriptive statistics for continuous numeric variables. PROC FREQ; Summarize frequency.
6. Implications for Analysis: Data Content. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2.
8.Implications for Analysis: School Survey, Student Assessment, and Transcript Data.
Introduction to SAS BIO 226 – Spring Outline Windows and common rules Getting the data –The PRINT and CONTENT Procedures Manipulating the data.
2. NLTS2 Study Overview. 1 Prerequisites Recommended module to complete before viewing this module  1. Introduction to the NLTS2 Training Modules.
Design Effects: What are they and how do they affect your analysis? David R. Johnson Population Research Institute & Department of Sociology The Pennsylvania.
1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book.
Quantify the Example Data First, code and quantify the data (assign column locations & variable names) Use the sample data to create a data set from the.
1.Introduction to the NLTS2 Training Modules Jose Blackorby Renee Cameto Camille Marder Christopher Sanford Kathryn Valdes James Van Campen SRI International.
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Pearson Chi-Square Contingency Table Analysis.
Using Weighted Data Donald Miller Population Research Institute 812 Oswald Tower, December 2008.
Introduction to Multiple Imputation CFDR Workshop Series Spring 2008.
Regression in SAS Caitlin Phelps. Importing Data  Proc Import:  Read in variables in data set  May need some options incase SAS doesn’t guess the format.
SPSS Instructions for Introduction to Biostatistics Larry Winner Department of Statistics University of Florida.
1 G Lect 13W Imputation (data augmentation) of missing data Multiple imputation Examples G Multiple Regression Week 13 (Wednesday)
Analysis Introduction Data files, SPSS, and Survey Statistics.
MICS Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Creating Analysis Files: Description of Preparation Steps.
DTC Quantitative Methods Summary of some SPSS commands Weeks 1 & 2, January 2012.
14b. Accessing Data Files in SAS ®. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2 Training.
16a. Accessing Data: Means in SPSS ®. 16a. Accessing Data: Means in SSPS ® 1 Prerequisites Recommended modules to complete before viewing this module.
ICCS 2009 IDB Seminar – Nov 24-26, 2010 – IEA DPC, Hamburg, Germany Training Workshop on the ICCS 2009 database Weights and Variance Estimation picture.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Tutorial I: Missing Value Analysis
Multiple Imputation using SAS Don Miller 812 Oswald Tower
17b.Accessing Data: Manipulating Variables in SAS ®
Using Data from the National Survey of Children with Special Health Care Needs Centers for Disease Control and Prevention National Center for Health Statistics.
1 Understanding and Using NAMCS and NHAMCS Data Part 2 –Using public-use files Eric Nawar Ambulatory Care Statistics Branch Division of Health Care Statistics.
NHANES Analytic Strategies Deanna Kruszon-Moran, MS Centers for Disease Control and Prevention National Center for Health Statistics.
University of Warwick, Department of Sociology, 2012/13 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Survey Design: Some Implications for.
Chapter 6: Modifying and Combining Data Sets  The SET statement is a powerful statement in the DATA step DATA newdatasetname; SET olddatasetname;.. run;
Based on Learning SAS by Example: A Programmer’s Guide Chapters 1 & 2
Appropriate use of Design Effects and Sample Weights in Complex Health Survey Data: A Review of Articles Published using Data from Add Health, MTF, and.
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 16 & 17 By Tasha Chapman, Oregon Health Authority.
Working with the ECLS-B Datasets Weights and other issues.
Introduction to SPSS.
Melanie Dove, MPH, ScD UC Davis
Producing Descriptive Statistics
Lexico-grammar: From simple counts to complex models
Presentation transcript:

18b. PROC SURVEY Procedures in SAS ®

1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2 Training Modules  2. NLTS2 Study Overview  3. NLTS2 Study Design and Sampling  NLTS2 Data Sources, either 4. Parent and Youth Surveys or 5. School Surveys, Student Assessments, and Transcripts  9. Weighting and Weighted Standard Errors

18b. PROC SURVEY Procedures in SAS ® 2 Prerequisites Recommended modules to complete before viewing this module (cont’d)  NLTS2 Documentation 10. Overview 11. Data Dictionaries 12. Quick References  Accessing Data 14b. Files in SAS 15b. Frequencies in SAS 16b. Means in SAS 17b. Manipulating Variables in SAS

18b. PROC SURVEY Procedures in SAS ® 3 Overview  PROC SURVEY procedures  Analysis files  Statements  Frequencies  Crosstabs  Means  Comparative Means  Closing  Important information

18b. PROC SURVEY Procedures in SAS ® 4 NLTS2 restricted-use data NLTS2 data are restricted. Data used in these presentations are from a randomly selected subset of the restricted-use NLTS2 data. Results in these presentations cannot be replicated with the NLTS2 data licensed by NCES.

18b. PROC SURVEY Procedures in SAS ® 5 PROC SURVEY procedures PROC SURVEY procedures account for complex (stratified/clustered) sampling designs, correctly calculating standard errors with weighted data. Survey designs that call for complex sampling require different methods to calculate standard errors. Procedures we have used to this point assume a simple random sample. These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

18b. PROC SURVEY Procedures in SAS ® 6 PROC SURVEY procedures Weighted standard errors from PROC SURVEY procedures differ greatly from those produced using basic SAS procedures. PROC SURVEY includes procedures for  Frequencies (PROC SURVEYFREQ)  Means (PROC SURVEYMEANS)  Crosstabs (PROC SURVEYFREQ or SURVEYMEANS)  Comparative means (PROC SURVEYMEANS)  Regressions (PROC SURVEYREG or SURVEYLOGISTIC). These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

18b. PROC SURVEY Procedures in SAS ® 7 PROC SURVEY procedures Variation among methods for calculating standard errors  Programs that produce weighted standard errors for PROC SURVEY procedures will typically generate slightly different estimates.  Estimated standard errors are close in SAS Survey procedures, SUDAAN, and SPSS Complex Samples procedures but are not exactly the same.  There is no uniformity to the differences; sometimes the standard errors from SAS PROC SURVEY are slightly larger, and sometimes they are slightly smaller. These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

18b. PROC SURVEY Procedures in SAS ® 8 PROC SURVEY procedures Variation among methods for calculating standard errors (cont.)  Standard errors in our reports and published tables were calculated with formulas for estimation and may be slightly different from those produced by SAS PROC SURVEY procedures. Ours tend to be slightly larger than those from SAS.  Standard errors produced by the general procedures in SAS— frequencies, crosstabs, or descriptives—differ greatly from those generated by PROC SURVEY procedures. Don’t use unweighted standard errors! These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

18b. PROC SURVEY Procedures in SAS ® 9 Analysis files How to prepare the data to use PROC SURVEY procedures. The first step is to create an analysis data set.  Combine data or select an existing file from a given source/wave.  If combining files, add appropriate analysis weights to the new file. Wave and source weight if working within a single file Weight from the smallest data source if working with multiple files. Once the analysis file has been created or selected, two variables need to be added to that file.  Add “Stratum” and “Cluster” found in the “n2sample” file. These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

18b. PROC SURVEY Procedures in SAS ® 10 Statements PROC SURVEY procedures look much the same as other procedures. There are two added statements in PROC SURVEY procedures that define the sample, “STRATA” and “CLUSTER.”  In NLTS2, the variables specified will always be “Stratum” and “Cluster.” STRATA Stratum ; CLUSTER Cluster ; There is always a WEIGHT statement. Weight [weight variable] ; These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

18b. PROC SURVEY Procedures in SAS ® 11 Statements As in PROC FREQ, there is a TABLES statement with options for frequencies and crosstabulations. As in PROC MEANS, there is a separate statement for a categorical “by” or classification variable.  In PROC SURVEYMEANS the statement is “DOMAIN” (not “CLASS”) DOMAIN w2_Dis12 ; SAS PROC SURVEY procedures have other options that use replicate weights.  In our examples, we will be using the option based on Taylor linearization which uses a single weight and the STRATA and CLUSTER statements. These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

18b. PROC SURVEY Procedures in SAS ® 12 Frequencies How to run frequencies in PROC SURVEY procedures Running frequencies—or any other PROC SURVEY procedures—does not differ much from the Base SAS procedures. Syntax for frequencies in PROC SURVEYFREQ PROC SURVEYFREQ data = [ddname].[file name] ; TABLES [variable(s)] ; STRATA stratum ; CLUSTER cluster ; WEIGHT [weight] ; run ; These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

18b. PROC SURVEY Procedures in SAS ® 13 Crosstabs How to run crosstabs Syntax for crosstabulations in PROC SURVEYFREQ PROC SURVEYFREQ Data = [ddname].[file name] ; TABLES [row variable] * [column variable] /[options] ; STRATA stratum ; CLUSTER cluster ; WEIGHT [main analysis weight] ; run ; These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

18b. PROC SURVEY Procedures in SAS ® 14 Crosstabs Options on the TABLES statement control printed output.  Print row percentages (“ROW”) and/or column percentages (“COL”)  Suppress printing of weighted counts (“NOWT”), cell percentages (“NOCELLPERCENT”), and cell counts (“NOFREQ”) TABLES [row variable] * [column variable] /row col nowt nocellpercent nofreq ; These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

18b. PROC SURVEY Procedures in SAS ® 15 Means How to run means Syntax in PROC SURVEYMEANS PROC SURVEYMEANS Data=[ ddname ].[ file name ]; VAR [variable(s)] ; STRATA stratum ; CLUSTER cluster ; WEIGHT [weight] ; run ; These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

18b. PROC SURVEY Procedures in SAS ® 16 Comparative Means How to run comparative means Syntax in PROC SURVEYMEANS PROC SURVEYMEANS Data=[ ddname ].[ file name ]; DOMAIN [by or classification variable]; VAR [variable(s)] ; STRATA stratum ; CLUSTER cluster ; WEIGHT [weight] ; run ; These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

18b. PROC SURVEY Procedures in SAS ® 17 Example Use the file created in Module 14b, Accessing Data Files in SAS, PrScoresEmp.sas7bdat. Merge sample data from the n2sample file.  Bring in Stratum and Cluster.  Weight variable will be wt_na. Using PROC SURVEY procedures, run  Frequency of ndaF1_Friend  Crosstab of ndaF1_Friend by na_Age4 and w2_Dis12 Are differences significant? If so, how do perceptions vary by age? By disability category?  Means of ndaPC_pr  Comparative means of ndaPC_pr by na_Age4 and w2_Dis12. These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

18b. PROC SURVEY Procedures in SAS ® 18 Example These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

18b. PROC SURVEY Procedures in SAS ® 19 Example detail These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

18b. PROC SURVEY Procedures in SAS ® 20 Example These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

18b. PROC SURVEY Procedures in SAS ® 21 Example These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

18b. PROC SURVEY Procedures in SAS ® 22 Example These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

18b. PROC SURVEY Procedures in SAS ® 23 Example These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

18b. PROC SURVEY Procedures in SAS ® 24 Example These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

18b. PROC SURVEY Procedures in SAS ® 25 Closing Topics discussed in this module  PROC SURVEY procedures  Analysis files  Statements  Frequencies  Crosstabs  Means  Comparative means Next module:  19. Multivariate Analysis Using NLTS2 Data

18b. PROC SURVEY Procedures in SAS ® 26 Important information  NLTS2 website contains reports, data tables, and other project-related information  Information about obtaining the NLTS2 database and documentation can be found on the NCES website  General information about restricted data licenses can be found on the NCES website  address: