Lecture 3 Topic - Descriptive Procedures Programs 3-4 LSB 4:1-4.4; 4:9:4:11; 8:1-8:5; 5:1-5.2.

Slides:



Advertisements
Similar presentations
Technology Short Courses: Spring 2010 Kentaka Aruga
Advertisements

I OWA S TATE U NIVERSITY Department of Animal Science Using Basic Graphical and Statistical Procedures (Chapter in the 8 Little SAS Book) Animal Science.
Statistical Techniques I EXST7005 Start here Measures of Dispersion.
Statistical Methods Lynne Stokes Department of Statistical Science Lecture 7: Introduction to SAS Programming Language.
Analysis of Variance Compares means to determine if the population distributions are not similar Uses means and confidence intervals much like a t-test.
EPI 809/Spring Probability Distribution of Random Error.
Creating Graphs on Saturn GOPTIONS DEVICE = png HTITLE=2 HTEXT=1.5 GSFMODE = replace; PROC REG DATA=agebp; MODEL sbp = age; PLOT sbp*age; RUN; This will.
WINKS SDA Statistical Data Analysis (Windows Kwikstat) Getting Started Guide.
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
Today: Run SAS programs on Saturn (UNIX tutorial) Runs SAS programs on the PC.
Reading – Linear Regression Le (Chapter 8 through 8.1.6) C &S (Chapter 5:F,G,H)
Descriptive Statistics In SAS Exploring Your Data.
Interpreting Bi-variate OLS Regression
Data Cleaning 101 Ron Cody, Ed.D Robert Wood Johnson Medical School Piscataway, NJ.
Week 3 Topic - Descriptive Procedures Program 3 in course notes Cody & Smith (Chapter 2)
Lecture 8 Chi-Square STAT 3120 Statistical Methods I.
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 12 Describing Data.
Chapter 9 Producing Descriptive Statistics PROC MEANS; Summarize descriptive statistics for continuous numeric variables. PROC FREQ; Summarize frequency.
Introduction to SAS BIO 226 – Spring Outline Windows and common rules Getting the data –The PRINT and CONTENT Procedures Manipulating the data.
Measures of Variation. For discrete variables, the Index of Qualitative Variation.
1 Experimental Statistics - week 4 Chapter 8: 1-factor ANOVA models Using SAS.
April 11 Logistic Regression –Modeling interactions –Analysis of case-control studies –Data presentation.
1 Experimental Statistics - week 10 Chapter 11: Linear Regression and Correlation.
HLTH 653 Lecture 2 Raul Cruz-Cano Spring Statistical analysis procedures Proc univariate Proc t test Proc corr Proc reg.
1 Experimental Statistics - week 10 Chapter 11: Linear Regression and Correlation Note: Homework Due Thursday.
6-3 Multiple Regression Estimation of Parameters in Multiple Regression.
1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book.
Lesson 5 - Topics Formatting Output Working with Dates Reading: LSB:3:8-9; 4:1,5-7; 5:1-4.
Regression For the purposes of this class: –Does Y depend on X? –Does a change in X cause a change in Y? –Can Y be predicted from X? Y= mX + b Predicted.
April 6 Logistic Regression –Estimating probability based on logistic model –Testing differences among multiple groups –Assumptions for model.
6-1 Introduction To Empirical Models Based on the scatter diagram, it is probably reasonable to assume that the mean of the random variable Y is.
Haas MFE SAS Workshop Lecture 3: Peng Liu Haas School.
Analyses using SPSS version 19
6-3 Multiple Regression Estimation of Parameters in Multiple Regression.
Lesson 6 - Topics Reading SAS datasets Subsetting SAS datasets Merging SAS datasets.
Regression & Correlation. Review: Types of Variables & Steps in Analysis.
Simple Linear Regression. Data available : (X,Y) Goal : To predict the response Y. (i.e. to obtain the fitted response function f(X)) Least Squares Fitting.
Lesson 4 - Topics Creating new variables in the data step SAS Functions.
Mr. Magdi Morsi Statistician Department of Research and Studies, MOH
Lesson 8 - Topics Creating SAS datasets from procedures Using ODS and data steps to make reports Using PROC RANK Programs in course notes LSB 4:11;5:3.
The Distribution of Single Variables. Two Types of Variables Continuous variables – Equal intervals of measurement – Known zero-point that is meaningful.
Lecture 4 Ways to get data into SAS Some practice programming
Today: March 7 Data Transformations Rank Tests for Non-Normal data Solutions for Assignment 4.
© Willett, Harvard University Graduate School of Education, 1/28/2016S010Y/C09 – Slide 1 S010Y: Answering Questions with Quantitative Data Class 9/III.2:
Customize SAS Output Using ODS Joan Dong. The Output Delivery System (ODS) gives you greater flexibility in generating, storing, and reproducing SAS procedure.
Chapter 1 Introduction to Statistics. Section 1.1 Fundamental Statistical Concepts.
1 Experimental Statistics - week 12 Chapter 11: Linear Regression and Correlation Chapter 12: Multiple Regression.
SAS Programming Training Instructor:Greg Grandits TA: Textbooks:The Little SAS Book, 5th Edition Applied Statistics and the SAS Programming Language, 5.
1 Experimental Statistics - week 11 Chapter 11: Linear Regression and Correlation.
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 16 & 17 By Tasha Chapman, Oregon Health Authority.
Lesson 10 - Topics SAS Procedures for Standard Statistical Tests and Analyses Programs 19 and 20 LSB 8:16-17.
Lecture 3 Topic - Descriptive Procedures
EHS 655 Lecture 4: Descriptive statistics, censored data
Introduction to Graphing in SAS
MATH-138 Elementary Statistics
Today: Feb 28 Reading Data from existing SAS dataset One-way ANOVA
Applied Business Forecasting and Regression Analysis
Lesson 4 Descriptive Procedures
제 5장 기술통계 및 추론 PROC MEANS 절차 PROC MEANS <options> ;
Lesson 3 Overview Descriptive Procedures Controlling SAS Output
Lecture 2 Topics - Descriptive Procedures
Lesson 10 - Topics SAS Procedures for Standard Statistical Tests and Analyses Programs 19 and 20 LSB 9:4-7;12-13 Welcome to lesson 10. In this lesson.
Lesson 8 - Topics Creating SAS datasets from procedures
Lecture 2 Topics - Descriptive Procedures
Producing Descriptive Statistics
Let’s continue to review some of the statistics you’ve learned in your first class: Bivariate analyses (two variables measured at a time on each observation)
Let’s review some of the statistics you’ve learned in your first class: Univariate analyses (single variable) are done both graphically and numerically.
Lecture 2 Topics - Descriptive Procedures
Presentation transcript:

Lecture 3 Topic - Descriptive Procedures Programs 3-4 LSB 4:1-4.4; 4:9:4:11; 8:1-8:5; 5:1-5.2

Descriptive Procedures In SAS

Syntax for Procedures PROC PROCNAME DATA=datasetname ; substatements/ ; The WHERE statement is a useful substatement available to all procedures. PROC PRINT DATA=demo ; VAR marstat ; WHERE state = 'MN';

Data Layout of tomhs.data In Course Notes VariableTypeLenPosInformDescription PTIDChar101$10.Patient ID CLINICChar112$1.Clinical center RANDDATENum614mmddyy10.Randdate SBPBLNum31153.SBP at baseline DATA tomhs; INFILE ‘folderpath\tomhs.data'; ptid clinic randdate sbpbl 3. ; Note: You can give any legal variable name.

DATA weight; INFILE ‘C:\SAS_Files\tomhs.data' ; ptid clinic group sex height weight sbpbl sbp12 3.; bmi = (weight* )/(height*height); sbpchg = sbp12 - sbpbl; if group = 6 then active = 2; else active = 1; RUN; Program 3

PROC PRINT DATA = weight (OBS=5) NOBS; TITLE 'Proc Print: Five observations from the TOMHS Study'; RUN; PROC MEANS DATA = weight; VAR height weight bmi; TITLE 'Proc Means Example 1'; RUN; PROC MEANS DATA = weight MEAN MEDIAN STD MAXDEC=2; VAR height weight bmi; TITLE 'Proc Means Example 2 (specifying options)'; RUN;

Proc Print: Five observations from the TOMHS Study ptid clinic sex height weight bmi C03615 C B00979 B B00644 B D01348 D A01088 A Proc Means Example 1 The MEANS Procedure Variable N Mean Std Dev Minimum Maximum height weight bmi

Proc Means Example 2 (specifying options) The MEANS Procedure Variable Mean Median Std Dev height weight bmi

PROC MEANS DATA = weight N MEAN STD MAXDEC=2 ; CLASS clinic; VAR height weight bmi; TITLE 'Proc Means Example 3 (Using a CLASS statement)'; RUN; N clinic Obs Variable N Mean Std Dev A 18 height weight bmi B 29 height weight bmi C 36 height weight bmi D 17 height weight bmi

PROC TTEST DATA = weight ; VAR sbpchg; CLASS active; TITLE 'T-Test Comparing Active and Placebo Groups'; RUN; ****************************************************************************** T-Test Comapring Active and Placebo Groups The TTEST Procedure Statistics Lower CL Upper CL Lower CL Variable active N Mean Mean Mean Std Dev Std Dev sbpchg sbpchg sbpchg Diff (1-2) Variable active Std Dev Std Err sbpchg Diff (1-2) T-Tests Variable Method Variances DF t Value Pr > |t| sbpchg Pooled Equal

PROC UNIVARIATE DATA = weight ; VAR bmi; ID ptid; TITLE 'Proc Univariate Example 1'; RUN; * Note: PROC UNIVARIATE will give you much output ; PROC UNIVARIATE

Proc Univariate Example 1 The UNIVARIATE Procedure Variable: bmi Moments N 100 Sum Weights 100 Mean Sum Observations Std Deviation Variance Skewness Kurtosis Uncorrected SS Corrected SS Coeff Variation Std Error Mean Basic Statistical Measures Location Variability Mean Std Deviation Median Variance Mode Range Interquartile Range Tests for Location: Mu0=0 Test -Statistic p Value Student's t t Pr > |t| <.0001 Sign M 50 Pr >= |M| <.0001 Signed Rank S 2525 Pr >= |S| <.0001

Quantile Estimate 100% Max % % % % Q % Median % Q % % % % Min Extreme Observations Lowest Highest Value ptid Obs Value ptid Obs A B C B B A A C B B

* High resolution graphs can also be produced. The following makes a histogram and normal plot ; ODS GRAPHICS ON; PROC UNIVARIATE DATA = weight; VAR bmi; HISTOGRAM bmi / NORMAL MIDPOINTS=20 to 40 by 2; INSET N = 'N' (5.0) MEAN = 'Mean' (5.1) STD = 'Sdev' (5.1) MIN = 'Min' (5.1) MAX = 'Max' (5.1)/ POS=NW HEADER='Summary Statistics'; LABEL bmi = 'Body Mass Index (kg/m2)'; TITLE 'Histogram of BMI'; PROBPLOT bmi/NORMAL (MU=est SIGMA=est); RUN;

* PROC SGPLOT can do several types of plots - here a boxplot ; PROC SGPLOT; VBOX bmi; * Vertical boxplot; TITLE 'Boxplot of BMI'; RUN;

* Using SGPLOT to make side-by-side boxplots; PROC SGPLOT; TITLE "Boxplot of BMI for Men and Women"; HBOX bmi/CATEGORY=sex; RUN; Later we will see how to format the value 1 and 2 so they display as men and women.

DATA weight; INFILE ‘C:\SAS_Files\tomhs.data' ; ptid clinic age sex height weight cholbl 3.0 ; bmi = (weight* )/(height*height); RUN; Program 4

PROC FREQ DATA=weight; TABLES sex clinic ; TITLE 'Frequency Distribution of Clinical Center and Gender'; RUN; Frequency Distribution of Sex and Clinical Center The FREQ Procedure Cumulative Cumulative sex Frequency Percent Frequency Percent Cumulative Cumulative clinic Frequency Percent Frequency Percent A B C D

*2-Way Frequency Tables ; PROC FREQ DATA=weight; TABLES sex*clinic/CHISQ ; ; TITLE 'Cross Tabulation of Clinical Center and Sex'; RUN;

Cross Tabulation of Clinical Center and Sex The FREQ Procedure Table of sex by clinic sex clinic Frequency| Percent | Row Pct | Col Pct |A |B |C |D | Total | 12 | 20 | 30 | 11 | 73 | | | | | | | | | | | | | | | | 6 | 9 | 6 | 6 | 27 | 6.00 | 9.00 | 6.00 | 6.00 | | | | | | | | | | | Total Percent men in clinic A

Statistics for Table of sex by clinic Statistic DF Value Prob ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Chi-Square Likelihood Ratio Chi-Square Mantel-Haenszel Chi-Square Phi Coefficient Contingency Coefficient Cramer's V

USEFUL TABLE OPTIONS CHISQ – performs chi-square analyses for 2-way tables MISSING – includes missing data as a separate category LIST – makes condensed table (useful when looking at 3-way or higher tables)

* Using PROC SGPLOT for bar charts; PROC SGPLOT; VBAR clinic; TITLE "Vertical Bar Chart of Clinical Center"; LABEL clinic = "Clinical Center"; Plot can be imbedded into an HTML document or kept as a separate file. File can be inserted in Office documents.

* DATALABEL puts values on top of bar; PROC SGPLOT; YAXIS LABEL = "Mean Cholesterol" VALUES = (0 to 300 by 50); VBAR clinic/RESPONSE=cholbl STAT=MEAN DATALABEL ; TITLE 'Mean Cholesterol by Clinical Center'; LABEL clinic = "Clinical Center"; RUN;

PROC SGPLOT DATA=weight; YAXIS LABEL = "Body Mass Index (BMI)" ; XAXIS LABEL = "Age (y)" ; REG X=age Y=bmi/clm; WHERE sex = 2; TITLE 'Plot of BMI and Age for Women'; RUN;

Pearson Correlation Coefficients, N = 27 Prob > |r| under H0: Rho=0 bmi age bmi age Correlation Coefficient P-value testing if correlation is significantly different from zero PROC CORR DATA=weight; VAR bmi age; WHERE sex = 2; TITLE 'Correlation of BMI and Age for Women'; RUN ;

ODS GRAPHICS ; PROC REG DATA=weight ; MODEL bmi=age; WHERE sex = 2; TITLE 'Simple Linear Regression'; RUN; Partial Output Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > |t| Intercept <.0001 age Regression equation: bmi = *age *Note: many options for plotting within proc reg. See Ch 5,H,I ODS graphics on will produce many plot by default.

Fit plot from PROC REG

Exercise 3 See exercise 3 in course notes