Quick Data Summaries in SAS

Slides:



Advertisements
Similar presentations
I OWA S TATE U NIVERSITY Department of Animal Science Using Basic Graphical and Statistical Procedures (Chapter in the 8 Little SAS Book) Animal Science.
Advertisements

DESCRIBING DISTRIBUTION NUMERICALLY
Class 14 Testing Hypotheses about Means Paired samples 10.3 p
SAS Programming: Working With Variables. Data Step Manipulations New variables should be created during a Data step Existing variables should be manipulated.
QUANTITATIVE DATA ANALYSIS
Descriptive Statistics In SAS Exploring Your Data.
Quick Data Summaries in SAS Start by bringing in data –Use permanent data set for these examples Proc Tabulate –Produces summaries very quickly and easily.
Summarizing Measured Data Nelson Fonseca State University of Campinas.
Introduction to SQL Session 1 Retrieving Data From a Single Table.
Chapter 5 – 1 Chapter 5: Measures of Variability The Importance of Measuring Variability The Range IQR (Inter-Quartile Range) Variance Standard Deviation.
SPSS Statistical Package for the Social Sciences is a statistical analysis and data management software package. SPSS can take data from almost any type.
 Deviation is a measure of difference for interval and ratio variables between the observed value and the mean.  The sign of deviation (positive or.
Week 3 Topic - Descriptive Procedures Program 3 in course notes Cody & Smith (Chapter 2)
SAS PROC REPORT PROC TABULATE
Lecture 5 Sorting, Printing, and Summarizing Your Data.
Chapter 9 Producing Descriptive Statistics PROC MEANS; Summarize descriptive statistics for continuous numeric variables. PROC FREQ; Summarize frequency.
Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS ESSENTIALS -- Elliott & Woodward1.
Introduction to SAS BIO 226 – Spring Outline Windows and common rules Getting the data –The PRINT and CONTENT Procedures Manipulating the data.
1 Experimental Statistics - week 4 Chapter 8: 1-factor ANOVA models Using SAS.
Niraj J. Pandya, Element Technologies Inc., NJ.  Summarize all possible combinations of class level variables even if few categories are altogether missing.
1 1 Slide Descriptive Statistics: Numerical Measures Location and Variability Chapter 3 BA 201.
A Brief Introduction to PROC TRANSPOSE prepared by Voytek Grus for
Multilevel Linear Models Field, Chapter 19. Why use multilevel models? Meeting the assumptions of the linear model – Homogeneity of regression coefficients.
How to find measures variability using SPSS
Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun 3
1 Filling in the blanks with PROC FREQ Bill Klein Ryerson University.
Trial Group AGroup B Mean P value 2.8E-07 Means of Substances Group.
T T03-01 Calculate Descriptive Statistics Purpose Allows the analyst to analyze quantitative data by summarizing it in sorted format, scattergram.
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Within Subjects Analysis of Variance PowerPoint.
1 1 Slide IS 310 – Business Statistics IS 310 Business Statistics CSU Long Beach.
1 1 Slide © 2006 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Chapter 5 Reading and Manipulating SAS ® Data Sets and Creating Detailed Reports Xiaogang Su Department of Statistics University of Central Florida.
Lecture 3 Topic - Descriptive Procedures Programs 3-4 LSB 4:1-4.4; 4:9:4:11; 8:1-8:5; 5:1-5.2.
Chapter 4 concerns various SAS procedures (PROCs). Every PROC operates on: –the most recently created dataset –all the observations –all the appropriate.
Chapter 17: Formatting Data 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
1 Statistical Software Programming. STAT 6360 –Statistical Software Programming Sorting, Printing, Summarizing Data Now that we can input data and do.
Lesson 8 - Topics Creating SAS datasets from procedures Using ODS and data steps to make reports Using PROC RANK Programs in course notes LSB 4:11;5:3.
2.4 Measures of Variation Coach Bridges NOTES. What you should learn…. How to find the range of a data set How to find the range of a data set How to.
Standard Deviation -5. Range- difference between highest & lowest value in a set of data Standard Deviation- a widely used measurement of variability.
Computing with SAS Software A SAS program consists of SAS statements. 1. The DATA step consists of SAS statements that define your data and create a SAS.
FORMAT statements can be used to change the look of your output –if FORMAT is in the DATA step, then the formats are permanent and stored with the dataset.
17b.Accessing Data: Manipulating Variables in SAS ®
Medical Statistics (full English class) Ji-Qian Fang School of Public Health Sun Yat-Sen University.
1 STAT 500 – Statistics for Managers STAT 500 Statistics for Managers.
Economics 111Lecture 7.2 Quantitative Analysis of Data.
Based on Learning SAS by Example: A Programmer’s Guide Chapters 1 & 2
Measures of Variation. Range, Variance, & Standard Deviation.
Standard Deviation Variance and Range. Standard Deviation:  Typical distance of observations from their mean  A numerical summary that measures the.
Introduction Dispersion 1 Central Tendency alone does not explain the observations fully as it does reveal the degree of spread or variability of individual.
Homework solution#1 Q1: Suppose you have a sample from Palestine University and the distribution of the sample as: MedicineDentistEngineeringArtsCommerce.
IENG-385 Statistical Methods for Engineers SPSS (Statistical package for social science) LAB # 1 (An Introduction to SPSS)
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 16 & 17 By Tasha Chapman, Oregon Health Authority.
Notes on SQL. SQL Programming Employers increasingly tell us that they look for 3 things on a resume: SAS, R and SQL. In these notes you will learn: 1.What.
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 3 & 4 By Tasha Chapman, Oregon Health Authority.
Session 1 Retrieving Data From a Single Table
Applied Business Forecasting and Regression Analysis
Measures of variation (dispersion) [مقاييس التشتت]
Numerical Descriptives in R
Chapter 4: Sorting, Printing, Summarizing
Unit 4 Statistics Review
3-2 Measures of Variance.
Descriptive Statistics
Statistics 1: Introduction to Probability and Statistics
Producing Descriptive Statistics
R-lab 2 -Dorji Pelzom.
14.3 Measures of Dispersion
Let’s review some of the statistics you’ve learned in your first class: Univariate analyses (single variable) are done both graphically and numerically.
Wilcoxon Rank-Sum Test
Presentation transcript:

Quick Data Summaries in SAS Start by bringing in data Use permanent data set for these examples Proc Summary Produces summaries relatively easily Designed to produce a table of output that can be manipulated further ***This is a critical difference from tabulate*** Need to pre-sort data by any “by” groups Need to print out results

Quick Data Summaries in SAS Basic Summary Syntax: Proc sort; By var1 var2; Run; Proc summary; Var variable3; Output out=new_table mean=mean_name n=n_name….; Proc print;

Statistics available in Proc Summary Mean, n, standard deviation, standard deviation, variance, coefficient of variation, sum Minimum, maximum, range, number of missing observations, median

Some Quirks of Proc Summary Whenever you use proc summary, it adds two new variables: _type_ and _freq_ (note underscores at beginning and end of variable names _freq_ indicates the number of observations _type_ indicates whether the output is a matrix or not You can ignore these variables in virtually all cases You need to remember what is the “active” dataset, or specify the dataset that summary will operate on The active dataset is the most recently used dataset by default

Shannon’s Diversity Index H= -∑ pi ln(pi)