SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 16 & 17 By Tasha Chapman, Oregon Health Authority.

Slides:



Advertisements
Similar presentations
Haas MFE SAS Workshop Lecture 3:
Advertisements

16b. Accessing Data: Means in SAS ®. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2 Training.
Simple Logistic Regression
SAS Programming: Working With Variables. Data Step Manipulations New variables should be created during a Data step Existing variables should be manipulated.
Public Health 5415 Biostatistical Methods II Spring 2005 Greg Grandits Class Times Monday10:10am-12:05pm Wednesday10:10am-11:00am.
Today: Run SAS programs on Saturn (UNIX tutorial) Runs SAS programs on the PC.
Quick Data Summaries in SAS Start by bringing in data –Use permanent data set for these examples Proc Tabulate –Produces summaries very quickly and easily.
15a.Accessing Data: Frequencies in SPSS ®. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2 Training.
15b. Accessing Data: Frequencies in SAS ®. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2 Training.
Introduction to SQL Session 1 Retrieving Data From a Single Table.
PROC FREQ 1SHRUG November 28, What good is Proc FREQ It Counts! Answers question how many Display data (error checks), descriptive Analyze categorical.
SAS Programming SAS Data Mart. Outline Access different format of data for SAS SAS data mart SAS data manipulation 2.
Data Cleaning 101 Ron Cody, Ed.D Robert Wood Johnson Medical School Piscataway, NJ.
Data Preparation for Analytics Using SAS Gerhard Svolba, Ph.D. Reviewed by Madera Ebby, Ph.D.
Welcome to SAS…Session..!. What is SAS..! A Complete programming language with report formatting with statistical and mathematical capabilities.
STAT 3130 Statistical Methods II Missing Data and Imputation.
Chapter 8 Producing Summary Reports. Section 8.1 Introduction to Summary Reports.
SAS PROC REPORT PROC TABULATE
Chapter 9 Producing Descriptive Statistics PROC MEANS; Summarize descriptive statistics for continuous numeric variables. PROC FREQ; Summarize frequency.
Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS ESSENTIALS -- Elliott & Woodward1.
PROC REPORT organizes the output in many ways, from the simple to highly complex… PROC REPORT NOWINDOWS HEADLINE HEADSKIP; COLUMN variable-list; DEFINE.
USING SAS PROCEDURES SAS System Options OPTIONS Statement
HPR Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and.
18b. PROC SURVEY Procedures in SAS ®. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2 Training.
Lesson 5 - Topics Formatting Output Working with Dates Reading: LSB:3:8-9; 4:1,5-7; 5:1-4.
SAS 介绍和举例 Presented by 经济实验教学中心 商务数据挖掘中心. Raw Data Read in Data Process Data (Create new variables) Output Data (Create SAS Dataset) Analyze Data Using.
Lesson 2 Topic - Reading in data Chapter 2 (Little SAS Book)
1 Filling in the blanks with PROC FREQ Bill Klein Ryerson University.
WINKS 7 Tutorial 7 – Advanced Topic: Labels and Formats Permission granted for use for instruction and for personal use. © Alan C. Elliott,
Introduction to Enterprise Guide Jennifer Schmidt Rhonda Ellis Cassandra Hall.
Haas MFE SAS Workshop Lecture 3: Peng Liu Haas School.
Reports and Queries Chapter 3 – Access text Reports – Page Queries – Page
Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS ESSENTIALS -- Elliott & Woodward1.
Priya Ramaswami Janssen R&D US. Advantages of PROC REPORT -Very powerful -Perform lists, subsets, statistics, computations, formatting within one procedure.
Chapter 5 Reading and Manipulating SAS ® Data Sets and Creating Detailed Reports Xiaogang Su Department of Statistics University of Central Florida.
Lecture 3 Topic - Descriptive Procedures Programs 3-4 LSB 4:1-4.4; 4:9:4:11; 8:1-8:5; 5:1-5.2.
11/25/2015Slide 1 Scripts are short programs that repeat sequences of SPSS commands. SPSS includes a computer language called Sax Basic for the creation.
Lesson 4 - Topics Creating new variables in the data step SAS Functions.
Chapter 4 concerns various SAS procedures (PROCs). Every PROC operates on: –the most recently created dataset –all the observations –all the appropriate.
Chapter 17: Formatting Data 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
1 EPIB 698C Lecture 4 Raul Cruz-Cano Summer 2012.
1 Statistical Software Programming. STAT 6360 –Statistical Software Programming Sorting, Printing, Summarizing Data Now that we can input data and do.
Lesson 8 - Topics Creating SAS datasets from procedures Using ODS and data steps to make reports Using PROC RANK Programs in course notes LSB 4:11;5:3.
DTC Quantitative Methods Summary of some SPSS commands Weeks 1 & 2, January 2012.
An Introduction Katherine Nicholas & Liqiong Fan.
Computing with SAS Software A SAS program consists of SAS statements. 1. The DATA step consists of SAS statements that define your data and create a SAS.
FORMAT statements can be used to change the look of your output –if FORMAT is in the DATA step, then the formats are permanent and stored with the dataset.
1 Chapter 3: Getting Started with Tasks 3.1 Introduction to Task Dialogs 3.2 Creating a Listing Report 3.3 Creating a Frequency Report 3.4 Creating a Two-Way.
Customize SAS Output Using ODS Joan Dong. The Output Delivery System (ODS) gives you greater flexibility in generating, storing, and reproducing SAS procedure.
Lesson 2 Topic - Reading in data Programs 1 and 2 in course notes –Chapter 2 (Little SAS Book)
Chapter 6: Modifying and Combining Data Sets  The SET statement is a powerful statement in the DATA step DATA newdatasetname; SET olddatasetname;.. run;
SAS Programming Training Instructor:Greg Grandits TA: Textbooks:The Little SAS Book, 5th Edition Applied Statistics and the SAS Programming Language, 5.
1 Checking Data with the PRINT and FREQ Procedures.
Based on Learning SAS by Example: A Programmer’s Guide Chapters 1 & 2
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 14 & 19 By Tasha Chapman, Oregon Health Authority.
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 5 & 6 By Ravi Mandal.
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 3 & 4 By Tasha Chapman, Oregon Health Authority.
Applied Business Forecasting and Regression Analysis
Lecture 2 Topics - Descriptive Procedures
SAS Output Delivery System
Lesson 8 - Topics Creating SAS datasets from procedures
Chapter 4: Sorting, Printing, Summarizing
Introduction to SAS A SAS program is a list of SAS statements executed in order Every SAS statement ends with a semicolon! SAS statements can be in caps.
Producing Descriptive Statistics
Introduction to SAS Essentials Mastering SAS for Data Analytics
Presentation transcript:

SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 16 & 17 By Tasha Chapman, Oregon Health Authority

Topics covered…  PROC Freq  Options  Using formats  Missing data  Order=  Multi-dimensional tables  Statistics

Topics covered…  PROC Means  Options  Class statement  Missing data  Output statement  _TYPE_ and Chartype  ODS NOPROCTITLE

PROC Freq

 PROC Freq can be used to run simple frequency tables on your data

PROC Freq Results of PROC Freq of “Demographics”

 Use the table statement to only print selected variables  Use the nocum option to suppress cumulative statistics  Use the nopercent option to suppress percent statistics  Can use options together or separately PROC Freq

 where statement – Only include selected observations  format statement – Apply format to selected variables  Only applies to current procedure  Can be used to group data

Using formats  Use formats to group data

Missing data  Missing data will be excluded from the analysis  Will affect percent calculations

Missing data  Use the missing option to include missing values in the frequency table  Can also create a label for missing values in your PROC Format

Order=  By default PROC Freq orders your frequency table based on the internal (unformatted) values  Use the order= option to change the order  Missing values, if included in the table, will always be listed first regardless order= Results internal (Default) Order values by their internal (unformatted) values formatted Orders values by their formatted values freq Order values from the most to least frequent data Orders values based on their order in the input dataset

Order=

Multi-dimension tables  Can create simple cross-tabulations

 Use the nocol option to suppress column percent statistics  Use the norow option to suppress row percent statistics  Use the nopercent option to suppress total percent statistics  Can use options together or separately Multi-dimension tables

 Use the list option to display cross-tab tables in a list format

NotationResult table A * (B C D); Three tables: A by B ; A by C ; A by D table (A B) * (C D); Four tables: A by C ; A by D ; B by C ; B by D table A * B * C; One three-way table with the format Page * Row * Column. Each classification of A would appear on a separate page. table Ques1 - Ques10; Ten tables, one each for Ques1 through Ques10 table VarA -- VarB; One table each for all variables between VarA and VarB in the SAS dataset (by varnum) table Ques: ; One table each for all variables that begin with “ Ques ” table _numeric_; One table each for all numeric variables table _character_; One table each for all character variables table _all_; One table each for all variables Multi-dimension tables  There are multiple ways to request tables:

Multi-dimension tables  There are multiple ways to request tables: NotationResult table A * (B C D); Three tables: A by B ; A by C ; A by D table (A B) * (C D); Four tables: A by C ; A by D ; B by C ; B by D table A * B * C; One three-way table with the format Page * Row * Column. Each classification of A would appear on a separate page. table Ques1 - Ques10; Ten tables, one each for Ques1 through Ques10 table VarA -- VarB; One table each for all variables between VarA and VarB in the SAS dataset (by varnum) table Ques: ; One table each for all variables that begin with “ Ques ” table _numeric_; One table each for all numeric variables table _character_; One table each for all character variables table _all_; One table each for all variables

Statistics  PROC Freq is also used to calculate certain statistics, such as chi- square, odds ratio, and relative risk

PROC Means

 PROC Means can be used to run simple summary statistics on your data

Results of PROC Means of “Demographics” PROC Means

 Many options to control output of PROC Means  NMiss Mean Median – Examples of statistics that can be specified in PROC Means (see later slide for list of statistical keywords)  class statement – Allows for grouping by categorical variables  var statement – Only provides statistics for listed analysis variables

PROC Means

 Statistics available in PROC Means

PROC Means  maxdec= option – Specifies the number of decimal places for statistics  where statement – Only include selected observations  format statement – Apply format to selected variables  Only applies to current procedure  Can be used to group class data

Class variables  Table can also include multiple class variables

Class variables  Table can also include multiple class variables

Missing data WhereDefaultOverride Analysis variableExcludes that observation from the calculation of statistics None

Missing data N Obs Number of observations in that class category N Number of non- missing values for analysis variable These are the observations used in calculation of Mean and similar statistics

Missing data (Missing option) WhereDefaultOverride Analysis variableExcludes that observation from the calculation of statistics None Class variableExcludes that observation from the table MISSING option

Missing data (Missing option) Includes all class variables with missing data Includes selected class variables with missing data

Missing data (Missing option)

Output statement  Create output datasets using the output statement  out= specifies the name of the output dataset(s)  By default, the output dataset will include N, Mean, Min, Max, and Std. Dev – regardless of which statistics you specify in the PROC Means statement – for all levels of your class variable(s)

Output statement  Gender/Blood type : Class variables  _TYPE_ : Level of class variable(s)  _FREQ_ : Number of observations in that class category (N Obs)  _STAT_ : Name of the statistic  Cholesterol : Analysis variable

Output statement (_TYPE_)  _TYPE_ : Level of class variable(s)  0 = All observations  1 = Classified by Blood Type only  2 = Classified by Gender only  3 = Classified by both Blood Type and Gender

Output statement (_TYPE_)  Can replace the _TYPE_ variable with a binary representation of the class variables using the chartype option  (Short for Character Type)

Output statement (_TYPE_)  _TYPE_ : Level of class variable(s) (using chartype) Gender Blood TypeInterpretation 00All observations 01Blood Type only 10Gender only 11Blood Type x Gender

Output statement (_TYPE_)

Output statement (Missing data)

Lesson: If an observation is missing data for a class variable, that observation is excluded from all analyses in the procedure

Output statement (Missing data)

Output statement  You can specify which statistics to include through the output statement Statistic New variable name

Output statement  Use the autoname function to automatically generate new variable names

Output statement  If you forget to name your variables, your output will not run correctly

Output statement  Can assign different statistics to each variable

Output statement  Can have multiple output statements with different specifications for each dataset

Output statement

Additional Reading Steps to Success with PROC Means Advanced Tips and Techniques with PROC Means

ODS NOPROCTITLE

ODS  Some procedures (such as FREQ and MEANS) will print a procedure title at the top of their output  This cannot be controlled by title statements

ODS NOPROCTITLE  Use an ODS NOPROCTITLE statement to turn off the procedure titles

Read chapter 15 For next week…