Chapter 17 Supplement: Alternatives to IF-THEN/ELSE Processing STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South.

Slides:



Advertisements
Similar presentations
Statistical Methods Lynne Stokes Department of Statistical Science Lecture 7: Introduction to SAS Programming Language.
Advertisements

SAS Programming: Working With Variables. Data Step Manipulations New variables should be created during a Data step Existing variables should be manipulated.
Performing Queries Using PROC SQL Chapter 1 1 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
Today: Run SAS programs on Saturn (UNIX tutorial) Runs SAS programs on the PC.
1 Creating and Tweaking Data HRP223 – 2010 October 24, 2011 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This.
Algorithms. Introduction Before writing a program: –Have a thorough understanding of the problem –Carefully plan an approach for solving it While writing.
C++ for Engineers and Scientists Third Edition
Jeremy W. Poling B&W Y-12 L.L.C. Can’t Decide Whether to Use a DATA Step or PROC SQL? You Can Have It Both Ways with the SQL Function!
Chapter 18: Modifying SAS Data Sets and Tracking Changes 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
Understanding SAS Data Step Processing Alan C. Elliott stattutorials.com.
Data Cleaning 101 Ron Cody, Ed.D Robert Wood Johnson Medical School Piscataway, NJ.
Welcome to SAS…Session..!. What is SAS..! A Complete programming language with report formatting with statistical and mathematical capabilities.
Fortran- Subprograms Chapters 6, 7 in your Fortran book.
Chapter 10:Processing Macro Variables at Execution Time 1 STAT 541 © Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
Chapter 9 Producing Descriptive Statistics PROC MEANS; Summarize descriptive statistics for continuous numeric variables. PROC FREQ; Summarize frequency.
Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS ESSENTIALS -- Elliott & Woodward1.
Introduction to SAS BIO 226 – Spring Outline Windows and common rules Getting the data –The PRINT and CONTENT Procedures Manipulating the data.
1 Experimental Statistics - week 4 Chapter 8: 1-factor ANOVA models Using SAS.
Mastering Char to ASCII AND DOING MORE RELATED STRING MANIPULATION Why VB.Net ?  The Language resembles Pseudocode - good for teaching and learning fundamentals.
Introduction to SAS. What is SAS? SAS originally stood for “Statistical Analysis System”. SAS is a computer software system that provides all the tools.
©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina Chapter 17 supplement: Review of Formatting Data STAT 541.
Chapter 15: Combining Data Horizontally 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
SAS Macro: Some Tips for Debugging Stat St. Paul’s Hospital April 2, 2007.
Chapter 06 (Part I) Functions and an Introduction to Recursion.
SQL Chapter Two. Overview Basic Structure Verifying Statements Specifying Columns Specifying Rows.
Chapter 16: Using Lookup Tables to Match Data 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
Chapter 22: Using Best Practices 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
Lesson 4 - Topics Creating new variables in the data step SAS Functions.
Chapter 4 concerns various SAS procedures (PROCs). Every PROC operates on: –the most recently created dataset –all the observations –all the appropriate.
Chapter 17: Formatting Data 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
Chapter 1: Overview of SAS System Basic Concepts of SAS System.
Chapter 19: Introduction to Efficient SAS Programming 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
Computing with SAS Software A SAS program consists of SAS statements. 1. The DATA step consists of SAS statements that define your data and create a SAS.
FORMAT statements can be used to change the look of your output –if FORMAT is in the DATA step, then the formats are permanent and stored with the dataset.
1 EPIB 698C Lecture 3 Raul Cruz-Cano Fall Creating and Redefining Variables You can create and redefine variables with assignment statements as.
T U T O R I A L  2009 Pearson Education, Inc. All rights reserved Student Grades Application Introducing Two-Dimensional Arrays and RadioButton.
“LAG with a WHERE” and other DATA Step Stories Neil Howard A.
3.1.3 Program Flow control Structured programming – SELECTION in greater detail.
Chapter 23: Selecting Efficient Sorting Strategies 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
Chapter 21: Controlling Data Storage Space 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
Clearly Visual Basic: Programming with Visual Basic 2008 Chapter 11 So Many Paths … So Little Time.
Chapter 10 So Many Paths … So Little Time (Multiple-Alternative Selection Structures) Clearly Visual Basic: Programming with Visual Basic nd Edition.
Chapter 14: Combining Data Vertically 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
An Introduction to Programming with C++ Sixth Edition Chapter 5 The Selection Structure.
Based on Learning SAS by Example: A Programmer’s Guide Chapters 1 & 2
The Urban Institute - SAS Training6/9/20161 SAS Training This SAS Training Course was designed to introduce users at The Urban Institute to SAS programming.
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 8, 13, & 24 By Tasha Chapman, Oregon Health Authority.
1 Cleaning Invalid Data. Clean data by using assignment statements in the DATA step. Clean data by using IF-THEN / ELSE statements in the DATA step. 2.
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 16 & 17 By Tasha Chapman, Oregon Health Authority.
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 5 & 6 By Ravi Mandal.
LESSON 8: INTRODUCTION TO ARRAYS. Lesson 8: Introduction To Arrays Objectives: Write programs that handle collections of similar items. Declare array.
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 3 & 4 By Tasha Chapman, Oregon Health Authority.
Decision making If.. else statement.
Chapter 2: Getting Data into SAS
The Selection Structure
Chapter 3: Working With Your Data
Chapter 7 Arrays.
Chapter 13: Creating Samples and Indexes
Former Chapter 23: Selecting Efficient Sorting Strategies
Chapter 4: Sorting, Printing, Summarizing
SAS Essentials How SAS Thinks
Chapter 24 (4th ed.): Creating Functions
Decision making If statement.
Using Macros to Solve the Collation Problem
Lab 2 HRP223 – 2010 October 18, 2010 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected.
JavaScript: Introduction to Scripting
Conditional statements (2)
Presentation transcript:

Chapter 17 Supplement: Alternatives to IF-THEN/ELSE Processing STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina 1

2 SAS has several versatile and convenient built-in features that serve as alternatives to IF- THEN/ELSE processing. SAS has several versatile and convenient built-in features that serve as alternatives to IF- THEN/ELSE processing. Using the alternatives may result in simplified programming, an economy of code, greater efficiency, and greater readability of programs. Using the alternatives may result in simplified programming, an economy of code, greater efficiency, and greater readability of programs.

3 IF-THEN/ELSE Statements Simple and easy to use. Simple and easy to use. Not always easy to read or to make changes to. Not always easy to read or to make changes to. Can be less efficient than other methods. Can be less efficient than other methods. Alternatives include SELECT groups, ARRAY processing, and PROC FORMAT. Examples include recoding variable values, validating data, and controlling output appearance. Alternatives include SELECT groups, ARRAY processing, and PROC FORMAT. Examples include recoding variable values, validating data, and controlling output appearance.

4 Conditional Processing data one; length teacher counselor $30.; length teacher counselor $30.; input rating $20.; input rating $20.; if rating=’Exemplary’ if rating=’Exemplary’ then teacher=’Frodo’; then teacher=’Frodo’; else if rating in (’Poor’, ’Fair’) else if rating in (’Poor’, ’Fair’) then do; teacher=’Aragorn’; then do; teacher=’Aragorn’; counselor=’Gandalf’; end; counselor=’Gandalf’; end; else do; teacher=’unassigned’; else do; teacher=’unassigned’; counselor=’Legolas’; end; counselor=’Legolas’; end;cards;…;

5 SELECT Group data one; length teacher counselor $30.; length teacher counselor $30.; input rating $20.; input rating $20.; select (rating); select (rating); when(’Exemplary’) teacher=’Frodo’; when(’Exemplary’) teacher=’Frodo’; when(’Poor’,’Fair’) when(’Poor’,’Fair’) do; teacher=’Aragorn’; do; teacher=’Aragorn’; counselor=’Gandalf’; end; counselor=’Gandalf’; end; otherwise otherwise do; teacher=’unassigned’; do; teacher=’unassigned’; counselor=’Legolas’; end; counselor=’Legolas’; end; end; end;cards;…;

6 Subsetting Conditional Statements data males females; input sex $1. grade 2.; input sex $1. grade 2.; if sex=’M’ then output males; if sex=’M’ then output males; else if sex=’F’ then output females; else if sex=’F’ then output females;cards;…; proc freq data=males; tables grade; tables grade; proc freq data=females; tables grade; tables grade; The two mutually exclusive subsets are used with the same procedure.

7 BY-Group Processing data one; input sex $1. grade 2.; input sex $1. grade 2.;cards;…; proc sort; by sex; proc freq; by sex; tables grade; tables grade;

8 Subsetting IF Statements data one; input sex $1. grade 2.; input sex $1. grade 2.;cards;…; data M10; set one; if sex=’M’ and grade=10; set one; if sex=’M’ and grade=10; proc freq data=M10; tables grade; tables grade; data F7; set one; if sex=’F’ and grade=7; set one; if sex=’F’ and grade=7; proc freq data=F7; tables grade; tables grade;

9 Instead of creating a data set for each subset of interest, use the WHERE statement to specify a subset of the data for the procedure. data one; input sex $1. grade 2.; input sex $1. grade 2.;cards;…; proc freq; tables grade; where sex=’M’ and grade=10; where sex=’M’ and grade=10; proc freq; tables grade; where sex=’F’ and grade=7; where sex=’F’ and grade=7; WHERE Statements

10 WHERE= Data Set Option proc freq data=one (where=(sex=’M’and grade=10)); (where=(sex=’M’and grade=10)); tables grade; tables grade; proc freq data=one (where=(sex=’F’and grade=7)); (where=(sex=’F’and grade=7)); tables grade; tables grade;

11 New Variables Just for Output Appearance The gender2 variable is created for the sole purpose of printing more user-friendly values of M and F (instead of 1 and 2) in PROC FREQ output. data one; input gender; input gender; if gender=1 then gender2=’F’; if gender=1 then gender2=’F’; else if gender=2 then gender2=’M’; else if gender=2 then gender2=’M’;cards;…; proc freq; tables gender2; tables gender2;

12 PROC FORMAT Create a user-defined format to control the appearance of output. PROC FREQ will print the values of the gender variable as F and M instead of 1and 2. proc format; value gender 1=’F’ 2=’M’; value gender 1=’F’ 2=’M’; data one; input gender; input gender; format gender gender.; format gender gender.;cards;…; proc freq; The gender. format may be applied by using the FORMAT statement with a procedure, or it may be applied in the DATA step as shown here. If the format is applied in the DATA step, then the same format will apply to the variable in procedures where the variable is used.

13 Data Validation Suppose that the valid values for a gender variable are 1 and 2 and that other values are invalid. data one; input gender; input gender; if gender not in (1,2) if gender not in (1,2) then gender=.; then gender=.;cards;…;

14 Data Validation with an Informat proc format; invalue check 1,2=_same_ other=_error_; other=_error_; data one; input gender check.; input gender check.;cards;…;

15 Data Validation with an Informat proc format; invalue check 1,2=_same_ other=_error_; other=_error_; The keyword OTHER indicates range values that are excluded from all the other ranges for an informat. When _ERROR_ is specified as an informatted value, all values in the corresponding informat range are invalid and a missing value will be assigned to the variable. When _SAME_ is specified as an informatted value, a value in the corresponding informat range stays the same.

16 New Variables for Aggregate Analysis The user creates the group variable to divide the records into four groups based on percentile values. data one; input percentile; input percentile; if 1<=percentile<=25 then group=1; if 1<=percentile<=25 then group=1; else if 26<=percentile<=50 then group=2; else if 26<=percentile<=50 then group=2; else if 51<=percentile<=75 then group=3; else if 51<=percentile<=75 then group=3; else if 76<=percentile<=99 then group=4; else if 76<=percentile<=99 then group=4;cards;…; proc freq; var group; proc freq; var group;

17 User-Defined Format data one; input percentile; input percentile;cards;…; proc format; proc format; value group 1-25=1 value group 1-25= = = = = =4; 76-99=4; proc freq; tables percentile; tables percentile; format percentile group.; format percentile group.;

18 Creating New Variables from Existing Ones There is a need to convert the letter grades to numeric grades. The following statements create a new variable called numgrade. if grade=’A’ then numgrade=4; if grade=’A’ then numgrade=4; else if grade=’B+’ then numgrade=3.5; else if grade=’B+’ then numgrade=3.5; else if grade=’B’ then numgrade=3; else if grade=’B’ then numgrade=3; else if grade=’C+’ then numgrade=2.5; else if grade=’C+’ then numgrade=2.5; else if grade=’C’ then numgrade=2; else if grade=’C’ then numgrade=2; else if grade=’D+’ then numgrade=1.5; else if grade=’D+’ then numgrade=1.5; else if grade=’D’ then numgrade=1; else if grade=’D’ then numgrade=1; else if grade=’F’ then numgrade=0; else if grade=’F’ then numgrade=0;

19 INPUT Function and Informats PUT Function and Formats proc format; invalue number ’A’=4 ’B+’=3.5 ’B’=3’C+’=2.5 ’C’=2’D+’=1.5 ’D’=1 ’F’=0; value $words ’A’=’A student’ ’B+’,’B’=’B student’ ’B+’,’B’=’B student’ ’C+’,’C’=’C student’ ’C+’,’C’=’C student’ ’D+’,’D’=’D student’ ’D+’,’D’=’D student’ ’F’=’F student’; ’F’=’F student’;

20 INPUT Function and Informats PUT Function and Formats data grades; input grade $; input grade $; numgrade = input(grade,number.); numgrade = input(grade,number.); text = put(grade,$words.); text = put(grade,$words.);cards;A; proc print; obs grade numgrade text obs grade numgrade text 1 A 4 A student 1 A 4 A student

21 INPUT Function The general syntax without optional arguments is: INPUT(source,informat) It returns the value produced when an expression (source) is read using a specified informat. The informat type determines whether the result of the INPUT function is numeric or character. The INPUT function also converts character values to numeric values.

22 PUT Function The general syntax without optional arguments is: PUT(source,format) It returns a value using a specified format. Writes values of a numeric or character variable/constant (source) using the specified format. The format must be of the same type as the source. The PUT function also converts numeric values to character values and always returns a character value.

23 Converting Rules in a Table The table shows a test form number for each grade. data one; input grade score; input grade score; if grade=1 then form=47; if grade=1 then form=47; else if grade=2 then form=53; else if grade=2 then form=53; else if grade=3 then form=12; else if grade=3 then form=12;cards;…; Grade123 Form475312

24 ARRAY Statement data one; input grade score; array number {3} _temporary_ ( ); form=number{grade};cards;…; Temporary arrays eliminate using unnecessary variables for processing. Temporary array elements are particularly useful when their values are used for computations. If a temporary array element needs to be retained, it can be assigned to a variable.

25 Larger Tables Adjusted Score Depends on the Age and Raw Score Raw Score 1 Raw Score 2 Raw Score 3 Raw Score 4 Raw Score 5 Age Age Age

26 ARRAY Statement The adjusted score is easily obtained using the ARRAY statement and without using IF-THEN/ELSE statements. data one; input age rawscore; array grid {13:15, 1:5} _temporary_ {13:15, 1:5} _temporary_ ( ( ); );adjustedscore=grid(age,rawscore);cards;…;