1 EPIB 698C Lecture 3 Raul Cruz-Cano Fall 2011. 2 Creating and Redefining Variables You can create and redefine variables with assignment statements as.

Slides:



Advertisements
Similar presentations
CS&E 1111 Exfunctions Using Functions in Excel Objectives: Using Excel functions l SUM, MIN, MAX, AVERAGE, COUNT, COUNTA l ROUND l COUNTIF, SUMIF, AVERAGEIF.
Advertisements

Statistical Methods Lynne Stokes Department of Statistical Science Lecture 7: Introduction to SAS Programming Language.
Chapter 7 Introduction to Procedures. So far, all programs written in such way that all subtasks are integrated in one single large program. There is.
Knowing Understanding the Basics Writing your own code part 2 SAS Lab.
Please close your laptops
Professional Seminar Northwestern Polytechnic University By Dr. Michael M Cheng.
SAS Programming: Working With Variables. Data Step Manipulations New variables should be created during a Data step Existing variables should be manipulated.
Lecture 6 MATLAB functions Basics of Built-in Functions, Help Feature, Elementary Functions (e.g., Polynomials, Trigonometric Functions), Data Analysis,
Week 6 - Programming I So far, we’ve looked at simple programming via “scripts” = programs of sequentially evaluated commands Today, extend features to:
CSC110 Fall Chapter 5: Decision Visual Basic.NET.
Understanding SAS Data Step Processing Alan C. Elliott stattutorials.com.
Copyright © 2001 by Wiley. All rights reserved. Chapter 3: Variables, Assignment Statements, and Arithmetic Variables Assignment Statements Arithmetic.
Data Cleaning 101 Ron Cody, Ed.D Robert Wood Johnson Medical School Piscataway, NJ.
Lesson 4 Cell Reference Formulas. Working with Cell References continued… Relative Cell Reference A relative cell reference means that the cell value.
Python  By: Ben Blake, Andrew Dzambo, Paul Flanagan.
Lecture 5 Sorting, Printing, and Summarizing Your Data.
©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina Chapter 17 supplement: Review of Formatting Data STAT 541.
Question 10 What do I write?. Spreadsheet Make sure that you have got a printout of your spreadsheet - no spreadsheet, no marks!
Introduction to Engineering MATLAB – 1 Introduction to MATLAB Agenda Introduction Arithmetic Operations MATLAB Windows Command Window Defining Variables.
EPIB 698D Lecture 2 Raul Cruz Spring SAS functions SAS has over 400 functions, with the following general form: Function-name (argument, argument,
EPIB 698C Lecture 2 Notes Instructor: Raul Cruz 2/14/11 1.
April 6 Logistic Regression –Estimating probability based on logistic model –Testing differences among multiple groups –Assumptions for model.
Java Script: Arrays (Chapter 11 in [2]). 2 Outline Introduction Introduction Arrays Arrays Declaring and Allocating Arrays Declaring and Allocating Arrays.
Flow of Control Part 1: Selection
Fall, 2006Selection1 Choices, Choices, Choices! Selection in FORTRAN Nathan Friedman Fall, 2006.
Chapter 3 “Working With Your Data” concerns programming in the DATA step - putting lines of SAS code between a DATA and PROC statement… Creating new variables.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
1 Tutorial 2 GE 5 Tutorial 2  rules of engagement no computer or no power → no lesson no computer or no power → no lesson no SPSS → no lesson no SPSS.
Chapter 5 Reading and Manipulating SAS ® Data Sets and Creating Detailed Reports Xiaogang Su Department of Statistics University of Central Florida.
Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1.
Lecture 26: Reusable Methods: Enviable Sloth. Creating Function M-files User defined functions are stored as M- files To use them, they must be in the.
1 EPIB 698C Lecture 4 Raul Cruz-Cano Summer 2012.
Digital Image Processing Lecture 6: Introduction to M- function Programming.
Digital Image Processing Introduction to M-function Programming.
Summer SAS Workshop Lecture 3. Summer SAS Workshop Website
Section 3.9: RETAIN & sum statements –because all variables are set to missing at the start of each iteration of the DATA step, we need a statement to.
Computing with SAS Software A SAS program consists of SAS statements. 1. The DATA step consists of SAS statements that define your data and create a SAS.
FORMAT statements can be used to change the look of your output –if FORMAT is in the DATA step, then the formats are permanent and stored with the dataset.
CHAPTER 2 PROBLEM SOLVING USING C++ 1 C++ Programming PEG200/Saidatul Rahah.
Copyright 1999 by Larry Fuhrer. Pascal Programming Branching, Loops, Predefined Functions.
Introduction to Programming Python Lab 3: Arithmetic 22 January PythonLab3 lecture slides.ppt Ping Brennan
T U T O R I A L  2009 Pearson Education, Inc. All rights reserved Student Grades Application Introducing Two-Dimensional Arrays and RadioButton.
R objects  All R entities exist as objects  They can all be operated on as data  We will cover:  Vectors  Factors  Lists  Data frames  Tables 
Controlling Program Flow with Decision Structures.
Chapter 17 Supplement: Alternatives to IF-THEN/ELSE Processing STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South.
1 Introduction to SAS Available at
Chapter 2 Excel Fundamentals Logical IF (Decision) Statements Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
1 EPIB 698C Lecture 1 Instructor: Raul Cruz-Cano
PHP Tutorial. What is PHP PHP is a server scripting language, and a powerful tool for making dynamic and interactive Web pages.
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 5 & 6 By Ravi Mandal.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Extended Prelude to Programming Concepts & Design, 3/e by Stewart Venit and.
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 7 & 10 By Tasha Chapman, Oregon Health Authority.
Introduction to Programming
Introduction to Programming
Stats Lab #1 TA: Kyle Davis
The Selection Structure
Chapter 3: Working With Your Data
Working with Formulas and Functions
Introduction to Programming
Introduction to DATA Step Programming: SAS Basics II
Introduction to Programming
Computing in COBOL: The Arithmetic Verbs and Intrinsic Functions
Introduction to Programming
Instructor: Raul Cruz-Cano 7/19/2012
Working with Formulas and Functions
Presentation transcript:

1 EPIB 698C Lecture 3 Raul Cruz-Cano Fall 2011

2 Creating and Redefining Variables You can create and redefine variables with assignment statements as follows: Variable =expression Type of expressionExample Numeric constantAge =10; Character constantGender =‘Female’; A old variableAge = age_at_baseline ; AdditionAge =age_at_baseline +10; Vector Notation

3 Home gardener's data Gardeners were asked to estimate the pounds they harvested for four corps: tomatoes, zucchini, peas and grapes. Here is the data: Gregor Molly Luther Susan Task:  add new variable group with a value of 14;  add variable type to indicate home gardener;  Create a new variable zucchini_1 which equals to zucchini*10  derive total pounds of corps for each gardener;  derive % of tomatoes for each gardener

4 Home gardener's data DATA homegarden; INFILE ‘C:\garden.txt'; INPUT Name $ 1-7 Tomato Zucchini Peas grapes; group = 14; Type = 'home'; Zucchini_1= Zucchini * 10; Total=tomato + zucchini_1 + peas + grapes; PerTom = (Tomato / Total) * 100; Run; CODE

5 Home gardener's data Check the log window: Missing values were generated as a result of performing an operation on missing values. Since for the last subject, we have missing values for peas, so we the variable total and PerTom, which are calculated from peas, are set to missing

6 SAS functions SAS has over 400 functions, with the following general form: Function-name (argument, argument, …) All functions must have parentheses even if they don’t require any arguments Example:  X=Int(log(10));  Mean_score = mean(score1, score2, score3); The Mean function returns mean of non-missing arguments, which differs from simply adding and dividing by their number, which would return a missing values if any arguments are missing

7 Common Functions And Operators  Functions ABS: absolute value EXP: exponential LOG: natural logarithm MAX and MIN: maximum and minimum SQRT: square root SUM: sum of variables Example: SUM (of x1-x10, x21) Arithmetic: +, -, *, /, ** (not ^)

8 More SAS functions Function NameExampleResult Max Y=Max(1, 3, 5);Y=5 Round Y=Round (1.236, 2);Y=1.24 Sum Y=sum(1, 3, 5);Y=9 Length a=‘my cat’; Y=Length (a); Y=6 Trima=‘my ’, b=‘cat’ Y=trim(a)||b Y=‘mycat’ CODE

9 Selected date functions functionsDescriptionExampleResults TodayReturns current dateX=today();Today’s date QTRReturns a yearly quarter from a SAS date value X= QTR(366)1 MonthReturn the month value from a SAS date value X= Month(366) 1 DayReturn the day value from a SAS date value X= day (369)4 MDYReturns a SAS date value from month, day and year input X=MDY(1,1, 60) 0

10 Working with SAS Date A SAS date is a numeric value equal to the number of days since Jan. 1, For example: DateSAS date value Jan. 1, Jan. 1, Jan. 1, Jan. 1, CODE

11 Example: pumpkin carving contest data This data contains contestant’s name, age, type of pumpkin (carved or decorated), date of entry and the scores from 5 judges. Alicia Grossman 13 c Matthew Lee 9 D Elizabeth Garcia 10 C Lori Newcombe 6 D Jose Martinez 7 d Brian Williams 11 C We will derive the means scores using the “Mean” function Transform values of “type” to upper case Get the day of the month from the SAS date

12 Example: pumpkin carving contest data DATA contest; INFILE ‘C:\pumpkin.txt'; INPUT Name $16. Age Type Date MMDDYY10. (Scr1 Scr2 Scr3 Scr4 Scr5) (4.1); AvgScore= MEAN(Scr1,Scr2,Scr3,Scr4, Scr5); DayEntered = DAY(Date); Type = UPCASE(Type); run; CODE

13 Using IF-THEN statement IF-THEN statement is used for conditional processing. Example: you want to derive means test scores for female students but not male students. Here we derive means conditioning on gender =‘female’ Syntax: If condition then action; Eg: If gender =‘F’ then mean_score =mean(scr1, scr2);

14 Using IF-THEN statement Logical comparisonMnemonic termsymbol Equal toEQ= Not equal toNE^= or ~= Less thanLT< Less than or equal toLE<= Greater thanGT> greater than or equal toGE>= Equal to one in a listIN List of Logical comparison operators Note: Missing numeric values will be treated as the most negative values you can reference on your computer

15 Using IF-THEN statement Example: We have data contains the following information of subjects: Age Gender Midterm Quiz FinalExam 21 M 80 B F 90 A M 87 B F 80 C F 95 A M 88 C 93 Task: To group student based on their age ( =60)

data conditional; input Age Gender $ Midterm Quiz $ FinalExam; datalines; 21 M 80 B F 90 A M 87 B F 80 C F 95 A M 88 C 93 ; data new1; set conditional; if Age < 20 then AgeGroup = 1; if 20 <= Age < 40 then AgeGroup = 2; if 40 <= Age < 60 then AgeGroup = 3; if Age >= 60 then AgeGroup = 4; run; 16 CODE

17 Multiple conditions with AND and OR IF condition1 and condition2 then action; Eg: If age <40 and gender=‘F’ then group=1; If age <40 or gender=‘F’ then group=2;

18 IF-THEN statement, multiple conditions Example: We have data contains the following information of subjects: Age Gender Midterm Quiz FinalExam 21 M 80 B F 90 A M 87 B F 80 C F 95 A M 88 C 93 Task: To group student based on their age ( =40),and gender

19 data new1; set conditional; If age <40 and gender='F' then group=1; If age >=40 and gender='F' then group=2; IF age <40 and gender ='M' then group=3; IF age >=40 and gender ='M' then group=4; run; CODE

Note: Missing numeric values will be treated as the most negative values you can reference on your computer Example: group age into age groups with missing values 21 M 80 B F 90 A 93. M 87 B F 80 C F 95 A+ 97. M 88 C CODE

21 Multiple actions with Do, end Syntax: IF condition then do; Action1 ; Action 2; End; If age <=20 then do ; group=1; exam_date =“Monday”; End;

22 IF-THEN statement, with multiple actions Example: We have data contains the following information of subjects: Age Gender Midterm Quiz FinalExam 21 M 80 B F 90 A M 87 B F 80 C F 95 A M 88 C 93 Task: To group student based on their age, and assign test date based on the age group CODE

23 IF-THEN/ELSE statement Syntax IF condition1 then action1; Else if condition2 then action2; Else if condition3 then action3; IF-THEN/Else statement has two advantages than IF-THEN statement (1) It is more efficient, use less computing time (2) Else logic ensures that your groups are mutually exclusive so that you do not put one observation into more than one groups.

24 IF-THEN/ELSE statement data new1; set conditional; if Age < 20 then AgeGroup = 1; else if Age >= 20 and Age < 40 then AgeGroup = 2; else if Age >= 40 and Age < 60 then AgeGroup = 3; else if Age >= 60 then AgeGroup = 4; run; DATA contest; INFILE 'C:\pumpkin.txt'; INPUT Name $16. Age Type Date MMDDYY10. (Scr1 Scr2 Scr3 Scr4 Scr5) (4.1); If age < =10 then mean_score =mean(Scr1,Scr2); else mean_score=mean(Scr1,Scr2,Scr3,Scr4, Scr5); AvgScore= MEAN(Scr1,Scr2,Scr3,Scr4, Scr5); DayEntered = DAY(Date); Type = UPCASE(Type); run; CODE

25 The IN operator If you want to test if a value is one of the possible choices, you can use multiple “OR” statement like this: IF grade =‘A’ or grade =‘B’ or grade =‘C’ then PASS=‘yes’; A alternative is to use a IN operator: IF grade in (‘A’ ‘B’ ‘C’) then PASS=‘yes’; IF grade in (‘A’, ‘B’,‘C’) then PASS=‘yes’; CODE (error)

26 Simplifying programs with Arrays SAS Arrays are a collection of elements (usually SAS variables) that allow you to write SAS statements referencing this group of variables. Arrays are defined using Array statement as: ARRAY name (n) variable list name: is a name you give to the array n: is the number of variables in the array eg: ARRAY store (4) macys sears target costco Store(1) is the variable for macys Store(2) is the variable for sears

27 Simplifying programs with Arrays A radio station is conducting a survey asking people to rate 10 songs. The rating is on a scale of 1 to 5, with 1=Do not like the song; 5-like the song; IF the listener does not want to rate a song, he puts a “9” to indicate missing values Here is the data with location, listeners age and rating for 10 songs Albany Richmond Oakland Richmond Berkeley We want to change 9 to missing values (.)

28 Simplifying programs with Arrays DATA songs; INFILE 'F:\radio.txt'; INPUT City $ 1-15 Age domk wj hwow simbh kt aomm libm tr filp ttr; ARRAY song (10) domk wj hwow simbh kt aomm libm tr filp ttr; DO i = 1 TO 10; IF song(i) = 9 THEN song(i) =.; END; run;

29 Using shortcuts for lists of variable names When writing SAS programs, we will often need to write a list of variables names. When you have a data with many variables, a shortcut for lists of variables names is helpful Numbered range list: variables which starts with same characters and end with consecutive number can be part of a numbered range list Eg : INPUT cat8 cat9 cat10 cat11 INPUT cat8 – cat11

30 Using shortcuts for lists of variable names Name range list: name range list depends on the internal order, or position of the variables in a SAS dataset. This is determined by the appearance of the variables in the DATA step. Eg : Data new; Input x1 x2 y2 y3; Run; Then the internal range list is: x1 x2 y2 y3 Shortcut for this variable list is x1-y3; Proc contents procedure with the POSITION option can be used to find out the internal order

31 Using shortcuts for lists of variable names DATA songs; INFILE ‘C:\radio.txt'; INPUT City $ 1-15 Age domk wj hwow simbh kt aomm libm tr filp ttr; ARRAY new (10) Song1 - Song10; ARRAY old (10) domk -- ttr; DO i = 1 TO 10; IF old(i) = 9 THEN new(i) =.; ELSE new(i) = old(i); END; AvgScore = MEAN(OF Song1 - Song10); run;