By Sasikumar Palanisamy

Slides:



Advertisements
Similar presentations
Introduction to Programming using Matlab Session 2 P DuffourJan 2008.
Advertisements

How SAS implements structured programming constructs
Program Representations. Representing programs Goals.
Programming Types of Testing.
CHAPTER 5: LOOP STRUCTURES Introduction to Computer Science Using Ruby (c) 2012 Ophir Frieder et al.
Loops (Part 1) Computer Science Erwin High School Fall 2014.
Debugging Introduction to Computing Science and Programming I.
Representing programs Goals. Representing programs Primary goals –analysis is easy and effective just a few cases to handle directly link related things.
5-1 Flow of Control Recitation-01/25/2008  CS 180  Department of Computer Science  Purdue University.
Chapter 6 - Repetition. Introduction u Many applications require certain operations to be carried out more than once. Such situations require repetition.
Chapter 5: Loops and Files.
Recap from last time: live variables x := 5 y := x + 2 x := x + 1 y := x y...
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Understanding SAS Data Step Processing Alan C. Elliott stattutorials.com.
Welcome to SAS…Session..!. What is SAS..! A Complete programming language with report formatting with statistical and mathematical capabilities.
Fundamentals of Python: From First Programs Through Data Structures
Precision Going back to constant prop, in what cases would we lose precision?
CC0002NI – Computer Programming Computer Programming Er. Saroj Sharan Regmi Week 7.
Chapter 21 Reading Hierarchical Files Reading Hierarchical Raw Data Files.
HOMEWORK REVIEW This is an if else statement layout if (condition) { code to be executed if condition is true; } else { code to be executed if condition.
SAS Macro: Some Tips for Debugging Stat St. Paul’s Hospital April 2, 2007.
CPS120 Introduction to Computer Science Iteration (Looping)
C++ Programming: From Problem Analysis to Program Design, Fifth Edition, Fifth Edition Chapter 7: User-Defined Functions II.
Debugging in Java. Common Bugs Compilation or syntactical errors are the first that you will encounter and the easiest to debug They are usually the result.
5-1 Repetition Statements Repetition statements allow us to execute a statement multiple times Often they are referred to as loops Like conditional statements,
Conditions. Objectives  Understanding what altering the flow of control does on programs and being able to apply thee to design code  Look at why indentation.
1 Conditionals In many cases we want our program to make a decision about whether a piece of code should be executed or not, based on the truth of a condition.
CPS120: Introduction to Computer Science Lecture 14 Functions.
L OO P S While writing a program, there may be a situation when you need to perform some action over and over again. In such situation you would need.
BMTRY 789 Lecture 11: Debugging Readings – Chapter 10 (3 rd Ed) from “The Little SAS Book” Lab Problems – None Homework Due – None Final Project Presentations.
Loops George Mason University. Loop Structure Loop- A structure that allows repeated execution of a block of statements Loop body- A block of statements;
Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1.
CPS120 Introduction to Computer Science Iteration (Looping)
Chapter 1: Overview of SAS System Basic Concepts of SAS System.
Summer SAS Workshop Lecture 3. Summer SAS Workshop Website
Loops and Files. 5.1 The Increment and Decrement Operators.
Controlling Input and Output
CMPSC 16 Problem Solving with Computers I Spring 2014 Instructor: Tevfik Bultan Lecture 4: Introduction to C: Control Flow.
Flow Control in Imperative Languages. Activity 1 What does the word: ‘Imperative’ mean? 5mins …having CONTROL and ORDER!
Arrays and Loops. Learning Objectives By the end of this lecture, you should be able to: – Understand what a loop is – Appreciate the need for loops and.
Conditional Statements A conditional statement lets us choose which statement will be executed next Conditional statements give us the power to make basic.
Chapter 11 Reading SAS Data
Chapter 6: Loops.
Manipulating Pictures, Arrays, and Loops part 2
Selection (also known as Branching) Jumail Bin Taliba by
Loops BIS1523 – Lecture 10.
Introduction To Repetition The for loop
7 - Programming 7P, Q, R - Testing.
Python: Control Structures
Two “identical” programs
Error Handling Summary of the next few pages: Error Handling Cursors.
Chapter 18: Modifying SAS Data Sets and Tracking Changes
By Don Henderson PhilaSUG, June 18, 2018
Manipulating Pictures, Arrays, and Loops part 2
Chapter 22 Reading Hierarchical Files
Returning Structures Lesson xx
SAS Essentials How SAS Thinks
Defining and Calling a Macro
Loops CIS 40 – Introduction to Programming in Python
3 Iterative Processing.
Using Multiple SET Statements to Combine and Analyze Data
PYTHON: BUILDING BLOCKS Sequencing & Selection
Flowcharts and Pseudo Code
Java Programming Loops
Introduction to Computer Science
Python While Loops.
Software Development Techniques
Hans Baumgartner Penn State University
Introduction to SAS Essentials Mastering SAS for Data Analytics
Presentation transcript:

By Sasikumar Palanisamy SAS Supervisor within You vs SAS: Debugging SAS Data Steps with the Program Data Vector By Sasikumar Palanisamy

Agenda Introduction SAS Supervisor SAS Data Step Compilation Phase Execution Phase Return Controls What is PDV? Data flow Process through PDV Summary Proprietary & Confidential. © 2015 Chiltern

Introduction The SAS DATA Step is a primary method you have used to create SAS data sets. Most of the time they work as expected but sometimes they don’t. Sometimes you know why and sometimes don’t. Stuck in a rut, scratching your head. What went wrong and where? Does this describe you, then let’s talk. Proprietary & Confidential. © 2015 Chiltern

SAS Supervisor SAS Supervisor controls the execution flow of a SAS job and it has a look ahead capability. Why learn about SAS Supervisor? To know how SAS program is structured and controlled. To know how to examine SAS code from the supervisor point-of-view. SAS Supervisor is keyword Driven DATA Step Example – input, infile, put, if-then PROC Step Example – var, id, table, class Anywhere Example – title, options Proprietary & Confidential. © 2015 Chiltern

SAS DATA Step SAS DATA step consists of a group of SAS statements that begins with a DATA statement. The statements that make up the DATA step are compiled, and the syntax is checked. If the syntax is correct, then the statements are executed. In its simplest form, the DATA step is a loop with an automatic output and return action. New SAS Data Set Compilation Phase Execution Phase Descriptor Portion Data Portion Proprietary & Confidential. © 2015 Chiltern

(includes syntax checking) Compilation Phase During this phase, each of the statements within the data step are scanned for syntax errors. Input Buffer, PDV & Descriptor portion of SAS dataset gets created at the end of compilation phase compiles SAS statements (includes syntax checking) creates an input buffer a program data vector descriptor information Proprietary & Confidential. © 2015 Chiltern

Execution Phase Initialization of variables in the PDV to missing Execution of the DATA Step Program Outputting or copying values of variables in the PDV to the output SAS data set. Repeating above steps until the input data source is exhausted. Proprietary & Confidential. © 2015 Chiltern

Execution Phase Begins with a DATA statement (counts iterations) sets variable values to missing in the PDV reads an input record executes additional executable statements writes An observation to the SAS data set returns To the beginning of the DATA Step Data-reading statement; is there a record to read? closes Data set; goes on to the next DATA or PROC step No Yes Proprietary & Confidential. © 2015 Chiltern

Return Controls SAS Statements which return Control to SAS Supervisor ABORT or STOP: If abort or Stop statement is executed control will be passed to SAS supervisor. Implied Return: It is executed whenever you reach the bottom of the data step. Delete Statement: Execution of a delete statement passes control to the SAS supervisor. Subsetting If Statement: If expression evaluates as false control is returned to the SAS supervisor. Return Statement: Execution of a Return statement passes control to the SAS supervisor. Failed read operation: If a read-operation (input, set, merge) fails it passes to SAS supervisor. Proprietary & Confidential. © 2015 Chiltern

What is PDV? The Program Data Vector is a logical area of memory that is created during the data step processing. SAS builds a SAS dataset by reading one observation at a time into the PDV and, unless given code to do otherwise, writes the observation to a target dataset. The program data vector contains two types of variables. Permanent (data set and computed variables) Temporary (automatic and option defined) Automatic (_N_ and _ERROR_) Option defined (e.g., first.by-variable, last.by-variable, in=variable, end=variable) Proprietary & Confidential. © 2015 Chiltern

Data Flow Process through PDV _N_ _ERROR_ FIRST.varlbl LAST.varlbl varlbl xx yy zz VARLBL XX aa 5 bb 8 3 _N_ _ERROR_ FIRST.varlbl LAST.varlbl varlbl xx yy zz 1 . _N_ _ERROR_ FIRST.varlbl LAST.varlbl varlbl xx yy zz 1 aa 5 10 VARLBL XX YY ZZ aa 5 10 bb 3 6 ?? Proprietary & Confidential. © 2015 Chiltern

Data Flow Process through PDV Here is a quick math for ‘bb’ records – 1st record xx = 8; yy = 2 * 8 = 16; zz = 0 + 16 = 16; 2nd record xx = 3; yy = 2 * 3 = 6; zz = 16 + 6 = 22; ** since zz is retained. But “22” is in-correct as per the SAS Supervisor. VARLBL XX aa 5 bb 8 3 Proprietary & Confidential. © 2015 Chiltern

Data Flow Process through PDV SAS answer is 6. What went wrong in the logic? How to debug? Let’s use PDV to look the processing closer. Adding 4 put statement to debug the Proprietary & Confidential. © 2015 Chiltern

Data Flow Process through PDV Here is the result from log – SAS didn’t output 4th put statement for first ‘bb’ record - since we had subset “if” statement with last.varlbl - so SAS ignored following statements and passed control back to top of DATA Step to read in next record. Proprietary & Confidential. © 2015 Chiltern

Data Flow Process through PDV Moved “if last.varlbl” as last statement. VARLBL XX YY ZZ aa 5 10 bb 3 6 22 Proprietary & Confidential. © 2015 Chiltern

Summary We should think like SAS supervisor! Data Step process Compilation Phase Execution Phase Return Controls PDV and it’s importance. Proprietary & Confidential. © 2015 Chiltern

Thanks to: PhilaSUG 2016 for the opportunity to present. Terek Peterson for reviewing this presentation and providing me the guidance. Proprietary & Confidential. © 2015 Chiltern

Contact Information Sasikumar Palanisamy Chiltern International Ltd. 1016 W 9th Ave, King of Prussia, PA (484)-679-2689 Sasikumar.Palanisamy@chiltern.com Proprietary & Confidential. © 2015 Chiltern