Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 4: Using Lookup Tables to Match Data: Arrays

Similar presentations


Presentation on theme: "Chapter 4: Using Lookup Tables to Match Data: Arrays"— Presentation transcript:

1 Chapter 4: Using Lookup Tables to Match Data: Arrays
4.1: Introduction to Lookup Techniques 4.2: Using One-Dimensional Arrays as Lookup Tables 4.3: Using a Multidimensional Array as a Lookup Table 4.4: Loading a Multidimensional Array from a SAS Data Set

2 Chapter 4: Using Lookup Tables to Match Data: Arrays
4.1: Introduction to Lookup Techniques 4.2: Using One-Dimensional Arrays as Lookup Tables 4.3: Using a Multidimensional Array as a Lookup Table 4.4: Loading a Multidimensional Array from a SAS Data Set

3 Objectives Define table lookup. List table lookup techniques.
TAG_Audio: TAG_Instructor: Let’s make sure we all are talking the same type of processing when we talk about ”table lookup”. TAG_Movie: TAG_Print: TAG_Section:

4 Table Lookups Lookup values for a table lookup can be stored in the following: array hash object format data set Lookup values for a table lookup can be stored in the following: array hash object format data set Lookup Values lookup Lookup techniques include the following: array subscript value hash object key value FORMAT statement, PUT function merge, join Lookup techniques include the following: array subscript value hash object key value FORMAT statement, PUT function merge, join Data Values TAG_Audio: TAG_Instructor: The values are what we are looking up. The technique is how we are doing the lookup. TAG_Movie: TAG_Print: TAG_Section:

5

6 4.01 Multiple Choice Poll Which of these is an example of a table lookup? You have the data for January sales in one data set, February sales in a second data set, and March sales in a third. You need to create a report for the entire first quarter. You want to send birthday cards to employees. The employees' names and addresses are in one data set and their birth dates are in another. You need to calculate the amount each customer owes for his purchases. The price per item and the number of items purchased are stored in the same data set. B

7 4.01 Multiple Choice Poll – Correct Answer
Which of these is an example of a table lookup? You have the data for January sales in one data set, February sales in a second data set, and March sales in a third. You need to create a report for the entire first quarter. You want to send birthday cards to employees. The employees' names and addresses are in one data set and their birth dates are in another. You need to calculate the amount each customer owes for his purchases. The price per item and the number of items purchased are stored in the same data set. B

8 Overview of Table Lookup Techniques
Arrays, hash objects, and formats provide an in-memory lookup table. The merge and join use lookup values that are stored on disk. TAG_Audio: TAG_Instructor: This is important because the size of the data set can determine whether you can use arrays, hash objects or formats. TAG_Movie: TAG_Print: TAG_Section:

9 Overview of Arrays An array is similar to a row of buckets. 1 2 3 4
TAG_Audio: TAG_Instructor: This explanation is for 1 dimensional arrays. We'll see multi-dimentional arrays in the next section, but the picture isn't much different for multi-dimensional arrays. A multi-dimensional array is still similar to a row of buckets, but they are identified with multiple numbers. TAG_Movie: TAG_Print: TAG_Section: ...

10 Overview of Arrays An array is similar to a numbered row of buckets. 1
2 3 4 SAS puts a value in a bucket based on the bucket number. TAG_Audio: TAG_Instructor: In the case of multidimensional arrays, SAS puts a value in a bucket based on multiple numbers. TAG_Movie: TAG_Print: TAG_Section: ...

11 Overview of Arrays An array is similar to a numbered row of buckets. 1
2 3 4 SAS puts a value in a bucket based on the bucket number. TAG_Audio: TAG_Instructor: In the case of multidimensional arrays, SAS retrieves a value in a bucket based on multiple numbers. TAG_Movie: TAG_Print: TAG_Section: A value is retrieved from a bucket based on the bucket number.

12 Overview of a Hash Object
A hash object is similar to rows of buckets that are identified by the value of a key. Key Data Data TAG_Audio: TAG_Instructor: The key value can have multiple data items associated with it. TAG_Movie: TAG_Print: TAG_Section: ...

13 Overview of a Hash Object
A hash object is similar to rows of buckets that are identified by the value of a key. SAS puts value(s) in the data bucket(s) based on the value(s) in the key bucket. Key Data Data TAG_Audio: TAG_Instructor: Go over, we'll spend more time on this in chapter 5. TAG_Movie: TAG_Print: TAG_Section: ...

14 Overview of a Hash Object
A hash object is similar to rows of buckets that are identified by the value of a key. SAS puts value(s) in the data bucket(s) based on the value(s) in the key bucket. Key Data Data Value(s) are retrieved from the data bucket(s) based on the value(s) in the key bucket. TAG_Audio: TAG_Instructor: Go over, but we'll spend more time on this in chapter 5. TAG_Movie: TAG_Print: TAG_Section:

15 Overview of a Format A format is similar to rows of buckets that are identified by the data value. Data Value Label TAG_Audio: TAG_Instructor: Go over, we'll spend more time on this in chapter 6. TAG_Movie: TAG_Print: TAG_Section: ...

16 Overview of a Format A format is similar to rows of buckets that are identified by the data value. Data Value Label SAS puts data values and label values in the buckets when the format is used in a FORMAT statement, PUT function, or PUT statement. TAG_Audio: TAG_Instructor: Go over, we'll spend more time on this in chapter 6. TAG_Movie: TAG_Print: TAG_Section: ...

17 Overview of a Format A format is similar to rows of buckets that are identified by the data value. Data Value Label SAS puts data values and label values in the buckets when the format is used in a FORMAT statement, PUT function, or PUT statement. SAS uses a binary search on the data value bucket in order to return the value in the label bucket. TAG_Audio: TAG_Instructor: Go over, we'll spend more time on this in chapter 6. TAG_Movie: TAG_Print: TAG_Section:

18 Overview of Merges and Joins
The DATA step MERGE and the SQL join operators are similar to multiple stacks of buckets that are referred to by the value of one or more common variables. By Value Data By Value Data TAG_Audio: TAG_Instructor: Everyone should be familiar with DATA step MERGEs and SQL joins. TAG_Movie: TAG_Print: TAG_Section:

19

20 4.02 Multiple Answer Poll Which techniques do you currently use when performing table lookups? Arrays Hash object Formats Merges Joins No single answer to this poll. For your information only

21 Chapter 4: Using Lookup Tables to Match Data: Arrays
4.1: Introduction to Lookup Techniques 4.2: Using One-Dimensional Arrays as Lookup Tables 4.3: Using a Multidimensional Array as a Lookup Table 4.4: Loading a Multidimensional Array from a SAS Data Set

22 Objectives Define one-dimensional arrays.
Use a one-dimensional array for a table lookup task. TAG_Audio: TAG_Instructor: TAG_Movie: TAG_Print: TAG_Section:

23 Overview of Arrays (Review)
An array is similar to a row of numbered buckets. 1 2 3 4 SAS puts a value in a bucket based on the bucket number. TAG_Audio: TAG_Instructor: Review, so just ask if there are questions. TAG_Movie: TAG_Print: TAG_Section: A value is retrieved from a bucket based on the bucket number.

24 Defining Arrays (Review)
An array is a temporary grouping of SAS variables that are arranged in a particular order and identified by an array name. The following tasks can be accomplished using an array: performing repetitive calculations on a group of variables creating many variables with the same attributes restructuring data performing a table lookup with one or more numeric factors TAG_Audio: TAG_Instructor: Go over these quickly. We'll be using arrays primarily for the last bullet, table lookups. They will see restructuring data as a side task in the first example. TAG_Movie: TAG_Print: TAG_Section:  An array exists only for the duration of the current DATA step.

25 Using One-Dimensional Arrays (Review)
To use an array, declare the array by using an ARRAY statement. General form for the one-dimensional ARRAY statement: ARRAY array-name {number-of-elements} <$> <length> <list-of-variables> <(initial-values)>; TAG_Audio: TAG_Instructor: Cover quickly. They should already know this. The following slides are animations of each portion of the ARRAY statement. TAG_Movie: TAG_Print: TAG_Section:

26 Using One-Dimensional Arrays (Review)
Examples of an ARRAY statement follow. array numarray{3} num1 – num3; array char{4} $ 6; array num{5} _temporary_ (5, 6, 7, 8, 9); TAG_Audio: TAG_Instructor: These slides are animations of each portion of the ARRAY statement. TAG_Movie: TAG_Print: TAG_Section: array yr{2000:2002} yr2000 yr2001 yr2002; ...

27 Using One-Dimensional Arrays (Review)
Examples of an ARRAY statement follow. array numarray{3} num1 – num3; Array Name array char{4} $ 6; array num{5} _temporary_ (5, 6, 7, 8, 9); TAG_Audio: TAG_Instructor: These slides are animations of each portion of the ARRAY statement. TAG_Movie: TAG_Print: TAG_Section: array yr{2000:2002} yr2000 yr2001 yr2002; ...

28 Using One-Dimensional Arrays (Review)
Examples of an ARRAY statement follow. array numarray{3} num1 – num3; number of elements array char{4} $ 6; array num{5} _temporary_ (5, 6, 7, 8, 9); TAG_Audio: TAG_Instructor: These slides are animations of each portion of the ARRAY statement. TAG_Movie: TAG_Print: TAG_Section: array yr{2000:2002} yr2000 yr2001 yr2002; ...

29 Using One-Dimensional Arrays (Review)
Examples of an ARRAY statement follow. array numarray{3} num1 – num3; names three numeric variables array char{4} $ 6; creates four character variables, char1 – char4, each a length of 6 array num{5} _temporary_ (5, 6, 7, 8, 9); creates temporary numeric elements, and stores the numeric values 5, 6, 7, 8, 9 TAG_Audio: TAG_Instructor: Point out the _temporary_ keyword in the 4th example. Note that the use of the _temporary_ keyword automatically retains the initial values across iterations of the DATA step. If you create PDV variables, they are initialized to missing at the top of the DATA step after the first iteration. TAG_Movie: TAG_Print: TAG_Section: array yr{2000:2002} yr2000 yr2001 yr2002; names three numeric variables

30

31 4.03 Multiple Choice Poll How many elements are referenced by the ARRAY statement: array numarray{*} num1 – num12;? 1 12 Unknown C. 12

32 4.03 Multiple Choice Poll – Correct Answer
How many elements are referenced by the ARRAY statement: array numarray{*} num1 – num12;? 1 12 Unknown C. 12

33 The DIM Function You can use the DIM function to return the number of elements in an array. For example, use the DIM function to provide the end value for a DO loop. array numarray{*} num1 – num12; <additional statements> do i=1 to dim(numarray); end; TAG_Audio: TAG_Instructor: Point out that if you are using the array dimensions with *, you need to name the elements so that SAS has some way of figuring out how many elements the array has. There is information about the DIM function under this slide. TAG_Movie: TAG_Print: TAG_Section: Equivalent code: array numarray{12} num1 – num12; <additional statements> do i=1 to 12; end;

34 Business Scenario Partial Listing of orion.employee_payroll
The data set orion.employee_payroll contains each employee’s hired date and current salary. Partial Listing of orion.employee_payroll Employee_ID Employee_Gender Salary Birth_Date Employee_ Hire_Date Term_Date Marital_ Status Dependents 120101 M 163040 18AUG1976 01JUL2003 . S 120102 108255 11AUG1969 01JUN1989 O 2 120103 87975 22JAN1949 01JAN1974 1 120104 F 46230 11MAY1954 01JAN1981 120105 27110 21DEC1974 01MAY1999 TAG_Audio: TAG_Instructor: Point out that each observation has a hired date and we can use the YEAR function to determine what year the employee was hired. TAG_Movie: TAG_Print: TAG_Section:

35 Business Scenario The data set orion.salary_stats contains statistics for all Orion Star employees for the years 1974 through For example, the average salary of the employees hired in 1974 is currently $39, Partial Listing of orion.salary_stats Statistic Yr1974 Yr1975 Yr1976 . . . Yr2006 Yr2007 Num_of_Emps 61 4 6 97 3 Median_Salary 30025 30020 26970 27240 Std_Salary Sum_Salary 132150 235030 86585 Avg_Salary TAG_Audio: TAG_Instructor: The data set orion.salary_stats has year in the columns. TAG_Movie: TAG_Print: TAG_Section:

36 Partial Listing of compare
Business Scenario The two data sets must be combined to calculate the difference between the average salary and the actual current salary for each employee based on the year of hire. Partial Listing of compare Using One Dimensional Arrays Year_ Obs Employee_ID Hired Salary Average Salary_Dif $163, $35, $127,957.50 $108, $88, $19,666.25 $87, $39, $48,731.39 $46, $36, $9,793.33 $27, $36, $-9,423.75 $26, $39, $-12,283.61 $30, $39, $-8,768.61 $27, $27, $

37

38 Setup for the Poll Partial Listing of orion.salary_stats
The two data sets that need to be combined are as follows: Partial Listing of orion.salary_stats Statistic Yr1974 Yr1975 Yr1976 . . . 2006 Yr2007 Avg_Salary Partial Listing of orion.employee_payroll Employee_ID Employee_Gender Salary Birth_ Date Employee_Hire_Date . . . 120101 M 163040 18AUG1976 01JUL2003 120102 108255 11AUG1969 01JUN1989 120103 87975 22JAN1949 01JAN1974 120104 F 46230 11MAY1954 01JAN1981 120105 27110 21DEC1974 01MAY1999

39 4.04 Poll Can the two data sets be merged with the DATA step MERGE statement or joined with the SQL procedure?  Yes  No Type answer here No. They currently do not have a variable in common.

40 4.04 Poll – Correct Answer Can the two data sets be merged with the DATA step MERGE statement or joined with the SQL procedure?  Yes  No Type answer here No. They currently do not have a variable in common.

41 4.05 Poll What do the two data sets have in common?
 They have the year in common.  They have nothing in common. Yes, the year. In orion.employee_payroll, year can be extracted from the Employee_Hire_Date variable. In orion.salary_stats, each column (except the first one) represents a year.

42 4.05 Poll – Correct Answer What do the two data sets have in common?
 They have the year in common.  They have nothing in common. In the data set, orion.salary_stats, the columns, except for the first, represent year. In the data set, orion.employee_payroll, year can be obtained from the Employee_Hire_Date variable. Yes, the year. In orion.employee_payroll, year can be extracted from the Employee_Hire_Date variable. In orion.salary_stats, each column (except the first one) represents a year.

43 Using a One-Dimensional Array
data compare; keep Employee_ID Year_Hired Salary Average Salary_Dif; format Salary Average Salary_Dif dollar12.2; array yr{1974:2007} Yr1974-Yr2007; if _n_=1 then set orion.salary_stats (where=(Statistic='Avg_Salary')); set orion.employee_payroll (keep=Employee_ID Employee_Hire_Date Salary); Year_Hired=year(Employee_Hire_Date); Average=yr{Year_Hired}; Salary_Dif=Salary-Average; run; TAG_Audio: TAG_Instructor: This is the big picture slide. Go over the 3 highlighted statements quickly. 1) Point out the colon that starts the dimensions at 1974 and ends them at ) The SET statement is in the IF _N_=1 statement because otherwise, SAS would determine that there is only one observation in the data set for 'Avg_Salary' and would stop the DATA step. When you have multiple SET statements, when SAS determines the first end of file, the DATA step stops. We'll talk about this a lot later. 3) This is the look up statement where we are using the YR array to determine the appropriate average based on the Year_Hired variable. Transition can be "now let's look at how execution works." TAG_Movie: TAG_Print: TAG_Section: p304d01

44 Execution Partial Listing of orion.salary_stats
Statistic Yr1974 Yr1975 Yr1976 . . . Avg_Salary Partial Listing of orion.employee_payroll Employee _ID Employee_ Hire_Date Salary 120101 01JUL2003 163040 120102 01JUN1989 108255 120103 01JAN1974 87975 120104 01JAN1981 46230 120105 01MAY1999 27110 data compare; keep Employee_ID Year_Hired Salary Average Salary_Dif; format Salary Average Salary_Dif dollar12.2; array yr{1974:2007} Yr1974-Yr2007; if _n_=1 then set orion.salary_stats (where=(Statistic='Avg_Salary')); set orion.employee_payroll (keep=Employee_ID Employee_Hire_Date Salary); Year_Hired=year(Employee_Hire_Date); Average=yr{Year_Hired}; Salary_Dif=Salary-Average; run; Partial PDV Salary Average Salary_ Dif Yr1974 Yr1975 Yr1976 Yr1977 Yr1978 . . . . yr{1974} yr{1975} yr{1976} yr{1977} yr{1978} D TAG_Audio: TAG_Instructor: After compilation, this is where the DATA step execution starts. The KEEP, FORMAT, and ARRAY statements are compile time statements. The ARRAY statement has associated the YR array to each of the variables Yr1974 through Yr2007. TAG_Movie: TAG_Print: TAG_Section: yr{2003} yr{2004} yr{2007} D Yr2003 Yr2004 . . . Yr2007 Statistic Employee _ID Employee_ Hire_Date Year_ Hired _N_ . 1 ...

45 Execution Partial Listing of orion.salary_stats
Statistic Yr1974 Yr1975 Yr1976 . . . Avg_Salary Partial Listing of orion.employee_payroll Employee _ID Employee_ Hire_Date Salary 120101 01JUL2003 163040 120102 01JUN1989 108255 120103 01JAN1974 87975 120104 01JAN1981 46230 120105 01MAY1999 27110 data compare; keep Employee_ID Year_Hired Salary Average Salary_Dif; format Salary Average Salary_Dif dollar12.2; array yr{1974:2007} Yr1974-Yr2007; if _n_=1 then set orion.salary_stats (where=(Statistic='Avg_Salary')); set orion.employee_payroll (keep=Employee_ID Employee_Hire_Date Salary); Year_Hired=year(Employee_Hire_Date); Average=yr{Year_Hired}; Salary_Dif=Salary-Average; run; Partial PDV Salary Average Salary_ Dif Yr1974 Yr1975 Yr1976 Yr1977 Yr1978 . . . . 34170 yr{1974} yr{1975} yr{1976} yr{1977} yr{1978} D TAG_Audio: TAG_Instructor: The first statement to execute is the IF statement. Since it is true, the PDV is loaded with the appropriate observation from orion.salary_stats. TAG_Movie: TAG_Print: TAG_Section: yr{2003} yr{2004} yr{2007} D Yr2003 Yr2004 . . . Yr2007 Statistic Employee _ID Employee_ Hire_Date Year_ Hired _N_ Avg_Salary . 1 ...

46 Execution Partial Listing of orion.salary_stats
Statistic Yr1974 Yr1975 Yr1976 . . . Avg_Salary Partial Listing of orion.employee_payroll Employee _ID Employee_ Hire_Date Salary 120101 01JUL2003 163040 120102 01JUN1989 108255 120103 01JAN1974 87975 120104 01JAN1981 46230 120105 01MAY1999 27110 data compare; keep Employee_ID Year_Hired Salary Average Salary_Dif; format Salary Average Salary_Dif dollar12.2; array yr{1974:2007} Yr1974-Yr2007; if _n_=1 then set orion.salary_stats (where=(Statistic='Avg_Salary')); set orion.employee_payroll (keep=Employee_ID Employee_Hire_Date Salary); Year_Hired=year(Employee_Hire_Date); Average=yr{Year_Hired}; Salary_Dif=Salary-Average; run; Partial PDV Salary Average Salary_ Dif Yr1974 Yr1975 Yr1976 Yr1977 Yr1978 . . . 163040 . 34170 yr{1974} yr{1975} yr{1976} yr{1977} yr{1978} D TAG_Audio: TAG_Instructor: At execution of this SET statement, orion.employee_payroll is loaded into the PDV. This is where the Year_Hired variable can be calculated. TAG_Movie: TAG_Print: TAG_Section: yr{2003} yr{2004} yr{2007} D Yr2003 Yr2004 . . . Yr2007 Statistic Employee _ID Employee_ Hire_Date Year_ Hired _N_ Avg_Salary 120101 01JUL2003 . 1 ...

47 Execution Partial Listing of orion.salary_stats
Statistic Yr1974 Yr1975 Yr1976 . . . Avg_Salary Partial Listing of orion.employee_payroll Employee _ID Employee_ Hire_Date Salary 120101 01JUL2003 163040 120102 01JUN1989 108255 120103 01JAN1974 87975 120104 01JAN1981 46230 120105 01MAY1999 27110 data compare; keep Employee_ID Year_Hired Salary Average Salary_Dif; format Salary Average Salary_Dif dollar12.2; array yr{1974:2007} Yr1974-Yr2007; if _n_=1 then set orion.salary_stats (where=(Statistic='Avg_Salary')); set orion.employee_payroll (keep=Employee_ID Employee_Hire_Date Salary); Year_Hired=year(Employee_Hire_Date); Average=yr{Year_Hired}; Salary_Dif=Salary-Average; run; Partial PDV Salary Average Salary_ Dif Yr1974 Yr1975 Yr1976 Yr1977 Yr1978 . . . 163040 . 34170 yr{1974} yr{1975} yr{1976} yr{1977} yr{1978} D TAG_Audio: TAG_Instructor: The variable Year_Hired is what we'll use to do the table lookup. TAG_Movie: TAG_Print: TAG_Section: yr{2003} yr{2004} yr{2007} D Yr2003 Yr2004 . . . Yr2007 Statistic Employee _ID Employee_ Hire_Date Year_ Hired _N_ Avg_Salary 120101 01JUL2003 2003 1 ...

48 Execution Partial Listing of orion.salary_stats
Statistic Yr1974 Yr1975 Yr1976 . . . Avg_Salary Partial Listing of orion.employee_payroll Employee _ID Employee_ Hire_Date Salary 120101 01JUL2003 163040 120102 01JUN1989 108255 120103 01JAN1974 87975 120104 01JAN1981 46230 120105 01MAY1999 27110 data compare; keep Employee_ID Year_Hired Salary Average Salary_Dif; format Salary Average Salary_Dif dollar12.2; array yr{1974:2007} Yr1974-Yr2007; if _n_=1 then set orion.salary_stats (where=(Statistic='Avg_Salary')); set orion.employee_payroll (keep=Employee_ID Employee_Hire_Date Salary); Year_Hired=year(Employee_Hire_Date); Average=yr{Year_Hired}; Salary_Dif=Salary-Average; run; Partial PDV Salary Average Salary_ Dif Yr1974 Yr1975 Yr1976 Yr1977 Yr1978 . . . 163040 . 34170 yr{1974} yr{1975} yr{1976} yr{1977} yr{1978} D TAG_Audio: TAG_Instructor: We used the separate statement Year_Hired=year(Employee_Hire_Date); and then used the Year_Hired variable as the index value to do the lookup. These could be combined into Average=yr{year(Employee_Hire_Date)} but the program is easier to follow with two separate statements. There's not a lot of difference in efficiency to use 2 statements, as long as your data sets are relatively small. The next couple of slides show how the assignment statement works. TAG_Movie: TAG_Print: TAG_Section: yr{2003} yr{2004} yr{2007} D Yr2003 Yr2004 . . . Yr2007 Statistic Employee _ID Employee_ Hire_Date Year_ Hired _N_ Avg_Salary 120101 01JUL2003 2003 1 ...

49 Execution Partial Listing of orion.salary_stats
Statistic Yr1974 Yr1975 Yr1976 . . . Avg_Salary Partial Listing of orion.employee_payroll Employee _ID Employee_ Hire_Date Salary 120101 01JUL2003 163040 120102 01JUN1989 108255 120103 01JAN1974 87975 120104 01JAN1981 46230 120105 01MAY1999 27110 data compare; keep Employee_ID Year_Hired Salary Average Salary_Dif; format Salary Average Salary_Dif dollar12.2; array yr{1974:2007} Yr1974-Yr2007; if _n_=1 then set orion.salary_stats (where=(Statistic='Avg_Salary')); set orion.employee_payroll (keep=Employee_ID Employee_Hire_Date Salary); Year_Hired=year(Employee_Hire_Date); Average=yr{Year_Hired}; Salary_Dif=Salary-Average; run; Average=yr{2003}; Partial PDV Salary Average Salary_ Dif Yr1974 Yr1975 Yr1976 Yr1977 Yr1978 . . . 163040 . 34170 yr{1974} yr{1975} yr{1976} yr{1977} yr{1978} D TAG_Audio: TAG_Instructor: The next couple of slides show how the assignment statement works. TAG_Movie: TAG_Print: TAG_Section: yr{2003} yr{2004} yr{2007} D Yr2003 Yr2004 . . . Yr2007 Statistic Employee _ID Employee_ Hire_Date Year_ Hired _N_ Avg_Salary 120101 01JUL2003 2003 1 ...

50 Execution Partial Listing of orion.salary_stats
Statistic Yr1974 Yr1975 Yr1976 . . . Avg_Salary Partial Listing of orion.employee_payroll Employee _ID Employee_ Hire_Date Salary 120101 01JUL2003 163040 120102 01JUN1989 108255 120103 01JAN1974 87975 120104 01JAN1981 46230 120105 01MAY1999 27110 data compare; keep Employee_ID Year_Hired Salary Average Salary_Dif; format Salary Average Salary_Dif dollar12.2; array yr{1974:2007} Yr1974-Yr2007; if _n_=1 then set orion.salary_stats (where=(Statistic='Avg_Salary')); set orion.employee_payroll (keep=Employee_ID Employee_Hire_Date Salary); Year_Hired=year(Employee_Hire_Date); Average=yr{Year_Hired}; Salary_Dif=Salary-Average; run; Partial PDV Salary Average Salary_ Dif Yr1974 Yr1975 Yr1976 Yr1977 Yr1978 . . . 163040 34170 yr{1974} yr{1975} yr{1976} yr{1977} yr{1978} D TAG_Audio: TAG_Instructor: All this does is the calculation. TAG_Movie: TAG_Print: TAG_Section: yr{2003} yr{2004} yr{2007} D Yr2003 Yr2004 . . . Yr2007 Statistic Employee _ID Employee_ Hire_Date Year_ Hired _N_ Avg_Salary 120101 01JUL2003 2003 1 ...

51 Execution Partial Listing of orion.salary_stats
Statistic Yr1974 Yr1975 Yr1976 . . . Avg_Salary Partial Listing of orion.employee_payroll Employee _ID Employee_ Hire_Date Salary 120101 01JUL2003 163040 120102 01JUN1989 108255 120103 01JAN1974 87975 120104 01JAN1981 46230 120105 01MAY1999 27110 data compare; keep Employee_ID Year_Hired Salary Average Salary_Dif; format Salary Average Salary_Dif dollar12.2; array yr{1974:2007} Yr1974-Yr2007; if _n_=1 then set orion.salary_stats (where=(Statistic='Avg_Salary')); set orion.employee_payroll (keep=Employee_ID Employee_Hire_Date Salary); Year_Hired=year(Employee_Hire_Date); Average=yr{Year_Hired}; Salary_Dif=Salary-Average; run; Implicit OUTPUT; Implicit RETURN; Partial PDV Salary Average Salary_ Dif Yr1974 Yr1975 Yr1976 Yr1977 Yr1978 . . . 163040 34170 yr{1974} yr{1975} yr{1976} yr{1977} yr{1978} D TAG_Audio: TAG_Instructor: After the implicit output, control returns to the top of the DATA step. TAG_Movie: TAG_Print: TAG_Section: yr{2003} yr{2004} yr{2007} D Yr2003 Yr2004 . . . Yr2007 Statistic Employee _ID Employee_ Hire_Date Year_ Hired _N_ Avg_Salary 120101 01JUL2003 2003 1 ...

52 Execution Continue until EOF Partial Listing of orion.salary_stats
Statistic Yr1974 Yr1975 Yr1976 . . . Avg_Salary Partial Listing of orion.employee_payroll Employee _ID Employee_ Hire_Date Salary 120101 01JUL2003 163040 120102 01JUN1989 108255 120103 01JAN1974 87975 120104 01JAN1981 46230 120105 01MAY1999 27110 data compare; keep Employee_ID Year_Hired Salary Average Salary_Dif; format Salary Average Salary_Dif dollar12.2; array yr{1974:2007} Yr1974-Yr2007; if _n_=1 then set orion.salary_stats (where=(Statistic='Avg_Salary')); set orion.employee_payroll (keep=Employee_ID Employee_Hire_Date Salary); Year_Hired=year(Employee_Hire_Date); Average=yr{Year_Hired}; Salary_Dif=Salary-Average; run; Continue until EOF Partial PDV Salary Average Salary_ Dif Yr1974 Yr1975 Yr1976 Yr1977 Yr1978 . . . 163040 34170 yr{1974} yr{1975} yr{1976} yr{1977} yr{1978} D TAG_Audio: TAG_Instructor: Execution continues until the end of file is encountered. TAG_Movie: TAG_Print: TAG_Section: yr{2003} yr{2004} yr{2007} D Yr2003 Yr2004 . . . Yr2007 Statistic Employee _ID Employee_ Hire_Date Year_ Hired _N_ Avg_Salary 120101 01JUL2003 2003 1

53 Resulting Data PROC PRINT Output proc print data=compare(obs=8);
var Employee_ID Year_Hired Salary Average Salary_Dif; title 'Using One Dimensional Arrays'; run; PROC PRINT Output Using One Dimensional Arrays Year_ Obs Employee_ID Hired Salary Average Salary_Dif $163, $35, $127,957.50 $108, $88, $19,666.25 $87, $39, $48,731.39 $46, $36, $9,793.33 $27, $36, $-9,423.75 $26, $39, $-12,283.61 $30, $39, $-8,768.61 $27, $27, $ TAG_Audio: TAG_Instructor: This is the result. Notice the PROC PRINT VAR statement that controls order of the variables. TAG_Movie: TAG_Print: TAG_Section:

54

55 4.06 Multiple Answer Poll Which of the following ARRAY statements is similar to the statement array yr{1974:2007} Yr1974-Yr2007; and will compile without errors? array yr{34} Yr1974-Yr2007; array yr{ } Yr1974-Yr2007; array yr{74:07} Yr1974-Yr2007; array yr{74-07} Yr1974-Yr2007; array yr{*} Yr1974-Yr2007; A and E. B, C, & D won’t compile without errors. B & D use a dash (-) instead of a colon (:) for the range of index values. In answer C, the stop value is smaller than the start value.

56 4.06 Multiple Answer Poll – Correct Answers
Which of the following ARRAY statements is similar to the statement array yr{1974:2007} Yr1974-Yr2007; and will compile without errors? array yr{34} Yr1974-Yr2007; array yr{ } Yr1974-Yr2007; array yr{74:07} Yr1974-Yr2007; array yr{74-07} Yr1974-Yr2007; array yr{*} Yr1974-Yr2007; A and E. B, C, & D won’t compile without errors. B & D use a dash (-) instead of a colon (:) for the range of index values. In answer C, the stop value is smaller than the start value.

57

58 Exercise These exercises reinforce the concepts discussed previously.

59 Chapter 4: Using Lookup Tables to Match Data: Arrays
4.1: Introduction to Lookup Techniques 4.2: Using One-Dimensional Arrays as Lookup Tables 4.3: Using a Multidimensional Array as a Lookup Table 4.4: Loading a Multidimensional Array from a SAS Data Set

60 Objectives Define a multidimensional array.
Explain the differences between a one-dimensional array and a multidimensional array. Use a multidimensional array as a lookup table.

61 Business Scenario The SAS data set orion.profit has information about every company for the years 2003 through 2007, separated by month. Partial Listing of orion.profit(where=(Sales ne .)) Company YYMM Sales Cost Salaries profit Logistics 03M01 $457,809 $210,914 $127,525 $119,370 03M02 $325,138 $149,718 $47,895 03M03 $276,805 $127,827 $134,198 $14,780 03M04 $558,806 $264,868 $159,741 03M05 $641,954 $303,324 $204,432 03M06 $827,976 $389,207 $304,571 03M07 $819,373 $389,020 $138,047 $292,306 03M08 $794,750 $373,204 $140,206 $281,340 TAG_Audio: TAG_Instructor: Point out the YYMM variable. We're going to investigate that first (see next slide.) TAG_Movie: TAG_Print: TAG_Section:

62

63 4.07 Quiz What is the type of the variable YYMM in the data set orion.profit? numeric This is important because we will use the YYMM variable to get the dimensions for the array.

64 4.07 Quiz – Correct Answer What is the type of the variable YYMM in the data set orion.profit? Use PROC CONTENTS. proc contents data=orion.profit; run; YYMM is a numeric variable. It represents a SAS date. numeric This is important because we will use the YYMM variable to get the dimensions for the array.

65 Business Scenario This table contains the budgeted amounts for each of those months and years. Each row represents a month, and each column represents a year. Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 $1,590,000 $1,880,000 $2,300,000 $1,960,000 $1,970,000 $1,290,000 $1,550,000 $1,830,000 $1,480,000 $1,640,000 $1,160,000 $1,380,000 $1,410,000 $1,440,000 $1,710,000 $2,100,000 $2,420,000 $2,130,000 $2,270,000 $1,990,000 $2,350,000 $2,840,000 $2,480,000 $2,670,000 $2,560,000 $3,020,000 $3,580,000 $3,070,000 $3,410,000 $2,590,000 $2,890,000 $3,550,000 $3,010,000 $3,490,000 $2,550,000 $3,030,000 $3,500,000 $1,070,000 $1,180,000 $1,260,000 $1,520,000 $1,270,000 $1,600,000 $1,360,000 $1,700,000 $1,470,000 $1,780,000 $1,540,000 $1,950,000 $2,870,000 $3,120,000 $3,760,000 $3,210,000 $4,370,000 TAG_Audio: TAG_Instructor: This is a table of values on a piece of paper at this point. Later, we'll see the same information in a SAS data set. TAG_Movie: TAG_Print: TAG_Section: continued...

66 Business Scenario You need to combine the budget amounts in the table with the actual amount in the SAS data set to create the following report: Listing of budget_amt Actual vs Budgeted Amounts (Two Observations) Company YYMM Sales Cost Salaries profit BudgetAmt Logistics 03M $457, $210, $127, $119,370 $1,590,000 Logistics 03M $325, $149, $127, $47,895 $1,290,000 TAG_Audio: TAG_Instructor: We are only using two observations from orion.profit because we only want to type in a few of the values from the table. Usually, you don't type the values into an array, but load it from a SAS data set, which we'll do in a few slides. But to get us started, we'll do some typing. TAG_Movie: TAG_Print: TAG_Section:

67

68 4.08 Quiz What do the data set orion.profit and the table have in common? Partial Listing of orion.profit (where=(Sales ne .)) Company YYMM Sales Cost Salaries profit Logistics 03M01 $457,809 $210,914 $127,525 $119,370 03M02 $325,138 $149,718 $47,895 03M03 $276,805 $127,827 $134,198 $14,780 03M04 $558,806 $264,868 $159,741 They have the month and year in common. In the data set orion.profit, the month and year can be extracted from the variable YYMM. In the lookup table, each row represents a month and each column represents a year. Important that the students see that the two have the month and year in common. There are 2 factors; therefore, we'll use a two dimensional array for the combination. Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 $1,590,000 $1,880,000 $2,300,000 $1,960,000 $1,970,000 $1,290,000 $1,550,000 $1,830,000 $1,480,000 $1,640,000

69 4.08 Quiz – Correct Answer What do the data set orion.profit and the table have in common? They have the month and year in common. In the data set orion.profit, the month and year can be extracted from the variable YYMM. In the table, each row represents a month and each column represents a year. Type answer here They have the month and year in common. In the data set orion.profit, the month and year can be extracted from the variable YYMM. In the lookup table, each row represents a month and each column represents a year. Important that the students see that the two have the month and year in common. There are 2 factors; therefore, we'll use a two dimensional array for the combination.

70 Overview of Two-Dimensional Arrays
To combine the table of budgeted values with the data set containing profit, use a two-dimensional array. A two-dimensional array is similar to a row of buckets. 1,1 1,2 2,1 2,2 TAG_Audio: TAG_Instructor: A two-dimesional array is similar to the row of buckets where each bucket is referred to with 2 numbers. TAG_Movie: TAG_Print: TAG_Section: SAS puts a value in a bucket based on multiple numbers. Values are retrieved from a bucket based on multiple numbers.

71 Using Multidimensional Arrays
General form for the multidimensional ARRAY statement: ARRAY array-name {…,rows, cols} <$> <length> <elements> <(initial values)>; rows specifies the number of array elements in a row arrangement. cols specifies the number of array elements in a column arrangement. TAG_Audio: TAG_Instructor: The ARRAY statement for a multidimensional array is just like the one for a one dimensional array except that you have multiple dimensions in the curly braces { } that show the number of elements for each factor. You can think of them as the number of elements in the rows and the number of elements in the columns. There is no limit to the number of dimensions you can have in a multidimensional array. Just remember that the entire array has to fit in memory at one time. Be sure to point out the variable names, B1-B10. TAG_Movie: TAG_Print: TAG_Section: Example: array B{2,5} B1-B10;

72 Using Multidimensional Arrays
array B{2,5} B1-B10 ( , , , , , , , , , ); Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 $1,590,000 $1,880,000 $2,300,000 $1,960,000 $1,970,000 $1,290,000 $1,550,000 $1,830,000 $1,480,000 $1,640,000 $1,160,000 $1,380,000 $1,410,000 $1,440,000 $1,710,000 $2,100,000 $2,420,000 $2,130,000 $2,270,000 $1,990,000 $2,350,000 $2,840,000 $2,480,000 $2,670,000 $2,560,000 $3,020,000 $3,580,000 $3,070,000 $3,410,000 $2,590,000 $2,890,000 $3,550,000 $3,010,000 $3,490,000 $2,550,000 $3,030,000 $3,500,000 $1,070,000 $1,180,000 $1,260,000 $1,520,000 $1,270,000 $1,600,000 $1,360,000 $1,700,000 $1,470,000 $1,780,000 $1,540,000 $1,950,000 $2,870,000 $3,120,000 $3,760,000 $3,210,000 $4,370,000 TAG_Audio: TAG_Instructor: We're going to start with the first two rows from this table, and we will type the numbers into the array statement as the initial values. We're also creating 10 variables in which to hold those constants. TAG_Movie: TAG_Print: TAG_Section:

73 Using Multidimensional Arrays
array B{2,5} B1-B10 ( , , , , , , , , , ); PDV B{1,1} B{2,1} B{1,3} B{1,4} B{1,5} B{1,2} B{2,2} B{2,3} B{2,4} B{2,5} B1 B2 B3 B4 B5 B6 B7 B8 B9 B10 TAG_Audio: TAG_Instructor: This is what the PDV would look like. TAG_Movie: TAG_Print: TAG_Section:

74

75 Setup for the Poll The following ARRAY statement creates 10 new variables. array B{2,5} B1-B10 ( , , , , , , , , , );

76 4.09 Multiple Answer Poll Which of the following would be equivalent to the two dimensional ARRAY statement on the previous slide? array B{*} B1-B10 ( , , , , , , , , , ); array B{2,2003:2007} B1-B ( , , , , , , , , , ); array B{2,5} ( , , , , , , , , , ); array B{2,5} _temporary_ ( , , , , , , , , , ); B and C. Each creates a 2-dimensional array with 2 rows and 5 columns. Answer A creates a one-dimensional array. You cannot use the notation with a multi-dimensional array. Answer D creates a temporary array, which does not create slots for variables in the PDV.

77 4.09 Multiple Answer Poll – Correct Answers
Which of the following would be equivalent to the two dimensional ARRAY statement on the previous slide? array B{*} B1-B10 ( , , , , , , , , , ); array B{2,2003:2007} B1-B ( , , , , , , , , , ); array B{2,5} ( , , , , , , , , , ); array B{2,5} _temporary_ ( , , , , , , , , , ); B and C. Each creates a 2-dimensional array with 2 rows and 5 columns. Answer A creates a one-dimensional array. You cannot use the notation with a multi-dimensional array. Answer D creates a temporary array, which does not create slots for variables in the PDV.

78 Business Scenario Find the budgeted amounts for each company, year, and month. Partial Listing of orion.profit Company YYMM Sales Cost Salaries profit Logistics 03M01 $457,809 $210,914 $127,525 $119,370 03M02 $325,138 $149,718 $47,895 + Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 $1,590,000 $1,880,000 $2,300,000 $1,960,000 $1,970,000 $1,290,000 $1,550,000 $1,830,000 $1,480,000 $1,640,000 TAG_Audio: TAG_Instructor: Back to our example. We need to combine the orion.profit data set with the first two rows of Budget information to create a data set that contains the Budget Amount. TAG_Movie: TAG_Print: TAG_Section: = Partial Listing of budget_amt Company YYMM Sales Cost Salaries profit BudgetAmt Logistics 03M01 $457,809 $210,914 $127,525 $119,370 $1,590,000 03M02 $325,138 $149,718 $47,895 $1,290,000

79 Using Multidimensional Arrays
Find the budgeted amounts for each company, year, and month. Company YYMM Logistics 03M01 03M02 Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 $1,590,000 $1,880,000 $2,300,000 $1,960,000 $1,970,000 $1,290,000 $1,550,000 $1,830,000 $1,480,000 $1,640,000 TAG_Audio: TAG_Instructor: We need to dissect the YYMM variable in order to find the year variable for our columns and the month variable for the row. TAG_Movie: TAG_Print: TAG_Section:

80 Using Multidimensional Arrays
data budget_amt; drop Y M; array B{2,2003:2007} _temporary_ ( , , , , , , , , , ); set orion.profit(where=(Sales ne .) obs=2); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; TAG_Audio: TAG_Instructor: Step 1. point out _temporary_ Step 2 and Step 3, we calculate the subscript value. Step 4. Table lookup. TAG_Movie: TAG_Print: TAG_Section: p304d03

81 Execution ... Partial Listing of orion.profit data budget_amt;
drop Y M; array B{2,2003:2007} _Temporary_ ( , , , , , , , , , ); set orion.profit(where=(Sales ne .) obs=2); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.profit Company YYMM Sales Cost . . . Logistics 03M01 457809 210914 03M02 325138 149718 B{1,2003} B{1,2004} B{1,2005} B{1,2006} B{1,2007} B{2,2003} B{2,2004} B{2,2005} B{2,2006} B{2,2007} TAG_Audio: TAG_Instructor: This is where processing is at the end of compile but before execution. Notice the array is set up and the values loaded. TAG_Movie: TAG_Print: TAG_Section: PDV Company YYMM Sales Cost Salaries profit Y M BudgetAmt _N_ . 1 D D D ...

82 Execution ... Partial Listing of orion.profit data budget_amt;
drop Y M; array B{2,2003:2007} _Temporary_ ( , , , , , , , , , ); set orion.profit(where=(Sales ne .) obs=2); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.profit Company YYMM Sales Cost . . . Logistics 03M01 457809 210914 03M02 325138 149718 B{1,2003} B{1,2004} B{1,2005} B{1,2006} B{1,2007} B{2,2003} B{2,2004} B{2,2005} B{2,2006} B{2,2007} TAG_Audio: TAG_Instructor: When the SET statement executes, it fills in the PDV. TAG_Movie: TAG_Print: TAG_Section: PDV Company YYMM Sales Cost Salaries profit Y M BudgetAmt _N_ Logistics 03M01 457809 210914 127525 119370 . 1 D D D ...

83 Execution ... Partial Listing of orion.profit data budget_amt;
drop Y M; array B{2,2003:2007} _Temporary_ ( , , , , , , , , , ); set orion.profit(where=(Sales ne .) obs=2); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.profit Company YYMM Sales Cost . . . Logistics 03M01 457809 210914 03M02 325138 149718 B{1,2003} B{1,2004} B{1,2005} B{1,2006} B{1,2007} B{2,2003} B{2,2004} B{2,2005} B{2,2006} B{2,2007} TAG_Audio: TAG_Instructor: Y and M are calculated. TAG_Movie: TAG_Print: TAG_Section: PDV Company YYMM Sales Cost Salaries profit Y M BudgetAmt _N_ Logistics 03M01 457809 210914 127525 119370 2003 1 . D D D ...

84 Execution ... Partial Listing of orion.profit data budget_amt;
drop Y M; array B{2,2003:2007} _Temporary_ ( , , , , , , , , , ); set orion.profit(where=(Sales ne .) obs=2); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.profit Company YYMM Sales Cost . . . Logistics 03M01 457809 210914 03M02 325138 149718 B{1,2003} B{1,2004} B{1,2005} B{1,2006} B{1,2007} B{2,2003} B{2,2004} B{2,2005} B{2,2006} B{2,2007} TAG_Audio: TAG_Instructor: This is the table lookup statement. The two assignment statements could be combined into this statement: BudgetAmt=B{Month(YYMM),Year(YYMM)); The next slide shows how it works. TAG_Movie: TAG_Print: TAG_Section: PDV Company YYMM Sales Cost Salaries profit Y M BudgetAmt _N_ Logistics 03M01 457809 210914 127525 119370 2003 1 . D D D ...

85 Execution BudgetAmt=B{1,2003}; ... Partial Listing of orion.profit
data budget_amt; drop Y M; array B{2,2003:2007} _Temporary_ ( , , , , , , , , , ); set orion.profit(where=(Sales ne .) obs=2); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; BudgetAmt=B{1,2003}; Partial Listing of orion.profit Company YYMM Sales Cost . . . Logistics 03M01 457809 210914 03M02 325138 149718 B{1,2003} B{1,2004} B{1,2005} B{1,2006} B{1,2007} B{2,2003} B{2,2004} B{2,2005} B{2,2006} B{2,2007} TAG_Audio: TAG_Instructor: The values for the array subscripts are substituted into the assignment statement, and the value is copied from the array element into the PDV. TAG_Movie: TAG_Print: TAG_Section: PDV Company YYMM Sales Cost Salaries profit Y M BudgetAmt _N_ Logistics 03M01 457809 210914 127525 119370 2003 1 D D D ...

86 Execution Implicit OUTPUT; Implicit RETURN; ...
data budget_amt; drop Y M; array B{2,2003:2007} _Temporary_ ( , , , , , , , , , ); set orion.profit(where=(Sales ne .) obs=2); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Implicit OUTPUT; Implicit RETURN; Partial Listing of orion.profit Company YYMM Sales Cost . . . Logistics 03M01 457809 210914 03M02 325138 149718 B{1,2003} B{1,2004} B{1,2005} B{1,2006} B{1,2007} B{2,2003} B{2,2004} B{2,2005} B{2,2006} B{2,2007} TAG_Audio: TAG_Instructor: After the implicit output it is an implicit return. TAG_Movie: TAG_Print: TAG_Section: PDV Company YYMM Sales Cost Salaries profit Y M BudgetAmt _N_ Logistics 03M01 457809 210914 127525 119370 2003 1 D D D ...

87 Execution Reinitialize PDV ... Partial Listing of orion.profit
data budget_amt; drop Y M; array B{2,2003:2007} _Temporary_ ( , , , , , , , , , ); set orion.profit(where=(Sales ne .) obs=2); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.profit Company YYMM Sales Cost . . . Logistics 03M01 457809 210914 03M02 325138 149718 Reinitialize PDV B{1,2003} B{1,2004} B{1,2005} B{1,2006} B{1,2007} B{2,2003} B{2,2004} B{2,2005} B{2,2006} B{2,2007} TAG_Audio: TAG_Instructor: At the top of the DATA step, the PDV is re-initialized. TAG_Movie: TAG_Print: TAG_Section: PDV Company YYMM Sales Cost Salaries profit Y M BudgetAmt _N_ Logistics 03M01 457809 210914 127525 119370 . 2 D D D ...

88 Execution ... Partial Listing of orion.profit data budget_amt;
drop Y M; array B{2,2003:2007} _Temporary_ ( , , , , , , , , , ); set orion.profit(where=(Sales ne .) obs=2); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.profit Company YYMM Sales Cost . . . Logistics 03M01 457809 210914 03M02 325138 149718 B{1,2003} B{1,2004} B{1,2005} B{1,2006} B{1,2007} B{2,2003} B{2,2004} B{2,2005} B{2,2006} B{2,2007} TAG_Audio: TAG_Instructor: The SET statement executes and loads observations 2 from orion.profit into the PDV. Y and M are calculated. TAG_Movie: TAG_Print: TAG_Section: PDV Company YYMM Sales Cost Salaries profit Y M BudgetAmt _N_ Logistics 03M02 325138 149718 127525 47895 2003 2 . D D D ...

89 Execution ... Partial Listing of orion.profit data budget_amt;
drop Y M; array B{2,2003:2007} _Temporary_ ( , , , , , , , , , ); set orion.profit(where=(Sales ne .) obs=2); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.profit Company YYMM Sales Cost . . . Logistics 03M01 457809 210914 03M02 325138 149718 B{1,2003} B{1,2004} B{1,2005} B{1,2006} B{1,2007} B{2,2003} B{2,2004} B{2,2005} B{2,2006} B{2,2007} TAG_Audio: TAG_Instructor: The lookup occurs. Skip over this slide quickly as the next one shows how. TAG_Movie: TAG_Print: TAG_Section: PDV Company YYMM Sales Cost Salaries profit Y M BudgetAmt _N_ Logistics 03M02 325138 149718 127525 47895 2003 2 . D D D ...

90 Execution BudgetAmt=B{2,2003}; ... Partial Listing of orion.profit
data budget_amt; drop Y M; array B{2,2003:2007} _Temporary_ ( , , , , , , , , , ); set orion.profit(where=(Sales ne .) obs=2); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; BudgetAmt=B{2,2003}; Partial Listing of orion.profit Company YYMM Sales Cost . . . Logistics 03M01 457809 210914 03M02 325138 149718 B{1,2003} B{1,2004} B{1,2005} B{1,2006} B{1,2007} B{2,2003} B{2,2004} B{2,2005} B{2,2006} B{2,2007} TAG_Audio: TAG_Instructor: The array element value is copied into the PDV. TAG_Movie: TAG_Print: TAG_Section: PDV Company YYMM Sales Cost Salaries profit Y M BudgetAmt _N_ Logistics 03M02 325138 149718 127525 47895 2003 2 D D D ...

91 Execution Implicit OUTPUT; Implicit RETURN; ...
data budget_amt; drop Y M; array B{2,2003:2007} _Temporary_ ( , , , , , , , , , ); set orion.profit(where=(Sales ne .) obs=2); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Implicit OUTPUT; Implicit RETURN; Partial Listing of orion.profit Company YYMM Sales Cost . . . Logistics 03M01 457809 210914 03M02 325138 149718 B{1,2003} B{1,2004} B{1,2005} B{1,2006} B{1,2007} B{2,2003} B{2,2004} B{2,2005} B{2,2006} B{2,2007} TAG_Audio: TAG_Instructor: After observation 2 is output, return goes back to the top of the DATA step. TAG_Movie: TAG_Print: TAG_Section: PDV Company YYMM Sales Cost Salaries profit Y M BudgetAmt _N_ Logistics 03M02 325138 149718 127525 47895 2003 2 D D D ...

92 Execution Execution stops. ... Partial Listing of orion.profit
data budget_amt; drop Y M; array B{2,2003:2007} _Temporary_ ( , , , , , , , , , ); set orion.profit(where=(Sales ne .) obs=2); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.profit Company YYMM Sales Cost . . . Logistics 03M01 457809 210914 03M02 325138 149718 Execution stops. B{1,2003} B{1,2004} B{1,2005} B{1,2006} B{1,2007} B{2,2003} B{2,2004} B{2,2005} B{2,2006} B{2,2007} TAG_Audio: TAG_Instructor: Because we used the OBS=2 DATA set option, execution stops. TAG_Movie: TAG_Print: TAG_Section: PDV Company YYMM Sales Cost Salaries profit Y M BudgetAmt _N_ Logistics 03M02 325138 149718 127525 47895 2003 2 D D D ...

93

94 Exercise These exercises reinforce the concepts discussed previously.

95 Chapter 4: Using Lookup Tables to Match Data: Arrays
4.1: Introduction to Lookup Techniques 4.2: Using One-Dimensional Arrays as Lookup Tables 4.3: Using a Multidimensional Array as a Lookup Table 4.4: Loading a Multidimensional Array from a SAS Data Set

96 Objectives Load a multidimensional array from a SAS data set.
Identify the advantages of an array as a lookup table. Identify the disadvantages of an array as a lookup table.

97 Business Scenario Budget values are stored in a SAS data set named orion.budget where the rows represent months and the columns represent years. Load the array from the values in the SAS data set. Listing of orion.budget Month Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 1 $1,590,000 $1,880,000 $2,300,000 $1,960,000 $1,970,000 2 $1,290,000 $1,550,000 $1,830,000 $1,480,000 $1,640,000 3 $1,160,000 $1,380,000 $1,410,000 $1,440,000 4 $1,710,000 $2,100,000 $2,420,000 $2,130,000 $2,270,000 5 $1,990,000 $2,350,000 $2,840,000 $2,480,000 $2,670,000 6 $2,560,000 $3,020,000 $3,580,000 $3,070,000 $3,410,000 7 $2,590,000 $2,890,000 $3,550,000 $3,010,000 $3,490,000 8 $2,550,000 $3,030,000 $3,500,000 9 $1,070,000 $1,180,000 $1,260,000 $1,520,000 10 $1,270,000 $1,600,000 $1,360,000 $1,700,000 11 $1,470,000 $1,780,000 $1,540,000 $1,950,000 12 $2,870,000 $3,120,000 $3,760,000 $3,210,000 $4,370,000 TAG_Audio: TAG_Instructor: This time we've got the array data in a data set where each observation is a month and the variables represent years. TAG_Movie: TAG_Print: TAG_Section:

98 Stored Array Values Array values should be read from a SAS data set when the following conditions exist: There are too many values to initialize easily in the array. Values change frequently. The same values are used in many programs. TAG_Audio: TAG_Instructor: Main reason: to avoid having to type them. TAG_Movie: TAG_Print: TAG_Section:

99 Using Multidimensional Arrays
data budget_amt; drop Yr2003-Yr2007 Month I J Y M; array B{12,2003:2007} _temporary_; if _n_=1 then do I=1 to 12; set orion.budget; array tmp{2003:2007} Yr2003-Yr2007; do J=2003 to 2007; B{I,J}= tmp{J}; end; set orion.profit(where=(Sales ne .)); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; TAG_Audio: TAG_Instructor: 1) the DO I=1 to 12 loop is what is driving execution of the SET statement so it can read all of the observations of the orion.Budget data. You could have used the NOBS= option then use that variable instead of the 12. Since we know there are only 12 months of data in this data set, there's no reason not to use the 12. 2) The TMP array is pointing to the variables in the orion.budget data. 3) This assignment statement is what is loading the B array. What we’re doing here is loading the entire month row into the final B array by iterating from 2003 to Later we'll use the B array for the lookup. TAG_Movie: TAG_Print: TAG_Section: p304d04

100

101 4.10 Multiple Choice Poll How many elements are in the array defined by the ARRAY statement? 24 48 60 array B{12,2003:2007} _temporary_; d. 60

102 4.10 Multiple Choice Poll – Correct Answer
How many elements are in the array defined by the ARRAY statement? 24 48 60 array B{12,2003:2007} _temporary_; d. 60

103 Execution ... Partial PDV Partial Listing of orion.budget I Month
data budget_amt; drop Yr2003-Yr2007 Month I J Y M; array B{12,2003:2007} _Temporary_; if _n_ = 1 then do I=1 to 12; set orion.budget; array tmp{2003:2007} Yr2003-Yr2007; do J = 2003 to 2007; B{I,J}= tmp{J}; end; set orion.profit(where=(Sales ne .)); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.budget Month Yr2003 Yr2004 . . . 1 2 3 4 . B{1,2003} B{2,2003} B{1,2005} B{1,2006} B{1,2007} B{1,2004} B{2,2004} B{12,2006} B{12,2007} . . . TAG_Audio: TAG_Instructor: This is where processing is at the end of compilation. Notice that the array B has no values in it, but it is temporary and not part of the PDV. TAG_Movie: TAG_Print: TAG_Section: Partial PDV I Month Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 J Company . D tmp{2003} tmp{2004} tmp{2006} tmp{2007} tmp{2005} YYMM Sales Cost Salaries profit Y M Budget Amt _N_ . 1 D ...

104 Execution ... Partial PDV Partial Listing of orion.budget I Month
data budget_amt; drop Yr2003-Yr2007 Month I J Y M; array B{12,2003:2007} _Temporary_; if _n_ = 1 then do I=1 to 12; set orion.budget; array tmp{2003:2007} Yr2003-Yr2007; do J = 2003 to 2007; B{I,J}= tmp{J}; end; set orion.profit(where=(Sales ne .)); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.budget Month Yr2003 Yr2004 . . . 1 2 3 4 . B{1,2003} B{2,2003} B{1,2005} B{1,2006} B{1,2007} B{1,2004} B{2,2004} B{12,2006} B{12,2007} . . . TAG_Audio: TAG_Instructor: Since the IF statement is true, the DO statement executes and I is set to 1. TAG_Movie: TAG_Print: TAG_Section: Partial PDV I Month Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 J Company 1 . D tmp{2003} tmp{2004} tmp{2006} tmp{2007} tmp{2005} YYMM Sales Cost Salaries profit Y M Budget Amt _N_ . 1 D ...

105 Execution ... Partial PDV Partial Listing of orion.budget I Month
data budget_amt; drop Yr2003-Yr2007 Month I J Y M; array B{12,2003:2007} _Temporary_; if _n_ = 1 then do I=1 to 12; set orion.budget; array tmp{2003:2007} Yr2003-Yr2007; do J = 2003 to 2007; B{I,J}= tmp{J}; end; set orion.profit(where=(Sales ne .)); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.budget Month Yr2003 Yr2004 . . . 1 2 3 4 . B{1,2003} B{2,2003} B{1,2005} B{1,2006} B{1,2007} B{1,2004} B{2,2004} B{12,2006} B{12,2007} . . . TAG_Audio: TAG_Instructor: The first observation of the data set orion.budget is loaded into the PDV. TAG_Movie: TAG_Print: TAG_Section: Partial PDV I Month Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 J Company 1 . D tmp{2003} tmp{2004} tmp{2006} tmp{2007} tmp{2005} YYMM Sales Cost Salaries profit Y M Budget Amt _N_ . 1 D ...

106 Execution ... Partial PDV Partial Listing of orion.budget I Month
data budget_amt; drop Yr2003-Yr2007 Month I J Y M; array B{12,2003:2007} _Temporary_; if _n_=1 then do I=1 to 12; set orion.budget; array tmp{2003:2007} Yr2003-Yr2007; do J=2003 to 2007; B{I,J}= tmp{J}; end; set orion.profit(where=(Sales ne .)); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.budget Month Yr2003 Yr2004 . . . 1 2 3 4 . B{1,2003} B{2,2003} B{1,2005} B{1,2006} B{1,2007} B{1,2004} B{2,2004} B{12,2006} B{12,2007} . . . TAG_Audio: TAG_Instructor: The ARRAY statement is not executable, that's why the highlighting skips to the DO statement. J is now 2003. TAG_Movie: TAG_Print: TAG_Section: Partial PDV I Month Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 J Company 1 2003 D tmp{2003} tmp{2004} tmp{2006} tmp{2007} tmp{2005} YYMM Sales Cost Salaries profit Y M Budget Amt _N_ . 1 D ...

107 Execution ... Partial PDV Partial Listing of orion.budget I Month
data budget_amt; drop Yr2003-Yr2007 Month I J Y M; array B{12,2003:2007} _Temporary_; if _n_=1 then do I=1 to 12; set orion.budget; array tmp{2003:2007} Yr2003-Yr2007; do J=2003 to 2007; B{I,J}=tmp{J}; end; set orion.profit(where=(Sales ne .)); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.budget Month Yr2003 Yr2004 . . . 1 2 3 4 . B{1,2003} B{2,2003} B{1,2005} B{1,2006} B{1,2007} B{1,2004} B{2,2004} B{12,2006} B{12,2007} . . . TAG_Audio: TAG_Instructor: The assignment statement loads the appropriate array element. Go through the next slides very quickly unless someone has a question. TAG_Movie: TAG_Print: TAG_Section: Partial PDV I Month Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 J Company 1 2003 D tmp{2003} tmp{2004} tmp{2006} tmp{2007} tmp{2005} YYMM Sales Cost Salaries profit Y M Budget Amt _N_ . 1 D ...

108 Execution B{1,2003}=tmp{2003}; ... Partial PDV
data budget_amt; drop Yr2003-Yr2007 Month I J Y M; array B{12,2003:2007} _Temporary_; if _n_=1 then do I=1 to 12; set orion.budget; array tmp{2003:2007} Yr2003-Yr2007; do J=2003 to 2007; B{I,J}= tmp{J}; end; set orion.profit(where=(Sales ne .)); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.budget Month Yr2003 Yr2004 . . . 1 2 3 4 . B{1,2003}=tmp{2003}; B{1,2003} B{2,2003} B{1,2005} B{1,2006} B{1,2007} B{1,2004} B{2,2004} B{12,2006} B{12,2007} . . . TAG_Audio: TAG_Instructor: The assignment statement uses values from the PDV to fill in the appropriate array element with the value referenced by the array TMP. TAG_Movie: TAG_Print: TAG_Section: Partial PDV I Month Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 J Company 1 2003 D tmp{2003} tmp{2004} tmp{2006} tmp{2007} tmp{2005} YYMM Sales Cost Salaries profit Y M Budget Amt _N_ . 1 D ...

109 Execution ... Partial PDV Partial Listing of orion.budget I Month
data budget_amt; drop Yr2003-Yr2007 Month I J Y M; array B{12,2003:2007} _Temporary_; if _n_=1 then do I=1 to 12; set orion.budget; array tmp{2003:2007} Yr2003-Yr2007; do J=2003 to 2007; B{I,J}=tmp{J}; end; set orion.profit(where=(Sales ne .)); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.budget Month Yr2003 Yr2004 . . . 1 2 3 4 . B{1,2003} B{2,2003} B{1,2005} B{1,2006} B{1,2007} B{1,2004} B{2,2004} B{12,2006} B{12,2007} . . . TAG_Audio: TAG_Instructor: J increments to 2004. TAG_Movie: TAG_Print: TAG_Section: Partial PDV I Month Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 J Company 1 2004 D tmp{2003} tmp{2004} tmp{2006} tmp{2007} tmp{2005} YYMM Sales Cost Salaries profit Y M Budget Amt _N_ . 1 D ...

110 Execution ... Partial PDV Partial Listing of orion.budget I Month
data budget_amt; drop Yr2003-Yr2007 Month I J Y M; array B{12,2003:2007} _Temporary_; if _n_=1 then do I=1 to 12; set orion.budget; array tmp{2003:2007} Yr2003-Yr2007; do J=2003 to 2007; B{I,J}=tmp{J}; end; set orion.profit(where=(Sales ne .)); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.budget Month Yr2003 Yr2004 . . . 1 2 3 4 . B{1,2003} B{2,2003} B{1,2005} B{1,2006} B{1,2007} B{1,2004} B{2,2004} B{12,2006} B{12,2007} . . . TAG_Audio: TAG_Instructor: The subscript J is still in bounds so the DO loop executes again. TAG_Movie: TAG_Print: TAG_Section: Partial PDV I Month Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 J Company 1 2004 D tmp{2003} tmp{2004} tmp{2006} tmp{2007} tmp{2005} YYMM Sales Cost Salaries profit Y M Budget Amt _N_ . 1 D ...

111 Execution B{1,2004}=tmp{2004}; ... Partial PDV
data budget_amt; drop Yr2003-Yr2007 Month I J Y M; array B{12,2003:2007} _Temporary_; if _n_=1 then do I=1 to 12; set orion.budget; array tmp{2003:2007} Yr2003-Yr2007; do J=2003 to 2007; B{I,J}= tmp{J}; end; set orion.profit(where=(Sales ne .)); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.budget Month Yr2003 Yr2004 . . . 1 2 3 4 . B{1,2004}=tmp{2004}; B{1,2003} B{2,2003} B{1,2005} B{1,2006} B{1,2007} B{1,2004} B{2,2004} B{12,2006} B{12,2007} . . . TAG_Audio: TAG_Instructor: Go through the next slides quickly. You don't have to say much except, 'here we go again.' TAG_Movie: TAG_Print: TAG_Section: Partial PDV I Month Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 J Company 1 2004 D tmp{2003} tmp{2004} tmp{2006} tmp{2007} tmp{2005} YYMM Sales Cost Salaries profit Y M Budget Amt _N_ . 1 D ...

112 Execution Continue until J=2008 ... Partial PDV
data budget_amt; drop Yr2003-Yr2007 Month I J Y M; array B{12,2003:2007} _Temporary_; if _n_=1 then do I=1 to 12; set orion.budget; array tmp{2003:2007} Yr2003-Yr2007; do J=2003 to 2007; B{I,J}=tmp{J}; end; set orion.profit(where=(Sales ne .)); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.budget Month Yr2003 Yr2004 . . . 1 2 3 4 . Continue until J=2008 B{1,2003} B{2,2003} B{1,2005} B{1,2006} B{1,2007} B{1,2004} B{2,2004} B{12,2006} B{12,2007} . . . TAG_Audio: TAG_Instructor: I increments to 2. Control returns to the top of the outer DO loop. TAG_Movie: TAG_Print: TAG_Section: Partial PDV I Month Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 J Company 2 1 2008 D tmp{2003} tmp{2004} tmp{2006} tmp{2007} tmp{2005} YYMM Sales Cost Salaries profit Y M Budget Amt _N_ . 1 D ...

113 Execution ... Partial PDV Partial Listing of orion.budget I Month
data budget_amt; drop Yr2003-Yr2007 Month I J Y M; array B{12,2003:2007} _Temporary_; if _n_=1 then do I=1 to 12; set orion.budget; array tmp{2003:2007} Yr2003-Yr2007; do J=2003 to 2007; B{I,J}=tmp{J}; end; set orion.profit(where=(Sales ne .)); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.budget Month Yr2003 Yr2004 . . . 1 2 3 4 . B{1,2003} B{2,2003} B{1,2005} B{1,2006} B{1,2007} B{1,2004} B{2,2004} B{12,2006} B{12,2007} . . . TAG_Audio: TAG_Instructor: The DO statement executes a second time. TAG_Movie: TAG_Print: TAG_Section: Partial PDV I Month Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 J Company 2 1 2008 D tmp{2003} tmp{2004} tmp{2006} tmp{2007} tmp{2005} YYMM Sales Cost Salaries profit Y M Budget Amt _N_ . 1 D ...

114 Execution ... Partial PDV Partial Listing of orion.budget I Month
data budget_amt; drop Yr2003-Yr2007 Month I J Y M; array B{12,2003:2007} _Temporary_; if _n_=1 then do I=1 to 12; set orion.budget; array tmp{2003:2007} Yr2003-Yr2007; do J=2003 to 2007; B{I,J}=tmp{J}; end; set orion.profit(where=(Sales ne .)); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.budget Month Yr2003 Yr2004 . . . 1 2 3 4 . B{1,2003} B{2,2003} B{1,2005} B{1,2006} B{1,2007} B{1,2004} B{2,2004} B{12,2006} B{12,2007} . . . TAG_Audio: TAG_Instructor: The second observation is copied into the PDV. TAG_Movie: TAG_Print: TAG_Section: Partial PDV I Month Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 J Company 2 2008 D tmp{2003} tmp{2004} tmp{2006} tmp{2007} tmp{2005} YYMM Sales Cost Salaries profit Y M Budget Amt _N_ . 1 D ...

115 Execution ... Partial PDV Partial Listing of orion.budget I Month
data budget_amt; drop Yr2003-Yr2007 Month I J Y M; array B{12,2003:2007} _Temporary_; if _n_=1 then do I=1 to 12; set orion.budget; array tmp{2003:2007} Yr2003-Yr2007; do J=2003 to 2007; B{I,J}=tmp{J}; end; set orion.profit(where=(Sales ne .)); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.budget Month Yr2003 Yr2004 . . . 1 2 3 4 . B{1,2003} B{2,2003} B{1,2005} B{1,2006} B{1,2007} B{1,2004} B{2,2004} B{12,2006} B{12,2007} . . . TAG_Audio: TAG_Instructor: J is reset to 2003. TAG_Movie: TAG_Print: TAG_Section: Partial PDV I Month Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 J Company 2 2003 D tmp{2003} tmp{2004} tmp{2006} tmp{2007} tmp{2005} YYMM Sales Cost Salaries profit Y M Budget Amt _N_ . 1 D ...

116 Execution ... Partial PDV Partial Listing of orion.budget I Month
data budget_amt; drop Yr2003-Yr2007 Month I J Y M; array B{12,2003:2007} _Temporary_; if _n_=1 then do I=1 to 12; set orion.budget; array tmp{2003:2007} Yr2003-Yr2007; do J=2003 to 2007; B{I,J}=tmp{J}; end; set orion.profit(where=(Sales ne .)); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.budget Month Yr2003 Yr2004 . . . 1 2 3 4 . B{1,2003} B{2,2003} B{1,2005} B{1,2006} B{1,2007} B{1,2004} B{2,2004} B{12,2006} B{12,2007} . . . TAG_Audio: TAG_Instructor: And the assignment statement executes where I is 2 and J goes from 2003 to Again, don't say anything but run through the next set of slides quickly. TAG_Movie: TAG_Print: TAG_Section: Partial PDV I Month Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 J Company 2 2003 D tmp{2003} tmp{2004} tmp{2006} tmp{2007} tmp{2005} YYMM Sales Cost Salaries profit Y M Budget Amt _N_ . 1 D ...

117 Execution B{2,2003}=tmp{2003}; ... Partial PDV
data budget_amt; drop Yr2003-Yr2007 Month I J Y M; array B{12,2003:2007} _Temporary_; if _n_=1 then do I=1 to 12; set orion.budget; array tmp{2003:2007} Yr2003-Yr2007; do J=2003 to 2007; B{I,J}= tmp{J}; end; set orion.profit(where=(Sales ne .)); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.budget Month Yr2003 Yr2004 . . . 1 2 3 4 . B{2,2003}=tmp{2003}; B{1,2003} B{2,2003} B{1,2005} B{1,2006} B{1,2007} B{1,2004} B{2,2004} B{12,2006} B{12,2007} . . . TAG_Audio: TAG_Instructor: TAG_Movie: TAG_Print: TAG_Section: Partial PDV I Month Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 J Company 2 2003 D tmp{2003} tmp{2004} tmp{2006} tmp{2007} tmp{2005} YYMM Sales Cost Salaries profit Y M Budget Amt _N_ . 1 D ...

118 Execution ... Partial PDV Partial Listing of orion.budget I Month
data budget_amt; drop Yr2003-Yr2007 Month I J Y M; array B{12,2003:2007} _Temporary_; if _n_=1 then do I=1 to 12; set orion.budget; array tmp{2003:2007} Yr2003-Yr2007; do J=2003 to 2007; B{I,J}=tmp{J}; end; set orion.profit(where=(Sales ne .)); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.budget Month Yr2003 Yr2004 . . . 1 2 3 4 . B{1,2003} B{2,2003} B{1,2005} B{1,2006} B{1,2007} B{1,2004} B{2,2004} B{12,2006} B{12,2007} . . . TAG_Audio: TAG_Instructor: TAG_Movie: TAG_Print: TAG_Section: Partial PDV I Month Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 J Company 2 2004 D tmp{2003} tmp{2004} tmp{2006} tmp{2007} tmp{2005} YYMM Sales Cost Salaries profit Y M Budget Amt _N_ . 1 D ...

119 Execution B{2,2004}=tmp{2004}; ... Partial PDV
data budget_amt; drop Yr2003-Yr2007 Month I J Y M; array B{12,2003:2007} _Temporary_; if _n_=1 then do I=1 to 12; set orion.budget; array tmp{2003:2007} Yr2003-Yr2007; do J=2003 to 2007; B{I,J}= tmp{J}; end; set orion.profit(where=(Sales ne .)); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.budget Month Yr2003 Yr2004 . . . 1 2 3 4 . B{2,2004}=tmp{2004}; B{1,2003} B{2,2003} B{1,2005} B{1,2006} B{1,2007} B{1,2004} B{2,2004} B{12,2006} B{12,2007} . . . TAG_Audio: TAG_Instructor: TAG_Movie: TAG_Print: TAG_Section: Partial PDV I Month Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 J Company 2 2004 D tmp{2003} tmp{2004} tmp{2006} tmp{2007} tmp{2005} YYMM Sales Cost Salaries profit Y M Budget Amt _N_ . 1 D ...

120 Execution ... Partial PDV Partial Listing of orion.budget I Month
data budget_amt; drop Yr2003-Yr2007 Month I J Y M; array B{12,2003:2007} _Temporary_; if _n_=1 then do I=1 to 12; set orion.budget; array tmp{2003:2007} Yr2003-Yr2007; do J=2003 to 2007; B{I,J}=tmp{J}; end; set orion.profit(where=(Sales ne .)); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.budget Month Yr2003 Yr2004 . . . 1 2 3 4 . B{1,2003} B{2,2003} B{1,2005} B{1,2006} B{1,2007} B{1,2004} B{2,2004} B{12,2006} B{12,2007} . . . TAG_Audio: TAG_Instructor: TAG_Movie: TAG_Print: TAG_Section: Partial PDV I Month Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 J Company 2 2005 D tmp{2003} tmp{2004} tmp{2006} tmp{2007} tmp{2005} YYMM Sales Cost Salaries profit Y M Budget Amt _N_ . 1 D ...

121 Execution Eventually, I=12 and J=2006 ... Partial PDV
data budget_amt; drop Yr2003-Yr2007 Month I J Y M; array B{12,2003:2007} _Temporary_; if _n_=1 then do I=1 to 12; set orion.budget; array tmp{2003:2007} Yr2003-Yr2007; do J=2003 to 2007; B{I,J}=tmp{J}; end; set orion.profit(where=(Sales ne .)); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.budget Month Yr2003 Yr2004 . . . 1 2 3 4 . Eventually, I=12 and J=2006 B{1,2003} B{2,2003} B{1,2005} B{1,2006} B{1,2007} B{1,2004} B{2,2004} B{12,2006} B{12,2007} . . . TAG_Audio: TAG_Instructor: So we don't bore them too much, we skip until I is 12 and J is 2006 to load the last two array elements. TAG_Movie: TAG_Print: TAG_Section: Partial PDV I Month Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 J Company 12 2006 D tmp{2003} tmp{2004} tmp{2006} tmp{2007} tmp{2005} YYMM Sales Cost Salaries profit Y M Budget Amt _N_ . 1 D ...

122 Execution ... Partial PDV Partial Listing of orion.budget I Month
data budget_amt; drop Yr2003-Yr2007 Month I J Y M; array B{12,2003:2007} _Temporary_; if _n_=1 then do I=1 to 12; set orion.budget; array tmp{2003:2007} Yr2003-Yr2007; do J=2003 to 2007; B{I,J}=tmp{J}; end; set orion.profit(where=(Sales ne .)); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.budget Month Yr2003 Yr2004 . . . 1 2 3 4 . B{1,2003} B{2,2003} B{1,2005} B{1,2006} B{1,2007} B{1,2004} B{2,2004} B{12,2006} B{12,2007} . . . TAG_Audio: TAG_Instructor: Don't talk through these next slides. They just load the end of the array, so skip quickly through them. TAG_Movie: TAG_Print: TAG_Section: Partial PDV I Month Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 J Company 12 2006 D tmp{2003} tmp{2004} tmp{2006} tmp{2007} tmp{2005} YYMM Sales Cost Salaries profit Y M Budget Amt _N_ . 1 D ...

123 Execution B{12,2006}=tmp{2006}; ... Partial PDV
data budget_amt; drop Yr2003-Yr2007 Month I J Y M; array B{12,2003:2007} _Temporary_; if _n_=1 then do I=1 to 12; set orion.budget; array tmp{2003:2007} Yr2003-Yr2007; do J=2003 to 2007; B{I,J}= tmp{J}; end; set orion.profit(where=(Sales ne .)); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.budget Month Yr2003 Yr2004 . . . 1 2 3 4 . B{12,2006}=tmp{2006}; B{1,2003} B{2,2003} B{1,2005} B{1,2006} B{1,2007} B{1,2004} B{2,2004} B{12,2006} B{12,2007} . . . TAG_Audio: TAG_Instructor: TAG_Movie: TAG_Print: TAG_Section: Partial PDV I Month Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 J Company 12 2006 D tmp{2003} tmp{2004} tmp{2006} tmp{2007} tmp{2005} YYMM Sales Cost Salaries profit Y M Budget Amt _N_ . 1 D ...

124 Execution ... Partial PDV Partial Listing of orion.budget I Month
data budget_amt; drop Yr2003-Yr2007 Month I J Y M; array B{12,2003:2007} _Temporary_; if _n_=1 then do I=1 to 12; set orion.budget; array tmp{2003:2007} Yr2003-Yr2007; do J=2003 to 2007; B{I,J}=tmp{J}; end; set orion.profit(where=(Sales ne .)); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.budget Month Yr2003 Yr2004 . . . 1 2 3 4 . B{1,2003} B{2,2003} B{1,2005} B{1,2006} B{1,2007} B{1,2004} B{2,2004} B{12,2006} B{12,2007} . . . TAG_Audio: TAG_Instructor: TAG_Movie: TAG_Print: TAG_Section: Partial PDV I Month Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 J Company 12 2007 D tmp{2003} tmp{2004} tmp{2006} tmp{2007} tmp{2005} YYMM Sales Cost Salaries profit Y M Budget Amt _N_ . 1 D ...

125 Execution ... Partial PDV Partial Listing of orion.budget I Month
data budget_amt; drop Yr2003-Yr2007 Month I J Y M; array B{12,2003:2007} _Temporary_; if _n_=1 then do I=1 to 12; set orion.budget; array tmp{2003:2007} Yr2003-Yr2007; do J=2003 to 2007; B{I,J}=tmp{J}; end; set orion.profit(where=(Sales ne .)); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.budget Month Yr2003 Yr2004 . . . 1 2 3 4 . B{1,2003} B{2,2003} B{1,2005} B{1,2006} B{1,2007} B{1,2004} B{2,2004} B{12,2006} B{12,2007} . . . TAG_Audio: TAG_Instructor: TAG_Movie: TAG_Print: TAG_Section: Partial PDV I Month Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 J Company 12 2007 D tmp{2003} tmp{2004} tmp{2006} tmp{2007} tmp{2005} YYMM Sales Cost Salaries profit Y M Budget Amt _N_ . 1 D ...

126 Execution B{12,2007}=tmp{2007}; ... Partial PDV
data budget_amt; drop Yr2003-Yr2007 Month I J Y M; array B{12,2003:2007} _Temporary_; if _n_=1 then do I=1 to 12; set orion.budget; array tmp{2003:2007} Yr2003-Yr2007; do J=2003 to 2007; B{I,J}= tmp{J}; end; set orion.profit(where=(Sales ne .)); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.budget Month Yr2003 Yr2004 . . . 1 2 3 4 . B{12,2007}=tmp{2007}; B{1,2003} B{2,2003} B{1,2005} B{1,2006} B{1,2007} B{1,2004} B{2,2004} B{12,2006} B{12,2007} . . . TAG_Audio: TAG_Instructor: TAG_Movie: TAG_Print: TAG_Section: Partial PDV I Month Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 J Company 12 2007 D tmp{2003} tmp{2004} tmp{2006} tmp{2007} tmp{2005} YYMM Sales Cost Salaries profit Y M Budget Amt _N_ . 1 D ...

127 Execution ... Partial PDV Partial Listing of orion.budget I Month
data budget_amt; drop Yr2003-Yr2007 Month I J Y M; array B{12,2003:2007} _Temporary_; if _n_=1 then do I=1 to 12; set orion.budget; array tmp{2003:2007} Yr2003-Yr2007; do J=2003 to 2007; B{I,J}=tmp{J}; end; set orion.profit(where=(Sales ne .)); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.budget Month Yr2003 Yr2004 . . . 1 2 3 4 . B{1,2003} B{2,2003} B{1,2005} B{1,2006} B{1,2007} B{1,2004} B{2,2004} B{12,2006} B{12,2007} . . . TAG_Audio: TAG_Instructor: Once J is 2008, the inner DO loop exits and the second END statement executes. TAG_Movie: TAG_Print: TAG_Section: Partial PDV I Month Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 J Company 12 2008 D tmp{2003} tmp{2004} tmp{2006} tmp{2007} tmp{2005} YYMM Sales Cost Salaries profit Y M Budget Amt _N_ . 1 D ...

128 Execution ... Partial PDV Partial Listing of orion.budget I Month
data budget_amt; drop Yr2003-Yr2007 Month I J Y M; array B{12,2003:2007} _Temporary_; if _n_=1 then do I=1 to 12; set orion.budget; array tmp{2003:2007} Yr2003-Yr2007; do J=2003 to 2007; B{I,J}=tmp{J}; end; set orion.profit(where=(Sales ne .)); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.budget Month Yr2003 Yr2004 . . . 1 2 3 4 . B{1,2003} B{2,2003} B{1,2005} B{1,2006} B{1,2007} B{1,2004} B{2,2004} B{12,2006} B{12,2007} . . . TAG_Audio: TAG_Instructor: At that point, I increments to 13. TAG_Movie: TAG_Print: TAG_Section: Partial PDV I Month Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 J Company 13 12 2008 D tmp{2003} tmp{2004} tmp{2006} tmp{2007} tmp{2005} YYMM Sales Cost Salaries profit Y M Budget Amt _N_ . 1 D ...

129 Execution ... Partial PDV Partial Listing of orion.budget I Month
data budget_amt; drop Yr2003-Yr2007 Month I J Y M; array B{12,2003:2007} _Temporary_; if _n_=1 then do I=1 to 12; set orion.budget; array tmp{2003:2007} Yr2003-Yr2007; do J=2003 to 2007; B{I,J}=tmp{J}; end; set orion.profit(where=(Sales ne .)); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.budget Month Yr2003 Yr2004 . . . 1 2 3 4 . B{1,2003} B{2,2003} B{1,2005} B{1,2006} B{1,2007} B{1,2004} B{2,2004} B{12,2006} B{12,2007} . . . TAG_Audio: TAG_Instructor: the DO statement executes, but I is out of bounds. TAG_Movie: TAG_Print: TAG_Section: Partial PDV I Month Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 J Company 13 12 2008 D tmp{2003} tmp{2004} tmp{2006} tmp{2007} tmp{2005} YYMM Sales Cost Salaries profit Y M Budget Amt _N_ . 1 D ...

130 Execution ... Partial PDV Partial Listing of orion.profit I Month
data budget_amt; drop Yr2003-Yr2007 Month I J Y M; array B{12,2003:2007} _Temporary_; if _n_=1 then do I=1 to 12; set orion.budget; array tmp{2003:2007} Yr2003-Yr2007; do J=2003 to 2007; B{I,J}=tmp{J}; end; set orion.profit(where=(Sales ne .)); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.profit Company YYMM Sales . . . Logistics 03M01 457809 03M02 325138 03M03 276805 03M04 558806 . B{1,2003} B{2,2003} B{1,2005} B{1,2006} B{1,2007} B{1,2004} B{2,2004} B{12,2006} B{12,2007} . . . TAG_Audio: TAG_Instructor: So the SET statement executes the first time. Note that _N_ is still 1. TAG_Movie: TAG_Print: TAG_Section: Partial PDV I Month Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 J Company 13 12 2008 Logistics D tmp{2003} tmp{2004} tmp{2006} tmp{2007} tmp{2005} YYMM Sales Cost Salaries profit Y M Budget Amt _N_ 03M01 457809 210914 127525 119370 . 1 D ...

131 Execution ... Partial PDV Partial Listing of orion.profit I Month
data budget_amt; drop Yr2003-Yr2007 Month I J Y M; array B{12,2003:2007} _Temporary_; if _n_=1 then do I=1 to 12; set orion.budget; array tmp{2003:2007} Yr2003-Yr2007; do J=2003 to 2007; B{I,J}=tmp{J}; end; set orion.profit(where=(Sales ne .)); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.profit Company YYMM Sales . . . Logistics 03M01 457809 03M02 325138 03M03 276805 03M04 558806 . B{1,2003} B{2,2003} B{1,2005} B{1,2006} B{1,2007} B{1,2004} B{2,2004} B{12,2006} B{12,2007} . . . TAG_Audio: TAG_Instructor: The assignment statements execute using the YYMM data from the first observation of orion.profit. TAG_Movie: TAG_Print: TAG_Section: Partial PDV I Month Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 J Company 13 12 2008 Logistics D tmp{2003} tmp{2004} tmp{2006} tmp{2007} tmp{2005} YYMM Sales Cost Salaries profit Y M Budget Amt _N_ 03M01 457809 210914 127525 119370 2003 1 . D ...

132 Execution ... Partial PDV Partial Listing of orion.profit I Month
data budget_amt; drop Yr2003-Yr2007 Month I J Y M; array B{12,2003:2007} _Temporary_; if _n_=1 then do I=1 to 12; set orion.budget; array tmp{2003:2007} Yr2003-Yr2007; do J=2003 to 2007; B{I,J}=tmp{J}; end; set orion.profit(where=(Sales ne .)); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.profit Company YYMM Sales . . . Logistics 03M01 457809 03M02 325138 03M03 276805 03M04 558806 . B{1,2003} B{2,2003} B{1,2005} B{1,2006} B{1,2007} B{1,2004} B{2,2004} B{12,2006} B{12,2007} . . . TAG_Audio: TAG_Instructor: This is the table lookup statement. TAG_Movie: TAG_Print: TAG_Section: Partial PDV I Month Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 J Company 13 12 2008 Logistics D tmp{2003} tmp{2004} tmp{2006} tmp{2007} tmp{2005} YYMM Sales Cost Salaries profit Y M Budget Amt _N_ 03M01 457809 210914 127525 119370 2003 1 . D ...

133 Execution BudgetAmt=B{1,2003}; ... Partial PDV
data budget_amt; drop Yr2003-Yr2007 Month I J Y M; array B{12,2003:2007} _Temporary_; if _n_=1 then do I=1 to 12; set orion.budget; array tmp{2003:2007} Yr2003-Yr2007; do J=2003 to 2007; B{I,J}=tmp{J}; end; set orion.profit(where=(Sales ne .)); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.profit Company YYMM Sales . . . Logistics 03M01 457809 03M02 325138 03M03 276805 03M04 558806 . BudgetAmt=B{1,2003}; B{1,2003} B{2,2003} B{1,2005} B{1,2006} B{1,2007} B{1,2004} B{2,2004} B{12,2006} B{12,2007} . . . TAG_Audio: TAG_Instructor: The values of Y and M are substituted as the array subscripts and the array element is copied into the PDV. TAG_Movie: TAG_Print: TAG_Section: Partial PDV I Month Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 J Company 13 12 2008 Logistics D tmp{2003} tmp{2004} tmp{2006} tmp{2007} tmp{2005} YYMM Sales Cost Salaries profit Y M Budget Amt _N_ 03M01 457809 210914 127525 119370 2003 1 D ...

134 Execution Implicit OUTPUT; Implicit RETURN; ... Partial PDV
data budget_amt; drop Yr2003-Yr2007 Month I J Y M; array B{12,2003:2007} _Temporary_; if _n_=1 then do I=1 to 12; set orion.budget; array tmp{2003:2007} Yr2003-Yr2007; do J=2003 to 2007; B{I,J}=tmp{J}; end; set orion.profit(where=(Sales ne .)); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.profit Company YYMM Sales . . . Logistics 03M01 457809 03M02 325138 03M03 276805 03M04 558806 . Implicit OUTPUT; Implicit RETURN; B{1,2003} B{2,2003} B{1,2005} B{1,2006} B{1,2007} B{1,2004} B{2,2004} B{12,2006} B{12,2007} . . . TAG_Audio: TAG_Instructor: After the implicit output, the DATA step executes a second time. TAG_Movie: TAG_Print: TAG_Section: Partial PDV I Month Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 J Company 13 12 2008 Logistics D tmp{2003} tmp{2004} tmp{2006} tmp{2007} tmp{2005} YYMM Sales Cost Salaries profit Y M Budget Amt _N_ 03M01 457809 210914 127525 119370 2003 1 D ...

135 Execution Reinitialize PDV ... Partial PDV
data budget_amt; drop Yr2003-Yr2007 Month I J Y M; array B{12,2003:2007} _Temporary_; if _n_=1 then do I=1 to 12; set orion.budget; array tmp{2003:2007} Yr2003-Yr2007; do J=2003 to 2007; B{I,J}=tmp{J}; end; set orion.profit(where=(Sales ne .)); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.profit Company YYMM Sales . . . Logistics 03M01 457809 03M02 325138 03M03 276805 03M04 558806 . Reinitialize PDV B{1,2003} B{2,2003} B{1,2005} B{1,2006} B{1,2007} B{1,2004} B{2,2004} B{12,2006} B{12,2007} . . . TAG_Audio: TAG_Instructor: New variables are reinitialized to missing. Note, that includes I and J. TAG_Movie: TAG_Print: TAG_Section: Partial PDV I Month Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 J Company . 12 Logistics D tmp{2003} tmp{2004} tmp{2006} tmp{2007} tmp{2005} YYMM Sales Cost Salaries profit Y M Budget Amt _N_ 03M01 457809 210914 127525 119370 . 2 D ...

136 Execution False ... Partial PDV Partial Listing of orion.profit I
data budget_amt; drop Yr2003-Yr2007 Month I J Y M; array B{12,2003:2007} _Temporary_; if _n_=1 then do I=1 to 12; set orion.budget; array tmp{2003:2007} Yr2003-Yr2007; do J=2003 to 2007; B{I,J}=tmp{J}; end; set orion.profit(where=(Sales ne .)); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.profit Company YYMM Sales . . . Logistics 03M01 457809 03M02 325138 03M03 276805 03M04 558806 . False B{1,2003} B{2,2003} B{1,2005} B{1,2006} B{1,2007} B{1,2004} B{2,2004} B{12,2006} B{12,2007} . . . TAG_Audio: TAG_Instructor: The IF/THEN DO group does not execute again because _N_ is 2. That is important to prevent SAS from hitting the end of file and ending the DATA step. TAG_Movie: TAG_Print: TAG_Section: Partial PDV I Month Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 J Company . 12 Logistics D tmp{2003} tmp{2004} tmp{2006} tmp{2007} tmp{2005} YYMM Sales Cost Salaries profit Y M Budget Amt _N_ 03M01 457809 210914 127525 119370 . 2 D ...

137 Execution ... Partial PDV Partial Listing of orion.profit I Month
data budget_amt; drop Yr2003-Yr2007 Month I J Y M; array B{12,2003:2007} _Temporary_; if _n_=1 then do I=1 to 12; set orion.budget; array tmp{2003:2007} Yr2003-Yr2007; do J=2003 to 2007; B{I,J}=tmp{J}; end; set orion.profit(where=(Sales ne .)); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.profit Company YYMM Sales . . . Logistics 03M01 457809 03M02 325138 03M03 276805 03M04 558806 . B{1,2003} B{2,2003} B{1,2005} B{1,2006} B{1,2007} B{1,2004} B{2,2004} B{12,2006} B{12,2007} . . . TAG_Audio: TAG_Instructor: The second SET statement executes again and the second observation of orion.profit is copied into the PDV. TAG_Movie: TAG_Print: TAG_Section: Partial PDV I Month Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 J Company . 12 Logistics D tmp{2003} tmp{2004} tmp{2006} tmp{2007} tmp{2005} YYMM Sales Cost Salaries profit Y M Budget Amt _N_ 03M02 325138 149718 127525 47895 . 2 D ...

138 Execution ... Partial PDV Partial Listing of orion.profit I Month
data budget_amt; drop Yr2003-Yr2007 Month I J Y M; array B{12,2003:2007} _Temporary_; if _n_=1 then do I=1 to 12; set orion.budget; array tmp{2003:2007} Yr2003-Yr2007; do J=2003 to 2007; B{I,J}=tmp{J}; end; set orion.profit(where=(Sales ne .)); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.profit Company YYMM Sales . . . Logistics 03M01 457809 03M02 325138 03M03 276805 03M04 558806 . B{1,2003} B{2,2003} B{1,2005} B{1,2006} B{1,2007} B{1,2004} B{2,2004} B{12,2006} B{12,2007} . . . TAG_Audio: TAG_Instructor: Y and M are calculated. TAG_Movie: TAG_Print: TAG_Section: Partial PDV I Month Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 J Company . 12 Logistics D tmp{2003} tmp{2004} tmp{2006} tmp{2007} tmp{2005} YYMM Sales Cost Salaries profit Y M Budget Amt _N_ 03M02 325138 149718 127525 47895 2003 2 D ...

139 Execution Implicit OUTPUT; Implicit RETURN; ... Partial PDV
data budget_amt; drop Yr2003-Yr2007 Month I J Y M; array B{12,2003:2007} _Temporary_; if _n_=1 then do I=1 to 12; set orion.budget; array tmp{2003:2007} Yr2003-Yr2007; do J=2003 to 2007; B{I,J}=tmp{J}; end; set orion.profit(where=(Sales ne .)); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.profit Company YYMM Sales . . . Logistics 03M01 457809 03M02 325138 03M03 276805 03M04 558806 . Implicit OUTPUT; Implicit RETURN; B{1,2003} B{2,2003} B{1,2005} B{1,2006} B{1,2007} B{1,2004} B{2,2004} B{12,2006} B{12,2007} . . . TAG_Audio: TAG_Instructor: There is the implicit output and the implicit return. TAG_Movie: TAG_Print: TAG_Section: Partial PDV I Month Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 J Company . 12 Logistics D tmp{2003} tmp{2004} tmp{2006} tmp{2007} tmp{2005} YYMM Sales Cost Salaries profit Y M Budget Amt _N_ 03M02 325138 149718 127525 47895 2003 2 D ...

140 Execution Continue until EOF ... Partial PDV
data budget_amt; drop Yr2003-Yr2007 Month I J Y M; array B{12,2003:2007} _Temporary_; if _n_=1 then do I=1 to 12; set orion.budget; array tmp{2003:2007} Yr2003-Yr2007; do J=2003 to 2007; B{I,J}=tmp{J}; end; set orion.profit(where=(Sales ne .)); Y=year(YYMM); M=month(YYMM); BudgetAmt=B{M,Y}; run; Partial Listing of orion.profit Company YYMM Sales . . . Logistics 03M01 457809 03M02 325138 03M03 276805 03M04 558806 . Continue until EOF B{1,2003} B{2,2003} B{1,2005} B{1,2006} B{1,2007} B{1,2004} B{2,2004} B{12,2006} B{12,2007} . . . TAG_Audio: TAG_Instructor: And execution continues until the end of the file in orion.profit. TAG_Movie: TAG_Print: TAG_Section: Partial PDV I Month Yr2003 Yr2004 Yr2005 Yr2006 Yr2007 J Company . 12 Logistics D tmp{2003} tmp{2004} tmp{2006} tmp{2007} tmp{2005} YYMM Sales Cost Salaries profit Y M Budget Amt _N_ 03M02 325138 149718 127525 47895 2003 2 D ...

141 Advantages of Using an Array Disadvantages of Using an Array
faster than a hash object or format if you can use it a contiguous chunk of memory requested at compile time use of positional order memory requirements to load the entire array use of multiple values to determine the array element to be returned requirement that you must have a numeric value as a pointer to the array elements ability to use a non-sorted and non-indexed base data set the return of only a single value from the lookup operation use of numeric expressions to determine which element of the array is to be looked up; exact match not required dimensions supplied at compile time by either hard-coding or macro variables TAG_Audio: TAG_Instructor: These mean: Stress this advantage. It is the most important We had the idea of 1st month, 2nd month, and so forth. We could use a multidimensional array: one dimension for each value. The data does not have to be sorted or indexed. And we calculated the value of M and Y using SAS functions. An array is always one of the best method to use if you can. TAG_Movie: TAG_Print: TAG_Section:

142 Review of Arrays Array An array uses less memory than a hash object.
The size of the array is determined at compilation time. The subscript value(s) must be numeric. Subscript values must be consecutive integers. One data value can be associated with the subscript value(s). An array selects values by direct access based on the subscript value. Arrays can only be used in the DATA step.

143

144 Exercise These exercises reinforce the concepts discussed previously.


Download ppt "Chapter 4: Using Lookup Tables to Match Data: Arrays"

Similar presentations


Ads by Google