Download presentation
Presentation is loading. Please wait.
Published byHomer Garrett Modified over 9 years ago
1
Chapter 8 Producing Summary Reports
2
Section 8.1 Introduction to Summary Reports
3
33 Objectives Identify the different report writing procedures. Create one-way and two-way frequency tables using the FREQ procedure. Restrict the variables processed by the FREQ procedure. Generate simple descriptive statistics using the MEANS procedure. Group observations of a SAS data set for analysis using the CLASS statement in the MEANS procedure.
4
44 Summary Reports Summarize Data and Report Writing Step Summarize Data and Report Writing Step Report Writing Step Report Writing Step Report LastName FirstName Age TORRES JAN23 LANGKAMM SARAH46 SMITH MICHAEL71 WAGSCHAL NADJA37 TOERMOEN JOCHEN16 Small Data Set LastName FirstName Age TORRES JAN23 LANGKAMM SARAH46 SMITH MICHAEL71 WAGSCHAL NADJA37 TOERMOEN JOCHEN16... IngersolHans32 Himelewski Janice87 Large Data Set...
5
55 Summary Report Procedures Toolbox PROC FREQ produces frequency counts. PROC MEANS produces simple statistics. PROC REPORT produces flexible detail and summary reports.
6
66 PROC FREQ Output Distribution of Job Code Values The FREQ Procedure Job Cumulative Cumulative Code Frequency Percent Frequency Percent ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ FLTAT1 14 20.29 14 20.29 FLTAT2 18 26.09 32 46.38 FLTAT3 12 17.39 44 63.77 PILOT1 8 11.59 52 75.36 PILOT2 9 13.04 61 88.41 PILOT3 8 11.59 69 100.00
7
77 PROC MEANS Output Salary by Job Code The MEANS Procedure Analysis Variable : Salary Job N Code Obs N Mean Std Dev Minimum Maximum ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ FLTAT1 14 14 25642.86 2951.07 21000.00 30000.00 FLTAT2 18 18 35111.11 1906.30 32000.00 38000.00 FLTAT3 12 12 44250.00 2301.19 41000.00 48000.00 PILOT1 8 8 69500.00 2976.10 65000.00 73000.00 PILOT2 9 9 80111.11 3756.48 75000.00 86000.00 PILOT3 8 8 99875.00 7623.98 92000.00 112000.00 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
8
88 PROC REPORT Output Salary Analysis Job Code Home Base Salary ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ FLTAT1 CARY $131,000 FRANKFURT $100,000 LONDON $128,000 FLTAT2 CARY $245,000 FRANKFURT $181,000 LONDON $206,000 FLTAT3 CARY $217,000 FRANKFURT $134,000 LONDON $180,000 PILOT1 CARY $211,000 FRANKFURT $135,000 LONDON $210,000 PILOT2 CARY $323,000 FRANKFURT $240,000 LONDON $158,000 PILOT3 CARY $300,000 FRANKFURT $205,000 LONDON $294,000 ========== $3,598,000
9
Section 8.2 Basic Summary Reports
10
10 SAS Vocabulary n PROC FREQ n TABLES n NLEVELS n Crosstabular n * n PROC MEANS n VAR n CLASS n MAXDEC=
11
11 Goal Report 1 International Airlines wants to know how many employees are in each job code. Distribution of Job Code Values The FREQ Procedure Job Cumulative Cumulative Code Frequency Percent Frequency Percent ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ FLTAT1 14 20.29 14 20.29 FLTAT2 18 26.09 32 46.38 FLTAT3 12 17.39 44 63.77 PILOT1 8 11.59 52 75.36 PILOT2 9 13.04 61 88.41 PILOT3 8 11.59 69 100.00
12
12 Categorize job code and salary values to determine how many employees fall into each group. Salary Distribution by Job Codes The FREQ Procedure Table of JobCode by Salary JobCode Salary Frequency ‚ Percent ‚ Row Pct ‚ Col Pct ‚Less tha‚25,000 t‚More tha‚ Total ‚n 25,000‚o 50,000‚n 50,000‚ ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Flight Attendant ‚ 5 ‚ 39 ‚ 0 ‚ 44 ‚ 7.25 ‚ 56.52 ‚ 0.00 ‚ 63.77 ‚ 11.36 ‚ 88.64 ‚ 0.00 ‚ ‚ 100.00 ‚ 100.00 ‚ 0.00 ‚ ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Pilot ‚ 0 ‚ 0 ‚ 25 ‚ 25 ‚ 0.00 ‚ 0.00 ‚ 36.23 ‚ 36.23 ‚ 0.00 ‚ 0.00 ‚ 100.00 ‚ ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Total 5 39 25 69 7.25 56.52 36.23 100.00 Goal Report 2
13
13 PROC FREQ displays frequency counts of the data values in a SAS data set. General form of a simple PROC FREQ step: PROC FREQ DATA=SAS-data-set; RUN; PROC FREQ DATA=SAS-data-set; RUN; Example: Creating a Frequency Report proc freq data=ia.crew; run;
14
14 By default, PROC FREQ analyzes every variable in the SAS data set displays each distinct data value calculates the number of observations in which each data value appears (and the corresponding percentage) Indicates, for each variable, how many observations have missing values. Creating a Frequency Report
15
15... proc freq data=ia.crew; run; ia.crew Distribution of LastName Distribution of Salary Distribution of JobCode Distribution of FirstName Distribution of EmpID Distribution of HireDate Distribution of Phone Distribution of Location Default Frequency Reports
16
16 Variables to Analyze PROC FREQ is appropriate for variables with only a few values. For example, if you have a class list with one row for each student, it would not be very meaningful to analyze the student ID if there is one row per person in the table. PROC FREQ enables you to choose the variables to analyze.
17
17 Printing Selected Variables SAS enables you to select the variables to display or analyze. In PROC PRINT, what statement selected the variables for the output?...
18
18 Printing Selected Variables SAS enables you to select the variables to display or analyze. In PROC PRINT, what statement selected the variables for the output? The VAR statement
19
19 Printing Selected Variables SAS enables you to select the variables to display or analyze. In PROC FREQ, what statement selects the variables? PROCStatement to select variables PRINTVAR...
20
20 Printing Selected Variables PROCStatement to select variables PRINTVAR SAS enables you to select the variables to display or analyze. In PROC FREQ, what statement selects the variables? The TABLES statement
21
21 Printing Selected Variables SAS enables you to select the variables to display or analyze. PROCStatement to select variables PRINTVAR FREQTABLES
22
22 Use the TABLES statement to limit the variables included in the frequency counts. These are typically variables that have a limited number of distinct values. General form of a PROC FREQ step with a TABLES statement: PROC FREQ DATA=SAS-data-set; TABLES SAS-variables ; RUN; PROC FREQ DATA=SAS-data-set; TABLES SAS-variables ; RUN; One-Way Frequency Report Ignore the option for now.
23
23 One-Way Frequency Report Use the TABLE statement to analyze JobCode. For example: proc freq data=ia.crew; tables JobCode ; run;
24
24 Distribution of Job Code Values The FREQ Procedure Job Cumulative Cumulative Code Frequency Percent Frequency Percent ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ FLTAT1 14 20.29 14 20.29 FLTAT2 18 26.09 32 46.38 FLTAT3 12 17.39 44 63.77 PILOT1 8 11.59 52 75.36 PILOT2 9 13.04 61 88.41 PILOT3 8 11.59 69 100.00 title 'Distribution of Job Code Values'; proc freq data=ia.crew; tables JobCode; run; Creating a Frequency Report – Example
25
25 One-Way Frequency Report You can select more than one variable to analyze by listing them all in the TABLES statement. Separate them with a space. This creates one report for each variable. For example: proc freq data=ia.crew; tables JobCode Location; RUN;
26
26 The FREQ Procedure Job Cumulative Cumulative Code Frequency Percent Frequency Percent ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ FLTAT1 14 20.29 14 20.29 FLTAT2 18 26.09 32 46.38 FLTAT3 12 17.39 44 63.77 PILOT1 8 11.59 52 75.36 PILOT2 9 13.04 61 88.41 PILOT3 8 11.59 69 100.00 Cumulative Cumulative Location Frequency Percent Frequency Percent ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ CARY 27 39.13 27 39.13 FRANKFURT 19 27.54 46 66.67 LONDON 23 33.33 69 100.00 title; proc freq data=ia.crew; tables JobCode Location; run; Creating a Frequency Report – Example JobCode Report Location Report
27
27 Use the NLEVELS option in the PROC FREQ statement to display the number of levels for the variables included in the frequency counts. Displaying the Number of Levels – Example title 'Distribution of Location Values'; proc freq data=ia.crew nlevels; tables Location; run;
28
28 Distribution of Location Values The FREQ Procedure Number of Variable Levels Variable Levels ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Location 3 Cumulative Cumulative Location Frequency Percent Frequency Percent ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ CARY 27 39.13 27 39.13 FRANKFURT 19 27.54 46 66.67 LONDON 23 33.33 69 100.00 Creating a Frequency Report – Example
29
29 Creating a Frequency Report To display the number of levels without displaying the frequency counts, add the NOPRINT option to the TABLES statement. proc freq data=ia.crew nlevels; tables JobCode Location / noprint; title 'Number of Levels for Job Code and Location'; run; Number of Levels for Job Code and Location The FREQ Procedure Number of Variable Levels Variable Levels ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ JobCode 6 Location 3
30
30 Creating a Frequency Report To display the number of levels for all variables without displaying any frequency counts, use the _ALL_ keyword and the NOPRINT option in the TABLES statement. (You must also use the NLEVELS option.) title 'Number of Levels for All Variables'; proc freq data=ia.crew nlevels; tables _all_ / noprint; run;
31
31 International Airlines wants to use formats to categorize the flight crew by job code. Pilot PILOT1 PILOT2 PILOT3 FLTAT1 FLTAT2 FLTAT3 Flight Attendant Stored values Formatted values Analyzing Categories of Values
32
32 proc format; value $codefmt 'FLTAT1'-'FLTAT3'='Flight Attendant' 'PILOT1'-'PILOT3'='Pilot'; run; proc freq data = ia.crew; format JobCode $codefmt.; tables JobCode; run; Analyzing Categories of Values – Example
33
33 Distribution of Job Code Values The FREQ Procedure Cumulative Cumulative JobCode Frequency Percent Frequency Percent ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Flight Attendant 44 63.77 44 63.77 Pilot 25 36.23 69 100.00 Analyzing Categories of Values – Example PROC FREQ automatically groups the data by the formatted value of a variable if a format is associated with that variable.
34
34 This exercise reinforces the concepts discussed previously. Exercise
35
35 Exercises 1.Use the StudyAbroad2 delimited data file to create a data set called StudyLocations. The variables in order are Country, Cost, Time, and BeginDate. Format Cost to reflect currency values and BeginDate to a readable date value. Change the column headings to Country, Trip Cost, Length of Program, and Trip Begin Date. 2.Create a listing report to verify all the work for #1 above. 3.Use PROC FREQ to determine the frequencies for Country and Time.
36
36 Exercises – A Solution data StudyLocations; infile 'StudyAbroad2.csv' dsd; input Country :$15. Cost Time :$8. BeginDate :mmddyy10.; format BeginDate mmddyy10. Cost dollar8.; label Cost='Trip Cost' Time='Length of Program' BeginDate = 'Trip Begin Date'; run; proc print data= StudyLocations noobs label; run; proc freq data=StudyLocations; tables Country Time; run;
37
37 Exercises Length Trip of Trip Begin Country Cost Program Date Germany $4,200 Semester 09/01/2007 France $8,162 Year 10/01/2007 Great Britain $8,225 Year 09/01/2007 Australia $7,500 Year 06/01/2007 Sweden $5,286 Semester 12/01/2007 Spain $3,500 Semester 09/01/2007 Mexico $2,300 Semester 09/01/2007 France $3,971 Semester 10/01/2007 Great Britain $8,225 Year 09/01/2007 Sweden $5,286 Semester 12/01/2007 Germany $4,200 Semester 09/01/2007 Great Britain $4,700 Semester 09/01/2007 Germany $7,625 Year 09/01/2007 Partial PROC PRINT Output
38
38 Exercises The FREQ Procedure Cumulative Cumulative Country Frequency Percent Frequency Percent ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Australia 11 20.75 11 20.75 France 9 16.98 20 37.74 Germany 7 13.21 27 50.94 Great Britain 10 18.87 37 69.81 Mexico 4 7.55 41 77.36 Spain 5 9.43 46 86.79 Sweden 7 13.21 53 100.00 Length of Program Cumulative Cumulative Time Frequency Percent Frequency Percent ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Semester 25 47.17 25 47.17 Year 28 52.83 53 100.00 Partial PROC FREQ Output
39
39 A two-way, or crosstabular, frequency report analyzes all possible combinations of the distinct values of two variables. The asterisk (*) operator in the TABLES statement is used to cross variables. General form of the FREQ procedure to create a crosstabular report: Crosstabular Frequency Reports PROC FREQ DATA=SAS-data-set; TABLES variable1 * variable2; RUN; PROC FREQ DATA=SAS-data-set; TABLES variable1 * variable2; RUN;
40
40 proc format; value $codefmt 'FLTAT1'-'FLTAT3'='Flight Attendant' 'PILOT1'-'PILOT3'='Pilot'; value money low-<25000 ='Less than 25,000' 25000-50000='25,000 to 50,000' 50000<-high='More than 50,000'; run; proc freq data=ia.crew; tables JobCode*Salary; format JobCode $codefmt. Salary money.; title 'Salary Distribution by Job Codes'; run; Crosstabular Frequency Reports – Example
41
41 Salary Distribution by Job Codes The FREQ Procedure Table of JobCode by Salary JobCode Salary Frequency ‚ Percent ‚ Row Pct ‚ Col Pct ‚Less tha‚25,000 t‚More tha‚ Total ‚n 25,000‚o 50,000‚n 50,000‚ ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Flight Attendant ‚ 5 ‚ 39 ‚ 0 ‚ 44 ‚ 7.25 ‚ 56.52 ‚ 0.00 ‚ 63.77 ‚ 11.36 ‚ 88.64 ‚ 0.00 ‚ ‚ 100.00 ‚ 100.00 ‚ 0.00 ‚ ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Pilot ‚ 0 ‚ 0 ‚ 25 ‚ 25 ‚ 0.00 ‚ 0.00 ‚ 36.23 ‚ 36.23 ‚ 0.00 ‚ 0.00 ‚ 100.00 ‚ ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Total 5 39 25 69 7.25 56.52 36.23 100.00 Crosstabular Frequency Reports First Variable Second Variable
42
42 proc freq data=ia.crew; tables JobCode*Location / crosslist; title 'Location Distribution for Job Codes'; run; Crosstabular Frequency Reports – Example To display the crosstabulation results in a listing form, add the CROSSLIST option to the TABLES statement.
43
43 Location Distribution for Job Codes The FREQ Procedure Table of JobCode by Location Job Row Column Code Location Frequency Percent Percent Percent ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ FLTAT1 CARY 5 7.25 35.71 18.52 FRANKFURT 4 5.80 28.57 21.05 LONDON 5 7.25 35.71 21.74 Total 14 20.29 100.00 ------------------------------------------------------------- FLTAT2 CARY 7 10.14 38.89 25.93 FRANKFURT 5 7.25 27.78 26.32 LONDON 6 8.70 33.33 26.09 Total 18 26.09 100.00 ------------------------------------------------------------- Crosstabular Frequency Reports Partial Output
44
44 This exercise reinforces the concepts discussed previously. Exercise
45
45 Exercises Using the StudyLocations data set you created in a previous exercise, create a crosstabular frequency report using the CROSSLIST option. Display the length of the program by country.
46
46 Exercises proc freq data=StudyLocations; tables Time*Country /crosslist; run;
47
47 Exercises
48
48 International Airlines wants to determine the minimum, maximum, and average salary for each job code. Business Task
49
49 The MEANS procedure displays simple descriptive statistics for the numeric variables in a SAS data set. General form of a simple PROC MEANS step: PROC MEANS DATA=SAS-data-set; RUN; PROC MEANS DATA=SAS-data-set; RUN; proc means data=ia.crew; title 'Salary Analysis'; run; Calculating Summary Statistics – Example How many variables will be analyzed?
50
50 Salary Analysis The MEANS Procedure Variable N Mean Std Dev Minimum Maximum ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ HireDate 69 9812.78 1615.44 7318.00 12690.00 Salary 69 52144.93 25521.78 21000.00 112000.00 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Calculating Summary Statistics In this example, PROC MEANS analyzed two variables, HireDate and Salary.
51
51 Calculating Summary Statistics By default, PROC MEANS analyzes every numeric variable in the SAS data set prints the following statistics – N – MEAN – ST – MIN – MAX excludes missing values before calculating statistics.
52
52 Summary Statistics Nnumber of rows with nonmissing values MEANarithmetic mean (or average) STDstandard deviation MINminimum value MAXmaximum value Default Statistics:
53
53 Choosing Summary Statistics Other Statistics: RANGEdifference between lowest and highest values MEDIAN50 th percentile value SUMtotal NMISSnumber of rows with missing values. For more information on other PROC MEANS options, refer to the SAS OnlineDoc.
54
54 Choosing Summary Statistics To see a different statistic or control the number of default statistics, list the statistics you want in the PROC MEANS statement as an option to the step. You saw this in Chapter 2 when you worked on syntax errors.
55
55 Choosing Summary Statistics title 'Salary Analysis'; proc means data=ia.crew mean max min; run; Salary Analysis The MEANS Procedure Variable Mean Maximum Minimum ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ HireDate 9812.78 12690.00 7318.00 Salary 52144.93 112000.00 21000.00 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ The order of the statistics changed from the default order. Because min is listed in the statement after max, that is the order that they appear in the output.
56
56 Grouping Observations PROC MEANS may not always print two digits to the right of the decimal point. To control the maximum number of decimal places for PROC MEANS to use in printing results, use the MAXDEC= option in the PROC MEANS statement. General form of the PROC MEANS statement with the MAXDEC= option: PROC MEANS DATA=SAS-data-set MAXDEC=number; RUN; PROC MEANS DATA=SAS-data-set MAXDEC=number; RUN;
57
57 Choosing Summary Statistics title 'Salary Analysis'; proc means data=ia.crew mean max min maxdec=1; run; Salary Analysis The MEANS Procedure Variable Mean Maximum Minimum ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ HireDate 9812.8 12690.0 7318.0 Salary 52144.9 112000.0 21000.0 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Values are rounded to the specified number of decimals. The range of values for the MAXDEC= option is 0-8. The MAXDEC= option does not use format names to format the values.
58
58 SAS enables you to select the variables to display and analyze. In PROC MEANS, what statement selects the variables? Printing Selected Variables PROCStatement to select variables PRINTVAR FREQTABLES...
59
59 Printing Selected Variables SAS enables you to select the variables to display and analyze. In PROC MEANS, what statement selects the variables? The VAR statement PROCStatement to select variables PRINTVAR FREQTABLES
60
60 Printing Selected Variables SAS enables you to select the variables to display and analyze. PROCStatement to select variables PRINTVAR FREQTABLES MEANSVAR
61
61 The VAR statement restricts the variables processed by PROC MEANS. General form of the VAR statement: VAR SAS-variable(s); Selecting Variables
62
62 proc means data=ia.crew; var Salary; title 'Salary Analysis'; run; ia.crew Selecting Variables – Example Salary Analysis The MEANS Procedure Analysis Variable : Salary N Mean Std Dev Minimum Maximum ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 69 52144.93 25521.78 21000.00 112000.00 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ...
63
63 The CLASS statement in the MEANS procedure groups the observations of the SAS data set for analysis. General form of the CLASS statement: CLASS SAS-variable(s); Grouping Observations
64
64 title 'Salary by Job Code'; proc means data=ia.crew maxdec=2; var Salary; class JobCode; run; ia.crew Grouping Observations – Example...
65
65 Salary by Job Code The MEANS Procedure Analysis Variable : Salary Job N Code Obs N Mean Std Dev Minimum Maximum ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ FLTAT1 14 14 25642.86 2951.07 21000.00 30000.00 FLTAT2 18 18 35111.11 1906.30 32000.00 38000.00 FLTAT3 12 12 44250.00 2301.19 41000.00 48000.00 PILOT1 8 8 69500.00 2976.10 65000.00 73000.00 PILOT2 9 9 80111.11 3756.48 75000.00 86000.00 PILOT3 8 8 99875.00 7623.98 92000.00 112000.00 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Grouping Observations
66
66 Using Formats with PROC MEANS You cannot format the statistics, but you can format the CLASS variables. proc format; value $codefmt 'FLTAT1'-'FLTAT3'='Flight Attendant' 'PILOT1'-'PILOT3'='Pilot'; run; title 'Salary by Job Code'; proc means data=ia.crew mean max min maxdec=2; var Salary; class JobCode; format JobCode $codefmt.; run;
67
67 Salary by Job Code The MEANS Procedure Analysis Variable : Salary N JobCode Obs Mean Maximum Minimum ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Flight Attendant 44 34590.91 48000.00 21000.00 Pilot 25 83040.00 112000.00 65000.00 Grouping Observations
68
68 This exercise reinforces the concepts discussed previously. Exercise
69
69 Exercises Using the StudyLocations data set that you created in the previous section and PROC MEANS, create a report that includes the following: with the minimum, maximum, and range of Cost grouping the cost of the trip by Country appropriately formatted values
70
70 Exercises proc means data= StudyLocations min max range; var cost; class Country; run;
71
71 Exercises (Alternate Solution) proc means data= StudyLocations min max range nonobs ; var cost; class Country; run; To remove the NOBS statistic, use the NONOBS option in the PROC MEANS statement.
72
72 This exercise reinforces the concepts discussed previously. Exercise – Section 8.2
73
Section 8.3 The REPORT Procedure
74
74 Objectives Use the REPORT procedure to create a listing report. Apply the ORDER usage type to sort the data in a listing report. Apply the SUM and GROUP usage types to create a summary report. Use the RBREAK statement to produce a grand total.
75
75 SAS Vocabulary PROC REPORT WINDOWS|WD NOWINDOWS|NOWD PROC TABULATE COLUMN DEFINE FORMAT= WIDTH= ORDER GROUP RBREAK HEADLINE HEADSKIP
76
76 REPORT Procedure Features PROC REPORT enables you to create listing reports Rows are listed one line at a time (as in PROC PRINT output). create summary reports Data is grouped and many rows are combined in one line of output. LastName FirstName Date Purchase TORRES JAN16409 120.80 TORRES JAN16578 500.20 SMITH MICHAEL16614 82.25 SMITH MICHAEL15999 16.48 SMITH MICHAEL16080 25.45 YONKERSJESSIE16783 832.98 ZIMMEL JIMMY16999 48.32 Data Set LastName FirstName Date Purchase TORRES JAN16409 621.00 SMITH MICHAEL16614 124.18 YONKERSJESSIE16783 832.98 ZIMMEL JIMMY16999 48.32 Summarized Report...
77
77 REPORT Procedure Features PROC REPORT enables you to create listing reports Rows are listed one line at a time (as in PROC PRINT output). create summary reports Data is grouped and many rows are combined in one line of output. enhance reports easily, for example, with formats, labels, and groups request separate subtotals and grand totals generate reports in an interactive point-and-click (default) or programming environment.
78
78 PROC REPORT versus PROC PRINT
79
79 proc report data=ia.crew nowd; run; Creating a List Report General form of a simple PROC REPORT step: Selected options: PROC REPORT DATA=SAS-data-set ; RUN; PROC REPORT DATA=SAS-data-set ; RUN; WINDOWS | WDinvokes the procedure in an interactive REPORT window (default). NOWINDOWS | NOWDdisplays the report in the OUTPUT window.
80
80 proc report data=ia.crew nowd; run; Creating a List Report Output JobCod Location Phone EmpID e Salary LONDON 2388 E01163 FLTAT2 34000 CARY 1381 E02102 FLTAT3 42000 LONDON 2553 E00710 FLTAT2 33000 CARY 2554 E01818 PILOT2 82000 CARY 2569 E03921 FLTAT3 47000 LONDON 2577 E03339 FLTAT2 35000 LONDON 2582 E03555 PILOT2 83000 CARY 2599 E02766 FLTAT2 32000 LONDON 2745 E03740 PILOT1 73000 FRANKFURT 1160 E01483 FLTAT2 33000 CARY 2779 E01384 FLTAT2 38000 FRANKFURT 2797 E00223 PILOT3 105000 FRANKFURT 1136 E04581 PILOT1 69000 FRANKFURT 1183 E00632 PILOT3 100000 FRANKFURT 2960 E03884 FLTAT2 38000 LONDON 2997 E00034 FLTAT3 44000 LONDON 1156 E03591 FLTAT3 47000 FRANKFURT 1194 E04064 FLTAT2 37000 FRANKFURT 1197 E01996 FLTAT1 26000 LONDON 1160 E04356 FLTAT2 34000 LONDON 1552 E01447 FLTAT3 45000 FRANKFURT 1553 E02679 FLTAT1 27000 CARY 1555 E02606 FLTAT2 36000 LONDON 1565 E03323 FLTAT1 22000
81
81 What do you notice about JobCode? Creating a List Report JobCod Location Phone EmpID e Salary LONDON 2388 E01163 FLTAT2 34000 CARY 1381 E02102 FLTAT3 42000 LONDON 2553 E00710 FLTAT2 33000 CARY 2554 E01818 PILOT2 82000 CARY 2569 E03921 FLTAT3 47000 LONDON 2577 E03339 FLTAT2 35000 LONDON 2582 E03555 PILOT2 83000 CARY 2599 E02766 FLTAT2 32000 LONDON 2745 E03740 PILOT1 73000 FRANKFURT 1160 E01483 FLTAT2 33000 CARY 2779 E01384 FLTAT2 38000 FRANKFURT 2797 E00223 PILOT3 105000 FRANKFURT 1136 E04581 PILOT1 69000 FRANKFURT 1183 E00632 PILOT3 100000 FRANKFURT 2960 E03884 FLTAT2 38000 LONDON 2997 E00034 FLTAT3 44000 LONDON 1156 E03591 FLTAT3 47000 FRANKFURT 1194 E04064 FLTAT2 37000 FRANKFURT 1197 E01996 FLTAT1 26000 LONDON 1160 E04356 FLTAT2 34000 LONDON 1552 E01447 FLTAT3 45000 FRANKFURT 1553 E02679 FLTAT1 27000 CARY 1555 E02606 FLTAT2 36000 LONDON 1565 E03323 FLTAT1 22000 Output
82
82 proc report data=ia.crew; run; Creating a List Report What happens if you forget the NOWD option? Try it. Your instructor can show you how to easily change the width of JobCode, the format of Salary, and change the color of Salary to green.
83
83 proc report data=ia.crew; run; Creating a List Report What happens if you forget the NOWD option? An interactive window opens and you can make changes to the report interactively, rather than modifying the code. You must close this window before any other code is submitted, otherwise, the code will wait in the buffer for you to close the window. After the window is closed, any code submitted will be executed. !
84
84 The REPORT Procedure The default listing displays each data value as it is stored in the data set, or formatted value if a format is stored with the data variable names or labels as report column headings a default width for the report columns (The width that is used is discussed later.) character values left-justified numeric values right-justified observations in the order in which they are stored in the data set.
85
85 SAS enables you to select the variables to display and analyze. In PROC REPORT, what statement selects the variables? Printing Selected Variables PROCStatement to select variables PRINTVAR FREQTABLES MEANSVAR...
86
86 Printing Selected Variables SAS enables you to select the variables to display and analyze. In PROC REPORT, what statement selects the variables? The COLUMN statement PROCStatement to select variables PRINTVAR FREQTABLES MEANSVAR
87
87 Reference: Printing Selected Variables SAS enables you to select the variables to display and analyze. PROCStatement to select variables PRINTVAR FREQTABLES MEANSVAR REPORTCOLUMN
88
88 Printing Selected Variables COLUMN SAS-variables; You can use a COLUMN statement in PROC REPORT to do the following: select the variables to appear in the report order the variables in the report General form of the COLUMN statement:
89
89 Sample Listing Report – Example Partial SAS Output title 'Salary Analysis'; proc report data=ia.crew nowd; column JobCode Location Salary; run; Salary Analysis JobCod Location Salary e PILOT1 LONDON 72000 FLTAT3 CARY 41000 PILOT2 FRANKFURT 81000 PILOT2 FRANKFURT 83000 FLTAT2 LONDON 36000 PILOT1 LONDON 65000 FLTAT2 FRANKFURT 35000 FLTAT2 FRANKFURT 38000 FLTAT1 LONDON 28000 FLTAT3 LONDON 44000 FLTAT2 CARY 37000...
90
90 The DEFINE Statement You can enhance the report by using DEFINE statements to perform the following tasks: define how each variable is used in the report assign formats to variables specify report column headers and column widths change the order of the rows in the report
91
91 The DEFINE Statement General form of the DEFINE statement: DEFINE variable / ; You should add a DEFINE statement to the PROC REPORT step for every variable that you want to look differently from the default appearance. You do not have to add a DEFINE statement for every variable; only for the ones whose appearance you want to change. required
92
92 The DEFINE Statement Selected attributes: If there is a label stored in the descriptor portion of the data set, it is the default header. If one is not stored, SAS uses the variable name. ' report-column-header 'defines the report column header. Example: title 'Salary Analysis'; proc report data=ia.crew nowd; column JobCode Location Salary; define Salary / 'Annual Salary'; run;
93
93 Sample Listing Report – Example Salary Analysis Annual JobCod Location Salary e PILOT1 LONDON 72000 FLTAT3 CARY 41000 PILOT2 FRANKFURT 81000 PILOT2 FRANKFURT 83000 FLTAT2 LONDON 36000 PILOT1 LONDON 65000 FLTAT2 FRANKFURT 35000 FLTAT2 FRANKFURT 38000... title 'Salary Analysis'; proc report data=ia.crew nowd; column JobCode Location Salary; define Salary / ‘Annual Salary’; run; The / is required.
94
94 Selected attributes: If there is a format stored in the descriptor portion of the data set, it is the default format. The DEFINE Statement FORMAT=assigns a format to a variable. Example: title 'Salary Analysis'; proc report data=ia.crew nowd; column JobCode Location Salary; define Salary / 'Annual Salary' format= dollar8.; run;
95
95 Sample Listing Report – Example Salary Analysis Annual JobCod Location Salary e PILOT1 LONDON $72,000 FLTAT3 CARY $41,000 PILOT2 FRANKFURT $81,000 PILOT2 FRANKFURT $83,000 FLTAT2 LONDON $36,000 PILOT1 LONDON $65,000 FLTAT2 FRANKFURT $35,000 FLTAT2 FRANKFURT $38,000... title 'Salary Analysis'; proc report data=ia.crew nowd; column JobCode Location Salary; define Salary / 'Annual Salary' format= dollar8. ; run; Use one /, followed by all attributes in any order.
96
96 Selected attributes: The default width is the variable length for character variables 9 for numeric variables the format width if there is a format stored in the descriptor portion of the data set. WIDTH= controls the width of a report column. The DEFINE Statement The WIDTH= option enables you to change the width of JobCode so that the e is not on a separate line.
97
97 title 'Salary Analysis'; proc report data=ia.crew nowd; column JobCode Location Salary; define Salary / 'Annual Salary' format= dollar8. ; define JobCode / width= 8; run; Sample Listing Report – Example Salary Analysis Annual JobCode Location Salary PILOT1 LONDON $72,000 FLTAT3 CARY $41,000 PILOT2 FRANKFURT $81,000 PILOT2 FRANKFURT $83,000 FLTAT2 LONDON $36,000 PILOT1 LONDON $65,000 FLTAT2 FRANKFURT $35,000 FLTAT2 FRANKFURT $38,000... The order of the DEFINE statements does not matter.
98
98 Enhancing the Listing Report – Example Change column headings. Increase the column widths. Add a format to display Salary with dollar signs and commas. proc report data=ia.crew nowd; column JobCode Location Salary; define JobCode / width=8 'Job Code'; define Location / 'Home Base'; define Salary / format=dollar10.; run;...
99
99 Enhancing the Listing Report – Example Partial SAS Output Job Code Home Base Salary PILOT1 LONDON $72,000 FLTAT3 CARY $41,000 PILOT2 FRANKFURT $81,000 PILOT2 FRANKFURT $83,000 FLTAT2 LONDON $36,000 PILOT1 LONDON $65,000 FLTAT2 FRANKFURT $35,000 FLTAT2 FRANKFURT $38,000 FLTAT1 LONDON $28,000...
100
100 Enhancing the Listing Report – Example Change the report to group the pilots and flight attendants. Job Code Home Base Salary PILOT1 LONDON $72,000 FLTAT3 CARY $41,000 PILOT2 FRANKFURT $81,000 PILOT2 FRANKFURT $83,000 FLTAT2 LONDON $36,000 PILOT1 LONDON $65,000 FLTAT2 FRANKFURT $35,000 FLTAT2 FRANKFURT $38,000 FLTAT1 LONDON $28,000...
101
101 Selected attributes The ORDER attribute orders the report in ascending order. Include the DESCENDING option in the DEFINE statement to force the order to be descending. suppresses repetitious printing of values. does not need data to be sorted previously. ORDER Usage Type ORDER orders the rows in the report.
102
102 ORDER Usage Type – Example Display the data in order by JobCode. proc report data=ia.crew nowd; column JobCode Location Salary; define JobCode / order width=8 'Job Code'; define Location / 'Home Base'; define Salary / format=dollar10.; run;
103
103 ORDER Usage Type – Example Partial SAS Output Salary Analysis Job Code Home Base Salary FLTAT1 LONDON $28,000 FRANKFURT $25,000 CARY $23,000... FRANKFURT $27,000 LONDON $22,000 FLTAT2 LONDON $36,000 FRANKFURT $35,000... FRANKFURT $33,000 CARY $38,000 The values of FLTAT are not repeated for each row; they are suppressed.
104
104 ORDER Usage Type – Example Display the data in descending order by JobCode. The DESCENDING keyword can go anywhere in the DEFINE statement after the /. It cannot be abbreviated. proc report data=ia.crew nowd; column JobCode Location Salary; define JobCode / descending order width=8 'Job Code'; define Location / 'Home Base'; define Salary / format=dollar10.; run;
105
105 ORDER Usage Type – Example Partial SAS Output Salary Analysis Job Code Home Base Salary PILOT3 LONDON $108,000 CARY $112,000 LONDON $94,000... PILOT2 FRANKFURT $81,000 FRANKFURT $83,000... PILOT1 LONDON $72,000 LONDON $65,000 CARY $71,000...
106
106 ORDER Usage Type – Example What if you also want Location in sorted order? Salary Analysis Job Code Home Base Salary FLTAT1 LONDON $28,000 FRANKFURT $25,000 CARY $23,000... FRANKFURT $27,000 LONDON $22,000 FLTAT2 LONDON $36,000 FRANKFURT $35,000...
107
107 ORDER Usage Type – Example Display the data in order by JobCode and Location. proc report data=ia.crew nowd; column JobCode Location Salary; define JobCode / order width=8 'Job Code'; define Location / order 'Home Base'; define Salary / format=dollar10.; run;
108
108 ORDER Usage Type – Example Output Salary Analysis Job Code Home Base Salary FLTAT1 CARY $23,000 $21,000... FRANKFURT $25,000 $22,000... LONDON $28,000 $29,000 $24,000 $25,000 $22,000 FLTAT2 CARY $37,000 $34,000 $33,000... FRANKFURT $35,000 $38,000...
109
109 ORDER Usage Type – Example How did SAS know to group Location in JobCode ? proc report data=ia.crew nowd; column JobCode Location Salary; define JobCode / order width=8 'Job Code'; define Location / order 'Home Base'; define Salary / format=dollar10.; run;
110
110 ORDER Usage Type – Example proc report data=ia.crew nowd; column JobCode Location Salary; define JobCode / order width=8 'Job Code'; define Location / order 'Home Base'; define Salary / format=dollar10.; run; The COLUMN statement selects and controls the order of the variables in the output. Remember How did SAS know to group Location in JobCode ?
111
111 ORDER Usage Type – Example How did SAS know to group Location in JobCode ? Specify Location before JobCode in the COLUMN statement. proc report data=ia.crew nowd; column Location JobCode Salary; define JobCode / order width=8 'Job Code'; define Location / order 'Home Base'; define Salary / format=dollar10.; run;
112
112 ORDER Usage Type – Example Specify Location before JobCode in the COLUMN statement. Salary Analysis Home Base Job Code Salary CARY FLTAT1 $23,000 $21,000 $29,000 $30,000 $28,000 FLTAT2 $37,000 $34,000 $33,000... FRANKFURT FLTAT1 $25,000 $22,000 $26,000 $27,000 FLTAT2 $35,000 $38,000 $33,000...
113
113 This exercise reinforces the concepts discussed previously. Exercise
114
114 Exercises Using the StudyLocations data set that you created earlier and PROC REPORT, create the following report: 1.Display the variables in the following order: Country, Length of Program, Trip Begin Date, and Trip Cost 2.Format the Trip Cost appropriately as currency the Trip Begin Date so that December 1, 2007 will appear as 01/12/2007. 3.Give an appropriate width for other variables 4.Order the rows by Country. 5.Title the report Study Abroad Options.
115
115 Exercises title 'Study Abroad Options'; proc report data=StudyLocations nowd; column country time begindate cost; define beginDate /format=ddmmyy10.; define country / order; run;
116
116 Exercises Partial Output
117
117 Business Task International Airlines wants to summarize Salary by JobCode for each Location.
118
118 Desired Report Salary Analysis Job Code Home Base Salary FLTAT1 CARY $131,000 FRANKFURT $100,000 LONDON $128,000 FLTAT2 CARY $245,000 FRANKFURT $181,000 LONDON $206,000 FLTAT3 CARY $217,000 FRANKFURT $134,000 LONDON $180,000 PILOT1 CARY $211,000 FRANKFURT $135,000 LONDON $210,000 PILOT2 CARY $323,000 FRANKFURT $240,000 LONDON $158,000 PILOT3 CARY $300,000 FRANKFURT $205,000 LONDON $294,000 ========== $3,598,000 You want one line for each flight attendant from Cary and the total salary.
119
119 The DEFINE Statement - Review General form of the DEFINE statement in PROC REPORT: The USAGE attribute is necessary to produce this summary report. DEFINE variable / ;
120
120 The DEFINE Statement – USAGE Attribute By default in PROC REPORT, character variables have a display usage and produce a listing report. (Each row is listed and there is no summarization or collapsing of rows.) numeric variables have an analysis usage and produce summary reports. Variable Type Default Usage Report Produced CharacterDisplayListing NumericAnalysisSummary
121
121 The DEFINE Statement – USAGE Attribute The analysis usage for numeric variables uses a default statistic of SUM (You can choose a different statistic.) has no effect when you produce a report that contains character variables by default Character data has a display usage by default. If you have at least one column with a display usage, you get a listing report. !
122
122 The DEFINE Statement – USAGE Attribute If your data set has one character display column, PROC REPORT will output a listing report by default, regardless of the number of numeric columns. Variable Type Default Usage Report Produced CharacterDisplayListing NumericAnalysisSummary
123
123 Character and Numeric Variables Display Usage Type (Character Variable Default) Analysis Usage Type (Numeric Variable Default) Report Listing Report Produced Original Data Set JobCode Salary PILOT1 72000 FLTAT3 41000 PILOT2 81000 PILOT2 83000 FLTAT2 36000 PILOT1 65000... The default statistic is SUM.
124
124 Numeric Variables Only Analysis Usage SUM Statistic Original Data Set Report Summary Report Produced... Salary 378000 Sum of all Salary values in the data set Why did you get a summary report?
125
125 Defining Group Variables To have character columns appear in the summarized report, use the GROUP attribute.
126
126 Defining Group Variables In order for grouping to take affect, the word group must be placed in the DEFINE statement for every character variable. Example: proc report data=ia.crew nowd; column JobCode Location Salary;; define JobCode / group width=8 'Job Code'; define Location / group 'Home Base'; define Salary / format=dollar10.; run;
127
127 Defining Group Variables All observations whose group variables have the same values are collapsed into a single row in the report.
128
128 Listing Report Produced Defining Group Variables JobCode as Display Usage Analysis Usage SUM Statistic Original Data Set JobCode as Group Usage Report JobCode Salary FLTAT2 36000 FLTAT3 41000 PILOT1 137000 PILOT2 164000 JobCode Salary PILOT1 72000 FLTAT3 41000 PILOT2 81000 PILOT2 83000 FLTAT2 36000 PILOT1 65000 Report Summary Report Produced FLTAT2 is in both reports....
129
129 Defining Group Variables As you saw with the ORDER option, nesting of group variables is determined by the order of the variables in the COLUMN statement. Example proc report data=ia.crew nowd; column JobCode Location Salary;; define JobCode / group width=8 'Job Code'; define Location / group 'Home Base'; define Salary / format=dollar10.; run;
130
130 Summarizing the Data Salary Analysis Job Code Home Base Salary FLTAT1 CARY $131,000 FRANKFURT $100,000 LONDON $128,000 FLTAT2 CARY $245,000 FRANKFURT $181,000 LONDON $206,000 FLTAT3 CARY $217,000 FRANKFURT $134,000 LONDON $180,000 PILOT1 CARY $211,000 FRANKFURT $135,000 LONDON $210,000 PILOT2 CARY $323,000 FRANKFURT $240,000 LONDON $158,000 PILOT3 CARY $300,000 FRANKFURT $205,000 LONDON $294,000 Partial SAS Output
131
131 Defining Group Variables Report Salary Analysis JobCode Location Salary FLTAT2 CARY 67000 FLTAT3 CARY 85000 FRANKFURT 93000 Location as Group Usage Original Data Set JobCode as Group Usage Analysis Usage SUM Statistic...
132
132 Defining Group Variables List Location before JobCode in the COLUMN statement. Example proc report data=ia.crew nowd; column Location JobCode Salary;; define JobCode / group width=8 'Job Code'; define Location / group 'Home Base'; define Salary / format=dollar10.; run;
133
133 Defining Group Variables Output proc report data=ia.crew nowd; column Location JobCode Salary;; define JobCode / group width=8 'Job Code'; define Location / group 'Home Base'; define Salary / format=dollar10.; run; Home Base Job Code Salary CARY FLTAT1 $131,000 FLTAT2 $245,000 FLTAT3 $217,000 PILOT1 $211,000 PILOT2 $323,000 PILOT3 $300,000 FRANKFURT FLTAT1 $100,000 FLTAT2 $181,000 FLTAT3 $134,000 PILOT1 $135,000 PILOT2 $240,000 PILOT3 $205,000 LONDON FLTAT1 $128,000 FLTAT2 $206,000 FLTAT3 $180,000 PILOT1 $210,000 PILOT2 $158,000 PILOT3 $294,000 Location appears first and JobCode is nested in Location.
134
134 Reference: Defining Group Variables If you have a group variable, there must be no display or order variables. Group variables produce summary reports (observations collapsed into groups). Display and order variables produce listing reports (one row for each observation).
135
135 Reference: Defining Analysis Variables Default usage for numeric variables is analysis with a default statistic of SUM. If…Then… the report contains group variables, the report displays the sum of the numeric variables’ values for each group. the report contains at least one display or order variable and no group variables, the report lists all of the values of the numeric variable. the report contains only numeric variables, the report displays grand totals for the numeric variables.
136
136 Defining Analysis Variables Selected statistics include the following: To specify a statistic other than SUM, type the name of the statistic after the slash in the DEFINE statement. Example: define Salary / mean format=dollar10.; SUMsum (default) Nnumber of nonmissing values MEANaverage MAXmaximum value MINminimum value
137
137 Specify the MEAN statistic. Summarizing the Data proc report data=ia.crew nowd; column JobCode Location Salary; define JobCode / group width=8 'Job Code'; define Location / group 'Home Base'; define Salary / mean format=dollar10.; run;
138
138 Output Summarizing the Data proc report data=ia.crew nowd; column JobCode Location Salary; define JobCode / group width=8 'Job Code'; define Location / group 'Home Base'; define Salary / mean format=dollar10.; run; Job Code Home Base Salary FLTAT1 CARY $26,200 FRANKFURT $25,000 LONDON $25,600 FLTAT2 CARY $35,000 FRANKFURT $36,200 LONDON $34,333 FLTAT3 CARY $43,400 FRANKFURT $44,667 LONDON $45,000 PILOT1 CARY $70,333 FRANKFURT $67,500 LONDON $70,000 PILOT2 CARY $80,750 FRANKFURT $80,000 LONDON $79,000 PILOT3 CARY $100,000 FRANKFURT $102,500 LONDON $98,000
139
139 This exercise reinforces the concepts discussed previously. Exercise
140
140 Exercises Using the CollegeStats data set, produce the following report: Average SAT Scores and GPAs for Second Attempt by Gender Average Second SAT Average Gender Score HS_GPA Female 1,138 3.39 Male 1,089 3.29
141
141 Exercises – A Solution proc format; value $gender 'm','M' = 'Male' 'f','F' = 'Female'; run; title 'Average SAT Scores and GPAs for Second Attempt'; title2 'by Gender'; options nodate nonumber center ls=64; proc report data=collegestats nowd; column gender SAT_Score_II HS_GPA ; define gender / group width=6 format=$gender.; define SAT_Score_II / mean 'Average Second SAT Score' format=comma12.; define HS_GPA/ mean 'Average HS_GPA' width=7; run;
142
142 Printing Grand Totals You can use an RBREAK statement to add the following: grand total to the top or bottom of the report line before the grand total line after the grand total General form of the RBREAK statement: RBREAK BEFORE | AFTER ;
143
143 Printing Grand Totals Selected options: SUMMARIZEprints the total. OLprints a single line above the total. DOLprints a double line above the total. ULprints a single line below the total. DULprints a double line below the total. Refer to SAS OnlineDoc for more information about the RBREAK statement and other PROC REPORT options.
144
144 proc report data=ia.crew nowd; column JobCode Location Salary; define JobCode / group width=8 'Job Code'; define Location / group 'Home Base'; define Salary / format=dollar10.; rbreak after / summarize dol; run; Use the RBREAK statement to display the grand total at the bottom of the report. The RBREAK Statement The SUMMARIZE option gives you the grand total.
145
145 The RBREAK Statement Salary Analysis Job Code Home Base Salary FLTAT1 CARY $131,000 FRANKFURT $100,000 LONDON $128,000 FLTAT2 CARY $245,000 FRANKFURT $181,000 LONDON $206,000 FLTAT3 CARY $217,000 FRANKFURT $134,000 LONDON $180,000 PILOT1 CARY $211,000 FRANKFURT $135,000 LONDON $210,000 PILOT2 CARY $323,000 FRANKFURT $240,000 LONDON $158,000 PILOT3 CARY $300,000 FRANKFURT $205,000 LONDON $294,000 ========== $3,598,000
146
146 Printing Subtotals You can use a BREAK statement to add the following: subtotal to the top or bottom of the report line before the subtotal line after the subtotal General form of the BREAK statement: BREAK BEFORE | AFTER VariableName ;
147
147 Use the BREAK statement to display the subtotal at the end of each group in the report. The BREAK Statement proc report data=ia.crew nowd; column JobCode Location Salary; define JobCode / group width=8 'Job Code'; define Location / group 'Home Base'; define Salary / format=dollar10.; rbreak after / summarize dol; break after JobCode / summarize ol; run; Example: You want subtotals after each JobCode.
148
148 The BREAK Statement Subtotals
149
149 Use the SKIP option in the BREAK statement to add a blank line between the groups. The BREAK Statement proc report data=ia.crew nowd; column JobCode Location Salary; define JobCode / group width=8 'Job Code'; define Location / group 'Home Base'; define Salary / format=dollar10.; rbreak after / summarize dol; break after JobCode / summarize ol skip; run; Example: Add a line between each group.
150
150 The BREAK Statement Breaks
151
151 proc report data=ia.crew nowd headline; column JobCode Location Salary; define JobCode / group width=8 'Job Code'; define Location / group 'Home Base'; define Salary / format=dollar10.; rbreak after / summarize dol; run; To add a line under your column headings, use the HEADLINE option in the PROC REPORT statement. Enhancing the Report
152
152 Enhancing the Report Salary Analysis Job Code Home Base Salary ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ FLTAT1 CARY $131,000 FRANKFURT $100,000 LONDON $128,000 FLTAT2 CARY $245,000 FRANKFURT $181,000 LONDON $206,000 FLTAT3 CARY $217,000 FRANKFURT $134,000 LONDON $180,000 PILOT1 CARY $211,000 FRANKFURT $135,000 LONDON $210,000 PILOT2 CARY $323,000 FRANKFURT $240,000 LONDON $158,000 PILOT3 CARY $300,000 FRANKFURT $205,000 LONDON $294,000 ========== $3,598,000 Headline
153
153 proc report data=ia.crew nowd headline headskip; column JobCode Location Salary; define JobCode / group width=8 'Job Code'; define Location / group 'Home Base'; define Salary / format=dollar10.; rbreak after / summarize dol; run; To skip a line under your column headings, use the HEADSKIP option in the PROC REPORT statement. Enhancing the Report
154
154 Enhancing the Report Salary Analysis Job Code Home Base Salary ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ FLTAT1 CARY $131,000 FRANKFURT $100,000 LONDON $128,000 FLTAT2 CARY $245,000 FRANKFURT $181,000 LONDON $206,000 FLTAT3 CARY $217,000 FRANKFURT $134,000 LONDON $180,000 PILOT1 CARY $211,000 FRANKFURT $135,000 LONDON $210,000 PILOT2 CARY $323,000 FRANKFURT $240,000 LONDON $158,000 PILOT3 CARY $300,000 FRANKFURT $205,000 LONDON $294,000 ========== $3,598,000 Headskip
155
155 This exercise reinforces the concepts discussed previously. Exercise
156
156 Exercises Modify a previous exercise using the StudyLocation data set that you created and the PROC REPORT step to create the following output: Average SAT Scores and GPAs for Second Attempt by Gender Average Second SAT Average Gender Score HS_GPA ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Female 1,138 3.39 Male 1,089 3.29 ƒƒƒƒƒƒƒƒƒƒƒƒ ƒƒƒƒƒƒƒ 1,114 3.34 ============ ======= A blank line appears between genders.
157
157 Exercises proc format; value $gender 'm','M' = 'Male' 'f','F' = 'Female'; run; title 'Average SAT Scores and GPAs for Second Attempt'; title2 'by Gender'; options nodate nonumber; proc report data=collegestats nowd headline headskip; column gender SAT_Score_II HS_GPA ; define gender / group width=6 format=$gender.; define SAT_Score_II / mean 'Average Second SAT Score' format=comma12.; define HS_GPA/ mean 'Average HS_GPA' width=7; rbreak after / summarize ol dul; break after gender/ skip; run;
158
158 This exercise reinforces the concepts discussed previously. Exercise – Section 8.3
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.