The Information Delivery Process Data In Information Out ManageOrganizeExploit.

Slides:



Advertisements
Similar presentations
The SAS ® System Additional Information on Statistical Analysis Programming.
Advertisements

Statistical Methods Lynne Stokes Department of Statistical Science Lecture 7: Introduction to SAS Programming Language.
Slide C.1 SAS MathematicalMarketing Appendix C: SAS Software Uses of SAS  CRM  datamining  data warehousing  linear programming  forecasting  econometrics.
Chapter 17 Read Raw Data in Fixed Format using Formatted Input Objectives Distinguish between standard and nonstandard numeric data Read standard fixed-field.
Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming I: Essentials Training.
SAS Programming: Working With Variables. Data Step Manipulations New variables should be created during a Data step Existing variables should be manipulated.
Introduction to SPSS Allen Risley Academic Technology Services, CSUSM
XP Chapter 3 Succeeding in Business with Microsoft Office Access 2003: A Problem-Solving Approach 1 Analyzing Data For Effective Decision Making.
Chapter 7 Data Management. Agenda Database concept Import data Input and edit data Sort data Function Filter data Create range name Calculate subtotal.
1 SAS SAS is a statistics software package developed by SAS Institute Inc. in U.S.A. SAS products include SAS/STAT, SAS/IML, SAS/OR, etc. The most.
Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1.
Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives.
Creating SAS® Data Sets
How to start using SAS.
SAS ® ANOVA Essentials. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives.
Welcome to SAS…Session..!. What is SAS..! A Complete programming language with report formatting with statistical and mathematical capabilities.
11 Chapter 2: Working with Data in a Project 2.1 Introduction to Tabular Data 2.2 Accessing Local Data 2.3 Importing Text Files 2.4 Editing Tables in the.
Chapter 2: Working with Data in a Project
Chapter 8 Producing Summary Reports. Section 8.1 Introduction to Summary Reports.
SAS PROC REPORT PROC TABULATE
SAS Workshop Lecture 1 Lecturer: Annie N. Simpson, MSc.
Chapter 9 Producing Descriptive Statistics PROC MEANS; Summarize descriptive statistics for continuous numeric variables. PROC FREQ; Summarize frequency.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall1 Exploring Microsoft Office Access Committed to Shaping the Next Generation.
Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS ESSENTIALS -- Elliott & Woodward1.
Introduction to SAS BIO 226 – Spring Outline Windows and common rules Getting the data –The PRINT and CONTENT Procedures Manipulating the data.
Analyzing Data For Effective Decision Making Chapter 3.
Introduction to SAS. What is SAS? SAS originally stood for “Statistical Analysis System”. SAS is a computer software system that provides all the tools.
Input, Output, and Processing
EPIB 698C Lecture 2 Notes Instructor: Raul Cruz 2/14/11 1.
BMTRY 789 Lecture 2 SAS Syntax, entering raw data, etc. Lecturer: Annie N. Simpson, MSc. Readings – Chapters 1, 2, 12, & 13 Lab Problems 1.1, 1.2, 1.3,
With Microsoft Office 2007 Intermediate© 2008 Pearson Prentice Hall1 PowerPoint Presentation to Accompany GO! with Microsoft ® Office 2007 Intermediate.
Lesson 2 Topic - Reading in data Chapter 2 (Little SAS Book)
Summer SAS Workshop Lecture 2. Summer Summer SAS Workshop Lecture 2 I’ve got Data…how do I get started? Libname Review How do you do arithmetic.
1 Chapter 2: Working with Data in a Project 2.1 Introduction to Tabular Data 2.2 Accessing Local Data 2.3 Accessing Remote Data 2.4 Importing Text Files.
Introduction to Enterprise Guide Jennifer Schmidt Rhonda Ellis Cassandra Hall.
1 EPIB 698E Lecture 1 Notes Instructor: Raul Cruz 7/9/13.
How to start using SAS Tina Tian. The topics An overview of the SAS system Reading raw data/ create SAS data set Combining SAS data sets & Match merging.
Chapter 5 Reading and Manipulating SAS ® Data Sets and Creating Detailed Reports Xiaogang Su Department of Statistics University of Central Florida.
Chapter 17: Formatting Data 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.
1 Statistical Software Programming. STAT 6360 –Statistical Software Programming Data Input in SAS Many ways to get your data into SAS: –Through data entry.
Chapter 1: Overview of SAS System Basic Concepts of SAS System.
Copyright © 2010, SAS Institute Inc. All rights reserved. SAS ® Using the SAS Grid.
SAS for Data Management and Analysis
An Introduction Katherine Nicholas & Liqiong Fan.
Computing with SAS Software A SAS program consists of SAS statements. 1. The DATA step consists of SAS statements that define your data and create a SAS.
Chapter 6 Concatenating SAS Data Sets and Creating Summary Reports Xiaogang Su Department of Statistics University of Central Florida.
FORMAT statements can be used to change the look of your output –if FORMAT is in the DATA step, then the formats are permanent and stored with the dataset.
Chapter 18 Reading Free-Format Data. 2 Objectives Read free-format data not recognized in fixed fields. Read free-format data separated by non-blank delimiters,
ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.
LISA SHORT COURSE SERIES: INTRODUCTION TO SAS UNIVERSITY William DeShong Fall 2015.
Chapter 4: Creating List Reports
Lesson 2 Topic - Reading in data Programs 1 and 2 in course notes –Chapter 2 (Little SAS Book)
1 EPIB 698C Lecture 1 Instructor: Raul Cruz-Cano
SAS Programming Training Instructor:Greg Grandits TA: Textbooks:The Little SAS Book, 5th Edition Applied Statistics and the SAS Programming Language, 5.
1 Checking Data with the PRINT and FREQ Procedures.
HRP Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and.
Based on Learning SAS by Example: A Programmer’s Guide Chapters 1 & 2
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 16 & 17 By Tasha Chapman, Oregon Health Authority.
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 5 & 6 By Ravi Mandal.
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 3 & 4 By Tasha Chapman, Oregon Health Authority.
Temporary vs. Permanent SAS Data Sets
Applied Business Forecasting and Regression Analysis
Instructor: Raul Cruz-Cano 7/9/2012
Chapter 2: Getting Data into SAS
Instructor: Raul Cruz-Cano
Chapter 4: Sorting, Printing, Summarizing
Producing Descriptive Statistics
Introduction to SAS Essentials Mastering SAS for Data Analytics
Presentation transcript:

The Information Delivery Process Data In Information Out ManageOrganizeExploit

2 Turning Data Into Information Data DATA Step PROC Steps Data Information SAS Data Sets Data PROC Steps Information

3 Turning Data Into Information Process of delivering meaningful information: 80% Data-related: Access Scrub Transform Manage Store and retrieve 20% Analysis

4 The Raw Data Partial fixed-column raw data file: 

5 Browsing the Data Values                 

6 Reading a Raw Data File Raw Data File SAS Data Set

7 Reading Raw Data Files Raw Data File DATA Step SAS Data Set data...; infile...; input...; run; 0031GOLDENBERG DESIREE 0040WILLIAMS ARLENE M. 0071PERRY ROBERT A. 0082MCGWIER-WATTSCHRISTINA      

8 Reading Raw Data Files In order to create a SAS data set from a raw data file, you must start a DATA step and name the SAS data set being created (DATA statement) identify the location of the raw data file to read (INFILE statement) describe how to read the data fields from the raw data file (INPUT statement).

9 Creating a SAS Data Set with the DATA Statement General form of the DATA statement: This DATA statement creates a SAS data set called WORK.EMPDATA: data work.empdata; DATA SAS-data-set(s);

10 Pointing to a Raw Data File with the INFILE Statement General form of the INFILE statement: Examples: OS/390 infile ‘edc.prog1.employee’; UNIX infile ‘/user/prog1/employee.dat’; WIN infile ‘C:\workshop\winsas\ prog1\employee.dat’; INFILE ‘filename’ ;

11 Reading Raw Data Using Column Input General form of column input: To read raw data values with column input, 1.name the SAS variable you want to create 2.use a dollar sign, $, if the SAS variable is character 3.specify the starting column, a dash, and the ending column of the raw data field. INPUT variable $ startcol-endcol …;

12 Reading Raw Data Using Column Input 0031GOLDENBERG DESIREE PILOT input empid $ 1-4 lastname $

13 Reading Raw Data Using Column Input 0031GOLDENBERG DESIREE PILOT input empid $ 1-4 lastname $ 5-17 firstname $

14 Reading Raw Data Using Column Input 0031GOLDENBERG DESIREE PILOT input empid $ 1-4 lastname $ 5-17 firstname $ jobcode $

15 Reading Raw Data Using Column Input 0031GOLDENBERG DESIREE PILOT input empid $ 1-4 lastname $ 5-17 firstname $ jobcode $ salary 37-45;

16 Reading Raw Data Using Column Input 0031GOLDENBERG DESIREE PILOT input empid $ 1-4 lastname $ 5-17 firstname $ jobcode $ salary 37-45;

17 Business Scenario International Airlines is preparing to review its flight crew. The immediate goal is to read the Excel spreadsheet and create a SAS data set. Excel Spreadsheet SAS Data Set

18 What is the Import Wizard? A point-and-click graphical interface that enables you to create a SAS data set from several types of external files including dBASE file (*.DBF) Excel 97 Spreadsheet (*.XLS) Microsoft Access Table Delimited file (*.*) Comma Separated Values (*.CSV)

19 The Raw Data The aircraft data is stored in a fixed-column raw data file:          aircraft modeldate in service last maintenance date aircraft ID Partial data:

20 Using Formatted Input The raw data file will be read with formatted input. Raw Data File DATA Step SAS Data Set data sas-data-set-name; infile raw-filename; input pointer-control variable informat-name; run;   

21 What is a SAS Format? A format is an instruction that the SAS System uses to write data values. SAS formats have the following form: format.

22 SAS Formats Selected SAS formats: w.d standard numeric format $w. standard character format COMMAw.d commas in a number: 12, DOLLARw.d dollar signs and commas in a number: $12,234.41

23 SAS Formats

24 Using Formatted Input General form of the INPUT statement with formatted input: Pointer the pointer to column n. +nmoves the pointer n positions. INPUT pointer-control column informat...;

25 Using Formatted Input Formatted input can be used to read non- standard data values by moving the input pointer to the starting position of the field specifying a column name specifying an informat. An informat specifies the width of the input field and how to read the data values that are stored in the field.

26 Using Formatted Input General form of an informat: $indicates a character format. informat-namenames the informat. wis an optional field width.. is the required delimiter. doptionally, specifies a decimal for numeric informats. $informat-namew.d

27 Selected Informats 7. or 7.0reads seven columns of numeric data. 7.2reads seven columns of numeric data and inserts a decimal point in the data value. $5.reads five columns of character data and removes leading blanks. $CHAR5.reads five columns of character data and preserves leading blanks.

28 Selected Informats COMMA7.reads seven columns of numeric data and removes selected nonnumeric characters, such as dollar signs and commas. PD4.reads four columns of packed decimal data. MMDDYY10.reads dates of the form 01/20/2000.

29 Working with Date Values The raw data file contains date values. These date values will be read with the MMDDYY10. informat:     

30 Converting Dates to SAS Date Values SAS uses date informats to read and convert dates to SAS date values. For example, Stored Value InformatConverted Value 10/29/1999MMDDYY OCT1999DATE /10/1999DDMMYY

31 SAS Formats Selected SAS date formats: MMDDYYw (MMDDYY6.) 10/16/92 (MMDDYY8.) 10/16/1992 (MMDDYY10.) DATEw.16OCT92 (DATE7.) 16OCT1992 (DATE9.)

32 Locating and Browsing the Raw Data File Browse the raw data file and determine the column layout and type:          aircraft modeldate in service last maintenance date aircraft ID Partial raw data file:

33 Starting the DATA Step Use the DATA statement to begin the DATA step and name the SAS data set: data work.aircraft; other SAS statements run; Use the INFILE statement to identify the input raw data file: data work.aircraft; infile ‘aircraft.dat’; other SAS statements run;

34 Writing the INPUT Statement Use the INPUT statement and pointer control to read the record starting with the first column. Read the value with the $16. informat and assign it to the variable MODEL.  data work.aircraft; infile ‘aircraft.dat’; model $16. other SAS statements run;  

35 Writing the INPUT Statement Use the INPUT statement and pointer control to read the record starting with column 18. Read the value with the $6. informat and assign the value to AIRCRAFTID.  data work.aircraft; infile ‘aircraft.dat’; model aircraftid $6. other SAS statements run;  

36 Writing the INPUT Statement Use the INPUT statement and pointer control to read the record starting with column 25. Read the value with the MMDDYY10. informat and assign the value to INSERVICE.  data work.aircraft infile ‘aircraft.dat’; model aircraftid inservice mmddyy10. other SAS statements run;  

37 Use the INPUT statement and pointer control to read the record starting with column 36. Read the value with the MMDDYY10. informat and assign the value to LASTMAINT.  data work.aircraft; infile ‘aircraft.dat’; model aircraftid inservice lastmaint mmddyy10.; run;   Writing the INPUT Statement

38 SAS Syntax Rules They can begin and end in any column. One or more blanks or special characters can be used to separate words. A single statement can span multiple lines. Several statements can be on the same line. SAS statements are free-format. data work.mech_pilot; infile 'c:\coursedata\emplist.dat'; input lastname $ 1-20 firstname $ jobtitle $ salary 54-59; run; proc means data=work.mech_pilot n mean; class jobtitle; var salary;run; Unconventional spacing

39 SAS Syntax Rules They can begin and end in any column. One or more blanks or special characters can be used to separate words. A single statement can span multiple lines. Several statements can be on the same line. SAS statements are free-format. data work.mech_pilot; infile 'c:\coursedata\emplist.dat'; input lastname $ 1-20 firstname $ jobtitle $ salary 54-59; run; proc means data=work.mech_pilot n mean; class jobtitle; var salary;run; Unconventional spacing

40 SAS Syntax Rules data work.mech_pilot; infile 'c:\coursedata\emplist.dat'; input lastname $ 1-20 firstname $ jobtitle $ salary 54-59; run; proc print data=work.mech_pilot; run; proc means data=work.mech_pilot n mean; class jobtitle; var salary; run; SAS statements usually begin with an identifying keyword always end with a semicolon.

41 Adding a New Variable  Create a new variable by extracting the four-digit year values from the SAS date values.

42 Using an Assignment Statement An assignment statement evaluates an expression and assigns the resulting value to a variable. General syntax of an assignment statement: variable=expression;

43 Using Operators Selected operators for basic arithmetic calculations in an assignment statement:

44 Using SAS Functions A SAS function is a routine that returns a value that is determined from specified arguments. General syntax of a SAS function: function-name(argument1,argument2,...)

45 Using SAS Functions SAS functions perform arithmetic operations compute statistics (for example, mean) manipulate SAS dates and process character values perform many other tasks.

46 Creating a Vertical Bar Chart Use the GCHART procedure and the VBAR statement to create a vertical bar chart. proc gchart data=work.aircraft; vbar yrbeg_service; title 'Aircraft In Service, by Year'; run;

47 Reading a Subset of Raw Data Use the DATA step that was written earlier. Add a subsetting IF statement to process only the subset in which the value of AGE is at least 15. data work.aircraft; infile ‘aircraft.dat’; model aircraftid inservice lastmaint mmddyy10.; yrbeg_service=year(inservice); age=year(today())-yrbeg_service; if age>=15; run;

48 What Is a SAS Data Library?

49 What Is a SAS Data Library? Regardless of which host operating system you use, you identify SAS data libraries by assigning each one a libref. libref

50 What Is a SAS Data Library? By default, SAS creates two SAS data libraries: a temporary library called WORK a permanent library called SASUSER. SASUSER WORK

51 SAS Data Libraries You can think of a SAS data library as a drawer in a filing cabinet and a SAS data set as one of the file folders in the drawer.

52 SAS Data Libraries WORK - temporary library When you invoke SAS, you automatically have access to a temporary and a permanent SAS data library. SASUSER - permanent library You can create and access your own permanent libraries. IA - permanent library

53 Reading a SAS Data Set Input data setOutput data set SET statementDATA statement Temporary SAS data set Temporary SAS data set Permanent SAS data set Permanent SAS data set

54 Two-level SAS Filenames The first name (libref) refers to the library. Every SAS file has a two-level name. The second name (filename) refers to the file in the library. The data set MECH_PILOT is a SAS file in the WORK library. libref.filename

55 Browsing the Data Portion The PRINT procedure displays the data portion of a SAS data set. By default, PROC PRINT displays all observations all variables OBS column on the left-hand side.

56 Browsing the Data Portion General form of the PRINT procedure: Example: proc print data=work.empdata; run; PROC PRINT DATA=SAS-data-set; RUN;

57 Objectives Generate list reports using the PRINT procedure. Display selected variables in a list report using the VAR statement. Display selected observations in a list report using the WHERE statement. Sort the observations in a SAS data set using the SORT procedure.

58 Creating a List Report      PROC Step proc print data=work.empdata; var empid salary jobcode; run;

59 Formatting Data Values      proc print data=work.empsort; format salary dollar11.2; run;

60 Creating a Frequency Report            PROC Step

61 Creating a Frequency Report The FREQ procedure displays frequency counts of the data values in a SAS data set. General form of a simple PROC FREQ step: PROC FREQ DATA=SAS-data-set; RUN; Example: proc freq data=work.empsort; run;

62 Creating a One-Way Frequency Report Only variables listed on the TABLES statement are included in the frequency counts. These are typically variables that have a limited number of distinct values. General form of a PROC FREQ step: PROC FREQ DATA=SAS-data-set; TABLES SAS-variables; RUN;

63 Calculating Job Code Frequencies        

64 Calculating Salary Frequencies          

65 Calculating Job Code/Salary Frequencies                     

66 Creating a Frequency Report By default, PROC FREQ analyzes every variable in the SAS data set displays each distinct data value calculates the number of observations in which each data value appears (and corresponding percentage) indicates for each variable how many observations have missing values.

67 Calculating Summary Statistics The MEANS procedure displays simple descriptive statistics for the numeric variables in a SAS data set. General form of a simple PROC MEANS step: PROC MEANS DATA=SAS-data-set; RUN; Example: proc means data=ia.aircraftcap; run;

68 Calculating Summary Statistics        proc means data=ia.aircraftcap; run;

69 Calculating Summary Statistics By default, PROC MEANS analyzes every numeric variable in the SAS data set prints the statistics N, MEAN, STD, MIN, and MAX excludes missing values before calculating statistics.

70 proc means data=ia.aircraftcap; var totpasscap; run; Selecting Variables       

71 proc means data=ia.aircraftcap maxdec=2; var totpasscap; class model; run; Grouping Observations

72 Calculating Capacity Statistics for Each Type of Plane          