Chapter 2: Getting Data into SAS

Slides:



Advertisements
Similar presentations
The SAS ® System Additional Information on Statistical Analysis Programming.
Advertisements

The INFILE Statement Reading files into SAS from an outside source: A Very Useful Tool!
Chapter 17 Read Raw Data in Fixed Format using Formatted Input Objectives Distinguish between standard and nonstandard numeric data Read standard fixed-field.
Using Macros and Visual Basic for Applications (VBA) with Excel
Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming I: Essentials Training.
Exploring Microsoft Excel 2002 Chapter 7 Chapter 7 List and Data Management: Converting Data to Information By Robert T. Grauer Maryann Barber Exploring.
SUNY Morrisville-Norwich Campus- Week 7 CITA 130 Advanced Computer Applications II Spring 2005 Prof. Tom Smith.
Chapter 7 Data Management. Agenda Database concept Import data Input and edit data Sort data Function Filter data Create range name Calculate subtotal.
XP New Perspectives on Microsoft Office Excel 2003, Second Edition- Tutorial 11 1 Microsoft Office Excel 2003 Tutorial 11 – Importing Data Into Excel.
Tutorial 11: Connecting to External Data
Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1.
Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives.
Creating SAS® Data Sets
COMPREHENSIVE Excel Tutorial 8 Developing an Excel Application.
Welcome to SAS…Session..!. What is SAS..! A Complete programming language with report formatting with statistical and mathematical capabilities.
11 Chapter 2: Working with Data in a Project 2.1 Introduction to Tabular Data 2.2 Accessing Local Data 2.3 Importing Text Files 2.4 Editing Tables in the.
Chapter 2: Working with Data in a Project
© 2008 The McGraw-Hill Companies, Inc. All rights reserved. ACCESS 2007 M I C R O S O F T ® THE PROFESSIONAL APPROACH S E R I E S Lesson 4 – Creating New.
SAS Workshop Lecture 1 Lecturer: Annie N. Simpson, MSc.
Chapter 10: Working with Large Data Spreadsheet-Based Decision Support Systems Prof. Name Position (123) University Name.
CHAPTER 13 Creating a Workbook Part 1. Learning Objectives Understand spreadsheets and Excel Enter data in cells Edit cell content Work with columns and.
Chapter 20 Creating Multiple Observations from a Single Record Objectives Create multiple observations from a single record containing repeating blocks.
1 Performing Spreadsheet What-If Analysis Applications of Spreadsheets.
Using Advanced INPUT Techniques Peter Cosette Dave Hall Amy Dunn-Ruiz Eric Lyon.
EPIB 698C Lecture 2 Notes Instructor: Raul Cruz 2/14/11 1.
Chapter 1: Introduction to SAS  SAS programs: A sequence of statements in a particular order  Rules for SAS statements: –Every SAS statement ends in.
BMTRY 789 Lecture 2 SAS Syntax, entering raw data, etc. Lecturer: Annie N. Simpson, MSc. Readings – Chapters 1, 2, 12, & 13 Lab Problems 1.1, 1.2, 1.3,
I OWA S TATE U NIVERSITY Department of Animal Science Getting Your Data Into SAS (Chapter 2 in the Little SAS Book) Animal Science 500 Lecture No. 3 September.
® Microsoft Office 2010 Word Tutorial 3 Creating a Multiple-Page Report.
Chapter 17 Creating a Database.
Lesson 2 Topic - Reading in data Chapter 2 (Little SAS Book)
1 Chapter 2: Working with Data in a Project 2.1 Introduction to Tabular Data 2.2 Accessing Local Data 2.3 Accessing Remote Data 2.4 Importing Text Files.
Chapter 4 Creating a Custom Publication from Scratch Microsoft Publisher 2013.
A lesson approach © 2011 The McGraw-Hill Companies, Inc. All rights reserved. a lesson approach Microsoft® Excel 2010 © 2011 The McGraw-Hill Companies,
A Simple Guide to Using SPSS ( Statistical Package for the Social Sciences) for Windows.
Here’s another problem (see section 2.13 on page 54). A file contains two different types of records (say A’s and B’s) and we only want to read in the.
1 Statistical Software Programming. STAT 6360 –Statistical Software Programming Data Input in SAS Many ways to get your data into SAS: –Through data entry.
Chapter 1: Overview of SAS System Basic Concepts of SAS System.
With Microsoft Excel 2007Comprehensive 1e© 2008 Pearson Prentice Hall1 Chapter 4: PowerPoint Presentation GO! with Microsoft Excel ® 2007 Comprehensive.
Microsoft® Excel Create an Excel table. 1 Work with the Table Tools Design tab. 2 Sort and filter records in a table. 3 Identify structured references.
Chapter 18 Reading Free-Format Data. 2 Objectives Read free-format data not recognized in fixed fields. Read free-format data separated by non-blank delimiters,
Chapter 2 Getting Data into SAS Directly enter data into SAS data sets –use the ViewTable window. You can define columns (variables) with the Column Attributes.
Microsoft Office 2013 Try It! Chapter 4 Storing Data in Access.
Lesson 2 Topic - Reading in data Programs 1 and 2 in course notes –Chapter 2 (Little SAS Book)
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 5 & 6 By Ravi Mandal.
Chapter 11 Enhancing an Online Form and Using Macros Microsoft Word 2013.
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 3 & 4 By Tasha Chapman, Oregon Health Authority.
Creating a Workbook Part 1
Excel Tutorial 8 Developing an Excel Application
GO! with Microsoft Office 2016
Tutorial 11: Connecting to External Data
Lesson 2 Topic - Reading raw data into SAS
Lesson 2 Tables and Charts
GO! with Microsoft Access 2016
Introduction to SAS®.
Instructor: Raul Cruz-Cano 7/9/2012
Chapter 3: Working With Your Data
Former Chapter 23: Selecting Efficient Sorting Strategies
Chapter 1: Introduction to SAS
Instructor: Raul Cruz-Cano
Chapter 4: Sorting, Printing, Summarizing
MODULE 7 Microsoft Access 2010
Chapter 7: Macros in SAS Macros provide for more flexible programming in SAS Macros make SAS more “object-oriented”, like R Not a strong suit of text ©
Chapter 9 Lesson 2 Notes.
Microsoft Excel 2007 – Level 2
Microsoft Excel 2007 – Level 1
Spreadsheets and Data Management
Lesson 13 Working with Tables
Introduction to SAS Essentials Mastering SAS for Data Analytics
Microsoft Excel 2007 – Level 2
Presentation transcript:

Chapter 2: Getting Data into SAS Data is stored in many different formats Here are four categories of methods to read in data Entering data directly through the keyboard (small data sets) Creating SAS data sets from raw data lines Book likes direct data entry via Viewtable; I find it cumbersome. See class demos for use of raw data lines. © Fall 2011 John Grego and the University of South Carolina

Getting Data into SAS Here are four categories of methods to read in data Converting other software’s data files (e.g., Excel) into SAS data sets (my favorite!) Reading other software’s data files directly (often requires additional SAS/ACCESS products) © Fall 2011 John Grego and the University of South Carolina

Direct Data Entry with Viewtable window Select Tools → Table Editor, Edit column attributes Enter data File→Save As… © Fall 2011 John Grego and the University of South Carolina

Import Window Allows you to import various types of data files (Microsoft Excel formats) The default is for the first row to be variable names. Use Options… to change. Options… button also selects worksheet from workbook We will step through this using ToughSpreadsheet.sas © Fall 2011 John Grego and the University of South Carolina

Import Window-Libraries Work library--the data set is deleted after exiting SAS Other libraries—data set saved (but not necessarily library location) after exiting SAS © Fall 2011 John Grego and the University of South Carolina

PROC IMPORT Can save PROC IMPORT step used to import data The default generated step includes options discussed in text DBMS=EXCEL may work better than DBMS=XLSX © Fall 2011 John Grego and the University of South Carolina

Reading in Raw Data Type data directly into a SAS program using the statements that appear to the right cards; datalines; lines; Bring up agency.sas © Fall 2011 John Grego and the University of South Carolina

Reading in Raw Data-INFILE If your raw data is an external text file, you could use an INFILE statement to tell SAS where it is. Specify the full path name If your lines are longer than 256 charcters, use LRECL=.. Fall95.sas has an old INFILE statement. LongRecord.sas has an example of LRECL later.

Data Separated by Spaces This style is called “free format” or list input since the number of spaces in between variables is flexible Use the INPUT statement to name variables Include a $ after names of character variables Bring up agency.sas. Again. Add some spaces to show free format still works.

Data Arranged in Columns Knowledge of this approach is particularly critical for large data sets Each value of a variable is found at the same spot on the data line Advantages: Don’t need spacing between values Missing values don’t need a special symbol (can be blank) Show Fall95.sas and David Hitchcock’s example

Data Arranged in Columns-II Advantages: Character data can have embedded blanks You can skip variables you don’t need to read into SAS Example: INPUT var1 1-10 var2 11-15 var3 $ 16-30; Fixed width

Data Not in Standard Format Types of non-standard data: Numbers with commas or dollar signs Dates and times of day We can read nonstandard data using codes known as informats Most informats end in . (so SAS won’t confuse them with a variable) Show ToughSpreadsheet.xlsx or ToughSpreadsheet.txt

SAS Informats Import from Excel often assigns informats automatically Pages 44-45 list many SAS informats (Note that date informats are converted to a numerical value—Julian date) Run ToughSpreadsheet.sas

Other Inputting Issues You can mix input styles: read in some variables list-style; others column-style; others using informats; even the order can be shuffled; E.g, you can explicity move the pointer to a specific column number: @50 moves the pointer to the 50th column I like @x for formatted input. @’charstring’—the book has a fake-y example of this.

Messy Data colon modifier: Tells SAS exactly how many columns a variable occupies, but stops when it reaches a space This method is not appropriate for characters with embedded spaces Example: Deptname :$15. tells SAS to read Deptanme for 15 characters or until it reaches a space (or delimiter character). It handles ragged free format data well.

Multiple Lines of Data per Observation Sometimes each observation will be on several lines in the raw data file (census data, standardized test scores, etc) Use / to tell SAS when to go to the next line Or use #2, for example, to tell SAS to go to the 2nd line of the observation PACT data we worked with had over 2000 columns. Run LongRecord.sas

Multiple Records per Line of Raw Data Sometimes several observations will be on one line of data This is common for textbook exercises Use @@ to tell SAS to keep the pointer on the raw data line and prepare to read the next observation Simple example: data a; input y x @@; 25.1 3 26.2 2 28.4 1 ; run;

Reading Part of a Data File Sometimes we want to modify data input based on values of one variable We can read just the first variable(s) using the @ sign @x, @ and @@ are not closely related, but the text draws a connection Run Fall95.sas and atandatat.sas

Reading Delimited Files These instructions have been almost completely subsumed by the Import Wizard and the widespread use of Excel DLM= specifies alternative delimiters Comma: DLM=‘,’ Tab: DLM=‘09’X #: DLM=‘#’ Skim INFILE material, though we cover it in HW 3. Show INFILE options in David Hitchcock’s example.

Missing values Two delimiters in a row are read as a single delimiter What if two commas in a row indicate a missing value? Or data values contain commas? Can use DSD option Data values with commas must be in quotes

Truncated Records MISSOVER adds missing values at the end of a short record rather than reading more data from the next line TRUNCOVER is used to advance to the next line when the length of a record is shorter than the variable field.

SAS data sets: Temporary and Permanent Data sets stored in Work library are temporary (removed upon exiting SAS) Data sets stored in other libraries are permanent (will be saved upon exiting SAS)

Permanent SAS Data Sets You can specify the library when creating a data set in the DATA step Example: Suppose you have a library called sportlib (this is a “libref”) DATA sportlib.baseball; creates a data set baseball to be stored in the permanent sportlib library

Direct References You can create a permanent SAS data set without creating a library Example: Suppose you want to save baseball.sas7bdat in your 540 directory: DATA “e:\STAT 540\baseball”;

Temporary SAS Data Sets DATA work.baseball; stores baseball in the temporary work library DATA baseball; stores baseball in the default temporary work library Book covers PROC CONTENTS