Using Advanced INPUT Techniques Peter Cosette Dave Hall Amy Dunn-Ruiz Eric Lyon.

Slides:



Advertisements
Similar presentations
The SAS ® System Additional Information on Statistical Analysis Programming.
Advertisements

The INFILE Statement Reading files into SAS from an outside source: A Very Useful Tool!
Statistical Methods Lynne Stokes Department of Statistical Science Lecture 7: Introduction to SAS Programming Language.
Next Presentation: Presenter: Arthur Tabachneck Copy and Paste from Word or Excel to SAS Art holds a PhD from Michigan State University, has been a SAS.
Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming I: Essentials Training.
Statistics in Science  Introducing SAS ® software Acknowlegements to David Williams Caroline Brophy.
CPS120: Introduction to Computer Science Lecture 15 B Data Files.
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
© 2006 Pearson Education. All Rights Reserved Starting Out with C++: Early Objects 5/e Starting Out with C++: Early Objects 5 th Edition Chapter 2 Introduction.
1 Computer Applications in Epidemiology Dongmei Li Lecture 26 5/6/2009.
Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1.
Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives.
Creating SAS® Data Sets
Welcome to SAS…Session..!. What is SAS..! A Complete programming language with report formatting with statistical and mathematical capabilities.
“SAS macros are just text substitution!” “ARRRRGGHHH!!!”
Databases with PHP A quick introduction. Y’all know SQL and Databases  You put data in  You get data out  You can do processing on it very easily 
SAS Workshop Lecture 1 Lecturer: Annie N. Simpson, MSc.
Introduction to SAS BIO 226 – Spring Outline Windows and common rules Getting the data –The PRINT and CONTENT Procedures Manipulating the data.
Chapter 21 Reading Hierarchical Files Reading Hierarchical Raw Data Files.
1 Experimental Statistics - week 4 Chapter 8: 1-factor ANOVA models Using SAS.
Chapter 20 Creating Multiple Observations from a Single Record Objectives Create multiple observations from a single record containing repeating blocks.
SAS Macro: Some Tips for Debugging Stat St. Paul’s Hospital April 2, 2007.
Use the UPDATE statement to: –update a master dataset with new transactions (e.g. a bank account updated regularly with deposits and withdrawals…). Not.
1 Back Up with Each Submit One approach for keeping a dynamic back up copy of your current work.
EPIB 698C Lecture 2 Notes Instructor: Raul Cruz 2/14/11 1.
DHTML AND JAVASCRIPT Genetic Computer School LESSON 5 INTRODUCTION JAVASCRIPT G H E F.
BMTRY 789 Lecture 2 SAS Syntax, entering raw data, etc. Lecturer: Annie N. Simpson, MSc. Readings – Chapters 1, 2, 12, & 13 Lab Problems 1.1, 1.2, 1.3,
I OWA S TATE U NIVERSITY Department of Animal Science Getting Your Data Into SAS (Chapter 2 in the Little SAS Book) Animal Science 500 Lecture No. 3 September.
Lesson 2 Topic - Reading in data Chapter 2 (Little SAS Book)
Chapter 7 File I/O 1. File, Record & Field 2 The file is just a chunk of disk space set aside for data and given a name. The computer has no idea what.
ISU Basic SAS commands Laboratory No. 1 Computer Techniques for Biological Research Animal Science 500 Ken Stalder, Professor Department of Animal Science.
BMTRY 789 Lecture 11: Debugging Readings – Chapter 10 (3 rd Ed) from “The Little SAS Book” Lab Problems – None Homework Due – None Final Project Presentations.
Lesson 10 - Mail Merge and Reviewing Documents Advanced Microsoft Word.
Here’s another problem (see section 2.13 on page 54). A file contains two different types of records (say A’s and B’s) and we only want to read in the.
Converting Large NCBI Databases into SAS Rosa SJ Lin Division of Statistical Genomics Washington University in Saint Louis June 30, 2008.
Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1.
Chapter 1: Overview of SAS System Basic Concepts of SAS System.
Computing with SAS Software A SAS program consists of SAS statements. 1. The DATA step consists of SAS statements that define your data and create a SAS.
FORMAT statements can be used to change the look of your output –if FORMAT is in the DATA step, then the formats are permanent and stored with the dataset.
Chapter 18 Reading Free-Format Data. 2 Objectives Read free-format data not recognized in fixed fields. Read free-format data separated by non-blank delimiters,
FILES. open() The open() function takes a filename and path as input and returns a file object. file object = open(file_name [, access_mode][, buffering])
Chapter 21: Controlling Data Storage Space 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
BIT 115: Introduction To Programming Professor: Dr. Baba Kofi Weusijana (say Doc-tor Way-oo-see-jah-nah, Doc-tor, or Bah-bah)
Functions CSE 1310 – Introduction to Computers and Programming Vassilis Athitsos University of Texas at Arlington 1.
Lesson 2 Topic - Reading in data Programs 1 and 2 in course notes –Chapter 2 (Little SAS Book)
1 EPIB 698C Lecture 1 Instructor: Raul Cruz-Cano
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 5 & 6 By Ravi Mandal.
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 3 & 4 By Tasha Chapman, Oregon Health Authority.
Loops BIS1523 – Lecture 10.
Lesson 2 Topic - Reading raw data into SAS
Instructor: Raul Cruz-Cano 7/9/2012
Chapter 2: Getting Data into SAS
Miscellaneous Items Loop control, block labels, unless/until, backwards syntax for “if” statements, split, join, substring, length, logical operators,
Other Kinds of Arrays Chapter 11
Writing Shell Scripts ─ part 3
Let's Race! Typing on the Home Row
When I want to execute the subroutine I just give the command Write()
MON TUE WED THU
File Input and Output.
2008 Calendar.
Sun Mon Tue Wed Thu Fri Sat
Sun Mon Tue Wed Thu Fri Sat
1/○~1/○ weekly schedule MON TUE WED THU FRI SAT SUN MEMO
2016 | 10 OCT SUN MON TUE WED THU FRI SAT
Sun Mon Tue Wed Thu Fri Sat
Software Development Techniques
Introduction to SAS Essentials Mastering SAS for Data Analytics
Introduction to SAS Essentials Mastering SAS for Data Analytics
2008 Calendar.
Presentation transcript:

Using Advanced INPUT Techniques Peter Cosette Dave Hall Amy Dunn-Ruiz Eric Lyon

Advanced Input Outline Handling Missing Values and Short Data Lines Detecting the End of Files Advanced Options for Reading Data Files Reading Data Conditionally The Single and the Double Trailing Using Variable and Informant Lists Creating Multiple Obs. from One Line of Input Relative Column Pointers

LOG: “SAS went to a new line when INPUT statement reached past end of the line” Solutions Missover sets variable to missing if it has more variables than data values in one line Pad When doing column input this command pads each line with blanks up to 256 bytes Truncover Good for reading in variable length records with missing data

Inputting data With missing values missing.txt: Use missover

Inputting data with missing values short.txt 001Amy Dunn-Ruiz Peter Cossette David Hall Eric Lyon Use Pad

End=Last Option last is a temporary variable that is false(0) until last record of external file is read in, then it is true(1) end is also a set option and used to determine when you are reading the last observation in a SAS data set

Data Example for End=Last & Obs days.txt Sun 0 Mon 0 Tue 8 Wed 2 Thu 10 Fri 3 Sat 1

Obs and First Obs No matter how many observations are in the file, if you put obs=3 it will only reads in the first 3 Use FirstObs to read in selected observations in the middle of a data set firstobs=4 obs=7 Useful for large data sets where you need to test stuff out

Reading in Multiple Files The Wildcard End=Finished Filename Dummy Variable

The ? Wildcard If you have multiple files with similar names like: Person1, Person2, Person3,…. data wild; infile 'e:\files\info\person?.txt'; input Name Style Letter; run; You can use a * wildcard to allow for more than one wildcard digit

End=Finished data lotsofiles; if finished = 0 then infile ‘grover.txt' end=finished; else infile ‘bigbird.txt'; else infile ‘snuffy.txt’; input Name Style Place; run; Use this to read in one file, then when that's done, it moves on to read in the next file, and so on;

Filename filename super (‘grover.txt’ ‘oscar.txt’ ‘bert.txt); data lotsofiles; infile super; input Name Place Style; run; We called our file super, you can name it anything you want This is a little less typing than End=Finished method

Dummy with External File data lotsofiles; infile 'e:\super.txt'; input External $ 30.; infile dummy filevar=External end=Last; do until (last); input Name Place Style; output; end; run; E:\super.txt contains a list of all of the filenames we want to read in Easier to use than End=Finshed or Filename when you have LOTS of files to read in

Dummy with Datalines data lotsofiles; input External $ 30.; infile dummy filevar=External end=Last; do until (last); input Name Place Style; output; end; datalines; e:\wilma.txt e:\betty.txt e:\fred.txt e:\barney.txt etc..... ; run; Same as previous slide, but with datalines not an external file

Reading Multiple Lines of Data Name lines #1, #2, #3, etc and it reads them in together as one line Or use a slash to indicate where the lines are separated (though this is not always super clear, author doesn't like this technique)

Data in Columns HondaCivic Ford Focus BMW X

Mixing Record Types with Conditional Input If some data you are reading in has more variables than other data Using sign fixes the errors, it is the absolute column pointer

Mixing Record Types with Conditional Input star89.txt

More Sign For Use as a Filter tells computer not to move on after reading in Year, lets it do the if statement rather than go onto next line

The Trailing When there are, for example, only 2 variables but more than 2 things on the dataline Normally SAS won't read in all the data off that line, moves to next line after inputting the first 2 values on the line makes everything nice again, UNLESS THERE'S A MISSING VALUE...THEN IT WON'T WORK

Using Variable & Informat Lists You can supply a single informat to a list of variables and save typing input (FirstName LastName MT1-MT3)(2*$10. 3*$2.);

Using Relative Column Pointers to Read a Complex Data Structure Effectively The + sign is a relative column pointer Wt3 3. etc…….

Summary of Key Terms Missover - Used to handle data with missing values Pad – Used to handle column data with missing values End = Last - Used to detect the end of a data file Obs - used to read the first n obs. from a data file Firstobs - Sets the first obs. read from a data file Filevar – Used to specify filename - Used to “hold the line” for conditional input Double - Use with Caution! Informant Lists & Relative Column Pointers – Useful!