HRP223 2008 Copyright © 1999-2008 Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and.

Slides:



Advertisements
Similar presentations
The essentials managers need to know about Excel
Advertisements

Final Thoughts HRP 223 – 2013 December 4 th, 2013 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation.
Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This.
1 SAS Formats and SAS Macro Language HRP223 – 2011 November 9 th, 2011 Copyright © Leland Stanford Junior University. All rights reserved. Warning:
Beginning Data Manipulation HRP Topic 4 Oct 19 th 2011.
1 Merging with SQL HRP223 – 2011 October 31, 2011 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation.
1 Lab 2 HRP223 – 2010 October 18, 2010 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected.
1 Processing Grouped Data HRP223 – 2011 November 14 th, 2011 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This.
1 Combining (with SQL) HRP223 – 2010 October 27, 2009 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation.
1 Creating and Tweaking Data HRP223 – 2010 October 24, 2011 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This.
1 Lab 1 HRP223 – 2010 October 6, 2010 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected.
1 Database Theory and Normalization HRP223 – 2010 November 14 th, 2011 Copyright © Leland Stanford Junior University. All rights reserved. Warning:
1 Lab 1 HRP223 – 2011 Oct 10, 2011 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected.
1 An Introduction to IBM SPSS PSY450 Experimental Psychology Dr. Dwight Hennessy.
A Simple Guide to Using SPSS© for Windows
Introduction to a Programming Environment
Bug Session Two. Session description In this session the use of algorithms is reinforced to help pupils plan out what they will need to program on their.
Introduction to Excel 2007 Part 3: Bar Graphs and Histograms Psych 209.
1 Windows and Beginning Data Manipulation HRP223 – 2013 Oct 9, 2012 Copyright © Leland Stanford Junior University. All rights reserved. Warning:
SAS for Categorical Data Copyright © 2004 Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright.
Week 4-5 Java Programming. Loops What is a loop? Loop is code that repeats itself a certain number of times There are two types of loops: For loop Used.
1 Data Manipulation (with SQL) HRP223 – 2010 October 13, 2010 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This.
Introduction to SAS BIO 226 – Spring Outline Windows and common rules Getting the data –The PRINT and CONTENT Procedures Manipulating the data.
PMS /134/182 HEX 0886B6 PMS /39/80 HEX 5E2750 PMS /168/180 HEX 00A8B4 PMS /190/40 HEX 66CC33 By Adrian Gardener Date 9 July 2012.
HPR Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and.
Working with Data in Windows HRP223 – 2009 Sept 28 th, 2009 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This.
Math 3400 Computer Applications of Statistics Lecture 1 Introduction and SAS Overview.
HRP Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and.
Key Words: Functional Skills. Key Words: Spreadsheets.
Processing Lab 3 – Header issues and trace editing Bryce Hutchinson Objectives: Fixing elevation issues Define an LMO function Pick first breaks Kill traces.
1 Lab 2 and Merging Data (with SQL) HRP223 – 2009 October 19, 2009 Copyright © Leland Stanford Junior University. All rights reserved. Warning:
Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and international.
1 Summary HRP223 – 2009 November 1 st, 2010 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is.
Introduction to Enterprise Guide Jennifer Schmidt Rhonda Ellis Cassandra Hall.
1 Data Manipulation (with SQL) HRP223 – 2010 October 13, 2010 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This.
BMTRY 789 Lecture 11: Debugging Readings – Chapter 10 (3 rd Ed) from “The Little SAS Book” Lab Problems – None Homework Due – None Final Project Presentations.
HRP Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and.
1 Printing in Python Every program needs to do some output This is usually to the screen (shell window) Later we’ll see graphics windows and external files.
1 Lab 1 HRP223 – 2011 Oct 10, 2011 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected.
HRP Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and.
 Columns  Rows  Cells  Ranges  Cell addresses  Column headers  Row headers  Formulas  Spreadsheet.
1 Lab 1 HRP223 – 2009 October 5, 2009 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected.
HRP Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and.
1) the value you want to look up Vlookups are really easy......just remember you need 4 things: 2) The table range you want to look up to 3) The column.
1 Data Manipulation (with SQL) HRP223 – 2009 October 12, 2009 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This.
1 EPIB 698C Lecture 1 Instructor: Raul Cruz-Cano
Beginning Data Manipulation HRP Topic 4 Oct 14 th 2012 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This.
HRP Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and.
Summary HRP223 – 2009 October 28, 2009 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected.
Practical Office 2007 Chapter 10
Intro to PHP & Variables
Working with Data in Windows
SAS Output Delivery System
Instructor: Raul Cruz-Cano
Number and String Operations
Introduction to SAS A SAS program is a list of SAS statements executed in order Every SAS statement ends with a semicolon! SAS statements can be in caps.
Combining (with SQL) HRP223 – 2013 October 30, 2013
Lab 3 and HRP259 Lab and Combining (with SQL)
Lab 2 and Merging Data (with SQL)
Combining (with SQL) HRP223 – 2012 November 05, 2011
Lab 2 HRP223 – 2010 October 18, 2010 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected.
Spreadsheets, Modelling & Databases
Lab 1 HRP223 – 2009 October 5, 2009 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected.
A Bit About SAS/Macro Language Database Theory and Normalization
File Sharing and Processing Grouped Data
Data Manipulation (with SQL)
Introduction to Excel 2007 Part 3: Bar Graphs and Histograms
Processing Grouped Data
Presentation transcript:

HRP Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and international treaties. Unauthorized reproduction of this presentation, or any portion of it, may result in severe civil and criminal penalties and will be prosecuted to maximum extent possible under the law. HRP Topic 2 – Using EG

HRP At this point you can:  Start up a project  Use SAS as a calculator  Set some configuration options – Remember to work in WORK, rather than SASUSER  Create a library  Import a dataset – into work or your custom library  Subset a dataset – You can use data steps, write or point/click to SQL

HRP Working on a Project  Set up a library to hold your permanent data.  Import data into that library.  Look at what you’ve got.  Check for bad data.  Subset the data to keep the data you want.  Make a report.

HRP Make the Library  Tools menu > Assign Library…  Review the code (if you want)  Check the log

HRP Write the Import Code Where is the dataset node in the flowchart? The log is good. It is a bug… they forgot to draw the dataset if you use proc import.

HRP  You really want to put the source file in the library. – Tweak the code and link the import node to the library.

HRP

Files in a Library  Once a file is in a library, you can access it just like any other file on your computer.

HRP Structure  If you have a dataset on the left margin of a process flow, you will have a problem in your future. – Put every dataset into a library. – If your datasets move across machines you just need to change the one library reference path.  Add a note (File > New > Note) with information on the origin of every data file and connect it. – Include the time, date, and source of the file ( titles help also).

HRP Add a Variable  To add a variable with EG: – Select the dataset – Choose Filter and Query…. from the Data menu – Name the query and new dataset – Select the current variables (drag and drop to select data) – Click Computed Columns – Click New, then click Build Expression – Fill in the expression and click OK – Select the new variable and give it a good name – Select the new variable (drag and drop to select data)

HRP

Calculate Stuff  Calculate the discounted price and then get some descriptive statistics on the new values. – Either reopen the previous filter and add in the formula there or just make a new data set by filtering the previously created data set.

HRP Click on the data set to analyze or choose it from the list Proc Means Proc Univariate

HRP Procs or Menu Items  Use the task list (right side of the screen), organized by task name, to look up the procedures that go with a menu item or if you are told to use a procedure, you can find the corresponding menu item like this.

HRP

Not enough data for a useful histogram Be glad you did not need to memorize this stuff.

HRP Looking at Categorical Data  In this source file we have a categorical “tour” variable. What are the its values?  Use the Describe > One-Way Frequencies menu option to see the categories. Drag Tour from the left pane and drop it into the Analysis variables group.

HRP Proc Freak  The procedure that does frequency counts is proc freq (pronounced freak). It is very important to learn because it does the core categorical analysis for basic epidemiological studies. The EG code is:  This could be simplified PROC FREQ DATA=day2.source; TABLES Tour; RUN;

HRP The Levels  You have already seen how to subset a dataset using the GUI and SQL.  What if you want to subset into 3 different data sets? You could do a lot of pointing and clicking or write a little program.

HRP Splitting in a Data Step  All data steps begin with the data statement.  Most have a set statement saying where the data is coming from and they should end with a run statement. * A list of what data sets to make; data fj12 ps27 sh43; * based on what file? ; set day2.source; * Check the value of tour and if TRUE output; if tour = "FJ12" then output fj12; if tour = "PS27" then output ps27; if tour = "SH43" then output sh43; return; * This line is optional; run;

HRP What is a statement?  A statement is a single instruction beginning with a keyword and ending in a semicolon.  You can use white space and new lines to make them easier to read. – Look back at the proc sql statements you have seen and notice where the semicolons are. SQL statements are LONG.

HRP How does a Data Step work?  The data statement says make this (or these) data set(s). – SAS then reads every line down to the run statement and gathers a list of all variables used. This list is called the program data vector (PDV). – It then sets all the variables to missing. – It then does the instruction listed on each line of the data step program in the order that the lines are written.

HRP How SAS Processes a Dataset  When you create a SAS data set, SAS does the following things: 1.SAS reads every line down to the run statement and gathers a list of all variables used. – This list is called the program data vector (PDV). 2.It sets the values in the PDV to missing. 3.Then it does all the instructions you tell it to do, in the order you have written them. 4.Then it writes all the variables out to the new dataset. 5.It then repeats the process if there is more data.

HRP How SAS Processes a Dataset (2)  In the example below, SAS will look in the existing dataset called Teletubbies and it will find two variables, teletubby and thing. Then it will find the variable called kid.  Then it will do the instructions in order. data Teletubbies2; *name of a new data set; set Teletubbies; *load 1 observation of data; kid = "Andrew"; * fill in the blank; output; *write the variables to teletubbies2; return; *return to the top of the step; run; *end of these instructions;

HRP The Set Statement set Teletubbies;  This line tells SAS to load one row of data from the data set Teletubbies into the PDV. The first time this line is run, the first row of data is loaded into the PDV.  When there is no more data to load, the data step is done.

HRP Variable Assignment  In the example the word Andrew is assigned to the variable kid. All variables are assigned from the right side into the variable named on the left. kid = "Andrew";  If a variable appears on the left and right side of an equal sign, the original value on the right is changed and then written to the left.  aNumber = aNumber + 4; Assignment goes this way original valuenew value

HRP How SAS Processes a Dataset (3)  If you do not include the output and return statements, SAS will do them automatically. So, the previous data step would typically be written like this. data Teletubbies2; set Teletubbies; kid = "Andrew"; run;

HRP Test Your Understanding (2) data test3a test3b; set source; if isMale = 1 then output test3a; hasCancer = 1; output test3b; run;

HRP Adding Variables with Code  Creating a new variable is simple. All you need to do is give the bit of data a name and give it a value. The variable being assigned information is always on the left side of the equal sign. Actually it is too easy because it lets you miss bugs. data day2.paid; set day2.source; didpay = ‘nope’; run;

HRP What is a bug anyway?  When you write a program and it doesn’t work the way that you intended, it is described as having a bug.  There are many types of bugs. Syntax and semantic errors are relatively easy to find and fix. When these errors happen, SAS can not figure out what you want done. Conceptual errors happen when SAS understands the words you give it but it does not do what you intended. These can be very, very hard to find and fix.  Spotting syntax bugs is easy. You just need to look in the SAS log.

HRP What is a bug anyway? (2)  You will look in the log window to find out if SAS found any syntax errors. * oops forgot the "then";

HRP Uninitialized Bug  If you are trying to assign a string of letters to a character variable but you forget to quote the string, SAS thinks it is a variable but that variable never gets a value. data day2.paid; set day2.source; didpay = nope; run;

HRP Uninitialized Bug (2)  If empty variables (usually with easy typos or spelling mistakes) show up in your datasets, you probably made this mistake.

HRP Parts of a SAS Dataset  You have seen how to browse a SAS dataset like a spreadsheet. There are two parts of a dataset which you do not see when you browse the data. – There is a section that acts like a dictionary which has a description of the data set, including among other things, the types of variables (character or numeric) and when the data set was created. – There is sometimes a section that has “index” information. You can create an index to help speed up processing of huge files.

HRP Seeing the Details with EG

HRP By Position  Knowing the variables’ order can help you do complex things.

HRP If want to code…  You can see the dictionary of attributes by typing a proc contents step in a code window: proc contents data=teletubbies; run;  To get the variables in their stored order, use: proc contents data=teletubbies position; run;

HRP Running it All  If you return to this project later and you want to rerun the code, keep in mind you can right click on the left-most node and do Run Branch.