Quantify the Example Data First, code and quantify the data (assign column locations & variable names) Use the sample data to create a data set from the.

Slides:



Advertisements
Similar presentations
Statistical Methods Lynne Stokes Department of Statistical Science Lecture 7: Introduction to SAS Programming Language.
Advertisements

Chapter 3: Editing and Debugging SAS Programs. Some useful tips of using Program Editor Add line number: In the Command Box, type num, enter. Save SAS.
Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming I: Essentials Training.
Bivariate Analysis Cross-tabulation and chi-square.
I OWA S TATE U NIVERSITY Department of Animal Science Modifying and Combing SAS Data Sets (Chapter in the 6 Little SAS Book) Animal Science 500 Lecture.
SAS Programming: Working With Variables. Data Step Manipulations New variables should be created during a Data step Existing variables should be manipulated.
I OWA S TATE U NIVERSITY Department of Animal Science Getting Started Using SAS Software Animal Science 500 Lecture No. 2.
Today: Run SAS programs on Saturn (UNIX tutorial) Runs SAS programs on the PC.
A Simple Guide to Using SPSS© for Windows
Mean Comparison With More Than Two Groups
Chi-square Test of Independence
Introduction to SQL Session 1 Retrieving Data From a Single Table.
Data Management: Quantifying Data & Planning Your Analysis
SPSS 1: An Introduction to the Statistical Package SPSS Suzie Cro MRC Clinical Trials Unit.
Introduction to SPSS Short Courses Last created (Feb, 2008) Kentaka Aruga.
Pet Fish and High Cholesterol in the WHI OS: An Analysis Example Joe Larson 5 / 6 / 09.
8/9/2015Slide 1 The standard deviation statistic is challenging to present to our audiences. Statisticians often resort to the “empirical rule” to describe.
Adding Automated Functionality to Office Applications.
Introduction to SPSS (For SPSS Version 16.0)
Understanding SAS Data Step Processing Alan C. Elliott stattutorials.com.
Welcome to SAS…Session..!. What is SAS..! A Complete programming language with report formatting with statistical and mathematical capabilities.
CHAPTER 4: INTRODUCTION TO COMPUTER ORGANIZATION AND PROGRAMMING DESIGN Lec. Ghader Kurdi.
SAS Workshop Lecture 1 Lecturer: Annie N. Simpson, MSc.
Chapter 9 Producing Descriptive Statistics PROC MEANS; Summarize descriptive statistics for continuous numeric variables. PROC FREQ; Summarize frequency.
Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS ESSENTIALS -- Elliott & Woodward1.
Introduction to SAS BIO 226 – Spring Outline Windows and common rules Getting the data –The PRINT and CONTENT Procedures Manipulating the data.
1 Experimental Statistics - week 4 Chapter 8: 1-factor ANOVA models Using SAS.
IPC144 Introduction to Programming Using C Week 1 – Lesson 2
Using SPSS for Windows Part II Jie Chen Ph.D. Phone: /6/20151.
18b. PROC SURVEY Procedures in SAS ®. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2 Training.
Introduction to SAS. What is SAS? SAS originally stood for “Statistical Analysis System”. SAS is a computer software system that provides all the tools.
1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book.
Math 3400 Computer Applications of Statistics Lecture 1 Introduction and SAS Overview.
SAS 介绍和举例 Presented by 经济实验教学中心 商务数据挖掘中心. Raw Data Read in Data Process Data (Create new variables) Output Data (Create SAS Dataset) Analyze Data Using.
Chi-square (χ 2 ) Fenster Chi-Square Chi-Square χ 2 Chi-Square χ 2 Tests of Statistical Significance for Nominal Level Data (Note: can also be used for.
Knowing Understanding the Basics Writing your own code SAS Lab.
Chapter 1: Introduction to SAS  SAS programs: A sequence of statements in a particular order  Rules for SAS statements: –Every SAS statement ends in.
I OWA S TATE U NIVERSITY Department of Animal Science Getting Your Data Into SAS (Chapter 2 in the Little SAS Book) Animal Science 500 Lecture No. 3 September.
ISU Basic SAS commands Laboratory No. 1 Computer Techniques for Biological Research Animal Science 500 Ken Stalder, Professor Department of Animal Science.
Introduction to SPSS. Object of the class About the windows in SPSS The basics of managing data files The basic analysis in SPSS.
1 An Introduction to SPSS for Windows Jie Chen Ph.D. 6/4/20161.
A Simple Guide to Using SPSS ( Statistical Package for the Social Sciences) for Windows.
Dr. Engr. Sami ur Rahman Research Methods in Computer Science Lecture: Data Analysis (Introduction to SPSS)
Laboratory 1. Introduction to SAS u Statistical Analysis System u Package for –data entry –data manipulation –data storage –data analysis –reporting.
BMTRY 789 Lecture 11: Debugging Readings – Chapter 10 (3 rd Ed) from “The Little SAS Book” Lab Problems – None Homework Due – None Final Project Presentations.
Chapter 4 concerns various SAS procedures (PROCs). Every PROC operates on: –the most recently created dataset –all the observations –all the appropriate.
Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1.
Chapter 1: Overview of SAS System Basic Concepts of SAS System.
Summer SAS Workshop Lecture 3. Summer SAS Workshop Website
SAS for Data Management and Analysis
An Introduction Katherine Nicholas & Liqiong Fan.
Computing with SAS Software A SAS program consists of SAS statements. 1. The DATA step consists of SAS statements that define your data and create a SAS.
FORMAT statements can be used to change the look of your output –if FORMAT is in the DATA step, then the formats are permanent and stored with the dataset.
Using SPSS Next. An Introduction SPSS (the Statistical Package for the Social Sciences)
Chapter 8: Using Basic Statistical Procedures “33⅓% of the mice used in the experiment were cured by the test drug; 33⅓% of the test population were unaffected.
Chapter 1 Introduction to Statistics. Section 1.1 Fundamental Statistical Concepts.
1 PEER Session 02/04/15. 2  Multiple good data management software options exist – quantitative (e.g., SPSS), qualitative (e.g, atlas.ti), mixed (e.g.,
Chapter 6: Modifying and Combining Data Sets  The SET statement is a powerful statement in the DATA step DATA newdatasetname; SET olddatasetname;.. run;
1 EPIB 698C Lecture 1 Instructor: Raul Cruz-Cano
SAS Programming Training Instructor:Greg Grandits TA: Textbooks:The Little SAS Book, 5th Edition Applied Statistics and the SAS Programming Language, 5.
HRP Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and.
Analyzing Data. Learning Objectives You will learn to: – Import from excel – Add, move, recode, label, and compute variables – Perform descriptive analyses.
The Urban Institute - SAS Training6/9/20161 SAS Training This SAS Training Course was designed to introduce users at The Urban Institute to SAS programming.
Development Environment
Variables, Expressions, and IO
DEPARTMENT OF COMPUTER SCIENCE
Chapter 1: Introduction to SAS
Tamara Arenovich Tony Panzarella
Chapter 4: Sorting, Printing, Summarizing
Hardware is… Software is…
Presentation transcript:

Quantify the Example Data First, code and quantify the data (assign column locations & variable names) Use the sample data to create a data set from the first 10 counties Include: ID, County, Number of reporting Units (v1), Number of employees (v2), Payroll (v3) Save to your flash drive as ‘countydata’

The SAS ® System Statistical Analysis Programming

Introduction to SAS ® Arguably the most popular computer software for conducting statistical data analysis Does both data management & statistical analysis Useful for managing even the most complex data sets Operates on its own language

Introduction to SAS ® Open the SAS ® Window

Introduction to SAS ® You essentially have 4 windows within SAS: The Explorer Sidebar Window The Log Window The Editor Window The Output Window You can resize and reconfigure these windows, and minimize & maximize as you would in any windows-based program

Introduction to SAS ® The Editor Window is for constructing & running programs “Programming” in SAS involves writing out step-by-step instructions in the correct order in a format the SAS System can understand The program you write must be perfect SAS will give you error messages

SAS ® Programming Three major components to most SAS programs: Input Manipulation Output

SAS ® Programming Input Most of the time data are placed into a data file and inputted into the program The program tells the system which variables are located in which columns

SAS ® Programming: Input Input data & column locations

SAS ® Programming Manipulation Data are then manipulated to accomplish the tasks for which the program was written: transforming or combining variables or conducting statistical or other analyses

SAS ® Programming: Manipulation Manipulate the Data

SAS ® Programming Output Program Output The results of the program are then outputted into the Output Window You must save these results Log

SAS ® Programming: Output

SAS ® Programming: Log

SAS ® Programming Basic Input Statement = “DATA Step” Begins with an “options” statement that formats what the output page will look like Names the temporary data set location “data1,” “data 2,” etc. or text name (8 characters max) Tells SAS where to find your actual data set File location Gives the “Input” – or, column locations for your variables

SAS ® Programming: Input Options Temporary Data Set Data Location Input Column Locations

SAS ® Programming Basic Input Statement After your input statement, you add statements to transform or manipulate the data Add statements to perform analysis procedures Ends with a RUN statement

SAS ® Programming: Input Data Manipulations & Transformations Analysis Procedure

SAS ® Programming: Syntax SAS Statements Commands or instructions that can be interpreted by the SAS system These commands appear as blue text in the Enhanced Editor window DATA, PROC, PUT, INPUT, RUN, etc.

SAS ® Programming: Syntax Every SAS statement must end in a semicolon; This is how the system knows the statement is complete One of the most common errors is omitting semicolons Comments begin with an asterisk *

SAS ® Programming: Syntax In the Enhanced Editor: Plain text is black Numerical values are teal SAS Statements are blue Errors are red Basic arithmetic functions can be used (+, -, *, /)

SAS ® Programming: Logical Operators Symbol AbbreviationOperation = eqequal to ^= or ~= nenot equal to > gtgreater than < ltless than >= or => gegreater than or equal to <= or =< leless than or equal to & and | or

Building a SAS ® Program 1. Open the SAS Program and Click inside the Editor Window 2. Add your “options” statements: options nocenter nonumber nodate linesize=88 pagesize=72; 3. Add the “data” statement, then the name of your first temporary data file (data1)

Building a SAS ® Program

4. Add the “infile” statement, then the file location where your data is stored 5. Add the “input” statement, then each variable name followed by its numeric location A dollar sign $ after a variable name signifies that the variable is character (text) data Recommend that you input data in 80 column lines, #2 would signify the start of a new line

Building a SAS ® Program

6. Add statements for data management or statistical analysis. SAS Statements vary based on the task to be accomplished Data management: create new variables, change values, etc. Statistical procedures: frequencies, correlations, crosstabulations, regression, etc.

Building a SAS ® Program

Hands-On Exercise 1: Build a Basic SAS Program Using SAS, write a basic program for the county data set you created For your analysis, run a “print” command: Proc Print; var county v1 v2 v3;

Exercise 1

SAS ® Procedures PROC Commands SAS procedures that perform different operations use “PROC” commands A lot of different PROC commands, we’ll touch on a few of the most used Some for data management Some for statistical analysis

SAS ® Procedures PROC PRINT Prints the data you have in your temporary SAS data set Will print the variables you designate (either those from your initial INPUT statement, or variables you create) Helps you better understand your data set; helps you spot errors

SAS ® Procedures Proc Print; var v1 v2 v3; This statement tells SAS to print the data / information for v1, v2, and v3 If you run “PROC PRINT” without any variables designated, it will print ALL of your variables

SAS ® Procedures PROC PRINT You should run a proc print when you transform variables or create new variables to insure that the transformations were done correctly Example Create a new variable by adding two others: newvar = v1+v2; Proc print; var v1 v2 newvar; Check the output to insure that the operation is correct

Variable Manipulations SAS will permit you to perform many different types of variable manipulations Add Variables newvar1 = v1+v2+v3; Subtract Variables newvar2 = v3 – v2;

Variable Manipulations Multiply Variables newvar3 = v2 * v3; Divide Variables newvar4 = v2/v1; More complex transformations can be done following basic rules for arithmetic operations newvar5 = (v1+v2/v3)*4;

Variable Manipulations You can also use your new variables in other transformations newvar6 = newvar4*newvar5 Create categorical variables You can reformat your data into new variables If you have a survey question with responses showing ‘year of birth’ you can convert it to ‘age’

Variable Manipulations

For example, if you have a series of data for a variable: Variable name: “vexample” Values: We want to create a categorical variable with the categories and corresponding values of: Low = 1 Medium = 2 High = 3

Variable Manipulations Give your new variable a name like “newvexample” or “vexamplecat” Your new categorical variable would be created with this if/then syntax:

Variable Manipulations If your data is not as simple as and so on, you can use the “PROC SORT” command to help you sort your data set

Variable Manipulations Run a PROC SORT for v2, and then run a PROC PRINT to show the variable rearranged in ascending order

Variable Manipulations

Now, create a new variable “newv2” with the following categories: Low = 1 (values less than 100) Medium = 2 (values 100 to 500) High = 3 (values more than 500) Run a PROC PRINT and PROC FREQ to check your transformations

Variable Manipulations

IF/THEN Statements In the previous exercise, you saw how if / then statements can be used to create new variables If / then statements are very powerful and can be used in a number of ways to help you manage your data

IF/THEN Statements Segmenting Data Sets – the IF statement Simple IF Statements The SAS “IF” command can be used to segment or partition your data set For example, suppose you only want to examine certain cases in your data set – only females, only people over age 55, only Florida counties with populations greater than 500,000, etc.

IF/THEN Statements You can segment in this way, using the IF statement: If we only want to examine the number of reporting units in our sample for counties with a “low” number of employees: If newv2 is low looks like this in SAS language: IF newv2=1;

IF/THEN Statements

Combining IF statements to segment data sets with the DATA command It is very useful to combine the IF command to segment data with the DATA command we learned earlier Recall that your initial data step started with the command: data data1; This created the initial temporary SAS data set

IF/THEN Statements The temporary data set “data 1” contained all of the cases that you entered into your data set If you now want to examine only a subset of those cases, you can do that in a second data set: data data 2; set data1; This creates a second temporary data set called “data 2” (remember SAS allows a large number of data sets)

IF/THEN Statements We can now use an IF statement to segment the data in our set “data 2” Let’s create a second data set that includes only counties with a “medium” number of employees Run a PROC PRINT to check the output

IF/THEN Statements

The PROC PRINT shows us that the temporary data set we’re now dealing with has only the 5 counties with a “medium” number of employees

IF/THEN Statements Hands-On Exercise Use the commands we’ve just learned to: 1. Create a new variable for high, medium, and low payroll amounts (newv3) 2. Use the DATA and IF statements to create a new data set that contains only those counties with the highest payroll for gasoline services stations – run a PROC PRINT to check your results

IF/THEN Statements

The IF and THEN commands are most often used together with the operators we talked about before

SAS ® Programming: Logical Operators Symbol AbbreviationOperation = eqequal to ^= or ~= nenot equal to > gtgreater than < ltless than >= or => gegreater than or equal to <= or =< leless than or equal to & and | or

IF/THEN Statements More Complex IF statements Multiple IF statements can be connected using “and” or “or” statements to make more complex statements: if v1 eq 2 or v2 gt 5 and v3 ne 2 then newvar =1

IF/THEN Statements Using IF and THEN statements: The general form of this command (for creating new variables, separating data sets, etc.) is: IF variable condition exists (character indicator abbreviation: eq, ne, lt, le, ge) THEN new variable condition (numeric symbol) IF v2 eq 5 then newv2 = 1; Again, you can combine conditions for more complex statements

IF/THEN Statements

Add Variables & Cases Two other important data management functions that SAS can perform are adding additional cases or observations and adding new variables

Add Variables & Cases Adding Cases The term for adding cases or observations is “concatenation” This allows you to add new cases to the bottom of your existing data set You simply create a second data set and add it to your initial data set

Add Variables & Cases Initial Data Set Additional Cases Merged Set

Add Variables & Cases Hands-On Exercise You have already created one data set of 10 counties 1. Create a new data set containing information for the next 4 counties (Collier, Columbia, De Soto, and Dixie) 2. Add these cases to your existing data set 3. Do a PROC PRINT for data3 to verify

Exercise

Add Variables & Cases Adding Variables Adding variables to your existing data is simple as well Again, you will need to create a second data set that will essentially add a column or columns to your initial data set The second data set will contain the new variable you are adding and one variable that matches exactly a variable in your initial data set – usually the sequential ID number (similar to Access)

Add Variables & Cases To make sure that the data sets are properly combined, you must SORT the initial and second data set by the matching variable The syntax looks like this:

Add Variables & Cases Initial Data Set Added Variables Merge

SAS ® Statistical Procedures Descriptive Procedure for Continuous Data PROC UNIVARIATE; Proc Univariate will provide basic descriptive information for continuous variables The syntax looks like this:

SAS ® Statistical Procedures

Descriptive Procedure for Categorical Data PROC FREQ; Proc Freq will provide basic descriptive information for categorical or ordinal variables The syntax looks like this:

SAS ® Statistical Procedures

Analytical Procedures for Continuous Data PROC CORR; Proc Corr provides an analysis of the association between two continuous variables Computes a correlation coefficient that demonstrates the level of association, as well as a p-value showing the significance of that association The syntax looks like this:

SAS ® Statistical Procedures Correlation coefficient p-value

SAS ® Statistical Procedures Analytical Procedures for Categorical Data PROC FREQ; Proc Freq can also be used to calculate the level of association between two categorical or nominal variables X 2 can be added to assess the significance level of that association The syntax looks like this: DV IV

SAS ® Statistical Procedures Crosstab Table Chi-square analysis

SAS ® Statistical Procedures PROC FREQ can also be used in conjunction with DEVIATION to analyze the standard deviation Many SAS procedures like this have additional analyses that can be added in this way

SAS ® Statistical Procedures Multivariate Analysis: PROC REG; computes the association between a continuous dependent variable and numerous independent variables PROC LOGIT; computes the association between a categorical dependent variable and numerous independent variables

SAS ® Statistical Procedures Regression analysis: PROC REG; Uses the “model” command Construct your model with your dependent variable first, then your independent variables The syntax looks like this:

SAS ® Statistical Procedures

These are only a few examples of the analyses you can do with SAS SAS can also do: Time series analysis Factor analysis ANNOVA T-tests …and more!