Tamara Arenovich Tony Panzarella

Slides:



Advertisements
Similar presentations
Statistical Methods Lynne Stokes Department of Statistical Science Lecture 7: Introduction to SAS Programming Language.
Advertisements

Tutorial 12: Enhancing Excel with Visual Basic for Applications
Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming I: Essentials Training.
Statistics in Science  Introducing SAS ® software Acknowlegements to David Williams Caroline Brophy.
Introduction to SPSS Allen Risley Academic Technology Services, CSUSM
1 An Introduction to IBM SPSS PSY450 Experimental Psychology Dr. Dwight Hennessy.
Introduction to SPSS Short Courses Last created (Feb, 2008) Kentaka Aruga.
Welcome to SAS…Session..!. What is SAS..! A Complete programming language with report formatting with statistical and mathematical capabilities.
JavaScript, Fifth Edition Chapter 1 Introduction to JavaScript.
SAS Workshop Lecture 1 Lecturer: Annie N. Simpson, MSc.
Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS ESSENTIALS -- Elliott & Woodward1.
Introduction to SAS BIO 226 – Spring Outline Windows and common rules Getting the data –The PRINT and CONTENT Procedures Manipulating the data.
IPC144 Introduction to Programming Using C Week 1 – Lesson 2
Introduction to SAS. What is SAS? SAS originally stood for “Statistical Analysis System”. SAS is a computer software system that provides all the tools.
Quantify the Example Data First, code and quantify the data (assign column locations & variable names) Use the sample data to create a data set from the.
Chapter 1: Introduction to SAS  SAS programs: A sequence of statements in a particular order  Rules for SAS statements: –Every SAS statement ends in.
Getting Started with MATLAB 1. Fundamentals of MATLAB 2. Different Windows of MATLAB 1.
ISU Basic SAS commands Laboratory No. 1 Computer Techniques for Biological Research Animal Science 500 Ken Stalder, Professor Department of Animal Science.
Introduction to SPSS. Object of the class About the windows in SPSS The basics of managing data files The basic analysis in SPSS.
Introduction to Enterprise Guide Jennifer Schmidt Rhonda Ellis Cassandra Hall.
FIX Eye FIX Eye Getting started: The guide EPAM Systems B2BITS.
A Simple Guide to Using SPSS ( Statistical Package for the Social Sciences) for Windows.
1 EPIB 698E Lecture 1 Notes Instructor: Raul Cruz 7/9/13.
Dr. Engr. Sami ur Rahman Research Methods in Computer Science Lecture: Data Analysis (Introduction to SPSS)
1 Data Manipulation (with SQL) HRP223 – 2010 October 13, 2010 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This.
Priya Ramaswami Janssen R&D US. Advantages of PROC REPORT -Very powerful -Perform lists, subsets, statistics, computations, formatting within one procedure.
Chapter 5 Reading and Manipulating SAS ® Data Sets and Creating Detailed Reports Xiaogang Su Department of Statistics University of Central Florida.
Chapter 4 concerns various SAS procedures (PROCs). Every PROC operates on: –the most recently created dataset –all the observations –all the appropriate.
Chapter 1: Overview of SAS System Basic Concepts of SAS System.
An Introduction Katherine Nicholas & Liqiong Fan.
Computing with SAS Software A SAS program consists of SAS statements. 1. The DATA step consists of SAS statements that define your data and create a SAS.
FORMAT statements can be used to change the look of your output –if FORMAT is in the DATA step, then the formats are permanent and stored with the dataset.
1 PEER Session 02/04/15. 2  Multiple good data management software options exist – quantitative (e.g., SPSS), qualitative (e.g, atlas.ti), mixed (e.g.,
1 Introduction to SAS Available at
1 Data Manipulation (with SQL) HRP223 – 2009 October 12, 2009 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This.
1 EPIB 698C Lecture 1 Instructor: Raul Cruz-Cano
SAS Programming Training Instructor:Greg Grandits TA: Textbooks:The Little SAS Book, 5th Edition Applied Statistics and the SAS Programming Language, 5.
HRP Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and.
Based on Learning SAS by Example: A Programmer’s Guide Chapters 1 & 2
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 14 & 19 By Tasha Chapman, Oregon Health Authority.
IENG-385 Statistical Methods for Engineers SPSS (Statistical package for social science) LAB # 1 (An Introduction to SPSS)
Mail Merge Introduction to Word Processing ITSW 1401 Instructor: Glenda H. Easter Introduction to Word Processing ITSW 1401 Instructor: Glenda H. Easter.
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 3 & 4 By Tasha Chapman, Oregon Health Authority.
Chapter 3: Getting Started with Tasks
Session 1 Retrieving Data From a Single Table
Chapter 5: Enhancing Your Output with ODS
Applied Business Forecasting and Regression Analysis
Downloading and Preparing a StudentVoice File for SPSS
Computing Fundamentals
Introduction to SPSS.
By Dr. Madhukar H. Dalvi Nagindas Khandwala college
Chapter 2: Getting Data into SAS
DEPARTMENT OF COMPUTER SCIENCE
SAS Programming Introduction to SAS.
Intro to PHP & Variables
Working with Data in Windows
Chapter 1: Introduction to SAS
IPC144 Introduction to Programming Using C Week 1 – Lesson 2
Instructor: Raul Cruz-Cano
Exploring Microsoft® Access® 2016 Series Editor Mary Anne Poatsy
Variables In programming, we often need to have places to store data. These receptacles are called variables. They are called that because they can change.
WEB PROGRAMMING JavaScript.
Introduction to SAS A SAS program is a list of SAS statements executed in order Every SAS statement ends with a semicolon! SAS statements can be in caps.
PHP.
Producing Descriptive Statistics
Lab 2 HRP223 – 2010 October 18, 2010 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected.
Data Manipulation (with SQL)
Instructor: Raul Cruz 9/4/13
Introduction to SAS Essentials Mastering SAS for Data Analytics
Presentation transcript:

Tamara Arenovich Tony Panzarella Introduction to SAS Tamara Arenovich Tony Panzarella

I. OBJECTIVES This session is intended to introduce you to SAS – what it is, how it works, and how you will use it. The focus of this session is on the SAS programming essentials needed to help get you started on your SAS session. We will also cover some basic descriptive statistics. II. WHAT IS SAS? The acronym SAS stands for Statistical Analysis System. Simply put, it is a software program that allows you to analyze lots of data quite rapidly. It works by having you tell it what to do through a sequence of steps (or commands). Through this sequence of steps, there are four major tasks (in general) that are often performed: Data access Data management Data analysis Data presentation

Enhanced Editor III. EXPLORING THE SAS ENVIRONMENT (IN WINDOWS) [I] Syntax Rules: SAS programs must be written following syntax rules Beginning with a Keyword Ending with a semicolon SAS statements are not case-sensitive, except inside quotation marks sex = 'm' is the same as SEX = 'm'; sex = 'm' is not the same as sex = 'M‘;

[II] Steps in a SAS Program There are two types of steps in a SAS program: DATA steps PROC steps A SAS program can contain any number of DATA and PROC steps Examples:

[III] Running (Submitting) SAS Programs Running your SAS program: Until you’re sure your SAS program is completely error-free, submitting only sections of your SAS code at a time is preferred

B. SAS Log After submitting a SAS program, the SAS log contains information about the processing of the SAS program, including any error or warning messages In the SAS log, you should always see The SAS statements NOTES You might also see WARNINGS ERRORS Note: Always read your SAS log after running a SAS program.

C. SAS Results Viewer / Output As a general rule, PROC steps generate output, DATA steps do not. The results of your PROC steps can be viewed in the Results Viewer window Earlier versions of SAS (i.e. 9.1, 9.2) print results to the Output window by default. ODS commands are required to make these files look a bit nicer and generate graphics. Your practicum sites may be working with earlier versions of SAS

D. SAS Files All of your SAS programs, SAS datasets, SAS log, and SAS output can be saved: Type of File File Extension SAS program .sas SAS log .log SAS output .mht (default) .lst (earlier versions) SAS dataset .sas7bdat

E. Results Window F. Explorer Window Lists all the reports that appear in the Output window. You can also use the Results window to jump through your output for easy navigation F. Explorer Window The Explorer Window allows you to browse SAS libraries and SAS datasets.

IV. THE FOUR MAJOR TASKS 1. Data access Data management Data analysis Data presentation

Task 1: Data Access (Reading your data into SAS) [I] SAS Libraries All SAS files follow a 2-part naming system: libref.fileref A libref is a reference to a directory on your computer or a connection to a physical location on your computer To define the libref component, we use a ‘libname’ statement (code) OR the New Library window (point & click) Example: libname lunl7 'C:\Users\projects\Data';

No record of this library reference in your program or log file SAS will remember this library designation across sessions

Permanent SAS Libraries: These libraries that are created by you are known as permanent SAS Libraries You may create as many permanent SAS libraries as you wish Rules For Naming your Permanent Library (the LIBREF): Must be 1 to 8 characters Must begin with a letter or underscore. Temporary SAS Libraries: Known as the WORK library. If no libref is specified, WORK is assumed

[II] Reading a SAS Dataset into SAS Assign a library reference that refers to the directory on your computer where the SAS dataset is saved. No cards statement required, the data is already in SAS format – set command used instead

[III] Import/Export Wizard The Import/Export wizard guides you through the importing or exporting process You can import your data from a variety of data sources (e.g. Excel, Access, SPSS, Stata), but make sure your data is structured appropriately prior to importing

Data access 2. Data management Data analysis Data presentation

Task 2: Data Management In Task 2, you are modifying the current SAS dataset and turning it into a new SAS dataset that is appropriate for analysis. All of this cleaning is performed in the DATA step. [I] Naming a SAS Dataset All SAS datasets have two-level names: libref.fileref. Fileref can be 32 characters long Not case-sensitive Must begin with a letter or underscore. Subsequent characters can be letters, underscores or numbers. Special characters (e.g., #) are not used.

Examples of valid SAS dataset names: baseline baseline1 _baseline Examples of non-valid SAS dataset names: base line (cannot have spaces) baseline#1 (# is not a valid character) 1baseline (cannot begin with a number)

[II] Viewing Contents of SAS Dataset Use the PROC CONTENTS procedure to view descriptive information about the contents of your dataset Use the PROC PRINT procedure to view the actual data Use the VAR statement to specify the variables to be displayed. Use the WHERE statement to specify the observations to be displayed

[IV] Types of SAS Variables Character Letters, numbers, special characters and blanks Length 1 to 32,767 bytes [default length is 8 characters] Creating new variables, the length statement precedes the SET statement Numeric variables 8 bytes of storage by default Provides space for 16 to 17 significant digits SEX as a character variable might be coded as: a). sex = '1' or sex = '2' b). sex = 'Male' or sex = 'Female‘ SEX as a numeric variable might be coded as: a). sex = 1 or sex = 2 The way variable values are stored affects what you can do with the variables.

[V] Creating New SAS Datasets Within the DATA step, use the DATA and SET statements to create a new SAS dataset. In the DATA statement, specify the new SAS dataset that you are about to create. In the SET statement, specify the SAS dataset that you are reading from. Example: data yoga.females; set yoga.data1; <insert data management & cleaning statements here>; if sex=1; run;

[VI] Common Data Management Activities Performed in the DATA Step Keep or Remove Observations Use the WHERE statement or the IF statement.

Comparison Operators that Can be Used with a WHERE or IF statement: Definition Mnemonic Symbol Equal to EQ = Not equal to NE ^= ~= Greater than GT > Less than LT < Greater than or equal to GE >= Less than or equal to LE <= Equal to one of a list IN In ()

Logical Operators that Can Be Used with a WHERE or IF statement: Definition Mnemonic Symbol If both expressions are true AND & If either expression is true OR | To reverse logic of a comparison NOT ^ So, any of the WHERE statements below could have been used to restrict the dataset to baseline observations only: where sex = 1; where sex eq 1; where sex ^in (0); where sex not in (0); The following WHERE statement could be used to restrict your dataset to female participants age 65 and older only: where sex = 1 AND age GE 65;

Keep or Drop Variables There are three different ways to do this, and all three methods are done in the DATA step. Method 1: Use the KEEP (or DROP) statement in the DATA step. Method 2: Use the KEEP = (or DROP =) dataset option in the DATA statement. Method 3: Use the KEEP = (or DROP =) dataset option in the SET statement.

Method 1: Use the KEEP (or DROP) statement in the DATA step.

Method 2: Use the KEEP= (or DROP=) dataset option in the DATA statement.

Method 3: Use the KEEP= (or DROP=) dataset option in the SET statement.

Creating a new variable There are many ways to create new variables. I will show you two ways here: Using equations Using conditional (if-then-else) logic.

Renaming a variable: Use the RENAME statement in the DATA step.

Create descriptive labels for variable names: Use the LABEL statement in the DATA step.

Format the Values of a Variable Create and apply formats when you wish to change the appearance of variable values. Creating and applying user-defined formats involves two steps. First, you must create the formats using the PROC FORMAT step. Then, you must apply the format in the DATA step.

Merging Files The MERGE statement can be used to combine two or more SAS datasets Ensure unique identifiers are present and that files are sorted by the unique identifiers One-one and one-many merges are ok, many-one and many-many DO NOT WORK!!!

Summary – in the Data Step Keep or remove observations: Use the WHERE statement Keep or drop variables: Use keep statement in Data Step Use keep= dataset option in the Data statement Use keep= dataset option in the Set statement Create new variable: Using equations Using conditional (if-then-else) logic Renaming a variable: Use the RENAME statement Create descriptive labels: Use the LABEL statement Format the values of a variable: Using the PROC FORMAT step Applying the format in the DATA step

Data access Data management 3. Data analysis Data presentation

Task 3: Data Analysis Two very common SAS procedures: PROC FREQ and PROC MEANS.

[I] PROC FREQ To produce frequency counts and cross tabular frequency tables. Can be used with either numerical or character variables. In the PROC FREQ statement, specify the name of the SAS dataset you wish to analyze. In the TABLES statement, list the variables you want frequencies of. For a cross tabular frequency table, use an asterisk (*) symbol in the TABLES statement to cross variables.

[II] PROC MEANS To display simple descriptive statistics for variables in a SAS dataset. Numerical variables only. In the PROC MEANS statement, specify the name of the SAS dataset to be analyzed. In the VAR statement, list the variables to be analyzed. An optional CLASS statement may be used.

[III] Useful Statements That Can Be Used in Most PROC steps BY Statement This statement may be used with the PROC FREQ procedure. It allows you to perform subgroup analysis, working similarly to the CLASS statement in the PROC MEANS procedure. Before using the BY statement in any procedure, you must first sort your data on the BY variable.

WHERE Statement This statement may be used with both the PROC FREQ or PROC MEANS procedure (and others!)

FORMAT Statement This statement may be used in the PROC FREQ procedure on the analysis variable, or it may be used in the PROC MEANS procedure on the class variable. Use this statement if you did not assign the format of interest in your DATA step, but wish to assign it for a specific procedure only.

Label Statement This statement may be used in both the PROC FREQ and PROC MEANS procedure. In the PROC MEANS procedure, it may be used with both the analysis variable and the class variable. Use this statement if you did not assign the descriptive label of interest in your DATA step, but wish to assign it for a specific procedure only.

SUMMARY – in the PROC Steps By Statement: To be used with the PROC FREQ procedure WHERE Statement: To be used with both the PROC FREQ or PROC MEANS procedures FORMAT Statement: To be used with the PROC FREQ procedure on the analysis variable To be used with the PROC MEANS procedure on the class variable LABEL Statement:

Data access Data management Data analysis 4. Data presentation

Task 4: Data Presentation Here will show you three methods: 1). SAS System Options; 2). Adding titles and/or footnotes; 3). Saving your output as a PDF or EXCEL file.

[I] SAS System Options You may use the OPTIONS statement to change SAS system options.

Commonly used options: Description linesize = n Specifies the line size (printer line width) for the SAS log and the SAS output files pagesize = n Specifies the number of lines that can be printed per page of SAS output nonumber Suppresses the printing of page numbers (by default, page numbers are printed) nodate Suppresses the printing of today’s date (by default, the date is printed) errors = n Specifies the maximum number of observations with error messages

[II] Titles and/or Footnotes Add titles and/or footnotes to your SAS output by using the TITLE and/or FOOTNOTE statement, respectively, in any PROC step.

[III] Create PDF Reports Use ODS (Output Delivery System) statements to write your output to a PDF, HTML, or RTF file. Specifically, you need to write two statements: The ODS PDF FILE = statement specifies the destination of the new PDF file you are about to create. Note that in this statement you must also give the PDF file a name. The ODS PDF CLOSE statement closes the PDF destination.

When working in older versions of SAS, the ODS graphics on statement may generate additional results…(e.g. residual plots in PROC MIXED)

ODS can be extremely useful when you need to save some part of your output directly to a SAS file (e.g. simulations…)

[V] SAS Online Documentation Access the SAS Online Documentation by clicking the last icon in the toolbar: There is a lot of information here! It is particularly useful to you if you know the procedure or statement you want to use and would like to get the syntax for it. Two useful chapters in the Online Documentation are: Procedures: Select Contents -> SAS Products -> Base SAS -> Procedures SAS Stat: Select Contents -> SAS Products -> SAS/STAT -> SAS/STAT User’s Guide

Contact information: Tamara Arenovich Manager, Biostatistical Consulting Service Centre for Addiction and Mental Health Tel: 416-535-8501 ext. 36338 tamara.arenovich@camh.ca tamara.arenovich@utoronto.ca

Acknowledgments We thank Ms. Thi Ho & Ms. Anthea Lau for preparation of this material and this power point file.

- THE END -