BMTRY 789 Lecture9: Proc Tabulate Readings – Chapter 11 & Selected SUGI Reading Lab Problems - 11.1, 11.2 Homework Due Next Week– HW6.

Slides:



Advertisements
Similar presentations
Aggregate Data and Statistics
Advertisements

Summer Computing Workshop. Introduction to Variables Variables are used in every aspect of programming. They are used to store data the programmer needs.
Access 2007 ® Use Databases How can Microsoft Access 2007 help you structure your database?
General Astronomy Using Excel for Lab Analysis. Introduction Being able to use a spreadsheet to help in analysis of any laboratory work is a very useful.
Livelihoods analysis using SPSS. Why do we analyze livelihoods?  Food security analysis aims at informing geographical and socio-economic targeting 
Example 2.11 Comparison of Male and Female Movie Stars’ Salaries Exploring Data with Pivot Tables.
John Porter Why this presentation? The forms data take for analysis are often different than the forms data take for archival storage Spreadsheets are.
Six compound procedures and higher-order procedures.
Quick Data Summaries in SAS Start by bringing in data –Use permanent data set for these examples Proc Tabulate –Produces summaries very quickly and easily.
Modules, Hierarchy Charts, and Documentation
Introduction to SQL Session 1 Retrieving Data From a Single Table.
Pet Fish and High Cholesterol in the WHI OS: An Analysis Example Joe Larson 5 / 6 / 09.
DAY 21: MICROSOFT ACCESS – CHAPTER 5 MICROSOFT ACCESS – CHAPTER 6 MICROSOFT ACCESS – CHAPTER 7 Akhila Kondai October 30, 2013.
Significance Testing 10/15/2013. Readings Chapter 3 Proposing Explanations, Framing Hypotheses, and Making Comparisons (Pollock) (pp ) Chapter 5.
SAS PROC REPORT PROC TABULATE
XP New Perspectives on Microsoft Access 2002 Tutorial 51 Microsoft Access 2002 Tutorial 5 – Enhancing a Table’s Design, and Creating Advanced Queries and.
SAS Workshop Lecture 1 Lecturer: Annie N. Simpson, MSc.
Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS ESSENTIALS -- Elliott & Woodward1.
IE 212: Computational Methods for Industrial Engineering
PROC REPORT organizes the output in many ways, from the simple to highly complex… PROC REPORT NOWINDOWS HEADLINE HEADSKIP; COLUMN variable-list; DEFINE.
Outlines Chapter 3 –Chapter 3 – Loops & Revision –Loops while do … while – revision 1.
HPR Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and.
Chapter 03: Lecture Notes (CSIT 104) 11 Chapter 3 Charts: Delivering a Message Exploring Microsoft Office Excel 2007.
Chapter 6 SAS ® OLAP Cube Studio. Section 6.1 SAS OLAP Cube Studio Architecture.
Statistics for the Social Sciences Psychology 340 Fall 2013 Correlation and Regression.
BMTRY 789 Introduction to SAS Programming Lecturer: Annie N. Simpson, MSc.
Conditional Statements For computer to make decisions, must be able to test CONDITIONS IF it is raining THEN I will not go outside IF Count is not zero.
Chapter 7 File I/O 1. File, Record & Field 2 The file is just a chunk of disk space set aside for data and given a name. The computer has no idea what.
Research Design 10/16/2012. Readings Chapter 3 Proposing Explanations, Framing Hypotheses, and Making Comparisons (pp ) Chapter 5 Making Controlled.
Access 2007 ® Use Databases How can Microsoft Access 2007 help you structure your database?
A Simple Guide to Using SPSS ( Statistical Package for the Social Sciences) for Windows.
PROCESSING OF DATA The collected data in research is processed and analyzed to come to some conclusions or to verify the hypothesis made. Processing of.
Making Python Pretty!. How to Use This Presentation… Download a copy of this presentation to your ‘Computing’ folder. Follow the code examples, and put.
Priya Ramaswami Janssen R&D US. Advantages of PROC REPORT -Very powerful -Perform lists, subsets, statistics, computations, formatting within one procedure.
1 FUNCTIONS - I Chapter 5 Functions help us write more complex programs.
BMTRY 789 Lecture 11: Debugging Readings – Chapter 10 (3 rd Ed) from “The Little SAS Book” Lab Problems – None Homework Due – None Final Project Presentations.
Spreadsheet Models for Managers: Session 14 14/1 Copyright © Richard Brenner Spreadsheet Models for Managers Session 14 Using Macros II Function.
Chapter 4 concerns various SAS procedures (PROCs). Every PROC operates on: –the most recently created dataset –all the observations –all the appropriate.
Matlab tutorial course Lesson 4: Writing your own functions: programming constructs
DAY 21: MICROSOFT ACCESS – CHAPTER 5 MICROSOFT ACCESS – CHAPTER 6 MICROSOFT ACCESS – CHAPTER 7 Aliya Farheen October 29,2015.
FORMAT statements can be used to change the look of your output –if FORMAT is in the DATA step, then the formats are permanent and stored with the dataset.
LECTURE 02: EVALUATING MODELS January 27, 2016 SDS 293 Machine Learning.
BMTRY 789 Lecture 6: Proc Sort, Random Number Generators, and Do Loops Readings – Chapters 5 & 6 Lab Problem - Brain Teaser Homework Due – HW 2 Homework.
Chapter 8: Using Basic Statistical Procedures “33⅓% of the mice used in the experiment were cured by the test drug; 33⅓% of the test population were unaffected.
ANOVA and Multiple Comparison Tests
Uncertainty and confidence Although the sample mean,, is a unique number for any particular sample, if you pick a different sample you will probably get.
Chapter 6: Modifying and Combining Data Sets  The SET statement is a powerful statement in the DATA step DATA newdatasetname; SET olddatasetname;.. run;
MICROSOFT ACCESS – CHAPTER 5 MICROSOFT ACCESS – CHAPTER 6 MICROSOFT ACCESS – CHAPTER 7 Sravanthi Lakkimsety Mar 14,2016.
The Law of Averages. What does the law of average say? We know that, from the definition of probability, in the long run the frequency of some event will.
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
Working Through a Survey From Raw Data to Meaningful Visuals Xiaobing Shuai
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 14 & 19 By Tasha Chapman, Oregon Health Authority.
Loops Brent M. Dingle Texas A&M University Chapter 6 – Section 6.3 Multiway Branches (and some from Mastering Turbo Pascal 5.5, 3 rd Edition by Tom Swan)
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 16 & 17 By Tasha Chapman, Oregon Health Authority.
Excel is our friend. Really. Excel is a spreadsheet program used a great deal in business and in science. It can do a lot of things but we will mainly.
CSC 108H: Introduction to Computer Programming
AP CSP: Cleaning Data & Creating Summary Tables
CMPT 120 Topic: Python’s building blocks -> More Statements
Applied Business Forecasting and Regression Analysis
Introduction to Python
Loops BIS1523 – Lecture 10.
Chapter 6: Modifying and Combining Data Sets
Other Kinds of Arrays Chapter 11
Intro to PHP & Variables
Number and String Operations
Coding Concepts (Data Structures)
Subscript and Summation Notation
Producing Descriptive Statistics
Understand the importance of data context
Introduction to SAS Essentials Mastering SAS for Data Analytics
Presentation transcript:

BMTRY 789 Lecture9: Proc Tabulate Readings – Chapter 11 & Selected SUGI Reading Lab Problems , 11.2 Homework Due Next Week– HW6

Summer 2009BMTRY 789 Intro to SAS Programming2 Ready for a break yet? I AM!

Summer 2009BMTRY 789 Intro to SAS Programming3 Why Use Proc Tabulate? SAS Software provides hundreds of ways you can analyze your data. You can use the DATA step to slice and dice your data, and there are dozens of procedures that will process your data and produce all kinds of statistics. Odds are that no matter how you organize and analyze your data, you’ll end up producing a report in the form of a table. This is why every SAS user needs to know how to use PROC TABULATE.

Summer 2009BMTRY 789 Intro to SAS Programming4 Why Use Proc Tabulate? While TABULATE doesn’t do anything that you can’t do with other PROCs, the payoff is in the output. TABULATE computes a variety of statistics, and it neatly packages the results in a single table. Unfortunately, TABULATE has gotten a bad rap as being a difficult procedure to learn. If we take things step by step, anyone can learn PROC TABULATE.

Summer 2009BMTRY 789 Intro to SAS Programming5 One dimensional tables To really understand TABULATE, you have to start very simply. The simplest possible table in TABULATE has to have three things: a PROC TABULATE statement, a TABLE statement, and a CLASS or VAR statement.

Summer 2009BMTRY 789 Intro to SAS Programming6 One dimensional tables The PROC TABULATE statement looks like this: PROC TABULATE DATA=TEMP; TABLE SCORE1; RUN; If you run this code as is, you will get an error message because TABULATE can’t figure out whether the variable SCORE1 is intended as an analysis variable, which is used to compute statistics, or a classification variable, which is used to define row or column categories in the table. wrong

Summer 2009BMTRY 789 Intro to SAS Programming7 One dimensional tables In this case, we want to use SCORE1 as the analysis variable. We will be using it to compute a statistic. To tell TABULATE that SCORE1 is an analysis variable, you use a VAR statement. The syntax of a VAR statement is simple: you just list the variables that will be used for analysis. So now the syntax for our PROC TABULATE is: PROC TABULATE DATA=TEMP; VAR SCORE1; TABLE SCORE1; RUN; right

Summer 2009BMTRY 789 Intro to SAS Programming8 One dimensional tables The result is the table shown below. It has a single column, with the header SCORE1 to identify the variable, and the header SUM to identify the statistic. There is just a single table cell, which contains the value for the sum of Score1 for all of the observations in the dataset. score1 Sum

Summer 2009BMTRY 789 Intro to SAS Programming9 ADDING A STATISTIC The previous table shows what happens if you don’t tell TABULATE which statistic to use. If the variable in your table is an analysis variable, meaning that it is listed in a VAR statement, then the statistic you will get by default is the sum. Sometimes the sum will be the statistic that you want. Most likely, sum isn’t the statistic that you want. PROC TABULATE DATA=TEMP; VAR SCORE1; TABLE SCORE1*MEAN; RUN; score1 Mean 10.22

Summer 2009BMTRY 789 Intro to SAS Programming10 ADDING ANOTHER STATISTIC Each of the tables shown so far was useful, but the power of PROC TABULATE comes from being able to combine several statistics and/or several variables in the same table. TABULATE does this by letting you specify a series of “tables” within a single large table. We’re going to add a “table” showing the number of observations, as well as showing the mean SCORE. PROC TABULATE DATA=TEMP; VAR SCORE1; TABLE SCORE1*N SCORE1*MEAN; RUN; OR PROC TABULATE DATA=TEMP; VAR SCORE1; TABLE SCORE1*(N MEAN); RUN; *NOTE: When SAS is creating a one-dimensional table, additional variables, statistics, and categories are always added as new columns. score1 NMean

Summer 2009BMTRY 789 Intro to SAS Programming11 ADDING A CLASSIFICATION VARIABLE After seeing the tables we’ve built so far in this chapter, you’re probably asking yourself, “Why use PROC TABULATE? Everything I’ve seen so far could be done with a PROC MEANS.” One answer to this question is classification variables. By specifying a variable to categorize your data, you can produce a concise table that shows values for various subgroups in your data. PROC TABULATE DATA=TEMP; CLASS TREAT; VAR SCORE1; TABLE SCORE1*MEAN*TREAT; RUN; score1 Mean treat

Summer 2009BMTRY 789 Intro to SAS Programming12 TWO-DIMENSIONAL TABLES You probably noticed that our example table is not very elegant in appearance. That’s because it only takes advantage of one dimension. It has multiple columns, but only one row. It is much more efficient to build tables that have multiple rows and multiple columns. You can fit more information on a page, and the table looks better, too. The easiest way to build a two-dimensional table is to build it one dimension at a time. First we’ll build the columns, and then we’ll add the rows. PROC TABULATE DATA=TEMP; CLASS TREAT; VAR SCORE1; TABLE TREAT, SCORE1*(N MEAN); RUN; score1 NMean treat

Summer 2009BMTRY 789 Intro to SAS Programming13 ADDING CLASSIFICATION VARIABLES ON BOTH DIMENSIONS The previous example showed how to reformat a table from one to two dimensions, but it did not show the true power of two-dimensional tables. With two dimensions, you can classify your statistics by two different variables at the same time. To do this, you put one classification variable in the row dimension and one classification in the column dimension. PROC TABULATE DATA=TEMP; CLASS TREAT VISIT; VAR SCORE1; TABLE TREAT, SCORE1*VISIT*(N MEAN); RUN; score1 visit 123 NMeanN N treat *Notice how the analysis variable SCORE1 remains as the column heading, and N and MEAN remains as the statistics, but now there are additional column headings to show the three categories of VISIT.

Summer 2009BMTRY 789 Intro to SAS Programming14 ADDING ANOTHER CLASSIFICATION VARIABLE The previous example showed how to add a classification variable to both the rows and columns of a two-dimensional table. But you are not limited to just one classification per dimension. PROC TABULATE DATA=TEMP; CLASS TREAT VISIT PTN; VAR SCORE1; TABLE TREAT PTN, SCORE1*VISIT*(N MEAN); RUN; *This ability to stack multiple mini-tables within a single table can be a powerful tool for delivering large quantities of information in a user-friendly format (i.e. TABLE 1 of most publications). score1 visit 123 NMeanN N treat ptn

Summer 2009BMTRY 789 Intro to SAS Programming15 NESTING THE CLASSIFICATION VARIABLES So far all we have done is added additional “tables” to the bottom of our first table. We used the space operator between each of the row variables to produce stacked tables. By using a series of row variables, we can explore a variety of relationships between the variables. PROC TABULATE DATA=TEMP; CLASS TREAT VISIT; VAR SCORE1; TABLE TREAT*SCORE1*VISIT*(N MEAN); RUN; treat 12 score1 visit NMeanN N N N N

Summer 2009BMTRY 789 Intro to SAS Programming16 ADDING TOTALS TO THE ROWS As your tables get more complex, you can help make your tables more readable by adding row and column totals. Totals are quite easy to generate in TABULATE because you can use the ALL variable. This is a built in classification variable supplied by TABULATE that stands for “all observations.” You do not have to list it in the CLASS statement because it is a classification variable by definition. PROC TABULATE DATA=TEMP; CLASS TREAT VISIT; VAR SCORE1; TABLE TREAT, SCORE1*(VISIT ALL)*(N MEAN); RUN; score1 visit All 123 NMeanN N N treat

Summer 2009BMTRY 789 Intro to SAS Programming17 ADDING TOTALS TO THE ROWS AND COLUMNS Not only can you use ALL to add row totals, but you can also use ALL to produce column totals. What you do is list ALL as an additional variable in the row definition of the TABLE statement. No asterisk is needed because we just want to add a total at the bottom of the table. PROC TABULATE DATA=TEMP; CLASS TREAT VISIT; VAR SCORE1; TABLE TREAT ALL, SCORE1*(VISIT ALL)*(N MEAN); RUN; score1 visit All 123 NMeanN N N treat All

Summer 2009BMTRY 789 Intro to SAS Programming18 MAKING THE TABLE PRETTY The preceding examples have shown how to create basic tables. They contained all of the needed information, but they were pretty ugly. The next thing you need to learn is a few tricks to clean up your tables. PROC TABULATE DATA=TEMP; CLASS TREAT VISIT; VAR SCORE1; TABLE TREAT ALL=‘Total’, SCORE1*(VISIT ALL=‘Total’)*(N MEAN); RUN; score1 visit Total 123 NMeanN N N treat Total

Summer 2009BMTRY 789 Intro to SAS Programming19 MAKING THE TABLE PRETTY Another thing that could be improved about this table is getting rid of the excessive column headings. Proc Format; value visit 1='Visit 1' 2='Visit 2' 3='Visit 3'; value tr 1='Therapy 1' 2='Therapy 2'; run; PROC TABULATE DATA=TEMP; CLASS TREAT VISIT; VAR SCORE1; TABLE TREAT=‘ ’ ALL=‘Total’, SCORE1=‘ ‘ *(VISIT=‘ ‘ ALL=‘Total’)*MEAN/ box=‘Average Score’; Format treat tr. Visit visit.; RUN; Average Score Visit 1Visit 2Visit 3Total Therapy Therapy Total