SAS Lecture 6 – SAS/GRAPH Aidan McDermott, May 3, 2005.

Slides:



Advertisements
Similar presentations
Summary Statistics/Simple Graphs in SAS/EXCEL/JMP.
Advertisements

Statistical Methods Lynne Stokes Department of Statistical Science Lecture 7: Introduction to SAS Programming Language.
Introduction to Excel 2007 Part 2: Bar Graphs and Histograms February 5, 2008.
Fundamental Features of Graphs All graphs have two, clearly-labeled axes that are drawn at a right angle. –The horizontal axis is the abscissa, or X-axis.
Graphic representations in statistics (part II). Statistics graph Data recorded in surveys are displayed by a statistical graph. There are some specific.
Microsoft Excel 2010 Chapter 7
By Hrishikesh Gadre Session II Department of Mechanical Engineering Louisiana State University Engineering Equation Solver Tutorials.
McGraw-Hill Technology Education © 2004 by the McGraw-Hill Companies, Inc. All rights reserved. Office Excel 2003 Lab 2 Charting Worksheet Data.
Descriptive Statistics In SAS Exploring Your Data.
Types of Data Displays Based on the 2008 AZ State Mathematics Standard.
Exploring Office Grauer and Barber 1 Committed to Shaping the Next Generation of IT Experts. Chapter 3 – Graphs and Charts: Delivering a Message.
1 Committed to Shaping the Next Generation of IT Experts. Chapter 3 – Graphs and Charts: Delivering a Message Robert Grauer and Maryann Barber Exploring.
1 Computing for Todays Lecture 10 Yumei Huo Fall 2006.
XP New Perspectives on Microsoft Office Excel 2003 Tutorial 1 1 Microsoft Office Excel 2003.
Data Tutorial Tutorial on Types of Graphs Used for Data Analysis, Along with How to Enter Them in MS Excel Carryn Bellomo University of Nevada, Las Vegas.
SW318 Social Work Statistics Slide 1 Using SPSS for Graphic Presentation  Various Graphics in SPSS  Pie chart  Bar chart  Histogram  Area chart 
SPSS Statistical Package for the Social Sciences is a statistical analysis and data management software package. SPSS can take data from almost any type.
Using Charts in a Presentation Lesson 6. Software Orientation Charts can help your audience understand relationships among numerical values. The figure.
XP New Perspectives on Microsoft Office Excel 2003 Tutorial 4 1 Microsoft Office Excel 2003 Tutorial 4 – Working With Charts and Graphics.
Introduction to SPSS Short Courses Last created (Feb, 2008) Kentaka Aruga.
1 Chapter 3: Getting Started with Tasks 3.1 Introduction to Tasks and Wizards 3.2 Creating a Frequency Report 3.3 Generating HTML, PDF, and RTF Output.
Introduction to SPSS (For SPSS Version 16.0)
The gchart Procedure The gchart Procedure is used to create bar charts of various types (it can also create pie charts. It’s most basic form would look.
© 2002 ComputerPREP, Inc. All rights reserved. Word 2000: Working with Long Documents.
PY550 Research and Statistics Dr. Mary Alberici Central Methodist University.
11 Chapter 3: Getting Started with Tasks 3.1 Introduction to Tasks and Wizards 3.2 Creating a Frequency Report 3.3 Generating HTML, PDF, and RTF Output.
Chapter 5 Review: Plotting Introduction to MATLAB 7 Engineering 161.
1 iSee Player Tutorial Using the Forest Biomass Accumulation Model as an Example ( Tutorial Developed by: (
Data Analysis Using SPSS
European Computer Driving Licence Syllabus version 5.0 Module 4 – Spreadsheets Chapter 22 – Functions Pass ECDL5 for Office 2007 Module 4 Spreadsheets.
Introduction to SAS BIO 226 – Spring Outline Windows and common rules Getting the data –The PRINT and CONTENT Procedures Manipulating the data.
Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS ESSENTIALS -- Elliott & Woodward1.
Chapter 9 Creating and Designing Graphs. Creating a Graph A graph is a diagram of data that shows relationship among a set of numbers. Data can be represented.
The introduction to SPSS Ⅱ.Tables and Graphs for one variable ---Descriptive Statistics & Graphs.
Microsoft Office Illustrated Introductory, Premium Edition with Charts Working.
 Frequency Distribution is a statistical technique to explore the underlying patterns of raw data.  Preparing frequency distribution tables, we can.
XP New Perspectives on Microsoft Word 2002 Tutorial 31 Microsoft Word 2002 Tutorial 3 – Creating a Multiple-Page Report.
Microsoft ® Office Excel 2007 Working with Charts.
Introduction to SPSS. Object of the class About the windows in SPSS The basics of managing data files The basic analysis in SPSS.
Introduction to Enterprise Guide Jennifer Schmidt Rhonda Ellis Cassandra Hall.
CREATING CHARTS By: Dr. Ennis - Cole OBJECTIVES b Identify the elements of an Excel chart b Identify the type of chart represents your data most effectively.
Graphing Tutorial William Hornick CS 101. Overview You will be given a brief description, example, and “how to create” for each of the following: You.
Effective SAS greplay’ing and how to avoid stretching By David Mottershead Senior Programmer, Quanticate.
McGraw-Hill Career Education© 2008 by the McGraw-Hill Companies, Inc. All Rights Reserved. Office Excel 2007 Lab 2 Charting Worksheet Data.
Excel Web App By: Ms. Fatima Shannag.
PROC GPLOT GPLOT is used to make two dimensional scatter-plots. General Syntax: proc gplot data=data-set options; plot y-variable*x-variable/options; run;
Mr. Magdi Morsi Statistician Department of Research and Studies, MOH
1. Tables, Charts, and Graphs Microsoft Word & Excel 2003.
Lesson 6 Formatting Cells and Ranges. Objectives:  Insert and delete cells  Manually format cell contents  Copy cell formatting with the Format Painter.
1 Chapter 3: Getting Started with Tasks 3.1 Introduction to Task Dialogs 3.2 Creating a Listing Report 3.3 Creating a Frequency Report 3.4 Creating a Two-Way.
Unit 3: Text, Fields & Tables DT2510: Advanced CAD Methods.
Microsoft Office 2013 Try It! Chapter 4 Storing Data in Access.
SAS/GRAPH The Basics. Today’s Topics GOPTIONS GPLOT GCHART GCONTOUR G3D.
IE 411/511: Visual Programming for Industrial Applications Lecture Notes #2 Introduction to the Visual Basic Express 2010 Integrated Development Environment.
McGraw-Hill/Irwin The Interactive Computing Series © 2002 The McGraw-Hill Companies, Inc. All rights reserved. Microsoft Excel 2002 Lesson 4 Inserting.
CMS SAS Users Group Conference Learn more about THE POWER TO KNOW ® October 17, 2011 PROC GMAP, HTML and You Thomas Kornfield, CMS.
Microsoft Office XP Illustrated Introductory, Enhanced With Charts Working.
MS Excel INFORMATION TECHNOLOGY MANAGEMENT SERVICE Training & Research Division.
IENG-385 Statistical Methods for Engineers SPSS (Statistical package for social science) LAB # 1 (An Introduction to SPSS)
Chapter 3: Getting Started with Tasks
Learning Microsoft® Office 2003 – Deluxe Edition
Chapter 2 – Introduction to the Visual Studio .NET IDE
Microsoft PowerPoint 2003 Illustrated Introductory
Instructor: Raul Cruz-Cano
Tutorial 3 – Creating a Multiple-Page Report
Graphs with SPSS.
Objectives At the end of this session, students will be able to:
Presentation transcript:

SAS Lecture 6 – SAS/GRAPH Aidan McDermott, May 3, 2005

2 SAS/GRAPH There are a small number of graphic types commonly used in public health presentations and publication. These basic types are either used alone or mixed together to form a composite graphic. Here we will look at how to build some of these basic types of graph. Golden Rule: Everybody is a graph critic.

3 Two types of graph maker If you are using SAS for statistics and data management then it seems natural to use it to produce your graphs as well. Sometimes a statistical procedure will produce the graph you are looking for anyway. Need a one-off graph for a presentation versus production line graphs. To produce “quick and dirty” graphs you can use Graph-n-go. Very easy to use; not bad for putting multiple graphs on one page; data viewer is a graph type; only a small number of graph types available; not all options available; labor intensive so not suitable for production line graphs. Use SAS/Graph procedures Very flexible; complete control over graphic elements; less labor intensive in the long run; harder to learn; same control can be used for SAS/STAT graphics output.

4 Some common types of graph Charts Histograms Stem and leaf plots Boxplots Plots Contour plots / 3-dimensional plots Maps Gantt charts Trellis plots Trees / pedigrees / dendograms (mathematical) graphs / networks Flow charts / entity-relationship diagrams

6 Graph-n-go Solutions  reporting  graph-n-go The top two icons represent data models The rest are data viewers.

7 Graph-n-go Choose and configure a data model. Choose a dataset. Right mouse button click on the data model and choose properties. Set which columns to use, where clauses etc.

8 Graph-n-go Choose a viewer and position it on the viewer area (e.g. a bar chart). Drag and drop the data model onto the viewer to associate data with the viewer. Right mouse button on the viewer and choose properties. Configure (choose variables to plot etc).

9 Graph-n-go When finished graph can be exported to html etc. Choose file  export  write to file You’ll see more in the lab.

10 Graphic output within SAS You have already seen some graphic output from within SAS. proc means, proc univariate, proc genmod, proc lifetest etc. all produce graphs Other procedures in SAS specifically produce graphs, even some procedures that are not part of SAS/Graph (proc boxplot is an example) Here our aim is to produce publication/presentation-- quality graphs.

11 Graph basics SAS stores graphs in catalogs (an entity similar to a folder in windows). Graphs are stored in a SAS proprietary format. By default graphs are stored in a catalog called Gseg in the work library. Graphs can be translated to postscript, gif, jpeg, and a number of other commonly used formats for printing or including in other documents (Word, html, etc.).

12 Graphic control There are three ways to control the look of a sas/graph. 1. Use options within the procedure 2. Use global commands 3. Use goptions

13 GOPTIONS set the environment for a graphics program to run and send output independent of the program remain in effect for the entire SAS session unless changed or reset control appearance of graphic elements by specifying default fonts, colors, text heights etc. Useful when you want the same options in multiple procs

14 PROC GOPTIONS used to review current GOPTIONS lists alphabetically all of the current GOPTIONS in the LOG window proc goptions; run; Can also type goptions at the command line

15 GOPTIONS GOPTIONS options-list ROTATE= portrait or landscape (will override the setting in the print dialog box) RESET=ALL resets all options to defaults including all global statements RESET=GOPTIONS resets only goptions statements

16 COLORS=device dependent default color list for device driver GUNIT= unit of measurement for height in global statements, such as TITLE and FOOTNOTE cell - character cells pct - percent of graphics area in - inches

17 Data From the SAS samples folder. Three Californian pollutant monitoring stations (AZU, LIV, SFO) One monthly measurement (taken on the 15th of the month) for CO, O3, SO4, temperature etc. for each station. 36 observations in all Month is a numeric variable taking the value 1 for January, 2 for February, etc.

18 Californian Air pollutant Data – ca88air

19 Charts Examples Look for graphic elements in each chart Look for common data types Look for similarities among the examples

20

21

22

23

24

25

26

27 Charts All the examples used a small number of graphic elements Main difference between plots is the polygon/area type Most involved a categorical/discrete variable and a numeric variable. A histogram uses a continuous variable to create categories. The counts of a categorical variable can be used to create the numeric variable.

28 Proc GCHART produces charts based on the values of one or more chart variables. produces vertical and horizontal bar charts, block charts, pie charts etc. graphs based on statistics - counts, percentages, sums, or means run-group processing numeric and character variables

29 Proc GCHART example  proc format; value seas 1 = ‘Win’ 2 = ‘Spr’  3 = ‘Sum’ 4 = ‘Fal’; data ca88air; set vol1.ca88air(where=(station=“SFO”)); if ( month in (12,1,2) ) then season = 1; else if ( month in (3,4,5) ) then season = 2; else if ( month in (6,7,8) ) then season = 3; else if ( month in (9,10,11)) then season = 4; format season seas.; format month mth.; run;

30 Proc GCHART example title1 h=4 ’Mean seasonal carbon monoxide for station SFO’; footnote j=l h=4 f=simplex 'Bar Chart - vertical’; proc gchart data=ca88air; vbar season / sumvar=co type=mean discrete ctext=black clm=95 ;  run;  quit;

31

32 Proc GCHART syntax PROC GCHART data=data set name;  One of the following:  VBAR variables / options;  HBAR variables / options;  STAR variables / options;  PIE variables / options;  BLOCK variables / options; run;

33 VBAR separate bar chart for each chart variable each bar represents the statistic selected for a value of the chart variable response axis (vertical) provides a scale for statistic graphed midpoint axis - horizontal axis

34 VBAR SYNTAX VBAR chart variables/ options; chart-variable(s) specifies one or more variables that define the categories of data to chart. options specifies appearance, statistics, axes and midpoint options

35 VBAR midpoints are the values of the chart variable that identify categories of data. By default, midpoints are selected or calculated by the procedure. The way the procedure handles the midpoints depends on whether the values of the chart variable are character, discrete numeric, or continuous numeric. character chart variables- separate bar is drawn for each value

36 VBAR numeric chart variables - each bar represents a range of values - DISCRETE option generates a midpoint for each unique value of the chart variable. - generates midpoints that represent ranges of values. By default, determines the ranges, calculates the median value of each range, and displays the median value at each midpoint on the chart. A value that falls exactly halfway between two midpoints is placed in the higher range.

37 VBAR OPTIONS For character or discrete numeric values, you can use the MIDPOINTS= option to rearrange the midpoints or to exclude midpoints from the chart. For character data MIDPOINTS= list values in quotes MIDPOINTS=‘Sydney’ ‘Atlanta’ ‘Paris’

38 VBAR OPTIONS For continuous numeric variables, use the MIDPOINTS= option to change the number of midpoints, to control the range of values each midpoint represents, or to change the order of the midpoints. To control the range of values each midpoint represents, use the MIDPOINTS= option to specify the median value of each range. For example, to select the ranges 20-29, 30-39, and 40-49, specify MIDPOINTS=

39 VBAR OPTIONS Other options; DISCRETEseparate bar for each value of numeric variable TYPE=statisticspecifies the chart statistic. FREQ frequency PCT percentage SUM sum (the default) MEAN mean CLM=confidence-level draws chart confidence intervals (error bars)

40 VBAR SYNTAX SUMVAR=variable specifies variable to used for sum or mean calculations for each midpoint. The resulting statistics are represented by the length of the bars along the response axis, and they are displayed at major tick marks. REQUIRED if specifying TYPE- MEAN or SUM. RAXIS= axisn response axis MAXIS=axisn midpoint axis

41 GLOBAL STATEMENTS define titles, footnotes used to control axes, symbols, patterns, and legends can be defined anywhere  inside a proc or before a proc in effect until canceled, replaced, or the end of SAS session cancel by repeating statement with no options or using goptions RESET=ALL;

42 GLOBAL STATEMENTS TITLE defines titles AXIS defines appearance of axes FOOTNOTE defines footnotes PATTERN defines patterns used in graphs (histograms) LEGEND defines legends SYMBOL defines symbols (plotting) NOTE adds text to graph

43 TITLE STATEMENT creates, changes or cancels a title for all subsequent graphics output in a SAS session allowed up to 10 titles keyword TITLE can be followed by unlimited number of text strings and options text strings enclosed in single or double quotes most recently created TITLE number replaces the previous TITLE of the same number

44 Title syntax TITLE | ‘text-n’>; Options: FONT=font specifies the font for the subsequent text. HEIGHT= specifies the height of text H=n characters in number of units JUSTIFY= specifies the alignment J=R|L|C By default, JUSTIFY=C=center R=right L=left.

45 PATTERN STATEMENT defines the characteristics of patterns used in charts type of fill pattern - solid, empty, lined color An example of a global statement

46 PATTERN STATEMENT PATTERN options; OPTIONS COLOR= pattern color VALUE= fill E empty S solid Ln left slanting lines Rn right slanting lines Xn crosshatched lines where n is indicating the lightest

47 Proc GCHART example pattern1 color=blue value=fill; pattern2 color=red value=fill; proc gchart data=ca88air; star month / sumvar=co type=mean discrete ctext=black noheading ;  run;  quit;

48

49 Exporting graphs Make sure the graphics window has focus, by clicking on it. File  export as Image select type of image – gif, … open other software program – Powerpoint insert picture

50 Graphs can also be saved in a SAS catalog. They are stored in a SAS proprietary format. They can be viewed with proc greplay. goptions replace; libname mylib ‘c:\Temp\sasclass\myfiles’; proc gchart data=mydat gout=lib.mygraphs; … proc greplay allows multiple plots on one page. Saving graphs

51 PROC GPLOT graphs one variable against another producing presentation quality plots coordinates of each point correspond to the values in one or more observations of the input data set. run-group processing - procedure does not end with a run - submit new statements and produce more graphs without another PROC - ends with QUIT or PROC or DATA

52 Proc GPLOT produces two-dimensional graphs that plot one variable against another within a set of coordinate axes graphs are automatically scaled to the values of your data, although scaling can be controlled with options or with AXIS statements. scatterplots, bubble plots plots, plots with interpolated lines (SYMBOL statement)

53

54 GPLOT SYNTAX PROC GPLOT data=data-set-name ; PLOT request list ; request list is of the form: vertical*horizontal e.g. PLOT y*x; vertical*horizontal=variable e.g. PLOT y*x=z;

55 Graphics options on PLOT statement CTEXT= color LEGEND= LEGENDn (uses nth global LEGEND statement) HAXIS=AXISn (uses nth global AXIS statement) VAXIS=AXISn (uses nth global AXIS statement) GPLOT SYNTAX

56 Proc GPLOT example Suppose we are asked to draw a plot of ozone by month for the three stations SFO, LIV, AZU. After consulting the help we might try: proc gplot data=ca88air; plot o3 * month;  run;  quit; which produces:

57

58 Proc GPLOT example Increase the size of the text use a format to print out Month names clear the unwanted footnote GOPTIONS gunits=pct htext=4; footnote1; proc gplot data=ca88air; plot o3 * month ; format month mth.; title1 '1988 Air Quality Data - Ozone'; run;

59

60 Proc GPLOT example back to the help you can make a stratified plot by station x axis too crowded - use a different format proc gplot data=ca88air; plot o3 * month = station; format month mthc.; title1 '1988 Air Quality Data - Ozone'; run;

61

62 Proc GPLOT example the symbols in the plot are too small use symbol global statements! symbol1 v=dot i=join c=blue h=1.3; symbol2 v=dot i=join c=green h=1.3; symbol3 v=dot i=join c=brown h=1.3; proc gplot data=ca88air; plot o3 * month = station; format month mthc.; title1 '1988 Air Quality Data - Ozone'; run;

63

64 Proc GPLOT example The x-axis is not right - use an axis global statement axis1 minor = none label = (f=simplex j=c 'Ozone levels at three locations') major = (h=1.1) order = (0 to 13 by 1) value = (f=simplex h=3.0); proc gplot data=ca88air; plot o3 * month = station / haxis=axis1; format month mthc.; title1 '1988 Air Quality Data - Ozone'; run;

65

66 Proc GPLOT example The x-axis has extra characters - use a new format or use an axis global statement y-axis label need to be rotated and placed in center of axis legend needs moving - legend global command axis1 minor = none label = (f=centb j=c 'Ozone levels at three locations') major = (h=1.0) order = (0 to 13 by 1) value = (f=simplex h=3.0 " " "J" "F" "M" "A" "M" "J” "J" "A" "S" "O" "N" "D" " ");

67 Proc GPLOT example axis2 label = (f=centb rotate=0 angle=90 j=c 'Ozone') value = (f=simplex h=3.0) ; legend1 across=3 position=(bottom center inside) label=none; proc gplot data=ca88air; plot o3 * month = station / haxis=axis1 vaxis=axis2; format month mthc.; title1 '1988 Air Quality Data - Ozone'; run;

68

69 proc g3d and proc contour produce 3-dimensional analogs of gplot

70 Maps You can use proc gmaps to make simple presentation maps There is another product by SAS called SAS/GIS - i.e. SAS / geographical information system

71

72 Data taken from the CDC web page AIDS prevalence during rate is given for each state per 100,000 of population state is given by name and two letter code map data is provided by SAS in the library maps -- the map we will use is maps.us if you look in the maps library you will see data for maps for most countries and world maps

73 Data this data uses FIPS coding to match geographic boundries e.g. the fips coding for Alaska is 02 and Maryland is 24 We need to join the AIDS data and the FIPS codes in order to map the data proc sort data=aids; by name; proc sort data=state; by name; data join; merge aids(in=inaids) state(in=instate); by name; if inaids and instate then output join; run;

74 Proc GMAP proc gmap is used to create a number of different types of map the map we will be interested in is a choropleth map -- this is a map in which the rates will be color-coded by state. such a map shares many of the properties of a chart, particulary a pie or star chart -- both use areas to represent information, but in the case of the choropleth map the color/shading contains the display information

75 Proc GMAP First we set up some global title and footnote statements: title1 color=blue font=centb "Acquired immunodeficiency syndrome (AIDS) by state" ; title2 font=cent "(per 100,000 of population)" ; title3 font=cent "12 months ending June, 1998" ; footnote1 color=green justify=left " Choropleth Map";

76 Proc GMAP the syntax of proc gmap is like other graphic procedures we have met, but it specifically requires: –a map dataset (maps.us in this case) –an id variable which is present in both the map dataset and the dataset we wish to map (in this case the variable state is in both datasets and contains the fips code) –the syntax is: proc gmap map=map data=data; id idvar; choro rate / options; run;

77 Proc GMAP title1 color=blue font=centb "Acquired immunodeficiency syndrome (AIDS) by state" ; title2 font=cent "(per 100,000 of population)" ; title3 font=cent "12 months ending June, 1998" ; footnote1 color=green justify=left " Choropleth Map"; proc gmap map=maps.us data=join; id state; choro rate / coutline=black midpoints= ; run;

78

79 Proc GMAP Instead of a choropleth map, you could also make a surface map. For example: proc gmap map=maps.us data=join; id state; surface rate / constant=20 cbody=red nlines=100; footnote1 color=green justify=left " Surface Map"; run;

80

81 defines appearance and location of axes and tick marks defines text and appearance of axis label defines order of data values on axis 99 active AXIS statements in a SAS session Syntax: AXIS ; Axis statement

82 ORDER=(value list) specifies the data values in the order they are to appear on the axis. The values specified by ORDER= are the major tick mark values. These values are displayed at the major tick marks unless they are modified by the VALUE= option. Examples: ORDER=(10 to 50 by 10) ORDER=(10,20,30,40,50) Axis statement options

83 LABEL= (text description ‘text string’); By default, the text of the axis label is either the variable name or a previously assigned variable label. Enclose each string in quotation marks. COLOR=text-color ANGLE=degrees FONT=font | NONE HEIGHT=text-height JUSTIFY=LEFT | CENTER | RIGHT Example: Label= (font=swissb color=blue j=l a=90 ‘Systolic BP mmHG’) ; Axis statement options

84 VALUE=(text description1 ‘text’... text descriptionn ‘textn’); modifies the major tick mark values, that is, the text that labels the major tick marks on the axis. Text-description defines the appearance and ‘text’ is the text of a major tick mark value. COLOR=text-color ANGLE=degrees FONT=font | NONE HEIGHT=text-height JUSTIFY=LEFT | CENTER | RIGHT Axis statement options

85 specifies symbols in GPLOT defines appearance of symbols, plot lines, including bars, boxes, confidence limits, and area fills interpolation methods Symbol statement

86 SYMBOL options; COLOR = symbol color FONT= font HEIGHT= n INTERPOL = R =STEP ( for KM plots) =BOX VALUE= symbol WIDTH=n