Creating a Compact Columnar Output with PROC REPORT Walter R. Young Principal Clinical Programmer Analyst Wyeth.

Slides:



Advertisements
Similar presentations
DIVERSE REPORT GENERATION By Chris Speck PAREXEL International Durham, NC.
Advertisements

Spectre (Clinical) %unistats A flexible macro to give you.... “proc univariate” descriptive statistics with category counts and percentages (plus optional.
Outline Proc Report Tricks Kelley Weston. Outline Examples 1.Text that spans columnsText that spans columns 2.Patient-level detail in the titlesPatient-level.
Understanding Microsoft Excel
Introduction to SPSS Allen Risley Academic Technology Services, CSUSM
Developing Effective Reports
XP New Perspectives on Microsoft Office Excel 2003, Second Edition- Tutorial 11 1 Microsoft Office Excel 2003 Tutorial 11 – Importing Data Into Excel.
FIRST COURSE Excel Lecture. XP 2 Introducing Excel Microsoft Office Excel 2007 (or Excel) is a computer program used to enter, analyze, and present quantitative.
Welcome to SAS…Session..!. What is SAS..! A Complete programming language with report formatting with statistical and mathematical capabilities.
Laboratory Exercise # 13 – Font and Number Format Styles Office Productivity Tools 1 Laboratory Exercise # 13 Font and Number Format Styles Objectives:
© 2002 ComputerPREP, Inc. All rights reserved. Word 2000: Working with Long Documents.
Developing Effective Reports
Advanced CSS - Page Layout. Advanced CSS  Compound Selectors:  Is a Dreamweaver term, not a CSS term.  Describes more advanced types of selectors such.
Lesson 1 – Microsoft Excel The goal of this lesson is for students to successfully explore and describe the Excel window and to create a new worksheet.
Aubrey Wood Accounting Support Program Liaison (SPL) xt 1306
Excel Part 2 Formatting a Workbook. XP Objectives Format text, numbers, and dates Change font colors and fill colors Merge a range into a single cell.
SAS PROC REPORT PROC TABULATE
Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS ESSENTIALS -- Elliott & Woodward1.
XP Chapter 5 Succeeding in Business with Microsoft Office Access 2003: A Problem-Solving Approach 1 Developing Effective Reports Chapter 5 “Nothing succeeds.
October 2003Bent Thomsen - FIT 3-21 IT – som værktøj Bent Thomsen Institut for Datalogi Aalborg Universitet.
Actual Building the Pages Tables. Using Table Elements  To build effective page templates, you must be familiar with the HTML table elements and attributes.
1 Experimental Statistics - week 4 Chapter 8: 1-factor ANOVA models Using SAS.
Excel Worksheet # 5 Class Agenda Formulas & Functions
1 Data List Spreadsheets or simple databases - a different use of Spreadsheets Bent Thomsen.
Lesson 8 — Spreadsheets Unit 2 — Software. Lesson 8 – Spreadsheets 2 Objectives Understand the purpose and function of a spreadsheet. Identify the major.
Teacher’s Assessment Assistant Worksheet Builder Starting the Program
WORKBOOK FORMATTING Nolan Tomboulian Tomboulian.wikispaces.com HOW THINGS LOOK CELL COLORFONT COLOR CELL BORDERSFONT SIZE CELL SIZEFONT.
1 Lesson 18 Organizing and Enhancing Worksheets Computer Literacy BASICS: A Comprehensive Guide to IC 3, 3 rd Edition Morrison / Wells.
Designing a Web Page with Tables. A text table: contains only text, evenly spaced on the Web page in rows and columns uses only standard word processing.
With Microsoft Office 2007 Intermediate© 2008 Pearson Prentice Hall1 PowerPoint Presentation to Accompany GO! with Microsoft ® Office 2007 Intermediate.
CREATING TEMPLATES CREATING CUSTOM CHARACTERS IMPORTING BATCH DATA SAVING DATA & TEMPLATES CREATING SERIES DATA PRINTING THE DATA.
Computer Literacy BASICS: A Comprehensive Guide to IC 3, 5 th Edition Lesson 19 Organizing and Enhancing Worksheets 1 Morrison / Wells / Ruffolo.
Histo-Labels Easy to use and fun to setup. Create and edit all label menus to fulfill your slide labeling requirements. Print to any printer installed.
© 2008 The McGraw-Hill Companies, Inc. All rights reserved. WORD 2007 M I C R O S O F T ® THE PROFESSIONAL APPROACH S E R I E S Lesson 15 Advanced Tables.
Chapter 3: Formatted Input/Output Copyright © 2008 W. W. Norton & Company. All rights reserved. 1 Chapter 3 Formatted Input/Output.
WHAT IS A DATABASE? A DATABASE IS A COLLECTION OF DATA RELATED TO A PARTICULAR TOPIC OR PURPOSE OR TO PUT IT SIMPLY A GENERAL PURPOSE CONTAINER FOR STORING.
McGraw-Hill Career Education© 2008 by the McGraw-Hill Companies, Inc. All Rights Reserved. Office Word 2007 Lab 3 Creating Reports and Tables.
Intro to Excel - Session 3.11 Tutorial 3 - Session 3.1 Developing a Professional-Looking Worksheet.
A Simple Guide to Using SPSS ( Statistical Package for the Social Sciences) for Windows.
Priya Ramaswami Janssen R&D US. Advantages of PROC REPORT -Very powerful -Perform lists, subsets, statistics, computations, formatting within one procedure.
Chapter 5 Reading and Manipulating SAS ® Data Sets and Creating Detailed Reports Xiaogang Su Department of Statistics University of Central Florida.
© 2008 The McGraw-Hill Companies, Inc. All rights reserved. ACCESS 2007 M I C R O S O F T ® THE PROFESSIONAL APPROACH S E R I E S Lesson 8 – Adding and.
Key Applications Module Lesson 17 — Organizing Worksheets Computer Literacy BASICS.
Chapter 5 Working with Multiple Worksheets and Workbooks
Working with Formulas Formula – An expression that returns a value – Written using operators that combine different values, resulting in a single displayed.
1 Lesson 13 Organizing and Enhancing Worksheets Computer Literacy BASICS: A Comprehensive Guide to IC 3, 3 rd Edition Morrison / Wells.
External Executable Tools The Master's Touch David L. Blankenship.
Lesson 6 Formatting Cells and Ranges. Objectives:  Insert and delete cells  Manually format cell contents  Copy cell formatting with the Format Painter.
Managing Text Flow Lesson 5. Setting Page Layout The layout of a page helps communicate your message. Although the content of your document is obviously.
Computing with SAS Software A SAS program consists of SAS statements. 1. The DATA step consists of SAS statements that define your data and create a SAS.
Chapter 6 Concatenating SAS Data Sets and Creating Summary Reports Xiaogang Su Department of Statistics University of Central Florida.
Understanding Microsoft Excel Lesson 1 – Microsoft Excel 2013.
 2006 Pearson Education, Inc. All rights reserved Control Statements: Part 2.
Chapter 3: Formatted Input/Output 1 Chapter 3 Formatted Input/Output.
Computer Fundamentals Muhammadamin Daneshwar And Masoud Aras Computer Engineering Department Soran University Lecture 6.
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 14 & 19 By Tasha Chapman, Oregon Health Authority.
Working Efficiently with Large SAS® Datasets Vishal Jain Senior Programmer.
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 5 & 6 By Ravi Mandal.
Understanding Microsoft Excel
Understanding Microsoft Excel
Understanding Microsoft Excel
Key Applications Module Lesson 17 — Organizing Worksheets
Tutorial 5: Working with Excel Tables, PivotTables, and PivotCharts
Loops BIS1523 – Lecture 10.
Building a User Interface with Forms
Word Processing.
Understanding Microsoft Excel
Variables and Arithmetic Operations
Understanding Microsoft Excel
Presentation transcript:

Creating a Compact Columnar Output with PROC REPORT Walter R. Young Principal Clinical Programmer Analyst Wyeth

Why create a columnar output from a data set. n For own use. n To show others at a meeting. n To put in a standardized regulatory report. n To put in a publication. The above are ranked in approximately increasing effort required to create the report. Author’s Opinion: It is the neatest, most heuristic way to present a data set.

PROC CONTENTS for ECGTEST 57 Variables 28 Observations

Solution 1: Default PROC PRINT Advantages: n Easy and Neat for Narrow Data Set n Can Use ID and VAR Statements. Disadvantages: n Virtually No Beautification Options n Doesn’t Work for a Wide Data Set n Wraps Neatly but One Can’t Control Wrapping with the exception of increasing the page size. n Wide Columns are Truncated.

Default PROC PRINT

Default PROC REPORT

Solution 2: Use a Default PROC REPORT with NOWD (Required in Batch) Option Advantages n Output is columnar. n Small number of variables fit on a page. Disadvantages n Rows are not identified for multiple pages. n Spacing between columns is uneven. n Column labels split unattractively. n Column order isn’t optimum. n Wide columns cause program to bomb.

PROC REPORT Defaults and Properties n PROC was designed to run interactively. n Labels are used as column headers. n Variables are in position order. n Spacing is 2 including before the first column. Actual spacing rules are explainable but messy. n WRAP with NAMED options create messy output. n If all variables are numeric and none are specified as DISPLAY, they are summed instead of listed. n MISSING option is needed to print all data rows. n If a variable name not in the data set, is listed in both the COLUMN and DEFINE statements, no error will result.

PROC REPORT Justification Rules n Default is right for numeric - left for character. n Numerical values are right justified within formats which are justified within the width. n Character values are justified in width with leading blanks retained and trailing blanks eliminated. n Rules apply simultaneously to labels and values.

PROC REPORT SPLIT Character n The SPLIT character splits both labels and FLOW variables. n Choose a printable character(e.g., “~”, “|”, “\”) other than the default “/” which is common in many entered texts. n If a word’s length in a flowed variable is greater than the variable’s width, the word will split at that width. n To indent flowed text, insert a split character plus the desired number of spaces and one at the end of the text. n If there is a split character in the flowed variable, words at the end of the field will split randomly due to a SAS bug which will be fixed in a future SAS version. To fix this one either widens the field to eliminate non-indented flow or writes a macro to insert split characters where desired. n If unprintable printer control characters exist in the flowed variable, they must be removed. This is an uncommon problem which can happen if data is coming from many sources.

Solution 3: Use ID Statement (introduced in 6.12) With PROC REPORT n Observations are identified. n However, All Other Default Problems Exist. n Won’t Work if the Width of any Variable ¬ Exceeds the Inherent PROC REPORT limit. ­ Plus the Width of the ID variables Plus the Spaces Between Columns Exceeds Line Size. (In this case the FLOW option must be used.)

Solution 4: PROC REPORT with Minimal Options for a Narrow Data Set n Must Use a COLUMN Statement (Analogous to a VAR statement in PROC PRINT). n Use a BREAK Statement for spaces between lines. n Use HEADLINE, HEADSKIP, BREAK Statement or underline to separate labels from the observations. n Specify a Constant Spacing Between Columns. n Customize Labels in the DEFINE statements or Use Variable Names (System NOLABEL option). n Possibly use PANEL option to minimize paper use. n The above gives you most of the features of using a PUT statement formatting (DATA _NULL_).

If the width of a data set won’t fit within the line size, one should make the output compact. In a compact output, the maximum number of observations of the variables should be made to either fit on the width of a single page or on the width of a minimum number of pages. Author’s Opinion: Presenting the data in columns on a single page width, neatly and informatively, is more heuristic than presenting it on multiple pages. Solution 5: Use PROC REPORT With All Applicable Options for a Wide Data Set

To make the report compact n Make every reasonable effort to limit width to a single page. n Reduce the space between columns to one. n Drop space before 1 st column (SPACING=0 in DEFINE). n Drop unnecessary variables from COLUMN statement. n Drop variables having the same value for all observations and consider putting them in a title, footnote or legend. n Sort the data by sensible variables having a fair number of rows for each combination in the BY statement and use the BY in PROC REPORT. n For data sets wider than a single page, pick the minimum # of ID variables to adequately identify all observations. Balance the width of the non-ID variables across pages. n Use PROC FREQ to determine whether long variables can be coded and describe the code in a legend.

To make the report compact n Don’t use the FLOW option unless necessary as it increases the number of lines per observation. Also, consider not using the SPLIT character in its label. n Don’t alter any variable if proofreading. n Eliminate variables which have a one to one relationship with other variables. n Sensibly condense character variables. n Edit variables without eliminating their meaning. n Transfer meaning from a variable to its label. n Since formats can alter variable widths, apply them prior to calculating column widths. n Use the STYLE attribute, some of the 6 font parameters and ODS. While good for publications, this doesn’t support a standardized line size and appearance.

For alphanumeric variables n Determine their maximum width in the data set. n If a format increases this width, use that width. n Consider removing any invariant prefixes or suffixes n If the FLOW parameter is required, consider the line size constraint, calculate the width plus spacing of all other variables and: ¬For a single FLOW variable, use its maximum width. ­For multiple FLOW variables, determine how to best allocate their widths to minimize lines per observation. ® See if other data can be put on the added line(s) per observation (e.g., concatenate visit date, SPLIT character and visit name and use the FLOW option).

For numeric variables n Determine their range, maximum value and whether they’re integer and then specify an appropriate format (not the default BEST) and decimal point. n For date time variables, specify an appropriate compact format (e.g., MMDDYY6.). Separate date and time with DATEPART. If time is missing for all observations remove it from the report. n If it has a format which transforms it into an alphanumeric variable, apply the format and treat it as though it were an alphanumeric variable.

To increase the attractiveness of a compact report n Appropriately order the COLUMN statement variables. n Appropriately specify ORDER variables and the BREAK statement (e.g., blank line between ID variables). n Use informative labels neatly spanned in the COLUMN statement and appropriately split in the DEFINE. n Use tricks (e.g., unprintable character at end of label, SPLIT character and blank at beginning of label, spacing =0) to separately justify labels and values. n Use neat and informative titles, footnotes and/or legends. If necessary, expand a label’s meaning in a legend.

20 Possible Variables for Compact Output

Condensing and Editing LVALC and SASNAME

Compact Output (15 Variable) Data Step

Compact Output

Final Compact Output

Compact Output PROC REPORT

Compact Output With BY Statement

Compact Output With Patient ID Option

General ECGTEST Compact Report n Decide what variables to always exclude. n Decide constant variables for title. n Decide what 1 to 1 variables to include. n Count and remove applicable leading zeroes. n Determine which data condensing tricks work. n If not condensable, they must be output in full n Predetermine variables that need FLOW. n Determine which variables have a fixed width. n Calculate width of all remaining variables. n Use BY variables and the ID option for date.

Compact Output With Visit ID Option

Original AE PROC REPORT Code

Changes Made to Produce Final Listing n Width of all variables was minimized. n Leading zeroes were stripped from subject; it was concatenated with age sex and FLOW added. n Century was eliminated and date was output after subject and made an ORDER variable. n Body system was coded into footnotes. n Verbatim label indentation was corrected. n Labels were beautified. “STUDY DAY” was centered. n Width of indented column was maximized to eliminate FLOW of the concatenated variables. n DAI was put in data set and FLOW added. The above reduced the output from 21 to 11 pages.

Final AE PROC REPORT Code

Creating an Automated AE Listing l Find variables needed for other projects and: ¬ Find if their width is variable and calculate it. ­ Make attractive labels with SPLIT characters. ® Exclude them if they are blank (e.g., time). ¯ Use minimum possible width. ° 2 lines per observation: Thus use FLOW? l Maximize width of the verbatim variable. l User should specify variables and their order. l Change footnotes to an automated legend. l Add options for the BY variables.