Creating Summary Data Sets Ron Cody, Ed.D. Robert Wood Johnson Medical School.

Slides:



Advertisements
Similar presentations
1 Radio Maria World. 2 Postazioni Transmitter locations.
Advertisements

Números.
Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.
Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.
AGVISE Laboratories %Zone or Grid Samples – Northwood laboratory
Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.

PDAs Accept Context-Free Languages
ALAK ROY. Assistant Professor Dept. of CSE NIT Agartala
/ /17 32/ / /
Reflection nurulquran.com.
EuroCondens SGB E.
Worksheets.
RM WD-97 WD-101 WD-102 WD-124 a IIIh-H : RM110 (2.1) Hainan c agGY Ia-1 (2) Anhui agGY Ia-2 (3) agGY Ia WD-2 WD-8 WD-36 agGY Ia
STATISTICS Linear Statistical Models
Addition and Subtraction Equations
Multiplication X 1 1 x 1 = 1 2 x 1 = 2 3 x 1 = 3 4 x 1 = 4 5 x 1 = 5 6 x 1 = 6 7 x 1 = 7 8 x 1 = 8 9 x 1 = 9 10 x 1 = x 1 = x 1 = 12 X 2 1.
1 When you see… Find the zeros You think…. 2 To find the zeros...
Western Public Lands Grazing: The Real Costs Explore, enjoy and protect the planet Forest Guardians Jonathan Proctor.
EQUS Conference - Brussels, June 16, 2011 Ambros Uchtenhagen, Michael Schaub Minimum Quality Standards in the field of Drug Demand Reduction Parallel Session.
12.3 – Analyzing Data.
Add Governors Discretionary (1G) Grants Chapter 6.
CALENDAR.
Making a Line Plot Collect data and put in chronological order
CHAPTER 18 The Ankle and Lower Leg
Summative Math Test Algebra (28%) Geometry (29%)
ASCII stands for American Standard Code for Information Interchange
CS1512 Foundations of Computing Science 2 Lecture 20 Probability and statistics (2) © J R W Hunter,
£1 Million £500,000 £250,000 £125,000 £64,000 £32,000 £16,000 £8,000 £4,000 £2,000 £1,000 £500 £300 £200 £100 Welcome.
The 5S numbers game..
Frequency Tables and Single variable Graphics
突破信息检索壁垒 -SciFinder Scholar 介绍
A Fractional Order (Proportional and Derivative) Motion Controller Design for A Class of Second-order Systems Center for Self-Organizing Intelligent.
Sampling in Marketing Research
The basics for simulations
© 2010 Concept Systems, Inc.1 Concept Mapping Methodology: An Example.
Connecticut Mastery Test (CMT) and the Connecticut Academic Achievement Test (CAPT) Spring 2013 Presented to the Guilford Board of Education September.
Inference About Conditional Associations In 2 x 2 x K Tables Demeke Kasaw Gary Gongwer.
Aim: How do we organize and interpret statistical data?
Figure 3–1 Standard logic symbols for the inverter (ANSI/IEEE Std
Statistics Review – Part I
Progressive Aerobic Cardiovascular Endurance Run
Visual Highway Data Select a highway below... NORTH SOUTH Salisbury Southern Maryland Eastern Shore.
Intercollegiate FRCS Update
Quantitative Methods Session 1 Chapter 1 - AVERAGE Pranjoy Arup Das.
When you see… Find the zeros You think….
2011 WINNISQUAM COMMUNITY SURVEY YOUTH RISK BEHAVIOR GRADES 9-12 STUDENTS=1021.
Before Between After.
2011 FRANKLIN COMMUNITY SURVEY YOUTH RISK BEHAVIOR GRADES 9-12 STUDENTS=332.
2.10% more children born Die 0.2 years sooner Spend 95.53% less money on health care No class divide 60.84% less electricity 84.40% less oil.
Numeracy Resources for KS2
1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)
Static Equilibrium; Elasticity and Fracture
ANALYTICAL GEOMETRY ONE MARK QUESTIONS PREPARED BY:
Resistência dos Materiais, 5ª ed.
Lial/Hungerford/Holcomb/Mullins: Mathematics with Applications 11e Finite Mathematics with Applications 11e Copyright ©2015 Pearson Education, Inc. All.
Doc.: IEEE /0333r2 Submission July 2014 TGaj Editor Report for CC12 Jiamin Chen, HuaweiSlide 1 Date: Author:
1 Some Pitfalls in Testing … Japan Imports of Wheat US Pacific and Gulf Export Ports.
UNDERSTANDING THE ISSUES. 22 HILLSBOROUGH IS A REALLY BIG COUNTY.
Chapter 8: Dialysis Providers 2014 ANNUAL DATA REPORT VOLUME 2: E ND -S TAGE R ENAL D ISEASE.
Timothy Forsyth Ashok Viswanathan Debbie McCullough Donn Garvert.
A Data Warehouse Mining Tool Stephen Turner Chris Frala
Chart Deception Main Source: How to Lie with Charts, by Gerald E. Jones Dr. Michael R. Hyman, NMSU.
1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)
Introduction Embedded Universal Tools and Online Features 2.
úkol = A 77 B 72 C 67 D = A 77 B 72 C 67 D 79.
Schutzvermerk nach DIN 34 beachten 05/04/15 Seite 1 Training EPAM and CANopen Basic Solution: Password * * Level 1 Level 2 * Level 3 Password2 IP-Adr.
Data Cleaning 101 Ron Cody, Ed.D Robert Wood Johnson Medical School Piscataway, NJ.
Lesson 8 - Topics Creating SAS datasets from procedures Using ODS and data steps to make reports Using PROC RANK Programs in course notes LSB 4:11;5:3.
Longitudinal Data Techniques: Looking Across Observations Ronald Cody, Ed.D., Robert Wood Johnson Medical School.
Presentation transcript:

Creating Summary Data Sets Ron Cody, Ed.D. Robert Wood Johnson Medical School

Test data set (CLINIC) SUBJECT GENDERAGE_GROUPBLOOD_TYPEHRSBPDBP 1M1A M1B M2O M1A F2A F1B F2O F2O F1A F1B M1B M2B

PROC MEANS DATA=data_set_name NOPRINT; Is equivalent to PROC SUMMARY DATA=data_set_name; PROC MEANS vs. PROC SUMMARY

Creating a SUMMARY Data Set Containing MEANS PROC MEANS DATA=CLINIC NOPRINT; /**************************************** Equivalent to PROC SUMMARY DATA=CLINIC; *****************************************/ CLASS GENDER; VAR HR SBP DBP; OUTPUT OUT=OUT1 MEAN=M_HR M_SBP M_DBP; RUN; Listing of data set OUT1 Obs GENDER _TYPE_ _FREQ_ M_HR M_SBP M_DBP F M

Using a BY statement Instead of a CLASS Statement PROC SORT DATA=CLINIC; BY GENDER; RUN; PROC MEANS DATA=CLINIC NOPRINT; BY GENDER; VAR HR SBP DBP; OUTPUT OUT=OUT1 MEAN=M_HR M_SBP M_DBP; RUN; Listing of data set OUT1 Obs GENDER _TYPE_ _FREQ_ M_HR M_SBP M_DBP 1 F M

Creating a SUMMARY Data Set Containing MEANS Broken Down by GENDER and AGE_GROUP PROC MEANS DATA=CLINIC NOPRINT; CLASS GENDER AGE_GROUP; VAR HR SBP DBP; OUTPUT OUT=OUT1 MEAN=M_HR M_SBP M_DBP; RUN; AGE_ GENDER GROUP _TYPE_ _FREQ_ M_HR M_SBP M_DBP F M F F M M

Explaining the _TYPE_ Variable Class VariablesRepresentation GENDERAGE_GROUPBinaryDecimal CLASS GENDER AGE_GROUP;

Demonstrating the NWAY Option PROC MEANS DATA=CLINIC NOPRINT NWAY; CLASS GENDER AGE_GROUP; VAR HR SBP DBP; OUTPUT OUT=OUT1 MEAN=M_HR M_SBP M_DBP; RUN; AGE_ GENDER GROUP _TYPE_ _FREQ_ M_HR M_SBP M_DBP F F M M

Outputting More than One Statistic PROC MEANS DATA=CLINIC NOPRINT; CLASS GENDER; VAR HR SBP DBP; OUTPUT OUT=OUT1 MEAN =M_HR M_SBP M_DBP N =N_HR N_SBP N_DBP MAX =MAX_HR MAX_SBP MAX_DBP MEDIAN =MED_HR MED_SBP MED_DBP; RUN; GENDER _TYPE_ _FREQ_ M_HR M_SBP M_DBP N_HR N_SBP F M N_DBP MAX_HR MAX_SBP MAX_DBP MED_HR MED_SBP MED_DBP

Partial List of Some Available Statistics KeywordDescription________________________________ MEANMean NNumber of non-missing values NMISSNumber of missing values MINSmallest non-missing value MAX Largest value MEDIANMedian RANGERange - difference between the minimum and maximum values Q125 th percentile Q375 th percentile QRANGEInterquartile range (difference between 25 th and 75 th percentile) STDStandard deviation STDERRStandard error UCLMUpper bound of the 95% confidence interval LCLMLower bound of the 95% confidence interval

Demonstrating the AUTONAME OUTPUT option PROC MEANS DATA=CLINIC NOPRINT; CLASS GENDER; VAR HR SBP DBP; OUTPUT OUT=OUT1 MEAN = N = MAX = MEDIAN = / AUTONAME; RUN; GENDER _TYPE_ _FREQ_ HR_Mean SBP_Mean DBP_Mean HR_N SBP_N F M SBP_ DBP_ DBP_N HR_Max SBP_Max DBP_Max HR_Median Median Median

Another Way of Naming Output Variables PROC MEANS DATA=CLINIC NOPRINT NWAY; CLASS GENDER AGE_GROUP; VAR HR SBP DBP; OUTPUT OUT=OUT1 MEAN=; RUN; Listing of Data Set OUT1 AGE_ GENDER GROUP _TYPE_ _FREQ_ HR SBP DBP F F M M

Dropping Unneeded Variables in the Output Dataset PROC MEANS DATA=CLINIC NOPRINT NWAY; CLASS GENDER AGE_GROUP; VAR HR SBP DBP; OUTPUT OUT=OUT1(DROP= _:) MEAN=M_HR M_SBP M_DBP; RUN; Listing of Data Set OUT1 AGE_ GENDER GROUP M_HR M_SBP M_DBP F F M M

Demonstrating the CHARTYPE Procedure Option PROC MEANS DATA=CLINIC NOPRINT CHARTYPE; CLASS GENDER AGE_GROUP; VAR HR SBP DBP; OUTPUT OUT=OUT1 MEAN=M_HR M_SBP M_DBP; RUN; Demonstrating CHARTYPE Option AGE_ GENDER GROUP _TYPE_ _FREQ_ M_HR M_SBP M_DBP F M F F M M

Demonstrating the CHARTYPE Procedure Option PROC PRINT DATA=OUT1 NOOBS; TITLE "Demonstrating CHARTYPE Option"; WHERE _TYPE_ EQ "10"; RUN; Demonstrating CHARTYPE Option AGE_ GENDER GROUP _TYPE_ _FREQ_ M_HR M_SBP M_DBP F M

Another Way to Name Variables (instead of using a VAR statement ) PROC MEANS DATA=CLINIC NOPRINT; CLASS GENDER; ***VAR STATEMENT OPTIONAL; OUTPUT OUT=OUT1 MEAN(HR) =M_HR N(HR SBP DBP) =N_HR N_SBP N_DBP MAX(SBP) =MAX_SBP MEDIAN(SBP DBP) =MED_SBP MED_DBP; RUN; GENDER _TYPE_ _FREQ_ M_HR N_HR N_SBP N_DBP MAX_SBP MED_SBP MED_DBP F M

Multi-way Breakdowns Using a TYPES Statement PROC MEANS DATA=CLINIC NOPRINT CHARTYPE; CLASS GENDER AGE_GROUP BLOOD_TYPE; VAR HR SBP DBP; TYPES GENDER AGE_GROUP*GENDER BLOOD_TYPE*GENDER; OUTPUT OUT=OUT1 MEAN=M_HR M_SBP M_DBP; RUN; AGE_ BLOOD_ GENDER GROUP TYPE _TYPE_ _FREQ_ M_HR M_SBP M_DBP F M F. A F. B F. O M. A M. B M. O F F M M

Using the _TYPE_ Values to Create Multiple Data Sets DATA GENDER AGE_BY_GENDER BLOOD_BY_GENDER; SET OUT1; IF _TYPE_ = "100" THEN OUTPUT GENDER; ELSE IF _TYPE_ = "110" THEN OUTPUT AGE_BY_GENDER; RUN; Listing of Data Set GENDER AGE_ BLOOD_ GENDER GROUP TYPE _TYPE_ _FREQ_ M_HR M_SBP M_DBP F M Listing of Data Set AGE_BY_GENDER AGE_ BLOOD_ GENDER GROUP TYPE _TYPE_ _FREQ_ M_HR M_SBP M_DBP F F M M

Examples of TYPES Statements TYPES A A*C D*C; TYPES A*(B C D); TYPES () A A*C*D;

Using PROC FREQ to Count Frequencies PROC FREQ DATA=CLINIC NOPRINT; TABLES AGE_GROUP / OUT=NUMBER; RUN; Listing of Data Set NUMBER AGE_ GROUP COUNT PERCENT

Renaming the COUNT Variable PROC FREQ DATA=CLINIC NOPRINT; TABLES AGE_GROUP / OUT=NUMBER(RENAME=(COUNT=N_AGE) DROP=PERCENT); RUN; Listing of Data Set NUMBER AGE_ GROUP N_AGE

Using PROC MEANS to Count Frequencies PROC MEANS DATA=CLINIC NOPRINT NWAY; CLASS AGE_GROUP; VAR HR; /* ANY NUMERIC VARIABLE */ OUTPUT OUT=COUNTS(RENAME=(_FREQ_ = N_AGE) DROP=_TYPE_ DUMMY) N=DUMMY; RUN; Listing of Data Set COUNTS AGE_ GROUP N_AGE

Using PROC FREQ to Count Frequencies in a Two-way Table PROC FREQ DATA=CLINIC NOPRINT; TABLES GENDER*BLOOD_TYPE / OUT=FREQOUT(DROP=PERCENT RENAME=(COUNT=NUMBER)); RUN; Listing of Data Set FREQOUT BLOOD_ GENDER TYPE NUMBER F A 2 F B 2 F O 2 M A 2 M B 3 M O 1

Using PROC FREQ to Output More than One Data Set PROC FREQ DATA=CLINIC NOPRINT; TABLES AGE_GROUP / OUT=OUT1; TABLES GENDER / OUT=OUT2; TABLES GENDER*AGE_GROUP / OUT=OUT3; RUN; Listing of Data Set OUT1 AGE_GROUP COUNT PERCENT Listing of Data Set OUT2 GENDER COUNT PERCENT F 6 50 M Listing of Data Set OUT3 GENDER AGE_GROUP COUNT PERCENT F F M M