Match-Merge in the Data Step

Slides:



Advertisements
Similar presentations
How SAS implements structured programming constructs
Advertisements

SAS Programming:File Merging and Manipulation. Reading External Files (review) data barf; * create the dataset BARF; infile ’s:\mysas\Table7.1'; * open.
Combining Lags and Arrays [and a little macro] Lynn Lethbridge SHRUG OCT 28, 2011.
S ORTING WITH SAS L ONG, VERY LONG AND LARGE, VERY LARGE D ATA Aldi Kraja Division of Statistical Genomics SAS seminar series June 02, 2008.
A guide to the unknown…  A dataset is longitudinal if it tracks the same type of information on the same subjects at multiple points in time or space.
I OWA S TATE U NIVERSITY Department of Animal Science Modifying and Combing SAS Data Sets (Chapter in the 6 Little SAS Book) Animal Science 500 Lecture.
SAS ® Regression Essentials. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives.
Understanding SAS Data Step Processing Alan C. Elliott stattutorials.com.
Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1.
Data Cleaning 101 Ron Cody, Ed.D Robert Wood Johnson Medical School Piscataway, NJ.
Topics in Data Management SAS Data Step. Combining Data Sets I - SET Statement Data available on common variables from different sources. Multiple datasets.
Welcome to SAS…Session..!. What is SAS..! A Complete programming language with report formatting with statistical and mathematical capabilities.
SAS PROC REPORT PROC TABULATE
Lecture 5 Sorting, Printing, and Summarizing Your Data.
SAS Workshop Lecture 1 Lecturer: Annie N. Simpson, MSc.
Introduction to SAS BIO 226 – Spring Outline Windows and common rules Getting the data –The PRINT and CONTENT Procedures Manipulating the data.
Chapter 20 Creating Multiple Observations from a Single Record Objectives Create multiple observations from a single record containing repeating blocks.
Lesson 2 Topic - Reading in data Chapter 2 (Little SAS Book)
Grant Brown.  AIDS patients – compliance with treatment  Binary response – complied or no  Attempt to find factors associated with better compliance.
SQL Chapter Two. Overview Basic Structure Verifying Statements Specifying Columns Specifying Rows.
1 Filling in the blanks with PROC FREQ Bill Klein Ryerson University.
Lesson 6 - Topics Reading SAS datasets Subsetting SAS datasets Merging SAS datasets.
Chapter 4 concerns various SAS procedures (PROCs). Every PROC operates on: –the most recently created dataset –all the observations –all the appropriate.
Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1.
YET ANOTHER TIPS, TRICKS, TRAPS, TECHNIQUES PRESENTATION: A Random Selection of What I Learned From 15+ Years of SAS Programming John Pirnat Kaiser Permanente.
Lesson 8 - Topics Creating SAS datasets from procedures Using ODS and data steps to make reports Using PROC RANK Programs in course notes LSB 4:11;5:3.
Lecture 4 Ways to get data into SAS Some practice programming
Computing with SAS Software A SAS program consists of SAS statements. 1. The DATA step consists of SAS statements that define your data and create a SAS.
Use the SET statement to: –create an exact copy of a SAS dataset –modify an existing SAS dataset by creating new variables, subsetting (using a subsetting.
Lesson 2 Topic - Reading in data Programs 1 and 2 in course notes –Chapter 2 (Little SAS Book)
Chapter 6: Modifying and Combining Data Sets  The SET statement is a powerful statement in the DATA step DATA newdatasetname; SET olddatasetname;.. run;
SAS Programming Training Instructor:Greg Grandits TA: Textbooks:The Little SAS Book, 5th Edition Applied Statistics and the SAS Programming Language, 5.
Longitudinal Data Techniques: Looking Across Observations Ronald Cody, Ed.D., Robert Wood Johnson Medical School.
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 3 & 4 By Tasha Chapman, Oregon Health Authority.
Chapter 11 Reading SAS Data
Applied Business Forecasting and Regression Analysis
Chapter 6: Set Operators
SAS Programming Training
Chapter 6: Modifying and Combining Data Sets
Lesson 2 Topic - Reading raw data into SAS
SAS Programming Training
An Introduction to SQL.
Lesson 9 - Topics Restructuring datasets LSB: 6:14
Instructor: Raul Cruz-Cano
Lesson 8 - Topics Creating SAS datasets from procedures
SAS Essentials How SAS Thinks
We’ll now consider 2x2 contingency tables, a table which has only 2 rows and 2 columns along with a special way to analyze it called Fisher’s Exact Test.
Creating the Example Data
Lesson 7 - Topics Reading SAS data sets
Introduction to SAS A SAS program is a list of SAS statements executed in order Every SAS statement ends with a semicolon! SAS statements can be in caps.
USA Today, October 9, 2014.
Outer Joins Inner joins returned only matching rows. When you join tables, you might want to include nonmatching rows as well as matching rows.
Merging in SAS These slides show alternatives regarding the merge of two datasets using the IN data set option (check in the SAS onlinedoc > “BASE SAS”,
Introduction to DATA Step Programming SAS Basics II
Complete Case Macro.
How are your SAS Skills? Chapter 1: Accessing Data (Question # 1)
Introduction to DATA Step Programming: SAS Basics II
Inner Joins.
Combining Data Sets in the DATA step.
Type=Corr SAS.
Producing Descriptive Statistics
The INTERSECT Operator
How to realize random in SAS Professor: Lijue Student: Zhengliang
Appending and Concatenating Files
UNION Operator keywords Displays all rows from both the tables
Introduction to SAS Lecturer: Chu Bin Lin.
Hans Baumgartner Penn State University
Introduction to SAS Essentials Mastering SAS for Data Analytics
Introduction to SAS Essentials Mastering SAS for Data Analytics
Wilcoxon Rank-Sum Test
Presentation transcript:

Match-Merge in the Data Step

Example Data data htwt; length id 8; input height weight id @@; datalines; 56.50 98 1 62.25 145 2 62.50 128 3 64.75 119 4 68.75 144 5 60.00 117 6 58.00 125 7 ; run; data chol; input chol id @@; 234 1 172 2 248 3 215 4 145 5 281 6 335 7 proc print data=htwt noobs; proc print data=chol noobs; Example Data

Match-Merging involves combining observations from two or more SAS data sets into a single observation in a new SAS data set. data tot1; merge htwt chol; by id; run; proc print data=tot1 noobs;run;

proc sort data=htwt; by id; run; proc sort data=chol; data tot1; merge htwt chol; proc print data=tot1; Match-Merge in the data step requires: All data sets sorted on the by variable. The by variable exists and has the same name on all datasets being merged

Match-Merge Different Number of Rows data htwt; input height weight id @@; datalines; 56.50 98 1 62.25 145 2 62.50 128 3 64.75 119 4 68.75 144 5 60.00 117 6 63.00 156 9 63.00 134 10 ; run; data chol; input chol id @@; 215 1 145 2 281 3 335 4 196 7 proc print data=htwt noobs; proc print data=chol noobs;

data tot4; merge htwt(in=h) chol(in=c); by id; run; proc print data=tot4 noobs;

The forgotten by data tot3; merge htwt chol; run; proc print data=tottmp;

in= Option data tot5; merge htwt(in=h) chol(in=c); by id; if h; run; proc print data=tot5 noobs; in= Option

in= Option data tot6; merge htwt(in=h) chol(in=c); by id; if c; run; proc print data=tot6 noobs;

in= Option data tot6; merge htwt(in=h) chol(in=c); if h and c; run; proc print data=tot6 noobs;

Three Data Sets data htwt; length id 8; input height weight id @@; datalines; 56.50 98 1 62.25 145 2 62.50 128 3 64.75 119 4 68.75 144 5 60.00 117 6 63.00 156 9 63.00 134 10 ; run; data chol; input chol id @@; 215 1 145 2 281 3 335 4 196 7 data bp; input dbp sbp id @@; 83 125 1 73 108 4 71 108 5 79 116 6 89 170 7 80 120 8 70 108 9 79 123 10 proc print data=htwt noobs;run; proc print data=chol noobs;run; proc print data=bp noobs;run;

Match-merge three Data Sets proc sort data=bp;by id;run; proc sort data=htwt;by id;run; proc sort data=chol;by id;run; data tot6; merge bp(in=b) htwt(in=h) chol(in=c); by id; if b and h and c; run; proc print data=tot6 noobs; run;

Data sets with the same variable – overlaying columns data chol1 (keep=sbp height chol id); length id 8; set fram.frex4(obs=15 ); where sbp ne . and height ne . and chol ne .; id=_n_; run; data chol2 (obs=15 keep= chol id); call streaminit(12345); do id=1 to 15; chol=int(rand("normal",240,40)); output; end; proc print data=chol1 noobs;run; proc print data=chol2 noobs;run;

proc sort data=chol1;by id;run; data totchol; merge chol1 chol2; by id; run; proc print data=totchol noobs;run;

data totchol2; merge chol2 chol1; by id; run; proc print data=totchol2 noobs;run;

All data must be sorted on the by variable The by variable must have the same name on all data sets in the merge statement. Columns are overlaid Output controlled with in= option