Merging in SAS These slides show alternatives regarding the merge of two datasets using the IN data set option (check in the SAS onlinedoc > “BASE SAS”,

Slides:



Advertisements
Similar presentations
How to Grade Wikis Ways to look for and grade evidence of collaboration & build strong partnerships.
Advertisements

Effecting Efficiency Effortlessly Daniel Carden, Quanticate.
Research Methods Lecture 3 More STATA Ian Walker Room S2.109   Slides available at:
The INFILE Statement Reading files into SAS from an outside source: A Very Useful Tool!
SAS Programming:File Merging and Manipulation. Reading External Files (review) data barf; * create the dataset BARF; infile ’s:\mysas\Table7.1'; * open.
Examples from SAS Functions by Example Ron Cody
Slide C.1 SAS MathematicalMarketing Appendix C: SAS Software Uses of SAS  CRM  datamining  data warehousing  linear programming  forecasting  econometrics.
Chapter 3: Editing and Debugging SAS Programs. Some useful tips of using Program Editor Add line number: In the Command Box, type num, enter. Save SAS.
Writing Reader-Focused Letters, Memos, and
Statistical Methods II
I OWA S TATE U NIVERSITY Department of Animal Science Modifying and Combing SAS Data Sets (Chapter in the 6 Little SAS Book) Animal Science 500 Lecture.
1 SAS Formats and SAS Macro Language HRP223 – 2011 November 9 th, 2011 Copyright © Leland Stanford Junior University. All rights reserved. Warning:
Beginning Data Manipulation HRP Topic 4 Oct 19 th 2011.
1 Lab 2 HRP223 – 2010 October 18, 2010 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected.
1 Creating and Tweaking Data HRP223 – 2010 October 24, 2011 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This.
Hash vs Join A case study evaluating the use of the data step hash object to replace a SQL join Geoff Ness Sep 2014.
SPSS 1: An Introduction to the Statistical Package SPSS Suzie Cro MRC Clinical Trials Unit.
Pet Fish and High Cholesterol in the WHI OS: An Analysis Example Joe Larson 5 / 6 / 09.
WRDS User Guide West Virginia University. Three Ways of Working with WRDS Web – Based PC – SAS The WRDS UNIX server will be accessed using SSH Secure.
Chapter 18: Modifying SAS Data Sets and Tracking Changes 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
Welcome to SAS…Session..!. What is SAS..! A Complete programming language with report formatting with statistical and mathematical capabilities.
© Willett, Harvard University Graduate School of Education, 8/27/2015S052/I.3(c) – Slide 1 More details can be found in the “Course Objectives and Content”
New Mexico Computer Science For All Statements and Expressions in NetLogo Maureen Psaila-Dombrowski.
SAS SQL SAS Seminar Series
Lecture 5 Sorting, Printing, and Summarizing Your Data.
1 Data Management (1) Data Management (1) “Application of Information and Communication Technology to Production and Dissemination of Official statistics”
Introduction to Standard Reports. Standard Reports 2 How to get information out of AQS Standard Reports Site / Monitor Metadata Detail Data Reports “
1 Performing Spreadsheet What-If Analysis Applications of Spreadsheets.
SAS Efficiency Techniques and Methods By Kelley Weston Sr. Statistical Programmer Quintiles.
Ts_print IN A FEW EASY STEPS. C L E A N, Q U A L I T Y D A T A F O R E X C E L L E N C E I N R E S E A R C H ts_print is CRSP’s flexible report writer.
WRDS CCM User Guide West Virginia University. CRSP/Compustat Merged (CCM) CCM is comprised of CRSP and Compustat® data together with the link between.
1 Lab 2 and Merging Data (with SQL) HRP223 – 2009 October 19, 2009 Copyright © Leland Stanford Junior University. All rights reserved. Warning:
Grant Brown.  AIDS patients – compliance with treatment  Binary response – complied or no  Attempt to find factors associated with better compliance.
SQL Chapter Two. Overview Basic Structure Verifying Statements Specifying Columns Specifying Rows.
1 Efficient SAS Coding with Proc SQL When Proc SQL is Easier than Traditional SAS Approaches Mike Atkinson, May 4, 2005.
Using Weighted Data Donald Miller Population Research Institute 812 Oswald Tower, December 2008.
Haas MFE SAS Workshop Lecture 2: The Data Management Alex Vedrashko For sample code and these slides, see Peng Liu’s page
Getting Started With Stata Session 1 Jim Anthony John Troost Department of Epidemiology Michigan State University.
1 Data Manipulation (with SQL) HRP223 – 2010 October 13, 2010 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This.
Slide 9.1 Confirmatory Factor Analysis MathematicalMarketing In This Chapter We Will Cover Models with multiple dependent variables, where the independent.
Chapter 4 concerns various SAS procedures (PROCs). Every PROC operates on: –the most recently created dataset –all the observations –all the appropriate.
An Introduction Katherine Nicholas & Liqiong Fan.
14b. Accessing Data Files in SAS ®. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2 Training.
Computing with SAS Software A SAS program consists of SAS statements. 1. The DATA step consists of SAS statements that define your data and create a SAS.
FORMAT statements can be used to change the look of your output –if FORMAT is in the DATA step, then the formats are permanent and stored with the dataset.
VB Conditionals If Then, Select Case. If Then Useful computer programs typically have to make a lot of decisions. In VB, If…Then code is used for decision.
Chapter 11: Sequential File Merging, Matching, and Updating Programming Logic and Design, Third Edition Comprehensive.
HRP Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and.
Use the SET statement to: –create an exact copy of a SAS dataset –modify an existing SAS dataset by creating new variables, subsetting (using a subsetting.
1 Data Manipulation (with SQL) HRP223 – 2009 October 12, 2009 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This.
Chapter 6: Modifying and Combining Data Sets  The SET statement is a powerful statement in the DATA step DATA newdatasetname; SET olddatasetname;.. run;
Real Time Remote Access Comparing SAS and SPSS David Price Quy Do April 2013.
1 EPIB 698C Lecture 1 Instructor: Raul Cruz-Cano
Beginning Data Manipulation HRP Topic 4 Oct 14 th 2012 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This.
HRP Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and.
Online Programming| Online Training| Real Time Projects | Certifications |Online Classes| Corporate Training |Jobs| CONTACT US: STANSYS SOFTWARE SOLUTIONS.
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 14 & 19 By Tasha Chapman, Oregon Health Authority.
Working Efficiently with Large SAS® Datasets Vishal Jain Senior Programmer.
Oracle sql Online Training By SMART MIND ONLINE TRAINING Website:
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 5 & 6 By Ravi Mandal.
Chapter 6: Modifying and Combining Data Sets
Getting Started with R.
ECONOMETRICS ii – spring 2018
Introduction to SAS A SAS program is a list of SAS statements executed in order Every SAS statement ends with a semicolon! SAS statements can be in caps.
Merging in SAS These slides show alternatives regarding the merge of two datasets using the IN data set option (check in the SAS onlinedoc > “BASE SAS”,
Claire Osgood November 2017
Lab 2 and Merging Data (with SQL)
Lab 2 HRP223 – 2010 October 18, 2010 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected.
3-Variable K-map AB/C AB/C A’B’ A’B AB AB’
Presentation transcript:

Merging in SAS These slides show alternatives regarding the merge of two datasets using the IN data set option (check in the SAS onlinedoc > “BASE SAS”, “SAS Language Reference: Dictionary” > “Data step options” > “IN=“ In the slides, the red data goes into the merged data set. The greyed out observations are left out.

The perfect merge Dataset ADataset B IDV1V2IDV3V

Not so perfect (if a or b;) Dataset A (in=a)Dataset B (in=b) IDV1V2IDV3V

If a=b; (both datasets contribute) Dataset A (in=a)Dataset B (in=b) IDV1V2IDV3V

If a; (must be in dataset A) Dataset A (in=a)Dataset B (in=b) IDV1V2IDV3V

If b; (must be in dataset B) Dataset A (in=a)Dataset B (in=b) IDV1V2IDV3V

Notes The examples assume there is a unique identifier. This can be either one variable (ex, CRSP's PERMNO or Compustat's GVKEY) or more than one variable (for example, PERMNO and DATE for a panel dataset). Assumption: Both data sets are sorted by the unique identifier(s).

Sample code

Typical problems If both datasets were complete (they both have the same observed units, then the IF statements would be unnecessary; "if a and b" would be equivalent to leaving the statement out altogether) If you do not have a BY statement (no identifier -- you somehow know that each row of one datasets corresponds to the same one row in the other dataset), the datasets are just "glued" side-by-side. Common mishaps: the by variables have different formats across datasets, SAS will merge the datasets, but will put a WARNING in the log. Another common mishap is to have variables with the same name (that are not the ID) -- one of the will be overwritten.

References Good references are 4.html and a manual called "Combining and modifying SAS data sets: examples", which is in the RC library. It has a lot of example. Unfortunately, it does not exist in an online version (only the code is available, but the explanations are very good).