Presentation is loading. Please wait.

Presentation is loading. Please wait.

Merging in SAS These slides show alternatives regarding the merge of two datasets using the IN data set option (check in the SAS onlinedoc > “BASE SAS”,

Similar presentations


Presentation on theme: "Merging in SAS These slides show alternatives regarding the merge of two datasets using the IN data set option (check in the SAS onlinedoc > “BASE SAS”,"— Presentation transcript:

1 Merging in SAS These slides show alternatives regarding the merge of two datasets using the IN data set option (check in the SAS onlinedoc > “BASE SAS”, “SAS Language Reference: Dictionary” > “Data step options” > “IN=“ In the slides, the red data goes into the merged data set. The greyed out observations are left out.

2 The perfect merge Dataset ADataset B IDV1V2IDV3V4 1123 1343 24214342854234 31294363325434 41227674763234 5232345229324 65344356554324 734389788434 832467878895342

3 Not so perfect (if a or b;) Dataset A (in=a)Dataset B (in=b) IDV1V2IDV3V4 1343 24214342854234 3129436 41227674763234 5229324 65344356554324 734389 832467878895342

4 If a=b; (both datasets contribute) Dataset A (in=a)Dataset B (in=b) IDV1V2IDV3V4 1343 24214342854234 3129436 41227674763234 5229324 65344356554324 734389 832467878895342

5 If a; (must be in dataset A) Dataset A (in=a)Dataset B (in=b) IDV1V2IDV3V4 1343 24214342854234 3129436... 41227674763234 5229324 65344356554324 734389... 832467878895342

6 If b; (must be in dataset B) Dataset A (in=a)Dataset B (in=b) IDV1V2IDV3V4..1343 24214342854234 3129436 41227674763234..5229324 65344356554324 734389 832467878895342

7 Notes The examples assume there is a unique identifier. This can be either one variable (ex, CRSP's PERMNO or Compustat's GVKEY) or more than one variable (for example, PERMNO and DATE for a panel dataset). Assumption: Both data sets are sorted by the unique identifier(s).

8 Sample code

9 Typical problems If both datasets were complete (they both have the same observed units, then the IF statements would be unnecessary; "if a and b" would be equivalent to leaving the statement out altogether) If you do not have a BY statement (no identifier -- you somehow know that each row of one datasets corresponds to the same one row in the other dataset), the datasets are just "glued" side-by-side. Common mishaps: the by variables have different formats across datasets, SAS will merge the datasets, but will put a WARNING in the log. Another common mishap is to have variables with the same name (that are not the ID) -- one of the will be overwritten.

10 References Good references are http://ftp.sas.com/techsup/download/technote/ts64 4.html and a manual called "Combining and modifying SAS data sets: examples", which is in the RC library. It has a lot of example. Unfortunately, it does not exist in an online version (only the code is available, but the explanations are very good).


Download ppt "Merging in SAS These slides show alternatives regarding the merge of two datasets using the IN data set option (check in the SAS onlinedoc > “BASE SAS”,"

Similar presentations


Ads by Google