Download presentation
Presentation is loading. Please wait.
Published byBaldwin Melvyn Walton Modified over 9 years ago
1
Grant Brown
2
AIDS patients – compliance with treatment Binary response – complied or no Attempt to find factors associated with better compliance. Seems a natural choice for logistic regression
3
Variable coding was problematic Entirely composed of factors, many included. Client had convergence problems. Many ‘cells’ were empty, so no MLE could be found.
4
Very simple to check where the errors are coming from, simply perform a proc freq on the factors in your model. Start with simple frequencies to check for any extremely skewed groups. Move on to cross-tab frequencies, which give lots of output, but show each cell. proc freq data=dataset; tables var1* var2 * var3; run; proc freq data=dataset; tables var1 var2 var3; run;
5
Relational databases are ubiquitous. Keys Identifier fields ▪ Numbers ▪ Dates Examples Problems Everything. Databases grow over time, so problems can compound.
6
The syntax is deceptively simple. Issues: Sorting Catching non matched cases Many-to-many woes General advice: Understand input datasets Verify output datasets … merge DataSet1 DataSet2; by SomeVar1 SomeVar2; …
7
Combines ‘uniquely’ matched records. Matched by combination of ‘by’ variables. Data set on the right side of the statement wins in cases of conflict. If you need to understand it’s behavior more specifically, you are probably doing something wrong. (multiple instances of by variables in both data sets)
8
DATA Merged NotMerged1 NotMerged2; MERGE data1 (IN = in1) data2 (IN = in2); BY var1 var2 var3; IF (in1 AND in2) then output Merged; ELSE IF (in1 AND NOT in2) THEN OUTPUT NotMerged1; ELSE IF (in2 AND NOT in1) THEN OUTPUT NotMerged2; RUN; PROC SORT DATA = data1; BY var1 var2 var3; RUN; PROC SORT DATA = data2; BY var1 var2 var3; RUN;
9
DATA newdata; MERGE data1 data2; BY var1; RUN;
10
Limit your input data sets to ‘by’ fields and information needed in the output before the merge. Spot check your results. Count observations in the logs. Create a ‘unique’ variable beforehand if you are getting mysterious records. If you are doing the same thing over and over and just changing variable names, spend 10 minutes looking at macro examples.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.