Presentation is loading. Please wait.

Presentation is loading. Please wait.

I OWA S TATE U NIVERSITY Department of Animal Science Modifying and Combing SAS Data Sets (Chapter in the 6 Little SAS Book) Animal Science 500 Lecture.

Similar presentations


Presentation on theme: "I OWA S TATE U NIVERSITY Department of Animal Science Modifying and Combing SAS Data Sets (Chapter in the 6 Little SAS Book) Animal Science 500 Lecture."— Presentation transcript:

1 I OWA S TATE U NIVERSITY Department of Animal Science Modifying and Combing SAS Data Sets (Chapter in the 6 Little SAS Book) Animal Science 500 Lecture No. 6 September 16, 2010

2 I OWA S TATE U NIVERSITY Department of Animal Science Modifying Variable Lengths u The code below uses only 3 character spaces for the county by specifying the length of the location variable u You can also use this with numeric variables data new; length location $3; input location $ 1-15 date $ rainfall; Cards; /*remember you could use the datalines statement to do the same thing);*/ Story county6/10/10 4.5 Polk county 6/10/10 6.5 Story county7/10/10 4.9 Polk county 7/10/10 2.4 ; Run; Quit;

3 I OWA S TATE U NIVERSITY Department of Animal Science Creating new variables u You can create variables in the initial data step where you are inputting the data or you can use a new data step to create the variables. n As we did when we created 1. Adjusted backfat variables for barrows and gilts 2. Total weight gain during the test period 3. Average daily gain during the test period u You do not have to make calculations when making new variables. n You could divide backfat, loin muscle are, gain, etc into categories to evaluate if model effects differ by the classes you developed

4 I OWA S TATE U NIVERSITY Department of Animal Science Creating new variables u If you choose to create a new data step – you will need to either create a new file in SAS or modify the existing file. n Example what we have done in Lab by the following statements Data Pig13 Set Pig12; n Alternatively you could create a new file in SAS l If you create a new file you may need to merge it with the original data file; l This is a place where students often have difficulty

5 I OWA S TATE U NIVERSITY Department of Animal Science Creating New Variables in SAS u Both of these options will result in a data set named “New” with all the variables that have been defined u This option creates the variables lograinfall and sqrtrainfall in the initial data step u In the second set of code you are creating a new file in SAS and naming it “New” the set statement tells SAS to Assign the data from the first “New” to this file “New” data new; input @1 location $ 1-15 date mmddyy8. rainfall; log_rain = log(rainfall); sqrt_rain= sqrt(rain); datalines; Story county 6/10/10 4.5 Polk county 6/10/10 6.5 Story county 7/10/10 4.9 Polk county 7/10/10 2.4 ; run; data new; set new; log_rain = log(rainfall); sqrt_rain = sqrt(rain); run;

6 I OWA S TATE U NIVERSITY Department of Animal Science Creating New Variables in SAS u SAS has many other functions to perform various calculations for trigonometry, finance, and other applications. n Some examples assuming x is the variable you want to modify: l Log = log(x) l Sin = sin(x) l Cos = cos(x) n As we will see in the next lab, you may find the distribution does not meet the assumptions of the analysis of variance n Will need to transform data often using some of the u Use SAS help to find the correct notation u Some helpful search hints are: n Search under SAS Functions, Arithmetic Functions, Numeric Variables, Logical Operators

7 I OWA S TATE U NIVERSITY Department of Animal Science Operators u Recall from previous discussions u Addition, subtraction, multiplication, and division are specified by +, -, *, and /, respectively. u For exponentiation, a double asterisk ** is used. n exprainfall = exprainfall**2 u Parentheses can be used to group expressions, and these expressions can be nested within several levels. SAS follows the standard PEMDAS order, () ** * / + -, for evaluating functions.

8 I OWA S TATE U NIVERSITY Department of Animal Science Logical Operators u SAS can also evaluate logical expressions n, =, if, then, else, else if, and (&), or(|), not (^)… n Search Logical Operators in SAS Help data new; input @1 location $ 1-15 date mmddyy8. rainfall; *creating a new variable based on location; if location = “Story county" then x = rainfall +5; else x = 5; *creating a new variable based on level of rainfall; if rainfall < 3 then y=1; else if rainfall < 4.9 then y=2; else y = 3; datalines; Story county 6/10/10 4.5 Polk county 6/10/10 6.5 Story county 7/10/10 4.9 Polk county 7/10/10 2.4 ; Run; Quit; data new; set new; log_rain = log(rainfall); sqrt_rain = sqrt(rain); run;

9 I OWA S TATE U NIVERSITY Department of Animal Science Other ways to Modify Variables - Do Loops u DO loops can be used to create an ordered sequence of numbers. u Below is an example of a do loop in SAS u The program "loops" through the values of Q from 1 to 5 and performs the calculations requested for the current value of Q. The OUTPUT statement tells SAS to export Q and the new variables to the dataset EXAMPLE. The END statement signifies the end of the loop. An END statement is necessary for each DO statement! Notice that neither INPUT nor DATALINES statements are used. data do; set new; do q=1 to 5; q_rain=q*rainfall; q_rainsquared=q_rain**2; output; end; proc print data = do; run; Quit;

10 I OWA S TATE U NIVERSITY Department of Animal Science Modifying your data using PROC Transpose u Sometimes you need to reshape your data which is in a long format (shown below) FamIDYearFamInc 1200740000 1200840500 1200941000 2200745000 2200845400 2200945800 3200775000 3200876000 3200977000

11 I OWA S TATE U NIVERSITY Department of Animal Science u into a wide format (shown below). Modifying your data using PROC Transpose FamIDFamInc07FamInc08FamInc0 1400004050041000 2450004540045800 3750007600077000

12 I OWA S TATE U NIVERSITY Department of Animal Science Modifying your data using PROC Transpose u How do we accomplish this? u SAS proc transpose to reshape the data from a long to a wide format.

13 I OWA S TATE U NIVERSITY Department of Animal Science Modifying your data using PROC Transpose data long1 ; input famid year faminc ; cards ; 1 96 40000 1 97 40500 1 98 41000 2 96 45000 2 97 45400 2 98 45800 3 96 75000 3 97 76000 3 98 77000 ; run; quit; proc transpose data=long1 out=wide1 prefix=faminc; by famid ; id year; var famin; run; quit; proc print data = wide1; run; quit; Notice that the option prefix= faminc specifies a prefix to use in constructing names for transposed variables in the output data set. SAS automatic variable _NAME_ contains the name of the variable being transposed

14 I OWA S TATE U NIVERSITY Department of Animal Science Modifying your data using PROC Transpose u What does this get you? u SAS output that looks like the following Obs famid _Name_ faminc96 faminc97 faminc98 1 1 faminc 40000 40500 41000 2 2 faminc 45000 45400 45800 3 3 faminc 75000 76000 77000

15 I OWA S TATE U NIVERSITY Department of Animal Science Removing Observations u When performing statistical calculations, SAS, by default, uses all of the observations that are in the dataset. u You can selectively delete observations that you do not wish to use with: n IF statements - specify which observations to keep, n IF and THEN DELETE will delete observations.

16 I OWA S TATE U NIVERSITY Department of Animal Science Removing Observations u Both statements result in a file with only data from Story County data new; set new; if location = “Story county"; Run; Quit; data new; set new; if location = “Polk county" then delete; run;

17 I OWA S TATE U NIVERSITY Department of Animal Science Removing Variables u You may only need to use a few variables from a larger dataset. This can be done with KEEP or DROP statements. u For many datasets, you can keep unneeded variables in the dataset, and SAS can handle them with ease. n This will be the case for most of you dealing with several hundred or even several thousand observations n Those dealing with very large data sets might benefit from using the Keep or Drop statements l Caution be sure you are thinking ahead so that you do not drop a variable needed later in some calculation or function

18 I OWA S TATE U NIVERSITY Department of Animal Science Removing Variables u Let’s say we wanted to keep only the location, x and y variables from the “new” file. u Both data steps below will accomplish the task. data subset; set new; drop date rainfall; run; data subset; set new; keep location x y; run;

19 I OWA S TATE U NIVERSITY Department of Animal Science Combining Datasets u There are two different ways to combine datasets using the SET and MERGE statements in a Data Step. u The SET statement is used to add observations to an existing dataset. n This is what we have done in lab n We have developed new variables using calculations with existing data and have adjusted current data in the data set. n Data Pig12 Set Pig12; l ADG = (OFFWT – ONWT) / DOT; u Consider the following example, using monthly rainfall totals in Gainesville in 1995 and 1996. When used to combine observations into one dataset, the SET command works as follows:

20 I OWA S TATE U NIVERSITY Department of Animal Science Combining Datasets – Set data rain95; input month rainfall @@; year=1995; datalines; 1 3.08 2 1.07 3 6.14 4 5.18 5 2.47 6 7.55 7 7.66 8 7.20 9 2.10 10 4.33 11 3.15 12 1.29 ; run; quit; data rain96; input month rainfall @@; year=1996; datalines; 1 0.97 2 0.66 3 10.52 4 1.72 5 2.01 6 6.05 7 11.00 8 4.90 9 2.23 10 6.18 11 1.73 12 6.63 ; run; quit; data rain9596; set rain95 rain96; run; quit;

21 I OWA S TATE U NIVERSITY Department of Animal Science Combining Datasets u The MERGE statement adds the variables in one dataset to another dataset. u Consider the following example using Southern teams in the National Basketball Association:National Basketball Association

22 I OWA S TATE U NIVERSITY Department of Animal Science Combining Datasets - Merge data nba1; input @1 city $11. @11 division $; Cards; Orlando Atlantic Miami Atlantic Atlanta Central Charlotte Central ; run; quit; data nba2; input mascot $ @@; Cards; Magic Heat Hawks Hornets ; run; quit; data nba3; merge nba1 nba2; run; quit; Proc Print; run; quit;

23 I OWA S TATE U NIVERSITY Department of Animal Science Combining Datasets – Merge Output Results Obs city division mascot 1 Orlando Atlantic Magic 2 Miami Atlantic Heat 3 Atlanta Central Hawks 4 Charlotte Central Hornets

24 I OWA S TATE U NIVERSITY Department of Animal Science Combining Datasets u In the last example, the observations were ordered so that each dataset corresponds to the other. You will often need to put datasets together based on values of variables which are included in both datasets. To do this, both datasets must first be sorted in order by the common variable or variables.

25 I OWA S TATE U NIVERSITY Department of Animal Science Combining Datasets Proc sort sorts the data file using the variable chosen in the by statement – we will discuss Proc Sort in more detail later on

26 I OWA S TATE U NIVERSITY Department of Animal Science Combining Datasets with Merge -Output The SAS System 10:16 Saturday, September 11, 2010 14 Obs city division mascot 1Atlanta Central Hawks 2Charlotte Central Hornets 3Miami Atlantic Heat 4 Orlando Atlantic Magic


Download ppt "I OWA S TATE U NIVERSITY Department of Animal Science Modifying and Combing SAS Data Sets (Chapter in the 6 Little SAS Book) Animal Science 500 Lecture."

Similar presentations


Ads by Google