Download presentation
Presentation is loading. Please wait.
Published byRandolf Mitchell Modified over 9 years ago
1
17b.Accessing Data: Manipulating Variables in SAS ®
2
1 Prerequisites Recommended modules to complete before viewing this module 1. Introduction to the NLTS2 Training Modules 2. NLTS2 Study Overview 3. NLTS2 Study Design and Sampling NLTS2 Data Sources, either 4. Parent and Youth Surveys or 5. School Surveys, Student Assessments, and Transcripts NLTS2 Documentation 10. Overview 11. Data Dictionaries 12. Quick References
3
17b. Accessing Data: Manipulating Variables in SAS ® 2 Prerequisites Recommended modules to complete before viewing this module (cont’d) 13. Analysis Example: Descriptive/Comparative Using Longitudinal Data Accessing Data 14b. Files in SAS 15b. Frequencies in SAS
4
17b. Accessing Data: Manipulating Variables in SAS ® 3 Overview Purpose Modifying existing variables Creating new variables Summary Closing Important information
5
17b. Accessing Data: Manipulating Variables in SAS ® 4 NLTS2 restricted-use data NLTS2 data are restricted. Data used in these presentations are from a randomly selected subset of the restricted-use NLTS2 data. Results in these presentations cannot be replicated with the NLTS2 data licensed by NCES.
6
17b. Accessing Data: Manipulating Variables in SAS ® 5 Purpose Learn to Modify an existing variable Create a new variable Join/combine data from different sources
7
17b. Accessing Data: Manipulating Variables in SAS ® 6 Modifying existing variables How to modify a variable. To collapse categories, break a continuous variable into categories, or recode a variable, it is not always necessary to create a new variable in SAS. User-assigned formats control how output prints but does not change the variable. Syntax for categorizing an existing variable with a format PROC FORMAT ; VALUE b2catfmt low-1 = "(<=1) 1 or younger" 2-5 = "(2-5) 2 to 5 years of age" 6-10 = "(6-10) 6 to 10 years of age" 11-high = "(>=11) 11 or older" ; PROC FREQ data = collapse ; TABLES np1B2a ; FORMAT np1B2a b2catfmt. ; These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.
8
17b. Accessing Data: Manipulating Variables in SAS ® 7 Modifying existing variables Syntax to modify an existing variable Create a new variable rather than permanently changing the exiting variable Create a new format so values are meaningful PROC FORMAT ; VALUE b2catfmt 1 = "(1) 1 or younger" 2 = "(2) 2 to 5 years of age" 3 = "(3) 6 to 10 years of age" 4 = "(4) 11 or older" ; Recode the variable in a data step This would result in a temporary change. Why? What would make it a permanent change? DATA collapse ; SET sasdb.n2w1parent ; These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.
9
17b. Accessing Data: Manipulating Variables in SAS ® 8 Modifying existing variables Syntax to recode an existing variable into a new variable with value and variable labels. /* create age of youth when diagnosed – with age range categories*/ if missing(np1B2a) then np1B2a_Cat = np1B2a ; else if np1B2a <= 1 then np1B2a_Cat = 1 ; else if 2<=np1B2a<=5 then np1B2a_Cat = 2 ; else if 6<=np1B2a<=10 then np1B2a_Cat = 3 ; else if np1B2a > 10 then np1B2a_Cat = 4 ; FORMAT np1B2a_Cat b2catfmt. ; LABEL np1B2a_Cat = '(np1B2a_cat) Age of youth when diagnosed - categorized into ranges' ; These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.
10
17b. Accessing Data: Manipulating Variables in SAS ® 9 Modifying existing variables Look at results Run a frequency of the new variable Useful to look at a crosstab of the original variable by the new variable to check how values were coded Look at frequency distributions and crosstab of new vs. old variables The “LIST” option on TABLES statement will print the crosstab table more compactly. A FORMAT statement without a format specified will strip existing formats. TABLES np1B2a_Cat * np1B2a /MISSPRINT LIST ; FORMAT np1B2a ; These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.
11
17b. Accessing Data: Manipulating Variables in SAS ® 10 Modifying existing variables These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.
12
17b. Accessing Data: Manipulating Variables in SAS ® 11 Modifying existing variables These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.
13
17b. Accessing Data: Manipulating Variables in SAS ® 12 Modifying existing variables: Example Modifying a variable Use Wave 3 parent/youth interview file Collapse np3NbrProbs into a new variable 0-1 2 3 4-6 Remember to Label the variable. Add value formats. Account for missing values. These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.
14
17b. Accessing Data: Manipulating Variables in SAS ® 13 Modifying existing variables: Example PROC FREQ with a user-defined format (no change made to np3NbrProbs) These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.
15
17b. Accessing Data: Manipulating Variables in SAS ® 14 Modifying existing variables: Example PROC FREQ with new variable np3NbrProbs_Cat created from np3NbrProbs These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.
16
17b. Accessing Data: Manipulating Variables in SAS ® 15 Modifying existing variables: Example Created np3NbrProbs_Cat compared with original np3NbrProbs Stripped existing formats from np3NbrProbs with format statement FORMAT np3NbrProbs; These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.
17
17b. Accessing Data: Manipulating Variables in SAS ® 16 Creating new variables How to create a new variable. The values in the new variable can be the results of calculations, assignments, or logic. A new variable can be created from an existing variable or from multiple variables, including variables from other sources and/or waves. Variables from other sources/waves must be added to the active data file before creating the new variable. These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.
18
17b. Accessing Data: Manipulating Variables in SAS ® 17 Creating new variables Be aware of any coding differences between the variables when combining values. Decide what to do with missing values. Example: Create a variable using parent interview data from Waves 1, 2, and 3. Has student been suspended and/or expelled in any wave? These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.
19
17b. Accessing Data: Manipulating Variables in SAS ® 18 Creating new variables Create a format for the new variable and join data needed PROC FORMAT ; VALUE fmta 0 = "(0) Never suspended/expelled" 1 = "(1) Suspended or expelled in any wave" 2 = "(2) Suspended or expelled every wave" ; DATAcollapse ; MERGE sasdb.n2w1parent (keep=ID np1d7h) sasdb.n2w2paryouth (keep=ID np2d5d) sasdb.n2w3paryouth (keep=ID np3d5d) sasdb.n2w4paryouth(keep=ID np4d5d) ; BY ID ; These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.
20
17b. Accessing Data: Manipulating Variables in SAS ® 19 Creating new variables Syntax If np1D7h>=0 and np2D5d>=0 and np3D5d>=0 and np4D5d>=0 then do ; if np1D7h=1 and np2D5d=1 and np3D5d=1 and np4D5d=1 then np4D5d_ever = 2 ; else if np1D7h=1 or np2D5d=1 or np3D5d=1 or np4D5d=1 then np4D5d_ever = 1 ; else np4D5d_ever = 0 ; end ; Code will result in a variable that Requires a value for every wave Is 0 if never suspended/expelled Is 1 if suspended/expelled in any wave Is 2 if suspend/expelled in all three waves. These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.
21
17b. Accessing Data: Manipulating Variables in SAS ® 20 Creating new variables These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.
22
17b. Accessing Data: Manipulating Variables in SAS ® 21 Creating new variables: Example Creating a new variable Use the Wave 4 parent/youth interview file. Bring in np1F7 from Wave 1, np2P8_J4 from Wave 2, and np3P8_J4 from Wave 3 interview files. Create a new variable np4P8_J4_ever (ever done volunteer or community service). Initialize value to “0” if any value in np1F7, np2P8_J4, np3P8_J4, or np4P8_J4 is “0.” Reassign to “1” if any value in np1F7, np2P8_J4, np3P8_J4, or np4P8_J4 is “1.” These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.
23
17b. Accessing Data: Manipulating Variables in SAS ® 22 Creating new variables: Example Creating a new variable (cont’d) Assign a variable label and value labels. Run a frequency of np4P8_J4_ever. Run a crosstabulation of np4P8_J4_ever by np4P8_J4. These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.
24
17b. Accessing Data: Manipulating Variables in SAS ® 23 Creating new variables: Example These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.
25
17b. Accessing Data: Manipulating Variables in SAS ® 24 Summary Be aware of differences in coding between similar variables when building composite variables. Missing values must be considered. Know how missing values are being coded, particularly when using more than one variable to create another. Joined data are more likely to have missing values. Weights Generally, the analysis weight would be the weight from the smallest sample when combining data. When filling in values for a variable in an active file with values from another, it is OK to use the weight in the active file. These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.
26
17b. Accessing Data: Manipulating Variables in SAS ® 25 Summary Know the values, mind the missing, and watch your weights! These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.
27
17b. Accessing Data: Manipulating Variables in SAS ® 26 Closing Topics discussed in this module Modifying existing variables Creating new variables Summary Next module: 18b. PROC SURVEY Procedures in SAS
28
17b. Accessing Data: Manipulating Variables in SAS ® 27 Important information NLTS2 website contains reports, data tables, and other project-related information http://nlts2.org/http://nlts2.org/ Information about obtaining the NLTS2 database and documentation can be found on the NCES website http://nces.ed.gov/statprog/rudman/http://nces.ed.gov/statprog/rudman/ General information about restricted data licenses can be found on the NCES website http://nces.ed.gov/statprog/instruct.asphttp://nces.ed.gov/statprog/instruct.asp E-mail address: nlts2@sri.com
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.