Presentation is loading. Please wait.

Presentation is loading. Please wait.

14a. Accessing Data Files in SPSS ®. 1 Prerequisites Recommended modules to complete before viewing this module 1. Introduction to the NLTS2 Training.

Similar presentations


Presentation on theme: "14a. Accessing Data Files in SPSS ®. 1 Prerequisites Recommended modules to complete before viewing this module 1. Introduction to the NLTS2 Training."— Presentation transcript:

1 14a. Accessing Data Files in SPSS ®

2 1 Prerequisites Recommended modules to complete before viewing this module 1. Introduction to the NLTS2 Training Modules 2. NLTS2 Study Overview 3. NLTS2 Study Design and Sampling NLTS2 Data Sources, either 4. Parent and Youth Surveys or 5. School Surveys, Student Assessments, and Transcripts NLTS2 Documentation 10. Overview 11. Data Dictionaries 12. Quick References

3 14a. Accessing Data Files in SPSS ® 2 Overview Purpose Open and view data files Limiting variables Subsetting cases Joining/combining data files Summary Closing Important information

4 14a. Accessing Data Files in SPSS ® 3 NLTS2 restricted-use data NLTS2 data are restricted. Data used in these presentations are from a randomly selected subset of the restricted-use NLTS2 data. Results in these presentations cannot be replicated with the NLTS2 data licensed by NCES.

5 14a. Accessing Data Files in SPSS ® 4 Purpose Learn to Open a data file See what is in a file (i.e., contents of the file) Size a data file for a perfect fit Reduce the number of variables Reduce the number of cases (i.e., subset the data) Combine information from multiple sources Bring in data from another source or another wave Join or combine files Create a new file

6 14a. Accessing Data Files in SPSS ® Open and view data files SAS ® and SPSS ® data are in separate folders. SPSS data have a.sav extension. Note: Data files were developed in SAS, which Allows 28 distinct missing values vs. SPSSs 3 distinct values or a range of values. Has associated user-defined value label formats stored separately in a format library vs. the SPSS convention of storing value labels with the variable. See Notes to SPSS Users hyperlinked from the table of contents in the data documentation for details. 5

7 14a. Accessing Data Files in SPSS ® Open and view data files Files are either read from or written to. Files have a name and a location where they are stored. SPSS needs to know the name of the file and where to find it. The path describes the nesting of folders. Example: C:\myprojects\NLTS2\Data is a path or location. – i.e., the file is located on Drive C, in the folder Data, which is nested inside the myprojects and NLTS2 folders. 6 These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

8 14a. Accessing Data Files in SPSS ® 7 Open and view data files Syntax Open file from menu Open file command in SPSS syntax editor Submit a G ET F ILE command G ET F ILE 'C':\myprojects\NLTS2\Data\n2w1tchr.sav'. Menu command From menu: File: Open: Data [select file from browser] A window opens with spreadsheet type of display. These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

9 14a. Accessing Data Files in SPSS ® 8 Open and view data files The open file is the active dataset. Select the Variable View tab for details about variables. Rows list the variables. Columns contain descriptors and attributes about the variables and values. Data View has case-by-case values for each variable. Each row holds data for a single respondent. Each column holds the data for a single variable. These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

10 14a. Accessing Data Files in SPSS ® 9 Open and view data files: Example Viewing files Open the Wave 1 Teacher file. Look at both the data view and variable view. These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

11 14a. Accessing Data Files in SPSS ® Open and view data files: Example 10 These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

12 14a. Accessing Data Files in SPSS ® 11 Limiting variables How to reduce the number of variables in the file Large files with many cases and many variables are unwieldy; simplify. Fewer variables to search through Fewer cases to process Create files that are limited to just those variables needed for analysis. You have the choice of drop or keep. Which one is best? The one that requires less typing! If you are dropping more variables than keeping, use keep and vice versa. These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

13 14a. Accessing Data Files in SPSS ® 12 Limiting variables Note: When making changes to a file Use work files for temporary changes. Work files are files that are in existence only for the duration of the program or SPSS session. An active file is a work file unless it is saved. Save the results to a new data file. Usually it is best to create a new file rather than to modify the source file. These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

14 14a. Accessing Data Files in SPSS ® Limiting variables In syntax editor GET FILE = ' C:\MyProjects\NLTS2\Data \n2w1parent.sav' /KEEP= ID w1_DisHdr2001 w1_GendHdr2001 w1_IncomeHdr2001 w1_AgeHdr2001 np1Weight np1HealthProb np1GroupMember np1ProblemCount np1E2a np1B2a. EXECUTE. SAVE OUTFILE= ' C:\MyProjects\NLTS2\Data\ Par_w1_lmt_vars.sav' /DROP= np1E2a np1GroupMember. Notice that K EEP is on the G ET F ILE and D ROP is on the S AVE F ILE commands. Notice that SPSS statements end in a period. 13 These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

15 14a. Accessing Data Files in SPSS ® Limiting variables Menu driven Open a file. From F ILE select S AVE AS. Click on V ARIABLES. in next pop-up menu, click on D ROP A LL. In the large box with the variable list, click in each little box in theK EEP column to select only the variables needed from this file. Little boxes with an X indicate the variables are kept; blank boxes indicate the variables are dropped. Click C ONTINUE, give the file a new name in the Browse window, and click S AVE. Open a new file. 14 These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

16 14a. Accessing Data Files in SPSS ® Limiting variables: Example Limiting variables Create a file with fewer variables. Create a new file called PrScores.sav from n2w2dirassess.sav. Keep only the following variables: ID ndacalc_pr ndaPC_PR ndasyn_pr NDaF1_friend na_age4 w2_dis12 w2_gend2 na_grade4 w2_incm3 wt_na. Open and review the new file in variable view. 15 These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

17 14a. Accessing Data Files in SPSS ® Limiting variables 16 These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

18 14a. Accessing Data Files in SPSS ® Subsetting cases How to reduce the number of cases or records in the file. Often analysis is done on a subset; for example: Select only youth with visual impairment. Select only youth who are out of secondary school. Exclude younger students. All variables are available in the file; only cases are conditionally restricted. 17 These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

19 14a. Accessing Data Files in SPSS ® Subsetting cases Example: Limit Wave 4 Parent/Youth interview data to those who are 21 or older, excluding youth who are 19 or 20 (W4_Age2007 = 19 or 20). Code in syntax editor to limit cases USE ALL. COMPUTE filter_$=(W4_Age2007>20). VARIABLE LABEL filter_$ 'W4_Age2007>20 (FILTER)'. VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'. FORMAT filter_$ (f1.0). FILTER BY filter_$. EXECUTE. Code to select all cases FILTER OFF. USE ALL. EXECUTE. 18 These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

20 14a. Accessing Data Files in SPSS ® Subsetting cases To limit cases from menu Data: Select Cases Click I F CONDITION IS SATISFIED and I F button. Build logic condition. Click C ONTINUE and OK. To select all cases from menu Data: Select Cases Select A LL C ASES and click OK. To have the best of both worlds Click P ASTE to save code. Select and run code from syntax editor. Toggle on and off as needed. 19 These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

21 14a. Accessing Data Files in SPSS ® Subsetting cases: Example Subsetting cases Create a small data set with a subset of cases. Open PrScores.sav created in previous example. Limit cases to those classified with hearing impairment only, i.e., those with a value of 5 for w2_dis12. Look at Variable View and Data View in the data editor. Are there any visual clues that the filter is on? Turn the filter off so all cases are selected. 20 These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

22 14a. Accessing Data Files in SPSS ® Joining/combining data files How to bring in data from another file Purpose Learn to combine or join files Bring in data from another source Bring in data from another wave Learn what to watch for Number of cases in the combined file How cases are joined – Key variable, i.e., which variable to match on – Keyed file, i.e., which cases to keep 21 These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

23 14a. Accessing Data Files in SPSS ® Joining/combining data files Why do this? Often it is necessary to combine information from different files to perform comparative analyses, create new variables, or measure differences over time. For example, you may want to Create composite variables from multiple sources. Look at similar items at different points in time. 22 These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

24 14a. Accessing Data Files in SPSS ® Joining/combining data files Example of a composite variable from multiple sources Create a variable for if parent attended a parent/teacher conference using Wave 2 teacher survey item nts2C8 and fill in with parent interview item np2E1a_d if teacher data are missing. Example of items at different points in time Create a variable to look at the pattern of employment between Waves 2 and 3: employed both waves, either wave, or neither wave. Set to employed both waves if np2HasPdJob (W2) and np3HasJob (W3) are yes. Else set to employed in either wave if np2HasPdJob or np3HasJob are yes. Else set to not employed if np2HasPdJob and np3HasJob are no. 23 These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

25 14a. Accessing Data Files in SPSS ® Joining/combining data files Hypothetical Example: Data Availability Across Instruments Youth Interview Data W1 Assessment Data W2 Interview Data W2 Program Data W2 1Yes No 2YesNoYesNo 3 Yes 4 No Yes 5 NoYesNo 6Yes There will be missing records across files and missing items within files. If data look like this, ask which file is the main file being analyzed. 24 These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

26 14a. Accessing Data Files in SPSS ® Joining/combining data files Data in all files must be sorted by the key variable. The key variable matches files case by case. Key variable is ID. Files on CD should be sorted by key variable, but as you work with files they may become unsorted. Code in syntax editor to sort data. SORT CASES BY ID (A). To sort data from menu Data: Sort Cases Select S ORT IN ASCENDING ORDER radio button. Select ID and move it to the S ORT BY box by clicking the right-facing arrow. Click "OK" or Paste" and run code from syntax editor. 25 These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

27 14a. Accessing Data Files in SPSS ® Joining/combining data files Code in syntax editor to join files. MATCH FILES / FILE=C:\MyProjects\NLTS2\Data\n2w1parent.sav /TABLE='C:\MyProjects\NLTS2\Data\n2w2paryouth.sav' /TABLE='C:\MyProjects\NLTS2\Data\n2w3paryouth.sav /TABLE='C:\MyProjects\NLTS2\Data\n2w4paryouth.sav' /BY ID /KEEP=ID np1i_3a_7 np2HasPdJob np3HasJob np4HasJob. Execute. Why F ILE or T ABLE ? F ILE : All cases in the file are kept. T ABLE : Keeps only the cases that match those found inF ILE. 26 These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

28 14a. Accessing Data Files in SPSS ® Joining/combining data files To join data using menu-driven options From active file, go to menu item Data and select Merge Files andAdd Variables. Select file from browser window and click Open. Note: All variables that appear in both files will automatically be moved to the Excluded Variables box. Select ID in Excluded Variables box and do the following: Click Match cases on key variables in sorted file. Select External file is keyed table radio button. – In some versions of SPSS, Non-active dataset is keyed table. – Keyed table keeps only those cases that match those in the active file. Click left-facing arrow next to move ID to Key Variables box. 27 These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

29 14a. Accessing Data Files in SPSS ® Joining/combining data files To join data using menu-driven options (contd) Select variables to keep or drop Active file variables are marked with "*." Variables from the keyed or external file are marked with "+." To drop items, move variables to "Excluded Variables" box from the "New Active Dataset" box by highlighting each variable and clicking the left- facing arrow If only selecting a few variables Exclude all those marked with "+" using click/shift click on the first and last variables. Click left-facing arrow to move variables to the "Excluded Variables" box. Select each variable to keep in the "Excluded Variables" box. Click the right-facing arrow to move to the New Active Dataset. Press OK to run or select Paste to run code from syntax editor. 28 These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

30 14a. Accessing Data Files in SPSS ® Joining/combining data files To save your work using menu-driven instructions Select Save as from File menu and give the file a name. If a new file has already been created with a new name using the steps outlined in the subsetting data files, selectSave under the File menu. To save your work using the syntax editor A file can be saved using either an existing or new name. To save a file with the name MyNewFile: SAVE OUTFILE= 'C:\MyProjects\NLTS2\Data\MyNewFile.sav'. 29 These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

31 14a. Accessing Data Files in SPSS ® Joining/combining data files If bringing in data from more than one file, repeat this process for each file. Suggestion: Name files in a meaningful way, such as By date: AnFile_29July.sav By type of analysis: PI_CrossWave.sav By source: PI_W123.sav By sequence: File_5.sav. 30 These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

32 14a. Accessing Data Files in SPSS ® Joining/combining data files: Example Joining/combining data Combine data from another file with an existing file. Open and sort PrScores.sav by ID. Bring in np2HasPdJob from n2w2paryouth.sav. Bring in np3HasJob from n2w3paryouth. Save the file as PrScoresEmp.sav. 31 These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

33 14a. Accessing Data Files in SPSS ® Joining/combining data files: Example 32 These results cannot be replicated with full dataset; all output in modules generated with a random subset of the full data.

34 14a. Accessing Data Files in SPSS ® Closing Congratulations, you have learned to Open and view a file Create a new file Reduce the size of files by specifying the Variables needed Cases needed Join files using a key variable Save files with a new name. 33

35 14a. Accessing Data Files in SPSS ® Closing Topics discussed in this module Purpose Open and view data files Limiting variables Subsetting cases Joining/combining data files Next module: 15a. Accessing Data: Frequencies in SPSS 34

36 14a. Accessing Data Files in SPSS ® Important information NLTS2 website contains reports, data tables, and other project-related information http://nlts2.org/http://nlts2.org/ Information about obtaining the NLTS2 database and documentation can be found on the NCES website http://nces.ed.gov/statprog/rudman/http://nces.ed.gov/statprog/rudman/ General information about restricted data licenses can be found on the NCES website http://nces.ed.gov/statprog/instruct.asphttp://nces.ed.gov/statprog/instruct.asp E-mail address: nlts2@sri.com 35


Download ppt "14a. Accessing Data Files in SPSS ®. 1 Prerequisites Recommended modules to complete before viewing this module 1. Introduction to the NLTS2 Training."

Similar presentations


Ads by Google