Download presentation
Presentation is loading. Please wait.
1
ANALYZING QUANTITATIVE DATA WITH SPSS
© Susan Mowers Fall 2016
2
HANDS-ON OPEN PRESENTATION
Type in in your browser URL box Doubleclick on the file: Save to your desktop under a folder called “SPSS” (need to create??)
3
Overview Working with quantitative data Introducing SPSS
Elements of your descriptive analysis Before SPSS, check your data literacy: Who do the data cover? When/where was it gathered? Was it a sample, of whom? What do the data mean? Or in other words, the background story to your analysis, and the write-up of your methodology Descriptive statistics 101 Archive your work Introducation to Univariate and Bivariate descriptive analysis for data variables Weights Selected options for presenting data ICPSR data research materials for students… Quantitative human data is very likely to include qualitative data. Often self reported. Quantitative analysis has two aspects: Descriptive statistics (Description or summarization of or showing the data at hand), E.g., According to the data from the 2006 Census, women are twice than men as likely to be single parents than men if they live in Ontario. Inferential analysis is about predictions.
4
Learning outcomes Find out what files you need to do basic SPSS quantitative analysis Learn how to capture your files and your work Learn what elements you can use for a descriptive analysis of your SPSS data: General Social Survey, 2014, cycle 8, Victimization: Main file. Learn what learning tools ICPSR has to offer for quantitative research Capture your files and work –for working effectively and being able to back up your research results
5
Introducing Spss Benefits of SPSS? WHERE? SPSS is installed in:
Easy to learn All the benefits of the computing syntax code behind your GUI-driven statistics Designed for social science data SPSS supports labels, declared missing values and so on,(good data markup) e.g.,: Label: Perception of local police: Treating people fairly for the variable PLP_160, (see page 8 of colebook) Label: Married for the value 1, Living common-law for the value 2, …. (see page 2-3 of coldebook) The labels will show up on your tables and graphs! Missing value: 9 = Not stated, required so that SPSS will treat 9 as a « missing value, and not as a valid value SPSS also supports weighting Caution: labels and missing values should be declared by the data producer/provider, but they still need to be checked. Non-SPSS file formats? Conversion! Check to make sure you have retained embedded metadata. StatTransfer is one tool. WHERE? SPSS is installed in: Vanier labs on 2nd floor 3rd floor Morisset lab, here: MRT 308, and FSS Library (FSS 2010). SPSS = graphical – pull-down menus + syntax code to re-run … you and your supervisor or instructor or readers of your research articles can re-run your statistics Research replication Non-SPSS file formats ? Conversion options Can you import or transfer to SPSS .sav ? SPSS supports some « codebook information » Non-SPSS file formats? Conversion! Check to see if the original file has data labels? Note that .sav is a standard SPSS data file. .por is a portable SPSS file Labels are essential for your descriptive statistics which is partly about showing your statistics, either through statistical tables or in graphs!
6
Working with quantitative data
You need to know the source and methods for your data What is the basic information, Statistics Canada web site: [StatsCan Page for the GSS cycle 28, 2014, Victimization ] What about the rest of the detail? See <odesi> for your data and documentation (Follow the video(s): NAVIGATE … and DOWNLOAD] What about an overview of the findings and comparison with previous 5 cycles, see Samuel Perreault, “Criminal Victimization in Canada, 2014” Juristat, Confirm the suitability of the survey and the topics! Why do you need to gather various files before you get started in SPSS? Understand what the data represent – Are they meaningful to your research question? Be prepared to answer how you arrived at your results – Are your results replicable / valid? Shareable with your supervisor / team? Keep your work every day in one place! Then, double-check regularly: are your files backed up? Make sure the data license allows you to share your data with the rest of your research team and supervisor at the U of O. Yes, if from Odesi or ICPSR. Describe your methodology: Cite your data!! Name your key variables / measurements. Are they different from the original data? Can you provide your SPSS code of your work to your professor or team-mates?
7
OUR Research question:
Does getting older have a positive effect on the perception that the local police treat people fairly? For men and women both? As time allows, you can explore how other independent variables like demographic, employment, victimization and discrimination and other life experiences affect people’s belief that the police treat people fairly?
8
DATA documentation QUESTIONNAIRE DATA DICTIONARY
9
And the Must-have data fileS
Our SPSS data file From GSS Victimization survey of 2014 (cycle 28), started in 1988. Fifty Out of 789 variables. All cases. We will add one more variable… Includes: Persons aged 15 and older, in the ten provinces and the three territories (2 sampling methods) selected according to telephone framework associated with addresses. . Excludes: Persons living in institutions. Canadians without telephones… Also includes crime not reported to the police and provides a complement to police reported crime data. Best practise: Cite your data! Statistics Canada. (2016). General social survey (GSS), 2013, cycle 28, Victimization, Mail file [Public use microdata file]*. Ottawa, Ontario: Statistics Canada [producer and distributor]. Retrieved from APA * or [Public use microdata file and codebook] … and codebook is implied
10
HANDS-ON http://bit.ly/GSS_V_files OPEN PAGE:
Must-have files access page for workshop Please keep this page open Can you find your public use data source from here? Normal location [LINK] Can you find your survey documentation? What documentation should you download? Normally you would download all files. The web page to the left provides all these files for convenience
11
HANDS-ON OPEN FIRST SPSS file Read in a data set (next page)
Look at Data editor, (including Entering data and select a subset) Look at Variable view (including labels)
12
Hands-on – Data management
Open the first data file In SPSS, open the data file: Navigate to your Desktop folder: SPSS\data\ Select gss-12M0018-E-2014-c-28-vmf_F1.sav”
13
HANDS-ON – DATA MANAGEMENT
Look at Data editor, (including Entering data and select a subset) Look at Variable view … note variable and value labels
14
Must-have files for descriptive analysis using spss
User guide is prefered over Statistics Canada survey page. It answers: SURVEY METHODOLOGY QUESTIONS: Does the survey cover the right population and time period? Who (target population, response rate) When data gathered/covered, Why, (e.g, policing, public policy, safety, victim services) How (survey design/method) This is the background to your research story! Codebook, Questionnaire… They answer: What is the size of the sample? Does the survey cover the right concepts and measurements? Will you need to do more work (and how) on the data before it is ready to analyse? What is the weight variable? SPSS data file (either the full dataset or a subset, as you have just seen) For more information about accessing Statistics Canada’s public use microdata files, note: --> Data and Odesi Access data from Odesi, ICPSR Main products for descriptive statistics (SPSS, Stata, SAS, even Excel – note the Excel add-on collectica supports embedded metadata) Descriptive statistics describe, summarize and show the data – they don’t allow for predictions beyond the data
15
Analyzing quaNTITative data through descriptive statistics
Descriptive statistics describe, summarize and show the data Do you know your types of data? Why does it matter? Averages won’t work with data that reflect qualitative / categorical values, e.g., from the Canadian Community Health Survey … « Degree of perception that the local police treat people fairly » or « Have you ever been homeless? Or Highest level of education», but … they will work with quantitative values, e.g., « Number of children living at home » OR, « Weeks of employment in the past year but the GSS28 is lacking and income in actual dollars variable». Types of data? Nominal Married, Common-Law, Divorced-separated, Single/Never married Ordinal Less than high school, High school, Some post-secondary, Post-secondary grad. Interval-ratio $15,000 - $29,000 (as one of 10 levels of income, in equal income ranges of $14.999) Continuous : Positive mental health continuous scale, Main products (SPSS, Stata, SAS, even Excel – note colectica version of Excel) Descriptive statistics don’t allow for predictions beyond the data itself Qualitiative variables: Numbers have no meaning, nominal groups are unordered, ordinal groups are ordered Quantative variables: numbers have meaning, spaces between numbers are equal
16
Importing data sources
Documents (.pdf) SPSS file (.sav) Note, our SPSS data set files together are a subset of those variables we are analysing. (We have two SPSS files.) Syntax file (.sps) Log/output file (.spv) The full file has 1381 variables, ours has xx including two new variables I created to prepare the data for this workshop.
17
Hands-on data management
Your desktop work/storage area Create a folder called spss on your computer desktop In this folder, we have the folder structure: gsg_spss.ppt Data Documentation Commands Output
18
Hands-on DATA MANAGEMENT
Now let’s combine two data files in SPSS Note: both files are sorted by RECID. Under Data: Select Merge files and Add variables Navigate to your SPSS\data desktop folder and select: gss-12M0018-E-2014-c-28-quant.sav Check your file. Have the new variables: WET_110 and WHW_120 been added?
19
Descriptive statistics 101 … univariate analysis
What do the data have to say about our categorical values? Use Frequencies for categorical values (e.g., maritial status, body mass index groups, education levels, …) For nominal values and ordinal values Distribution in raw numbers and percents (ratios), Select statistics: mode for nominal values Select statistics: median and mode for ordinal values Frequencies for age groups and education level The mode (most typical) Descriptives for Positive mental health continuous scale Cross-tabulation (percentages) two categorical variables BMI nominal overweight, underweight, just about right (do you considder yourself) Means Compares the average (mean) between groups Use when one variable is interval and the other is ordinal or nominal E.g. Who is higher educated, married men or married women? (married men/women is a recoded/derived value that I created)
20
WHAT ARE FREQUENCIEs? … What are Sample weights?
Unweighted Frequencies Describe the distribution of your survey respondents’ legal marital status Weighted Frequencies Describe the distribution of your target population’s (Canadians) legal marital status
21
Hands-on Let’s do a frequency table for a categorical value
… chose one nominal and one ordinal variable which you previously identified
22
Menu method for frequency + USING paste to syntax (best practise)
Click on Analyse Click on Descriptive statistics Click on Frequencies Select your variable, using the right arrow Click on Statistics Select Mode, and if relevant, Median Click Continue Click Paste Run your command in your Syntax editor (highlight and click on right arrow at top) FREQUENCIES VARIABLES=PLP_160 /STATISTICS=STDDEV MEDIAN MODE /ORDER=ANALYSIS. Do first and continue to slides 21 – 22 (you will re-run as weighted frequencies as step 2) Now run complete procedure for measures of central tendency
23
Hands-on Turn on the weight
Pull down menu: Data then Scroll down and select Weight cases Paste to your syntax editor Re-run your frequencies, pasting again into your syntax editor You now have population estimates for your data! Weighted frequencies can be used as research results We will keep the weight turned on for the rest of our descriptive statistics workshop. Best practise: by pasting your code into your syntax editor, you can clearly see if your data is weighted or not
24
HANDS-ON VERIFICATION TEST
Compare your unweighted and weighted frequencies for one variable found in your data dictionary Do the valids, missing and totals match?
25
HANDS-ON VERIFICATION TEST
Now return to slide 20 and run full procedure for measures of central tendency for your categorical values.
26
(2) Descriptive statistics 101: Categorical variables, continued
Create Bar graphs to show the relationship between two categorical variables We will first do our re-code here for trust (1), no trust (0) in police. Let’s compare age pyramids for married men versus married women Click on Graphs Select Legacy Dialogs. Chose Bar Click on Clustered Click on Define Click on % of cases Select Trust > Category axis Select AGEGR10 Define Clusters by Click Paste and Run in your Syntax editor as usual
27
WHAT IS A HISTOGRAM?
28
Lets do two histograms First for a continuous variable: Number of weeks employed, last 12 weeks (showing distribution) Second for perception of fairness of local police (dummy variable). (Does perception of the fairness of local police increase with age?)
29
Hands-on Let’s do a histogram to show the distribution of Number of weeks employed – Past twelve months
30
Menu method for HISTOGRAMS USING paste to syntax (best practise)
Click on Graphs Click on Legacy dialogs Chose Histogram Click on your variable of interest and use the right arrow: Weeks employed … OR Age group … Optional: If you want separate histograms for different groups, e.g., perception of fairness of local polcies, you can put an additional variable in the Panel by: are. Chose rows if you would like the two graphs on top of each other, or Column if you want them side by side. Click on Statistics Click Paste Run your commands in your Syntax editor (highlight the commands and click on right arrow at top)
31
Quick note: Standard deviation
From Math is Fun: “The Standard Deviation is a measure of how spread out numbers are” and “what is normal and more extreme”. From data on the height of group of dogs … Calculate the mean and variance Calculate the variance (square of differences), then average of the sum of differences)
32
WHAT ARE Descriptives? Continuous variables Weighted:
33
Hands-on Let’s do descriptives for Number of weeks employed – Past twelve months Note, Respondents aged years have been removed from the data
34
Descriptive statistics 101 ACCORDING TO OUR DATA … Continuous variables, part 2
Run basic Descriptive statistics for continous variables, e.g. Click on Pull-down menu Analyse Click on Descriptive statistics Click on Descriptives Click on the continuous variable(s) you want, like Number of weeks employed – last 12 months and click on right arrow Click on Options to the right. Select Minimum, Maximum, Mean (=average), and for more advanced users, Standard deviation Note: your weight should still be turned on. If not, turn it back on. Click on Paste, and then in Syntax editor, Run your syntax Frequencies for age groups and education level The mode (most typical) Descriptives for Positive mental health continuous scale Cross-tabulation (percentages) two categorical variables Means Compares the average (mean) between groups Use when one variable is interval and the other is ordinal or nominal eg. Who has worked longer at their job, men or women?
35
WHAT IS A CROSS TABULATION?
Note: you could group age groups into 15 year intervals for example, for a simpler crosstab
36
Don’t forget clustered bar charts with cross tabulations
37
Hands-on Let’s use Summary statistics to compare distribution of age groups by level of perception of the fairness of local policy Cross tabulation with bar graph (categorical values)
38
Descriptive statistics 101 ACCORDING TO OUR DATA …
Does age affect the perception of the fairness of the local police? Bar graph - Show Cross-tabulation - Summarize Frequencies for age groups and education level The mode (most typical) Descriptives for Positive mental health continuous scale Cross-tabulation (percentages) two categorical variables Means Compares the average (mean) between groups Use when one variable is interval and the other is ordinal or nominal eg. Who has worked longer at their job, men or women?
39
Source for next slide– GUstave Goldmann workshop, Introduction to SPSS, Jan. 17, 2014
40
Additional hints on data management
Defining variable labels VARIABLE LABELS varname ‘variable label’. Define variable level of measurement VARIABLE LEVEL varname (NOMINAL). Define values for code-set VALUE LABELS varname 1 ‘label1’ 2 ‘label2’ ... (Source: Gustave Goldmann, January 2014) Note: Menu process possible too using Transform -> Recode features or the Variable view. (S.M., September 29, 2016)
41
Descriptive statistics 101: Categorical variables, continued
Create a cross tabulation to show correlation between two categorical values Let’s compare summary statistics of age groups for married men versus married women Click on Analyse Select Descriptive statistics Choose Crosstabs Select AGEGR10 > Rows Select PLP_160> Columns Click on Cells Unselect Observed for Counts Click on Column for Percentages Click on Continue Click Paste and Run in your Syntax editor as usual
42
OTHER stats INTERESTS for December?
Further data visualizations with SPSS: scatterplots, bar and whisker plots Best practices … Further analysis, multivariate, simple correlations and regressions Data entry, data importing, cleaning up data in SPSS SAS / Stata
43
reSources for your research projects
ICPSR Resources Where to find data and its must-have documentation? Odesi – Canadian ICSPR – International Ask us: GSG © Susan Mowers 2016
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.