Introduction to SPSS Asst. Prof. Dr. Emrah Oney.

Introduction to SPSS Asst. Prof. Dr. Emrah Oney

Topics we will cover today
SPSS at a glance Basic Structure of SPSS Cleaning your data Descriptive Statistics Charts Other Resources

SPSS at a glance SPSS stands for Statistical Package for the Social Sciences SPSS was made to be easier to use then other statistical software like S-Plus, R, or SAS. The newest version of SPSS is SPSS Today we will be working on SPSS 16.0.

How to open SPSS Go to START Click on PROGRAMS Click on SPSS INC
The computers in the lab typically have SPSS on the desktop. It is a red box that says SPSS on the top.

Opening a data file Click on FILE  OPEN  DATA
Click MY COMPUTER  LOCAL DISK C:/ Click PROGRAM FILES  SPSS Click TUTORIAL  SAMPLE FILES Select CATALOG.SAV

Basic structure of SPSS
There are two different windows in SPSS 1st – Data Editor Window - shows data in two forms Data view Variable view 2nd – Output viewer Window – shows results of data analysis *You must save the data editor window and output viewer window separately. Make sure to save both if you want to save your changes in data or analysis.*

Data view vs. Variable view
Rows are cases Columns are variables Variable view Rows define the variables Name, Type, Width, Decimals, Label, Missing, etc. Scale – age, weight, income Nominal – categories that cannot be ranked (ID number) Ordinal – categories that can be ranked (level of satisfaction)

Entering Data Can either enter data by hand or import from other programs (i.e. Excel) Hand entering Insert a variable by: Right clicking one of the rows in variable view and selecting “Insert Variable” Entering a “Name” in variable view and pressing “Enter” or “Tab” Right clicking on a column in the data view and selecting “Insert Variable” Clicking on the “Insert Variable” icon in the Toolbar Clicking on “Data”  “Insert Variable”

Entering Data Define variables in variable view
Name = Name of variable displayed in data view Type = Numeric, Comma, Dot, Scientific notation, Date, Dollar, Custom currency, & String Width = # of digits displayed in data view Decimals = # of decimal places displayed in data view Label = Name of variable displayed when running analyses Values = Value Labels – i.e. 1 = Male, 0 = Female Missing = Values that the system will recognize as missing Columns = # of columns used to display variable in data view Align = variable left, right, or center aligned Measure = scale on which variable is measured – Nominal, Ordinal, or Scale (Interval or Ratio)

Entering Data Importing Data Click “File”  “Open”  “Data”
Select the file type in question If Excel: Make sure top row of excel file lists variable names & the variables all have different names After selecting the file, click Enter – make sure the box “Read variable names from the first row of data” is clicked Make sure you variable are defined properly in the variable view

Menus File & Edit Menus View Menu
Exactly the same as all Windows programs View Menu Allows you to customize the SPSS desktop Status Bar – “Processor Area” at the very bottom of the screen Toolbars Fonts Grid Lines Value Labels – Make sure this is selected if you want to use them Variables/Data view

Menus Data Menu Define Dates… = Inserts a Date variable
Insert Variable Insert Case Go to Case… Sort Cases… = Ascending or descending order Transpose… = Switches cases and variables (former in columns and latter in rows) Merge Files – More on this later Split Files – More on this later Select Cases = If condition is satisfied, Random sample of cases, Based on time or case range, Use filter variable

Splitting and Merging Files
Click on “Organize output by groups” – grouping variable should be discrete (i.e. gender, hair color, etc.) Click on grouping variable and move to “Groups Based on” box Click “OK” Merging You can add either variables or cases If adding variables: Make sure both files share at least one variable that is identical, the key variable (i.e. SubID) Make sure both files are sorted by this variable Make sure, in both files, all cases have data for this variable and there are no duplicate cases Click on “Merge Files”  “Add Variables” Find the file you wish to merge with the one you have open The variable in the “Excluded Variables” box should be the key variable, denoted by a (+) indicating its presence in both files Click on “Match cases on key variables in sorted files” Move the key variable to the “Key Variables” box

Menus Transform Menu Compute... Recode – Into Same/Different Variable
Name new variable in “Target Variable” box Type equation in “Numeric Expression” box Recode – Into Same/Different Variable Select variable(s) to recode and move to the “Variables” box Click “Old and New Values” Click “OK”

Obtaining Descriptive Statistics
Click on “Analyze”  “Descriptive Statistics”  Frequencies Use to determine counts on values of variables Cut scores and %iles

SPSS Output for Frequency Distribution

Relative Frequency Distribution
Relative Frequency Distribution of IQ for Two Classes IQ Frequency Percent Valid Percent Cumulative Percent Total

Grouped Relative Frequency Distribution
Relative Frequency Distribution of IQ for Two Classes IQ Frequency Percent Cumulative Percent 80 – 90 – 100 – 110 – 120 – 130 – 140 – 150 and over Total

Descriptives Click on “Analyze”  “Descriptive Statistics” 
Use to get descriptive statistics (central tendency, variability, etc.) Use to convert variables to z-scores

Explore Click on “Analyze”  “Descriptive Statistics”  Explore
Use to examine descriptive statistics by grouping variable

Explore

Cleaning your data – missing data
There are two types of missing values in SPSS: system-missing and user-defined. System-missing data is assigned by SPSS when a function cannot be performed. For example, dividing a number by zero. SPSS indicates that a value is system-missing by one period in the data cell.

Cleaning your data – missing data
User-defined missing data are values that the researcher can tell SPSS to recognize as missing. For example, 9999 is a common user-defined missing value. To define a variable’s user-defined missing value… Look at your variables in VARIABLE VIEW Find the column labeled MISSING Find the variable that you would like to work with. Select that variable’s missing cell by clicking on the gray box in the right corner. A range can also be used if you only want to use half of a scale.

Cleaning your data – missing data cont.
When you have missing data in your data set, you can fill in the missing data with surrounding information so it does not affect your analysis. click TRANSFORM click REPLACE MISSING VALUES select the variable with missing values and move it to the right using the arrow SPSS will rename and create a new variable with your filled in data. click METHOD to select what type of method you would like SPSS to use when replacing missing values. click OK and view your new data in data view

Graphing Data Click GRAPH Click CHART BUILDER Click HISTOGRAM
Put MEN on the X axis. Click ELEMENT PROPERTIES. Check the box labeled DISPLAY NORMAL CURVE. This will impose a normal curve onto your graph. You can also change the style of your graph in this element properties window. You can copy and paste these graphs into word and excel files.

Graphing Continued There are other ways to make graphs. Click ANALYZE
Click DESCRIPTIVE STATISTICS Click FREQUENCIES Click services Click CHART Click BAR CHART Click PERCENTAGES

Data manipulation – select cases
By selecting cases, the researcher can select only certain cases for analysis click DATA click SELECT CASES click RANDOM SAMPLE OF CASES select your preferences

Data manipulation – compute new variable
Computing new variables – create a new variable from multiple variables click TRANSFORM click COMPUTE fill in the new target variable TOTALSALES fill in numeric expression = men+women+jewel create an IF statement by clicking on the IF button click INCLUDE IF CASE SATISFIES CONDITION enter condition MAIL>10000 This new variable TOTALSALES tells us what the total sales are for catalogs which mailed over 10,000 catalogs.

Mean Class A--IQs of 13 Students 102 115 128 109 131 89 98 106 140 119
110 Class B--IQs of 13 Students 109 Σ Yi = Σ Yi = 1433 Y-barA = Σ Yi = 1437 = Y-barB = Σ Yi = 1433 = n n

Mean The mean is the “balance point.”
Each person’s score is like 1 pound placed at the score’s position on a see-saw. Below, on a 200 cm see-saw, the mean equals 110, the place on the see-saw where a fulcrum finds balance: 1 lb at 93 cm 1 lb at 106 cm 1 lb at 131 cm 110 cm units below 21 units above 4 units below 0 units The scale is balanced because… on the left = on the right

Mean Bill Gates All of Us Outlier Mean
Means can be badly affected by outliers (data points with extreme values unlike the rest) Outliers can make the mean a bad measure of central tendency or common experience Income in the U.S. Bill Gates All of Us Outlier Mean

Median The middle value when a variable’s values are ranked in order; the point that divides a distribution into two equal halves. When data are listed in order, the median is the point at which 50% of the cases are above and 50% below it. The 50th percentile.

Median Median = 109 (six cases above, six below)
Class A--IQs of 13 Students 89 93 97 98 102 106 109 110 115 119 128 Median = 109 (six cases above, six below)

Median If the first student were to drop out of Class A, there would be a new median: 89 93 97 98 102 106 109 110 115 119 128 131 140 Median = 109.5 = 219/2 = 109.5 (six cases above, six below)

Median All of Us Bill Gates outlier
The median is unaffected by outliers, making it a better measure of central tendency, better describing the “typical person” than the mean when data are skewed. All of Us Bill Gates outlier

Median Mean Mean Median Median
If the recorded values for a variable form a symmetric distribution, the median and mean are identical. In skewed data, the mean lies further toward the skew than the median. Symmetric Skewed Mean Mean Median Median

Median The middle score or measurement in a set of ranked scores or measurements; the point that divides a distribution into two equal halves. Data are listed in order—the median is the point at which 50% of the cases are above and 50% below. The 50th percentile.

Mode A la mode!! The most common data point is called the mode.
The combined IQ scores for Classes A & B: BTW, It is possible to have more than one mode! A la mode!!

Mode It may mot be at the center of a distribution.
Data distribution on the right is “bimodal” (even statistics can be open-minded)

Mode Mean Median Mode Mode Median Mean
It may give you the most likely experience rather than the “typical” or “central” experience. In symmetric distributions, the mean, median, and mode are the same. In skewed data, the mean and median lie further toward the skew than the mode. Symmetric Skewed Mean Median Mode Mode Median Mean

Descriptive Statistics
Summarizing Data: Central Tendency (or Groups’ “Middle Values”) Mean Median Mode Variation (or Summary of Differences Within Groups) Range Interquartile Range Variance Standard Deviation

Data manipulation – recode a variable
Recoding allows a researcher to create a new variable with a different set of parameters click TRANSFORM click RECODE INTO DIFFERENT VARIABLE move mail over to the right create a name for the new variable mailcategories click OLD AND NEW VALUES

Other Resources There are many resources online to help you learn SPSS (tutorials, blogs, etc.) CSSCR has a Quicktime SPSS class on its website CSSCR offers SPSS handouts which are also on its website CSSCR offers classes on SPSS each quarter – come back for the SPSS Beyond the Basics class!

Introduction to SPSS Asst. Prof. Dr. Emrah Oney.

Similar presentations

Presentation on theme: "Introduction to SPSS Asst. Prof. Dr. Emrah Oney."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Introduction to SPSS Asst. Prof. Dr. Emrah Oney.

Similar presentations

Presentation on theme: "Introduction to SPSS Asst. Prof. Dr. Emrah Oney."— Presentation transcript:

Similar presentations

About project

Feedback