Introduction to Statistical Software Package

Slides:



Advertisements
Similar presentations
Descriptive Statistics-II
Advertisements

I OWA S TATE U NIVERSITY Department of Animal Science Using Basic Graphical and Statistical Procedures (Chapter in the 8 Little SAS Book) Animal Science.
Introduction to SPSS Allen Risley Academic Technology Services, CSUSM
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
A Simple Guide to Using SPSS© for Windows
Introduction to Spreadsheets Presented by Frank H. Osborne, Ph. D. © 2005 Bio 2900 Computer Applications in Biology.
Introduction to SPSS Descriptive Statistics. Introduction to SPSS Statistics Program for the Social Sciences (SPSS) Commonly used statistical software.
SW388R7 Data Analysis & Computers II Slide 1 Computing Transformations Transforming variables Transformations for normality Transformations for linearity.
SPSS 1: An Introduction to the Statistical Package SPSS Suzie Cro MRC Clinical Trials Unit.
SPSS Statistical Package for the Social Sciences is a statistical analysis and data management software package. SPSS can take data from almost any type.
Assumption of Homoscedasticity
Introduction to SPSS Short Courses Last created (Feb, 2008) Kentaka Aruga.
8/9/2015Slide 1 The standard deviation statistic is challenging to present to our audiences. Statisticians often resort to the “empirical rule” to describe.
SW388R7 Data Analysis & Computers II Slide 1 Assumption of normality Transformations Assumption of normality script Practice problems.
Introduction to SPSS (For SPSS Version 16.0)
Examining Univariate Distributions Chapter 2 SHARON LAWNER WEINBERG SARAH KNAPP ABRAMOWITZ StatisticsSPSS An Integrative Approach SECOND EDITION Using.
Chapter Sixteen Starting the Data Analysis Winston Jackson and Norine Verberg Methods: Doing Social Research, 4e.
How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window.
Statistical Methods For Health Research. History Blaise Pascl: tossing ……probability William Gossett: standard error of mean “ how large the sample should.
Chapter 1 Displaying the Order in a Group of Numbers and… Intro to SPSS (Activity 1) Thurs. Aug 22, 2013.
APPENDIX B Data Preparation and Univariate Statistics How are computer used in data collection and analysis? How are collected data prepared for statistical.
Introduction to SPSS Edward A. Greenberg, PhD
LINDSEY BREWER CSSCR (CENTER FOR SOCIAL SCIENCE COMPUTATION AND RESEARCH) UNIVERSITY OF WASHINGTON September 17, 2009 Introduction to SPSS (Version 16)
© Copyright McGraw-Hill CHAPTER 3 Data Description.
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 4 l Introduction to Statistical Software Package 4.1 Data Input 4.2 Data Editor 4.3 Data.
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
Introduction to SPSS. Object of the class About the windows in SPSS The basics of managing data files The basic analysis in SPSS.
Introduction to Quantitative Research Analysis and SPSS SW242 – Session 6 Slides.
Experimental Research Methods in Language Learning Chapter 9 Descriptive Statistics.
A Simple Guide to Using SPSS ( Statistical Package for the Social Sciences) for Windows.
September 18-19, 2006 – Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development Conducting and interpreting multivariate analyses.
SW318 Social Work Statistics Slide 1 Percentile Practice Problem (1) This question asks you to use percentile for the variable [marital]. Recall that the.
Chapter Eight: Using Statistics to Answer Questions.
Mr. Magdi Morsi Statistician Department of Research and Studies, MOH
1.Introduction to SPSS By: MHM. Nafas At HARDY ATI For HNDT Agriculture.
Chapter 3: Organizing Data. Raw data is useless to us unless we can meaningfully organize and summarize it (descriptive statistics). Organization techniques.
DTC Quantitative Methods Summary of some SPSS commands Weeks 1 & 2, January 2012.
Using SPSS Next. An Introduction SPSS (the Statistical Package for the Social Sciences)
1 PEER Session 02/04/15. 2  Multiple good data management software options exist – quantitative (e.g., SPSS), qualitative (e.g, atlas.ti), mixed (e.g.,
2/24/2016Slide 1 The standard deviation statistic is challenging to present to our audiences. Statisticians often resort to the “empirical rule” to describe.
IENG-385 Statistical Methods for Engineers SPSS (Statistical package for social science) LAB # 1 (An Introduction to SPSS)
Data Entry, Coding & Cleaning SPSS Training Thomas Joshua, MS July, 2008.
Introduction to SPSS Review of Concepts (stats and scales) Data entry (the workspace and labels) – By hand – Import Excel Running an analysis-
Workshop Series May 17, 2017 Brandon Aragon
SPSS For a Beginner CHAR By Adebisi A. Abdullateef
Measurements Statistics
Different Types of Data
Analysis and Empirical Results
Downloading and Preparing a StudentVoice File for SPSS
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
Chapter 3 Describing Data Using Numerical Measures
Descriptive measures Capture the main 4 basic Ch.Ch. of the sample distribution: Central tendency Variability (variance) Skewness kurtosis.
Introduction Previous lessons have demonstrated that the normal distribution provides a useful model for many situations in business and industry, as.
Introduction to SPSS.
By Dr. Madhukar H. Dalvi Nagindas Khandwala college
Frequency Distributions
Assumption of normality
Evaluating Univariate Normality
DEPARTMENT OF COMPUTER SCIENCE
Computing Transformations
CHAPTER 3 Data Description 9/17/2018 Kasturiarachi.
Description of Data (Summary and Variability measures)
LINDSEY BREWER CSSCR (CENTER FOR SOCIAL SCIENCE COMPUTATION AND RESEARCH) UNIVERSITY OF WASHINGTON September 17, 2009 Introduction to SPSS (Version 16)
Chapter 3 Describing Data Using Numerical Measures
Theme 5 Standard Deviations and Distributions
An Introduction to Statistics
Spss.
Introduction Previous lessons have demonstrated that the normal distribution provides a useful model for many situations in business and industry, as.
Honors Statistics Review Chapters 4 - 5
Chapter Nine: Using Statistics to Answer Questions
Presentation transcript:

Introduction to Statistical Software Package l Chapter 4 l Introduction to Statistical Software Package 4.1 Data Input 4.2 Data Editor 4.3 Data Computation and Transformation

Introduction to Statistical Software Package The statistical software package used – SPSS (Statistical Package for the Social Sciences) v. 21/v. 22 Sophisticated software used by social scientists and related professional for statistical analysis. Compatible with the Windows environment. Sample average Population average “X-bar” “mu”

4.1 Data Input Defining variables (in Variable view): Variable names Labels (Variable and value) Missing values Variable types Column format Measurement level 2 5 10 15 20 Defects per lot Frequency (lots) Average is 5.1 defects per lot

4.1 Data Input Variable names 5+7 2 Must begin with letter. The remaining can be letter, digit, period or the symbols @, #, _ or $. Cannot end with period or underscore Blanks and special characters cannot be used (E.g.: !, ?, ‘ and *) Must be unique; Duplication is not allowed The length in one variable cannot exceed 64 characters Reserved keywords cannot be used as variable names (E.g.: ALL, AND, BY, EQ, GE, LE, LT, NE, NOT, OR, TO and WITH) Are not case sensitive 5+7 2

4.1 Data Input Labels Variable label is a full description of the variable name Optional It has a value label that can be in alphanumeric and numeric code Alphanumeric code Numeric code

4.1 Data Input Missing values Variable types How to dealt with missing data – Leave the cell blank or assign missing value codes Rules for assigning missing value codes: Must be the same data type as they represent (E.g.: Missing numeric data must use numeric missing value code) Missing value code cannot occur as data in the data set By default, the choice of digit is usually 9 Variable types By default, SPSS assumes all new variables are numeric with two decimal places. Can be changed to other type (date, currency, etc) and have different number of decimal Data Ranks Rank of median = (1+8)/2 = 4.5 0 1 2 3 4 5 3 1 8 8 5 6 4 9 Median Average

4.1 Data Input Column format Measurement level It is possible to adjust the width of the Data Editor columns and change the alignment of data in the column (left, centre or right) Measurement level Can be specified as nominal, ordinal, interval or ratio 5 -20% -10% 0% Percent change at opening Frequency Average = -8.2% Median = -8.6%

4.2 Data Editor Specifically, in editing and viewing data and result in SPSS, it has seven different types of windows: Data editor (Our focus) Viewer and draft viewer (Output) Pivot table editor Chart editor Text output editor Syntax editor Script editor 10 20 30 40 50 $0 $100,000 $200,000 Income Average = $38,710 Median = $27,216 Frequency

4.2 Data Editor Mode Data view

4.2 Data Editor Variable view Average, median, and mode are identical in the case of a normal distribution

4.2 Data Editor Menu-driven and has variety of pull-down menus. The main menu bar contains 12 menu: Graphs* File Utilities Edit Add-ons View Windows Data Help Transform *Available in all windows (Easier to generate new output) Analyze* Direct marketing Average Median Mode

4.2 Data Editor Toolbar Average Median Mode Variable File open File print Undo Go to case Find Insert variable Weight cases Value labels Show all variables Average Median Mode File save Dialog recall Redo Go to variable Insert cases Split file Select cases Use sets Spell check Run descriptive statistics

4.2 Data Editor Exercise Enter the following cases: Average Median Mode

4.3 Data Computation and Transformation Data screening Objectives: Making sure that data have been entered correctly Making sure that the distributions are normal The importance of data screening May affect the validity of the results that are produced May assist in deciding; the need of data transformation, the usage of nonparametric techniques This subtopic will be demonstrated using data of work3.sav Average Median Mode

4.3 Data Computation and Transformation Errors in data entry Error in data entry are common and therefore data files must be carefully screened Use Frequencies or Descriptive command Output example: Average Median Mode

4.3 Data Computation and Transformation Assessing normality The assumption of normality is prerequisite of many inferential statistical techniques Normality can be explored using a combined assessment of graphic: Histogram Boxplot Normal probability plot Detrended normal plot And normality test: Kolmogorov-Smirnov and Shapiro-Wilk statistics Skewness Kurtosis Average Median Mode

4.3 Data Computation and Transformation Assessing normality Performed using the Explore command in Descriptive Statistics under Analyze Histogram Need to be Bell shaped /symmetry Average Median Mode

4.3 Data Computation and Transformation Assessing normality Boxplot Median should be at the center of the body (symmetric) Average Median Mode

4.3 Data Computation and Transformation Assessing normality Normal probability plot Plot should fall more or less in a straight line Average Median Mode

4.3 Data Computation and Transformation Assessing normality Detrended normal plot There should be no pattern to the clustering of points (The point should assemble around a horizontal line through zero) Average Median Mode

4.3 Data Computation and Transformation Assessing normality Kolmogorov-Smirnov and Shapiro-Wilk statistics Kolmogorov-Smirnov Sample size 100 Shapiro-Wilk Sample size < 100 If the significance level (Sig.) is greater than 0.05, normality is assumed Average Median Mode

4.3 Data Computation and Transformation Assessing normality Skewness and Kurtosis Their value need to be close to zero to be assumed normal Basically, the data of age can be assumed normal There is no need for transformation Data can be analyzed using parametric test Average Median Mode

4.3 Data Computation and Transformation Variable transformation The decision to transform variables depends on the severity of the departure from normality Most appropriate transformation method need to be selected* *Suggested by Tabachnick and Fidell (2007) and Howell (2007) C = A constant added to each score so that the smallest score is 1 K = A constant from which each score is subtracted so that the smallest score is 1; usually equal to the largest score + 1 Data distribution Transformation method [Num. Expression] Moderately positive skewness Square-Root [SQRT(X)] Substantially positive skewness Log 10 [LG10(X)] (with zero values) Log 10 [LG10 (X + C) Moderately negative skewness Square-Root [SQRT(K - X)] Substantially negative skewness Log 10 [LG10(K - X)] Average Median Mode

4.3 Data Computation and Transformation To illustrate the process of transformation, the hours of exercise (hourex) will was examined using the preceding steps. Variable transformation The decision to transform variables depends on the severity of the departure from normality Most appropriate transformation method need to be selected* *Suggested by Tabachnick and Fidell (2007) and Howell (2007) C = A constant added to each score so that the smallest score is 1 K = A constant from which each score is subtracted so that the smallest score is 1; usually equal to the largest score + 1 Average Median Mode

4.3 Data Computation and Transformation It is obvious that variable hours of exercise (hourex) is not normal. Variable transformation need to be performed Based on previous assessment, we know that variable hourex is not normally distributed but is significantly positively skewed and because of the is no zero value involved, Log 10 transformation method should be performed. Variable transformation The decision to transform variables depends on the severity of the departure from normality Most appropriate transformation method need to be selected* *Suggested by Tabachnick and Fidell (2007) and Howell (2007) C = A constant added to each score so that the smallest score is 1 K = A constant from which each score is subtracted so that the smallest score is 1; usually equal to the largest score + 1 Average Median Mode

4.3 Data Computation and Transformation Results Variable transformation The decision to transform variables depends on the severity of the departure from normality Most appropriate transformation method need to be selected* *Suggested by Tabachnick and Fidell (2007) and Howell (2007) C = A constant added to each score so that the smallest score is 1 K = A constant from which each score is subtracted so that the smallest score is 1; usually equal to the largest score + 1 Average Median Mode It is apparent that Log 10 transformation was appropriate because the distribution of hourex is now relatively normal

4.3 Data Computation and Transformation Data transformation Recoding Collapsing continuous variables into categorical variables Recoding negatively worded scale items Replacing missing values and bringing outlying cases into the distribution Computing To obtain composite scores for items on a scale Data selection In the Select Cases option in the Data menu, data can be selected: On specified cases using If option Randomly using the Sample option Based on time or case range using the Range option Variable transformation The decision to transform variables depends on the severity of the departure from normality Most appropriate transformation method need to be selected* *Suggested by Tabachnick and Fidell (2007) and Howell (2007) C = A constant added to each score so that the smallest score is 1 K = A constant from which each score is subtracted so that the smallest score is 1; usually equal to the largest score + 1 Average Median Mode