Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and 2000 4-1 l Chapter 4 l Introduction to Statistical Software Package 4.1 Data Input 4.2 Data Editor 4.3 Data.

Slides:



Advertisements
Similar presentations
Descriptive Statistics-II
Advertisements

I OWA S TATE U NIVERSITY Department of Animal Science Using Basic Graphical and Statistical Procedures (Chapter in the 8 Little SAS Book) Animal Science.
Statistical Methods Lynne Stokes Department of Statistical Science Lecture 7: Introduction to SAS Programming Language.
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition Instructor’s Presentation Slides 1.
5/15/2015Slide 1 SOLVING THE PROBLEM The one sample t-test compares two values for the population mean of a single variable. The two-sample test of a population.
Assumption of normality
Introduction to SPSS Allen Risley Academic Technology Services, CSUSM
WINKS SDA Statistical Data Analysis (Windows Kwikstat) Getting Started Guide.
Sociology 601(Martin) Lecture for week 2: September Chapter 3.1: –Making Charts Chapter 3.2 – 3.5 (if time permits) –Measures of central tendency.
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
A Simple Guide to Using SPSS© for Windows
Introduction to Spreadsheets Presented by Frank H. Osborne, Ph. D. © 2005 Bio 2900 Computer Applications in Biology.
Introduction to SPSS Descriptive Statistics. Introduction to SPSS Statistics Program for the Social Sciences (SPSS) Commonly used statistical software.
SW388R7 Data Analysis & Computers II Slide 1 Computing Transformations Transforming variables Transformations for normality Transformations for linearity.
SPSS 1: An Introduction to the Statistical Package SPSS Suzie Cro MRC Clinical Trials Unit.
SPSS Statistical Package for the Social Sciences is a statistical analysis and data management software package. SPSS can take data from almost any type.
Introduction to SPSS Short Courses Last created (Feb, 2008) Kentaka Aruga.
8/9/2015Slide 1 The standard deviation statistic is challenging to present to our audiences. Statisticians often resort to the “empirical rule” to describe.
SW388R7 Data Analysis & Computers II Slide 1 Assumption of normality Transformations Assumption of normality script Practice problems.
FEBRUARY, 2013 BY: ABDUL-RAUF A TRAINING WORKSHOP ON STATISTICAL AND PRESENTATIONAL SYSTEM SOFTWARE (SPSS) 18.0 WINDOWS.
Introduction to SPSS (For SPSS Version 16.0)
Examining Univariate Distributions Chapter 2 SHARON LAWNER WEINBERG SARAH KNAPP ABRAMOWITZ StatisticsSPSS An Integrative Approach SECOND EDITION Using.
Marketing Research Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides.
Chapter Sixteen Starting the Data Analysis Winston Jackson and Norine Verberg Methods: Doing Social Research, 4e.
How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window.
PY550 Research and Statistics Dr. Mary Alberici Central Methodist University.
Statistical Methods For Health Research. History Blaise Pascl: tossing ……probability William Gossett: standard error of mean “ how large the sample should.
Statistics 1 Course Overview
Chapter 3 Statistical Concepts.
Chapter 1 Displaying the Order in a Group of Numbers and… Intro to SPSS (Activity 1) Thurs. Aug 22, 2013.
Tutor: Prof. A. Taleb-Bendiab Contact: Telephone: +44 (0) CMPDLLM002 Research Methods Lecture 9: Quantitative.
APPENDIX B Data Preparation and Univariate Statistics How are computer used in data collection and analysis? How are collected data prepared for statistical.
Introduction to SPSS Edward A. Greenberg, PhD
1 Experimental Statistics - week 4 Chapter 8: 1-factor ANOVA models Using SAS.
LINDSEY BREWER CSSCR (CENTER FOR SOCIAL SCIENCE COMPUTATION AND RESEARCH) UNIVERSITY OF WASHINGTON September 17, 2009 Introduction to SPSS (Version 16)
SPSS Presented by Chabalala Chabalala Lebohang Kompi Balone Ndaba.
© Copyright McGraw-Hill CHAPTER 3 Data Description.
STAT 211 – 019 Dan Piett West Virginia University Lecture 1.
Introduction to SPSS. Object of the class About the windows in SPSS The basics of managing data files The basic analysis in SPSS.
Introduction to Quantitative Research Analysis and SPSS SW242 – Session 6 Slides.
Experimental Research Methods in Language Learning Chapter 9 Descriptive Statistics.
A Simple Guide to Using SPSS ( Statistical Package for the Social Sciences) for Windows.
September 18-19, 2006 – Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development Conducting and interpreting multivariate analyses.
SW318 Social Work Statistics Slide 1 Percentile Practice Problem (1) This question asks you to use percentile for the variable [marital]. Recall that the.
Chapter Eight: Using Statistics to Answer Questions.
Mr. Magdi Morsi Statistician Department of Research and Studies, MOH
1.Introduction to SPSS By: MHM. Nafas At HARDY ATI For HNDT Agriculture.
Chapter 3: Organizing Data. Raw data is useless to us unless we can meaningfully organize and summarize it (descriptive statistics). Organization techniques.
DTC Quantitative Methods Summary of some SPSS commands Weeks 1 & 2, January 2012.
Using SPSS Next. An Introduction SPSS (the Statistical Package for the Social Sciences)
1 PEER Session 02/04/15. 2  Multiple good data management software options exist – quantitative (e.g., SPSS), qualitative (e.g, atlas.ti), mixed (e.g.,
2/24/2016Slide 1 The standard deviation statistic is challenging to present to our audiences. Statisticians often resort to the “empirical rule” to describe.
1 EPIB 698C Lecture 1 Instructor: Raul Cruz-Cano
Chapter 6: Descriptive Statistics. Learning Objectives Describe statistical measures used in descriptive statistics Compute measures of central tendency.
IENG-385 Statistical Methods for Engineers SPSS (Statistical package for social science) LAB # 1 (An Introduction to SPSS)
Data Entry, Coding & Cleaning SPSS Training Thomas Joshua, MS July, 2008.
Introduction to SPSS Review of Concepts (stats and scales) Data entry (the workspace and labels) – By hand – Import Excel Running an analysis-
SPSS For a Beginner CHAR By Adebisi A. Abdullateef
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
Introduction to Statistical Software Package
Chapter 3 Describing Data Using Numerical Measures
Introduction to SPSS.
By Dr. Madhukar H. Dalvi Nagindas Khandwala college
DEPARTMENT OF COMPUTER SCIENCE
CHAPTER 3 Data Description 9/17/2018 Kasturiarachi.
LINDSEY BREWER CSSCR (CENTER FOR SOCIAL SCIENCE COMPUTATION AND RESEARCH) UNIVERSITY OF WASHINGTON September 17, 2009 Introduction to SPSS (Version 16)
Chapter 3 Describing Data Using Numerical Measures
Spss.
Honors Statistics Review Chapters 4 - 5
Chapter Nine: Using Statistics to Answer Questions
Presentation transcript:

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 4 l Introduction to Statistical Software Package 4.1 Data Input 4.2 Data Editor 4.3 Data Computation and Transformation

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and Sample average Population average “X-bar” “mu” Introduction to Statistical Software Package  The statistical software package used – SPSS (Statistical Package for the Social Sciences) v. 21/v. 22  Sophisticated software used by social scientists and related professional for statistical analysis.  Compatible with the Windows environment.

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and Defects per lot Frequency (lots) Average is 5.1 defects per lot 4.1 Data Input  Defining variables (in Variable view): Variable names Labels (Variable and value) Missing values Variable types Column format Measurement level

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and Data Input  Variable names Must begin with letter. The remaining can be letter, digit, period or the #, _ or $. Cannot end with period or underscore Blanks and special characters cannot be used (E.g.: !, ?, ‘ and *) Must be unique; Duplication is not allowed The length in one variable cannot exceed 64 characters Reserved keywords cannot be used as variable names (E.g.: ALL, AND, BY, EQ, GE, LE, LT, NE, NOT, OR, TO and WITH) Are not case sensitive

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and Data Input  Labels Variable label is a full description of the variable name Optional It has a value label that can be in alphanumeric and numeric code Alphanumeric code Numeric code

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and Ranks Rank of median = (1+8)/2 = Median Average Data 4.1 Data Input  Missing values How to dealt with missing data – Leave the cell blank or assign missing value codes Rules for assigning missing value codes: Must be the same data type as they represent (E.g.: Missing numeric data must use numeric missing value code) Missing value code cannot occur as data in the data set By default, the choice of digit is usually 9  Variable types By default, SPSS assumes all new variables are numeric with two decimal places. Can be changed to other type (date, currency, etc) and have different number of decimal

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and %-10%0% Percent change at opening Frequency Average = -8.2% Median = -8.6% 4.1 Data Input  Column format It is possible to adjust the width of the Data Editor columns and change the alignment of data in the column (left, centre or right)  Measurement level Can be specified as nominal, ordinal, interval or ratio

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and $0$100,000$200,000 Income Average = $38,710 Median = $27,216 Frequency 4.2 Data Editor  Specifically, in editing and viewing data and result in SPSS, it has seven different types of windows: Data editor (Our focus) Viewer and draft viewer (Output) Pivot table editor Chart editor Text output editor Syntax editor Script editor

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and Mode 4.2 Data Editor  Data view

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and Average, median, and mode are identical in the case of a normal distribution 4.2 Data Editor  Variable view

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and Average Median Mode 4.2 Data Editor  Menu-driven and has variety of pull-down menus.  The main menu bar contains 12 menu: File Edit View Data Transform Analyze* Direct marketing Graphs* Utilities Add-ons Windows Help *Available in all windows (Easier to generate new output)

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and Average Median Mode 4.2 Data Editor  Toolbar File save File open File print Dialog recall Undo Redo Go to case Go to variable Variable Run descriptive statistics Find Insert cases Insert variable Split file Weight cases Select cases Value labels Use sets Show all variables Spell check

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and Average Median Mode 4.2 Data Editor  Exercise Enter the following cases:

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and Average Median Mode 4.3 Data Computation and Transformation  Data screening Objectives: Making sure that data have been entered correctly Making sure that the distributions are normal  The importance of data screening May affect the validity of the results that are produced May assist in deciding; the need of data transformation, the usage of nonparametric techniques  This subtopic will be demonstrated using data of work3.sav

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and Average Median Mode 4.3 Data Computation and Transformation  Errors in data entry Error in data entry are common and therefore data files must be carefully screened Use Frequencies or Descriptive command Output example:

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and Average Median Mode 4.3 Data Computation and Transformation  Assessing normality The assumption of normality is prerequisite of many inferential statistical techniques Normality can be explored using a combined assessment of graphic: Histogram Boxplot Normal probability plot Detrended normal plot And normality test: Kolmogorov-Smirnov and Shapiro-Wilk statistics Skewness Kurtosis

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and Average Median Mode 4.3 Data Computation and Transformation  Assessing normality Performed using the Explore command in Descriptive Statistics under Analyze Histogram Need to be Bell shaped /symmetry

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and Average Median Mode 4.3 Data Computation and Transformation  Assessing normality Boxplot Median should be at the center of the body (symmetric)

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and Average Median Mode 4.3 Data Computation and Transformation  Assessing normality Normal probability plot Plot should fall more or less in a straight line

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and Average Median Mode 4.3 Data Computation and Transformation  Assessing normality Detrended normal plot There should be no pattern to the clustering of points (The point should assemble around a horizontal line through zero)

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and Average Median Mode 4.3 Data Computation and Transformation  Assessing normality Kolmogorov-Smirnov and Shapiro-Wilk statistics Kolmogorov-Smirnov Sample size 100 Shapiro-Wilk Sample size < 100 If the significance level (Sig.) is greater than 0.05, normality is assumed

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and Average Median Mode 4.3 Data Computation and Transformation  Assessing normality Skewness and Kurtosis Their value need to be close to zero to be assumed normal  Basically, the data of age can be assumed normal There is no need for transformation Data can be analyzed using parametric test

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and Average Median Mode 4.3 Data Computation and Transformation  Variable transformation The decision to transform variables depends on the severity of the departure from normality Most appropriate transformation method need to be selected* *Suggested by Tabachnick and Fidell (2007) and Howell (2007) C = A constant added to each score so that the smallest score is 1 K = A constant from which each score is subtracted so that the smallest score is 1; usually equal to the largest score + 1 Data distributionTransformation method [Num. Expression] Moderately positive skewnessSquare-Root [SQRT(X)] Substantially positive skewnessLog 10 [LG10(X)] Substantially positive skewness (with zero values) Log 10 [LG10 (X + C) Moderately negative skewnessSquare-Root [SQRT(K - X)] Substantially negative skewnessLog 10 [LG10(K - X)]

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and Average Median Mode 4.3 Data Computation and Transformation  Variable transformation The decision to transform variables depends on the severity of the departure from normality Most appropriate transformation method need to be selected* *Suggested by Tabachnick and Fidell (2007) and Howell (2007) C = A constant added to each score so that the smallest score is 1 K = A constant from which each score is subtracted so that the smallest score is 1; usually equal to the largest score + 1  To illustrate the process of transformation, the hours of exercise (hourex) will was examined using the preceding steps.

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and Average Median Mode 4.3 Data Computation and Transformation  Variable transformation The decision to transform variables depends on the severity of the departure from normality Most appropriate transformation method need to be selected* *Suggested by Tabachnick and Fidell (2007) and Howell (2007) C = A constant added to each score so that the smallest score is 1 K = A constant from which each score is subtracted so that the smallest score is 1; usually equal to the largest score + 1  It is obvious that variable hours of exercise (hourex) is not normal. Variable transformation need to be performed  Based on previous assessment, we know that variable hourex is not normally distributed but is significantly positively skewed and because of the is no zero value involved, Log 10 transformation method should be performed.

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and Average Median Mode 4.3 Data Computation and Transformation  Variable transformation The decision to transform variables depends on the severity of the departure from normality Most appropriate transformation method need to be selected* *Suggested by Tabachnick and Fidell (2007) and Howell (2007) C = A constant added to each score so that the smallest score is 1 K = A constant from which each score is subtracted so that the smallest score is 1; usually equal to the largest score + 1  Results It is apparent that Log 10 transformation was appropriate because the distribution of hourex is now relatively normal

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and Average Median Mode 4.3 Data Computation and Transformation  Variable transformation The decision to transform variables depends on the severity of the departure from normality Most appropriate transformation method need to be selected* *Suggested by Tabachnick and Fidell (2007) and Howell (2007) C = A constant added to each score so that the smallest score is 1 K = A constant from which each score is subtracted so that the smallest score is 1; usually equal to the largest score + 1  Data transformation Recoding Collapsing continuous variables into categorical variables Recoding negatively worded scale items Replacing missing values and bringing outlying cases into the distribution Computing To obtain composite scores for items on a scale  Data selection In the Select Cases option in the Data menu, data can be selected: On specified cases using If option Randomly using the Sample option Based on time or case range using the Range option