CADA Final Review Assessment –Continuous assessment (10%) –Mini-project (20%) –Mid-test (20%) –Final Examination (50%) 40% from Part 1 & 2 60% from Part 3 & 4
Main contents Getting Started with SPSS Describing Data Testing Hypothesis Examining Relationships
Part 1: Getting Started with SPSS Try to open the SPSS data file demo.sav. SPSS example files can be found C:\Program Files\SPSSInc\Statistics17\Samples\English This data file is a fictitious survey of several thousand people, containing basic demographic and consumer information. In Data View, columns represent variables, and rows represent cases (observations).
Construct a SPSS data file In Variable View, each row is a variable, and each column is an attribute that is associated with that variable. 1. By entering data directly2. By reading from other applications
Nominal (type of car owned) discrete (number of children) continuous (time of an exam) Scale (Quantitative) DATA ordinal Summary of Types of Variables Categorical Data
A simple frequency table The “missing” item tells us how many people did not select one of the two valid answers.
Pie charts
Bar chart
Histogram ( 直方图 ) It is a histogram for grouped numerical data in which the frequencies or percentages of each group of numerical data are represented as individual bars.
Stem-and-leaf plots completion time in hours Stem-and-Leaf Plot for agecat6= Frequency Stem & Leaf Extremes (>=6.2) Stem width: 1.00 Each leaf: 1 case(s)
Basic statistics
Test Relationship between Scale & Categorical Variables Compare Means
Age, Education, and Internet Use Internet use by age (statistics for subgroups)
ANOVA Table The F test shows that there is a significant difference among average hours worked per week in five categories of education.
Multiple Comparison
Testing a single mean The standard error of the mean is The t -statistic The 95% confidence interval of the difference is
Testing a Hypothesis about Two related means
This problem is recommended to use the paired-samples t test.
Testing Two Independent Means
Output from t test for TV watching hours
Bar chart of completion time by Age and Gender
Two-way ANOVA
Relationship between Scale Variables
Linear regression model
The regression model becomes life expectancy=90-(0.70 x birthrate) That tells us that for an increase of 1 in birthrate, there is a decrease in life expectancy of 0.70 years.
ANOVA
Prediction and residuals
Checking for normality
Association between Categorical Variables
Crosstabulation Contingency Table by the use of time and education Here the percentages are column %
Gender Hand Preference LeftRight Female Observed = 12 Expected = 14.4 Observed = 108 Expected = Male Observed = 24 Expected = 21.6 Observed = 156 Expected = The Chi-Square Test Statistic The test statistic is:
Chi-square Test on Independence Since the p-value= <0.05, you reject the null hypothesis of independence. There is strong evidence of a relationship between primary reason for not returning and the hotel.