Download presentation
Presentation is loading. Please wait.
Published byPaulina Jackson Modified over 6 years ago
1
The future is a vain hope, the past is a distracting thought
The future is a vain hope, the past is a distracting thought. Uphold our loving kindness at this instant, and be committed to our duties and responsibilities right now.
2
Applied Statistics Using SAS and SPSS
Topic: Hypothesis Testing By Prof Kelly Fan, Cal State Univ, East Bay
3
Hypothesis Testing A statistical hypothesis is an assertion or conjecture concerning one or more populations. Agenda: Types of tests Types of errors P-value Summary of tests Assumption checking
6
Types of Tests
7
Types of Tests
8
Types of Tests
9
Type II Error, or “ Error” Type I Error, or “ Error”
Types of Errors H0 true H0 false Type II Error, or “ Error” Good! (Correct!) we accept H0 Type I Error, or “ Error” Good! (Correct) we reject H0
10
= Probability of Type I error = P(rej. H0|H0 true)
= Probability of Type II error = P(acc. H0|H0 false) We often preset , called significance level. The value of depends on the specifics of the H1 (and most often in the real world, we don’t know these specifics).
11
Suppose the Critical Value = 141:
EXAMPLE: H0 : < 100 H1 : >100 Suppose the Critical Value = 141: X =100 C=141
12
These are values corresp.to a value of 25 for the Std. Dev. of X
= P (X < 141/= 150) = .3594 = 150 What is ? 141 = 150 These are values corresp.to a value of 25 for the Std. Dev. of X = P (X < 141/= 160) = .2236 = 160 141 = 160 = 170 = P (X < 141/= 170) = .1230 = 170 141 = P (X < 141/= 180) = 180 = .0594 = P (X < 141|H0 false) 141 = 180
13
Note:. Had been preset at. 025 (instead of
Note: Had been preset at .025 (instead of .05), C would have been 149 (and would be larger); had been preset at .10, C would have been 132 and would be smaller. and “trade off”.
14
P Value Definition: the probability that we reject Ho when Ho is true based on the observed data Idea: the largest “risk” we pay to reject H0 Alternate name: the observed type I error rate / the observed significance level When will we reject Ho ? What is the formula to calculate the largest risk?
15
Steps of Hypothesis Tests
Set up Ho and Ha properly Preset a level (the significant level) Select an appropriate test Calculate its p-value Reject Ho if p-value < or = the significant level; otherwise fail to reject Ho
16
Set Up Hypothesis Properly
Conjecture: The fraction of defective product in a certain process is at most 10%. Which error is more seriously? Incorrectly claim this conjecture is true? false? The “=“ sign must be in Ho
17
One Population
18
Two Populations
19
Assumption Checking Tests/graphs for normality
Tests for equal variances
20
Example: Mortar Strength
The tension bond strength of cement mortar is an important characteristic of the product. An engineer is interested in comparing the strength of a modified formulation in which polymer latex emulsions have been added during mixing to the strength of the unmodified mortar. The experimenter has collected 10 observations on strength for the modified formulation and another 10 observations for the unmodified formulation.
21
Example: Mortar Strength
Modified Unmodified
22
SAS/SPSS Data Input SPSS: One variable one column in the work sheet
SAS: One variable one name
23
Normality Tests/Plots
SAS: PROC UNIVARIATE DATA=** NORMAL PLOT; Tests for Normality Test Statistic p Value------ Shapiro-Wilk W Pr < W Kolmogorov-Smirnov D Pr > D >0.1500 Cramer-von Mises W-Sq Pr > W-Sq Anderson-Darling A-Sq Pr > A-Sq
24
Normality Tests/Plots
SPSS: Analyze >> Descriptive Statistics >> Explore >> Plots , Normality plots with tests
25
One-sample t tests SAS: PROC UNIVARIATE DATA=** NORMAL PLOT MU0=*;
Tests for Location: Mu0=18 Test Statistic p Value Student's t t Pr > |t| 0.0003 Sign M -6.5 Pr >= |M| 0.0044 Signed Rank S -79.5 Pr >= |S| 0.0005 SPSS: Analyze >> Compare means >> One sample t test , choose Test variable, Test value
26
Two-sample t Tests and Equal-variance Tests
SAS: PROC TTEST DATA=** ;
27
Two-sample Normality Tests/Plots
SAS: PROC UNIVARIATE DATA=** NORMAL PLOT; BY group-var; VAR testing-var; SPSS: Analyze >> Descriptive Statistics >> Explore , choose Dependent , Factor lists >> Plots , Normality plots with tests SPSS output:
28
Two-sample t Tests and Equal-variance Tests
SPSS: see below; choose test and grouping variables
29
Research Question A researcher claims that a new series of math courses for elementary school is more effective than the current one. Two (1st grade) classes of students are selected to perform an experiment to verify this claim. How would you conduct the experiment to avoid confounding variables as much as possible?
30
Paired Samples If the same set of sources are used to obtain data representing two populations, the two samples are called paired. The data might be paired: As a result of the data from certain “before” and “after” studies From matching two subjects to form “matched pairs”
31
Tests for Paired Samples
Calculate the pair differences Proceed as in one sample case Notes: SAS: all variables must be included in data SPSS: create/calculate all variables we need
32
Inferences about mean when “beyond the scope”
When population is nonnormal and n is small, how to do inferences about m: 1). Non-parametric tests 2). (Optional) Use Bootstrap methods to simulate the sampling distribution of t test statistic and then the simulated distribution to find an (approximate) C.I. and p-value
33
Non-parametric Tests Independent samples: Wilcoxon Rank Sum Test (also called Manny-Whitney test) Assumption: two distributions of the same shape Paired samples/One sample: Wilcoxon Signed-Rank Test Assumption: a symmetric distribution (of the differences for paired samples)
34
Introduction to Bootstrap Methods
How to simulate the sampling distribution of a given statistic, say t, based on a given sample of size n: Pretend the original sample is the entire population Select a random sample of size n from the original sample (now the population) with replacement ; this is called a bootstrap sample Calculate the t value of the bootstrap sample, t* Repeat steps 2, 3 many times, 1000 or more, say B times. Use the obtained t* values to obtain an approximation to the sampling distribution
35
Review: Confidence Interval
37
One Population
38
Two Populations
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.