Presentation is loading. Please wait.

Presentation is loading. Please wait.

Daniela Stan Raicu School of CTI, DePaul University

Similar presentations


Presentation on theme: "Daniela Stan Raicu School of CTI, DePaul University"— Presentation transcript:

1 Daniela Stan Raicu School of CTI, DePaul University
CSC 323 Quarter: Spring 02/03 Daniela Stan Raicu School of CTI, DePaul University 1/2/2019 Daniela Stan - CSC323

2 Outline Confidence intervals when the population distribution is unknown Confidence intervals when the sample size is small Tests of significance 1/2/2019 Daniela Stan - CSC323

3 Is x normal distributed?
Is the population normal? Yes No Is ? Is ? Yes No Yes No is normal has t-student distribution is considered to be normal may or may not be considered normal (We need more info) 1/2/2019 Daniela Stan - CSC323

4 Assumptions when applying z-statistic
1. The population has a normal distribution with mean µ and standard deviation . 2. The standard deviation  is known 3. The size ‘n’ of the simple random sample (SRS) is large 4. The appropriate test statistic to use for inference about µ when  is known is the z statistic: where the expected value µ0 is the value assumed in the null hypothesis Ho. z has a normal distribution N(0,1) z = (x - µ0)  /  n 1/2/2019 Daniela Stan - CSC323

5 Assumptions when applying z-statistic
Is z-statistic appropriate to use when: The sample size is small? 2. The population does not have a normal distribution? 3. The population has a normal distribution but the standard deviation  is unknown? When the standard deviation of a statistic (in our case x) is estimated from data, the result is called the standard error of the statistic: SE x = s/  n What is the distribution of (x - µ0) s/  n ? It is not normal! 1/2/2019 Daniela Stan - CSC323

6 Inference on averages for small samples (cont.)
If data arise from a population with normal distribution and n is small (n<30), we can use a different curve, called t- distribution or Student’s curve. The t-distribution was discovered by W. S. Gosset (born on 13 June 1876 in Canterbury, England), the chief statistician of the Guinness brewery in Dublin, Ireland. He discovered the t-distribution in order to deal with small samples arising in statistical quality control. The brewery had a policy against employees publishing under their own names, thus he published his results about the t-distribution under the pen name "Student", and that name has become attached to the distribution. 1/2/2019 Daniela Stan - CSC323

7 The t-student distributions
Suppose that an SRS of size n is drawn from an N(µ, ). Then the one-sample t statistic t = (x -µ0) s /  n has the t-distribution with n-1 degrees of freedom. - The degrees of freedom come from the standard deviation s in the denominator of t. - There are many student’s curves! There is one student’s curve for each number of degrees of freedom; for tests on averages: Degrees of freedom = number of observations – 1 1/2/2019 Daniela Stan - CSC323

8 Comparing the student’s curve and the standard normal curve
d.f.=5 d.f.=15 t t Student’s curve Standard Normal curve Student’s curve has “fatter” tails. For d.f. around 30, the student’s curve is very similar to the standard normal curve. d.f.=30 1/2/2019 Daniela Stan - CSC323 t

9 When to use the t-test When should we use it? Each of the following conditions should hold: For computing a statistical test on averages. The sample is a simple random sample. The number of observations is small, the sample size n is less than 30. The distribution of the population is bell-shaped, it is not too different from the normal distribution. (Not easy to check, typically true for measurements!) 1/2/2019 Daniela Stan - CSC323

10 Tests on averages: z-test or t-test?
If the amount of current data is large Small (n <30) Use the z-test & the normal curve The distribution of the population is Unknown but quite different from the normal curve Unknown but not different from the normal curve Use the t-test & the student’s curve Do not use the t-test! 1/2/2019 Daniela Stan - CSC323

11 Confidence intervals for proportions
Assignment: 1. Draw the flowchart for estimating the population proportions 2. Calculate the confidence interval for the population proportion for different situations from the flowchart 3. Calculate the confidence intervals for different confidence levels such as C=.96, .98, etc. 4. Give examples where the way we calculated the confidence Intervals does not work. 1/2/2019 Daniela Stan - CSC323

12 Tests of Significance Example 1:
In the courtroom, juries must make a decision about the guilt or innocence of a defendant. Suppose you are on the jury in a murder trial. It is obviously a mistake if the jury claims the suspect is guilty when in fact he or she is innocent. What is the other type of mistake the jury could make? Which is more serious? 1/2/2019 Daniela Stan - CSC323

13 Tests of Significance Example 2:
Suppose exactly half, or 0.50, of a certain population would answer yes when asked if they support the death penalty. A random sample of 400 people results in 220, or 0.55, who answer yes. The Rule for Sample Proportions tells us that the potential sample proportions in this situation are approximately bell-shaped, with standard deviation of Find the standardized score for the observed value of 0.55. Then determine how often you would expect to see a standardized score at least that large or larger. 1/2/2019 Daniela Stan - CSC323

14 2.27% Tests of Significance Example 2: (cont.) n = 400 mean = 0.50
STD=0.025 2.27% 0.500 0.525 0.475 0.550 0.450 0.575 0.425 1/2/2019 Daniela Stan - CSC323

15 The Five Steps of Hypothesis Testing
1. Determining the Two Hypotheses: H0, Ha 2. Computing the Sampling Distribution 3. Collecting and Summarizing the Data (calculating the observed test statistic) 4. Determining How Unlikely the Test Statistic is if the Null Hypothesis is True (calculating the P-value) 5. Making a Decision/Conclusion (based on the P-value, is the result statistically significant?) 1/2/2019 Daniela Stan - CSC323

16 1.A. The Null Hypothesis: H0
population parameter equals some value no relationship no change no difference in two groups, etc. When performing a hypothesis test, we assume that the null hypothesis is true until we have sufficient evidence against it. 1. B. The Alternative Hypothesis: Ha population parameter differs from some value relationship exists a change occurred two groups are different, etc. 1/2/2019 Daniela Stan - CSC323

17 The Hypotheses for Proportions
Null: H0: p=p0 One sided alternatives Ha: p>p0 Ha: p<p0 Two sided alternative Ha: p =p0 1/2/2019 Daniela Stan - CSC323

18 The Hypotheses for Proportions
Null: H0: p=p0 One sided alternatives Ha: p>p0 Ha: p<p0 Two sided alternative Ha: p =p0 1/2/2019 Daniela Stan - CSC323

19 Example: Parental Discipline
Nationwide random telephone survey of 1,250 adults. 474 respondents had children under 18 living at home results on behavior based on the smaller sample reported margin of error 3% for the full sample 5% for the smaller sample Results of the study “The 1994 survey marks the first time a majority of parents reported not having physically disciplined their children in the previous year. Figures over the past six years show a steady decline in physical punishment, from a peak of 64 percent in 1988” The 1994 sample proportion who did not spank or hit was 51% ! Is this evidence that a majority of the population did not spank or hit? 1/2/2019 Daniela Stan - CSC323

20 Case Study: The Hypotheses
Null: The proportion of parents who physically disciplined their children in the previous year is the same as the proportion p of parents who did not physically discipline their children. [H0: p=.5] Alt: A majority of parents did not physically discipline their children in the previous year. [Ha: p>.5] 2. Sampling Distributions of p If numerous samples or repetitions of size n are taken, the sampling distribution of the sample proportions from various samples will be approximately normal with mean equal to p (the population proportion) and standard deviation equal to Since we assume the null hypothesis is true, we replace p with p0 to complete the test. 1/2/2019 Daniela Stan - CSC323

21 3. Test Statistic for Proportions
To determine if the observed proportion is unlikely to have occurred under the assumption that H0 is true, we must first convert the observed value to a standardized score: Case study: Based on the sample n=474 (large, so proportions follow normal distribution) no physical discipline (.50 is p0 from the null hypothesis) standardized score (test statistic) z = ( ) / = 0.43 1/2/2019 Daniela Stan - CSC323

22 4. P-value The P-value is the probability of observing data this extreme or more so in a sample of this size, assuming that the null hypothesis is true. A small P-value indicates that the observed data (or relationship) is unlikely to have occurred if the null hypothesis were actually true. The P-value tends to be small when there is evidence in the data against the null hypothesis. 1/2/2019 Daniela Stan - CSC323

23 P-value = 0.3446 Case Study: P-value z=0.43
0.500 0.523 0.477 0.546 0.454 0.569 0.431 1 -1 2 -2 3 -3 z: From the normal distribution table (Table B), z=0.4 is the 65.54th percentile. 1/2/2019 Daniela Stan - CSC323

24 Typical Cut-off for the P-value
5. Decision If we think the P-value is too low to believe the observed test statistic is obtained by chance only, then we would reject chance (reject the null hypothesis) and conclude that a statistically significant relationship exists (accept the alternative hypothesis). Otherwise, we fail to reject chance and do not reject the null hypothesis of no relationship (result not statistically significant). Typical Cut-off for the P-value Commonly, P-values less than 0.05 are considered to be small enough to reject chance. Some researchers use 0.10 or 0.01 as the cut-off instead of 0.05. This “cut-off” value is typically referred to as the significance level  of the test 1/2/2019 Daniela Stan - CSC323

25 P-value for Testing Proportions
Ha: p>p0 P-value is the probability of getting a value as large or larger than the observed test statistic (z) value. Ha: p<p0 P-value is the probability of getting a value as small or smaller than the observed test statistic (z) value. Ha: p=p0 P-value is two times the probability of getting a value as large or larger than the absolute value of the observed test statistic (z) value. 1/2/2019 Daniela Stan - CSC323


Download ppt "Daniela Stan Raicu School of CTI, DePaul University"

Similar presentations


Ads by Google