Presentation is loading. Please wait.

Presentation is loading. Please wait.

Fundamentals of Data Analysis Lecture 4 Testing of statistical hypotheses.

Similar presentations


Presentation on theme: "Fundamentals of Data Analysis Lecture 4 Testing of statistical hypotheses."— Presentation transcript:

1 Fundamentals of Data Analysis Lecture 4 Testing of statistical hypotheses

2 Program for today F Statistical hypothesis; F Parametric (standard) tests; F Nonparametric (distribution free) tests.

3 Statistical hypothesis n While attempting to make decisions some necessary assumptions or guesses about the populations or statements about the probability distribution of the populations made are called statistical hypothesis. n These assumptions are to be proved or disproved n A predictive statement usually put in the form of a null hypothesis and alternate hypothesis n Capable of being tested by scientific methods, that relates an independent variable to some dependent variable

4 Testing of statistical hypotheses n Researcher bets in advance of his experiment that the results will agree with his theory and cannot be accounted for by the chance variation involved in sampling n Procedures which enable researcher to decide whether to accept or reject hypothesis or whether observed samples differ significantly from expected results.

5 Testing of statistical hypotheses n Plan & conduct experiment so that if the results are not explained by the chance variation, theory is confirmed n Collect data n Set null hypotheses i.e. assume that results are due to chance alone n Use a theoretical sampling distribution n Obtain probability of sample data as if it is chance variation n If probability at 5 is less than some predetermined small percentage (say 1% or 5%) reject the null hypothesis and accept the alternate hypothesis

6 Procedure for hypothesis testing n Plan & conduct experiment so that if the results are not explained by the chance variation, theory is confirmed n Collect data n Set null hypotheses i.e. assume that results are due to chance alone n Use a theoretical sampling distribution n Obtain probability of sample data as if it is chance variation n If probability at 5 is less than some predetermined small percentage (say 0.1%, 1% or 5%) reject the null hypothesis and accept the alternate hypothesis

7 Type I and type II errors  Error is determined in advance as level of significance for a given sample size  If we try to reduce type I error, the probability of committing type II error increases  Both type errors cannot be reduced simultaneously  Decision maker has to strike a balance / trade off examining the costs & penalties of both type errors

8 Null (H 0 ) & Alternative (H a ) hypotheses H 0 - while computing two methods assuming that both are equally good H a - a set of alternative to H 0 or rejecting the H 0 (what one wishes to prove)

9 The level of significance  Some percentage (usually 5%) chosen with great care, thought & reason so that how will be rejected when the sampling result (observed evidence) has a probability of < 0.05 of occurring if H 0 is true  Researcher is willing to take as much as a 5% risk of rejecting H 0  Significance level is the maximum value of the probability of rejecting H 0 when it is true  It is usually determined in advance, I.e., the probability of type I error (α) is assigned in advance and hence nothing can be done about it

10 Parametric tests Parametric tests allow you to make a number of requests for various statistical parameters. Examination of phenomena by calculating the parameters is a very effective way to learn, this is due to a concise and accurate form of the description. Parametric tests, despite its diversity, do not give answers to all the important questions, mainly because these tests can be applied if the tested quantity (the population) has normal distribution or very close to it. In addition, parametric tests, as the name suggests, describe a property of the phenomenon under study (test results), without giving sufficient grounds to formulate general conclusions.

11 Student's t test  It is based on t-distribution and only incase of small samples  Used for testing difference between means of two samples, coefficient of simple & partial correlations, etc  Using this test, we can test the null hypothesis as: H 0 :  =  0 while the alternative hypothesis is as follows : H 1 :    0

12 Student's t test In fact very few know the mean value and standard deviation of the general population, so we must be satisfied with estimate value using most frequently applied estimators - the average of the sample : and standard deviation inside the sample calculate with the aid of equation:

13 Student's t test We must calculate the statistics: which has the Student’s t distribution with n - 1 degrees of freedom (n – number of samples), provided that the population is normal or very close to it.

14 Student's t test So, if you want to check the null hypothesis of equality of the mean value for the sample with the average for the population, we use the Student's t- distribution tables and for ithe assumed level of confidence and read the critical value t , such that : Now compare this value t with the critical value t  and if :  |t|  t  then reject the null hypothesis ;  |t| < t  then there is no reason to reject the null hypothesis.

15 Student’s t test Example We know that the average light time of the bulb is   = 1059 hours. After making changes in the technology decided to see if these changes have not shortened the light time. The null hypothesis is therefore of the form H  :   =  , ie: the average burn time has not changed bulbs. For testing random sample of 10 light bulbs was taken, the results of these studies are presented in Table.

16 Student’s t test Example Lighting time

17 Student’s t test Example Read from the tables for a confidence level 0.95 critical value t a = 1.833, therefore there is no reason to reject the null hypothesis.

18 Student’s t test Exercise 12 farms was independently drawn in a village and the following values ​​ of crops of oats was obtained for them: 23.3, 22.1, 21.8, 19.9, 23.7, 22.3, 22.6, 21.5, 21.9, 22.8, 23.0, 22.2 At the level of significance 5% test the hypothesis that the value of the average yield of oats in the whole village is 22,6 q/ha, alternative hypothesis is that the value of the average yield of oats is higher.

19 To be continued … !


Download ppt "Fundamentals of Data Analysis Lecture 4 Testing of statistical hypotheses."

Similar presentations


Ads by Google