Download presentation
Presentation is loading. Please wait.
Published byMolly Mathews Modified over 9 years ago
1
Hypothesis testing and parameter estimation Bhuvan Urgaonkar “Empirical methods in AI” by P. Cohen
2
2 System behavior in unknown situations Self-tuning systems ought to behave properly in situations not previously encountered How to quantify the goodness of a system in dealing with unknown situations? Statistical inference is one way
3
3 Statistical inference Process of drawing inference about an unseen population given a relatively small sample Populations and samples Statistics: Functions on samples Parameters: Functions on populations
4
4 Examples Example 1: Toss a fair coin –Parameter: number of heads in 10 tosses –Can be determined analytically Example 2: Two chess programs A and B play 15 games, A wins 10, draws 2, loses 3. –Parameter: probability that A wins –Population of all possible chess games too large to enumerate => we cannot know the exact value Can estimate p win as p=0.67 p is a statistic derived from the above sample
5
5 Two kinds of statistical inference Hypothesis testing: Answer a yes-or-no question about a population and assess the probability that the answer is wrong –Assume p win =0.5 and assess the probability of the sample result p=0.67 –If this is very small, A and B are not equal Parameter estimation: Estimate the true value of a parameter given a statistic –If p=0.67, what is the “best” estimate of p win –How wide an interval to draw around p to be confident that p win falls within it?
6
6 Two kinds of statistical inference Hypothesis testing: Answer a yes-or-no question about a population and assess the probability that the answer is wrong –Assume p win =0.5 and assess the probability of the sample result p=0.67 –If this is very small, A and B are not equal Parameter estimation: Estimate the true value of a parameter given a statistic –If p=0.67, what is the “best” estimate of p win –How wide an interval to draw around p to be confident that p win falls within it?
7
7 Hypothesis testing example Two programs A and B that summarize news stories –Performance measured as recall, the proportion of the important parts of a story that make it into the summary Suppose you run A every day for 120 days and record mean recall scores of 10 stories Then you run B and want to answer: –Is B better than A?
8
8 Hypothesis testing steps Formulate a null hypothesis –mean(A) = mean(B) Gather a sample of 10 news stories and run them through B. Call the sample mean Emean(B) Assuming the null hypothesis is right, estimate the distribution of mean recall scores for all possible samples of size 10 run through B Calculate the probability of obtaining Emean(B) given this distribution If this probability is low, reject the null hypothesis
9
9 Hypothesis testing steps Formulate a null hypothesis –mean(A) = mean(B) Gather a sample of 10 news stories and run them through B. Call the sample mean Emean(B) Assuming the null hypothesis is right, estimate the distribution of mean recall scores for all possible samples of size 10 run through B Calculate the probability of obtaining Emean(B) given this distribution If this probability is low, reject the null hypothesis
10
10 Sampling distributions Distribution of a statistic calculated from all possible samples of a given size, drawn from a given population Example: Two tosses of a fair coin; sample statistic be the number of heads –Sampling distribution is discrete –Elements are 0, 1, 2 with probabilities 0.25, 0.5, 0.25 How to get sampling distributions?
11
11 Exact sampling distributions Coin tossed 20 times, num. heads=16 –Is the coin fair? Sampling distribution of the proportion p h under the null hypo that the coin is fair Easy to calculate exact probabilities of all the values for p h for N coin tosses –Possible values: 0/N, 1/N, …, N/N –Pr(p h =i/N) = N! * 0.5 N / i! * (N-i)! –Pr(p h =16/20) = 0.0046 --- next to impossible!
12
12 Estimated sampling distributions Unlike the sampling distribution of the proportion, that of the mean cannot be calculated exactly. –Recall the news story example It can, however, be estimated due to a remarkable theorem
13
13 Central limit theorem The sampling distribution of the mean of samples of size N approaches a normal distribution as N increases. –If samples are drawn from a population with mean M and std. dev SD, then the mean of the sampling distribution is M, its std. dev is SD/sqrt(N) –This holds irrespective of the shape of the population distribution!
14
14 The missing piece in hypothesis testing Null hypothesis –mean(A) = mean(B) We don’t know the distribution of mean(B), but we do know the distribution of Emean(A)! –CLT: Emean(A) = mean (A) = mean (B)
15
15 Computer-aided methods for estimating sampling distributions Use simulation to estimate the sampling distribution Monte Carlo tests –If population distribution is known but not the sampling distribution of the test statistic –Derive samples from this known distribution Bootstrap methods –Population distribution is unknown –Idea: Resample from the sample (treat the sample as the population!)
16
16 Other related concepts/techniques Hypotheses tests that work under different conditions –Z-test, t-test (small values of N) –Ref: Paul Cohen Parameter estimation –Confidence intervals –Analysis of variance: interaction among variables –Contingency tables –Ref: Paul Cohen Expectation maximization –X: observed data, Z: unobserved, Let Y=X U Z –Searches for h that maximizes E[ln P(Y | h)] –Ref: “Machine Learning” by Tom Mitchell
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.