Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section.

Slides:

Advertisements

Similar presentations

Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section.

Advertisements

Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section.

Multinomial Experiments Goodness of Fit Tests We have just seen an example of comparing two proportions. For that analysis, we used the normal distribution.

Hypothesis Testing IV Chi Square.

Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Sections.

Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section.

Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section.

© 2010 Pearson Prentice Hall. All rights reserved The Chi-Square Test of Independence.

Chapter 5 Basic Probability Distributions

Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section.

FE Mathematics Review Dr. Scott Molitor Associate Professor

Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Sections.

Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section.

The Neymann-Pearson Lemma Suppose that the data x 1, …, x n has joint density function f(x 1, …, x n ;  ) where  is either  1 or  2. Let g(x 1, …,

Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section.

Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section.

Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics S eventh Edition By Brase and Brase Prepared by: Lynn Smith.

HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Section 10.7.

Functions of Random Variables. Methods for determining the distribution of functions of Random Variables 1.Distribution function method 2.Moment generating.

McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Statistical Inferences Based on Two Samples Chapter 9.

Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on Categorical Data 12.

Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section.

Functions of Random Variables. Methods for determining the distribution of functions of Random Variables 1.Distribution function method 2.Moment generating.

Chapter 16 – Categorical Data Analysis Math 22 Introductory Statistics.

Lectures prepared by: Elchanan Mossel elena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section.

Use of moment generating functions 1.Using the moment generating functions of X, Y, Z, …determine the moment generating function of W = h(X, Y, Z, …).

Sampling Distribution of the Sample Mean. Example a Let X denote the lifetime of a battery Suppose the distribution of battery battery lifetimes has 

Chapter 12 A Primer for Inferential Statistics What Does Statistically Significant Mean? It’s the probability that an observed difference or association.

Section 10.2 Independence. Section 10.2 Objectives Use a chi-square distribution to test whether two variables are independent Use a contingency table.

The table shows a random sample of 100 hikers and the area of hiking preferred. Are hiking area preference and gender independent? Hiking Preference Area.

Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Sections.

Copyright © Cengage Learning. All rights reserved. Chi-Square and F Distributions 10.

Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section.

Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.

- We have samples for each of two conditions. We provide an answer for “Are the two sample means significantly different from each other, or could both.

Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section.

Chi Square Test for Goodness of Fit Determining if our sample fits the way it should be.

Chapter 10 Section 5 Chi-squared Test for a Variance or Standard Deviation.

Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Sections.

Lectures' Notes STAT –324 Probability Probability and Statistics for Engineers First Semester 1431/1432 and 5735 Teacher: Dr. Abdel-Hamid El-Zaid Department.

Section 10.2 Objectives Use a contingency table to find expected frequencies Use a chi-square distribution to test whether two variables are independent.

Test of Goodness of Fit Lecture 41 Section 14.1 – 14.3 Wed, Nov 14, 2007.

TESTING STATISTICAL HYPOTHESES

Chapter Five The Binomial Probability Distribution and Related Topics

EMIS 7300 SYSTEMS ANALYSIS METHODS FALL 2005

Presentation 12 Chi-Square test.

Chapter 18 Chi-Square Tests

The Chi-Squared Test Learning outcomes

Lectures prepared by: Elchanan Mossel Yelena Shvets

Lectures prepared by: Elchanan Mossel Yelena Shvets

Lectures prepared by: Elchanan Mossel Yelena Shvets

Lectures prepared by: Elchanan Mossel Yelena Shvets

Lectures prepared by: Elchanan Mossel Yelena Shvets

Lectures prepared by: Elchanan Mossel Yelena Shvets

Lectures prepared by: Elchanan Mossel Yelena Shvets

Chi Square Two-way Tables

Lectures prepared by: Elchanan Mossel Yelena Shvets

Lecture 36 Section 14.1 – 14.3 Mon, Nov 27, 2006

STAT 312 Introduction Z-Tests and Confidence Intervals for a

Inference on Categorical Data

Lecture 41 Section 14.1 – 14.3 Wed, Nov 14, 2007

Hypothesis Tests for a Standard Deviation

Lectures prepared by: Elchanan Mossel Yelena Shvets

Lectures prepared by: Elchanan Mossel Yelena Shvets

Lectures prepared by: Elchanan Mossel Yelena Shvets

Lectures prepared by: Elchanan Mossel Yelena Shvets

Lecture 46 Section 14.5 Wed, Apr 13, 2005

Lecture 43 Section 14.1 – 14.3 Mon, Nov 28, 2005

Math 10, Spring 2019 Introductory Statistics

MATH 2311 Section 8.5.

Presentation transcript:

Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section 5.3

The normalization of the normal Recall: N(0,1) has density f(x) = Ce -1/2x 2 We will calculate the value of C using X,Y » N(0,1) that are independent. (X,Y) have joint density f(x,y) = C 2 e -1/2 ( x 2 + y 2 ) ; And: Question: what is the value of C? Answer:

Rotational Invariance Note: The joint density f(x,y) = C 2 e -1/2 ( x 2 + y 2 ) is rotationally invariant – the height depends only on the radial distance from (0,0) and not on the angle. Let: X Y R

Rotational invariance r r+dr Note that R 2 (r,r + dr) if (X,Y) is in the annulus A(r,r+dr) of circumference 2  r, and area 2  r dr. In A(r,r+dr) we have: f(x,y) » C 2 e -1/2 r 2. Hence: Therefore the density of R is: So: x y

The Variance of N(0,1) By the change of variables formula S = R 2 ~ Exp(1/2): The probability distribution of R is called the Rayleigh distribution. It has the density Therefore the Variance of N(0,1) is given by:

Radial Distance A dart is thrown at a circular target by an expert. The point of contact is distributed over the target so that approximately 50% of the shots are in the bull’s eye. Assume that the x and y-coordinates of the hits measured from the center, are distributed as (X,Y), where X,Y are independent N(0,  ). What’s the % of the shots that land within the radius twice that of the bull’s eye? What’s the average distance of the shot from the center? Questions: What’s the radius of the bull’s eye?

Radial Distance r The hitting distance R has Rayleigh distribution. Therefore: What’s the radius r of the bull’s eye? P(A) ¼ 0.5 A What’s the % of the shots that land within the radius twice that of the bull’s eye?

Radial Distance What’s the approximate average distance of the shot from the center? The average is given by: (by symmetry,)

Linear Combinations of Independent Normal Variables Suppose that X, Y » N(0,1) and independent. Question: What is the distribution of Z = aX + bY ? Solution: Assume first that a 2 +b 2 = 1. Then there is an angle  such that Z = cos  X + sin  Y.

Linear Combinations of Independent Normal Variables Z = cos  X + sin  Y. By rotational symmetry: P(x<Z<x+  x) = P(x<X<x+  x) So: Z ~ N(0,1). X Y  sin  Y cos  X Z  x x  x x

Linear Combinations of Independent Normal Variables If Z = aX + bY, where a and b are arbitrary, we can define a new variable: So Z’ » N(0,1) and Z » N(0,  a 2 + b 2 2 ). If X » N( ,   ) and Y » N(,   ) then So X + Y » N( + ,   2 +  2 2 ).

N independent Normal Variables Claim: If X 1,…, X N are independent N(  i,  i 2 ) variables then Z = X 1 +X 2 +…+X N » N(  1 +…+  N, (  1 2 +…+  N 2 ) ). Proof: By induction. Base case is trivial: Z 1 = X 1 » N(  1,  1 2 ) Assuming the claim for N-1 variables we get Z N-1 » N(  1 +…+  N-1, (  1 2 +…+  N-1 2 ) ). Now: Z N = Z N-1 + X N, where X N and Z N-1 are independent Normal variables. So by the previous result: Z N » N(   N, (  1 2 +…+  N 2 ) ).

 square Distribution Claim: The joint density of n independent N(0,1) variables is: This follows from the fact that a shell of radius r and thickness dr in n dimensions has volume c n r n-1 dr, where c n denotes the surface area of a unit sphere. Note: The density is spherically symmetric it depends on the radial distance: Claim:

 square Distribution This distribution is also called the  -square distribution with n degrees of freedom. Claim: The distribution of R 2 satisfies:

Applications of  square Distribution Claim: Consider an experiment that is repeated independently n times where the i th outcomes has the probability p i for 1 · i · m. Let N i = # of outcomes of the i th type (N 1 +…+N m = n). Then for large n: 10 draws with replacement P b =6/20; P i =4/20; P c =10/20. N b =3; N i =1; N c =6. has approximately a  -square distribution with m-1 degrees of freedom. R 2 2 = (3–3) 2 /3 + (1-2) 2 /2 + (6-5) 2 /5 =1/2 + 1/5 = 0.7

Note: The claim allows to “test” to what extent an outcome is consistent with an a priory guess about the actual probabilities. 10 draws with replacement P b =6/20; P i =4/20; P c =10/20. N b =3; N i =1; N c =6.  = (3–3) 2 /3 + (1-2) 2 /2 + (6-5) 2 /5 =1/2 + 1/5 = 0.7  square Distribution  2 = 0.7 and the probability of observing a statistic of this size or larger is about 60%, so the sample is consistent with the box.

 square Example SandalsSneakersLeather shoes BootsOtherTotals Male observed Male expected Female observed Female expected Total We have a sample of male and female college students and we record what type of shoes they are wearing. We would like to test the hypothesis that men and women are not different in their shoe habits, so we set the expected number in each category to be the average of the two observed values.

 –square Example (Again, because of our balanced male/female sample, our row totals were the same, so the male and female observed-expected frequency differences were identical. This is usually not the case.) The total chi square value for Table 1 is the number of degrees of freedom is 4. This gives This allows to reject the null hypothesis F/Sandals: (( ) 2 /9.5) =1.289 F/Sneakers: ((5 - 11) 2 /11) =3.273 F/L. Shoes: ((7 - 10) 2 /10) =0.900 F/Boots: (( ) 2 /12.5) =0.980 F/Other: ((9 - 7) 2 /7) =0.571 M/Sandals: (( ) 2 /9.5) =1.289 M/Sneakers: (( ) 2 /11) =3.273 M/L. Shoes: (( ) 2 /10) =0.900 M/Boots: (( ) 2 /12.5) =0.980 M/Other: ((5 - 7) 2 /7) =0.571