Download presentation
Presentation is loading. Please wait.
Published byIlene Lamb Modified over 9 years ago
1
Stats 241.3 Probability Theory Summary
2
The sample Space, S The sample space, S, for a random phenomena is the set of all possible outcomes.
3
An Event, E The event, E, is any subset of the sample space, S. i.e. any set of outcomes (not necessarily all outcomes) of the random phenomena S E
4
Probability
5
Suppose we are observing a random phenomena Let S denote the sample space for the phenomena, the set of all possible outcomes. An event E is a subset of S. A probability measure P is defined on S by defining for each event E, P[E] with the following properties 1. P[E] ≥ 0, for each E. 2. P[S] = 1. 3.
6
Finite uniform probability space Many examples fall into this category 1.Finite number of outcomes 2.All outcomes are equally likely 3. To handle problems in case we have to be able to count. Count n(E) and n(S).
7
Techniques for counting
8
Basic Rule of counting Suppose we carry out k operations in sequence Let n 1 = the number of ways the first operation can be performed n i = the number of ways the i th operation can be performed once the first (i - 1) operations have been completed. i = 2, 3, …, k Then N = n 1 n 2 … n k = the number of ways the k operations can be performed in sequence.
9
Diagram:
10
Basic Counting Formulae 1.Permutations: How many ways can you order n objects n!n! 2.Permutations of size k (< n): How many ways can you choose k objects from n objects in a specific order
11
3.Combinations of size k ( ≤ n): A combination of size k chosen from n objects is a subset of size k where the order of selection is irrelevant. How many ways can you choose a combination of size k objects from n objects (order of selection is irrelevant)
12
Important Notes 1.In combinations ordering is irrelevant. Different orderings result in the same combination. 2.In permutations order is relevant. Different orderings result in the different permutations.
13
Rules of Probability
14
The additive rule P[A B] = P[A] + P[B] – P[A B] and if P[A B] = P[A B] = P[A] + P[B]
15
The additive rule for more than two events then and if A i A j = for all i ≠ j.
16
The Rule for complements for any event E
17
Conditional Probability, Independence and The Multiplicative Rue
18
Then the conditional probability of A given B is defined to be:
19
The multiplicative rule of probability and if A and B are independent. This is the definition of independent
20
The multiplicative rule for more than two events
21
Independence for more than 2 events
22
Definition: The set of k events A 1, A 2, …, A k are called mutually independent if: P[A i 1 ∩ A i 2 ∩… ∩ A i m ] = P[A i 1 ] P[A i 2 ] …P[A i m ] For every subset {i 1, i 2, …, i m } of {1, 2, …, k } i.e. for k = 3 A 1, A 2, …, A k are mutually independent if: P[A 1 ∩ A 2 ] = P[A 1 ] P[A 2 ], P[A 1 ∩ A 3 ] = P[A 1 ] P[A 3 ], P[A 2 ∩ A 3 ] = P[A 2 ] P[A 3 ], P[A 1 ∩ A 2 ∩ A 3 ] = P[A 1 ] P[A 2 ] P[A 3 ]
23
Definition: The set of k events A 1, A 2, …, A k are called pairwise independent if: P[A i ∩ A j ] = P[A i ] P[A j ] for all i and j. i.e. for k = 3 A 1, A 2, …, A k are pairwise independent if: P[A 1 ∩ A 2 ] = P[A 1 ] P[A 2 ], P[A 1 ∩ A 3 ] = P[A 1 ] P[A 3 ], P[A 2 ∩ A 3 ] = P[A 2 ] P[A 3 ], It is not necessarily true that P[A 1 ∩ A 2 ∩ A 3 ] = P[A 1 ] P[A 2 ] P[A 3 ]
24
Bayes Rule for probability
25
Let A 1, A 2, …, A k denote a set of events such that An generalization of Bayes Rule for all i and j. Then
26
Random Variables an important concept in probability
27
A random variable, X, is a numerical quantity whose value is determined be a random experiment
28
Definition – The probability function, p(x), of a random variable, X. For any random variable, X, and any real number, x, we define where {X = x} = the set of all outcomes (event) with X = x. For continuous random variables p(x) = 0 for all values of x.
29
Definition – The cumulative distribution function, F(x), of a random variable, X. For any random variable, X, and any real number, x, we define where {X ≤ x} = the set of all outcomes (event) with X ≤ x.
30
Discrete Random Variables For a discrete random variable X the probability distribution is described by the probability function p(x), which has the following properties
31
Graph: Discrete Random Variable p(x)p(x) a b
32
Continuous random variables For a continuous random variable X the probability distribution is described by the probability density function f(x), which has the following properties : 1. f(x) ≥ 0
33
Graph: Continuous Random Variable probability density function, f(x)
34
The distribution function F(x) This is defined for any random variable, X. F(x) = P[X ≤ x] Properties 1. F(-∞) = 0 and F(∞) = 1. 2. F(x) is non-decreasing (i. e. if x 1 < x 2 then F(x 1 ) ≤ F(x 2 ) ) 3. F(b) – F(a) = P[a < X ≤ b].
35
4. p(x) = P[X = x] =F(x) – F(x-) 5.If p(x) = 0 for all x (i.e. X is continuous) then F(x) is continuous. Here
36
6. For Discrete Random Variables F(x) is a non-decreasing step function with F(x)F(x) p(x)p(x)
37
7. For Continuous Random Variables Variables F(x) is a non-decreasing continuous function with F(x)F(x) f(x) slope x To find the probability density function, f(x), one first finds F(x) then
38
Some Important Discrete distributions
39
The Bernoulli distribution
40
Suppose that we have a experiment that has two outcomes 1.Success (S) 2.Failure (F) These terms are used in reliability testing. Suppose that p is the probability of success (S) and q = 1 – p is the probability of failure (F) This experiment is sometimes called a Bernoulli Trial Let Then
41
The probability distribution with probability function is called the Bernoulli distribution p q = 1- p
42
The Binomial distribution
43
We observe a Bernoulli trial (S,F) n times. where Let X denote the number of successes in the n trials. Then X has a binomial distribution, i. e. 1. p = the probability of success (S), and 2. q = 1 – p = the probability of failure (F)
44
The Poisson distribution Suppose events are occurring randomly and uniformly in time. Let X be the number of events occuring in a fixed period of time. Then X will have a Poisson distribution with parameter.
45
The Geometric distribution Suppose a Bernoulli trial (S,F) is repeated until a success occurs. X = the trial on which the first success (S) occurs. The probability function of X is: p(x) =P[X = x] = (1 – p) x – 1 p = p q x - 1
46
The Negative Binomial distribution Suppose a Bernoulli trial (S,F) is repeated until k successes occur. Let X = the trial on which the k th success (S) occurs. The probability function of X is:
47
The Hypergeometric distribution Suppose we have a population containing N objects. Suppose the elements of the population are partitioned into two groups. Let a = the number of elements in group A and let b = the number of elements in the other group (group B). Note N = a + b. Now suppose that n elements are selected from the population at random. Let X denote the elements from group A. The probability distribution of X is
48
Continuous Distributions
49
Continuous random variables For a continuous random variable X the probability distribution is described by the probability density function f(x), which has the following properties : 1. f(x) ≥ 0
50
Graph: Continuous Random Variable probability density function, f(x)
51
Continuous Distributions The Uniform distribution from a to b
52
The Normal distribution (mean , standard deviation )
53
The Exponential distribution
54
The Weibull distribution A model for the lifetime of objects that do age.
55
The Weibull distribution with parameters and .
56
The Weibull density, f(x) ( = 0.5, = 2) ( = 0.7, = 2) ( = 0.9, = 2)
57
The Gamma distribution An important family of distributions
58
The Gamma distribution Let the continuous random variable X have density function: Then X is said to have a Gamma distribution with parameters and.
59
Graph: The gamma distribution ( = 2, = 0.9) ( = 2, = 0.6) ( = 3, = 0.6)
60
Comments 1.The set of gamma distributions is a family of distributions (parameterized by and ). 2.Contained within this family are other distributions a.The Exponential distribution – in this case = 1, the gamma distribution becomes the exponential distribution with parameter. The exponential distribution arises if we are measuring the lifetime, X, of an object that does not age. It is also used a distribution for waiting times between events occurring uniformly in time. b.The Chi-square distribution – in the case = /2 and = ½, the gamma distribution becomes the chi- square ( 2 ) distribution with degrees of freedom. Later we will see that sum of squares of independent standard normal variates have a chi-square distribution, degrees of freedom = the number of independent terms in the sum of squares.
61
Expectation
62
Let X denote a discrete random variable with probability function p(x) (probability density function f(x) if X is continuous) then the expected value of X, E(X) is defined to be: and if X is continuous with probability density function f(x)
63
Expectation of functions Let X denote a discrete random variable with probability function p(x) then the expected value of X, E[g (X)] is defined to be: and if X is continuous with probability density function f(x)
64
Moments of a Random Variable
65
the k th moment of X : The first moment of X, = 1 = E(X) is the center of gravity of the distribution of X. The higher moments give different information regarding the distribution of X.
66
the k th central moment of X
67
Moment generating functions
68
Definition Let X denote a random variable, Then the moment generating function of X, m X (t) is defined by:
69
Properties 1. m X (0) = 1 2. 3.
70
4. Let X be a random variable with moment generating function m X (t). Let Y = bX + a Then m Y (t) = m bX + a (t) = E(e [bX + a]t ) = e at E(e X[ bt ] ) = e at m X (bt) 5. Let X and Y be two independent random variables with moment generating function m X (t) and m Y (t). Then m X+Y (t) = E(e [X + Y]t ) = E(e Xt e Yt ) = E(e Xt ) E(e Yt ) = m X (t) m Y (t)
71
6. Let X and Y be two random variables with moment generating function m X (t) and m Y (t) and two distribution functions F X (x) and F Y (y) respectively. Let m X (t) = m Y (t) then F X (x) = F Y (x). This ensures that the distribution of a random variable can be identified by its moment generating function
72
M. G. F.’s - Continuous distributions
73
M. G. F.’s - Discrete distributions
74
Note: The distribution of a random variable X can be described by:
77
Jointly distributed Random variables Multivariate distributions
78
Discrete Random Variables
79
The joint probability function; p(x,y) = P[X = x, Y = y]
80
Continuous Random Variables
81
Definition: Two random variable are said to have joint probability density function f(x,y) if
82
Marginal and conditional distributions
83
Marginal Distributions (Discrete case): Let X and Y denote two random variables with joint probability function p(x,y) then the marginal density of X is the marginal density of Y is
84
Marginal Distributions (Continuous case): Let X and Y denote two random variables with joint probability density function f(x,y) then the marginal density of X is the marginal density of Y is
85
Conditional Distributions (Discrete Case): Let X and Y denote two random variables with joint probability function p(x,y) and marginal probability functions p X (x), p Y (y) then the conditional density of Y given X = x conditional density of X given Y = y
86
Conditional Distributions (Continuous Case): Let X and Y denote two random variables with joint probability density function f(x,y) and marginal densities f X (x), f Y (y) then the conditional density of Y given X = x conditional density of X given Y = y
87
The bivariate Normal distribution
88
Let where This distribution is called the bivariate Normal distribution. The parameters are 1, 2, 1, 2 and
89
Surface Plots of the bivariate Normal distribution
90
2.The marginal distribution of x 2 is Normal with mean 2 and standard deviation 2. 1.The marginal distribution of x 1 is Normal with mean 1 and standard deviation 1. Marginal distributions
91
Conditional distributions 1.The conditional distribution of x 1 given x 2 is Normal with: and mean standard deviation 2.The conditional distribution of x 2 given x 1 is Normal with: and mean standard deviation
92
Independence
93
Two random variables X and Y are defined to be independent if Definition: if X and Y are discrete if X and Y are continuous
94
multivariate distributions k ≥ 2
95
Definition Let X 1, X 2, …, X n denote n discrete random variables, then p(x 1, x 2, …, x n ) is joint probability function of X 1, X 2, …, X n if
96
Definition Let X 1, X 2, …, X k denote k continuous random variables, then f(x 1, x 2, …, x k ) is joint density function of X 1, X 2, …, X k if
97
The Multinomial distribution Suppose that we observe an experiment that has k possible outcomes {O 1, O 2, …, O k } independently n times. Let p 1, p 2, …, p k denote probabilities of O 1, O 2, …, O k respectively. Let X i denote the number of times that outcome O i occurs in the n repetitions of the experiment.
98
is called the Multinomial distribution The joint probability function of:
99
The Multivariate Normal distribution Recall the univariate normal distribution the bivariate normal distribution
100
The k-variate Normal distribution where
101
Marginal distributions
102
Definition Let X 1, X 2, …, X q, X q+1 …, X k denote k discrete random variables with joint probability function p(x 1, x 2, …, x q, x q+1 …, x k ) then the marginal joint probability function of X 1, X 2, …, X q is
103
Definition Let X 1, X 2, …, X q, X q+1 …, X k denote k continuous random variables with joint probability density function f(x 1, x 2, …, x q, x q+1 …, x k ) then the marginal joint probability function of X 1, X 2, …, X q is
104
Conditional distributions
105
Definition Let X 1, X 2, …, X q, X q+1 …, X k denote k discrete random variables with joint probability function p(x 1, x 2, …, x q, x q+1 …, x k ) then the conditional joint probability function of X 1, X 2, …, X q given X q+1 = x q+1, …, X k = x k is
106
Definition Let X 1, X 2, …, X q, X q+1 …, X k denote k continuous random variables with joint probability density function f(x 1, x 2, …, x q, x q+1 …, x k ) then the conditional joint probability function of X 1, X 2, …, X q given X q+1 = x q+1, …, X k = x k is Definition
107
Let X 1, X 2, …, X q, X q+1 …, X k denote k continuous random variables with joint probability density function f(x 1, x 2, …, x q, x q+1 …, x k ) then the variables X 1, X 2, …, X q are independent of X q+1, …, X k if Definition – Independence of sets of vectors A similar definition for discrete random variables.
108
Definition Let X 1, X 2, …, X k denote k continuous random variables with joint probability density function f(x 1, x 2, …, x k ) then the variables X 1, X 2, …, X k are called mutually independent if Definition – Mutual Independence A similar definition for discrete random variables.
109
Expectation for multivariate distributions
110
Definition Let X 1, X 2, …, X n denote n jointly distributed random variable with joint density function f(x 1, x 2, …, x n ) then
111
Some Rules for Expectation
112
Thus you can calculate E[X i ] either from the joint distribution of X 1, …, X n or the marginal distribution of X i. The Linearity property
113
In the simple case when k = 2 3.(The Multiplicative property) Suppose X 1, …, X q are independent of X q+1, …, X k then if X and Y are independent
114
Some Rules for Variance
115
Ex: Tchebychev’s inequality
116
Note: If X and Y are independent, then
117
The correlation coefficient XY 2. if there exists a and b such that where XY = +1 if b > 0 and XY = -1 if b< 0
118
Some other properties of variance
119
4.Variance: Multiplicative Rule for independent random variables Suppose that X and Y are independent random variables, then:
120
Mean and Variance of averages Let Let X 1, …, X n be n mutually independent random variables each having mean and standard deviation (variance 2 ). Then and
121
The Law of Large Numbers Let Let X 1, …, X n be n mutually independent random variables each having mean Then for any > 0 (no matter how small)
122
Conditional Expectation:
123
Let X 1, X 2, …, X q, X q+1 …, X k denote k continuous random variables with joint probability density function f(x 1, x 2, …, x q, x q+1 …, x k ) then the conditional joint probability function of X 1, X 2, …, X q given X q+1 = x q+1, …, X k = x k is Definition
124
Let U = h( X 1, X 2, …, X q, X q+1 …, X k ) then the Conditional Expectation of U given X q+1 = x q+1, …, X k = x k is Definition Note this will be a function of x q+1, …, x k.
125
A very useful rule Then Let (x 1, x 2, …, x q, y 1, y 2, …, y m ) = (x, y) denote q + m random variables.
126
Functions of Random Variables
127
Methods for determining the distribution of functions of Random Variables 1.Distribution function method 2.Moment generating function method 3.Transformation method
128
Distribution function method Let X, Y, Z …. have joint density f(x,y,z, …) Let W = h( X, Y, Z, …) First step Find the distribution function of W G(w) = P[W ≤ w] = P[h( X, Y, Z, …) ≤ w] Second step Find the density function of W g(w) = G'(w).
129
Use of moment generating functions 1.Using the moment generating functions of X, Y, Z, …determine the moment generating function of W = h(X, Y, Z, …). 2.Identify the distribution of W from its moment generating function This procedure works well for sums, linear combinations, averages etc.
130
Let x 1, x 2, … denote a sequence of independent random variables Sums Let S = x 1 + x 2 + … + x n then Linear Combinations Let L = a 1 x 1 + a 2 x 2 + … + a n x n then
131
Arithmetic Means Let x 1, x 2, … denote a sequence of independent random variables coming from a distribution with moment generating function m(t)
132
The Transformation Method Theorem Let X denote a random variable with probability density function f(x) and U = h(X). Assume that h(x) is either strictly increasing (or decreasing) then the probability density of U is:
133
The Transfomation Method (many variables) Theorem Let x 1, x 2,…, x n denote random variables with joint probability density function f(x 1, x 2,…, x n ) Let u 1 = h 1 (x 1, x 2,…, x n ). u 2 = h 2 (x 1, x 2,…, x n ). u n = h n (x 1, x 2,…, x n ). define an invertible transformation from the x’s to the u’s
134
Then the joint probability density function of u 1, u 2,…, u n is given by: where Jacobian of the transformation
135
Some important results Distribution of functions of random variables
136
The method used to derive these results will be indicated by: 1. DF- Distribution Function Method. 2. MGF- Moment generating function method 3. TF- Transformation method
137
Student’s t distribution Let Z and U be two independent random variables with: 1. Z having a Standard Normal distribution and 2. U having a 2 distribution with degrees of freedom then the distribution of: is: DF
138
The Chi-square distribution Let Z 1, Z 2, …, Z v be v independent random variables having a Standard Normal distribution, then has a 2 distribution with degrees of freedom. MGF DF for = 1 for > 1
139
Distribution of the sample mean Let x 1, x 2, …, x n denote a sample from the normal distribution with mean and variance 2. then has a Normal distribution with: MGF
140
If x 1, x 2, …, x n is a sample from a distribution with mean , and standard deviations then if n is large The Central Limit theorem and variance has a normal distribution with mean MGF
141
Distribution of the sample variance Let x 1, x 2, …, x n denote a sample from the normal distribution with mean and variance 2. then has a 2 distribution with = n - 1 degrees of freedom. Let MGF
142
Distribution of sums of Gamma R. V.’s Let X 1, X 2, …, X n denote n independent random variables each having a gamma distribution with parameters (, i ), i = 1, 2, …, n. Then W = X 1 + X 2 + … + X n has a gamma distribution with parameters (, 1 + 2 +… + n ). Distribution of a multiple of a Gamma R. V. Suppose that X is a random variable having a gamma distribution with parameters (, ). Then W = aX has a gamma distribution with parameters ( /a, ). MGF
143
Distribution of sums of Binomial R. V.’s Let X 1, X 2, …, X k denote k independent random variables each having a binomial distribution with parameters (p,n i ), i = 1, 2, …, k. Then W = X 1 + X 2 + … + X k has a binomial distribution with parameters (p, n 1 + n 2 +… + n k ). Distribution of sums of Negative Binomial R. V.’s Let X 1, X 2, …, X n denote n independent random variables each having a negative binomial distribution with parameters (p,k i ), i = 1, 2, …, n. Then W = X 1 + X 2 + … + X n has a negative binomial distribution with parameters (p, k 1 + k 2 +… + k n ). MGF
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.