Stats 241.3 Probability Theory Summary. The sample Space, S The sample space, S, for a random phenomena is the set of all possible outcomes.

Stats 241.3 Probability Theory Summary

The sample Space, S The sample space, S, for a random phenomena is the set of all possible outcomes.

An Event, E The event, E, is any subset of the sample space, S. i.e. any set of outcomes (not necessarily all outcomes) of the random phenomena S E

Probability

Suppose we are observing a random phenomena Let S denote the sample space for the phenomena, the set of all possible outcomes. An event E is a subset of S. A probability measure P is defined on S by defining for each event E, P[E] with the following properties 1. P[E] ≥ 0, for each E. 2. P[S] = 1. 3.

Finite uniform probability space Many examples fall into this category 1.Finite number of outcomes 2.All outcomes are equally likely 3. To handle problems in case we have to be able to count. Count n(E) and n(S).

Techniques for counting

Basic Rule of counting Suppose we carry out k operations in sequence Let n 1 = the number of ways the first operation can be performed n i = the number of ways the i th operation can be performed once the first (i - 1) operations have been completed. i = 2, 3, …, k Then N = n 1 n 2 … n k = the number of ways the k operations can be performed in sequence.

Diagram:       

Basic Counting Formulae 1.Permutations: How many ways can you order n objects n!n! 2.Permutations of size k (< n): How many ways can you choose k objects from n objects in a specific order

3.Combinations of size k ( ≤ n): A combination of size k chosen from n objects is a subset of size k where the order of selection is irrelevant. How many ways can you choose a combination of size k objects from n objects (order of selection is irrelevant)

Important Notes 1.In combinations ordering is irrelevant. Different orderings result in the same combination. 2.In permutations order is relevant. Different orderings result in the different permutations.

Rules of Probability

The additive rule P[A  B] = P[A] + P[B] – P[A  B] and if P[A  B] =  P[A  B] = P[A] + P[B]

The additive rule for more than two events then and if A i  A j =  for all i ≠ j.

The Rule for complements for any event E

Conditional Probability, Independence and The Multiplicative Rue

Then the conditional probability of A given B is defined to be:

The multiplicative rule of probability and if A and B are independent. This is the definition of independent

The multiplicative rule for more than two events

Independence for more than 2 events

Definition: The set of k events A 1, A 2, …, A k are called mutually independent if: P[A i 1 ∩ A i 2 ∩… ∩ A i m ] = P[A i 1 ] P[A i 2 ] …P[A i m ] For every subset {i 1, i 2, …, i m } of {1, 2, …, k } i.e. for k = 3 A 1, A 2, …, A k are mutually independent if: P[A 1 ∩ A 2 ] = P[A 1 ] P[A 2 ], P[A 1 ∩ A 3 ] = P[A 1 ] P[A 3 ], P[A 2 ∩ A 3 ] = P[A 2 ] P[A 3 ], P[A 1 ∩ A 2 ∩ A 3 ] = P[A 1 ] P[A 2 ] P[A 3 ]

Definition: The set of k events A 1, A 2, …, A k are called pairwise independent if: P[A i ∩ A j ] = P[A i ] P[A j ] for all i and j. i.e. for k = 3 A 1, A 2, …, A k are pairwise independent if: P[A 1 ∩ A 2 ] = P[A 1 ] P[A 2 ], P[A 1 ∩ A 3 ] = P[A 1 ] P[A 3 ], P[A 2 ∩ A 3 ] = P[A 2 ] P[A 3 ], It is not necessarily true that P[A 1 ∩ A 2 ∩ A 3 ] = P[A 1 ] P[A 2 ] P[A 3 ]

Bayes Rule for probability

Let A 1, A 2, …, A k denote a set of events such that An generalization of Bayes Rule for all i and j. Then

Random Variables an important concept in probability

A random variable, X, is a numerical quantity whose value is determined be a random experiment

Definition – The probability function, p(x), of a random variable, X. For any random variable, X, and any real number, x, we define where {X = x} = the set of all outcomes (event) with X = x. For continuous random variables p(x) = 0 for all values of x.

Definition – The cumulative distribution function, F(x), of a random variable, X. For any random variable, X, and any real number, x, we define where {X ≤ x} = the set of all outcomes (event) with X ≤ x.

Discrete Random Variables For a discrete random variable X the probability distribution is described by the probability function p(x), which has the following properties

Graph: Discrete Random Variable p(x)p(x) a b

Continuous random variables For a continuous random variable X the probability distribution is described by the probability density function f(x), which has the following properties : 1. f(x) ≥ 0

Graph: Continuous Random Variable probability density function, f(x)

The distribution function F(x) This is defined for any random variable, X. F(x) = P[X ≤ x] Properties 1. F(-∞) = 0 and F(∞) = 1. 2. F(x) is non-decreasing (i. e. if x 1 < x 2 then F(x 1 ) ≤ F(x 2 ) ) 3. F(b) – F(a) = P[a < X ≤ b].

4. p(x) = P[X = x] =F(x) – F(x-) 5.If p(x) = 0 for all x (i.e. X is continuous) then F(x) is continuous. Here

6. For Discrete Random Variables F(x) is a non-decreasing step function with F(x)F(x) p(x)p(x)

7. For Continuous Random Variables Variables F(x) is a non-decreasing continuous function with F(x)F(x) f(x) slope x To find the probability density function, f(x), one first finds F(x) then

Some Important Discrete distributions

The Bernoulli distribution

Suppose that we have a experiment that has two outcomes 1.Success (S) 2.Failure (F) These terms are used in reliability testing. Suppose that p is the probability of success (S) and q = 1 – p is the probability of failure (F) This experiment is sometimes called a Bernoulli Trial Let Then

The probability distribution with probability function is called the Bernoulli distribution p q = 1- p

The Binomial distribution

We observe a Bernoulli trial (S,F) n times. where Let X denote the number of successes in the n trials. Then X has a binomial distribution, i. e. 1. p = the probability of success (S), and 2. q = 1 – p = the probability of failure (F)

The Poisson distribution Suppose events are occurring randomly and uniformly in time. Let X be the number of events occuring in a fixed period of time. Then X will have a Poisson distribution with parameter.

The Geometric distribution Suppose a Bernoulli trial (S,F) is repeated until a success occurs. X = the trial on which the first success (S) occurs. The probability function of X is: p(x) =P[X = x] = (1 – p) x – 1 p = p q x - 1

The Negative Binomial distribution Suppose a Bernoulli trial (S,F) is repeated until k successes occur. Let X = the trial on which the k th success (S) occurs. The probability function of X is:

The Hypergeometric distribution Suppose we have a population containing N objects. Suppose the elements of the population are partitioned into two groups. Let a = the number of elements in group A and let b = the number of elements in the other group (group B). Note N = a + b. Now suppose that n elements are selected from the population at random. Let X denote the elements from group A. The probability distribution of X is

Continuous Distributions

Continuous random variables For a continuous random variable X the probability distribution is described by the probability density function f(x), which has the following properties : 1. f(x) ≥ 0

Graph: Continuous Random Variable probability density function, f(x)

Continuous Distributions The Uniform distribution from a to b

The Normal distribution (mean , standard deviation  )  

The Exponential distribution

The Weibull distribution A model for the lifetime of objects that do age.

The Weibull distribution with parameters  and .

The Weibull density, f(x) (  = 0.5,  = 2) (  = 0.7,  = 2) (  = 0.9,  = 2)

The Gamma distribution An important family of distributions

The Gamma distribution Let the continuous random variable X have density function: Then X is said to have a Gamma distribution with parameters  and.

Graph: The gamma distribution (  = 2,  = 0.9) (  = 2,  = 0.6) (  = 3,  = 0.6)

Comments 1.The set of gamma distributions is a family of distributions (parameterized by  and ). 2.Contained within this family are other distributions a.The Exponential distribution – in this case  = 1, the gamma distribution becomes the exponential distribution with parameter. The exponential distribution arises if we are measuring the lifetime, X, of an object that does not age. It is also used a distribution for waiting times between events occurring uniformly in time. b.The Chi-square distribution – in the case  = /2 and = ½, the gamma distribution becomes the chi- square (  2 ) distribution with degrees of freedom. Later we will see that sum of squares of independent standard normal variates have a chi-square distribution, degrees of freedom = the number of independent terms in the sum of squares.

Expectation

Let X denote a discrete random variable with probability function p(x) (probability density function f(x) if X is continuous) then the expected value of X, E(X) is defined to be: and if X is continuous with probability density function f(x)

Expectation of functions Let X denote a discrete random variable with probability function p(x) then the expected value of X, E[g (X)] is defined to be: and if X is continuous with probability density function f(x)

Moments of a Random Variable

the k th moment of X : The first moment of X,  =  1 = E(X) is the center of gravity of the distribution of X. The higher moments give different information regarding the distribution of X.

the k th central moment of X

Moment generating functions

Definition Let X denote a random variable, Then the moment generating function of X, m X (t) is defined by:

Properties 1. m X (0) = 1 2. 3.

4. Let X be a random variable with moment generating function m X (t). Let Y = bX + a Then m Y (t) = m bX + a (t) = E(e [bX + a]t ) = e at E(e X[ bt ] ) = e at m X (bt) 5. Let X and Y be two independent random variables with moment generating function m X (t) and m Y (t). Then m X+Y (t) = E(e [X + Y]t ) = E(e Xt e Yt ) = E(e Xt ) E(e Yt ) = m X (t) m Y (t)

6. Let X and Y be two random variables with moment generating function m X (t) and m Y (t) and two distribution functions F X (x) and F Y (y) respectively. Let m X (t) = m Y (t) then F X (x) = F Y (x). This ensures that the distribution of a random variable can be identified by its moment generating function

M. G. F.’s - Continuous distributions

M. G. F.’s - Discrete distributions

Note: The distribution of a random variable X can be described by:

Jointly distributed Random variables Multivariate distributions

Discrete Random Variables

The joint probability function; p(x,y) = P[X = x, Y = y]

Continuous Random Variables

Definition: Two random variable are said to have joint probability density function f(x,y) if

Marginal and conditional distributions

Marginal Distributions (Discrete case): Let X and Y denote two random variables with joint probability function p(x,y) then the marginal density of X is the marginal density of Y is

Marginal Distributions (Continuous case): Let X and Y denote two random variables with joint probability density function f(x,y) then the marginal density of X is the marginal density of Y is

Conditional Distributions (Discrete Case): Let X and Y denote two random variables with joint probability function p(x,y) and marginal probability functions p X (x), p Y (y) then the conditional density of Y given X = x conditional density of X given Y = y

Conditional Distributions (Continuous Case): Let X and Y denote two random variables with joint probability density function f(x,y) and marginal densities f X (x), f Y (y) then the conditional density of Y given X = x conditional density of X given Y = y

The bivariate Normal distribution

Let where This distribution is called the bivariate Normal distribution. The parameters are  1,  2,  1,  2 and 

Surface Plots of the bivariate Normal distribution

2.The marginal distribution of x 2 is Normal with mean  2 and standard deviation  2. 1.The marginal distribution of x 1 is Normal with mean  1 and standard deviation  1. Marginal distributions

Conditional distributions 1.The conditional distribution of x 1 given x 2 is Normal with: and mean standard deviation 2.The conditional distribution of x 2 given x 1 is Normal with: and mean standard deviation

Independence

Two random variables X and Y are defined to be independent if Definition: if X and Y are discrete if X and Y are continuous

multivariate distributions k ≥ 2

Definition Let X 1, X 2, …, X n denote n discrete random variables, then p(x 1, x 2, …, x n ) is joint probability function of X 1, X 2, …, X n if

Definition Let X 1, X 2, …, X k denote k continuous random variables, then f(x 1, x 2, …, x k ) is joint density function of X 1, X 2, …, X k if

The Multinomial distribution Suppose that we observe an experiment that has k possible outcomes {O 1, O 2, …, O k } independently n times. Let p 1, p 2, …, p k denote probabilities of O 1, O 2, …, O k respectively. Let X i denote the number of times that outcome O i occurs in the n repetitions of the experiment.

is called the Multinomial distribution The joint probability function of:

The Multivariate Normal distribution Recall the univariate normal distribution the bivariate normal distribution

The k-variate Normal distribution where

Marginal distributions

Definition Let X 1, X 2, …, X q, X q+1 …, X k denote k discrete random variables with joint probability function p(x 1, x 2, …, x q, x q+1 …, x k ) then the marginal joint probability function of X 1, X 2, …, X q is

Definition Let X 1, X 2, …, X q, X q+1 …, X k denote k continuous random variables with joint probability density function f(x 1, x 2, …, x q, x q+1 …, x k ) then the marginal joint probability function of X 1, X 2, …, X q is

Conditional distributions

Definition Let X 1, X 2, …, X q, X q+1 …, X k denote k discrete random variables with joint probability function p(x 1, x 2, …, x q, x q+1 …, x k ) then the conditional joint probability function of X 1, X 2, …, X q given X q+1 = x q+1, …, X k = x k is

Definition Let X 1, X 2, …, X q, X q+1 …, X k denote k continuous random variables with joint probability density function f(x 1, x 2, …, x q, x q+1 …, x k ) then the conditional joint probability function of X 1, X 2, …, X q given X q+1 = x q+1, …, X k = x k is Definition

Let X 1, X 2, …, X q, X q+1 …, X k denote k continuous random variables with joint probability density function f(x 1, x 2, …, x q, x q+1 …, x k ) then the variables X 1, X 2, …, X q are independent of X q+1, …, X k if Definition – Independence of sets of vectors A similar definition for discrete random variables.

Definition Let X 1, X 2, …, X k denote k continuous random variables with joint probability density function f(x 1, x 2, …, x k ) then the variables X 1, X 2, …, X k are called mutually independent if Definition – Mutual Independence A similar definition for discrete random variables.

Expectation for multivariate distributions

Definition Let X 1, X 2, …, X n denote n jointly distributed random variable with joint density function f(x 1, x 2, …, x n ) then

Some Rules for Expectation

Thus you can calculate E[X i ] either from the joint distribution of X 1, …, X n or the marginal distribution of X i. The Linearity property

In the simple case when k = 2 3.(The Multiplicative property) Suppose X 1, …, X q are independent of X q+1, …, X k then if X and Y are independent

Some Rules for Variance

Ex: Tchebychev’s inequality

Note: If X and Y are independent, then

The correlation coefficient  XY 2. if there exists a and b such that where  XY = +1 if b > 0 and  XY = -1 if b< 0

Some other properties of variance

4.Variance: Multiplicative Rule for independent random variables Suppose that X and Y are independent random variables, then:

Mean and Variance of averages Let Let X 1, …, X n be n mutually independent random variables each having mean  and standard deviation  (variance  2 ). Then and

The Law of Large Numbers Let Let X 1, …, X n be n mutually independent random variables each having mean  Then for any  > 0 (no matter how small)

Conditional Expectation:

Let X 1, X 2, …, X q, X q+1 …, X k denote k continuous random variables with joint probability density function f(x 1, x 2, …, x q, x q+1 …, x k ) then the conditional joint probability function of X 1, X 2, …, X q given X q+1 = x q+1, …, X k = x k is Definition

Let U = h( X 1, X 2, …, X q, X q+1 …, X k ) then the Conditional Expectation of U given X q+1 = x q+1, …, X k = x k is Definition Note this will be a function of x q+1, …, x k.

A very useful rule Then Let (x 1, x 2, …, x q, y 1, y 2, …, y m ) = (x, y) denote q + m random variables.

Functions of Random Variables

Methods for determining the distribution of functions of Random Variables 1.Distribution function method 2.Moment generating function method 3.Transformation method

Distribution function method Let X, Y, Z …. have joint density f(x,y,z, …) Let W = h( X, Y, Z, …) First step Find the distribution function of W G(w) = P[W ≤ w] = P[h( X, Y, Z, …) ≤ w] Second step Find the density function of W g(w) = G'(w).

Use of moment generating functions 1.Using the moment generating functions of X, Y, Z, …determine the moment generating function of W = h(X, Y, Z, …). 2.Identify the distribution of W from its moment generating function This procedure works well for sums, linear combinations, averages etc.

Let x 1, x 2, … denote a sequence of independent random variables Sums Let S = x 1 + x 2 + … + x n then Linear Combinations Let L = a 1 x 1 + a 2 x 2 + … + a n x n then

Arithmetic Means Let x 1, x 2, … denote a sequence of independent random variables coming from a distribution with moment generating function m(t)

The Transformation Method Theorem Let X denote a random variable with probability density function f(x) and U = h(X). Assume that h(x) is either strictly increasing (or decreasing) then the probability density of U is:

The Transfomation Method (many variables) Theorem Let x 1, x 2,…, x n denote random variables with joint probability density function f(x 1, x 2,…, x n ) Let u 1 = h 1 (x 1, x 2,…, x n ). u 2 = h 2 (x 1, x 2,…, x n ). u n = h n (x 1, x 2,…, x n ). define an invertible transformation from the x’s to the u’s

Then the joint probability density function of u 1, u 2,…, u n is given by: where Jacobian of the transformation

Some important results Distribution of functions of random variables

The method used to derive these results will be indicated by: 1. DF- Distribution Function Method. 2. MGF- Moment generating function method 3. TF- Transformation method

Student’s t distribution Let Z and U be two independent random variables with: 1. Z having a Standard Normal distribution and 2. U having a  2 distribution with degrees of freedom then the distribution of: is: DF

The Chi-square distribution Let Z 1, Z 2, …, Z v be v independent random variables having a Standard Normal distribution, then has a  2 distribution with degrees of freedom. MGF DF for = 1 for > 1

Distribution of the sample mean Let x 1, x 2, …, x n denote a sample from the normal distribution with mean  and variance  2. then has a Normal distribution with: MGF

If x 1, x 2, …, x n is a sample from a distribution with mean , and standard deviations  then if n is large The Central Limit theorem and variance has a normal distribution with mean MGF

Distribution of the sample variance Let x 1, x 2, …, x n denote a sample from the normal distribution with mean  and variance  2. then has a  2 distribution with = n - 1 degrees of freedom. Let MGF

Distribution of sums of Gamma R. V.’s Let X 1, X 2, …, X n denote n independent random variables each having a gamma distribution with parameters (,  i ), i = 1, 2, …, n. Then W = X 1 + X 2 + … + X n has a gamma distribution with parameters (,  1 +  2 +… +  n ). Distribution of a multiple of a Gamma R. V. Suppose that X is a random variable having a gamma distribution with parameters (,  ). Then W = aX has a gamma distribution with parameters ( /a,  ). MGF

Distribution of sums of Binomial R. V.’s Let X 1, X 2, …, X k denote k independent random variables each having a binomial distribution with parameters (p,n i ), i = 1, 2, …, k. Then W = X 1 + X 2 + … + X k has a binomial distribution with parameters (p, n 1 + n 2 +… + n k ). Distribution of sums of Negative Binomial R. V.’s Let X 1, X 2, …, X n denote n independent random variables each having a negative binomial distribution with parameters (p,k i ), i = 1, 2, …, n. Then W = X 1 + X 2 + … + X n has a negative binomial distribution with parameters (p, k 1 + k 2 +… + k n ). MGF

Stats 241.3 Probability Theory Summary. The sample Space, S The sample space, S, for a random phenomena is the set of all possible outcomes.

Similar presentations

Presentation on theme: "Stats 241.3 Probability Theory Summary. The sample Space, S The sample space, S, for a random phenomena is the set of all possible outcomes."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Stats 241.3 Probability Theory Summary. The sample Space, S The sample space, S, for a random phenomena is the set of all possible outcomes.

Similar presentations

Presentation on theme: "Stats 241.3 Probability Theory Summary. The sample Space, S The sample space, S, for a random phenomena is the set of all possible outcomes."— Presentation transcript:

Similar presentations

About project

Feedback