Chapter 1 Probability Theory (i) : One Random Variable Bioinformatics Tea Seminar: Statistical Methods in Bioinformatics Chapter 1 Probability Theory (i) : One Random Variable 06/05/2008 Jae Hyun Kim
Content Discrete Random Variable Discrete Probability Distributions Probability Generating Functions Continuous Random Variable Probability Density Functions Moment Generating Functions jaekim@ku.edu
Discrete Random Variable Numerical quantity that, in some experiment (Sample Space) that involves some degree of randomness, takes one value from some discrete set of possible values (EVENT) Sample Space Set of all outcomes of an experiment (or observation) For Example, Flip a coin { H,T } Toss a die {1,2,3,4,5,6} Sum of two dice { 2,3,…,12 } Event Any subset of outcome jaekim@ku.edu
Discrete Probability Distributions The probability distribution Set of values that this random variable can take, together with their associated probabilities Example, Y = total number of heads when flip a coin twice Probability Distribution Function Cumulative Distribution Function jaekim@ku.edu
One Bernoulli Trial A Bernoulli Trial Single trial with two possible outcomes “success” or “failure” Probability of success = p jaekim@ku.edu
The Binomial Distribution The Binomial Random Variable The number of success in a fixed number of n independent Bernoulli trials with the same probability of success for each trial Requirements Each trial must result in one of two possible outcomes The various trials must be independent The probability of success must be the same on all trials The number n of trials must be fixed in advance jaekim@ku.edu
Bernoulli Trail and Binomial Distribution Comments Single Bernoulli Trial = special case (n=1) of Binomial Distribution Probability p is often an unknown parameter There is no simple formula for the cumulative distribution function for the binomial distribution There is no unique “binomial distribution,” but rather a family of distributions indexed by n and p jaekim@ku.edu
The Hypergeometric Distribution N objects ( n red, N-n white ) m objects are taken at random, without replacement Y = number of red objects taken Biological example N lab mice ( n male, N-n female ) m Mutations The number Y of mutant males: hypergeometric distribution jaekim@ku.edu
The Uniform/Geometric Distribution The Uniform Distribution Same values over the range The Geometric Distribution Number of Y Bernoulli trials before but not including the first failure Cumulative distribution function jaekim@ku.edu
The Poisson Distribution Event occurs randomly in time/space For example, The time between phone calls Approximation of Binomial Distribution When n is large p is small np is moderate Binomial (n, p, x ) = Poisson (np, x) ( = np) jaekim@ku.edu
Mean Mean / Expected Value Expected Value of g(y) Linearity Property Example Linearity Property In general, jaekim@ku.edu
Variance Definition jaekim@ku.edu
Summary jaekim@ku.edu
General Moments Moment r th moment of the probability distribution about zero Mean : First moment (r = 1) r th moment about mean Variance : r = 2 jaekim@ku.edu
Probability-Generating Function PGF Used to derive moments Mean Variance If two r.v. X and Y have identical probability generating functions, they are identically distributed jaekim@ku.edu
Continuous Random Variable Probability density function f(x) Probability Cumulative Distribution Function jaekim@ku.edu
Mean and Variance Mean Variance Mean value of the function g(X) jaekim@ku.edu
Chebyshev’s Inequality Proof jaekim@ku.edu
The Uniform Distribution Pdf Mean & Variance jaekim@ku.edu
The Normal Distribution Pdf Mean , Variance 2 jaekim@ku.edu
Approximation Normal Approximation to Binomial Condition n is large Binomial (n,p,x) = Normal (=np, 2=np(1-p), x) Continuity Correction Normal Approximation to Poisson is large Poisson (,x) = Normal(=, 2=, x) jaekim@ku.edu
The Exponential Distribution Pdf Cdf Mean 1/, Variance 1/2 jaekim@ku.edu
The Gamma Distribution Pdf Mean and Variance jaekim@ku.edu
The Moment-Generating Function Definition Useful to derive m’(0) = E[X], m’’(0) = E[X2], m(n)(0) = E[Xn] mgf m(t) = pgf P(et) jaekim@ku.edu
Conditional Probability Bayes’ Formula Independence Memoryless Property jaekim@ku.edu
Entropy Definition Entropy vs Variance can be considered as function of PY(y) a measure of how close to uniform that distribution is, and thus, in a sense, of the unpredictability of any observed value of a random variable having that distribution. Entropy vs Variance measure in some sense the uncertainty of the value of a random variable having that distribution Entropy : Function of pdf Variance : depends on sample values jaekim@ku.edu