Important Discrete Probability Distributions

Slides:



Advertisements
Similar presentations
Chapter 6 Continuous Random Variables and Probability Distributions
Advertisements

DISCRETE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS
Chapter 5 Discrete Random Variables and Probability Distributions
ฟังก์ชั่นการแจกแจงความน่าจะเป็น แบบไม่ต่อเนื่อง Discrete Probability Distributions.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 5-1 Chapter 5 Some Important Discrete Probability Distributions Statistics.
Lec 18 Nov 12 Probability – definitions and simulation.
Review of Basic Probability and Statistics
Chapter 4 Discrete Random Variables and Probability Distributions
Probability Distributions
TDC 369 / TDC 432 April 2, 2003 Greg Brewster. Topics Math Review Probability –Distributions –Random Variables –Expected Values.
1 Review of Probability Theory [Source: Stanford University]
Probability and Statistics Review
C4: DISCRETE RANDOM VARIABLES CIS 2033 based on Dekking et al. A Modern Introduction to Probability and Statistics Longin Jan Latecki.
Class notes for ISE 201 San Jose State University
Discrete and Continuous Distributions G. V. Narayanan.
Chapter 21 Random Variables Discrete: Bernoulli, Binomial, Geometric, Poisson Continuous: Uniform, Exponential, Gamma, Normal Expectation & Variance, Joint.
Chapter 4 Continuous Random Variables and Probability Distributions
Prof. SankarReview of Random Process1 Probability Sample Space (S) –Collection of all possible outcomes of a random experiment Sample Point –Each outcome.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Discrete Random Variables Chapter 4.
Discrete Distributions
1 If we can reduce our desire, then all worries that bother us will disappear.
Winter 2006EE384x1 Review of Probability Theory Review Session 1 EE384X.
Statistics for Engineer Week II and Week III: Random Variables and Probability Distribution.
Continuous Distributions The Uniform distribution from a to b.
October 15. In Chapter 6: 6.1 Binomial Random Variables 6.2 Calculating Binomial Probabilities 6.3 Cumulative Probabilities 6.4 Probability Calculators.
PROBABILITY CONCEPTS Key concepts are described Probability rules are introduced Expected values, standard deviation, covariance and correlation for individual.
Random Variables. A random variable X is a real valued function defined on the sample space, X : S  R. The set { s  S : X ( s )  [ a, b ] is an event}.
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 6 Continuous Random Variables.
CHAPTER Discrete Models  G eneral distributions  C lassical: Binomial, Poisson, etc Continuous Models  G eneral distributions 
Biostatistics Class 3 Discrete Probability Distributions 2/8/2000.
Binomial Experiment A binomial experiment (also known as a Bernoulli trial) is a statistical experiment that has the following properties:
School of Information University of Michigan Discrete and continuous distributions.
Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 5 Discrete Random Variables.
1 Topic 3 - Discrete distributions Basics of discrete distributions Mean and variance of a discrete distribution Binomial distribution Poisson distribution.
Math b (Discrete) Random Variables, Binomial Distribution.
EAS31116/B9036: Statistics in Earth & Atmospheric Sciences Lecture 3: Probability Distributions (cont’d) Instructor: Prof. Johnny Luo
Expectation. Let X denote a discrete random variable with probability function p(x) (probability density function f(x) if X is continuous) then the expected.
Topic 3 - Discrete distributions Basics of discrete distributions - pages Mean and variance of a discrete distribution - pages ,
Chapter 3 Discrete Random Variables and Probability Distributions  Random Variables.2 - Probability Distributions for Discrete Random Variables.3.
Probability Theory Modelling random phenomena. Permutations the number of ways that you can order n objects is: n! = n(n-1)(n-2)(n-3)…(3)(2)(1) Definition:
C4: DISCRETE RANDOM VARIABLES CIS 2033 based on Dekking et al. A Modern Introduction to Probability and Statistics Longin Jan Latecki.
1 Keep Life Simple! We live and work and dream, Each has his little scheme, Sometimes we laugh; sometimes we cry, And thus the days go by.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 5 Discrete Random Variables.
3.1 Statistical Distributions. Random Variable Observation = Variable Outcome = Random Variable Examples: – Weight/Size of animals – Animal surveys: detection.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 5-1 Chapter 5 Some Important Discrete Probability Distributions Business Statistics,
Chap 5-1 Chapter 5 Discrete Random Variables and Probability Distributions Statistics for Business and Economics 6 th Edition.
Random Variables By: 1.
Continuous Random Variables. Probability Density Function When plotted, continuous treated as discrete random variables can be “binned” form “bars” A.
Chapter 6 Continuous Random Variables Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Continuous Distributions
MECH 373 Instrumentation and Measurements
MAT 446 Supplementary Note for Ch 3
Discrete Random Variables
Random variables (r.v.) Random variable
Random Variables.
C4: DISCRETE RANDOM VARIABLES
Engineering Probability and Statistics - SE-205 -Chap 3
Chapter 3 Discrete Random Variables and Probability Distributions
Chapter 5 Some Important Discrete Probability Distributions
Discrete random variable X Examples: shoe size, dosage (mg), # cells,…
ASV Chapters 1 - Sample Spaces and Probabilities
4A: Probability Concepts and Binomial Probability Distributions
Probability Theory and Specific Distributions (Moore Ch5 and Guan Ch6)
6: Binomial Probability Distributions
Discrete Probability Distributions
Continuous Distributions
1/2555 สมศักดิ์ ศิวดำรงพงศ์
Chapter 5: Sampling Distributions
Presentation transcript:

Important Discrete Probability Distributions

Handy Counting Formulas When various outcomes of an experiment are equally likely computing probabilities reduces to a counting problem Say we have two experiments, each with a set of outcomes: Experiment 1 has m outcomes Experiment 2 has n outcomes The total number of outcomes that can occur for both experiments is m×n

Handy Counting Formulas When various outcomes of an experiment are equally likely computing probabilities reduces to a counting problem Say now we have k-experiments with the following number of outcomes: Experiment 1 has n1 outcomes Experiment 2 has n2 outcomes … Experiment k has nk outcomes The total number of outcomes that can occur for all experiments is (the counting principle): Total number of outcomes = n1n2…nk

Handy Counting Formulas How many ways are there to select r distinct items from a group of n distinct items? Permutations: If the order of selection is important Combinations: If the order of selection is irrelevant

Handy Counting Formulas How many ways are there to arrange n distinct items into k-groups (partitions), each with ni items Partitions: Grouping items into sets where order doesn't’t matter multinomial-coefficient Note:

This is how we do permutation and combinations in R: factorial(5) # 5! prod(1:5) # 5! also # n_P_r is prod(n:(n-r+1)) prod(25:(25-5+1)) # 25_P_5 # n_P_r is also n!/(n-r)! factorial(25)/(factorial(25-5)) # 25_P_5 also # n_C_r is choose(n,r) choose(25,5) # 25_C_5 And this is what we get:

Probability Mass Function Probability over a discrete set of outcomes is described by a probability mass function (PMF) A PMF can be represented as a table or displayed as a histogram Fiber Color Probability Black/Grey 0.48 Blue 0.291 Red 0.127 Orange/Brown 0.048 Pink/Purple 0.033 Green 0.017 Yellow 0.002 Other

Example: Probability Mass Function For Some Glass RI library(dafs) data(Glass) hist(Glass[,1], xlab="RI", main="Refractive Index of 290 Glass Fragments") Continuous data treated as if it were discrete

Cumulative Distribution Function A function that gives the probability that a random variable is less than or equal to a specified value is a cumulative distribution function (CDF): Varies between 0 and 1 CDFs for discrete RVs are step functions

Cumulative Distribution Function The same mathematical machinery can be used compute a CDF for a histogram of any data type: ordinal-discrete (previous slide) artificially ordered nominal-discrete *continuous treated as if it were discrete (empirical CDF) library(mlbench) data(Glass) RI <- Glass[,1] hist(RI) plot(ecdf(RI), ylab="F(x)", xlab="x=RI", main="Empirical CDF of RIs")

Cumulative Distribution Function In R we can compute the empirical CDF, F(x) like this: dat <- c( 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2, 3,3,3,3,3,3,3,3,3, 4,4,4,4,4,4,4,4,4 ) Fx <- ecdf(dat) Fx(3) ecdf(dat)(3) Don’t name anything “F” in R. F(x = 3) Pr(X ≤ 3)

Cumulative Distribution Function Use the CDF to compute the probability that a RV will lay between two specified values such that: a <- 1.51593 b <- 1.51820 # Pr(a<RI<=b) ecdf(x = RI)(b) - ecdf(x = RI)(a) # Also Pr(a<RI<=b) length(which(RI > a & RI <= b))/length(RI) F(b) F(a) a b

Probabilities between any bounds What is we want this instead??: -or- these??: We can do this by counting instead using the which and length functions: a <- 1.51593 b <- 1.51820 length(which(RI >= a & RI <= b))/length(RI) length(which(RI > a & RI < b))/length(RI) length(which(RI >= a & RI < b))/length(RI)

Moments and Expectation Values Moments are handy numerical values that can systematically help to describe distribution location and shape properties. mth-order moments are found by taking the expectation value of an RV raised to the mth-power:

Moments and Expectation Values 1st-order moment: Number of times outcome xi occurs Total number of experiments average value of X

Moments and Expectation Values 1st-order moment: location descriptor mean average value of X 1st-order moment for a parameter g(X) on X: average value of parameter g

Second order central moment. Moments and Expectation Values 2nd-order moments: Second order moment. Not that interesting… but… It can be shown that Second order central moment. spread descriptor Population standard deviation

Moments and Expectation Values Higher-order moments measure other distribution shape properties: 3rd order: “skewness” 4th order: “kurtosis” (pointy-ness/flat-ness) no skew leptokurtic left skew right skew platykurtic

Bernoulli Distribution Bernoulli PMF: “Coin Flipping” distribution Probability of a “Heads” (success) is p Probability of a “Tails” (fail) is 1 − p

Bernoulli Distribution Mean: Variance: p <- 0.7 # Probability of a "Heads" (a success) bernoulli.pmf <- dbinom(x = 1:0, size = 1, prob = p) plot(1:0,bernoulli.pmf, typ="h", main="Bernoulli PMF",xlab="x (heads=1, tails=0)",ylab="Pr(X)") # A sample of 10,000 "coin flips”: sample.of.bernoulli <- rbinom(10000, size = 1, prob = p) hist(sample.of.bernoulli, xlim=c(0,1), xlab="x (heads=1, tails=0)", bre=2) mean(sample.of.bernoulli) # Average ~ np var(sample.of.bernoulli) # Variance ~ np(1-p)

Bernoulli Distribution Cumulative distribution function (CDF): # Plot the Cumulative Distribution Function: This one is not that interesting # since there are only two possibilities for what X can be ("heads"/"tails") bernoulli.cdf <- pbinom(q = 0:1, size = 1, prob = p) plot(0:1, bernoulli.cdf, typ="s", main="Bernoulli CDF",xlab="x (tails=0, heads=1)",ylab="F(x)") # Make a prettier CDF plot by getting a big random sample # and plotting the empirical CDF for it: sample.of.bernoulli <- rbinom(100000, size = 1, prob = p) plot(ecdf(sample.of.bernoulli), main="Bernoulli CDF from a big random sample",xlab="x (tails=0, heads=1)",ylab="F(x)")

Binomial Distribution Binomial PMF: Number of “heads” (successes) in n flips Number of “Heads” (successes) is x Probability of a “Heads” is p Number of flips (“Bernoulli trials”) is n

Binomial Distribution Mean: Variance: p <- 0.5 # Probability of a "Heads" (a success) n <- 20 binomial.pmf <- dbinom(x = 0:20, size = n, prob = p) plot(0:20,binomial.pmf, typ="h", main="Binomial PMF",xlab="#-heads (x)",ylab="Pr(X)") # A sample of 1,000 trials of n-"coin flips". Each trial counts #the number of "heads" in n-tosses: sample.of.binomial <- rbinom(1000, size = n, prob = p) hist(sample.of.binomial, xlim=c(0,20),xlab="#-heads (x)") mean(sample.of.binomial) # Average ~ np var(sample.of.binomial) # Variance ~ np(1-p)

Binomial Distribution Mean: Variance: n = 20 p = 0.5 Sample of 1000 from Pr(X)

Binomial Distribution Cumulative distribution function (CDF): Don’t worry. Just use this: pbinom(q = x, size = n, prob = p) “p-functions” in R are the CDFs of the distribution And while we’re at it: dbinom “d-function” in R is the density (mass) of the distribution pbinom “p-function” in R is the CDFs of the distribution qbinom “q-function” in R give the quantiles of the distribution (x-values) for a given cumulative probability (p-value) rbinom “r-functions” in R gives a random sample from the distribution *NOTE: “p-functions” and “q-functions” are inverses of each other

Binomial Distribution Cumulative distribution function (CDF): # Plot the Cumulative Distribution Function: binomial.cdf <- pbinom(q = 0:20, size = n, prob = p) plot(0:20, binomial.cdf, typ="s", main="Binomial CDF", xlab="#-heads (x)",ylab="F(x)") # Make a prettier CDF plot by getting a big random sample # and plotting the empirical CDF for it: sample.of.binomial <- rbinom(100000, size = n, prob = p) plot(ecdf(sample.of.binomial), main="Binomial CDF from a big random sample", xlab="#-heads (x)",ylab="F(x)")

Poisson Distribution Poisson PMF: Number of “events” occurring in an experiment which has a mean rate of occurrence l. Average number of “events” in an experiment is l Say on average you get 100 texts in a day. Then l = 100. Number of “events” is x *NOTE: The is no upper limit on “events” that can occur in an experiment, unlike for the binomial, where the upper limit of “successes” (“events”) is n.

Poisson Distribution Mean: Variance: = 100 Sample of 365 from Pr(X)

Poisson Distribution Cumulative distribution function (CDF): ppois(q = x, lambda = lam)

Poisson Distribution Code for Poisson figures: # On average we get 100 "texts" per day (lambda, units: events/interval) lambda <- 100 #Poisson PMF. Gives probabilities for recieving between 70-130 "texts" per day poisson.pmf <- dpois(x = 70:130, lambda = lambda) plot(70:130,poisson.pmf, typ="h", main="Poisson PMF",xlab="#-events (x)",ylab="Pr(X)") # A sample of 365 "days" (intervals). Each "day" we count #the number of "texts" (events) we get: sample.of.poisson <- rpois(365, lambda=lambda) hist(sample.of.poisson) mean(sample.of.poisson) # Average ~ lambda var(sample.of.poisson) # Variance ~ lambda # Plot the Cumulative Distribution Function: poisson.cdf <- ppois(q = 0:200, lambda = lambda) plot(0:200, poisson.cdf, typ="s", main="Poisson CDF", xlab="#-events (x)",ylab="F(x)") # Make a prettier CDF plot by getting a big random sample # and plotting the empirical CDF for it: sample.of.poisson <- rpois(100000, lambda = lambda) plot(ecdf(sample.of.poisson), main="Poisson CDF from a big random sample", xlab="#-events (x)",ylab="F(x)”)