Probability theory 2 Tron Anders Moger September 13th 2006.

Slides:



Advertisements
Similar presentations
Exponential Distribution. = mean interval between consequent events = rate = mean number of counts in the unit interval > 0 X = distance between events.
Advertisements

Random Variables ECE460 Spring, 2012.
Statistics review of basic probability and statistics.
Chapter 5 Discrete Random Variables and Probability Distributions
© 2003 Prentice-Hall, Inc.Chap 5-1 Business Statistics: A First Course (3 rd Edition) Chapter 5 Probability Distributions.
© 2002 Prentice-Hall, Inc.Chap 5-1 Basic Business Statistics (8 th Edition) Chapter 5 Some Important Discrete Probability Distributions.
Sampling Distributions (§ )
Chapter 2 Discrete Random Variables
Terminology A statistic is a number calculated from a sample of data. For each different sample, the value of the statistic is a uniquely determined number.
Review of Basic Probability and Statistics
Chapter 4 Discrete Random Variables and Probability Distributions
CONTINUOUS RANDOM VARIABLES These are used to define probability models for continuous scale measurements, e.g. distance, weight, time For a large data.
Normal Distribution ch5.
Review.
Chapter 6 Continuous Random Variables and Probability Distributions
Statistics.
Probability and Statistics Review
A random variable that has the following pmf is said to be a binomial random variable with parameters n, p The Binomial random variable.
Probability Distributions Random Variables: Finite and Continuous A review MAT174, Spring 2004.
2. Random variables  Introduction  Distribution of a random variable  Distribution function properties  Discrete random variables  Point mass  Discrete.
Discrete Probability Distributions
Probability Distributions Random Variables: Finite and Continuous Distribution Functions Expected value April 3 – 10, 2003.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution and Other Continuous Distributions.
Continuous Random Variables and Probability Distributions
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 6-1 Statistics for Managers Using Microsoft® Excel 5th Edition.
Chapter 5 Continuous Random Variables and Probability Distributions
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution and Other Continuous Distributions.
Lecture II-2: Probability Review
Chapter 21 Random Variables Discrete: Bernoulli, Binomial, Geometric, Poisson Continuous: Uniform, Exponential, Gamma, Normal Expectation & Variance, Joint.
4-1 Continuous Random Variables 4-2 Probability Distributions and Probability Density Functions Figure 4-1 Density function of a loading on a long,
QA in Finance/ Ch 3 Probability in Finance Probability.
Chapter 6 The Normal Probability Distribution
Random variables Petter Mostad Repetition Sample space, set theory, events, probability Conditional probability, Bayes theorem, independence,
McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved.
DATA ANALYSIS Module Code: CA660 Lecture Block 3.
Random Variables & Probability Distributions Outcomes of experiments are, in part, random E.g. Let X 7 be the gender of the 7 th randomly selected student.
Modeling and Simulation CS 313
PROBABILITY & STATISTICAL INFERENCE LECTURE 3 MSc in Computing (Data Analytics)
40S Applied Math Mr. Knight – Killarney School Slide 1 Unit: Statistics Lesson: ST-5 The Binomial Distribution The Binomial Distribution Learning Outcome.
Copyright ©2011 Nelson Education Limited The Normal Probability Distribution CHAPTER 6.
Topics Covered Discrete probability distributions –The Uniform Distribution –The Binomial Distribution –The Poisson Distribution Each is appropriately.
Theory of Probability Statistics for Business and Economics.
Random Variables Numerical Quantities whose values are determine by the outcome of a random experiment.
ENGR 610 Applied Statistics Fall Week 3 Marshall University CITE Jack Smith.
1 Since everything is a reflection of our minds, everything can be changed by our minds.
June 11, 2008Stat Lecture 10 - Review1 Midterm review Chapters 1-5 Statistics Lecture 10.
Random Variables Presentation 6.. Random Variables A random variable assigns a number (or symbol) to each outcome of a random circumstance. A random variable.
MATH 4030 – 4B CONTINUOUS RANDOM VARIABLES Density Function PDF and CDF Mean and Variance Uniform Distribution Normal Distribution.
1 Topic 5 - Joint distributions and the CLT Joint distributions –Calculation of probabilities, mean and variance –Expectations of functions based on joint.
Stats Probability Theory Summary. The sample Space, S The sample space, S, for a random phenomena is the set of all possible outcomes.
40S Applied Math Mr. Knight – Killarney School Slide 1 Unit: Statistics Lesson: ST-5 The Binomial Distribution The Binomial Distribution Learning Outcome.
Exam 2: Rules Section 2.1 Bring a cheat sheet. One page 2 sides. Bring a calculator. Bring your book to use the tables in the back.
Random Variable The outcome of an experiment need not be a number, for example, the outcome when a coin is tossed can be 'heads' or 'tails'. However, we.
IE 300, Fall 2012 Richard Sowers IESE. 8/30/2012 Goals: Rules of Probability Counting Equally likely Some examples.
CY1B2 Statistics1 (ii) Poisson distribution The Poisson distribution resembles the binomial distribution if the probability of an accident is very small.
Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.
Sampling and estimation Petter Mostad
Topic 5: Continuous Random Variables and Probability Distributions CEE 11 Spring 2002 Dr. Amelia Regan These notes draw liberally from the class text,
Probability Theory Modelling random phenomena. Permutations the number of ways that you can order n objects is: n! = n(n-1)(n-2)(n-3)…(3)(2)(1) Definition:
5 - 1 © 1998 Prentice-Hall, Inc. Chapter 5 Continuous Random Variables.
Continuous Random Variables and Probability Distributions
© 2002 Prentice-Hall, Inc.Chap 5-1 Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 5 The Normal Distribution and Sampling Distributions.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 6-1 Chapter 6 The Normal Distribution and Other Continuous Distributions Basic Business.
Lecture 3 Types of Probability Distributions Dr Peter Wheale.
Lecture 8: Measurement Errors 1. Objectives List some sources of measurement errors. Classify measurement errors into systematic and random errors. Study.
Chapter 4 Discrete Random Variables and Probability Distributions
Biostatistics Class 3 Probability Distributions 2/15/2000.
4-1 Continuous Random Variables 4-2 Probability Distributions and Probability Density Functions Figure 4-1 Density function of a loading on a long,
Discrete Random Variables
Sampling Distributions (§ )
Presentation transcript:

Probability theory 2 Tron Anders Moger September 13th 2006

The Binomial distribution Bernoulli distribution: One experiment with two possible outcomes, probability of success P. If the experiment is repeated n times The probability P is constant in all experiments The experiments are independent Then the number of successes follows a binomial distribution

The Binomial distribution If X has a Binomial distribution, its PDF is defined as:

Example Since the early 50s, UFO’s have been reported in the U.S. Assume P(real observation)=1/ Binomial experiments, n=10000, p=1/ X counts the number of real observations

The Hypergeometric distribution Randomly sample n objects from a group of N, S of which are successes. The distribution of the number of successes, X, in the sample, is hypergeometric distributed:

Example What is the probability of winning the lottery, that is, getting all 7 numbers on your coupon correct out of the total 34?

The distribution of rare events: The Poisson distribution Assume successes happen independently, at a rate λ per time unit. The probability of x successes during a time unit is given by the Poisson distribution:

Example: AIDS cases in 1991 (47 weeks) Cases per week: Mean number of cases per week: λ=44/47=0.936 Can model the data as a Poisson process with rate λ=0.936

Example cont’d: No. ofNo.Expected no. observed casesobserved(from Poisson dist.) Calculation: P(X=2)= *e /2!=0.17 Multiply by the number of weeks: 0.17*47=8.1 Poisson distribution fits data fairly well!

The Poisson and the Binomial Assume X is Bin(n,P), E(X)=nP Probability of 0 successes: P(X=0)=(1-p) n Can write λ =nP, hence P(X=0)=(1- λ/n) n If n is large and P is small, this converges to e -λ, the probability of 0 successes in a Poisson distribution! Can show that this also applies for other probabilities. Hence, Poisson approximates Binomial when n is large and P is small (n>5, P<0.05).

Bivariate distributions If X and Y is a pair of discrete random variables, their joint probability function expresses the probability that they simultaneously take specific values: – –marginal probability: –conditional probability: –X and Y are independent if for all x and y:

Example The probabilities for –A: Rain tomorrow –B: Wind tomorrow are given in the following table: No rain Light rain Heavy rain No wind Some windStrong windStorm

Example cont’d: Marginal probability of no rain: =0.36 Similarily, marg. prob. of light and heavy rain: 0.34 and 0.3. Hence marginal dist. of rain is a PDF! Conditional probability of no rain given storm: 0.01/( )=0.1 Similarily, cond. prob. of light and heavy rain given storm: 0.4 and 0.5. Hence conditional dist. of rain given storm is a PDF ! Are rain and wind independent? Marg. prob. of no wind: =0.2 P(no rain,no wind)=0.36*0.2=0.072≠0.1

Covariance and correlation Covariance measures how two variables vary together: Correlation is always between -1 and 1: If X,Y independent, then If Cov(X,Y)=0 then

Continuous random variables Used when the outcomes can take any number (with decimals) on a scale Probabilities are assigned to intervals of numbers; individual numbers generally have probability zero Area under a curve: Integrals

Cdf for continuous random variables As before, the cumulative distribution function F(x) is equal to the probability of all outcomes less than or equal to x. Thus we get The probability density function is however now defined so that We get that

Expected values The expectation of a continuous random variable X is defined as The variance, standard deviation, covariance, and correlation are defined exactly as before, in terms of the expectation, and thus have the same properties

Example: The uniform distribution on the interval [0,1] f(x)=1 F(x)=x

The normal distribution The most used continuous probability distribution: –Many observations tend to approximately follow this distribution –It is easy and nice to do computations with –BUT: Using it can result in wrong conclusions when it is not appropriate

Histogram of weight with normal curve displayed

The normal distribution The probability density function is where Notation Standard normal distribution Using the normal density is often OK unless the actual distribution is very skewed Also: µ±σ covers ca 65% of the distribution µ±2σ covers ca 95% of the distribution

The normal distribution with small and large standard deviation σ

Simple method for checking if data are well approximated by a normal distribution: Explore As before, choose Analyze->Descriptive Statistics->Explore in SPSS. Move the variable to Dependent List (e.g. weight). Under Plots, check Normality Plots with tests.

Histogram of lung function for the students

Q-Q plot for lung function

Age – not normal

Q-Q plot of age

Skewed distribution, with e.g. the observations 0.40, 0.96, 11.0 A trick for data that are skewed to the right: Log-transformation!

Log-transformed data ln(0.40)=-0.91 ln(0.96)=-0.04 ln(11) =2.40 Do the analysis on log- transformed data SPSS: transform- compute

OK, the data follows a normal distribution, so what? First lecture, pairs of terms: –Sample – population –Histogram – distribution –Mean – Expected value In statistics we would like the results from analyzing a small sample to apply for the population Has to collect a sample that is representative w.r.t. age, gender, home place etc.

New way of reading tables and histograms: Histograms show that data can be described by a normal distribution Want to conclude that data in the population are normally distributed Mean calculated from the sample is an estimate of the expected value µ of the population normal distribution Standard deviation in the sample is an estimate of σ in the population normal distribution Mean±2*(standard deviation) as estimated from the sample (hopefully) covers 95% of the population normal distribution

In addition: Most standard methods for analyzing continuous data assumes a normal distribution. When n is large and P is not too close to 0 or 1, the Binomial distribution can be approximated by the normal distribution A similar phenomenon is true for the Poisson distribution This is a phenomenon that happens for all distributions that can be seen as a sum of independent observations. Means that the normal distribution appears whenever you want to do statistics

The Exponential distribution The exponential distribution is a distribution for positive numbers (parameter λ): It can be used to model the time until an event, when events arrive randomly at a constant rate

Next time: Sampling and estimation Will talk much more in depth about the topics mentioned in the last few slides today