Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal Email: yuppal@ysu.edu
Lecture Slides 5 Random Variables Probability Distributions Discrete Distributions Discrete Uniform Probability Distribution Binomial Probability Distribution Continuous Distribution Normal Distribution
Random Variables A random variable is a numerical description of the outcome of an experiment. A discrete random variable may assume either a finite number of values or an infinite sequence of values. A continuous random variable may assume any numerical value in an interval or collection of intervals.
Example: JSL Appliances Discrete random variable with a finite number of values Let x = number of TVs sold at the store in one day, where x can take on 5 values (0, 1, 2, 3, 4)
Example: JSL Appliances Discrete random variable with an infinite sequence of values Let x = number of customers arriving in one day, where x can take on the values 0, 1, 2, . . . We can count the customers arriving, but there is no finite upper limit on the number that might arrive.
Random Variables Question Random Variable x Type Family size x = Number of dependents reported on tax return Discrete Distance from home to store x = Distance in miles from home to the store site Continuous Own dog or cat x = 1 if own no pet; = 2 if own dog(s) only; = 3 if own cat(s) only; = 4 if own dog(s) and cat(s) Discrete
Discrete Probability Distributions The probability distribution for a random variable describes how probabilities are distributed over the values of the random variable. We can describe a discrete probability distribution with a table, graph, or equation.
Discrete Probability Distributions The probability distribution is defined by a probability function, denoted by p(x), which provides the probability for each value of the random variable. The required conditions for a discrete probability function are: p(x) > 0 p(x) = 1
Discrete Probability Distributions Using past data on TV sales, … a tabular representation of the probability distribution for TV sales was developed. Number Units Sold of Days 0 80 1 50 2 40 3 10 4 20 200 x p(x) 0 .40 1 .25 2 .20 3 .05 4 .10 1.00 80/200
Discrete Probability Distributions Graphical Representation of Probability Distribution .10 .20 .30 .40 .50 Probability 0 1 2 3 4 Values of Random Variable x (TV sales)
Expected Value and Variance The expected value, or mean, of a random variable is a measure of its central location. E(x) = = p(x) *x The variance summarizes the variability in the values of a random variable. Var(x) = 2 = p(x)*(x - )2 The standard deviation, , is defined as the positive square root of the variance.
expected number of TVs sold in a day Expected Value x p(x) x*p(x) 0 .40 .00 1 .25 .25 2 .20 .40 3 .05 .15 4 .10 .40 E(x) = 1.20 expected number of TVs sold in a day
Variance and Standard Deviation x x - (x - )2 p(x) p(x)*(x - )2 1 2 3 4 -1.2 -0.2 0.8 1.8 2.8 1.44 0.04 0.64 3.24 7.84 .40 .25 .20 .05 .10 .576 .010 .128 .162 .784 Variance of daily sales = s 2 = 1.660 Standard deviation of daily sales = 1.2884 TVs
Types of Discrete Probability Distributions: Uniform Binomial
Discrete Uniform Probability Distribution The discrete uniform probability distribution is the simplest example of a discrete probability distribution given by a formula. The discrete uniform probability function is p(x) = 1/n the values of the random variable are equally likely where: n = the number of values the random variable may assume
Discrete Uniform Probability Distribution Suppose, instead of looking at the past sales of the TVs, I assume (or think) that TVs sales have a uniform probability distribution, then the example done above would change as follows:
expected number of TVs sold in a day Expected Value x p(x) x*p(x) 0 .2 .00 1 .2 .20 2 .2 .40 3 .2 .60 4 .2 .80 E(x) = 2.0 expected number of TVs sold in a day
Variance and Standard Deviation x x - (x - )2 p(x) p(x)*(x - )2 1 2 3 4 -2.0 -1.0 0.0 1.0 2.0 4.0 1.0 0.0 .2 0.8 0.2 0.0 Variance of daily sales = s 2 = 2.0 Standard deviation of daily sales = 1.41 TVs
Example: I am bored Imagine this situation. There is heavy snowstorm. Everything is shut down. You and everybody in your family have to stay home. You are utterly bored. You catch hold of your sibling and get him or her to play this game. The game is to bet on the toss of a coin.
Example: I am bored If it turns up heads exactly once in three tosses, you win or otherwise you lose. Lets call the event of getting heads on anyone trial as a success. Similarly, the event of getting tails is a failure. Suppose the probability of getting heads (or of a success) is 0.6. The big question is that you want to find out the probability of getting exactly 1 head on three tosses.
Tree Diagram Trial 2 Outcomes Trial 1 Trial 3 H HHH = (0.6)3*(0.4)0= 0.216 H HHT = (0.6)2*(0.4)1=0.144 T H HTH = (0.6)2*(0.4)1 =0.144 H T T HTT = (0.6)1*(0.4)2 =0.096 H THH = (0.6)2*(0.4)1 = 0.144 H T T THT = (0.6)1*(0.4)2 =0.096 H T TTH = (0.6)1*(0.4)2 =0.096 T TTT = (0.6)0*(0.4)3 =0.064
So, what is the probability of you winning the game? What is the random variable here? What is the probability of getting 2 heads in three tosses? = P(HHT) + P(HTH) + P(THH) = 0.144 + 0.144 +0.144 = 0.432 Or 43.2% How does the probability distribution of our Random variable look?
Binomial Distribution 1. The experiment consists of a sequence of n identical trials. 2. Two outcomes, success and failure, are possible on each trial. 3. The probability of a success, denoted by p, does not change from trial to trial. 4. The trials are independent.
Binomial Distribution Our interest is in the number of successes occurring in the n trials. We let x denote the number of successes occurring in the n trials. Binomial Distribution is highly useful when the number of trials is large.
Binomial Distribution Binomial Probability Function where: n = the number of trials p = the probability of success on any one trial
Counting Rule for Combinations Another useful counting rule (esp. when n is large) enables us to count the number of experimental outcomes when x objects are to be selected from a set of N objects. Number of Combinations of n Objects Taken x at a Time where: n! = n(n - 1)(n - 2) . . . (2)(1) x! = x(x - 1)(x- 2) . . . (2)(1) 0! = 1
Example: I am bored Using binomial distribution, the probability of 1 head in 3 tosses is
Example: I am bored Suppose, you won. But knowing your sibling, she or he says that bet was getting exactly 2 heads in 3 tosses. Since you are bored, you have no choice but continuing to play:
Example: I am bored She again cheats. She says that bet was getting at least 2 heads in 3 tosses. What does this mean: Getting 2 or more heads P(2 heads) + P(3 heads)
Example: I am bored
Binomial Distribution Expected Value E(x) = = n*p Variance Var(x) = 2 = np(1 - p) Standard Deviation
Example: I am bored Mean (or expected value) Variance: Standard Deviation E(x) = = n*p= 3*0.6 = 1.8 Var(x) = 2 = np(1 - p) = 3*(0.6)*(1-0.6) = 0.72
Chapter 6 Continuous Probability Distributions Normal Probability Distribution x p(x) Normal
Normal Probability Distribution The normal probability distribution is the most important distribution for describing a continuous random variable. It is widely used in statistical inference.
Normal Probability Distribution It has been used in a wide variety of applications: Heights of people Scientific measurements
Normal Probability Distribution It has been used in a wide variety of applications: Test scores Amounts of rainfall
Normal Distributions The probability of the random variable assuming a value within some given interval from x1 to x2 is defined to be the area under the curve between x1 and x2. x f (x) Normal x1 x2
Normal Probability Distribution Characteristics The distribution is symmetric; its skewness measure is zero. x
Normal Probability Distribution Characteristics The highest point on the normal curve is at the mean, which is also the median and mode. x Mean = m
Normal Probability Distribution Characteristics The entire family of normal probability distributions is defined by its mean m and its standard deviation s . Standard Deviation s x Mean m
Normal Probability Distribution Characteristics The mean can be any numerical value: negative, zero, or positive. The following shows different normal distributions with different means. x -10 20
Normal Probability Distribution Characteristics The standard deviation determines the width of the curve: larger values result in wider, flatter curves. s = 15 s = 25 x Same Mean
Normal Probability Distribution Characteristics Probabilities for the normal random variable are given by areas under the curve. The total area under the curve is 1 (.5 to the left of the mean and .5 to the right). .5 .5 x Mean m
Standardizing the Normal Values or the z-scores Z-scores can be calculated as follows: We can think of z as a measure of the number of standard deviations x is from .
Standard Normal Probability Distribution A standard normal distribution is a normal distribution with mean of 0 and variance of 1. If x has a normal distribution with mean (μ) and Variance (σ), then z is said to have a standard normal distribution. s = 1 z
Example: Air Quality I collected this data on the air quality of various cities as measured by particulate matter index (PMI). A PMI of less than 50 is said to represent good air quality. The data is available on the class website. Suppose the distribution of PMI is approximately normal.
Example: Air Quality The mean PMI is 41 and the standard deviation is 20.5. Suppose I want to find out the probability that air quality is good or what is the probability that PMI is greater than 50.