Numerical parameters of a Random Variable Remember when we were studying sets of data of numbers. We found some numbers useful, namely The spread The frequencies The average The standard deviation (population or sample) We are going to do the same with RV’s. Let’s translate ….
spreadbecomesrange frequenciesbecomeprobabilities averagebecomesexpected value standardbecomesstandard deviationdeviation The concepts of range, probability distribution and expected value we have already seen. How do we compute the standard deviation of a RV ?
The Standard Deviation of a Random Variable In the case of datasets we defined the standard deviation as the average distance from the mean, and since distances use | |, which are hard to use, we changed to the variance, that is the average of the squares of the distances. Recall that “average” translate to expected value, so it is natural to define the variance of a Random Variable as follows:
Let = E(X). Then Var(iance)(X) = Expected value of the squares of the distances from that is E((x - ) 2 ) As usual, this calculation can get hairy, but, as usual, there is a short cut, based on the formula: E((x - ) 2 ) = E(x 2 ) - In words, you compute the expected value of the squares (no distances) and subtract the mean squared. Let's do an example:
Here is the probability distribution table of a RV First of all we compute : Now we apply the definition (no shortcut)
So, if we use the definition we must do the following calculation: which I am way too lazy to even try! Here is the shortcut: From the second row we get E(x 2 ), that is
Now all we have to compute is (2.18) 2 Which gives 4.77 (actually )
Notation We denote with Var(X) the variance of X. Observe that our shortcut reads: Var(X) = E(X 2 ) - [E(X)] 2 And, of course,
Special Values E(b(n,p)) The above notation is shorthand for The expected value of a binomial RV based on the number of successes in n identical and independent Bernoulli trials in each of which the probability of Success is p. Let’s look at what we have to compute:
Don’t be scared, our trusty intuition will help us. Look at this list: npwe expect successes successes successes successes Your intuition seems to suggest that E(b(n,p)) = np and your intuition is CORRECT!
Var(b(n,p)) The above notation is shorthand for The variance of a binomial RV based on the number of successes in n identical and independent Bernoulli trials in each of which the probability of Success is p. Let’s look at what we have to compute (using shortcut):
Don’t be scared, we will NOT do the computation. Unfortunately our intuition does not tell us … diddly squat, I will just have to tell you the answer: Var(b(n,p)) = npq Therefore the two fundamental parameters of the Binomial RV X = b(n,p) are (memorizing time !!)
Chebyshev’s and Empirical rules applying The same rules we learned about the role of the standard deviation in the case of numerical datasets apply verbatim to RV’s. More specifically: (Chebyshev’s) For “any” RV ( is our old cabbage leaf!)
(Empirical) If the distribution is “nice” then where “nice” means approximately mound-shaped and approximately symmetric. That’s all for discrete RVs !