Download presentation
Presentation is loading. Please wait.
Published byJanice Richardson Modified over 9 years ago
1
Statistics for Social and Behavioral Sciences Session #11: Random Variable, Expectations (Agresti and Finlay, Chapter 4) Prof. Amine Ouazad
2
Statistics Course Outline P ART I. I NTRODUCTION AND R ESEARCH D ESIGN P ART II. D ESCRIBING DATA P ART III. D RAWING CONCLUSIONS FROM DATA : I NFERENTIAL S TATISTICS P ART IV. : C ORRELATION AND C AUSATION : R EGRESSION A NALYSIS Week 1 Weeks 2-4 Weeks 5-9 Weeks 10-14 This is where we talk about Zmapp and Ebola! Firenze or Lebanese Express?
3
Last Session Four rules of probability distributions 1.P(not A) = 1 – P(A) 2.P(A or B) = P(A) + P(B) when P(A and B)=0 3.P(A and B)=P(A) P(B given A) Beware of the inverse probability fallacy, P(B given A) is not P(A given B) 3’. P(A and B)=P(A) P(B) when A and B are independent Inverse Probability Fallacy: – P(A|B) is not P(B|A). – We have a formula P(A|B) = (P(B|A) P(A)) / P(B)
4
Outline 1.Random Variable Probability distribution of a random variable Expectation of a random variable 1.The normal distribution 2.Polls and normal distributions Next time:Probability Distributions (continued) Chapter 4 of A&F
5
Random variable A random variable is a variable whose value is not given ex- ante… but rather can take multiple values ex-post. Example: – X is a random variable that, before the coin is tossed (ex-ante), can take values « Heads » or « Tails ». Once the coin is tossed (ex-post), the value of X is known, it is either « Heads » or « Tails ». – Y is a random variable that can take values 1,2,3,4,5, or 6 depending on the draw of a dice. Before the dice is thrown, the value is not known. After the dice is drawn, we know the value of Y.
6
Probability distribution of a random variable Take all possible values of a random variable Y: – Example: 1,2,3,4,5,6 – In general: y 1, y 2, y 3, …, y K. Probability of the event that the random variable Y equates y k is noted P(Y=y k ) or simply P(y k ). The probability distribution of random variable Y is the list of all values of P(Y=y k ). Example: for a balanced dice, the probability distribution of Y is the list of values P(Y=1), P(Y=2), P(Y=3), … which is {1/6,1/6,1/6,1/6,1/6,1/6} All throughout the course we consider either discrete quantitative random variables or categorical random variables.
7
Expected value of a random variable What are your expected gains when playing the coin game? Gain is a random variable, equal to +10 AED when getting heads, and -10 AED when getting tails. E(gain) = Gain when getting heads x Probability of heads + Gain when getting tails x Probability of tails. In general, for a random variable Y, the expected value of Y is: E(Y) = y k P(Y=y k ) Also note that probabilities sum to one. P(Y=y k ) = 1 Should I play this game at all? What is my expected gain?? Should I play this game at all? What is my expected gain??
8
Expected Earnings? « Your annual earnings right after NYU Abu Dhabi » is a random variable… – The variable has not been realized yet. Let’s give it a name Y = « Your annual earnings right after NYU Abu Dhabi ». E(earnings) = E(Y) = y k P(Y=y k ) Takes potentially K values. Problemo: We don’t observe earnings in the future!!! Hum, how much will I earn??
9
An approximation is to use the distribution of current graduates … To substitute for our lack of knowledge of P(Y=y k ) for each k. Earnings take K distinct values, no two graduates earn exactly the same annual wage… Hence an approximation of expected earnings is E(Y) = y k x (1/ K) The average earnings of current graduates… But that’s only an approximation !! What could be wrong? Expected Earnings? Hum, how much will I earn??
10
Properties of the Expectation The expectation of the sum is the sum of the expectations: E(earnings – debt) = E(earnings) – E(debt) The expectation of a constant x the random variable is the constant x the expectation: E( Constant x Earnings ) = Constant x E(Earnings) E.g. E(Earnings in AED) = 3.6 x E(Earnings in USD) Beware !!! E( X Y ) is not E(X) E(Y) in general. When X and Y are independent, E( X Y ) = E(X) E(Y). Law of conditional expectation E(X)=E(E(X|Z))
11
Outline 1.Random Variable Probability distribution of a random variable Expectation of a random variable 1.The normal distribution 2.Polls and normal distributions Next time:Probability Distributions (continued) Chapter 4 of A&F
12
A particular distribution Some random variables have a particular “bell-shaped” distribution: – Individuals’ height. What is the distribution of height at age 20? P(height) What height can I expect for my child? E(height) – Individuals’ weight. What is the distribution of weight at age 35? P(weight) What weight can I expect at age 35? E(height) – The logarithm of income. What is the distribution of the log of income after graduation? P(log(income)) What log income can I expect after graduation? The “bell-shaped” distribution will now be called a “normal” distribution.
13
The normal distribution “The normal distribution is symmetric, bell shaped, and characterized by its mean m and standard deviation s. The probability within any particular number of standard deviations of m is the same for all normal distributions.” P( – < height < + ) = 0.68or 68% P( - 2 < height < + 2 ) = 0.95or 95% P( - 3 < height < + 3 ) = 0.997or 99.7% All of these are “events”
14
Draw a histogram will a very small bin size… so that the little stairs disappear…. and a curve appears. The normal distribution
15
Comparing test scores across colleges Test scores have a normal distribution with mean 3 and standard deviation 4. Test scores have a normal distribution with mean 4 and standard deviation 1. “Hip hop in the Middle East” “Early paleontology in Indianapolis” Problem: how do I compare Marina’s test score of 3.6 at the paleontology course with a test score of 4.1 at the Hip Hop in the Middle East?
16
Z-score ! Take a student’s paleontology test score at the end of the semester. This is a random variable. – Its probability distribution has a mean of =3 with a standard deviation of =4. – Now consider the “z-scored” paleontology test score: – The z-scored paleontology test score has a mean of 0, and a standard deviation of 1.
17
Standard Normal Distribution Is simply the normal distribution with mean 0 and standard deviation 1. A z-score of 3 means that the student is three times the standard deviation (of original test scores) above the mean. So who has a better grade, Marina or Slavoj?
18
Outline 1.Random Variable Probability distribution of a random variable Expectation of a random variable 1.The normal distribution 2.Polls and normal distributions Next time:Probability Distributions (continued) Chapter 4 of A&F
19
Who will win the mid term elections in the US? Mid term elections are held two years after the presidential elections in the United States. They take place early november 2014. A question: what fraction of the voters will vote for a democrat in Colorado?
20
Wrap up A random variable is a variable whose value has not been realized. The expectation of a random variable Y is: E(Y) = y k P(Y=y k ) Also, E(X+Y) = E(X) + E(Y), and E(c X)=c E(X), and E(E(X|Z))=E(X) Typically the probability distribution P is not known, but we approximate it…. – Using the distribution for past values of Y (example: earnings of previous graduates) – Using polls, to ask individuals for example how they will vote. The normal distribution is an ubiquitous distribution, that is symmetric, bell shaped. It is characterized by its mean and its standard deviation . The standard normal distribution has mean 0 and standard deviation 1.
21
Coming up: Readings: Chapter 4 entirely – full of interesting examples and super relevant. Online quiz tonight. Go to the website http://www.realclearpolitics.com/epolls/2014/senate/co/colorado_senate_gardner_vs_udal l-3845.html and prepare one or two slides to present the race in Colorado. http://www.realclearpolitics.com/epolls/2014/senate/co/colorado_senate_gardner_vs_udal l-3845.html – Who do you think will win? – What is MoE? – What is the likely distribution of the “fraction of voters who will vote for Gardner?” For help: Amine Ouazad Office 1135, Social Science building amine.ouazad@nyu.edu Office hour: Tuesday from 5 to 6.30pm. GAF: Irene Paneda Irene.paneda@nyu.edu Sunday recitations. At the Academic Resource Center, Monday from 2 to 4pm.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.