Statistics for Social and Behavioral Sciences Session #11: Random Variable, Expectations (Agresti and Finlay, Chapter 4) Prof. Amine Ouazad.

Slides:



Advertisements
Similar presentations
Statistics for Social and Behavioral Sciences Session #16: Confidence Interval and Hypothesis Testing (Agresti and Finlay, from Chapter 5 to Chapter 6)
Advertisements

Probability Distributions CSLU 2850.Lo1 Spring 2008 Cameron McInally Fordham University May contain work from the Creative Commons.
Statistics for Social and Behavioral Sciences Part IV: Causality Randomized Experiments, ANOVA Chapter 12, Section 12.1 Prof. Amine Ouazad.
Normal Distribution; Sampling Distribution; Inference Using the Normal Distribution ● Continuous and discrete distributions; Density curves ● The important.
Business Statistics for Managerial Decision
A.P. STATISTICS LESSON 7 – 1 ( DAY 1 ) DISCRETE AND CONTINUOUS RANDOM VARIABLES.
Ch.18 Normal approximation using probability histograms Review measures center and spread –List of numbers (histogram of data) –Box model For a “large”
Statistics for Social and Behavioral Sciences Session #9: Linear Regression and Conditional distribution Probabilities (Agresti and Finlay, Chapter 9)
Probability. Probability Definitions and Relationships Sample space: All the possible outcomes that can occur. Simple event: one outcome in the sample.
Introduction to Probability and Statistics
Probability Concepts and Applications
1 Sociology 601, Class 4: September 10, 2009 Chapter 4: Distributions Probability distributions (4.1) The normal probability distribution (4.2) Sampling.
Introduction to Educational Statistics
 The Law of Large Numbers – Read the preface to Chapter 7 on page 388 and be prepared to summarize the Law of Large Numbers.
Probability (cont.). Assigning Probabilities A probability is a value between 0 and 1 and is written either as a fraction or as a proportion. For the.
Statistical Analysis Pedro Flores. Conditional Probability The conditional probability of an event B is the probability that the event will occur given.
Mutually Exclusive: P(not A) = 1- P(A) Complement Rule: P(A and B) = 0 P(A or B) = P(A) + P(B) - P(A and B) General Addition Rule: Conditional Probability:
Discrete and Continuous Random Variables Continuous random variable: A variable whose values are not restricted – The Normal Distribution Discrete.
Statistics and Quantitative Analysis U4320 Segment 4: Statistics and Quantitative Analysis Prof. Sharyn O’Halloran.
Statistics for Social and Behavioral Sciences Part IV: Causality Association and Causality Session 22 Prof. Amine Ouazad.
Statistical Analysis – Chapter 4 Normal Distribution
Midterm 1 Well done !! Mean 80.23% Median 84.6% Standard deviation of ppt. 5 th percentile is 53.
Expected Value (Mean), Variance, Independence Transformations of Random Variables Last Time:
Statistics for Social and Behavioral Sciences Session #15: Interval Estimation, Confidence Interval (Agresti and Finlay, Chapter 5) Prof. Amine Ouazad.
Statistics for Social and Behavioral Sciences Session #17: Hypothesis Testing: The Confidence Interval Method and the T-Statistic Method (Agresti and Finlay,
Statistics for Social and Behavioral Sciences Part IV: Causality Multivariate Regression Chapter 11 Prof. Amine Ouazad.
Random Variables A random variable A variable (usually x ) that has a single numerical value (determined by chance) for each outcome of an experiment A.
Estimation and Hypothesis Testing. The Investment Decision What would you like to know? What will be the return on my investment? Not possible PDF for.
Statistics for Social and Behavioral Sciences Session #18: Literary Analysis using Tests (Agresti and Finlay, from Chapter 5 to Chapter 6) Prof. Amine.
Standardized Score, probability & Normal Distribution
Statistics for Social and Behavioral Sciences Session #14: Estimation, Confidence Interval (Agresti and Finlay, Chapter 5) Prof. Amine Ouazad.
Chapter 7: The Normal Probability Distribution
8.5 Normal Distributions We have seen that the histogram for a binomial distribution with n = 20 trials and p = 0.50 was shaped like a bell if we join.
Statistics for Social and Behavioral Sciences
The Mean of a Discrete Probability Distribution
Probability The definition – probability of an Event Applies only to the special case when 1.The sample space has a finite no.of outcomes, and 2.Each.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 1 – Slide 1 of 34 Chapter 11 Section 1 Random Variables.
Review A random variable where X can take on a range of values, not just particular ones. Examples: Heights Distance a golfer hits the ball with their.
Statistics for Social and Behavioral Sciences Session #6: The Regression Line C’ted (Agresti and Finlay, Chapter 9) Prof. Amine Ouazad.
Applied Business Forecasting and Regression Analysis Review lecture 2 Randomness and Probability.
Random Variables Numerical Quantities whose values are determine by the outcome of a random experiment.
6-2: STANDARD NORMAL AND UNIFORM DISTRIBUTIONS. IMPORTANT CHANGE Last chapter, we dealt with discrete probability distributions. This chapter we will.
Understanding Basic Statistics Chapter Seven Normal Distributions.
Probability Definition: randomness, chance, likelihood, proportion, percentage, odds. Probability is the mathematical ideal. Not sure what will happen.
5.3 Random Variables  Random Variable  Discrete Random Variables  Continuous Random Variables  Normal Distributions as Probability Distributions 1.
Statistics for Social and Behavioral Sciences Part IV: Causality Multivariate Regression R squared, F test, Chapter 11 Prof. Amine Ouazad.
Statistics for Social and Behavioral Sciences Part IV: Causality Inference for Slope and Correlation Section 9.5 Prof. Amine Ouazad.
Statistics for Social and Behavioral Sciences Part IV: Causality Comparison of two groups Chapter 7 Prof. Amine Ouazad.
6-2: STANDARD NORMAL AND UNIFORM DISTRIBUTIONS. IMPORTANT CHANGE Last chapter, we dealt with discrete probability distributions. This chapter we will.
Statistics and Quantitative Analysis U4320 Segment 5: Sampling and inference Prof. Sharyn O’Halloran.
Statistics 101 Discrete and Continuous Random Variables.
Chapter 6. Probability What is it? -the likelihood of a specific outcome occurring Why use it? -rather than constantly repeating experiments to make sure.
Probability –classical approach P(event E) = N e /N, where N = total number of possible outcomes, N e = total number of outcomes in event E assumes equally.
CONTINUOUS RANDOM VARIABLES
Binomial Distributions Chapter 5.3 – Probability Distributions and Predictions Mathematics of Data Management (Nelson) MDM 4U.
Statistics for Social and Behavioral Sciences Session #19: Estimation and Hypothesis Testing, Wrap-up & p-value (Agresti and Finlay, from Chapter 5 to.
Chapter 9 – The Normal Distribution Math 22 Introductory Statistics.
Chapter 7 The Normal Probability Distribution 7.1 Properties of the Normal Distribution.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics S eventh Edition By Brase and Brase Prepared by: Lynn Smith.
Psy B07 Chapter 3Slide 1 THE NORMAL DISTRIBUTION.
Binomial Distributions Chapter 5.3 – Probability Distributions and Predictions Mathematics of Data Management (Nelson) MDM 4U Authors: Gary Greer (with.
Review Design of experiments, histograms, average and standard deviation, normal approximation, measurement error, and probability.
Lesson 96 – Expected Value & Variance of Discrete Random Variables HL2 Math - Santowski.
Copyright ©2011 Brooks/Cole, Cengage Learning Continuous Random Variables Class 36 1.
Conditional Probability 423/what-is-your-favorite-data-analysis-cartoon 1.
Review of Probability and Estimators Arun Das, Jason Rebello
Chapter 4 – Part 3.
AP Statistics Chapter 16 Notes.
Discrete & Continuous Random Variables
Chapter Outline The Normal Curve Sample and Population Probability
Presentation transcript:

Statistics for Social and Behavioral Sciences Session #11: Random Variable, Expectations (Agresti and Finlay, Chapter 4) Prof. Amine Ouazad

Statistics Course Outline P ART I. I NTRODUCTION AND R ESEARCH D ESIGN P ART II. D ESCRIBING DATA P ART III. D RAWING CONCLUSIONS FROM DATA : I NFERENTIAL S TATISTICS P ART IV. : C ORRELATION AND C AUSATION : R EGRESSION A NALYSIS Week 1 Weeks 2-4 Weeks 5-9 Weeks This is where we talk about Zmapp and Ebola! Firenze or Lebanese Express?

Last Session Four rules of probability distributions 1.P(not A) = 1 – P(A) 2.P(A or B) = P(A) + P(B) when P(A and B)=0 3.P(A and B)=P(A) P(B given A) Beware of the inverse probability fallacy, P(B given A) is not P(A given B) 3’. P(A and B)=P(A) P(B) when A and B are independent Inverse Probability Fallacy: – P(A|B) is not P(B|A). – We have a formula P(A|B) = (P(B|A) P(A)) / P(B)

Outline 1.Random Variable Probability distribution of a random variable Expectation of a random variable 1.The normal distribution 2.Polls and normal distributions Next time:Probability Distributions (continued) Chapter 4 of A&F

Random variable A random variable is a variable whose value is not given ex- ante… but rather can take multiple values ex-post. Example: – X is a random variable that, before the coin is tossed (ex-ante), can take values « Heads » or « Tails ». Once the coin is tossed (ex-post), the value of X is known, it is either « Heads » or « Tails ». – Y is a random variable that can take values 1,2,3,4,5, or 6 depending on the draw of a dice. Before the dice is thrown, the value is not known. After the dice is drawn, we know the value of Y.

Probability distribution of a random variable Take all possible values of a random variable Y: – Example: 1,2,3,4,5,6 – In general: y 1, y 2, y 3, …, y K. Probability of the event that the random variable Y equates y k is noted P(Y=y k ) or simply P(y k ). The probability distribution of random variable Y is the list of all values of P(Y=y k ). Example: for a balanced dice, the probability distribution of Y is the list of values P(Y=1), P(Y=2), P(Y=3), … which is {1/6,1/6,1/6,1/6,1/6,1/6} All throughout the course we consider either discrete quantitative random variables or categorical random variables.

Expected value of a random variable What are your expected gains when playing the coin game? Gain is a random variable, equal to +10 AED when getting heads, and -10 AED when getting tails. E(gain) = Gain when getting heads x Probability of heads + Gain when getting tails x Probability of tails. In general, for a random variable Y, the expected value of Y is: E(Y) =  y k P(Y=y k ) Also note that probabilities sum to one.  P(Y=y k ) = 1 Should I play this game at all? What is my expected gain?? Should I play this game at all? What is my expected gain??

Expected Earnings? « Your annual earnings right after NYU Abu Dhabi » is a random variable… – The variable has not been realized yet. Let’s give it a name Y = « Your annual earnings right after NYU Abu Dhabi ». E(earnings) = E(Y) =  y k P(Y=y k ) Takes potentially K values. Problemo: We don’t observe earnings in the future!!! Hum, how much will I earn??

An approximation is to use the distribution of current graduates … To substitute for our lack of knowledge of P(Y=y k ) for each k. Earnings take K distinct values, no two graduates earn exactly the same annual wage… Hence an approximation of expected earnings is E(Y) =  y k x (1/ K) The average earnings of current graduates… But that’s only an approximation !! What could be wrong? Expected Earnings? Hum, how much will I earn??

Properties of the Expectation The expectation of the sum is the sum of the expectations: E(earnings – debt) = E(earnings) – E(debt) The expectation of a constant x the random variable is the constant x the expectation: E( Constant x Earnings ) = Constant x E(Earnings) E.g. E(Earnings in AED) = 3.6 x E(Earnings in USD) Beware !!! E( X Y ) is not E(X) E(Y) in general. When X and Y are independent, E( X Y ) = E(X) E(Y). Law of conditional expectation E(X)=E(E(X|Z))

Outline 1.Random Variable Probability distribution of a random variable Expectation of a random variable 1.The normal distribution 2.Polls and normal distributions Next time:Probability Distributions (continued) Chapter 4 of A&F

A particular distribution Some random variables have a particular “bell-shaped” distribution: – Individuals’ height. What is the distribution of height at age 20? P(height) What height can I expect for my child? E(height) – Individuals’ weight. What is the distribution of weight at age 35? P(weight) What weight can I expect at age 35? E(height) – The logarithm of income. What is the distribution of the log of income after graduation? P(log(income)) What log income can I expect after graduation? The “bell-shaped” distribution will now be called a “normal” distribution.

The normal distribution “The normal distribution is symmetric, bell shaped, and characterized by its mean m and standard deviation s. The probability within any particular number of standard deviations of m is the same for all normal distributions.” P(  –  < height <  +  ) = 0.68or 68% P(  - 2  < height <  + 2  ) = 0.95or 95% P(  - 3  < height <  + 3  ) = 0.997or 99.7% All of these are “events”

Draw a histogram will a very small bin size… so that the little stairs disappear…. and a curve appears. The normal distribution

Comparing test scores across colleges Test scores have a normal distribution with mean 3 and standard deviation 4. Test scores have a normal distribution with mean 4 and standard deviation 1. “Hip hop in the Middle East” “Early paleontology in Indianapolis” Problem: how do I compare Marina’s test score of 3.6 at the paleontology course with a test score of 4.1 at the Hip Hop in the Middle East?

Z-score ! Take a student’s paleontology test score at the end of the semester. This is a random variable. – Its probability distribution has a mean of  =3 with a standard deviation of  =4. – Now consider the “z-scored” paleontology test score: – The z-scored paleontology test score has a mean of 0, and a standard deviation of 1.

Standard Normal Distribution Is simply the normal distribution with mean 0 and standard deviation 1. A z-score of 3 means that the student is three times the standard deviation (of original test scores) above the mean. So who has a better grade, Marina or Slavoj?

Outline 1.Random Variable Probability distribution of a random variable Expectation of a random variable 1.The normal distribution 2.Polls and normal distributions Next time:Probability Distributions (continued) Chapter 4 of A&F

Who will win the mid term elections in the US? Mid term elections are held two years after the presidential elections in the United States. They take place early november A question: what fraction of the voters will vote for a democrat in Colorado?

Wrap up A random variable is a variable whose value has not been realized. The expectation of a random variable Y is: E(Y) =  y k P(Y=y k ) Also, E(X+Y) = E(X) + E(Y), and E(c X)=c E(X), and E(E(X|Z))=E(X) Typically the probability distribution P is not known, but we approximate it…. – Using the distribution for past values of Y (example: earnings of previous graduates) – Using polls, to ask individuals for example how they will vote. The normal distribution is an ubiquitous distribution, that is symmetric, bell shaped. It is characterized by its mean  and its standard deviation . The standard normal distribution has mean 0 and standard deviation 1.

Coming up: Readings: Chapter 4 entirely – full of interesting examples and super relevant. Online quiz tonight. Go to the website l-3845.html and prepare one or two slides to present the race in Colorado. l-3845.html – Who do you think will win? – What is MoE? – What is the likely distribution of the “fraction of voters who will vote for Gardner?” For help: Amine Ouazad Office 1135, Social Science building Office hour: Tuesday from 5 to 6.30pm. GAF: Irene Paneda Sunday recitations. At the Academic Resource Center, Monday from 2 to 4pm.