Conditional Distributions and the Bivariate Normal Distribution James H. Steiger.

Slides:



Advertisements
Similar presentations
Kin 304 Regression Linear Regression Least Sum of Squares
Advertisements

Review of Probability. Definitions (1) Quiz 1.Let’s say I have a random variable X for a coin, with event space {H, T}. If the probability P(X=H) is.
Bivariate Analyses.
Theoretical Probability Distributions We have talked about the idea of frequency distributions as a way to see what is happening with our data. We have.
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Assumptions for Linear Regression Evaluating a Regression Model.
Correlation and Covariance
Basic Statistical Concepts
Correlation-Regression The correlation coefficient measures how well one can predict X from Y or Y from X.
Statistics Psych 231: Research Methods in Psychology.
Analysis of Research Data
1.2: Describing Distributions
 The Law of Large Numbers – Read the preface to Chapter 7 on page 388 and be prepared to summarize the Law of Large Numbers.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 4: The Normal Distribution and Z-Scores.
Chap 3-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 3 Describing Data: Numerical Statistics for Business and Economics.
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution Business Statistics: A First Course 5 th.
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
6.1 What is Statistics? Definition: Statistics – science of collecting, analyzing, and interpreting data in such a way that the conclusions can be objectively.
Investment Analysis and Portfolio Management Chapter 7.
PSYCHOLOGY: Themes and Variations Weiten and McCann Appendix B : Statistical Methods Copyright © 2007 by Nelson, a division of Thomson Canada Limited.
The Mean of a Discrete Probability Distribution
NOTES The Normal Distribution. In earlier courses, you have explored data in the following ways: By plotting data (histogram, stemplot, bar graph, etc.)
CONTINUOUS RANDOM VARIABLES AND THE NORMAL DISTRIBUTION.
Linear Regression James H. Steiger. Regression – The General Setup You have a set of data on two variables, X and Y, represented in a scatter plot. You.
Stats/Methods I JEOPARDY. Jeopardy CorrelationRegressionZ-ScoresProbabilitySurprise $100 $200$200 $300 $500 $400 $300 $400 $300 $400 $500 $400.
Correlation is a statistical technique that describes the degree of relationship between two variables when you have bivariate data. A bivariate distribution.
Points in Distributions n Up to now describing distributions n Comparing scores from different distributions l Need to make equivalent comparisons l z.
Some probability distribution The Normal Distribution
Correlation and Regression Chapter 9. § 9.3 Measures of Regression and Prediction Intervals.
Introduction to Probability and Statistics Thirteenth Edition Chapter 12 Linear Regression and Correlation.
Regression Regression relationship = trend + scatter
Educ 200C Wed. Oct 3, Variation What is it? What does it look like in a data set?
Chapter 6 The Normal Distribution. 2 Chapter 6 The Normal Distribution Major Points Distributions and area Distributions and area The normal distribution.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
Review Lecture 51 Tue, Dec 13, Chapter 1 Sections 1.1 – 1.4. Sections 1.1 – 1.4. Be familiar with the language and principles of hypothesis testing.
Discrete Distributions. Random Variable - A numerical variable whose value depends on the outcome of a chance experiment.
1 Competence and a new baby, M | mastery baby | 0 1 | Total | | 151 | | |
Scatter Diagram of Bivariate Measurement Data. Bivariate Measurement Data Example of Bivariate Measurement:
Advanced Statistical Methods: Continuous Variables REVIEW Dr. Irina Tomescu-Dubrow.
Data Analysis, Presentation, and Statistics
Summarizing Data Graphical Methods. Histogram Stem-Leaf Diagram Grouped Freq Table Box-whisker Plot.
AP Statistics Section 15 A. The Regression Model When a scatterplot shows a linear relationship between a quantitative explanatory variable x and a quantitative.
MM150 Unit 9 Seminar. 4 Measures of Central Tendency Mean – To find the arithmetic mean, or mean, sum the data scores and then divide by the number of.
Advanced Algebra The Normal Curve Ernesto Diaz 1.
The simple linear regression model and parameter estimation
Chapter 7 The Normal Probability Distribution
Correlation and Covariance
Probability Imagine tossing two coins and observing whether 0, 1, or 2 heads are obtained. It would be natural to guess that each of these events occurs.
Normal Distribution.
Chapter 6 The Normal Curve.
BIOS 501 Lecture 3 Binomial and Normal Distribution
Lecture Slides Elementary Statistics Thirteenth Edition
AP Exam Review Chapters 1-10
CHAPTER 26: Inference for Regression
Chapter 16.
Introduction to Probability and Statistics Thirteenth Edition
Discrete Distributions
Modeling Continuous Variables
Correlation and Covariance
Ch11 Curve Fitting II.
Discrete Distributions
The normal distribution
Discrete Distributions.
Ch 4.1 & 4.2 Two dimensions concept
Warsaw Summer School 2017, OSU Study Abroad Program
Created by Erin Hodgess, Houston, Texas
The Normal Distribution
Presentation transcript:

Conditional Distributions and the Bivariate Normal Distribution James H. Steiger

Overview In this module, we have several goals: Introduce several technical terms Bivariate frequency distribution Marginal distribution Conditional distribution In connection with the bivariate normal distribution, discuss Conditional distribution calculations Regression toward the mean

Bivariate Frequency Distributions So far, we have discussed univariate frequency distributions, which table or plot a set of values and their frequencies. Xf

Bivariate Frequency Distributions We need two dimensions to table or plot a univariate (1-variable) frequency distribution – one dimension for the value, one dimension for its frequency.

Bivariate Frequency Distributions A bivariate frequency distribution presents in a table or plot pairs of values on two variables and their frequencies.

Bivariate Frequency Distributions For example, suppose you throw two coins, X and Y, simultaneously and record the outcome as an ordered pair of values. Imagine that you threw the coin 8 times, and observed the following (1=Head, 0 = Tail) (X,Y)f (1,1)2 (1,0)2 (0,1)2 (0,0)2

Bivariate Frequency Distributions To graph the bivariate distribution, you need a 3 dimensional plot, although this can be drawn in perspective in 2 dimensions (X,Y)f (1,1)2 (1,0)2 (0,1)2 (0,0)2

Bivariate Frequency Distributions

Marginal Distributions The bivariate distribution of X and Y shows how they behave together. We may also be interested in how X and Y behave separately. We can obtain this information for either X or Y by collapsing (summing) over the opposite variable. For example:

Marginal Distribution of Y

Conditional Distributions The conditional distribution of Y given X=a is the distribution of Y for only those occasions when X takes on the value a. Example: The conditional distribution of Y given X=1 is obtained by extracting from the bivariate distribution only those pairs of scores where X=1, then tabulating the frequency distribution of Y on those occasions.

Conditional Distributions The conditional distribution of Y given X=1 is: 12 02

Conditional Distributions While marginal distributions are obtained from the bivariate by summing, conditional distributions are obtained by “making a cut” through the bivariate distribution.

Marginal, Conditional, and Bivariate Relative Frequencies The notion of relative frequency generalizes easily to bivariate, marginal, and conditional probability distributions. In all cases, the frequencies are rescaled by dividing by the total number of observations in the current distribution table.

Bivariate Continuous Probability Distributions With continuous distributions, we plot probability density. In this case, the resulting plot looks like a mountainous terrain, as probability density is registered on a third axis. The most famous bivariate continuous probability distribution is the bivariate normal. Glass and Hopkins discuss the properties of this distribution in some detail.

Bivariate Continuous Probability Distributions Characteristics of the Bivariate Normal Distribution Marginal Distributions are normal Conditional Distributions are normal, with constant variance for any conditional value. Let b and c be the slope and intercept of the linear regression line for predicting Y from X.

The Bivariate Normal Distribution

Computing Conditional Distributions We simply use the formulas We have two choices : Process the entire problem in the original metric, or Process in Z score form, then convert to the original metric.

Conditional Distribution Problems The distribution of IQ for women (X) and their daughters (Y) is bivariate normal, with the following characteristics: Both X and Y have means of 100 and standard deviations of 15. The correlation between X and Y is.60.

Conditional Distribution Problems First we will process in the original metric. Here are some standard questions. What are the linear regression coefficients for predicting Y from X? They are

Conditional Distribution Problems What is the distribution of IQ scores for women whose mothers had an IQ of 145? The mean follows the linear regression rule: The standard deviation is

Conditional Distribution Problems What percentage of daughters of mothers with IQ scores of 145 will have an IQ at least as high as their mother?

Conditional Distribution Problems We simply compute the probability of obtaining a score of 145 or higher in a normal distribution with a mean of 127 and a standard deviation of 12. We have: The area above 1.5 in the standard normal curve is 6.68%.

Conditional Distribution Problems What are the social implications of this result?