Lecture Eleven Probability Models.

Slides:



Advertisements
Similar presentations
1 Chapter 9 Supplement Model Building. 2 Introduction Introduction Regression analysis is one of the most commonly used techniques in statistics. It is.
Advertisements

Lecture 17: Tues., March 16 Inference for simple linear regression (Ch ) R2 statistic (Ch ) Association is not causation (Ch ) Next.
The Regression Equation  A predicted value on the DV in the bi-variate case is found with the following formula: Ŷ = a + B (X1)
Example 1 To predict the asking price of a used Chevrolet Camaro, the following data were collected on the car’s age and mileage. Data is stored in CAMARO1.
Fundamentals of Real Estate Lecture 13 Spring, 2003 Copyright © Joseph A. Petry
1 Multiple Regression Chapter Introduction In this chapter we extend the simple linear regression model, and allow for any number of independent.
1 Simple Linear Regression and Correlation The Model Estimating the Coefficients EXAMPLE 1: USED CAR SALES Assessing the model –T-tests –R-square.
1 Lecture Twelve. 2 Outline Failure Time Analysis Linear Probability Model Poisson Distribution.
Lecture 26 Model Building (Chapters ) HW6 due Wednesday, April 23 rd by 5 p.m. Problem 3(d): Use JMP to calculate the prediction interval rather.
Simple Linear Regression
© 2000 Prentice-Hall, Inc. Chap Multiple Regression Models.
Multiple Regression Models. The Multiple Regression Model The relationship between one dependent & two or more independent variables is a linear function.
Lecture 22 Multiple Regression (Sections )
1 Multiple Regression. 2 Introduction In this chapter we extend the simple linear regression model, and allow for any number of independent variables.
1 Lecture Twelve. 2 Outline Projects Failure Time Analysis Linear Probability Model Poisson Approximation.
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
1 Lecture Eleven Probability Models. 2 Outline Bayesian Probability Duration Models.
1 Simple Linear Regression and Correlation Chapter 17.
Lecture 23 Multiple Regression (Sections )
1 Lecture Eleven Probability Models. 2 Outline Bayesian Probability Duration Models.
Business Statistics - QBM117 Interval estimation for the slope and y-intercept Hypothesis tests for regression.
1 1 Slide © 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Lecture 22 – Thurs., Nov. 25 Nominal explanatory variables (Chapter 9.3) Inference for multiple regression (Chapter )
Adminstrative Info for Final Exam Location: Steinberg Hall-Dietrich Hall 351 Time: Thursday, May 1st, 4:00-6:00 p.m. Closed book. Allowed two double-sided.
Lecture 17 Interaction Plots Simple Linear Regression (Chapter ) Homework 4 due Friday. JMP instructions for question are actually for.
Lecture 21 – Thurs., Nov. 20 Review of Interpreting Coefficients and Prediction in Multiple Regression Strategy for Data Analysis and Graphics (Chapters.
Lecture 19 Simple linear regression (Review, 18.5, 18.8)
Simple Linear Regression. Introduction In Chapters 17 to 19, we examine the relationship between interval variables via a mathematical equation. The motivation.
1 Simple Linear Regression 1. review of least squares procedure 2. inference for least squares lines.
Lecture 16 Duration analysis: Survivor and hazard function estimation
AS 737 Categorical Data Analysis For Multivariate
Chapter 13: Inference in Regression
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
1 1 Slide Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple Coefficient of Determination n Model Assumptions n Testing.
Economics 173 Business Statistics Lecture 22 Fall, 2001© Professor J. Petry
Outline When X’s are Dummy variables –EXAMPLE 1: USED CARS –EXAMPLE 2: RESTAURANT LOCATION Modeling a quadratic relationship –Restaurant Example.
Chapter 13 Multiple Regression
Copyright © 2009 Cengage Learning 18.1 Chapter 20 Model Building.
Lecture 27 Chapter 20.3: Nominal Variables HW6 due by 5 p.m. Wednesday Office hour today after class. Extra office hour Wednesday from Final Exam:
Economics 173 Business Statistics Lecture 10 Fall, 2001 Professor J. Petry
Chapter 8: Simple Linear Regression Yang Zhenlin.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 10 th Edition.
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
Multiple Regression David A. Kenny January 12, 2014.
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
Statistics for Business and Economics Module 2: Regression and time series analysis Spring 2010 Lecture 6: Multiple Regression Model Building Priyantha.
1 Chapter 20 Model Building Introduction Regression analysis is one of the most commonly used techniques in statistics. It is considered powerful.
1 Assessment and Interpretation: MBA Program Admission Policy The dean of a large university wants to raise the admission standards to the popular MBA.
DURATION ANALYSIS Eva Hromádková, Applied Econometrics JEM007, IES Lecture 9.
1 Simple Linear Regression Chapter Introduction In Chapters 17 to 19 we examine the relationship between interval variables via a mathematical.
Expectations of Random Variables, Functions of Random Variables
Chapter 14 Introduction to Multiple Regression
Inference for Least Squares Lines
Multiple Regression Analysis
Regression Analysis: Statistical Inference
Survival curves We know how to compute survival curves if everyone reaches the endpoint so there is no “censored” data. Survival at t = S(t) = number still.
Hypothesis Testing Review
Essentials of Modern Business Statistics (7e)
Multiple Regression Analysis with Qualitative Information
John Loucks St. Edward’s University . SLIDES . BY.
CHAPTER 18 SURVIVAL ANALYSIS Damodar Gujarati
Keller: Stats for Mgmt & Econ, 7th Ed Linear Regression Analysis
Business Statistics Multiple Regression This lecture flows well with
Statistics 103 Monday, July 10, 2017.
Regression Analysis Week 4.
Multiple Regression Analysis with Qualitative Information
T305: Digital Communications
Multiple Regression Analysis
Multiple Regression Analysis with Qualitative Information
Presentation transcript:

Lecture Eleven Probability Models

Outline Bayesian Probability Duration Models

Bayesian Probability Facts Incidence of the disease in the population is one in a thousand The probability of testing positive if you have the disease is 99 out of 100 The probability of testing positive if you do not have the disease is 2 in a 100

Joint and Marginal Probabilities

Filling In Our Facts

Using Conditional Probability Pr(+ H)= Pr(+/H)*Pr(H)= 0.02*0.999=.01998 Pr(+ S) = Pr(+/S)*Pr(S) = 0.99*0.001=.00099

Filling In Our Facts

By Sum and By Difference

False Positive Paradox Probability of Being Sick If You Test + Pr(S/+) ? From Conditional Probability: Pr(S/+) = Pr(S +)/Pr(+) = 0.00099/0.02097 Pr(S/+) = 0.0472

Bayesian Probability By Formula Pr(S/+) = Pr(S +)/Pr(+) = PR(+/S)*Pr(S)/Pr(+) Where PR(+) = PR(+/S)*PR(S) + PR(+/H)*PR(H) And Using our facts; Pr(S/+) = 0.99*(0.001)/[0.99*.001 + 0.02*.999] Pr(S/+) = 0.00099/[0.00099+0.01998] Pr(S/+) = 0.00099/0.02097 = 0.0472

Duration Models Exploratory (Graphical) Estimates Kaplan-Meier Functional Form Estimates Exponential Distribution

Duration of Post-War Economic Expansions in Months

Estimated Survivor Function for Ten Post-War Expansions

Kaplan-Meyer Estimate of Survivor Function Survivor Function = (# at risk - # ending)/# at risk

Exponential Distribution Density: f(t) = exp[ - t], 0 t Cumulative Distribution Function F(t) F(t) = F(t) = - exp[- u] F(t) = -1 {exp[- t] - exp[0]} F(t) = 1 - exp[- t] Survivor Function, S(t) = 1- F(t) = exp[- t] Taking logarithms, lnS(t) = - t

So l = 0.022

Exponential Distribution (Cont.) Mean = 1/ = Memoryless feature: Duration conditional on surviving until t = : DURC( ) = = + 1/ Expected remaining duration = duration conditional on surviving until time , i.e DURC, minus Or 1/ , which is equal to the overall mean, so the distribution is memoryless

Exponential Distribution(Cont.) Hazard rate or function, h(t) is the probability of failure conditional on survival until that time, and is the ratio of the density function to the survivor function. It is a constant for the exponential. h(t) = f(t)/S(t) = exp[- t] /exp[- t] =

Model Building Reference: Ch 20

20.2 Polynomial Models There are models where the independent variables (xi) may appear as functions of a smaller number of predictor variables. Polynomial models are one such example.

Polynomial Models with One Predictor Variable y = b0 + b1x1+ b2x2 +…+ bpxp + e y = b0 + b1x + b2x2 + …+bpxp + e

Polynomial Models with One Predictor Variable First order model (p = 1) y = b0 + b1x + e Second order model (p=2) y = b0 + b1x + b2x2 + e b2 < 0 b2 > 0

Polynomial Models with One Predictor Variable Third order model (p = 3) y = b0 + b1x + b2x2 + e b3x3 + e b3 < 0 b3 > 0

Polynomial Models with Two Predictor Variables First order model y = b0 + b1x1 + e b2x2 + e b1 < 0 x1 x2 x1 x2 y b2 > 0 b2 < 0

20.3 Nominal Independent Variables In many real-life situations one or more independent variables are nominal. Including nominal variables in a regression analysis model is done via indicator variables. An indicator variable (I) can assume one out of two values, “zero” or “one”. 1 if the temperature was below 50o 0 if the temperature was 50o or more 1 if a first condition out of two is met 0 if a second condition out of two is met 1 if data were collected before 1980 0 if data were collected after 1980 1 if a degree earned is in Finance 0 if a degree earned is not in Finance I=

Nominal Independent Variables; Example: Auction Car Price (II) Example 18.2 - revised (Xm18-02a) Recall: A car dealer wants to predict the auction price of a car. The dealer believes now that odometer reading and the car color are variables that affect a car’s price. Three color categories are considered: White Silver Other colors Note: Color is a nominal variable.

Nominal Independent Variables; Example: Auction Car Price (II) Example 18.2 - revised (Xm18-02b) 1 if the color is white 0 if the color is not white I1 = 1 if the color is silver 0 if the color is not silver I2 = The category “Other colors” is defined by: I1 = 0; I2 = 0

How Many Indicator Variables? Note: To represent the situation of three possible colors we need only two indicator variables. Conclusion: To represent a nominal variable with m possible categories, we must create m-1 indicator variables.

Nominal Independent Variables; Example: Auction Car Price Solution the proposed model is y = b0 + b1(Odometer) + b2I1 + b3I2 + e The data White car Other color Silver color

Example: Auction Car Price The Regression Equation From Excel (Xm18-02b) we get the regression equation PRICE = 16701-.0555(Odometer)+90.48(I-1)+295.48(I-2) Odometer Price The equation for a silver color car. 16996.48 - .0555(Odometer) Price = 16701 - .0555(Odometer) + 90.48(0) + 295.48(1) The equation for a white color car. 16791.48 - .0555(Odometer) 16701 - .0555(Odometer) Price = 16701 - .0555(Odometer) + 90.48(1) + 295.48(0) Price = 16701 - .0555(Odometer) + 45.2(0) + 148(0) The equation for an “other color” car.

Example: Auction Car Price The Regression Equation From Excel we get the regression equation PRICE = 16701-.0555(Odometer)+90.48(I-1)+295.48(I-2) For one additional mile the auction price decreases by 5.55 cents. A white car sells, on the average, for $90.48 more than a car of the “Other color” category A silver color car sells, on the average, for $295.48 more than a car of the “Other color” category.

Example: Auction Car Price The Regression Equation There is insufficient evidence to infer that a white color car and a car of “other color” sell for a different auction price. Xm18-02b There is sufficient evidence to infer that a silver color car sells for a larger price than a car of the “other color” category.

Nominal Independent Variables; Example: MBA Program Admission (MBA II) Recall: The Dean wanted to evaluate applications for the MBA program by predicting future performance of the applicants. The following three predictors were suggested: Undergraduate GPA GMAT score Years of work experience It is now believed that the type of undergraduate degree should be included in the model. Note: The undergraduate degree is nominal data.

Nominal Independent Variables; Example: MBA Program Admission (II) 1 if B.A. 0 otherwise I1 = 1 if B.B.A 0 otherwise I2 = 1 if B.Sc. or B.Eng. 0 otherwise I3 = The category “Other group” is defined by: I1 = 0; I2 = 0; I3 = 0

Nominal Independent Variables; Example: MBA Program Admission (II) MBA-II

20.4 Applications in Human Resources Management: Pay-Equity Pay-equity can be handled in two different forms: Equal pay for equal work Equal pay for work of equal value. Regression analysis is extensively employed in cases of equal pay for equal work.

Human Resources Management: Pay-Equity Solution Construct the following multiple regression model: y = b0 + b1Education + b2Experience + b3Gender + e Note the nature of the variables: Education – Interval Experience – Interval Gender – Nominal (Gender = 1 if male; =0 otherwise).

Human Resources Management: Pay-Equity Solution – Continued (Xm20-03) Analysis and Interpretation The model fits the data quite well. The model is very useful. Experience is a variable strongly related to salary. There is no evidence of sex discrimination.

Human Resources Management: Pay-Equity Solution – Continued (Xm20-03) Analysis and Interpretation Further studying the data we find: Average experience (years) for women is 12. Average experience (years) for men is 17 Average salary for female manager is $76,189 Average salary for male manager is $97,832