1/18/2019 ST3131, Lecture 1.

Slides:



Advertisements
Similar presentations
Statistics 350 Lecture 16. Today Last Day: Introduction to Multiple Linear Regression Model Today: More Chapter 6.
Advertisements

Linear Regression and Correlation Analysis
1 Simple Linear Regression Linear regression model Prediction Limitation Correlation.
Lecture 17 Interaction Plots Simple Linear Regression (Chapter ) Homework 4 due Friday. JMP instructions for question are actually for.
Correlation and Regression Analysis
Simple Linear Regression Analysis
Introduction to Linear Regression and Correlation Analysis
Estimation of Demand Prof. Ravikesh Srivastava Lecture-8.
Simple Linear Regression
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved OPIM 303-Lecture #9 Jose M. Cruz Assistant Professor.
1 1 Slide Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple Coefficient of Determination n Model Assumptions n Testing.
Applied Quantitative Analysis and Practices LECTURE#22 By Dr. Osman Sadiq Paracha.
1 Everyday is a new beginning in life. Every moment is a time for self vigilance.
MBP1010H – Lecture 4: March 26, Multiple regression 2.Survival analysis Reading: Introduction to the Practice of Statistics: Chapters 2, 10 and 11.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 13-1 Introduction to Regression Analysis Regression analysis is used.
Chapter 11 Correlation and Simple Linear Regression Statistics for Business (Econ) 1.
Economics 173 Business Statistics Lecture 10 Fall, 2001 Professor J. Petry
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Lecturer: Ing. Martina Hanová, PhD.. Regression analysis Regression analysis is a tool for analyzing relationships between financial variables:  Identify.
Introduction. We want to see if there is any relationship between the results on exams and the amount of hours used for studies. Person ABCDEFGHIJ Hours/
Introduction Many problems in Engineering, Management, Health Sciences and other Sciences involve exploring the relationships between two or more variables.
Estimating standard error using bootstrap
The simple linear regression model and parameter estimation
Chapter 7. Classification and Prediction
Correlation & Regression
Regression Analysis: Statistical Inference
Linear Regression.
Regression Analysis Module 3.
CHAPTER 3 Describing Relationships
Statistics for the Social Sciences
10.2 Regression If the value of the correlation coefficient is significant, the next step is to determine the equation of the regression line which is.
Regression Chapter 6 I Introduction to Regression
Chapter 11: Simple Linear Regression
Chapter 5 STATISTICS (PART 4).
Correlation and Regression
Understanding Standards Event Higher Statistics Award
Happiness comes not from material wealth but less desire.
Lecture 12 More Examples for SLR More Examples for MLR 9/19/2018
...Relax... 9/21/2018 ST3131, Lecture 3 ST5213 Semester II, 2000/2001
Econ 3790: Business and Economics Statistics
…Don’t be afraid of others, because they are bigger than you
Simple Linear Regression
Lecture Slides Elementary Statistics Thirteenth Edition
Regression Analysis Week 4.
CHAPTER 29: Multiple Regression*
Prepared by Lee Revere and John Large
Chapter 12 Simple Linear Regression and Correlation
Lecture Notes The Relation between Two Variables Q Q
CHAPTER 3 Describing Relationships
M248: Analyzing data Block D UNIT D2 Regression.
Chapter 4, Regression Diagnostics Detection of Model Violation
Regression Lecture-5 Additional chapters of mathematics
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Statistics for Business and Economics
CHAPTER 3 Describing Relationships
Regression and Categorical Predictors
CHAPTER 3 Describing Relationships
بسم الله الرحمن الرحيم. Correlation & Regression Dr. Moataza Mahmoud Abdel Wahab Lecturer of Biostatistics High Institute of Public Health University.
3.2. SIMPLE LINEAR REGRESSION
Chapter 9 Dummy Variables Undergraduated Econometrics Page 1
Correlation & Regression
Nazmus Saquib, PhD Head of Research Sulaiman AlRajhi Colleges
Forecasting Plays an important role in many industries
CHAPTER 3 Describing Relationships
Presentation transcript:

1/18/2019 ST3131, Lecture 1

Chapter 1. Introduction Questions: Q1. What is Regression Analysis? Q2. How to Do Regression Analysis? Q3. Where to Use Regression Analysis? 1/18/2019 ST3131, Lecture 1

Q1: What is Regression Analysis? A useful tool for finding Functional Relationships among Variables based on data, and using this relationship for further analysis of the data. Functional Relationships: Mathematical Formulas or Equations connects a response variable & several predictor variables. Variables include Quantitative (Numerical) Variables ( e.g. ) 2 Types: Continuous(e.g. ) and Discrete (e.g. ). Binary Variable: takes only 2 values, 0 and 1, say. And Qualitative (Non-numerical) Variables: e.g., Neighborhood type ( ), Blood type ( ) 1/18/2019 ST3131, Lecture 1

Exercise Page 18, Problem 1.1 (b),(d),(f) and (h): Quantitative or Qualitative Variables? If Latter, state the possible categories: (b). # of Children in a family (d). Race (f). Fuel Consumption (h). Political party preference 1/18/2019 ST3131, Lecture 1

A General Regression Model Response Variable.: Y Predictor Variables.: Measurement Error: f : unknown regression function 1/18/2019 ST3131, Lecture 1

Parametric Regression Models Parametric Regression Models : f is a ( ) function Type 1: Linear Regression Model: f is a ( ) function: : Regression Parameters /Coefficients Type 2: Nonlinear Regression Model: f is a ( ) function in ( ): 1/18/2019 ST3131, Lecture 1

Exercise Linear Regression / Nonlinear Regression Model? (a). (b). 1/18/2019 ST3131, Lecture 1

Types of Parametric Regression Models Regression Type Conditions/Definitions Univariate (Multivariate) One (two or more Quantitative) Response variables Simple (Multiple) Only one (two or more) Predictor variables Linear /Nonlinear Response is Linear/Nonlinear in Parameters Logistic Response variable is Qualitative Analysis of Variance (Covariance) All (Some) Predictors are Qualitative variables 1/18/2019 ST3131, Lecture 1

An Example Questions of Interest  A company markets and repairs small computers. How fast (Time, response) an electronic component (Computer Unit, predictor variable) can be repaired is very important to the efficiency of the company. The Variables in this example are: Time and Units Questions of Interest What is the relationship between the length of a service call (Time) and the number of electronic components (Computer Units)? In general, how long will it take to repair k computer units? 1/18/2019 ST3131, Lecture 1

Computer Repair Data Table 2.5 Page 27 Pre-Data Analysis Units Min’s 1 23 6 97 2 29 7 109 3 49 8 119 4 64 9 149 74 145 5 87 10 154 96 166 To see How  the Time is related with computer Units, we can draw  a plot of Time  against computer Units. From the plot, we can see the simple relationship between Time and Units. This will suggest what kind of model is good to fit the data. 1/18/2019 ST3131, Lecture 1

Pre-Data Analysis Scatter Plot (Time vs Units) Some Simple Conclusions Time is Linearly related with computer Units. Time is Increasing with Number of Units. The Linearity is NOT exactly, Measurement Errors exist. Thus Linear Regression Model can be used for the relation between Time and computer Units 1/18/2019 ST3131, Lecture 1

Simple Linear Regression Model called Linear Regression Intercept called Linear Regression Slope called Regression Parameters or Coefficients  =I-th Measurement Error n=# of observations where X=Units, called Independent,  Explanatory or Predictor variable the i-th observation Y=Time, called Dependent or Response variable 1/18/2019 ST3131, Lecture 1

Least Squares Method Least Squares Method is often used to fit the above Linear Regression Model: Find   to Minimize  Least: Minimization Squares : Sum of Squared residuals The solution is called Least Squares Estimator of , denoted as . 1/18/2019 ST3131, Lecture 1

Simple Linear Regression Fit The Fitted Equation Some Conclusions Least Squares fit gives the LS-Estimates The left plot shows that the simple linear regression model is good for the data. The fitted line is increasing with increasing Units. 1/18/2019 ST3131, Lecture 1

The Resulting Model & Prediction The resulting model is Where , read as “Y hat”, stands for the estimation at X. That is “Minutes=4.162+15.509* Units”. Prediction: X=1, Y=4.162+15.509*1=19.67, X=5, Y=4.162+15.509*5=81.71, etc. Interpretation: it takes about 19.67 minutes to repair 1 computer unit; about 81.71 minutes to repair 5 computer units. 1/18/2019 ST3131, Lecture 1

Q2:How to Do Regression Analysis? In summary, we can list the steps in Regression Analysis as: Step 1. State the Problem of Interest Step 2. Select Potentially Relevant Variables Step 3. Collect Relevant Data Step 4. Specify a Model Step 5. Choose a Fitting Method Step 6. Fit the Chosen Model Step 7. Check the Resulting Model Step 8. Apply the Resulting Model for Prediction 1/18/2019 ST3131, Lecture 1

Step1: State the Problem of Interest A Key step in Regression Analysis. Different statement of the problem will lead to different choice of response and predictor variables. Example: Suppose we wish to determine if or not an employer is discriminating against a given group of employees,say women. Data on Salary, Qualifications, and Sex are available. Example 1: For question “ On average, are women paid less than equally qualified men?”, we choose Salary as Response, Qualification and Sex as Predictors. Example 2: For question “ On average, are women more qualified than equally paid men?”, we choose Qualification as Response, Salary and Sex as Predictors. 1/18/2019 ST3131, Lecture 1

Exercise Page 19, Problem 1.3 (a), (b), (c ). Which variable can be used as Response, Which can be used as Predictors? Why? (a). Number of cylinders, gasoline consumption of cars (b). SAT scores, grade point average, college admission (c ). Supply and demand of certain goods 1/18/2019 ST3131, Lecture 1

Step 2:Select Potentially Relevant Variables Depends on the Problem of Interest. Usually, Y denotes Response variable, X1, X2, …,Xp denote Predictor variables. For the question “ if the price of a single house in a given geographical area is high?”, response: price of a single house(Y), predictor variables may include: area of the lot(X1), area of the house(X2), age of the house(X3), number of bathrooms(X4), type of neighborhood(X5), style of the house(X6), amount of real estate taxes(X7) etc. 1/18/2019 ST3131, Lecture 1

Step 3: Data Collection Data are collected for the chosen response and predictor variables. The collected data are usually recorded in the following form: Observation No. Response Y Predictors X1 X2 …. Xp 1 2 … n Y1 Y2 …. Yn x11, x12,… x1p x21, x22,…, x2p ………………. xn1, xn2,…,xnp Each column lists observations for a variable, Each row list observations for all predictor variables in a case. 1/18/2019 ST3131, Lecture 1

Step 4: Model Specification Usually specified by the experts in the area of study May be specified based on the Initial Analysis of the data In our level, the variables are given, and the model is often given, mostly Linear and Parametric. 1/18/2019 ST3131, Lecture 1

Step 5: Method of Fitting In our case, we mainly focus on Least Squares Method. People may use other methods such as Weighted Least Squares, Maximum likelihood method, Ridge Regression, Principal Components method etc. from different views and different contexts. 1/18/2019 ST3131, Lecture 1

Step 6:Fitted Model and Prediction When the parameters are estimated , we got the fitted regression equation or formula, e.g., the Estimated Linear Regression Equation is: : Fitted value when (X1,..,Xp) is a data point, n Fitted Values are: : Predicted Value when (X1,..,Xp) is not a data point. 1/18/2019 ST3131, Lecture 1

Step 7: Model Checking To check if the assumptions for the regression model is valid or not. This is Regression Diagnostics problem. The details will be given in Chapter 4. 1/18/2019 ST3131, Lecture 1

Q3: Where to Use Regression Analysis? Regression Analysis is one of the most widely used statistical tools. It has extensive applications in many subject areas, including: Agricultural Sciences Industrial and Labor Relations History Government Environmental Sciences Military, Economics, Financial etc. For details, read Section 1.3, Page 3-7. 1/18/2019 ST3131, Lecture 1

Read Chapter 1, Sections 1-4 Reading Assignments Read Chapter 1, Sections 1-4 Read Chapter 2, Sections 1-4. Thinking about the following questions: a). What is SLR model? b). How to find the LS estimates of the parameters? c). How to compute the LS estimates manually? 1/18/2019 ST3131, Lecture 1