Regression Dr. Jieh-Shan George YEH

Slides:



Advertisements
Similar presentations
Nonlinear models Hill et al Chapter 10. Types of nonlinear models Linear in the parameters. –Includes models that can be made linear by transformation:
Advertisements

1 1 Chapter 5: Multiple Regression 5.1 Fitting a Multiple Regression Model 5.2 Fitting a Multiple Regression Model with Interactions 5.3 Generating and.
- Word counts - Speech error counts - Metaphor counts - Active construction counts Moving further Categorical count data.
Lecture Data Mining in R 732A44 Programming in R.
Extension The General Linear Model with Categorical Predictors.
Week 3. Logistic Regression Overview and applications Additional issues Select Inputs Optimize complexity Transforming Inputs.
Ch11 Curve Fitting Dr. Deshi Ye
Simple Linear Regression
Chapter 8 Logistic Regression 1. Introduction Logistic regression extends the ideas of linear regression to the situation where the dependent variable,

GRA 6020 Multivariate Statistics The regression model OLS Regression Ulf H. Olsson Professor of Statistics.
GRA 6020 Multivariate Statistics; The Linear Probability model and The Logit Model (Probit) Ulf H. Olsson Professor of Statistics.
Statistical Methods Chichang Jou Tamkang University.
Lecture 4: Correlation and Regression Laura McAvinue School of Psychology Trinity College Dublin.
Syllabus for Engineering Statistics Probability: rules, cond. probability, independence Summarising and presenting data Discrete and continuous random.
Review of the fundamental concepts of probability Exploratory data analysis: quantitative and graphical data description Estimation techniques, hypothesis.
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Linear Regression and Linear Prediction Predicting the score on one variable.
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
Simple Linear Regression NFL Point Spreads – 2007.
Review of Lecture Two Linear Regression Normal Equation
Logistic Regression KNN Ch. 14 (pp ) MINITAB User’s Guide
SAS Lecture 5 – Some regression procedures Aidan McDermott, April 25, 2005.
Baburao Kamble (Ph.D) University of Nebraska-Lincoln Data Analysis Using R Week 8: Advanced Regression and Probability Distribution.
CHAPTER NINE Correlational Research Designs. Copyright © Houghton Mifflin Company. All rights reserved.Chapter 9 | 2 Study Questions What are correlational.
Chapter 3: Generalized Linear Models 3.1 The Generalization 3.2 Logistic Regression Revisited 3.3 Poisson Regression 1.
Poisson Random Variable Provides model for data that represent the number of occurrences of a specified event in a given unit of time X represents the.
Statistics Frequency and Distribution. We interrupt this lecture for the following… Significant digits You should not report numbers with more significant.
1 Peter Fox Data Analytics – ITWS-4963/ITWS-6965 Week 13a, April 22, 2014 Boosting, dimension reduction and a preview of the return to Big Data.
MTH 161: Introduction To Statistics
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Generalized Linear Models All the regression models treated so far have common structure. This structure can be split up into two parts: The random part:
Thomas Knotts. Engineers often: Regress data  Analysis  Fit to theory  Data reduction Use the regression of others  Antoine Equation  DIPPR.
Linear Model. Formal Definition General Linear Model.
Data Mining: Neural Network Applications by Louise Francis CAS Annual Meeting, Nov 11, 2002 Francis Analytics and Actuarial Data Mining, Inc.
Week 5: Logistic regression analysis Overview Questions from last week What is logistic regression analysis? The mathematical model Interpreting the β.
1 Multiple Regression A single numerical response variable, Y. Multiple numerical explanatory variables, X 1, X 2,…, X k.
1 GLM I: Introduction to Generalized Linear Models By Curtis Gary Dean Distinguished Professor of Actuarial Science Ball State University By Curtis Gary.
Regression. Types of Linear Regression Model Ordinary Least Square Model (OLS) –Minimize the residuals about the regression linear –Most commonly used.
Introduction to Regression Analysis. Dependent variable (response variable) Measures an outcome of a study  Income  GRE scores Dependent variable =
College Prep Stats. x is the independent variable (predictor variable) ^ y = b 0 + b 1 x ^ y = mx + b b 0 = y - intercept b 1 = slope y is the dependent.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Model Building and Model Diagnostics Chapter 15.
Correlation and Regression: The Need to Knows Correlation is a statistical technique: tells you if scores on variable X are related to scores on variable.
Requirements of a good GRAPH. GRAPH  Title (usually “dependent” vs. “independent”)  Go Big (cover at least ½ the page in both directions) This increases.
Generalized Linear Models (GLMs) and Their Applications.
Linear Discriminant Analysis and Logistic Regression.
Logistic Regression. Linear regression – numerical response Logistic regression – binary categorical response eg. has the disease, or unaffected by the.
Statistics 2: generalized linear models. General linear model: Y ~ a + b 1 * x 1 + … + b n * x n + ε There are many cases when general linear models are.
Statistics Review ChE 475 Fall 2015 Dr. Harding. 1. Repeated Data Points Use t-test based on measured st dev (s) true mean measured mean In Excel, =T.INV(
Remembering way back: Generalized Linear Models Ordinary linear regression What if we want to model a response that is not Gaussian?? We may have experiments.
Beginning Statistics Table of Contents HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2008 by Hawkes Learning Systems/Quant Systems, Inc.
Dependent Variable Discrete  2 values – binomial  3 or more discrete values – multinomial  Skewed – e.g. Poisson Continuous  Non-normal.
Week 7: General linear models Overview Questions from last week What are general linear models? Discussion of the 3 articles.
REGRESSION MODELS OF BEST FIT Assess the fit of a function model for bivariate (2 variables) data by plotting and analyzing residuals.
732G21/732G28/732A35 Lecture 3. Properties of the model errors ε 4. ε are assumed to be normally distributed
Lecturer: Ing. Martina Hanová, PhD.. Regression analysis Regression analysis is a tool for analyzing relationships between financial variables:  Identify.
Scatter Diagrams; Linear Curve Fitting Objectives: Draw and Interpret Scatter Diagrams Distinguish between Linear and Nonlinear Relations Use a Graphing.
Chapter 7. Classification and Prediction
Logistic Regression When and why do we use logistic regression?
Statistics 200 Lecture #5 Tuesday, September 6, 2016
Generalized Linear Models
Regression Techniques
Simple Linear Regression
Introduction to logistic regression a.k.a. Varbrul
Quantitative Methods What lies beyond?.
DCAL Stats Workshop Bodo Winter.
Quantitative Methods What lies beyond?.
ASV Chapters 1 - Sample Spaces and Probabilities
EQUATION 4.1 Relationship Between One Dependent and One Independent Variable: Simple Regression Analysis.
Correlation and Covariance
Regression Part II.
Presentation transcript:

Regression Dr. Jieh-Shan George YEH

ARULES: MINING ASSOCIATION RULES AND FREQUENT ITEMSETS

Regression Regression is to build a function of independent variables (also known as predictors) to predict a dependent variable (also called response). For example, banks assess the risk of home- loan applicants based on their age, income, expenses, occupation, number of dependents, total credit limit, etc.

Linear Regression Linear regression is to predict response with a linear function of predictors as follows: y = c 0 + c 1 x 1 + c 2 x c k x k ; where x 1 ; x 2 ;… ; x k are predictors and y is the response to predict.

year <- rep(2008:2010, each=4) quarter <- rep(1:4, 3) cpi <- c(162.2, 164.6, 166.5, 166.0, , 167.0, 168.6, 169.5, , 172.1, 173.3, 174.0) plot(cpi, xaxt="n", ylab="CPI", xlab="") # draw x-axis axis(1, labels=paste(year,quarter,sep="Q"), at=1:12, las=3)

library(scatterplot3d) s3d <- scatterplot3d(year, quarter, cpi, highlight.3d=T, type="h", lab=c(2,3)) s3d$plane3d(fit)

data2011 <- data.frame(year=2011, quarter=1:4) cpi2011 <- predict(fit, newdata=data2011) style <- c(rep(1,12), rep(2,4)) plot(c(cpi, cpi2011), xaxt="n", ylab="CPI", xlab="", pch=style, col=style) axis(1, at=1:16, las=3,labels=c(paste(year,quarter,sep="Q"), "2011Q1", "2011Q2", "2011Q3", "2011Q4"))

Logistic Regression Logistic regression is used to predict the probability of occurrence of an event by fitting data to a logistic curve. A logistic regression model is built as the following equation:

Generalized Linear Regression The generalized linear model (GLM) generalizes linear regression by allowing the linear model to be related to the response variable via a link function and allowing the magnitude of the variance of each measurement to be a function of its predicted value.

Generalized Linear Regression It unifies various other statistical models, including linear regression, logistic regression and Poisson regression. Function glm() is used to t generalized linear models, specified by giving a symbolic description of the linear predictor and a description of the error distribution.

data("bodyfat", package="TH.data") myFormula <- DEXfat ~ age + waistcirc + hipcirc + elbowbreadth + kneebreadth bodyfat.glm <- glm(myFormula, family = gaussian("log"), data = bodyfat) summary(bodyfat.glm) pred <- predict(bodyfat.glm, type="response")

plot(bodyfat$DEXfat, pred, xlab="Observed Values", ylab="Predicted Values") abline(a=0, b=1)

Non-linear Regression non-linear regression is to fit a curve through data.