Data mining and statistical learning - lab2-4 Lab 2, assignment 1: OLS regression of electricity consumption on temperature at 53 sites.

Slides:



Advertisements
Similar presentations
A. The Basic Principle We consider the multivariate extension of multiple linear regression – modeling the relationship between m responses Y 1,…,Y m and.
Advertisements

CORRELATION. Overview of Correlation u What is a Correlation? u Correlation Coefficients u Coefficient of Determination u Test for Significance u Correlation.
GENERAL LINEAR MODELS: Estimation algorithms
Ridge Regression Population Characteristics and Carbon Emissions in China ( ) Q. Zhu and X. Peng (2012). “The Impacts of Population Change on Carbon.
CS Statistical Machine learning Lecture 13 Yuan (Alan) Qi Purdue CS Oct
Structural Equation Modeling
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
Psychology 202b Advanced Psychological Statistics, II February 1, 2011.
CORRELATION. Overview of Correlation u What is a Correlation? u Correlation Coefficients u Coefficient of Determination u Test for Significance u Correlation.
1-1 Regression Models  Population Deterministic Regression Model Y i =  0 +  1 X i u Y i only depends on the value of X i and no other factor can affect.
GRA 6020 Multivariate Statistics The regression model OLS Regression Ulf H. Olsson Professor of Statistics.
Discrim Continued Psy 524 Andrew Ainsworth. Types of Discriminant Function Analysis They are the same as the types of multiple regression Direct Discrim.
Linear Regression Models Based on Chapter 3 of Hastie, Tibshirani and Friedman Slides by David Madigan.
Data mining and statistical learning, lecture 4 Outline Regression on a large number of correlated inputs  A few comments about shrinkage methods, such.
Data mining and statistical learning, lecture 5 Outline  Summary of regressions on correlated inputs  Ridge regression  PCR (principal components regression)
Linear statistical models 2008 Model diagnostics  Residual analysis  Outliers  Dependence  Heteroscedasticity  Violations of distributional assumptions.
Data mining and statistical learning - lecture 13 Separating hyperplane.
Data mining and statistical learning, lecture 3 Outline  Ordinary least squares regression  Ridge regression.
Data mining and statistical learning - lecture 11 Neural networks - a model class providing a joint framework for prediction and classification  Relationship.
Correlation 1. Correlation - degree to which variables are associated or covary. (Changes in the value of one tends to be associated with changes in the.
Discriminant Analysis Testing latent variables as predictors of groups.
Objectives of Multiple Regression
Outline Separating Hyperplanes – Separable Case
PCA Example Air pollution in 41 cities in the USA.
© 2002 Prentice-Hall, Inc.Chap 14-1 Introduction to Multiple Regression Model.
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 12-1 Correlation and Regression.
2 Multicollinearity Presented by: Shahram Arsang Isfahan University of Medical Sciences April 2014.
1 Multivariate Linear Regression Models Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of.
Generalizing Linear Discriminant Analysis. Linear Discriminant Analysis Objective -Project a feature space (a dataset n-dimensional samples) onto a smaller.
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
Scatterplots & Regression Week 3 Lecture MG461 Dr. Meredith Rolfe.
Math 5364/66 Notes Principal Components and Factor Analysis in SAS Jesse Crawford Department of Mathematics Tarleton State University.
Chapter 7 Multivariate techniques with text Parallel embedded system design lab 이청용.
REGRESSION DIAGNOSTICS Fall 2013 Dec 12/13. WHY REGRESSION DIAGNOSTICS? The validity of a regression model is based on a set of assumptions. Violation.
Linear Discriminant Analysis (LDA). Goal To classify observations into 2 or more groups based on k discriminant functions (Dependent variable Y is categorical.
R EGRESSION S HRINKAGE AND S ELECTION VIA THE L ASSO Author: Robert Tibshirani Journal of the Royal Statistical Society 1996 Presentation: Tinglin Liu.
Reduces time complexity: Less computation Reduces space complexity: Less parameters Simpler models are more robust on small datasets More interpretable;
CpSc 881: Machine Learning
Chapter1: Introduction Chapter2: Overview of Supervised Learning
Assignments CS fall Assignment 1 due Generate the in silico data set of 2sin(1.5x)+ N (0,1) with 100 random values of x between.
Psychology 202a Advanced Psychological Statistics October 22, 2015.
1 Statistics 262: Intermediate Biostatistics Regression Models for longitudinal data: Mixed Models.
 Seeks to determine group membership from predictor variables ◦ Given group membership, how many people can we correctly classify?
LECTURE 07: CLASSIFICATION PT. 3 February 15, 2016 SDS 293 Machine Learning.
Tom.h.wilson Department of Geology and Geography West Virginia University Morgantown, WV.
Venn diagram shows (R 2 ) the amount of variance in Y that is explained by X. Unexplained Variance in Y. (1-R 2 ) =.36, 36% R 2 =.64 (64%)
LECTURE 05: CLASSIFICATION PT. 1 February 8, 2016 SDS 293 Machine Learning.
Université d’Ottawa / University of Ottawa 2001 Bio 8100s Applied Multivariate Biostatistics L11.1 Lecture 11: Canonical correlation analysis (CANCOR)
Multivariate statistical methods. Multivariate methods multivariate dataset – group of n objects, m variables (as a rule n>m, if possible). confirmation.
Dimension reduction (2) EDR space Sliced inverse regression Multi-dimensional LDA Partial Least Squares Network Component analysis.
Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.
I. Statistical Methods for Genome-Enabled Prediction of Complex Traits OUTLINE THE CHALLENGES OF PREDICTING COMPLEX TRAITS ORDINARY LEAST SQUARES (OLS)
Chapter 12 REGRESSION DIAGNOSTICS AND CANONICAL CORRELATION.
Predicting Energy Consumption in Buildings using Multiple Linear Regression Introduction Linear regression is used to model energy consumption in buildings.
Deep Feedforward Networks
Probability Theory and Parameter Estimation I
Background on Classification
CH 5: Multivariate Methods
Chapter 12: Regression Diagnostics
Multiple Linear Regression
The greatest blessing in life is
Linear Discriminant Analysis
OVERVIEW OF LINEAR MODELS
Generally Discriminant Analysis
数据的矩阵描述.
Biointelligence Laboratory, Seoul National University
Factor Analysis (Principal Components) Output
Multivariate Methods Berlin Chen
Multivariate Methods Berlin Chen, 2005 References:
Presentation transcript:

Data mining and statistical learning - lab2-4 Lab 2, assignment 1: OLS regression of electricity consumption on temperature at 53 sites

Data mining and statistical learning - lab2-4 SAS code for ridge regression proc reg data=mining.dailytemperature outest = dtempbeta ridge=0 to 10 by 1; model daily_consumption = stockholm g_teborg malm_ /p; output out=olsoutput pred=olspred; proc print data=dtempbeta; run;

Data mining and statistical learning - lab2-4 Estimated regression parameters in ridge regression

Data mining and statistical learning - lab2-4 Predicted vs observed values in OLS regression and ridge regression - trade-off between variance and bias

Data mining and statistical learning - lab2-4 Fat content vs absorbance in different channels (wavelengths)

Data mining and statistical learning - lab2-4 OLS regression fat vs channel10, channel30, channel50, channel70, channel90

Data mining and statistical learning - lab2-4 OLS regression fat vs channel1 – channel 100

Data mining and statistical learning - lab2-4 OLS regression fat vs channel1 – channel 100

Data mining and statistical learning - lab2-4 OLS regression with strongly correlated predictors If the X T X matrix has not full rank (some X -variables are linearly dependent) the mean square solution is not unique If the X -variables are strongly correlated, then: (i) the regression coefficients will be uncertain; (ii) the predictions may be OK

Data mining and statistical learning - lab2-4 Principal Component Analysis of lake survey data Some variables vary much more than others How does this influence principal components derived from the covariance and correlation matrices, respectively?

Data mining and statistical learning - lab2-4 Principal Component Analysis of lake survey data - score plot derived from the correlation matrix

Data mining and statistical learning - lab2-4 Principal Component Analysis of lake survey data - eigenvectors derived from the correlation matrix

Data mining and statistical learning - lab2-4 Principal Component Analysis of lake survey data with outliers removed - score plot derived from the correlation matrix

Data mining and statistical learning - lab2-4 Principal Component Analysis of lake survey data with outliers removed - eigenvectors derived from the correlation matrix

Data mining and statistical learning - lab2-4 Principal Component Analysis of lake survey data with outliers removed - MINITAB score plot derived from the correlation matrix

Data mining and statistical learning - lab2-4 Principal Component Analysis of lake survey data with outliers removed - MINITAB loading plot derived from the correlation matrix

Data mining and statistical learning - lab2-4 Regression of an indicator matrix Find a linear function which is (on average) one for objects in class 1 and otherwise (on average) zero Find a linear function which is (on average) one for objects in class 1 and otherwise (on average) zero Assign a new object to class 1 if

Data mining and statistical learning - lab2-4 Discriminant analysis - decision border

Data mining and statistical learning - lab2-4 3D-plot of an indicator matrix for class 1

Data mining and statistical learning - lab2-4 3D-plot of an indicator matrix for class 2

Data mining and statistical learning - lab2-4 Regression of an indicator matrix - discriminating function Estimate discriminant functions for each class, and then classify a new object to the class with the largest value for its discriminant function

Data mining and statistical learning - lab2-4 Linear discriminant analysis (LDA) LDA is an optimal classification method when the data arise from Gaussian distributions with different means and a common covariance matrix