OLS REGRESSION VS. NEURAL NETWORKS VS. MARS A COMPARISON R. J. Lievano E. Kyper University of Minnesota Duluth.

Slides:

Advertisements

Similar presentations

All Possible Regressions and Statistics for Comparing Models

Advertisements

EE 690 Design of Embodied Intelligence

Multilayer Perceptrons 1. Overview  Recap of neural network theory  The multi-layered perceptron  Back-propagation  Introduction to training  Uses.

Principle Components & Neural Networks How I finished second in Mapping Dark Matter Challenge Sergey Yurgenson, Harvard University Pasadena, 2011.

6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.

Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.

Multiple Linear Regression

Ch11 Curve Fitting Dr. Deshi Ye

Lecture 14 – Neural Networks

The Nature of Statistical Learning Theory by V. Vapnik

Decision Support Systems

Chapter 5 NEURAL NETWORKS

Neural Networks Marco Loog.

MACHINE LEARNING 12. Multilayer Perceptrons. Neural Networks Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)

Data mining and statistical learning - lecture 12 Neural networks (NN) and Multivariate Adaptive Regression Splines (MARS)  Different types of neural.

October 28, 2010Neural Networks Lecture 13: Adaptive Networks 1 Adaptive Networks As you know, there is no equation that would tell you the ideal number.

Data mining and statistical learning - lecture 11 Neural networks - a model class providing a joint framework for prediction and classification  Relationship.

Introduction to Directed Data Mining: Neural Networks

Decision Tree Models in Data Mining

Last lecture summary.

Ranga Rodrigo April 5, 2014 Most of the sides are from the Matlab tutorial. 1.

Integrating Neural Network and Genetic Algorithm to Solve Function Approximation Combined with Optimization Problem Term presentation for CSC7333 Machine.

Using Neural Networks in Database Mining Tino Jimenez CS157B MW 9-10:15 February 19, 2009.

Multi-Layer Perceptrons Michael J. Watts

Chapter 11 – Neural Networks COMP 540 4/17/2007 Derek Singer.

11 CSE 4705 Artificial Intelligence Jinbo Bi Department of Computer Science & Engineering

Outline 1-D regression Least-squares Regression Non-iterative Least-squares Regression Basis Functions Overfitting Validation 2.

Combining Regression Trees and Radial Basis Function Networks paper by: M. Orr, J. Hallam, K. Takezawa, A. Murray, S. Ninomiya, M. Oide, T. Leonard presentation.

Using Neural Networks to Predict Claim Duration in the Presence of Right Censoring and Covariates David Speights Senior Research Statistician HNC Insurance.

5.2 Input Selection 5.3 Stopped Training

NEURAL NETWORKS FOR DATA MINING

Classification / Regression Neural Networks 2

A comparison of the ability of artificial neural network and polynomial fitting was carried out in order to model the horizontal deformation field. It.

Artificial Intelligence Chapter 3 Neural Networks Artificial Intelligence Chapter 3 Neural Networks Biointelligence Lab School of Computer Sci. & Eng.

Chapter 16 Data Analysis: Testing for Associations.

Lectures 15,16 – Additive Models, Trees, and Related Methods Rice ECE697 Farinaz Koushanfar Fall 2006.

Reservoir Uncertainty Assessment Using Machine Learning Techniques Authors: Jincong He Department of Energy Resources Engineering AbstractIntroduction.

Neural Networks Demystified by Louise Francis Francis Analytics and Actuarial Data Mining, Inc.

Chapter 8: Adaptive Networks

Hazırlayan NEURAL NETWORKS Backpropagation Network PROF. DR. YUSUF OYSAL.

Chong Ho Yu.  Data mining (DM) is a cluster of techniques, including decision trees, artificial neural networks, and clustering, which has been employed.

© Galit Shmueli and Peter Bruce 2010 Chapter 6: Multiple Linear Regression Data Mining for Business Analytics Shmueli, Patel & Bruce.

Artificial Neural Networks for Data Mining. Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 6-2 Learning Objectives Understand the.

Computacion Inteligente Least-Square Methods for System Identification.

Data Mining: Neural Network Applications by Louise Francis CAS Convention, Nov 13, 2001 Francis Analytics and Actuarial Data Mining, Inc.

Basis Expansions and Generalized Additive Models Basis expansion Piecewise polynomials Splines Generalized Additive Model MARS.

Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.

Overfitting, Bias/Variance tradeoff. 2 Content of the presentation Bias and variance definitions Parameters that influence bias and variance Bias and.

2011 Data Mining Industrial & Information Systems Engineering Pilsung Kang Industrial & Information Systems Engineering Seoul National University of Science.

Chapter 11 – Neural Nets © Galit Shmueli and Peter Bruce 2010 Data Mining for Business Intelligence Shmueli, Patel & Bruce.

Data Mining: Concepts and Techniques1 Prediction Prediction vs. classification Classification predicts categorical class label Prediction predicts continuous-valued.

Canadian Bioinformatics Workshops

Prepared by Fayes Salma.  Introduction: Financial Tasks  Data Mining process  Methods in Financial Data mining o Neural Network o Decision Tree  Trading.

Deep Feedforward Networks

Data Mining CAS 2004 Ratemaking Seminar Philadelphia, Pa.

Lecture 3: Linear Regression (with One Variable)

第 3 章神经网络.

Prof. Carolina Ruiz Department of Computer Science

Chapter 6: Multiple Linear Regression

Artificial Intelligence Methods

Artificial Intelligence Chapter 3 Neural Networks

network of simple neuron-like computing elements

Linear Model Selection and regularization

Artificial Intelligence Chapter 3 Neural Networks

Artificial Intelligence Chapter 3 Neural Networks

Artificial Intelligence Chapter 3 Neural Networks

Regression Analysis.

Artificial Intelligence Chapter 3 Neural Networks

Prof. Carolina Ruiz Department of Computer Science

Presentation transcript:

OLS REGRESSION VS. NEURAL NETWORKS VS. MARS A COMPARISON R. J. Lievano E. Kyper University of Minnesota Duluth

Research Questions Are new data mining regression techniques superior to classical regression? Can data analysis methods implemented naively (through default automated routines) yield useful results consistently?

Assessment of 3x2 3 factorial experiment Regression methods (3): OLS forward stepwise regression, feedforward neural networks, Multivariate Adaptive Regression Splines (MARS). Type of function (2): linear and nonlinear. Noise Size (2): small, large. Sample Size (2): small, large.

FORWARD STEPWISE REGRESSION Given a set of responses Y and predictors X such that Y = F(x) + ε where ε is an error (noise) structure: Find a subset X R of X which satisfies a set of conditions such as goodness-of-fit or simplicity. Fit a set of successive models of the type Y i = Σ j β j X j + ε i Stop when a specified criterion has been achieved. e.g. Maximum adjusted R 2 No remaining significant predictors

MULTIVARIATE ADAPTIVE REGRESSION SPLINES (MARS) Given a set of responses Y and predictors X such that Y = F(x) + ε where ε is an error (noise) structure: Find a set of basis functions W j (spline transformations of X j ) which describe intervals of varying relationships between X j and Y Fit these basis functions with a stepwise regression procedure to models of the type until a stopping criterion has been achieved.

Input (I)Output (y) x1x2x3x4x5x0x1x2x3x4x5x I=0.8+.3x1+.7x2-.2x3+.4x4-.5x5  (I) 0.8 A Neuron Sigmoidal Activation (transfer) Function NEURAL NETWORKS COMPONENTS To next Layer

Input from hidden node Output Overall (many Nodes) The resulting model is just a flexible non-linear regression of the response on a set of predictor variables. Input LayerHidden LayerOutput Layer

Hypothesis H1: The three methods are equivalent in accuracy (goodness-of-fit). H2: The three methods are equivalent in ability to select valid predictors. –H2a: The three methods are equivalent in the degree of underfitting. –H2b: The three methods are equivalent in the degree of overfitting.

A SLICE OF Y = α + Σ j β j X j + ε (Linear functional form modeled)

A SLICE OF Y = α + Σ j LOG e (β j X j ) + ε (Nonlinear functional form modeled

ANOVA RESULTS: METHOD MEANS AND 0.95 INTERVALS

Results/Conclusions H1 can be rejected (three methods are not equivalent in accuracy). H2a can not be rejected, underfitting is more prevalent in nonlinear fits with large noise for smaller samples. H2b can be rejected (three methods are equivalent in degree of overfitting).

Results Cont. Linear PMSE: OLS regression Linear over spec.: MARS Nonlinear PMSE: NNW Nonlinear over spec.: MARS Need further study to answer research questions clearly.

Further Research Conducted Kept the same three methods with only large samples. Kept function as a factor but changed from two to three functions (1 linear, 2 nonlinear) Replaced noise with contamination (contaminated and uncontaminated data) Found that OLS regression performed best in all linear cases. Unlike previous findings we now found that MARS performed the best in all nonlinear cases and that underspecfication is now significant.