CS548 Fall 2018 Model and Regression Trees

Slides:



Advertisements
Similar presentations
Random Forest Predrag Radenković 3237/10
Advertisements

CS548 Spring 2015 Showcase By Yang Liu, Viseth Sean, Azharuddin Priyotomo Showcasing work by Le, Abrahart, and Mount on "M5 Model Tree applied to modelling.
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.
Decision Tree Approach in Data Mining
Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Part I Introduction to Data Mining by Tan,
Data Mining Classification: Alternative Techniques
1 CS 391L: Machine Learning: Instance Based Learning Raymond J. Mooney University of Texas at Austin.
Application of Stacked Generalization to a Protein Localization Prediction Task Melissa K. Carroll, M.S. and Sung-Hyuk Cha, Ph.D. Pace University, School.
Lazy vs. Eager Learning Lazy vs. eager learning
Decision Tree Learning 主講人:虞台文 大同大學資工所 智慧型多媒體研究室.
Chapter 7 – Classification and Regression Trees
SLIQ: A Fast Scalable Classifier for Data Mining Manish Mehta, Rakesh Agrawal, Jorma Rissanen Presentation by: Vladan Radosavljevic.
Feature Selection Presented by: Nafise Hatamikhah
Using Structure Indices for Efficient Approximation of Network Properties Matthew J. Rattigan, Marc Maier, and David Jensen University of Massachusetts.
ID3 Algorithm Abbas Rizvi CS157 B Spring What is the ID3 algorithm? ID3 stands for Iterative Dichotomiser 3 Algorithm used to generate a decision.
Ensemble Learning: An Introduction
Induction of Decision Trees
1 Nearest Neighbor Learning Greg Grudic (Notes borrowed from Thomas G. Dietterich and Tom Mitchell) Intro AI.
Decision Trees (2). Numerical attributes Tests in nodes are of the form f i > constant.
Aprendizagem baseada em instâncias (K vizinhos mais próximos)
1 MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING By Kaan Tariman M.S. in Computer Science CSCI 8810 Course Project.
Exposure In Wireless Ad-Hoc Sensor Networks S. Megerian, F. Koushanfar, G. Qu, G. Veltri, M. Potkonjak ACM SIG MOBILE 2001 (Mobicom) Journal version: S.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
DATA MINING AND MACHINE LEARNING Addison Euhus and Dan Weinberg.
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
Machine Learning Chapter 3. Decision Tree Learning
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 2011 Predicting Solar Generation from Weather Forecasts Using Machine Learning Navin.
Chapter 9 – Classification and Regression Trees
Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.3: Decision Trees Rodney Nielsen Many of.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.7: Instance-Based Learning Rodney Nielsen.
ASSESSING LEARNING ALGORITHMS Yılmaz KILIÇASLAN. Assessing the performance of the learning algorithm A learning algorithm is good if it produces hypotheses.
MACHINE LEARNING 10 Decision Trees. Motivation  Parametric Estimation  Assume model for class probability or regression  Estimate parameters from all.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.8: Clustering Rodney Nielsen Many of these.
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 6.2: Classification Rules Rodney Nielsen Many.
Random Forests Ujjwol Subedi. Introduction What is Random Tree? ◦ Is a tree constructed randomly from a set of possible trees having K random features.
By N Venkata Srinath, MS Power Systems.. Statistical approach to wind forecasting Making use of past data future wind is forecasted. The simplest statistical.
Lecture Notes for Chapter 4 Introduction to Data Mining
Competition II: Springleaf Sha Li (Team leader) Xiaoyan Chong, Minglu Ma, Yue Wang CAMCOS Fall 2015 San Jose State University.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Data Mining: Cluster Analysis This lecture node is modified based on Lecture Notes for Chapter.
DECISION TREES Asher Moody, CS 157B. Overview  Definition  Motivation  Algorithms  ID3  Example  Entropy  Information Gain  Applications  Conclusion.
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
Linear Models & Clustering Presented by Kwak, Nam-ju 1.
Rodney Nielsen Many of these slides were adapted from: I. H. Witten, E. Frank and M. A. Hall Data Science Algorithms: The Basic Methods Clustering WFH:
DATA MINING and VISUALIZATION Instructor: Dr. Matthew Iklé, Adams State University Remote Instructor: Dr. Hong Liu, Embry-Riddle Aeronautical University.
Machine Learning Supervised Learning Classification and Regression
Clustering CSC 600: Data Mining Class 21.
Data Driven Resource Allocation for Distributed Learning
Introduction to Machine Learning and Tree Based Methods
Data Science Algorithms: The Basic Methods
Classification Algorithms
Instance Based Learning
Ch9: Decision Trees 9.1 Introduction A decision tree:
CS Fall 2016 (© Jude Shavlik), Lecture 4
For Evaluating Dialog Error Conditions Based on Acoustic Information
Data Science Algorithms: The Basic Methods
Source: Procedia Computer Science(2015)70:
Data Mining: Concepts and Techniques (3rd ed
Decision Tree Saed Sayad 9/21/2018.
Introduction to Data Mining, 2nd Edition by
Introduction to Data Mining, 2nd Edition by
Data Mining Practical Machine Learning Tools and Techniques
Machine Learning Chapter 3. Decision Tree Learning
Instance Based Learning
Machine Learning Chapter 3. Decision Tree Learning
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
Data Mining Classification: Alternative Techniques
Data Mining CSCI 307, Spring 2019 Lecture 23
Presentation transcript:

CS548 Fall 2018 Model and Regression Trees Showcase by Geri Dimas and Han Liu Showcasing work by A. Troncoso, S. Salcedo-Sanz, C. Casanova-Mateo, J.C. Riquelme, L. Prieto on "Local models-based regression trees for very short-term wind speed prediction”

References [1] A. Troncoso, S. Salcedo-Sanz, C. Casanova-Mateo, J. Riquelme, and L. Prieto, “Local models-based regression trees for very short-term wind speed prediction,” Renewable Energy, vol. 81, pp. 589–598, 2015. [2]P.-N. Tan, M. Steinbach, A. Karpatne, and V. Kumar, Introduction to data mining. NY NY: Pearson, 2019. [3]I. H. Witten, E. Frank, M. A. Hall, and C. J. Pal, Data mining: practical machine learning tools and techniques. Amsterdam: Morgan Kaufmann, 2017. [4] M. Robnik-Sikonja, P. Savicky, M. M. Robnik-Sikonja, Package ‘corelearn’, 2018. Add reference for R corelearn package background

Evaluation of prospective locations for new wind farm placements A need for methods that produce quick, accurate forecasting of short-term wind speeds in wind farms. The Problem : Every wind farm installed is required to report hourly energy predictions in most countries Evaluation of prospective locations for new wind farm placements To ensure stability of the whole electric system in case of large penetrating of wind energy The Importance The importance ability predict plays: - it is required in every wind farm, can help determine ideal locations for new farms, also important to ensure stability of the whole electric system in cases of large penetrating of wind energy. https://www.iconfinder.com/icons/426577/ecol_ecology_wind_wind_energy_windmill_icon

Local models-based regression trees The Method: Local models-based regression trees This is used create the regression trees → how it works Chooses the model Tree with the lowest errors for each tower http://freescrapbooks.blogspot.com/2011/05/clip-art-leaf-tree.html

The Models in the leaves: Global Model? Linear Model? Differences Model 1 (M1) √ Constant (Mean) Model 2 (M2) Constant (Median) Model 3 (M3) Minimize the mean square error Model 4 (M4) Minimize description length principle Model 5 (M5) MSE + a greedy search Model 6 (M6) X k-Nearest Neighbors + Mean Model 7 (M7) k-Nearest Neighbors + Weighted Mean Model 8 (M8) Minimize locally weighted mean square error Define what a local model is vs. global When we get to model 6 and 7, I will explain further what the difference between a global and local model is Define linear model

The Data: 239 discontinuous time series data for each tower Hourly wind speeds collected at 8 wind towers from November 2008 to November 2010 Guadalajara, Spain 239 discontinuous time series data for each tower 70% for training (175 TS data, 3061 records) 30% for testing (64 TS data, 1091 records)

Example of wind speed distribution for tower 1 The Data Example of wind speed distribution for tower 1 Data is collected Hourly from each tower -- so we have 8 TS data avaliable one for each tower from Nov 2008 to Nov 2010 the graph represents the average of the 8 together to produce the windspeed-- snippet is of approx 7 mths of data Shows hourly averages of wind speed for the 8 towers over 168 hrs in October 2010

The Process: Basic Regression Tree 1. Calculate heterogeneity in class/target attribute 2. Calculate homogeneity for each of the available attributes , selecting the one with largest reduction in heterogeneity in relation to the class, this becomes root/internal node 3. Divide values on the selected attribute 4. Repeat Steps 2-4 until meet termination criteria Geri The difference between our approach and the basic RT approach is we apply a different TYPE of Regression model in Basic Regression tree process Explain M6/M7 nearest neighbor idea Example is breadth first search M6/M7 - k nearest neighbor

The Process: The Global Model: The grey points are the training data The x value we want to predict a value of y for Y The y value we predict using the Global Model First explain what a global model is versus a linear model in a very simplistic case Lets say we have a 1D set of data, and want to run regression on this set of data What a global model does is it considers the ENTIRE set of values to create a prediction Show the first line that is “ roughly” a regression line through the data For arguments sake lets say the line produced is accurately depicted given the points presented. What a global model does is takes into consideration all of the points in the data set when forecasting the y prediction of a x value. Ex: Lets say we are trying to predict the y valeu for the value ( depicted by square) on X----> 2. Now a Local model differs by only taking into consideration the X

K Nearest Neighbor Local Model The Local Model: The Process: K Nearest Neighbor Local Model The grey points are the training data The x value we want to predict a value of y for Y The k nearest neighbors for our x value in the x space Where y =4 The y value we predict using the Local Model Now the local model can be described as follows: In this paper they choose to use two local type models: K- nearest neighbors and Weighted K-nearest neighbor EXAPLIN the difference between weighted and not If K=1 X So we predict y=4

K Nearest Neighbor Local Model The Local Model: The Process: K Nearest Neighbor Local Model Average of K nearest neighbors=(4+5+9)/3 =6 The grey points are the training data The x value we want to predict a value of y for Y y=9 y=5 The k nearest neighbors for our x value in the x space y=4 The y value we predict using the Local Model Now the local model can be described as follows: In this paper they choose to use two local type models: K- nearest neighbors and Weighted K-nearest neighbor EXAPLIN the difference between weighted and not If K=3 X So we predict y=6

Using Nearest Neighbor Algorithm in Regression Trees The Process: Using Nearest Neighbor Algorithm in Regression Trees Given a test data instance, follow the branch of the tree that corresponds to the test instance until you get to the leaf. Then: 1. Find the k nearest neighbors of the test instance among the data instances of the training set that belong to the leaf 2. Calculate the mean the target values of these k nearest neighbors and output this mean as the prediction value Geri Basic Regression tree process Explain M6/M7 nearest neighbor idea Example is breadth first search M6/M7 - k nearest neighbor

Experiment Results: Model Evaluation and Comparison Performance (MSE in m/s) Speed(Learning time in seconds) Speed(Forecasting time in seconds) Model 1 1.421 0.017 0.000 Model 2 1.427 0.016 Model 3 1.398 0.022 Model 4 0.034 0.001 Model 5 1.373 0.019 Model 6 1.326 0.055 Model 7 0.056 Model 8 2.788 0.076 0.004 Performance (MSE in m/s) Speed(Learning time in seconds) Speed(Forecasting time in seconds) Model 1 1.421 0.017 0.000 Model 2 1.427 0.016 Model 3 1.398 0.022 Model 4 0.034 0.001 Model 5 1.373 0.019 Model 6 1.326 0.055 Model 7 0.056 Model 8 2.788 0.076 0.004 Han These regression models are used in the leaf nodes

Alternative Non- RT Methods: Comparison Performance (MSE in m/s) CPU time ( in seconds) M6 1.326 0.055 M7 0.056 SVM 1.358 511.437 MLP 1.368 17.773 CART 1.465 0.016 Performance (MSE in m/s) CPU time ( in seconds) M6 1.326 0.055 M7 0.056 SVM 1.358 511.437 MLP 1.368 17.773 CART 1.465 0.016 Geri

Conclusion Regression trees based on local models (in particular M6, M7) are a powerful methodology for predicting wind speeds Regression Tree algorithms are favorable in comparison to other methods in terms of prediction errors on the testing data In addition Regression Trees have overall low computational times, which prove to be beneficial given the short-term nature of wind speeds

Thank you! Questions?