Look-ahead Linear Regression Trees (LLRT) David Vogel, A.I. Insight Ognian Asparouhov, A.I. Insight Tobias Scheffer, Max Planck Institute for Computer Science
What is a Linear Regression Tree? All Data Regression Tree All Data LRT
Optimizing a Regression Tree Optimizing a Model Tree
Simulated Example #1
Simulated Example #2 X1=0 X1=1 Regression CART split on X2 LLRT split on X1 True Model Training 1.338 0.440 0.104 0.103 Validation 1.491 0.490 0.101
LLRT Idea Brute Force: Try every possible split on each splitting variable For each possible split, try every possible model in right and left partitions to achieve maximum accuracy Use massive amounts of optimization to make this possible
Related Citations 1992, RETIS (Karalic) optimizes the overall RSS RETIS optimization cited as “intractable” or “non-scalable” as recent as 2005 (Machine Learning)
Optimizations in LLRT Quick calculations for evaluating multiple leaf models from common sufficient statistics Matrix Solutions: G.E. versus SVD Forward Stepwise model selection shortcuts Limit 10-20 possible splits per variable
Optimizing Regression Analysis where and RSS
S-fold validation to avoid over-fitting
Stopping Rule Proposed split fails to improve result with two different samplings
M5 Run-time M5 Run time 10 20 30 40 50 60 20000 40000 60000 80000 100000 # Records # Minutes for Training
Reviewer Criticisms Limitation of 10-20 possible split points Experimental results do not include comparisons to many model tree algorithms LLRT is slower than other model tree algorithms