Presentation is loading. Please wait.

Presentation is loading. Please wait.

Linear Regression. Fitting Models to Data Linear Analysis Decision Trees.

Similar presentations


Presentation on theme: "Linear Regression. Fitting Models to Data Linear Analysis Decision Trees."— Presentation transcript:

1 Linear Regression

2

3 Fitting Models to Data Linear Analysis Decision Trees

4

5 YearWhatNotesWho 1963AID: Automatic Interaction DetectorContinuousJames Morgan John Sonquist 1973THAID: THeta AIDCategoricalJames Morgan Robert Messenger 1980CHAID: CHi-Square AIDMultiple SplitsKass 1984CART: Classification and Regression Trees Popular ApproachLeo Breiman 1986Iterative Dichotomiser 3 (ID3)CategoricalQuinlan Ross 1994C4.5 AlgorithmContinuous and CategoricalQuinlan Ross 1994BaggingResamplingLeo Breiman BoostingCascading Small TreesRob Schapire Jerry Friedman 2001Random ForestsMany treesLeo Breiman Adele Cutler

6 AID: Automatic Interaction Detector Association Co-Occurence

7 CHAID

8 CART: Classification and Regression Trees CART family is oriented to statistics using the concept of impurity Measures how well are the two classes separated – Ideally we would like toseparate all 0s and 1 http://freakonometrics.hypotheses.org/1279

9 Fitting Models to Data

10 Titanic Case Study

11 OverFitting

12

13 Bagging Builds multiple decision trees by repeatedly resampling training data with replacement Fit a Model to each Sample Voting across the trees for a consensus prediction.

14 Learns slowly Given the current model, we fit a decision tree to the residuals (misclassifications) from the model. We then add this new decision tree into the fitted function in order to update the residuals. Each of these trees can be rather small, with just a few terminal nodes, determined by the parameter d in the algorithm. By fitting small trees to the residuals, we slowly improve fit in areas where it does not perform well Boosting

15 Random Forests

16 http://www.stat.berkeley.edu/~breiman/RandomForests/

17

18 Gradient Boosting

19 Many Algorithms Decision Trees rpart (CART) tree (CART) ctree (conditional inference tree) CHAID (chi-squared automatic interaction detection) evtree (evolutionary algorithm) mvpart (multivariate CART) knnTree (nearest-neighbor-based trees) RWeka (J4.8, M50, LMT) LogicReg (Logic Regression) BayesTree TWIX (with extra splits) party (conditional inference trees, model- based trees) Random Forests randomForest(CART-based random forests) randomSurvivalForest(for censored responses) party(conditional random forests) gbm(tree-based gradient boosting) mboost(model-based and tree-based gradient boosting)


Download ppt "Linear Regression. Fitting Models to Data Linear Analysis Decision Trees."

Similar presentations


Ads by Google