Ungraded quiz Unit 5.

Slides:



Advertisements
Similar presentations
Chapter 7 Classification and Regression Trees
Advertisements

Brief introduction on Logistic Regression
Model generalization Test error Bias, variance and complexity
Model Assessment and Selection
Model Assessment, Selection and Averaging
Chapter 7 – Classification and Regression Trees
Model assessment and cross-validation - overview
Chapter 7 – Classification and Regression Trees
Chapter 2: Lasso for linear models
Statistical Decision Theory, Bayes Classifier
Basic Data Mining Techniques
Lecture 6: Multiple Regression
Sparse vs. Ensemble Approaches to Supervised Learning
Classification and Prediction: Regression Analysis
Ensemble Learning (2), Tree and Forest
Decision Tree Models in Data Mining
Introduction to Directed Data Mining: Decision Trees
Inferential statistics Hypothesis testing. Questions statistics can help us answer Is the mean score (or variance) for a given population different from.
Chapter 9 – Classification and Regression Trees
Zhangxi Lin ISQS Texas Tech University Note: Most slides are from Decision Tree Modeling by SAS Lecture Notes 5 Auxiliary Uses of Trees.
Multiple Regression The Basics. Multiple Regression (MR) Predicting one DV from a set of predictors, the DV should be interval/ratio or at least assumed.
Jeff Howbert Introduction to Machine Learning Winter Regression Linear Regression.
Regression Analyses. Multiple IVs Single DV (continuous) Generalization of simple linear regression Y’ = b 0 + b 1 X 1 + b 2 X 2 + b 3 X 3...b k X k Where.
Decision Tree Learning Debapriyo Majumdar Data Mining – Fall 2014 Indian Statistical Institute Kolkata August 25, 2014.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Empirical Research Methods in Computer Science Lecture 7 November 30, 2005 Noah Smith.
CS Statistical Machine learning Lecture 18 Yuan (Alan) Qi Purdue CS Oct
Chapter 2: Logistic Regression 2.1 Likelihood Approach 2.2 Binary Logistic Regression 2.3 Nominal and Ordinal Logistic Regression Models 1.
Jennifer Lewis Priestley Presentation of “Assessment of Evaluation Methods for Prediction and Classification of Consumer Risk in the Credit Industry” co-authored.
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
Jeff Howbert Introduction to Machine Learning Winter Regression Linear Regression Regression Trees.
CpSc 881: Machine Learning
Chapter1: Introduction Chapter2: Overview of Supervised Learning
Case Selection and Resampling Lucila Ohno-Machado HST951.
From OLS to Generalized Regression Chong Ho Yu (I am regressing)
Classification and Regression Trees
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
4-1 MGMG 522 : Session #4 Choosing the Independent Variables and a Functional Form (Ch. 6 & 7)
Chapter 12 REGRESSION DIAGNOSTICS AND CANONICAL CORRELATION.
Nonparametric Statistics
Unit 3 Hypothesis.
Notes on Logistic Regression
Comparing several means: ANOVA (GLM 1)
APPROACHES TO QUANTITATIVE DATA ANALYSIS
Introduction to Data Mining and Classification
Understanding Research Results: Description and Correlation
Ungraded quiz Unit 6.
Ungraded quiz Unit 3.
Decision Tree Chong Ho (Alex) Yu.
From OLS to Generalized Regression
Ungraded quiz Unit 1.
Ungraded quiz Unit 1.
Nonparametric Statistics
Ungraded quiz Unit 4.
I271b Quantitative Methods
What is Regression Analysis?
Linear Model Selection and regularization
CRISP: Consensus Regularized Selection based Prediction
Ungraded quiz Unit 7.
Ungraded quiz Unit 4.
Ungraded quiz Unit 6.
Ungraded quiz Unit 7.
Ungraded quiz Unit 5.
Ungraded quiz Unit 3.
Ungraded quiz Unit 10.
Ungraded quiz Unit 1.
Ungraded quiz Unit 4.
Ungraded quiz Unit 1.
Ungraded quiz Unit 3.
Ungraded quiz Unit 8.
Presentation transcript:

Ungraded quiz Unit 5

Show me your fingers Do not shout out the answer, or your classmates will follow what you said. Use your fingers One finger (the right finger) = A Two fingers = B Three fingers = C Four fingers = D No finger = I don’t know. I didn’t study

Which of the following is NOT a problem of traditional OLS regression? Too many assumptions about the residuals and the predictors It tends to overfit to the sample. The model is unstable when some predictors are strongly correlated (collinearity) There is only one unique solution when the sample size is large. It must be a linear model.

Which of the following statement is NOT a characteristic of generalized regression? Similar to abduction or IBE: don't fix on one single answer, consider a few. It imposes penalty on the model to reduce complexity Start with a full model model and then scale back the model. It is also known as regularized regression.

Which of the following is NOT an option in JMP’s generalized regression? Elastic Net Lasso Double Lasso Ridge Dante Selector

Elastic net combines the penalty methods of …. Double lasso and Ridge Lasso and Ridge Double Lasso and Dantzig selector Ridge and Dantzig selector

AIC or AICc is better than BIC because… AIC and AICc is based on the principle of information loss. The Bayesian approach requires a prior input but usually it is debatable. AIC is asymptotically optimal in model selection in terms of the root mean square error, but BIC is not asymptotically optimal. All of the above.

In one of the leafs (partitioned group) of a tree, there are 9 males and 1 females. According to GINI, this group is… Highly impure Highly pure Neither pure nor impure; insufficient information

Which of the following statements about the leaf report is UNTRUE? It is based upon the nested-if logic. It shows interactions of variables The probability of each scenario is the same as the p value in hypothesis testing.

What of the following about the ROC curve is UNTRUE? It is originated from Signal Detection Theory during WWII. The area under the curve indicates the predictive accuracy. When the AUC value is .6 or above, it is considered acceptable.

Which of the following statement is TRUE? If the outcome variable is categorical, LogWorth (the likelihood ratio of chi-square) is reported. LogWorth is the inverse of the p value. Like the p value, a lower LogWorth is better. LogWorth is a partitioned criterion based on impurity.

In JMP if you uncheck informative missing before generating a partition tree, what would happen? The missing data will be deleted in a listwise way. The missing data will be deleted in a casewise way. The missing data will be imputed.

Which of the following is NOT an option in SPSS’s classification trees? CHAID QUEST CRT

Which of the following is NOT a drawback of SPSS’s classification trees? The graphs of JMP are dynamic and interactive whereas the counterparts of SPSS are static. The interactive model outline in SPSS shows much less details than the tree in JMP. The hierarchy of the SPSS tree might not correspond to the rank of predictor importance. Neither viewer nor model outline allows tree-pruning.