 Insight – extracting conceptually appealing information from data  Exposition – displaying the decision tree results in a form to communicate insight.

Slides:



Advertisements
Similar presentations
Conceptualization, Operationalization, and Measurement
Advertisements

DECISION TREES. Decision trees  One possible representation for hypotheses.
Regression analysis Relating two data matrices/tables to each other Purpose: prediction and interpretation Y-data X-data.
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.
Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Part I Introduction to Data Mining by Tan,
Classification: Definition Given a collection of records (training set ) –Each record contains a set of attributes, one of the attributes is the class.
Copyright © Allyn & Bacon (2007) Research is a Process of Inquiry Graziano and Raulin Research Methods: Chapter 2 This multimedia product and its contents.
Comparison of Data Mining Algorithms on Bioinformatics Dataset Melissa K. Carroll Advisor: Sung-Hyuk Cha March 4, 2003.
CHAPTER 1 Practical Business Research. A working definition: Any systematic attempt at collecting and interpreting data and evidence in order to inform.
Chapter 10 Decision Making © 2013 by Nelson Education.
Theoretical Framework
Information is the essence. We all work with information. Our work depends on it. …accurate …valid …reliable …current We depend on it. It comes in different.
Power Point Slides developed by Ms. Elizabeth Freeman
Lecture Notes for Chapter 4 Introduction to Data Mining
Binary Decision Diagrams1 BINARY DECISION DIAGRAMS.
Supervised classification performance (prediction) assessment Dr. Huiru Zheng Dr. Franscisco Azuaje School of Computing and Mathematics Faculty of Engineering.
Engineering Data Analysis & Modeling Practical Solutions to Practical Problems Dr. James McNames Biomedical Signal Processing Laboratory Electrical & Computer.
Data mining and statistical learning, lecture 5 Outline  Summary of regressions on correlated inputs  Ridge regression  PCR (principal components regression)
Simple Regression correlation vs. prediction research prediction and relationship strength interpreting regression formulas –quantitative vs. binary predictor.
Simple Correlation Scatterplots & r Interpreting r Outcomes vs. RH:
Scientific method - 1 Scientific method is a body of techniques for investigating phenomena and acquiring new knowledge, as well as for correcting and.
Bivariate & Multivariate Regression correlation vs. prediction research prediction and relationship strength interpreting regression formulas process of.
Chapter 7 Correlational Research Gay, Mills, and Airasian
Prelude of Machine Learning 202 Statistical Data Analysis in the Computer Age (1991) Bradely Efron and Robert Tibshirani.
Ensemble Learning (2), Tree and Forest
Introduction to Directed Data Mining: Decision Trees
Designing Marketing Channels
McGraw-Hill © 2006 The McGraw-Hill Companies, Inc. All rights reserved. Correlational Research Chapter Fifteen.
Equivalence Class Testing
Factor Analysis Psy 524 Ainsworth.
Midterm Review. 1-Intro Data Mining vs. Statistics –Predictive v. experimental; hypotheses vs data-driven Different types of data Data Mining pitfalls.
CHAPTER NINE Correlational Research Designs. Copyright © Houghton Mifflin Company. All rights reserved.Chapter 9 | 2 Study Questions What are correlational.
Data Analysis (continued). Analyzing the Results of Research Investigations Two basic ways of describing the results Two basic ways of describing the.
Generic Approaches to Model Validation Presented at Growth Model User’s Group August 10, 2005 David K. Walters.
Copyright  2004 McGraw-Hill Pty Ltd. PPTs t/a Marketing Research by Lukas, Hair, Bush and Ortinau 2-1 The Marketing Research Process Chapter Two.
Validity. Face Validity  The extent to which items on a test appear to be meaningful and relevant to the construct being measured.
Learning from Observations Chapter 18 Through
Advanced Correlational Analyses D/RS 1013 Factor Analysis.
How to write a professional paper. 1. Developing a concept of the paper 2. Preparing an outline 3. Writing the first draft 4. Topping and tailing 5. Publishing.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Lecture 02.
Dimension Reduction in Workers Compensation CAS predictive Modeling Seminar Louise Francis, FCAS, MAAA Francis Analytics and Actuarial Data Mining, Inc.
Learning Objectives In this chapter you will learn about the elements of the research process some basic research designs program evaluation the justification.
Chapter 9: Correlation and Regression Analysis. Correlation Correlation is a numerical way to measure the strength and direction of a linear association.
Lecture 12 Factor Analysis.
Multivariate Analysis and Data Reduction. Multivariate Analysis Multivariate analysis tries to find patterns and relationships among multiple dependent.
Applied Quantitative Analysis and Practices
STILL MORE 9.1. VI. CORRELATION & CAUSATION Just because there is a strong relationship, this does NOT imply cause and effect!
Multivariate Data Analysis Chapter 3 – Factor Analysis.
Big Data Analysis and Mining Qinpei Zhao 赵钦佩 2015 Fall Decision Tree.
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
Classification Tree Interaction Detection. Use of decision trees Segmentation Stratification Prediction Data reduction and variable screening Interaction.
2011 Data Mining Industrial & Information Systems Engineering Pilsung Kang Industrial & Information Systems Engineering Seoul National University of Science.
SCIENTIFIC METHOD NATURE OF SCIENCE AND EXPERIMENTAL DESIGN VANCE
Research Questions  What is the nature of the distribution of assignment quality dimensions of rigor, knowledge construction, and relevance in Math and.
© 2011 Pearson Education, Inc. All rights reserved. This multimedia product and its contents are protected under copyright law. The following are prohibited.
Introduction to Machine Learning and Tree Based Methods
Inferential Statistics:
Employee Turnover: Data Analysis and Exploration
Reliability and Validity of Measurement
SAS Homework 2 Review Decision trees
MIS2502: Data Analytics Classification using Decision Trees
Implementing AdaBoost
Chapter_19 Factor Analysis
Decision trees MARIO REGIN.
STT : Intro. to Statistical Learning
Presentation transcript:

 Insight – extracting conceptually appealing information from data  Exposition – displaying the decision tree results in a form to communicate insight and inform policy and planning  Tell a story

 Conceptual model  Operationalize the conceptual model  Develop the story in context  Key relationships and story plot  Create testable hypotheses

 Top down decision tree creation  Select branches that conform to model  Can be lower logworth than other branches  Include non-significant branches that reflect the conceptual model  Test hypotheses

 Based on underlying conceptual model  Ishikawa Diagram (Fishbone Diagram)  Determine likely relevant dimensions from data  Test hypotheses

 Expository needs interpretation  Prediction does not need to tell a story  Prediction needs to accurately predict future values, have reproducibility and reliability

 Sample Design to gain knowledge of the environment  Data Efficacy and Operational Measures – data that relates to known or likely factors predicting the target.  True measures

 The Challenge – Identifying strong predictors  Matching predictors with range  Combinations of predictors  Approach – Bonferroni and validation or cross-validation

 Stand-in variables  Create Composites (Principal components or factor scores or reduction measures)  More data  Best fit is the right size, but what is the right size?

 Multi-way splits: Use as many partitions as distinct values.  Binary splits: Divides values into two subsets.  Need to find optimal partitioning

In theory, multiway splits are no more flexible than binary splits. Multiway splits often give more interpretable trees because split variables tend to be used fewer times. Many prefer binary splits because an exhaustive search is more feasible.