1 DATA MINING: DEFINITIONS AND DECISION TREE EXAMPLES Emily Thomas Director of Planning and Institutional Research.

Slides:



Advertisements
Similar presentations
Supporting End-User Access
Advertisements

1.Data categorization 2.Information 3.Knowledge 4.Wisdom 5.Social understanding Which of the following requires a firm to expend resources to organize.
Shipi Kankane Prashanth Nakirekommula.  Applying analytics and risk- management capabilities to health insurance through LexisNexis data platforms. 
When to use Data Mining. Introduction An important question that should be answered before you commence any data mining project is whether data mining.
Data Mining.
Basic Data Mining Techniques
Knowledge Discovery Centre: CityU-SAS Partnership 1 Speakers: Prof Y V Hui, CityU Dr H P Lo, CityU Dr Sammy Yuen, CityU Dr K W Cheng, SAS Institute Mr.
DataMining By Guan Hang Su CS157A section 2 fall 2005.
Neural Networks in Data Mining “An Overview”
Oracle Data Mining Ying Zhang. Agenda Data Mining Data Mining Algorithms Oracle DM Demo.
Data Mining & Data Warehousing PresentedBy: Group 4 Kirk Bishop Joe Draskovich Amber Hottenroth Brandon Lee Stephen Pesavento.
Data Mining and Decision Tree CS157B Spring 2006 Masumi Shimoda.
Computer Science Universiteit Maastricht Institute for Knowledge and Agent Technology Data mining and the knowledge discovery process Summer Course 2005.
Enterprise systems infrastructure and architecture DT211 4
Data Mining By Andrie Suherman. Agenda Introduction Major Elements Steps/ Processes Tools used for data mining Advantages and Disadvantages.
Application of SAS®! Enterprise Miner™ in Credit Risk Analytics
Data Mining Techniques
1 REVIEW LEARNING OUTCOME Customer Relationship Management LO I.
Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.
Business Intelligence, Data Mining and Data Analytics/Predictive Analytics By: Asela Thomason IS 495 Summer 2015.
Intelligent Systems Lecture 23 Introduction to Intelligent Data Analysis (IDA). Example of system for Data Analyzing based on neural networks.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Next Generation Techniques: Trees, Network and Rules
DATA MINING Team #1 Kristen Durst Mark Gillespie Banan Mandura University of DaytonMBA APR 09.
Customer Relationship Management Key Concepts. Customer Relationship Management Strategy Link all processes of the company from its customers through.
1 Data Mining Books: 1.Data Mining, 1996 Pieter Adriaans and Dolf Zantinge Addison-Wesley 2.Discovering Data Mining, 1997 From Concept to Implementation.
Chapter 13 Genetic Algorithms. 2 Data Mining Techniques So Far… Chapter 5 – Statistics Chapter 6 – Decision Trees Chapter 7 – Neural Networks Chapter.
Inductive learning Simplest form: learn a function from examples
Overview of Data Mining Methods Data mining techniques What techniques do, examples, advantages & disadvantages.
Introduction to Data Mining Group Members: Karim C. El-Khazen Pascal Suria Lin Gui Philsou Lee Xiaoting Niu.
Copyright © 2010, SAS Institute Inc. All rights reserved. Applied Analytics Using SAS ® Enterprise Miner™
Some working definitions…. ‘Data Mining’ and ‘Knowledge Discovery in Databases’ (KDD) are used interchangeably Data mining = –the discovery of interesting,
Data Mining: Software Helping Business Run
The CRISP Data Mining Process. August 28, 2004Data Mining2 The Data Mining Process Business understanding Data evaluation Data preparation Modeling Evaluation.
Business Intelligence and Decision Modeling Week 9 Customer Profiling Decision Trees (Part 2) CHAID CRT.
Data Mining In contrast to the traditional (reactive) DSS tools, the data mining premise is proactive. Data mining tools automatically search the data.
3-1 Data Mining Kelby Lee. 3-2 Overview ¨ Transaction Database ¨ What is Data Mining ¨ Data Mining Primitives ¨ Data Mining Objectives ¨ Predictive Modeling.
APPLICATION OF DATAMINING TOOL FOR CLASSIFICATION OF ORGANIZATIONAL CHANGE EXPECTATION Şule ÖZMEN Serra YURTKORU Beril SİPAHİ.
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
Information systems and management in business Chapter 8 Business Intelligence (BI)
27-18 września Data Mining dr Iwona Schab. 2 Semester timetable ORGANIZATIONAL ISSUES, INDTRODUCTION TO DATA MINING 1 Sources of data in business,
MKT 700 Business Intelligence and Decision Models Algorithms and Customer Profiling (1)
1 STAT 5814 Statistical Data Mining. 2 Use of SAS Data Mining.
What is Data Mining? process of finding correlations or patterns among dozens of fields in large relational databases process of finding correlations or.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
Copyright © 2010 SAS Institute Inc. All rights reserved. Decision Trees Using SAS Sylvain Tremblay SAS Canada – Education SAS Halifax Regional User Group.
An Evaluation of Commercial Data Mining Proposed and Presented by Emily Davis Supervisor: John Ebden.
Data Mining and ERP Presented by: Abhineet Malviya Ankesh Jindal Mayur Shinde.
MIS2502: Data Analytics Advanced Analytics - Introduction.
DATA MINING PREPARED BY RAJNIKANT MODI REFERENCE:DOUG ALEXANDER.
Copyright © 2001, SAS Institute Inc. All rights reserved. Data Mining Methods: Applications, Problems and Opportunities in the Public Sector John Stultz,
Data Mining. Overview the extraction of hidden predictive information from large databases Data mining tools predict future trends and behaviors, allowing.
Data Mining Copyright KEYSOFT Solutions.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
Sports Market Research. Know Your Customer How do businesses know their customers needs and wants?  Ask them/talking to customers  Surveys  Questionnaires.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 28 Data Mining Concepts.
Classification Tree Interaction Detection. Use of decision trees Segmentation Stratification Prediction Data reduction and variable screening Interaction.
Data Resource Management – MGMT An overview of where we are right now SQL Developer OLAP CUBE 1 Sales Cube Data Warehouse Denormalized Historical.
Data Mining is the process of analyzing data and summarizing it into useful information Data Mining is usually used for extremely large sets of data It.
DATA MINING © Prentice Hall.
Introduction to Data Mining
A Predictive Model for Student Retention Using Logistic Regression
Data Mining 101 with Scikit-Learn
Data Mining Techniques So Far…
Secondary Data, Databases,
כריית נתונים.
Supporting End-User Access
Comparisons of Clustering Detection and Neural Network in E-Miner, Clementine and I-Miner Jong-Hee Lee and Yong-Seok Choi.
Presentation transcript:

1 DATA MINING: DEFINITIONS AND DECISION TREE EXAMPLES Emily Thomas Director of Planning and Institutional Research

2 WHAT IS DATA MINING? +Data mining is the discovery of hidden knowledge, unexpected patterns and new rules in large databases. -Data mining is exploratory. The results lack the protection from spurious conclusions that validates theory-based hypothesis-driven statistics.

3 WHY USE DATA MINING? In the corporate world: Large amounts of data are captured in enterprise data bases. These databases are too large for traditional statistical techniques. Identifying patterns in the data can target profitable, or unprofitable, customers.

4 WHY USE DATA MINING? In institutional research: Large numbers of variables We have insufficient time/resources to investigate all the relationships that might be informative. Identifying data patterns can shed light on student behavior.

5 WHY DATA MINING NOW? Development of large, integrated enterprise databases Development of data mining techniques and software Development of simplified user interface

6 DATA MINING TECHNIQUES Decision trees Rule induction Nearest neighbors Neural networks Clustering Genetic algorithms Exploratory factor analysis Stepwise regression

7 DECISION TREE ANALYSIS CHAID: Chi-squared Automatic Interaction Detector (SPSS Answer Tree) 1.Select significant independent variables 2.Identify category groupings or interval breaks to create groups most different with respect to the dependent variable 3.Select as the primary independent variable the one identifying groups with the most different values of the dependent variable 4.Select additional variables to extend each branch if there are further significant differences

8 TRANSFER RETENTION RATES Percent of new full-time Fall 2002 transfers returning in Spring 2003

9 TRANSFER RETENTION RATES FALL 2002-SPRING 2003

10 SOS 2000: SATISFACTION WITH THE QUALITY OF EDUCATION

11 VERY LARGE INTELLECTUAL GROWTH 19% of students

12 LARGE INTELLECTUAL GROWTH 41% of students

13 LOW OR MODERATE INTELLECTUAL GROWTH 40% of students

14 SOS 2000: SATISFACTION WITH “THIS COLLEGE IN GENERAL”

15 DECISION TREE ADVANTAGES AND DISADVANTAGES +Discover unexpected relationships +Identify subgroup differences +Use categorical or continuous data +Accommodate missing data -Possibly spurious relationships -Presentation difficulties

16 BIBLIOGRAPHY AnswerTree 2.0: User’s Guide. SPSS, Adriaans, P and D Zantinge (1996). Data Mining. Harlow, England and elsewhere: Addison-Wesley. Bordon, VMH (1995). Segmenting Student Markets with a Student Satisfaction and Priorities Survey. Research in Higher Education 16:2, Neville, PG. (1999). “Decision Trees for Predictive Modeling,” SAS Technical Report, The SAS Institute. Thomas, EH and N Galambos. What Satisfies Students? Mining Student-Opinion Data with Regression and Decision Tree Analysis. Forthcoming in Research in Higher Education, May 2004.