Decision Support and Business Intelligence Systems (9 th Ed., Prentice Hall) Chapter 5: Data Mining for Business Intelligence.

Slides:



Advertisements
Similar presentations
Classification: Definition Given a collection of records (training set ) –Each record contains a set of attributes, one of the attributes is the class.
Advertisements

Civil and Environmental Engineering Carnegie Mellon University Sensors & Knowledge Discovery (a.k.a. Data Mining) H. Scott Matthews April 14, 2003.
DATA MINING CS157A Swathi Rangan. A Brief History of Data Mining The term “Data Mining” was only introduced in the 1990s. Data Mining roots are traced.
Week 9 Data Mining System (Knowledge Data Discovery)
© Prentice Hall1 DATA MINING TECHNIQUES Introductory and Advanced Topics Eamonn Keogh (some slides adapted from) Margaret Dunham Dr. M.H.Dunham, Data Mining,
Data Mining Knowledge Discovery in Databases Data 31.
Data Mining By Archana Ketkar.
Business Intelligence: A Managerial Perspective on Analytics (3rd Edition) Chapter 4: Data Mining.
Data Mining – Intro.
Chapter 5 Data mining : A Closer Look.
Data Mining & Data Warehousing PresentedBy: Group 4 Kirk Bishop Joe Draskovich Amber Hottenroth Brandon Lee Stephen Pesavento.
Computer Science Universiteit Maastricht Institute for Knowledge and Agent Technology Data mining and the knowledge discovery process Summer Course 2005.
Chapter 4: Data Mining for Business Intelligence
Chapter 5: Data Mining for Business Intelligence
Data Mining By Andrie Suherman. Agenda Introduction Major Elements Steps/ Processes Tools used for data mining Advantages and Disadvantages.
Decision Support and Business Intelligence Systems (9 th Ed., Prentice Hall) Chapter 5: Data Mining for Business Intelligence.
Data Mining Week 10.
Chapter 5: Data Mining for Business Intelligence
Dr. Awad Khalil Computer Science Department AUC
Chapter 5: Data Mining for Business Intelligence
Chapter 5: Data Mining for Business Intelligence
Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.
Decision Support and Business Intelligence Systems (9 th Ed., Prentice Hall) Chapter 5: Data Mining for Business Intelligence.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Data Mining Techniques As Tools for Analysis of Customer Behavior
Chapter 5: Data Mining for Business Intelligence
Chapter 7 DATA, TEXT, AND WEB MINING Pages , 311, Sections 7.3, 7.5, 7.6.
Chapter 4: Data Mining for Business Intelligence
Data Mining CS157B Fall 04 Professor Lee By Yanhua Xue.
Introduction to Data Mining Group Members: Karim C. El-Khazen Pascal Suria Lin Gui Philsou Lee Xiaoting Niu.
Knowledge Discovery and Data Mining Evgueni Smirnov.
Data Mining Process A manifestation of best practices A systematic way to conduct DM projects Different groups has different versions Most common standard.
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
Knowledge Discovery and Data Mining Evgueni Smirnov.
Chapter 11 Business Intelligence Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall 11-1.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
Data mining. Data mining, at its core, is the transformation of large amounts of data into meaningful patterns and rules.
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
CHAPTER 4 Data Warehousing, Access, Analysis, Mining, and Visualization 2 1.
Decision Support and Business Intelligence Systems (9 th Ed., Prentice Hall) Chapter 5: Data Mining for Business Intelligence.
Chapter 14 Data Mining Transparencies. 2 Chapter Objectives u The concepts associated with data mining. u The main features of data mining operations,
Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 5-1 Data Mining Methods: Classification Most frequently used DM method Employ supervised.
An Introduction Student Name: Riaz Ahmad Program: MSIT( ) Subject: Data warehouse & Data Mining.
Academic Year 2014 Spring Academic Year 2014 Spring.
Monday, February 22,  The term analytics is often used interchangeably with:  Data science  Data mining  Knowledge discovery  Extracting useful.
Chapter 5: Data Mining.
Chapter 2 Data, Text, and Web Mining. Data Mining Concepts and Applications  Data mining (DM) A process that uses statistical, mathematical, artificial.
Copyright © 2014 Pearson Education, Inc. 5-1 DATA MINING.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 28 Data Mining Concepts.
Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 6: Artificial Neural Networks for Data Mining.
DATA MINING TECHNIQUES (DECISION TREES ) Presented by: Shweta Ghate MIT College OF Engineering.
Decision Support Systems Data Mining for Business Intelligence.
Decision Support and Business Intelligence Systems (9 th Ed., Prentice Hall) Chapter 5: Data Mining for Business Intelligence.
Decision Support and Business Intelligence Systems (9 th Ed., Prentice Hall) Chapter 5: Data Mining for Business Intelligence.
Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 6: Artificial Neural Networks for Data Mining.
Business Intelligence: A Managerial Approach (2nd Edition)
Data Mining – Intro.
DATA MINING © Prentice Hall.
Chapter 5: Data Mining for Business Intelligence
Predictive Analytics I: Data Mining Process, Methods, and Algorithms
Week 11 Knowledge Discovery Systems & Data Mining :
Prepared by: Mahmoud Rafeek Al-Farra
Supporting End-User Access
Business Intelligence: A Managerial Approach (2nd Edition)
Data Warehousing Data Mining Privacy
Business Intelligence
Presentation transcript:

Decision Support and Business Intelligence Systems (9 th Ed., Prentice Hall) Chapter 5: Data Mining for Business Intelligence

Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 5-2 Why Data Mining ? Why Data Mining تعدين البيانات? More intense competition المنافسة الشديدة at the global scale Availability of quality data on customers, vendors, transactions, Web, etc. Consolidation توحيد and integration of data repositories مستودعات البيانات into data warehouses The exponential increase in data processing and storage capabilities; and decrease in cost

Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 5-3 Definition of Data Mining The nontrivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns أنماط in data stored in structured databases. - Fayyad et al., (1996) Keywords in this definition: Process, nontrivial, valid, novel, potentially useful, understandable. Other names: knowledge extraction, pattern analysis, knowledge discovery اكتشاف, information harvesting حصّـد, pattern searching, data dredging,…

Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 5-4 Data Mining at the Intersection of Many Disciplines Data Mining at the Intersection تقاطع of Many Disciplines المجالات

Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 5-5 Data Mining Characteristics/Objectives Source of data for DM is often a consolidated data warehouse (not always!) DM environment is usually a client-server Data is the most critical ingredient عنصر for DM The miner الشخص القائم بالتعدين is often an end user Striking it rich requires creative thinking

Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 5-6 Data in Data Mining Data: a collection of facts usually obtained as the result of experiences, observations, or experiments Data may consist of numbers, words, images, … Data: lowest level of abstraction (from which information and knowledge are derived) -DM with different data types? - Other data types? مرتبه فواصل

Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 5-7 What Does DM Do? DM extract patterns from data Pattern? A mathematical (numeric and/or symbolic) relationship among data items Types of patterns Association جمعية Prediction Cluster (segmentation) تجزئة Sequential (or time series) relationships متتابعة

Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 5-8 A Taxonomy for Data Mining Tasks A Taxonomy التصنيف for Data Mining Tasks

Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 5-9 Data Mining Applications Customer Relationship Management Maximize return on marketing campaigns الحملات Improve customer retention (churn analysis) Maximize customer value Identify and treat most valued customers Banking and Other Financial Automate the loan application process Detecting fraudulent transactions Maximize customer value Optimizing cash reserves with forecasting

Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 5-10 Data Mining Applications (cont.) Retailing and Logistics Optimize inventory levels at different locations Improve the store layout and sales promotions Optimize logistics by predicting seasonal effects Minimize losses due to limited shelf life Manufacturing and Maintenance Predict/prevent machinery failures Identify anomalies in production systems to optimize the use manufacturing capacity Discover novel patterns to improve product quality

Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 5-11 Data Mining Applications Brokerage and Securities Trading Predict changes on certain bond prices Forecast the direction of stock fluctuations Assess the effect of events on market movements Identify and prevent fraudulent activities in trading Insurance Forecast claim costs for better business planning Determine optimal rate plans Optimize marketing to specific customers Identify & prevent منع fraudulent claim activities مطالبة احتيالية

Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 5-12 Data Mining Applications (cont.) Computer hardware and software Science and engineering Government and defense Homeland security and law enforcement تنفيذ Travel industry Healthcare Medicine Entertainment industry Sports Etc. Highly popular application areas for data mining

Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 5-13 Data Mining Process A manifestation of best practices A systematic way to conduct DM projects Different groups has different versions Most common standard processes: CRISP-DM (Cross-Industry Standard Process for Data Mining) SEMMA (Sample, Explore, Modify, Model, and Assess) KDD (Knowledge Discovery in Databases)

Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 5-14 Data Mining Process: CRISP-DM

Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 5-15 Data Mining Process: CRISP-DM Step 1: Business Understanding Step 2: Data Understanding Step 3: Data Preparation (!) إعداد البيانات Step 4: Model Building Step 5: Testing and Evaluation Step 6: Deployment The process is highly repetitive and experimental (DM: art versus science?) Accounts for ~85% of total project time

Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 5-16 Data Preparation – A Critical DM Task

Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 5-17 Data Mining Process: SEMMA

Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 5-18 Data Mining Methods: Classification Most frequently used DM method Part of the machine-learning family Employ supervised learning Learn from past data, classify new data The output variable is categorical (nominal or ordinal) in nature Classification versus regression? Classification versus clustering?

Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 5-19 Assessment Methods for Classification Predictive accuracy الدقة التنبؤية Hit rate Speed Model building; predicting Robustness متانة Scalability التدرجية Interpretability التفسير إمكانية Transparency, explainability

Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 5-20 Classification Techniques Decision tree analysis Statistical analysis Neural networks Support vector machines Case-based reasoning Bayesian classifiers تصنيفات النظرية الافتراضية Genetic algorithms Rough sets مجموعات الخام

Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 5-21 Decision Trees Employs the divide and conquer method Recursively divides a training set until each division consists of examples from one class 1. Create a root node and assign all of the training data to it 2. Select the best splitting attribute 3. Add a branch to the root node for each value of the split. Split the data into mutually exclusive subsets along the lines of the specific split 4. Repeat the steps 2 and 3 for each and every leaf node until the stopping criteria is reached A general algorithm for decision tree building

Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 5-22 Cluster Analysis for Data Mining Used for automatic identification of natural groupings of things Part of the machine-learning family Employ unsupervised learning Learns the clusters of things from past data, then assigns new instances There is not an output variable Also known as segmentation

Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 5-23 Data Mining Myths Data mining … provides instant solutions/predictions is not yet viable for business applications requires a separate, dedicated مخصصة database can only be done by those with advanced degrees is only for large firms that have lots of customer data is another name for the good-old statistics

Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 5-24 Common Data Mining Mistakes 1. Selecting the wrong problem for data mining 2. Ignoring what your sponsor thinks data mining is and what it really can/cannot do 3. Not leaving insufficient time for data acquisition, selection and preparation 4. Looking only at aggregated مُجّمعة results and not at individual records/predictions 5. Being sloppy about keeping track of the data mining procedure and results

Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 5-25 Common Data Mining Mistakes 6. Ignoring suspicious (good or bad) findings and quickly moving on 7. Running mining algorithms repeatedly and blindly, without thinking about the next stage 8. Naively believing everything you are told about the data 9. Naively believing everything you are told about your own data mining analysis 10. Measuring your results differently from the way your sponsor measures them