Academic Year 2014 Spring Academic Year 2014 Spring.

Slides:



Advertisements
Similar presentations
Web Mining.
Advertisements

Supporting End-User Access
Overview of Data Mining & The Knowledge Discovery Process Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
1 ACCTG 6910 Building Enterprise & Business Intelligence Systems (e.bis) Introduction to Data Mining Olivia R. Liu Sheng, Ph.D. Emma Eccles Jones Presidential.
Week 9 Data Mining System (Knowledge Data Discovery)
© Prentice Hall1 DATA MINING TECHNIQUES Introductory and Advanced Topics Eamonn Keogh (some slides adapted from) Margaret Dunham Dr. M.H.Dunham, Data Mining,
Data Mining Knowledge Discovery in Databases Data 31.
Dr. Tahar Kechadi Dr. Joe Carthy
Data Mining By Archana Ketkar.
Data Mining – Intro.
CS157A Spring 05 Data Mining Professor Sin-Min Lee.
Data Mining: A Closer Look
Data Mining.
Business Intelligence
CIT 858: Data Mining and Data Warehousing Course Instructor: Bajuna Salehe Web:
TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT (Muscat, Oman) DATA MINING.
Enterprise systems infrastructure and architecture DT211 4
Data Mining: Concepts & Techniques. Motivation: Necessity is the Mother of Invention Data explosion problem –Automated data collection tools and mature.
OLAM and Data Mining: Concepts and Techniques. Introduction Data explosion problem: –Automated data collection tools and mature database technology lead.
Data Warehouse Fundamentals Rabie A. Ramadan, PhD 2.
『 Data Mining 』 By Jung, hae-sun. 1.Introduction 2.Definition 3.Data Mining Applications 4.Data Mining Tasks 5. Overview of the System 6. Data Mining.
Dr. Awad Khalil Computer Science Department AUC
Knowledge Discovery & Data Mining process of extracting previously unknown, valid, and actionable (understandable) information from large databases Data.
Chapter 5: Data Mining for Business Intelligence
Shilpa Seth.  What is Data Mining What is Data Mining  Applications of Data Mining Applications of Data Mining  KDD Process KDD Process  Architecture.
1 Data Mining DT211 4 Refer to Connolly and Begg 4ed.
Data Mining Techniques As Tools for Analysis of Customer Behavior
Data Mining: Introduction. Why Data Mining? l The Explosive Growth of Data: from terabytes to petabytes –Data collection and data availability  Automated.
Data Mining Techniques As Tools for Analysis of Customer Behavior Lecture 2:
3 Objects (Views Synonyms Sequences) 4 PL/SQL blocks 5 Procedures Triggers 6 Enhanced SQL programming 7 SQL &.NET applications 8 OEM DB structure 9 DB.
Introduction To Data Mining. What Is Data Mining? A toolA tool Extraction of interesting (non-trivial, implicit, previously unknown and potentially useful)
Data mining: some basic ideas Francisco Moreno Excerpts from Fundamentals of DB Systems, Elmasri & Navathe and other sources.
Chapter 1 Introduction to Data Mining
INTRODUCTION TO DATA MINING MIS2502 Data Analytics.
1 1 Slide Introduction to Data Mining and Business Intelligence.
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
DATA MINING 1. 2 Data Mining Extracting or “mining” knowledge from large amounts of data Data mining is the process of autonomously retrieving useful.
Switch off your Mobiles Phones or Change Profile to Silent Mode.
Fox MIS Spring 2011 Data Mining Week 9 Introduction to Data Mining.
Data Mining By Dave Maung.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
1 Improving quality of graduate students by data mining Asst. Prof. Kitsana Waiyamai, Ph.D. Dept. of Computer Engineering Faculty of Engineering, Kasetsart.
CS157B Fall 04 Introduction to Data Mining Chapter 22.3 Professor Lee Yu, Jianji (Joseph)
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
CRM - Data mining Perspective. Predicting Who will Buy Here are five primary issues that organizations need to address to satisfy demanding consumers:
Introduction to Data-Mining Marko Grobelnik Institut Jozef Stefan.
MIS2502: Data Analytics Advanced Analytics - Introduction.
Data Mining Copyright KEYSOFT Solutions.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 28 Data Mining Concepts.
Introduction.  Instructor: Cengiz Örencik   Course materials:  myweb.sabanciuniv.edu/cengizo/courses.
Chapter 3 Building Business Intelligence Chapter 3 DATABASES AND DATA WAREHOUSES Building Business Intelligence 6/22/2016 1Management Information Systems.
Data Resource Management – MGMT An overview of where we are right now SQL Developer OLAP CUBE 1 Sales Cube Data Warehouse Denormalized Historical.
Data Mining – Introduction (contd…) Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot.
CS570: Data Mining Spring 2010, TT 1 – 2:15pm Li Xiong.
Data Mining.
Data Mining – Intro.
Data Mining Motivation: “Necessity is the Mother of Invention”
MIS2502: Data Analytics Advanced Analytics - Introduction
DATA MINING © Prentice Hall.
Introduction C.Eng 714 Spring 2010.
Data Warehousing and Data Mining
I don’t need a title slide for a lecture
Data Mining: Concepts and Techniques
Supporting End-User Access
Data Mining: Concepts and Techniques
Data Mining Concepts and Techniques
Understanding Customer Behaviors with Information Technologies
Data Mining Techniques As Tools for Analysis of Customer Behavior
Data Mining: Concepts and Techniques
CSE591: Data Mining by H. Liu
Presentation transcript:

Academic Year 2014 Spring Academic Year 2014 Spring

MODULE CC3005NI: Advanced Database Systems “Distributed Database (DDB) and Data Mining (DM)” (PART – 2) Academic Year 2014 Spring Academic Year 2014 Spring

 Data Mining is a knowledge discovery process of automated extraction of hidden predictive information or patterns from data in large databases.  Key Points  Knowledge discovery in databases – KDD  Automated process  Extraction or searching for interesting / useful information or pattern or trend  From large databases Data Mining:

 Problem: Data Explosion  Automated data collection tools and mature database technology lead to tremendous amounts of data stored in databases, data warehouses and other information repositories  We are drowning in data, but starving for knowledge! Motivation:

 Solution: Data Warehousing and Data Mining  Data Warehousing and On-Line Analytical Processing (OLAP)  Extraction of interesting knowledge (rules, regularities, patterns, future trends, predictions) from large databases. Motivation:

 Data Mining aims to find information from data in:  Relational Databases  Data Warehouses  Advanced Database and information repositories o Object Oriented and Object Relational Databases o Spatial Databases o Time-Series Data and Temporal Data o Text databases and Multimedia Databases o Heterogeneous and Legacy Databases o WWW Data Mining: From Where?

 Data Mining is aimed at providing these capabilities:  Automated discovery of previously unknown patterns o Data Mining tools sift through large amounts of data to discover meaningful new correlations and hidden patterns  Automated prediction of trends and behaviours o Data Mining tools predict future trends, behaviours, allowing businesses to make proactive, knowledge driven decisions  Results used to help business make better business decisions and to gain a competitive advantages Data Mining: Objectives

 Database Analysis and Decision Support  Market Analysis and Management o Target Marketing, Customer Relation Management, Market Basket Analysis, Market Segmentation  Risk Analysis and Management o Forecasting, Customer Retention, Quality Control, Competitive Analysis  Fraud Detection and Management o Identify unusual spending patterns, Irregularities  Other Applications o Text Mining (News Group, , Documents) and Web Analysis o Intelligent Query Answering Data Mining: Example

Data Mining: Interdisciplinary Subject Data Mining Database Technology Statistics Visualization Artificial Intelligence Machine Learning Other Disciplines

 Process of Information / Knowledge Extraction is carried out repetitively, adaptively and progressively.  Comprehension of application domain  Preparation of Data sets  Discovery of Patterns  Evaluation of Patterns and implications  Comprehension of Application Domain  Develop a good understanding of application domain. Data Mining Process:

 Preparation of data sets  Identify a subset of data of database / Data Warehouse on which to carry out Data Mining  Encode / cleaning data to make it suitable input to Data Mining Algorithms  Discovery of Patterns  Apply techniques of Data Mining on data set extracted earlier in order to discover repetitive patterns in data. Data Mining Process:

 Evaluation of Patterns and Implications  Draw implications from discovered patterns  Evaluating which experiments to carry out next, which hypothesis to formulate, or which consequences to draw in process of knowledge discovery. Data Mining Process:

 There are various techniques / algorithms used for Data Mining, including:  Association  Classification  Sequential Patterns  Patterns with Time Series  Categorisation and Segmentation Data Mining: Some Techniques

 Association rules discover regular patterns within large data sets, such as presence of two items within group of tuples.  These rules discover situation in which presence of item in transaction is linked to presence of another item with high probability. Association Rules

 Quality of association rules can be measured precisely, by defining properties of SUPPORT and CONFIDENCE. SUPPORT is minimum (percentage) of transactions (or baskets) containing both items A and B (A and B could both be single or group items) CONFIDENCE is minimum (percentage) of those baskets containing both items A and B, among those containing A. Association Rules

milk + bread + cereal milk + bread + sugar + eggs milk + bread + butter Shopping Baskets milk + bread + butter Customer - 1Customer - 2Customer - 3 Customer - n hmmm... which items are frequently purchased together by my customer? Marketing Analyst milkbreadsugarbuttercerealegg Basket – Basket – Basket – Basket – n Boolean Representation Association – Example Shopping Habits

Association – Example

 Strategy 1: Place milk and bread within close proximity may further encourage customers to purchase these items together within single visits to store! How Data Mining (DM) Improves Business?

 Strategy 2: Place milk and bread at opposite ends of store may entice customers who purchase such items to pick up other items along way! How Data Mining (DM) Improves Business?

 Strategy 3: Put these two items into package at reduced price!!!

Classification  Classify phenomenon in a predefined class. Place milk and bread within close proximity may further encourage customers to purchase these items together within single visits to store! Classifier is an algorithm that carries out classification Classifier is typically presented as decision trees. In these trees nodes are labeled by conditions that allow decision making. Examples:  Motor Insurance  Health Insurance

Classification – Example

 Discover patterns between events such that presence of one set of items/objects in database of events over period of time.  Detection of sequential patterns is equivalent to detecting association among events with certain temporal relationships (time dimension).  Examples  Understand and Analyse long term Customer buying Behaviour  Medical Diagnosis Sequential Patterns

 Discover links between two sets of data which are time dependent, and is based on degree of similarity between patterns that both time series demonstrate.  Similarities can be detected within positions of time series  Examples  Stock Market Movement (Compare Market performance of Oct 2001 with Oct 2007)  New home owners’ buying patterns within two months of purchase  Products selling patterns in different seasons. Patterns with Time Series

 Categorisation is process of partitioning given collection of events or items into a set of segments/clusters which share some common properties.  Segments/Clusters may be predefined, or may be determined during process of categorisation Categorisation & Segmentation

 Examples  Classification of customer profile: by frequency of visits, types of financing used, amount of purchase, etc.  Demographic information: age, income group, place of residence, buying habits, etc.  Planning store promotions and advertisements, planning seasonal marketing strategies, planning additional stores. Categorisation & Segmentation

Other Data Mining Approaches

Typical Application Area of DM  Finance and Banking  Retails and Sales  Credit Card Operations  Medical Diagnoses and Healthcare  Insurance  Others

Integrated DM Environment  To maximise its potential and performance, Data Mining tools must be fully integrated with Data Warehouse environment as well as flexible interactive business analysis tools. OLAP (On Line Analytical Processing) enables more sophisticated end user business model to be applied when navigating Data Warehouse Data Mining Server can be integrated with Data Warehouse and OLAP server. This integration enables operational decisions to be directly implemented and tracked. As warehouse expands with new decisions and results, organisation can continually mine best practices and apply them to future decisions.

Integrated DM Environment

Thank you!!! Questions are WELCOME Academic Year 2014 Spring Academic Year 2014 Spring