Data Mining By : Tung, Sze Ming ( Leo ) CS 157B. Definition A class of database application that analyze data in a database using tools which look for.

Slides:



Advertisements
Similar presentations
An Introduction to Data Mining
Advertisements

Supporting End-User Access
By: Mr Hashem Alaidaros MIS 211 Lecture 4 Title: Data Base Management System.
Data Mining Glen Shih CS157B Section 1 Dr. Sin-Min Lee April 4, 2006.
Accessing Organizational Information—Data Warehouse
Chapter 9 Business Intelligence Systems
DATA MINING CS157A Swathi Rangan. A Brief History of Data Mining The term “Data Mining” was only introduced in the 1990s. Data Mining roots are traced.
Chapter 9 DATA WAREHOUSING Transparencies © Pearson Education Limited 1995, 2005.
1 ACCTG 6910 Building Enterprise & Business Intelligence Systems (e.bis) Introduction to Data Mining Olivia R. Liu Sheng, Ph.D. Emma Eccles Jones Presidential.
Week 9 Data Mining System (Knowledge Data Discovery)
Business Intelligence Michael Gross Tina Larsell Chad Anderson.
DATA WAREHOUSING.
Business Intelligence Andrew Davis Andria Zippler Jana Krinsky Tiffany Ferris.
Data Mining By Archana Ketkar.
Database – Part 2 Dr. V.T. Raja Oregon State University.
1 Data and Knowledge Management. 2 Data Management: A Critical Success Factor The difficulties and the process Data sources and collection Data quality.
CS157A Spring 05 Data Mining Professor Sin-Min Lee.
Data mining By Aung Oo.
Business Intelligence: Essential of Business
Business Intelligence & Exam 1 Review
Data Mining: A Closer Look
Introduction to Data Mining Data mining is a rapidly growing field of business analytics focused on better understanding of characteristics and.
TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT (Muscat, Oman) DATA MINING.
Enterprise systems infrastructure and architecture DT211 4
Data Mining By Andrie Suherman. Agenda Introduction Major Elements Steps/ Processes Tools used for data mining Advantages and Disadvantages.
『 Data Mining 』 By Jung, hae-sun. 1.Introduction 2.Definition 3.Data Mining Applications 4.Data Mining Tasks 5. Overview of the System 6. Data Mining.
Knowledge Discovery & Data Mining process of extracting previously unknown, valid, and actionable (understandable) information from large databases Data.
McGraw-Hill/Irwin McGraw-Hill/Irwin Copyright © 2009 by The McGraw-Hill Companies, Inc. All rights reserved.
Chapter 5: Data Mining for Business Intelligence
Data Mining Techniques
Business Intelligence, Data Mining and Data Analytics/Predictive Analytics By: Asela Thomason IS 495 Summer 2015.
Data Mining Chun-Hung Chou
Enabling Organization-Decision Making
Database Design - Lecture 1
Chapter 9 Business Intelligence and Information Systems for Decision Making.
© 2008 Pearson Prentice Hall, Experiencing MIS, David Kroenke Slide 1 Chapter 9 Competitive Advantage with Information Systems for Decision Making.
Data Mining CS157B Fall 04 Professor Lee By Yanhua Xue.
Introduction to Data Mining Group Members: Karim C. El-Khazen Pascal Suria Lin Gui Philsou Lee Xiaoting Niu.
INTRODUCTION TO DATA MINING MIS2502 Data Analytics.
Database Design Part of the design process is deciding how data will be stored in the system –Conventional files (sequential, indexed,..) –Databases (database.
Lecturer: Gareth Jones. How does a relational database organise data? What are the principles of a database management system? What are the principal.
Data MINING Data mining is the process of extracting previously unknown, valid and actionable information from large data and then using the information.
 Fundamentally, data mining is about processing data and identifying patterns and trends in that information so that you can decide or judge.  Data.
Fox MIS Spring 2011 Data Mining Week 9 Introduction to Data Mining.
Introduction – Addressing Business Challenges Microsoft® Business Intelligence Solutions.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
EXAM REVIEW MIS2502 Data Analytics. Exam What Tool to Use? Evaluating Decision Trees Association Rules Clustering.
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
CRM - Data mining Perspective. Predicting Who will Buy Here are five primary issues that organizations need to address to satisfy demanding consumers:
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
Chapter 4 Marketing Intelligence and Database Research.
CISB113 Fundamentals of Information Systems Data Management.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
DATA MINING By Cecilia Parng CS 157B.
DATA RESOURCE MANAGEMENT
Business Intelligence - 2 BUS 782. Topics Data warehousing Data Mining.
MIS2502: Data Analytics Advanced Analytics - Introduction.
DATA MINING PREPARED BY RAJNIKANT MODI REFERENCE:DOUG ALEXANDER.
Information Systems in Organizations
Academic Year 2014 Spring Academic Year 2014 Spring.
Data Mining. Overview the extraction of hidden predictive information from large databases Data mining tools predict future trends and behaviors, allowing.
Customer Relationship Management (CRM) Chapter 4 Customer Portfolio Analysis Learning Objectives Why customer portfolio analysis is necessary for CRM implementation.
Chapter 3 Building Business Intelligence Chapter 3 DATABASES AND DATA WAREHOUSES Building Business Intelligence 6/22/2016 1Management Information Systems.
Data Resource Management – MGMT An overview of where we are right now SQL Developer OLAP CUBE 1 Sales Cube Data Warehouse Denormalized Historical.
Supplemental Chapter: Business Intelligence Information Systems Development.
Introduction BIM Data Mining.
MIS2502: Data Analytics Advanced Analytics - Introduction
Supporting End-User Access
Data mining Data mining is the process of analyzing data from different perspectives and summarizing it into useful information.
Kenneth C. Laudon & Jane P. Laudon
Presentation transcript:

Data Mining By : Tung, Sze Ming ( Leo ) CS 157B

Definition A class of database application that analyze data in a database using tools which look for trends or anomalies. A class of database application that analyze data in a database using tools which look for trends or anomalies. Data mining was invented by IBM. Data mining was invented by IBM.

Purpose To look for hidden patterns or previously unknown relationships among the data in a group of data that can be used to predict future behavior. To look for hidden patterns or previously unknown relationships among the data in a group of data that can be used to predict future behavior. Ex: Data mining software can help retail companies find customers with common interests. Ex: Data mining software can help retail companies find customers with common interests.

Background Information Many of the techniques used by today's data mining tools have been around for many years, having originated in the artificial intelligence research of the 1980s and early 1990s. Many of the techniques used by today's data mining tools have been around for many years, having originated in the artificial intelligence research of the 1980s and early 1990s. Data Mining tools are only now being applied to large-scale database systems. Data Mining tools are only now being applied to large-scale database systems.

The Need for Data Mining The amount of raw data stored in corporate data warehouses is growing rapidly. The amount of raw data stored in corporate data warehouses is growing rapidly. There is too much data and complexity that might be relevant to a specific problem. There is too much data and complexity that might be relevant to a specific problem. Data mining promises to bridge the analytical gap by giving knowledgeworkers the tools to navigate this complex analytical space. Data mining promises to bridge the analytical gap by giving knowledgeworkers the tools to navigate this complex analytical space.

The Need for Data Mining, cont’ The need for information has resulted in the proliferation of data warehouses that integrate information multiple sources to support decision making. The need for information has resulted in the proliferation of data warehouses that integrate information multiple sources to support decision making. Often include data from external sources, such as customer demographics and household information. Often include data from external sources, such as customer demographics and household information.

Approach to Data Mining association association sequence-based analysis sequence-based analysis clustering clustering classification classification

Association Classic market-basket analysis, which treats the purchase of a number of items (for example, the contents of a shopping basket) as a single transaction. Classic market-basket analysis, which treats the purchase of a number of items (for example, the contents of a shopping basket) as a single transaction. This information can be used to adjust inventories, modify floor or shelf layouts, or introduce targeted promotional activities to increase overall sales or move specific products. This information can be used to adjust inventories, modify floor or shelf layouts, or introduce targeted promotional activities to increase overall sales or move specific products. Example : 80 percent of all transactions in which beer was purchased also included potato chips. Example : 80 percent of all transactions in which beer was purchased also included potato chips.

Sequence-based analysis Traditional market-basket analysis deals with a collection of items as part of a point-in-time transaction. Traditional market-basket analysis deals with a collection of items as part of a point-in-time transaction. to identify a typical set of purchases that might predict the subsequent purchase of a specific item. to identify a typical set of purchases that might predict the subsequent purchase of a specific item.

Clustering Clustering approach address segmentation problems. Clustering approach address segmentation problems. These approaches assign records with a large number of attributes into a relatively small set of groups or "segments." These approaches assign records with a large number of attributes into a relatively small set of groups or "segments." Example : Buying habits of multiple population segments might be compared to determine which segments to target for a new sales campaign. Example : Buying habits of multiple population segments might be compared to determine which segments to target for a new sales campaign.

Classification Most commonly applied data mining technique Most commonly applied data mining technique Algorithm uses preclassified examples to determine the set of parameters required for proper discrimination. Algorithm uses preclassified examples to determine the set of parameters required for proper discrimination. Example : A classifier derived from the Classification approach is capable of identifying risky loans, could be used to aid in the decision of whether to grant a loan to an individual. Example : A classifier derived from the Classification approach is capable of identifying risky loans, could be used to aid in the decision of whether to grant a loan to an individual.

Issues of Data Mining Present-day tools are strong but require significant expertise to implement effectively. Present-day tools are strong but require significant expertise to implement effectively. Issues of Data Mining Issues of Data Mining Susceptibility to "dirty" or irrelevant data. Susceptibility to "dirty" or irrelevant data. Inability to "explain" results in human terms. Inability to "explain" results in human terms.

Issues susceptibility to "dirty" or irrelevant data susceptibility to "dirty" or irrelevant data Data mining tools of today simply take everything they are given as factual and draw the resulting conclusions. Data mining tools of today simply take everything they are given as factual and draw the resulting conclusions. Users must take the necessary precautions to ensure that the data being analyzed is "clean." Users must take the necessary precautions to ensure that the data being analyzed is "clean."

Issues, cont’ inability to "explain" results in human terms inability to "explain" results in human terms Many of the tools employed in data mining analysis use complex mathematical algorithms that are not easily mapped into human terms. Many of the tools employed in data mining analysis use complex mathematical algorithms that are not easily mapped into human terms. what good does the information do if you don’t understand it? what good does the information do if you don’t understand it?

The End The End