Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.

Slides:



Advertisements
Similar presentations
Data Mining in Computer Games By Adib Adam Hussain & Mohammed Sarfraz.
Advertisements

Data Mining Sangeeta Devadiga CS 157B, Spring 2007.
DATA MINING CS157A Swathi Rangan. A Brief History of Data Mining The term “Data Mining” was only introduced in the 1990s. Data Mining roots are traced.
Week 9 Data Mining System (Knowledge Data Discovery)
© Prentice Hall1 DATA MINING TECHNIQUES Introductory and Advanced Topics Eamonn Keogh (some slides adapted from) Margaret Dunham Dr. M.H.Dunham, Data Mining,
Data Mining.
Basic concepts of Data Mining, Clustering and Genetic Algorithms Tsai-Yang Jea Department of Computer Science and Engineering SUNY at Buffalo.
Data Mining By Archana Ketkar.
Data Mining Ketaki Borkar CS157A November 29, 2007.
Data Mining Concepts 1.1 COT5230 Data Mining Week 1 Data Mining Concepts M O N A S H A U S T R A L I A ’ S I N T E R N A T I O N A L U N I V E R S I T.
Data Mining – Intro.
Data mining By Aung Oo.
Presented To: Madam Nadia Gul Presented By: Bi Bi Mariam.
DASHBOARDS Dashboard provides the managers with exactly the information they need in the correct format at the correct time. BI systems are the foundation.
Data Mining: A Closer Look
Chapter 5 Data mining : A Closer Look.
Data Mining & Data Warehousing PresentedBy: Group 4 Kirk Bishop Joe Draskovich Amber Hottenroth Brandon Lee Stephen Pesavento.
Enterprise systems infrastructure and architecture DT211 4
Data Mining By Andrie Suherman. Agenda Introduction Major Elements Steps/ Processes Tools used for data mining Advantages and Disadvantages.
Dr. Awad Khalil Computer Science Department AUC
Knowledge Discovery & Data Mining process of extracting previously unknown, valid, and actionable (understandable) information from large databases Data.
Data Mining Techniques
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Data Mining Chun-Hung Chou
Enabling Organization-Decision Making
Spatial Statistics and Spatial Knowledge Discovery First law of geography [Tobler]: Everything is related to everything, but nearby things are more related.
Lecture 9: Knowledge Discovery Systems Md. Mahbubul Alam, PhD Associate Professor Dept. of AEIS Sher-e-Bangla Agricultural University.
INTRODUCTION TO DATA MINING MIS2502 Data Analytics.
1 1 Slide Introduction to Data Mining and Business Intelligence.
Knowledge Discovery and Data Mining Evgueni Smirnov.
Knowledge Discovery and Data Mining Evgueni Smirnov.
Some working definitions…. ‘Data Mining’ and ‘Knowledge Discovery in Databases’ (KDD) are used interchangeably Data mining = –the discovery of interesting,
Data Mining Knowledge on rough set theory SUSHIL KUMAR SAHU.
 Fundamentally, data mining is about processing data and identifying patterns and trends in that information so that you can decide or judge.  Data.
Fox MIS Spring 2011 Data Mining Week 9 Introduction to Data Mining.
Data Mining By Dave Maung.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
Data Mining In contrast to the traditional (reactive) DSS tools, the data mining premise is proactive. Data mining tools automatically search the data.
3-1 Data Mining Kelby Lee. 3-2 Overview ¨ Transaction Database ¨ What is Data Mining ¨ Data Mining Primitives ¨ Data Mining Objectives ¨ Predictive Modeling.
CRM - Data mining Perspective. Predicting Who will Buy Here are five primary issues that organizations need to address to satisfy demanding consumers:
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
1.1 © 2010 Dr. Tarek Abd El-Hafeez Decision Support Systems Lecture 2 Decision Support Systems.
What is Data Mining? process of finding correlations or patterns among dozens of fields in large relational databases process of finding correlations or.
DATA MINING By Cecilia Parng CS 157B.
DATA MINING Using Association Rules by Andrew Williamson.
MIS2502: Data Analytics Advanced Analytics - Introduction.
An Introduction Student Name: Riaz Ahmad Program: MSIT( ) Subject: Data warehouse & Data Mining.
DATA MINING PREPARED BY RAJNIKANT MODI REFERENCE:DOUG ALEXANDER.
CS507 Information Systems. Lesson # 11 Online Analytical Processing.
Data Mining. Overview the extraction of hidden predictive information from large databases Data mining tools predict future trends and behaviors, allowing.
Books Visualizing Data by Ben Fry Data Structures and Problem Solving Using C++, 2 nd edition by Mark Allen Weiss MATLAB for Engineers, 3 rd edition by.
Knowledge Discovery and Data Mining 19 th Meeting Course Name: Business Intelligence Year: 2009.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 28 Data Mining Concepts.
Data Mining NATE BUTLER, BRENT DAVIS, BROCK NOLAN, AND NICK THORNHILL.
Department of Computer Science Sir Syed University of Engineering & Technology, Karachi-Pakistan. Presentation Title: DATA MINING Submitted By.
Data Mining is the process of analyzing data and summarizing it into useful information Data Mining is usually used for extremely large sets of data It.
Supplemental Chapter: Business Intelligence Information Systems Development.
Introduction BIM Data Mining.
By Arijit Chatterjee Dr
Data Mining Generally, (Sometimes called data or knowledge discovery) is the process of analyzing data from different perspectives and summarizing it.
MIS2502: Data Analytics Advanced Analytics - Introduction
DATA MINING © Prentice Hall.
Data Mining 101 with Scikit-Learn
Data Mining Basics Image Source:
Sangeeta Devadiga CS 157B, Spring 2007
DATA MINING.
Supporting End-User Access
Data Mining: Introduction
Kenneth C. Laudon & Jane P. Laudon
Presentation transcript:

Data Mining BY JEMINI ISLAM

Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools of data mining

What is data mining? Generally, data mining (sometimes called data or knowledge discovery) is the process of analyzing data from different perspectives and summarizing it into useful information. It allows users to analyze data from many different dimensions or angles, categorize it, and summarize the relationships identified. Technically, data mining is the process of finding correlations or patterns among dozens of fields in large relational databases.

Cont.. Data Mining, also known as Knowledge-Discovery in Databases (KDD), is the process of automatically searching large volumes of data for patterns. Data Mining is a fairly recent and contemporary topic in computing. However, Data Mining applies many older computational techniques from statistics, machine learning and pattern recognition.

Example of data mining: A simple example of data mining is its use in a retail sales department. If a store tracks the purchases of a customer and notices that a customer buys a lot of silk shirts, the data mining system will make a correlation between that customer and silk shirts. The sales department will look at that information and may begin direct mail marketing of silk shirts to that customer, or it may alternatively attempt to get the customer to buy a wider range of products..

Example Cont.. Another widely used (though hypothetical) example is that of a very large North American chain of supermarkets. Through intensive analysis of the transactions and the goods bought over a period of time, analysts found that beers and diapers were often bought together.

Continue.. The grocery chain could use this newly discovered information in various ways to increase revenue. For example, they could move the beer display closer to the diaper display. And, they could place the high-profit diapers next to the high-profit beers.

Why use data mining? Data is one of the most valuable assets for any corporation - but only if we know how to reveal valuable knowledge hidden in raw data. Data mining allows us to extract diamonds of knowledge from historical data and predict useful outcomes form that.

Cont.. Data mining can- * optimize business decisions, * increase the value of each customer and communication, and *improve satisfaction of customer with your services.

How does data mining work? Data mining creates link between separate transactions and analytical systems in a large- scale information technology. It uses various software to analyze relationships and patterns. Generally,the following four types of relationships are sought :

Classification A task of finding a function that maps records into one of several discrete classes. For example, a restaurant chain could mine customer purchase data to determine when customers visit and what they typically order. This information could be used to increase traffic by having daily specials.

Clustering Clustering is a task of identifying groups of records that are similar between themselves but different from the rest of the data. For example, data can be mined to identify market segments or consumer affinities

Association. Data can be mined to identify association. The beer-diaper example is an example of associative mining.

Sequential Patterns Data is mined to anticipate behavior patterns and trends.For example, an outdoor equipment retailer could predict the likelihood of a backpack being purchased based on a consumer's purchase of sleeping bags and hiking shoes.

The process of data mining The process of data mining consists of three stages: 1) The initial exploration, 2) model building or pattern identification with validation or verification, and (3) deployment (i.e., the application of the model to new data in order to generate predictions).

Stage 1: Exploration This stage usually starts with data preparation which may involve cleaning data,data transformations, selecting subsets of records and - in case of data sets with large numbers of variables ("fields") – performing some preliminary feature selection operations to bring the number of variables to a manageable range).

Stage 2: Model building and validation. This stage involves considering various models and choosing the best one based on their predictive performance (i.e., explaining the variability in question and producing stable results across samples).

Stage 3: Deployment. That final stage involves using the model selected as best in the previous stage and applying it to new data in order to generate predictions or estimates of the expected outcome.

Tools of Data Mining Artificial Neural Networks: Non-linear predictive models that learn through training and resemble biological neural networks in structure.

Cont.. Genetic algorithms: Optimization techniques that use processes such as genetic combination, mutation, and natural selection in a design based on the concepts of natural evolution.

Cont.. Decision trees: Tree shaped structures that represent sets of decisions. These decisions generate rules for the classification of a dataset

Cont..(Tools of Data Mining) Nearest neighbor method: A technique that classifies each record in a dataset based on a combination of the classes of the k record(s) most similar to it in a historical dataset (where k 1). Sometimes called the k-nearest neighbor technique

Cont.. Rule induction: The extraction of useful if- then rules from data based on statistical significance.

Tools of Data Mining (Cont..) Data visualization: The visual interpretation of complex relationships in multidimensional data. Graphics tools are used to illustrate data relationships.

Conclusion: The concept of Data Mining is becoming increasingly popular as a business information management tool where it is expected to reveal knowledge structures that can guide decisions in conditions of limited certainty. Today increasingly more companies acknowledge the value of this new opportunity and use data mining tools and solutions that help optimizing their operations and increase customer’s bottom line.