Data mining. Data mining, at its core, is the transformation of large amounts of data into meaningful patterns and rules.

Slides:



Advertisements
Similar presentations
Data Mining and Text Analytics GATE, by Joel Bywater.
Advertisements

Prof. Carolina Ruiz Department of Computer Science Worcester Polytechnic Institute INTRODUCTION TO KNOWLEDGE DISCOVERY IN DATABASES AND DATA MINING.
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
Data warehouse example
Civil and Environmental Engineering Carnegie Mellon University Sensors & Knowledge Discovery (a.k.a. Data Mining) H. Scott Matthews April 14, 2003.
Chapter 9 DATA WAREHOUSING Transparencies © Pearson Education Limited 1995, 2005.
1 ACCTG 6910 Building Enterprise & Business Intelligence Systems (e.bis) Introduction to Data Mining Olivia R. Liu Sheng, Ph.D. Emma Eccles Jones Presidential.
DATA WAREHOUSING.
SLIDE 1IS 257 – Fall 2008 Data Mining and the Weka Toolkit University of California, Berkeley School of Information IS 257: Database Management.
Data Mining By Archana Ketkar.
Data Mining and Data Warehousing – a connected view.
Introduction to WEKA Aaron 2/13/2009. Contents Introduction to weka Download and install weka Basic use of weka Weka API Survey.
1 Data and Knowledge Management. 2 Data Management: A Critical Success Factor The difficulties and the process Data sources and collection Data quality.
Advanced Database Applications Database Indexing and Data Mining CS591-G1 -- Fall 2001 George Kollios Boston University.
Chapter 4: Data Mining for Business Intelligence
GUHA method in Data Mining Esko Turunen Tampere University of Technology Tampere, Finland.
Chapter 4 Data, Text, and Web Mining
CHAPTER 08 Accessing Organizational Information – Data Warehouse
Data Mining: Concepts & Techniques. Motivation: Necessity is the Mother of Invention Data explosion problem –Automated data collection tools and mature.
OLAM and Data Mining: Concepts and Techniques. Introduction Data explosion problem: –Automated data collection tools and mature database technology lead.
Data Warehouse Fundamentals Rabie A. Ramadan, PhD 2.
Mining Large Data at SDSC Natasha Balac, Ph.D.. A Deluge of Data Astronomy Life Sciences Modeling and Simulation Data Management and Mining Geosciences.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Chapter 5: Data Mining for Business Intelligence
Understanding Data Warehousing
Katanosh Morovat.   This concept is a formal approach for identifying the rules that encapsulate the structure, constraint, and control of the operation.
DECISION SUPPORT SYSTEM ARCHITECTURE: The data management component.
Chapter 1 Introduction to Data Mining
INTRODUCTION TO DATA MINING MIS2502 Data Analytics.
Knowledge Discovery and Data Mining Evgueni Smirnov.
Introduction to Web Mining Spring What is data mining? Data mining is extraction of useful patterns from data sources, e.g., databases, texts, web,
Data Mining – A First View Roiger & Geatz. Definition Data mining is the process of employing one or more computer learning techniques to automatically.
Knowledge Discovery and Data Mining Evgueni Smirnov.
Lecturer: Gareth Jones. How does a relational database organise data? What are the principles of a database management system? What are the principal.
HW#2: A Strategy for Mining Association Rules Continuously in POS Scanner Data.
1.file. 2.database. 3.entity. 4.record. 5.attribute. When working with a database, a group of related fields comprises a(n)…
1 Reviewing Data Warehouse Basics. Lessons 1.Reviewing Data Warehouse Basics 2.Defining the Business and Logical Models 3.Creating the Dimensional Model.
Data Mining By Dave Maung.
IS 325 Notes for Wednesday August 28, Data is the Core of the Enterprise.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
Document Clustering for Forensic Analysis: An Approach for Improving Computer Inspection.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
3-1 Data Mining Kelby Lee. 3-2 Overview ¨ Transaction Database ¨ What is Data Mining ¨ Data Mining Primitives ¨ Data Mining Objectives ¨ Predictive Modeling.
Part I Data Mining Fundamentals. Data Mining: A First View Chapter 1.
MIS2502: Data Analytics Advanced Analytics - Introduction.
Data Mining Basics. “Copyright and Terms of Service Copyright © Texas Education Agency. The materials found on this website are copyrighted © and trademarked.
Chapter 2 Data, Text, and Web Mining. Data Mining Concepts and Applications  Data mining (DM) A process that uses statistical, mathematical, artificial.
Decision Support and Business Intelligence Systems (9 th Ed., Prentice Hall) Chapter 5: Data Mining for Business Intelligence.
Data Mining Introduction to data mining concepts.
KNOWLEDGE DISCOVERY & DATA MINING Abhishek M. Mehta ROLL NO:24.
There is an inherent meaning in everything. “Signs for people who can see.”
Business Intelligence: A Managerial Approach (2nd Edition)
SNS COLLEGE OF TECHNOLOGY
Data Mining Generally, (Sometimes called data or knowledge discovery) is the process of analyzing data from different perspectives and summarizing it.
MIS2502: Data Analytics Advanced Analytics - Introduction
Introduction to Data Mining
A Web Mining Platform for Enhancing Knowledge Management on the Web KOK-LEONG ONG WEE-KEONG NG EE-PENG LIM Center for Advanced Information Systems,
Data Mining 101 with Scikit-Learn
Introduction C.Eng 714 Spring 2010.
Data The lowest level of abstraction from which information and knowledge are derived. Any collection of recorded facts, numbers, or datum of any nature.
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Data Warehouse.
Data Warehousing and Data Mining
Introduction to Systems Analysis and Design Stefano Moshi Memorial University College System Analysis & Design BIT
Web Mining Department of Computer Science and Engg.
Data Warehousing Data Mining Privacy
Databases and Information Systems
Data Mining.
Presentation transcript:

Data mining

Data mining, at its core, is the transformation of large amounts of data into meaningful patterns and rules.

Definition of Data Mining The nontrivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data stored in structured databases. - Fayyad et al., (1996) Keywords in this definition: Process, nontrivial, valid, novel, potentially useful, understandable. Other names: knowledge extraction, pattern analysis, knowledge discovery, information harvesting, pattern searching, data dredging,…

Two types of Data mining Directed Data mining (supervised) Undirected (Unsupervised)

Direct Data mining In directed data mining, you are trying to predict a particular data point. For example. the sales price of a house given information about other houses for sale in the neighborhood

Undirect Data mining mining In undirected data mining, you are trying to create groups of data, or find patterns in existing data. For example. In effect, every U.S. census is data mining, as the government looks to gather data about everyone in the country and turn it into useful information.

Modern data mining started in the mid-1990s, as the power of computing, and the cost of computing and storage finally reached a level where it was possible for companies to do it in-house, without having to look to outside computer powerhouses

The term data mining is all-encompassing, referring to dozens of techniques and procedures used to examine and transform data.

Data Mining at the Intersection of Many Disciplines

Data Mining Characteristics/Objectives Source of data for DM is often a consolidated data warehouse (not always!) DM environment is usually a client-server or a Web-based information systems architecture Data is the most critical ingredient for DM which may include soft/unstructured data The miner is often an end user Striking it rich requires creative thinking Data mining tools’ capabilities and ease of use are essential (Web, Parallel processing, etc.)

Data in Data Mining Data: a collection of facts usually obtained as the result of experiences, observations, or experiments Data may consist of numbers, words, images, … Data: lowest level of abstraction (from which information and knowledge are derived) -DM with different data types? - Other data types?

A Taxonomy for Data Mining Tasks

The ultimate goal of data mining is to create a model. A model that can improve the way you read and interpret your existing data and predict your future data. Since there are so many techniques with data mining, the major step to creating a good model is to determine what type of technique to use. That will come with practice and experience, and some guidance. From there, the model needs to be refined to make it even more useful.

Weka as a Data mining tool Data mining isn't solely the domain of big companies and expensive software. In fact, there's a piece of software that does almost all the same things as these expensive pieces of software — the software is called WEKA. WEKA is the product of the University of Waikato (New Zealand) and was first implemented in its modern form in It uses the GNU General Public License (GPL). The software is written in the Java™ language and contains a GUI for interacting with data files and producing visual results (think tables and curves). It also has a general API, so you can embed WEKA, like any other library, in your own applications to such things as automated server-side data-mining tasks.