1 Copyright Jiawei Han; modified by Charles Ling for CS411a/538a Data Mining and Data Warehousing  Introduction  Data warehousing and OLAP for data mining.

Slides:



Advertisements
Similar presentations
Copyright Jiawei Han, modified by Charles Ling for CS411a
Advertisements

1 Copyright by Jiawei Han, modified by Charles Ling for cs411a/538a Data Mining and Data Warehousing v Introduction v Data warehousing and OLAP for data.
CS/EngMt/CpEng 404 Data Mining & Knowledge Discovery Dan St. Clair Lect 1 – Intro. To Data Mining & Data Warehouses.
1 Copyright Jiawei Han. Modified by Charles Ling for CS411a/538a CS411a/CS538a v New rooms: Monday: UC 224; Friday: UC 30 v Office hours of Dr. Ling, :
Chapter 18: Data Analysis and Mining Kat Powell. Chapter 18: Data Analysis and Mining ➔ Decision Support Systems ➔ Data Analysis and OLAP ➔ Data Warehousing.
University of Alberta  Dr. Osmar R. Zaïane, Principles of Knowledge Discovery in Data Dr. Osmar R. Zaïane University of Alberta Fall 2004.
University of Alberta  Dr. Osmar R. Zaïane, Principles of Knowledge Discovery in Data Dr. Osmar R. Zaïane University of Alberta Fall 2004.
Spatial and Temporal Data Mining V. Megalooikonomou Introduction to Decision Trees ( based on notes by Jiawei Han and Micheline Kamber and on notes by.
6/25/2015 Acc 522 Fall 2001 (Jagdish S. Gangolly) 1 Data Mining I Jagdish Gangolly State University of New York at Albany.
Data Mining By Archana Ketkar.
COMP 578 Data Warehousing And OLAP Technology Keith C.C. Chan Department of Computing The Hong Kong Polytechnic University.
Data Mining – Intro.
Ch3 Data Warehouse part2 Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2009.
DASHBOARDS Dashboard provides the managers with exactly the information they need in the correct format at the correct time. BI systems are the foundation.
Data Warehousing 資料倉儲 Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept. of Information Management, Tamkang University Dept. of Information ManagementTamkang.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
OLAM and Data Mining: Concepts and Techniques. Introduction Data explosion problem: –Automated data collection tools and mature database technology lead.
Data Mining : Introduction Chapter 1. 2 Index 1. What is Data Mining? 2. Data Mining Functionalities 1. Characterization and Discrimination 2. MIning.
Lingma Acheson Department of Computer and Information Science, IUPUI
Data Mining Techniques
Understanding Data Analytics and Data Mining Introduction.
DECISION SUPPORT SYSTEM ARCHITECTURE: The data management component.
Chapter 1 Introduction to Data Mining
Introduction to Web Mining Spring What is data mining? Data mining is extraction of useful patterns from data sources, e.g., databases, texts, web,
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
Succeeding with Technology Database Systems Basic Data Management Concepts Organizing Data in a Database Database Management Systems Using Database Systems.
Database A database is a collection of data organized to meet users’ needs. In this section: Database Structure Database Tools Industrial Databases Concepts.
Spatial Data Mining Ashkan Zarnani Sadra Abedinzadeh Farzad Peyravi.
CS370 Spring 2007 CS 370 Database Systems Lecture 4 Introduction to Database Design.
Data Warehousing/Mining 1 Data Warehousing/Mining Comp 150DW Course Overview Instructor: Dan Hebert.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
DBSQL 9-1 Copyright © Genetic Computer School 2009 Chapter 9 Data Mining and Data Warehousing.
 Finding all the patterns autonomously in a database? — unrealistic because the patterns could be too many but uninteresting  Data mining should be.
Multidimensional analysis model for a document warehouse that includes textual measures KIM JEONG RAE UOS.DML
January 17, 2016Data Mining: Concepts and Techniques 1 What Is Data Mining? Data mining (knowledge discovery from data) Extraction of interesting ( non-trivial,
UNIT-3 Data Mining Primitives, Languages, and System Architectures LectureTopic ********************************************** Lecture-18Data mining primitives:
Evaluation of DBMiner By: Shu LIN Calin ANTON. Outline  Importing and managing data source  Data mining modules Summarizer Associator Classifier Predictor.
Advanced Database Concepts
CS 157B: Database Management Systems II April 10 Class Meeting Department of Computer Science San Jose State University Spring 2013 Instructor: Ron Mak.
August 18, 2009Data Mining: Concepts and Techniques 1 Data Mining: Concepts and Techniques — Slides for Textbook — — Chapter 4 — ©Jiawei Han and Micheline.
DATA MINING LECTURE 1 INTRODUCTION TO DATA MINING.
Data Mining Concepts and Techniques Course Presentation by Ali A. Ali Department of Information Technology Institute of Graduate Studies and Research Alexandria.
Smart Web Search Agents Data Search Engines >> Information Search Agents - Traditional searching on the Web is done using one of the following three: -
Foundations of information systems : BIS 1202 Lecture 4: Database Systems and Business Intelligence.
Introduction.  Instructor: Cengiz Örencik   Course materials:  myweb.sabanciuniv.edu/cengizo/courses.
UNIT-3 Data Mining Primitives, Languages, and System Architectures
Data Mining – Intro.
Data Mining.
Introduction C.Eng 714 Spring 2010.
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
©Jiawei Han and Micheline Kamber
Jiawei Han Department of Computer Science
Data Mining: Concepts and Techniques Course Outline
©Jiawei Han and Micheline Kamber
©Jiawei Han and Micheline Kamber
MANAGING DATA RESOURCES
Data Mining Concept Description
Data Warehouse and OLAP
Lingma Acheson Department of Computer and Information Science, IUPUI
Data Warehousing and Data Mining
©Jiawei Han and Micheline Kamber Department of Computer Science
Chapter 4: Data Mining Primitives, Languages, and System Architectures
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
©Jiawei Han and Micheline Kamber Department of Computer Science
UNIT-3 Data Mining Primitives, Languages, and System Architectures
Data Mining: Concepts and Techniques
Data Warehouse and OLAP
Presentation transcript:

1 Copyright Jiawei Han; modified by Charles Ling for CS411a/538a Data Mining and Data Warehousing  Introduction  Data warehousing and OLAP for data mining  Data preprocessing  Primitives for data mining  Concept description  Mining association rules in large databases  Classification and prediction  Cluster analysis  Mining complex types of data  Applications and trends in data mining

2 Copyright Jiawei Han; modified by Charles Ling for CS411a/538a Data Mining and Warehousing: Session 4 Primitives for Data Mining

3 Copyright Jiawei Han; modified by Charles Ling for CS411a/538a Session 4: Primitives for Data Mining  Introduction  What defines a data mining task?  A data mining query language (DMQL)??  A GUI (graphical user interface) based on a data mining query language  Summary

4 Copyright Jiawei Han; modified by Charles Ling for CS411a/538a Motivation  Data mining: an interactive process  user directs the mining to be performed  Users could use a set of primitives to communicate with the data mining system.  By incorporating these primitives in a data mining query language  User’s interaction with the system becomes more flexible  A foundation for the design of graphical user interface  Standardization of data mining industry and practice

5 Copyright Jiawei Han; modified by Charles Ling for CS411a/538a What Defines a Data Mining Task ?  Task-relevant data  Type of knowledge to be mined  Background knowledge  Pattern interestingness measurements  Visualization of discovered patterns

6 Copyright Jiawei Han; modified by Charles Ling for CS411a/538a Task-Relevant Data  Database or data warehouse name  Database tables or data warehouse cubes  Condition for data selection  Relevant attributes or dimensions  Data grouping criteria

7 Copyright Jiawei Han; modified by Charles Ling for CS411a/538a Types of knowledge to be mined  Characterization & discrimination  Association  Classification/prediction  Clustering  Outlier analysis  and so on

8 Copyright Jiawei Han; modified by Charles Ling for CS411a/538a Background knowledge  Concept hierarchies  schema hierarchy –eg. street < city < province_or_state < country  set-grouping hierarchy –eg. {20-39} = young, {40-59} = middle_aged  operation-derived hierarchy – address, login-name < department < university < country  rule-based hierarchy –low_profit (X) <= price(X, P1) and cost (X, P2) and (P1 - P2) < $50  User’s existing knowledge of the data.  E.g. structural zero

9 Copyright Jiawei Han; modified by Charles Ling for CS411a/538a Pattern interestingness measurements  Simplicity eg. rule length  Certainty eg. confidence, P(A|B) = n(A and B)/ n (B)  Utility potential usefulness, eg. support  Novelty not previously known, surprising

10 Copyright Jiawei Han; modified by Charles Ling for CS411a/538a Visualization of Discovered Patterns  Different background/purpose may require different form of representation  E.g., rules, tables, crosstabs, pie/bar chart etc.  Concept hierarchies is also important  discovered knowledge might be more understandable when represented at high concept level.  Interactive drill up/down, pivoting, slicing and dicing provide different perspective to data.  Different knowledge required different representation.

11 Copyright Jiawei Han; modified by Charles Ling for CS411a/538a Summary: Five primitives for specifying a data mining task  task-relevant data  database/date warehouse, relation/cube, selection criteria, relevant dimension, data grouping  kind of knowledge to be mined  characterization, discrimination, association...  background knowledge  concept hierarchies,..  interestingness measures  simplicity, certainty, utility, novelty  knowledge presentation and visualization techniques to be used for displaying the discovered patterns  rules, table, reports, chart, graph, decision trees, cubes...  drill-down, roll-up,....