CS590D: Data Mining Prof. Chris Clifton April 21, 2005 Multi-Relational Data Mining.

Slides:



Advertisements
Similar presentations
Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California
Advertisements

Frequent Itemset Mining Methods. The Apriori algorithm Finding frequent itemsets using candidate generation Seminal algorithm proposed by R. Agrawal and.
Machine Learning: Lecture 9
22C:19 Discrete Structures Induction and Recursion Spring 2014 Sukumar Ghosh.
RIPPER Fast Effective Rule Induction
Inductive Logic Programming: The Problem Specification Given: –Examples: first-order atoms or definite clauses, each labeled positive or negative. –Background.
Presented by Russell Myers Paper by Ming-Chuan Wu and Alejandro P. Buchmann.
Relational Data Mining in Finance Haonan Zhang CFWin /04/2003.
AI Week 22 Machine Learning Data Mining Lee McCluskey, room 2/07
Basic Data Mining Techniques Chapter Decision Trees.
Basic Data Mining Techniques
Data Mining with Decision Trees Lutz Hamel Dept. of Computer Science and Statistics University of Rhode Island.
Machine Learning: Symbol-Based
CS490D: Introduction to Data Mining Prof. Chris Clifton April 19, 2004 More on Associations/Rules.
© Prentice Hall1 DATA MINING Introductory and Advanced Topics Part II Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist.
1 Structures and Strategies for State Space Search 3 3.0Introduction 3.1Graph Theory 3.2Strategies for State Space Search 3.3Using the State Space to Represent.
Data Mining – Intro.
Introduction to Data Mining Engineering Group in ACL.
Overview of Distributed Data Mining Xiaoling Wang March 11, 2003.
Basic Data Mining Techniques
Data Mining Techniques
CS490D: Introduction to Data Mining Prof. Chris Clifton April 14, 2004 Fraud and Misuse Detection.
Inductive Logic Programming (for Dummies) Anoop & Hector.
1 DATABASE TECHNOLOGIES BUS Abdou Illia, Fall 2007 (Week 3, Tuesday 9/4/2007)
Inductive Logic Programming Includes slides by Luis Tari CS7741L16ILP.
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
Notes for Chapter 12 Logic Programming The AI War Basic Concepts of Logic Programming Prolog Review questions.
INTRODUCTION TO MACHINE LEARNING. $1,000,000 Machine Learning  Learn models from data  Three main types of learning :  Supervised learning  Unsupervised.
Inductive learning Simplest form: learn a function from examples
COMP3503 Intro to Inductive Modeling
Introduction to ILP ILP = Inductive Logic Programming = machine learning  logic programming = learning with logic Introduced by Muggleton in 1992.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Wednesday, February 7, 2001.
Introduction to Data Mining Group Members: Karim C. El-Khazen Pascal Suria Lin Gui Philsou Lee Xiaoting Niu.
AI Week 14 Machine Learning: Introduction to Data Mining Lee McCluskey, room 3/10
Some working definitions…. ‘Data Mining’ and ‘Knowledge Discovery in Databases’ (KDD) are used interchangeably Data mining = –the discovery of interesting,
Categorical data. Decision Tree Classification Which feature to split on? Try to classify as many as possible with each split (This is a good split)
November 10, Machine Learning: Lecture 9 Rule Learning / Inductive Logic Programming.
Data Mining By Dave Maung.
1 Statistical Techniques Chapter Linear Regression Analysis Simple Linear Regression.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
Multi-Relational Data Mining: An Introduction Joe Paulowskey.
Data Mining In contrast to the traditional (reactive) DSS tools, the data mining premise is proactive. Data mining tools automatically search the data.
Fast Algorithms for Mining Association Rules Rakesh Agrawal and Ramakrishnan Srikant VLDB '94 presented by kurt partridge cse 590db oct 4, 1999.
DATA MINING By Cecilia Parng CS 157B.
Ch. 13 Ch. 131 jcmt CSE 3302 Programming Languages CSE3302 Programming Languages (notes?) Dr. Carter Tiernan.
Slides for “Data Mining” by I. H. Witten and E. Frank.
Chapter 6 Classification and Prediction Dr. Bernard Chen Ph.D. University of Central Arkansas.
ISQS 7342 Dr. zhangxi Lin By: Tej Pulapa. DT in Forecasting Targeted Marketing - Know before hand what an online customer loves to see or hear about.
An Introduction Student Name: Riaz Ahmad Program: MSIT( ) Subject: Data warehouse & Data Mining.
Summary „Rough sets and Data mining” Vietnam national university in Hanoi, College of technology, Feb.2006.
Web Mining and Semantic Web Web Mining and Semantic Web Pınar Şenkul Dept. of Computer Engineering.
Data Mining and Decision Support
1 Classification: predicts categorical class labels (discrete or nominal) classifies data (constructs a model) based on the training set and the values.
Data Mining By: Johan Johansson. Mining Techniques Association Rules Association Rules Decision Trees Decision Trees Clustering Clustering Nearest Neighbor.
More Symbolic Learning CPSC 386 Artificial Intelligence Ellen Walker Hiram College.
Chap. 10 Learning Sets of Rules 박성배 서울대학교 컴퓨터공학과.
Basic Data Mining Techniques Chapter 3-A. 3.1 Decision Trees.
Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining Chapter 3 Basic Data Mining Techniques Jason C. H. Chen, Ph.D. Professor of MIS School of Business.
Section Recursion  Recursion – defining an object (or function, algorithm, etc.) in terms of itself.  Recursion can be used to define sequences.
 Problem Analysis  Coding  Debugging  Testing.
Machine Learning overview Chapter 18, 21
Prepared by: Mahmoud Rafeek Al-Farra
SEEM5770/ECLT5840 Course Review
Proper Refinement of Datalog Clauses using Primary Keys
Supporting End-User Access
The Good News: Why are Decision Trees Currently Quite Popular for Classification Problems? Very robust --- good average testing performance: outperform.
CS490D: Introduction to Data Mining Prof. Chris Clifton
Statistical Relational AI
The Good News: Why are Decision Trees Currently Quite Popular for Classification Problems? Very robust --- good average testing performance: outperform.
Presentation transcript:

CS590D: Data Mining Prof. Chris Clifton April 21, 2005 Multi-Relational Data Mining

What is MRDM? Problem: Data in multiple tables –Want rules/patterns/etc. across tables Solution: Represent as single table –Join the data –Construct a single view –Use standard data mining techniques Example: “Customer” and “Married-to” –Easy single-table representation Bad Example: Ancestor of

Relational Data Network

Basis of Solutions: Inductive Logic Programming ILP Rule: –customer(CID,Name,Age,yes)  Age > 30  purchase(CID,PID,D,Value,PM)  PM = credit card  Value > 100 Learning methods: –Database represented as clauses (rules) –Unification: Given rule (function/clause), discover values for which it holds

Example How do we learn the “daughter” relationship? –Is this classification? Association? Covering Algorithm: “guess” at rule explaining only positive examples –Remove positive examples explained by rule –Iterate

How to make a good “guess” Clause subsumption: Generalize –More general clause (daughter(mary,Y) subsumes daughter(mary,ann) Start with general hypotheses and move to more specific

Issues Search space – efficiency Noisy data –positive examples labeled as negative –Missing data (e.g., a daughter with no parents in the database) What else might we want to learn?

WARMR: Multi-relational association rules

Multi-Relational Decision Trees