AI Week 23 Machine Learning Data Mining – Week 2 Lee McCluskey, room 2/07

Slides:



Advertisements
Similar presentations
Data Mining Tools Overview Business Intelligence for Managers.
Advertisements

Mining Association Rules from Microarray Gene Expression Data.
Huffman Codes and Asssociation Rules (II) Prof. Sin-Min Lee Department of Computer Science.
CSE 634/590 Data mining Extra Credit: Submitted By: Moieed Ahmed
DATA MINING Association Rule Discovery. AR Definition aka Affinity Grouping Common example: Discovery of which items are frequently sold together at a.
Rule Generation from Decision Tree Decision tree classifiers are popular method of classification due to it is easy understanding However, decision tree.
Mining Multiple-level Association Rules in Large Databases
IT 433 Data Warehousing and Data Mining Association Rules Assist.Prof.Songül Albayrak Yıldız Technical University Computer Engineering Department
Association Rule Mining. 2 The Task Two ways of defining the task General –Input: A collection of instances –Output: rules to predict the values of any.
Chase Repp.  knowledge discovery  searching, analyzing, and sifting through large data sets to find new patterns, trends, and relationships contained.
Association Rules l Mining Association Rules between Sets of Items in Large Databases (R. Agrawal, T. Imielinski & A. Swami) l Fast Algorithms for.
Association Analysis. Association Rule Mining: Definition Given a set of records each of which contain some number of items from a given collection; –Produce.
Learning Fuzzy Association Rules and Associative Classification Rules Jianchao Han Computer Science Department California State University Dominguez Hills.
Data Mining Techniques So Far: Cluster analysis K-means Classification Decision Trees J48 (C4.5) Rule-based classification JRIP (RIPPER) Logistic Regression.
Mining Association Rules. Association rules Association rules… –… can predict any attribute and combinations of attributes … are not intended to be used.
Chapter 9 Business Intelligence Systems
Chapter 16 Parallel Data Mining 16.1From DB to DW to DM 16.2Data Mining: A Brief Overview 16.3Parallel Association Rules 16.4Parallel Sequential Patterns.
AI Week 22 Machine Learning Data Mining Lee McCluskey, room 2/07
Introduction to Neural Networks Simon Durrant Quantitative Methods December 15th.
Association Rules Presented by: Anilkumar Panicker Presented by: Anilkumar Panicker.
Fast Algorithms for Association Rule Mining
Research Project Mining Negative Rules in Large Databases using GRD.
Data Mining: A Closer Look
Chapter 5 Data mining : A Closer Look.
Introduction to Data Mining Data mining is a rapidly growing field of business analytics focused on better understanding of characteristics and.
Data Mining CS 157B Section 2 Keng Teng Lao. Overview Definition of Data Mining Application of Data Mining.
GUHA method in Data Mining Esko Turunen Tampere University of Technology Tampere, Finland.
Enterprise systems infrastructure and architecture DT211 4
『 Data Mining 』 By Jung, hae-sun. 1.Introduction 2.Definition 3.Data Mining Applications 4.Data Mining Tasks 5. Overview of the System 6. Data Mining.
Basic Data Mining Techniques
Data Mining Techniques
Fundamentals of Hypothesis Testing: One-Sample Tests
03/23/09 AI Week 15 Machine Learning: Data Mining : Association Rule Mining, Associative Classification, Applications Lee McCluskey, room 3/10
3 Objects (Views Synonyms Sequences) 4 PL/SQL blocks 5 Procedures Triggers 6 Enhanced SQL programming 7 SQL &.NET applications 8 OEM DB structure 9 DB.
Data Clustering 1 – An introduction
Eng.Mosab I. Tabash Applied Statistics. Eng.Mosab I. Tabash Session 1 : Lesson 1 IntroductiontoStatisticsIntroductiontoStatistics.
Data Mining – A First View Roiger & Geatz. Definition Data mining is the process of employing one or more computer learning techniques to automatically.
Association Rules. CS583, Bing Liu, UIC 2 Association rule mining Proposed by Agrawal et al in Initially used for Market Basket Analysis to find.
Data Mining By Fu-Chun (Tracy) Juang. What is Data Mining? ► The process of analyzing LARGE databases to find useful patterns. ► Attempts to discover.
ASSOCIATION RULE DISCOVERY (MARKET BASKET-ANALYSIS) MIS2502 Data Analytics Adapted from Tan, Steinbach, and Kumar (2004). Introduction to Data Mining.
AI Week 14 Machine Learning: Introduction to Data Mining Lee McCluskey, room 3/10
1 Knowledge Discovery Transparencies prepared by Ho Tu Bao [JAIST] ITCS 6162.
HW#2: A Strategy for Mining Association Rules Continuously in POS Scanner Data.
Introduction of Data Mining and Association Rules cs157 Spring 2009 Instructor: Dr. Sin-Min Lee Student: Dongyi Jia.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Fundamentals of Hypothesis Testing: One-Sample Tests Statistics.
Outline Knowledge discovery in databases. Data warehousing. Data mining. Different types of data mining. The Apriori algorithm for generating association.
Association Rule.. Association rule mining  It is an important data mining model studied extensively by the database and data mining community.  Assume.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
Association Rule Mining
DATA MINING By Cecilia Parng CS 157B.
ASSOCIATION RULES (MARKET BASKET-ANALYSIS) MIS2502 Data Analytics Adapted from Tan, Steinbach, and Kumar (2004). Introduction to Data Mining.
Chapter 6. Classification and Prediction Classification by decision tree induction Bayesian classification Rule-based classification Classification by.
Elsayed Hemayed Data Mining Course
Academic Year 2014 Spring Academic Year 2014 Spring.
Chap 6: Association Rules. Rule Rules!  Motivation ~ recent progress in data mining + warehousing have made it possible to collect HUGE amount of data.
Chapter 3 Data Mining: Classification & Association Chapter 4 in the text box Section: 4.3 (4.3.1),
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 28 Data Mining Concepts.
Data Mining – Introduction (contd…) Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot.
Chapter 13 – Association Rules DM for Business Intelligence.
By Arijit Chatterjee Dr
A Research Oriented Study Report By :- Akash Saxena
William Norris Professor and Head, Department of Computer Science
Waikato Environment for Knowledge Analysis
Data Mining Association Analysis: Basic Concepts and Algorithms
Association Rule Mining
Data Analysis.
Data Mining Association Analysis: Basic Concepts and Algorithms
Market Basket Analysis and Association Rules
Chapter 14 – Association Rules
How a Financial Crisis Affects Data Mining Results: A Case Study
Presentation transcript:

AI Week 23 Machine Learning Data Mining – Week 2 Lee McCluskey, room 2/07

Artform Research Group Focus on one area: Data Mining involves discovering patterns from large data bases or data warehouses for different purposes. It is the science of extracting meaningful information from (large) databases. Applications - Market analysis and Retail, Decision support, Financial analysis, Discovering environmental trends Two Types of Learning: Data Mining can be supervised (“Learning from Example”) or unsupervised (“Learning from Observation”) Data Mining is often part of a larger process aimed at getting more out of data warehouses and involves data clensing data clensing: is the process of identifying and removing or correcting corrupted record from a database. This makes the data consistent with other similar data sets in the database. Eg the process may remove invalid post codes, spurious extreme values (eg ).

Artform Research Group Association Rule Mining(ARM) This is an “unsupervised learning activity” - briefly, looking for strong associations between features in data. Definitions: A transactional database is a set of “transactions” eg the details of individual sales. A transaction can be though of as an “item-set” where each item is an attribute-value {height=6, temp = 20. weather = warm} As a special case we could have nominal item sets {bread, cheese, milk}

Artform Research Group Association Rule Mining(ARM): Important Definitions An association rule is an expression X => Y where X, Y are item-sets, and The support of an association rule is defined as the proportion of transactions in the database that contain X U Y. The confidence of an association rule is defined as the probability that a transaction contains Y given that it contains X, that is = no of transactions containing (X U Y) / no of transactions containing X

Artform Research Group Example A trader deals in the following currencies in a series of 8 transactions… 1Sterling YenDollarEuro 2DollarEuroRandSterlingRuble 3PesosEuroRubleRupeeYen 4RupeeSterlingRubleEuroDollar 5Sterling DinarsRandYen 6Pesos KronerSterlingDollar 7RubleRupeeKronerSterlingPesos 8DollarEuroSterling What is the SUPPORT and CONFIDENCE of the following rules? {Ruble } → {Rupee} {Sterling, Euro} → {Ruble} {Sterling, Euro} → {Ruble,,Pesos} Find an association rule from the set of transactions that has - at least 2 items in its antecedents, - better support and better confidence than both rules above.

Artform Research Group Aims of ARM Given a transactional database D, the association rule problem is to find all rules that have supports and confidences greater than certain user-specified thresholds, denoted by minimum support (MinSupp) and minimum confidence (MinConf), respectively. The aim is the discovery of the most significant associations between the items in a transactional data set. This process involves primarily the discovery of so called frequent item- sets, i.e. item-sets that occurred in the transactional data set above MinSupp and MinConf.

Artform Research Group Contract: Classification Rule Mining The output of DM is a (set of) classification rule(s) WHERE classes are known apriori (supervised learning) and there is only one class on RHS. Features => C(1) …. Features => C(n)

Artform Research Group Classification Rule Mining Size = medium, colour = green, shape = square => c1 Size = small, colour = red, shape = square => c1 Size = small, colour = blue, shape = circle => c1 Size = small, colour = green, shape = triangle => c2 Size = large, colour = white, shape = circle => c2 Aims is to find “hypotheses” that are Characteristic – true of all members of a class Discriminating – not true of ANY members of other classes

Artform Research Group Associative Classification If we fuse ARM and CRM we get “Associative Classification” – use the association technique, but learning about particular items or item sets. Associative Classification is a branch in data mining that combines classification and association rule mining. In other words, it utlises association rule discovery methods in classification data sets. Typically: Find Association Rules using ARM Sift out the “Class Association Rules” – ones that have the class of interest on their Right Hand Sides

Artform Research Group Example in Road Traffic Control

Artform Research Group Example in Road Traffic Control

Artform Research Group Example in Road Traffic Control Data.. Numeric Data Record from individual CARS (date, time, position, actual speed, expected speed) Textual Data of INCIDENTS (date, time start, time cleared, position, severity, road type, area, incident category, cause, road-effect, traffic-effect, reporter..)

Artform Research Group Example in Road Traffic Control associations between variations in speeds with near- future incidents effect of a particular type of incident (eg roadworks) on average speeds on nearby trunk roads looking for predictors in "heavy/slow traffic" incidents: look for associations with speed variations or accidents on roads downstream from the incident position (hence causing the incident) looking for associations between speeds around a bypass and a later "heavy traffic" incident within the town bypassed extraction of the roads that have most impact to cause congestion formulation of rules that can predict conditions after a period of road works or an incident (depending on specific road, type of incident etc).

Artform Research Group Conclusions Data Mining is a powerful set of techniques to help discover hidden knowledge It can be supervised or unsupervised. ARM CRM AC Are three important classes of technique used in DM