Targeted Association Mining in Time-Varying Domains

Slides:



Advertisements
Similar presentations
Association Rule and Sequential Pattern Mining for Episode Extraction Jonathan Yip.
Advertisements

Huffman Codes and Asssociation Rules (II) Prof. Sin-Min Lee Department of Computer Science.
A distributed method for mining association rules
Data Mining Techniques Association Rule
Association Analysis (Data Engineering). Type of attributes in assoc. analysis Association rule mining assumes the input data consists of binary attributes.
Mining Multiple-level Association Rules in Large Databases
10 -1 Lecture 10 Association Rules Mining Topics –Basics –Mining Frequent Patterns –Mining Frequent Sequential Patterns –Applications.
Data Mining Association Analysis: Basic Concepts and Algorithms
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Association Mining Data Mining Spring Transactional Database Transaction – A row in the database i.e.: {Eggs, Cheese, Milk} Transactional Database.
Association Rules l Mining Association Rules between Sets of Items in Large Databases (R. Agrawal, T. Imielinski & A. Swami) l Fast Algorithms for.
© Vipin Kumar CSci 8980 Fall CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance Computing Research Center Department of Computer.
Data Mining Techniques So Far: Cluster analysis K-means Classification Decision Trees J48 (C4.5) Rule-based classification JRIP (RIPPER) Logistic Regression.
Data Mining Association Analysis: Basic Concepts and Algorithms Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach, Kumar Introduction.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Chapter 16 Parallel Data Mining 16.1From DB to DW to DM 16.2Data Mining: A Brief Overview 16.3Parallel Association Rules 16.4Parallel Sequential Patterns.
6/23/2015CSE591: Data Mining by H. Liu1 Association Rules Transactional data Algorithm Applications.
Core Text Mining Operations 2007 년 02 월 06 일 부산대학교 인공지능연구실 한기덕 Text : The Text Mining Handbook pp.19~41.
Fast Algorithms for Association Rule Mining
Mining Association Rules
Data Mining – Intro.
Pattern Recognition Lecture 20: Data Mining 3 Dr. Richard Spillman Pacific Lutheran University.
Association Discovery from Databases Association rules are a simple formalism for expressing positive connections between columns in a 0/1 matrix. A classical.
Data Mining : Introduction Chapter 1. 2 Index 1. What is Data Mining? 2. Data Mining Functionalities 1. Characterization and Discrimination 2. MIning.
Apriori algorithm Seminar of Popular Algorithms in Data Mining and Machine Learning, TKK Presentation Lauri Lahti.
Association Rules. 2 Customer buying habits by finding associations and correlations between the different items that customers place in their “shopping.
Ch5 Mining Frequent Patterns, Associations, and Correlations
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining By Tan, Steinbach, Kumar Lecture.
Modul 7: Association Analysis. 2 Association Rule Mining  Given a set of transactions, find rules that will predict the occurrence of an item based on.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Keng-Wei Chang Author : Anthony K.H. Tung Hongjun Lu Jiawei Han Ling Feng 國立雲林科技大學 National.
Association Rules. CS583, Bing Liu, UIC 2 Association rule mining Proposed by Agrawal et al in Initially used for Market Basket Analysis to find.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang.
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Association Rule Mining III COMP Seminar GNET 713 BCB Module Spring 2007.
Outline Knowledge discovery in databases. Data warehousing. Data mining. Different types of data mining. The Apriori algorithm for generating association.
1 FINDING FUZZY SETS FOR QUANTITATIVE ATTRIBUTES FOR MINING OF FUZZY ASSOCIATE RULES By H.N.A. Pham, T.W. Liao, and E. Triantaphyllou Department of Industrial.
Towards Using Grid Services for Mining Fuzzy Association Rules Mihai Gabroveanu, Ion Iancu, Mirel Cosulschi, Nicolae Constantinescu Faculty of Mathematics.
Association rule mining Goal: Find all rules that satisfy the user-specified minimum support (minsup) and minimum confidence (minconf). Assume all data.
1 What is Association Analysis: l Association analysis uses a set of transactions to discover rules that indicate the likely occurrence of an item based.
Association Rule Mining
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Data Mining: Association Analysis This lecture node is modified based on Lecture Notes for.
DISCOVERING SPATIAL CO- LOCATION PATTERNS PRESENTED BY: REYHANEH JEDDI & SHICHAO YU (GROUP 21) CSCI 5707, PRINCIPLES OF DATABASE SYSTEMS, FALL 2013 CSCI.
Charles Tappert Seidenberg School of CSIS, Pace University
1 Knowledge discovery & data mining Association rules and market basket analysis --introduction UCLA CS240A Course Notes* __________________________ *
CURE Clustering Using Representatives Handles outliers well. Hierarchical, partition First a constant number of points c, are chosen from each cluster.
HEMANTH GOKAVARAPU SANTHOSH KUMAR SAMINATHAN Frequent Word Combinations Mining and Indexing on HBase.
Chapter 8 Association Rules. Data Warehouse and Data Mining Chapter 10 2 Content Association rule mining Mining single-dimensional Boolean association.
Indexing and Mining Free Trees Yun Chi, Yirong Yang, Richard R. Muntz Department of Computer Science University of California, Los Angeles, CA {
1 Data Mining Lecture 6: Association Analysis. 2 Association Rule Mining l Given a set of transactions, find rules that will predict the occurrence of.
Chapter 3 Data Mining: Classification & Association Chapter 4 in the text box Section: 4.3 (4.3.1),
CFI-Stream: Mining Closed Frequent Itemsets in Data Streams
Data Mining – Intro.
Mining Dependent Patterns
Big Data Analytics: HW#2
DATA MINING © Prentice Hall.
Data Mining Association Analysis: Basic Concepts and Algorithms
Association rule mining
Knowledge discovery & data mining Association rules and market basket analysis--introduction UCLA CS240A Course Notes*
Frequent Pattern Mining
Association Rules.
Waikato Environment for Knowledge Analysis
Data Mining Association Analysis: Basic Concepts and Algorithms
Action Association Rules Mining
I don’t need a title slide for a lecture
Market Basket Analysis and Association Rules
©Jiawei Han and Micheline Kamber
Hansheng Lei Univ. of Texas Rio Grande Valley
Charles Tappert Seidenberg School of CSIS, Pace University
Presentation transcript:

Targeted Association Mining in Time-Varying Domains Project proposal Fall 2016 Jennifer Lavergne

Project Goals Phase 1: Modify Min-Max Itemset Tree Code Modify itemset tree code supplied for the project to find emerging and declining patters. Phase 2: Prepare datasets for testing. Datasets cleaned but need to be separated into time intervals Phase 3: Emerging/Declining pattern mining tests. Tests on real world and synthetic datasets.

Background

Association Mining Large repositories of data containing hidden gems of information. Rarer information is harder to find. Find interesting correlations: Items selling together Contributors to fatal accidents Symptoms for rare diseases (Agrawal et al)

Association Rules Association rule - an implication {X ⇒ Y, support, confidence}. Where X and Y are subsets of the itemset I and X∩Y = Ø Example: {{bread, milk} ⇒ {cheese}, 30%, 75%} Support = #occurrences of I in database/#rows in database Minsup – The minimum support threshold for an itemset I to be considered frequent Confidence = Support(X ⋃ Y)/Support(X) for itemset I = X ⋃ Y. Minconf – a user specified threshold that indicates the interestingness of a candidate rule I: conf(I) > minconf

Itemset Trees A data structure which aids in users querying for a specific itemset and it’s support: Targeted Association Mining Item mapped to numeric values: {bread} = {1}, {cheese} = {2} Numbers must be in ascending order within the itemset Ex: I = {1, 2, 56, 120} Note: Can be used to find all or specific rules within a dataset.

Why Emerging? By the time the data is analyzed, the value of the knowledge has greatly depreciated. Velocity: Time to action vs. Value (Hackathorn, 2002)

Time Stream Data Time stream – A stream of time sensitive data on which the algorithm will mine patterns. Time window – A window delineating the current “interesting” time frame made up of n time steps. Time step – An increment of time in which the time windows progress forward.

Time Stream Data: Moving Forward Current time step: Move forward 1 time step:

Dynamic Data Mining: Pattern Types Time

Project Description

Phase 1: Itemset Tree Code Modify existing code or write your own: Process queries over time streams Find emerging and declining patterns Supports stored in a support array equal to the number of time steps Read papers: Itemset trees Ordered Min-Max Itemset Tree Dyn-TARM

Phase 2: Preparing datasets For real world data: Separate into time steps based upon dates in the data For synthetic data: Divide data into equal parts to create time steps

Phase 3: Tests (both datasets) Question 1: What effect does modifying the input measures have? (0 < minSup, minConf < 1, 0< α < 100 time step size, and time window size.) Question 2: What is the trend of each pattern over the time stream? Is their emergence/declining cyclic or maintained all through? Question 3: What is the distribution of emergence/declining patterns over each dataset?

Jennifer Lavergne jjslavergne@louisiana.edu Questions? Jennifer Lavergne jjslavergne@louisiana.edu