Date:2004/03/05 Mining Frequent Episodes for relating Financial Events and Stock Trends Anny Ng and Ada Wai-chee Fu PAKDD 2003 報告者: Ming Jing Tsai.

Slides:



Advertisements
Similar presentations
Association Rule Mining
Advertisements

Association Rule Mining. 2 The Task Two ways of defining the task General –Input: A collection of instances –Output: rules to predict the values of any.
6.830 Lecture 11 Query Optimization & Automatic Database Design 10/8/2014.
732A02 Data Mining - Clustering and Association Analysis ………………… Jose M. Peña FP grow algorithm Correlation analysis.
FP-Growth algorithm Vasiljevic Vladica,
FP (FREQUENT PATTERN)-GROWTH ALGORITHM ERTAN LJAJIĆ, 3392/2013 Elektrotehnički fakultet Univerziteta u Beogradu.
Data Mining Association Analysis: Basic Concepts and Algorithms
FPtree/FPGrowth (Complete Example). First scan – determine frequent 1- itemsets, then build header B8 A7 C7 D5 E3.
Association Analysis. Association Rule Mining: Definition Given a set of records each of which contain some number of items from a given collection; –Produce.
Data Mining Techniques So Far: Cluster analysis K-means Classification Decision Trees J48 (C4.5) Rule-based classification JRIP (RIPPER) Logistic Regression.
Data Mining Association Analysis: Basic Concepts and Algorithms Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach, Kumar Introduction.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
732A02 Data Mining - Clustering and Association Analysis ………………… Jose M. Peña Association rules Apriori algorithm FP grow algorithm.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Data Mining Association Analysis: Basic Concepts and Algorithms
Generalized Sequential Pattern (GSP) Step 1: – Make the first pass over the sequence database D to yield all the 1-element frequent sequences Step 2: Repeat.
FP-Tree/FP-Growth Practice. FP-tree construction null B:1 A:1 After reading TID=1: After reading TID=2: null B:2 A:1 C:1 D:1.
Data Mining Association Analysis: Basic Concepts and Algorithms
1 Mining Frequent Patterns Without Candidate Generation Apriori-like algorithm suffers from long patterns or quite low minimum support thresholds. Two.
Association Analysis: Basic Concepts and Algorithms.
Data Mining Association Analysis: Basic Concepts and Algorithms
Pattern Lattice Traversal by Selective Jumps Osmar R. Zaïane and Mohammad El-Hajj Department of Computing Science, University of Alberta Edmonton, AB,
FPtree/FPGrowth. FP-Tree/FP-Growth Algorithm Use a compressed representation of the database using an FP-tree Then use a recursive divide-and-conquer.
© Vipin Kumar CSci 8980 Fall CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance Computing Research Center Department of Computer.
Association Analysis (3). FP-Tree/FP-Growth Algorithm Use a compressed representation of the database using an FP-tree Once an FP-tree has been constructed,
© Vipin Kumar CSci 8980 Fall CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance Computing Research Center Department of Computer.
SEG Tutorial 2 – Frequent Pattern Mining.
1 Efficiently Mining Frequent Trees in a Forest Mohammed J. Zaki.
Chapter 19 - basic definitions - order statistics ( findkth( ) ) - balanced binary search trees - Java implementations Binary Search Trees 1CSCI 3333 Data.
Sequential PAttern Mining using A Bitmap Representation
Data Mining Frequent-Pattern Tree Approach Towards ARM Lecture
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining By Tan, Steinbach, Kumar Lecture.
Modul 7: Association Analysis. 2 Association Rule Mining  Given a set of transactions, find rules that will predict the occurrence of an item based on.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Data Mining Association Analysis Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach, Kumar Introduction to Data Mining 4/18/
Mining Frequent Patterns without Candidate Generation.
Mining Frequent Patterns without Candidate Generation : A Frequent-Pattern Tree Approach 指導教授:廖述賢博士 報 告 人:朱 佩 慧 班 級:管科所博一.
Frequent Item Mining. What is data mining? =Pattern Mining? What patterns? Why are they useful?
CSE4334/5334 DATA MINING CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai.
1 Inverted Matrix: Efficient Discovery of Frequent Items in Large Datasets in the Context of Interactive Mining -SIGKDD’03 Mohammad El-Hajj, Osmar R. Zaïane.
CanTree: a tree structure for efficient incremental mining of frequent patterns Carson Kai-Sang Leung, Quamrul I. Khan, Tariqul Hoque ICDM ’ 05 報告者:林靜怡.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Data Mining: Association Analysis This lecture node is modified based on Lecture Notes for.
Association Analysis (3)
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Δ-Tolerance Closed Frequent Itemsets James Cheng,Yiping Ke,and Wilfred Ng ICDM ’ 06 報告者:林靜怡 2007/03/15.
Reducing Number of Candidates Apriori principle: – If an itemset is frequent, then all of its subsets must also be frequent Apriori principle holds due.
1 Data Mining Lecture 6: Association Analysis. 2 Association Rule Mining l Given a set of transactions, find rules that will predict the occurrence of.
Chapter 6 – Trees. Notice that in a tree, there is exactly one path from the root to each node.
CFI-Stream: Mining Closed Frequent Itemsets in Data Streams
Reducing Number of Candidates
Data Mining Association Analysis: Basic Concepts and Algorithms
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
Frequent Pattern Mining
Market Basket Analysis and Association Rules
FP-Tree/FP-Growth Detailed Steps
Data Mining Association Analysis: Basic Concepts and Algorithms
Association Rule Mining
Data Mining Association Analysis: Basic Concepts and Algorithms
Mining Association Rules from Stars
Data Mining Association Analysis: Basic Concepts and Algorithms
The DSW Algorithm The building block for tree transformations in this algorithm is the rotation There are two types of rotation, left and right, which.
COMP5331 FP-Tree Prepared by Raymond Wong Presented by Raymond Wong
732A02 Data Mining - Clustering and Association Analysis
Mining Frequent Patterns without Candidate Generation
Frequent-Pattern Tree
FP-Growth Wenlong Zhang.
Association Analysis: Basic Concepts
Presentation transcript:

date:2004/03/05 Mining Frequent Episodes for relating Financial Events and Stock Trends Anny Ng and Ada Wai-chee Fu PAKDD 2003 報告者: Ming Jing Tsai

Definition  Events : financial news,political …  e 1,e 2,e 3 ….,e k : event types  day record D i :{e i1,e i2,e i3 ….,e ik }  Episode:{e 1,e 2,e 3 ….,e k } , has at least two elements and at least one e j is a stock event type  Window = x days

Definition  Window frequency : number of windows that contains an event type  DB frequency : number of occurrences of an event type in DB  Frequency of an episode (ex) number of windows the first day of window contains at least one of the event types in episode.

Construct event tree  Header in descending db frequencies order  Event_set pair sorted in the descending db frequencies  node : E :event type,c :counts,b :binary bit

Pruning method  window frequencies < min_sup  Remove duplicate event type in both firstday part and remaining day part

daysevents 1b 2ac 3b 4d 5b 6ca 7d Window = 3,min_sup =3 An Event database Db frequencies

windows windowDay included Event_set pairs 11,2,3 22,3,4 33,4,5 44,5,6 55,6,7 66,7 77 Ordered frequent event type Window frequencies Window = 3,min_sup =3

{null} {b:1:0} {a:1:1} {c:1:1} {a:1:0} {c:1:0} {b:1:1} {d:1:1} b a c d

{null} {b:2:0} {a:1:1} {c:1:1} {a:1:0} {c:1:0} {b:1:1} b a c d {d:1:0} {b:1:1} {a:1:1} {c:1:1} {d:1:1}

{null} {b:3:0} {a:1:1} {c:1:1} {a:1:0} {c:1:0} {b:1:1} b a c d {d:1:0} {b:1:1} {a:1:1} {c:1:1} {d:1:1}

{null} {b:3:0} {a:2:1} {c:2:1} {a:1:0} {c:1:0} {b:1:1} b a c d {d:1:0} {b:1:1} {a:1:1} {c:1:1} {d:1:1}

{null} {b:3:0} {a:2:1} {c:2:1} {a:2:0} {c:2:0} {b:1:1} b a c d {d:1:0} {b:1:1} {a:1:1} {c:1:1} {d:1:1}

{null} {b:3:0} {a:2:1} {c:2:1} {a:2:0} {c:2:0} {b:1:1} b a c d {d:2:0} {b:1:1} {a:1:1} {c:1:1} {d:1:1}

Mining frequent episode  Header table{h 0,h 1, …..,h H }  Mining recursively each of the linked list kept at the header table from bottom to top  Conditional path can build conditional event tree  Object 1:found frequent episodes of form {a} ∪ {h i } first-part frequencies  Object 2:found frequent episodes that contain h i and at least two other event types Db frequencies

Traverse conditional path  Remove invalid event types  Adjust counts of nodes above hi in the path to be equal to that of hi  If hi is in the firstdays part, then move all event types in the remainingdays part to the firstdays part  Remove hi from the path

Generate frequent episode  When a conditional event tree contains only a single path Any subset of firstpart ∪ event base set Any Subsets of firstpart ∪ Any Subsets of remainingpart ∪ event base set

Mining Header d  event base set {d} db frequency:{ } First_part frequency:{ } Frequent episode :{bd,ad,cd} min_sup =3 WEvent_set pairs

Recursively Mining Header c  event base set {cd} db frequency:{ } First_part frequency:{ } Frequent episode :{bcd,acd}

 Recursively Mining Header a event base set {acd} db frequency:{ } First_part frequency:{ } Frequent episode :{bacd}

Mining Header c  event base set {c} db frequency:{ } First_part frequency:{ } Frequent episode :{bc} min_sup =3 WEvent_set pairs

Recursively Mining Header a  event base set {ac} db frequency:{ } First_part frequency:{ } Frequent episode :{bac} min_sup =3

Mining Header a  event base set {a} db frequency:{ } First_part frequency:{ } Frequent episode :{ba} min_sup =3 WEvent_set pairs

Experiment (synthetic data)

Dataset 2 T20,I5,M1000,D3K

Experiment (real data)  News event from a internet 121 event types 757 days  Stock data Dow Jones,Nasdaq,Hang Seng, 12 top local companies

Experiment (real data)

episodesupport Nasdaq downs, PCCW downs151 Nasdaq ups, SHK properties flats, HSBC flats 178 China Mobile downs, Nasdaq downs, HK Electric flats 178