SeqStream: Mining Closed Sequential Pattern over Stream Sliding Windows Lei Chang Tengjiao Wang Dongqing Yang Hua Luan ICDM’08 Lei Chang Tengjiao Wang.

Slides:



Advertisements
Similar presentations
Online Mining of Frequent Query Trees over XML Data Streams Hua-Fu Li*, Man-Kwan Shan and Suh-Yin Lee Department of Computer Science.
Advertisements

Mining Frequent Patterns II: Mining Sequential & Navigational Patterns Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Probabilistic Skyline Operator over Sliding Windows Wenjie Zhang University of New South Wales & NICTA, Australia Joint work: Xuemin Lin, Ying Zhang, Wei.
PREFIXSPAN ALGORITHM Mining Sequential Patterns Efficiently by Prefix- Projected Pattern Growth
Heaps1 Part-D2 Heaps Heaps2 Recall Priority Queue ADT (§ 7.1.3) A priority queue stores a collection of entries Each entry is a pair (key, value)
AVL Trees1 Part-F2 AVL Trees v z. AVL Trees2 AVL Tree Definition (§ 9.2) AVL trees are balanced. An AVL Tree is a binary search tree such that.
D. D. Sleator and R. E. Tarjan | AT&T Bell Laboratories Journal of the ACM | Volume 32 | Issue 3 | Pages | 1985 Presented By: James A. Fowler,
STAGGER: Periodicity Mining of Data Streams using Expanding Sliding Windows Mohamed G. Elfeky Walid G.Aref Ahmed K. Elmagarmid ICDM /10/021Chen.
Main Index Contents 11 Main Index Contents Week 6 – Binary Trees.
Incremental Discovery of Sequential Patterns (ACM-SIGMOD's 96 Data Mining Workshop)
IncSpan: Incremental Mining of Sequential Patterns in Large Databases Hong Cheng,Xifeng Yan,Jiawei Han University of Illinois at Urbana-Champaign.
An Efficient IP Address Lookup Algorithm Using a Priority Trie Authors: Hyesook Lim and Ju Hyoung Mun Presenter: Yi-Sheng, Lin ( 林意勝 ) Date: Mar. 11, 2008.
Stabbing the Sky: Efficient Skyline Computation over Sliding Windows COMP9314 Lecture Notes.
1 Pertemuan 20 Binomial Heap Matakuliah: T0026/Struktur Data Tahun: 2005 Versi: 1/1.
What ’ s Hot and What ’ s Not: Tracking Most Frequent Items Dynamically G. Cormode and S. Muthukrishman Rutgers University ACM Principles of Database Systems.
B + -Trees (Part 2) Lecture 21 COMP171 Fall 2006.
Priority Queues1 Part-D1 Priority Queues. Priority Queues2 Priority Queue ADT (§ 7.1.3) A priority queue stores a collection of entries Each entry is.
5.9 Heaps of optimal complexity
B + -Trees (Part 2) COMP171. Slide 2 Review: B+ Tree of order M and of leaf size L n The root is either a leaf or 2 to M children n Each (internal) node.
1 Efficiently Mining Frequent Trees in a Forest Mohammed J. Zaki.
Mining Sequential Patterns: Generalizations and Performance Improvements R. Srikant R. Agrawal IBM Almaden Research Center Advisor: Dr. Hsu Presented by:
Frequent Pattern Mining from Time-Fading Streams of Uncertain Data Carson Kai-Sang Leung and Fan Jiang DaWaK
USpan: An Efficient Algorithm for Mining High Utility Sequential Patterns Authors: Junfu Yin, Zhigang Zheng, Longbing Cao In: Proceedings of the 18th ACM.
Longest Increasing Subsequences in Windows Based on Canonical Antichain Partition Erdong Chen (Joint work with Linji Yang & Hao Yuan) Shanghai Jiao Tong.
Approximate Frequency Counts over Data Streams Loo Kin Kong 4 th Oct., 2002.
Approximate Frequency Counts over Data Streams Gurmeet Singh Manku, Rajeev Motwani Standford University VLDB2002.
Mining Multidimensional Sequential Patterns over Data Streams Chedy Raїssi and Marc Plantevit DaWak_2008.
Join Synopses for Approximate Query Answering Swarup Achrya Philip B. Gibbons Viswanath Poosala Sridhar Ramaswamy Presented by Bhushan Pachpande.
MINING FREQUENT ITEMSETS IN A STREAM TOON CALDERS, NELE DEXTERS, BART GOETHALS ICDM2007 Date: 5 June 2008 Speaker: Li, Huei-Jyun Advisor: Dr. Koh, Jia-Ling.
Efficient mining and prediction of user behavior patterns in mobile web systems Vincent S. Tseng, Kawuu W. Lin Information and Software Technology 48 (2006)
10/20/2015 2:03 PMRed-Black Trees v z. 10/20/2015 2:03 PMRed-Black Trees2 Outline and Reading From (2,4) trees to red-black trees (§9.5) Red-black.
CEMiner – An Efficient Algorithm for Mining Closed Patterns from Time Interval-based Data Yi-Cheng Chen, Wen-Chih Peng and Suh-Yin Lee ICDM 2011.
A Test Paradigm for Detecting Changes in Transactional Data Streams Willie Ng and Manoranjan Dash DASFAA 2008.
Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August.
QED: A Novel Quaternary Encoding to Completely Avoid Re-labeling in XML Updates Changqing Li,Tok Wang Ling.
3 Data. Software And Data Data Data element – a single, meaningful unit of data. Name Social Security Number Data structure – a set of related data elements.
CSS446 Spring 2014 Nan Wang.  to study trees and binary trees  to understand how binary search trees can implement sets  to learn how red-black trees.
Mining Progressive Confident Rules M. Zhang, W. Hsu and M.L. Lee Int'l Conf on Data Mining (ICDM),2006 IEEE Advisor : Jia-Ling Koh Speaker : Tsui-Feng.
Using category-Based Adherence to Cluster Market-Basket Data Author : Ching-Huang Yun, Kun-Ta Chuang, Ming-Syan Chen Graduate : Chien-Ming Hsiao.
CHAPTER 10.1 BINARY SEARCH TREES    1 ACKNOWLEDGEMENT: THESE SLIDES ARE ADAPTED FROM SLIDES PROVIDED WITH DATA STRUCTURES AND ALGORITHMS.
On Reducing Classifier Granularity in Mining Concept-Drifting Data Streams Peng Wang, H. Wang, X. Wu, W. Wang, and B. Shi Proc. of the Fifth IEEE International.
Indexing and Mining Free Trees Yun Chi, Yirong Yang, Richard R. Muntz Department of Computer Science University of California, Los Angeles, CA {
NORMALIZATION Handout - 4 DBMS. What is Normalization? The process of grouping data elements into tables in a way that simplifies retrieval, reduces data.
1 Mining the Smallest Association Rule Set for Predictions Jiuyong Li, Hong Shen, and Rodney Topor Proceedings of the 2001 IEEE International Conference.
Ning Jin, Wei Wang ICDE 2011 LTS: Discriminative Subgraph Mining by Learning from Search History.
Discovering Frequent Arrangements of Temporal Intervals Papapetrou, P. ; Kollios, G. ; Sclaroff, S. ; Gunopulos, D. ICDM 2005.
CFI-Stream: Mining Closed Frequent Itemsets in Data Streams
Balanced Search Trees 2-3 Trees AVL Trees Red-Black Trees
Decision Trees DEFINITION: DECISION TREE A decision tree is a tree in which the internal nodes represent actions, the arcs represent outcomes of an action,
Red-Black Trees v z Red-Black Trees Red-Black Trees
Red-Black Trees 5/17/2018 Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia, and.
Red-Black Trees 5/22/2018 Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia, and.
Binary search tree. Removing a node
CISC220 Fall 2009 James Atlas Lecture 13: Binary Trees.
Parallel Density-based Hybrid Clustering
Patricia Practical Algorithm To Retrieve Information Coded In Alphanumeric. Compressed binary trie. All nodes are of the same data type (binary tries use.
CARPENTER Find Closed Patterns in Long Biological Datasets
Alyce Brady CS 470: Data Structures CS 510: Computer Algorithms
Trees and Binary Trees.
Part-D1 Priority Queues
Red-Black Trees v z /20/2018 7:59 AM Red-Black Trees
Red-Black Trees v z Red-Black Trees Red-Black Trees
Advanced Associative Structures
Mining Association Rules from Stars
Heapsort and d-Heap Neil Tang 02/11/2010
A Fast Algorithm for Subspace Clustering by Pattern Similarity
Red-Black Trees v z /17/2019 4:20 PM Red-Black Trees
MCN: A New Semantics Towards Effective XML Keyword Search
Heapsort and d-Heap Neil Tang 02/14/2008
Red-Black Trees v z /6/ :10 PM Red-Black Trees
Presentation transcript:

SeqStream: Mining Closed Sequential Pattern over Stream Sliding Windows Lei Chang Tengjiao Wang Dongqing Yang Hua Luan ICDM’08 Lei Chang Tengjiao Wang Dongqing Yang Hua Luan ICDM’ /1/2 1

Outline. Preliminary. Algorithm. Experimental results. Conclusion. 2016/1/2 2

Preliminary. The inverse sequence of a sequence s, denoted by s’ s =, s’= An s-projected database D s -projected database is {,,,, } The size of D s denoted as R(D s ) The size of -projected database is /1/2 3

-projected database is {φ,φ,, } The size of -projected database is 6. The inverse database of D, denoted by D’ The database in current sliding window after inserting(but before removing), denoted by D^. D^ : {,,,,, } 2016/1/2 4

In the inverse database of D^, the set of sequence from user appear in current window is called an insertion database denoted by D +. The set of sequence from user that appear in remove winodw is called a removal database denoted by D /1/2 5

D^ : {,,,,, } D^’: {,,,,, } D + : {,, } D - : {,, } 2016/1/2 6

7

closed pattern : { :6, :3, :4, :5, :4, :3, :4} closed pattern : { :6, :5, :4, :3, :4, :4, :3} 2016/1/2 8

s n : A node n of an IST corresponds a sequence that starts from the root node to that node, and the sequence is denoted by S n. c-node : If s n is a closed sequential sequence in D’, n is a c-node. t-node : If s n is not a closed sequential sequence in D’ and it does not have any t-node ancestor. i-node : n is neither a c-node nor t-node. 2016/1/2 9

Algorithm. Element insertion Element removal State update 2016/1/2 10

Element insertion Theorem 2 : If a depth-1 node whose item does not occur in the newly coming element, nodes under that node will not change their attribute values and any t-node under it does not change its type after inserting the element. Theorem 3 : After inserting a new element, if the PDBSize and support of a t-node do not change, it will keep to be a t-node. 2016/1/2 11

2016/1/2 12

D c ^’ : {,,,, } D f ^’ : {φ, φ, } c : {,,,, } ca : {,, } cb : {, φ, } ce : {, φ, } 2016/1/2 13

Element removal Theorem 5 : After the removal of e tc−w, a t-node may be deleted, but it never changes to a c-node or an i-node. For each child node t of n, it computes s t -projected database in the removal database D − 2016/1/2 14

D − : {,, } D a − : {,, } D b − : {,φ, } D c − : {,, } …… D f − : {φ, } 2016/1/2 15

State update Theorem 6 : Given a t-node n in an IST for the inverse database D, there must exist an i-node or a c-node t in the IST. i-node => c-node c-node => t-node 2016/1/2 16

2016/1/2 17

Experimental results. 2016/1/2 18

2016/1/2 19

Conclusion. This paper has proposed a Seqstream algorithm to mine closed sequential pattern in sliding window. Designed for multi-stream? 2016/1/2 20