SeqStream: Mining Closed Sequential Pattern over Stream Sliding Windows Lei Chang Tengjiao Wang Dongqing Yang Hua Luan ICDM’08 Lei Chang Tengjiao Wang Dongqing Yang Hua Luan ICDM’ /1/2 1
Outline. Preliminary. Algorithm. Experimental results. Conclusion. 2016/1/2 2
Preliminary. The inverse sequence of a sequence s, denoted by s’ s =, s’= An s-projected database D s -projected database is {,,,, } The size of D s denoted as R(D s ) The size of -projected database is /1/2 3
-projected database is {φ,φ,, } The size of -projected database is 6. The inverse database of D, denoted by D’ The database in current sliding window after inserting(but before removing), denoted by D^. D^ : {,,,,, } 2016/1/2 4
In the inverse database of D^, the set of sequence from user appear in current window is called an insertion database denoted by D +. The set of sequence from user that appear in remove winodw is called a removal database denoted by D /1/2 5
D^ : {,,,,, } D^’: {,,,,, } D + : {,, } D - : {,, } 2016/1/2 6
7
closed pattern : { :6, :3, :4, :5, :4, :3, :4} closed pattern : { :6, :5, :4, :3, :4, :4, :3} 2016/1/2 8
s n : A node n of an IST corresponds a sequence that starts from the root node to that node, and the sequence is denoted by S n. c-node : If s n is a closed sequential sequence in D’, n is a c-node. t-node : If s n is not a closed sequential sequence in D’ and it does not have any t-node ancestor. i-node : n is neither a c-node nor t-node. 2016/1/2 9
Algorithm. Element insertion Element removal State update 2016/1/2 10
Element insertion Theorem 2 : If a depth-1 node whose item does not occur in the newly coming element, nodes under that node will not change their attribute values and any t-node under it does not change its type after inserting the element. Theorem 3 : After inserting a new element, if the PDBSize and support of a t-node do not change, it will keep to be a t-node. 2016/1/2 11
2016/1/2 12
D c ^’ : {,,,, } D f ^’ : {φ, φ, } c : {,,,, } ca : {,, } cb : {, φ, } ce : {, φ, } 2016/1/2 13
Element removal Theorem 5 : After the removal of e tc−w, a t-node may be deleted, but it never changes to a c-node or an i-node. For each child node t of n, it computes s t -projected database in the removal database D − 2016/1/2 14
D − : {,, } D a − : {,, } D b − : {,φ, } D c − : {,, } …… D f − : {φ, } 2016/1/2 15
State update Theorem 6 : Given a t-node n in an IST for the inverse database D, there must exist an i-node or a c-node t in the IST. i-node => c-node c-node => t-node 2016/1/2 16
2016/1/2 17
Experimental results. 2016/1/2 18
2016/1/2 19
Conclusion. This paper has proposed a Seqstream algorithm to mine closed sequential pattern in sliding window. Designed for multi-stream? 2016/1/2 20