Download presentation
Presentation is loading. Please wait.
1
Coproduct Transformations on Lattices of Closed Partial Orders Gemma Casas-Garriga gcasas@lsi.upc.es MOISES meeting MOISES meeting, Valladolid, Sept 2004 LARCA
2
Data Description D = {s 1, …, s N } where each s i is a sequence. A sequence is an ordered list of sets of items: For example, We consider a set of sequences D to be analyzed. 3 1 2 Id
3
Basic Definitions An episode in D is an acyclic directed graph, indicating a partial order between items. 3 1 2 BC A D The support of a poset in D is the number of input sequences that are compatible with it. BC A D it is compatible with second and third input sequences
4
Problem Formulation Goal: to identify posets and their support (alternatively, whose support is over a minimum user-specified threshold) Problem: many redundant partial orders... 3 1 2 For ex. both P and P’ are compatible with the same input sequences, but P is more “informative” than P’. BC A D P BCA P’ P’ P
5
Problem Formulation If P’ P we say that P is more specific than P’. Specificity relation is different from classical inclusion of episodes. Goal redefined: to identify the most specific partial orders among those occurring in the same input seqs ( alternatively, with support over a minimum threshold). BC A D BCAD ||,,,
6
Example 3 1 2 BC A D BCDA, || BC D A ABCD BCAD BCDA 1,2,3 2,3 1,3 1 3 2 Input seqs where the poset is the most specific.
7
Motivation Ordering relationships are useful in many domains: web mining, monitoring of processes, e-comerce... The most specific episodes give a general view of D, summarizing all the input sequences without redundancies.
8
Addressing the Problem Observation: Identifying such structures directly from the data is a complex task (specificity relation is difficult to calculate). Our proposal: – Constructing hybrid episodes out of their maximal paths. – That is, finding those subsequences in D that will identify maximal paths of the final desired episodes. BC A D Two max paths:
9
Our Proposal BC A D BCDA, || B C D A ABCD BCAD BCDA Set of all seqs. identifying max. paths: What are these sequences?
10
Result 1 Theorem: sequences identifying maximal paths of the most specific posets are a particular case of so-called stable sequences. 3 1 2 Stable sequences are maximal among those having the same number of occurences (support) in D. {s | s’ s.t s s’ and support (s) = support (s’)} is stable is not stable because it is contained in that has the same support. Many algorithms for minig stable seqs: CloSpan, BIDE, TSP...
11
How to construct posets out of Stable Sequences? BC A D BCDA, || B C D A ABCD BCAD BCDA Stable Sequences Some stable sequences may identify maximal paths of different partial orders.
12
Result 2 We characterize a closure operator working on sets of sequences. A closure operator satisfies the three basic closure axioms: Monotonicity, Extensivity, and Idempotency. Broadly: Given any set of sequences S, (S) returns the set of maximal sequences that are present in the same input sequences where S is contained. 3 1 2 ( { } ) = {, }
13
A set of sequences S will be closed if it coincides with its closure : (S) = S(S) = S Result 2 Lemma: individually, sequences in a closed set S, are stable sequences. ( {, } ) = {, } Both and are stable sequences. 3 1 2
14
Theorem: closed sets of sequences identify the maximal paths of the same partial order. Result 2 {, } BC A D BCDA, || B C D A {, } Closed set of sequencesPartial Orders 3 1 2
15
Lattice of Closed Sets of Sequences { } {, } 1,2,3 2,3 1,3 2 31 BC A D BCDA, || B C D A
16
Lattice of Closed Partial Orders 1,2,3 CDAB BC A D BCDA, || B C D A BCDACADB 2,3 1,3 1 32 Moreover, these posets are closed.
17
Formalization A directed graph is modeled as G=(V,E,l) where V is the set of vertices; E VxV is the set of edges; and l is an injective labelling function. A poset is an acyclic directed graph, such that the relation on V stablished by edges E is reflexive, antisymmetric and transitive. A graph morphism between two graphs G=(V,E,l) and G’=(V’,E’,l’) consists of an injective function h:V V’ that preserves labels and (u,v) E implies (h(u),h(v)) E’. B C D A BCDA, ||
18
Result 3 Coproduct of a family of graphs: G1 Gn GG’...... BC AC B C A Example:
19
Result 3 {, } B C D A Coproduct of: BCD AD B C D A Theorem: A lattice of stable sequences can be transformed into a lattice of closed posets by rewriting each node via coproduct transformations.
20
Conclusions We identify partial orders in sequential data by: – Mining stable sequences and their support (CloSpan, BIDE …). – Grouping stable sequences in closed sets of sequences, according to operator . – Getting final episodes from those agrupations. This transformation represents an important algorithmic simplification. Formally, in case of not having repetition of items, this transformation can be expressed as coproduct transformations.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.