Coproduct Transformations on Lattices of Closed Partial Orders Gemma Casas-Garriga MOISES meeting MOISES meeting, Valladolid, Sept 2004.

Slides:



Advertisements
Similar presentations
Recap: Mining association rules from large datasets
Advertisements

CS848: Topics in Databases: Foundations of Query Optimization Topics covered  Introduction to description logic: Single column QL  The ALC family of.
Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
Association Analysis (2). Example TIDList of item ID’s T1I1, I2, I5 T2I2, I4 T3I2, I3 T4I1, I2, I4 T5I1, I3 T6I2, I3 T7I1, I3 T8I1, I2, I3, I5 T9I1, I2,
Distributed Computing 1. Lower bound for leader election on a complete graph Shmuel Zaks ©
Chapter 3 Relations. Section 3.1 Relations and Digraphs.
Parallel Scheduling of Complex DAGs under Uncertainty Grzegorz Malewicz.
Applied Discrete Mathematics Week 11: Graphs
Challenges posed by Structural Equation Models Thomas Richardson Department of Statistics University of Washington Joint work with Mathias Drton, UC Berkeley.
Foundations of Data-Flow Analysis. Basic Questions Under what circumstances is the iterative algorithm used in the data-flow analysis correct? How precise.
Equivalence, DFA, NDFA Sequential Machine Theory Prof. K. J. Hintz Department of Electrical and Computer Engineering Lecture 2 Updated and modified by.
Boolean Algebra cont’ The digital abstraction Graphs and Topological Sort מבנה המחשב + מבוא למחשבים ספרתיים תרגול 2#
1 On a question of Leiss regarding the Towers of Hanoi problem.
1 Mining Frequent Patterns Without Candidate Generation Apriori-like algorithm suffers from long patterns or quite low minimum support thresholds. Two.
1 Minimization of Automata It is very likely that problems like this will appear on midterm or final exam. Slides from Tadeusz Luba.
Validating Streaming XML Documents Luc Segoufin & Victor Vianu Presented by Harel Paz.
1 Fault Nodes in Implication Graph for Equivalence/Dominance Collapsing, and Identifying Untestable and Independent Faults R. Sethuram
1 Relations: The Second Time Around Chapter 7 Equivalence Classes.
Technion 1 (Yet another) decision procedure for Equality Logic Ofer Strichman and Orly Meir Technion.
Theory and Applications
Elementary graph algorithms Chapter 22
Relating Two Formal Models of Path-Vector Routing March 15, 2005: IEEE INFOCOM, Miami, Florida Aaron D. Jaggard Tulane University Vijay.
TOWARDS IDENTITY ANONYMIZATION ON GRAPHS. INTRODUCTION.
Zvi Kohavi and Niraj K. Jha 1 Capabilities, Minimization, and Transformation of Sequential Machines.
Hungarian Algorithm Vida Movahedi Elderlab, York University June 2007.
Unit 14 Derivation of State Graphs
Chapter 2 Graph Algorithms.
Krakow, Summer 2011 Comparability Graphs William T. Trotter
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining By Tan, Steinbach, Kumar Lecture.
1 Global Routing Method for 2-Layer Ball Grid Array Packages Yukiko Kubo*, Atsushi Takahashi** * The University of Kitakyushu ** Tokyo Institute of Technology.
Efficient Data Mining for Calling Path Patterns in GSM Networks Information Systems, accepted 5 December 2002 SPEAKER: YAO-TE WANG ( 王耀德 )
Theory of Computation, Feodor F. Dragan, Kent State University 1 Regular expressions: definition An algebraic equivalent to finite automata. We can build.
Chapter 9. Chapter Summary Relations and Their Properties n-ary Relations and Their Applications (not currently included in overheads) Representing Relations.
Accessible Set Systems Andreas Klappenecker. Matroid Let S be a finite set, and F a nonempty family of subsets of S, that is, F  P(S). We call (S,F)
INM175 Topic 7 1 Module INM175 Discrete Mathematics Topic 7 Set Theoretic Models.
The Integers. The Division Algorithms A high-school question: Compute 58/17. We can write 58 as 58 = 3 (17) + 7 This forms illustrates the answer: “3.
1 Combinatorial Algorithms Parametric Pruning. 2 Metric k-center Given a complete undirected graph G = (V, E) with nonnegative edge costs satisfying the.
8.3 Representing Relations Directed Graphs –Vertex –Arc (directed edge) –Initial vertex –Terminal vertex.
Discrete Mathematics and Its Applications Sixth Edition By Kenneth Rosen Chapter 8 Relations 歐亞書局.
Relations and their Properties
Problem Statement How do we represent relationship between two related elements ?
Chapter 9. Chapter Summary Relations and Their Properties n-ary Relations and Their Applications (not currently included in overheads) Representing Relations.
Other Access Control Models
Donghyun (David) Kim Department of Mathematics and Physics North Carolina Central University 1 Chapter 0 Introduction Some slides are in courtesy of Prof.
Maximum Flow Problem Definitions and notations The Ford-Fulkerson method.
1 Summarizing Sequential Data with Closed Partial Orders Gemma Casas-Garriga Proceedings of the SIAM International Conference on Data Mining (SDM'05) Advisor.
Representing Relations Using Digraphs
Directed Graphs Directed Graphs Shortest Path 12/7/2017 7:10 AM BOS
Directed Graphs 12/7/2017 7:15 AM Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia,
Partial Orderings: Selected Exercises
Introduction to Relations
Computing Full Disjunctions
Efficient Closed Pattern Mining in Strongly Accessible Set Systems
Advanced Pattern Mining 02
Directed Graphs 9/20/2018 1:45 AM Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia,
Directed Graphs 5/1/15 12:25:22 PM
Directed Graphs Directed Graphs 1 Shortest Path Shortest Path
New Apporoach to Data Mining
Mining Access Pattrens Efficiently from Web Logs Jian Pei, Jiawei Han, Behzad Mortazavi-asl, and Hua Zhu 2000년 5월 26일 DE Lab. 윤지영.
On Inferring K Optimum Transformations of XML Document from Update Script to DTD Nobutaka Suzuki Graduate School of Library, Information and Media Studies.
Sungho Kang Yonsei University
Control Flow Analysis (Chapter 7)
Directed Graphs Directed Graphs Directed Graphs 2/23/ :12 AM BOS
On the Graph Decomposition
교환 학생 프로그램 내년 1월 중순부터 6월 초 현재 학부 2,3 학년?
Relations: The Second Time Around
Locality In Distributed Graph Algorithms
Software Testing and QA Theory and Practice (Chapter 5: Data Flow Testing) © Naik & Tripathy 1 Software Testing and Quality Assurance Theory and Practice.
Presentation transcript:

Coproduct Transformations on Lattices of Closed Partial Orders Gemma Casas-Garriga MOISES meeting MOISES meeting, Valladolid, Sept 2004 LARCA

Data Description D = {s 1, …, s N } where each s i is a sequence. A sequence is an ordered list of sets of items: For example, We consider a set of sequences D to be analyzed Id

Basic Definitions An episode in D is an acyclic directed graph, indicating a partial order between items BC A D The support of a poset in D is the number of input sequences that are compatible with it. BC A D it is compatible with second and third input sequences

Problem Formulation Goal: to identify posets and their support (alternatively, whose support is over a minimum user-specified threshold) Problem: many redundant partial orders For ex. both P and P’ are compatible with the same input sequences, but P is more “informative” than P’. BC A D P BCA P’ P’  P

Problem Formulation If P’  P we say that P is more specific than P’. Specificity relation is different from classical inclusion of episodes. Goal redefined: to identify the most specific partial orders among those occurring in the same input seqs ( alternatively, with support over a minimum threshold). BC A D BCAD ||,,, 

Example BC A D BCDA, || BC D A ABCD BCAD BCDA 1,2,3 2,3 1, Input seqs where the poset is the most specific.

Motivation Ordering relationships are useful in many domains: web mining, monitoring of processes, e-comerce... The most specific episodes give a general view of D, summarizing all the input sequences without redundancies.

Addressing the Problem Observation: Identifying such structures directly from the data is a complex task (specificity relation is difficult to calculate). Our proposal: – Constructing hybrid episodes out of their maximal paths. – That is, finding those subsequences in D that will identify maximal paths of the final desired episodes. BC A D Two max paths:

Our Proposal BC A D BCDA, || B C D A ABCD BCAD BCDA Set of all seqs. identifying max. paths: What are these sequences?

Result 1 Theorem: sequences identifying maximal paths of the most specific posets are a particular case of so-called stable sequences Stable sequences are maximal among those having the same number of occurences (support) in D. {s |  s’ s.t s  s’ and support (s) = support (s’)} is stable is not stable because it is contained in that has the same support. Many algorithms for minig stable seqs: CloSpan, BIDE, TSP...

How to construct posets out of Stable Sequences? BC A D BCDA, || B C D A ABCD BCAD BCDA Stable Sequences Some stable sequences may identify maximal paths of different partial orders.

Result 2 We characterize a closure operator  working on sets of sequences. A closure operator satisfies the three basic closure axioms: Monotonicity, Extensivity, and Idempotency. Broadly: Given any set of sequences S,  (S) returns the set of maximal sequences that are present in the same input sequences where S is contained  ( { } ) = {, }

A set of sequences S will be closed if it coincides with its closure : (S) = S(S) = S Result 2 Lemma: individually, sequences in a closed set S, are stable sequences.  ( {, } ) = {, } Both and are stable sequences

Theorem: closed sets of sequences identify the maximal paths of the same partial order. Result 2 {, } BC A D BCDA, || B C D A {, } Closed set of sequencesPartial Orders 3 1 2

Lattice of Closed Sets of Sequences { } {, } 1,2,3 2,3 1, BC A D BCDA, || B C D A

Lattice of Closed Partial Orders 1,2,3 CDAB BC A D BCDA, || B C D A BCDACADB 2,3 1, Moreover, these posets are closed.

Formalization A directed graph is modeled as G=(V,E,l) where V is the set of vertices; E  VxV is the set of edges; and l is an injective labelling function. A poset is an acyclic directed graph, such that the relation on V stablished by edges E is reflexive, antisymmetric and transitive. A graph morphism between two graphs G=(V,E,l) and G’=(V’,E’,l’) consists of an injective function h:V  V’ that preserves labels and (u,v)  E implies (h(u),h(v))  E’. B C D A BCDA, ||

Result 3 Coproduct of a family of graphs: G1 Gn GG’ BC AC B C A Example:

Result 3 {, } B C D A Coproduct of: BCD AD B C D A Theorem: A lattice of stable sequences can be transformed into a lattice of closed posets by rewriting each node via coproduct transformations.

Conclusions We identify partial orders in sequential data by: – Mining stable sequences and their support (CloSpan, BIDE …). – Grouping stable sequences in closed sets of sequences, according to operator . – Getting final episodes from those agrupations. This transformation represents an important algorithmic simplification. Formally, in case of not having repetition of items, this transformation can be expressed as coproduct transformations.