PSYCHO: A Prototype System for Pattern Management Barbara Catania, Anna Maddalena, Maurizio Mazza DISI - University of Genoa, Italy VLDB ’05 – Trondheim (Norway)
What is a pattern? Any compact and rich in semantics representation of raw data
PSYCHO Aim Customizable system for generating, representing, and manipulating heterogeneous patterns, possibly user-defined
Related Work Theoretical proposals –Inductive databases (CINQ project): data mining patterns, no combination –3World model: data mining patterns represented as linear constraints, combination supported Standards: only data mining patterns, no combination, no synchronization –representation: PMML, CWM-DM –manipulation: Java Data Mining, SQL/MM DM
PSYCHO added value Powerful pattern model –user-defined patterns –pattern hierarchies –relationship with source data Pattern Manipulation Language (PML) –extraction/direct insertion –validity, synchronization Pattern Query Language (PQL) –selection –combination of patterns of possible different types –combination of patterns and data (cross-over queries)
PSYCHO pattern model PANDA EU project (IST ) name structure schema source schema image formula pattern type PID structure source measure image expression pattern name class measure schema name code mining function measure function * * name code validity period schema validity period
PSYCHO architecture Pattern Base PBMS Formula handler PML interpreter Query processor PDL interpreter PBMS Engine Data sources... Import/ExportUser interface Oracle 10g + Oracle Data Mining Java Jasper + SICStus Prolog
Scenarios Scenario 1 –manipulation and querying capabilities over (heterogenous) data mining patterns Scenario 2 –representation and management of pattern hierarchies Scenario 3 –representation and management of user-defined patterns
Thanks !!! Demo session –Wednsday (today) ,30 –Friday: 9 – 10,30 Internet Site –
References B. Catania and A. Maddalena. Pattern Management: Practice and Challenges. In Jerome Darmont and Omar Boussaid, editors, Processing and Managing Complex Data for Decision Support, Idea Group Publishing. To appear. B. Catania, A. Maddalena, M. Mazza, E. Bertino, S. Rizzi. A Framework for Data Mining Pattern Management. In Proc. of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD 2004), pages 87-98, September M. Terrovitis, P. Vassiliadis, S. Skiadopoulos, E. Bertino, B. Catania, and Anna Maddalena. Modeling and Language Support for the Management of Pattern-Bases. In Proc. of the Int. Conf. on Statistical and Scientific Database Management, Santorini, Greece, June E. Bertino, B. Catania, and A. Maddalena. Towards a Language for Pattern Manipulation and Querying. In Proc. of From Data to Patterns: Int’l Workshop on Pattern Representation and Management (PaRMa’04), Creete, Greece, March B. Catania and A. Maddalena. A Framework for Cluster Management. In Proc. of the Int. Workshop on Clustering Information over the Web, Creete, Greece, March Also in LNCS 3268: EDBT Workshops 2004, pages S. Rizzi, E. Bertino, B. Catania, et al. Towards a Logical Model for Patterns. In Proc. of the Inf. Conf. on Conceptual Modeling (ER'03), October 2003.