Download presentation
Presentation is loading. Please wait.
1
G. Papastefanatos 1, P. Vassiliadis 2, A. Simitsis 3, T. Sellis 1,4, Y. Vassiliou 1 (1) National Technical University of Athens, Athens, Hellas (Greece) {gpapas, yv}@dblab.ece.ntua.gr (2) University of Ioannina, Ioannina, Hellas (Greece) pvassil@cs.uoi.gr (3) HP Labs, Palo Alto, California, USA alkis@hp.com (4) Institute for the Management of Information Systems (Greece) timos@imis.athena-innovation.gr Rule-based Management of Schema Changes at ETL sources
2
MEDWa ‘09, Riga, September 20092 Outline Motivation Graph-based representation of ETL processes Regulating ETL Evolution Hecataeus Internals Conclusions
3
MEDWa ‘09, Riga, September 20093 Outline Motivation Graph-based representation of ETL processes Regulating ETL Evolution Hecataeus Internals Conclusions
4
Data Warehouse Environment MEDWa ‘09, Riga, September 20094
5
Data Warehouse Schema Evolution MEDWa ‘09, Riga, September 20095 Data warehouses are evolving environments, e.g.: A dimension is removed or renamed The structure of a dimension table is updated A fact table is completely decoupled from a dimension The measures of a fact table change An ETL source is modified, etc
6
Evolving ETL sources… Schema Changes on the sources of ETL processes. Design constructs are –Added, Removed, Modified ETL processes affected: –Syntactically –Syntactically – i.e., become invalid –Semantically –Semantically – i.e., must conform to the new source database semantics Adaptation of ETL flows –time-consuming task, –treated in most of the cases manually by the administrators/developers MEDWa ‘09, Riga, September 20096
7
We would like to know... What part of the process is affected and how if e.g., an attribute is deleted? Can we predict and handle the impact of changes? To what extent can readjustment be automated? MEDWa ‘09, Riga, September 20097
8
Hecataeus Framework MEDWa ‘09, Riga, September 20098 Mechanism for performing what-if analysis for potential changes of ETL sources Graph based representation of ETL workflows Annotation of graph with rules for adapting ETL processes to source schema evolution Evolution events are mapped to changes on the graph constructs
9
MEDWa ‘09, Riga, September 20099 Outline Motivation Graph-based representation of ETL processes Regulating ETL Evolution Hecataeus Internals Conclusions
10
ETL Workflow representation MEDWa ‘09, Riga, September 200910
11
Query representation MEDWa ‘09, Riga, September 200911 Q:SELECT EMP.Emp#, Sum(WORKS.Hours) as T_Hours FROM EMP, WORKS WHERE EMP.Emp# = WORKS.Emp# GROUP BY EMP.Emp# Join, GB
12
MEDWa ‘09, Riga, September 200912 Outline Motivation Graph-based representation of ETL processes Regulating ETL Evolution Hecataeus Internals Conclusions
13
Graph Annotation with rules According to prevailing policy, the proper action is taken graph evolution MEDWa ‘09, Riga, September 200913
14
Example MEDWa ‘09, Riga, September 200914 Q: SELECT EMP.Emp#, EMP.Name FROM EMP Q: SELECT EMP.Emp#, EMP.Name, Phone FROM EMP Event Add attribute Phone to relation EMP
15
MEDWa ‘09, Riga, September 200915 Outline Motivation Graph-based representation of ETL processes Regulating ETL Evolution Hecataeus Internals Conclusions
16
System architecture MEDWa ‘09, Riga, September 200916 DDL files SQL scripts DB Catalog Parser Create DB Schema Evolution Manager Workload representation Evolution Semantics Validate Workload Graph Viewer DB Schema representation XML, jpeg Import/ Export Scenarios Graph Visualization Metric Manager
17
Evolution Manager Architecture MEDWa ‘09, Riga, September 200917
18
MEDWa ‘09, Riga, September 200918 Outline Motivation Graph-based representation of ETL processes Regulating ETL Evolution Hecataeus Internals Conclusions
19
Research in DB Evolution DB Schema Evolution –OODB evolution –Schema versioning DW Schema Evolution –Taxonomy of evolution events –Versioning –Materialized Views Evolution –View adaptation & synchronization Evolution wrt Model Mappings MEDWa ‘09, Riga, September 200919
20
Summarizing The problem of adaptation of ETL workflows to evolvable data sources Graph –based representation of ETL activities Graph enrichment with semantics for evolution events Graph annotation with rules for handling a priori evolution events Hecataeus: Framework for performing and evaluating evolution scenarios in DW environments MEDWa ‘09, Riga, September 200920
21
Thank you... MEDWa ‘09, Riga, September 200921 http://www.cs.uoi.gr/~pvassil/projects/hecataeus/ Hecataeus : A tool for visualizing and performing what-if analysis for evolution scenarios
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.