G. Papastefanatos 1, P. Vassiliadis 2, A. Simitsis 3, K. Aggistalis 2, F. Pechlivani 2, Yannis Vassiliou 1 (1) National Technical University of Athens (2) University of Ioannina (3) IBM Almaden Research Center Language Extensions for the Automation of Database Schema Evolution
ICEIS'08, Barcelona, June Outline Motivation Graph-based modeling A language for policy annotation Evaluation Conclusions
ICEIS'08, Barcelona, June Outline Motivation Graph-based modeling A language for policy annotation Evaluation Conclusions
ICEIS'08, Barcelona, June Data Warehouse Environment WWW Act1 Act2 Act3 Act4 Act5
ICEIS'08, Barcelona, June Motivation WWW Act1 Act2 Act3 Act4 Act5
ICEIS'08, Barcelona, June Database Schema Evolution Current database systems are continuously evolving environments, where design constructs are –Added –Removed –Modified Evolution is not handled by current DBMS with an automatic way –Syntactic as well as semantic adaptation of queries and views is a time-consuming task, treated in most of the cases manually Evolution-driven database design is missing
ICEIS'08, Barcelona, June Evolution Effects SW artifacts around the DBMS are affected: –Syntactically – i.e., become invalid –Semantically – i.e., query must conform to the new source database semantics Adaptation of activities, queries and views –time-consuming task –treated in most of the cases manually by the administrators/developers
ICEIS'08, Barcelona, June We would like to know… What part of the process is affected and how if e.g., an attribute is deleted? Can we predict the impact of changes? To what extent can readjustment be automated? Can we perform what-if analysis for potential changes of source configurations?
ICEIS'08, Barcelona, June Outline Motivation Graph-based modeling A language for policy annotation Evaluation Conclusions
ICEIS'08, Barcelona, June Database Schema Evolution – Our approach Mechanism for performing what-if analysis for potential changes of database configurations Graph based representation of database constructs (i.e., relations, views, constraints, queries) Annotation of graph with rules for adapting queries to database schema evolution Evolving databases Queries Database Schema Graph-based modeling for uniform representation Metrics for Evaluating Evolution Design Evolving applications Rules for Handling Evolution
ICEIS'08, Barcelona, June SELECT EMP.Emp#, Sum(WORKS.Hours) as T_Hours FROM EMP, WORKS WHERE EMP.Emp# = WORKS.Emp# GROUP BY EMP.Emp# Graph based representation
ICEIS'08, Barcelona, June Annotating the graph with adaptation rules Set of evolving database constructs: relations attributes constraints 1 Set of potential evolution changes: addition deletion modification 2 Set of reaction policies: propagate block prompt 3
ICEIS'08, Barcelona, June Query Adaptation - Example Annotated Query Graph Event Add attribute Phone to relation EMP Transformed Query Graph Q NameEID Name EID S S EMP S S map-select … ON attribute addition TO EMP THEN propagate Q: SELECT EID, Name FROM EMP Q: SELECT EID, Name, Phone FROM EMP Q NameEID Name EID S S EMP S S map-select … ON attribute addition TO EMP THEN propagate Phone S S map-select
ICEIS'08, Barcelona, June Contributions Based on –a graph-based model for database constructs (relational tables, views, database constraints) and SQL queries we present –a mechanism for the annotation of the graph’s constructs with elements that facilitate what-if analysis and predetermine the reaction to evolution events occurring in the database schema –SQL extensions that enable the implementation of our approach for evolution management Case study
ICEIS'08, Barcelona, June Outline Motivation Graph-based modeling A language for policy annotation Evaluation Conclusions
ICEIS'08, Barcelona, June Annotation of graph constructs with policies {relation, query, view, attribute constraint, condition} specific nodes ON TO THEN {delete update insert} {block propagate prompt}
ICEIS'08, Barcelona, June Global constraints ON DELETE NODE THEN PROPAGATE ON DELETE ATTRIBUTE THEN PROMPT Example: CREATE DATABASE company ON DELETE ATTRIBUTE THEN PROMPT
ICEIS'08, Barcelona, June Top – level constructs Example for relations: CREATE TABLE works (EMP# NUMBER(3), PROJ# NUMBER(3), HOURS NUMBER(5), ON Add Attribute TO works THEN propagate) This statement for relation works allows the addition of attributes and propagates this addition to all queries and views accessing this relation
ICEIS'08, Barcelona, June Top – level constructs Example for views: CREATE VIEW emps-prjs AS SELECT E.Emp#, E.Name, P.Projname FROM Emp E,Works W,Proj P WHERE E.EMP#=W.EMP# AND W.Proj#=P.Proj# ON Modify Condition TO emps-prjs THEN block This syntax blocks changes in the WHERE clause of the view definition for view emps-prjs
ICEIS'08, Barcelona, June Top – level constructs Example for queries: Q:SELECT EP.Emp#, EP.Name FROM emps-prjs EP WHERE EP.PRJNAME = ‘Olympic Games’ ON Add Attribute TO emps-prjs THEN block The above syntax blocks the inclusion of added attributes in the underlying view emps-prjs in the select clause of the query.
ICEIS'08, Barcelona, June Specific node annotation CREATE TABLE emp (EMP# NUMBER(3), Name Varchar2(150),..., ON Delete Attribute TO Name THEN block) Q: SELECT E.Emp#, E.Name, P.Projname FROM Emp E,Works W,Proj P WHERE E.EMP#=W.EMP# AND W.Proj#=P.Proj# ON Delete Attribute TO Name THEN propagate
ICEIS'08, Barcelona, June Constraints CREATE TABLE emp (EMP# NUMBER(3), Name Varchar2(150), Constraint EMP.PK PRIMARY KEY (EMP#), ON Modify Constraint TO EMP.PK THEN propagate)
ICEIS'08, Barcelona, June Conditions CREATE CONDITION AS CREATE CONDITION Emp_Age_Cond AS AGE>50 CREATE CONDITION Works_Emp_FK AS WORKS.EMP# IN EMP.EMP# CREATE CONDITION Works_Emp_J AS WORKS.EMP#=EMP.EMP# Then, a query SELECT * FROM EMP WHERE AGE_COND would simply use the condition as a macro. Parametric conditions, to allow referring to aliases in SQL queries are straightforward. Reverse eng. of existing code: automatic condition names can be assigned to all the queries.
ICEIS'08, Barcelona, June Outline Motivation Graph-based modeling A language for policy annotation Evaluation Conclusions
ICEIS'08, Barcelona, June Testbed A data mart of an agency for the Greek public sector 6 months of monitoring 52 “queries” (in reports, scripts) over 18 relations When converted to our graph structure: 2500 nodes
ICEIS'08, Barcelona, June Distribution of policies over the annotated nodes Event # of nodes Propagate Block Rename relations 180 Add attributes 6413 Delete Attributes Rename Attributes Domain Modification Condition Modification 021 Total Annotations
ICEIS'08, Barcelona, June Global policy and individual annotations Scope # of operations Annotations Policy Clauses Query scope 4869 Relation Scope Database Scope 362
ICEIS'08, Barcelona, June Outline Motivation Graph-based modeling A language for policy annotation Evaluation Conclusions
ICEIS'08, Barcelona, June Conclusions Coherent framework for propagating potential changes of the database to all the software artifacts around the database based on a model-policy-event scheme Extension to the SQL language specifically tailored for the management of evolution. Limited overhead imposed on both the system and the humans, who design and maintain it. Tested in a real-world scenario of the Greek public sector.
ICEIS'08, Barcelona, June On-going/Future Work Hecataeus: A tool for visualizing and performing what-if analysis for several evolution scenarios. Patterns of evolution sequences
ICEIS'08, Barcelona, June Acknowledgment Information dissemination of this work was supported by the European Union in the framework of the project “Support of Computer Science Studies in the University of Ioannina” of the “Operational Program for Education and Initial Vocational Training” of the 3rd Community Support Framework of the Hellenic Ministry of Education, funded by national sources and by the European Social Fund (ESF).
ICEIS'08, Barcelona, June Gracias ! Sources:
ICEIS'08, Barcelona, June Questions?
ICEIS'08, Barcelona, June Algorithm Propagate changeS (DaWaK 2007) Input: an ETL summary S over a graph G o =(V o,E o ) and an event e Output: a graph G n =(V n,E n ) Variables: a set of events E, and an affected node A Begin dps(S, G o, G n, {e}, A) End dps(S, G n, G o, E, A) { I = Ins_by_policy(affected(E)) D = Del_by_policy(affected(E)) G n = G o – D I E = E–{e} action(affected(E)) if consumer(A) nil for each consumer(A) dps(S,G n,G o,E,consumer (A)) }
ICEIS'08, Barcelona, June Conflict resolution (DaWaK 2007) Graph constructs may have contradictory policies for the same event Policies defined on query graph structures are stronger than policies defined on view graph structures which in turn prevail on policies defined on relation graph structures Rule
ICEIS'08, Barcelona, June Configuration of our setting (DaWaK 2007) 7 ETL processes 53 ETL activities 3 lookup tables 7 source tables 9 target tables
ICEIS'08, Barcelona, June Adapted activities per Event (DaWaK 2007)
ICEIS'08, Barcelona, June Related work DB schema Evolution –Schema Versioning DW schema Evolution Materialized View Evolution Evolution wrt Model Mappings