Fabio Grandi Alma Mater Studiorum – Università di Bologna A Relational Multi-Schema Data Model and Query Language for Full Support of Schema Versioning Fabio Grandi Alma Mater Studiorum – Università di Bologna SEBD 2002
Introduction Schema Evolution Schema Versioning Automatic recovery of extant data after schema changes Schema Versioning Maintenance of past schemas (e.g. for legacy application support) Two design options for the implementation of data repositories SEBD 2002
Design Options for Extant Data Single-pool solution Roddick, Roddick & Snodgrass [TSQL2 ’95] All schema versions associated with a unique shared data repository Multi-pool solution De Castro, Grandi & Scalas [TDB ’95, IS ’97] Each schema version associated with a different private data repository SEBD 2002
Single-pool - Example A 123 B A 125 B C create schema SV1 create table R(A int, B int) create schema SV2 alter table R add column C int A 123 B A 125 B C SV1: R(A,B) SV1: R(A,B) SV2: R(A,B,C) SEBD 2002
Single-pool - Example (2) create schema SV3 alter table R drop column B create schema SV4 alter table R alter column A char A 125 B C A’ 125 B C A’’ xy SV1: R(A,B) SV2: R(A,B,C) SV3: R(A,C) SV1: R(A,B) <= R(A’,B) SV2: R(A,B,C) <= R(A’,B,C) SV3: R(A,C) <= R(A’,C) SV4: R(A,C) <= R(A’’,C) SEBD 2002
Multi-pool - Example A 123 B A 125 B C 33 DP1: DP2: A 125 C 55 A xy C SV1: R(A,B) SV2: R(A,B,C) A 125 C 55 A xy C 44 DP3: DP4: SV3: R(A,C) SV4: R(A,C) SEBD 2002
Single-pool vs Multi-pool Simple to implement with the completed schema solution Reduces to a quite “trivial” view mechanism Multi-pool: More flexible and complex Thus, potentially more useful, but… Is it feasible ? And what’s for ? SEBD 2002
Multi-pool – what’s for? From a conceptual perspective different schema versions correspond to different points of view SV2 SV1 SV3 SV4 Application Domain SEBD 2002
Schema Versions as Viewpoints IT Market US Market EU Market Different viewpoints on the same data involve different structures at intensional level (schema) different values at extensional level (data) SEBD 2002
A Concrete Example The “Lark 2.5” car model in the global market ( diff. viewpoints involve diff. marketing strategies ) Lark 2.5 26 K$ US Market Lark 2.5 32 K€ Euro4 EU Market Lark GT 31 K€ Euro3 IT Market SEBD 2002
A Multi-schema Query Language Full potentialities of a multi-pool schema versioning approach can be exploited only by means of a query language which allows users/developers to express Multi-schema Queries that is involving data belonging to different data pools MSQL - Multi-schema SQL extension SEBD 2002
The MSQL Query Language MSQL basic construct: contextualization of names and data references to schema versions [SV: X] denotes the conceptual entity named “X” in schema version “SV” SV: X denotes the extension wrt schema version “SV” of the conceptual entity “X” the two mechanisms can be combined (e.g. as in SVi: [SVj: X] ) SEBD 2002
The MSQL Query Language (2) Examples: select * from [SVj: R] retrieve the table called R in SVj select * from SVi: R retrieve the contents wrt SVi of R select * from SVi: [SVj: R] retrieve the contents wrt SVi of the table called R in SVj SEBD 2002
The MSQL Query Language (3) Examples: select [SVj: A] from R retrieve the column called A in SVj of R select SVi: A from R retrieve the contents wrt SVi of R.A select SVi: [SVj: A] from R retrieve the contents wrt SVi of the column called A in SVj of R SEBD 2002
The MSQL Query Language (4) Example: select SVi: [SVj: A] from SVk: [SVl: R] retrieve the values wrt SVi of the column called A in SVj in the tuples belonging wrt SVk to the table called R in SVl SEBD 2002
The MSQL Query Language (5) Example: set schema USMKT; select NAME from EUMKT:CAR as EC, ITMKT:CAR as IC where IC.PRICE < EC.PRICE retrieve the American names of all the cars also sold in Europe such that the Italian price is lower than the price applied in the rest of Europe SEBD 2002
The Logical Storage Model Provides a simple implementation scheme for a Multi-pool schema versioning database (on top of a traditional relational DBMS) Is used to formally define the semantics of the MSQL language Representation of the SVi data pool: RNamei(RID,Name) ANamei(AID,Name) RSchemai(RID,AID) RExti(RID,TID) AExti(TID,AID,Value) SEBD 2002
Conclusions Defense of the Multi-pool approach (from a conceptual design standpoint) Introduction of the Multi-schema Query Language (MSQL) Introduction of the Logical Storage Model Future Work development of implementation strategies study of formal properties (e.g. correctness of a schema evolution process under the multi-pool approach) SEBD 2002