Download presentation
Presentation is loading. Please wait.
Published byAmber Hill Modified over 9 years ago
1
Don’t Scrap It, Wrap It! A Wrapper Architecture for Legacy Data Sources Mary Tork Roth Peter Schwarz IBM Almaden
2
Road Map Motivation Garlic Overview Wrapper Architecture –Data Definition –Query Planning –Query Execution Good, Bad, and Ugly
3
Motivation “Real Companies” Heavy investment in legacy –Data management wares –Application woes Need an integrated view of heterogeneous data sources –Leverage existing query facilities –Work around idiosyncrasies
4
Garlic Architecture Query Processor Garlic Metadata Relational DB Object DB Image Archive Complex Objects Client Wrapper
5
Wrapper Goals Small start-up cost –Wizards are not the only ones writing Incremental growth –Wrappers must be able to evolve –Add new sources without disturbing existing ones Must be able to optimize queries –Enable participation, not delegation
6
Wrapper Overview Data Source Wrapper Garlic Objects Method Invocation Planning Work Request Wrapper Plan Query Plan Execution Execution Plan Iterator
7
Modeling Data Object Data Model –Interface and Implementation –GDL variant of ODMG-ODL Wrapper assigns IDs to objects –OID = IID + key Methods –default accessor methods –stub and generic dispatch
8
Modeling Data Example interface Country { attribute string name; attribute string airlines_served; attribute boolean visa_required; attribute Image scene; } interface Image { attribute readonly string file_name; double matches(in string file_name); void display(in string device_name); }
9
Query Planning Like System-R, bottom-up dynamic programming Wrapper tells what it can do through methods –plan_access() for single collections –plan_join() for multi-way joins –plan_bind() for inner streams of joins Input: work request Output: set of plans, cost, cardinalities?
10
Single Collections Work Request –Attributes to project upon –Selections, and methods to invoke Wrapper response –Which projections, selections it supports –Cost of plan –Instances of Wrapper_Plan class –Include private data for plan execution –Execute a plan which subsumes request?
11
Single Collection Access Plan select H.name, H.city, H.daily_rate from Hotels H where H.class = 5 and H.loc = ‘beach’ Garlic Optimizer Web Wrapper Hotel Repository Work Request Project: H.OID, H.name, H.city H.daily_rate, H.class, H.loc Preds: H.class = 5 H.loc = ‘beach’ Wrapper Access Plan - Wrapper_Plan class Properties Project: H.OID, H.name, H.city, H.daily_rate, H.class, H.loc Preds: H.class = 5 Cost: Plan details (private)
12
Join Plans Request –Plans to join –Join Predicate Wrapper response –Join plan with supported predicates –Cost of join
13
Join Plans select I.name from Countries C, Cities I where C.name = ‘Greece’ and I.pop < 500 and I.country=C.OID Garlic Optimizer Wrapper Join Plan - Countries, Cities Project: C.OID, C.name, I.OID, I.name, I.pop, I.country Preds: C.name = `Greece’, I.pop < 500, I.country = C.OID Cost: Plan details (private) Wrapper Access Plan Work Request Project: C.OID, C.name Preds: C.name = ‘Greece’ Cost: Plan details (private) Wrapper Access Plan Project: I.OID, I.name... Preds: I.pop < 500 Cost: Private details (private) Input Plans Join pred: I.country = C.OID Relational Wrapper Relational DB
14
Inter Site Joins select C.pop, H.name from Cities C, Hotels H where C.name = H.loc Site A: Cities - C Site B: Hotels - H AB Garlic H H C AB Garlic HC H C AB Garlic H sub H sub.loc H sub C
15
Bind Plans Inter wrapper join Fetch matches –Values produced by outer node –Inner node invoked for each/set of values –Like semi or filter join Same request and reply pairs
16
Query Execution Garlic plan looks like tree with wrapper plans as leaves Wrapper exports iterator interface –Translate plan into iterator –Methods supported reset() advance() bind()
17
Wrapper Details Interface files include the GDL Environment files include parameters specific to wrappers Libraries –Core, shared among several wrappers –Implementation, specific to repositories Dynamically loaded code Same address space as Garlic
18
Odds and Ends How easy is it to write a wrapper? –Summer student, chemist, and many wrappers written. Related Work –TSIMMIS Uses QDTL, a declarative spec for supported queries –DISCO Language for describing capabilites Partial queries
19
Good and Bad Good –Leverages existing query facilities –Handles idiosyncrasies –Graceful growth and evolution Bad –How easy is it to write wrappers? –How unstructured can my repository be? –Optimization Centralized vs. Local Selectivity estimation?
20
The Ugly Cost model for diverse set of sources Handling failures –Unavailable sources –Wrappers are buggy and often wrong –Want graceful degradation on failures Replication
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.