Presentation is loading. Please wait.

Presentation is loading. Please wait.

© 1998 Singh & Huhns1 Database Integration. © 1998 Singh & Huhns2 Dimensions of Integration Existence of global schema Location transparency: same view.

Similar presentations


Presentation on theme: "© 1998 Singh & Huhns1 Database Integration. © 1998 Singh & Huhns2 Dimensions of Integration Existence of global schema Location transparency: same view."— Presentation transcript:

1 © 1998 Singh & Huhns1 Database Integration

2 © 1998 Singh & Huhns2 Dimensions of Integration Existence of global schema Location transparency: same view of data and behavior at all sites Uniform access and update language Uniform interaction protocols Opacity of replication Strict semantic guarantees of overall system

3 © 1998 Singh & Huhns3 Full Integration Distributed databases are the most tightly integrated. They provide A global schema A unique way to access the DB through that schema Location transparency Replication managed automatically ACID transactions (i.e., a 2PC-type semantics)

4 © 1998 Singh & Huhns4 Federation Less than full integration Local schemas and a global schema coexist: access may be through either (at a given site, the local schema and the global schema are visible) ACID transactions are optional, but still possible if the local transaction managers are open— problems of extraneous conflicts must be solved Location transparency

5 © 1998 Singh & Huhns5 Multidatabases Multidatabases are a loose form of federation involving No global schema A uniform language to access the DB The locations of the data are visible to the application There may be some support for semantic constraints, depending on what the underlying systems provide

6 © 1998 Singh & Huhns6 Interoperation Interoperation is the loosest form of integration, in that there is no real integration of the databases There might be no global schema Heterogeneous ways to access the DB must coexist Location transparency is not easy to achieve Different languages might be required at different databases, because they might have different underlying metamodels Applications must handle all semantics

7 © 1998 Singh & Huhns7 Workflows Tasks include –queries –transactions –applications –administrative activities Tasks decompose into subtasks that are –distributed and heterogeneous, but –coordinated Subtasks have mutual constraints on –order –occurrence –combinations of the above –return values

8 © 1998 Singh & Huhns8 Example Problems Loan application processing Processing admissions to graduate program Telecommunications service provisioning often requires –several weeks –many operations (48 in all, 23 manual) –coordination among many operation-support systems and network elements (16 database systems)

9 © 1998 Singh & Huhns9 Traditional Transactions DB abstraction for activity ACID properties –Atomicity: all or none –Consistency: final state is consistent if initial state is consistent –Isolation: intermediate states are invisible –Durability: committed results are permanent Applicability –brief activities (seconds, at most) –simple activities (few updates) –on centralized architectures In distributed settings, use mutual (e.g., two-phase) commit to prevent violation of ACID properties x:=x-a y:=y+a

10 © 1998 Singh & Huhns10 Why Relaxed Tasks? Consider tasks that are –Complex, i.e., long-running, prone to failure, update multiple data items, update across multiple systems, and have subtle consistency requirements –Cooperative, i.e., involve multiple applications and involve human interaction –Over heterogeneous environments –Have autonomous unchangeable parts Relax ACID properties –For tasks as a whole –Even if the subtasks are ACID

11 © 1998 Singh & Huhns11 Important Issues How to come up with a workflow specification? –Notion of correctness of executions what when how –Notion of resource constraints How does a workflow interface with the underlying databases? –Concurrency control –Recovery Normal executions are often easy—just a partial order of activities Exception conditions and ad hoc flows are harder

12 © 1998 Singh & Huhns12 Extended Transactions Numerous extended transaction models that relax the ACID properties in set ways. They consider features such as Nesting –traditional: closed (ACID) –newer: open (non-ACID) Constraints among subtransactions, such as –commit dependencies –abort Atomicity –contingency procedures to ensure “all” Consistency restoration –compensation

13 © 1998 Singh & Huhns13 Extended Transaction Models Sagas Poly transactions Flex transactions Cooperative transactions DOM transactions Split-and-join transactions ACTA metamodel Long-running activities ConTracts

14 © 1998 Singh & Huhns14 Scheduling Approaches These address the issues of how activities may be scheduled, assuming that desired semantic properties are known A common thread is the notion of the significant events of a task. These are the events that are relevant for coordination issues. Thus a complex activity may be reduced to a single state and termination of that activity to a significant event Tasks or workflows can be modeled in terms of dependencies among the significant events of their subtasks Example: If the booking assignment transaction fails, then initiate a compensate transaction for the billing transaction

15 © 1998 Singh & Huhns15 Dependency Enforcement A specified workflow may only be executed if the corresponding dependencies can be enforced. Is this always possible? No! The stated dependencies may be mutually inconsistent, e.g., in requiring –e before f and f before e –e should occur and should not occur The assumptions made about the significant events may not be realized in the given tasks. For example –a task commits before it starts –a task commits and aborts –a commit should be triggered

16 © 1998 Singh & Huhns16 Syntactic Event Attributes Many of the assumptions can be syntactically tested. These are independent of the exact nature of the tasks or the significant events: Whether an event occurs or not The mutual ordering of various events The consistency of the ordering of the schedule with the ordering of events in the task, e.g., start precedes commit The consistency of the schedule with the events in the task, e.g., abort and commit are complementary events The last is borderline between syntactic and semantic attributes

17 © 1998 Singh & Huhns17 Semantic Event Attributes There are also certain semantic event attributes that affect the enforceability of a set of dependencies. Events may variously be Delayable: those which the scheduler can defer –commit of a transaction Rejectable: those which the scheduler can prevent –commit of a transaction Triggerable: those which the scheduler can cause to occur –start of a task

18 © 1998 Singh & Huhns18 Transaction Management in Multidatabase Systems

19 © 1998 Singh & Huhns19 MDBS 3 levels of autonomy are possible design, e.g., LDB software is fixed execution, e.g., LDB retains full control on execution even if in conflict with GTM communication, e.g., LDB decides what (control) information to release GTM LDB server Global Transactions Local Transactions

20 © 1998 Singh & Huhns20 Global Serializability Transactions throughout the MDBS are serializable, i.e., the transactions are equivalent to some serial execution What the GTM can ensure is that the global transactions are serializable This doesn't guarantee global serializability, because of indirect conflicts: –GTM does T1: r1(a); r1(c) –GTM does T2: r2(b); r2(d) –LDB1 does T3: w3(a); w3(b) –LDB2 does T4: w4(c); w4(d) –Since T1 and T2 are read-only, they are serializable. –LDB1 sees S1=r1(a); c1; w3(a); w3(b); c3; r2(b); c2 –LDB2 sees S2=w4(c); r1(c); c1; r2(d); c2; w4(d); c4 –Each LDB has a serializable schedule; yet jointly they put T1 before and after T2

21 © 1998 Singh & Huhns21 Global Atomicity This arises because some sites may not release their prepare- to-commit state and not participate in a global commit protocol Global Deadlock Easy to construct scenarios in which a deadlock is achieved. Assume LDB1 and LDB2 use 2PL. If a deadlock is formed solely of global transactions, then the GTM may detect it of a combination of local and global transactions, then –GTM won't know of it –LDBs won't share control information

22 © 1998 Singh & Huhns22 Tickets Global serializability occurs because of local conflicts that the GTM doesn't see Fix by always causing conflicts--whenever two GTs execute at a site, they must conflict there. Indirect conflicts become local conflicts visible to the LDB –Make each GT increment a ticket at each site Downside: –Causes all local subtransactions of a global transaction to go through a local hotspot –GTs are serialized but only because lots are aborted!

23 © 1998 Singh & Huhns23 Rigorous DBMS Rigorous = Strict. Check that this prevents the bad example. The GTM must delay all commits until all actions are completed –possible only if allowed by LDB –requires an operation-level interface to LDB Downside: –Causes all sites to be held up until all are ready to commit –Essentially like the 2PC approach

24 © 1998 Singh & Huhns24 Global Constraints When no global constraints, local serializability is enough Can split data into local and global –LDB controls local data –GTM controls global (local read but only write via GTM) Downside: doesn’t work in all cases

25 © 1998 Singh & Huhns25 Atomicity & Durability What happens when a GT fails? The local sites ensure atomicity and durability of the local subtransactions With 2PC, GTM can guarantee that all or none commit Otherwise, –redo: rerun the writes from log –retry: rerun all of a subtransactions –compensate: semantically undo all others


Download ppt "© 1998 Singh & Huhns1 Database Integration. © 1998 Singh & Huhns2 Dimensions of Integration Existence of global schema Location transparency: same view."

Similar presentations


Ads by Google