HyperKVS Group Meeting Oracle Streams Dr. Volker Kuhr
Agenda Description of the environment of Distributed Databases Streams vs. Adv. Replication Oracle Streams Technology –Capture –Propagation –Apply Architecture Benefits of Streams Technology –High Availibility –Managebility –Performance –Guaranteed Future Conclusion
Distributed Databases AFR MEXSAOESPCZ VPM IBM-HOST (GATEWAYS) SHANGHAI CHANGCHUN KVSPF6 P97 PROD KVS is one of the worldwide biggest distributed DB systems using Oracle Adv. Replication
Streams vs. Adv. Replication – Replication: Adv. Replication since 7.3.x available Enhancements until Oracle Version 8i/9i Unique Position until Oracle Version 8i Replication based on Trigger functionality only for DML Static solution (Schema, Tables, Attributes must be the same) – Streams: Streams usable since for productive environments Key Companies are working with Streams Technology –Datawarehouse –Distributed Systems High focus on Enhancements (10gR1, 10gR2, 11g ) very dynamic Also useful for heterogeneous systems.
Streams Basic Elements –Three basic tasks of a stream: Capture Staging Apply (consumption) –A stream can perform multiple tasks across multiple databases. ApplyStagingCapture
You can place events in Streams: –Implicitly: Log-based capture of DML and DDL changes –Explicitly: Direct enqueue of user messages Capture
Propagation (Staging) Streams uses a staging area, which: – Is implemented as a queue in a queue table – Supports the self-describing data type, SYS.AnyData – Stages captured events and user-created events in the same queue – Retains events until they are consumed by all applicable tasks, processes, or applications. Propagation Staging
Apply – Events in the staging queue can be consumed: Implicitly by an apply process Explicitly by an application performing dequeue via open interfaces such as JMS, C, or PL/SQL – The apply process can: Apply data changes directly to the database objects Perform a series of operations based on the event by means of an apply handler Apply
Architecture Goal: – Reduction on simple Implementation Typ: Primary Database with Secondary Databases – Complex Implementation follows later: Primary Database with Extented Secondary Databases
Availibility – With Replication, there might be locking problems at high productive Slave sites Problems only on Slave Sites Tables with frequently changes Heart of the application: tree of documents Business Trend is increasing –Catia V5 –Amount of users / Application with very good consumer acceptance –Huge tables and high frequent changes on tables – Temporarily “lost” data in replication Environment Conflict resolution is handled at Master Site Consolidated information will be written back to slave – In Streams, all DB Sites have more equality & autonomy
Managebility – Replication Changes on Table structure in short Release cycles Lack of Support During DDL Changes on Tables High aggravation on guarantee of global consistence –Conflicts during the time of worldwide changes –“normal” DML Changes – bulk operations – Efficient Backup & Recovery Concept for Streams Well defined strategy for Recovery Issues Cloning of Streams Enhancements in Exp/Imp and data pump functionality – Overhead in Replication Reorganization of MLOG$-, USLOG$ Tables
Performance – Less Network workload Less network overhead than Adv. Replication Information is not written back to slave sites Support of DDL Statements –DDL Statements are (re)executed on slaves with streams – Data Integrity, Constraining Unique Indexes/Constraints, Foreign Keys Optimizer – Faster User Interaktion Capture & Apply mechanism have its own background processes no Snapshot Logs / no Trigger interaction – Advanced Queuing (AQ) & usages of multiple queues Parallelisation vs. single queue concept of Replication mechanism
Guaranteed Future (1/2) – Support of DDL Operations Reduction of massive DML Operations Use of Rolling Partitions –Streams Support for Tablespace & Schema Replication About 130 Tables are in Replication environment –Easy Use of Instantiation The only way to implement huge tables (> 20 Mio Rows) –Journaling Tables –Security Secure Queues in STRMADMIN Schema –Data integrity & Transaction order guaranty Constraints, Foreign Keys
Guaranteed Future (2/2) – Test system Build Up with Streams Semi productive Environment:Coupling with Production System Downstream Capture: Ni impact on production Environment No double data volume ( comparable to Snapshots) No divergence between production and test by time No Loss of Test Data – Directed Networking & Multi-Client Capability Apply Forwarding & Queue Forwarding Table Subsets for different Secondary Sites (clients) – Support of transformations Different Columns of Tables can be replicated with streams Table-, Column- or Schema name can be different on systems –Support for (almost) all Data Types Abstract Data Types (XML as ADT) in progress of new Releases
Conclusion KVS is one of the worldwide biggest distributed DB systems using Oracle Adv. Replication Replication has been the only way of working in this distributed environment. There are several misbehaviours with Adv. Replication Streams is a new technology, based on Advanced Queuing and logminer functionality Streams is usable since Oracle 9i and fully developed in 10g(R2) With Streams, there will be less workload regarding user interaction, but there are background processes using resources There will be lots of benefits using Streams instead of Replication Argumentation is based Availibility, Managebility, Performance and guaranteed future Beside distributed database environments, Streams is also usable in the business of Test&Quality systems, heterogeneous systems and datawarehouse applications.