5 Copyright © 2009, Oracle. All rights reserved. Right-Time Data Warehousing with OWB
Copyright © 2009, Oracle. All rights reserved Objectives After completing this lesson, you should be able to: Identify business case for right-time data warehousing Design near real-time trickle feed mappings Use Streams queue operators Use Change Data Capture code templates
Copyright © 2009, Oracle. All rights reserved Lesson Agenda Does OWB support real-time or near-real-time data warehousing? –Real-time and near-real-time concepts –Batch versus trickle feed –OWB support for advanced queues Building a trickle feed mapping Using CDC code templates for Change Data Capture
Copyright © 2009, Oracle. All rights reserved What Is Meant by Real-Time Data Warehousing Definitions of real-time data warehouse (RTDW) are broad. –One of the defining characteristics of a RTDW is that the data freshness is “as real-time as it needs to be.”* * from The Data Warehouse Institute Two main categories of data freshness: –Pure real time –Near real time Live or nearly live updates
Copyright © 2009, Oracle. All rights reserved What Refresh Frequency Does OWB Support? OWB supports near real-time data warehousing. –We will refer to this as right time data warehousing. OWB supplies “building blocks”; users construct solutions that fit their needs.
Copyright © 2009, Oracle. All rights reserved OWB Supports Multiple Loading Approaches Three loading approaches: Full data refresh into empty tables Incremental data refresh Continuous trickle feed approach New to OWB 11g R2, trickle feed mappings based on Oracle Streams feature of the database server, and the Advanced Queue operator of OWB; this approach can consume changed data in near real time. Similar to batch load approach, though intervals can be very short New to OWB 11g R2, Change Data Capture (CDC) code template framework captures incremental changes from heterogeneous sources “out of the box.”
Copyright © 2009, Oracle. All rights reserved Trickle Feed: Start/Stop Trickle start … stop map logic
Copyright © 2009, Oracle. All rights reserved Mapping Can Consume or Produce Data to And from Queues Trickle feed mappingCDC code template mapping Mappings can consume data from queues. –Near real-time consumers (trickle feed mappings) –Batch style consumers (including Change Data Capture) Mappings can produce data for queues.
Copyright © 2009, Oracle. All rights reserved OWB Supports Two Types of Advanced Queues ANYDATA AQ Near real-time or batch Typed AQ Batch, only For strongly typed payloads
Copyright © 2009, Oracle. All rights reserved Streams Queues Can Stage Logical Change Records (LCR) or Non-LCR Messages Oracle Streams queues can stage either LCRs or non-LCR messages. –However, OWB requires a user to declare whether an advanced queue will stage either LCRs or messages. –In other words, OWB does not support advanced queues staging both LCRs and messages. Advanced queue Streams queues LCR Messageor
Copyright © 2009, Oracle. All rights reserved real-time Batch or Consume from a queue Consume from a Queue
Copyright © 2009, Oracle. All rights reserved Produce to a queue Produce Information into a Queue
Copyright © 2009, Oracle. All rights reserved Propagate Streams tags Permitted Not permitted Streams queues Propagate Non-Streams queues Transformations Rules Queue-to-Queue Propagation
Copyright © 2009, Oracle. All rights reserved Lesson Agenda Does OWB support real-time or near-real-time data warehousing? –Real-time and near-real-time concepts –Batch versus trickle feed –OWB support for advanced queues Building a trickle feed mapping Using CDC Code Templates for Change Data Capture
Copyright © 2009, Oracle. All rights reserved Building a Trickle Feed Mapping We examine the following steps for building a very simple trickle feed mapping for near real-time data warehousing. 1.Create a receiving advanced queue. 2.Create a queue table in the target schema. 3.Deploy the advanced queue and the queue table. 4.Create a mapping the uses the advanced queue. –Choose the queue to serve as a real-time source. –Choose the primary data type. –Map the payload from the queue to a new target table. 5.Deploy the table and the mapping. 6.Drop messages into this queue and watch them load immediately.
Copyright © 2009, Oracle. All rights reserved Building a Trickle Feed Mapping: Queue Operator Wizard
Copyright © 2009, Oracle. All rights reserved Select the Message Queue as Driver
Copyright © 2009, Oracle. All rights reserved Select Batch or Real-Time Source
Copyright © 2009, Oracle. All rights reserved Select Message Type
Copyright © 2009, Oracle. All rights reserved Add Object Type Expander for Payload Expander lets you “see” inside the message, in this case SUBJECT and TEXT.
Copyright © 2009, Oracle. All rights reserved Add the Mapping Target Table
Copyright © 2009, Oracle. All rights reserved Execution Type Automatically Set to Trickle
Copyright © 2009, Oracle. All rights reserved Define the Streams Administrator User Define Streams administrator user in OWB. Add to module.
Copyright © 2009, Oracle. All rights reserved Configure the Mapping Real-Time Parameters Define the degree of parallelism. Generate table instantiation (not used). Define the Streams administrator location. Define the apply process name.
Copyright © 2009, Oracle. All rights reserved Practice 5-1 Overview: Building a Trickle Feed Mapping This practice covers all of the steps for building a trickle feed mapping for near real-time data warehousing. Creating a receiving advanced queue Creating a queue table in the target schema Deploying the advanced queue and the queue table Creating a mapping that uses the advanced queue –Choosing the queue to serve as a real-time source –Choosing the primary data type –Mapping the payload from the queue to a new target table Deploying the table and the mapping Dropping messages into this queue and watching them load immediately
Copyright © 2009, Oracle. All rights reserved Lesson Agenda Does OWB support real-time or near-real-time data warehousing? –Real-time and near-real-time concepts –Batch versus trickle feed –OWB support for advanced queues Building a trickle feed mapping Using CDC code templates for Change Data Capture
Copyright © 2009, Oracle. All rights reserved Change Data Capture Framework Orchestrates process to capture changes in near real-time Open framework supports multiple platforms. –CDC templates seeded for Oracle, DB2 UDB, SQL Server Typical mechanisms –Trigger based (Oracle, IBM, Microsoft) –Log based (Oracle and IBM)
Copyright © 2009, Oracle. All rights reserved Simplified CDC Mapping Steps 1.Choose how to perform CDC (trigger or log mechanism). 2.Select tables upon which to perform CDC. 3.Start the capture process. 4.Define the subscribers to receive the changed data. 5.Define mappings to consume the changes.
Copyright © 2009, Oracle. All rights reserved Choose How to Perform CDC Decide whether you want to perform CDC via a mechanism that employs triggers or logs. Select the CDC code template that matches your need.
Copyright © 2009, Oracle. All rights reserved Select Tables Upon Which to Perform CDC From which tables do you want to capture changed data?
Copyright © 2009, Oracle. All rights reserved Start the CDC Capture
Copyright © 2009, Oracle. All rights reserved Define Subscribers Subscriber consumes the changes.
Copyright © 2009, Oracle. All rights reserved Define Mapping to Consume Changes
Copyright © 2009, Oracle. All rights reserved Define Execution Units and Deploy Define execution unit; assign code template to it. Deploy mapping, target table, location.
Copyright © 2009, Oracle. All rights reserved Using Web Services to Administer CDC: Select Run-Time Services Web services to administer CDC
Copyright © 2009, Oracle. All rights reserved Using Web Services to Administer CDC: Edit the Web Service
Copyright © 2009, Oracle. All rights reserved Quiz Are all of the following statements true? a.Support for right-time data warehousing in OWB 11g Release 2 refers to near real-time, not pure real-time. b.Use trickle feed mappings to achieve near real-time data warehousing. c.OWB 11g Release 2 supports both “Typed” and “ANYDATA” advanced queues in the Oracle database, with Typed AQs used in batch mode only, and ANYDATA AQs used in either batch or near real-time processing. d.The OWB CDC framework provides “out of the box” CDC templates seeded for Oracle, DB2 UDB, and SQL Server
Copyright © 2009, Oracle. All rights reserved Summary In this lesson, you should have learned how to: Identify business case for right-time data warehousing Design near real-time trickle feed mappings Use Streams queue operators Use Change Data Capture code templates
Copyright © 2009, Oracle. All rights reserved Practice 5-2 Overview: Steps for Using CDC Code Templates for CDC 1.Create a module. –Edit the module to select a CDC code template. –Select JOB_HISTORY as the table whose changed data will be captured. 2.Create the mapping that uses the CDC code template to consume changes. –Set CDC properties on the JOB_HISTORY table operator. –Create an execution unit; assign CDC code template to it. 3.Deploy the mapping’s target table and the mapping. 4.Start the Change Data Capture process. 5.Add a subscriber. 6.Add a new record to the JOB_HISTORY table. 7.Start the mapping to observe only the new record inserted.
Copyright © 2009, Oracle. All rights reserved