Presentation is loading. Please wait.

Presentation is loading. Please wait.

C Copyright © 2007, Oracle. All rights reserved. Introduction to Data Warehousing Fundamentals.

Similar presentations


Presentation on theme: "C Copyright © 2007, Oracle. All rights reserved. Introduction to Data Warehousing Fundamentals."— Presentation transcript:

1 C Copyright © 2007, Oracle. All rights reserved. Introduction to Data Warehousing Fundamentals

2 Copyright © 2007, Oracle. All rights reserved. C - 2 Definition of a Data Warehouse “A data warehouse is an enterprise structured repository of subject-oriented, time-variant data used for information retrieval and decision support. The data warehouse stores atomic and summary data.” – Oracle Data Warehouse Method

3 Copyright © 2007, Oracle. All rights reserved. C - 3 Typical Data Warehousing Process Phase I: STRATEGY Identify business requirements. Define objectives and purpose of DW. Phase II: DEFINITION Project scoping and planning: Using building block approach Phase III: ANALYSIS Information requirements are defined. Phase IV: DESIGN Database structures to hold base data and summaries are created. Translation mechanisms are designed. Phase V: BUILD AND DOCUMENT The warehouse is built and documentation is developed. Phase VI: POPULATE, TEST, AND TRAIN The warehouse is populated and tested. The users are trained on system and tools. Phase VII: DISCOVERY AND EVOLUTION The warehouse is monitored and adjustments are applied, or future extensions are planned. Iterative

4 Copyright © 2007, Oracle. All rights reserved. C - 4 Data Warehouse Compared to OLTP Property Activities Response Time Operations Nature of Data Data Organized Size Data Sources Data Warehouse Analysis Seconds to hours Primarily read-only Snapshots over time By subject, time Large to very large Operational, internal, external OLTP Processes Subseconds to seconds DML Current By application Small to large Operational, internal

5 Copyright © 2007, Oracle. All rights reserved. C - 5

6 Copyright © 2007, Oracle. All rights reserved. C - 6 Property Scope Subjects Data Source Size (typical) Implementation Time Data Mart Department Single-subject, line of business (LOB) Few See notes below Months Data Warehouse Enterprise Multiple Many See notes below Months to years Data Warehouse Compared with Data Mart

7 Copyright © 2007, Oracle. All rights reserved. C - 7 Independent Versus Dependent Marts IndependentDependent Data marts Ware- house Sources

8 Copyright © 2007, Oracle. All rights reserved. C - 8 Independent Data Mart Sales or marketing data mart Operational systems External data Flat files

9 Copyright © 2007, Oracle. All rights reserved. C - 9 Dependent Data Mart Data warehouse Marketing Sales Finance Data mart Marketing Sales Finance Human Resources Operational systems External data Flat files

10 Copyright © 2007, Oracle. All rights reserved. C - 10 Purpose of a Staging Area Extract Operational External Flat files Server log files E Transform/Load TL Staging areas Transformations Enterprise model (atomic data) PublishSubscribe Portal Access layers Metadata repository Dependent data marts RDBMS Clickstream B2C B2B L

11 Copyright © 2007, Oracle. All rights reserved. C - 11 Data Staging Area The construction site for the warehouse Required by many implementations Can be composed of operational data stores (ODS), flat files, or relational server tables Frequently configured as multitier staging Extract Transform Load (Transport) Operational systems Data staging areaWarehouse

12 Copyright © 2007, Oracle. All rights reserved. C - 12 Warehouse envt. Warehouse environmentOper. envt. Remote Staging Model Warehouse Data staging area within the warehouse environment Extract, Transform, Load Transform Load Data staging area in its own environment, avoiding negative impact on the warehouse environment Operational system Data staging area Staging envt.Oper. envt. Warehouse Extract, Transform, Load Transform Load Operational system Data staging area Transform

13 Copyright © 2007, Oracle. All rights reserved. C - 13 Onsite Staging Model Data staging area within the operational environment, possibly impacting the operational system: Operational environment WH envt. Extract Transform Load WarehouseOperational system Data staging area

14 Copyright © 2007, Oracle. All rights reserved. C - 14 Clickstream B2C B2B L Purpose of an Enterprise Model Extract OperationalExternal Flat files E Transform/Load TL Staging areas Transformations PublishSubscribe Portal Access layers Metadata repository Dependent data marts Federated data warehouse RDBMS Enterprise model (atomic data) Server log files

15 Copyright © 2007, Oracle. All rights reserved. C - 15 Extract, Transform, Load (ETL) Processes Extract source data. Transform/clean data. Index and summarize. Load data into warehouse. Detect changes. Refresh data. Gateways Programs Tools ETL Operational systemsWarehouse

16 Copyright © 2007, Oracle. All rights reserved. C - 16 ETL Processes Must result in data that is relevant, useful, high-quality, accurate, and accessible Require a large proportion of warehouse development time and resources Relevant Useful Quality Accurate Accessible Consolidate Clean up Restructure ETL Operational systems Warehouse

17 Copyright © 2007, Oracle. All rights reserved. C - 17 Source Systems Production Archive Internal External 12345.00 12780.00 2345787.00 87877.98 5678.00 100% 110% 230% 200% -10% ABC CO GMBH LTD GBUK INC FFR ASSOC MCD CO

18 Copyright © 2007, Oracle. All rights reserved. C - 18 Mappings Define which operational attributes to use Define how to transform the attributes for the warehouse Define where the attributes exist in the warehouse Source file A___ F1123 F2Bloggs F310/12/56 Staging file One___ Number USA123 Name Mr. Bloggs DOB 10-Dec-56 Mappings Source file A Staging file One F1 Number F2 Name F3 DOB Source file A Staging file One Metadata

19 Copyright © 2007, Oracle. All rights reserved. C - 19 Extracting Data Data extraction takes selected data fields that pertain to the data warehouse. Extraction routines account for the variety of systems from which the data is taken. Extraction routines contain data or business rules, audit trails, and error correction facilities. Operational databases Data mappings and transformations Data staging area Warehouse database

20 Copyright © 2007, Oracle. All rights reserved. C - 20 Possible Reasons for ETL Failure A missing source file A system failure Inadequate metadata Poor mapping information Inadequate storage planning A source structural change No contingency plan Inadequate data validation

21 Copyright © 2007, Oracle. All rights reserved. C - 21 Typical Warehousing Development Tasks Define source metadata Define staging area metadata Map source to staging area Deploy database structures Deploy mappings Extract data into staging tables Define enterprise model (warehouse) metadata Map staging area to enterprise model Deploy database structures Deploy mappings Extract data into the enterprise model Define data mart metadata (cubes, dimensions) Map enterprise model to data marts Deploy database structures Deploy mappings Extract data into the data mart Refresh warehouse and data mart Maintain warehouse and data mart Administration Warehouse to data marts Source to staging Staging to warehouse

22 Copyright © 2007, Oracle. All rights reserved. C - 22


Download ppt "C Copyright © 2007, Oracle. All rights reserved. Introduction to Data Warehousing Fundamentals."

Similar presentations


Ads by Google