Download presentation
Presentation is loading. Please wait.
Published byChester Jenkins Modified over 8 years ago
1
Enterprise Data Warehouse A Technical Perspective Tony Dalwood Information Architecture & Management University of South Australia
2
IT Structure ISTS – Information Strategy & Technology Services ISTS – Information Strategy & Technology Services Information Strategy Information Strategy Corporate Information Systems Corporate Information Systems E-Business E-Business Information Architecture & Management Information Architecture & Management Technical Services Technical Services Customer Services Customer Services Network Services Network Services Systems Infrastructure Systems Infrastructure
3
Information Architecture & Management (IAM) Merger of DBA team & Information Integration team in Feb 2006 Merger of DBA team & Information Integration team in Feb 2006 IAM manages IAM manages Corporate System Databases (3 DBA’s) Corporate System Databases (3 DBA’s) Operational Data Store Management Operational Data Store Management Middle Tier Apps Middle Tier Apps Student Portal (myUniSA) Student Portal (myUniSA) Staff “Portal” (UniSAinfo) Staff “Portal” (UniSAinfo) UniSAinfo Reporting UniSAinfo Reporting EDW EDW
4
Project Governance Steering Group Steering Group Includes Directors of ISTS, Planning and Assurance Services (PAS), Student & Academic Services (SAS), Finance Includes Directors of ISTS, Planning and Assurance Services (PAS), Student & Academic Services (SAS), Finance Sponsors Group Sponsors Group Director of Planning & Assurance Services Director of Planning & Assurance Services Dep. Director Information Strategy Dep. Director Information Strategy Business Project Manager Business Project Manager Technical Project Manager Technical Project Manager Reference Group Reference Group Senior Officers from PAS, HR, Research, SAS, Finance Senior Officers from PAS, HR, Research, SAS, Finance
5
Project Governance Project Team Project Team Business Project Manager (PAS) Business Project Manager (PAS) Technical Project Manager (ISTS) Technical Project Manager (ISTS) Design Architect/Dev Team Leader (ISTS) Design Architect/Dev Team Leader (ISTS) Business Analyst (x1.5) (PAS) Business Analyst (x1.5) (PAS) Data Quality Manager (0.5) (PAS) Data Quality Manager (0.5) (PAS) Developers (x3 variant) (ISTS) Developers (x3 variant) (ISTS)
6
EDW Project Milestones Aug 2004 - Business Case submitted by Planning & Assurance Services (PAS) and ISTS to extend current reporting environment to an EDW ($150K) Aug 2004 - Business Case submitted by Planning & Assurance Services (PAS) and ISTS to extend current reporting environment to an EDW ($150K) Feb 2005 – Project Commenced Feb 2005 – Project Commenced Feb-July 2005 – Data Gathering Workshops Feb-July 2005 – Data Gathering Workshops Sep-Dec 2005 – Technical Research & Proof of Concept (0.5 IT Resource) Sep-Dec 2005 – Technical Research & Proof of Concept (0.5 IT Resource) Jan-Feb 2006 – External Consultancy (1 IT Resource) Jan-Feb 2006 – External Consultancy (1 IT Resource) May 2006 – First Star Schema complete (Research Publications) (4 IT Resources) May 2006 – First Star Schema complete (Research Publications) (4 IT Resources) July 2006 – Three more Star Schemas complete (Research Income, AVCC Data, Research Staff Supervision) (4 IT Resources) July 2006 – Three more Star Schemas complete (Research Income, AVCC Data, Research Staff Supervision) (4 IT Resources) August 2006 – First “Soft” Production Release (2.5 IT Resources) August 2006 – First “Soft” Production Release (2.5 IT Resources) Beyond – Student Data & Finance Data (min 2 IT Resources) Beyond – Student Data & Finance Data (min 2 IT Resources) NB: IT Resource not including part time Tech Project Manager
7
BusinessTechnical ‘One’ Source of the truth Conformed Dimensions Consolidated Facts Performance Transformed schema design External Data Flexible data sources Simplicity Pre-calculated measures Historical Capability Versioning, Snapshot Data Quality Verification, Validation, Audit Trail Project Goals
8
By-Products of an EDW Project Data Discovery Data Discovery What data do we have What data do we have How data is used and maintained How data is used and maintained What is the quality of the data What is the quality of the data How data can be utilised by more of the organisation How data can be utilised by more of the organisation Enhanced Collaboration Enhanced Collaboration Intra and Inter communication between business units, system owners and IT Intra and Inter communication between business units, system owners and IT
9
Technical Project Plan “Warehousing” Research “Warehousing” Research Proof of Concept exercise Proof of Concept exercise External Assistance External Assistance Implementation of an Architecture Implementation of an Architecture Development Standards & Procedures Development Standards & Procedures Build & Implementation of Stage 1 Build & Implementation of Stage 1 Review Review
10
Proof of Concept Validate Warehouse research findings Validate Warehouse research findings Proof of Concept covered the following topics: Proof of Concept covered the following topics: Project methodology Project methodology Technical architecture Technical architecture Design methodology Design methodology ETL methodology ETL methodology MetaData options MetaData options Data Quality approach Data Quality approach Security implementation options Security implementation options
11
Project Methodology
12
Technical Architecture Inputs into Architecture Inputs into Architecture Business Goals Business Goals Existing Reporting Environments Existing Reporting Environments Technology Technology Time Time $$ $$ Resources/Skills Resources/Skills
14
Data Flow Architecture
15
Design Methodology Dimensional Modelling chosen as the design philosophy Dimensional Modelling chosen as the design philosophy Star Schemas/Snowflakes Star Schemas/Snowflakes Facts Facts Dimensions Dimensions Measures Measures Bridges Bridges History Retention for Slowly Changing Dimensions History Retention for Slowly Changing Dimensions Warehouse records are versioned i.e. never deleted or overwritten. Warehouse records are versioned i.e. never deleted or overwritten. Views to identify “current” records Views to identify “current” records
16
Transformation of Design - Source
17
Transformation of Design - Target
18
ETL Methodology Scripts Vs Tool decision Scripts Vs Tool decision Tool chosen for following reasons: Tool chosen for following reasons: Already licensed for Oracle Internet Developer Suite that includes Oracle Warehouse Builder Already licensed for Oracle Internet Developer Suite that includes Oracle Warehouse Builder Oracle Database environment Oracle Database environment Oracle technical skills Oracle technical skills Visibility of Development Environment Visibility of Development Environment Auto technical Meta Data generation Auto technical Meta Data generation Auto and accessible code generation using PL/SQL Auto and accessible code generation using PL/SQL Ability to include custom code Ability to include custom code Integration with Oracle database and related Oracle technology Integration with Oracle database and related Oracle technology Framework for Beginners Framework for Beginners Difficult to evaluate other products without expertise Difficult to evaluate other products without expertise Smarts & Effort into Modelling and Design – ETL should be a “no brainer” Smarts & Effort into Modelling and Design – ETL should be a “no brainer”
19
MetaData Data about Data Data about Data Oracle Warehouse Builder provides technical metadata Oracle Warehouse Builder provides technical metadata Business MetaData facility currently restricted to documentation and Cognos catalogs Business MetaData facility currently restricted to documentation and Cognos catalogs Evaluation of MetaData methods to be reviewed at the completion of Stage 1 development Evaluation of MetaData methods to be reviewed at the completion of Stage 1 development
20
Data Quality Pre-ETL Pre-ETL Technical profile to ensure physical design has mapped appropriate data elements Technical profile to ensure physical design has mapped appropriate data elements Business profile of source data to identify data attributes e.g. data type, patterns, nulls, min, max, outlies Business profile of source data to identify data attributes e.g. data type, patterns, nulls, min, max, outlies ETL ETL Transform to conformed data sets Transform to conformed data sets Foreign Key checks Foreign Key checks Reporting of anomolies Reporting of anomolies Post ETL Post ETL Final Business profile to validate transformations of data Final Business profile to validate transformations of data
21
Security Security options implemented are: Security options implemented are: Database Layer Database Layer Oracle roles to grant or deny access to database objects based on Business rules Oracle roles to grant or deny access to database objects based on Business rules Oracle views for granular data security where appropriate Oracle views for granular data security where appropriate User Layer User Layer Access to end user Cognos catalogues/cubes controlled via Cognos security mechanisms and filesystem access Access to end user Cognos catalogues/cubes controlled via Cognos security mechanisms and filesystem access
22
Development Lifecycle Business Requirements Business Requirements Design Process Design Process Logical Design Logical Design Physical Design Physical Design Data Mapping Data Mapping Data Profiling Data Profiling
23
Development Lifecycle Design & Build ETL Objects & Processes Design & Build ETL Objects & Processes Extraction routines Extraction routines ‘Diff’ routines ‘Diff’ routines Tag records as Inserts, Updates or Deletes Tag records as Inserts, Updates or Deletes Build Staging tables Build Staging tables Build Target warehouse tables Build Target warehouse tables
24
Standard ETL Process Scheduled Extract/Diff process runs to populate a Diff table in the Staging Area Scheduled Extract/Diff process runs to populate a Diff table in the Staging Area ETL process then performs a standard set of steps ETL process then performs a standard set of steps Load Staging from Diff table Load Staging from Diff table Stamp Staging record according to Diff type (U, D or I) Stamp Staging record according to Diff type (U, D or I) Updated Record – Tag staging record as new ‘version’ of core record Updated Record – Tag staging record as new ‘version’ of core record Deleted Record – Tag staging record ‘Retired’ record in warehouse Deleted Record – Tag staging record ‘Retired’ record in warehouse Inserted Record – Tag staging record to be new record (version 1) Inserted Record – Tag staging record to be new record (version 1) Update Core – End date existing “current” record Update Core – End date existing “current” record Load new Core – New “current” record from Staging Load new Core – New “current” record from Staging
25
Development Lifecycle Post ETL Post ETL Measures Measures Summary data Summary data Process Flows to execute ETL Process Flows to execute ETL Security views Security views End User Layer e.g. Catalogues End User Layer e.g. Catalogues
26
ETL Auditing When did a process last run When did a process last run How long did it run for How long did it run for Did it Succeed, Fail or produce Warnings Did it Succeed, Fail or produce Warnings How many records did it alter or insert How many records did it alter or insert What were the data exceptions What were the data exceptions
27
UniSA EDW Toolset Oracle Database Oracle Database Oracle Warehouse Builder Oracle Warehouse Builder Oracle Workflow Oracle Workflow Oracle Enterprise Manager Oracle Enterprise Manager Datiris Data profiler Datiris Data profiler Cognos Impromptu/Powerplay Cognos Impromptu/Powerplay Whiteboard and lots of A3 Paper!!! Whiteboard and lots of A3 Paper!!!
28
Oracle Database Options assisting Warehouse implementation Options assisting Warehouse implementation External tables External tables Materialised Views Materialised Views Query Rewrite Query Rewrite Bitmap indexes Bitmap indexes Partitioning Partitioning Star Query optimizer options Star Query optimizer options
29
Oracle Warehouse Builder Provides the design and development environment and framework for the build and deployment of Warehouse objects and transformation processes Provides the design and development environment and framework for the build and deployment of Warehouse objects and transformation processes Consists of Design Repository and Runtime components Consists of Design Repository and Runtime components
31
Oracle Workflow Optionally used for job execution with “dependency management” Optionally used for job execution with “dependency management” Exists as an optional install with RDBMS Exists as an optional install with RDBMS Run as Client/Server or HTTP browser based application Run as Client/Server or HTTP browser based application Workflow engine is a service on the warehouse database server administered by a workflow schema Workflow engine is a service on the warehouse database server administered by a workflow schema
32
Oracle Enterprise Manager Optionally used as the scheduling option for submitting and monitoring Warehouse builder processes or workflows Optionally used as the scheduling option for submitting and monitoring Warehouse builder processes or workflows Base OEM comes with RDBMS Base OEM comes with RDBMS Optionally run as standalone install or Management Server mode using a web console Optionally run as standalone install or Management Server mode using a web console
33
Cognos 7.3 Reporting Suite Catalogues Catalogues Report Developer access layer Report Developer access layer Impromptu Impromptu Reporting capability Reporting capability Powerplay Powerplay Multi-dimensional analysis Multi-dimensional analysis Upfront Upfront Web interface Web interface
34
Oracle Warehouse Builder Demonstration
81
OWB 10g Release 2 - Paris New Features: Design Tool Design Tool Graphic Interface Improvements Graphic Interface Improvements Built in Slowly Changing Dimension property Built in Slowly Changing Dimension property Data Profiling/Quality utilities Data Profiling/Quality utilities Better Integrated Workflow Engine Better Integrated Workflow Engine Job Scheduling within OWB via OEM Job Scheduling within OWB via OEM
82
Project Review Sanity Check on whole process, architecture, methodology Sanity Check on whole process, architecture, methodology Business & Technical Business & Technical Evaluate ROI Evaluate ROI Quantify metrics on time to deliver Quantify metrics on time to deliver Proposed Future phases Proposed Future phases Usage Statistics Usage Statistics Hardware adequacy & capacity Hardware adequacy & capacity
83
Useful Technical References Links Links Oracle Business Intelligence & Technical Sites Oracle Business Intelligence & Technical Sites http://www.oracle.com/solutions/business_intelligence/index.html http://www.oracle.com/solutions/business_intelligence/index.html http://www.oracle.com/solutions/business_intelligence/index.html http://www.oracle.com/technology/tech/bi/index.html http://www.oracle.com/technology/tech/bi/index.html http://www.oracle.com/technology/tech/bi/index.html Rittman Blog Rittman Blog http://www.rittman.net/ http://www.rittman.net/ http://www.rittman.net/ Kimball Tips Kimball Tips http://www.kimballgroup.com/html/designtips.html http://www.kimballgroup.com/html/designtips.html http://www.kimballgroup.com/html/designtips.html Texts Texts Oracle 9iRel2 Data Warehousing - Hobbs Oracle 9iRel2 Data Warehousing - Hobbs Kimball Texts Kimball Texts The Data Warehouse Lifecycle Toolkit The Data Warehouse Lifecycle Toolkit The Data Warehouse ETL Toolkit The Data Warehouse ETL Toolkit
84
Questions ?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.