Data Warehouse Development Methodology

Slides:



Advertisements
Similar presentations
MEDICAL MUTUAL OF OHIO Corporate Data Warehouse January 17, 2000 By Terry Cleary Alycia Lieber Mike Mina.
Advertisements

Course: e-Governance Project Lifecycle Day 1
Data Manager Business Intelligence Solutions. Data Mart and Data Warehouse Data Warehouse Architecture Dimensional Data Structure Extract, transform and.
Enterprise Resource Planning
Enterprise Data Warehousing (EDW) By: Jordan Olp.
Basic guidelines for the creation of a DW Create corporate sponsors and plan thoroughly Determine a scalable architectural framework for the DW Identify.
Lecture 5 Themes in this session Building and managing the data warehouse Data extraction and transformation Technical issues.
Introduction to data warehouses
MIS 385/MBA 664 Systems Implementation with DBMS/ Database Management Dave Salisbury ( )
Database Planning, Design, and Administration Transparencies
Information Integration. Modes of Information Integration Applications involved more than one database source Three different modes –Federated Databases.
1 Samples The following slides are provided as samples and references for the Quarterly Reviews Additional slides will be added.
Chapter 6 Database Design
1 Pertemuan 14 Perencanaan, Desain dan Administrasi Databases Matakuliah: >/ > Tahun: > Versi: >
Components and Architecture CS 543 – Data Warehousing.
Office of Water Water Quality Exchange Pilot. Purpose To Establish a platform/software independent data exchange format for ambient water quality and.
6 Chapter 6 Database Design Hachim Haddouti. 6 2 Hachim Haddouti and Rob & Coronel, Ch6 In this chapter, you will learn: That successful database design.
Lecture Nine Database Planning, Design, and Administration
The database development process
Data Warehouse Components
Data Warehouse Toolkit Introduction. Data Warehouse Bill Inmon's paradigm: Data warehouse is one part of the overall business intelligence system. An.
©1999, 2002, Joyce Bischoff, All rights reserved. Conducting Data Warehouse Assessments Joyce Bischoff Bischoff Consulting, Inc. Hockessin, Delaware
1 Components of A Successful Data Warehouse Chris Wheaton, Co-Founder, Client Advocate.
Chapter 5 Planning for a Successful Warehouse. Financial Justification zIntangible Benefits (45%) - Remain competitive - Respond to changing business.
Management Planning and Project. CHAPTER OBJECTIVES  Review the essentials of planning for a data warehouse.  Distinguish between data warehouse projects.
ETL By Dr. Gabriel.
Agile Approach to Information Strategy and Data Governance.
BUSINESS INTELLIGENCE/DATA INTEGRATION/ETL/INTEGRATION AN INTRODUCTION Presented by: Gautam Sinha.
Lecture 5 MGMT © 2012 Houman Younessi Framework for Cogenerating IS Strategy with Business Strategy (Co-Planning)
Understanding Data Warehousing
Engineering, Operations & Technology | Information TechnologyAPEX | 1 Copyright © 2009 Boeing. All rights reserved. Architecture Concept UG D- DOC UG D-
LECTURE 1 What does a Business Analyst do? IFS 231 Business Analysis.
IST 210 Database Design Process IST 210 Todd S. Bacastow January 2005.
1 Chapter 9 Database Design. 2 2 In this chapter, you will learn: That successful database design must reflect the information system of which the database.
Week 4 Lecture Part 3 of 3 Database Design Samuel ConnSamuel Conn, Faculty Suggestions for using the Lecture Slides.
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie.
(e)Business Process Management easyREMOTE DWH © Josef Schiefer, IBM Watson Process Warehousing Unified Business Framework... in concert.
Database Planning, Design, and Administration Transparencies
2 Copyright © Oracle Corporation, All rights reserved. Defining Data Warehouse Concepts and Terminology.
Information Systems Engineering. Lecture Outline Information Systems Architecture Information System Architecture components Information Engineering Phases.
CHAPTER 7: ARCHITECTURAL COMPONENTS. CHAPTER OBJECTIVES  Understand data warehouse architecture  Examine how the architectural framework supports the.
EPA Geospatial Segment United States Environmental Protection Agency Office of Environmental Information Enterprise Architecture Program Segment Architecture.
Creating a Data Warehouse Data Acquisition: Extract, Transform, Load Extraction Process of identifying and retrieving a set of data from the operational.
Metadata By N.Gopinath AP/CSE Metadata and it’s role in the lifecycle. The collection, maintenance, and deployment of metadata Metadata and tool integration.
Advanced Database Concepts
CSI - Introduction ITIL v3.
Harry Goossens Centre of Competence on Data Warehousing.
IFS310: Module 2 1/18/2007 Systems Planning and SDLC.
Zhangxi Lin Texas Tech University
Business Intelligence Pathway Method 5 th Meeting Course Name: Business Intelligence Year: 2009.
IST 210 Database Design Process IST 210, Section 1 Todd S. Bacastow January 2004.
Chapter 8: Data Warehousing. Data Warehouse Defined A physical repository where relational data are specially organized to provide enterprise- wide, cleansed.
Data Warehouse – Your Key to Success. Data Warehouse A data warehouse is a  subject-oriented  Integrated  Time-variant  Non-volatile  Restructure.
Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 8: Data Warehousing.
C Copyright © 2007, Oracle. All rights reserved. Introduction to Data Warehousing Fundamentals.
2 Copyright © 2006, Oracle. All rights reserved. Defining Data Warehouse Concepts and Terminology.
Defining Data Warehouse Concepts and Terminology
Introduction.
Components of A Successful Data Warehouse
Data Warehouse—Subject‐Oriented
Introduction to Data Warehousing
Data Warehouse.
Defining Data Warehouse Concepts and Terminology
المحاضرة 4 : مستودعات البيانات (Data warehouse)
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie
Data Warehouse Architecture
Data Warehouse Architecture
IT Transformation: Strategic Plan & Pilot Public Education Department
Metadata The metadata contains
Technical Architecture
Presentation transcript:

Data Warehouse Development Methodology

Waterfall Methodology

Waterfall Methodology With infrastructure setup and project management

Iterative Methodology Incremental approach: Top-down incremental approach Bottom-up incremental approach Warehouse Development Approaches The most challenging aspect of data warehousing lies not in its technical difficulty, but in choosing the best approach to data warehousing for your company’s structure and culture, and dealing with the organizational and political issues that will inevitably arise during implementation. Among the different approaches to developing a data warehouse are: “Big bang” approach Incremental approach Top-down incremental approach Bottom-up incremental approach

Top-Down Approach Analyze requirements at the enterprise level Develop conceptual information model Identify and prioritize subject areas Complete a model of selected subject area Map to available data Perform a source system analysis Implement base technical architecture Establish metadata, extraction, and load processes for the initial subject area Create and populate the initial subject area data mart within the overall warehouse framework Top-Down Incremental Approach Advantages This approach has the following advantages: Provides a relatively quick implementation and payback. Typically, the scoping, definition study, and initial implementation are scaled down so that they can be completed in six to seven months. Offers significantly lower risk because it avoids being as analysis heavy as the “big bang” approach Emphasizes high-level business needs Achieves synergy among subject areas. Maximum information leverage is achieved as cross-functional reporting and a single version of the truth are made possible Disadvantages This approach has the following disadvantages: Requires an increase in up-front costs before the business sees any return on their investment Is difficult to define the boundaries of the scoping exercise if the business is global May not be suitable unless the client needs cross-functional reporting

Bottom-Up Approach Define the scope and coverage of the data warehouse and analyze the source systems within this scope Define the initial increment based on the political pressure, assumed business benefit and data volume Implement base technical architecture and establish metadata, extraction, and load processes as required by increment Create and populate the initial subject areas within the overall warehouse framework Bottom-Up Incremental Approach This approach is similar to the top-down approach but the emphasis is on the data rather than the business benefit. Here, IT is in charge of the project either because IT wants to be in charge or the business has deferred the project to IT. Advantages This approach has the following advantages: This is a “proof of concept” type of approach, therefore it is often appealing to IT. It is easier to get IT buy-in for this approach because it is focused on IT. Disadvantages This approach has the following disadvantages: Because the solution model is typically developed from source systems and these source systems will have encapsulated within them the current business processes, the overall extensibility of the model will be compromised. IT staff is often the last to know about business changes—IT could be designing something that will be out of date before they complete its delivery. As the framework of definition in this approach tends to be much narrower, often a significant amount of reengineering work is required for each increment.

Incremental Approach to Warehouse Development Multiple iterations Shorter implementations Validation of each phase Increment 1 Strategy Definition Analysis Design Build Iterative Incremental Approach The incremental approach manages the growth of the data warehouse by developing incremental solutions that comply with the full-scale data warehouse architecture. Rather than starting by building an entire enterprisewide data warehouse as a first deliverable, start with just one or two subject areas, implement them as scalable data mart and roll them out to your end users. Then, after observing how users are actually using the warehouse, add the next subject area or the next increment of functionality to the system. This is also an iterative process. It is this iteration that keeps the data warehouse in line with the needs of the organization. Benefits Delivers a strategic data warehouse solution through incremental development efforts Provides extensible, scalable architecture Supports the information needs of the enterprise organization Quickly provides business benefit and ensures a much earlier return of investment Allows a data warehouse to be built based on a subject or application area at a time Allows the construction of an integrated data mart environment Production

Methodology Ensures a successful data warehouse Encourages incremental development Provides a staged approach to an enterprisewide warehouse: Safe Manageable Proven Recommended Methodology A methodology is a set of detailed steps or procedures to accomplish a defined goal. Employing a methodology for the development of any system is always important. In a warehouse environment even more so. The warehouse is such a big investment, in every resource you can think of, that its success is essential. To avoid failure of the warehouse implementation, you must employ a methodology and keep to it. Failure is generally caused in two ways. The first cause of failure is that the warehouse is not delivered on time, and the second is that the warehouse fails to deliver what the business users need. A good method helps to manage expectations by identifying clear deliverables. On the other hand, don’t become a slave to the steps of a methodology. Practice methodology with focus on results, not on activities. This achieves consistency of deliverables while recognizing differences in individual working styles.

Architecture “Provides the planning, structure, and standardization needed to ensure integration of multiple components, projects, and processes across time.” “Establishes the framework, standards, and procedures for the data warehouse at an enterprise level.” — The Data Warehousing Institute Architecture From a business and technology view, an architecture defines a collection of components and specifies their relationships. The goal of the architecture activities is a single, integrated data warehouse meeting business information needs. Some of the components of a data warehousing architecture are: Data sources Data acquisition Data management Data distribution Information directory Data access tools

Extraction, Transformation, and Load (ETL) “Effective data extract, transform and load (ETL) processes represent the number one success factor for your data warehouse project and can absorb up to 70 percent of the time spent on a typical data warehousing project.” Source Staging Area Target Extraction, Transformation, and Loading (ETL) These processes are fundamental to the creation of quality information in the data warehouse. You take data from source systems; clean, verify, validate, and convert it into a consistent state; then move it into the warehouse. Extraction: The process of selecting specific operational attributes from the various operational systems. Transformation: The process of integrating, verifying, validating, cleaning, and time stamping the selected data into a consistent and uniform format for the target databases. Rejected data is returned to the data owner for correction and reprocessing. Loading: The process of moving data from an intermediate storage area into the target warehouse database. ETL Tools Specialized tools make these tasks comparatively easy to setup, maintain, and manage. Specialized tools can be an expensive option, which motivates many warehouses to employ customized ETL programs written in COBOL, C++, PL/SQL, or other programming languages or application development tools. Oracle Warehouse Builder (OWB) is Oracle’s ETL tool.

Data Warehouse Architecture Ex., Incremental Implementation Implementation deliverables: Analysis Confirm and refine requirements Design Gather specifications and prepare the blueprint for the data warehouse or data mart Construction Put in place and test the data warehouse or data mart and all required support tools Deployment Data warehouse or data mart is accepted for use in the business Increment n

Operation and Support Data access and reporting Refreshing warehouse data Monitoring Responding to change Operation Present warehouse data to the end user in a meaningful and business specific manner, and select query tools that are tailored to the users’ requirements for information Periodically refresh the warehouse data Respond to changing data sources, requirements, and technology Monitor, manage, and tune

Phases of the Incremental Approach Strategy Definition Analysis Design Build Production Strategy Definition Analysis Design Build Production Increment 1 Phases of the Incremental Approach Effective and efficient data warehouse project management involves the use of project phases. Project phases identify the tasks to be completed, the resources required, the directing and reporting efforts, and the quality assurance required before moving on to the next phase. Project phasing is a management technique used to focus project teams toward a short-term goal and to communicate progress to senior management. Strategy Define the business objectives and purpose of the data warehouse Define the data warehouse team and executive sponsor Define success measurements Definition Define the scope and objectives for the incremental development effort Identify the technical and data warehouse architecture Outline data access methods

Strategy Phase Deliverables Business goals and objectives Data warehouse purpose, objectives, and scope Enterprise data warehouse logical model Incremental milestones Source systems data flows Subject area gap analysis Identifying Warehouse Strategy Phase Deliverables For each of the data warehouse project phases there are deliverables. The deliverables for the strategy phase focus on defining the business objectives and purpose of the data warehouse solution. The purpose and objectives for the total data warehouse solution are essential to setting and managing expectations. The strategy phase also clearly defines the data warehouse team and the executive sponsor. Business goals and objectives: Documents the strategic business goals and objectives Data warehouse purpose, objectives, and scope: Documents the purpose and objectives of the enterprise data warehouse, its scope, and how it is intended to be used Enterprise data warehouse logical model: High-level, logical information model that diagrams the major entities and relationships for the enterprise Incremental milestones: Documents a realistic scope of the data warehouse, acceptable delivery milestones for each increment, and source data availability

Strategy Phase Deliverables Data acquisition strategy Data quality strategy Metadata strategy Data access environment Training strategy Identifying Warehouse Strategy Phase Deliverables (continued) Source system data flows: Outlines source system data, where it originates, the flow of data between business functions and source systems, degree of reliability, and data volatility Subject area gap analysis: Documents the variance between the information requirements and the ability of the data sources to provide that information Data acquisition strategy: Documents the approach to extracting, transforming, and loading data from the source systems to the target environments for the initial load and subsequent refreshes Data quality strategy: Outlines the approach for data management, error and exception handling, data cleansing, and the audit and control of the data Metadata strategy: Documents the strategy of capturing, integrating, and accessing metadata for all components of the warehouse environment Data access environment: Documents the identification, selection, and design of tools that support end-user access to the warehouse data Training strategy: Outlines the development and end-user training requirements, identifies the technical and business personnel requiring training, and establishes time frames for executing the training plans