Data Conversion to a Data warehouse Presented By Sanjay Gunasekaran.

Slides:



Advertisements
Similar presentations
Supervisor : Prof . Abbdolahzadeh
Advertisements

An overview of Data Warehousing and OLAP Technology Presented By Manish Desai.
Business Information Warehouse Business Information Warehouse.
Data Warehousing M R BRAHMAM.
Data Warehouse Architecture Sakthi Angappamudali Data Architect, The Oregon State University, Corvallis 16 th May, 2005.
Data Warehouse IMS5024 – presented by Eder Tsang.
Introduction to Data Warehousing. From DBMS to Decision Support DBMSs widely used to maintain transactional data Attempts to use of these data for analysis,
Designing the Data Warehouse and Data Mart Methodologies and Techniques.
Components and Architecture CS 543 – Data Warehousing.
INTRODUCTION TO OLAP MIS 497. Why OLAP? Online Analytical Processing vs. Online Transaction Processing Online Analytical Processing vs. Online Transaction.
Chapter 13 The Data Warehouse
DATA WAREHOUSE (Muscat, Oman).
Designing a Data Warehouse
Data Warehousing: Defined and Its Applications Pete Johnson April 2002.
Components of the Data Warehouse Michael A. Fudge, Jr.
CS 345: Topics in Data Warehousing Tuesday, September 28, 2004.
Data Warehousing Alex Ostrovsky CS157B Spring 2007.
Online Analytical Processing (OLAP) Hweichao Lu CS157B-02 Spring 2007.
Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor
DATA WAREHOUSING IN SQL SERVER 2005/2008 BUSINESS INTELLIGENCE.
©Silberschatz, Korth and Sudarshan18.1Database System Concepts - 5 th Edition, Aug 26, 2005 Buzzword List OLTP – OnLine Transaction Processing (normalized,
Database Systems – Data Warehousing
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie.
OLAP Theory-English version On-Line Analytical processing (Business Intelligence) [Ing.J.Skorkovský,CSc.] Department of corporate economy.
AN OVERVIEW OF DATA WAREHOUSING
OnLine Analytical Processing (OLAP)
Business Intelligence Zamaneh Jahed. What is Business Intelligence? Business Intelligence (BI) is a broad category of applications and technologies for.
Data warehousing and online analytical processing- Ref Chap 4) By Asst Prof. Muhammad Amir Alam.
1 Data Warehouses BUAD/American University Data Warehouses.
2 Copyright © Oracle Corporation, All rights reserved. Defining Data Warehouse Concepts and Terminology.
OLAP & DSS SUPPORT IN DATA WAREHOUSE By - Pooja Sinha Kaushalya Bakde.
MIS2502: Data Analytics The Information Architecture of an Organization.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,
MANAGING DATA RESOURCES ~ pertemuan 7 ~ Oleh: Ir. Abdul Hayat, MTI.
Data Warehouse. Group 5 Kacie Johnson Summer Bird Washington Farver Jonathan Wright Mike Muchane.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Chapter 5 DATA WAREHOUSING Study Sections 5.2, 5.3, 5.5, Pages: & Snowflake schema.
The Data Warehouse “A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile collection of “all” an organisation’s data in support.
Business Intelligence Transparencies 1. ©Pearson Education 2009 Objectives What business intelligence (BI) represents. The technologies associated with.
Advanced Database Concepts
Copyright© 2014, Sira Yongchareon Department of Computing, Faculty of Creative Industries and Business Lecturer : Dr. Sira Yongchareon ISCG 6425 Data Warehousing.
1 Copyright © 2009, Oracle. All rights reserved. Oracle Business Intelligence Enterprise Edition: Overview.
MIS 451 Building Business Intelligence Systems Data Staging.
1 Copyright © Oracle Corporation, All rights reserved. Business Intelligence and Data Warehousing.
The Need for Data Analysis 2 Managers track daily transactions to evaluate how the business is performing Strategies should be developed to meet organizational.
An Overview of Data Warehousing and OLAP Technology
Data Warehouse – Your Key to Success. Data Warehouse A data warehouse is a  subject-oriented  Integrated  Time-variant  Non-volatile  Restructure.
2 Copyright © 2006, Oracle. All rights reserved. Defining Data Warehouse Concepts and Terminology.
Data Mining and Data Warehousing: Concepts and Techniques What is a Data Warehouse? Data Warehouse vs. other systems, OLTP vs. OLAP Conceptual Modeling.
Supervisor : Prof . Abbdolahzadeh
Jaclyn Hansberry MIS2502: Data Analytics The Things You Can Do With Data The Information Architecture of an Organization Jaclyn.
Intro to MIS – MGS351 Databases and Data Warehouses
Data Warehousing CIS 4301 Lecture Notes 4/20/2006.
Data warehouse and OLAP
Chapter 13 The Data Warehouse
Data Warehouse.
المحاضرة 4 : مستودعات البيانات (Data warehouse)
Components of the Data Warehouse Michael A. Fudge, Jr.
Data Warehouse and OLAP
Unidad II Data Warehousing Interview Questions
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie
An Introduction to Data Warehousing
Data warehouse.
Data Warehouse.
Data Warehousing Concepts
Analytics, BI & Data Integration
Data Warehouse and OLAP
Best Practices in Higher Education Student Data Warehousing Forum
Presentation transcript:

Data Conversion to a Data warehouse Presented By Sanjay Gunasekaran

Main Topics Brief Overview of Data Warehouse Concept of Data Conversion Importance of Data conversion and the steps involved Common Industry Methodology Outline and Analysis done in the Alternate Plan paper

Data warehousing It is a concept and not a product A method to analyze massive amounts of data to make better business decisions. Helpful in analyzing Sales data(E.g..) and make decisions that affect the company’s performance. A Data warehouse in general contains Summarized, De-normalized and Replicated data that is infrequently updated and is optimized for decision support applications.

Comparison between Operational Environment and Data Warehouse Detailed Current Transaction Driven Minimum redundancy Static Structure Small amount of data Constantly updated Summarized Variable over time Analysis driven Some redundancy Flexible structure Huge volumes of data Infrequently Updated Data WarehouseOperational Environment

Data Warehouse Concepts Multidimensional Model a) Facts - Table containing aggregate information required for analysis. b) Dimensions - Classes of descriptors of the facts. c) Hierarchies - Level of Aggregation of data. Databases a) Relational i) Oracle b) Multi-Dimensional i) Oracle Express ii) Essbase iii) Gentium

Implementation Steps Analyze user requirements for the Data warehouse. Analyze existing transaction Processing Data. Design the Data warehouse (Multi-dimensional Model) Create the Data warehouse (Relational or Multi- dimensional) Extract and clean the operational data. Migrate and load the data into the warehouse. Do decision support analysis on the warehouse data using OLAP tools. Create reports for reporting purposes.

Data Warehouse Architecture Terminology's a) OLTP systemsd) Staging Area b) Metadatae) Extraction, Loading & Migration c) Data Warehousef) External Data

Data Warehouse Architecture (Contd..) OLTP Systems –Online Transaction Processing Systems, Production Systems. Systems used to manage and run the business. Metadata –consists of information about the data that feeds, gets transformed and exists in the Data Warehouse Data Warehouse –Core of the Architecture –supports informational processing by providing a solid platform of integrated, historical data from which to do analysis

Data Warehouse Architecture (Contd..) Staging Area –Data Warehouse workbench –the place where raw data is brought in, cleaned, combined, archived and eventually exported to either the Data Warehouse or to one or more Data Marts Extraction, Cleaning & Loading –Known as the Data Conversion process. –The process by which data from the operational systems are moved to the Warehouse –One of the most important steps in the implementation of a Data Warehouse. External Data

Data Conversion Loading of data from the operational system to the Data warehouse. Process wherein data is extracted, cleaned, combined, archived and eventually loaded into the Data warehouse. Complex, time-consuming and unglamorous. Comprises of the following processes: a) Extraction b) Cleaning c) Loading Very, Very important section of the Data warehousing process.

Importance of Data Conversion The Data warehouse holds the information that is the key to a corporation’s decision making process. Unreliable and “Dirty” data can effect the performance of the corporation. Examples a) Marketing communications. b) Retail Sales c) Medical records

Steps in Data Conversion Extract data from the operational systems to intermediate schema (Staging area). - Staging area is the Data warehouse workbench where the data is cleaned, combined, archived and eventually exported to the Data warehouse.. It has the same schema structure as the operational system. Convert the intermediate schema to “load data”. Aggregate the “load data”. Migrate the “load data” from the staging area to the Data Warehouse server (if the staging area is not on the same server as the warehouse). Load the data into the Data warehouse.

Data Conversion Process

Data Conversion Extraction - Routines are created to read source data and move it to an intermediate staging area. - Staging Area has the same schema as the source. It is important as the data is cleaned before it is uploaded into the warehouse. Convert intermediate Schemas to “Load Data” - Data cleaning process. It comprises of: - Data examination - Data parsing - Data correction - Record matching - Data transformation

Data Conversion (Contd..) Aggregate “Load data” - “Load data” is aggregated by executing a series of sorts externally. Move the “Load data” from the staging area onto the Data warehouse server - Done if the Data warehouse server is different Load the data onto the Data warehouse - Done using SQL routines or bulk-load utilities.

Paper Outline Brief explanation of Data warehousing concept Data warehouse architecture Data conversion Importance of data conversion Common Industry methodology Analysis of Data conversion process using an example: - Sales Order System

Overall Analysis Concept of the paper was to outline the Data Conversion process. Design a Relational Database, Staging Area and Data Warehouse. Move Data from the Relational database to the Staging Area Move Data from the Staging area to the Warehouse.

In-depth Analysis Designed the Relational Database to reflect the Transactional processing system of a common Organization. Designed the Staging Area to reflect only the Sales system. Designed the Data Warehouse for the Sales system. Built the relational database(source system) for the quoted example (Sales System) in Oracle Built the Staging Area in Oracle. Built the Data Warehouse in Oracle (Multi Dimensional Design in a relational Database). Created Views for the source tables(Transparency) Created synonyms for the views (as source tables were in a different server)

In-depth Analysis (Contd..) Wrote SQL scripts to first move data from the synonyms created, to the Staging area. Wrote SQL scripts and procedures to move data from the Staging Area to the Data Warehouse. –Data was moved first from the Staging area tables to the dimension tables namely Product, Location and Customer. –Time dimension table was populated with 10 years of data. Additional scripts were written to populate the time dimension with data every year. –Data was moved from the Staging area to the fact table (Core Table). Wrote scripts to check for the consistency of data. These scripts checked the total records moved from the Source system to the Satging area and from the Staging area to the Data Warehouse. Additionally, they checked for the total amount moved from the database to the Data Warehouse.

Conclusion The importance of the Data warehouse can only be achieved by OLAP analysis and Data Mining. Data Conversion is one of the most critical process in implementing a Data warehouse Warehouse holds the information that is of great value to the enterprise Data conversion process must be done effectively and efficiently