Transportation: Loading Warehouse Data Chapter 12.

Slides:



Advertisements
Similar presentations
BY LECTURER/ AISHA DAWOOD DW Lab # 4 Overview of Extraction, Transformation, and Loading.
Advertisements

Module 8 Importing and Exporting Data. Module Overview Transferring Data To/From SQL Server Importing & Exporting Table Data Inserting Data in Bulk.
Loading & organising data. Objectives Loading data using direct-load insert Loading data into oracle tables using SQL*Loader conventional and direct paths.
9 Copyright © 2004, Oracle. All rights reserved. Managing Data.
18 Copyright © 2005, Oracle. All rights reserved. Moving Data.
Chapter 5 Data Management. – The Best & Most Convenient Way to Learn Salesforce.com 2 Objectives By the end of the module, you.
Management Information Systems, Sixth Edition
EXPERT DOCUMENT SOLUTIONS FOR YOUR BUSINESS EXPERT DOCUMENT SOLUTIONS FOR YOUR BUSINESS.
Database Systems: A Practical Approach to Design, Implementation and Management International Computer Science S. Carolyn Begg, Thomas Connolly Lecture.
Designing the Data Warehouse and Data Mart Methodologies and Techniques.
Components and Architecture CS 543 – Data Warehousing.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 12: Managing and Implementing Backups and Disaster Recovery.
1 © Prentice Hall, 2002 Chapter 11: Data Warehousing.
MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration Chapter 11 Managing and Monitoring a Windows Server 2008 Network.
Data Warehousing: Defined and Its Applications Pete Johnson April 2002.
Chapter 17 Methodology – Physical Database Design for Relational Databases Transparencies © Pearson Education Limited 1995, 2005.
Copying, Managing, and Transforming Data With DTS.
Leaving a Metadata Trail Chapter 14. Defining Warehouse Metadata Data about warehouse data and processing Vital to the warehouse Used by everyone Metadata.
Chapter 9 Database Management
Chapter 14 & 15 Conceptual & Logical Database Design Methodology
ETL Design and Development Michael A. Fudge, Jr.
Data Conversion to a Data warehouse Presented By Sanjay Gunasekaran.
ETL By Dr. Gabriel.
L/O/G/O Metadata Business Intelligence Erwin Moeyaert.
SSIS Over DTS Sagayaraj Putti (139460). 5 September What is DTS?  Data Transformation Services (DTS)  DTS is a set of objects and utilities that.
Overview of the Database Development Process
Database Systems – Data Warehousing
Database Design - Lecture 1
Data Warehouse Chapter 11. Multiple Files Problem Added complexity of multiple source files Start simple Multiple Source files Extracted data Logic to.
Data Warehousing Seminar Chapter 5. Data Warehouse Design Methodology Data Warehousing Lab. HyeYoung Cho.
Zhangxi Lin Texas Tech University ISQS 6339, Data Management & Business Intelligence 1 ISQS 6339, Data Management & Business Intelligence Extraction, Transformation,
Objectives Overview Define the term, database, and explain how a database interacts with data and information Define the term, data integrity, and describe.
5 Copyright © 2004, Oracle. All rights reserved. Using Recovery Manager.
Lecture 9 Methodology – Physical Database Design for Relational Databases.
Copyright © 2007, Oracle. All rights reserved. Managing Concurrent Requests.
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie.
June 6 th – 8 th 2005 Deployment Tool Set Synergy 2005.
Chapter 16 Methodology – Physical Database Design for Relational Databases.
Oracle Advanced Compression – Reduce Storage, Reduce Costs, Increase Performance Session: S Gregg Christman -- Senior Product Manager Vineet Marwah.
Discovering Computers Fundamentals Fifth Edition Chapter 9 Database Management.
IT 456 Seminar 5 Dr Jeffrey A Robinson. Overview of Course Week 1 – Introduction Week 2 – Installation of SQL and management Tools Week 3 - Creating and.
MIS DATABASE SYSTEMS, DATA WAREHOUSES, AND DATA MARTS CHAPTER 3
DATABASE MGMT SYSTEM (BCS 1423) Chapter 5: Methodology – Conceptual Database Design.
1 Data Warehouses BUAD/American University Data Warehouses.
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
DB Zip Expert Portable database backup and export/import Copyright © SoftTree Technologies, Inc.
Data Staging Data Loading and Cleaning Marakas pg. 25 BCIS 4660 Spring 2012.
Methodology – Physical Database Design for Relational Databases.
Enterprise Data Warehousing— Planning for the Long Haul Vicky Shaffer and Marti Graham April 18, 2005.
D Copyright © Oracle Corporation, All rights reserved. Loading Data into a Database.
Transportation: Refreshing Warehouse Data Chapter 13.
Ing. Erick López Ch. M.R.I. Replicación Oracle. What is Replication  Replication is the process of copying and maintaining schema objects in multiple.
Metadata By N.Gopinath AP/CSE Metadata and it’s role in the lifecycle. The collection, maintenance, and deployment of metadata Metadata and tool integration.
7 Strategies for Extracting, Transforming, and Loading.
3 Copyright © 2006, Oracle. All rights reserved. Using Recovery Manager.
Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.
© 2012 Saturn Infotech. All Rights Reserved. Oracle Hyperion Data Relationship Management Presented by: Prasad Bhavsar Saturn Infotech, Inc.
20 Copyright © 2008, Oracle. All rights reserved. Cache Management.
3 Copyright © 2007, Oracle. All rights reserved. Using the RMAN Recovery Catalog.
14 Copyright © 2004, Oracle. All rights reserved. Using Materialized Views.
Introduction to Core Database Concepts Getting started with Databases and Structure Query Language (SQL)
Can you do this in SmarTeam?
11 Copyright © 2004, Oracle. All rights reserved. Performing a Migration Using Oracle Migration Workbench (Part II)
Copyright  Oracle Corporation, All rights reserved Building the Warehouse.
11 Copyright © 2009, Oracle. All rights reserved. Enhancing ETL Performance.
Plan for Populating a DW
Graphical Data Engineering
Informix Red Brick Warehouse 5.1
Data Warehouse.
Presentation transcript:

Transportation: Loading Warehouse Data Chapter 12

Transporting data into the Warehouse Loading moves the data into the warehouse Loading can be time-consuming: - Consider the load window. - Schedule the task; automates all processes. Initial load moves large volumes Subsequent refresh moves smaller volumes Business determines the cycle

Extract Processing Environment After each time interval, build a new database Run queries Operational database T1T2T3

Warehouse Processing Environment Build a new database After each time interval, add changes to database Archive or pure oldest data Run queries Operational database T1T2T3

First-Time Load Single event that populates the data with historical data Involves large volume of data Employs distinct ETT tasks Involves large amounts of processing after load Operational database T1T2T3

Refresh Performed according to a business cycle Simple task Less data to load than first-time load Less-complex ETT Smaller amounts of postload processing Operational database T1T2T3

Building the Transportation Process Specification Techniques and tools File transfer methods The load window Time window for other tasks First-time and refresh cycle Connectivity bandwidth

Building the Transportation Process Test the proposed techniques Document proposed load Gain agreement on the process Monitor Review Revise

Granularity Important design and operational issue Space requirements - Storage - Backup - Recovery - Load Low-level grain - Expensive, high level of processing, more disk, detail High-level grain - Cheaper, less processing, less disk, little detail

Transportation Techniques Tools Utilities and 3GL Gateways Customized copy programs Replication FTP Manual

Transportation Technique Considerations Tools are comprehensive but costly. Data-movement utilities are fast and powerful. Gateways are not always the fastest method: - Access other databases - Supply dependent data marts - Support a distributed environment - Provide real-time access if needed

SQL*Loader Input files Log files Control file Bad files Discard files Using SQL*Loader to Load Data Fastest load mechanism Direct path Parallel and unrecoverable Direct-load INSERT (Oracle8) Direct-path load API (Oracle8i)

Direct-Path Load API in Oracle8i Allows ETT and other tools to load Oracle databases efficiently Permits load behavior to be customized Gives direct-path load performance Provided complete access to all direct- load functionality using OCI

More Transportation Technique Considerations Use customized programs as a last resort Replication is limited by data-transfer rates

Postprocessing of Loaded Data Create Indexes Generate keys SummarizeFilter Postprocessing of loaded data Extract Transform Transport

Indexing Data Before load: fast index reenablement During load: adds time to load window After load: adds time to load window Index Operational databases Staging files Warehouse database

Unique Indexes Disable constraints load Enable constraints to create index Disable constraints Load data Enable constraints Create index Catch errors Reprocess

Creating Artificial Keys Use generalized or derived keys Maintain the uniqueness of a row Use an administrative process to assign the key Concatenate operational key with number: - Easy to maintain - Cumbersome keys - No clean value for retrieval

Creating Unique Keys for Records Assign a number from a list: - No semantic meaning - Extract operations must reference table assign numbers Update metadata Verdict

Creating Summary Tables CTAS pCTAS Summary data Warehouse Data marts

Verifying Data Integrity Load data into intermediate file Compare target flash totals with totals before load File 1 File 2 Counts and amounts Flash totals File 1 Load File 2 Preserve, inspect, fix, then load Load Intermediate file Warehouse

Steps for Verifying Data Integrity

Standard Quality Assurance Checks Load status Completion of the process Completeness of the data Data reconciliation Violations Reprocessing Comparison of counts and amounts

Summary This lesson discussed the following topics: First-time load considerations Techniques for transporting data Tasks involved in the postload processing stage