ETL and Metadata. The ETL Process Source Systems Extract Transform Staging Area Load Presentation System.

Slides:



Advertisements
Similar presentations
Chapter 5 Normalization of Database Tables
Advertisements

Introduction to OWB(Oracle Warehouse Builder)
WHAT D IS RAW, UNPROCESSED FACTS AND FIGURES COLLECTED, STORED AND PROCESSED BY COMPUTERS.
Filegroup “Stage A” Filegroup “Stage A” Filegroup “A” Partition 1,2 Filegroup “B” Partition 3,4 Filegroup “C” Partition 5,6 Filegroup “D” Partition.
DataMigrator 7.7 in Real Time
Chapters 7 & 9 System Scope
Copyright © Starsoft Inc, Data Warehouse Architecture By Slavko Stemberger.
Technical BI Project Lifecycle
WJEC Applied ICT Databases – Data Dictionary and Data Types Data Dictionary According to Wikipedia: A data dictionary, as defined in the IBM Dictionary.
GCSE Computing#BristolMet Session Objectives# 21 MUST describe methods of validating data as it is input. SHOULD explain the use of key fields to connect.
Quick-and-dirty.  Commands end in a semi-colon ◦ If you forget, another prompt line shows up  Either continue the command or…  End it with a semi-colon.
Logical Data Modeling Review Lecture for University of Agder, Grimstad DAT202 Databaser (5.5.11) Judith Molka-Danielsen
Database Systems: A Practical Approach to Design, Implementation and Management International Computer Science S. Carolyn Begg, Thomas Connolly Lecture.
MIS 451 Building Business Intelligence Systems Logical Design (3) – Design Multiple-fact Dimensional Model.
Normalization A337. A337 - Reed Smith2 Structure What is a database? ◦ Tables of information  Rows are referred to as records  Columns are referred.
Chapter 17 Methodology – Physical Database Design for Relational Databases Transparencies © Pearson Education Limited 1995, 2005.
ETL Process in Data Warehouse Chirayu Poundarik. Outline ETL Extraction Transformation Loading.
TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 ETL PROCESS (Muscat, Oman)
Team Dosen UMN Physical DB Design Connolly Book Chapter 18.
5 Copyright © 2009, Oracle. All rights reserved. Defining ETL Mappings for Staging Data.
Leaving a Metadata Trail Chapter 14. Defining Warehouse Metadata Data about warehouse data and processing Vital to the warehouse Used by everyone Metadata.
Data warehousing theory and modelling techniques Building Dimensional Models.
ETL Design and Development Michael A. Fudge, Jr.
The Relational Database Model
L/O/G/O Metadata Business Intelligence Erwin Moeyaert.
Chapter 5 Database Processing.
Best Practices for Data Warehousing. 2 Agenda – Best Practices for DW-BI Best Practices in Data Modeling Best Practices in ETL Best Practices in Reporting.
Survey Data Management and Combined use of DDI and SDMX DDI and SDMX use case Labor Force Statistics.
ISV Innovation Presented by ISV Innovation Presented by Business Intelligence Fundamentals: Data Loading Ola Ekdahl IT Mentors 9/12/08.
Data on the Web Life Cycle Bernadette Farias Lóscio March, 2014.
Converting COBOL Data to SQL Data: GDT-ETL Part 1.
Lecture 9 Methodology – Physical Database Design for Relational Databases.
Session 4: The HANA Curriculum and Demos Dr. Bjarne Berg Associate professor Computer Science Lenoir-Rhyne University.
Dimensional model. What do we know so far about … FACTS? “What is the process measuring?” Fact types:  Numeric Additive Semi-additive Non-additive (avg,
Normalization A technique that organizes data attributes (or fields) such that they are grouped to form stable, flexible and adaptive entities.
Data Management Console Synonym Editor
DBMS Implementation Chapter 6.4 V3.0 Napier University Dr Gordon Russell.
Data Warehouse Design Xintao Wu University of North Carolina at Charlotte Nov 10, 2008.
Information Systems & Databases 2.2) Organisation methods.
ETL Extract. Design Logical before Physical Have a plan Identify Data source candidates Analyze source systems with data- profiling tools Receive walk-through.
Siebel 8.0 Module 2: Overview of EIM Processing Integrating Siebel Applications.
Data Staging Data Loading and Cleaning Marakas pg. 25 BCIS 4660 Spring 2012.
Methodology – Physical Database Design for Relational Databases.
Chapter 9 Logical Database Design : Mapping ER Model To Tables.
Design Methods Instructor: Dr. Jerry Gao. Software Design Methods Design --> as a multistep process in which we design: a) data structureb) program structure.
7 Strategies for Extracting, Transforming, and Loading.
Two-Tier DW Architecture. Three-Tier DW Architecture.
Database Design Slide 1 Database Design Lecture 7 part 2 Mapping ERD to Tables.
Description and exemplification use of a Data Dictionary. A data dictionary is a catalogue of all data items in a system. The data dictionary stores details.
3 1 Chapter 3 The Relational Database Model Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
Howard Paul. Sequential Access Index Files and Data File Random Access.
Copyright © 2006, Oracle. All rights reserved. Czinkóczki László oktató Using the Oracle Warehouse Builder.
7 Copyright © 2006, Oracle. All rights reserved. Defining a Relational Dimensional Model.
Building the Corporate Data Warehouse Pindaro Demertzoglou Lally School of Management Data Resource Management.
Indexing Your Data Warehouse Troy Gallant, MTA. Agenda  A little about me  Indexing review  Enterprise Data Warehouse (EDW) vs. OLTP  EDW structure.
Slide 1 © 2016, Lera Technologies. All Rights Reserved. Oracle Data Integrator By Lera Technologies.
ETL Process in Data Warehouse
Data Warehousing/Loading the DW—Topics
Building Data ware House
Methodology – Physical Database Design for Relational Databases
Microsoft Office Illustrated
ETL Processing Mechanics of ETL.
SSIS Demo Michael A. Fudge, Jr.
BRK2279 Real-World Data Movement and Orchestration Patterns using Azure Data Factory Jason Horner, Attunix Cathrine Wilhelmsen, Inmeta -
Typically data is extracted from multiple sources
The Relational Database Model
Design and ETL
ETL Processing Mechanics of ETL.
Data Warehousing/Loading the DW—Topics
Presentation transcript:

ETL and Metadata

The ETL Process Source Systems Extract Transform Staging Area Load Presentation System

Source Data Record the name location and data that exists in the TPS environment. File names and location Layout Attribute meaning Source Business Owner IS Owner Platform Location Data Source Description

Extraction Copy specific data directly from the source tables into a working dataset in the staging area. Targe t Table Target Column Da ta Ty pe Le n Target Column Description Sour ce Syst em Source Table / File Sourc e Col / Field Data Txform Notes

Transformation (Dimension Tables) Generate surrogate key in a primary- surrogate table. Make this permanent. Insert the surrogate key into the working dimension tables. Conduct any editing/cleaning operations you need (usually on the working table) Generate any derived attributes you need. Generate and retain process logs.

Transformation (Fact tables) Join all dimensions to the fact table (using original primary keys). Insert surrogate keys Generate derived facts Generate indicator flags Ch g Fla g Fact Gro up Derived Fact Name Derived Fact Description Typ e Ag g Rul e Formula Constra ints Transf or- matio ns

Target Data Describe the presentation data structure. Model Metadata Usage and constraints Table Name Column Name Dat a Typ e LenNull s? Column Description PK Or der FK

Flow Documentation DFD for the ETL process ERD for Source, Staging and Target databases. Metadata Usage notes.