Object-Oriented Frameworks for Migrating Structured Data April 2004.

Slides:



Advertisements
Similar presentations
Technical BI Project Lifecycle
Advertisements

Chapter 1: The Database Environment
Guide to Oracle10G1 Introduction To Forms Builder Chapter 5.
A Guide to Oracle9i1 Introduction To Forms Builder Chapter 5.
1 Introduction The Database Environment. 2 Web Links Google General Database Search Database News Access Forums Google Database Books O’Reilly Books Oracle.
1 C. Shahabi Application Programming for Relational Databases Cyrus Shahabi Computer Science Department University of Southern California
McGraw-Hill/Irwin Copyright © 2008, The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin Copyright © 2008 The McGraw-Hill Companies, Inc.
A Guide to Oracle9i1 Creating an Integrated Database Application Chapter 8.
1 Classification: Genpact Internal.  Tool From Oracle  Works with Oracle Database  PL/SQL Based  Widely Used with Oracle Applications  Can be Used.
ASP.NET Programming with C# and SQL Server First Edition
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 2 Hidden Gems of APEX David Gale Software Engineer Oracle Application Express November,
IMPORT WIZARD 491a Summer 2005 Roudabeh Moraghebi.
Chapter 7 Managing Data Sources. ASP.NET 2.0, Third Edition2.
Building Ad-Hoc Reports using the SQL Server 2005 Reporting Services (SSRS) Report Builder (SQL307) Adrian Rupp Business Intelligence Solutions Specialist.
Session-01. Hibernate Framework ? Why we use Hibernate ?
Confidential ODBC May 7, Features What is ODBC? Why Create an ODBC Driver for Rochade? How do we Expose Rochade as Relational Transformation.
5 Copyright © 2009, Oracle. All rights reserved. Defining ETL Mappings for Staging Data.
Data warehousing theory and modelling techniques Building Dimensional Models.
Phil Brewster  One of the first steps – identify the proper data types  Decide how data (in columns) should be stored and used.
ETL The process of updating the data warehouse.. Recent Developments in Data Warehousing: A Tutorial Hugh J. Watson Terry College of Business University.
 ETL: Extract Transformation and Load  Term is used to describe data migration or data conversion process  ETL may be part of the business process repeated.
Introduction to Database Concepts
Module 3: Table Selection
1 Java Database Connection (JDBC) There are many industrial-strength DBMS's commercially available in the market. Oracle, DB2, and Sybase are just a few.
DBS201: DBA/DBMS Lecture 13.
Initial Data Load Extension Module Webinar February 4th, 2009.
Architecture of.NET Framework .NET Framework ٭ Microsoft.NET (pronounced “dot net”) is a software component that runs on the Windows operating.
Jean-Pierre Dijcks Principal Product Manager Oracle Warehouse Builder Oracle Corporation.
Oracle Application Express (Oracle APEX), formerly called HTML DB, is a Free rapid web application development tool for the Oracle database.
COLD FUSION Deepak Sethi. What is it…. Cold fusion is a complete web application server mainly used for developing e-business applications. It allows.
{ Graphite Grigory Arashkovich, Anuj Khanna, Anirban Gangopadhyay, Michael D’Egidio, Laura Willson.
More ETL. ETL in a nutshell ETL is an abbreviation of the three words Extract, Transform and Load. It is an ETL process to –extract data, mostly from.
Designing and Developing WS B. Ramamurthy. Plans We will examine the resources available for development of JAX-WS based web services. We need an IDE,
Using SAS® Information Map Studio
Data Management Console Synonym Editor
Life Cycle Management Using Oracle 9i Warehouse Builder Anissa Stevens Avanco International, Inc Mark Van De Wiel Oracle.
CERN - IT Department CH-1211 Genève 23 Switzerland t DB Development Tools Benthic SQL Developer Application Express WLCG Service Reliability.
5-1 McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved.
ETL Extract Transform Load. Introduction of ETL ETL is used to migrate data from one database to another, to form data marts and data warehouses and also.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
Carey Probst Technical Director Technology Business Unit - OLAP Oracle Corporation.
DataMAPPER - Applied Database Tech. 이화여대 과학기술대학원 석사 3 학기 992COG08 김지혜.
“This presentation is for informational purposes only and may not be incorporated into a contract or agreement.”
Copyright  Oracle Corporation, All rights reserved. 7 Accessing a Database Using SQLJ.
8 1 Chapter 8 Advanced SQL Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
3 Copyright © 2009, Oracle. All rights reserved. Accessing Non-Oracle Sources.
THE DATABASE ENVIRONMENT Definitions: Data, Information, Database, MetadataData, Information File Processing Systems The Database Approach Components of.
1Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
Dr Gordon Russell, Napier University Unit Embedde SQL - V2.0 1 Embedded SQL Unit 5.1.
BlackBerry Applications using Microsoft Visual Studio and Database Handling.
7 Strategies for Extracting, Transforming, and Loading.
Clearly Visual Basic: Programming with Visual Basic 2008 Chapter 25 I’m Suffering from Information Overload.
Getting Started with.NET Getting Started with.NET/Lesson 1/Slide 1 of 31 Objectives In this lesson, you will learn to: *Identify the components of the.NET.
Chapter 24 I’m Suffering from Information Overload (Access Databases) Clearly Visual Basic: Programming with Visual Basic nd Edition.
3/6: Data Management, pt. 2 Refresh your memory Relational Data Model
Relational Database Systems Bartosz Zagorowicz. Flat Databases  Originally databases were flat.  All information was stored in a long text file, called.
CHAPTER 7 LESSON C Creating Database Reports. Lesson C Objectives  Display image data in a report  Manually create queries and data links  Create summary.
D Copyright © 2004, Oracle. All rights reserved. Using Oracle XML Developer’s Kit.
CS 440 Database Management Systems Stored procedures & OR mapping 1.
DBS201: Data Modeling. Agenda Data Modeling Types of Models Entity Relationship Model.
Copyright © 2006, Oracle. All rights reserved. Czinkóczki László oktató Using the Oracle Warehouse Builder.
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe.
E Copyright © 2006, Oracle. All rights reserved. Using SQL Developer.
1 Copyright © 2008, Oracle. All rights reserved. Repository Basics.
Slide 1 © 2016, Lera Technologies. All Rights Reserved. Oracle Data Integrator By Lera Technologies.
Data Resource Management Data Concepts Database Management Types of Databases Chapter 5 McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies,
Framework Curation and Loading for tranSMART v1.0 - FC&L4tranSMART
Data Warehouse.
Java Database Connectivity
Presentation transcript:

Object-Oriented Frameworks for Migrating Structured Data April 2004

2 Information Integration Paradigms Data Migration For a Relational Database (Java) Commercial ETL tools. For an Object-Oriented Database (Java) Data Mediation Object-Oriented Framework (C++) Rule-Based Framework (C)

3 Data Migration for a Relational Database Java data mapping framework. Splits, flattens and joins structured flat files. Maps data into Oracle using JDBC. Or pre-processes non-standard data before loading with a commercial ETL tool. Minimal 2,000-line framework in 6 months. Most mappings are one line of code. Custom functions can be added as needed. New data sources added with minimal code.

4 Sample Data Mapping Code private void CompanyTable() throws DataFormatException, IOException, SQLException { table("COMPANY_LOAD"); function("COMPANY_ID", Oracle.NextSequence("SEQ_COMPANY_ID")); field("DUNS_NUMBER", "DUNS_NO"); field("COMPANY", "COMPANY"); field("TRADE_NAME", "TRADE_NAME");... function("TIME_CREATED", Time.Begin(), DATE_TIME); row(); table(); }

5 Design for Data Mapping Code Dun & Bradstreet Mapping Dun & Bradstreet Header Data Status Data Exception Data Debug Oracle Target Delimited Target Data Target Data Header Data Mapping Time Functions Oracle Functions Math Functions... Data Functions Data File Data Log Oracle Source Data Source Data Tables Data Row Data Interface Data Set Delimited Source Dun & Bradstreet Source Data Fields Data Connection Data Fields is has uses has uses has uses has writes has are uses writes

6 Promises of Commercial ETL Tools Code generators or engine-based tools can reduce or eliminate coding. Graphical interfaces can help users visualize mappings and reduce errors. Some ETL tools interface with data design and metadata tools for easier structuring and standardization.

7 ETL Commercial Tool Investigation Independently investigated about 20 commercial ETL tools, ranging from web site surveys to standardized comparisons. “Given the generally high pricing for the predominant ETL tools, it is sometimes challenging to build a clear business case to make this switch.” – Gartner, May “Augmentation in the form of custom code remains a requirement for the majority of ETL tool deployments.” – Gartner, May 2002.

8 Limitations of Commercial ETL Tools Code generators create more, and more complex, code. Oracle Warehouse Builder requires 3 times as much PL/SQL code as needed by the custom Java framework for sample Dun & Bradstreet data. Engine-based tools are expensive and limited in handling complex data formats. Initial costs for Informatica would be > $300K. Their complex flat file component showed limited functionality during another team’s trial use.

9 Data Migration for an Object-Oriented Database Declarative data mapping framework compiles object models into Java. Declares attributes, functions, relationships, and translations for data and queries. Splits, flattens and joins structured flat files. Maps data into ObjectStore using API. 16,000-line framework in 1 year. Most mappings are one line of code. Custom functions can be added as needed. New data sources added with minimal code.

10 Sample Data Mapping Declarations class CCR (method DelimitedText) { string DUNS = DUNS; string DUNS4 = String.Concat(DUNS, Plus4); string Status = Status; string Name = LegalBusinessName; string StreetAddress1 = StreetAddress1; string StreetAddress2 = StreetAddress2;... relationship CCR Parent inverse Subsidiaries where ParentDUNS4 == CCR.DUNS4; relationship set(CCR) Subsidiaries inverse Parent where DUNS4 == CCR.ParentDUNS4; }

11 Design for Data Mapping Declarations Data File String Functions Object Store Data Class Data Class Precompile Data File Data Class Source Data Warehouse Data Warehouse Writer Data Warehouse Reader Math Functions Data Definition... Data Class Data File accesses... generates reads uses reads writes reads uses

12 Summary Custom data migration frameworks in Java: Declarative syntax simplifies data mapping. Handle non-standard structured data formats. Require minimal code for new sources. Can be used to pre-process non-standard data formats before using a commercial tool. Commercial ETL tools: GUI is valuable for standard row and column data. Limited in handling complex structured data. Can be very expensive.