Download presentation
Presentation is loading. Please wait.
Published byReynold Blake Modified over 9 years ago
1
Object-Oriented Frameworks for Migrating Structured Data April 2004
2
2 Information Integration Paradigms Data Migration For a Relational Database (Java) Commercial ETL tools. For an Object-Oriented Database (Java) Data Mediation Object-Oriented Framework (C++) Rule-Based Framework (C)
3
3 Data Migration for a Relational Database Java data mapping framework. Splits, flattens and joins structured flat files. Maps data into Oracle using JDBC. Or pre-processes non-standard data before loading with a commercial ETL tool. Minimal 2,000-line framework in 6 months. Most mappings are one line of code. Custom functions can be added as needed. New data sources added with minimal code.
4
4 Sample Data Mapping Code private void CompanyTable() throws DataFormatException, IOException, SQLException { table("COMPANY_LOAD"); function("COMPANY_ID", Oracle.NextSequence("SEQ_COMPANY_ID")); field("DUNS_NUMBER", "DUNS_NO"); field("COMPANY", "COMPANY"); field("TRADE_NAME", "TRADE_NAME");... function("TIME_CREATED", Time.Begin(), DATE_TIME); row(); table(); }
5
5 Design for Data Mapping Code Dun & Bradstreet Mapping Dun & Bradstreet Header Data Status Data Exception Data Debug Oracle Target Delimited Target Data Target Data Header Data Mapping Time Functions Oracle Functions Math Functions... Data Functions Data File Data Log Oracle Source Data Source Data Tables Data Row Data Interface Data Set Delimited Source Dun & Bradstreet Source Data Fields Data Connection Data Fields is has uses has uses has uses has writes has are uses writes
6
6 Promises of Commercial ETL Tools Code generators or engine-based tools can reduce or eliminate coding. Graphical interfaces can help users visualize mappings and reduce errors. Some ETL tools interface with data design and metadata tools for easier structuring and standardization.
7
7 ETL Commercial Tool Investigation Independently investigated about 20 commercial ETL tools, ranging from web site surveys to standardized comparisons. “Given the generally high pricing for the predominant ETL tools, it is sometimes challenging to build a clear business case to make this switch.” – Gartner, May 2002. “Augmentation in the form of custom code remains a requirement for the majority of ETL tool deployments.” – Gartner, May 2002.
8
8 Limitations of Commercial ETL Tools Code generators create more, and more complex, code. Oracle Warehouse Builder requires 3 times as much PL/SQL code as needed by the custom Java framework for sample Dun & Bradstreet data. Engine-based tools are expensive and limited in handling complex data formats. Initial costs for Informatica would be > $300K. Their complex flat file component showed limited functionality during another team’s trial use.
9
9 Data Migration for an Object-Oriented Database Declarative data mapping framework compiles object models into Java. Declares attributes, functions, relationships, and translations for data and queries. Splits, flattens and joins structured flat files. Maps data into ObjectStore using API. 16,000-line framework in 1 year. Most mappings are one line of code. Custom functions can be added as needed. New data sources added with minimal code.
10
10 Sample Data Mapping Declarations class CCR (method DelimitedText) { string DUNS = DUNS; string DUNS4 = String.Concat(DUNS, Plus4); string Status = Status; string Name = LegalBusinessName; string StreetAddress1 = StreetAddress1; string StreetAddress2 = StreetAddress2;... relationship CCR Parent inverse Subsidiaries where ParentDUNS4 == CCR.DUNS4; relationship set(CCR) Subsidiaries inverse Parent where DUNS4 == CCR.ParentDUNS4; }
11
11 Design for Data Mapping Declarations Data File String Functions Object Store Data Class Data Class Precompile Data File Data Class Source Data Warehouse Data Warehouse Writer Data Warehouse Reader Math Functions Data Definition... Data Class Data File accesses... generates reads uses reads writes reads uses
12
12 Summary Custom data migration frameworks in Java: Declarative syntax simplifies data mapping. Handle non-standard structured data formats. Require minimal code for new sources. Can be used to pre-process non-standard data formats before using a commercial tool. Commercial ETL tools: GUI is valuable for standard row and column data. Limited in handling complex structured data. Can be very expensive.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.