Presentation is loading. Please wait.

Presentation is loading. Please wait.

A DFDL Proposal based on Commercial Data Processing Requirements

Similar presentations


Presentation on theme: "A DFDL Proposal based on Commercial Data Processing Requirements"— Presentation transcript:

1 A DFDL Proposal based on Commercial Data Processing Requirements
Mike Beckerle, Technology Office

2 Ascential Software, Inc.
GGF Sponsor Enterprise Data Integration High-volume parallel processing Commercial Record-Oriented Data Complex formats: XML, Cobol, C, ad-hoc. Clusters and Intra-Enterprise Grids Deployments have 100s of computers Apps are performance critical! “Do what’s right for the customer.” Open standards for data format description

3 DFDL Dream Roadmap DFDL is one of the most important things the GGF is working on! 2004 GGF, initial implementations, draft std. 2005 ANSI/ISO process begins

4 Chronology/Thought Process
Somewhere in MikeB’s brain….. The DFDL-WG really needs to see the crazy list of attributes for commercial data that we run into all the time…. Hmmm. We also already integrate metadata from SQL, Cobol, SAS, EDI, and various other sources, we use a common model for that. I’ve gathered a very comprehensive list of the representation attributes. XML has XSDL, and the information set idea, ASCL has several similar things internally So…

5 Requirements Came from: Ascential DataStage Products Mercator Products
Cobol/Mainframe, Relational, XML, ad-hoc data sources are commonly handled Mercator Products EDI data formats, esp. X.12 OMG CWM (Common Warehouse Metamodel) RDBMS SQL data model SAS (new GGF sponsor!!!) XSDL and XML Lots of Internationalization and Unicode experience

6 How to Read/Interpret this Document
Doc is NOT a response to any other DFDL-WG proposals Was prepared in parallel, not in response There’s still lots of TBDs Attributes list is quite comprehensive. Character sets covered comprehensively.

7 Themes Information Set / Abstract Data Model Goals
distinct from Representation Layer Goals Read/Write Symmetry Completeness: Describe anything Without making common cases too hard Handle commercial data formats directly DFDL Information Set Representation Stream as Data Blocks Mapping to Binary Stream

8 Value of DFDL Information Set
XML Info. Set Java C/C++ Fortran DFDL Information Set Representation Stream as Data Blocks (FB, VBS, etc) Mapping to Binary

9 Record Format Complexity
A typical field definition within a record: Name: SMF6JNM Length: 4 bytes EBCDIC Description: When SMF6INDC contains a X'1', this field contains a four-digit EBCDIC job number. When SMF6INDC contains a X'3' or greater, the job number has more than four digits, and this field contains zeroes. The correct job number is then found in SMF6JBID.

10 Favorite(?) Data Attributes
yyEarliestYear Is “03” 1903, or 2003? overpunchedASCIISignStyle: e.g., +120 decimal Hex F1.F2.C0 in EBCDIC = “12{“ Hex D in ASCII = “12{“ digitGroupingScheme=“3,2” 12,12,34, (Thai) ,89 (much of Europe) 121,234, (US) calendar Q: How many days old is someone born on CE? A: Depends on what country they were born in! Greece and Turkey both converted to the Gregorian calendar since 1923.

11 Clean up separation of rep from abstract layer
Next Steps Clean up separation of rep from abstract layer Factoring of binary rep attributes from character rep attributes Clarify attribute inheritance idiom Attributed type trees are central to the proposal, but not clearly explained in this draft. Expression language Esp. the library it has available Find common ground with other DFDL proposals


Download ppt "A DFDL Proposal based on Commercial Data Processing Requirements"

Similar presentations


Ads by Google