Ontologizing EDI doug foxvog 23 July 2004
Ontologizing EDI What is EDI? EDI Data Types Ontologizing of EDI Ontologizing Invoice Message Type Summary
EDI Electronic Data Interchange EDI is a system for standardized business message passing Used by hundreds of thousands of corporations Look into ontologizing EDI Can Invoice, PurchaseOrder, Functional Acknowledgment, PaymentOrder be mapped to SW languages?
EDI Obtained ASC X12 EDI Workbook (23 MB HTML ) –318 Message Types (Transaction Sets) –1019 Message Segment Types –34 “Composite” Segment Types –1466 Data Element Types Code Sets Numbers Text Parts of Data Segments –Up to 1500 (or more) codes per code set –578 External Code Sets (must find elsewhere) Hardcopy is Size of Large Telephone Book –Without External Code Sets
Ontologizing EDI What is EDI? EDI Data Types Ontologizing of EDI Ontologizing Invoice Message Type Summary
Message Type Description Transaction Set Defines: –Segment Types in Message –Order of Segment Types –Optionality of Segment Types –Conditionality A if B; A if not B; A only if B; A only if not B –Repetitions of Segment Types Maximum Number or Unlimited –Repetition of Groups of Segment Types Same Segment Type in different part of message has different meaning with respect to message
Invoice Message Examined 90 Segment Type Fields 9 Mandatory Fields Groups of Fields Repeat Zero to 200,000 times (or unlimited) Nested Loops of Segment Type Groups Field Types Repeat w/ Different Semantics –Date/Time Field in 8 places –Reference Information in 6 places 49 Segment Types 191 Data Element Types (used by Segments)
Invoice Message Contents Header Information Parties Identified Ungrouped Information Industry Codes (multiple Code Sets) Reference, Vessel, & Accounting Blocks ,000 Line-Item Blocks –Many Sub-blocks within Line-Item Block Summary Information –Accounting, Tax, & Shipping Blocks Repeat
Example Segment Type Currency Has 21 Data Elements –2 Currency Codes –1 Exchange Rate –1 Currency Market Code –2 Entity Identifiers –5 Date/Time Sets (!) Date/Time Qualifier Code Date Field Time Field
Example Data Element Types EntityIdentifier Code identifying type of –Organizational Entity –Physical Location –Property –Person –Entity type relative to transaction (e.g. Employer) 1500 Available Codes Date/TimeQualifier Code identifying –Type of thing date applies to –How date applies to thing (ends, promised for, …) 1416 Available codes
Ontologizing EDI What is EDI? EDI Data Types Ontologizing of EDI Ontologizing Invoice Message Type Summary
Ontologizing EDI for SWS Purpose would be ontology mapping –Relay info from EDI invoice –Produce EDI format purchase order –Produce EDI format payment message –Send & detect EDI acknowledgement messages How reasonable would such a mapping be considering such a huge dataset? –What can we get by with for these four message types? What have others done?
Ontologizing at Different Levels Transaction Set (Message Type) –Meaning of segments relative to TS unstated Data Segment Groups –Appear unlabled in Transaction set file Data Segment Types –Format of each provided in file –Data elements in each –Data element dependencies stated Data Element Codes –Can be concepts, relations, mixture –Affect meaning of Segments –Some applicable only to certain Segment or TS types
Data Segment Ontologizing EDI file describes on message layout Ontology must focus on meaning Relationship among Data Elements must be expressed –Currency segment defined with 2 Currency codes. –If both present, one is source currency which is being converted into second currency.
Data Element Ontologizing Some are homogeneous code sets so that it is easy to encode whole set –Currency Code – Over 160 currencies –Currency Exchange Code – 6 exchanges Some are heterogeneous –Time Code UTC+2, EastEuropeTime, EastEurSummerTime –Entity Identifier Code Org., Person, Location, Participant type Has multiple internal taxonomies Some are text –CityName – cities need to be ontologized
How Much Needs to be Ontologized? How to determine subset needed/ appropriate for Web Services? Meaning of many Data Element Codes need ontologizing – how to select? Large variety of topics to be covered –An ontology is needed for each –Ontologies need to be tied together
Should EDI be Ontologized? How much effort to cover enough to establish communication with EDI systems? Has anyone else ontologized EDI, or a significant portion thereof? Are companies moving away from EDI to other systems? Is this effort to aid buggy-whip manufacturers?
Ontologizing EDI What is EDI? EDI Data Types Ontologizing of EDI Ontologizing Invoice Message Type Summary
Ontologizing Invoice Topics include –Time– Currencies –Temporal relations– Contracts –Geographical regions– Reports –Physical products– Banking –Measured quantities– Taxes –Meta-information– Delivery Need ontologies for each of these
Ontologizing Invoice The Data Segment types included in the Invoice Transaction Set were ontologized. –To different levels by different students Some of the Data Element types had their codes ontologized. –This would be needed for ontology mapping Different Ontology Languages used –WSML, FLORA, RDF-S, CycL Data Elements within Currency and Date/Time Data Segment types expressed in several languages.
Date – Time Segment Date and/or Time Date/Time Format specifier Date/Time Qualifier –Type of thing timestamped –How timestamp relates to thing Date/Time of event Start/End date/time Expected/Promised/Scheduled/Requested … time Effective date, expiration date, due date, dob Corrected/Former –Combination of above with type
Comparing Invoice to PurchaseOrder and PayOrder Invoice Purchase Order Pay Order Segment Types Data Element Types Much overlap: 86 Data Elements in all 3– 71 in two of three 295 total Data Elements used 19 Segment Type in all 3– 15 in two of three 105 total Segment Types used
What has been Ontologized? All Invoice Data Segments (to some extent) Dates & Times Temporal Relations Currency Types Currency Markets Geopolitical Entities –Country list to relate to Currencies Agent types graphically placed in taxonomy – not encoded yet.
Ontologizing EDI What is EDI? EDI Data Types Ontologizing of EDI Ontologizing Invoice Message Type Summary
EDI is a massive set of descriptions of message formats An individual message type permits inclusion of thousands of different codes which are syntactically meaningful Heterogeneous codes require individual attention Portions of EDI already converted to RDF EDI may be being phased out Questionable whether we should encode
Questions?