Augmenting Traditional Conceptual Models to Accommodate XML Structures Stephen W. Liddle Information Systems Department Reema Al-Kamha & David W. Embley Computer Science Department Brigham Young University, Provo, Utah
2 of 25 5 November 2007 ER 2007, Auckland, New Zealand Outline Background XML modeling criteria Missing modeling constructs C-XML Augmenting ER and UML Conclusion
3 of 25 5 November 2007 ER 2007, Auckland, New Zealand The Need for Greater Abstraction The history of modeling and programming languages is the history of our attempts to scale the mountain of abstraction XML needs better abstractions
4 of 25 5 November 2007 ER 2007, Auckland, New Zealand XML Schema/Model Mismatch XML features not explicitly supported in traditional conceptual models: Ordered lists of concepts Choice of concept from among several Nested information hierarchies Mixed content Use of content from another model
5 of 25 5 November 2007 ER 2007, Auckland, New Zealand Our Contributions Proposed conceptual representation for XML structures Representation of these concepts in ER and UML
6 of 25 5 November 2007 ER 2007, Auckland, New Zealand XML Modeling Criteria Graphical notation Formal foundation Structure independence Reflection of the mental model N-ary relationship sets Logical level mapping Cardinality for all participants Irregular & heterogeneous structure Document-centric data Views Constraints Ordering
7 of 25 5 November 2007 ER 2007, Auckland, New Zealand Missing Modeling Constructs (1) Sequence structure Parent concept Ordered child concepts Constrained recurrence of children Constrained recurrence of sequence itself <xs:element name="MiddleName" type="xs:string“ minOccurs="0" maxOccurs="2"/>
8 of 25 5 November 2007 ER 2007, Auckland, New Zealand Missing Modeling Constructs (2) Choice structure Parent concept Choose one child concept from several alternatives Constrained recurrence of chosen child Constrained recurrence of choice itself <xs:element name="PhoneNumber" type="xs:string" minOccurs="1" maxOccurs="2" />
9 of 25 5 November 2007 ER 2007, Auckland, New Zealand Missing Modeling Constructs (3) Mixed attribute Allows character and element data to be intertwined Any and anyAttribute structures Insert structures from other namespaces Constrained recurrence
10 of 25 5 November 2007 ER 2007, Auckland, New Zealand Missing Modeling Constructs (4) Nesting of hierarchical structures Key organizational characteristic of XML Arbitrarily complex nesting possible
11 of 25 5 November 2007 ER 2007, Auckland, New Zealand C-XML Augmentations C-XML foundation proposed at ER 2004 Augmented hypergraph of relationship sets and object sets Sequence Choice Any/anyAttribute Mixed content
12 of 25 5 November 2007 ER 2007, Auckland, New Zealand C-XML Example
13 of 25 5 November 2007 ER 2007, Auckland, New Zealand C-XML Modeling Sufficiency Graphical notation Formal foundation Structure independence Reflection of the mental model N-ary relationship sets Logical level mapping Cardinality for all participants Irregular & heterogeneous structure Document-centric data Views Constraints Ordering
14 of 25 5 November 2007 ER 2007, Auckland, New Zealand C-XML Advantages w.r.t. XML Structure independence Associate multiple sequences/choices with a single concept Intermix ordinary relationship sets with sequences/choices Generalized sequence, choice, mixed content Reflection of mental model Represent hierarchical and non-hierarchical structure No attribute vs. element decision Both ordered and unordered related concepts Choice distinguished from generalization/specialization Full range of cardinality constraints
15 of 25 5 November 2007 ER 2007, Auckland, New Zealand Other Conceptual Models for XML We are not the only ones working on this See surveys by Sengupta & Wilde (2003) and Necasky (2006) XML Schema itself responds in part to the need for better conceptual models of XML But there is more to be done
16 of 25 5 November 2007 ER 2007, Auckland, New Zealand Methodology The market wants standards Consider popular conceptual models ER UML (class diagrams) Find related work that maps those models to XML XER (XML for ER) by Sengupta et al. Conrad et al. work on UML Compare their work with C-XML to identify problems Extend their work to XML Schema XER ER-XML Conrad UML-XML
17 of 25 5 November 2007 ER 2007, Auckland, New Zealand XER (Best Attempt)
18 of 25 5 November 2007 ER 2007, Auckland, New Zealand XER Issues Lacks some cardinality constraints Missing any and anyAttribute constructs No representation for composite keys Can only apply key designation to attributes Combines notions of choice and generalization/specialization in one construct This is a conceptual mismatch for various reasons Need to allow for anonymous entities We made many assumptions to present a cleaned up version of XER as ER-XML
19 of 25 5 November 2007 ER 2007, Auckland, New Zealand ER-XML (XER++)
20 of 25 5 November 2007 ER 2007, Auckland, New Zealand Conrad UML/XML (Best Attempt)
21 of 25 5 November 2007 ER 2007, Auckland, New Zealand Conrad UML/XML Issues Key construct: Extended UML aggregation to map choice and sequence structures Lack of anyAttribute construct Lack of mixed content construct Lacks key constraints Unless you count OCL Sequence/choice representation can only be applied to classes, not attributes
22 of 25 5 November 2007 ER 2007, Auckland, New Zealand UML-XML (Conrad++)
23 of 25 5 November 2007 ER 2007, Auckland, New Zealand C-XML Advantages Formal foundation XER not formally defined UML formal description not fully developed Reflection of mental model No attribute/element distinction required Views Hypergraphs are amenable to view translations C-XML has high-level views (object sets and relationship sets) as a first-class construct Logical level mapping We have mapped both XML Schema C-XML and C-XML XML Schema Cardinality for all participants C-XML handles cardinality constraints for more participants even than than XML Schema supports
24 of 25 5 November 2007 ER 2007, Auckland, New Zealand Conclusion Problem: some XML Schema concepts are missing in traditional conceptual models Solution: enrich conceptual models with ability to Order a list of concepts Choose alternatives from among several Specify mixed content Use content data from another data model
25 of 25 5 November 2007 ER 2007, Auckland, New Zealand Conclusion C-XML answers the need Our solution can be adapted to ER and UML languages Though these adaptations aren’t quite as good at satisfying modeling requirements as C-XML