Presentation is loading. Please wait.

Presentation is loading. Please wait.

Exploring Remote Object Coherence in XML Web Services

Similar presentations


Presentation on theme: "Exploring Remote Object Coherence in XML Web Services"— Presentation transcript:

1 Exploring Remote Object Coherence in XML Web Services
Robert van Engelen and Wei Zhang Florida State University Madhusudhan Govindaraju State University of New York at Binghamton 9/20/2006 ICWS 2006

2 Outline Motivation “O/X” impedance mismatch
Object-level coherence requirements Mapping programming language types to XML Static-dynamic algorithms for XML (de)serialization Results and conclusions XML has been acc 9/20/2006 ICWS 2006

3 Motivation Can object-level coherence be achieved to ensure data structures and object graphs preserve their states and structure when moved in XML (de)serialization format? 9/20/2006 ICWS 2006

4 O/X Impedance Mismatch(1)
Inability to support derivation by restriction in programming languages Such as restricted value and patterns Not being able to map XML names to identifiers in all cases Unsupported XML schema components Unsupported types No naming mechanism to implement XML namespaces Serializing object graphs The first 3 are being addressed. Once we addressed these issues, we move to the object-coherence 9/20/2006 ICWS 2006

5 Namespaces vs Packages
XML namespace concept can not be translated to c++ namespaces or Java packages Schema snippet: Using namespace a; Class X { Y_TYPE: y; What to be here for namespace b? } C++ class: <complexType name=“X”> <sequence> <element name=“a:y” type=“a:Y_TYPE”/> <element ref=“b:x” /> </sequence> … </complexType> 9/20/2006 ICWS 2006

6 Serializing a graph of object
requires object-level coherence to ensure application-level interoperability 9/20/2006 ICWS 2006

7 XML Web Services Client Server XML Serialization XML Deserialization
Object graph x Object graph Object graph y XML Serialization XML Deserialization Send request XML Server: transform y into y’ Receive request Object graph x’ Object graph Object graph y’ XML Deserialization XML Serialization Receive response XML Send response 9/20/2006 ICWS 2006

8 Web Service Tools: Lossy Serialization in XML
XML Serialization Object graph x Object graph y XML Deserialization Internal binary format X XML Serialization in XML can be lossy, because object graph x cannot be recovered from its serialized XML form if we are not careful about the choice of mapping 9/20/2006 ICWS 2006

9 Basic Requirement: Object Graph Isomorphism
Mapping M Object graph x Object graph y Inverse mapping M -1 Representation X Representation Y Object graphs x and y are isomorph if M is bijective and has an inverse M-1 Object-level coherence requires bijective mappings, so that distributed and serialized object graphs are always isomorph 9/20/2006 ICWS 2006

10 Object Coherence Requires Lossless XML Serialization
Serialization method M Object graph x Object graph y Deserialization method M -1 Internal binary format X XML Serialization is lossless if an object graph x can be recovered from its serialized XML form y = M(x) by deserializing x = M-1(y) We consider structural equivalence only (location in memory is irrelevant) e.g. as in JavaRMI, IIOP, and DCOM 9/20/2006 ICWS 2006

11 Static Structure Analysis for SOAP RPC Encoding
TREE DAG XML Schema <complexType name=“X”> <sequence> <element name=“y” type=“tns:Y”/> <element name=“z” type=“tns:Z”/> </sequence> … </complexType> <complexType name=“X”> <sequence>…</sequence> </complexType> <complexType name=“Y”> <sequence>…</sequence> </complexType <complexType name=“X”> <sequence>…</sequence> </complexType Example XML Instance <x> <y>…</y> <z>…</z> … </x> <x> <y href=“123”/> … <y href=“123”/> … </x> … <mref id=“123”>…</mref> <x> <x href=“456”/> … <x href=“456”/> … </x> … <mref id=“456”>…</mref> X Y Z X Y X Soap1.1 backpatching Soap 1.2 ok Supporting href/id 9/20/2006 ICWS 2006

12 Static Structure Analysis for Document/Literal Encoding
DAG XML Schema <complexType name=“X”> <sequence> <element name=“y” type=“tns:Y”/> <element name=“z” type=“tns:Z”/> </sequence> … </complexType> <complexType name=“X”> <sequence>…</sequence> <attribute name=“ref” type=“REF”/> </complexType> <complexType name=“Y”> <sequence>…</sequence> <attribute name=“id” type=“ID”/> </complexType <complexType name=“X”> <sequence>…</sequence> <attribute name=“id” type=“ID”/> <attribute name=“ref” type=“REF”/> </complexType Example XML Instance <x> <y>…</y> <z>…</z> … </x> <x ref=“123”>…</x> … <x ref=“123”>…</x> … <y id=“123”>…</y> <x ref=“456”>…</x> … <x ref=“456”>…</x> … <x id=“456”>…</x> X Y Z X Y X 9/20/2006 ICWS 2006

13 Mapping Programming language types to XML(1)
Problems with RPC encoding Multi-ref serialization with href and id attributes violates XML schema validation constraints Serialization of nil references, multi-referenced objects, and (sparse) multi-dimensional arrays are not precisely defined Multiple structures may map to same XML However, there are two problems with RPC encoding. The first issue is that the multiref serialization with href and id attributes violates XML schema validation constraints, because these attributes are not part of the schema of a typical data structure. The second problem is that the serialization of nil references, multi-referenced objects, and (sparse) multi-dimensional arrays is not precisely defined, which leads to interoperability problems. More specifically, SOAP-RPC encoding requires the following considerations for effective object-coherent serialization: 9/20/2006 ICWS 2006

14 Mapping Programming language types to XML(2)
Mapping SOAP RPC encoding style requirements Mapping primitive types to built-in primitive XSD types and vice versa Mapping records (structs, classes) to xs:complexType Mapping arrays to SOAP arrays by ensuring support for SOAP 1.1 partial and sparse arrays Object graph must be serialized with multi-reference encoding Choosing a naming mechanism for XML namespaces resolution to avoid name clashes prefix_name However, there are two problems with RPC encoding. The first issue is that the multiref serialization with href and id attributes violates XML schema validation constraints, because these attributes are not part of the schema of a typical data structure. The second problem is that the serialization of nil references, multi-referenced objects, and (sparse) multi-dimensional arrays is not precisely defined, which leads to interoperability problems. More specifically, SOAP-RPC encoding requires the following considerations for effective object-coherent serialization: 9/20/2006 ICWS 2006

15 Mapping Programming language types to XML(3)
Mapping SOAP Document/Literal encoding style requirements Mapping XML attribute definitions within an xs:complexType Support for repetitions of xs:element in xs:complexType sequences Support for use of xs:choice and xs:any Support for xs:group and xs:attributeGroup Support for substitution group Support for top-level element, attribute definitions and references Support for mixed contents However, there are two problems with RPC encoding. The first issue is that the multiref serialization with href and id attributes violates XML schema validation constraints, because these attributes are not part of the schema of a typical data structure. The second problem is that the serialization of nil references, multi-referenced objects, and (sparse) multi-dimensional arrays is not precisely defined, which leads to interoperability problems. More specifically, SOAP-RPC encoding requires the following considerations for effective object-coherent serialization: 9/20/2006 ICWS 2006

16 A static-dynamic approach to XML (de)serialization
gSOAP project: A static-dynamic approach to XML (de)serialization 9/20/2006 ICWS 2006

17 Static Structure Analysis for Compiled XML Serialization
Construct a data model Compiler determines which data type instances can refer to other data type instances at run time Generate type-specific points-to analysis algorithm Only needed for pointer types and structs/classes with pointer fields Generate type-specific optimized serialization code Serializer uses id-ref linking based on pointer analysis Source code type definitions Compiler generates generates Points-to analysis Optimized serializer Ptr hash table (graph edges) 9/20/2006 ICWS 2006

18 Constructing a Plausible Data Model from Source Code
typedef int SSN; struct Node { int val; int *ptr; float num; struct Node *next; }; val ptr num next Node SSN Source code type definitions Data model graph (arcs denote all possible pointer references) 9/20/2006 ICWS 2006

19 Generating an XML Schema Definition for Serialized XML
<simpleType name=“SSN”> <restriction base=“int”/> </simpleType> <complexType name=“Node”> <sequence> <element name=“val” type=“int”/> <element name=“ptr” type=“int” minOccurs=“0”/> <element name=“num” type=“float”/> <element name=“next” type=“tns:Node” minOccurs=“0”/> </sequence> </complexType> val ptr num next Node SSN Data model graph 9/20/2006 ICWS 2006

20 Generating Code for Runtime Points-To Analysis
serialize_pointerToint(int *p) { if (p != NULL) { // lookup and mark (p,TYPE_int) // as target in ptr hash table } } serialize_Node(struct Node *p) { if (p != NULL) { // lookup and mark (p,TYPE_Node) // as target in ptr hash table mark_embedded(&p->val,TYPE_int); serialize_pointerToint(p->ptr); // skip p->num serialize_pointerToNode(p->next); } } val ptr num next Node SSN Data model graph 9/20/2006 ICWS 2006

21 Generating Serialization Code
put_pointerToint(int *p) { if (p != NULL) { // lookup (p,TYPE_int) // if embedded, then output “ref” // if single, then output value // if multi, then output value // with “id” and mark embedded } } put_Node(struct Node *p) { if (p != NULL) { // lookup (p,TYPE_Node) // if embedded, then output “ref” // if single, then output value // if multi, then output value // with “id” and mark embedded put_int(&p->val); put_pointerToint(p->ptr); put_float(&p->num) put_pointerToNode(p->next); } } val ptr num next Node SSN 9/20/2006 ICWS 2006

22 Serialization Example
val =123 ptr =B num =1.4 next =B val =456 ptr =C num =2.3 next =A Node loc=A Node loc=B =789 Ptr hash table after runtime points-to analysis by the generated algorithm constructed from the data model SSN loc=C ID PtrLoc PtrType PtrSize PtrCount RefType 1 A Node 16 2 multi B int 4 embedded 3 single C 9/20/2006 ICWS 2006

23 Example (cont’d) val =123 ptr =B num =1.4 next =B val =456 ptr =C
multi, id=1 single val =123 ptr =B num =1.4 next =B val =456 ptr =C num =2.3 next =A multi, id=1 Node loc=A Node loc=B single embedded, id=2 =789 SSN loc=C 9/20/2006 ICWS 2006

24 Example (cont’d) val =123 ptr =B num =1.4 next =B val =456 ptr =C
multi, id=1 single val =123 ptr =B num =1.4 next =B val =456 ptr =C num =2.3 next =A multi, id=1 Node loc=A Node loc=B single embedded, id=2 =789 <Node id=“_1”> </Node> SSN loc=C 9/20/2006 ICWS 2006

25 Example (cont’d) val =123 ptr =B num =1.4 next =B val =456 ptr =C
multi, id=1 single val =123 ptr =B num =1.4 next =B val =456 ptr =C num =2.3 next =A multi, id=1 Node loc=A Node loc=B single embedded, id=2 =789 <Node id=“_1”> <val>123</val> </Node> SSN loc=C 9/20/2006 ICWS 2006

26 Example (cont’d) val =123 ptr =B num =1.4 next =B val =456 ptr =C
multi, id=1 single val =123 ptr =B num =1.4 next =B val =456 ptr =C num =2.3 next =A multi, id=1 Node loc=A Node loc=B single embedded, id=2 =789 <Node id=“_1”> <val>123</val> <ptr ref=“#_2”/> </Node> SSN loc=C 9/20/2006 ICWS 2006

27 Example (cont’d) val =123 ptr =B num =1.4 next =B val =456 ptr =C
multi, id=1 single val =123 ptr =B num =1.4 next =B val =456 ptr =C num =2.3 next =A multi, id=1 Node loc=A Node loc=B single embedded, id=2 =789 <Node id=“_1”> <val>123</val> <ptr ref=“#_2”/> <num>1.4</num> </Node> SSN loc=C 9/20/2006 ICWS 2006

28 Example (cont’d) val =123 ptr =B num =1.4 next =B val =456 ptr =C
multi, id=1 single val =123 ptr =B num =1.4 next =B val =456 ptr =C num =2.3 next =A multi, id=1 Node loc=A Node loc=B single embedded, id=2 =789 <Node id=“_1”> <val>123</val> <ptr ref=“#_2”/> <num>1.4</num> <next> </next> </Node> SSN loc=C 9/20/2006 ICWS 2006

29 Example (cont’d) val =123 ptr =B num =1.4 next =B val =456 ptr =C
multi, id=1 single val =123 ptr =B num =1.4 next =B val =456 ptr =C num =2.3 next =A multi, id=1 Node loc=A Node loc=B single embedded, id=2 =789 <Node id=“_1”> <val>123</val> <ptr ref=“#_2”/> <num>1.4</num> <next> <val id=“_2”>456</val> </next> </Node> SSN loc=C 9/20/2006 ICWS 2006

30 Example (cont’d) val =123 ptr =B num =1.4 next =B val =456 ptr =C
multi, id=1 single val =123 ptr =B num =1.4 next =B val =456 ptr =C num =2.3 next =A multi, id=1 Node loc=A Node loc=B single embedded, id=2 =789 <Node id=“_1”> <val>123</val> <ptr ref=“#_2”/> <num>1.4</num> <next> <val id=“_2”>456</val> <ptr>789</ptr> </next> </Node> SSN loc=C 9/20/2006 ICWS 2006

31 Example (cont’d) val =123 ptr =B num =1.4 next =B val =456 ptr =C
multi, id=1 single val =123 ptr =B num =1.4 next =B val =456 ptr =C num =2.3 next =A multi, id=1 Node loc=A Node loc=B single embedded, id=2 =789 <Node id=“_1”> <val>123</val> <ptr ref=“#_2”/> <num>1.4</num> <next> <val id=“_2”>456</val> <ptr>789</ptr> <num>2.3</num> </next> </Node> SSN loc=C 9/20/2006 ICWS 2006

32 Example (cont’d) val =123 ptr =B num =1.4 next =B val =456 ptr =C
multi, id=1 single val =123 ptr =B num =1.4 next =B val =456 ptr =C num =2.3 next =A multi, id=1 Node loc=A Node loc=B single embedded, id=2 =789 <Node id=“_1”> <val>123</val> <ptr ref=“#_2”/> <num>1.4</num> <next> <val id=“_2”>456</val> <ptr>789</ptr> <num>2.3</num> <next ref=“#_1”/> </next> </Node> SSN loc=C 9/20/2006 ICWS 2006

33 Example (cont’d) val =123 ptr =B num =1.4 next =B val =456 ptr =C
multi, id=1 single val =123 ptr =B num =1.4 next =B val =456 ptr =C num =2.3 next =A multi, id=1 Node loc=A Node loc=B single embedded, id=2 =789 <Node id=“_1”> <val>123</val> <ptr ref=“#_2”/> <num>1.4</num> <next> <val id=“_2”>456</val> <ptr>789</ptr> <num>2.3</num> <next ref=“#_1”/> </next> </Node> SSN loc=C 9/20/2006 ICWS 2006

34 Compiled XML Deserialization
For each data type, generate optimized deserialization code: Generate a recursive descent LL(1) XML parser that is optimized for the data type Embed deserialization operations as semantic actions in the parser Invoke id hash table lookups as semantic actions to resolve id-ref references on the fly The pointer remapper ensures the consistency of pointers in the object graph when nodes are moved in the construction process Source code type definitions Compiler generates LL(1) parser & deserializer Pointer remapper Id hash table (resolve id-ref) 9/20/2006 ICWS 2006

35 Deserialization Example
<Node id=“_1”> </Node> Recursive descent parsing with object deserialization operations as semantic actions val =? ptr =? num =? next =? id=1 Node 9/20/2006 ICWS 2006

36 Example (cont’d) <Node id=“_1”> <val>123</val> </Node> Recursive descent parsing with object deserialization operations as semantic actions val =123 ptr =? num =? next =? id=1 Node 9/20/2006 ICWS 2006

37 Example (cont’d) <Node id=“_1”> <val>123</val> <ptr ref=“#_2”/> </Node> Recursive descent parsing with object deserialization operations as semantic actions val =123 ptr =? num =? next =? id=1 Node ref=2 (unresolved) 9/20/2006 ICWS 2006

38 Example (cont’d) <Node id=“_1”> <val>123</val> <ptr ref=“#_2”/> <num>1.4</num> </Node> Recursive descent parsing with object deserialization operations as semantic actions val =123 ptr =? num =1.4 next =? id=1 Node ref=2 (unresolved) 9/20/2006 ICWS 2006

39 Example (cont’d) <Node id=“_1”> <val>123</val> <ptr ref=“#_2”/> <num>1.4</num> <next> </next> </Node> Recursive descent parsing with object deserialization operations as semantic actions val =123 ptr =? num =1.4 next =B val =? ptr =? num =? next =? id=1 Node Node ref=2 (unresolved) 9/20/2006 ICWS 2006

40 Example (cont’d) <Node id=“_1”> <val>123</val> <ptr ref=“#_2”/> <num>1.4</num> <next> <val id=“_2”>456</val> </next> </Node> Recursive descent parsing with object deserialization operations as semantic actions val =123 ptr =B num =1.4 next =B val =456 ptr =? num =? next =? id=1 Node id=2 Node 9/20/2006 ICWS 2006

41 Example (cont’d) <Node id=“_1”> <val>123</val> <ptr ref=“#_2”/> <num>1.4</num> <next> <val id=“_2”>456</val> <ptr>789</ptr> </next> </Node> Recursive descent parsing with object deserialization operations as semantic actions val =123 ptr =B num =1.4 next =B val =456 ptr =C num =? next =? id=1 Node id=2 Node =789 9/20/2006 ICWS 2006 SSN

42 Example (cont’d) <Node id=“_1”> <val>123</val> <ptr ref=“#_2”/> <num>1.4</num> <next> <val id=“_2”>456</val> <ptr>789</ptr> <num>2.3</num> </next> </Node> Recursive descent parsing with object deserialization operations as semantic actions val =123 ptr =B num =1.4 next =B val =456 ptr =C num =2.3 next =? id=1 Node id=2 Node =789 9/20/2006 ICWS 2006 SSN

43 Example (cont’d) <Node id=“_1”> <val>123</val> <ptr ref=“#_2”/> <num>1.4</num> <next> <val id=“_2”>456</val> <ptr>789</ptr> <num>2.3</num> <next ref=“#_1”/> </next> </Node> Recursive descent parsing with object deserialization operations as semantic actions val =123 ptr =B num =1.4 next =B val =456 ptr =C num =2.3 next =A id=1 Node Node =789 9/20/2006 ICWS 2006 SSN

44 Example (cont’d) <Node id=“_1”> <val>123</val> <ptr ref=“#_2”/> <num>1.4</num> <next> <val id=“_2”>456</val> <ptr>789</ptr> <num>2.3</num> <next ref=“#_1”/> </next> </Node> Recursive descent parsing with object deserialization operations as semantic actions val =123 ptr =B num =1.4 next =B val =456 ptr =C num =2.3 next =A id=1 Node Node =789 9/20/2006 ICWS 2006 SSN

45 Performance Results Measured on roundtrip message containing a struct
of size 1.2KB Performance results in number of roundtrip messages per second of a benchmark client/server application on various platforms. 9/20/2006 ICWS 2006

46 Performance Results(2)
Measured on echoString With various array sizes End-to-end performance (in milliseconds) of a gSOAP benchmark application with and without multi-ref encoding. The data show that gSOAP’s multi-ref implementation adds just 2% to 3% overhead, which is minimal. 9/20/2006 ICWS 2006

47 Performance Comparison to Other XML Parsers
Desrialization is expensive Measured on echoString Of size 1KB Experimental parsers 9/20/2006 ICWS 2006

48 Conclusions Object-level coherence in XML Web Service can be achieved with specialized algorithms Design space of mapping issues is large Algorithms to (de)serialize non-primitive data structures ensuring object-level coherence and performance guarantees Work best with SOAP 1.2 RPC encoding For SOAP document/literal style, we suggest use of id-ref Mention XML:ID standard 9/20/2006 ICWS 2006

49 Thank You 9/20/2006 ICWS 2006

50 Implementation: the gSOAP Toolkit
Web service applications are build in two stages: Generate service definitions in familiar C/C++ header file format Generate client/server stubs and skeleton source code and serialization code Note: the soapcpp2 compiler can also be used to generate WSDL from header files Service defs: service.wsdl wsdl2h tool Header file defs: service.h soapcpp2 compiler soapClient.cpp soapServer.cpp soapC.cpp 9/20/2006 ICWS 2006

51 1. Analyze WSDL/Schema and Generate C/C++ Header File
Service defs: service.wsdl wsdl2h tool Header file defs: service.h Documented methods and class diagrams 9/20/2006 ICWS 2006

52 2. Analyze Header File and Generate Stubs and Serializers
Header file defs: service.h gSOAP application Native C/C++ data soapcpp2 compiler client/server API LL(1) XML parsers and (de)serializers soapClient.cpp soapServer.cpp soapC.cpp schema-opt serializer schema-opt deserializer Plugins stats gSOAP engine HTTP/S + TCP/IP + UDP logging WSSE XML data Network fabric 9/20/2006 ICWS 2006


Download ppt "Exploring Remote Object Coherence in XML Web Services"

Similar presentations


Ads by Google