Efficient XML Interchange What is it? Why is it? How does it fit in?

Slides:



Advertisements
Similar presentations
Introduction to the BinX Library eDIKT project team Ted Wen Robert Carroll
Advertisements

A centre of expertise in digital information management UKOLN is supported by: XML and the DCMI Abstract Model DC Architecture WG Meeting,
The Lossless JPEG standard y=(a+b)/2 = 145 r= =-35 Category (r) = 6, Magnitude (r) = ’s complement of cat (r) = Rep(35)={6,011100}
17 Apr 2002 XML Namespaces Andy Clark. The Problem Documents use different vocabularies – Example 1: CD music collection – Example 2: online order transaction.
W3C XML Schema: what you might not know (and might or might not like!) Noah Mendelsohn Distinguished Engineer IBM Corp. October 10, 2002.
CIS 375—Web App Dev II SOAP.
ISO DSDL ISO – Document Schema Definition Languages (DSDL) Martin Bryan Convenor, JTC1/SC18 WG1.
File Processing - Organizing file for Performance MVNC1 Organizing Files for Performance Chapter 6 Jim Skon.
Embedding Knowledge in HTML Some content from a presentations by Ivan Herman of the W3c.
JSI Sensor Middleware. Slide 2 of x Embedded vs. Midleware based Architecture for Sensor Metadata Management Embedded approach assign an IP address to.
Open-DIS and XML DIS in Other Formats. Distributed Interactive Simulation DIS is an IEEE standard for simulations, primarily virtual worlds Binary protocol:
ILDG File Format Chip Watson, for Middleware & MetaData Working Groups.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
Compression & Huffman Codes
Presentation 7 part 2: SOAP & WSDL. Ingeniørhøjskolen i Århus Slide 2 Outline Building blocks in Web Services SOA SOAP WSDL (UDDI)
Some Thoughts on Data Representation 47th IETF AAAarch Research Group David Spence Merit Network, Inc.
Compression Techniques. Digital Compression Concepts ● Compression techniques are used to replace a file with another that is smaller ● Decompression.
Aki Hecht Seminar in Databases (236826) January 2009
Introduction to XLink Transparency No. 1 XML Information Set W3C Recommendation 24 October 2001 (1stEdition) 4 February 2004 (2ndEdition) Cheng-Chia Chen.
State Machines Timing Computer Bus Computer Performance Instruction Set Architectures RISC / CISC Machines.
XML Introduction What is XML –XML is the eXtensible Markup Language –Became a W3C Recommendation in 1998 –Tag-based syntax, like HTML –You get to make.
XML Primer. 2 History: SGML vs. HTML vs. XML SGML (1960) XML(1996) HTML(1990) XHTML(2000)
Efficient XML Interchange. XML Why is XML good? A widely accepted standard for data representation Fairly simple format Flexible It’s not used by everyone,
XML Basics Hope Greenberg Center for Teaching & Learning.
© 2006 by IBM 1 How to use Eclipse to Build Rich Internet Applications With PHP and AJAX Phil Berkland IBM Software Group Emerging.
Optimized Communication For Mobile Multimedia Collaboration Applications Sangyoon Oh Community Grids Laboratory Indiana University
JXON An Architecture for Schema and Annotation Driven JSON/XML Bidirectional Transformations David A. Lee Senior Principal Software Engineer Slide 1.
Introduction to AJAX AJAX Keywords: JavaScript and XML
Copyright © 2012 Accenture All Rights Reserved.Copyright © 2012 Accenture All Rights Reserved. Accenture, its logo, and High Performance Delivered are.
XML eXtensible Markup Language w3c standard Why? Store and transport data Easy data exchange Create more languages WSDL (Web Service Description Language)
Profiles and levelstMyn1 Profiles and levels MPEG-2 is intended to be generic, supporting a diverse range of applications Different algorithmic elements.
XML 1 Enterprise Applications CE00465-M XML. 2 Enterprise Applications CE00465-M XML Overview Extensible Mark-up Language (XML) is a meta-language that.
Foundations of Computer Science Computing …it is all about Data Representation, Storage, Processing, and Communication of Data 10/4/20151CS 112 – Foundations.
XML과 Database 홍기형 성신여자대학교 성신여자대학교 홍기형.
FIGIS’ML Hands-on training - © FAO/FIGIS An introduction to XML Objectives : –what is XML? –XML and HTML –XML documents structure well-formedness.
Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka.
Web Services based e-Commerce System Sandy Liu Jodrey School of Computer Science Acadia University July, 2002.
Presentation. Recap A multi layer architecture powered by Spring Framework, ExtJS, Spring Security and Hibernate. Taken advantage of Spring’s multi layer.
The netCDF-4 data model and format Russ Rew, UCAR Unidata NetCDF Workshop 25 October 2012.
2007. Software Engineering Laboratory, School of Computer Science S E Web-Harvest Web-Harvest: Open Source Web Data Extraction tool 이재정 Software Engineering.
17 Apr 2002 XML Syntax: Documents Andy Clark. Basic Document Structure Element tags – Elements have associated attributes Text content Miscellaneous –
XML Introduction. Markup Language A markup language must specify What markup is allowed What markup is required How markup is to be distinguished from.
XML Engr. Faisal ur Rehman CE-105T Spring Definition XML-EXTENSIBLE MARKUP LANGUAGE: provides a format for describing data. Facilitates the Precise.
Efficient XML Interchange High Performance XML Don McGregor (mcgredo (at) nps.edu) Don Brutzman (brutzman (at) nps.edu)
XML, XSL, and SOAP Building Object Systems from Documents CSC/ECE 591o Summer 2000.
WSRP Description and Transport Issues SC Andre Kramer, Citrix Systems Inc. 6 th WSRP F2F, Grenoble, France 12 th -14.
Martin Kruliš by Martin Kruliš (v1.1)1.
E-Science Data Information and Knowledge Transformation BinX – A Tool for Binary File Access eDIKT project team Ted Wen
Working with XML. Markup Languages Text-based languages based on SGML Text-based languages based on SGML SGML = Standard Generalized Markup Language SGML.
IETF 53, Minneapolis Kutscher/Ott/Bormann 1 SDPng Update Dirk Jörg Carsten draft-ietf-mmusic-sdpng-04.txt.
Basic HTML Document Structure. Slide 2 Goals (XHTML HTML5) XHTML Separate document structure and content from document formatting HTML 5 Create a formal.
OGC Web Services with complex data Stephen Pascoe How OGC Web Services relate to GML Application Schema.
Introduction to XML Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
SEMI-STRUCTURED DATA (XML) 1. SEMI-STRUCTURED DATA ER, Relational, ODL data models are all based on schema Structure of data is rigid and known is advance.
XML Extensible Markup Language
Learning XML Basics ©NIITeXtensible Markup Language/Lesson 1/Slide 1 of 40 Objectives In this session, you will learn to: * Identify the limitations of.
® Sponsored by Improving Access to Point Cloud Data 98th OGC Technical Committee Washington DC, USA 8 March 2016 Keith Ryden Esri Software Development.
WELL- FORMEDNESS CH 6. Objective Well-formedness rules Text in XML Elements and Tags in Atributes Entity references CDATA sections Comments Unicode XML1.1.
Unit 4 Representing Web Data: XML
Office Open XML Formats: Enabling Solutions
Jan Dvořák Andrea Bollini Thomas Vestdam
BTEC NCF Dip in Comp - Unit 02 Fundamentals of Computer Systems Lesson 10 - Text & Image Representation Mr C Johnston.
ACOE301: Computer Architecture II Labs
What is FITS? FITS = Flexible Image Transport System
XML in Web Technologies
The XML Language.
Chapter 7 Representing Web Data: XML
Using NFFI Web Services on the tactical level: An evaluation of compression techniques 13th ICCRTS: C2 for Complex Endeavors Frank T. Johnsen.
Lesson 5: HTML Tables.
OPeNDAP/Hyrax Interfaces
Presentation transcript:

Efficient XML Interchange What is it? Why is it? How does it fit in?

What is Efficient XML Interchange? Alternative Representation of XML Infoset –support full XML (Infoset) data model –not a subset –no really, not a subset! Interchange Format –optimized for data exchange –transmission, storage, processing –can use Schema, conventional compression

Why? Expand the Web –limited uptake of XML & friends in certain domains performance is problem –noteworthy domains mobile, embedded, scientific, … Lesson From Binary XML Formats –real need, and real solutions –widely applicable, win-win –multiple formats cause segregation, limit adoption

Integration into XML Stack Same Data Model –merely an alternative encoding Open Issues –format, or encoding? –content negotiation? –schema knowledge vs content negotiation –modes, configurability (e.g. simple types)

WebAPI / EXI? Impact on… –APIs initalisation: encoding modes, schema info? –XMLHttpRequest again: modes, schema info? diversity of formats? –Are data models in sync? HTML as XML? –REX fragment support?

Efficient XML Interchange Format Basics

Efficient XML Interchange Goal(s) –maintain XML (Infoset) data model –seamless integration into XML software stack –improve compaction AND processing Observation: –smallness has multiple benefits –e.g. energy consumption during transmission –allows XML deployment in new scenarios Underlying Philosophy: –exploit a-priori knowledge of (likely) content

How does it work? Exploit Knowledge, at Several Different Levels –XML knowledge copious syntactic redundancy –Schema knowledge schema describes content in detail –heuristics e.g. (declared) elements >> processing instructions e.g. repeated string elements e.g. small numbers >> large numbers Cooperation with Conventional Compression –heavily biased data stream as compressor input

EXI Base Format Coding Grammars –generic grammar: describe full XML Infoset arbitrary elements, PIs, comments, entity references, etc. –schema-derived grammar describes a specific format –content-derived grammar add rules depending on encountered elements –splice these together, at very fine granularity allow anything, but know what is (currently) likely likely content: more efficient encoding

EXI Base Format Built-in, Generic Element Grammar StartTag Element EE AT(*) NS SE(*) CH ER CM PI SE(*), CH, ER, CM, PI

EXI Base Format A Schema-Based Grammar AT(color) SE(quantity) SE(desc) SE(price) SE(quantity) EE SE(desc) Element Content Model: (optional) attribute color (optional) element desc (mandatory) elements quantity, price

EXI Base Format Merged Generic & Schema Derived Grammar SE(quantity) EE SE(*), CH, ER, CM, PI SE(quantity) SE(price) SE(desc) SE(*), CH, ER, CM, PI SE(*) CH ER EE CM PI quantity desc

Other, Major EXI Features Simple Type Values –optimized codecs –type assigment through grammar generic text coding always available –string / value tables Bit-Packed vs byte-aligned codec –biased input into deflate compression

Impact on the XML Stack Questions –content negotiation, header http integration? what do you need? what would be a problem? pre-shared schemas –which formats? samples? (X)HTML? AJAX? –need hooks in the specification? –options / variables different schemas, different options?