Office OpenXML April 2007 Adam Farquhar. 2 Outline Office OpenXML Importance to Library and Archive community History Relation to other standards Design.

Slides:



Advertisements
Similar presentations
Microsoft Office System UK Developers Conference Radisson Edwardian, Heathrow 29 th & 30 th June 2005.
Advertisements

Copyright © 2003 Pearson Education, Inc. Slide 8-1 Created by Cheryl M. Hughes, Harvard University Extension School Cambridge, MA The Web Wizards Guide.
June 28-29, 2005IHE Interoperability Workshop 1 Integrating the Healthcare Enterprise Cross-enterprise Document Sharing for Imaging (XDS-I) Rita Noumeir.
1 of 18 Information Dissemination New Digital Opportunities IMARK Investing in Information for Development Information Dissemination New Digital Opportunities.
1 jNIK IT tool for electronic audit papers 17th meeting of the INTOSAI Working Group on IT Audit (WGITA) SAI POLAND (the Supreme Chamber of Control)
13 September 2012 SDMX Technical Working Group1 Report of the SDMX Technical Standards Working Group SDMX Expert Group Meeting, Paris, September 2012.
28 April 2004Second Nordic Conference on Scholarly Communication 1 Citation Analysis for the Free, Online Literature Tim Brody Intelligence, Agents, Multimedia.
Repository models and policies for preservation Steve Hitchcock Preserv Project Intelligence Agents Multimedia Group, School of Electronics and Computer.
Welcome To The HL7 Clinical Decision Support Meeting! Co-Chair Robert A. Jenders, MD, MS 15 January 2003 (Please sign in.)
Extending Eclipse CDT for Remote Target Debugging Thomas Fletcher Director, Automotive Engineering Services QNX Software Systems.
E-Content Service Group Virtual Meeting Digital Preservation: How to Get Started.
The National Standards and Quality System Jean-Louis Racine The World Bank Cambridge, England April 19, 2007 Knowledge Economy Forum VI Technology Acquisition.
Rue du Rhône 114- CH-1204 Geneva - T: F: Ecma TC43: Universal 3D.
Document architecture standards in Ecma Isabelle Valet-Harper Ecma TC45 Chairman.
Rue du Rhône 114- CH-1204 Geneva - T: F: Document Formats: Ecma Office OpenXML How Ecma.
ECMA Open XML File Formats and the Evolution of Open File Formats Mark Lange Senior Policy Counsel Microsoft EMEA.
1 Ecma TC45 – Office Open XML Formats Jean Paoli & Isabelle Valet-Harper – Microsoft, TC45 co-Chairs Adam Farquhar – British Library, TC45 Vice Chair Brian.
The creation of "Yaolan.com" A Site for Pre-natal and Parenting Education in Chinese by James Caldwell DAE Interactive Marketing a Web Connection Company.
18 Copyright © 2005, Oracle. All rights reserved. Distributing Modular Applications: Introduction to Web Services.
Introduction to Planets Hans Hofman Nationaal Archief Netherlands Prague, 17 October 2008.
XP New Perspectives on Microsoft Office Word 2003 Tutorial 6 1 Microsoft Office Word 2003 Tutorial 6 – Creating Form Letters and Mailing Labels.
XP New Perspectives on Microsoft Office Word 2003 Tutorial 7 1 Microsoft Office Word 2003 Tutorial 7 – Collaborating With Others and Creating Web Pages.
UKOLN, University of Bath
10. Juni 1998reto ambühler ( WELCOME TO THE GATHERING PLACE.
1 © 2009 University of Wisconsin-Extension, Cooperative Extension, Program Development and Evaluation Response Rate in Surveys Key resource: Dillman, D.A.,
4. Internet Programming ENG224 INFORMATION TECHNOLOGY – Part I
OASIS OData Technical Committee. AGENDA Introduction OASIS OData Technical Committee OData Overview Work of the Technical Committee Q&A.
Niagara Portal Introduction January 2007 Scott Muench - Technical Sales Manager.
Configuration management
Software change management
Information Systems Today: Managing in the Digital World
Chapter Information Systems Database Management.
“The Honeywell Web-based Corrective Action Solution”
Request Tracker IT Partners Conference Oliver Thomas 19 April 2005.
Reliable Interoperation between Open Office & MS office by UOML Alex Wang Chair/OASIS UOML TC Chairman / Sursen Co.
Assembling, Repurposing And Manipulating Document Content Using The New Office File Format Brian Jones OFF 304 Program Manager Microsoft Corporation.
The Life Cycle of Standards Presented by the International Electrotechnical Commission.
DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?
Heppenheim Producer-Archive Interface Specification Status of standardisation project Main characteristics, major changes, items pending.
31242/32549 Advanced Internet Programming Advanced Java Programming
Chapter 11 Designing Effective Output
Dr. Alexandra I. Cristea XHTML.
Building an EMS Database on a Company Intranet By: Nicholas Bollons Sally Goodman.
A lesson approach © 2011 The McGraw-Hill Companies, Inc. All rights reserved. a lesson approach Microsoft® PowerPoint 2010 © 2011 The McGraw-Hill Companies,
Benchmark Series Microsoft Excel 2013 Level 2
ECMA-376 Office Open XML Open Formats and Standardization Jean Paoli GM, Interoperability and XML Architecture Co-Chair Ecma TC45 Office Open XML Microsoft.
With Microsoft Access 2010© 2011 Pearson Education, Inc. Publishing as Prentice Hall1 PowerPoint Presentation to Accompany GO! with Microsoft ® Access.
Adaptability of learning objects by appropriate knowledge representation Anastas Misev Institute of Informatics Faculty of Natural Science and Mathematics.
Interoperability via OpenXML Wolfgang Keber DIaLOGIKa – Germany
Preservation and Long-term access through Networked Services Adam Farquhar, The British Library iPres2006 Cornell University, October 2006.
Microsoft Office Open XML Formats Brian Jones Lead Program Manager Microsoft Corporation.
Project Proposal: Academic Job Market and Application Tracker Website Project designed by: Cengiz Gunay Client: Cengiz Gunay Audience: PhD candidates and.
Addressing Metadata in the MPEG-21 and PDF-A ISO Standards NISO Workshop: Metadata on the Cutting Edge May 2004 William G. LeFurgy U.S. Library of Congress.
What Agencies Should Know About PDF/A September 20, 2005 Susan J. Sullivan, CRM
1 XML as a preservation strategy Experiences with the DiVA document format Eva Müller, Uwe Klosa Electronic Publishing Centre Uppsala University Library,
What Agencies Should Know About PDF/A-1 April 6, 2006 Mark Giguere
Evolving MARC 21 for the future Rebecca Guenther CCS Forum, ALA Annual July 10, 2009.
Open XML Formats Jessica Gruber Consultant Microsoft Corporation.
Open XML Formats Fabio Santini.NET Developer Evangelist Microsoft Italy.
Joint Information Systems Committee 09/03/2016 | | Slide 1 Toolkit and Demonstrator Calls Section Title Tish Roberts JISC programme Manager.
R2R ↔ NODC Steve Rutz NODC Observing Systems Team Leader May 12, 2011 Presented by L. Pikula, IODE OceanTeacher Course Data Management for Information.
Amiens, France, Summer 2007 Office Format Wars a mid-term survey.
10 Questions and Answers about.
OASIS ODF 1.2 Rob Weir Co-Chair, OASIS ODF TC
LibreOffice Intro Name Surname
INTRO. To I.T Razan N. AlShihabi
Office Open XML Formats: Enabling Solutions
A Comparative Study for Open Document Formats
Lightweight tools for on-line course development
QoS Metadata Status 106th OGC Technical Committee Orléans, France
Presentation transcript:

Office OpenXML April 2007 Adam Farquhar

2 Outline Office OpenXML Importance to Library and Archive community History Relation to other standards Design criteria Structure of standard Working with the specification Conclusion

3 Office OpenXML – A standard file format for Office Documents Office OpenXML is an open standard for word-processing documents, presentations, and spreadsheets High Fidelity Migration from legacy Microsoft binary formats Faithfully represent in XML the pre-existing corpus of word-processing, presentations and spreadsheets documents Millions of users created billions of documents over the past 20 years Interoperability, Platform independence, Internationalization, Accessibility Extensive review and modifications during the standardization process Enable new range of applications - Integration with business data Clear definition of conformance Support Custom XML Schemas (e.g. Birth Certificate, HL7) Long-term preservation Full specification, no application or system dependencies, clear path for migration, future evolution/maintenance in Ecma & ISO

4 What is wrong with legacy binary formats? Designed to be manipulated by a single vendors software Direct serialisation of in-memory data structures Evolved over many years in response to customer needs Augmented through acquisitions New features re-used existing attributes Result – the software is the specification! We are renting our content from Microsoft

5 Why should libraries or archives be involved? Address a root cause of digital obsolescence Formats have been deeply coupled with the programs that create them Formats are often poorly specified and complex Programs have a shorter lifespan than content Raise awareness about digital preservation Especially among software vendors The standard identifies preservation as a key issue Represent our interests Provide an independent voice Save pain down the line Compare bulk de-acidification, treating caustic inks

6 Office Open XML - History Start In 2000, Microsoft became serious about using XML in its Office file formats Consumers, governments, libraries, archives become increasingly vocal about the need for a full specification for the Office file formats Microsoft Office 2003 MS Office XML formats published on Danish Government site IDA (2004) Microsoft should consider the merits of submitting XML formats to an international standards body of their choicehttp://europa.eu.int/idabc/en/document/2592/5588 IDA & EC explicit ask to put the evolution of the formats in the control of a standards body to build translators to/from ODF Governments recommend eventual submission to ISO Now Dec PEGSCO Report Microsoft has adopted a pure XML format The Open XML (ECMA) standard is freely available The Open Specification Promise enables both Open Source and Commercial software to implement Open XML

7 Ecma-376 Office Open XML Standardization November , co-submission of Office Open XML Formats to Ecma International Co-sponsors: Apple, Barclays Capital, BP, British Library, Essilor, Intel Corporation, Microsoft Corporation, NextPage Inc., Statoil ASA, Toshiba Participants represented a wide range of interest December , Ecma General Assembly accepts standardization: Ecma TC 45 created Goal: To create an Ecma Office Open XML Formats standard To contribute the Ecma Office Open XML Formats standard to ISO/IEC JTC 1 for approval and adoption by ISO and IEC To ensure future evolution of Office Open XML Open process Technical Committee open to any Ecma member Novell, US Library of Congress joined TC45 after creation

8 Ecma Standardization Dec15, st face to face meeting – Brussels Microsoft submit initial 2000 page draft of Office Open XML Weekly 2 hour conference call – participants Face 2 Ecma, Apple, British Lib, Toshiba, Microsoft, Statoil Initial and Interim drafts posted publicly on Ecma web site External feedback – SC34 experts, others Final standard 6000 pages Ecma GA: Overwhelming positive vote - Approval Submit to ISO Ecma Secretary General Jan van den Beld (left) receives initial draft of office document standard from TC45 Chair Jean Paoli (center) Adam Farquhar (right), TC45 Vice-Chair, Head of e-Architecture for the British Library

9 Ecma-376 Office Open XML Adoption Many Office suites - Multiple platforms Microsoft Office Default Save Format is Open XML (+ free updates for Office 2000, XP, 2003) – Dec 2006/Jan 2007 Open Office – Novell support Open XML in Open Office – Novell edition – Availability Feb 2007 Corel – announcement of support of Open XML - Availability mid 2007 Gnumeric – open source Spreadsheet supports OpenXML Sun – working on OpenXML import filter for spreadsheets OpenXMLDeveloper.org (hundred of developers, multiple platforms)

10 ISO Standardization Ecma General Assembly approval Dec Overwhelming Positive vote for approving sending Open XML to JTC1 ISO Fast Track ISO Fast-Track Process JTC1 Fast Track procedure - Approved for Ecma Standards >75% of Ecma standards approved as ISO/IEC standards Ballot time Jan 5 – Ecma submit Office Open XML to ISO/JTC1 Feb 5 – End of 30-day review period, to determine perceived contradictions Feb 28 – Ecma provides feedback on comments & perceived contradictions 5-month letter ballot – Technical Review through September 2 nd

11 The Highlander myth How many document format standards should their be? Some say they can be only one (The Highlander Principle) As sensible as the movie! Where otherwise immortals slay each other! In fact, there are many standard formats now: HTML, PDF/A, ODF, OOXML CGM, SVG; JPEG, PNG; TIFF/IT, PDF/X And many more widely used formats And there will continue to be many No format is immortal Formats address different needs Innovation is not over

12 The simple office document myth Have you heard – office documents are simple! In fact, they can be extraordinarily complex Office documents can contain: Multiple character sets Left-to-right, right-to-left, bi-directional text Images, sound, video, vector graphics Annotations and changes from multiple authors Arbitrary metadata and XML components Complex mathematical equations Animated transitions Embedded data, database connections, queries, cached data Embedded components from other applications

13 The monolithic specification myth and the proportionality principle Six thousand pages! Thats too big for anyone to use. In fact, the standard follows a proportionality principle Easy jobs should be easy! A developer can take the standard and implement tools within a week (assuming knowledge of zip, xml) Examples: update addresses or copyright notices, replace logos, extract text stream, produce simple documents Hard jobs can be hard! A implementing a full office suite will take many person-years Examples: provide high-performance calculation engine, provide full OOXML->ODF translation, develop an MS Office competitor But all of these are now possible!

14 The ECMA-376 Specification The committee worked to make it readable! White Paper (14p) Part 1: Fundamentals (165p) Accessible with simple examples Part 2: Open Packaging Conventions (125p) Part 3: Primer (466p) Many examples, diagrams, explanations Part 4: Mark-up Language specification (5756p) Detailed, but most uses require only small subsets Part 5: Compatibility and extensions (34p)

15 Open XML Format Architecture Sample.docx Document Parts Most parts are XML Each XML part is a discrete, compressed component Can add, extract and modify individual parts without using Office programs Corruption or absence of any part does not prohibit the file from being opened Developer view: modular file User view: single document

16 OpenXML Mark-up approach Very different mark-up approach from ODF, HTML Flatter structure Local edits result in local changes Basis for text is a run A run is contiguous text with identical properties This is three runs This is three runs.

17 The principle of proportionality confirmed New open source project from Julien Chable Bulk of code serves to manipulate packages A few minutes sufficed to write a tool to extract addresses from any OOXML document

18 Conclusion The Digital Library community has influenced Office OpenXML Key vendors are more aware of digital preservation The Office OpenXML Standard Co-exists with existing and future document standards Plays a key role preserving billions of legacy office documents Follows the Proportionality Principle Enables innovation Is progressing through ISO Continues to evolve through an open process Now we own our content!

19 Questions?

20 Royalty-free File Format Licensing Office File Format Licensing Royalty-free license for XML file formats Royalty-free license for Binary file formats Fundamentals of Office file format licensing The technical documentation is available for anyone The schemas are based on the W3C XML standard The license is royalty-free The license is perpetual The license is very brief and available to everyone