OpenXML: What is it? XML-based file format which describes documents, presentations, spreadsheets, etc. Replacement for binary file formats used in previous versions of Office
Why use OpenXML? Readable – plaintext representation Smaller - compressed as a ZIP archive Straightforward - images are respresented within tags All the benefits of regular XML!
Docx Structure (Containers) Paragraph Most basic unit One for each line break in the document Container element Run Region of content with a common set of properties All runs must be contained within a paragraph
Docx Structure (Root Elements) Text Basic block of text Normal formatting can be applied through formatting tags (i.e. for bold) Must be contained within a run Images Pictures, Clipart, Smartart, Shapes, charts, etc. Additional transformations can be applied to the base image (rotation, reflection, etc.)
Docx Structure (example) This is bold text. This is bold text.
Dissecting a Word 2007 Document Demo
Working with OpenXML documents Microsoft SDK for OpenXML Provides strong bindings for accessing document parts Allows developer to create or change documents without having Word open Word Object Model Coming up next…
Office Plugins Visual Studio Tools for Office (VSTO) Add-on for Visual Studio 2005 Develop Office add-ins just like any other application Use WYSIWYG editor to create GUI Access the document through the Word object model
Word Object Model
InlineShapes Collection of references to all images in the document Paragraphs Directly correspond to OpenXML tags Ranges Contiguous area in document Can access actual text of document through Text property
Creating a plugin demo Visual Studio Tools for Office Demo
How we’re using it… OpenXML SDK to parse the document/presentation for accessibility errors VSTO SE to create an addin that checks accessibility Word Object Model to highlight regions of text and manipulate the document
Conclusions Any questions?