23. 5. 20003 ETD 2003, Berlin 1 LaTeX as an Archiving Format: Benefits and Problems Experiences from the MathDiss International Project and the EMANI project.

Slides:



Advertisements
Similar presentations
Introduction to Adobe Acrobat James Crowley C3 – Crowley Computer Consulting.
Advertisements

Repository models and policies for preservation Steve Hitchcock Preserv Project Intelligence Agents Multimedia Group, School of Electronics and Computer.
File Formats for Tariff Content. Prepared by Gary Kravis – UNICON, Inc. Practical Practical …must lend itself to tariff content …must lend itself to tariff.
Delivering textual resources. Overview Getting the text ready – decisions & costs Structures for delivery Full text Marked-up Image and text Indexed How.
DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?
SALES INTRODUCTION.  Overview  Scenario  How do we do it now?  The Solution  How it Works  Benefits  Target Markets  Supported MFD’s Content.
Electronic Theses and Dissertations: Benefits, Issues, and the University of Waterloo Approach
Overwhelmed by Large-scale Digitization Projects
Exchanging Files Division of Secondary and Virtual Learning Kentucky Department of Education.
® Copyright 2008 Adobe Systems Incorporated. All rights reserved. ADOBE® ACCESSIBILITY AT Access to Flash and PDF Matt May 25 Mar 2010 Featuring.
Toulouse School of Graduate Studies Theses and Dissertations ETDs - Why We Do them –We at UNT believe that electronic theses and dissertations enhance.
Word vs. LaTeX Udi Boker April 2004.
Quicktime Howell Istance School of Computing De Montfort University.
Contents and Formats Existing Digital Sources Gertraud Griepke Cornell University, July 26th 2002.
WMES3103 : INFORMATION RETRIEVAL
© 2010 Microsoft Corporation. All rights reserved. Quality Assurance: Towards Tools for Characterizing and Comparing Digital Documents Natasa Milic-Frayling.
XML Vikki Brandon Jesse Josh. Quick Facts Who created it? –W3C Jon Bosak and Tim Bray as leaders and major contributors When? –Started in 1996 and finished.
Chapter 12: Network Programming Desktop Publishing Translator models Latex Documentation Preparation Postscript programming language WYSIWIG Editors.
Portable Document Format PDF. What is PDF? Universal file format developed by Adobe Systems Incorporates fine detail and quality of print publications.
Overview of Search Engines
Computer Literacy BASICS: A Comprehensive Guide to IC 3, 5 th Edition Lesson 14 Sharing Documents 1 Morrison / Wells / Ruffolo.
Document Delivery Formats for the Web and Legal Digital Collections Kevin Reiss June 18 th, 2004 Law Library Rutgers-Newark School of Law.
Product Retrieval Statistics Canada / Statistique Canada Chuck Humphrey ACCOLEDS/DLI Training December, 2001.
Introduction to Desktop Publishing Using Adobe InDesign ®
Portable Document Format PDF. What is PDF? Universal file format developed by Adobe Systems Incorporates fine detail and quality of print publications.
EMANI Göttingen1 Data Formats in Mathematics EMANI and DML EMANI Meeting Göttingen, Dr. Thomas Fischer Metadaten und Datenbanken.
Luc Audrain Hachette Livre Head of digitalization
File Formats About graphic file formats And image compression.
Open Textbooks and Electronic Publishing Formats/Standards Arctic Virtual Learnng Tools
WORKFLOWS AND OTHER CONSIDERATIONS FOR DIGITIZATION  Steve Bingo  Processing Archivist Washington State University Libraries  Alex Merrill  Assistant.
Cataloguing Electronic resources Prepared by the Cataloguing Team at Charles Sturt University.
Lakeland Click arrow to advance show. Click on the “A” under “Listed By Name.” (“A” for Academic Search Database)
1 XML as a preservation strategy Experiences with the DiVA document format Eva Müller, Uwe Klosa Electronic Publishing Centre Uppsala University Library,
Andrew Lingo INSYS 597 April 2, 2009
Technology Choices for the JSTOR Online Archive Presented by Chang Feng Department of Computer Engineering and Computer Science, University of Missouri-Columbia,
Distance Education Technologies and Systems
Introduction to Interactive Media Interactive Media Components: Text.
_______________________________________________________________________________________________________________ PHP Bible, 2 nd Edition1  Wiley and the.
Implementor’s Panel: BL’s eJournal Archiving solution using METS, MODS and PREMIS Markus Enders, British Library DC2008, Berlin.
SAS ODS (Output Delivery System) Donald Miller 812 Oswald Tower ;
Storage of digital objects Adolf Knoll National Library of the Czech Republic
PAN-European Exploitation of the Results of the Libraries Programme - EXPLOIT German Libraries Institute Berlin EXPLOIT 1 Electronic library materials.
ECDL. Word processing Work with documents and save them in different file formats Choose built-in options such as the Help function to enhance productivity.
Photographic Journalism Level 5 Image Processing Dominic Deane
Introduction to Markup Languages January 31, 2002.
Student Quick Start Guide Prepared by: Information Services Division Perpustakaan Sultan Abdul Samad Universiti Putra Malaysia
Document Computing Technologies for Managing Electronic Document Collections Ross Wilkinson... [et al.] Circulation Counter [RES3H] ZA4080.D
File Formats in the Context of Archiving Dr. Thomas Fischer EMANI – Project Meeting February 14 th - 16 th, 2002 Springer-Verlag Heidelberg Göttingen State.
This poster has been designed to act as a customisable template. You do not have to use this template but it might be a useful starting point. The poster.
WORLD CONSORTIUM Welcome to. An overview by Phil Elliott Satzconcept Skandinavia a.s.
Package! Publish! Print! Brian Adelberg Digital Document Solutions Software Development Lead Microsoft Corporation.
Delivering textual and visual resources. Overview Case studies Methods for providing access Structures for delivery Full text Marked-up Image and text.
ITL conference 2003 Putting Your Content on a Diet Using rich online media without download woes.
Making the Most of PDFs PDF (portable document format) is a file format developed by Adobe Systems. PDFs make it possible to send documents with original.
© 2005 KPIT Cummins Infosystems Limited We value our relationship XML Publisher Prafulla Kauthalkar RJTSB – Oracle Apps Consultant We value our relationship.
PDF Recovery Tool Fix Portable Document File Format.
Topic 2: Hardware and Software
NGUYEN THI THANH NHA HMCL
Product Retrieval Statistics Canada / Statistique Canada Title page
USING ADOBE ACROBAT READER DC
Adobe Visual Design Setting project requirements using InDesign (5%)
Portable Document Format
Securing & Sharing a Presentation
Portable Document Format PDF
Securing and Sharing a Presentation
What NOT to do with accessibility.
PRODUCTION PHASES CHANGES
Dissemination and Communication Introductory course
Securing and Sharing a Presentation
Presentation transcript:

ETD 2003, Berlin 1 LaTeX as an Archiving Format: Benefits and Problems Experiences from the MathDiss International Project and the EMANI project Thomas Fischer, State and University Library Göttingen

ETD 2003, Berlin 2 Overview Basis Considerations on File Types –Types of file formats –Purposes of file formats –Criteria for File Formats for Archiving –File formats in Mathematics Experiences from MathDiss International Experiences from the EMANI project Conclusions

ETD 2003, Berlin 3 Types of file formats Binary formats Examples: PostScript, PDF, DVI, Word documents Mark-Up formats Examples: –SGML family: HTML, XML, MathML –Rich Text Format (RTF), Microsoft’s –some Versions of WordPerfect files –TeX family: TeX, LaTeX, AMSTeX

ETD 2003, Berlin 4 Purposes of file formats I On-screen rendering Display data on screens with different sizes and resolutions Preferably zoomable Printing Prepare data for printing to different devices like laser printers or bubble jet Data exchange Transport data from one repository to the other Some sort of error checking is necessary

ETD 2003, Berlin 5 Purposes of file formats II Discovery and Retrieval Support search functions internally (using the dedicated programme) Allow external text extraction (e.g. for indexing) Archiving preserve the intellectual contents and the “look and feel” of the file (essentially “eternally”) make it available to the scientific community

ETD 2003, Berlin 6 Criteria for File Formats for Archiving I Deterioration of the storage media not considered: essentially manageable Error tolerance –Does the change of a single byte make the file unreadable? Long term stability –Is the file format changing constantly? –Is the dedicated programme “backwards compatible”, I.e. able to read the older files? Full open specification –Is a full and complete specification of the file format publicly available (at no cost)?

ETD 2003, Berlin 7 Criteria for File Formats for Archiving II System independence –Does the file require a specific hardware platform or operating system? Ease of handling –Does the format transform easily to other formats (for delivery)? –Do documents consist of several individual files? Independence of commercial interests and influence –Is the file format owned by a commercial company –Are the programmes for creating and rendering the format only available from a commercial company? (The same?) Minor considerations –Bulkiness –Options for navigation –Copy and paste

ETD 2003, Berlin 8 File formats in Mathematics TeX Basic starting format for writing mathematics DVI Result of compiling TeX, readable with DVI- viewer in dedicated environment PostScript Result of transformation from DVI PDF Result of transformation from DVI or PostScript

ETD 2003, Berlin 9 Experiences from MathDiss International I PostScript Largest single document: 159 pages KB Compressed: –Zip: 950 KB –StuffIt: 613 KB Portable but bulky format, requires compression for delivery

ETD 2003, Berlin 10 Experiences from MathDiss International II PDF I Largest PDF file: 196 pages, KB Compressed: Zip: KB, StuffIt: KB, StuffItX: KB “There was an error when opening this page. The error occurred when analysing an image.”

ETD 2003, Berlin 11 Experiences from MathDiss International III PDF II Rendering: Copying: Â__r¿.á_Å_ÂM_ÉÏ__ßíÓr_LÇvÁiÂÒá ¿ Å_¸º___RàlÂÄÀA_.Ór_uÂÄÀ_¸.ß_âµãrÂÒ_Rñ.À_Ç.ÂÄÁL_r¸.ß_ß_Â1àlÂÄÀ ôyÂÒâµãL__ÓL_rÇ àrÂÒß#_Ü_.é_ÂÄ_LàlÓL_rÇ_ß__rÀ__.Ár».ÂÄ_Dß_é_ÂÄÀµàlÂÒ_

ETD 2003, Berlin 12 Experiences from MathDiss International III TeX Multiple files for single document Dedicated environment necessary Often macro (.sty) or other files needed not present Correct files compile and display nicely Most complex bundle: 74 files: eight files with no ending, including three makefiles, one.aux, one.bak, three.bc, one.fot, eight.hex, one.jpeg, two.log, three.make, three.md, one.meta, four.mi, 30.pre, seven.psi, one.tx. The actual dissertation is hidden in the.bak file.

ETD 2003, Berlin 13 Experiences from the EMANI project Analysis of TeX files from Springer Verlag for different journals. Example: Numerische Mathematik Needed additional files: svjour.cls: a general class file definition for all Springer Journals svnummat.clo: a special class option file for “Numerische Mathematik” TOTAL00.NUM: a somewhat obscure file that “redefines the things for journals to produce totally camera ready output”

ETD 2003, Berlin 14 Conclusions I PDF Advantages: Easy to handle Convenient reader Disadvantages: Files can become very large Error tolerance is limited Acrobat system is owned by Adobe and is not open source.

ETD 2003, Berlin 15 Conclusions II TeX Advantages: Original source Fairly small Open source Disadvantages Needs environment with special files Multiple files, possibility of missing files

ETD 2003, Berlin 16 Conclusions III General Archive should provide guidance for creation of files (e.g. MathDiss start file) Archive needs “Ingest” system that checks Completeness of files and successful compilations (TeX) Possibly crippling adjustments of security settings, rendering quality (PDF) Archive needs management for related complex files and versioning For general acceptance, TeX needs a combined format that can be read using a single reader (e.g. IBM techexplorer)

ETD 2003, Berlin 17 Thank You! Questions, remarks? Thomas Fischer, SUB Göttingen