Presentation is loading. Please wait.

Presentation is loading. Please wait.

23. 5. 20003 ETD 2003, Berlin 1 LaTeX as an Archiving Format: Benefits and Problems Experiences from the MathDiss International Project and the EMANI project.

Similar presentations


Presentation on theme: "23. 5. 20003 ETD 2003, Berlin 1 LaTeX as an Archiving Format: Benefits and Problems Experiences from the MathDiss International Project and the EMANI project."— Presentation transcript:

1 23. 5. 20003 ETD 2003, Berlin 1 LaTeX as an Archiving Format: Benefits and Problems Experiences from the MathDiss International Project and the EMANI project Thomas Fischer, State and University Library Göttingen

2 23. 5. 20003 ETD 2003, Berlin 2 Overview Basis Considerations on File Types –Types of file formats –Purposes of file formats –Criteria for File Formats for Archiving –File formats in Mathematics Experiences from MathDiss International Experiences from the EMANI project Conclusions

3 23. 5. 20003 ETD 2003, Berlin 3 Types of file formats Binary formats Examples: PostScript, PDF, DVI, Word documents Mark-Up formats Examples: –SGML family: HTML, XML, MathML –Rich Text Format (RTF), Microsoft’s –some Versions of WordPerfect files –TeX family: TeX, LaTeX, AMSTeX

4 23. 5. 20003 ETD 2003, Berlin 4 Purposes of file formats I On-screen rendering Display data on screens with different sizes and resolutions Preferably zoomable Printing Prepare data for printing to different devices like laser printers or bubble jet Data exchange Transport data from one repository to the other Some sort of error checking is necessary

5 23. 5. 20003 ETD 2003, Berlin 5 Purposes of file formats II Discovery and Retrieval Support search functions internally (using the dedicated programme) Allow external text extraction (e.g. for indexing) Archiving preserve the intellectual contents and the “look and feel” of the file (essentially “eternally”) make it available to the scientific community

6 23. 5. 20003 ETD 2003, Berlin 6 Criteria for File Formats for Archiving I Deterioration of the storage media not considered: essentially manageable Error tolerance –Does the change of a single byte make the file unreadable? Long term stability –Is the file format changing constantly? –Is the dedicated programme “backwards compatible”, I.e. able to read the older files? Full open specification –Is a full and complete specification of the file format publicly available (at no cost)?

7 23. 5. 20003 ETD 2003, Berlin 7 Criteria for File Formats for Archiving II System independence –Does the file require a specific hardware platform or operating system? Ease of handling –Does the format transform easily to other formats (for delivery)? –Do documents consist of several individual files? Independence of commercial interests and influence –Is the file format owned by a commercial company –Are the programmes for creating and rendering the format only available from a commercial company? (The same?) Minor considerations –Bulkiness –Options for navigation –Copy and paste

8 23. 5. 20003 ETD 2003, Berlin 8 File formats in Mathematics TeX Basic starting format for writing mathematics DVI Result of compiling TeX, readable with DVI- viewer in dedicated environment PostScript Result of transformation from DVI PDF Result of transformation from DVI or PostScript

9 23. 5. 20003 ETD 2003, Berlin 9 Experiences from MathDiss International I PostScript Largest single document: 159 pages 40.099 KB Compressed: –Zip: 950 KB –StuffIt: 613 KB Portable but bulky format, requires compression for delivery

10 23. 5. 20003 ETD 2003, Berlin 10 Experiences from MathDiss International II PDF I Largest PDF file: 196 pages, 30.740 KB Compressed: Zip: 5.013 KB, StuffIt: 3.375 KB, StuffItX: 2.514 KB “There was an error when opening this page. The error occurred when analysing an image.”

11 23. 5. 20003 ETD 2003, Berlin 11 Experiences from MathDiss International III PDF II Rendering: Copying: Â__r¿.á_Å_ÂM_ÉÏ__ßíÓr_LÇvÁiÂÒá ¿._r__Å.¸äßíÅÒË+@ÃÓrÀjå¡ÂÒÀ_¸L<Lá ¿ Å_¸º___RàlÂÄÀA_.Ór_uÂÄÀ_¸.ß_âµãrÂÒ_Rñ.À_Ç.ÂÄÁL_r¸.ß_ß_Â1àlÂÄÀ ôyÂÒâµãL__ÓL_rÇ àrÂÒß#_Ü_.é_ÂÄ_LàlÓL_rÇ_ß__rÀ__.Ár».ÂÄ_Dß_é_ÂÄÀµàlÂÒ_

12 23. 5. 20003 ETD 2003, Berlin 12 Experiences from MathDiss International III TeX Multiple files for single document Dedicated environment necessary Often macro (.sty) or other files needed not present Correct files compile and display nicely Most complex bundle: 74 files: eight files with no ending, including three makefiles, one.aux, one.bak, three.bc, one.fot, eight.hex, one.jpeg, two.log, three.make, three.md, one.meta, four.mi, 30.pre, seven.psi, one.tx. The actual dissertation is hidden in the.bak file.

13 23. 5. 20003 ETD 2003, Berlin 13 Experiences from the EMANI project Analysis of TeX files from Springer Verlag for different journals. Example: Numerische Mathematik Needed additional files: svjour.cls: a general class file definition for all Springer Journals svnummat.clo: a special class option file for “Numerische Mathematik” TOTAL00.NUM: a somewhat obscure file that “redefines the things for journals to produce totally camera ready output”

14 23. 5. 20003 ETD 2003, Berlin 14 Conclusions I PDF Advantages: Easy to handle Convenient reader Disadvantages: Files can become very large Error tolerance is limited Acrobat system is owned by Adobe and is not open source.

15 23. 5. 20003 ETD 2003, Berlin 15 Conclusions II TeX Advantages: Original source Fairly small Open source Disadvantages Needs environment with special files Multiple files, possibility of missing files

16 23. 5. 20003 ETD 2003, Berlin 16 Conclusions III General Archive should provide guidance for creation of files (e.g. MathDiss start file) Archive needs “Ingest” system that checks Completeness of files and successful compilations (TeX) Possibly crippling adjustments of security settings, rendering quality (PDF) Archive needs management for related complex files and versioning For general acceptance, TeX needs a combined format that can be read using a single reader (e.g. IBM techexplorer)

17 23. 5. 20003 ETD 2003, Berlin 17 Thank You! Questions, remarks? Thomas Fischer, SUB Göttingen fischer @mail.sub.uni-goettingen.de


Download ppt "23. 5. 20003 ETD 2003, Berlin 1 LaTeX as an Archiving Format: Benefits and Problems Experiences from the MathDiss International Project and the EMANI project."

Similar presentations


Ads by Google