Presentation is loading. Please wait.

Presentation is loading. Please wait.

Cornell July 25, 2002 NUMDAM Pierre Bérard Institut Fourier, CNRS–Université Joseph Fourier & Cellule MathDoc, CNRS–Université Joseph Fourier Grenoble.

Similar presentations


Presentation on theme: "Cornell July 25, 2002 NUMDAM Pierre Bérard Institut Fourier, CNRS–Université Joseph Fourier & Cellule MathDoc, CNRS–Université Joseph Fourier Grenoble."— Presentation transcript:

1 Cornell July 25, 2002 NUMDAM Pierre Bérard Institut Fourier, CNRS–Université Joseph Fourier & Cellule MathDoc, CNRS–Université Joseph Fourier Grenoble (France)

2 Cornell July 25, 2002 NUMDAM Cellule MathDoc www-mathdoc.ujf-grenoble.fr An institute on Scientific Information & Communication in Mathematics, supported by Centre National de la Recherche Scientifique (CNRS) and Ministère de la Recherche. General mission: documentation issues in mathematics at the national level in France, in cooperation with mathematics libraries and institutes.

3 Cornell July 25, 2002 NUMDAM

4 Cornell July 25, 2002 NUMDAM Digitisation of Ancient Mathematics Documents NUM érisation de D ocuments A nciens M athématiques A digitisation program supported by and Ministère de la Recherche, managed by the Cellule MathDoc.

5 Cornell July 25, 2002 NUMDAM NUMDAM: aims Reinforce French mathematical journals (visibility, accessibility, durability). Hand down digitised archives of the French mathematical heritage to future generations and participate in international efforts with the same endeavour. Strive towards making this digitised mathematical heritage freely accessible.

6 Cornell July 25, 2002 NUMDAM Political choices Database freely accessible on the web. Full text freely accessible after a moving – wall (depending on each serial). Scheduled interoperability between retro-digitized and natively digital collections. National and international co-operations in as far as possible.

7 Cornell July 25, 2002 NUMDAM Technical choices Scan from first to last page @ 600 dpi. OCR (non-corrected @99,9%, mathematical formulae and images excluded). Multi-page files for logical units (TIFF, PDF + hidden text, DjVu). End-of-article bibliographies treated (corrected OCR @ 99,99% + mark- up of “ author ”, “ title ”, “ year ” fields) Database: cataloguing data for each article, summary (if present), end- of-article bibliography (if present), hidden OCRed text. Structured data exchange in XML. In as far as possible links to/from JFM, ZM and MR databases. Future enhancements scheduled depending on technology available.

8 Cornell July 25, 2002 NUMDAM Production choices Use of an external operator for the technical treatments. « In house » study, segmentation, cataloguing, quality control, and display. Quality and durability policy : Prefer standard and easily convertible formats, as sources of future processing if necessary (TIFF, XML), not be tied to a proprietary system. Archive high quality images, which should allow to regenerate the text (formula OCR, structure recognition).

9 Cornell July 25, 2002 NUMDAM NUMDAM Phase I Journals

10 Cornell July 25, 2002 NUMDAM NUMDAM Phase I: Chronology Spring 2003. — End of the industrial phase of NUMDAM Phase I, public access to articles via the web. Autumn 2002. — Start of NUMDAM Phase II. Dealing with © issues continued. August 2002. — First 50,000 pages delivered by vendor. Feb. - May 2002. — Setting-up production chain (vendor) and quality control (Cellule MathDoc). Dealing with © issues. Dec. 2001. — Choice of vendor validated by CNRS. Nov. 2000 - Oct. 2001. — Cataloguing and checking database. Oct. 2000 - May 2001. — Writing up schedule of conditions/vendor. July 2000. — Funding by CNRS.

11 Cornell July 25, 2002 NUMDAM NUMDAM Phase II Take an active part in the Digital Mathematics Library project. Cooperate with other digitisation projects (Gallica–BnF, possibly EMANI digitisation part). Inventory of resources & cooperation with historians and mathematicians to make scientific choices and establish priorities, in order to Digitise all French mathematics journals (Annales de l’Institut Henri Poincaré, Annales de l’Université de Toulouse, Comptes Rendus de l’Académie, Journal de l’École polytechnique,....), and possibly some mathematically important general science journals. Digitise important seminar series (séminaires Bourbaki, Cartan, séminaire de Probabilités de Strasbourg,...). Digitise a substantial set of important monographies.

12 Cornell July 25, 2002 NUMDAM Software developments SQL  XML Quality control Authors id & © Display Links Database maintenance Quality control Schedule of technical conditions Vendor Digitisation Segmentation Treatements (ocr & bibliographies) Display: Search and Browsing Links: JFM, MR, ZM Examination of collections and setting- up the database Copyright issues and negotiations with publishers NUMDAM programme: overview

13 Cornell July 25, 2002 NUMDAM Quality control procedure LOG (Log of errors) Automatic control Perl Sorting samples Perl Samples (files TIFF; XML, TIFF, PDF, DjVu) Files received from vendor TIFF; XML, TIFF, PDF and DjVu Log of errors BD MySQL Check-list Php Visual control Synthesis Rejection Validation

14 Cornell July 25, 2002 NUMDAM NUMDAM Programme XML description of physical volumes

15 Cornell July 25, 2002 NUMDAM Publications Mathématiques de l’Institut des Hautes Études Scientifiques Physical volume: Year 1962, Volume 12

16 Cornell July 25, 2002 NUMDAM A paper in a physical volume Article by Bernard Dwork in Publications Mathématiques IHÉS, 12 (1962), 5-68

17 Cornell July 25, 2002 NUMDAM

18 Cornell July 25, 2002 NUMDAM Bibliographies

19 Cornell July 25, 2002 NUMDAM Cross-linking External databases JFM, MR, ZM,... DB of articles & DB of images MR 28#3039 ZM 0173.48601 MR 10,592e ZM 0032.39402 PMIHES_1962__12__5_0 EDBM SQL PDF DjVu

20 Cornell July 25, 2002 NUMDAM MR —— NUMDAM MR–lookup |Publications IHES|Shih||13||1962||PMIHES_1962__13__5_0|| |Inst. Hautes Etudes Sci. Publ. Math.|Shih||13||1962||PMIHES_1962__13__5_0| 26#1893|Homologie des espaces fibr\'es. BdD NUMDAM MR MR–lookup

21 Cornell July 25, 2002 NUMDAM JFM & ZM —— NUMDAM New identification tool in development in the LIMES framework (EU project) |Publications IHES|Shih||13||1962||PMIHES_1962__13__5_0|| |Inst. Hautes Etudes Sci. Publ. Math.|Shih||13||1962||PMIHES_1962__13__5_0| 0105.16903|Homologie des espaces fibr\'es. BdD NUMDAM ZM ZM–lookup

22 Cornell July 25, 2002 NUMDAM Identification of authors: two purposes Improve search facilities by setting-up a reference list of authors. Provide a tool to help address copyright issues.

23 Cornell July 25, 2002 NUMDAM Internal tool...

24 Cornell July 25, 2002 NUMDAM

25 Cornell July 25, 2002 NUMDAM

26 Cornell July 25, 2002 NUMDAM

27 Cornell July 25, 2002 NUMDAM

28 Cornell July 25, 2002 NUMDAM NUMDAM: search interface based on EDBM (in development)

29 Cornell July 25, 2002 NUMDAM JFM MR ZM Abstract if available

30 Cornell July 25, 2002 NUMDAM NUMDAM URLs Main: www-mathdoc.ujf-grenoble.fr/NUMDAM/ Visitors (sample files): www-mathdoc.ujf-grenoble.fr/NUMDAM/Visitors/ Login: VISITORSPwd: v\to\num LiNuM (Books at BnF, Cornell, Göttingen, Michigan): www-mathdoc.ujf-grenoble.fr/LiNuM/ Journal de Mathématiques Pures et Appliquées 1836 – 1880 (BnF): www-mathdoc.ujf-grenoble.fr/JMPA/ Search NUMDAM database: math-sahel.ujf-grenoble.fr/NUMDAM/Public/Bd/consultation.htm Inventory: math-sahel.ujf grenoble.fr/NUMDAM/Public/Inventaire/inventaire.htm

31 Cornell July 25, 2002 NUMDAM Thank you for your attention...


Download ppt "Cornell July 25, 2002 NUMDAM Pierre Bérard Institut Fourier, CNRS–Université Joseph Fourier & Cellule MathDoc, CNRS–Université Joseph Fourier Grenoble."

Similar presentations


Ads by Google