IntroductionDC Metadata towards an e-research cyberinfrastructure The case of French ETDs
IntroductionDC Summary Introducing ARTIST and authors of this collective work Main actors operating on French ETDs Their roles; Their metadata 3 case studies Creating metadata: thinking about reusability Thematic survey around biodiversity: thinking vocabularies Institutional survey: thinking ontologies Conclusion
IntroductionDC Authors Jacques DUCLOY – INIST, ARTIST Yann Nicolas – ABES Diane Le Hénaff - INRA Muriel FOULONNEAU – now CCSD Luc GRIVEL – Univ. Paris 1 Jean-Paul DUCASSE – Univ. Lyon 2 + several little interventions and comments
IntroductionDC ARTIST members Networked workshop Appropriation by Research Communities of Technologies of Scientific & Technical Information Community of: Researchers & Engineers Information Science & Information Technologies working in research world
IntroductionDC ARTIST topics How to build Digital Library or scholarly publishing applications dealing with e-science? New approaches to make research or science in a cyberinfrastructure
IntroductionDC ARTIST e-Science experimentations Scientific forum Sample: cooperative linguistic discussion Carl Lagoze paper about DL What is a Digital Library ? Scientific journal: AMETIST Appropriation, Mutualisation, Experimentation Digital writing Richer on-line version Experience becomes “reproductible” Paper view Scientific focus and evaluation purpose Digital Library experimentations (metadata -> DL) Cooperative writing: this article
IntroductionDC This article: metadata for e-science Not only for: (scientific) information retrieval But also for: + research evaluation + federative digital libraries + research policy oriented studies + scientific surveys
IntroductionDC main entities dealing with French ETDs Universities And their national organization EPST (National Research Institute) Sample CNRS : 30,000 people European framework Each country gets its own organization + European actions (Delos, DRIVER…) Francophone framework International framework Networked Digital Library of Theses and Dissertations (NDLTD), ePrints application profile
IntroductionDC Translation difficulties showing different approaches Thèse = ETD Thesis (English context) Dissertation (US context) Veille scientifique Scientific survey (using informetric tools) In order to discover innovations Pilotage de la recherche research policy oriented studies Strong role of French administration
ActorsDC Actor: Cyberthèses Cyberdocs OpenOffice + XSLT Word + style Xml / TEI PDF Jury ETD
ActorsDC Cyberthèses metadata Thesis document: Xml, TEI-Lite Xml version must be readable by a human being Metadata: DC ETD-ms (Electronic Thesis & Dissertation Metadata Standard) Further related axis: TEI header, Latex -> MathML
ActorsDC Actor: ABES Ministry of Education Agency Star University: ETD Union catalog Persistent Identifier Sudoc portal Preservation Dissemination (CINES)
IntroductionDC Abes / Star metadata TEF (Thèses Electroniques Françaises) AFNOR standard (French member of ISO) Dublin Core Qualified With several ETD adaptations (jury…) METS Rights Using Schematron
ActorsDC Actor: CCSD Centre pour la Communication Scientifique Directe HAL: National Archive International Archive (Arxiv, Driver…) Local Archive Thematic Archive TEL … Inserm Researcher author Researcher reader
IntroductionDC CCSD metadata At the beginning: local schema Strong relationship between: Author University, laboratory, research team DC export / OAI - PMH
IntroductionDC Hal: institutional repository A French advantage related to open archive « Protocole d’accord Universités EPST sur les dépots par les chercheurs » In CNRS each researcher must produce an activity report which in generated by Hal/CCSD Some scientific headers can request a CCSD deposit
ActorsDC Actor: INIST/CNRS Institut de l’Information Scientifique et Technique Pascal & Francis bibliographic data bases 15,000,000 XML records Scientific Portals, Scientific information analysis Vocabularies: termSciences
IntroductionDC INIST - metadata Bibliographic records Exodic Origin: CCF based MARC format Translated in SGML in 92 now with a DCQ approach Strong links between authors and affiliations termSciences ISO (TMF)
ActorsDC Thesis Cyberthèses STAR Local archive CCSD Articles OAI-PMH The landscape we would like to have INIST
IntroductionDC Case study: sharing theses and their metadata Thinking about reusability by several actors which interoperate during ETD life cycle Inra unit Ecole doctorale Inra University Abes/star ETD Univ. Lab.
IntroductionDC Sharing metadata Administrative metadata are requested for a quite complex workflow Contents must be matched A given person could have different names… Different ways of naming units… Different classification schemes…
IntroductionDC Case study 2: BiodivERsA BiodivERsA: European research policy network about biodiversity x * 10 funding agencies, y*100 research program z * 1000 projects x1 * results … distributed network of CRIS CRIS: Current Research Information System
IntroductionDC Vocabulary adaptations CRIS… Archive DL Thematic CRIS Thematic Archive Thematic DL Global DL Classification schemas must be matched for computation purpose (funding evaluation)
IntroductionDC Case study 3 Affiliation must be managed with ontologies UHPCNRSINRIA CRIN LoriaInria LorraineInria Sophia YTCortexOmégaOrpailleur
IntroductionDC A technical conclusion Metadata (DCQ) is good but not sufficient We need vocabulary adaptations, ontologies Sharing several repositories (vocabularies, affiliations…) Managing metadata history etc
IntroductionDC The very conclusion We need to help people working altogether and doing compromises We need researcher becoming owners of their scientific information system… We need librarians appropriating technologies and helping researcher to appropriate librarian feeling We need engineers in computer science appropriating library and edition issues That is what we try to help to do with ARTIST
IntroductionDC Thank you for your listening Thank you for your questions…