210 mm Integration of an Automatic Indexing System within the Document Flow of a Grey Literature Repository Jindřich MynarzJindřich Mynarz, Ctibor ŠkutaCtibor.

Slides:



Advertisements
Similar presentations
The DRIVER Infrastructure (Digital Repository Infrastructure Vision for European Research) Paolo Manghi ISTI - National Research Council, Italy.
Advertisements

Controlled Vocabularies in TELPlus Antoine ISAAC Vrije Universiteit Amsterdam EDLProject Workshop November 2007.
Interoperability Scenarios All Working Groups Meeting May, Rome, Italy.
1 L U N D U N I V E R S I T Y a home grown, bespoke institutional Federated Search tool JIBS Conference at The John Rylands University Library,
G ET A HEAD ON Y OUR R EPOSITORY Worldwide Hydra Connect #2 September 30 – October 2, 2014 Cleveland, Ohio #hydraConnect.
SRDC Ltd. 1. Problem  Solutions  Various standardization efforts ◦ Document models addressing a broad range of requirements vs Industry Specific Document.
OneGeology-Europe - the first step to the European Geological SDI INSPIRE Conference 2010, Session Thematic Communities: Geology Krakow, June 24 th 2010.
IAEA International Atomic Energy Agency INIS Collection Search: Introduction and main features INIS Training Seminar 7-11 October 2013, Vienna Domenico.
ARCHIMÈDE Presented by Guy Teasdale Directeur, Services soutien et développement Bibliothèque de l’Université Laval CARL Workshop on Institutional Repositories.
Information and Business Work
Lund University Libraries Head Office Directory of Open Access Journals (DOAJ) & Directory of Open Access Repositories (OpenDOAR) Trends in Education and.
1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,
Steve Yip Head of Reference and Research Services HKUST Library Research Support Provided by HKUST Library and other JULAC Libraries in HK 1 Date : March.
Building a Digital Library with Fedora International Conference on Developing Digital Institutional Repositories Hong Kong December 9, 2004.
Information Retrieval Concerned with the: Representation of Storage of Organization of, and Access to Information items.
Lund University Libraries Head Office Update on International Seminar on Open Access for Developing Countries – Salvador, Bahia – Brazil September 21st-22.
The Casalini full-text platform: enriched content and expanded functionalities for empowered users Michele Casalini ADLUG Conference - Trento, 24 September.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
Content Management Systems Digital Resources for Research in the Humanities 2001.
Implementing Metadata Marjorie M K Hlava, President Access Innovations, Inc. Albuquerque, NM
WORKSHOP ON INFORMATION SYSTEMS ARCHITECTURES Information / software architectures based on Content Management Systems (CMS): some examples with Drupal.
Metadata: Its Functions in Knowledge Representation for Digital Collections 1 Summary.
DITA and Topic Maps Bringing the Pieces Together Topic Maps Conference 2008, Oslo Joe Gelb President, Suite Solutions.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
Grey Literature, E-Repositories and Evaluation of Academic & Research Institutes. The case study of BPI e-repository Maria V. Kitsiou - Head Librarian,
PREMIS Tools and Services Rebecca Guenther Network Development & MARC Standards Office, Library of Congress NDIIPP Partners Meeting July 21,
The MERIC Prototype A Proof of Concept for the MERIC Vision William E. Moen School of Library and Information Sciences Texas Center for Digital Knowledge.
ILC EDMS project suite Status Maura Barone GDE/Fermilab ILC Valencia - November 7, 2006.
Rutherford Appleton Laboratory SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory Semantic Web Best Practices and Deployment.
Automatic Subject Classification and Topic Specific Search Engines -- Research at KnowLib Anders Ardö and Koraljka Golub DELOS Workshop, Lund, 23 June.
Making Grey Literature Available through Institutional Repositories LeRoy J. LaFleur, Social Sciences Bibliographer Nathan A. Rupp, Metadata Librarian.
1 XML as a preservation strategy Experiences with the DiVA document format Eva Müller, Uwe Klosa Electronic Publishing Centre Uppsala University Library,
LIS 506 (Fall 2006) LIS 506 Information Technology Week 11: Digital Libraries & Institutional Repositories.
Annual reports and feedback from UMLS licensees Kin Wah Fung MD, MSc, MA The UMLS Team National Library of Medicine Workshop on the Future of the UMLS.
The DiVA System: Current Status and Ongoing Development Uwe Klosa Electronic Publishing Centre, Uppsala University, Sweden Eva Müller.
The JISC IE Metadata Schema Registry and IEEE LOM Application Profiles Pete Johnston UKOLN, University of Bath CETIS Metadata & Digital Repositories SIG,
19/10/20151 Semantic WEB Scientific Data Integration Vladimir Serebryakov Computing Centre of the Russian Academy of Science Proposal: SkTech.RC/IT/Madnick.
07/11/2002Thomas Baron - JACoW Workshop1 CERN Library Requirements T. Baron CERN ETT-DH-CDS.
Grey literature partnership network in the Czech Republic Fourteenth International Conference on Grey Literature National Research Council, Rome, Italy.
1 Construction Chapter Key Concepts Be familiar with the system construction process. Understand different types of tests and when to use Understand.
Food and Agriculture Organization of the UN Library and Documentation Systems Division Margherita Sini July 2005 Managing domain ontologies within the.
Supporting Further and Higher Education Collection description as Middleware The Information Environment Service Registry (IESR) Rachel Bruce, Information.
Recent Developments in CLARIN-NL Jan Odijk P11 LREC, Istanbul, May 23,
System of Grey Literature in the Czech Republic Eleventh International Conference on Grey Literature December 14-15, 2009 – Library of Congress, Washington.
Joint Information Systems Committee Supporting Higher and Further Education Rachel Bruce Programme Manager, JISC Executive Collection.
TPF Users Group Fall Conference Integrated Workstation Taskforce Requirements Document.
Eurostat SDMX and Global Standardisation Marco Pellegrino Eurostat, Statistical Office of the European Union Bangkok,
Millman—Nov 04—1 An Update on Digital Libraries David Millman Director of Research & Development Academic Information Systems Columbia University
Functional Requirements Specification for Open Repository for Doctoral Thesis at UNSA Dušanka Bošković University of Sarajevo 15 th Workshop on “Software.
IAEA International Atomic Energy Agency INIS Collection Search: Introduction and main features The Role of the International Nuclear Information System.
Semantic Web COMS 6135 Class Presentation Jian Pan Department of Computer Science Columbia University Web Enhanced Information Management.
Survey of Enhanced publications in the Czech Republic Fifteenth International Conference on Grey Literature CVTI SR, Bratislava, Slovak Republic, 2-3 December.
Renovation of Eurostat dissemination chain
Publishing Geodesy, Topography and Cartography Research via Invenio Jiří Drozda, Veronika Synková Research institute of Geodesy, Topography and Cartography.
Automatic vs manual indexing Focus on subject indexing Not a relevant question? –Wherever full text is available, automatic methods predominate Simple.
A Faceted Interface to the Library Catalog Tito Sierra NCSU Libraries ALA Midwinter Meeting January 20, 2007.
Open Science and Research – Services for Research Data Management © 2014 OKM ATT 2014–2017 initiative Licenced under.
Metayogi Increasing the Accessibility of the Semantic Web Karim Tharani Doug Macdonald Rachel Heidecker.
Slides Template for Module 3 Contextual details needed to make data meaningful to others CC BY-NC.
TextCrowd – Collaborative semantic enrichment of text-based datasets
© 2015 OKM ATT 2014–2017 initiative 
Document, Index, Discover, Access
Flanders Marine Institute (VLIZ)
Tim Smith CERN Geneva, Switzerland
The evolution of the SDMX infrastructure and services
Outline Pursue Interoperability: Digital Libraries
Submitted By: Usha MIT-876-2K11 M.Tech(3rd Sem) Information Technology
DIGITAL LIBRARY.
Networked Information Resources
Beyond OA: Additional methods for enhanced exposure NMU Open Access Seminar 30 October 2018 NMU Port Elizabeth Wynand van der Walt Head Librarian: Technical.
Presentation transcript:

210 mm Integration of an Automatic Indexing System within the Document Flow of a Grey Literature Repository Jindřich MynarzJindřich Mynarz, Ctibor ŠkutaCtibor Škuta National Technical Library Grey Literature 12 Conference,

210 mm Indexing of Grey Literature self-publishing, self-indexing the Web made publishing easier, can it make indexing easier as well? make non-professional indexing better through technology increase grey literature visibility and support navigation interfaces

210 mm Automatic Indexing conditional on full-text availability machine learning based on analysis of language corpora automatic term assignment automatic suggestions of indexing terms lessen the cognitive overhead involved in indexing human feedback to correct the obvious mistakes

210 mm Implementation re-use of existing components o combination and extension open source, open formats subject headings system + digital repository + automatic indexer + text corpus + glue code = automatic indexing system

210 mm Subject Heading System Polythematic Structured Subject Headings SystemPolythematic Structured Subject Headings System o universal Czech-English controlled vocabulary managed and used at the National Technical Libraryat the National Technical Library o expressed in RDF data format via SKOS vocabularyRDF data formatSKOS vocabulary

210 mm Digital Repository CDS Invenio o open source, modular architecture o extensions to the interface for entering new documents and the search interface

210 mm Automatic Indexer Maui Indexer o automatic term assignment with a controlled vocabulary o extensions for Czech language (stemmer, stopwords) o indexing model for Czech language with usage of PSH

210 mm Text Corpus National Repository of Grey Literature o maintained by the National Technical Library o aggregates documents from partner institutions o in some cases, metadata are created by the users

210 mm Glue Code code to tie all pieces together web services o loose coupling o re-use of existing code

210 mm User Interface Design Considerations opt-in indexing procedure suggest indexing headings autocomplete headings' fragments learn by example — show example documents indexed with the heading in question extending search interface

210 mm Further Possibilities and Challenges indexing must be reflected in end-user interfaces continuous enhancements of the individual parts of the document processing pipeline user-generated indexing feeding back into the development of the subject headings system

210 mm Thank you for your attention!