| 29 | Machine-based issuing of DNB Subject Categories and DDC Short Numbers for Medicine | 25. April 2016 1 Machine-based issuing of DNB Subject Categories.

Slides:



Advertisements
Similar presentations
Don’t Type it! OCR it! How to use an online OCR..
Advertisements

End-to-end document capture, indexation, OCR to Microsoft SharePoint
GL8 New Orleans December 4-5, 2006 INIST-CNRS (France) From SIGLE to OpenSIGLE and Beyond From SIGLE to OpenSIGLE and Beyond An In-Depth Look at Resource.
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Connecting you with information, support and your community Quality Control of Shelf Ready at the University of Warwick Christina Claridge Metadata Librarian.
1 L U N D U N I V E R S I T Y a home grown, bespoke institutional Federated Search tool JIBS Conference at The John Rylands University Library,
AAIM At our school, we must post our objectives and the standards that we are covering each day on the classroom wall. Here is an example of how.
ACCESSIBLE TECHNOLOGIES FOR SPEECH MANAGEMENT “Making media accessible to all” ITU workshop – Geneva October 2013.
The key to Library resources How to unlock it. What is a shelf number and why is it important?  It is the number that appears on the spine of a book.
Virtual Library Slavistics Its modules & new technologies COSEELIS conference 2009 Cambridge, April 6th, 2009.
1 Question Answering in Biomedicine Student: Andreea Tutos Id: Supervisor: Diego Molla.
Technical Tips and Tricks for User Support Mike Gardner
June 12, 2015 ©2005 Ovid Technologies Jörn Hope Ovid.
1 Cataloging for School Librarians — It Matters! Margaret Maurer Head, Catalog and Metadata Kent State University Libraries and Media Services 2006 ILF.
Janet Weber Manager, Publisher Relations OCLC MLAIB Discussion Group MLA & OCLC Update ALA Annual 28 June 2008.
Presented by Anni Tokatlian Teacher-Librarian Jasper Road Primary School.
NURSING 475 Step Five: RESEARCH APPLICATION. STEP FIVE: The Assignment: n Select a nursing intervention you performed on this patient. What are some of.
Knowledge Science & Engineering Institute, Beijing Normal University, Analyzing Transcripts of Online Asynchronous.
Data Management Seminar, 8-11th July 2008, Hamburg WinW3S - Translation of Forms and Labels (PDF)
Co-funded by the European Union under FP7-ICT Co-ordinated by aparsen.eu #APARSEN Dealing with DRM and Digital Rights at the German National Library.
The Dewey Decimal System
Dewey Decimal Classification (DDC) A library classification developed by Melvil Dewey in 1876 DDC are numbers representing subjects. Ten main classes –
MAUS Financial Ratio & Diagnostics Software Program The software program is a comprehensive financial tool that will give you a holistic approach to a.
Luc Audrain Hachette Livre Head of digitalization
Automatic Subject Classification and Topic Specific Search Engines -- Research at KnowLib Anders Ardö and Koraljka Golub DELOS Workshop, Lund, 23 June.
Inside the DDC Dewey goes Europe: On the use and development of the Dewey Decimal Classification (DDC) in European libraries Austrian National Library.
Learning Object Metadata Mining Masoud Makrehchi Supervisor: Prof. Mohamed Kamel.
OCLC Research: an update Lorcan Dempsey
Subject To Change automatic catalog enrichment with subject headings and codes 10th IGeLU conference Budapest, Marcus Zerbst Zentralbibliothek.
SPC Coastal & Oceanic Fisheries Programmes Digital Library Anne Gibert, SPC Librarian Assistant Jean-Paul Gaudechoux, SPC Fisheries Information.
Content Detection and Analysis CSCI 572: Information Retrieval and Search Engines Summer 2010.
Administrative Law Research: Federal Register and Code of Federal Regulations Trisha Simonds Fall 2008.
Dewey Decimal System Made By Melvil Dewey.. Dewey Decimal Rap.
Math Information Retrieval Zhao Jin. Zhao Jin. Math Information Retrieval Examples: –Looking for formulas –Collect teaching resources –Keeping updated.
Research library of the National Aerospace University Kharkiv Aviation Institute.
Sidnummer Dewey in Sweden - still a project? Harriet Aagaard The National Library of Sweden EDUG meeting 11 April 2013.
FINDING NON-FICTION BOOKS IN THE LIBRARY. How are non-fiction books organised? BY THEIR SUBJECT.
Code of Federal Regulations (CFR) Contents. Code of Federal Regulations The regulations first published in the Federal Register on a daily basis are then.
OCLC Online Computer Library Center Dewey Decimal Classification: News and New Views Dewey Users Group January 15, 2005 ALA Midwinter.
Natural language processing tools Lê Đức Trọng 1.
Keys to Finding Print Literature Reference Materials This slide show is a primer on how to find basic literary criticism and materials in the MOHS Library.
ISPRA 2004 Automatic Eurovoc indexing an Experiment in the Czech Parliament Anna Lhotská, Václav Sklenář Office of the Chamber of Deputies, Parliament.
Duraid Y. Mohammed Philip J. Duncan Francis F. Li. School of Computing Science and Engineering, University of Salford UK Audio Content Analysis in The.
Biography of Melvil Dewey and his life.
S T A T I S T I C S A U S T R I A March SuperSTAR A joint development with STR D.Burget October 2007 © STATISTICS AUSTRIA I n f.
Subject cataloguing: faster, better, cheaper Wan Wong & Alison Dellit National Library of Australia.
Twenty-Minute Library Tips A presentation about the Library of Congress classification system and how you can use it to find books in our library! Beth.
Content Short history of SEND Online presentation Challenges Questions 41st IAMSLIC Annual Conference & 16th EURASLIC Biennial Meeting, 7-11 September,
The world’s libraries. Connected. Practical Approaches for Maintaining Cataloging Competencies ALA Midwinter January 2013.
Presenting Documents How to Build a Digital Library Ian H. Witten and David Bainbridge.
1 Dr. Cord Pagenstecher Testimonies on Nazi Forced Labor and the Holocaust Building Digital Environments for Research and Education Dr. Cord Pagenstecher.
Dewey in Italy The Role of the National Central Library of Florence Laura Crociani – Maria Chiara Giunti – Elisabetta Viti EDUG Naples 2015.
1 CS 430: Information Discovery Lecture 7 Automatic Generation of Catalog Records.
| Tina Mengel, German National Library | EDUG 2016, Göttingen 1 DDC Updates & Notification functionality in WebDewey Tina Mengel.
ONLINE PUBLIC ACCESS CATALOGUES (OPAC) Annamaria Kiss Central Library Semmelweis University 2014.
Information Literacy University of Namibia Library 2006.
DESIGN AND IMPLEMENTATION OF LIBRARY AUTOMATION USING KOHA (Open Source Software) AT BHARATHIDASAN UNIVERSITY COLLEGE, PERAMBALUR R.VENUS MLISc- Final.
Theory, Tools, History: A Brief Introduction August 17, 2016.
Online Publishing Platform  How The Platform Works  Platform Integration Process  Automatically created content.
Professional development training on cataloging at the University Wisconsin-Madison Memorial Library, USA 14th October -24th October, 2016 Aigerim Shurshenova.
CrissCross, Seoul
Some work experiences and needs in order to provide better services by Mr. Somxay KHAMPAHAVONG Deputy-Director of UCL.
CONCERT 2001 October 3, 2001 Maurice Kwong Springer Verlag Hong Kong
Table of Contents: Part B
A Comprehensive Index for Classical Studies
CS224N Section 3: Corpora, etc.
Unsupervised Machine Learning: Clustering Assignment
Classification & Cataloging
Presentation transcript:

| 29 | Machine-based issuing of DNB Subject Categories and DDC Short Numbers for Medicine | 25. April Machine-based issuing of DNB Subject Categories and DDC Short Numbers for Medicine in the German National Library Frank Busse

Outline 1.General Information 2.Automatic Classification of DNB Subject Categories 3.Automatic Classification of DDC Short Numbers for Medicine | 29 | Machine-based issuing of DNB Subject Categories and DDC Short Numbers for Medicine | 25. April

General Information | 29 | Machine-based issuing of DNB Subject Categories and DDC Short Numbers for Medicine | 25. April

4 Automated Cataloguing – why?

| 29 | Machine-based issuing of DNB Subject Categories and DDC Short Numbers for Medicine | 25. April  2009Start of PETRUS project  2010 Ceasing of intellectual cataloguing of online publications  2012 Automatic classification / DNB Subject Categories  2014 Automatic indexing  2015 Automatic classification / DDC Short Numbers  2015 PETRUS project completed Timeline

| 29 | Machine-based issuing of DNB Subject Categories and DDC Short Numbers for Medicine | 25. April Further information: Subject cataloguing DNB Subject Categories Subject headings DDC numbers Subject Cataloguing at the DNB

| 29 | Machine-based issuing of DNB Subject Categories and DDC Short Numbers for Medicine | 25. April Automatic Classification of DNB Subject Categories

| 29 | Machine-based issuing of DNB Subject Categories and DDC Short Numbers for Medicine | 25. April  Since 2004  Based on Dewey Decimal Classification (DDC)  102 categoriescategories DNB Subject Categories

| 29 | Machine-based issuing of DNB Subject Categories and DDC Short Numbers for Medicine | 25. April Economics 560 Paleontology 640 Home and family management Examples of Subject Categories

| 29 | Machine-based issuing of DNB Subject Categories and DDC Short Numbers for Medicine | 25. April Automatic Classification  Start: 2012  Method: machine learning / SVM  Document type:  All online publications / without fiction  PDF (since 2012)  Epub (since 2015)  Language Ger/Eng  Volume: online publications (03/2016)

| 29 | Machine-based issuing of DNB Subject Categories and DDC Short Numbers for Medicine | 25. April  Supervised learning (Learning by example)  Pattern recognition  Generalization of rules  Classifying unknown objects Machine Learning

| 29 | Machine-based issuing of DNB Subject Categories and DDC Short Numbers for Medicine | 25. April  Averbis GmbH / Freiburg im Breisgau Averbis GmbH  Averbis Extraction Platform (AEP)  Version 2.2.2a  Improvements and further development Software

| 29 | Machine-based issuing of DNB Subject Categories and DDC Short Numbers for Medicine | 25. April Workflow Training  Base  Create a model  Software:  Averbis Software Routine  Daily processing of new online publications  Retro-processing  Software:  Averbis Software  DNB Interface  CBS

Routine | 29 | Machine-based issuing of DNB Subject Categories and DDC Short Numbers for Medicine | 25. April

| 29 | Machine-based issuing of DNB Subject Categories and DDC Short Numbers for Medicine | 25. April Training Selection Training data Parameter setting Linguistic analysis Training

| 29 | Machine-based issuing of DNB Subject Categories and DDC Short Numbers for Medicine | 25. April Training Data  Online publications & digitised Tables of Contents (ToC)  Since 2004  Language Ger/Eng  April 2016: Online publications & ToC

Training Workflow 17 | 29 | Machine-based issuing of DNB Subject Categories and DDC Short Numbers for Medicine | 25. April 2016 Selection Training data Parameter setting Linguistic analysis Training

| 29 | Machine-based issuing of DNB Subject Categories and DDC Short Numbers for Medicine | 25. April Parameter Setting  Language  Text length  Metadata weighting  Exclusion conditions  etc.

Training Workflow 19 | 29 | Machine-based issuing of DNB Subject Categories and DDC Short Numbers for Medicine | 25. April 2016 Selection Training data Parameter setting Linguistic analysis Training

Training Workflow 20 | 29 | Machine-based issuing of DNB Subject Categories and DDC Short Numbers for Medicine | 25. April 2016 Selection Training data Parameter setting Linguistic analysis Training

| 29 | Machine-based issuing of DNB Subject Categories and DDC Short Numbers for Medicine | 25. April Quality Management sample check data analysis improvement Two ways of generating sample data:  Intellectual supervision  Comparison with printed edition

Results Classified objects: Sample check: (18%) Result: 75% correct | 29 | Machine-based issuing of DNB Subject Categories and DDC Short Numbers for Medicine | 25. April

| 29 | Machine-based issuing of DNB Subject Categories and DDC Short Numbers for Medicine | 25. April DDC Short Numbers for Medicine

| 29 | Machine-based issuing of DNB Subject Categories and DDC Short Numbers for Medicine | 25. April DDC Short Numbers for Medicine  Developed in 2006/2007  Classification of printed medical theses  Fast and time-saving

| 29 | Machine-based issuing of DNB Subject Categories and DDC Short Numbers for Medicine | 25. April Example Book content: Study Overweight Children Kiel DNB-SC610 DDC Short Number

| 29 | Machine-based issuing of DNB Subject Categories and DDC Short Numbers for Medicine | 25. April DDC Short Numbers  Start: Oct  Method: machine learning / SVM  Document type:  Subject Category 610 „Medicine and health“  Online publications (PDF / Epub)  Language Ger/Eng  Volume: online publications (03/2016)

Results October – December 2015 Classified objects: Sample check: 574 (14%) Result: 74% correct | 29 | Machine-based issuing of DNB Subject Categories and DDC Short Numbers for Medicine | 25. April

Future challenges | 29 | Machine-based issuing of DNB Subject Categories and DDC Short Numbers for Medicine | 25. April  Improve results  Development of DDC Short Numbers for other DNB Subject Categories  No „automatic DDC“ with this tool

| 29 | Machine-based issuing of DNB Subject Categories and DDC Short Numbers for Medicine | 25. April Thank you for your attention! Questions? Frank Busse German National Library Section Automatic Indexing, Online Publications