Unlock the books with IntelligentCAPTURE Xavier Baumgartner University of St. Gallen.


Similar presentations
End-to-end document capture, indexation, OCR to Microsoft SharePoint

ELIBRARY CURRICULUM EDITION The ultimate K-12 curriculum and reference solution.
1 Use of Electronic Resources in Research Prof. Dr. Khalid Mahmood Department of Library & Information Science University of the Punjab.
A worldwide library cooperative OCLC Online Computer Library Center OCLC CJK Users Group 2007 Annual Meeting March 24, 2007, Boston David Whitehair, OCLC.
European Thesaurus on International Relations and Area Studies A multilingual terminological tool on international affairs Axel Huckstorf Stiftung Wissenschaft.
GMD German National Research Center for Information Technology Darmstadt University of Technology Perspectives and Priorities for Digital Libraries Research.
Basic Internet Terms Digital Design. Arpanet The first Internet prototype created in 1965 by the Department of Defense.
Retrieval of Information from Distributed Databases By Ananth Anandhakrishnan.
Introduction to Computational Linguistics
Search Techniques Boolean Logic and Keyword Searching.
Introduction to Library Research Gabriela Scherrer Reference Librarian for English Languages and Literatures, University Library of Bern.
ICOLC October 4, 2001 OCLC Services. Purpose Libraries’ web-based information portal needs –Maximize consortia’s role in their members’ use of database.
Client Lunch & Learn (12:15). Association for Information & Image Management Nov Research Scanner Utilization.
IS530 Lesson 12 Boolean vs. Statistical Retrieval Systems.
IAEA International Atomic Energy Agency United Nations Library and Information Network for Knowledge Sharing (UN-LINKS) September 2013, Geneva.
Dewey Decimal Classification – 10 main divisions.
Features and Uses of a Multilingual Full-Text Electronic Theses and Dissertations (ETDs) System Yin Zhang Kent State University Kyiho Lee, Bumjong You.
IAEA International Atomic Energy Agency ICSTI 2013 Annual Members’ Meeting March 2013.
Page 1 June 2, 2015 Optimizing for Search Making it easier for users to find your content.
Virtual Library Slavistics Its modules & new technologies COSEELIS conference 2009 Cambridge, April 6th, 2009.
Engineering Village ™ ® Basic Searching On Compendex ®
Faculty of Physical & Applied Sciences Postgraduates Library Information Resources 2011/12.
Search Engines and Information Retrieval
T.Sharon - A.Frank 1 Internet Resources Discovery (IRD) Classic Information Retrieval (IR)
Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.
Online training for professionals: how this is being addressed by AccessIT Adam Dudczak Poznań Supercomputing and Networking Center
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Introduction to Library Research Gabriela Scherrer Reference Librarian for English Languages and Literatures, University Library of Bern.
Mastering the Internet, XHTML, and JavaScript Chapter 7 Searching the Internet.
Introduction to Library Research Gabriela Scherrer Reference Librarian for English Languages and Literatures, University Library of Bern.
What is the Internet? The Internet is a computer network connecting millions of computers all over the world It has no central control - works through.
DB2 Net Search Extender Presenter: Sudeshna Banerji (CIS 595: Bioinformatics)
Information retrieval Finding relevant data using irrelevant keys Example: database of photographic images sorted by number, date. DBMS: Well structured.
Libraries and Institutional Content Management Systems
Exercise Your your Library ® Smart Searching UW Library Winter 2007.
ACCESS TO QUALITY RESOURCES ON RUSSIA Tanja Pursiainen, University of Helsinki, Aleksanteri institute. EVA 2004 Moscow, 29 November 2004.
An introduction to databases In this module, you will learn: What exactly a database is How a database differs from an internet search engine How to find.
Web of Science. Copyright 2006 Thomson Corporation 2 Example: (bird* or avian) and (flu or influenz*) Enter your terms to be searched. Search fields are.
Library Workshop for EPA Sep Outline 2 Find Library resources for research  iSearch  ProQuest Education Databases RefWorks – a web-based.
Introduction to Library Research Gabriela Scherrer Reference Librarian for English Languages and Literatures, University Library of Bern.
Using the library services Internet: evaluation of sources Use search engines effectively Some basic search techniques Choosing the appropriate source.
CRAI Library Catalogue of University of Barcelona.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
Avalanche Internet Data Management System. Presentation plan 1. The problem to be solved 2. Description of the software needed 3. The solution 4. Avalanche.
CIG Conference Norwich September 2006 AUTINDEX 1 AUTINDEX: Automatic Indexing and Classification of Texts Catherine Pease & Paul Schmidt IAI, Saarbrücken.
GESIS Dr. Maximilian Stempfhuber Head of Research and Development Social Science Information Centre, Bonn, Germany How to deal with heterogeneity when.
Web of Knowledge Service for UK Education April 2007 An Overview Web of Knowledge Support Officer
ICS-FORTH January 11, Thesaurus Mapping Martin Doerr Foundation for Research and Technology - Hellas Institute of Computer Science Bath, UK, January.
WISER Social Sciences: Politics & International Relations Gillian Beattie (Social Science Library) Jane Rawson (Vere Harmsworth Library)
University of North Texas Libraries Building Search Systems for Digital Library Collections Mark E. Phillips Texas Conference on Digital Libraries May.
NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006.
NCSU Libraries Andrew Pace & Emily Lynema NCSU Libraries May 24, 2006.
PAN-European Exploitation of the Results of the Libraries Programme - EXPLOIT German Libraries Institute Berlin EXPLOIT 1 Electronic Access, Document Ordering.
Search Engines. Search Strategies Define the search topic(s) and break it down into its component parts What terms, words or phrases do you use to describe.
The World Wide Web: Information Resource. Hock, Randolph. The Extreme Searcher’s Internet Handbook. 2 nd ed. CyberAge Books: Medford. (2007). Internet.
Mercury – A Service Oriented Web-based system for finding and retrieving Biogeochemical, Ecological and other land- based data National Aeronautics and.
Introduction to Information Retrieval Example of information need in the context of the world wide web: “Find all documents containing information on computer.
ICT-enabled Agricultural Science for Development Scenarios, Opportunities, Issues by ICTs transforming agricultural science, research & technology generation.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
1 Smart Searching Techniques Fall 2006 the Library.
Information Retrieval
WISER Humanities: Quality Information on the Internet Johanneke Sytsema Linguistics Subject Consultant Judy Reading Reader.
Internet Power Searching: Finding Pearls in a Zillion Grains of Sand By Daniel Arze.
Introduction to ProQuest and Ebook Central Platforms Ali Nazari-Nouri Training and Consulting Partner.
Information Literacy University of Namibia Library 2006.
Lecture #11: Ontology Engineering Dr. Bhavani Thuraisingham
Building Search Systems for Digital Library Collections
Cataloging the Internet
Introduction to Information Retrieval
Presentation transcript:

Unlock the books with IntelligentCAPTURE Xavier Baumgartner University of St. Gallen

Outline 1 Background of the Project: –Euregio Bodensee - Library Cooperation –Project AGI and VLB = Vorarlberger Landesbibliothek –IBH = Internationale Bodenseehochschule 2 Project Partners: –AGI: –Libraries

Outline 3 Project Tools: –intelligentCAPTURE –IC CAI-Engine –intelligentSEARCH 4 Project Results: –Library catalogue: –Portal:

1 Background Euregio Bodensee - Region extending for roughly 50km around Lake Constance (Bodensee) - Covers the southern German districts of Konstanz, Sigmaringen, Ravensburg, Lindau, and Oberallgäu und Bodenseekreis - Austrian province of Vorarlberg - Swiss cantons of St. Gallen, Schaffhausen, Appenzell- Innerrhoden and Appenzell-Ausserrhoden - Principality of Liechtenstein.

1 Background Euregio Bodensee - Library Cooperation

1 Background IBH = Internationale Bodensee-Hochschule International Lake Constance University - Virtual University - Network of 24 independent universities - Aim: promote cooperation among member universities in fields of science, research and infrastructure - Use synergies to mutual advantage

2 Project Partners AGI - Information Management Consultants - Focused on information and knowledge managment - Consulting - Software development and long-term maintenance - Use advanced recognition technologies in: Automatic indexing and text mining (CAI) Machine translation (MT) Optical character recognition (OCR) Recognition of text structures in PDF documents Voice recognition

2 Project Partners AGI - Information Management Consultants Products: - based on IBM technical platform Lotus Notes & Domino - intelligentCAPTURE -> tool for document capturing and machine indexing - IC INDEX -> tool for developing topic maps, taxonomies, thesauri and classifications - intelligentSEARCH -> tool for information retrieval, vizualization

2 Project Partners Libraries - University of Applied Sciences Dornbirn - University of Applied Sciences Kempten - University of Applied Sciences Liechtenstein - Central Library Zurich for University Zurich - University of Applied Sciences Konstanz - University of St. Gallen

3 Project tools intelligentCAPTURE - Software intelligentCAPTURE installed locally and connected to scanner - Workflow: - Identification of document via barcode - Scanning table of contents of books - Character recognition process (OCR) - Quick check of result of OCR

3 Project tools intelligentCAPTURE - Workflow (cont): - Generation of PDF file - Compression of files - Automatic indexing (CAI engine) - Transfer of PDF file to file system - Export of indexing results and PDF files to Local library system to Local intelligentSEARCH database to Central database, hosted by AGI

3 Project tools IC CAI Engine - Automatic indexing much more specific and comprehensive than just indexing of title and intellectual indexing with controlled vocabulary - Document analysis on basis of linguistic methods and procedures from computer linguistics - All words are reduced to linguistic base form (morphems) - Uses large semantic nets (thesauri, topic maps etc.) - Statistical rules for relevance ranking

3 Project tools IC CAI-Engine - Output of most important terms in groups: - geographical terms - personal/corporate terms - branches areas of activity - decriptors: words from internal thesaurus - important words and phrases from text - Libraries: use broad generic thesaurus, approx. 300‘000 German terms and smaller English thesaurus - Languages: German and English in use, French and Spanish available

Library1 iCAPT PDF Library 2Library 3 iCAPT PDF iCAPT PDF ILS Indexing ILS Indexing AGI

3 Project tools intelligent SEARCH - Search engine, simple (Google like) interface, with IBM GTR (Global Text Retrieval) as core engine - Search terms input -> automatically expanded semantically - Main features of GTR: Operators: Boolean, adjacency, near, paragraph sentence, right and left truncation, wildcard, fuzzy searching, sorting by relevance

3 Project tools intelligent SEARCH - AGI developed features: - Highlighting - Interfaces to library system, book seller, web via google - Query expansion by semantic nets - Vizualization and browsing of topic maps

4 Project Results Project Results - Library OPAC Vorarlberger Landesbibliothek: - Portal:

4 Project results - Portal with semantic search engine (intelligentSEARCH) - Content: automatically indexed content pages of books and other publications; PDF files of contents pages - Search terms expanded semantically - Relevance ranking - Highlighting

4 Project results - Links to libraries holding the book, to booksellers, to internet search engines - View topic maps