Document management (aka ‘digital libraries’) The Greenstone Group: Professor Ian Witten (leader); David Bainbridge, Dave Nichols, S.J. Cunningham, Steve.

Slides:



Advertisements
Similar presentations
Possibility in Digital Collection Management Introduction to CONTENTdm TM Hitoshi Kamada University of Arizona Presentation for OCLC-CJK Users Group Annual.
Advertisements

Copyright © 2014 Pearson Education, Inc. Publishing as Prentice Hall
File Management Chapter 3
Multilingual support; interface languages Course material prepared by Greenstone Digital Library Project University of Waikato, New Zealand andNational.
Web- and Multimedia-based Information Systems. Assessment Presentation Programming Assignment.
Information Retrieval in Practice
Information Retrieval Review
Presentation Outline  Project Aims  Introduction of Digital Video Library  Introduction of Our Work  Considerations and Approach  Design and Implementation.
1 CS 502: Computing Methods for Digital Libraries Lecture 20 Multimedia digital libraries.
Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,
Building Collections Using Greenstone Tod A. Olson Sr. Programmer/Analyst Digital Library Development Center University of Chicago Library
Presentation Outline  Project Aims  Introduction of Digital Video Library  Introduction of Our Work  Considerations and Approach  Design and Implementation.
CONTENTdm Important Features and Capabilities. CONTENTdm provides an “out of the box” solution to a complex web programming challenge. With minimal customization,
Architecture & Data Management of XML-Based Digital Video Library System Jacky C.K. Ma Michael R. Lyu.
Internet Resources Discovery (IRD) IBM DB2 Digital Library Thanks to Zvika Michnik and Avital Greenberg.
Modern Information Retrieval Chapter 1 Introduction.
Greenstone Digital Library Usage and Implementation By: Paul Raymond A. Afroilan Network Applications Team Preginet, ASTI-DOST.
Overview of Search Engines
Database Design IST 7-10 Presented by Miss Egan and Miss Richards.
Personal Information Management (PIM) is about keeping information and organizing it in such a way that we can find it when we need it. PIM as a field.
Digital Library Architecture and Technology
New Partnerships for Smarter Data Discovery, eBooks and Digital Asset Management Thailand IUG 2012 – Mahidol University.
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
Introduction to digital libraries How to Build a Digital Library Ian H. Witten and David Bainbridge.
CS598CXZ Course Summary ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Item Web 2.0 application relevant to teacher’s work.
ILC EDMS project suite Status Maura Barone GDE/Fermilab ILC Valencia - November 7, 2006.
Dr. Kurt Fendt, Comparative Media Studies, MIT MetaMedia An Open Platform for Media Annotation and Sharing Workshop "Online Archives:
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
Multimedia Digital Library Marcia Johnson. Collection 25 text documents 25 text documents In HTML, PDF, TXT formats (source: Project Gutenberg) In HTML,
Offline aAQUA. Developmental Informatics Lab Availability: Offline Access Works in resource constrained environment –intermittent and low bandwidth connectivity.
Metadata Xiangming Mu. What is metadata? What is metadata? (cont’) Data about data –Any data aids in the identification, description and location of.
Ihr Logo Chapter 7 Web Content Mining DSCI 4520/5240 Dr. Nick Evangelopoulos Xxxxxxxx.
Chapter 2 Architecture of a Search Engine. Search Engine Architecture n A software architecture consists of software components, the interfaces provided.
Web Searching Basics Dr. Dania Bilal IS 530 Fall 2009.
Thanks to Bill Arms, Marti Hearst Documents. Last time Size of information –Continues to grow IR an old field, goes back to the ‘40s IR iterative process.
Producción de Sistemas de Información Agosto-Diciembre 2007 Sesión # 8.
Document Formats How to Build a Digital Library Ian H. Witten and David Bainbridge.
Digital Archiving in the Hungarian Széchényi Library The story and the plans of the Hungarian Electronic Library Rome, 21. Oct István Moldován OSZK,
Chapter One Orientation: The world of digital libraries How to Build a Digital Library Ian H. Witten and David Bainbridge.
1 Information Retrieval Acknowledgements: Dr Mounia Lalmas (QMW) Dr Joemon Jose (Glasgow)

Introduction to Digital Libraries hussein suleman uct cs honours 2003.
LOGO A comparison of two web-based document management systems ShaoxinYu Columbia University March 31, 2009.
Uwe SchindlerGES 2007 – May 2-4, 2007 Data Information Service based on Open Archives Initiative Protocols and Apache Lucene Uwe Schindler 1, Benny Bräuer.
PAN-European Exploitation of the Results of the Libraries Programme - EXPLOIT German Libraries Institute Berlin EXPLOIT 1 Electronic library materials.
The World Wide Web: Information Resource. Hock, Randolph. The Extreme Searcher’s Internet Handbook. 2 nd ed. CyberAge Books: Medford. (2007). Internet.
Modern Information Retrieval Presented by Miss Prattana Chanpolto Faculty of Information Technology.
Introduction to Information Retrieval Example of information need in the context of the world wide web: “Find all documents containing information on computer.
Nikola Tesla Museum Clipping Library Saša Malkov Nenad Mitić Žarko Mijajlović 3 rd SEEDI Int.Conf. Cetinje, Montenegro 14. September 2007.
Greenstone Building your own collection. Overview Installation Usage Building a collection.
Greenstone Internals How to Build a Digital Library Ian H. Witten and David Bainbridge.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Information Retrieval
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
Soon Joo Hyun Database Systems Research and Development Lab. US-KOREA Joint Workshop on Digital Library t Introduction ICU Information and Communication.
Joomla! open-source content management system Becca Stroebel Deidra Townsend Gail Yerbic Jennifer Adams.
Chapter Three Presentation: User interface How to Build a Digital Library Ian H. Witten and David Bainbridge.
Presenting Documents How to Build a Digital Library Ian H. Witten and David Bainbridge.
1 CS 430: Information Discovery Lecture 21 Non-Textual Materials 1.
Greenstone.org Ian H. Witten New Zealand Digital Library Project Computer Science Department Waikato University New Zealand Browsing.
1 CS 430: Information Discovery Lecture 23 Non-Textual Materials.
Glencoe Introduction to Multimedia Chapter 2 Multimedia Online 1 Internet A huge network that connects computers all over the world. Show Definition.
Alexandria Digital Library ADL Metadata Architecture Greg Janée.
Information Retrieval in Practice
Search Engine Architecture
Multimedia Training Kit
DIGITAL LIBRARY.
Metadata to fit your needs... How much is too much?
Data Mining Chapter 6 Search Engines
Presentation transcript:

Document management (aka ‘digital libraries’) The Greenstone Group: Professor Ian Witten (leader); David Bainbridge, Dave Nichols, S.J. Cunningham, Steve Jones, Te Taka Keegan, Annika Hinze

Our work includes… Document management Content management Metadata management Multimedia documents Alerting and event notification support OCR-ing services Document & collection visualization User needs analysis Text mining Automatic metadata extraction

Greenstone software ‘digital library’ construction, use, and maintenance software Developed at Waikato ( Open Source Widely used internationally (UNESCO, FAO, Texas A&M Uni, Kyrgyz Republic, …) Digital library: A collection of digital objects (text, video, audio) along with methods for access and retrieval,[user] and for selection, organisation, and maintenance[librarian]

Greenstone software features  “Library” = set of separate collections “Collection” = set of separate documents  Multigigabyte collections  Hierarchical document model  Multimedia picture, voice, music, video collections  Multi-language documents Unicode throughout  Multi-language interfaces French, Chinese, Arabic …  Web browser or CD-ROM  Searching full-text and fielded, ranked or boolean  Browsing hierarchical indexes created from metadata  Metadata Dublin core + collection-specific extensions  Plugins different document types and metadata specifications  Classifiers create browsing indexes (collection editor decides)  Compression techniques throughout uses MG  Distributed collections coming soon, with Corba  Open-source software free, extensible Collections Documents Access Importing Distributing

Greenstone supports: multilanguage documents

Greenstone supports: hierarchically structured documents A book

Greenstone supports: collection design, maintenance Designing a collection with the Gatherer

Greenstone supports: CD-ROM access NGOs, e.g.  UNESCO  Global Help Project  United Nations University  World Health Organization  Pan American Health Organization

Greenstone supports: a wide (and growing) set of file formats DOC PDF XLS LaTeX Refer MARC … highly extensible through ‘plugin’ mechanism

Mobile document access handheld information access browsing methods for varying screen sizes studies on search behaviour (on- and off-line) support for non-text documents (FunkyZoom views of maps, images)

Browsing and exploration: hierarchical phrase index vWhat’s in this collection? vIs it any good? vWhat coverage for topic X? vMy query returned too much/little, what now?

Recent and proposed projects Making documents mobile: moving between large online collections and a PDA Text mining: extracting quality metadata from legacy documents User needs analysis: what sort of documents do a given set of users require, and how can the collection be managed? Visualization: making it easy to ‘see’ what’s in a collection, and supporting effective browsing

Recent and proposed projects Multi-language collections: tailoring a document collection interface and interaction mechanisms to the language of its users Alerting services: bringing potentially useful documents to the user’s attention, without overwhelming them Supporting unusual users: collections for the physically disabled, illiterate or semi-literate, children, … Audio and image collections: novel browsing and searching mechanism

Recent and proposed projects Storage and searching: developed highly efficient techniques for storing, indexing, and searching text documents; implemented in Greenstone, but portable to other document management software Usability analysis: how easy is it to use your current document collection? How can access be improved? And a host of wacky and cool things: collaging document collections, music retrieval systems, ‘aerial’ views of documents, …