Download presentation
Presentation is loading. Please wait.
Published byNicholas Carpenter Modified over 9 years ago
1
Document management (aka ‘digital libraries’) The Greenstone Group: Professor Ian Witten (leader); David Bainbridge, Dave Nichols, S.J. Cunningham, Steve Jones, Te Taka Keegan, Annika Hinze
2
Our work includes… Document management Content management Metadata management Multimedia documents Alerting and event notification support OCR-ing services Document & collection visualization User needs analysis Text mining Automatic metadata extraction
3
Greenstone software ‘digital library’ construction, use, and maintenance software Developed at Waikato (www.greenstone.org)www.greenstone.org Open Source Widely used internationally (UNESCO, FAO, Texas A&M Uni, Kyrgyz Republic, …) Digital library: A collection of digital objects (text, video, audio) along with methods for access and retrieval,[user] and for selection, organisation, and maintenance[librarian]
4
Greenstone software features “Library” = set of separate collections “Collection” = set of separate documents Multigigabyte collections Hierarchical document model Multimedia picture, voice, music, video collections Multi-language documents Unicode throughout Multi-language interfaces French, Chinese, Arabic … Web browser or CD-ROM Searching full-text and fielded, ranked or boolean Browsing hierarchical indexes created from metadata Metadata Dublin core + collection-specific extensions Plugins different document types and metadata specifications Classifiers create browsing indexes (collection editor decides) Compression techniques throughout uses MG Distributed collections coming soon, with Corba Open-source software free, extensible Collections Documents Access Importing Distributing
5
Greenstone supports: multilanguage documents
6
Greenstone supports: hierarchically structured documents A book
7
Greenstone supports: collection design, maintenance Designing a collection with the Gatherer
8
Greenstone supports: CD-ROM access NGOs, e.g. UNESCO Global Help Project United Nations University World Health Organization Pan American Health Organization
9
Greenstone supports: a wide (and growing) set of file formats DOC PDF XLS LaTeX Refer MARC … highly extensible through ‘plugin’ mechanism
10
Mobile document access handheld information access browsing methods for varying screen sizes studies on search behaviour (on- and off-line) support for non-text documents (FunkyZoom views of maps, images)
11
Browsing and exploration: hierarchical phrase index vWhat’s in this collection? vIs it any good? vWhat coverage for topic X? vMy query returned too much/little, what now?
12
Recent and proposed projects Making documents mobile: moving between large online collections and a PDA Text mining: extracting quality metadata from legacy documents User needs analysis: what sort of documents do a given set of users require, and how can the collection be managed? Visualization: making it easy to ‘see’ what’s in a collection, and supporting effective browsing
13
Recent and proposed projects Multi-language collections: tailoring a document collection interface and interaction mechanisms to the language of its users Alerting services: bringing potentially useful documents to the user’s attention, without overwhelming them Supporting unusual users: collections for the physically disabled, illiterate or semi-literate, children, … Audio and image collections: novel browsing and searching mechanism
14
Recent and proposed projects Storage and searching: developed highly efficient techniques for storing, indexing, and searching text documents; implemented in Greenstone, but portable to other document management software Usability analysis: how easy is it to use your current document collection? How can access be improved? And a host of wacky and cool things: collaging document collections, music retrieval systems, ‘aerial’ views of documents, …
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.