Presentation is loading. Please wait.

Presentation is loading. Please wait.

Document management (aka ‘digital libraries’) The Greenstone Group: Professor Ian Witten (leader); David Bainbridge, Dave Nichols, S.J. Cunningham, Steve.

Similar presentations


Presentation on theme: "Document management (aka ‘digital libraries’) The Greenstone Group: Professor Ian Witten (leader); David Bainbridge, Dave Nichols, S.J. Cunningham, Steve."— Presentation transcript:

1 Document management (aka ‘digital libraries’) The Greenstone Group: Professor Ian Witten (leader); David Bainbridge, Dave Nichols, S.J. Cunningham, Steve Jones, Te Taka Keegan, Annika Hinze

2 Our work includes… Document management Content management Metadata management Multimedia documents Alerting and event notification support OCR-ing services Document & collection visualization User needs analysis Text mining Automatic metadata extraction

3 Greenstone software ‘digital library’ construction, use, and maintenance software Developed at Waikato (www.greenstone.org)www.greenstone.org Open Source Widely used internationally (UNESCO, FAO, Texas A&M Uni, Kyrgyz Republic, …) Digital library: A collection of digital objects (text, video, audio) along with methods for access and retrieval,[user] and for selection, organisation, and maintenance[librarian]

4 Greenstone software features  “Library” = set of separate collections “Collection” = set of separate documents  Multigigabyte collections  Hierarchical document model  Multimedia picture, voice, music, video collections  Multi-language documents Unicode throughout  Multi-language interfaces French, Chinese, Arabic …  Web browser or CD-ROM  Searching full-text and fielded, ranked or boolean  Browsing hierarchical indexes created from metadata  Metadata Dublin core + collection-specific extensions  Plugins different document types and metadata specifications  Classifiers create browsing indexes (collection editor decides)  Compression techniques throughout uses MG  Distributed collections coming soon, with Corba  Open-source software free, extensible Collections Documents Access Importing Distributing

5 Greenstone supports: multilanguage documents

6 Greenstone supports: hierarchically structured documents A book

7 Greenstone supports: collection design, maintenance Designing a collection with the Gatherer

8 Greenstone supports: CD-ROM access NGOs, e.g.  UNESCO  Global Help Project  United Nations University  World Health Organization  Pan American Health Organization

9 Greenstone supports: a wide (and growing) set of file formats DOC PDF XLS LaTeX Refer MARC … highly extensible through ‘plugin’ mechanism

10 Mobile document access handheld information access browsing methods for varying screen sizes studies on search behaviour (on- and off-line) support for non-text documents (FunkyZoom views of maps, images)

11 Browsing and exploration: hierarchical phrase index vWhat’s in this collection? vIs it any good? vWhat coverage for topic X? vMy query returned too much/little, what now?

12 Recent and proposed projects Making documents mobile: moving between large online collections and a PDA Text mining: extracting quality metadata from legacy documents User needs analysis: what sort of documents do a given set of users require, and how can the collection be managed? Visualization: making it easy to ‘see’ what’s in a collection, and supporting effective browsing

13 Recent and proposed projects Multi-language collections: tailoring a document collection interface and interaction mechanisms to the language of its users Alerting services: bringing potentially useful documents to the user’s attention, without overwhelming them Supporting unusual users: collections for the physically disabled, illiterate or semi-literate, children, … Audio and image collections: novel browsing and searching mechanism

14 Recent and proposed projects Storage and searching: developed highly efficient techniques for storing, indexing, and searching text documents; implemented in Greenstone, but portable to other document management software Usability analysis: how easy is it to use your current document collection? How can access be improved? And a host of wacky and cool things: collaging document collections, music retrieval systems, ‘aerial’ views of documents, …


Download ppt "Document management (aka ‘digital libraries’) The Greenstone Group: Professor Ian Witten (leader); David Bainbridge, Dave Nichols, S.J. Cunningham, Steve."

Similar presentations


Ads by Google