Download presentation
Presentation is loading. Please wait.
Published byPedro Howick Modified over 10 years ago
1
Brief Notes from Kew Mark Jackson Software Applications Manager
2
Focussing on... n Herbarium digitisation n electronic Plant Information Centre
3
Kew Herbarium n Guesstimated –7 million specimens –250,000 types n Less than 5% specimens databased n A variety of personal databases
4
Preparation for Digitisation n Computerise transactions n Agree and document policy and procedures n Establish core fields (HISPID pending ABCD) n Develop hardware and software infrastructure (e.g. catalogue database, mass storage)
5
Digitisation Strategy n Curators to barcode, database and image types for loan n Repatriation & research projects –to use infrastructure and core fields –data to be imported into Catalogue (eventually) n Pursue digitisation projects www.kew.org/data/repatbr
6
Specimen imaging n Decision to try to match Cibachrome prints in terms of quality (e.g. suitable for many diagnostic purposes) – 600 dpi delivers 200MB images n Stored as uncompressed (but bzipped) TIFFs n Acquisition of mass storage
7
HerbScan n A3 flatbed scanner, inverted n Cradle for specimens n Distributed throughout Herbarium
8
Pros and cons n £30-40,000 n 200MB images barely achievable n 1 image per minute n Fixed n Versatile n £7,500 n 200MB images easily achievable n 10 images per hour n Some mobility n Suited to flat items 200 MB master images (600 dpi scans), based on capturing the level of detail of Cibachromes. Camera HerbScan
9
HerbCat Client Image Server Images Metadata image enquiries HerbCat enquiries
10
Focussing on... n Herbarium digitisation n electronic Plant Information Centre
11
n UK government funding for delivery of services electronically n Resource-discovery interface to multiple Kew data sources (not necessarily at Kew) n Data sources are heterogenous n Simple interface overlaying other systems ePIC Interface Data source
15
Data sources Interface (java servlet)/JSPs Multi-threaded Java server Request queue Handlers: one per data source one for logging one for spell-checking Requests Data sources Configuration files (XML) Results Architecture
16
n Web documents indexed using Lucene n Flora Zambesiaca digitised and marked-up with XML n Experimentation with options for query and output via Java servlet –using XSL to output selections –using Lucene to index the XML –importing the XML into a database n Other texts - jury still out, but Lucene route looks promising Texts
17
Feedback n Email mechanisms n Web usability testing/focus groups n Logging –Quantitative success levels of usage, patterns & trends beware: crawlers, testing & development staff, harvesters referring URLs, Google link: popularity of site country, domain –Qualitative success success of queries esp. zero hits (spelling, common names, families) performance & system monitoring number of queries per session, return visits results pages viewed
18
World distribution of queries
19
www.kew.org/epic Future n More data sources, including texts and images n Hierarchical browsing front-end based around revamped Brummitt Families & Genera with phylogenetic classification n Looking forward to –using the GBIF Names Service… –links with DiGIR/BioCASE resources...
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.