Download presentation
Presentation is loading. Please wait.
Published byAdele Moore Modified over 9 years ago
1
The Natural History Museum http://www.nhm.ac.uk Speaker: Charles Hussey Science Data Co-ordinator Department of Information and Library Systems c.hussey@nhm.ac.uk The Trustees of The Natural History Museum, 2002
2
Move towards networks connecting distributed sources Data Access - challenges and opportunities (Personal view of what is achievable) Start by drawing upon work for European Natural History Specimen Information Network Two components to this presentation Then look at some of the approaches we have taken within The NHM
3
Nicolas Bailly, MNHN Paris, ENHSIN David Gee, originator of DSML Dilshat Hewzullah, NHM, DSML & Querying distributed databases Anne Hume, NHM, Online databases and DSML Andrew Jones, University of Cardiff, SPICE for Species 2000 Mike Lowndes, NHM, Museum Information Locator System Rachel Perkins, NHM, Collections Level Descriptions Mike Sadka, NHM, Fast-track programme Darrell Siebert, NHM, Fish Collection Database Chris Sleep, NHM, DSML Neil Thomson, NHM, BioCASE Acknowledgements
4
Nature of Data What do we have to deal with? NHM Survey in 2000: 87 institutions responded: 33 different products; 40% using bespoke solutions; 5 using spreadsheets BioCISE Survey in 1998/99: 292 institutions responded: 60 different products; 75% using bespoke solutions; Only 8% providing web access to unit level data First Challenge: Integrating disparate sources
5
Nature of Data Do data providers have the means to: 1.Implement and maintain a local Internet Server providing 24-hour a day access? 2.Compile metadata (collections level or unit level)? 3.Supply additional data (such as resolving localities or providing elements of higher taxonomy) 4.Maintain quality of datasets 5.Construct views of their data or implement wrappers 6.Handle version control First Challenge: Integrating disparate sources
6
Nature of Data 1.Authorities for names 2.Personal names 3.Geographic co-ordinates 4.Place names 5.Language and spelling Second Challenge: Comparing like with like
7
1.Single client/server database used by all providers and users Architectures 2. Central summary system 3. Central Gateway to distributed databases 4. Peer-to-peer databases 5. Web directory pointing to data sources
8
1.Single client/server database used by all providers and users Architectures Example: NHM Palaeontology Collections Management System Example: Packages for Observers – Recorder 2000, MapMate Single database, subscribers have local client Allows detailed and complex interaction with data
9
Architectures 2. Central summary system Contributors maintain their own systems and post copies of data to centrally maintained database Example: NBN Species Dictionary
10
Architectures No central database 3. Central Gateway to distributed databases …but “Common Access System” may store metadata Example: Species 2000 Example: Biodiversity on the Web
11
Biodiversity on the Web Selection of Searchable Databases
12
Architectures multiple Z39.50 servers and clients 4. Peer-to-peer databases Example: Species Analyst Example: AHDS
13
Architectures Essentially, a portal 5. Web directory pointing to data sources Example: BIODIV
14
Scalability Sustainability Access Quality Control Other Issues Terminology Control “Gaps” in data: Still parts of collection not yet databased Collection not suitable for databasing at unit level Inadequate data dictionary Data not available for a specimen Data needs interpretation Indicators for Quality
15
Copy table from Access to SQL Server Restructure table to add “new” fields Perform conversions: Place = Waterbody + Locality(verbatim) + Site.Ref. Split Collection date to DAY, MONTH, YEAR Convert Lat & Long to decimal degrees Convert Altitude to metres and deal with altitude ranges Shape = Material + “(“+Preservation Method +”)” Collector = Collector Surname + Initials + Title Determiner = Determiner Surname + initials + Title Populate blank fields with static data by creating view (e.g. for Kingdom, Collection Name, Contact Info.) Delete fields not required after conversion Rename fields to match ENHSIN element names A Case in Point: Wrapping a dataset for ENHSIN Pilot
16
Imaging of Primary Sources Zoology Accession Ledgers Entomology Card indexes (VIADOCS project) NHM Initiatives Rapid Data Entry Fish Collection Botany Pilot Collections Level Description Darwin Centre Entomology Index to Collections Integrated Access Data Locator
28
ENHSIN: http://www.nhm.ac.uk/science/rco/enhsin/index.html SPICE Project: http://www.systematics.reading.ac.uk/spice Biodiversity on the Web: http://www.biodiversity.org.uk/ibs/ Species Analyst: http://habanero.nhm.ukans.edu NBN Species Dictionary: http://yaw.nhm.ac.uk/nhm/ AHDS Gateway: http://prospero.ahds.ac.uk:8080/ahds_live/ BIODIV: http://www.br.fgov.be/biodiv/ NHM Collection Level Descriptions: http://www.nhm.ac.uk/cld/index.shtml NHM Data Locator: http://internt.nhm.ac.uk/cgi-bin/locator/ Online databases at NHM: http://www.nhm.ac.uk/science/projects.html Links
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.