Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Natural History Museum Speaker: Charles Hussey Science Data Co-ordinator Department of Information and Library Systems

Similar presentations


Presentation on theme: "The Natural History Museum Speaker: Charles Hussey Science Data Co-ordinator Department of Information and Library Systems"— Presentation transcript:

1 The Natural History Museum http://www.nhm.ac.uk Speaker: Charles Hussey Science Data Co-ordinator Department of Information and Library Systems c.hussey@nhm.ac.uk  The Trustees of The Natural History Museum, 2002

2 Move towards networks connecting distributed sources Data Access - challenges and opportunities (Personal view of what is achievable) Start by drawing upon work for European Natural History Specimen Information Network Two components to this presentation Then look at some of the approaches we have taken within The NHM

3 Nicolas Bailly, MNHN Paris, ENHSIN David Gee, originator of DSML Dilshat Hewzullah, NHM, DSML & Querying distributed databases Anne Hume, NHM, Online databases and DSML Andrew Jones, University of Cardiff, SPICE for Species 2000 Mike Lowndes, NHM, Museum Information Locator System Rachel Perkins, NHM, Collections Level Descriptions Mike Sadka, NHM, Fast-track programme Darrell Siebert, NHM, Fish Collection Database Chris Sleep, NHM, DSML Neil Thomson, NHM, BioCASE Acknowledgements

4 Nature of Data What do we have to deal with? NHM Survey in 2000: 87 institutions responded: 33 different products; 40% using bespoke solutions; 5 using spreadsheets BioCISE Survey in 1998/99: 292 institutions responded: 60 different products; 75% using bespoke solutions; Only 8% providing web access to unit level data First Challenge: Integrating disparate sources

5 Nature of Data Do data providers have the means to: 1.Implement and maintain a local Internet Server providing 24-hour a day access? 2.Compile metadata (collections level or unit level)? 3.Supply additional data (such as resolving localities or providing elements of higher taxonomy) 4.Maintain quality of datasets 5.Construct views of their data or implement wrappers 6.Handle version control First Challenge: Integrating disparate sources

6 Nature of Data 1.Authorities for names 2.Personal names 3.Geographic co-ordinates 4.Place names 5.Language and spelling Second Challenge: Comparing like with like

7 1.Single client/server database used by all providers and users Architectures 2. Central summary system 3. Central Gateway to distributed databases 4. Peer-to-peer databases 5. Web directory pointing to data sources

8 1.Single client/server database used by all providers and users Architectures Example: NHM Palaeontology Collections Management System Example: Packages for Observers – Recorder 2000, MapMate Single database, subscribers have local client Allows detailed and complex interaction with data

9 Architectures 2. Central summary system Contributors maintain their own systems and post copies of data to centrally maintained database Example: NBN Species Dictionary

10 Architectures No central database 3. Central Gateway to distributed databases …but “Common Access System” may store metadata Example: Species 2000 Example: Biodiversity on the Web

11 Biodiversity on the Web Selection of Searchable Databases

12 Architectures multiple Z39.50 servers and clients 4. Peer-to-peer databases Example: Species Analyst Example: AHDS

13 Architectures Essentially, a portal 5. Web directory pointing to data sources Example: BIODIV

14 Scalability Sustainability Access Quality Control Other Issues Terminology Control “Gaps” in data: Still parts of collection not yet databased Collection not suitable for databasing at unit level Inadequate data dictionary Data not available for a specimen Data needs interpretation Indicators for Quality

15 Copy table from Access to SQL Server Restructure table to add “new” fields Perform conversions: Place = Waterbody + Locality(verbatim) + Site.Ref. Split Collection date to DAY, MONTH, YEAR Convert Lat & Long to decimal degrees Convert Altitude to metres and deal with altitude ranges Shape = Material + “(“+Preservation Method +”)” Collector = Collector Surname + Initials + Title Determiner = Determiner Surname + initials + Title Populate blank fields with static data by creating view (e.g. for Kingdom, Collection Name, Contact Info.) Delete fields not required after conversion Rename fields to match ENHSIN element names A Case in Point: Wrapping a dataset for ENHSIN Pilot

16 Imaging of Primary Sources Zoology Accession Ledgers Entomology Card indexes (VIADOCS project) NHM Initiatives Rapid Data Entry Fish Collection Botany Pilot Collections Level Description Darwin Centre Entomology Index to Collections Integrated Access Data Locator

17

18

19

20

21

22

23

24

25

26

27

28 ENHSIN: http://www.nhm.ac.uk/science/rco/enhsin/index.html SPICE Project: http://www.systematics.reading.ac.uk/spice Biodiversity on the Web: http://www.biodiversity.org.uk/ibs/ Species Analyst: http://habanero.nhm.ukans.edu NBN Species Dictionary: http://yaw.nhm.ac.uk/nhm/ AHDS Gateway: http://prospero.ahds.ac.uk:8080/ahds_live/ BIODIV: http://www.br.fgov.be/biodiv/ NHM Collection Level Descriptions: http://www.nhm.ac.uk/cld/index.shtml NHM Data Locator: http://internt.nhm.ac.uk/cgi-bin/locator/ Online databases at NHM: http://www.nhm.ac.uk/science/projects.html Links


Download ppt "The Natural History Museum Speaker: Charles Hussey Science Data Co-ordinator Department of Information and Library Systems"

Similar presentations


Ads by Google