Biodiversity Informatics Biodiversity informatics and the manipulation of biological information Jim Croft

Slides:



Advertisements
Similar presentations
AUSTRALIA’S VIRTUAL HERBARIUM
Advertisements

Why metadata matters for libraries... Rachel Heery UKOLN: The UK Office for Library and Information Networking, University of Bath
Australian Faunal Directory (AFD) and Australian Plant Census (APC): Content, Architecture and Services Documenting and delivering nomenclature and taxonomy.
Retrieval of Information from Distributed Databases By Ananth Anandhakrishnan.
Environmentally Sustainable Australia Atlas of Living Australia presentation to Environmentally Sustainable Australia Expert Working Group Donald Hobern,
GUID-1 Workshop Welcome and Introduction Donald Hobern GBIF Program Officer for Data Access and Database Interoperability February 2006.
Virtualizing Entomology Collection Student: Di Wang (Alan) Sponsors: John Marris: Curator, Entomology Research Museum Stuart Charters: Department of Applied.
The Australian SDI Clearinghouse Presentation by Peter Holland, General Manager, National Mapping Division, Geoscience Australia,
Jim Croft Centre for Plant Biodiversity Research, Australian National Herbarium & Australian National Botanic Gardens Helen Thompson ; Scott Payne Australian.
BGBM - Biodiversity Informatics04 June 2013 How the specimen data is organised and published at BGBM.
Integrating Biodiversity Data
BIS TDWG Conference, New Orleans, 2011 GBIF: Issues in providing federated access to digital information related to biological specimens David Remsen Senior.
21 st CENTURY FLORAS New technologies to speed the process Arthur D. Chapman
Integrated Taxonomic Information System Janet Gomon, Deputy Director, ITIS Smithsonian Institution Museum of Natural History The.
Scaling up The International Plant Names Index (IPNI) James A. Macklin Harvard University Herbaria Paul J. Morris Harvard University Herbaria & Museum.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
A New Computing Paradigm. Overview of Web Services Over 66 percent of respondents to a 2001 InfoWorld magazine poll agreed that "Web services are likely.
OpenUp! A New Project on Opening up the European Natural History Heritage for EUROPEANA W. G. Berendsohn, A. K. Michel, A. Güntsch, W.-H. Kusber (2011)
1 Lecture 13: Database Heterogeneity Debriefing Project Phase 2.
GLOBAL BIODIVERSITY INFORMATION FACILITY David Remsen ECAT Program Officer September G A Darwin-Core Archive solution to publishing and.
OpenUp! Natural History Heritage Information for Europeana Gerda Koch AIT-Angewandte Informationstechnik Forschungs-GmbH, Graz/Austria
Web Services Michael Smith Alex Feldman. What is a Web Service? A Web service is a message-oriented software system designed to support inter-operable.
Virtual Federal Herbarium Prototype. What is a virtual federal herbarium? A “library” of specimen data and images of plants and fungi A searchable public.
Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Semantic web technologies for secure interoperability and.
ALLOWS FOR efficient computerization and management of biological collections and mobilization of specimen information onto the Internet.ALLOWS FOR efficient.
Species Banks a GBIF mechanism to provide electronic access to quality species information Peter H. Schalk, Marc Brugman ETI, University of Amsterdam Tinde.
GLOBAL BIODIVERSITY INFORMATION FACILITY The Global Biodiversity Information Facility (GBIF ): The distributed architecture Samy Gaiji Head of Informatics.
Using IESR Ann Apps MIMAS, The University of Manchester, UK.
Internet Gateway for Delivering Biodiversity Data ESRI User Conference July 2005.
Lisa Ruff Business Productivity/Accessibility TS Microsoft Federal.
NSLA Members ACT Library and Information Service National Library of Australia National Library of New Zealand Northern Territory Library State Library.
Brian Matthews, CRIS 2002, 30/08/02 ERIS Workshop, CRIS2002 Architecture Brian Matthews, Business & Information Technology Dept, CLRC
1 Technologies for distributed systems Andrew Jones School of Computer Science Cardiff University.
Web Services (SOAP, WSDL, UDDI) SNU OOPSLA Lab. October 2005.
Introducing Australia’s Virtual Herbarium (AVH) 3 Ben Richardson Western Australian Herbarium, Department of Environment and Conservation / CHAH / HISCOM.
[] Where Did Those GBIF Occurrences Come From? Providing Digital Access to NatureServe's Reference Database: Report on a Project in the Early Stages of.
PLoS ONE Application Journal Publishing System (JPS) First application built on Topaz application framework Web 2.0 –Uses a template engine to display.
Digitization of Natural History Collections (DIGIT) Larry Speers Program Officer Digitization of Natural History Collections Data TDWG Annual Meeting Oct.
Information for decision making Migrating from fragmented visions to solve punctual problems (reacting to crisis) to Systemic and integrated approaches.
1 CS 502: Computing Methods for Digital Libraries Lecture 19 Interoperability Z39.50.
CBSOR,Indian Statistical Institute 30th March 07, ISI,Kokata 1 Digital Repository support for Consortium Dr. Devika P. Madalli Documentation Research &
BioCASE – A Biological Collection Access Service for Europe BioCASE programme – metadata and computing methods The Irish National Node Workshop: October.
Definition of an Observation In general, an observation represents the measurement of some attribute, of some thing, at a particular time and place. Observations.
Global Biodiversity Information Facility GLOBAL BIODIVERSITY INFORMATION FACILITY Meredith A. Lane CODATA/ERPANET Workshop: Scientific Data Selection &
Ricardo Pereira Software Engineer TDWG Infrastructure Project (TIP)
GBIF Data Access and Database Interoperability 2003 Work Programme Overview Donald Hobern, GBIF Programme Officer for Data Access and Database Interoperability.
An introduction to data exchange protocols in TDWG Renato De Giovanni TDWG 2008.
KE EMu, the world’s premier collections management software.
Australia’s Virtual Herbarium: Medium to long-term benefits from distributed biodiversity information systems.
H I S C O M Flora information Partnership Barry Conn Royal Botanic Gardens Sydney Council of Heads of Australian Herbaria.
Distributed Data Analysis & Dissemination System (D-DADS ) Special Interest Group on Data Integration June 2000.
Global Biodiversity Information Facility. GLOBAL BIODIVERSITY INFORMATION FACILITY Hannu Saarenmaa & al. Ecoinformatics Workshop Brussels 22 September.
Taxonomic Workflow in the EDIT Platform for Cybertaxonomy Andreas Kohlbecker, Pepe Ciardelli, Niels Hoffmann, Katja Luther, Andreas Müller Botanic Garden.
HISCOM An Australian Virtual Herbarium Jim Croft Australian National Herbarium.
Networking Biodiversity Data – Online Access to Distributed Data Sources in GBIF-D Andrea Hahn, A. Kirchhoff & W.G. Berendsohn Botanic Garden and Botanical.
The New GBIF Data Portal Web Services and Tools Donald Hobern GBIF Deputy Director for Informatics October 2006.
Mediterranean Plant Collections: The computerised way forward.
Australia’s Virtual Herbarium Unlocking Australia’s plant biodiversity Information.
AUSTRALIA’S VIRTUAL HERBARIUM A national collaborative model for integrated access to distributed biological information Australian National Herbarium.
IABIN Species and Specimens Thematic Network (SSTN) IABIN Executive Committee/Coordinating Institution Meeting. Tierras Enamoradas, Costa Rica. February.
Brian Matthews, euroCRIS, 18/09/03 CRIS architecture to support an ERA Brian Matthews.
AVH - Australia’s Virtual Herbarium Logo Jim Croft Centre for Plant Biodiversity Research Australian National Herbarium.
Charles Copp, Neil Caithness & Richard White.  Evaluation, selection and acquisition of existing thesauri  Thesaurus modelling - logical and physical.
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
Grid Services for Digital Archive Tao-Sheng Chen Academia Sinica Computing Centre
International Congress of Entomology, Orlando
Flanders Marine Institute (VLIZ)
Middleware independent Information Service
Biodiversity Informatics 101
Challenge Grant Update
Presentation transcript:

Biodiversity Informatics Biodiversity informatics and the manipulation of biological information Jim Croft

Outline ‘Biodiversity Informatics’ Australia’s Virtual Herbarium as a model of use and management of biodiversity knowledge New ways of managing biological knowledge Information management issues Current trends and future directions in biodiversity knowledge management

Biodiversity Informatics Management of our knowledge of biodiversity using modern techniques of data and information management

Taxonomy of Database Interoperability Multi-database systems Non-federated Federated Loosely coupled Tightly coupled Multiple schemasUnified schema Sheth & Larson (1990) [ Autonomous ]

Tightly Coupled Central administration Semantic consistency – Schemas –Authority files Common technology Difficult to implement Proprietary solutions tolerated Expensive

Loosely Coupled Closer to Reality Independent management Suited to scientific systems Common publication syntax –Export schema Less functionality … Doable Need open standards

Intermediate Coupling Scientific Independence Common syntax & semantics for the exchange of information. –Import/export –HISPID, Darwin Core, TDWG/CODATA abcd Leverage Existing Open Standards –Participation in wider, more loosely coupled federations –Simplicity –Distribution of effort

Data Refinement data information knowledge action Increasing refinement & utility of data the real world observations Envir. decision making conservation restoration biology resource mgmt utilization Policy & strategy government corporate individual

Herbarium Specimens

Specimen Data Capture

– Scientific name – Collection date – Collector name & number – Location – Soils – Habitat (incl. topography) – Vegetation community – Associated species Specimen Data The core information is from herbarium specimens Beyond taxonomy & names Collections data:

A Herbarium Database Structure

What do we want to know? What species does a plant belong to? What is its name? What other species is it related to? What does it look like? Where does it grow? Where might it grow? What other species grow with it? What species grow in a defined area? How did they get there?

What is a Virtual Herbarium? An on-line digital representation of a scientific collection of preserved plant specimens and botanical information

What is the AVH? Spread across Australian herbaria Data distributed; resides with custodians Each herbarium has a portal to receive requests and to deliver data A common single query AVH interface in each herbarium polls all herbaria Major Australian Herbaria

AVH Partners State Herbarium of South Australia Queensland Herbarium Australian National Herbarium Northern Territory Herbarium Tasmanian Herbarium Industry Partner: KE Software National Herbarium of Victoria National Herbarium of New South Wales Western Australian Herbarium Australian Biological Resources Study

Why is there an AVH? Pressure on Herbaria to work more efficiently Demand for access to larger amounts of data Demand to access data more quickly Demand to view data in different ways Pressure on herbaria to appear and to be more responsive to community needs

> 18,000 species of higher plants > 64,000 available names Extensive synonymy (4 names per plant) 8 major government-funded herbaria Similar number of university herbaria > 6,500,000 specimens in Aust. herbaria data elements per specimen Several Kb per specimen (excl. images) What is the AVH task?

Herbarium database status

$10M over 5 years to database all major Australian herbarium collections $10 million:- $ 4 million Commonwealth - $ 4 million State/Territory - $ 2 million private Initial focus on capture of herbarium specimen data Ultimate aim a complete flora information system The AVH Agreement

Australia’s Virtual Herbarium On-line access to herbarium specimen information and botanical knowledge

Australian Plant Name Index (APNI)

Acacia salicina

Incurved Recurved Research Potential: Plant distribution analysis ? Incurved Recurved Pultenaea distribution classes in eastern Australia ?

On-line systems Often regionally based Integrating: –Plant names and synonyms –Descriptive Flora treatments –Illustrations –Distributions –etc. Flora Information Systems

Botanical illustrations

Search all records on-line Digital images available (‘best of class’) 35,000 images of Australian plants and vegetation National Plant Photograph Index

High resolution image of type specimen of Austrobaileya downloaded over the Internet from the Herbarium of the New York Botanical Garden Type Images on demand

Flora & Revision Databases New ways of managing and delivering botanical information

A Flora in XML Example in HTML Platyzoma microphyllum R.Br., Prodr. 160 (1810) Gleichenia platyzoma F.Muell., Veg. Chatham.-Isl. 63 (1864). T: Facing Island, Qld, R.Brown Iter Austral. 102 ; lecto: BM. Illus.: S.B.Andrews… Rhizome short-creeping… Sporangia in zones in distal half of frond. Fig. 55 Widespread across northern Australia… Grows in sandy or swampy soils.... Map 135. W.A.: 14.4 km NW of Mt… Example in XML Platyzoma microphyllum R.Br, Prodr Gleichenia platyzoma F.Muell. Veg. Chatham.-Isl T: Facing Island, Qld, … Illus.: S.B.Andrews… Rhizome short-creeping… Sporangia in zones in distal half of frond. Fig. 55 Widespread across northern Australia… Grows in sandy or swampy soils... Map 135. W.A.: 14.4 km NW of Mt…

A Flora XML Schema fragment

A Flora database structure

A Flora database report

W-P file EditorsW-P file Botanist Publisher C-R Copy Book, etc. An old process of publication

W-P file EditorsW-P file Botanist Publisher C-R Copy Book, etc. An new process of publication XML file DatabaseXML fileOutputs

Editors Botanist Publisher C-R Copy Book, etc. A future process of publication XML file DatabaseOutputs Database Outputs

Interactive Identification Using computers to identify and name plant species and display information about them

Interactive Plant Identification

Current trends, future directions ?

Trends in Biodiverssity Information Management Nomenclatural Regional Text-based Taxon-based Individual effort Single user Standalone Centralized Proprietary System Idiosyncratic Design Nonstandard data content Conventional Developmental Access charges  Taxonomic  Global  Image-based  Spatially-based  Partnerships  Multiuser  Networked  Distributed  Open System  Standard Architecture  Standard data content  Innovative  Stable  Freely available

Global Organization Several parallel and complementary initiatives: –Global Biodiversity Information Facility (GIF) –Taxonomic Databases Working Group (TDWG) –Global Taxonomic Initiative (GTI) –International Organization for Plant Information (IOPI) –Species 2000 –All Species Foundation (ALL)

Data Flow within GBIF Network Service Metadata Collection NodeCollection Nodes GBIF Portal Participant Node Service Metadata Participant Node Service Metadata Specimen Index Data Detailed Specimen Data Aggregated Data Detailed Specimen Data Aggregated Data User Browser HTML Data

Requirements for Interoperability Standards…

URL UML abcd URI XHTML HTTP UDDI XSLT XPATH RDF PNG SVG DOM CSS SAX HISPID ITF BNF Z39.50 WAIS ASN.1 XML schema Standards for Interoperability of Biodiversity Databases Dublin Core RDFS Z39.19 SOAP cgi RMI DARWIN CORE WSDL