II Course on GBIF Node Management Arusha, Tanzania 31 st October and 1 st November 2008 Isabel CALABUIG Node Manager Danish Biodiversity Information Facility Engaging biodiversity data holders in GBIF
Engaging Biodiversity Data Holders INTRODUCTION Danish node from the Lessons Learned perspective....with special emphasis on our work to engage the biodiversity data holders General topics when assisting data holders in making their data freely available for search on the internet
Engaging Biodiversity Data Holders SUMMARY 1.Introduction to DanBIF as an inspiration for upcoming / developing nodes. 2.Open Access to Biodiversity Data. 3.Sensitive data. 4.Intellectual property rights: how to address them. 5.Benefits for data holders from digitisation and publication of data. 6.Social strategies. 7.Citation: current and future approach. 8.Conclusions.
facts Danish Biodiversity Information Facility Danish national node of GBIF July Funding: Three-year co-financing contract - The Faculty of Science, University of Copenhagen The Danish Natural Science Research Council Common national coverage but: Host: Natural History Museum of Denmark (Zoological Museum), University of Copenhagen 4
Organisational structure: As inspiration for other upcoming nodes: Board Secretariat –Node manager, Biologist Isabel Calabuig –IT Manager, Engineer Mihail Carausu –Scientific Communications Officer, Biologist Endsleff 5 –Data manager, Biologist XXX (2 days a week) Official members of the Network - Growing: –e.g. research institutions, currently 45 institutions/companies/organisations = = the Data holders & Data users Growing community of other data holders and users, and other interested people, currently approx. 160 people After 7 years: Seems effective structure for data service, interaction / consultancy & expansion!
OPEN ACCESS Official, international statements exist to support the work of the node:
Engaging Biodiversity Data Holders “Bits of Power: Issues in Global Access to Scientific Data” USA National Research Council, 1997 “The value of data lies in their use. Full and open access to scientific data should be adopted as the international norm for the exchange of scientific data derived from publicly funded research.” OPEN ACCESS
Engaging Biodiversity Data Holders “Science, Technology and Innovation for the XXI century” OCDE Committee meeting for Science and Technology Policy at the Ministry Level, OPEN ACCESS Final Communiqué: “Recognising that an optimum international exchange of data, information and knowledge contributes decisively to the advancement of scientific research and innovation” “Recognising that open access to, and unrestricted use of, data promotes scientific progress and facilitates the training of researchers” “Recognising that open access will maximise the value derived from public investments in data collection efforts”
Engaging Biodiversity Data Holders “Science, Technology and Innovation for the XXI century” OCDE Committee meeting for Science and Technology Policy at the Ministry Level, OPEN ACCESS DECLARE THEIR COMMITMENT TO: “Work towards the establishment of access regimes for digital research data from public funding in accordance with the following objectives and principles: Openness: balancing the interests of open access to data to increase the quality and efficiency of research and innovation with the need for restriction of access in some instances to protect social, scientific and economic interests.“ INVITE THE OECD: “To develop a set of OECD guidelines based on commonly agreed principles to facilitate optimal cost-effective access to digital research data from public funding, to be endorsed by the OECD Council at a later stage.”
GBIF Governing Board, Engaging Biodiversity Data Holders OPEN ACCESS
Engaging Biodiversity Data Holders Decision by the Convention on Biological Diversity, in the Conference of the Parties VIII/11 in Scientific and Technical Cooperation and the Facilitation Mechanism, OPEN ACCESS “Invites Parties and other Governments, as appropriate, to provide free and open access to all past, present and future public-good research results, assessments, maps and databases on biodiversity, in accordance with national and international legislation”
Engaging Biodiversity Data Holders “Declaration on Open Access” Scientific Committee of the European Research Council, OPEN ACCESS “Stresses the attractiveness of policies mandating the public availability of research results – in open access repositories – reasonably soon (ideally, 6 months, and in any case no later than 12 months) after publication.” “The ERC Scientific Council hopes that research funders across Europe will join forces in establishing common open-access rules and in building European open access repositories that will help make these rules operational”.
Engaging the data holders When communicating with the data holders of your country, there are –Things to agree on how to be aware of / deal with: Sensitive Data IPR –Benefits to underpin wrt digitisation and data sharing in GBIF Cleaning & maintenance of data Exposure & usefulness of data
SENSITIVE DATA SENSITIVE DATA within GBIF are: Permanently The location of rare, endangered, showy or fragile taxa. Commercially valuable taxa Specific information about the collectors of specimens/data. Temporally Data subject to ongoing research or awaiting publication. Incomplete or unchecked data. Engaging Biodiversity Data Holders Approximately 10% of the data is considered to be sensitive
RESOURCES AVAILABLE Survey on users holding sensitive data: Report on dealing with Sensitive Primary Species Occurrence Data Guide to Best Practices for Generalisisng Primary Occurrence Data Chapman, A. D. and O. Grafton, 2008, Guide to Best Practices for Generalising Primary Species-Occurrence Data, Copenhagen: Global Biodiversity Information Facility, 27 pp, ISBN: SENSITIVE DATA Engaging Biodiversity Data Holders
IPR Engaging Biodiversity Data Holders 2.GBIF should not assert any proprietary rights to the data published through GBIF. GBIF may though claim appropriate Intellectual Property Rights available over any tool developed by GBIF. Nevertheless, GBIF should seek to promote non-exclusive transfer to research institutions of such informatics technology. 3.GBIF should respect conditions set by data providers that affiliate their databases to GBIF. 4.GBIF should seek to ensure that the source of data is acknowledged and should request that such attribution is maintained in subsequent use of the data. 5.Nothing should restrict the right of owners of databases affiliated with GBIF to block access to any data. 1.All users should have equal access to data in databases affiliated with or developed by GBIF, in a free and open way. GBIF principles regarding Intellectual Property Rights Extracted from the Memorandum of Understanding of GBIF for , Paragraph 8.
Engaging Biodiversity Data Holders IPR GBIF Data Sharing & Data User Agreements – – You may also want to apply the: Conservation Commons – Creative Commons – Provides a legal document – they are authorised...
IPR example Copyright Notice This license may be summarized in the following terms: You are free to copy, distribute and perform the work, or to make any derivative works, under the following conditions: 1. You must attribute the work in the manner specified by the author or licensor. 2. You may not use this work for commercial purposes. 3. If you alter, transform, or build upon this work, you may distribute the resulting work only under a license identical to this one. The Coordination Unit of GBIF Spanish Node is not responsible of the accuracy, reliability, nor completeness of the data showed in its web site. No responsibility will be accepted for uses or misuses of the data. The contents of this World Wide Web server and under the GBIF.ES domain are under a Creative Commons license, as stated in the following page: ncsa/ 2.5/es/deed.en How to cite this resource Usage of the data hosted in this web server implies in all cases the explicit recognition and indication of the source, as follows: Coordination Unit of GBIF Spanish Node ( Unidad de Coordinación de GBIF en España ), [Resource or web page consulted], [Date of request]. An example of this kind of citation could be: Coordination Unit of GBIF Spanish Node ( Unidad de Coordinación de GBIF en España ), 'Report on Natural History Collections in Spain', /03/08. To cite data fron the GBIF Network (like those obtained from the Coordination Unit Data Server), check the instructions provided in this page. Home | Legal Notice | Contact Us | Webmaster | Site Map | Search GBIF ES is the Spanish node of the Global Biodiversity Information Facility sponsored by the Ministry of Education and Science, managed by the Spanish High Council of Scientific Research Some rights reserved
Engaging Biodiversity Data Holders IPR IPR Experts meeting, Madrid, 2004 The GBIF Secretariat organised an experts meeting in Madrid in March 2004 to discuss Intellectual Property Rights issues regarding biodiversity data sharing and the role of GBIF All the documents related to the meeting, including the presentations shown, are available in the CIRCA system: – tings/biodiversity_databases&vm=detailed&sb=Title
Engaging Biodiversity Data Holders BENEFITS & SOCIAL STRATEGIES This session: The benefits of joining as data provider And the strategies and tools to illustrate and communicate this to the data holder! Illustrated by the activities applied by DanBIF Hopefully as useful inspiration for your future work as nodes in engaging the data holders
Immediate advantages to the data holder Better management also of the physical specimens, if in a NH collection – know what you have Better management of data, once it is in digital form and properly organised: Data enriched through cleaning, intelligent indexing and comparison with other types of data –E.g. year 19?; 199?); Correct formatting of geo-reference. Data is securely catalogued, managed, and maintained on a provider server 21
Highlight to data holders : their contribution = their gain Their biodiversity data in the global context –Contribution to global community –Immortalization –Use of data in monitoring, conservation & management, e.g. land use What can they learn from you? –Digitisation of collections –Georeferencing –Visualisation of data –Database building and Darwin Core standard 22
Engaging Biodiversity Data Holders BENEFITS to highlight Feedback, and continuous update of their data: –Taxonomic revisions, –Geographic information completed and updated, etc. –Find specimen data held by other providers! –Create research networks! –Get response from other researcher on their data & vice-versa! Recognition for their work – CITATION – more later Better possibilities to obtain funding and ensure the sustainability of their projects and NH collections.
BENEFITS Better possibilities to obtain funding and ensure the sustainability of their projects and NH collections: an example in Spain. Engaging Biodiversity Data Holders Herbaria association GBIF Seed Money Government digitisation grants € Digital data in the GBIF Network Data available for: Decision-making Research Addressing country international obligations… All this process was catalysed from the coordination unit of the Spanish GBIF Node Grants for Scientific infrastructures with demonstrated public benefit +€ Staff & better management
Data and information effectively available Data and information that have been produced but are not easy to find, access, and use in other words, that are not effectively available! = gigantic task of mobilising billions of data is still needed. Biological collections Observations Reports Gray literature Data Bases Geography Scientific publications Make the data holders proud of being part of this endeavour!
SOCIAL STRATEGIES There are many types of data holders! Customised national/regional/thematic approach: use their same language, know their priorities, philosophy... Examples of datasets mediated by DanBIF –Natural History Museum of Denmark (309,710) –Aarhus University Herbarium (120,728) –Danish Mycological Society (80,602) Examples of larger datasets on the way –Danish Ornithological Society (5,000,000) –African Vertebrates (several museums) (500,000) –The Danish Nature and Environment Database (6,000,000+) 26
Engaging Biodiversity Data Holders SOCIAL STRATEGIES Importance of community building: –working as a community helps to address common problems –promotes future collaboration –makes the exchange of information easier, etc.
Help the data holders & users....with access to and use of the global network of biodiversity knowledge and data. –Create a contact network –Create a communication portal, maybe like
Enhance biodiversity research International conferences –Latest conference: (Århus April 5-6, 2008): Biodiversity Informatics and Climate Change Impacts on Life Biodiversity Informatics and Climate Change Impacts on Life –Next conference: (2009/2010): Macro ecology 29
Suggested node activities Provide IT consultancy for institutions (incl. basic and applied research) –Digitisation of natural history collections and other biodiversity data –Database design and web implementation Create a catalogue –Overview of natural history and other kinds of collections (both digitised and not digitised) and data sets from museums, research, monitoring & management, industry etc. –Online questionnaire about natural history collections etc. – Collections Metadata Inventory – also to be published and searchable Find the data holders and their data sets –Identification and IT technical preparation of biological databases for upload and search (or link) via the GBIF and DanBIF portals 30 Specimens Observations
31 Identify potential data providers Who? – What? – Where?– How? ? ? ? ? Zool. Museum Botanical Garden Zoo Aalborg National Environmental Research Institute ? ? ? ? ? ? ? ? Royal Veterinary & Agricultural University ? ? ? Herbarium Aarhus Risoe ? ? Botanical Museum Collections Metadata Inventory
32 Find & engage data providers Collections Metadata Repository WHAT IS IT? A catalogue of biological collections and databases in Denmark –including information on the fraction of collection items that so far have been stored in electronic format –and on what software was used for the purpose FOR the node secretariat: Point of departure for providing assistance on serving specimen level biodiversity data in Danish repositories –for search, query and analysis through an internet interface
33 FOR THE CONTRIBUTOR: Local management / overview of collections & DB’s (also for getting external funds) Follow development of collections & databasing Find & engage data providers Collections Metadata Repository Searchable internet-archive that lists and provides an overview of collections & DBs the community of curators, incl. assistants taxonomists; biodiversity researchers; etc.
34 Metadata Repository What is it? ”Collections Metadata Inventory” Experts List Major Taxons List Collections List Search Facility Online Questionnaire Report Tool Geography
35 DanBIF Collections Metadata Inventory - examples of items in form Description of collection –I total 21 items – shown on the right Information on institution that holds the collections and archives –I total 17 items Description of electronic archive based on collection –I total 16 items 1a. Name of collection 1b. Establishment year 2. Principal organismal group(s) 3. Reference experts 4. Geographical origin of data 5. Keywords and/or Special features of collection 6. Main topics for research uses 7. Major publications based on collection 8. Previous scientific use of collection 9. Collection type 10. Collection unit type 11. Principal countable units 12. Size of collection 13. Development status 14. Annual growth 15. Principal taxon scale 16. Number of taxa in total, based on scale indicated in question Does the collection include nomenclatural types? 18. Data format 19. Units documented electronically 20. Taxa documented electronically 21. Describe in a few lines the general properties of the collection
PR & dissemination - strategic areas Direct contact to data providers – investigative activity Create a website / web portal Be listed on other related websites Newsletter and other information material News service via lists Information meetings / lectures / conferences workshops Articles Exhibitions and other public communication activities E.g. in the Zoo’s in Aalborg ( ), Givskud (2007) and Odense (2008), in the Zoological Museum (2003-4), and at the Fungi Festival in the Copenhagen Botanical Garden (2005) Radio and TV Logo material 36 REMEMBER: All activities that inform about a node & GBIF makes a node & GBIF more known!
DanBIF and GBIF activities Info on DanBIF secretariat, board, network, data providers Info about GBIF Links to biodiversity resources / networks / projects Databases online Newsletter Hosting and developing other portals, e.g. – Natural History Guide - – Centre for Invasive Species - – Entomological Society - – Project Birch Mouse Build a web-portal! – GBIF provides free tool
GBIF & Data Citation Directs us to the sources of biodiversity data Recogniton: credit to those sources Career visibility Opportunities for further funding
Current practice: Print record persistence -> citation metrics Electronic publication of datasets has not received the same bibliometric status. No standards in place
European initiatives: The German Nation Library of Science (TIB) project: Publication and Citation of Sc Primary Data. –Deals with registration of datasets, assign Digital Object ID (DOI), catalogue them into the TIB library. UK CLADDIER Project (Citation, Location and deposition in discipline and institutional repositories). –Links publications held in institutional repositories to the underlying data held in specialists repositories.
GBIF and citation: GBIF MoU:...GBIF’s goals include making biodiversity data universally available, while fully acknowledging the contribution made by those gathering and furnishing these data (GBIF MoU, Paragraph 3, Objectives). Para. 8,... GBIF should seek to ensure that the source of data is acknowledged and should request that such attribution be maintained in any subsequent use of the data
Current GBIF practice: GBIF Data Use Agreement suggests the following format to cite data retrieved from the GBIF network: Biodiversity occurrence data provided by: xyz institution(s). (Accessed through GBIF Data Portal, YYYY- MM-DD)
Challenge: The current citation practice focuses on giving attribution to institutions but not to the scientists or those persons furnishing data. Based on “print citation format”
Data Citation Task Group
Task Group Objectives Identify the key issues and hurdles related to citation of biodiversity data served on-line. Formulate a set of recommendations, guidelines and advice GBIF members and data users on how to cite biodiversity data served via the GBIF portal. Provide advice and recommendations on next steps and actions needed regarding citation of biodiversity data.
Several issues to deal with: Journal restrictions (less pages, less words) Citable and persistent identifiers (e.g. GUIDs) to a given ”download event” vs individual IDs ? Persistency (data not changed and in a trusted repository). Scientific verifiability (re-use of data) Move away from copyright of databases.
Issues... How to cite: individual records, collector/observers, database. Individual provider makes several databases available vs multiple providers made available through 1 internet entry point or ”aggregator ”... Particular records that are extracted from one database and inserted into another? Capture and preserve the status of dynamic databases at a given time and to provide/preserve versions? Need to develop standards and formats (and e.g. allow for inclusions in the Science Citation Index)
LESSONS LEARNED – unforeseen “barriers” new approaches Listen to the potential providers – Provide special services – Pays back ten-fold! Make them comfortable wrt –Access to data / ownership –Sensitive data; IPR Busy researchers –Keep it short –Home-party approach per. department / institution –Patience – Patience – Patience – but KEEP STIRRING THE POT and EMPHASISING THE BENEFITS Generation / IT / Databasing issue BARRIERS –Invest time in strategic approach towards different “categories” of people –Paper version of Metadata Inventory 48
49 Physical Collections & Observations Databasing / digitisation need – Advice, resources (time - money), tools Internet portal $ $ $ LESSONS LEARNED – unforeseen “barriers”
Questions?
Isabel CALABUIG Node Manager Danish Biodiversity Information Facility - DANBIF Natural History Museum of Denmark, Zoological Museum University of Copenhagen Universitetsparken 15 DK-2100 Copenhagen Ø, Denmark Tel: Web: Engaging biodiversity data holders in GBIF II Course on GBIF Node Management Arusha, Tanzania 31 st October and 1 st November 2008 My thanks to: Alberto G. Talaván, GBIF Beatriz Torres, GBIF Lotte Endsleff, DanBIF